Codestats
With an existing DuckDB:
-
Extract the database to
data/git.duckdb
mkdir data/ mv ~/Downloads/git.duckdb.gz data/ gunzip data/git.duckdb.gz
-
Install dependencies and start Streamlit
poetry install poetry run streamlit run Recent.py
OR if you don't want to install Python, there's a
Dockerfile
and/or adocker-compose.yml
too, although it seems to run somewhat slower, perhaps due to the default cgroups limits? DuckDB seems rather hungry for resources.docker compose build docker compose up
Without an existing DB
-
Update your .ssh config - I use this config for multiplexing to speed up cloning
Host redmine-git User git HostName redmine.mgmtprod Port 2223 IdentityFile ~/.ssh/id_rsa ControlPath ~/.ssh/connections/%r@%h.ctl ControlMaster auto ControlPersist 10m IdentitiesOnly yes
-
Create an
.env
in this directoryexport GITLAB_HOST=gitlab.mgmtprod export GITLAB_USER=<username> export GITLAB_TOKEN=<personal access token> export GITLAB_ROOT="${HOME}/repos/gitlab" export GITOLITE_HOST="redmine-git" export GITOLITE_ROOT="${HOME}/repos/gitolite"
If necessary, create the GitLab personal access token first.
-
The indexing process happens in four steps:
- repository discovery (
poetry run python discover_gitlab.py
anddiscover_gitolite.py
)- this produces
data/repos-*.csv
- this produces
- cloning (or fetching) the repositories (
fetch_known_repos.py
)- this produces bare repositories in
GITLAB_ROOT
andGITOLITE_ROOT
- I have, in the past, used
git worktree
to work with bare repos locally toogit -C ~/repos/gitlab/odoo/odoo.git worktree add ~/work/odoo main
git -C ~/work/odoo commit
rm -rf ~/work/odoo
git -C ~/repos/gitlab/odoo/odoo.git worktree prune
- this produces bare repositories in
- indexing the repositories by parsing the output of
git ls-tree
andgit log --numstat
- produces
data/git_*.csv
- produces
- and lastly loading the CSVs into a DuckDB database
- produces
data/git.duckdb
- to be compressed into
data/git.duckdb.gz
usinggzip -k data/git.duckdb
- originally, this project ran on PostgreSQL
- but DuckDB is useful for a workshop format, and for sharing the DB index in general
- produces
- repository discovery (