gabrie30 / ghorg

Quickly clone an entire org/users repositories into one directory - Supports GitHub, GitLab, Bitbucket, and more 🐇🥚

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reclone not working inside container

afonsoc12 opened this issue · comments

Describe the bug
I'm packaging this amazing tool in a docker container for Unraid. All is working fine except for reclone.
The same command works with ghorg clone but not ghorg reclone.
It is able to find the command and starts running it, but exits with ERROR: Running ghorg clone cmd: ghorg clone kubernetes, err: exit status 1.
Here is the repository for all docker supporting scripts

To Reproduce
Steps to reproduce the behavior:

  1. Use the following reclone.yaml. I'll assume it is stored at $HOME/.config/ghorg/reclone.yaml
kubernetes:
     cmd: "ghorg clone kubernetes"
  1. Run the following docker command, see that it works
docker run --rm \
        -e GHORG_CONFIG="" `#Dont use config file` \
        -e GHORG_GITHUB_TOKEN=<API_TOKEN> \
        -v $HOME/.config/ghorg:/config \
        -v $HOME/repos:/data \
        afonsoc12/ghorg:latest \
        clone kubernetes
  1. Now use reclone. It will start running the command but will error:
docker run --rm \
        -e GHORG_CONFIG="" `#Dont use config file` \
        -e GHORG_GITHUB_TOKEN=<API_TOKEN> \
        -v $HOME/.config/ghorg:/config \
        -v $HOME/repos:/data \
        afonsoc12/ghorg:latest \
        reclone

Environment (please complete the following information):
Tried in M1 Mac and Linux/Unraid

Additional context
Looking for a way to debug this, or get more information on the error.

Thank you and keep up the great work!

Thanks for your kind words and the very interesting issue!

It seems to be coming from this call to Wait() https://github.com/gabrie30/ghorg/blob/master/cmd/reclone.go#L177

I was able to get the reclone command to work inside docker by following these commands so I'd wager theres something missing from your image that the .Wait() command needs to do its thing.

Might be worth trying the golang:alpine base image in your dockerfile and see if that works.

Hi,

Thanks for your prompt response.

I was able to run a few more tests and I don't think there is anything wrong with my image or Docker whatsoever.

The issue is that ghorg is able to read from a file at a different location, but throws an error when GHORG_RECLONE_PATH is set.

Take a look at the following examples:

# This command works fine (just unset all container default variables)
docker run --rm \
        -e XDG_CONFIG_HOME="" \
        -e GHORG_CONFIG="" \
        -e GHORG_RECLONE_PATH="" \
        -e GHORG_GITHUB_TOKEN=<API_TOKEN> \
        -v $HOME/.config/ghorg:/home/.config/ghorg \
        afonsoc12/ghorg \
        reclone
# Now if I explicitly define location for `GHORG_CONFIG` and `GHORG_RECLONE_PATH`,
# it reads from both location, but throws the error
docker run --rm \
        -e XDG_CONFIG_HOME="" \
        -e GHORG_CONFIG="/config/conf.yaml" \
        -e GHORG_RECLONE_PATH="/config/reclone.yaml" \
        -e GHORG_GITHUB_TOKEN=<API_TOKEN> \
        -v $HOME/.config/ghorg:/config \
        afonsoc12/ghorg \
        reclone

Hope this helps you troubleshoot further. I am not experienced in go, so can't be of much help investigating the issue.
I think we can mark this as a bug.

Cheers

This is working for me locally can you try this to see if it works for you too?

  1. clone ghorg repo
  2. cd into repo
  3. docker build . -t ghorg-docker
  4. run
docker run --rm \
-e GHORG_CONFIG="/config/conf.yaml" \
-e GHORG_DEBUG="true" \
-e GHORG_RECLONE_PATH="/config/reclone.yaml" \
-e GHORG_GITHUB_TOKEN=xxxxxxxxxx \
-v $HOME/.config/ghorg:/config \
-v $HOME/repos:/data \
ghorg-docker \
ghorg reclone --verbose

Hi @gabrie30,

I probably didn't explain what my objective is. Let me rephrase:
My idea is to bypass conf.yml since those can be defined as environment variables. So using your container, this does not work:

# conf.yaml does not exist in this test, only reclone
docker run --rm \
-e GHORG_DEBUG="true" \
-e GHORG_RECLONE_PATH="/config/reclone.yaml" \
-e GHORG_GITHUB_TOKEN=<API_TOKEN> \
-v $HOME/.config/ghorg:/config \
-v $HOME/repos:/data \
ghorg-docker \
ghorg reclone --verbose

The file isn't mandatory I assume, so why throw the error? If the file exists but is empty its the same result, regardless the necessary env variables being defined.

# conf.yaml exists but is the sample one (nothing set)
docker run --rm \
-e GHORG_DEBUG="true" \
-e GHORG_CONFIG="/config/conf.yaml"  \
-e GHORG_RECLONE_PATH="/config/reclone.yaml" \
-e GHORG_GITHUB_TOKEN=<API_TOKEN> \
-v $HOME/.config/ghorg:/config \
-v $HOME/repos:/data \
ghorg-docker \
ghorg reclone --verbose

I can easily solve this by just using the conf file, just wondering if this is expected behaviour or if I'm doing something wrong

You're correct the expected behavior of ghorg clone is that it does not need a configuration file. You're not doing anything wrong. The way reclone works is it just basically calls ghorg clone on the users behalf. But because each ghorg clone called within reclone can use different configuration, some setup must happen such as resetting all the GHORG envs, between all ghorg clones happening in the reclone...except for a few of them. So one thing you might run into is the GHORG_GITHUB_TOKEN env that gets passed in via docker will likely get reset. However the exact issue you are running into I've yet to pinpoint. I think it might have something to do with the GHORG_CONFIG being a global flag and how viper is initializing it, then how reclone skips resetting it. I'll need more time to poke around with this one. Glad you at least have a work around.

@afonsoc12 I made some updates. I noticed I was resetting the GHORG_RECLONE_PATH which should have been an ENV being skipped, that I believe was the issue with your first comment. The GHORG_GITHUB_TOKEN is also going to get reset. The reclone.yaml should hold the tokens for those clones. If not then the GHORG_CONF should hold it.

I think this should fix your issue. Would you be able to use master to test? I'd rather not cut a new release until I know it solves your issue. But I can if we need to.

Hi @gabrie30,

Thanks for putting some time into this. As far as I have tested, it still exhibits the same behaviour as before.

I see what you mean about environment being reseted between clone calls. But would it be possible to store reclone environment at the beginning and then restore it before each clone call? And make sure that ENV variables always take precedence over the config file (which I think they do).

Just a thought, as I've mentioned its not something urgent, but would be nice and easier when tweaking an image for Unraid.

@afonsoc12 made some changes I'm pretty sure I've found the issue if this doesn't fix it let me know the output you are getting when you use the --verbose flag and what your reclone.yaml looks like.

I don't want to store the reclone environment before the reclone because each reclone should be able to use its own conf.yaml if it needs to.

Hi,
Thanks for the effort you're putting on this issue. I'm afraid it still shows the same problem. I'm using your Dockerfile for all the troubleshooting.

Have you been able to make it work on your end? Not working on mine though..

The Image was built on commit 1db38f

Here's the output of

docker run --rm \
-e GHORG_DEBUG="true" \
-e GHORG_RECLONE_PATH="/config/reclone.yaml" \
-e GHORG_GITHUB_TOKEN=<API TOKEN> \
-v $HOME/.config/ghorg:/config \
-v $HOME/repos:/data \
ghorg-docker \
ghorg reclone --verbose
Output of docker run
$ docker run --rm \
-e GHORG_DEBUG="true" \
-e GHORG_RECLONE_PATH="/config/reclone.yaml" \
-e GHORG_GITHUB_TOKEN=<API TOKEN> \
-v $HOME/.config/ghorg:/config \
-v $HOME/repos:/data \
ghorg-docker \
ghorg reclone --verbose

-------- Setting Default ENV values ---------
GHORG_ABSOLUTE_PATH_TO_CLONE_TO: /root/ghorg/
GHORG_BRANCH:
GHORG_CLONE_PROTOCOL: https
GHORG_CLONE_TYPE: org
GHORG_SCM_TYPE: github
GHORG_SKIP_ARCHIVED: false
GHORG_SKIP_FORKS: false
GHORG_NO_CLEAN: false
GHORG_FETCH_ALL: false
GHORG_PRUNE: false
GHORG_PRUNE_NO_CONFIRM: false
GHORG_DRY_RUN: false
GHORG_CLONE_WIKI: false
GHORG_INSECURE_GITLAB_CLIENT: false
GHORG_BACKUP: false
GHORG_RECLONE_VERBOSE: false
GHORG_RECLONE_QUIET: false
GHORG_CONCURRENCY: 25
GHORG_INCLUDE_SUBMODULES: false
GHORG_EXIT_CODE_ON_CLONE_INFOS: 0
GHORG_EXIT_CODE_ON_CLONE_ISSUES: 1
GHORG_GITHUB_TOKEN: <API TOKEN>
GHORG_COLOR: disabled
GHORG_TOPICS:
GHORG_GITLAB_TOKEN:
GHORG_BITBUCKET_USERNAME:
GHORG_BITBUCKET_APP_PASSWORD:
GHORG_BITBUCKET_OAUTH_TOKEN:
GHORG_SCM_BASE_URL:
GHORG_PRESERVE_DIRECTORY_STRUCTURE: false
GHORG_OUTPUT_DIR:
GHORG_MATCH_REGEX:
GHORG_EXCLUDE_MATCH_REGEX:
GHORG_MATCH_PREFIX:
GHORG_EXCLUDE_MATCH_PREFIX:
GHORG_GITLAB_GROUP_EXCLUDE_MATCH_REGEX:
GHORG_IGNORE_PATH: /root/.config/ghorg/ghorgignore
GHORG_RECLONE_PATH: /config/reclone.yaml
GHORG_QUIET: false
GHORG_GIT_FILTER:
Aliases:
map[string]string{}
Override:
map[string]interface {}{}
PFlags:
map[string]viper.FlagValue{"color":viper.pflagValue{flag:(*pflag.Flag)(0x40001808c0)}, "config":viper.pflagValue{flag:(*pflag.Flag)(0x4000180960)}}
Env:
map[string][]string{}
Key/Value Store:
map[string]interface {}{}
Config:
map[string]interface {}{}
Defaults:
map[string]interface {}{"config":"/root/.config/ghorg/conf.yaml"}
Viper config file used:
GHORG_CONFIG SET TO: none
$ ghorg clone kubernetes
Could not find a valid github token. GHORG_GITHUB_TOKEN or (--token, -t) flag must be set. Create a personal access token, then set it in your $HOME/.config/ghorg/conf.yaml or use the (--token, -t) flag, see 'GitHub Setup' in README.md
ERROR: Running ghorg clone cmd: ghorg clone kubernetes, err: exit status 1
Env inside the container
PATH=/go/bin:/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=427bcd4d2685
GHORG_DEBUG=true
GHORG_RECLONE_PATH=/config/reclone.yaml
GHORG_GITHUB_TOKEN=<API TOKEN>
GOLANG_VERSION=1.19.2
GOPATH=/go
HOME=/root

Thanks for the output. So it looks like the reason its failing is because

Could not find a valid github token. GHORG_GITHUB_TOKEN or (--token, -t) flag must be set. Create a personal access token, then set it in your $HOME/.config/ghorg/conf.yaml or use the (--token, -t) flag, see 'GitHub Setup' in README.md

This is because you are passing in the GHORG_GITHUB_TOKEN via an env var. This env is being reset by the reclone command. This is required to happen so that config files can be used inside the reclone.yaml. This allows users to set different config files for each of their reclone commands if needed. e.g.

kubernetes:
  cmd: "ghorg clone kubernetes --config=/home/user/ghorg/conf.yaml"
kubernetes-some-other-way:
  cmd: "ghorg clone kubernetes --config=/home/user/ghorg/special-conf.yaml"

The GHORG_GITHUB_TOKEN should be set in the reclone.yaml command. You can either set it as a command line flag or reference another config file.

kubernetes:
  cmd: "ghorg clone kubernetes --token=xxxxxxxxx"

I see, that makes sense!
Just wondering why they are reseted rather than defaulted.
For example, if you don't define any setting that would override the container's GHORG_GITHUB_TOKEN (via config file or --token), it would consider that one as default, otherwise it would use the one with the closest scope.

Each clone within the reclone should be independent of the next, thats why the envs are reset. If you wanted to have the same configuration for each run, such as using the same github token, you would do that via the configuration file. It starts to get difficult to keep track of what configurations are set and where if you start passing envs down into clones and only using them if they are not overwritten in the configuration file or as a flag. It's much simpler to look at the reclone.conf and know exactly what configuration will be used.

Current approach to reclone

I see, that the current approach "ghorg reclone shall take care of selected env variables" is surprising and apparently misleading some users.

Proposed approach to reclone

Consider following updated approach:

  • user can configure clone and reclone by means of conf.yaml, env variables (as described for clone) and by command line options
  • reclone is free to use command line options where needed (thus overriding what is set by env variables or conf.yaml)
  • reclone will not touch env variables and will pass it down to clone command as is
  • it is user who is responsible for resolving possible configuration conflicts - not the reclone subcommand.

It could be summarized into "ghorg reclone allows to run one or more ghorg clone commands as defined in yaml configuration file".

This is very confusing indeed. I followed the readme which gives this example:

docker run --rm \
        -e GHORG_GITHUB_TOKEN=bGVhdmUgYSBjb21tZW50IG9uIGlzc3VlIDY2 \
        -v $HOME/.config/ghorg:/config `# optional` \
        -v $HOME/repositories:/data \
        ghcr.io/gabrie30/ghorg:latest \
        clone kubernetes --match-regex=^sig

but I changed clone to reclone, expecting the token to still work. Then I spent an hour debugging and finally found this thread.
So, what's the recommended way to pass in the token to a docker container and use reclone?
(Maybe you didn't originally design this tool with docker in mind, so just add some notes warning against using reclone with docker and env vars?)

P.s. I tried to workaround by set a different ENV variable via docker, and use that one in the reclone command. I was hoping that the new ENV var would not be cleared.

# +-+-+-+-+-+-+-+-+-+-+-+-+-+
# |G|H|O|R|G| |R|E|C|L|O|N|E|
# +-+-+-+-+-+-+-+-+-+-+-+-+-+
my-gitlab:
  cmd: "ghorg clone dev --scm=gitlab --token=$(GITLAB_TOKEN) --fetch-all --prune --prune-no-confirm --base-url=https://gitlab.myserver.io"

But this doesn't seem to work either :(

As far as I understood and I haven't tested this in a while, you can't set it as an env variable.
Either specify on conf.yaml or hard-code on your reclone.yaml.