PecanProject / pecan

The Predictive Ecosystem Analyzer (PEcAn) is an integrated ecological bioinformatics toolbox.

Home Page:www.pecanproject.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GitHub actions failing

robkooper opened this issue · comments

Bug Description

If we have 2 or more PRs github builds fail This can be due to rate limiting.

To Reproduce

start 2 or more pr builds at the same

@Aariq suggested:

Did you ever figure out why the actions are hitting the GitHub api limit? Could changing the actions to install dependencies from r-universe instead of github help with that? You can add any github-only dependencies to https://pecanproject.r-universe.dev/

For example, here I just added it as an argument to the setup-r action:
https://github.com/Aariq/pecan_ed2/actions/runs/5271732802/workflow#L37-L42

Yeah, I think the only reason an action would be using the GitHub API is to install packages from GitHub. But r-universe is already building binaries of the dev versions of all PEcAn packages, so if the GitHub actions aren't using those, that might be low-hanging fruit to fix this issue.

hey @Aariq, thanks for the heads up on slack, we did some tests on the workflow trying to figure out where exactly the calls are used, I will make a detailed comment on the findings in some time, would really appreciate your help with this issue :)

From the False PR #3209, we got to know that

Each Docker workflow approximately uses 26 - 27 (including the 8 cURL calls we did in the makefile) GitHub API calls, and hence we hit the rate limit on running 2 or more checks simultaneously,

A Sample workflow can be found over here

#12 73.24 curl -s -I https://api.github.com/orgs/pecanproject | grep ratelimit
#12 73.32 x-ratelimit-limit: 60
#12 73.32 x-ratelimit-remaining: 34
#12 73.32 x-ratelimit-reset: 1691483147
#12 73.32 x-ratelimit-resource: core
#12 73.32 x-ratelimit-used: 26
 

This is because everytime we install the modules/ dependencies using the GitHub API in the makefile, so to avoid this, one suggestion came up from @Aariq that we should install dependencies from r-universe instead of github

He also gave one sample workflow which uses the same concept: Sample Workflow

So now the question is, should we skip the installation part of the dependencies in the Makefile altogether or only some specific parts in the makefile need to be removed?

Also in the workflow, after the step of creating the image is completed, if we check the remaining rate limit then it shows that remaining rate is 0, meaning there are parts in the ./docker.sh script which tend to use the API calls and exhaust them:

Run curl -s -I https://api.github.com/orgs/pecanproject | grep ratelimit
x-ratelimit-limit: 60
x-ratelimit-remaining: 0
x-ratelimit-reset: 1692279788
x-ratelimit-resource: core
x-ratelimit-used: 60

A more general way to use our r-universe is to just set the repos option in R before installing anything. I don't know how the docker workflow works, but you could put this in a .Rprofile for example

options(repos = c(CRAN = 'https://cloud.r-project.org', pecan = 'https://pecanproject.r-universe.dev'))

The number 60 is significant here: Unless they’ve changed any rate limits since last I looked, 60/hour is the limit for unauthenticated calls, so seeing it means our access token is somehow not getting passed through/recognized.

Yes you are right, According to the latest github docs:

For unauthenticated requests, the rate limit allows for up to 60 requests per hour. Unauthenticated requests are associated with the originating IP address, and not the person making requests

For authenticated user:

User access token requests are limited to 5,000 requests per hour and per authenticated user 

So this may leave us with an possibility that we can use one single persons account / access token in the workflow (I think this may sign off the whole built to that persons github account

I don't think you should have to use someone's token. You should be able to use ${{ secrets.GITHUB_TOKEN }} in the workflow if you need one and I assume that counts as a authenticated user. https://docs.github.com/en/actions/security-guides/automatic-token-authentication

We do use the token in the current workflow

env:
  MASTER_REPO: PecanProject/pecan
  DOCKERHUB_ORG: pecan
  GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}

But these are not getting passed through/recognized.

I know next to nothing about docker, but does this env variable need to get passed to docker.sh somehow maybe?

ahh, let me think on this one and get back to you :)

I see it here https://github.com/PecanProject/pecan/blob/78c347a9aaedae223d958ecd9fb98f989b4dae2d/docker.sh#L150C34-L150C49

But should it also be in all the other docker build commands maybe?

But should it also be in all the other docker build commands maybe?

umm, Let me test it out in the false PR which is already up #3209

See also discussions from the last time we battled this hydra of a problem: #2986, #3083.

Here's a great example
https://github.com/PecanProject/pecan/actions/runs/5950411179/job/16138150098?pr=3203#step:12:1189

./scripts/time.sh "install  models/biocro" Rscript -e "options(Ncpus = 2)" -e "devtools::install('models/biocro', upgrade=FALSE)"

This script should probably be running install.packages('PEcAn.BIOCRO', repos = c('https://pecanproject.r-universe.dev', 'https://cloud.r-project.org')) instead of devtools::install('models/biocro')

Okay I see it here, but there one problem with it:

We have automated the installation in the makefile as follows:

install_R_pkg = ./scripts/time.sh "install ${1}" Rscript \
	-e ${SETROPTIONS} \
	-e "devtools::install('$(strip $(1))', upgrade=FALSE)"

Not all the packages/modules are present at https://pecanproject.r-universe.dev, and in the makefile we have defined the modules in form of arguments so we iterate over each of them. So i think we have to manually update the makefile to install models which are present at the website and rest modules would be installed as before, this may add to the complexity of the code :)

But should it also be in all the other docker build commands maybe?

I have added the --secrets flag in the code, but don't seem to find a way to test it, Let me make 2 3 simultaneous commits and check whether the Actions fail

Edit:

Nope doesn't work, the actions still fail due to api rate limit

hey @infotroph @Aariq , after discussion with Rob sir, we are sure that the GITHUB_PAT is not passed down to the docker build command. Do you guys have any reference to a github action that passes secrets to build the docker images?

I know there are reasons for these actions to be extra complicated, but why not just use something like devtools::install_deps() or pak::local_install_deps() which figures out what the dependencies of a package are from DESCRIPTION? Then you just need to set our r-universe as a repo (e.g. with options(repos = c('https://pecanproject.r-universe.dev', 'https://cloud.r-project.org')) and then run one of these functions. If there are GitHub-only dependencies not on pecanproject.r-universe.dev, then we should just add them.

If you need to add packages to our r-universe, make a PR to modify this file: https://github.com/PecanProject/universe/blob/main/packages.json

A more general way to use our r-universe is to just set the repos option in R before installing anything. I don't know how the docker workflow works, but you could put this in a .Rprofile for example

options(repos = c(CRAN = 'https://cloud.r-project.org', pecan = 'https://pecanproject.r-universe.dev'))

@Aariq , I checked on your suggestions about installing the packages from Runiverse, turns out that it supports installation for R-4.3, whereas in our CI workflow we use R-4.0 and R-4.1 so what would you suggest? should we upgrade the versions of R or do you see any workaround for this?

Here are the logs for your reference:

  Warning message:
  package ‘PEcAn.base/logger’ is not available for this version of R

This is what i see on Runiverse
IMG_6162

Also note that, the container that we use i.e. pecan/depends is only available for R4.0 and R4.1 and lower on dockerhub, so that should also be considered
IMG_6161

This isn't an r-universe specific thing. CRAN also builds binaries only for r-release, and r-oldrel. You should be able install from source still, but it will take much longer and the container might need build tools or something. install.packages('PEcAn.logger', repos = c('https://pecanproject.r-universe.dev', 'https://cloud.r-project.org')) works on my mac with R 4.0. Generally for tests and checks is best practice to use r-release, r-devel, and maybe r-oldrel so that you're always checking that packages build with the most recent (and upcoming) versions of R with no issues. You might want to just take the same approach with the docker containers and update R in them more often to take advantage of the pre-built binaries on r-universe.

okay this does install some binaries but it is case sensitive, so can you please merge my PR in the pecanproject.r-universe.dev Repository? @Aariq @mdietze

I have updated the names of the packages which are now case sensitive and as per the names of their directories, this will help us in downloading the packages from the runiverse directly and help to avoid hitting the Github API rate limit.

I guess we have indeed found a fix to this issue :)

The PR can be found below:

PecanProject/pecanproject.r-universe.dev#7

Followup: Currently i am getting error for packages like db, because on R universe. the name is PEcAn.DB but in our makefile and the subdirectory, the name is db (case sensitive) and this is the reason why we are getting errors in github actions, binaries like util and logger get installed without any problem when we install them directly from Runiverse :)

Apparently this issue was solved in the PR #3241 by @infotroph , kudos to him :)

Closing this issue for the time, may reopen this issue (Hopefully never) if we face same problem in the nearby future