asreview / asreview

Active learning for systematic reviews

Home Page:https://asreview.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Docker issues

abelsiqueira opened this issue · comments

This issue serves to keep track of the following two issues related to the Docker images:

1. Docker deploy tests take too long

The docker deploy workflow takes around 40 minutes for every build.
My understanding of the issue:

The Docker image is building for arm64 (for Mac M2 chips) as well as amd64 (Linux and older macs): https://github.com/asreview/asreview/blob/master/.github/workflows/docker.yml#L54

The arm64 build takes 2040.6 seconds (~34 minutes): https://github.com/asreview/asreview/actions/runs/5324415292/jobs/9643679858#step:9:4115
The amd64 build takes 223.3 seconds (~4 minutes): https://github.com/asreview/asreview/actions/runs/5324415292/jobs/9643679858#step:5:4111

Solutions (ordered by simplest to do), assuming that you want arm64.

  • Don't build/push nor test docker images unless it is a tag/release;
  • Don't build/push docker images unless it is a tag/release;
  • Don't build/push for arm64 unless it is a tag/release, but build for amd64 every commit.

2. asreview version on Docker is dirty

Running

docker run ghcr.io/asreview/asreview:v1.2 -V

shows version 1.2+0.gd606288.dirty instead of v1.2.
My understanding of the issue:

The docker image builds from the source code, so the version is obtained by the python package versioneer. I think the compile_assets​ call dirties the repo and you have a dirty version. The docker image v1.2​ shows that tag 1.2+0.gd606288.dirty​ for that reason. Probably the cloning for building the image is not cloning the tags, so it doesn't even know that this is version 1.2+??. Instead, versioneer thinks that is a tagless package, and uses 0+untagged.

Originally the idea of building from source was to be able to control what is built, have a leaner build, and to split into different docker images later. Based on previous discussions, it looks like these are not essential things, so an alternative would be going back to the old way of building and using the pip version directly.
Since the automation depends on the pip version being installed, some conditionals need to be added - or, to simplify, just move part of the docker workflow into the python package deployment workflow (with limitations).


My initial suggestion to fix this is

  • Change the Dockerfile to build from pip.
  • Delete the docker deployment workflow.
  • Add extra steps on the python package deployment workflow to build for amd and arm.
  • Don't test the Docker build (limitation of the steps above).

One alternative would be

  • Fix the versioneer issue and keep building from source using the existing deploy.yml.
  • Only run the docker deploy on tags.

Given the criticality of the build time, I will create a Pull Request disabling the build on every commit. This will allow time to discuss a solution that can be maintained once I leave the project.

Hotfix #1485 was updated to include a fix for the dirty tags (hopefully). If that is indeed the case, then the alternative solution proposed above was implemented, and this issue can be resolved. Of course, it depends on whether this is the desired solution or not.