Performance evaluation for Artifact Storage used in Github Actions
y4ssi opened this issue · comments
The purpose of this Issue is to compare the performance of downloading and uploading artifacts in a Github Actions Workflow using the native Github Actions system for artifact management versus using Google Cloud Storage integrated with Github Actions.
At the end of this issue, the snippets of the workflows used for testing will be provided.
The configuration of GCS
includes a bucket set up in the standard storage class (Google Cloud Storage Classes) in the us-central1
region. The bucket was configured with a policy to automatically delete objects after 1 day. Note that storage charges apply per hour.
In Github Actions
, all generated artifacts can only be removed after one day. Hence, storage charges apply per day.
Below are the respective costs (excluding free tiers):
Storage Costs:
- Github Actions: $0.08 USD*GB/day
- GCS: $0.02 USD*GB/month
Note: Github Actions charges per day, and GCS charges per hour.
Consider: 1 month = 750 hrs
Retrieve Costs:
- GCS: $0.01 USD*GB.
- Github Actions: Free.
Tests Conducted
Test 1
Create with the dd
Linux command 1024 files of 1 MiB each for a total of 1000 MiB. These files are uploaded to Github Actions in one step and downloaded in another step. The same procedure is followed for GCS.
Test 2
Create with the dd
Linux command 1 file of 1000 MiB. This file is uploaded to Github Actions in one step and downloaded in another step. The same procedure is followed for GCS.
Benchmark results:
Although it can be observed that Github Actions is slightly faster by 3.68% (summing the averages of the workflows in both tests), when considering costs, GCP is much more cost-effective.
Cost example for 1 GB
Storing 1 GB for a minimum of 1 day in Github Actions vs GCS. Additionally, consider that the artifact is obtained 3 times during a pipeline.
GCS Storage: ($0.02 USD * 1 GB / 750 hrs) * 24 hrs = $0.00064 USD per 1 GB. (Note the price of GCS is fixed per month, where 1 month = 750 hrs).
Github Actions Storage: $0.08 USD * 1 GB / 24 hrs) * 24 hrs = $0.08 USD per 1 GB. (The price of GH is fixed per day, where 1 day = 24 hrs).
GCS Retrieval: $0.01 USD * 1 GB * 3 times = $0.03 USD.
Github Actions Retrieval: $0 USD.
Total GH: $0.08 USD
Total GCS: $0.03064 USD
As can be seen, summing up the costs of retrieval and storage, it is much more cost-effective to use Google Cloud Storage (GCS).
Storage Cost vs Compute Cost
As we can see in this pricing table for the execution time of a pipeline in Github Actions
Considering that we typically use an 8 vCPU machine for Zcash
pipelines and that the generated artifacts are around ~8GB (even though not all are used, for practical purposes, we assume that all will be used).
These would be the costs per minute for uploading and downloading artifacts using the native method versus using GCS (It will be assumed that the 8GB will be uploaded once and downloaded once and averages are taken according to the table presented earlier.)
GCS
In GCS, uploading and downloading 8GB would take 327.2 seconds (1GB/40.9 seconds) and the cost is 0.00512USD ($0.02 USD * 8GB / 750hrs * 24hrs -> per day)
Linux: $0.032 USD / 60sec * 327.2 sec + $0.00512USD = $0.1796USD
Windows: $0.032 USD / 60sec * 327.2 sec * 2 + $0.00512USD = $0.3541USD
Taking into account that the multiplier for Linux is 1, for Windows is 2.
GH
In GH, uploading and downloading 8GB would take 315.6 seconds (1GB/39.45 seconds) and the cost is $0.64USD ($0.08 USD / GB * 8GB -> per day)
Linux: $0.032 USD / 60sec * 315.6 sec + $0.64USD = $0.8083USD
Windows: $0.032 USD / 60sec * 315.6 sec * 2 + $0.64USD = $0.9766USD
Taking into account that the multiplier for Linux is 1, for Windows is 2.
Conlusion
Considering all these factors, my recommendation is to use GCS, as it is more cost-effective, and the degradation in time is almost negligible.
Workflow Snippets
Multiple files
name: CI
on:
pull_request:
push:
branches: master
jobs:
generate-and-upload-to-GH-multiple:
runs-on: ubuntu-latest
steps:
- name: Generate files
run: |
mkdir generated_files
for i in {1..1024}; do
dd if=/dev/urandom of=generated_files/file_${i}.txt bs=1024000 count=1
done
- name: Upload files to artifacts
uses: actions/upload-artifact@v4
with:
name: my-artifacts
path: generated_files
download-artifacts-GH-multiple:
needs: generate-and-upload-to-GH-multiple
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Download artifacts
uses: actions/download-artifact@v4
with:
name: my-artifacts
generate-and-upload-to-GCS-multiple:
runs-on: ubuntu-latest
steps:
- name: Generate files
run: |
mkdir generated_files
for i in {1..1024}; do
dd if=/dev/urandom of=generated_files/file_${i}.txt bs=1024000 count=1
done
- name: Authenticate to Google Cloud
uses: google-github-actions/auth@v2
with:
credentials_json: ${{ secrets.GCP_SA_KEY }}
- name: Upload files to artifacts
uses: google-github-actions/upload-cloud-storage@v2
with:
path: generated_files
destination: gh-zcash/${{ github.run_id }}
download-artifacts-GCS-multiple:
needs: generate-and-upload-to-GCS-multiple
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: 'Set up Cloud SDK'
uses: 'google-github-actions/setup-gcloud@v2'
with:
version: '>= 363.0.0'
- name: Download artifact
run: |
./.github/gcs-download-artifacts.sh ${{ secrets.GCP_SA_KEY }} ${{ github.run_id }} generated_files ./
Single files
name: CI
on:
pull_request:
push:
branches: master
jobs:
generate-and-upload-to-GH-single:
runs-on: ubuntu-latest
steps:
- name: Generate files
run: |
mkdir generated_files
dd if=/dev/urandom of=generated_files/file_${i}.txt bs=1024 count=1000000
- name: Upload files to artifacts
uses: actions/upload-artifact@v4
with:
name: my-artifacts
path: generated_files
download-artifacts-GH-single:
needs: generate-and-upload-to-GH-single
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Download artifacts
uses: actions/download-artifact@v4
with:
name: my-artifacts
generate-and-upload-to-GCS-single:
runs-on: ubuntu-latest
steps:
- name: Generate files
run: |
mkdir generated_files
dd if=/dev/urandom of=generated_files/file_${i}.txt bs=1024 count=1000000
- name: Authenticate to Google Cloud
uses: google-github-actions/auth@v2
with:
credentials_json: ${{ secrets.GCP_SA_KEY }}
- name: Upload files to artifacts
uses: google-github-actions/upload-cloud-storage@v2
with:
path: generated_files
destination: gh-zcash/${{ github.run_id }}
download-artifacts-GCS-single:
needs: generate-and-upload-to-GCS-single
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Download artifact
run: |
./.github/gcs-download-artifacts.sh ${{ secrets.GCP_SA_KEY }} ${{ github.run_id }} generated_files ./