zcash / zcash

Zcash - Internet Money

Home Page:https://z.cash/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Performance evaluation for Artifact Storage used in Github Actions

y4ssi opened this issue · comments

commented

The purpose of this Issue is to compare the performance of downloading and uploading artifacts in a Github Actions Workflow using the native Github Actions system for artifact management versus using Google Cloud Storage integrated with Github Actions.

At the end of this issue, the snippets of the workflows used for testing will be provided.

The configuration of GCS includes a bucket set up in the standard storage class (Google Cloud Storage Classes) in the us-central1 region. The bucket was configured with a policy to automatically delete objects after 1 day. Note that storage charges apply per hour.

In Github Actions, all generated artifacts can only be removed after one day. Hence, storage charges apply per day.

Below are the respective costs (excluding free tiers):

Storage Costs:

  • Github Actions: $0.08 USD*GB/day
  • GCS: $0.02 USD*GB/month

Note: Github Actions charges per day, and GCS charges per hour.
Consider: 1 month = 750 hrs

Retrieve Costs:

  • GCS: $0.01 USD*GB.
  • Github Actions: Free.

Tests Conducted

Test 1

Create with the dd Linux command 1024 files of 1 MiB each for a total of 1000 MiB. These files are uploaded to Github Actions in one step and downloaded in another step. The same procedure is followed for GCS.

Test 2

Create with the dd Linux command 1 file of 1000 MiB. This file is uploaded to Github Actions in one step and downloaded in another step. The same procedure is followed for GCS.

Benchmark results:

image

Although it can be observed that Github Actions is slightly faster by 3.68% (summing the averages of the workflows in both tests), when considering costs, GCP is much more cost-effective.

Cost example for 1 GB

Storing 1 GB for a minimum of 1 day in Github Actions vs GCS. Additionally, consider that the artifact is obtained 3 times during a pipeline.

GCS Storage: ($0.02 USD * 1 GB / 750 hrs) * 24 hrs = $0.00064 USD per 1 GB. (Note the price of GCS is fixed per month, where 1 month = 750 hrs).

Github Actions Storage: $0.08 USD * 1 GB / 24 hrs) * 24 hrs = $0.08 USD per 1 GB. (The price of GH is fixed per day, where 1 day = 24 hrs).

GCS Retrieval: $0.01 USD * 1 GB * 3 times = $0.03 USD.

Github Actions Retrieval: $0 USD.

Total GH: $0.08 USD
Total GCS: $0.03064 USD

As can be seen, summing up the costs of retrieval and storage, it is much more cost-effective to use Google Cloud Storage (GCS).

Storage Cost vs Compute Cost

As we can see in this pricing table for the execution time of a pipeline in Github Actions

image

Considering that we typically use an 8 vCPU machine for Zcash pipelines and that the generated artifacts are around ~8GB (even though not all are used, for practical purposes, we assume that all will be used).

These would be the costs per minute for uploading and downloading artifacts using the native method versus using GCS (It will be assumed that the 8GB will be uploaded once and downloaded once and averages are taken according to the table presented earlier.)

GCS

In GCS, uploading and downloading 8GB would take 327.2 seconds (1GB/40.9 seconds) and the cost is 0.00512USD ($0.02 USD * 8GB / 750hrs * 24hrs -> per day)

Linux: $0.032 USD / 60sec * 327.2 sec + $0.00512USD = $0.1796USD
Windows: $0.032 USD / 60sec * 327.2 sec * 2 + $0.00512USD = $0.3541USD

Taking into account that the multiplier for Linux is 1, for Windows is 2.

GH

In GH, uploading and downloading 8GB would take 315.6 seconds (1GB/39.45 seconds) and the cost is $0.64USD ($0.08 USD / GB * 8GB -> per day)

Linux: $0.032 USD / 60sec * 315.6 sec + $0.64USD = $0.8083USD
Windows: $0.032 USD / 60sec * 315.6 sec * 2 + $0.64USD = $0.9766USD

Taking into account that the multiplier for Linux is 1, for Windows is 2.

Conlusion

Considering all these factors, my recommendation is to use GCS, as it is more cost-effective, and the degradation in time is almost negligible.

Workflow Snippets

Multiple files

name: CI

on:
  pull_request:
  push:
    branches: master

jobs:
  generate-and-upload-to-GH-multiple:
    runs-on: ubuntu-latest

    steps:
    - name: Generate files
      run: |
        mkdir generated_files
        for i in {1..1024}; do
          dd if=/dev/urandom of=generated_files/file_${i}.txt bs=1024000 count=1
        done

    - name: Upload files to artifacts
      uses: actions/upload-artifact@v4
      with:
        name: my-artifacts
        path: generated_files

  download-artifacts-GH-multiple:
    needs: generate-and-upload-to-GH-multiple
    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v4

    - name: Download artifacts
      uses: actions/download-artifact@v4
      with:
        name: my-artifacts

  generate-and-upload-to-GCS-multiple:
    runs-on: ubuntu-latest

    steps:
    - name: Generate files
      run: |
        mkdir generated_files
        for i in {1..1024}; do
          dd if=/dev/urandom of=generated_files/file_${i}.txt bs=1024000 count=1
        done

    - name: Authenticate to Google Cloud
      uses: google-github-actions/auth@v2
      with:
        credentials_json: ${{ secrets.GCP_SA_KEY }}

    - name: Upload files to artifacts
      uses: google-github-actions/upload-cloud-storage@v2
      with:
        path: generated_files
        destination: gh-zcash/${{ github.run_id }}

  download-artifacts-GCS-multiple:
    needs: generate-and-upload-to-GCS-multiple
    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v4

    - name: 'Set up Cloud SDK'
      uses: 'google-github-actions/setup-gcloud@v2'
      with:
        version: '>= 363.0.0'

    - name: Download artifact
      run: |
        ./.github/gcs-download-artifacts.sh ${{ secrets.GCP_SA_KEY }} ${{ github.run_id }} generated_files ./

Single files

name: CI

on:
  pull_request:
  push:
    branches: master

jobs:
  generate-and-upload-to-GH-single:
    runs-on: ubuntu-latest

    steps:
    - name: Generate files
      run: |
        mkdir generated_files
        dd if=/dev/urandom of=generated_files/file_${i}.txt bs=1024 count=1000000


    - name: Upload files to artifacts
      uses: actions/upload-artifact@v4
      with:
        name: my-artifacts
        path: generated_files

  download-artifacts-GH-single:
    needs: generate-and-upload-to-GH-single
    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v4

    - name: Download artifacts
      uses: actions/download-artifact@v4
      with:
        name: my-artifacts

  generate-and-upload-to-GCS-single:
    runs-on: ubuntu-latest

    steps:
    - name: Generate files
      run: |
        mkdir generated_files
        dd if=/dev/urandom of=generated_files/file_${i}.txt bs=1024 count=1000000


    - name: Authenticate to Google Cloud
      uses: google-github-actions/auth@v2
      with:
        credentials_json: ${{ secrets.GCP_SA_KEY }}

    - name: Upload files to artifacts
      uses: google-github-actions/upload-cloud-storage@v2
      with:
        path: generated_files
        destination: gh-zcash/${{ github.run_id }}

  download-artifacts-GCS-single:
    needs: generate-and-upload-to-GCS-single
    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v4

    - name: Download artifact
      run: |
        ./.github/gcs-download-artifacts.sh ${{ secrets.GCP_SA_KEY }} ${{ github.run_id }} generated_files ./