slsa-framework / slsa-github-generator

Language-agnostic SLSA provenance generation for Github Actions

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[feature] Set an environment variable in docker-based generator

asraa opened this issue · comments

commented

Is your feature request related to a problem? Please describe.

In trying to demo scorecards runs with the docker-based generator, I ran into an issue where the GITHUB_TOKEN env var must be set.

Also, for Trivy scanning, setting a volume would be helpful.

@asraa Just to confirm, are you suggesting to add these to the user-provided config file?

@tiziano88: Do you have any comments or concerns about (1) adding env variables, and (2) mounting custom volumes?

commented

@asraa Just to confirm, are you suggesting to add these to the user-provided config file?

Correct.

One thing that might complicate security concerns is that often the env vars will need to be resolved in the workflow calling the builder. For example, the config may include:

environment = ["GITHUB_TOKEN": "${GITHUB_TOKEN}"]

where ${GITHUB_TOKEN} must be resolved, but also not written in plaintext in the attestation.

We do something like this to resolve the ldflags in the go build configuration dynamically in workflows by adding an evaluated-env inputs to the go builder:

jobs:
# Generate ldflags dynamically.
# Optional: only needed for ldflags.
args:
runs-on: ubuntu-latest
outputs:
commit-date: ${{ steps.ldflags.outputs.commit-date }}
commit: ${{ steps.ldflags.outputs.commit }}
version: ${{ steps.ldflags.outputs.version }}
tree-state: ${{ steps.ldflags.outputs.tree-state }}
steps:
- id: checkout
uses: actions/checkout@ec3a7ce113134d7a93b817d10a8272cb61118579 # tag=v2.3.4
with:
fetch-depth: 0
- id: ldflags
run: |
echo "commit-date=$(git log --date=iso8601-strict -1 --pretty=%ct)" >> "$GITHUB_OUTPUT"
echo "commit=$GITHUB_SHA" >> "$GITHUB_OUTPUT"
echo "version=$(git describe --tags --always --dirty | cut -c2-)" >> "$GITHUB_OUTPUT"
echo "tree-state=$(if git diff --quiet; then echo "clean"; else echo "dirty"; fi)" >> "$GITHUB_OUTPUT"
# Trusted builder.
build:
permissions:
id-token: write # To sign the provenance.
contents: write # To upload assets to release.
actions: read # To read the workflow path.
needs: args
uses: slsa-framework/slsa-github-generator/.github/workflows/builder_go_slsa3.yml@v1.5.0
with:
go-version: 1.17
# Optional: only needed if using ldflags.
evaluated-envs: "COMMIT_DATE:${{needs.args.outputs.commit-date}}, COMMIT:${{needs.args.outputs.commit}}, VERSION:${{needs.args.outputs.version}}, TREE_STATE:${{needs.args.outputs.tree-state}}"

But here we may have inputs that are sensitive (like the token). We may still use this evaluated-env option, but we may need to mask them in the provenance, and also have a certain allow-listed set of environment variables.

I was hoping that we could separate the env variables passed to the GH action (which would contain the token, and other stuff) from those that are passed to the Docker command itself. Those are semantically very different IMO. Or is that not possible because of how GH Actions work?

commented

env variables passed to the GH action (which would contain the token, and other stuff) from those that are passed to the Docker command itself.

Oh yes, they are! In the scorecard container in particular, it requires this GH env var to be passed to the Docker command:

docker run -e GITHUB_AUTH_TOKEN=<your access token> gcr.io/openssf/scorecard:stable --repo=<your choice of repo>

In our builders, I don't think env vars into the GH action would affect the build, so they wouldn't be necessary to pass.

In the scorecard container in particular, it requires this GH env var to be passed to the Docker command

Ah I see, makes sense, thanks for the context! Though perhaps that's a sign that that particular action should be split into two: generating the scorecard results, and then uploading them (similar to what we are discussing in #1867 for arbitrary artifacts)

commented

Though perhaps that's a sign that that particular action should be split into two: generating the scorecard results, and then uploading them (similar to what we are discussing in #1867 for arbitrary artifacts)

Actually, the GH env var is required for generating the scorecard result, since the scorecard API relies on using a GH token to avoid rate limiting when querying for data. That docker run doesn't upload anything :)

I think I'll still think about what to do with redacting sensitive environment variables though here..