Vulnerability Report Image Scan fails due to read only root filesystem
KyleMartin901 opened this issue · comments
What steps did you take and what happened:
We installed Starboard using Helm chart 0.9.1 with the following values
targetNamespaces: ""
operator:
scanJobsConcurrentLimit: 30
vulnerabilityScannerScanOnlyCurrentRevisions: true
vulnerabilityScannerReportTTL: 24h
starboard:
scanJobTolerations:
- operator: Exists
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::XXX:role/XXX
trivy:
severity: HIGH,CRITICAL
githubToken: XXX
resources:
requests:
cpu: 500m
memory: 512M
limits:
cpu: 2000m
memory: 4G
resources:
requests:
cpu: 500m
memory: 512M
limits:
cpu: 2000m
memory: 2G
We have a private ML image that is deployed in our Kubernetes cluster and the Vulnerability scan (Trivy) fails with the following error failed to open: failed to create the temp file: open /tmp/fanal-291399187: read-only file system
{
"level": "error",
"ts": 1643685779.1531503,
"logger": "reconciler.vulnerabilityreport",
"msg": "Scan job container",
"job": "starboard-operator/scan-vulnerabilityreport-5b65b58f76",
"container": "ml-prediction-service",
"status.reason": "Error",
"status.message": "2022-02-01T03:22:58.840Z FATAL scan error: image scan failed: failed analysis: analyze error: failed to analyze layer: sha256:711287f45789dcf7beddd6550018938fcc3a70f33736d99163f8ec057db41c54 : walk error: failed to process the file: failed to analyze file: failed to analyze usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so: unable to open usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so: failed to open: failed to create the temp file: open /tmp/fanal-291399187: read-only file system",
"stacktrace": "github.com/aquasecurity/starboard/pkg/operator/controller.(*VulnerabilityReportReconciler).reconcileJobs.func1
/home/runner/work/starboard/starboard/pkg/operator/controller/vulnerabilityreport.go:320
sigs.k8s.io/controller-runtime/pkg/reconcile.Func.Reconcile
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/reconcile/reconcile.go:102
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227"
}
I can see that Trivy v0.22.0 uses the fanal package and that is where the error is coming from. The name of the created temp file starting with fanal
makes that easier to trackdown which is handy.
When I looked at Fanal I found that it will try to untar the file
https://github.com/aquasecurity/fanal/blob/672696605858a4bc4f6cb20fda9208d70220c311/walker/tar.go#L133 and before it does that it will check if the file size if it is bigger than 26MB const ThresholdSize = int64(200) << 20 and if it is it will write it to a temp file within the /tmp
directory. It is at this point that the scan errors out due to the scan-vulnerabilityreport container defaulting to a ReadOnlyRootFilesystem
[Miscellaneous information that will assist in solving the issue.]
To see the issue you can also run Trivy using docker to scan the Tensorflow image in read only mode
docker run --rm --read-only -v $HOME/Library/Caches:/root/.cache/ aquasec/trivy:0.22.0 tensorflow/tensorflow:2.7.0
2022-02-01T10:53:51.746Z WARN The root command will be removed. Please migrate to 'trivy image' command. See https://github.com/aquasecurity/trivy/discussions/1515
2022-02-01T10:54:51.717Z FATAL scan error: image scan failed: failed analysis: analyze error: failed to analyze layer: sha256:4a1c45155a198a0cac890fc6adcb7ca37973e374444f65e83729e79b0496f65b : walk error: failed to process the file: failed to analyze file: failed to analyze usr/local/lib/python3.8/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so: unable to open usr/local/lib/python3.8/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so: failed to open: failed to create the temp file: open /tmp/fanal-2995727774: read-only file system
Environment:
- Starboard version (use
starboard version
): 14.1 - Kubernetes version (use
kubectl version
): 1.20 - Kubernetes Distribution: EKS
Thank you for pointing this issue out @KyleMartin901 So in this case what we may do is mount emptyDir volume on the /tmp path in the scan pod to make it writable.
Thanks @danielpacak sounds good to me. I wasn't sure if you wanted it to be a config option or not but adding a emptyDir for /tmp sounds perfect for my needs