anchore / anchore-engine

A service that analyzes docker images and scans for vulnerabilities

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Analysis fails when container image has uncompressed layers

tsaarni opened this issue · comments

Is this a BUG REPORT or a FEATURE REQUEST? (choose one): BUG REPORT

Version of Anchore Engine and Anchore CLI if applicable:
Anchore Engine v0.9.4
Anchore CLI v0.9.1

What happened:
Analysis fails for images that contain uncompressed layers.

That is, layers with media type application/vnd.docker.image.rootfs.diff.tar instead of application/vnd.docker.image.rootfs.diff.tar.gzip.

What did you expect to happen:
Analysis should have succeeded.

Any relevant log output from analyzer:

Analyzer raises following exception and analysis fails with status analysis_failed.

[service:worker] 2021-05-27 17:04:26+0000 [-] [Thread-75] [anchore_engine.clients.skopeo_wrapper/download_image()] [DEBUG] command succeeded: cmd=/bin/sh -c skopeo   copy --remove-signatures --src-tls-verify=false   docker://docker.io/tsaarni/uncompressed-layer-demo@sha256:cb7e1611d9ff0e9090a13c5bdf343bb0cde9a8e30e60c9892e47c83a4ac97634 oci:/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw:image stdout=b'Copying blob sha256:b2d5eeeaba3a22b9b8aa97261957974a6bd65274ebd43e1d81d0a7b8b752b116\nCopying config sha256:cdce9ebeb6e8364afeac430fe7a886ca89a90a5139bc3b6f40b5dbd0cf66391c\nWriting manifest to image destination\nStoring signatures\n' stderr=b''
[service:worker] 2021-05-27 17:04:26+0000 [-] [Thread-75] [anchore_engine.clients.localanchore_standalone/squash()] [DEBUG] Layers to process: ['sha256:b2d5eeeaba3a22b9b8aa97261957974a6bd65274ebd43e1d81d0a7b8b752b116']
[service:worker] 2021-05-27 17:04:26+0000 [-] [Thread-75] [anchore_engine.clients.localanchore_standalone/squash()] [DEBUG] Pass 1: generating layer file timeline
[service:worker] 2021-05-27 17:04:26+0000 [-] [Thread-75] [anchore_engine.clients.localanchore_standalone/squash()] [DEBUG] processing layer sha256:b2d5eeeaba3a22b9b8aa97261957974a6bd65274ebd43e1d81d0a7b8b752b116 - None
[service:worker] 2021-05-27 17:04:26+0000 [-] [Thread-75] [anchore_engine.clients.localanchore_standalone/squash()] [DEBUG] Pass 3: closing layer tarfiles
[service:worker] 2021-05-27 17:04:26+0000 [-] [Thread-75] [anchore_engine.clients.localanchore_standalone/delete_staging_dirs()] [DEBUG] keep_image_analysis_tmpfiles is enabled - leaving analysis tmpdir in place {'unpackdir': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89', 'copydir': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw', 'rootfs': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/rootfs', 'outputdir': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/output', 'cachedir': None}
[service:worker] 2021-05-27 17:04:26+0000 [-] [Thread-75] [anchore_engine.clients.localanchore_standalone/delete_staging_dirs()] [DEBUG] keep_image_analysis_tmpfiles is enabled - leaving analysis tmpdir in place {'unpackdir': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89', 'copydir': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw', 'rootfs': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/rootfs', 'outputdir': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/output', 'cachedir': None}
[service:worker] 2021-05-27 17:04:26+0000 [-] [Thread-75] [anchore_engine.clients.localanchore_standalone/delete_staging_dirs()] [DEBUG] keep_image_analysis_tmpfiles is enabled - leaving analysis tmpdir in place {'unpackdir': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89', 'copydir': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw', 'rootfs': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/rootfs', 'outputdir': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/output', 'cachedir': None}
[service:worker] 2021-05-27 17:04:26+0000 [-] [Thread-75] [anchore_engine.clients.localanchore_standalone/delete_staging_dirs()] [DEBUG] keep_image_analysis_tmpfiles is enabled - leaving analysis tmpdir in place {'unpackdir': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89', 'copydir': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw', 'rootfs': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/rootfs', 'outputdir': '/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/output', 'cachedir': None}
[service:worker] 2021-05-27 17:04:26+0000 [-] Traceback (most recent call last):
[service:worker] 2021-05-27 17:04:26+0000 [-]   File "/usr/local/lib/python3.8/site-packages/anchore_engine/clients/localanchore_standalone.py", line 1092, in analyze_image
[service:worker] 2021-05-27 17:04:26+0000 [-]     imageSize = unpack(staging_dirs, layers)
[service:worker] 2021-05-27 17:04:26+0000 [-]   File "/usr/local/lib/python3.8/site-packages/anchore_engine/clients/localanchore_standalone.py", line 876, in unpack
[service:worker] 2021-05-27 17:04:26+0000 [-]     squashtar, imageSize = squash(unpackdir, cachedir, layers)
[service:worker] 2021-05-27 17:04:26+0000 [-]   File "/usr/local/lib/python3.8/site-packages/anchore_engine/clients/localanchore_standalone.py", line 330, in squash
[service:worker] 2021-05-27 17:04:26+0000 [-]     tarfiles[l] = tarfile.open(layertar, mode="r", format=tarfile.PAX_FORMAT)
[service:worker] 2021-05-27 17:04:26+0000 [-]   File "/usr/lib64/python3.8/tarfile.py", line 1590, in open
[service:worker] 2021-05-27 17:04:26+0000 [-]     raise ValueError("nothing to open")
[service:worker] 2021-05-27 17:04:26+0000 [-] ValueError: nothing to open
[service:worker] 2021-05-27 17:04:26+0000 [-]
[service:worker] 2021-05-27 17:04:26+0000 [-] During handling of the above exception, another exception occurred:
[service:worker] 2021-05-27 17:04:26+0000 [-]
[service:worker] 2021-05-27 17:04:26+0000 [-] Traceback (most recent call last):
[service:worker] 2021-05-27 17:04:26+0000 [-]   File "/usr/local/lib/python3.8/site-packages/anchore_engine/services/analyzer/analysis.py", line 333, in process_analyzer_job
[service:worker] 2021-05-27 17:04:26+0000 [-]     image_data = perform_analyze(
[service:worker] 2021-05-27 17:04:26+0000 [-]   File "/usr/local/lib/python3.8/site-packages/anchore_engine/services/analyzer/analysis.py", line 193, in perform_analyze
[service:worker] 2021-05-27 17:04:26+0000 [-]     analyzed_image_report, manifest_raw = localanchore_standalone.analyze_image(
[service:worker] 2021-05-27 17:04:26+0000 [-]   File "/usr/local/lib/python3.8/site-packages/anchore_engine/clients/localanchore_standalone.py", line 1125, in analyze_image
[service:worker] 2021-05-27 17:04:26+0000 [-]     raise AnalysisError(
[service:worker] 2021-05-27 17:04:26+0000 [-] anchore_engine.clients.localanchore_standalone.AnalysisError: failed to download, unpack, analyze, and generate image export (docker.io/tsaarni/uncompressed-layer-demo@sha256:cb7e1611d9ff0e9090a13c5bdf343bb0cde9a8e30e60c9892e47c83a4ac97634) - exception: nothing to open
[service:worker] 2021-05-27 17:04:26+0000 [-] [Thread-75] [anchore_engine.services.analyzer.analysis/process_analyzer_job()] [ERROR] problem analyzing image - exception: failed to download, unpack, analyze, and generate image export (docker.io/tsaarni/uncompressed-layer-demo@sha256:cb7e1611d9ff0e9090a13c5bdf343bb0cde9a8e30e60c9892e47c83a4ac97634) - exception: nothing to open

What docker images are you using:
docker.io/tsaarni/uncompressed-layer-demo:latest

The image is alpine:latest but it has been specifically crafted and uploaded in a way that it contains a layer that is uncompressed.

How to reproduce the issue:

$ docker run -e ANCHORE_CLI_USER=$ANCHORE_CLI_USER -e ANCHORE_CLI_PASS=$ANCHORE_CLI_PASS -e ANCHORE_CLI_URL=$ANCHORE_CLI_URL --network host -it anchore/engine-cli:v0.9.1 anchore-cli image add docker.io/tsaarni/uncompressed-layer-demo:latest

...

$ docker run -e ANCHORE_CLI_USER=$ANCHORE_CLI_USER -e ANCHORE_CLI_PASS=$ANCHORE_CLI_PASS -e ANCHORE_CLI_URL=$ANCHORE_CLI_URL --network host -it anchore/engine-cli:v0.9.1 anchore-cli image get docker.io/tsaarni/uncompressed-layer-demo:latest
Image Digest: sha256:cb7e1611d9ff0e9090a13c5bdf343bb0cde9a8e30e60c9892e47c83a4ac97634
Parent Digest: sha256:cb7e1611d9ff0e9090a13c5bdf343bb0cde9a8e30e60c9892e47c83a4ac97634
Analysis Status: analysis_failed
Image Type: docker
Analyzed At: None
Image ID: 6dbb9cc54074106d46d4ccb330f2a40a682d49dda5f4844962b7dce9fe44aaec
Dockerfile Mode: None
Distro: None
Distro Version: None
Size: None
Architecture: None
Layer Count: None

Full Tag: docker.io/tsaarni/uncompressed-layer-demo:latest
Tag Detected At: 2021-05-27T17:04:19Z

Anything else we need to know:
The root cause is following:

(1) When image is added Anchore first runs skopeo inspect to download manifest

$ skopeo inspect --raw docker://tsaarni/uncompressed-layer-demo:latest | jq .{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
  "config": {
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "size": 1472,
    "digest": "sha256:6dbb9cc54074106d46d4ccb330f2a40a682d49dda5f4844962b7dce9fe44aaec"
  },
  "layers": [
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar",
      "size": 5879808,
      "digest": "sha256:b2d5eeeaba3a22b9b8aa97261957974a6bd65274ebd43e1d81d0a7b8b752b116"
    }
  ]
}

Note that the layer sha256:b2d5eeeaba3a22b9b8aa97261957974a6bd65274ebd43e1d81d0a7b8b752b116 is uncompressed.

(2) Next, it uses skopeo copy to download image and store it as OCI directory

$ skopeo copy docker://tsaarni/uncompressed-layer-demo:latest oci:/analysis_scratch/...

(3) Skopeo will automatically compresses the layer on the fly during copy operation

This is the new manifest in the OCI directory

$ cat /analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw/blobs/sha256/f465bb97f5825dd1aae65c816886574984edcd22e908897648fcd26d819bd9a2 | jq .
{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:cdce9ebeb6e8364afeac430fe7a886ca89a90a5139bc3b6f40b5dbd0cf66391c",
    "size": 585
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:f12e03b5ef6655f2e0c3cc99f5f940e9be541a3b04703a27a1baf72325412af6",
      "size": 2902398
    }
  ]
}

Note that the layer sha256:f12e03b5ef6655f2e0c3cc99f5f940e9be541a3b04703a27a1baf72325412af6 is now a compressed version of the layer in the original image and that the digest changed due to compression.

(4) Analyzer will still use the manifest from skopeo inspect where the layer digest was for the uncompressed layer.

Therefore analyzer tries to open file /analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw/blobs/sha256/b2d5eeeaba3a22b9b8aa97261957974a6bd65274ebd43e1d81d0a7b8b752b116 (digest from original manifest) which does not exist. Exception is raised.

It should have opened /analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw/blobs/sha256/f12e03b5ef6655f2e0c3cc99f5f940e9be541a3b04703a27a1baf72325412af6 (digest after skopeo copy did the compression)

Here is contents of the work directory:

$ find /analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw/index.json
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw/oci-layout
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw/parent_manifest.json
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw/manifest.json
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw/blobs
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw/blobs/sha256
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw/blobs/sha256/f12e03b5ef6655f2e0c3cc99f5f940e9be541a3b04703a27a1baf72325412af6
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw/blobs/sha256/cdce9ebeb6e8364afeac430fe7a886ca89a90a5139bc3b6f40b5dbd0cf66391c
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/raw/blobs/sha256/f465bb97f5825dd1aae65c816886574984edcd22e908897648fcd26d819bd9a2
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/rootfs
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/output
/analysis_scratch/cf296952-8c5f-4e5b-b807-35dc652edc89/docker_history.json

Here manifest.json and parent_manifest.json are the original manifests from skopeo inspect.

Note that media type application/vnd.docker.image.rootfs.diff.tar was never explicitly mentioned in the image manifest v2 spec but it is supported by runtimes and the image works just fine:

$ docker run --rm -it tsaarni/uncompressed-layer-demo:latest ash
/ # cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.13.5
PRETTY_NAME="Alpine Linux v3.13"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://bugs.alpinelinux.org/"

I tried one simple approach to the problem:

diff --git a/anchore_engine/clients/skopeo_wrapper.py b/anchore_engine/clients/skopeo_wrapper.py
index ba337330..7c0515d0 100644
--- a/anchore_engine/clients/skopeo_wrapper.py
+++ b/anchore_engine/clients/skopeo_wrapper.py
@@ -187,7 +187,7 @@ def download_image(
             cmd = [
                 "/bin/sh",
                 "-c",
-                "skopeo {} {} copy {} {} {} {} docker://{} oci:{}:image".format(
+                "skopeo {} {} copy --dest-oci-accept-uncompressed-layers=true {} {} {} {} docker://{} oci:{}:image".format(
                     os_override_str,
                     global_timeout_str,
                     remove_signatures_string,

this got me bit further but it just postponed the problem to a later phase:

$ syft -vv -o json oci-dir:/analysis_scratch/2a278dcf-3085-481c-9396-e5b2f49f2a43/raw
[0000] DEBUG Application config:
output: json
scope: Squashed
quiet: false
log:
  structured: false
  level: debug
  file: ""
dev:
  profilecpu: false
  profilemem: false
check-for-app-update: true
anchore:
  upload-enabled: false
  host: ""
  path: ""
  username: ""
  password: ""
  dockerfile: ""
  overwrite-existing-image: false

[0000]  INFO new version of syft is available: 0.16.1
[0000] DEBUG image: source=OciDirectory location=/analysis_scratch/2a278dcf-3085-481c-9396-e5b2f49f2a43/raw from-lib=stereoscope
[0000] DEBUG image metadata: digest=sha256:cdce9ebeb6e8364afeac430fe7a886ca89a90a5139bc3b6f40b5dbd0cf66391c mediaType=application/vnd.oci.image.manifest.v1+json tags=[] from-lib=stereoscope
[0000] ERROR failed to catalog input: could not fetch image '/analysis_scratch/2a278dcf-3085-481c-9396-e5b2f49f2a43/raw': could not read image: unexpected media type: application/vnd.oci.image.layer.v1.tar for layer: sha256:b2d5eeeaba3a22b9b8aa97261957974a6bd65274ebd43e1d81d0a7b8b752b116

Another simple approach could be to use skopeo-generated OCI manifest during analysis, instead of the one from registry.
I guess this could be done during anchore_engine/clients/localanchore_standalone.py:analyze_image() call without big cascading impacts elsewhere, but I wonder what other purposes the original manifest is used by Anchore? Will there be other impacts when skopeo inspect and skopeo copy results do not match?

@tsaarni thanks for reporting this. Great issue writeup, and appreciate the significant time you have put into this already.

We'll review, but I think your later suggestion to use the OCI manifest is the right one. The manifest used should reflect the state of the image on disk. We may want to add better handling for non-gzip layers in Syft itself as well, but the manifest used by the analyzer to process the image should be accurate. We currently don't do any digest validation against the content, but if/when that would be added it would by definition have to be against the directly raw downloads from the registry with no transformations applied during copy, so the skopeo updates and/or an intermediate staging of raw bits prior to OCI-format conversion would be necessary in order to facilitate a digest-validation stage. So, that concern can be deferred for now.

I'll provide other comments in the PR directly.