quay / clair

Vulnerability Static Analysis for Containers

Home Page:https://quay.github.io/clair/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Clair Indexing timeout in offline mode

bluefriday opened this issue · comments

I ran Clair in an offline environment.

After that, when I request the index_report API(/indexer/api/v1/index_report), the indexer sends a request to the online environment.

And after waiting 30 seconds (probably a timeout) the indexing operation happens.

How can I change the configuration of the indexer to offline mode so that the timeout does not occur? (I tried using the airgap flag, but it didn't work.)

my configuration of indexer clair :

  config.yaml: |
    introspection_addr: :8089
    http_listen_addr: :6060
    log_level: debug-color
    indexer:
      connstring: host=svc-indexer-db port=5432 dbname=clair user=postgres password=postgres sslmode=disable
      scanlock_retry: 10
      layer_scan_concurrency: 5
      migrations: true
      airgap: true
  • Clair version/image: v4.3.2
  • Clair client name/version: 0.2.0
  • Host OS: centos7.4
  • Kernel (e.g. uname -a): 3.10.0-957.el7.x86_64
  • Kubernetes version (use kubectl version): 1.22.0
  • Network/Firewall setup: X

Please provide information (if you have it) about what request was made.

This smells like it could be the rhel package scanner, but it's hard to say without any logging.

Please provide information (if you have it) about what request was made.

This smells like it could be the rhel package scanner, but it's hard to say without any logging.

Ok, I reproduced my test :)

indexer's log

4:41AM INF index request start component=libindex/Libindex.Index manifest=sha256:4a1c4b21597c1b4415bdbecb28a3296c6b5e23ca4f9feeb599860a1dac6a0108
4:41AM DBG configured search API URL api=https://search.maven.org/solrsearch/select component=java/Scanner.Configure manifest=sha256:4a1c4b21597c1b4415bdbecb28a3296c6b5e23ca4f9feeb599860a1dac6a0108 version=3
4:41AM DBG attempting fetch of repo2cpe mapping file component=rhel/repo2cpe/UpdatingMapper.do manifest=sha256:4a1c4b21597c1b4415bdbecb28a3296c6b5e23ca4f9feeb599860a1dac6a0108 url=https://access.redhat.com/security/data/metrics/repository-to-cpe.json version=1.1


4:41AM ERR configuration failed error="Get \"https://access.redhat.com/security/data/metrics/repository-to-cpe.json\": dial tcp 23.25.172.12:443: i/o timeout" component=internal/indexer/layerscannner/New manifest=sha256:4a1c4b21597c1b4415bdbecb28a3296c6b5e23ca4f9feeb599860a1dac6a0108 scanner=rhel-repository-scanner
4:41AM DBG locking attempt component=libindex/Libindex.Index manifest=sha256:4a1c4b21597c1b4415bdbecb28a3296c6b5e23ca4f9feeb599860a1dac6a0108
4:41AM DBG locking OK component=libindex/Libindex.Index manifest=sha256:4a1c4b21597c1b4415bdbecb28a3296c6b5e23ca4f9feeb599860a1dac6a0108
4:41AM INF starting scan component=internal/indexer/controller/Controller.Index manifest=sha256:4a1c4b21597c1b4415bdbecb28a3296c6b5e23ca4f9feeb599860a1dac6a0108

indexer clair's configuration

apiVersion: v1
kind: ConfigMap
...
data:
  config.yaml: |
    introspection_addr: :8089
    http_listen_addr: :6060
    log_level: debug-color
    indexer:
      connstring: host=svc-indexer-db port=5432 dbname=clair user=postgres password=postgres sslmode=disable
      scanlock_retry: 10
      layer_scan_concurrency: 5
      migrations: true
      airgap: true
      scanner: {}

I found this url in claircore source. (quay/claircore#525)
@hdonnay Can I check the usage of Repo2CPEMappingFile in offline environment?

We are experiencing the same issue, is there any fix for it?
Airgap mode is on errors are

{
"level":"error",
"component":"internal/indexer/controller/Controller.Index",
"manifest":"sha256:b984358fc6c6880705bdf71f07f82a56e735b4d2a8dc4cb050a9ce1cda0ff82a",
"state":"CheckManifest",
"error":"failed to retrieve repositories for sha256:7b1a6ab2e44dbac178598dabe7cff59bd67233dba0b27e4fbd1f9d4b3c877a54: 
        store:repositoriesByLayer failed to retrieve id for scanner \"rhel-repository-scanner\": no rows in result set",
"time":"2022-01-06T19:32:49Z",
"message":"error during scan"
}

{
"level":"error",
"component":"internal/indexer/layerscannner/New",
"manifest":"sha256:74d47b7cc487e2dc5ed912e04a010809365c0580d96bd5ac1564ead60f43b2ed",
"scanner":"rhel-repository-scanner",
"error":"Get \"https://www.redhat.com/security/data/metrics/repository-to-cpe.json\": dial tcp 173.223.245.199:443: i/o timeout",
"time":"2022-01-06T23:00:19Z",
"message":"configuration failed"
}

EDIT:

After doing some testing with airgap mode, it looks like clair is missing some scanners on initial setup of the database, so it will error out when it sees a layer that requires that scanner.

clairctl export/import-updaters does not seem to solve this problem, the only fix is to let clair talk to the internet to retrieve the correct scanners that it needs for a layer. Which defeats the purpose of airgap.

We have tried this route and it still fails. Unable to find the entire contents of the file that you point to in the db and still seems to want to go out to the internet to retrieve information.

@hdonnay - any assistance on this would be greatly appreciated.

@hdonnay - Please look this PR (quay/claircore#548)

I think this enable to prevent going out to the internet in airgap mode.

The configuration for the repository indexer is here -- setting the "file" key should workaround the 30 second timeout.