quay / clair

Vulnerability Static Analysis for Containers

Home Page:https://quay.github.io/clair/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

regression in v4.5.0: rpm: error parsing ndb db: ndb: package: slot 212: unexpected error: slot: nonsense block count (0)

majewsky opened this issue · comments

After upgrading from 4.4.4 to 4.5.0, I started getting a bunch of indexing errors on newly pushed customer images, for example:

{"err": "failed to scan all layer contents: rpm: error parsing ndb db: ndb: package: slot 233: unexpected error: slot: nonsense block count (0)", "state": "IndexError", "success": false, "packages": {}, "repository": {}, "environments": {}, "distributions": {}, "manifest_hash": "sha256:9154877e0d56e5e28880d21583c6278a450a65551dcc75f9bafa4872197b5012"}

All other indexing errors are identical except that sometimes the number in "slot 233" is different.

I downgraded to 4.4.4, deleted one of the broken index reports, and reindexed the image. That time, it came out clean, so this is definitely a regression introduced in 4.5.0. I cannot provide the image to you since it's a customer image, but I can answer questions about it as necessary. As far as I can see from the image config blob, it uses a SLES base image. There are various layers where zypper is used to install stuff, and these layers sometimes end off with a && zypper clean -a, but the RPM database should always be left intact from what I can tell.

Environment

  • Clair version/image: 4.5.0 built from Dockerfile in release tarball
  • Clair client name/version: Keppel latest
  • Host OS: Flatcar Linux 3227.2.1
  • Kernel (e.g. uname -a): Linux 5.15.58-flatcar #1 SMP Wed Aug 3 19:35:04 -00 2022 x86_64 x86_64 x86_64 GNU/Linux
  • Kubernetes version (use kubectl version): 1.23.6
  • Network/Firewall setup: irrelevant

This is because the newest version handles ndb databases, whereas the previous version does not.

My intuition is that the RPM database isn't being repacked to be contiguous, for some reason. I thought that was standard behavior, but my spelunking though the C could be wrong. I've got a patch that should fix this, but I don't have a database to test against.

This is because the newest version handles ndb databases, whereas the previous version does not.

Good to know that.

I've got a patch that should fix this, but I don't have a database to test against.

If you provide us with a branch or Dockerfile to test this, we may have time to test this but I don't want to promise anything.
If not what would be the steps for us to test this? Clone the clair repo, bump claircore to the feature branch and build a Dockerfile from that?

Thanks @SuperSandro2000, we have identified a problematic Packages.DB and it is included as a regression test in the PR. When the PR is merged we can tag claircore and update Clair's dep. From there you can test on main or wait for a release.

We tend to deploy the latest code to the service and let it sit before tagging a Clair release in the hopes of assessing the stability and addressing errors. In this case v4.5.0 brought a lot of bug fixes that masked this error due to the overall error rate declining significantly.

The fix should be available in the next nightly.