elastic / package-registry

Elastic Package Registry (EPR)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Some GA packages no longer appear in `/search` without experimental/pre-release flag

jen-huang opened this issue · comments

This may have been caused by changes in #785

Steps:

  1. View package search endpoint for v7.16.2: https://epr.elastic.co/search?kibana.version=7.16.2
  2. Observe that security_detection_engine package, which is GA, is not part of the search results
  3. Add experimental flag (prerelease works too) https://epr.elastic.co/search?kibana.version=7.16.2&experimental=true
  4. Observe that security_detection_engine now appears

This package version is GA, so it ought to be returned in the search results regardless of any release parameter:

image

This is expected. 0.x packages cannot be considered GA anymore, as we are relying on semantic versioning only (see elastic/package-spec#225). We had an internal thread about consistency between release labels and versions of packages some weeks ago, there I mentioned the security_detection_engine package, but sadly I overlooked that no action was done for this package, or I missed it.

This shouldn't be a problem with current versions of Fleet, because experimental=true is always used. But what it seems I missed is that the /api/fleet/epm/packages endpoint can be used to forward queries directly to the registry.

I think that there are very few packages in this situation, probably only this one. The other one I knew was osquery_manager, and a 1.0.0 version was released for it, with support for 7.x versions.

I see that there is a 1.0.1 version, but it seems that it was only published for ^8.0.0.

I see the following possible action points:

  1. Add some logic in the package registry to somehow identify these requests and implicitly look for experimental packages.
  2. Add the experimental=true parameter by default to the queries done by /api/fleet/epm/packages, in consistency with what Fleet UIs do.
  3. Release a >= 1.0.0 version of the security_detection_engine package targeting 7.x.
  4. Do nothing.

If we want to do anything, I think we should do the option 3, this should solve the problem for the only known package affected by this. @elastic/protections would this be an option? If you want to keep maintaining a different "branch" for 7.x, I would suggest to release 1.1.0 for 7.x, and use 1.1.x for fixes there, and use 1.2.0 and above for 8.x.

Option 1 would be too error-prone, it may be impossible to distinguish queries made by Kibana for this case from queries coming from other clients, or from Kibana itself for other cases. Even if possible it would imply to introduce tricky logic to support a legacy use case. We could also consider reverting #785 and related changes, and reconsider alternatives for elastic/package-spec#225, but this would be quite a setback for us at this point.

Option 2 could be a nice to have if it makes sense, but as with option 1, this can be an unexpected effort to cover a single corner case. And it wouldn't solve the problem for existing versions of kibana.

Option 4 can be also considered if we think we can live with this known issue that doesn't affect most of the users, and has an easy workaround for the users affected.

@jsoriano I missed the rule about 0.x packages not being considered GA, that makes sense. I do believe direct usage of /api/fleet/epm/packages endpoint is an edge case and this behavior change should not affect most users.

Apart from security_detection_engine (and osquery_manager, which was already fixed), can we do another audit to see if there are any other 0.x packages marked as GA? I agree that option 3 is the most consistent approach.

The security_detection_engine package is kind of unique vs the other packages. Fleet is leveraged to release rule updates out of band from new Kibana releases. As such, we may maintain multiple concurrent versions of a package. To reflect this, the versioning strategy we have adopted is: relative-to-major.stack-minor.iteration. So 0.14.3 is the 3rd package released compatible with ^7.14+ (0 = 7 since the integration went GA in 7.14, so 1.0.1 is 8.0 package 1).

This shouldn't be a problem with current versions of Fleet, because experimental=true is always used.

I verified that Kibana 7.14+ load with experimental=true so the packages are properly loading.

We also bump the lowest supporting package which which we target for updates with each new Kibana release, so at the moment, we are in the 0.16.x series, pushing updates to ^7.16+.

With all that said, I believe we should be good to leave things as is and continue with the same versioning process.

Any concerns with this approach?

@brokensound77 thanks for your explanation. The main concern I see is that, if we continue with this schema, this little discrepancy is going to be be here during all the lifetime of 7.x:

  • Registry API is only going to show this package if experimental and/or prerelease parameters are set to true.
  • Same thing to /api/fleet/epm/packages Fleet API.
  • elastic-package status security_detection_engine --kibana-version 7.17.0 is going to show the package as "Technical Preview".

These are all things that in principle only affect developers or advanced users, and have an easy workaround, so it may be ok. But this has already appeared at least in a support case, and it can continue appearing during all the lifetime of 7.x.
Our ourselves of the future may have forgotten this discussion if/when this issue appears again in six months or one year, and new issues will be opened and time will have to be dedicated to re-investigate this.

If you release using the relative-to-major.stack-minor.iteration schema, would it be a problem to change it to stack-major.stack-minor.iteration? This would mean to start releasing 0.16.x packages as 7.16.y, and 1.0.x packages as 8.0.y. This would solve this issue, while allowing to keep something like the schema you would like to follow.

I think we can close this as in principle there is nothing we are going to do about this in the registry.

If this continues being a problem, please follow the recommendation of releasing the affected package with >= 1.0 stable versions.