prometheus / pushgateway

Push acceptor for ephemeral and batch jobs.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature request] Remote storage integrations (similar to Prometheus)

migueleliasweb opened this issue · comments

Feature request

For some usecases, some already highlighted in previous issues, users want to have the ability to rely more on the pushgateway functionality. In order to achieve that, things like HA Deployment or a different storage would be required.

I completely understand why these extra features wouldn't be 100% desired by the core team as they're wouldn't be simple to implement/maintain.

In the other hand, the pushgateway does 95% of the work and it feels sad to reimplement basically the same funcionality in order to just achieve the extra 5% features that won't be present.

Have said that, I think it would be awesome for the community and the Prometheus project as a whole is there was some kind of storage integration that could allow external storage system to be used without the need to rewrite the whole project.

The Prometheus project already does that and their view is very clear from the docs:

https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations

Prometheus's local storage is limited to a single node's scalability and durability. Instead of trying 
to solve clustered storage in Prometheus itself, Prometheus offers a set of interfaces that allow
integrating with remote storage systems.

Since there already seems to exist a storage interface, I'm sure a GRPC based implementation (or similar) that calls an external endpoint, is doable.

This could allow HA(#319), persistent storage(#111), metrics timeout (#19), leader election (maybe?) and much more without bringing the burden of maintaining all of this extra code to this project.

What do you think?

I was thinking on something amongst the lines of:


POD --(sends metrics)--> PushGateway --(remote write)--> CustomAPI --> Persistence layer (Redis, MySQL, MongoDB, DynamoDB, anything really)

This would turn the PGW into a remote-write proxy, more or less… If you actually want to push to a remote-write endpoint, why not do that directly?

Pushing to the PGW is semantically something else than remote-writing metrics. All those feature requests you have linked are more or less coming from this misunderstanding.

This would turn the PGW into a remote-write proxy, more or less…

The short answer is yes, indeed. The more comprehensive answer is a combination of all my comments here.

If you actually want to push to a remote-write endpoint, why not do that directly?

This can be explained with the same reason why someone implementing a remote storage for Prometheus wouldn just ditch Prometheus itself and use their new API directly. It's because of the interface.

Just like in Prometheus, PushGateway has a known HTTP interface and many projects already integrate with (either directly or via SDKs). These features include: grouping metrics incoming, exposse metrics to Prometheus, etc.

Having a different storage integration means users can still leverage the same tools and integrations already in place and just extend their usescases for the PushGateway transparently.

Pushing to the PGW is semantically something else than remote-writing metrics.

You're 100% right. That's exactly the reason one would want to implement a different metrics storage other than reimplementing the whole set of features the PushGateway already provides. The PushGateway capabilities would stay basically untouched whilst having an external storage would provide some extra features on top of what's already offered.

All those feature requests you have linked are more or less coming from this misunderstanding.

I think there's a fine line here between what the core PushGateway team is willing to support and what could be considered a valid usecase for the PushGateway. These, could be two very different things but I see the both being thrown around interchangeably.

This is precisely the reason I created this issue and came up with this idea for the remote storage. It's a compromise between adding a bunch of features to the core of the PushGateway and just allowing it to be extended without bringing in a lot of extra code plus maintenance.

This is illustrated in my initial words here:

I completely understand why these extra features wouldn't be 100% desired by the core team as they're wouldn't be simple to implement/maintain.

In the other hand, the pushgateway does 95% of the work and it feels sad to reimplement basically the same funcionality in order to just achieve the extra 5% features that won't be present.

A good example, already mentioned in other issues, is HA (which is basically the ability of having multiple PushGateways with the same storage backend - optionally with persisted storage) is a super valid use case when users can't easily afford missing metrics due to a pod restart or overall disruption due to the PushGateway itself having a single replica.

To take one step back: My desire is to keep the PGW simple and lightweight and focused on its fairly narrow use case. I do not believe that it's really doing 95% of the job you are proposing and we only need to add the remaining 5%. My guess would be more 50/50, or even worse…

But I don't want to stop anyone from trying other things. And if you believe the current PGW codebase is a good start, just fork the repo and take it from there. If things work out as you expect, it will be very easy to merge your fork back into the "official" repo. If things deviate too much, no harm done, then there will be two different repos for the different use cases.

Finally, if you would like this to be discussed by more Prometheus developers than just me, there is the dev-summit. It's happening monthly as a short online session, and then there are ~half-yearly all-day in-person sessions at PromCon and KubeCon EU. Next one is happening in two weeks time, and you can propose a topic in this doc.