kedacore / external-scaler-azure-cosmos-db

KEDA External Scaler for Azure Cosmos DB

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature Request] Scaling on custom query

ankitbko opened this issue · comments

Any plans to support scaling based on custom query on Cosmosdb like it happens in MSSQL or Mongodb scalers?

You can do that as part of KEDA core - http://keda.sh/

@tomkerkhove I will need to write similar an external scaler similar to this right?

No, you can just use KEDA which supports both systems.

Does it support scaling based on cosmos db query? I don't see the scaler for that. Apologies if I am not understanding this correctly, new to KEDA.

Oh I misunderstood your original question - Sorry!

I though you wanted to read a custom query from MSSQL & MongoDB.

@ankitbko - The Cosmos DB scaler will work by monitoring the change-feed that is being (or will be) processed by the target application. From the Cosmos DB documentation, it does not look like we can have filtered change-feeds. The estimator documented here too does not support filtering through query.

It should not be possible to read the content of these change feeds in the scaler, applying the custom query and only mark the external event as 'active' if there are non-zero results after filtering. That would require taking over lease of these change feeds causing conflict with the lease taken by target application, and may result in missing events.

Let me know if you have any suggestion for enabling custom queries using an existing support in Cosmos DB.

What would be a good improvement, though, would be to call it the "Azure Cosmos DB Changefeed scaler" to emphasize this @JatinSanghvi

The original request is for the data plane for which I can see value as well, but that would rather be a different scenario/scaler

@tomkerkhove, I am unable to understand. I was taking about the data plane actually. Change feeds are the only way new changes to the Cosmos DB container can be processed and we expect all target application to use change feeds for change processing.

The number of change feeds cannot be scaled on demand; the number is same the count of underlying physical partitions of Cosmos DB container; the physical partitions depends on the storage size of container (and also provisioned throughput).

If there are multiple instances of processor apps running (may be because they were scaled out by KEDA), each of these instances will acquire lease on one or more of these feeds and start consuming the changes. This sets the max limit these app instances to the count of physical partitions in Cosmos DB container. The leases ensure that two different apps don't process the same data. But this also limits what the scaler can possibly do. For example, it can estimate the size of data pending for processing, but it cannot steal leases from the scaled-target apps to read data, say to apply filter through custom query.

Let me know if I misunderstood your comment.

Changefeed is indeed for data processing of changes, but there is more than that.

If somebody would want to scale based on the # of docs returned from a query that is a valid scenario as well (which is the request here) but not the focus of this scaler.

Hence why I suggested to make it explicit that it's a Azure Cosmos Db Changefeed Scaler

Yes the ask here was to run a query on Cosmos and scale based on returned resultset similar to below image. Will this feature be added to this scaler (as its named cosmosdb scaler) or should we have another created for handling that scenario?
image

It shoulds, the question is, will it be the scaler or another one.

That's why the name of this one should be very explicit about it @JatinSanghvi @pragnagopa

The ask here is specifically to scale the Cosmos DB container listeners based on result of running a query. As I explained in earlier comments, Cosmos DB does not support this as that will require the scaler to steal leases from the listener app. If this could be possible, @ankitbko's ask could be addressed by the current scaler itself.

The ask here is specifically to scale the Cosmos DB container listeners based on result of running a query. As I explained in earlier comments, Cosmos DB does not support this as that will require the scaler to steal leases from the listener app. If this could be possible, @ankitbko's ask could be addressed by the current scaler itself.

That is not correct, the ask above is fully unrelated to changefeed and leases. The ask is to scale based on the result of a query and a target value.

For example, what is the count of documents where field status is Unprocessed. The rest is up to the application.

So the bottom line here is, this scaler is only for changefeed and should be named accordingly

cc @lpapudippu @brettsam @ahmelsayed

In general I agree with the feature request.

@tomkerkhove - regarding

So the bottom line here is, this scaler is only for changefeed and should be named accordingly

If the team agrees to support this feature - we should extend current scaler implementation to support scaling bases on query . Let's not rename anything yet !

That's possible for sure, question is what the user experience would be then but I'm happy to wait as long as we don't have to "break" things.

I am willing to contribute on this functionality and some members of my team may be interested as well. If the team agrees to support the feature let me know what is the plan.