jaegertracing / jaeger

CNCF Jaeger, a Distributed Tracing Platform

Home Page:https://www.jaegertracing.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug]: All Operation search not functioning under Azure CosmosDB-for-Cassandra storage

Wraith2 opened this issue · comments

What happened?

I have installed as a proof of concept and to minimize costs am using Azure CosmosDB for Cassandra as the database backend which I believe has been made to work based on previous closed issues #1667 and #2467 . The jaeger parts of the tooling work successfully but CosmosDB's cassandra api does not appear to be complete which means it is not possible to use the "all" operation selection to see the recent traces ordered by time.

Steps to reproduce

Setup a tracing database using AzureDB for Cassandra. Customize the install cqsh commands for a single node and run it, it will succeed. Proceed to setup something to send data through the collector to the backend and everything should work normally.
Run the query ui pointing to the azure db instance. select the "all" operation and click "Find Traces" button. An error will be returned:

ORDER BY requires creating a custom index: CosmosClusteringIndex. Please create a custom index and re-issue this query

The query that cases this to occur is:

SELECT trace_id FROM service_name_index WHERE bucket IN (0,1,2,3,4,5,6,7,8,9) AND service_name = ? AND start_time > ?  AND start_time < ? ORDER BY start_time DESC LIMIT ?;

and the definition of service_name_index already states that start_time is part of the key and ordered, WITH CLUSTERING ORDER BY (start_time DESC) so i think this is CosmosDB not conforming the cassandra api correctly. This is backed up by an issue on the microsoft learn site https://learn.microsoft.com/en-us/answers/questions/1181520/cassandra-api-unable-to-run-query?page=1&orderby=Helpful&comment=answer-1177286#newest-answer-comment where the user is directed to enable a preview cassandra feature that i cannot find.

Expected behavior

It would be good if jaeger could be made to work around this issue or some azure specific schema change could be identified that let it work in spite of the missing feature in cosmosdb.

I understand that is not likely to be a problem in jaeger. However when researching whether this backend would function all the information I could find suggested that it would work. If azure cosmosdb for cassandra is not a viable backend because it lacks a required feature of the real cassandra system then it may be useful for others to be able to find this issue in searches.

Relevant log output

No response

Screenshot

No response

Additional context

No response

Jaeger backend version

1.53

SDK

OpenTelemetry Dotnet package 1.7.0

Pipeline

azure appservice -> jaeger-collector -> azurecosmosdb-for-cassandra

Stogage backend

azurecosmosdb-for-cassandra

Operating system

Windows

Deployment model

No response

Deployment configs

create-schema-clean.txt

Apologies for the random ping @TheovanKraay but you may be well placed to help with this.

The Cosmos DB API for Apache Cassandra does have some compatibility gaps. I would recommend running Jaeger with Azure Managed Instance for Apache Cassandra.. This is an offering under Azure Cosmos DB, but is a fully managed service for pure open-source Apache Cassandra with 100% compatibility. You should not have any problems with any of the Jaeger commands if using this service instead.

Ok, thanks.

No action needed here from Jaeger then. To anyone who finds this in a search you will need to move to full cassandra or elasticsearch storage backend.