python ORM support for composite primary key tuple filtering - `WHERE (a,b) IN ( (1,2), (1, 3), (2, 4) )`
johnny-smitherson opened this issue · comments
In CQL, when the table Clustering Key is made of many columns, it's possible to make batch queries like this:
CREATE TABLE example_table (
a int, b int, c int, d int
PRIMARY KEY ((a, b), c, d)
);
x> SELECT * FROM example_table WHERE a=1 AND b=2 AND (c,d) IN ( (1,2), (1, 3), (2, 4) );
With a default cardinality limit of 100 rows, this type of query speeds up batch processing by reducing round trip time 100-fold.
This is not possible with the cassandra.cqlengine.query
ORM builder - as there is no way to specify .filter(XXX__in=((1,2),(3,4)))
queries on a list of tuples of the PK (or from indexes).
Is this a desirable feature to have? Or should I just keep this functionality in raw CQL?
Implementation ideas:
- add special
pk=(1,2,3,4)
andck__in=[(1,2), (3,4)]
filter functions that assume the tuples given are prefixes of the composite primary key in the correct order - or, add special syntax with explicit column names:
.filter(a=1, b=2, c__d__in=((1,2), (4,5))
SO discussion on WHERE with column tuples: https://stackoverflow.com/questions/62047786/cassandra-where-clause-as-a-tuple/62050254#62050254
I don't know if I would use the name of columns in the filter arguments, but might think of extending the where clause:
.filter(a=1, b=2, _and = '(c,d) IN ( (1,2), (1, 3), (2, 4)')
# or
.filter(a=1, b=2, _extra_where = 'AND (c,d) IN ( (1,2), (1, 3), (2, 4)')
but anyhow since this fork doesn't have any scylla specific modifications in the cqlengine
, I would recommend
suggesting it to cassandra upstream of this driver, in:
https://datastax-oss.atlassian.net/jira/software/c/projects/PYTHON/issues
if it would be agree and accepted there, it would eventfully gonna land here as well.
Is there an issue opened in upstream for this?
I don't think we are going to implement any new features in cqlengine in our fork.
A thought, maybe we should even title it unsupported/unmaintained from a certain release ?
There are integration tests for it and we do run them in CI - so it's not completely forgotten.
I know there are clients that use it so declaring it unsupported without replacement might not be a good idea.