Query on partitioned table doesn't filter out partitions in a smart way
matriv opened this issue · comments
CrateDB version
5.7.0
CrateDB setup information
No response
Problem description
Query on partitioned table doesn't filter out partitions in a smart way if the source of the partitioned generated column is used
Steps to Reproduce
CREATE TABLE IF NOT EXISTS "doc"."weather_data_partitioned"(
"timestamp" TIMESTAMP WITHOUT TIME ZONE,
"location" TEXT,
"temperature" DOUBLE PRECISION,
"humidity" DOUBLE PRECISION,
"wind_speed" DOUBLE PRECISION,
"ts_month" TIMESTAMP GENERATED ALWAYS AS date_trunc('month', "timestamp")
) CLUSTERED INTO 1 SHARDS
PARTITIONED BY ("ts_month");
COPY weather_data_partitioned
FROM 'https://github.com/crate/cratedb-datasets/raw/main/cloud-tutorials/data_weather.csv.gz'
WITH (format='csv', compression='gzip', empty_string_as_null=true);
The following query doesn't filter out partitions (hits all of them)
EXPLAIN ANALYZE SELECT
ts_month,
AVG("temperature") AS "avg_temperature",
AVG("humidity") AS "avg_humidity",
AVG("wind_speed") AS "avg_wind_speed"
FROM "doc"."weather_data_partitioned"
WHERE "timestamp" >= '2023-01-01' AND "timestamp" < '2023-03-01'
GROUP BY "ts_month"
ORDER BY "ts_month" ;
where as:
EXPLAIN ANALYZE SELECT
ts_month,
AVG("temperature") AS "avg_temperature",
AVG("humidity") AS "avg_humidity",
AVG("wind_speed") AS "avg_wind_speed"
FROM "doc"."weather_data_partitioned"
WHERE "ts_month" >= '2023-01-01' AND "ts_month" < '2023-03-01'
GROUP BY "ts_month"
ORDER BY "ts_month" ;
filters the partitions correctly.
Actual Result
Partitions are not filtered out if timestamp
column is used in the WHERE
clause.
Expected Result
Automatically understand that timestamp
is the source column for the generated expression that constructs the partitioned by
ts_month
column and use it to filter out the partitions
For a simpler table:
cr> create table t_p(a int, b generated always as (a % 10)) partitioned by (b);
CREATE OK, 1 row affected (0.048 sec)
cr> insert into t_p(a) select * from generate_series(1, 20, 1);
INSERT OK, 20 rows affected (3.329 sec)
cr> refresh table t_p;
REFRESH OK, 10 rows affected (0.001 sec)
cr> explain analyze select * from t_p where a = 11;
the mechanism works correctly, and this is done in GeneratedColumnExpander
, so the query becomes:
((a = 11) AND (b = (11 % 10)))
which is used to filter out partitions.
For the timestamp case though there is an implicit _cast
function introduced, for which the signature doesn't support the Feature.COMPARISON_REPLACEMENT
, so the GeneratedColumnExpander
exits here: https://github.com/crate/crate/blob/master/server/src/main/java/io/crate/analyze/GeneratedColumnExpander.java#L185
Is this a regression? If not, it's not a bug, but a potential performance improvement
Based on #14250, I tested on 5.3.2, 5.6.1 and current nightly. It worked with a modified create table statement, as there has been TIMESTAMP WITHOUT TIMEZONE
vs. TIMESTAMP
in the column definitions for timestamp
vs. ts_month
(see below). Omitting the datatype on the generated column leads to the expected effect, that partitions are selected. It touches three partitions (maybe because we put the 1st March, but no docs are fetched from this partition).
Modified CREATE TABLE
statement (omitted data type on generated column - using TIMESTAMP WITHOUT TIMEZONE
in the column definition of ts_month
still leads to the scan of all partitions):
CREATE TABLE IF NOT EXISTS "doc"."weather_data_partitioned"(
"timestamp" TIMESTAMP WITHOUT TIME ZONE,
"location" TEXT,
"temperature" DOUBLE PRECISION,
"humidity" DOUBLE PRECISION,
"wind_speed" DOUBLE PRECISION,
"ts_month" GENERATED ALWAYS AS date_trunc('month', "timestamp")
) CLUSTERED INTO 1 SHARDS
PARTITIONED BY ("ts_month");
EXPLAIN ANALYZE of the query filtered by WHERE "timestamp" >= '2023-01-01' AND "timestamp" < '2023-03-01'
:
Question
Based on the outline of @matriv above, an implicit _cast kicks in for the timestamp / weather_data example. Maybe omitting the data type completely is the solution to this issue and we should mention it in the docs?
Thx for checking it further @ckurze!
@mfussenegger, yep it's not a bug, as I missed this issue with the datatype which adds the implicit cast, and I thought that the mechanism to filter out the irrelevant partitions was broken in general.
So the issue is the type definition in the table schema ? o_O
Is there a difference between TIMESTAMP
and TIMESTAMPTZ
?
Issue seems to exist (implicit cast) for any of the 2 datatypes (WITH or WITHOUT TIME ZONE)
if the type (even if it's the same) is also defined on the generated column (which is the partitioned by column).
Workaround is to just omit the data type for the generated column that is used for partitioning, but the implicit cast should be fixed. I see it moved into the candidates for 5.8. Thank you :)