sql: support correlated subqueries
maddyblue opened this issue · comments
Fails with:
logic_test.go:420: ../../sqllogictest/test/select1.test:109: expected success, but found pq: qualified name "x.b" not found
I believe this is because the subquery references both x
and t1
(where t1
is from the parent FROM
clause), but only x
is in the visible tables list.
The description above was conflating two bugs into one. This is a real bug, but it isn't caused by the linked query.
Yeah, this is what I meant the other day when I stated that only basic subqueries work. Subqueries can utilize variables from their surrounding context, but we don't handle that yet.
For reference, the query:
query IIIII nosort
SELECT a+b*2+c*3+d*4+e*5,
CASE WHEN a<b-3 THEN 111 WHEN a<=b THEN 222
WHEN a<b+3 THEN 333 ELSE 444 END,
abs(b-c),
(a+b+c+d+e)/5,
a+b*2+c*3
FROM t1
WHERE (e>c OR e<d)
AND d>e
AND EXISTS(SELECT 1 FROM t1 AS x WHERE x.b<t1.b)
ORDER BY 4,2,1,3,5
Subqueries can utilize variables from their surrounding context, but we don't handle that yet.
I don't think that's what's happening here.
Closing in favour of #3291.
You're correct that this is not a correlated subquery, but we need to support those as well.
Yikes! Correlated subqueries look scary (I had no idea such a thing was possible) and seem impossible to support efficiently except in limited cases (when they can be transformed into joins, which is what amazon redshift seems to do). I'm entirely comfortable with not supporting them.
Apparently correlated subqueries can always be transformed into joins. At least, I recall reading literature that stated that. The point of using a correlated subquery instead of a join is that the query can sometimes be written more naturally that way. I don't think we should support correlated subqueries before we support joins and then we should only support them by transforming them into joins.
@knz is this done?
Not yet, wanna take it?
I was just making sure this wasn't done with the other JOIN work and that the issue should remain open.
This feature is required by the ActiveRecord query issued in #12783.
In case it's useful, here is a simple query that requires this feature that Postgres supports and we don't. In Postgres:
jordan=# SELECT nspname, (SELECT n.nspname) FROM pg_namespace n LIMIT 1;
nspname | nspname
----------+----------
pg_toast | pg_toast
(1 row)
In CockroachDB:
root@:26257> SELECT nspname, (SELECT n.nspname) FROM pg_namespace n LIMIT 1;
pq: source name "n" not found in FROM clause
The example I posted seems to be a limited subcase of generalized correlated queries. Since ActiveRecord only needs that limited subcase, I've opened #12993 to track this particular subcase.
I take it back. The query in #12783 does actually need full correlated subquery support.
@jordanlewis is this correctly labeled as 1.0 milestone? Please remove if not.
Definitely not going to happen for 1.0.
This is also required for FlywayDB.
Any chance to tame that beast any time soon @jordanlewis @knz ?
@tlvenn Correlated subqueries are tentatively planned for the release next October.
Thanks for the update @petermattis
Any changes on this issue?
Referencing parent table alias in the subquery still not working.
SELECT
"t"."TownId",
"t"."Name"
FROM "Towns" AS "t"
WHERE EXISTS(
SELECT 1
FROM "Zones" AS "z"
WHERE ("z"."Name" = "Hello") AND ("t"."TownId" = "z"."TownId")
)
ORDER BY "t"."TownId";
t
in "t"."TownId" = "z"."TownId"
generates error as below.
no data source matches prefix: t
@rkdrnfds The work is ongoing. There won't be progress before 2.1 however.
Correlated subquery support is now part of 2.1, so closing this issue down.
I'm running v2.1.0-beta.20180904, hoping to get Symfony+Doctrine ORM working with Cockroachdb. The query below, however, returns this error message:
SQLSTATE[42P01]: Undefined table: 7 ERROR: no data source matches prefix: a HINT: some correlated subqueries are not supported yet - see https://github.com/cockroachdb/cockroachissues/3288
Can you provide guidance? I'd assume that either the correlated subquery stuff somehow didn't make it into v2.1.0-beta.20180904, or that the error message is also used for a different issue.
An exception occurred while executing 'SELECT
a.attnum,
quote_ident(a.attname) AS field,
t.typname AS type,
format_type(a.atttypid, a.atttypmod) AS complete_type, (SELECT tc.col
lcollate FROM pg_catalog.pg_collation tc WHERE tc.oid = a.attcollation) AS collation,
(SELECT t1.typname FROM pg_catalog.pg_type t1 WHERE t1.oid = t.typbas
etype) AS domain_type,
(SELECT format_type(t2.typbasetype, t2.typtypmod) FROM
pg_catalog.pg_type t2 WHERE t2.typtype = 'd' AND t2.oid = a.atttypi
d) AS domain_complete_type,
a.attnotnull AS isnotnull,
(SELECT 't'
FROM pg_index
WHERE c.oid = pg_index.indrelid
AND pg_index.indkey[0] = a.attnum
AND pg_index.indisprimary = 't'
) AS pri,
(SELECT pg_get_expr(adbin, adrelid)
FROM pg_attrdef
WHERE c.oid = pg_attrdef.adrelid
AND pg_attrdef.adnum=a.attnum
) AS default,
(SELECT pg_description.description
FROM pg_description WHERE pg_description.objoid = c.oid AND a.att
num = pg_description.objsubid
) AS comment
FROM pg_attribute a, pg_class c, pg_type t, pg_namespace n
WHERE n.nspname NOT IN ('pg_catalog', 'information_schema', 'pg_toast
') AND c.relname = 'backward_dependencies' AND n.nspname = 'crdb_internal'
AND a.attnum > 0
AND a.attrelid = c.oid
AND a.atttypid = t.oid
AND n.oid = c.relnamespace
ORDER BY a.attnum':
/cc @andy-kimball @knz
sqlfmt formatted for the interested
I think the issue should stay open until all correlated subqueries are supported, including the apply operator for those that cannot be decorrelated.
@andy-kimball Lauren and I are planning to make a writeup in the docs of the limits of decorrelation currently. Perhaps we could link to that from here.
The posted query has two issues that prevent decorrelation:
- Use of a subscript expression
[0]
; this is not currently supported by the cost-based optimizer and so triggers fallback to the heuristic planner. - Absence of unique index assertions on the
pg_catalog
tables, so the optimizer cannot statically prove that correlated subqueries return <= 1 result.
Thanks for clarifying.
Typically, I'd be more than happy to simply rewrite a query or two. In this case though, this query is generated by Doctrine when it tries to see what the current schema looks like when creating migration scripts (./doctrine migrations:diff)
As such, there's not an easy work-around where I can just rewrite one or two of my own queries. For the time being I'll just use mysql instead, but figured this bit of context might help in determining the appropriate priority for support of the posted query..
@jordanlewis can we close this now?
It seems this doesn't work for DELETE queries yet. When I run this, I get the expected result:
SELECT * FROM list WHERE NOT EXISTS(SELECT 1 FROM list_file WHERE list_id = list.id);
id | public_id | title | views | date_created | user_id | user_ip
+----+-----------+-------+-------+----------------------------------+---------+-----------------+
44 | feV9vhnW | bob | 0 | 2019-04-07 09:34:53.223956+00:00 | 1 | 127.0.0.1:53948
(1 row)
But when I replace SELECT with DELETE it breaks:
DELETE FROM list WHERE NOT EXISTS(SELECT 1 FROM list_file WHERE list_id = list.id);
pq: no data source matches prefix: list
I'm on cockroach 2.1.6
@Fornax96 I expect this to work on our upcoming 19.1 release. You can try a beta here (https://www.cockroachlabs.com/docs/releases/v19.1.0-rc.2.html).
Awesome, I'll try it out.
Upgraded to 19.1, it works now!
I'm closing this issue out, since we've now released 19.1 with support for correlated subqueries both in read-only and mutation statements.
Hi I'm not sure if this is related to this issue, but this is where I ended up on a search of my problem. I'm trying stuff out in cockroachdb, learning the ropes still, and I'm trying to use the RETURNING
feature of DELETE
to then modify a row on another table, as follows:
UPDATE u SET c = true WHERE id = (DELETE FROM uu WHERE code = 'some code' RETURNING id);
But I am getting an error:
invalid syntax: statement ignored: at or near "from": syntax error
DETAIL: source SQL:
UPDATE u SET c = true WHERE id = (DELETE FROM uu WHERE code = 'some code' RETURNING id)
^
HINT: try \h UPDATE
If I run these separately they work fine (supplying the data as needed).
I also tried:
WITH cu AS (DELETE FROM uu WHERE code = 'some code' RETURNING id) UPDATE u SET c = true WHERE id = cu.id
but I get:
pq: no data source matches prefix: cu
I know there are workarounds, but these introduce (worse) race conditions that I would like to avoid if possible. Am I doing something wrong, or is what I am trying to do not yet implemented? Thanks.
build: CCL v19.2.6 @ 2020/04/06 18:05:31 (go1.12.12)
Edit:
Argh, I see this might be more closely related to: #43963 (which I only saw after posting the above msg)
@insaner this is a legitimate need but it's a new issue. Please file a new issue and copy-paste the text of your submission.
Perfect, thanks for looking into this. I've filed a new issue.
In case someone else comes upon this issue, this does work in CRDB and PG, it's just that the syntax wasn't quite right. It should be like this instead:
WITH cu AS (DELETE FROM uu WHERE code = 'some code' RETURNING id)
UPDATE u SET c = true FROM cu WHERE id = cu.id