cockroachdb / cockroach

CockroachDB - the open source, cloud-native distributed SQL database.

Home Page:https://www.cockroachlabs.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sql: support correlated subqueries

maddyblue opened this issue · comments

https://github.com/cockroachdb/sqllogictest/blob/a88396b84bb1fe62edf2e072e585d68c0b2ecdca/test/select1.test#L109

Fails with:

logic_test.go:420: ../../sqllogictest/test/select1.test:109: expected success, but found pq: qualified name "x.b" not found

I believe this is because the subquery references both x and t1 (where t1 is from the parent FROM clause), but only x is in the visible tables list.


The description above was conflating two bugs into one. This is a real bug, but it isn't caused by the linked query.

Yeah, this is what I meant the other day when I stated that only basic subqueries work. Subqueries can utilize variables from their surrounding context, but we don't handle that yet.

For reference, the query:

query IIIII nosort
SELECT a+b*2+c*3+d*4+e*5,
       CASE WHEN a<b-3 THEN 111 WHEN a<=b THEN 222
        WHEN a<b+3 THEN 333 ELSE 444 END,
       abs(b-c),
       (a+b+c+d+e)/5,
       a+b*2+c*3
  FROM t1
 WHERE (e>c OR e<d)
   AND d>e
   AND EXISTS(SELECT 1 FROM t1 AS x WHERE x.b<t1.b)
 ORDER BY 4,2,1,3,5

Subqueries can utilize variables from their surrounding context, but we don't handle that yet.

I don't think that's what's happening here.

Closing in favour of #3291.

You're correct that this is not a correlated subquery, but we need to support those as well.

Yikes! Correlated subqueries look scary (I had no idea such a thing was possible) and seem impossible to support efficiently except in limited cases (when they can be transformed into joins, which is what amazon redshift seems to do). I'm entirely comfortable with not supporting them.

Apparently correlated subqueries can always be transformed into joins. At least, I recall reading literature that stated that. The point of using a correlated subquery instead of a join is that the query can sometimes be written more naturally that way. I don't think we should support correlated subqueries before we support joins and then we should only support them by transforming them into joins.

@knz is this done?

Not yet, wanna take it?

I was just making sure this wasn't done with the other JOIN work and that the issue should remain open.

This feature is required by the ActiveRecord query issued in #12783.

In case it's useful, here is a simple query that requires this feature that Postgres supports and we don't. In Postgres:

jordan=# SELECT nspname, (SELECT n.nspname) FROM pg_namespace n LIMIT 1;
 nspname  | nspname
----------+----------
 pg_toast | pg_toast
(1 row)

In CockroachDB:

root@:26257> SELECT nspname, (SELECT n.nspname) FROM pg_namespace n LIMIT 1;
pq: source name "n" not found in FROM clause

The example I posted seems to be a limited subcase of generalized correlated queries. Since ActiveRecord only needs that limited subcase, I've opened #12993 to track this particular subcase.

I take it back. The query in #12783 does actually need full correlated subquery support.

@jordanlewis is this correctly labeled as 1.0 milestone? Please remove if not.

Definitely not going to happen for 1.0.

This is also required for FlywayDB.

Any chance to tame that beast any time soon @jordanlewis @knz ?

@tlvenn Correlated subqueries are tentatively planned for the release next October.

Thanks for the update @petermattis

Any changes on this issue?
Referencing parent table alias in the subquery still not working.

SELECT
  "t"."TownId",
  "t"."Name"
FROM "Towns" AS "t"
WHERE EXISTS(
    SELECT 1
    FROM "Zones" AS "z"
    WHERE ("z"."Name" = "Hello") AND ("t"."TownId" = "z"."TownId")
)
ORDER BY "t"."TownId";

t in "t"."TownId" = "z"."TownId" generates error as below.

no data source matches prefix: t

@rkdrnfds The work is ongoing. There won't be progress before 2.1 however.

commented

Zendesk ticket #2713 has been linked to this issue.

Correlated subquery support is now part of 2.1, so closing this issue down.

I'm running v2.1.0-beta.20180904, hoping to get Symfony+Doctrine ORM working with Cockroachdb. The query below, however, returns this error message:

SQLSTATE[42P01]: Undefined table: 7 ERROR: no data source matches prefix: a HINT: some correlated subqueries are not supported yet - see https://github.com/cockroachdb/cockroachissues/3288

Can you provide guidance? I'd assume that either the correlated subquery stuff somehow didn't make it into v2.1.0-beta.20180904, or that the error message is also used for a different issue.

  An exception occurred while executing 'SELECT                                              
                      a.attnum,                                                              
                      quote_ident(a.attname) AS field,                                       
                      t.typname AS type,                                                     
                      format_type(a.atttypid, a.atttypmod) AS complete_type, (SELECT tc.col  
  lcollate FROM pg_catalog.pg_collation tc WHERE tc.oid = a.attcollation) AS collation,      
                      (SELECT t1.typname FROM pg_catalog.pg_type t1 WHERE t1.oid = t.typbas  
  etype) AS domain_type,                                                                     
                      (SELECT format_type(t2.typbasetype, t2.typtypmod) FROM                 
                        pg_catalog.pg_type t2 WHERE t2.typtype = 'd' AND t2.oid = a.atttypi  
  d) AS domain_complete_type,                                                                
                      a.attnotnull AS isnotnull,                                             
                      (SELECT 't'                                                            
                       FROM pg_index                                                         
                       WHERE c.oid = pg_index.indrelid                                       
                          AND pg_index.indkey[0] = a.attnum                                  
                          AND pg_index.indisprimary = 't'                                    
                      ) AS pri,                                                              
                      (SELECT pg_get_expr(adbin, adrelid)                                    
                       FROM pg_attrdef                                                       
                       WHERE c.oid = pg_attrdef.adrelid                                      
                          AND pg_attrdef.adnum=a.attnum                                      
                      ) AS default,                                                          
                      (SELECT pg_description.description                                     
                          FROM pg_description WHERE pg_description.objoid = c.oid AND a.att  
  num = pg_description.objsubid                                                              
                      ) AS comment                                                           
                      FROM pg_attribute a, pg_class c, pg_type t, pg_namespace n             
                      WHERE n.nspname NOT IN ('pg_catalog', 'information_schema', 'pg_toast  
  ') AND c.relname = 'backward_dependencies' AND n.nspname = 'crdb_internal'                 
                          AND a.attnum > 0                                                   
                          AND a.attrelid = c.oid                                             
                          AND a.atttypid = t.oid                                             
                          AND n.oid = c.relnamespace                                         
                      ORDER BY a.attnum': 

/cc @andy-kimball @knz

sqlfmt formatted for the interested

I think the issue should stay open until all correlated subqueries are supported, including the apply operator for those that cannot be decorrelated.

@andy-kimball Lauren and I are planning to make a writeup in the docs of the limits of decorrelation currently. Perhaps we could link to that from here.

The posted query has two issues that prevent decorrelation:

  1. Use of a subscript expression [0]; this is not currently supported by the cost-based optimizer and so triggers fallback to the heuristic planner.
  2. Absence of unique index assertions on the pg_catalog tables, so the optimizer cannot statically prove that correlated subqueries return <= 1 result.

Thanks for clarifying.

Typically, I'd be more than happy to simply rewrite a query or two. In this case though, this query is generated by Doctrine when it tries to see what the current schema looks like when creating migration scripts (./doctrine migrations:diff)

As such, there's not an easy work-around where I can just rewrite one or two of my own queries. For the time being I'll just use mysql instead, but figured this bit of context might help in determining the appropriate priority for support of the posted query..

@jordanlewis can we close this now?

It seems this doesn't work for DELETE queries yet. When I run this, I get the expected result:

SELECT * FROM list WHERE NOT EXISTS(SELECT 1 FROM list_file WHERE list_id = list.id);
  id | public_id | title | views |           date_created           | user_id |     user_ip      
+----+-----------+-------+-------+----------------------------------+---------+-----------------+
  44 | feV9vhnW  | bob   |     0 | 2019-04-07 09:34:53.223956+00:00 |       1 | 127.0.0.1:53948  
(1 row)

But when I replace SELECT with DELETE it breaks:

DELETE FROM list WHERE NOT EXISTS(SELECT 1 FROM list_file WHERE list_id = list.id);  
pq: no data source matches prefix: list

I'm on cockroach 2.1.6

@Fornax96 I expect this to work on our upcoming 19.1 release. You can try a beta here (https://www.cockroachlabs.com/docs/releases/v19.1.0-rc.2.html).

Awesome, I'll try it out.

Upgraded to 19.1, it works now!

I'm closing this issue out, since we've now released 19.1 with support for correlated subqueries both in read-only and mutation statements.

Hi I'm not sure if this is related to this issue, but this is where I ended up on a search of my problem. I'm trying stuff out in cockroachdb, learning the ropes still, and I'm trying to use the RETURNING feature of DELETE to then modify a row on another table, as follows:

UPDATE u SET c = true WHERE id = (DELETE FROM uu WHERE code = 'some code' RETURNING id);

But I am getting an error:

invalid syntax: statement ignored: at or near "from": syntax error
DETAIL: source SQL:
UPDATE u SET c = true WHERE id = (DELETE FROM uu WHERE code = 'some code'  RETURNING id)
                                         ^
HINT: try \h UPDATE

If I run these separately they work fine (supplying the data as needed).

I also tried:

WITH cu AS (DELETE FROM uu WHERE code = 'some code' RETURNING id) UPDATE u SET c = true WHERE id = cu.id

but I get:

pq: no data source matches prefix: cu

I know there are workarounds, but these introduce (worse) race conditions that I would like to avoid if possible. Am I doing something wrong, or is what I am trying to do not yet implemented? Thanks.

build: CCL v19.2.6 @ 2020/04/06 18:05:31 (go1.12.12)


Edit:
Argh, I see this might be more closely related to: #43963 (which I only saw after posting the above msg)

@insaner this is a legitimate need but it's a new issue. Please file a new issue and copy-paste the text of your submission.

Perfect, thanks for looking into this. I've filed a new issue.

In case someone else comes upon this issue, this does work in CRDB and PG, it's just that the syntax wasn't quite right. It should be like this instead:

WITH cu AS (DELETE FROM uu WHERE code = 'some code' RETURNING id)
UPDATE u SET c = true FROM cu WHERE id = cu.id