Getting trouble when creating VT table

Question

Getting trouble when creating VT table

rainnnnny opened this issue a year ago · comments

Describe the bug
Works fine when the table name is simply select * from table_a, but fail with schema as prefix like select * from schema.table_a;

I'm writing my custom adapter of graphql based on cancan101/graphql-db-api, the problem doesn't exist in his version; And I wrote an almost totally different adapter then this becomes a problem

To Reproduce
Steps to reproduce the behavior:

Query with select bla,bla from 'tunnel_account_cash' where blabla, works fine.
Query with select bla,bla from 'main.tunnel_account_cash' where blabla
in apsw/db.py, debug like:

got output except create main.tunnel_account_cash twice,
and end up with apsw.SQLError: SQLError: table "tunnel_account_cash" already exists

except create main.tunnel_account_cash
except create main.tunnel_account_cash
Traceback (most recent call last):
  File "/home/rss/new40g/python/venv/lib/python3.8/site-packages/shillelagh/backends/apsw/db.py", line 223, in execute
    self._cursor.execute(operation, parameters)
  File "src/cursor.c", line 1081, in APSWCursor_execute.sqlite3_prepare
apsw.SQLError: SQLError: no such table: main.tunnel_account_cash

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/rss/python38/lib/python3.8/runpy.py", line 200, in _run_module_as_main
    return _run_code(code, main_globals, None, "__main__", mod_spec)
  File "/home/rss/python38/lib/python3.8/runpy.py", line 92, in _run_code
    exec(code, run_globals)
  File "/home/rss/.vscode-server/extensions/ms-python.python-2023.7.11081008/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>
    cli.main()
  File "/home/rss/.vscode-server/extensions/ms-python.python-2023.7.11081008/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
    run()
  File "/home/rss/.vscode-server/extensions/ms-python.python-2023.7.11081008/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
    runpy.run_path(target, run_name="__main__")
  File "/home/rss/.vscode-server/extensions/ms-python.python-2023.7.11081008/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/rss/.vscode-server/extensions/ms-python.python-2023.7.11081008/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/rss/.vscode-server/extensions/ms-python.python-2023.7.11081008/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
    exec(code, run_globals)
  File "/home/rss/new40g/scripts/graphqltest/graphql_cli.py", line 31, in <module>
    for row in connection.execute(
  File "/home/rss/new40g/python/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/home/rss/new40g/python/venv/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/rss/new40g/python/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
    ret = self._execute_context(
  File "/home/rss/new40g/python/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
    self._handle_dbapi_exception(
  File "/home/rss/new40g/python/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1514, in _handle_dbapi_exception
    util.raise_(exc_info[1], with_traceback=exc_info[2])
  File "/home/rss/new40g/python/venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/home/rss/new40g/python/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
    self.dialect.do_execute(
  File "/home/rss/new40g/python/venv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
    cursor.execute(statement, parameters)
  File "/home/rss/new40g/python/venv/lib/python3.8/site-packages/shillelagh/backends/apsw/db.py", line 81, in wrapper
    return method(self, *args, **kwargs)
  File "/home/rss/new40g/python/venv/lib/python3.8/site-packages/shillelagh/backends/apsw/db.py", line 234, in execute
    self._create_table(uri)
  File "/home/rss/new40g/python/venv/lib/python3.8/site-packages/shillelagh/backends/apsw/db.py", line 294, in _create_table
    self._cursor.execute(
  File "src/cursor.c", line 240, in resetcursor
apsw.SQLError: SQLError: table "tunnel_account_cash" already exists

Expected behavior
VT table be created properly.

My extra questions
Correct me if I'm wrong: it seems the adapter matters only for the create table sql generating, I checked the sql of VTTable.get_create_table and it's totally same when with or without schema prefix main., so I can only consider the problem is not in my adapter but in apsw cursor's execute, which failed to find table.

While I'm not familiar with C and getting hard to read apsw C code , I couldn't stop thinking about why do I need the VT table, what I want is converting sql to graphql query, get the result and return back, so, what's the tradeoff of shillelagh basing on sqlite and apsw? Could I write something like gsheets-db-api which seems getting rid of all sqlite things?

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: [centos]
Browser [e.g. chrome, safari]
Version [1.2.0]

Smartphone (please complete the following information):

Device: [e.g. iPhone6]
OS: [e.g. iOS8.1]
Browser [e.g. stock browser, safari]
Version [e.g. 22]

Additional context
Add any other context about the problem here.

Beto Dealmeida · Answer 1 · Thu Apr 20 2023 05:34:47 GMT+0800 (China Standard Time)

I think the problem is that your syntax is incorrect. This:

SELECT * FROM "main.t";  -- or 'main.t'

Means select everything from the table called "main.t". What you want to do is:

SELECT * FROM "main"."t";  -- or 'main'.'t'

rainnnnny · Answer 2 · Thu Apr 20 2023 11:12:38 GMT+0800 (China Standard Time)

Sorry I made a stupid mistake, the problem's gone, thanks a lot

But I still wish to know why shillelagh based on sqlite, would you please take a look at my questions: (copied from 'My extra questions' above)

While I'm not familiar with C and getting hard to read apsw C code , I couldn't stop thinking about why do I need the VT table, what I want is converting sql to graphql query, get the result and return back, so, what's the tradeoff of shillelagh basing on sqlite and apsw? Could I write something like gsheets-db-api which seems getting rid of all sqlite things?

Beto Dealmeida · Answer 3 · Thu Apr 20 2023 22:27:39 GMT+0800 (China Standard Time)

Sorry I made a stupid mistake, the problem's gone, thanks a lot

No worries, I'm glad it was something simple!

But I still wish to know why shillelagh based on sqlite, would you please take a look at my questions: (copied from 'My extra questions' above)

While I'm not familiar with C and getting hard to read apsw C code , I couldn't stop thinking about why do I need the VT table, what I want is converting sql to graphql query, get the result and return back, so, what's the tradeoff of shillelagh basing on sqlite and apsw? Could I write something like gsheets-db-api which seems getting rid of all sqlite things?

There are pros and cons to both approaches.

gsheets-db-api tries to solve the problem by parsing the SQL and translating it to REST calls. Parsing the SQL, doing the calls, and building the result set is not trivial. For simple queries like

SELECT * FROM "https://docs.gooogle.com/..."
WHERE col > 1

it's relatively easy, but handling any valid SQL query is a lot of work. Because of this, gsheets-db-api supports only a small subset of SQL, mostly the queries produced by Apache Superset.

The pros of this approach is that it can be more efficient. For example, in gsheets-db-api we can push the aggregations to the server, potentially greatly reducing the amount of data that has to be downloaded by the client. This is not true for shillelagh.

For shillelagh I used the virtual table approach. This means that all the SQL parsing and result set generation is done by SQLite. Any query that is a valid SQLite query, not matter how complex, will work. All it needs to worry about is fetching data, filtering it, and sorting it, which are easy to do.

The disadvantage is that now the aggregations are done outside of the virtual table by SQLite. A simple query like SELECT COUNT(*) FROM some_table requires all the data to be downloaded and passed to SQLite so it can be counted.

rainnnnny · Answer 4 · Fri Apr 21 2023 10:27:54 GMT+0800 (China Standard Time)

Got it and thank you so much for your detailed explanation. I'm exactly doing my work for Superset query, as there seems no available open-source Graphql-based BI tools so far. I've done my work basing on shillelagh for demo and will take more consideration for production use. And shillelagh is really a great idea and convenience for situation like me!