sql_to_ibis
is a Python package that translates SQL syntax into ibis expressions. This provides the capability of using only one SQL dialect to target many different backends
To use an ibis table in sql_to_ibis you must register it. Note that for joins or queries that involve more than one table you must use the same ibis client when creating both ibis tables. Once the table is registered you can query it using SQL with the query function. In the example below, we create and query a pandas DataFrame
import ibis.pandas
import pandas
import sql_to_ibis
df = pandas.DataFrame({"column1": [1, 2, 3], "column2": ["4", "5", "6"]})
ibis_table = ibis.pandas.Backend().from_dataframe(
df, name="my_table", client=ibis.pandas.connect({})
)
sql_to_ibis.register_temp_table(ibis_table, "my_table")
sql_to_ibis.query(
"select column1, cast(column2 as integer) + 1 as my_col2 from my_table"
).execute()
This would output a DataFrame that looks like:
column1 | my_col2 |
---|---|
1 | 5 |
2 | 6 |
3 | 7 |
The sql syntax for sql_to_ibis is as follows (Note that all syntax is case insensitive):
Example:
Note that columns with spaces in them can be expressed using double quotes. For example:
Example:
Example:
<aggregate>() OVER(
[PARTITION BY (<expresssion> [, <expression>...)]
[ORDER_BY (<expresssion> [, <expression>...)]
[ ( ROWS | RANGE ) ( <preceding> | BETWEEN <preceding> AND <following> ) ]
)
<preceding>: UNBOUNDED PRECEDING | <unsigned_integer> PRECEDING | CURRENT ROW
<following>: UNBOUNDED FOLLOWING | <unsigned_integer> FOLLOWING | CURRENT ROW
- Anything in <> is meant to be some string
- Anything in [] is optional
- Anything in {} is grouped together
- VARCHAR, STRING
- INT16, SMALLINT
- INT32, INT, INTEGER
- INT64, BIGINT
- FLOAT16
- FLOAT32
- FLOAT, FLOAT64
- BOOL
- DATETIME64, TIMESTAMP
- CATEGORY
- OBJECT