ibis-project / ibis

the portable Python dataframe library

Home Page:https://ibis-project.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

feat: convert SQL to Ibis code

galenseilis opened this issue · comments

Is your feature request related to a problem?

My team primarily develops in Python and I overall see substantial benefits to using Ibis.

Exchanging SQL to/from other teams is something my team needs to support. The ibis.to_sql function takes care of generating SQL for others, but it is also important that we can readily ingest SQL too. I see that ibis.sql is a first approximation of that, but it still ends up meaning that we have SQL code mixed into our Python code (or similarly parsing the SQL from other files).

It would be a major boon to be able to convert SQL to Ibis expressions programmatically.

What is the motivation behind your request?

This issue is more about some previous work that I wasn't able to quickly find, rather than something completely novel to the Ibis dev team.

The presentation Cloud + Forsyth- Ibis- Expressive analytics in Python at any scale | PyData NYC 2022
(sorry, I am not sure about the timestamp. Some time after t=514), it was mentioned that there may be functionality in the future to convert SQL code into Ibis code (i.e. Python).

Did anything happen with implementing that?

Describe the solution you'd like

I'd like for this to be a thing that exists. Clearly someone has already thought about it, but I have no idea what the current state is. This issue I have raised here is primarily to find out what the current state looks like.

What version of ibis are you running?

I have not started using it yet. I am in an evaluation stage where I am trying to decide how well suited it is for my team's stack. I'll be aiming to use w/e is the latest stable release that is compatible with the kedro-dataset.

Also, I would like to stay up to date with security advisories such as this: GHSA-x563-6hqv-26mr

What backend(s) are you using, if any?

MSSQL and PostreSQL for the most part. Occasionally for small locally-stored or in-memory things I will use SQLite.

Code of Conduct

  • I agree to follow this project's Code of Conduct

xref: #9267

this functionality does exist -- here is an example: https://gist.github.com/lostmygithubaccount/f6b4b02e626d8b9ef965015daf3d48a3

however, it is in an experimental state and was added a while ago. it doesn't currently work for most SQL you'd want to use this on. we were discussing this earlier this week

Let us discuss this early next week and plan to provide an update! I think the main unknown is just how difficult the problem of accurate conversion really is.

Hey @galenseilis --

Our current plan is to see what it takes to take TPC-H query 1 from SQL -> Ibis Python and then use that as a guide for how hard this will be to get working for most SQL functionality.

We agree that it would be awesome if it worked, we just aren't sure how much of a rabbit hole that is.

Definitely ping us if we haven't gotten back here in 2 weeks!

Just wanted to echo that this would be incredibly useful for scenarios I face in my current work as well (mainly in helping onboard analysts to ibis, while still allowing them to utilize functions, etc, written by others that operate on Ibis expressions).