ClearTax / sqlcli

A CLI tool written to parse SQL statements and answer questions about them.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sqlcli

sqlcli is a CLI tool written to parse SQL statements and answer questions about them.

Currently, it only really does two things:

  1. Given a SELECT statement, return the tables we read from
  2. Given a SELECT statement, return the columns that will be present in the output resultset — i.e., return the SQL Projection.

sqlcli was built as a glue layer in some parts of ClearTax's data platform. It relies heavily on the sqlparser crate — you can consider this to be a thin layer over the actual hard work that sqlparser does.

It's written in rust. Probably not very idiomatic rust, since it was written in a hurry. Feedback welcome!

Usage

Display help:

$ sqlcli
==>
sqlcli 0.1.0
Parse sql queries and answer questions about them.

USAGE:
    sqlcli [FLAGS] <SUBCOMMAND>

FLAGS:
    -d, --debug      Turn on debug logging
    -h, --help       Prints help information
    -V, --version    Prints version information

SUBCOMMANDS:
    ast        Print the AST as generated by sqlparser-rs.
    columns    Print output column names of the SQL query.
    help       Prints this message or the help of the given subcommand(s)
    tables     Print tables queried from in SQL.

Table names

Parse table names from a SQL select statement:

# Use filename argument
$ sqlcli tables some_file.sql
# Or as a filter
$ cat some_file.sql | sqlcli tables

Examples (output after ==>):

$ sqlcli tables << EOF
select * from video_views where user_id in (
    select id from users where plan = 'Ultra'
)
EOF
==>
users
video_views

sqlcli should be able to handle arbitrarily complex SQL with joins, CTEs, group by & partitions, etc.

$ sqlcli tables << EOF
with last_month_active_users as (
    select user_id, last_active_time from users
    where last_active_time > current_timestamp - interval '1 month'
),
renewal_due as (
    select active.id, billing.is_due
    from last_month_active_users active join billing_dimension billing on active.user_id = billing.user_id
)

select 
    (select count(*) from last_month_active_users) total_active_users,
    (select count(*) from renewal_due where is_due = 1) active_and_renewal_due

EOF

==>
billing_dimension
users

Note: sqlcli tables will always return the unique source tables any given query reads from.

In the above example, the final SELECT projection reads from an inner common table expression. These temporary tables are excluded from the output.

Column names

The columns subcommand returns the fields that should be present in the resultset.

# Use filename argument
$ sqlcli columns some_file.sql
# Or as a filter
$ cat some_file.sql | sqlcli columns

Examples:

$ sqlcli columns << EOF
SELECT user_id, first_name, last_name, concat(first_name, ' ', last_name) as full_name from users
EOF
==>
NAMED: user_id
NAMED: first_name
NAMED: last_name
NAMED: full_name

It will highlight if the query contains any wildcards or columns whose name cannot be determined (eg: selecting an expression like COUNT(*) without an alias).

This can be used to lint queries running in your data systems.

$ sqlcli columns << EOF
SELECT date(created_at) dt, count(*) from events
EOF
=>
NAMED: dt
UNNAMED
$ sqlcli columns << EOF
SELECT * from table_1
EOF
=>
WILDCARD: *
$ sqlcli columns << EOF
SELECT a.id, b.* 
from table_1 a left outer join table_2 b
on a.id = b.external_id
EOF
=>
NAMED: id
WILDCARD: b.*

Build & test

Install rust, and run:

# test
$ cargo test

# build binary
$ cargo build --release

The Fine Print

sqlcli:

  • only works on SELECT statements
  • only accepts one statement at a time (either via an input file or on STDIN)
  • will exit with return code 65 in case of malformed input
  • does not actually run the query, so you can only consider this to be a lint tool

All heavy-lifting is done by the excellent sqlparser library.

sqlcli was built as a side project for internal use at ClearTax. It's been open-sourced, but our primary use cases will remain internal. Pull requests / bug reports are welcome, but we may not be able to respond to everyone.

About

A CLI tool written to parse SQL statements and answer questions about them.

License:MIT License


Languages

Language:Rust 100.0%