HIVE - `GROUPING SETS` cause parsing error

Question

HIVE - `GROUPING SETS` cause parsing error

antonitto opened this issue 3 months ago · comments

Anton commented 3 months ago

Search before asking

I searched the issues and found no similar issues.

What Happened

I ran sqlfluff lint for hive dialect and got the error (see test.sql file example below):

== [test.sql] FAIL
L:   6 | P:  12 | CP01 | Keywords must be consistently upper case.
                       | [capitalisation.keywords]
L:   8 | P:   8 | LT01 | Expected single whitespace between numeric literal and
                       | raw comparison operator '='.
                       | [layout.spacing]
L:   8 | P:   9 | LT01 | Expected single whitespace between raw comparison
                       | operator '=' and numeric literal.
                       | [layout.spacing]
L:  10 | P:   1 |  PRS | Line 10, Position 1: Found unparsable section: 'GROUPING
                       | SETS (\n    (col1, col2),\n    (c...'
L:  14 | P:   2 | CV06 | Statements must end with a semi-colon.
                       | [convention.terminator]
WARNING: Parsing errors found and dialect is set to 'hive'. Have you configured your dialect correctly?
All Finished � �!

Expected Behaviour

HiveQL supports this syntax, and query runs ok, linting shouldn't result in an unparsable section error.

Observed Behaviour

See above

How to reproduce

SELECT
    col1,
    col2,
    col3,
    col4,
    sum(1) as cnt
FROM tbl
WHERE 1=1
GROUP BY col1, col2, col3, col4
GROUPING SETS (
    (col1, col2),
    (col1, col2, col3),
    (col1, col2, col3, col4)
)
;

Dialect

hive

Version

sqlfluff: 3.0.1
python: 3.8-11

Configuration

[sqlfluff]
sql_file_exts = .sql,.sql.j2,.dml,.ddl,.hql
output_line_length = 512
max_line_length = 128
ignore_comment_lines = True
large_file_skip_byte_limit = 30000
exclude_rules = L031,L035,L011,L042

[sqlfluff:rules]
allow_scalar = True
single_table_references = consistent
unquoted_identifiers_policy = all

[sqlfluff:indentation]
tab_space_size = 4
indent_unit = space
indented_joins = false
indented_using_on = true
template_blocks_indent = true

[sqlfluff:rules:convention.terminator]
multiline_newline = True
 
[sqlfluff:layout:type:comma]
line_position = trailing

Are you willing to work on and submit a PR to address the issue?

Yes I am willing to submit a PR!

Code of Conduct

I agree to follow this project's Code of Conduct

Anton · Answer 1 · Sat Mar 23 2024 10:14:37 GMT+0800 (China Standard Time)

Similar issue filed here for BigQuery: #5674