ms705 / nom-sql

Rust SQL parser written using nom

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

support MySQL special character escape sequences

lovasoa opened this issue · comments

Hi!
Thank you for this great library.
I am trying to use it to parse wikipedia dumps in my project wikipedia-externallinks-fast-extraction.
Unfortunately, they contain mysql escape characters that are currently not supported by this library.

Unsupported characters

The escape characters are:

\0
\'
\"
\b
\n
\r
\t
\Z
\\
\%
\_

Example

INSERT INTO externallinks VALUES (23481,120102,'http://home.arcor.de/jean-polmartin/aufsaetze/apliut.htm\'','http://de.arcor.home./jean-polmartin/aufsaetze/apliut.htm\'','http://de.arcor.home./jean-polmartin/aufsaetze/apliut.htm\'');

SQLite escape sequences don't seem to be supported either. According to the README:

We try to support both the SQLite and MySQL syntax; where they disagree, we choose MySQL. (It would be nice to support both via feature flags in the future.)

So I think :

  • '''' should parse as ' (sqlite-only)
  • '\' should parse as \ (sqlite-only)
  • '\\' should parse as \\ (it is a valid string in both SQLite and mysql, but with a different meaning)
  • '\'' should parse as ' (mysql-only)

It should not be difficult to implement using nom::escaped_transform

Good catch -- I actually independently ran into this issue last week (also parsing MySQL dumps) and made a mental note to fix it!

I originally looked at nom::escaped_transform for this, but didn't yet figure out exactly how to use it for this purpose. Looks like you ended up hand-rolling the parse rule instead, probably for good reasons.

I'll check out the PR 👍

I also tried to use nom::escaped_transform but with no success. I think this is because of that part of the documentation:

WARNING: if you do not use the verbose-errors feature, this combinator will currently fail to build because of a type inference error

I think we have all of them supported now, thanks to @lovasoa's work 👍