matrixorigin / matrixone

Hyperconverged cloud-edge native database

Home Page:https://docs.matrixorigin.cn/en

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug]: `load data inline` does not work when specifying `fields enclosed by '"'` by throwing an error `ERROR 20301 (HY000): invalid input: csvParser error: unterminated quoted field`

aronchanisme opened this issue · comments

commented

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Branch Name

1.2-dev

Commit ID

a5c552e

Other Environment Information

- Hardware parameters: 8c 16g ssd
- OS type: Debian 12 (kernal: 6.1.0-18-amd64)
- Others: none

Actual Behavior

load data inline does not work when specifying fields enclosed by '"' by throwing an error ERROR 20301 (HY000): invalid input: csvParser error: unterminated quoted field

MySQL [test]> drop table user;
Query OK, 0 rows affected (0.016 sec)

MySQL [test]> create table user (name varchar(25), age int, city varchar(25));
Query OK, 0 rows affected (0.013 sec)

MySQL [test]> system cat /tmp/1.csv
"A","23","Hello"
MySQL [test]> load data local infile '/tmp/1.csv' into table test.user fields terminated by ',' enclosed by '"'  lines terminated by '\n' (name,age,city);
Query OK, 1 row affected (0.011 sec)

MySQL [test]> select * from user;
+------+------+-------+
| name | age  | city  |
+------+------+-------+
| A    |   23 | Hello |
+------+------+-------+
1 row in set (0.004 sec)

MySQL [test]> load data inline format='csv', data=$XXX$
    -> "zhangsan","26","Shanxi;Xian" $XXX$
    -> into table user
    -> fields terminated by ',' enclosed by '"'
    -> lines terminated by '\n'
    -> (name,age,city);
ERROR 20301 (HY000): invalid input: csvParser error: unterminated quoted field
MySQL [test]> load data inline format='csv', data=$XXX$ "zhangsan","26","Shanxi;Xian" $XXX$ into table test.user fields terminated by ',' enclosed by '"'  lines terminated by '\n' (name,age,city);
ERROR 20301 (HY000): invalid input: csvParser error: unterminated quoted field
MySQL [test]> load data inline format='csv', data=$XXX$ "zhangsan","26","Shanxi" $XXX$ into table test.user fields terminated by ',' enclosed by '"'  lines terminated by '\n' (name,age,city);
ERROR 20301 (HY000): invalid input: csvParser error: unterminated quoted field
MySQL [test]> system vi /tmp/2.csv
MySQL [test]> load data local infile '/tmp/2.csv' into table test.user fields terminated by ',' enclosed by '"'  lines terminated by '\n' (name,age,city);
Query OK, 1 row affected (0.004 sec)

MySQL [test]> system cat /tmp/2.csv
"zhangsan","26","Shanxi;Xian"
MySQL [test]> select * from test.user;
+----------+------+-------------+
| name     | age  | city        |
+----------+------+-------------+
| A        |   23 | Hello       |
| zhangsan |   26 | Shanxi;Xian |
+----------+------+-------------+
2 rows in set (0.001 sec)

MySQL [test]> select git_version();
+---------------+
| git_version() |
+---------------+
| a5c552ef5     |
+---------------+
1 row in set (0.001 sec)

Expected Behavior

Given an csv file, which contains content like this

"zhangsan","26","Shanxi;Xian"

Both mo and mysql can load data local infile from the csv file and put data into the table successfully, however mo's load data inline way to load the data directly from sql statement failed. The expectation is below sql works

load data inline format='csv', data=$XXX$ "zhangsan","26","Shanxi" $XXX$ into table test.user fields terminated by ',' enclosed by '"'  lines terminated by '\n' (name,age,city);
load data inline format='csv', data=$XXX$ "zhangsan","26","Shanxi;XiAn" $XXX$ into table test.user fields terminated by ',' enclosed by '"'  lines terminated by '\n' (name,age,city);

Expected result of a select * from test.user after the load:

MySQL [test]> select * from test.user;
+------------+------+-------------+
| name       | age  | city        |
+------------+------+-------------+
| zhangsan   |   26 | Shanxi|
| zhangsan   |   26 | Shanxi;Xian |
+------------+------+-------------+
2 rows in set (0.001 sec)

Steps to Reproduce

create database if not exists test;
use test;
drop table if exists user;
create table user (name varchar(25), age int, city varchar(25));
load data inline format='csv', data=$XXX$ "zhangsan","26","Shanxi" $XXX$ into table test.user fields terminated by ',' enclosed by '"'  lines terminated by '\n' (name,age,city);
load data inline format='csv', data=$XXX$ "zhangsan","26","Shanxi;XiAn" $XXX$ into table test.user fields terminated by ',' enclosed by '"'  lines terminated by '\n' (name,age,city);
select * from user;

Additional information

#16790

commented

#16790 is stuck by this issue.

image

@aronchanisme load data infile是能正确支持的, 我看了一下文档load data inline是没有那些 terminated by enclosed by lines terminated by 之类的说明的, load data inline可能是没有做这些解析的. 莫尘当时做这个东西的时候可能就没打算支持这些东西

休假中

no process

无进展

同上

处理中移物联的insert pprof

处理中移物联的insert pprof

commented

not working on it