techouse / sqlite3-to-mysql

Transfer data from SQLite to MySQL

Home Page:https://techouse.github.io/sqlite3-to-mysql/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Conversion failure due to collation and unique indexes

andrewguertin opened this issue · comments

Describe the bug
If an sqlite file with (default) BINARY collation has a column with a unique index and two values that differ only in case, the conversion will fail, because sqlite3mysql creates databases and tables with utf8mb4_general_ci case insensitive collation, and under that collation the values are not unique.

Expected behaviour
Conversion to work

Actual result
MySQL failed adding indices to table mytable: 1062 (23000): Duplicate entry '' for key 'mykey'

System Information

$ sqlite3mysql --version
| software               | version                                                                        |
|------------------------|--------------------------------------------------------------------------------|
| sqlite3-to-mysql       | 1.4.5                                                                          |
|                        |                                                                                |
| Operating System       | Linux 5.14.2                                                                   |
| Python                 | CPython 3.9.6                                                                  |
| MySQL                  | mysql  Ver 15.1 Distrib 10.5.10-MariaDB, for Linux (x86_64) using readline 8.1 |
| SQLite                 | 3.35.5                                                                         |
|                        |                                                                                |
| click                  | 8.0.1                                                                          |
| mysql-connector-python | 8.0.26                                                                         |
| pytimeparse            | 1.1.8                                                                          |
| simplejson             | 3.17.3                                                                         |
| six                    | 1.16.0                                                                         |
| tabulate               | 0.8.9                                                                          |
| tqdm                   | 4.62.0                                                                         |

Additional context
Modifying sqlite3mysql to use utf8mb4_bin collation worked fine.

Documentation links:
https://www.sqlite.org/datatype3.html#collation
https://mariadb.com/kb/en/supported-character-sets-and-collations/

That's somewhat of an edge case, but MySQL is by default case insensitive, so I'm not sure how wise or unwise it would be to make the database case-sensitive in terms of compatibility. It would certainly open up a can of worms 😄

I could add an option to select your MySQL collation, I guess. 🤷

What's your suggestion?

I added these 2 CLI options to provide a custom charset and collation

--mysql-charset TEXT         MySQL database and table character set
                             [default: utf8mb4]
--mysql-collation TEXT       MySQL database and table collation