dolthub / dolt

Dolt – Git for Data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dolt workflows for manipulating the collation of a DB needed

macneale4 opened this issue · comments

If you modify the collation of your database today, there is no way to stage it, or do any of the workflows you would expect with data and schemas.

First, when you modify the collation, it is impossible to know based on the dolt_status table. The dolt status CLI doesn't even tell you. It's not actually clear how you would show this in the dolt_status table given its current schema. We may need an additional column, but I'm not certain of the right way forward there.

Second, there is no mechanism to add the collation change once you determine it's good to commit. dolt add is currently table based, so we probably need an additional flag to indicate that we want to add the collation change.

Third, merging two branches which have different collations is currently not deterministic. There is no conflict workflow, or way to indicate to a user that the collations of the two branches have both been changed. The dolt_conflicts and dolt_schema_conflicts tables are both awkward placed to put this information. Maybe add it to the dolt_merge_status table with a new column? TBD.

Finally - we need a way to render this information in diffs, so that you can look in history and see what the collation value changed from and to.

See: #7812

This PR #7823 adds makes dolt aware of database collation changes, providing a way to view, commit, and merge these changes.

It's a little awkward, but to preserve some backwards compatibility (by not changing the dolt system tables), database collation changes now appear in dolt_status as a table schema change. It'll have __DATABASE__<db-name> as a table name.
A release will come out with this change soon.