Schema migration tool

Question

Schema migration tool

krasserm opened this issue 9 years ago · comments

With the new plugin API in akka-persistence 2.4, the implementation of CassandraJournal and the underlying schema can be strongly simplified (for example, we can remove headers and markers, ...).

Although it would be possible to stay backwards compatible with the existing schema, it would make working on #48 and the maintenance of the plugin unnecessarily complex. Hence, a migration tool seems to be a better solution than backwards compatibility.

depends on #48
supersedes #54.

Christopher Batey · Answer 1 · Sun Jul 05 2015 00:18:06 GMT+0800 (China Standard Time)

Happy to work on #48 and this issue if you want @krasserm

Martin Jöhren · Answer 2 · Sun Jul 05 2015 00:45:14 GMT+0800 (China Standard Time)

You've seen https://github.com/comeara/pillar ?
Looks good from my point of view.

Martin Krasser · Answer 3 · Sun Jul 05 2015 17:06:23 GMT+0800 (China Standard Time)

@chbatey this would be fantastic. Thanks for offering help and looking forward to see PRs :-)

@matlockx thanks for the hint, will take a closer look soon.

Peter · Answer 4 · Mon Aug 10 2015 20:30:04 GMT+0800 (China Standard Time)

I use pillar on my project, but is not well maintained. I have a fork here https://github.com/smr-co-uk/pillar/tree/patch with merged pull requests, a couple of improvements and improved documentation.

Martin Krasser · Answer 5 · Tue Aug 11 2015 19:54:38 GMT+0800 (China Standard Time)

@PeterLappo thanks for sharing!

Andrew Snare · Answer 6 · Mon Oct 12 2015 04:31:25 GMT+0800 (China Standard Time)

I've been getting some questions from people trying to migrate their systems from 0.3 to 0.4. As far as I can tell, the schema changes are:

Rename the processor_id column to persistence_id
Drop the marker (clustering key) column.
Add the static used column.

However 0.3 creates tables with the compact storage option. This means, amongst other things, that marker column can't be dropped, nor can the used column be added.

This leads me to ask:

Has anyone ever actually tried migrating a table? (A large one?)
How is this tool supposed to work?

Right now it looks like there's no simple migration path for people with data, and that any eventual path will involve a stop-the-world process during which a new table is created and the data copied into it.

Have I missed something?

Martin Krasser · Answer 7 · Tue Oct 13 2015 12:08:57 GMT+0800 (China Standard Time)

@asnare copying data into the new schema can be made parallel to a running old application. When the initial data migration is finished, the old application needs to be stopped and the remaining small fraction of old data needs to be migrated in a second step. This should keep the downtime at a minimum.

Christopher Batey · Answer 8 · Sun Oct 18 2015 19:29:18 GMT+0800 (China Standard Time)

I've removed old data so far. I plan to write a small spark job to migrate old data in the future.

Patrik Nordwall · Answer 9 · Tue Dec 01 2015 18:55:08 GMT+0800 (China Standard Time)

Another schema change I would like to do at some point is to not store the PersistentRepr as a serialized blob, but add additional columns and only only store the event as a blob. I think that can be done in a backwards compatible way, but wanted to mention it here also.

Sebastiaan Samyn · Answer 10 · Wed Dec 02 2015 15:28:29 GMT+0800 (China Standard Time)

@patriknw would be a big improvement.

Patrik Nordwall · Answer 11 · Mon Jan 11 2016 22:04:58 GMT+0800 (China Standard Time)

Continued here: akka/akka-persistence-cassandra#11