embulk / embulk-output-jdbc

MySQL, PostgreSQL, Redshift and generic JDBC output plugins for Embulk

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Key duplication occurs when embulk-output-mysql retries

hito4t opened this issue · comments

embulk-output-mysql will retry when deadlock occurs.
But sometimes key duplication error will occur.
If number of records to load is large, some records will be committed even if deadlock occurs.
So key duplication will occur when embulk-output-mysql retries to load all records.

This is the bug of the PR #251 .

Values of Statement#executeBatch when succeeded :

  • number of records to load : 4
  • length of value : 4
  • count of Statement.SUCCESS_NO_INFO : 4

Values of BatchUpdateException#getUpdateCounts when deadlock occurs :

  • number of records to load : 4
  • length of value : 3 (deadlock occurred at the 3rd record)
  • count of Statement.EXECUTE_FAILED : 3
  • count of committed records : 0

Values of Statement#executeBatch when succeeded :

  • number of records to load : 4
  • length of value : 4
  • count of Statement.SUCCESS_NO_INFO : 4

Values of BatchUpdateException#getUpdateCounts when deadlock occurs (many records) :

  • number of records to load : 250,000
  • length of value : 249,999
  • count of 0 : 174,760
  • count of Statement.EXECUTE_FAILED : 75,239
  • count of committed records : 174,760