brendon / ranked-model

An acts_as_sortable/acts_as_list replacement built for Rails 4+

Home Page:https://github.com/mixonic/ranked-model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Another race conditions when inserting records

fschwahn opened this issue · comments

This is a continuation of #182 - for which I supplied a bugfix a few years back. We are still seeing a race condition, but this one is much rarer.

When concurrently creating / deleting items, it can happen that rearrange_ranks raises an error because the reason for the rearranging has been deleted in the meantime. This can be reproduced pretty consistently with the following script:

begin
  require "bundler/inline"
rescue LoadError => e
  $stderr.puts "Bundler version 1.10 or later is required. Please update your Bundler"
  raise e
end

gemfile(true) do
  source "https://rubygems.org"
  gem "activerecord", "~> 7.1"
  gem "ranked-model"
  gem "pg"
  gem 'byebug'
end

require "active_record"
require "minitest/autorun"
require "logger"

# This connection will do for database-independent bug reports.
ActiveRecord::Base.establish_connection(adapter: "postgresql", database: "ranked_model_issue", url: "postgres://postgres:@localhost:5434")
ActiveRecord::Base.logger = Logger.new(STDOUT)

ActiveRecord::Schema.define do
  create_table :ducks, force: true do |t|
    t.string :name
    t.integer :row_order
    t.timestamps
  end
end

class Duck < ActiveRecord::Base
  include RankedModel

  ranks :row_order
end

class BugTest < Minitest::Test
  def test_no_duplicate_row_values
    10.times do
      threads = 20.times.map do
        Thread.new do
          ActiveRecord::Base.connection_pool.with_connection do
            duck = Duck.create!
            duck.destroy
          end
        end
      end
      threads.each(&:join)
    end
  end
end

I solved this in my app by making the following change to rearrange_ranks:

def rearrange_ranks
  reset_cache  
  return if current_first.nil? || current_last.nil?

  # ...

Meaning if there's no current_first or current_last we can just skip rearranging, as there's nothing to re-arrange anymore.

Hi @fschwahn, good to hear from you again :)

I think I kind of understand, but can you walk me through it in a bit more detail?

Are you saying that duck = Duck.create! kicks off callbacks that don't complete before duck.destroy is called?

current_first.nil? || current_last.nil? would imply an empty list wouldn't it? If so, couldn't we just call finder.exists?.

Also, just a quick note, I've released a new ordering gem: https://github.com/brendon/positioning If you have some time, I'd love it if you could give it a whirl if you think it'd solve a need for you. My aim is for it to be resistant to race conditions but it's a hard one to test. Currently I'm just making assumptions that because all the code runs in a transaction (before_save etc...) there shouldn't be any issues. Probably a bad assumption :D

Are you saying that duck = Duck.create! kicks off callbacks that don't complete before duck.destroy is called?

Yes, I guess that's what's happening. When rank_taken? is called the record still exists in the database (otherwise rearrange_ranks wouldn't be called), but when current_first or current_last is called it is already gone. As I said, this doesn't happen often, but happened 3 times in the last month in our app, which is why I took another look.

current_first.nil? || current_last.nil? would imply an empty list wouldn't it? If so, couldn't we just call finder.exists?.

I think that would still be susceptible to the race condition, as the record could be destroyed between finder.exists? and the calls to current_first or current_last. If we check current_first & current_last explicitly those values are cached. That means that rearrange_ranks works on outdated values, but if the list is empty anyway that's not a problem I believe.

Also, just a quick note, I've released a new ordering gem: https://github.com/brendon/positioning If you have some time, I'd love it if you could give it a whirl if you think it'd solve a need for you

thanks for the heads up, I'll take a look

Thanks @fschwahn, I've had an interesting journey over the last few days. I adapted your test for my positioning library and found that it had issues with overlapping threads too. I ended up implementing an advisory lock strategy to ensure sequential ordering operations and I'm wondering if that might be the solution here too.

Check out:
https://github.com/brendon/positioning/blob/main/lib/positioning/advisory_lock.rb

And where the class is used here:

https://github.com/brendon/positioning/blob/d7d67e6ef996a778b0697d27c9ac88ff90cf307c/lib/positioning.rb#L46

The biggest drama with an advisory lock was with SQLite because it doesn't support that. Instead I used a file lock strategy that I saw in the Advisory Lock gem. It seems to work well, but I also had to hack SQLite for my tests to use 'IMMEDIATE' transactions, otherwise the test gets too hectic and depletes the connection pool.

Would you be open to something like this in ranked-model?

Sure, I don't have any preference how this scenario is solved, and locking seems more correct. Just curious if this could lead to deadlocks, or will negatively impact insert speed (ie. if several items are inserted simultaneously which are not part of the same list).

It would probably be slower but only because things aren't done asynchronously but that caused the issue to begin with. Advisory locks are done with a key so you can have a key that mentions the namespace, database, table, and positioning column which will prevent locks from affecting other tables (and) positioning columns if you use more than one on your table. I'll put up a branch with this concept for you to test out on your end if you like?

Sure, I'll give it a go 👍

Hi @fschwahn, this took a bit of time but try out this branch: https://github.com/brendon/ranked-model/tree/advisory-lock

It wraps creates, updates, and deletes in an advisory lock which gets released on commit or rollback. That should ensure that the deletion of a record can't proceed while a create or update is occurring. The lock is keyed to the table and column used to keep the position integer.

How did you find the branch @fschwahn?

I hadn't had the chance to test it in production yet.

All good. Keep me posted :)

Keen to progress this @fschwahn :) Are you able to give it a test?