brendon / ranked-model

An acts_as_sortable/acts_as_list replacement built for Rails 4+

Home Page:https://github.com/mixonic/ranked-model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Race condition when inserting records

fschwahn opened this issue · comments

When inserting several records simultaneously the following error is sometimes raised:

NoMethodError: undefined method `rank' for nil:NilClass
/app/vendor/bundle/ruby/3.0.0/gems/ranked-model-0.4.7/lib/ranked-model/ranker.rb line 194 in rearrange_ranks
/app/vendor/bundle/ruby/3.0.0/gems/ranked-model-0.4.7/lib/ranked-model/ranker.rb line 185 in assure_unique_position
/app/vendor/bundle/ruby/3.0.0/gems/ranked-model-0.4.7/lib/ranked-model/ranker.rb line 61 in handle_ranking
/app/vendor/bundle/ruby/3.0.0/gems/ranked-model-0.4.7/lib/ranked-model.rb line 33 in block in handle_ranking
/app/vendor/bundle/ruby/3.0.0/gems/ranked-model-0.4.7/lib/ranked-model.rb line 32 in each
/app/vendor/bundle/ruby/3.0.0/gems/ranked-model-0.4.7/lib/ranked-model.rb line 32 in handle_ranking 

I was able to somewhat reliably reproduce this error with the following test:

begin
  require "bundler/inline"
rescue LoadError => e
  $stderr.puts "Bundler version 1.10 or later is required. Please update your Bundler"
  raise e
end

gemfile(true) do
  source "https://rubygems.org"
  gem "activerecord", "~> 6.1"
  gem "ranked-model"
  gem "pg"
end

require "active_record"
require "minitest/autorun"
require "logger"

ActiveRecord::Base.establish_connection(adapter: "postgresql", database: "ranked_model_issue", url: "postgres://postgres:@localhost:5434")
ActiveRecord::Base.logger = Logger.new(STDOUT)

ActiveRecord::Schema.define do
  create_table :ducks, force: true do |t|
    t.string :name
    t.integer :row_order
    t.timestamps
  end
end

class Duck < ActiveRecord::Base
  include RankedModel

  ranks :row_order
end

class BugTest < Minitest::Test
  def test_error_during_rebalance
    threads = 5.times.map do
      Thread.new do
        ActiveRecord::Base.connection_pool.with_connection do
          Duck.create!
        end
      end
    end

    threads.each(&:join)

    assert_equal Duck.count, 5
    # Secondary issue
    # assert_equal Duck.distinct.pluck(:row_order).size, 5
  end
end
  • This does not work with sqlite, as sqlite raises an error due to the database being locked.
  • The test does not always fail, as is the nature with race conditions. It might take a few tries.
  • This test also exhibits a secondary issue, namely that the same row_order-value is taken several times, even if it does not outright fail.

Hi @fschwahn, that error rings a bell. I thought we'd fixed it or at least guarded against it. I assume you don't have a default value on that column as that would be prevent on boot.

Would you be interested in looking at a solution to this one? I suspect it'll involve locking of some kind :)

I've looked a bit into it, and I found the problem:

def current_last
@current_last ||= begin
if (ordered_instance = finder.
reverse.
first)
RankedModel::Ranker::Mapper.new ranker, ordered_instance
end
end
end

Calling reverse on finder loads the loads the ActiveRecord::Relation into memory (ie. finder.loaded? returns true). Every subsequent call to finder.first does no DB lookup anymore, but takes the result from the loaded relation in memory. In concurrent settings this loaded relation might be outdated (in my example returns nil instead of a record).

Loading the entire relation into memory seems never a good idea here, as only one record is of interest, so an immediate fix would be to use last so not the entire relation is loaded:

if (ordered_instance = finder.last)

I know too little about the code to tell if this has other side-effects, but it seems like a safe change.

That's a good idea in any case. However, in case current_first already returned something, it is memoized in an ivar, and might also be outdated. So it might make sense to call reset_cache at the top of rearrange_ranks. However, it is much less obvious to me if this has other side-effects.

That makes a lot of sense :) Even with .last I suspect there's still a chance for stale data unless we wrap all of this in a locking transaction?

The decision for .reverse was a bit bizarre:

26d35cf

But it was a simplification of what came before it, though prior to that the code was genuinely trying to reverse the sort order of the query. That code it a lot simpler these days with modern Rails :)

I'll close this for now, but let me know if you have any more thoughts on how to tighten this up.

I suspect there's still a chance for stale data unless we wrap all of this in a locking transaction?

Yes, I think so. However I don't know much about pessimistic locking, and I'd be afraid of introducing deadlocks 😐

I'll close this for now, but let me know if you have any more thoughts on how to tighten this up.

One more idea I had was introducing an optional jitter (applied in rank_at_average) to reduce the likelihood that 2 items are given the exact same row_order value (which is what ultimately leads to this issue here). However, that might not be desirable in situations where lots of re-ordering happens and the entire available space must be used.

Fair enough :) I think the best way forward would be to look for a guaranteed solution rather than one that reduces the risk.

At least the bug is fixed now :D