soveran / ohm

Object-Hash Mapping for Redis

Home Page:http://ohm.keyvalue.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ohm upgrade data corruption

gingerlime opened this issue · comments

We tried to upgrade Ohm from v1.3 to v2.0 (and later v2.0.1), but had some weird things going on. Unfortunately I wasn't able to reproduce it at the time, even after trying for several hours. We since downgraded back to v1.3

Now I came across some kind of a data corruption that looks like it was related to the upgrade / downgrade. The updated_at timestamps of some of our database objects (linked to the Ohm objects) were at around the same time as the upgrade took place.

I'm not sure whether I'll have time to try to reproduce this, but just to report the nature of the data corruption we see. In case someone else comes across this, or maybe someone else can reproduce.

We use Ohm to store statistics on our learning platform. Our code looks something like this

require 'ohm'
require 'ohm/contrib'

class Stats::UserStats < Ohm::Model
  include Ohm::DataTypes
  include Ohm::Timestamps
  include Ohm::Scope
  include Ohm::Versioned

  # user of these stats
  attribute :user_id
  # total number of terms seen
  counter :terms_seen
  # ... more attributes

  index :user_id
  # ... 
end

What we see is objects that appear empty, apart from an id. i.e.

irb(main):039:0> stats = Stats::UserStats.find(user_id: 18259)
=> #<Ohm::Set:0x00000001e89dc8 @key="Stats::UserStats:indices:user_id:18259", @namespace="Stats::UserStats", @model=Stats::UserStats>
irb(main):040:0> stats.first
=> #<Stats::UserStats:0x00000001e85070 @attributes={}, @_memo={}, @id="292090">

there are no attributes, even not ones that we expect like updated_at and created_at which are part of the Ohm::Timestamps

When we search for the Ohm id:

irb(main):041:0> Stats::UserStats[292090]
=> nil

If we look in redis, it seems like the index is still there, but it points to a non-existing key

$ redis-cli
127.0.0.1:6379> smembers Stats::UserStats:indices:user_id:18259
1) "366316"
127.0.0.1:6379> hgetall Stats::UserStats:366316
(empty list or set)

We're not entirely sure how it happens. Possibly we deleted those UserStats from within a Sidekiq job and somehow it wasn't fully deleted? But this is what we see right now.

If we find time, we might attempt to reproduce, but we're very limited on time unfortunately. I thought it's worth reporting anyway.

I think this is something that can't be addressed directly, as the database structure changes from one version to the other. I will close the issue for now, but I'm glad you reported it because at some point, when we have a way to reproduce it, we can reopen it and try to fix it or alert other users about how to avoid this scenario.

Just to get this straight - are you saying there's no way to upgrade from Ohm v1.3 to v2.0 since the data structures change between those versions?

It is possible to upgrade, but as there are some incompatible changes, you have to study carefully how the data will be migrated (hence the note in the README). I say there are ways to upgrade because I've done it myself with some apps, but in any case, Ohm 2.0 is not a drop-in replacement for Ohm 1.3.

I did see the note, checked the changelog, and we have a pretty good test coverage and everything was passing. We still bumped into a couple of unpleasant issues (which I reported and that weren't solved as far as I can tell - but things like this one are also hard to reproduce). We dropped back down to 1.3 and currently we're not sure if we will upgrade or move away from Ohm.

Ohm is very neat, but there are too many odd things going on that we can't seem capable of dealing with unfortunately.

Do you want to share a sample database with me? That way I can try to figure out what's happening. Ohm is a tiny library, there aren't odd things happening, because in essence if you look at the Lua scripts, that's mostly it. I understand that you are very frustrated, but believe me that I try to do my best, and in this case, without having your database, there's not a lot I can do.

Yes, I understand. And I wish we could more easily reproduce it - it will be a big step towards solving this.

Our redis is about 2.5Gb currently, and includes other stuff (like sessions etc) - any idea on how to export a sample out to you with only the Ohm stuff or even just a sample of that?

(btw, at least one problem we reported #171 - albeit small and not so important - was reproducible...)

I think I have the same/similar issue and am able to reproduce it

According to Cart:1:items I should have 4 items while in reality 2 have been deleted.

Output from the console

irb(main):001:0> Cart[1].items.count
=> 4
irb(main):002:0> Cart[1].items.to_a
=> [#<CartItem:0x007fe681da33f0 @attributes={:token=>"53", :type=>"product", :title=>"Incredible Rubber Hat", :link=>"incredible-rubber-hat", :amount=>"102", :tax=>"21"}, @_memo={}, @id="1">, #<CartItem:0x007fe681da2d88 @attributes={}, @_memo={}, @id="2">, #<CartItem:0x007fe681da2ba8 @attributes={}, @_memo={}, @id="3">, #<CartItem:0x007fe681da26f8 @attributes={:token=>"0", :type=>"shipping", :title=>"Verzendkosten", :amount=>"499", :tax=>"21"}, @_memo={}, @id="4">]

I know very little about Redis but if you tell me how to send you my database I'd be happy to help

Ruby 2.2.2
Rails 4.2.1
Ohm 2.2.1
I also run Sidekiq & Foreman with web:2,worker:2

require 'ohm'

class Cart < Ohm::Model
  set :items, :CartItem
end

class CartItem < Ohm::Model
  reference :cart, :Cart
  attribute :title
  index :title
end

cart = Cart.create

cart.items.add CartItem.create title: 'test1'
cart.items.add CartItem.create title: 'test2'

cart_item = cart.items.find(title: 'test1').first

# cart_item.delete
cart.items.delete(cart_item) # this deletes the index and the model

puts "Total items: #{cart.items.size}"

cart.items.each do |item|
  puts item.inspect
end

I'm new to Ruby and Ohm for that matter but I got it working as I expected.
If you delete the item on the set instead of the model itself then it's removed from the index and the model is deleted as well.

@zenry If you have a reference in place, you can use a collection on Cart to refere to all the CartItem instances:

require 'ohm'

class Cart < Ohm::Model
  collection :items, :CartItem
end

class CartItem < Ohm::Model
  reference :cart, :Cart
  attribute :title
  index :title
end

That way you don't need to update a set by hand.

@soveran thanks, will change it