basho / riak_core

Distributed systems infrastructure used by Riak.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

When to clear tainted ring state or what's the meaning of tainted ring [JIRA: RIAK-1864]

jsvisa opened this issue · comments

Hi all, In riak_core_ring_manager.erl, every time you fresh the ring, or transfer the ring, you have to riak_core_ring:set_tainted/1, but I don't see anywhere to clear the tainted ring? So my question is what does riak_core_ring:set_tainted/1 actually did? What's the meaning of a tainted ring? Yeah, you may say a tainted ring is dangerous, the riak_core may stopped by app_helper:get_env(riak_core, exit_when_tainted, false). But I see after the cluster is set up, all nodes started, the ring is always tainted, seems it's useless?

@jsvisa +1 also curious - changes were introduced by this P/R by @jtuple in 2011.

I never really understood the distinction between riak_core_ring:get_raw_ring and riak_core_ring:get_my_ring.

Examining the data structure returned by riak_core_ring_manager:get_raw_ring() - the only significant difference seems to be the following field: [[riak_core_ring_tainted|{meta_entry,t....

And the tainted ring check you mention is not enabled by default:

check_tainted(Ring=?CHSTATE{}, Msg) ->
    Exit = app_helper:get_env(riak_core, exit_when_tainted, false),
    case {get_meta(riak_core_ring_tainted, Ring), Exit} of
        {{ok, true}, true} ->
            riak_core:stop(Msg),
            ok;
        {{ok, true}, false} ->
            lager:error(Msg),
            ok;
        _ ->
            ok
    end.

The ring state is held per node in ETS (which is cheap to update, but comparatively slow to query), after a certain period of cluster stability (60 seconds I think) stored as a module definition via mochiglobal for faster lookup speed.

I guess the taint checks might have something to do with this interplay?