Consistency break when full sync includes an entry's deletion and re-creation
akkornel opened this issue · comments
Pondering the code, I've discovered a way to break consistency between our data store, and the client's. It will happen if a full sync is needed, and that involves an entry being deleted, then re-created with the same DN.
I haven't seen this happen yet, but I know it will.
Here's the (theoretical) sequence of what might happen:
- At some point in the past, you were in sync. But right now you're not connected.
- On the LDAP server, an entry (which you have in your local cache) is deleted.
- On the LDAP server, a new entry is created, and the entry has the same DN as the entry from 2.
- You connect, but your cookie is so old that the LDAP begins a full sync.
- The
Syncrepl
instance gets a call tosyncrepl_entry
whereuuid
is new to us, butdn
is one we've seen before. - Eventually the
Syncrepl
instance gets a call tosyncrepl_present
with no UUIDs, and withrefreshDeletes
set toFalse
. That tells us "All the UUIDs you have, which we haven't already said are present, are gone," and it triggers us to delete the old UUID. That triggers a call tosyncrepl_delete
.
That's how things look to the Syncrepl
class. This is actually OK, because our data stores (the UUID-to-DN map and the UUID-to-attributes map) are both keyed on UUID, so there is no conflict between these two entries.
However, the callbacks are not keyed on UUID, the callbacks are keyed on DN! So, here's the sequence of callbacks which would be triggered in the above scenario:
- A
bind_complete
callback, triggered by Step 4. - A
record_add
callback, triggered by Step 5. This will be confusing to the client, as they've already added the DN previously (as per Step 1). - A
record_delete
callback, triggered by Step 6. This will completely break consistency, as our client deletes what is probably still a valid entry.
At this point, our data store and the client's data store will be out of sync. The only way synchronization would come back into sync automatically would be if:
- A future reconnect causes another full re-sync.
- The entry is deleted and re-added on the LDAP server.
- The entry is modified on the LDAP server.
Either way, it will still be bad!