jonhoo / flurry

A port of Java's ConcurrentHashMap to Rust

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Expose non-replacing `insert`

jonhoo opened this issue · comments

By calling put with no_replacement = true, we can provide a version of insert that does not update the value if the key is already present. I can imagine that being a relatively useful method to provide. I'm not sure what to call it though? std::collections::HashMap does not have an equivalent, so we're on our own here.

commented

noreplace comes to mind but that's too cheesy

Perhaps insertifnull

commented

A lot of the APIs which deal with concurrency use try_ for methods that depend on state shared between multiple threads. For example, a lot of the locks have a try_lock() method which returns false if the lock is already held by another thread. Such methods can also be found in places where an operation might not succeed, such as try_from for fallible conversions.

As I assume we have a similar use case for the proposed method (i.e. inserting if no value for the given key is currently stored by a different thread) and it is definitely a case of an operation that might not happen depending on the state of the map, I suggest following this naming scheme and calling the method try_insert.

Edit: I also just think the name would be very intuitive ^^

Oh, wow, yes, try_insert is a great name!

commented

try_insert feels weird compared to the other APIs. For example, we do a try_lock() and it fails, the lock wasn't acquired. Now do a try_insert(), the insert failed but the data is there anyway.

It feels the wrong name to call based on what I usually expect from try, something that can fail. If the data is already there, is it a failure?

commented

The proposal only mentions checking if an entry already exists for the given key. The map may still contain a value different from what you try to insert, so in this case you "failed" to store your provided value in the map. So while there might be data in the map after a such a failed try_insert, it is not generally the data you wanted it to contain. That's how I think of this at least.

We can also discuss whether we should check the already stored value against the given value and what to return in case they match.

commented

Hmmm you are right it makes a lot of sense to use try_insert from considering something else can be there.

Checking the value may have a high cost, but could be necessary to decide what to return.

I don't think we should check for equality of the value.

One alternative is to rename the current insert to upsert, which is a short-form often used for "update or insert", and then call this method insert. But that's probably more confusing, since we then would not match the std API.

commented

I agree that checking for the value is unnecessary. I do think however that this method should return not only information about whether the insert succeeded, but also the previous value in case of failure (which is returned by put anyway). That way, the caller can check himself whether the value blocking his insert matches his value or not if he requires this.

So we could simply propagate the return of put like insert does (in which case the caller knows he was successful if he reads None as previous value), or we could use something like a Result to indicate success. While the Option contains all relevant information, I personally feel it is confusing to obtain None on success and Some on failure, so I'd lean towards the second alternative.

I agree, returning a Result with the old value in Err seems like a good solution.

std::collections::HashMap does not have an equivalent, so we're on our own here.

I think you would use map.entry(key).or_insert(value).

Sorry, yes, I meant as a free-standing method. The Entry API would be great to support, but we have other issues there sadly.

Then what about insert_if_absent as proposed in #12 (comment)?
The implementation could just be map.compute_if_absent(key, || value, guard).

I think the implementation is just

self.put(key, value, true, guard)

We don't currently have compute_if_absent (only compute_if_present), but yes, that would also work.

Between insert_if_absent and try_insert, I'm not sure which one I prefer to be honest. I'd be fine with either I think. I'm not a huge fan of the *_if_absent + *_if_present naming, but it does match ConcurrentHashMap, so 🤷‍♂️

@jonhoo I'm working on this right now. What exactly does HashMap::put returns?

I think naming the new method try_insert and returning a Result<V, V> or Result<(), V> would be a better approach. Anyway, I'll go with whatever you guys decide.

I believe Result<&T, &T> is the ideal return type for this method. That way map.try_insert(key, val, guard) would return Ok(&val) if map doesn't have an entry for key, and Err(&old_value) otherwise.

However, val is consumed by HashMap::put before the function returns, so returning a reference to it would require some extra work. So maybe we should just return a Result<(), &T> (to match std's try_whatever type signature)?

I don't think we want to make HashMap::put return Result, since Err(old_value) would imply that the put did not happen, when that is not the case for put. I do think it makes sense for Result to be the return type of try_insert (if that's what we call it), since there Err() there really does mean that the operation didn't do anything. May be worth adding a new (internal-only) enum to provide a more fine-grained return type for HashMap::put — that way we could also have a NotReplaced(T) variant to use when no_replacement = true, which gives back the T.

I'm not entirely sure what the return semantics of HashMap::try_insert should be. My instinct is that the Ok type should be (), and that the Err type should be T; the value that was not inserted. Returning &T with the current value (which caused the try_insert to fail) also makes sense to me though. Perhaps what we really want is

struct TryInsertError {
    current: &T,
    new: T,
}

The names here are inspired by crossbeam_epoch::CompareAndSetError.

commented

This seems to be covered now with #74. Closing the issue.