Expose non-replacing `insert`
jonhoo opened this issue · comments
By calling put
with no_replacement = true
, we can provide a version of insert
that does not update the value if the key is already present. I can imagine that being a relatively useful method to provide. I'm not sure what to call it though? std::collections::HashMap
does not have an equivalent, so we're on our own here.
noreplace comes to mind but that's too cheesy
Perhaps insertifnull
A lot of the APIs which deal with concurrency use try_
for methods that depend on state shared between multiple threads. For example, a lot of the locks have a try_lock()
method which returns false
if the lock is already held by another thread. Such methods can also be found in places where an operation might not succeed, such as try_from
for fallible conversions.
As I assume we have a similar use case for the proposed method (i.e. inserting if no value for the given key is currently stored by a different thread) and it is definitely a case of an operation that might not happen depending on the state of the map, I suggest following this naming scheme and calling the method try_insert
.
Edit: I also just think the name would be very intuitive ^^
Oh, wow, yes, try_insert
is a great name!
try_insert feels weird compared to the other APIs. For example, we do a try_lock() and it fails, the lock wasn't acquired. Now do a try_insert(), the insert failed but the data is there anyway.
It feels the wrong name to call based on what I usually expect from try, something that can fail. If the data is already there, is it a failure?
The proposal only mentions checking if an entry already exists for the given key. The map may still contain a value different from what you try to insert, so in this case you "failed" to store your provided value in the map. So while there might be data in the map after a such a failed try_insert
, it is not generally the data you wanted it to contain. That's how I think of this at least.
We can also discuss whether we should check the already stored value against the given value and what to return in case they match.
Hmmm you are right it makes a lot of sense to use try_insert from considering something else can be there.
Checking the value may have a high cost, but could be necessary to decide what to return.
I don't think we should check for equality of the value.
One alternative is to rename the current insert
to upsert
, which is a short-form often used for "update or insert", and then call this method insert
. But that's probably more confusing, since we then would not match the std
API.
I agree that checking for the value is unnecessary. I do think however that this method should return not only information about whether the insert succeeded, but also the previous value in case of failure (which is returned by put
anyway). That way, the caller can check himself whether the value blocking his insert matches his value or not if he requires this.
So we could simply propagate the return of put
like insert
does (in which case the caller knows he was successful if he reads None
as previous value), or we could use something like a Result
to indicate success. While the Option
contains all relevant information, I personally feel it is confusing to obtain None
on success and Some
on failure, so I'd lean towards the second alternative.
I agree, returning a Result
with the old value in Err
seems like a good solution.
std::collections::HashMap
does not have an equivalent, so we're on our own here.
I think you would use map.entry(key).or_insert(value)
.
Sorry, yes, I meant as a free-standing method. The Entry
API would be great to support, but we have other issues there sadly.
Then what about insert_if_absent
as proposed in #12 (comment)?
The implementation could just be map.compute_if_absent(key, || value, guard)
.
I think the implementation is just
self.put(key, value, true, guard)
We don't currently have compute_if_absent
(only compute_if_present
), but yes, that would also work.
Between insert_if_absent
and try_insert
, I'm not sure which one I prefer to be honest. I'd be fine with either I think. I'm not a huge fan of the *_if_absent
+ *_if_present
naming, but it does match ConcurrentHashMap
, so 🤷♂️
@jonhoo I'm working on this right now. What exactly does HashMap::put
returns?
I think naming the new method try_insert
and returning a Result<V, V>
or Result<(), V>
would be a better approach. Anyway, I'll go with whatever you guys decide.
I believe Result<&T, &T>
is the ideal return type for this method. That way map.try_insert(key, val, guard)
would return Ok(&val)
if map
doesn't have an entry for key
, and Err(&old_value)
otherwise.
However, val
is consumed by HashMap::put
before the function returns, so returning a reference to it would require some extra work. So maybe we should just return a Result<(), &T>
(to match std
's try_whatever
type signature)?
I don't think we want to make HashMap::put
return Result
, since Err(old_value)
would imply that the put did not happen, when that is not the case for put
. I do think it makes sense for Result
to be the return type of try_insert
(if that's what we call it), since there Err()
there really does mean that the operation didn't do anything. May be worth adding a new (internal-only) enum to provide a more fine-grained return type for HashMap::put
— that way we could also have a NotReplaced(T)
variant to use when no_replacement = true
, which gives back the T
.
I'm not entirely sure what the return semantics of HashMap::try_insert
should be. My instinct is that the Ok
type should be ()
, and that the Err
type should be T
; the value that was not inserted. Returning &T
with the current value (which caused the try_insert
to fail) also makes sense to me though. Perhaps what we really want is
struct TryInsertError {
current: &T,
new: T,
}
The names here are inspired by crossbeam_epoch::CompareAndSetError
.