get_or_insert_with_element should take &K instead of K

Question

get_or_insert_with_element should take &K instead of K

schungx opened this issue 5 years ago · comments

There is no reason for get_or_insert_with_element to take ownership of the key, since the whole key/value pair is inserted in one go as an Element<K, V>.

Changing the method to pub fn get_or_insert_with_element<F>(&self, key: &K, create: F) can enable some interesting new possibilities.

For example, if I create a ConMap<Vec<u8>, String>, I can normally pass in &[u8] as the lookup key to get(). However, currently I cannot do this:

let key = b"hello";
let x = map.get_or_insert_with_element(key, || Arc::new(Element::new(key.to_vec(), "world".to_string())));

In other words, first look-up the element with a slice reference without allocating a key structure. If not found, then create the key structure for storage.

Currently, I must do:

let key = b"hello";
if map.get(key).is_none() {
    let x = map.get_or_insert(key.to_vec(), "world".to_string());
}

Of course, then the programmer is responsible for not doing stupid things like having the key and the key value in Element<K,V> be different...

Michal 'vorner' Vaner · Answer 1 · Tue Dec 17 2019 16:12:01 GMT+0800 (China Standard Time)

You're right, this could be improved.

Do you want to send a pull request?

Stephen Chung · Answer 2 · Fri Dec 20 2019 09:39:11 GMT+0800 (China Standard Time)

Do you want to send a pull request?

I tried, but it seems that your code takes ownership of the key (which is of type C::Key) all the way down deep into the core. As such, it seems like a large surgery to change simply change key: K to key: &Q and Q: Eq + Hash, C::Key: Borrow<Q>.

I got stopped at TraverseState where if I change Future to take key: &Q, it requires adding a lifetime and I am not sure which lifetime I need to put into it. And everything keeps propagating deeper and deeper. Which is understandable because if the key is not really owned but a reference, it probably gets hairy when you put it into a future...

And I'm quite sure I'd need a key constructor function eventually, somewhere, because it doesn't look like I can depend on the Payload creation function to run in order to create the key. That will be prohibitive if the payload creation itself is expensive, and we just want to check if the key is in the trie... if it is, we'll want to skip the payload creation, thus catch-22.

So, what that means is that TraverseState will need to keep around not only the payload creation function, but also a key creation function.

Right now, it means that the crate assumes the key is not costly to create/clone, like primitive types. It does rule out using any type that is expensive to create as the key.

Michal 'vorner' Vaner · Answer 3 · Fri Dec 20 2019 19:16:58 GMT+0800 (China Standard Time)

OK, then I'll have a look myself if there's some conceptual reason to mandate owned key or if I can bend it without redesigning everything.

Stephen Chung · Answer 4 · Fri Dec 20 2019 21:07:30 GMT+0800 (China Standard Time)

Probably not. After thinking about it some more, it doesn't need to even carry around a separate key construction function.

You search for a key in the trie by first casting the trie's stored key into a simpler representation via Borrow::borrow, as long as that simpler representation can be compared.

You can do the same thing by having TraverseState carry the simpler representation (a reference to a slice typically), which is used to search the trie and identify whether the key exists.

If not, then a new node is created by the carried creation function which should create both the Payload and the proper key structure in one single go.

Michal 'vorner' Vaner · Answer 5 · Sat Dec 21 2019 22:46:15 GMT+0800 (China Standard Time)

I'm playing with it and certainly, I could change the function prototype to fn get_or_insert_with<Q, F>(key: &Q, constructor: F) where K: Borrow<Q>, F: FnOnce(&Q) -> Element. That would allow calling it with a reference key.

On the other hand, the current state allows embedding the key into the generated Element. In the new way, when the caller had owned key, they'd still be allowed to pass only a reference and had to clone it in the constructor function. So I'm not so sure the proposed way is better. I don't think I want to have both cases around.

Do you have some argument (numbers, statistics, use cases…) to support one or the other?

Stephen Chung · Answer 6 · Tue Dec 24 2019 10:35:21 GMT+0800 (China Standard Time)

I think we can look at this in the following scenarios:

1: Q->K (cheap, e.g. copying a Copy primitive type)
2: Q->K (expensive)
3: Q->K (forbidden, meaning K->Q is one-way)

This assumes K->Q always succeeds and is cheap (because you can always just pass a reference, I guess).

1: New and old ways are equivalent, with old way slightly simpler
2: New way can avoid an expensive operation. Old way requires two-step operation (first checking if key exists).

3 is the tricky (if uncommon) situation:

K must be around anyhow, so the key construction cannot be avoided, and just moving the K into the closure will work. However, that also means that Q must be detached from K otherwise K cannot be moved into the closure. Alternatively, Q must refer to a cloned copy of K, but then K needs to be Clone.

The old way, of course, handle this situation nicely.

So, in conclusion, the old way works better for all cases. However, the new way can avoid synchronization problems with a two-stage operation, if the user wants to avoid unnecessary allocations (because they are expensive), which may be more common than you think whenever the key is not a simple primitive type -- for example, think if the key is String and you want to check with &str.

Therefore, my suggestion is to have two separate functions.

Michal 'vorner' Vaner · Answer 7 · Wed Dec 25 2019 21:14:04 GMT+0800 (China Standard Time)

I don't know if I follow you completely. I think the bigger problem is not if there's K or Q, but if only reference is passed or if an owned value is passed. Even if the function would be instantiated with K=Q, the problem is turning the reference to owned value to insert into the map.

Furthermore, with the lock-free data structure, the chance the value will get created and not used is always there due to races between the threads. It's just hidden in the current function.

Anyway, I really don't like adding yet another function. There are already 4 get_or_insert functions, which looks like too many. Furthermore, I wrote the lock-free core as an exercise, but I've never had a chance to put the code to production use so I don't know if it's too slow or too memory hungry or something. I'd like to have some form of actual use of the crate before it is made more complex.

So, is the request because you think it would be nice to have or because you have some actual use case that needs it? Is the cloning in some way a performance bottleneck for you?

Stephen Chung · Answer 8 · Sun Dec 29 2019 16:52:00 GMT+0800 (China Standard Time)

Well, my use case is to use it as a global strings cache that maps raw byte streams to the relevant Strings, so I can avoid allocating huge number of temporary strings when decoding.

Therefore, my lookups are done using &[u8] from the input data stream, but the keys can't be such references, so they are copied into Vec<u8>. The map I use is IndexMap<Vec<u8>, String>.

Right now, lookup is a two-step process: (1) Check if the slice is inside the map, this can be done with &[u8]. (2) If not, add it by first cloning into a Vec<u8> and then calling one of the get_or_insert functions.

So this is a pretty valid use case where the key to lookup originally came in some other format (e.g. a byte stream) and requires an expensive process to create, but trivial to compare and hash. In such cases, and I'd assume is will be quite normal in low-level data processing, it is best to create the key as well as the payload at once, when the element is not inside the map.

Nevertheless, I can work with two steps. I don't think it will cause much harm, except some unnecessary key creations when lookups collide.

Michal 'vorner' Vaner · Answer 9 · Sun Dec 29 2019 19:49:45 GMT+0800 (China Standard Time)

Hmm. I see. I still have some doubts this is the right data structure for the purpose. You might get much better memory footprint and likely better speed if you do something like two-level hashing & granular locking. Have something like 2*N buckets, where N is the number of threads, hash the input value and pick a bucket. Each bucket then contains a mutex-protected ordinary HashMap. Furthermore, you can probably put some kind of bump allocator or arena inside each, so you allocate the strings in a cheaper way as well. This approach doesn't have the lock-free guarantees, but what you do doesn't sound like something that needs them.

If the proposed solution is slower than what you can get with the ConMap, then it would be a good argument to include such method.

Stephen Chung · Answer 10 · Tue Dec 31 2019 18:07:34 GMT+0800 (China Standard Time)

Furthermore, you can probably put some kind of bump allocator or arena inside each, so you allocate the strings in a cheaper way as well.

That's a good idea. I just haven't gone thru all the different crates implementing concurrent hashes. I just picked the one (yours!) that has the best API that fits my needs... Most implementations make more assumptions about the API usage, and none of them geared towards my needs -- which is a concurrent cache.

Michal 'vorner' Vaner · Answer 11 · Sun Feb 16 2020 18:18:26 GMT+0800 (China Standard Time)

This has become silent. Are you still working on a case/benchmark to show such API would reasonably help, or have you found another (better) solution in the meantime? If the latter, is it OK to close this?

Stephen Chung · Answer 12 · Tue Feb 18 2020 15:18:25 GMT+0800 (China Standard Time)

Well, I ended up going back to DashMap instead. It is OK to close this.

I can think o f a simple use case for this API, but I now see that it would involved deep changes to the current code.

The case is this: assume a scenario where (key, value) pairs are stored in the hash. Lookup of the key can be via a reference where the key coerces into (for example, lookup key String via &str).

The goal is to have a single function call that will, when passed such a reference, lookup the key from the hash. If failed, a new entry is added, and the key is created from such reference (may be via a factory closure, similar to a value-creation closure for cases where the value is expensive to create).

This scenario occurs whenever a key is compared using some sort of surrogate (e.g. reference) and is expensive to create from that surrogate, but that surrogate cannot be stored in the hash as the key type itself. E.g. references.

A typical use case will be ConMap<String, String> where you're mostly working with &str references (perhaps pointing to a structure deserialized by serde). Lookup by &str and find the matched value &String which then coerce into &str. In such case, it is impossible to do a get_or_insert with &str being the key because that is not the actual key type. The only way to use one single function call to get/add is to always create the key string, then throw it away if the entry already exists.

Therefore, my conclusion is that a new API function will be a great addition to such use cases, especially when dealing with mapping tables for serde deserialized data structures.