[ENHANCEMENT]: Add option to separate container initialization from memory allocation

Question

[ENHANCEMENT]: Add option to separate container initialization from memory allocation

sleeepyjack opened this issue a year ago · comments

Is your feature request related to a problem? Please describe.

Our current process involves memory allocation for a container within its constructor and instant initialization through the clear() function. This sequence happens immediately and is also enclosed within the constructor.

However, there are instances where it's beneficial to postpone the initialization step to a more appropriate time in the program flow.

Describe the solution you'd like

To facilitate this, we must implement a strategy allowing deferred initialization while ensuring the container is fully initialized before its first use. This is vital for maintaining data integrity and avoiding runtime errors.

One potential strategy could be an initialize-on-first-touch approach: Any function call that could potentially access the container's memory (like insert(), find(), size(), ref(), etc.) would first check a boolean member indicating whether the container has been initialized or not.

If the container is not yet initialized (i.e., the boolean is false), the clear() function would be invoked to initialize the container, thus ensuring it's ready for use.

Describe alternatives you've considered

No response

Additional context

No response

Daniel Jünger · Answer 1 · Sat Jul 29 2023 05:51:20 GMT+0800 (China Standard Time)

cc @esoha-nvidia

Jake Hemstad · Answer 2 · Tue Aug 01 2023 00:52:07 GMT+0800 (China Standard Time)

there are instances where it's beneficial to postpone the initialization step to a more appropriate time in the program flow.

What are those instances?

This sounds a little wonky to try and provide via the owning types.

If someone really wants this behavior, I think it would be best served by having them allocate their own storage and then using a view type around that storage.

Eyal Soha · Answer 3 · Tue Aug 01 2023 02:17:57 GMT+0800 (China Standard Time)

For my case, it's because I want to malloc the memory just once, then time the rest of the process. The malloc adds a lot of time to the program and I don't want to include it in my calculations. However, I do want to measure how long it takes to zero the memory.

I support the idea of having the constructor do malloc. That makes sense and fits in with the c++ paradigm of RAII. But having the constructor also zero the memory is not great, similar to how we often use arrays instead of std::vector in CUDA code because we don't want to waste time initializing memory.

Searching the web, there are a lot of people who would like std::vector to not initialize data: https://www.google.com/search?q=std+vector+without+initialization&oq=std+vector+without+initialization

It's so popular and idea that you have to wonder if std::vector made a mistake in forcing initialization. With std::vector, though, there is a solution where you can prevent initialization by wrapping your data member in a class with a constructor that does nothing. It's awkward but it works! I think that cuco does not even allow for the awkward solution.

At the very least, maybe add the clear function already?! esoha-nvidia@966da5c std::unordered_map has it. https://en.cppreference.com/w/cpp/container/unordered_map/clear

Yunsong Wang · Answer 4 · Tue Aug 01 2023 02:32:34 GMT+0800 (China Standard Time)

Just FYI, clear has been added in #333.

Eyal Soha · Answer 5 · Tue Aug 01 2023 02:58:24 GMT+0800 (China Standard Time)

@PointKernel Yeah but only in the experimental code, right? I've got enough on my plate without switching over to be a guinea pig. 😆 Perhaps port that to the non-experimental static_map?

Are there any other members in this list of functions that std::unordered_map offers that could be added to cuco in 5 minutes or fewer? clear took me like 5 minutes. If there are others, maybe add those, too?

Yunsong Wang · Answer 6 · Tue Aug 01 2023 04:50:08 GMT+0800 (China Standard Time)

Perhaps port that to the non-experimental static_map?

We will deprecate the "non-experimental" static_map in one month or two and experimental::static_map will be official. That said, I'm inclined to introduce new features and meet users' requirements in the new map instead of updating the old deprecating one.

If there are others, maybe add those, too?

Fully understand. We have a long list of work to wrap up the experimental containers. Matching STL behavior is definitely a nice thing to have but it's not the top priority for now.