Allow SharedMemoryDict to be initialized like a regular dictionary
cassiobotaro opened this issue · comments
There is an PR suggesting the addition of *args and **kwargs in our object initialization, but I would like to discuss some points and maybe a possible postponement of this implementation to release a major version, since it can change the method signature.
Some ideas and considerations:
- Adding args and kwargs along with dictionary initialization will throw an error about creating a dictionary, wouldn't it be better for a specific non-dictionary exception?
- Today "name" is a positional or named parameter and it is mandatory. But when initializing shared memory it is not.
- Maybe we should remove serializer from
__init__
and use it from serializer module, likelogging
module. - ShareableList uses the size from the
iterator
param. Why not do the same, but optionally allow to define a size? - We should consider putting name and size as mandatory named parameters. This will imply that both cannot be used as a key. Removing the serializer from
__init__
also helps here, considering it will be one less name.
I would like to listen to your opinions @spaceone, @renanivo, @jeanmask, @itheodoro, @thyagomininelli.
First of all: I proposed the change. For me this is an enhancement, which I currently don't necessarily need. So I have no problem with postponing it and waiting for a nice solution.
I also changed my opinion about having a manager. It would probably be nice if this library looks similar to what mulitprocessing looks like - so that one could easily change between the implementations.
One general point which should be considered is:
What to do when initializing a dictionary (with values) when the dictionary already exists?
* Adding args and kwargs along with dictionary initialization will throw an error about creating a dictionary, wouldn't it be better for a specific non-dictionary exception?
I don't understand this point. What exceptions will be thrown in which case?
* Today "name" is a positional or named parameter and it is mandatory. But when initializing shared memory it is not.
When name
is left out, a name is automatically generated. Specifying a/Knowing the name is mandatory If you initialize the memory in subprocesses after forking.
* Maybe we should remove serializer from `__init__` and use it from serializer module, like `logging` module.
How would it work then to use different serializers for different instances? This is a absolutely must for me to support nested dictionary structures.
* ShareableList uses the size from the `iterator` param. Why not do the same, but optionally allow to define a size?
This is something different - iiuc ShareableList cannot change it's size afterwards anymore - Therefor it's not really a list but a tuple. The size is also not the size of the shared memory but the count of the items.
* We should consider putting name and size as mandatory named parameters. This will imply that both cannot be used as a key. Removing the serializer from `__init__` also helps here, considering it will be one less name.
Mabye something like the following can be implemented:
class Manager(object):
def __init__(self, name=None, size, serializer=Pickle) -> None:
…
def dict(self, *args, **kwargs) -> SharedDictionary:
return SharedDictionary(*args, **kwargs)
# or
return SharedDictionary.create(self.name, self.size, self.serializer)(*args, **kwargs)
And last: When implementing the nested serializers I had the idea: why not changing this whole project into more than dictionaries - so that e.g. lists are also supported in the same serializing way. If you plan to do a major version update this would be very nice and probably not that hard.