luizalabs / shared-memory-dict

A very simple shared memory dict implementation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allow SharedMemoryDict to be initialized like a regular dictionary

cassiobotaro opened this issue · comments

There is an PR suggesting the addition of *args and **kwargs in our object initialization, but I would like to discuss some points and maybe a possible postponement of this implementation to release a major version, since it can change the method signature.

Some ideas and considerations:

  • Adding args and kwargs along with dictionary initialization will throw an error about creating a dictionary, wouldn't it be better for a specific non-dictionary exception?
  • Today "name" is a positional or named parameter and it is mandatory. But when initializing shared memory it is not.
  • Maybe we should remove serializer from __init__ and use it from serializer module, like logging module.
  • ShareableList uses the size from the iterator param. Why not do the same, but optionally allow to define a size?
  • We should consider putting name and size as mandatory named parameters. This will imply that both cannot be used as a key. Removing the serializer from __init__ also helps here, considering it will be one less name.

I would like to listen to your opinions @spaceone, @renanivo, @jeanmask, @itheodoro, @thyagomininelli.

First of all: I proposed the change. For me this is an enhancement, which I currently don't necessarily need. So I have no problem with postponing it and waiting for a nice solution.

I also changed my opinion about having a manager. It would probably be nice if this library looks similar to what mulitprocessing looks like - so that one could easily change between the implementations.

One general point which should be considered is:
What to do when initializing a dictionary (with values) when the dictionary already exists?

* Adding args and kwargs along with dictionary initialization will throw an error about creating a dictionary, wouldn't it be better for a specific non-dictionary exception?

I don't understand this point. What exceptions will be thrown in which case?

* Today "name" is a positional or named parameter and it is mandatory. But when initializing shared memory it is not.

When name is left out, a name is automatically generated. Specifying a/Knowing the name is mandatory If you initialize the memory in subprocesses after forking.

* Maybe we should remove serializer from `__init__` and use it from serializer module, like `logging` module.

How would it work then to use different serializers for different instances? This is a absolutely must for me to support nested dictionary structures.

* ShareableList uses the size from the `iterator` param. Why not do the same, but optionally allow to define a size?

This is something different - iiuc ShareableList cannot change it's size afterwards anymore - Therefor it's not really a list but a tuple. The size is also not the size of the shared memory but the count of the items.

* We should consider putting name and size as mandatory named parameters. This will imply that both cannot be used as a key. Removing the serializer from `__init__` also helps here, considering it will be one less name.

Mabye something like the following can be implemented:

class Manager(object):
    def __init__(self, name=None, size, serializer=Pickle) -> None:
         …
   def dict(self, *args, **kwargs) -> SharedDictionary:
        return SharedDictionary(*args, **kwargs)
       # or
       return SharedDictionary.create(self.name, self.size, self.serializer)(*args, **kwargs)

And last: When implementing the nested serializers I had the idea: why not changing this whole project into more than dictionaries - so that e.g. lists are also supported in the same serializing way. If you plan to do a major version update this would be very nice and probably not that hard.