davidlatwe / montydb

Monty, Mongo tinified. MongoDB implemented in Python !

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Confusing behavior of set_storage vs. MontyClient with memory

rkingsbury opened this issue · comments

To create a memory-mapped DB per the docs I can do

from montydb import set_storage, MontyClient
MontyClient(":memory:")

which results in nothing being written to disk. In the above, ":memory:" is a special repository name.

However if I first use set_storage with ":memory:" as the repository name

set_storage(":memory:")
MontyClient(":memory:")

Then a directory called :memory: is written to the current directory with a monty.storage.cfg file in it, which actually has nothing to do with the created MontyClient.

Similarly, if I pass "memory" as the repository name to MontyClient, a directory called memory is created and the actual underlying storage type is flatfile

>>> mc=MontyClient("memory")
>>> mc
MontyClient(repository='memory', document_class=builtins.dict, storage_engine=MontyStorage(engine: 'FlatFileStorage'))

Finally, if I specify storage="memory" (note the absence of :), with or without a repository,

set_storage(":memory:", storage="memory")
MontyClient(":memory:")

or

set_storage(storage="memory")
MontyClient(":memory:")

then all works as expected and nothing is written to disk.

Strictly speaking none of these are "bugs", but personally I think this behavior is quite confusing and has a risk of unintended behavior. I suggest a few ways to address:

  1. Recognize ":memory:" as a special repo name in set_storage and prevent it from being created as a directory
  2. Warn the user if they pass "memory" as a repository name in MontyClient.__init__(). It is conceivable that someone might want their database to live in a folder called "memory", but I think it's also feasible that a user might pass that argument thinking they were getting a memory database
  3. Expose all the set_storage kwargs in MontyClient.__init__() so that one could say, for example MontyClient(storage='sqlite', use_bson=True) rather than having to understand the nuances of set_storage and invoke it separately.

I'm willing to work on implementing the above, but wanted to hear your feedback first to make sure I'm not misunderstanding some of the intended behavior.