redis / redis-om-python

Object mapping, and more, for Redis and Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Indexing boolean fields breaks something

ilyatovbin-pp opened this issue · comments

class Something(JsonModel):    
    name: str
    should_do_something: bool = Field(index=True, default=True)

Migrator().run()

Something(name="test").save()

result:

Something(pk='01H0HYP3C4YFF1MR6JV9MEZSF4', name='test', should_do_something=True)

executing a basic get by pk works fine.

Something.get('01H0HYP3C4YFF1MR6JV9MEZSF4') 

result:

Something(pk='01H0HYP3C4YFF1MR6JV9MEZSF4', name='test', should_do_something=True)

when running find().all(), the result is empty:

In [22]: Something.find().all()
Out [22]: []

If I set the bool value to Index=False, the indexing starts working fine:

In [24]: class Something(JsonModel):
    ...:     class Meta:
    ...:         database = get_redis_instance()
    ...: 
    ...:     name: str
    ...:     should_do_something: bool = Field(index=False, default=True)
    ...: 

In [25]: Migrator().run()

In [26]: Something.find().all()
Out[26]: 
[Something(pk='01H0HYP3C4YFF1MR6JV9MEZSF4', name='test', should_do_something=True),
 Something(pk='01H0HYGDBJF50GFRYK9MSPA8DQ', name='test 2', should_do_something=True),
 Something(pk='01H0HYB0MKSMV4QWJ7BM5G2DGH', name='test', should_do_something=True)]

I can confirm this as well, that the JSONModel doesn't support indexing boolean fields. Here is my testcase that shows the same thing:

"""This file demonstrates that redis_om_python cannot index/search boolean fields"""
from redis_om import Field, JsonModel, Migrator

class MyObjectBroken(JsonModel):
    my_string: str = Field(index=True)
    my_int: int = Field(index=True)
    my_bool: bool = Field(index=True)


class MyObjectWorks(JsonModel):
    my_string: str = Field(index=True)
    my_int: int = Field(index=True)

Migrator().run()

# Add some objects that are broken
object1 = MyObjectBroken(my_string="", my_int=1, my_bool=False)
object1.save()
object2 = MyObjectBroken(my_string="", my_int=1, my_bool=False)
object2.save()

# Add some objects that work
object3 = MyObjectWorks(my_string="", my_int=1)
object3.save()
object4 = MyObjectWorks(my_string="", my_int=1)
object4.save()

objects = MyObjectBroken.find().all()
print(f"Found {len(objects)} MyObjectBroken expected 2")

object_get = MyObjectBroken.get(object1.pk)
print(f"Found {object_get=} MyObjectBroken")


objects = MyObjectWorks.find().all()
print(f"Found {len(objects)} MyObjectWorks expected 2")

object_get = MyObjectWorks.get(object3.pk)
print(f"Found {object_get=} MyObjectWorks")

If you run this, you'll get the following output showing the object with a boolean field doesn't work:

$ python boolean_field_search_suppport.py
Found 0 MyObjectBroken expected 2
Found object_get=MyObjectBroken(pk='01H5K85HDADT687Y25Y7BAEWKK', my_string='', my_int=1, my_bool=False) MyObjectBroken
Found 2 MyObjectWorks expected 2
Found object_get=MyObjectWorks(pk='01H5K85HDCCAS49NADBDY39CXQ', my_string='', my_int=1) MyObjectWorks

See #193 for where it's defined that booleans aren't supported yet, but seem to be supported at some point?

From my sleuthing, since the bool class is implemented as a subclass of the integer class, we end up getting a NUMERIC index created:

"ft.create" ":main.MyObjectBroken:index" "ON" "JSON" "PREFIX" "1" ":main.MyObjectBroken:" "SCHEMA" "$.pk" "AS" "pk" "TAG" "SEPARATOR" "|" "$.my_string" "AS" "my_string" "TAG" "SEPARATOR" "|" "$.my_int" "AS" "my_int" "NUMERIC" "$.my_bool" "AS" "my_bool" "NUMERIC"

This is coming from this line:

elif any(issubclass(typ, t) for t in NUMERIC_TYPES):

From the docs of FT.CREATE there isn't native support for boolean types, so it likely has to be mapped to NUMERIC for this to work, however it's unclear what's going on under the hood here, but I'm guessing that at least a part of it is related to saving actual RedisJSON boolean fields, but telling RediSearch that it's a numeric field.

I found a workaround for this issue. Since the bool class is a subclass of the int class, you can declare the model field as int and then convert between bool/int as necessary.

Here is the code that I tested on redis-om 0.1.3:

"""This file demonstrates that redis_om_python cannot index/search boolean fields"""
from redis_om import Field, JsonModel, Migrator

class MyObjectBroken(JsonModel):
    my_string: str = Field(index=True)
    my_int: int = Field(index=True)
    my_bool: int = Field(index=True) # declare as int, but pass in boolean values works


class MyObjectWorks(JsonModel):
    my_string: str = Field(index=True)
    my_int: int = Field(index=True)

Migrator().run()

# Add some objects that are broken
object1 = MyObjectBroken(my_string="", my_int=1, my_bool=False)
object1.save()
object2 = MyObjectBroken(my_string="", my_int=1, my_bool=True)
object2.save()

# Add some objects that work
object3 = MyObjectWorks(my_string="", my_int=1)
object3.save()
object4 = MyObjectWorks(my_string="", my_int=1)
object4.save()

objects = MyObjectBroken.find().all()
print(f"Found {len(objects)} MyObjectBroken expected 2")

objects = MyObjectBroken.find(MyObjectBroken.my_bool == int(False)).all()
print(f"Found {len(objects)} MyObjectBroken with bool as False expected 1")

object_get = MyObjectBroken.get(object1.pk)
print(f"Found {object_get=} MyObjectBroken")


objects = MyObjectWorks.find().all()
print(f"Found {len(objects)} MyObjectWorks expected 2")

object_get = MyObjectWorks.get(object3.pk)
print(f"Found {object_get=} MyObjectWorks")

Here is the output from it:

$ python boolean_field_search_suppport.py
Found 2 MyObjectBroken expected 2
Found 1 MyObjectBroken with bool as False expected 1
Found object_get=MyObjectBroken(pk='01H5KBCJMMCY9Z9SY74C87B20N', my_string='', my_int=1, my_bool=0) MyObjectBroken
Found 2 MyObjectWorks expected 2
Found object_get=MyObjectWorks(pk='01H5KBCJMP4RM26QSSVB77YKTQ', my_string='', my_int=1) MyObjectWorks