Refactor node provider DB Schema

Question

Refactor node provider DB Schema

louisevelayo opened this issue 9 months ago · comments

louisevelayo commented 9 months ago

🚧 WIP: Add changes that need to be made to the DB schema below:

Create a NP principal <> slack channel name table
Create a NP principal <> telegram chat id table
Remove column telegram_channel_id

Ioannis Mourginakis · Answer 1 · Fri Oct 27 2023 06:52:36 GMT+0800 (China Standard Time)

So this PR will mostly deal with refactoring our postgres DB schema and our NodeProviderDB class.

Intro

Right now the NodeProviderDB class contains 4 tables:

subscribers (one to one)
email_lookup (one to many)
channel_lookup (one to many)
node_label_lookup (one to one)

The problem is that this doesn't accurately reflect the nature of our data. I propose the following:

subscribers (one to one)
email_lookup (one to many)
slack_channel_lookup (one to many)
telegram_chat_lookup (one to many)
node_label_lookup (one to many)

Refactoring The Schema

We will refactor the class itself so that the desired database schema is stored directly in class attributes. This is sort of already the case, but I want to increase modularity so that:

The class isn't directly responsible for creating the database schema, just defining it (right now the class creates the tables on init)
The database schema can directly be validated in the tests, so we don't do it at runtime.

Read/Write Functions

Right now, every table has its own function to add, read, and delete data. The only real difference between these functions lies in the schema in which they operate on. This leads to a lot of code bloat and development inertia. The benefits of denoting the one to many or one to one relationship (surrogate/natural keys respectively) lies in that we can create significantly smaller functions that deal with the read/write operations. For example:

def read_one_to_many(table_name: str, value: str) -> List[dict]:
    pass
def read_one_to_one(table_name: str, value: str) -> dict:
    pass
def write_one_to_many(table_name, _) :
    pass
## ...
## This interface is totally incomplete and names will change!
## We need better names for these functions too

With this interface, we can read/write to each table with simple key/value pairs. This lets us keep the schema flexible (in case we need to add new columns or change table names later in development). Trying to access data that doesn't exist currently can break node monitor. If we work with dynamic attributes like this, rather than static definitions, we get the added benefit of being able to use python's dict.get() function, which will return None instead of throwing an exception.

Please note that this architecture is incomplete, I will continue to modify it as I decide upon more of the optimal architecture.

I expect this PR to reduce the size of class NodeProviderDB by half its current size, but still maintain all the same functionality.