DLR-SC / corpus-annotation-graph-builder

Corpus Annotation Graph builder (CAG) is an architectural framework that employs the build-and-annotate pattern for creating a graph.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Improve _edge_definitions

muelldlr opened this issue · comments

Currently, the _edge_definitions specify the collections as strings. To be able to use introspection correctly, it should be possible to specify the class of the collection directly instead. I would make it as a backwards compatible change. Example:

class CollectionA(GenericOOSNode):
    _name = "CollectionA"
    _fields = {"value": Field(), "value2": Field(), **GenericOOSNode._fields}


class CollectionB(GenericOOSNode):
    _name = "CollectionB"
    _fields = {"value": Field(), **GenericOOSNode._fields}


class CollectionC(GenericOOSNode):
    _name = "CollectionC"
    _fields = {"value": Field(), **GenericOOSNode._fields}


class HasRelation(GenericEdge):
    _fields = GenericEdge._fields

class SampleGraphCreator(GraphCreatorBase):
    _name = "SampleGraphCreator"
    _description = "Sample Graph"

    _edge_definitions = [
        {
            "relation": "HasRelation",
            "from_collections": [CollectionA],
            "to_collections": [CollectionB],
        },
        {
            "relation": "HasAnotherRelation",
            "from_collections": [CollectionC],
            "to_collections": [CollectionC],
        },
    ]

    def init_graph(self):
        pass

Currently you often see the example with CollectionA._name. Accessing a protected variable in this way also leads to a warning in the IDEs. We should try to be PEP8 compatible if possible.

Furthermore, this change allows a good solution for #24, where I have to introspect _edge_definitions.