DLR-SC / corpus-annotation-graph-builder

Corpus Annotation Graph builder (CAG) is an architectural framework that employs the build-and-annotate pattern for creating a graph.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fix OpenAlexConcept and Keyterms pipeline

johndolier opened this issue · comments

I identified some potential improvements/bugs when using "OpenAlexConcept" and "Keyterms" pipe in a single pipeline:

  • Loading the .env file in get_oaconcepts() (oaconcept.py) only works when the .env file is in the directory of oaconcept.py, but it does not work when the .env file is present locally in the same directory as the running script
  • 'openalex_concept_orchestrator.py' throws error in save_annotations(..). Most likely, the output of the endpoint in oaconcept.py changed and therefore the function needs adaption
  • save_annotations(..) in keyterms_orchestrator.py does not create a "text_key" column in the final dataframe, leading to an Error when merging the resulting dataframes of the two pipes in save() (pipeline_base.py)