aphp / edsnlp

Modular, fast NLP framework, compatible with Pytorch and spaCy, offering tailored support for French clinical notes.

Home Page:https://aphp.github.io/edsnlp/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Harmonize processing utils

percevalw opened this issue · comments

Description

@aricohen93

Parallel and distributed pipelines do not behave the same regarding the addition of a note_id column after the processing of documents.

TODO:

  • either remove the note_id select from distributed.py
  • add note_id after parallel
  • something else ?