asyml / forte

Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Simplify the entry adding interfaces in `DataStore`

mylibrar opened this issue · comments

Is your feature request related to a problem? Please describe.
This is follow-up issue from #834 (comment). Right now DataStore keeps a long list of entry adding methods:

  • add_annotation_raw
  • add_audio_annotation_raw
  • add_image_annotation_raw
  • add_link_raw
  • add_group_raw
  • ...

This might be cumbersome for both users and developers. DataStore already provides a _is_subclass method and there is no need to register a different method for each top level entry. We should try to merge them into add_entry_raw and simplify the interface.

Describe the solution you'd like

  • Design the interface. Now that we are merging all the entry adding methods, we should take care of the different input arguments. For example, add_link_raw accepts begin_tid and end_tid while add_group_raw takes in a member_type. The merged function should account for all these differences.
  • Implement the new function. The basic structure might a long list of if...elif...elif...else with each branch conditioning on DataStore._is_subclass.
  • Update all the references. Since add_***_raw() is already invoked in a few places in codebase, we need to update these references.

Describe alternatives you've considered
Notes:

  • Some arguments like tid is common to all add_***_raw() methods while others (e.g., allow_duplicate) may only apply to annotation-like entries.
  • Actually, many _new_***() methods (e.g., _new_link, _new_annotation, _new_group, etc.) are also pretty similar. We might consider merging them as well.

Additional context

  • This is part of the data efficiency project
  • This PR should be made to the master branch.