KarelDO / xmc.dspy

In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hi,I'd like to work on the following issues:

Tsomoriri opened this issue · comments

Adding the issues I want to work on and some initial doubts:

  • issue
    • Doubt/line of thought

  • Add a HugginFace dataloader to apply IReRa to any multi-label classification task
    - Add dataloader such that loader.py logic is changed to accomodate HuggingFace dataloader
  • Add a cli argument parsing class
    - Make the argparse stuff into a class.
  • Create a config for the Optimizer class
    - use dataclasses for defining config.
  • Make the Optimizer class agnostic of the implementation details of the program that gets optimized, so it can define a high-level strategy which gets compiled across many programs
    - generalise methods and abstract away methods so that strategy logic can be implemented at a high level.
  • Track amount of student and teacher calls during optimization so system-cost can be compared.
    - simply track calls to llm
  • Make src/programs/retriever.py more efficient
    - work on logic to optimise retrieving of dataset
  • Track intermediate pipeline steps and log these traces for debugging
    - track each step in IReRa and log it.
  • Use logging instead of print-statements, and dump logs to experiment file
    - Define a logger in loader and experiments to log the states
  • Control seeds in LM calls and data loaders
    - Add seed to optimiser.
    I hope that i am thinking in the right direction, I am intrigued by the IReRa style and want to help the project get better.