Hi,I'd like to work on the following issues:

Question

Tsomoriri opened this issue 6 months ago · comments

Adding the issues I want to work on and some initial doubts:

Add a HugginFace dataloader to apply IReRa to any multi-label classification task
- Add dataloader such that loader.py logic is changed to accomodate HuggingFace dataloader
Add a cli argument parsing class
- Make the argparse stuff into a class.
Create a config for the Optimizer class
- use dataclasses for defining config.
Make the Optimizer class agnostic of the implementation details of the program that gets optimized, so it can define a high-level strategy which gets compiled across many programs
- generalise methods and abstract away methods so that strategy logic can be implemented at a high level.
Track amount of student and teacher calls during optimization so system-cost can be compared.
- simply track calls to llm
Make src/programs/retriever.py more efficient
- work on logic to optimise retrieving of dataset
Track intermediate pipeline steps and log these traces for debugging
- track each step in IReRa and log it.
Use logging instead of print-statements, and dump logs to experiment file
- Define a logger in loader and experiments to log the states
Control seeds in LM calls and data loaders
- Add seed to optimiser.
I hope that i am thinking in the right direction, I am intrigued by the IReRa style and want to help the project get better.