Cache me if you can: an Online Cost-aware Teacher-Student framework to Reduce the Calls to Large Language Models
This repository is the official implementation of the OCaTS Framework that was introduced in the paper: Cache me if you can: an Online Cost-aware Teacher-Student framework to Reduce the Calls to Large Language Models.
Acknowledgements
This work is supported from Google's TPU Research Cloud (TRC) program
Table of Contents
Installation
In order to use the framework, you need to clone the repository and
install the packages in the requirements.txt
file. We recommend using a virtual environment to install the packages. You can use the following commands to create a virtual environment and install the packages:
- Conda:
conda create -n ocats python=3.11
conda activate ocats
pip install -r requirements.txt
- Virtualenv:
virtualenv ocats
source ocats/bin/activate
pip install -r requirements.txt
- Pyenv:
pyenv virtualenv 3.11 ocats
pyenv activate ocats
pip install -r requirements.txt
We recommend using conda to install the packages as it is the easiest way to install the packages. If you don't have conda installed, you can install it from here.
Usage
In order to use the framework, you need to first train the student model, if you are using a main.py
file.
Important notes:
- 🚩 This repository is specific to experiments employed for the paper and we only provide the code for the MLP and
$k$ -NN students. However, the framework can be used with any teacher and student model. Please feel free to create a fork of this repository and add your own teacher and student models. We would be happy to merge your pull request. - 🚩 We plan to make the code more user-friendly in the future. If you have any suggestions, please feel free to open an issue or create a pull request.
- 🚩 Finally, we plan to implement the full abstraction of the framework in the future. Currently, the framework is implemented in the
main.py
file and it is not abstracted. We also plan to open-source the code for the abstracted framework in the future for you to use.
Training
To train the MLP on top of MPNet embeddings, you can use the following command:
python train.py --config <config_file> --train_path <train_path> --dev_path <dev_path> --model_dir <model_dir>
Tuning
To tune the decision thresholds with the
python tune_knn.py --config <config_file> --lambdas <lambdas> --train_path <train_path> --dev_path <dev_path> --study_name <study_name>
To tune the decision thresholds with the MLP student, you can use the following command:
python tune.py --config <config_file> --lambdas <lambdas> --train_path <train_path> --dev_path <dev_path> --study_name <study_name>
Evaluation
To evaluate the framework, you can use the following command:
python main.py --config <config_file> --lambdas <lambdas> --train_path <train_path> --test_path <test_path> --model <model>
You can also use the scripts without any arguments. In this case, the scripts will use the default values which are for the main experiment in the paper.
Prompting LLMs
To prompt GPT-3.5-turbo and GPT-4 with this framework, you can use the prompt.py
script in the misc
folder. You can use the following commandz inside the misc folder:
python prompt.py --model <model> --prompt <prompt> --train_data <train_data> --test_data <test_data>
Contributing
As mentioned above, we plan to make the code more general user-friendly in the future. If you have any suggestions, please feel free to open an issue or create a pull request with your changes.
License
This repository is licensed under the MIT license. See LICENSE for details.
Citation
@article{stogiannidis_etal2023,
title={Cache me if you can: an Online Cost-aware Teacher-Student framework to Reduce the Calls to Large Language Models},
author={Ilias Stogiannidis and Stavros Vassos and Prodromos Malakasiotis and Ion Androutsopoulos},
journal={Findings of 2023 Conference on Empirical Methods in Natural Language Processing},
year={2023},
publisher={Association for Computational Linguistics}
}