llm_vs_vector

Made a little script to speed and cost of classification via LLM or via vector embeddings

Currently tests the classification of 50 sentences into positive or negative using three approaches:

Stats Tracked:

Speed is tracked for all three methods.
Distance to positive and negative is tracked for vector embedding methods
Token count and cost is tracked for ChatCompletion and ada-002 vector embedding.

Here's the Original tweet thread about this.

Update 1: Added multi.py for testing classification into more than two options, using movies and movie genres (only 3.5 and ada)

Example output:

Setup

pip install -r requirements.txt python -m spacy download en_core_web_md export OPENAI_API_KEY=<your key here>

python main.py

Testing speed and cost of classification via LLM or via vector embeddings

MIT License

Language:Python 100.0%