disler / elm-itv-benchmark

Simple, Opinionated benchmark for testing the viability of Efficient Language Models (ELMs) for personal use cases.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Efficient Language Model Personal Viability Benchmarking

Simple, Opinionated benchmark for testing the viability of Efficient Language Models (ELMs) for personal use cases.

Uses bun, promptfoo, and ollama for a minimalist, cross-platform, local LLM prompt testing & benchmarking experience.

Zero Cost Prompts

Setup

  • Install Bun
  • Install Ollama
  • Setup .env variables
    • cp .env.sample .env
    • Add your OpenAI API key to the .env file
  • Install dependencies: bun i
  • Run the minimal tests: bun minimal
  • Open test viewer: bun view
  • Run the ELM-ITV tests: bun elm

Guide

  • First, watch the video where we walk through ELMs and this codebase.
  • To get started take a look at BENCH__minimal_test_suite/ to get an idea of how to structure a basic test suite.
  • Next take a look at the BENCH__efficient_language_models/ test suite to get an idea of how you can setup tests for your own viability tests for ELMs.
  • Explore other ollama based models you can test
  • Modify the BENCH__minimal_test_suite/ or BENCH__efficient_language_models/ to suit your needs
  • Create a new test with the Create a new test suite script

Folder Structure

  • /BENCH__<name of test suite>
    • /prompt.txt - the prompt(s) to test
    • /test.yaml - variables and assertions
    • /promptfooconfig.yaml - llm model config

Scripts

  • Create a new test suite: bun run ./scripts/new_prompt_test
  • Run a test prompt against a running ollama server bun run ./scripts/ollama_local_model_call

Resources

About

Simple, Opinionated benchmark for testing the viability of Efficient Language Models (ELMs) for personal use cases.


Languages

Language:TypeScript 54.4%Language:JavaScript 45.6%