MaksymPetyak / medplexity

Evaluating LLMs for medical applications

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Medplexity

Release Documentation Status Discord License: MIT Open in Colab

Medplexity explorer β€’ Frontend GitHub repository β€’ Substack

Medplexity is a python library to help with evaluation of LLMs for medical applications.

medplexity-logo

It is designed to help with the following tasks:

  • Evaluating performance of LLMs on existing medical datasets and benchmarks. E.g. MedQA, PubMedQA, etc.
  • Comparing performance of different prompts, models, and architectures.
  • Exporting results of evaluation for visualisation and further analysis.

The goal is to help answer questions like "How much better would GPT-4 perform given a vector database to load certain resources?".

πŸ”§ Quick install

pip install medplexity

πŸ“– Documentation

Documentation can be found here.

Example

See our "Getting Started" notebook for a full example with MedMCQA dataset.

Contributions

Contributions are welcome! Check out the todos below, and feel free to open a pull request. Remember to install pre-commit to be compliant with our standards:

pre-commit install

Feel free to raise any questions on Discord

Explorer

In addition to the library, we are also building a web app to explore the results of evaluations. The explorer is available at medplexityai.com. It's also open-sourced, see the frontend repository.

πŸ“œ License

Medplexity is licensed under the MIT License. See the LICENSE file for more details.

About

Evaluating LLMs for medical applications

License:MIT License


Languages

Language:Jupyter Notebook 88.8%Language:Python 11.2%