In this repo, we walk you through an experiment for a common use case of Large Language Models (LLMs): text summarization.
We compare two strong open source models: Mixtral 8x7B and LLama2 70B.
We consider two comparison axis:
- inference performance, when run on NVIDIA GPUs for hardware acceleration, and
- task performance, evaluating the generated summaries with a suitable NLP evaluation metric.
You can follow the notebooks in order for a walk-through of the experiments.