Hamed's repositories
search_with_lepton
Building a quick conversation-based search demo with Lepton AI.
chatbot-ui
The open-source AI chat app for everyone.
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
faiss
A library for efficient similarity search and clustering of dense vectors.
promptbench
A unified evaluation framework for large language models
LLMLingua
To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
promptsource
Toolkit for creating, sharing and using natural language prompts.
auto-cot
Official implementation for "Automatic Chain of Thought Prompting in Large Language Models" (stay tuned & more will be updated)