Tom Aarsen (tomaarsen)

tomaarsen

Geek Repo

Company:Hugging Face

Location:Netherlands

Home Page:https://tomaarsen.com

Twitter:@tomaarsen

Github PK Tool:Github PK Tool


Organizations
embeddings-benchmark
Hugging-Face-Helping-Hand
huggingface
nltk

Tom Aarsen's repositories

attention_sinks

Extend existing LLMs way beyond the original training length with constant memory usage, without retraining

Language:PythonLicense:Apache-2.0Stargazers:654Issues:12Issues:30

SpanMarkerNER

SpanMarker for Named Entity Recognition

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:380Issues:9Issues:42

AnglE

Angle-optimized Text Embeddings | 🔥 SOTA on STS and MTEB Leaderboard

Language:PythonLicense:MITStargazers:2Issues:1Issues:0

ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max). A PyTorch LLM library that seamlessly integrates with llama.cpp, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, ModelScope, etc.

Language:PythonLicense:Apache-2.0Stargazers:2Issues:1Issues:0

AIR-Bench

AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

bm25s

BM25S is an ultra-fast lexical search library that implements BM25 using scipy

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

canopy

Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone

Language:PythonLicense:Apache-2.0Stargazers:1Issues:1Issues:0

ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)

Language:PythonLicense:MITStargazers:1Issues:1Issues:0
Language:Jupyter NotebookLicense:MITStargazers:1Issues:0Issues:0

EMO

[ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)

Language:PythonStargazers:1Issues:1Issues:0

setfit

Efficient few-shot learning with Sentence Transformers

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1Issues:1Issues:0

accelerate

🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

blog

Public repo for HF blog posts

Language:Jupyter NotebookStargazers:0Issues:1Issues:0

datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

dspy

DSPy: The framework for programming—not prompting—foundation models

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

GLiNER

Generalist model for NER (Extract any entity types from texts)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

Hotel-ID-2022

7th place entry to the Hotel-ID 2022 Kaggle challenge

Language:PythonStargazers:0Issues:1Issues:0

huggingface.js

Utilities to use the Hugging Face hub API

Language:TypeScriptLicense:MITStargazers:0Issues:0Issues:0

huggingface_hub

All the open source things related to the Hugging Face Hub.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

langchain

âš¡ Building applications with LLMs through composability âš¡

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

llama_index

LlamaIndex (GPT Index) is a data framework for your LLM applications

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

postgresml

The GPU-powered AI application database. Get your app to market faster using the simplicity of SQL and the latest NLP, ML + LLM models.

Language:RustLicense:MITStargazers:0Issues:0Issues:0

sentence-transformers

Multilingual Sentence & Image Embeddings with BERT

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

stsb-multi-mt

Machine translated multilingual STS benchmark dataset.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

tomaarsen.com-backend

Backend for www.tomaarsen.com

Language:PythonLicense:MITStargazers:0Issues:2Issues:0

tomaarsen.com-frontend

Frontend for www.tomaarsen.com

Language:HTMLLicense:MITStargazers:0Issues:2Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

Verba

Retrieval Augmented Generation (RAG) chatbot powered by Weaviate

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0