embeddings

There are 49 repositories under embeddings topic.

supabase
supabase / supabase
The open source Firebase alternative.
deno embeddings firebase graphql postgres postgresql postgrest realtime supabase vectors websockets
Language:TypeScript 66776
chroma-core / chroma
the AI-native open-source embedding database
document-retrieval embeddings llms
Language:Python 12574
Embedding / Chinese-Word-Vectors
100+ Chinese Word Vectors 上百种预训练中文词向量
chinese chinese-word-segmentation embedding embeddings vectors-trained word-embeddings
Language:Python 11621
h2oai / h2ogpt
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://codellama.h2o.ai/
chatgpt llm ai embeddings generative gpt gpt4all pdf private privategpt vectorstore llama2 mixtral
Language:Python 10691
embedchain / embedchain
Personalizing LLM Responses
ai chatgpt llm python chatbots rag application embeddings vector-database
Language:Python 8610
txtai
neuml / txtai
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
python search machine-learning nlp semantic-search neural-search vector-search txtai llm vector-database language-model transformers sentence-embeddings large-language-models information-retrieval search-engine vector-search-engine embeddings retrieval-augmented-generation rag
Language:Python 7096
pytorch-metric-learning
KevinMusgrave / pytorch-metric-learning
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
computer-vision contrastive-learning deep-learning deep-metric-learning embeddings image-retrieval machine-learning metric-learning pytorch self-supervised-learning
Language:Python 5796
postgresml / postgresml
The GPU-powered AI application database. Get your app to market faster using the simplicity of SQL and the latest NLP, ML + LLM models.
ai ann approximate-nearest-neighbor-search artificial-intelligence classification clustering embeddings forecasting javascript knn llm machine-learning ml postgres python rag regression search sql vector-database
Language:Rust 5483
FlagOpen / FlagEmbedding
Retrieval and Retrieval-augmented LLMs
embeddings information-retrieval llm retrieval-augmented-generation sentence-embeddings text-semantic-similarity
Language:Python 5195
text2vec
shibing624 / text2vec
text2vec, text to vector. 文本向量表征工具，把文本转化为向量矩阵，实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型，开箱即用。
similarity nlp text-similarity text2vec word2vec embeddings sentence-embeddings
Language:Python 4103
llmware-ai / llmware
Providing enterprise-grade LLM-based development framework, tools, and fine-tuned models.
ai bert embedding-vectors embeddings faiss generative-ai information-retrieval large-language-models machine-learning milvus nlp parsing python pytorch question-answering retrieval-augmented-generation semantic-search transformers rag
Language:Python 3814
tensorflow / hub
A library for transfer learning by reusing parts of TensorFlow models.
tensorflow machine-learning transfer-learning embeddings image-classification python ml
Language:Python 3444
langchain4j / langchain4j
Java version of LangChain
huggingface java langchain openai chatgpt gpt llama milvus pinecone weaviate onnx embeddings vector-database chroma gemini ollama anthropic openai-api pgvector
Language:Java 3361
lancedb / lance
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
apache-arrow computer-vision data-analysis data-analytics data-centric data-format data-science dataops deep-learning duckdb embeddings llms machine-learning mlops python rust
Language:Rust 3325
towhee-io / towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
machine-learning convolutional-networks embedding-vectors embeddings computer-vision image-processing video-processing feature-extraction image-retrieval unstructured-data feature-vector transformer milvus towhee vision-transformer vit pipeline llm
Language:Python 3015
lightly-ai / lightly
A python library for self-supervised learning on images.
computer-vision contrastive-learning deep-learning embeddings machine-learning pytorch self-supervised-learning
Language:Python 2781
ml-surveys
eugeneyan / ml-surveys
📋 Survey papers summarizing advances in deep learning, NLP, CV, graphs, reinforcement learning, recommendations, graphs, etc.
computer-vision deep-learning embeddings machine-learning nlp recommender-system reinforcement-learning survey
2773
SamurAIGPT / EmbedAI
An app to interact privately with your documents using the power of GPT, 100% privately, no data leaks
chatbot chatgpt gpt gpt4 langchain openai privategpt embedai embeddings generative gpt4all models vectorstore whisper
Language:JavaScript 2760
hegelai / prompttools
Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).
deep-learning large-language-models machine-learning prompt-engineering python embeddings llms vector-search developer-tools
Language:Python 2460
PetrochukM / PyTorch-NLP
Basic Utilities for PyTorch Natural Language Processing (NLP)
pytorch nlp natural-language-processing pytorch-nlp torchnlp data-loader embeddings word-vectors python deep-learning dataset metrics neural-network sru machine-learning
Language:Python 2202
awesome-generative-ai
filipecalegario / awesome-generative-ai
A curated list of Generative AI tools, works, models, and references
ai-art awesome awesome-list chatgpt dall-e dalle2 embeddings generative-ai gpt-4 llm llm-agent midjourney openai prompt-engineering semantic-search stable-diffusion text-to-image txt2img
2067
huggingface / text-embeddings-inference
A blazing fast inference solution for text embeddings models
ai embeddings huggingface llm ml
Language:Rust 2064
obsidian-smart-connections
brianpetro / obsidian-smart-connections
Chat with your notes & see links to related content with AI embeddings. Use local models or 100+ via APIs like Claude, Gemini, ChatGPT & Llama 3
chatgpt embeddings claude gemini llama3 obsidian obsidian-plugin
Language:JavaScript 1937
vearch
vearch / vearch
Distributed vector search for AI-native applications
vectors vector-search cloud-native document-retrieval embeddings vector-database hybrid-search rag retrieval-augmented-generation ai-native ai-native-database
Language:Go 1932
Kav-K / GPTDiscord
A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!
artificial-intelligence asyncio gpt3 help-wanted openai openai-api python dalle2 embeddings extractive-question-answering pinecone moderator-bot hacktoberfest github digitalocean multi-modal collaborate chatbot code-interpreter discord
Language:Python 1794
xlang-ai / instructor-embedding
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
embeddings information-retrieval language-model text-classification text-clustering text-embedding text-evaluation text-semantic-similarity prompt-retrieval text-reranking
Language:Python 1726
Hironsan / awesome-embedding-models
A curated list of awesome embedding models tutorials, projects and communities.
embeddings embedding-models word2vec machine-learning natural-language-processing awesome papers
Language:Jupyter Notebook 1718
featureform
featureform / featureform
The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
machine-learning data-science vector-database embeddings-similarity embeddings hacktoberfest feature-store mlops data-quality feature-engineering ml python
Language:Jupyter Notebook 1711
yongzhuo / Keras-TextClassification
中文长文本分类、短句子分类、多标签分类、两句子相似度（Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short），字词句向量嵌入层（embeddings）和网络层（graph）构建基类，FastText，TextCNN，CharCNN，TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN
text-classification keras rcnn dcnn charcnn bert nlp textcnn fasttext dpcnn embeddings capsule vdcnn crnn han xlnet albert keras-textclassification leam transformer
Language:Python 1710
lilianweng / stock-rnn
Predict stock market prices using RNN model with multilayer LSTM cells + optional multi-stock embeddings.
lstm rnn-tensorflow stock-price-prediction embeddings
Language:Python 1679
plasticityai / magnitude
A fast, efficient universal vector embedding utility package.
python natural-language-processing nlp machine-learning vectors embeddings word2vec fasttext glove gensim fast memory-efficient machine-learning-library word-embeddings
Language:Python 1611
jasonwei20 / eda_nlp
Data augmentation for NLP, presented at EMNLP 2019
nlp data-augmentation text-classification synonyms embeddings sentence classification rnn cnn swap position
Language:Python 1552
google / generative-ai-docs
Documentation for Google's Gen AI site - including the Gemini API and Gemma
ai chatbot documentation embeddings llm machine-learning gemini gemini-api gemma
Language:Jupyter Notebook 1330
eliorc / node2vec
Implementation of the node2vec algorithm.
machine-learning-algorithms embeddings deep-learning
Language:Python 1181
MilaNLProc / contextualized-topic-models
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
topic-modeling bert transformer embeddings text-as-data topic-coherence multilingual-topic-models multilingual-models neural-topic-models nlp nlp-library nlp-machine-learning
Language:Python 1170
bheinzerling / bpemb
Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)
embeddings subword-embeddings natural-language-processing nlp multilingual
Language:Python 1159