There are 16 repositories under embeddings topic.
100+ Chinese Word Vectors 上百种预训练中文词向量
A library for transfer learning by reusing parts of TensorFlow models.
📋 Survey papers summarizing advances in deep learning, NLP, CV, graphs, reinforcement learning, recommendations, graphs, etc.
A python library for self-supervised learning on images.
Basic Utilities for PyTorch Natural Language Processing (NLP)
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
A curated list of awesome embedding models tutorials, projects and communities.
A fast, efficient universal vector embedding utility package.
中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN
Predict stock market prices using RNN model with multilayer LSTM cells + optional multi-stock embeddings.
Data augmentation for NLP, presented at EMNLP 2019
Modern columnar data format for ML implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)
A robust, all-in-one GPT3 interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!
Implementation of triplet loss in TensorFlow
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.
text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
A tool for learning vector representations of words and entities from Wikipedia
Library for faster pinned CPU <-> GPU transfer in Pytorch
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Classify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.
Compute Sentence Embeddings Fast!
Named Entity Recognition using multilayered bidirectional LSTM
Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing
Vector Hub - Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)
Nimfa: Nonnegative matrix factorization in Python
🚀 Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.
A curated list of Generative AI tools, works, models, and references
Recommender Systems Paperlist that I am interested in