There are 4 repositories under fasttext-embeddings topic.
NLP 领域常见任务的实现,包括新词发现、以及基于pytorch的词向量、中文文本分类、实体识别、摘要文本生成、句子相似度判断、三元组抽取、预训练模型等。
Spanish word embeddings computed with different methods and from different corpora
Tools for shrinking fastText models (in gensim format)
Text to abstract art generation for the holidays!
A monolingual and cross-lingual meta-embedding generation and evaluation framework
Persian sentiment analysis ( آناکاوی سهش های فارسی | تحلیل احساسات فارسی )
PyTorch repository for text categorization and NER experiments in Turkish and English.
An evaluation of word-embeddings for classification
Improving Word Translation via Two-Stage Contrastive Learning (ACL 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.
Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"
Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).
Machine Translation from Sanskrit to Hindi using Unsupervised and Supervised Learning
Improving Bilingual Lexicon Induction with Cross-Encoder Reranking (Findings of EMNLP 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.
Ensemble PhoBERT with FastText Embedding to improve performance on Vietnamese Sentiment Analysis tasks.
Romanian Word Embeddings. Here you can find pre-trained corpora of word embeddings. Current methods: CBOW, Skip-Gram, Fast-Text (from Gensim library). The .vec and .model files are available for download (all in one archive).
Machine learning- based solution to the problem of duplicity in the bug reports repository.
Biomedical Word embeddings generated from Spanish Biomedical corpora.
Spanish Word Embeddings computed from large corpora and different sizes using fastText.
This project contains the code to use custom fasttext embeddings with flair framework.
Repository for the free online book Oddly Satisfying Deep Learning from Scratch (link below!)
Detect hate speech in tweets
Persian Word Embedding using FastText, BERT, GPT and GloVe | تعبیه کلمات فارسی با روش های مختلف
Final Project of the Data Science postgraduate class at UFC. To be published!
Let's hunt Fake News using Word2Vec, GloVe, FastText or learnt from corpus German embeddings.
🔍 A simple topic detector.
spam classifier with a dataset of 5000 mail
✔머신러닝 기반 온라인 기사 분석 서비스✔
Project files contain PyTorch implementations for Siamese BiLSTM models for Semantic Text Similarity task on the SICK Dataset using FastText embeddings. Also contains Siamese BiLSTM-Transformer Encoder and SBERT fine-tuning implementations on the STS Data tasks.
Unofficial minified fastetext API. Use it to run NLP DL models that require word embeddings on the client-side.
This is one of my fun projects. It's a review classifier based on Amazon's reviews dataset hosted on Kaggle. I used FastText and Deep Learning model LSTM to build it.
Topic Modeling on BBC News using Facebook's FastText embeddings and LDA probabilistic model.
A benchmark for embeddings evaluation for Kyrgyz language
Machine Learning and Deep Learning approach to IR - Contextual Embeddings - Clustering Documents
Experiments in the field of Semantic Search using BM-25 Algorithm, Mean of Word Vectors, along with state of the art Transformer based models namely USE and SBERT.