Handsomeqqqqqqq's starred repositories

ML-NLP

此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现,也是作为一个算法工程师必会的理论基础知识。

Language:Jupyter NotebookStargazers:15604Issues:383Issues:39

gensim

Topic Modelling for Humans

Language:PythonLicense:LGPL-2.1Stargazers:15523Issues:432Issues:1847

NLP_ability

总结梳理自然语言处理工程师(NLP)需要积累的各方面知识,包括面试题,各种基础知识,工程能力等等,提升核心竞争力

KeyBERT

Minimal keyword extraction with BERT

Language:PythonLicense:MITStargazers:3386Issues:32Issues:200

NLP-Interview-Notes

该仓库主要记录 NLP 算法工程师相关的面试题

named_entity_recognition

中文命名实体识别(包括多种模型:HMM,CRF,BiLSTM,BiLSTM+CRF的具体实现)

yake

Single-document unsupervised keyword extraction

Language:PythonLicense:NOASSERTIONStargazers:1613Issues:30Issues:66

CLUENER2020

CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition

lda

Topic modeling with latent Dirichlet allocation using Gibbs sampling

Language:PythonLicense:MPL-2.0Stargazers:1224Issues:49Issues:94

BERT-NER

Pytorch-Named-Entity-Recognition-with-BERT

Language:PythonLicense:AGPL-3.0Stargazers:1195Issues:23Issues:98

contextualized-topic-models

A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).

Language:PythonLicense:MITStargazers:1189Issues:17Issues:108

tagger

Named Entity Recognition Tool

Language:PythonLicense:Apache-2.0Stargazers:1157Issues:63Issues:84

RAKE

A python implementation of the Rapid Automatic Keyword Extraction

Language:PythonLicense:MITStargazers:973Issues:58Issues:9

Chinese-NLP-Corpus

Collections of Chinese NLP corpus

text-classification-surveys

文本分类资源汇总,包括深度学习文本分类模型,如SpanBERT、ALBERT、RoBerta、Xlnet、MT-DNN、BERT、TextGCN、MGAN、TextCapsule、SGNN、SGM、LEAM、ULMFiT、DGCNN、ELMo、RAM、DeepMoji、IAN、DPCNN、TopicRNN、LSTMN 、Multi-Task、HAN、CharCNN、Tree-LSTM、DAN、TextRCNN、Paragraph-Vec、TextCNN、DCNN、RNTN、MV-RNN、RAE等,浅层学习模型,如LightGBM 、SVM、XGboost、Random Forest、C4.5、CART、KNN、NB、HMM等。介绍文本分类数据集,如MR、SST、MPQA、IMDB、Yelp、20NG、AG、R8、DBpedia、Ohsumed、SQuAD、SNLI、MNLI、MSRP、MRDA、RCV1、AAPD,评价指标,如accuracy、Precision、Recall、F1、EM、MRR、HL、Micro-F1、Macro-F1、P@K,和技术挑战,包括多标签文本分类。

ETM

Topic Modeling in Embedding Spaces

Language:PythonLicense:MITStargazers:537Issues:14Issues:36

python-topic-model

Implementation of various topic models

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:370Issues:29Issues:15

sccl

Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021

Language:PythonLicense:MIT-0Stargazers:287Issues:6Issues:29

Black-Box-Tuning

ICML'2022: Black-Box Tuning for Language-Model-as-a-Service & EMNLP'2022: BBTv2: Towards a Gradient-Free Future with Large Language Models

Language:PythonLicense:MITStargazers:256Issues:7Issues:14

TRIME

[EMNLP 2022] Training Language Models with Memory Augmentation https://arxiv.org/abs/2205.12674

CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

Language:PythonLicense:MITStargazers:187Issues:9Issues:52

topicModelling

A project with topic model implementations

Language:PythonLicense:GPL-3.0Stargazers:131Issues:12Issues:2

PyTorchOT

implements optimal transport algorithms in pytorch

jate

NEWS: JATE2.0 Beta.11 Released, see details below.

Language:JavaLicense:LGPL-3.0Stargazers:81Issues:9Issues:46

NER-FunTool

本NER项目包含多个中文数据集,模型采用BiLSTM+CRF、BERT+Softmax、BERT+Cascade、BERT+WOL等,最后用TFServing进行模型部署,线上推理和线下推理。

Language:PythonLicense:MITStargazers:78Issues:2Issues:3

termsuite-core

A Java UIMA-based toolbox for multilingual and efficient terminology extraction an multilingual term alignment

Language:JavaLicense:Apache-2.0Stargazers:37Issues:11Issues:103

atr4s

Toolkit with state-of-the-art Automatic Terms Recognition methods in Scala

Language:ScalaLicense:Apache-2.0Stargazers:34Issues:19Issues:11

JurisLMs

JurisLMs: Jurisprudential Language Models

Language:PythonStargazers:19Issues:3Issues:0

NeuralSinkhornTopicModel

Neural Topic Model via Optimal Transport, ICLR 2021

Language:PythonLicense:MITStargazers:15Issues:1Issues:0
Language:PythonStargazers:14Issues:0Issues:0