There are 40 repositories under nlp-library topic.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
An Open-Source Framework for Prompt-Learning.
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
Awesome-pytorch-list 翻译工作进行中......
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Underthesea - Vietnamese NLP Toolkit
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
skweak: A software toolkit for weak supervision applied to NLP tasks
💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️
A Japanese Tokenizer for Business
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
A curated list of resources dedicated to Python libraries, LLMs, dictionaries, and corpora of NLP for Japanese
Easy-to-use and high-performance NLP and LLM framework based on MindSpore, compatible with models and datasets of 🤗Huggingface.
Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
Python version of Sudachi, a Japanese tokenizer.
multi_task_NLP is a utility toolkit enabling NLP developers to easily train and infer a single model for multiple tasks.