nlp-library

There are 45 repositories under nlp-library topic.

transformers
huggingface / transformers
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
nlp natural-language-processing pytorch pytorch-transformers transformer model-hub pretrained-models speech-recognition hacktoberfest python machine-learning deep-learning audio deepseek gemma glm llm qwen vlm
Language:Python 149808
explosion / spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
natural-language-processing data-science machine-learning python cython nlp artificial-intelligence ai spacy nlp-library neural-network neural-networks deep-learning named-entity-recognition entity-linking text-classification tokenization
Language:Python 32479
bharathgs / Awesome-pytorch-list
A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
pytorch python machine-learning deep-learning tutorials papers awesome awesome-list pytorch-tutorials data-science nlp nlp-library cv computer-vision natural-language-processing facebook probabilistic-programming utility-library neural-network pytorch-model
16144
thunlp / OpenPrompt
An Open-Source Framework for Prompt-Learning.
nlp pre-trained-language-models ai nlp-machine-learning natural-language-processing natural-language-understanding deep-learning pre-trained-model nlp-library pytorch transformer prompt prompt-toolkit prompts prompt-based-tuning prompt-learning
Language:Python 4724
fastnlp / fastNLP
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
chinese-nlp deep-learning natural-language-processing nlp-library nlp-parsing text-classification text-processing
Language:Python 3126
FudanNLP / fnlp
中文自然语言处理工具包 Toolkit for Chinese natural language processing
fnlp fudannlp java nlp-library
Language:Java 2667
xavier-zy / Awesome-pytorch-list-CNVersion
Awesome-pytorch-list 翻译工作进行中......
awsome-pytorch-list cnversion computer-vision cv data-sicence deep-learning facebook machine-learning neural-network nlp nlp-library papers probabilistic-programming python pytorch pytorch-models pytorch-tutorials tutorials utility-library
Language:Jupyter Notebook 1766
FARM
deepset-ai / FARM
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
bert deep-learning germanbert language-models ner nlp nlp-framework nlp-library pretrained-models pytorch question-answering roberta transfer-learning xlnet-pytorch
Language:Python 1753
chrismattmann / tika-python
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
tika-server python tika-python tika-server-jar parser-interface parse translation-interface usc text-extraction mime buffer memex text-recognition detection recognition nlp nlp-machine-learning nlp-library covid-19 extraction
Language:Python 1615
underthesea
undertheseanlp / underthesea
Underthesea - Vietnamese NLP Toolkit
dependency-parser dependency-parsing named-entity-recognition natural-language-processing ner nlp nlp-library pos-tagging sentence-segmentation vietnamese vietnamese-nlp vietnamese-tokenizer word-segmenter
Language:Python 1503
MilaNLProc / contextualized-topic-models
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
topic-modeling bert transformer embeddings text-as-data topic-coherence multilingual-topic-models multilingual-models neural-topic-models nlp nlp-library nlp-machine-learning
Language:Python 1226
DataDreamer
datadreamer-dev / DataDreamer
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
deep-learning machine-learning natural-language-processing nlp nlp-library python pytorch transformers alignment fine-tuning gpt instruction-tuning llm llmops llms openai synthetic-data synthetic-dataset-generation
Language:Python 1058
ashishpatel26 / Treasure-of-Transformers
💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️
transformer python nlp natural-language-processing tensorflow pytorch speech-recognition seq2seq pretrained-models language-models natural-language-generation nlp-library bert natural-language-understanding language-model pytorch-transformers model-hub jax awesome
Language:Jupyter Notebook 1049
thunlp / OpenDelta
A plug-and-play library for parameter-efficient-tuning (Delta Tuning)
deep-learning nlp nlp-library parameter-efficient-learning pretrained-language-model
Language:Python 1035
PyThaiNLP / pythainlp
Thai natural language processing in Python
python thai-nlp nlp-library thai-language natural-language-processing thai-nlp-library thai-soundex soundex word-segmentation thai hacktoberfest computational-linguistics text-processing
Language:Python 1026
atilika / kuromoji
Kuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search
japanese morphological-analyser nlp-library part-of-speech-tagger
Language:Java 974
NorskRegnesentral / skweak
skweak: A software toolkit for weak supervision applied to NLP tasks
weak-supervision nlp-machine-learning distant-supervision nlp-library spacy python data-science training-data natural-language-processing
Language:Python 922
mocobeta / janome
Japanese morphological analysis engine written in pure Python
japanese-language nlp-library python
Language:Python 871
ikawaha / kagome
Self-contained Japanese Morphological Analyzer written in pure Go
hacktoberfest japanese japanese-language korean morphological-analysis nlp-library pos-tagging segmentation tokenizer
Language:Go 857
mindspore-lab / mindnlp
Easy-to-use and high-performance NLP and LLM framework based on MindSpore, compatible with models and datasets of 🤗Huggingface.
deep-learning large-language-models llm mindspore natural-language-processing nlp nlp-library python
Language:Jupyter Notebook 851
WorksApplications / Sudachi
A Japanese Tokenizer for Business
morphological-analysis nlp-library pos-tagging segmentation
Language:Java 838
taishi-i / awesome-japanese-nlp-resources
A curated list of resources dedicated to Python libraries, LLMs, dictionaries, and corpora of NLP for Japanese
awesome awesome-list cc0 japanese japanese-language llm natural-language-processing nlp nlp-library
797
OCTIS
MIND-Lab / OCTIS
OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)
topic-modeling latent-dirichlet-allocation latent-semantic-analysis evaluation-metrics natural-language-processing non-negative-matrix-factorization neural-topic-models bayesian-optimization hyperparameter-optimization hyperparameter-tuning hyperparameter-search topic-models nlp nlproc nlp-library
Language:Python 783
lingua
pemistahl / lingua
The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
language-detection language-processing kotlin-library java-library nlp-library nlp nlp-machine-learning natural-language natural-language-processing android-library language-identification language-classification language-recognition
Language:Kotlin 740
Ailln / cn2an
📦 快速转化「中文数字」和「阿拉伯数字」～ (最新特性：分数，日期、温度等转化）
arabic-numbers arabic-numerals asr chinese-numerals cn2an nlp-library nlp-tool pypi python speech-recognition
Language:Python 714
cbaziotis / ekphrasis
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
nlp nlp-library semeval spell-corrector spelling-correction text-processing text-segmentation tokenization tokenizer word-normalization word-segmentation
Language:Python 668
wyounas / homer
Homer, a text analyser in Python, can help make your text more clear, simple and useful for your readers.
nlp nlp-library python python-library python-script python3 text-analysis
Language:Python 631
medspacy / medspacy
Library for clinical NLP with spaCy.
nlp nlp-library spacy clinical-nlp pipeline medspacy
Language:Jupyter Notebook 557
fhamborg / Giveme5W1H
Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
question answering news event-detection event-extraction fivewoneh fivew 5w1h 5w question-answering news-articles text-analysis nlp nlp-library
Language:HTML 517
proycon / pynlpl
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
nlp python computational-linguistics linguistics library folia machine-learning language-modelling search-algorithms evaluation-metrics text-processing nlp-library natural-language-processing
Language:Python 477
pyarabic
linuxscout / pyarabic
pyarabic
nlp-library arabic-language text-processing
Language:Python 455
CAMeL-Lab / camel_tools
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
arabic arabic-dialects dialect-identification morphological-analysis morphological-disambiguation morphological-generation morphological-reinflection named-entity-recognition nlp nlp-apis nlp-library pos-tagging sentiment-analysis stemming
Language:Python 446
NLP-Natural-Language-Processing
ElizaLo / NLP-Natural-Language-Processing
Projects and useful articles / links
nlp nlp-machine-learning machine-learning nlp-library articles natural-language-processing paper awesome awesome-list awesome-nlp google-colaboratory coursera-course coursera-data-science coursera-machine-learning deep-learning nlp-resources naturallanguageprocessing natural-language-understanding natural-language awesome-natural-language-processing
Language:Jupyter Notebook 408
WorksApplications / SudachiPy
Python version of Sudachi, a Japanese tokenizer.
morphological-analysis nlp-library pos-tagging segmentation
Language:Python 404
nagisa
taishi-i / nagisa
A Japanese tokenizer based on recurrent neural networks
dynet japanese natural-language-processing nlp nlp-library pos-tagging sequence-labeling tokenizer word-segmentation
Language:Python 397
hellohaptik / multi-task-NLP
multi_task_NLP is a utility toolkit enabling NLP developers to easily train and infer a single model for multiple tasks.
context-awareness entailment intent-classification machine-comprehension multitask-learning named-entity-recognition nli-tasks nlp nlp-apis nlp-datasets nlp-library pytorch ranking sentence-classification sequence-labeling transformers
Language:Python 372