rajeev3983 / awesome-nlp

A curated list of resources dedicated to Natural Language Processing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

awesome-nlp

A curated list of resources dedicated to Natural Language Processing

Maintainers - Keon Kim, Martin Park

Contributing

Please feel free to pull requests, email Keon Kim (keon.kim@nyu.edu) to add links.

Table of Contents

Tutorials and Courses

  • Tensor Flow Tutorial on Seq2Seq Models

videos

Codes

Implementations

Libraries

  • TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text

  • Node.js and Javascript - Node.js Libaries for NLP

    • Twitter-text - A JavaScript implementation of Twitter's text processing library
    • Knwl.js - A Natural Language Processor in JS
    • Retext - Extensible system for analyzing and manipulating natural language
    • NLP Compromise - Natural Language processing in the browser
    • Natural - general natural language facilities for node
  • Python - Python NLP Libraries

    • Scikit-learn: Machine learning in Python
    • Natural Language Toolkit (NLTK)
    • Pattern - A web mining module for the Python programming language. It has tools for natural language processing, machine learning, among others.
    • TextBlob - Providing a consistent API for diving into common natural language processing (NLP) tasks. Stands on the giant shoulders of NLTK and Pattern, and plays nicely with both.
    • YAlign - A sentence aligner, a friendly tool for extracting parallel sentences from comparable corpora.
    • jieba - Chinese Words Segmentation Utilities.
    • SnowNLP - A library for processing Chinese text.
    • KoNLPy - A Python package for Korean natural language processing.
    • Rosetta - Text processing tools and wrappers (e.g. Vowpal Wabbit)
    • BLLIP Parser - Python bindings for the BLLIP Natural Language Parser (also known as the Charniak-Johnson parser)
    • PyNLPl - Python Natural Language Processing Library. General purpose NLP library for Python. Also contains some specific modules for parsing common NLP formats, most notably for FoLiA, but also ARPA language models, Moses phrasetables, GIZA++ alignments.
    • python-ucto - Python binding to ucto (a unicode-aware rule-based tokenizer for various languages)
    • python-frog - Python binding to Frog, an NLP suite for Dutch. (pos tagging, lemmatisation, dependency parsing, NER)
    • python-zpar - Python bindings for ZPar, a statistical part-of-speech-tagger, constiuency parser, and dependency parser for English.
    • colibri-core - Python binding to C++ library for extracting and working with with basic linguistic constructions such as n-grams and skipgrams in a quick and memory-efficient way.
    • spaCy - Industrial strength NLP with Python and Cython.
    • PyStanfordDependencies - Python interface for converting Penn Treebank trees to Stanford Dependencies.
  • C++ - C++ Libraries

    • MIT Information Extraction Toolkit - C, C++, and Python tools for named entity recognition and relation extraction
    • CRF++ - Open source implementation of Conditional Random Fields (CRFs) for segmenting/labeling sequential data & other Natural Language Processing tasks.
    • CRFsuite - CRFsuite is an implementation of Conditional Random Fields (CRFs) for labeling sequential data.
    • BLLIP Parser - BLLIP Natural Language Parser (also known as the Charniak-Johnson parser)
    • colibri-core - C++ library, command line tools, and Python binding for extracting and working with basic linguistic constructions such as n-grams and skipgrams in a quick and memory-efficient way.
    • ucto - Unicode-aware regular-expression based tokenizer for various languages. Tool and C++ library. Supports FoLiA format.
    • libfolia - C++ library for the FoLiA format
    • frog - Memory-based NLP suite developed for Dutch: PoS tagger, lemmatiser, dependency parser, NER, shallow parser, morphological analyzer.
    • MeTA - MeTA : ModErn Text Analysis is a C++ Data Sciences Toolkit that facilitates mining big text data.
    • Mecab (Japanese)
    • Mecab (Korean)
  • Java - Java NLP Libraries

  • Clojure

    • Clojure-openNLP - Natural Language Processing in Clojure (opennlp)
    • Infections-clj - Rails-like inflection library for Clojure and ClojureScript

Articles

Review Articles

Word Vectors

General Natural Language Processing

Named Entity Recognition

Machine Translation

Neural Network

Supplementary Materials

Blogs

Multilingual

Spanish

Credits

part of the lists are from

About

A curated list of resources dedicated to Natural Language Processing