vngrs-ai / vnlp

State-of-the-art, lightweight NLP tools for Turkish language. Developed by VNGRS.

Home Page:https://vngrs.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

VNLP: Turkish NLP Tools

State-of-the-art, lightweight NLP tools for Turkish language.

Developed by VNGRS.

https://vngrs.com/

PyPI version PyPi downloads Docs License Python check

Functionality:

  • Sentence Splitter

  • Normalizer

    • Spelling/Typo correction
    • Convert numbers to word form
    • Deasciification
  • Stopword Remover:

    • Static
    • Dynamic
  • Stemmer: Morphological Analyzer & Disambiguator

  • Named Entity Recognizer (NER)

  • Dependency Parser

  • Part of Speech (PoS) Tagger

  • Sentiment Analyzer

  • Turkish Word Embeddings

    • FastText
    • Word2Vec
    • SentencePiece Unigram Tokenizer
  • News Summarization

  • News Paraphrasing

  • Summarization and Paraphrasing models are available in the demo. Contact us at vnlp@vngrs.com for API.

Demo:

Installation

pip install vngrs-nlp

Documentation:

  • See the Documentation for the details about usage, classes, functions, datasets and evaluation metrics.

Metrics:

Usage Example:

Dependency Parser

from vnlp import DependencyParser
dep_parser = DependencyParser()

dep_parser.predict("Oğuz'un kırmızı bir Astra'sı vardı.")
[("Oğuz'un", 'PROPN'),
 ('kırmızı', 'ADJ'),
 ('bir', 'DET'),
 ("Astra'sı", 'PROPN'),
 ('vardı', 'VERB'),
 ('.', 'PUNCT')]

# Spacy's submodule Displacy can be used to visualize DependencyParser result.
import spacy
from vnlp import DependencyParser
dependency_parser = DependencyParser()
result = dependency_parser.predict("Oğuz'un kırmızı bir Astra'sı vardı.", displacy_format = True)
spacy.displacy.render(result, style="dep", manual = True)

Citation

@article{turker2024vnlp,
  title={VNLP: Turkish NLP Package},
  author={Turker, Meliksah and Ari, Erdi and Han, Aydin},
  journal={arXiv preprint arXiv:2403.01309},
  year={2024}
}