urduhack / urduhack

An NLP library for the Urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.

Home Page:https://urduhack.readthedocs.io/en/stable/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Urduhack: A Python NLP library for Urdu language

image image Azure DevOps builds Azure DevOps tests Build Status CodeFactor codecov image Downloads Gitter License: MIT

Urduhack is a NLP library for urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.

You can reach out core contributor Mr Ikram Ali @ https://github.com/akkefa

Our Goal

  • Academic users Easier experimentation to prove their hypothesis without coding from scratch.
  • NLP beginners Learn how to build an NLP project with production level code quality.
  • NLP developers Build a production level application within minutes.

πŸ”₯ Features Support

  • Normalization
  • Preprocessing
  • Tokenization
  • Pipeline Module
  • Models
    • Pos tagger
    • Lemmatizer
    • Name entity recognition
    • Sentimental analysis
    • Image to text
    • Question answering system
  • Datasets loader

πŸ›  Installation

Urduhack officially supports Python 3.6–3.7, and runs great on PyPy.

Installing with tensorflow cpu version.

$ pip install urduhack[tf]

Installing with tensorflow gpu version.

$ pip install urduhack[tf-gpu]

Usage

import urduhack

# Downloading models
urduhack.download()

nlp = urduhack.Pipeline()
text = ""
doc = nlp(text)

for sentence in doc.sentences:
    print(sentence.text)
    for word in sentence.words:
        print(f"{word.text}\t{word.pos}")

    for token in sentence.tokens:
        print(f"{token.text}\t{token.ner}")

πŸ”— Documentation

Fantastic documentation is available at https://urduhack.readthedocs.io/

Documentation
Installation How to install Urduhack and download models
Quickstart New to Urduhack? Here's everything you need to know!
API Reference The detailed reference for Urduhack's API.
Contribute How to contribute to the code base.

πŸ‘ Contributors

Special thanks to everyone who contributed to getting the Urduhack to the current state.

Backers Backers on Open Collective

Thank you to all our backers! πŸ™ [Become a backer]

Sponsors Sponsors on Open Collective

Support this project by becoming a sponsor. [Become a sponsor]

πŸ“ Copyright and license

Code released under the MIT License.

About

An NLP library for the Urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.

https://urduhack.readthedocs.io/en/stable/

License:MIT License


Languages

Language:Python 100.0%