junaidpk / urduhack

A NLP library for Urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.

Home Page:https://urduhack.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Urduhack: A Python NLP library for Urdu language

image image Azure DevOps builds Azure DevOps tests Build Status CodeFactor codecov image Downloads Gitter License: MIT

Urduhack is a NLP library for urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.

You can reach out core contributor Mr Ikram Ali @ https://github.com/akkefa

Our Goal

  • Academic users Easier experimentation to prove their hypothesis without coding from scratch.
  • NLP beginners Learn how to build an NLP project with production level code quality.
  • NLP developers Build a production level application within minutes.

πŸ”₯ Features Support

  • Normalization
  • Preprocessing
  • Tokenization
  • Pipeline Module
  • Models
    • Pos tagger
    • Sentimental analysis
    • Sentence classification
    • Documents classification
    • Name entity recognition
    • Image to text
    • Speech to text
  • Datasets loader

πŸ›  Installation

Urduhack officially supports Python 3.6–3.7, and runs great on PyPy.

Installing with tensorflow cpu version.

$ pip install urduhack[tf]

Installing with tensorflow gpu version.

$ pip install urduhack[tf-gpu]

Usage

import urduhack

# Downloading models
urduhack.download()

nlp = urduhack.Pipeline()
text = ""
doc = nlp(text)

for sentence in doc.sentences:
    print(sentence.text)
    for word in sentence.words:
        print(f"{word.text}\t{word.pos}")

    for token in sentence.tokens:
        print(f"{token.text}\t{token.ner}")

πŸ”— Documentation

Fantastic documentation is available at https://urduhack.readthedocs.io/

Documentation
Installation How to install Urduhack and download models
Quickstart New to Urduhack? Here's everything you need to know!
API Reference The detailed reference for Urduhack's API.

πŸ‘ How to Contribute

  1. Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug. There is a Contributor Friendly tag for issues that should be ideal for people who are not very familiar with the codebase yet.
  2. Write a test which shows that the bug was fixed or that the feature works as expected.
  3. Send a pull request and bug the maintainer until it gets merged and published. :)

Backers Backers on Open Collective

Thank you to all our backers! πŸ™ [Become a backer]

Sponsors Sponsors on Open Collective

Support this project by becoming a sponsor. [Become a sponsor]

πŸ“ Copyright and license

Code released under the MIT License.

About

A NLP library for Urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.

https://urduhack.readthedocs.io/

License:MIT License


Languages

Language:Python 100.0%