HectorPulido / peque-nlu

Peque-NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extracts intends, features and information.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Peque NLU - Natural Language Understanding with Machine Learning

Peque-NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extract intents, features and information.

For example: quiero conocer el ultimo blogpost de unity Result: Timing -> latest, Technology -> unity, Intention -> search

Table of Contents

Features

  • Feature extraction from text Agnostic algorithm: you can use SGD, MLNN, LLMs, Word2Vec, etc.
  • 100% Free and Open source

Use cases

  • Chatbots, to get intention and extract features
  • Search engines, get keywords and intention from a semantic info
  • Data mining, classifying text and unstructured data without boilerplate

Getting Started

Prerequisites

  • Python 3.6+

Installation

Warning

pip installation coming soon

  1. Clone this repo
git clone git@github.com:HectorPulido/peque-nlu.git
  1. Install the requirements
pip install -r requirements.txt
  1. Use the library
from peque_nlu.intent_engines import SGDIntentEngine
from peque_nlu.intent_classifiers import ModelIntentClassifier


intent_engine = SGDIntentEngine("spanish")
model = ModelIntentClassifier("spanish", intent_engine)
model.fit(DATASET_PATH)

prediction = model.multiple_predict(
    [
        "Hola como te encuentras?",
        "Quiero aprender sobre lo último de python",
        "describeme usando un meme",
    ]
)

assert len(prediction) == 3
first_prediction = prediction[0]
assert "intent" in first_prediction
assert "probability" in first_prediction
assert "text" in first_prediction
assert "features" not in first_prediction

assert first_prediction["intent"] == "small_talk"

Usage

You need to provide to the algorithm before start, you can check this as base

{
    "intents": {
        "small_talk": [
            "hola",
            ...

        ],
        "fun_phrases": [
            "eres gracioso",
            ...
        ],
        "meme": [
            "¿conoces algun buen meme?",
            ...
        ],
        "thanks": [
            "gracias",
            ...
        ]
    },
    "entities": {
        "technology": [
            "python",
            ...
        ],
        "timing": [
            "recient",
            ...
        ]
    }
}

When you have your format ready, you can load and fit your dataset.

intent_engine = SGDIntentEngine("spanish")
model = ModelIntentClassifier("spanish", intent_engine)
model.fit(DATASET_PATH)

You can also save and load your models to reduce time and resources.

# Save
saver = PickleSaver()
saver.save(intent_engine, PICKLE_PATH)

# Load
intent_engine_loaded = SGDIntentEngine("spanish")
intent_engine_loaded = saver.load(PICKLE_PATH)

Then you can start to predict or extract features from a text

prediction = model.predict("quiero conocer el ultimo blogpost de unity")

Response:

{
    "intent": "search",
    "features": [
      {
        "word": "ultimo",
        "entity": "timing",
        "similarities": 1
      },
      {
        "word": "otro_ejemplo",
        "entity": "otra_entidad",
        "similarities": 0.9
      }
    ]
  }

Contributing

Your contributions are greatly appreciated! Please follow these steps:

  1. Fork the project
  2. Create your feature branch git checkout -b feature/MyFeature
  3. Commit your changes git commit -m "my cool feature"
  4. Push to the branch git push origin feature/MyFeature
  5. Open a Pull Request

License

Every base code made by me is under the MIT license

Contact


Let's connect 😋

Hector's LinkedIn     Hector's Twitter     Hector's Twitch     Hector's Youtube     Pequesoft website    

About

Peque-NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extracts intends, features and information.

License:MIT License


Languages

Language:Python 100.0%