rexlow / CuriousKid

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Curious Kid

A POC repository to get some ideas our of my head. Some of the work that will be included in this repository

  1. Important word extraction
  2. Identify important word segments from a sentence
  3. Tokenization and Part-of-Speech (POS) tagging with spacy
  4. Identity clauses and verbs
  5. NER tagger
  6. Generate questions from text blobs
  7. Maybe deep learning approach?

To build

Detail insturctions will be included when the work is done.

Download Encoders and Word Vectors

bash download_importance.sh

Install dependencies

pip3 install -r requirements.txt

Install Spacy models

Pick either en_core_web_sm or en_core_web_trf for name entity recognition task.

python -m spacy download en_core_web_sm
python -m spacy download en_core_web_trf

Usage

Important word extraction

python3 importance.py

About


Languages

Language:Python 95.1%Language:Shell 4.9%