upura / nlp-recipes-ja

Samples codes for natural language processing in Japanese

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NLP Recipes for Japanese

This repository contains samples codes for natural language processing in Japanese. It's highly inspired by microsoft/nlp-recipes.

Content

The following is a summary of the commonly used NLP scenarios covered in the repository. Each scenario is demonstrated in one or more scripts or Jupyter notebook examples that make use of the core code base of models and repository utilities.

Category Methods
Basic Cleaning, Normalization, Stopwords, Sentence Segmantation, Ruby
Embeddings Word2Vec, fastText, Universal Sentence Encoder
Feature Engineering Bag-of-Words, TF-IDF, BM25, SWEM, SCDV
Morphological Analysis Konoha, nagisa
Sentence Similarity Cosine Similarity
Sentiment Analysis oseti
Text Classification TF-IDF & Logistic Regression, TF-IDF & LightGBM, BERT, T5
Visualization Visualization with Japanese texts

Environment

docker-compose up -d --build
docker exec -it nlp-recipes-ja bash

About

Samples codes for natural language processing in Japanese


Languages

Language:Python 98.9%Language:Dockerfile 1.1%