chck / AugLy-jp

Data Augmentation for Japanese Text on AugLy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AugLy-jp

Data Augmentation for Japanese Text on AugLy

PyPI Version Python Version Python Test Test Coverage Code Quality Python Style Guide

Augmenter

base_text = "あらゆる現実をすべて自分のほうへねじ曲げたのだ"

Augmenter Augmented Description
SynonymAugmenter あらゆる現実をすべて自身のほうへねじ曲げたのだ Substitute similar word according to Sudachi synonym
WordEmbsAugmenter あらゆる現実をすべて関心のほうへねじ曲げたのだ Leverage word2vec, GloVe or fasttext embeddings to apply augmentation
FillMaskAugmenter つまり現実を、未来な未来まで変えたいんだ Using masked language model to generate text
BackTranslationAugmenter そして、ほかの人たちをそれぞれの道に安置しておられた Leverage two translation models for augmentation

Prerequisites

Software Install Command
Python 3.8.11 pyenv install 3.8.11
Poetry 1.1.* curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python

Get Started

Installation

pip install augly-jp

Or clone this repository:

git clone https://github.com/chck/AugLy-jp.git
poetry install

Test with reformat

poetry run task test

Reformat

poetry run task fmt

Lint

poetry run task lint

Inspired

License

This software includes the work that is distributed in the Apache License 2.0 [1].

About

Data Augmentation for Japanese Text on AugLy

License:MIT License


Languages

Language:Python 100.0%