Marley-Mulvin-Broome / JapaneseFrequencyProcessor

A Japanese word frequency processor written in Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

JPFreq

Tests Docs License: MIT Code Style: Black Ruff

JPFreq is a frequency processor for Japanese text. It uses the Cython wrapper for MeCab Fugashi to process Japanese text.

Installation

  1. Install Fugashi and Unidic
    pip install fugashi[unidic]
    python3 -m unidic download
  2. Install JPFreq
    pip install jpfreq

Usage

For detailed usage, see the documentation.

Getting the most frequent words

from jpfreq.jp_frequency_list import JapaneseFrequencyList

freq_list = JapaneseFrequencyList()
freq_list.process_line("私は猫です。")

print(freq_list.get_most_frequent())

Reading from a file

from jpfreq.jp_frequency_list import JapaneseFrequencyList

freq_list = JapaneseFrequencyList()
freq_list.process_file("path/to/file.txt")

print(freq_list.get_most_frequent())

About

A Japanese word frequency processor written in Python

License:MIT License


Languages

Language:Python 68.1%Language:HTML 29.4%Language:Shell 2.5%