jonghwanhyeon / python-mecab-ko

A python binding for mecab-ko

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add a method that returns all the original features.

Kyubyong opened this issue · comments

Hi,

The original mecab-ko returns many features, but this repo and konlpy extract only the first one, POS tag.
I guess it's useful if you add a method that returns all the features like this:

def analyze(self, sentence):
        lattice = _create_lattice(sentence)
        if not self.tagger.parse(lattice):
            raise MeCabError(self.tagger.what())

        return [
            (node.surface, node.feature)
            for node in lattice
        ]

Hello!

I added parse(sentence) method and it returns a Feature namedtuple.

Please refer to the example snippet below!

import mecab
mecab = mecab.MeCab()

mecab.parse('즐거운 하루 보내세요!')
# [
#     ('즐거운', Feature(
#         pos='VA+ETM', semantic=None, has_jongseong=True, reading='즐거운',
#         type='Inflect', start_pos='VA', end_pos='ETM',
#         expression='즐겁/VA/*+ᆫ/ETM/*')),
#     ('하루', Feature(
#         pos='NNG', semantic=None, has_jongseong=False, reading='하루',
#         type=None, start_pos=None, end_pos=None,
#         expression=None)),
#     ('보내', Feature(
#         pos='VV', semantic=None, has_jongseong=False, reading='보내',
#         type=None, start_pos=None, end_pos=None,
#         expression=None)),
#     ('세요', Feature(
#         pos='EP+EF', semantic=None, has_jongseong=False, reading='세요',
#         type='Inflect', start_pos='EP', end_pos='EF',
#         expression='시/EP/*+어요/EF/*')),
#     ('!', Feature(
#         pos='SF', semantic=None, has_jongseong=None, reading=None,
#         type=None, start_pos=None, end_pos=None,
#         expression=None))
# ]

Thanks. Awesome!