Hyunjoong Kim (lovit)

lovit

Geek Repo

Company:Naver corp.

Home Page:https://lovit.github.io

Github PK Tool:Github PK Tool


Organizations
ko-nlp

Hyunjoong Kim's repositories

soynlp

한국어 자연어처리를 위한 파이썬 라이브러리입니다. 단어 추출/ 토크나이저 / 품사판별/ 전처리의 기능을 제공합니다.

Language:PythonLicense:NOASSERTIONStargazers:905Issues:45Issues:114

KR-WordRank

비지도학습 방법으로 한국어 텍스트에서 단어/키워드를 자동으로 추출하는 라이브러리입니다

Language:PythonLicense:NOASSERTIONStargazers:343Issues:11Issues:12

textrank

Implementation TextRank and related utils

Language:PythonLicense:MITStargazers:83Issues:2Issues:2

huggingface_konlpy

Training Transformers of Huggingface with KoNLPy

Language:Jupyter NotebookStargazers:67Issues:6Issues:2

KoBERTScore

BERTScore for Korean

WordPieceModel

Word Piece Model python light version with functions tokenize/save/load

namuwikitext

Wikitext format dataset of Namuwiki (Most famous Korean wikipedia)

naver_news_search_scraper

검색어 기준으로 네이버뉴스와 댓글을 수집하는 파이썬 코드

soykeyword

Python library for keyword extraction

clustering4docs

Clustering algorithm library. Implemented spherical kmeans

Language:PythonLicense:GPL-3.0Stargazers:36Issues:4Issues:7

naver_movie_scraper

네이버 영화 정보 및 사용자 작성 영화평/평점 데이터 수집기

kmrd

Synthetic dataset for recommender system created from Naver Movie rating system

python_ml_intro

패스트캠퍼스, 파이썬을 이용한 머신러닝 입문 실습 코드

Language:Jupyter NotebookStargazers:21Issues:8Issues:0

levenshtein_finder

Similar string search in Levenshtein distance

petitions_archive

청와대 국민청원 데이터 아카이브

synthetic_dataset

Synthetic data generator for machine learning

Language:PythonStargazers:15Issues:4Issues:0

pycrfsuite_spacing

python-crfsuite를 이용한 한국어 띄어쓰기 교정기

Language:PythonStargazers:13Issues:3Issues:0

kmeans_to_pyLDAvis

Visualizing k-means using pyLDAvis

flask_api_tutorial

Flask 로 API 를 만들기 위한 튜토리얼

text-dedup

Python package for memory-friendly text de-duplication

Language:PythonLicense:Apache-2.0Stargazers:6Issues:2Issues:0

python_upload_webserver

Flask, Waitress based file upload webserver

python-stopwatch

Python stopwatch

tokenizers

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Language:RustLicense:Apache-2.0Stargazers:1Issues:0Issues:0

transformers

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

Language:PythonLicense:Apache-2.0Stargazers:1Issues:1Issues:0

kwnlp-sql-parser

Utilities for parsing Wikipedia MySQL/MariaDB dumps.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

parallelformers

Parallelformers: An Efficient Model Parallelization Toolkit for Deployment

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

wikiextractor

A tool for extracting plain text from Wikipedia dumps

Language:PythonLicense:AGPL-3.0Stargazers:0Issues:1Issues:0