Takahashi Kanji's starred repositories

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:34015Issues:0Issues:0

wikipedia-utils

Utility scripts for preprocessing Wikipedia texts for NLP

Language:PythonLicense:Apache-2.0Stargazers:72Issues:0Issues:0

llama_index

LlamaIndex is a data framework for your LLM applications

Language:PythonLicense:MITStargazers:33937Issues:0Issues:0

whisper.cpp

Port of OpenAI's Whisper model in C/C++

Language:C++License:MITStargazers:33345Issues:0Issues:0

ibis

the portable Python dataframe library

Language:PythonLicense:Apache-2.0Stargazers:4651Issues:0Issues:0
Language:PythonLicense:MITStargazers:94Issues:0Issues:0

entity-recognition-datasets

A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.

Language:PythonLicense:MITStargazers:1463Issues:0Issues:0

jiwer

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Language:PythonLicense:Apache-2.0Stargazers:576Issues:0Issues:0

kwja

An integrated Japanese analyzer based on foundation models

Language:PythonLicense:MITStargazers:119Issues:0Issues:0

RapidFuzz

Rapid fuzzy string matching in Python using various string metrics

Language:C++License:MITStargazers:2510Issues:0Issues:0

pytorch-partial-crf

CRF, Partial CRF and Marginal CRF in PyTorch

Language:PythonLicense:MITStargazers:31Issues:0Issues:0

applied-ml

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

License:MITStargazers:26852Issues:0Issues:0

gokart

Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline.

Language:PythonLicense:MITStargazers:302Issues:0Issues:0

ML-Workflow-with-SageMaker-and-StepFunctions

Example of ML Workflow using SageMaker and StepFunctions

Language:PythonLicense:MITStargazers:3Issues:0Issues:0

lit-NER

TorchServe+Streamlit for easily serving your HuggingFace NER models

Language:PythonLicense:NOASSERTIONStargazers:32Issues:0Issues:0

sample-codes-for-aiml

こちらでは AWS を使った AI/ML のサンプルコードを公開しています。

Language:Jupyter NotebookLicense:MITStargazers:8Issues:0Issues:0

PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Language:PythonLicense:AGPL-3.0Stargazers:4687Issues:0Issues:0

best-of-ml-python

🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

License:CC-BY-SA-4.0Stargazers:16100Issues:0Issues:0

OpenNRE

An Open-Source Package for Neural Relation Extraction (NRE)

Language:PythonLicense:MITStargazers:4285Issues:0Issues:0

text

Models, data loaders and abstractions for language processing, powered by PyTorch

Language:PythonLicense:BSD-3-ClauseStargazers:3482Issues:0Issues:0

LSTM-CRF-pytorch-faster

A more than 1000X faster paralleled LSTM-CRF implementation modified from the slower version in the Pytorch official tutorial (URL:https://pytorch.org/tutorials/beginner/nlp/advanced_tutorial.html).

Language:PythonStargazers:204Issues:0Issues:0

ripgrep

ripgrep recursively searches directories for a regex pattern while respecting your gitignore

Language:RustLicense:UnlicenseStargazers:46610Issues:0Issues:0

ner-wikipedia-dataset

Wikipediaを用いた日本語の固有表現抽出データセット

License:NOASSERTIONStargazers:129Issues:0Issues:0

100-nlp-papers

100 Must-Read NLP Papers

Stargazers:3718Issues:0Issues:0

UD_Japanese-GSD

Japanese data from the Google UDT 2.0.

Language:PythonLicense:NOASSERTIONStargazers:28Issues:0Issues:0

awesome-mlops

A curated list of references for MLOps

Stargazers:12312Issues:0Issues:0

inappropriate-words-ja

日本語における不適切表現を収集します。自然言語処理の時のデータクリーニング用等に使えると思います。

Language:PythonLicense:MITStargazers:159Issues:0Issues:0
Language:C#License:CC0-1.0Stargazers:33Issues:0Issues:0
Language:JavaLicense:Apache-2.0Stargazers:8Issues:0Issues:0

cudf

cuDF - GPU DataFrame Library

Language:C++License:Apache-2.0Stargazers:8080Issues:0Issues:0