Naoto Shimakoshi (shimacos37)

shimacos37

Geek Repo

Company:LayerX

Twitter:@nt_4o54

Github PK Tool:Github PK Tool

Naoto Shimakoshi's starred repositories

Kindai-OCR

OCR system for recognizing modern Japanese magazines

Language:PythonStargazers:132Issues:0Issues:0

bm25s

Fast lexical search library implementing BM25 in Python using Scipy (on average 2x faster than Elasticsearch in single-threaded setting)

Language:PythonLicense:MITStargazers:650Issues:0Issues:0

SelfDocSeg

[ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)

Language:PythonLicense:GPL-3.0Stargazers:33Issues:0Issues:0

fastapi-tips

FastAPI Tips by The FastAPI Expert!

Stargazers:1756Issues:0Issues:0

cleanlab

The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

Language:PythonLicense:AGPL-3.0Stargazers:9146Issues:0Issues:0

surya

OCR, layout analysis, reading order, line detection in 90+ languages

Language:PythonLicense:GPL-3.0Stargazers:9236Issues:0Issues:0

cruft

Allows you to maintain all the necessary cruft for packaging and building projects separate from the code you intentionally write. Built on-top of, and fully compatible with, CookieCutter.

Language:PythonLicense:MITStargazers:1194Issues:0Issues:0

automlops

Build MLOps Pipelines in Minutes

Language:PythonLicense:Apache-2.0Stargazers:143Issues:0Issues:0

Awesome-LLM4IE-Papers

Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)

Stargazers:584Issues:0Issues:0

stabilizer

Stabilize and achieve excellent performance with transformers

Language:PythonLicense:Apache-2.0Stargazers:41Issues:0Issues:0

DAWG

DAFSA-based dictionary-like read-only objects for Python. Based on `dawgdic` C++ library.

Language:C++License:MITStargazers:296Issues:0Issues:0

usearch

Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍

Language:C++License:Apache-2.0Stargazers:1955Issues:0Issues:0
Language:Jupyter NotebookStargazers:156Issues:0Issues:0

simple-simcse-ja

Exploring Japanese SimCSE

Language:PythonStargazers:57Issues:0Issues:0

doctr

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Language:PythonLicense:Apache-2.0Stargazers:3363Issues:0Issues:0

price-parser

Extract price amount and currency symbol from a raw text string

Language:PythonLicense:BSD-3-ClauseStargazers:308Issues:0Issues:0

namedivider-python

A tool for dividing the Japanese full name into a family name and a given name.

Language:PythonLicense:MITStargazers:237Issues:0Issues:0

mtl

Unofficial implementation of: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics

Language:PythonLicense:BSD-2-ClauseStargazers:528Issues:0Issues:0

ballerine

Open-source infrastructure and data orchestration platform for risk decisioning

Language:TypeScriptLicense:NOASSERTIONStargazers:2012Issues:0Issues:0

HojiChar

The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.

Language:PythonLicense:Apache-2.0Stargazers:112Issues:0Issues:0

tabular-dl-tabr

The implementation of "TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning"

Language:PythonLicense:MITStargazers:243Issues:0Issues:0

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonLicense:MITStargazers:19193Issues:0Issues:0

donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Language:PythonLicense:MITStargazers:5552Issues:0Issues:0

LightGBMLSS

An extension of LightGBM to probabilistic modelling

Language:PythonLicense:Apache-2.0Stargazers:251Issues:0Issues:0

SportsLabKit

A python package for turning sports video into csv files

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:220Issues:0Issues:0

machine-learning-round-table

Gather around the table, and have a discussion to catch up the latest trend of machine learning 🤖

Stargazers:299Issues:0Issues:0

skypilot

SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.

Language:PythonLicense:Apache-2.0Stargazers:6293Issues:0Issues:0

LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Language:PythonLicense:GPL-3.0Stargazers:5616Issues:0Issues:0

whisper-jax

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4261Issues:0Issues:0

StableLM

StableLM: Stability AI Language Models

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:15849Issues:0Issues:0