Kazuhiro Homma's starred repositories

SiFi-VITS2-44100-Ja

DDPM-based Pitch Generation and Pitch Controllable Voice Synthesis.

Language:PythonLicense:MITStargazers:47Issues:0Issues:0

guidance

A guidance language for controlling large language models.

Language:Jupyter NotebookLicense:MITStargazers:18536Issues:0Issues:0

EternalTerminal

Re-Connectable secure remote shell

Language:C++License:Apache-2.0Stargazers:2973Issues:0Issues:0

chat-gpt-jupyter-extension

A browser extension to provide various AI helper functions in Jupyter Notebooks, powered by ChatGPT.

Language:TypeScriptLicense:GPL-3.0Stargazers:302Issues:0Issues:0

sd-webui-xldemo-txt2img

Stable Diffusion XL 0.9 Demo webui extension

Language:PythonLicense:MITStargazers:194Issues:0Issues:0

whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Language:Jupyter NotebookLicense:BSD-2-ClauseStargazers:3221Issues:0Issues:0

whisper-asr-webservice

OpenAI Whisper ASR Webservice API

Language:PythonLicense:MITStargazers:1947Issues:0Issues:0

VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion

Language:PythonLicense:Apache-2.0Stargazers:4684Issues:0Issues:0

vits-finetuning

Fine-Tuning your VITS model using a pre-trained model

Language:PythonLicense:MITStargazers:541Issues:0Issues:0

manga-ocr

Optical character recognition for Japanese text, with the main focus being Japanese manga

Language:PythonLicense:Apache-2.0Stargazers:1578Issues:0Issues:0
Language:TypeScriptLicense:MITStargazers:683Issues:0Issues:0

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Language:PythonLicense:BSD-2-ClauseStargazers:10618Issues:0Issues:0

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:9095Issues:0Issues:0

DDSP-SVC

Real-time end-to-end singing voice conversion system based on DDSP (Differentiable Digital Signal Processing)

Language:PythonLicense:MITStargazers:1771Issues:0Issues:0

sqlite-vss

A SQLite extension for efficient vector search, based on Faiss!

Language:C++License:MITStargazers:1663Issues:0Issues:0
Language:Jupyter NotebookLicense:UnlicenseStargazers:515Issues:0Issues:0

PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

Language:PythonLicense:Apache-2.0Stargazers:42047Issues:0Issues:0

PaddleOCR-ONNX-Sample

PaddleOCRのPythonでのONNX推論サンプル

Language:PythonLicense:Apache-2.0Stargazers:39Issues:0Issues:0

sd-webui-controlnet

WebUI extension for ControlNet

Language:PythonLicense:GPL-3.0Stargazers:16732Issues:0Issues:0

adetailer

Auto detecting, masking and inpainting with detection model.

Language:PythonLicense:AGPL-3.0Stargazers:4045Issues:0Issues:0

llama.cpp

LLM inference in C/C++

Language:C++License:MITStargazers:64024Issues:0Issues:0

llm.cpp

Fork of llama.cpp, extended for GPT-NeoX, RWKV-v4, and Falcon models

Language:C++License:MITStargazers:28Issues:0Issues:0

speech-to-text-webcam-overlay

Web Speech API で音声認識した結果の字幕をWebカメラ映像に重ねて表示するWebページ

Language:JavaScriptLicense:CC0-1.0Stargazers:313Issues:0Issues:0

speech-to-text

Real-time transcription using faster-whisper

Language:HTMLLicense:MITStargazers:355Issues:0Issues:0

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Language:PythonLicense:MITStargazers:4267Issues:0Issues:0

text-generation-webui

A Gradio web UI for Large Language Models.

Language:PythonLicense:AGPL-3.0Stargazers:39158Issues:0Issues:0

whisper-jax

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4322Issues:0Issues:0

rvc-webui

liujing04/Retrieval-based-Voice-Conversion-WebUI reconstruction project

Language:PythonLicense:MITStargazers:478Issues:0Issues:0

openai-cookbook

Examples and guides for using the OpenAI API

Language:MDXLicense:MITStargazers:58239Issues:0Issues:0

awesome-keyword-spotting

This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).

License:MITStargazers:233Issues:0Issues:0