xiexukang

followers

following

stars

Master's student at JiangNan University

china

kevin_up's repositories

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookMIT000

3D-Speaker

A repository for single- and multi-modal speaker verification, speaker recognition and speaker diarization.

Apache-2.0000

awesome-asr-contextualization

A curated list of awesome papers on contextualizing E2E ASR outputs

Apache-2.0000

awesome-cpp

A curated list of awesome C++ (or C) frameworks, libraries, resources, and shiny things. Inspired by awesome-... stuff.

MIT000

Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

CC0-1.0000

awesome-multimodal-ml

Reading list for research topics in multimodal machine learning

MIT000

awesome-ncnn

😎 A Collection of Awesome NCNN-based Projects

000

Cantonese-learning

粤语学习资料

000

ChatWaifu_Mobile

移动版二次元 AI 老婆聊天器

MIT000

code-switching-papers

A curated list of research papers and resources on code-switching

Apache-2.0000

ctc_decoder

A ctc decoder for both online and offline asr model

000

data2vec-pytorch

PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI

MIT000

espresso

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

NOASSERTION000

expert_readed_books

2021年最新总结，推荐工程师合适读本，计算机科学，软件技术，创业，**类，数学类，人物传记书籍

000

FastASR

这是一个用C++实现ASR推理的项目，它依赖很少，安装也很简单，推理速度很快，在树莓派4B等ARM平台也可以流畅的运行。支持的模型是由Google的Transformer模型中优化而来，数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小时)，所以识别效果也很好，可以媲美许多商用的ASR软件。

Apache-2.0000

FastDeploy

⚡️An Easy-to-use and Fast Deep Learning Model Deployment Toolkit for ☁️Cloud 📱Mobile and 📹Edge. Including Image, Video, Text and Audio 20+ main stream scenarios and 150+ SOTA models with end-to-end optimization, multi-platform and multi-framework support.

Apache-2.0000

findpapers

Findpapers: A tool for helping researchers who are looking for related works

MIT000

Grounded-Segment-Anything

分割一切

Apache-2.0000

json

JSON for Modern C++

MIT000

keyword-spot

端到端语音唤醒工具箱，从模型训练到模型推理。

MIT000

myblog

myblog powered by django,xadmin

Language:Python000

ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

NOASSERTION000

OpenAI_Whisper_ASR

A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the Art" models

MIT000

pocolm

Small language toolkit for creation, interpolation and pruning of ARPA language models

NOASSERTION000

sherpa-ncnn

Real-time speech recognition using next-gen Kaldi with ncnn

NOASSERTION000

torchaudio

Data manipulation and transformation for audio signal processing, powered by PyTorch

BSD-2-Clause000

wenet_trt8

Apache-2.0000

wespeaker

000

WeTextProcessing

Apache-2.0000

whisper

MIT000