zlh11

zlh11

Geek Repo

Github PK Tool:Github PK Tool

zlh11's starred repositories

Language:PythonLicense:MITStargazers:188Issues:0Issues:0

libfacedetection

An open source library for face detection in images. The face detection speed can reach 1000FPS.

Language:C++License:NOASSERTIONStargazers:12201Issues:0Issues:0

libfacedetection.train

The training program for libfacedetection for face detection and 5-landmark detection.

Language:PythonLicense:Apache-2.0Stargazers:753Issues:0Issues:0

Latex-Paper-Templates

Latex-format paper templates, including Elsevier, arXiv and IEEE Access.

Language:TeXStargazers:222Issues:0Issues:0

conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Language:PythonLicense:Apache-2.0Stargazers:922Issues:0Issues:0

books

collected eBooks

Stargazers:103Issues:0Issues:0

REDPlayer

the REDPlayer :)

Language:CLicense:LGPL-3.0Stargazers:222Issues:0Issues:0
Language:CStargazers:4Issues:0Issues:0

spider_collection

python爬虫,目前库存:网易云音乐歌曲爬取,B站视频爬取,知乎问答爬取,壁纸爬取,xvideos视频爬取,有声书爬取,微博爬虫,安居客信息爬取+数据可视化,哔哩哔哩视频封面提取器,ip代理池封装,知乎百万级用户爬虫+数据分析,github用户爬虫

Language:PythonLicense:MITStargazers:1178Issues:0Issues:0

P.808

This is an open-source implementation of the ITU P.808 standard for "Subjective evaluation of speech quality with a crowdsourcing approach" (see https://www.itu.int/rec/T-REC-P.808/en). It uses Amazon Mechanical Turk as the crowdsourcing platform. It includes implementations for Absolute Category Rating (ACR), Degradation Category Rating (DCR), and Comparison Category Rating (CCR).

Language:HTMLLicense:MITStargazers:199Issues:0Issues:0

abseil-cpp

Abseil Common Libraries (C++)

Language:C++License:Apache-2.0Stargazers:14573Issues:0Issues:0

protobuf

Protocol Buffers - Google's data interchange format

Language:C++License:NOASSERTIONStargazers:64829Issues:0Issues:0

gpac

GPAC Ultramedia OSS for Video Streaming & Next-Gen Multimedia Transcoding, Packaging & Delivery

Language:CLicense:LGPL-2.1Stargazers:2677Issues:0Issues:0

fdk-aac

A standalone library of the Fraunhofer FDK AAC code from Android.

Language:C++License:NOASSERTIONStargazers:1164Issues:0Issues:0

opus

Modern audio compression for the internet.

Language:CLicense:NOASSERTIONStargazers:2204Issues:0Issues:0

Awesome-Music-Recommendation-Datasets

Awesome Datasets for Music Recommendation

Stargazers:64Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:84Issues:0Issues:0

inaSpeechSegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Language:PythonLicense:MITStargazers:727Issues:0Issues:0

tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

License:NOASSERTIONStargazers:26078Issues:0Issues:0
Language:PythonLicense:MITStargazers:11Issues:0Issues:0

pystoi

Python implementation of the Short Term Objective Intelligibility measure

Language:MATLABLicense:MITStargazers:317Issues:0Issues:0

conferencing-speech-2022

Source code for LCN submission for ConferencingSpeech2022 challenge.

Language:PythonLicense:MITStargazers:13Issues:0Issues:0

WebRTC-audio-processing

webrtc audio processing

Language:C++Stargazers:369Issues:0Issues:0

DNN-binaural-localization

Simulate directional sound – deep neural network (DNN) – layer-wise relevance propagation (LRP)

Language:Jupyter NotebookLicense:MITStargazers:2Issues:0Issues:0

SimpleVQA

A Deep Learning based No-reference Quality Assessment Model for UGC Videos

Language:PythonLicense:Apache-2.0Stargazers:55Issues:0Issues:0

SNR-Estimation-Using-Deep-Learning

An implementation for Frame-level Speech Signal-to-Noise Ratio Estimation using deep learning

Language:Jupyter NotebookLicense:MITStargazers:28Issues:0Issues:0

BinauralLocalizationCNN

Code to create networks that localize sounds sources in 3D environments

Language:PythonStargazers:37Issues:0Issues:0

sound-source-localization-algorithm_DOA_estimation

关于语音信号声源定位DOA估计所用的一些传统算法

Language:MATLABLicense:Apache-2.0Stargazers:348Issues:0Issues:0

BinauralSDM

This repository contains a set of tools to render Binaural Room Impulse Responses (BRIR) using the Spatial Decomposition Method (SDM).The implementation features a series of improvements presented in Amengual et al. 2020, such as quantization of the direction of arrival (DOA) estimates to improve the spectral properties of the rendered BRIRs, or RTMod and RTMod+AP equalization for the late reverberation.The repository also contains the necessary files to 3D print an array holder of optimized topology for the estimation of DOA information.

Language:MATLABLicense:CC-BY-4.0Stargazers:44Issues:0Issues:0
Language:MatlabStargazers:148Issues:0Issues:0