Beast code in Giters

zlh11's starred repositories

libfacedetection

An open source library for face detection in images. The face detection speed can reach 1000FPS.

Language:C++NOASSERTION1220100

libfacedetection.train

The training program for libfacedetection for face detection and 5-landmark detection.

Language:PythonApache-2.075300

Latex-Paper-Templates

Latex-format paper templates, including Elsevier, arXiv and IEEE Access.

Language:TeX22200

conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Language:PythonApache-2.092200

spider_collection

python爬虫，目前库存：网易云音乐歌曲爬取，B站视频爬取，知乎问答爬取，壁纸爬取，xvideos视频爬取，有声书爬取，微博爬虫，安居客信息爬取+数据可视化，哔哩哔哩视频封面提取器，ip代理池封装，知乎百万级用户爬虫+数据分析，github用户爬虫

Language:PythonMIT117800

This is an open-source implementation of the ITU P.808 standard for "Subjective evaluation of speech quality with a crowdsourcing approach" (see https://www.itu.int/rec/T-REC-P.808/en). It uses Amazon Mechanical Turk as the crowdsourcing platform. It includes implementations for Absolute Category Rating (ACR), Degradation Category Rating (DCR), and Comparison Category Rating (CCR).

Language:HTMLMIT19900

abseil-cpp

Abseil Common Libraries (C++)

Language:C++Apache-2.01457300

protobuf

Protocol Buffers - Google's data interchange format

Language:C++NOASSERTION6482900

gpac

GPAC Ultramedia OSS for Video Streaming & Next-Gen Multimedia Transcoding, Packaging & Delivery

Language:CLGPL-2.1267700

fdk-aac

A standalone library of the Fraunhofer FDK AAC code from Android.

Language:C++NOASSERTION116400

opus

Modern audio compression for the internet.

Language:CNOASSERTION220400

Awesome-Music-Recommendation-Datasets

Awesome Datasets for Music Recommendation

6400

you-only-hear-once

Language:Jupyter NotebookMIT8400

inaSpeechSegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Language:PythonMIT72700

tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

NOASSERTION2607800

InQSS

Language:PythonMIT1100

pystoi

Python implementation of the Short Term Objective Intelligibility measure

Language:MATLABMIT31700

conferencing-speech-2022

Source code for LCN submission for ConferencingSpeech2022 challenge.

Language:PythonMIT1300

WebRTC-audio-processing

webrtc audio processing

Language:C++36900

DNN-binaural-localization

Simulate directional sound – deep neural network (DNN) – layer-wise relevance propagation (LRP)

Language:Jupyter NotebookMIT200

SimpleVQA

A Deep Learning based No-reference Quality Assessment Model for UGC Videos

Language:PythonApache-2.05500

SNR-Estimation-Using-Deep-Learning

An implementation for Frame-level Speech Signal-to-Noise Ratio Estimation using deep learning

Language:Jupyter NotebookMIT2800

BinauralLocalizationCNN

Code to create networks that localize sounds sources in 3D environments

Language:Python3700

sound-source-localization-algorithm_DOA_estimation

关于语音信号声源定位DOA估计所用的一些传统算法

Language:MATLABApache-2.034800

BinauralSDM

This repository contains a set of tools to render Binaural Room Impulse Responses (BRIR) using the Spatial Decomposition Method (SDM).The implementation features a series of improvements presented in Amengual et al. 2020, such as quantization of the direction of arrival (DOA) estimates to improve the spectral properties of the rendered BRIRs, or RTMod and RTMod+AP equalization for the late reverberation.The repository also contains the necessary files to 3D print an array holder of optimized topology for the estimation of DOA information.

Language:MATLABCC-BY-4.04400

Perceptual-Coding-In-Python

Language:Matlab14800

zlh11