HughLan1214's starred repositories

Ignite

A static site generator for Swift developers.

Language:SwiftLicense:MITStargazers:1598Issues:0Issues:0

mixpanel-js-wrapper

A GitHub project created under the Mixpanel organization to store the Mixpanel JS wrapper

Language:JavaScriptStargazers:2Issues:0Issues:0

Dialogue-Topic-Segmenter

Improving Unsupervised Dialogue Topic Segmentation with Utterance-Pair Coherence Scoring

Language:PythonStargazers:57Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:440Issues:0Issues:0

SincNet

SincNet is a neural architecture for efficiently processing raw audio samples.

Language:PythonLicense:MITStargazers:1119Issues:0Issues:0

Speaker_Verification

Tensorflow implementation of "Generalized End-to-End Loss for Speaker Verification"

Language:PythonLicense:MITStargazers:349Issues:0Issues:0

VoiceprintRecognition-Pytorch

This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods

Language:PythonLicense:Apache-2.0Stargazers:723Issues:0Issues:0

You-Only-Speak-Once

Deep Learning - one shot learning for speaker recognition using Filter Banks

Language:Jupyter NotebookStargazers:148Issues:0Issues:0

camerakit-js

Library for Web Camera API. Increase ease of use and compatibility in your next project

Language:TypeScriptLicense:MITStargazers:40Issues:0Issues:0

meetingsdk-react-sample

Use the Zoom Meeting SDK in React

Language:JavaScriptLicense:NOASSERTIONStargazers:149Issues:0Issues:0

flask-video-stream

Simple webcam video streaming python3 script using Flask.

Language:PythonLicense:MITStargazers:70Issues:0Issues:0

jpeg_camera

JpegCamera – JavaScript webcam image capture library

Language:CoffeeScriptLicense:MITStargazers:369Issues:0Issues:0

webcamjs

HTML5 Webcam Image Capture Library with Flash Fallback

Language:ActionScriptLicense:MITStargazers:2492Issues:0Issues:0

streamlit-webrtc

Real-time video and audio streams over the network, with Streamlit.

Language:PythonLicense:MITStargazers:1323Issues:0Issues:0
Language:PythonLicense:MITStargazers:34Issues:0Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookLicense:MITStargazers:5713Issues:0Issues:0

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:66022Issues:0Issues:0

SuperDialseg

Supervised Dialogue Segmentation

Language:JavaLicense:MITStargazers:5Issues:0Issues:0

DialogLM

Official Implementation of "DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization."

Language:PythonLicense:MITStargazers:135Issues:0Issues:0

BERT-like-is-All-You-Need

The code for our INTERSPEECH 2020 paper - Jointly Fine-Tuning "BERT-like'" Self Supervised Models to Improve Multimodal Speech Emotion Recognition

Language:PythonLicense:MITStargazers:112Issues:0Issues:0

FAb-Net

Pytorch code for BMVC 2018 paper

Language:Jupyter NotebookLicense:MITStargazers:85Issues:0Issues:0

Self-Supervised-Embedding-Fusion-Transformer

The code for our IEEE ACCESS (2020) paper Multimodal Emotion Recognition with Transformer-Based Self Supervised Feature Fusion.

Language:PythonLicense:MITStargazers:106Issues:0Issues:0

mlx-examples

Examples in the MLX framework

Language:PythonLicense:MITStargazers:5690Issues:0Issues:0

sparrow-donut

Data extraction with Donut ML model

Language:PythonLicense:Apache-2.0Stargazers:46Issues:0Issues:0

sparrow

Data processing with ML and LLM

Language:PythonLicense:GPL-3.0Stargazers:2567Issues:0Issues:0

donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Language:PythonLicense:MITStargazers:5619Issues:0Issues:0
Language:Jupyter NotebookStargazers:3Issues:0Issues:0

EMO-AffectNetModel

Dynamic and static models for real-time facial emotion recognition

Language:Jupyter NotebookLicense:MITStargazers:74Issues:0Issues:0

soxan

Wav2Vec for speech recognition, classification, and audio classification

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:239Issues:0Issues:0