Anton Mitrofanov (medbar)

medbar

Geek Repo

Company:Special Technological Center Ltd.

Location:Saint Petersburg

Home Page:https://scholar.google.ru/citations?user=T-kDfn4AAAAJ&hl=ru

Github PK Tool:Github PK Tool

Anton Mitrofanov's starred repositories

duckdb

DuckDB is an analytical in-process SQL database management system

Language:C++License:MITStargazers:23097Issues:0Issues:0

m2d

Masked Modeling Duo: Towards a Universal Audio Pre-training Framework

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:64Issues:0Issues:0

GoodbyeDPI

GoodbyeDPI — Deep Packet Inspection circumvention utility (for Windows)

Language:CLicense:Apache-2.0Stargazers:23871Issues:0Issues:0

prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

Language:PythonLicense:Apache-2.0Stargazers:15954Issues:0Issues:0

FAdam_PyTorch

an implementation of FAdam (Fisher Adam) in PyTorch

Language:PythonLicense:MITStargazers:31Issues:0Issues:0

fense

Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval.

Language:PythonStargazers:19Issues:0Issues:0

Dasheng

Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"

Language:PythonLicense:Apache-2.0Stargazers:40Issues:0Issues:0

ONE-PEACE

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

Language:PythonLicense:Apache-2.0Stargazers:943Issues:0Issues:0

AIR-Bench

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

Language:PythonStargazers:38Issues:0Issues:0

DataProcessingFramework

Framework for processing and filtering datasets

Language:PythonLicense:Apache-2.0Stargazers:25Issues:0Issues:0

AudioLLM

Audio Large Language Models

Stargazers:68Issues:0Issues:0

AudioBench

AudioBench: A Universal Benchmark for Audio Large Language Models

Language:PythonLicense:NOASSERTIONStargazers:74Issues:0Issues:0

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonLicense:NOASSERTIONStargazers:6154Issues:0Issues:0

zeta

Build high-performance AI models with modular building blocks

Language:PythonLicense:Apache-2.0Stargazers:384Issues:0Issues:0

Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonStargazers:1119Issues:0Issues:0

py-webrtcvad

Python interface to the WebRTC Voice Activity Detector

Language:CLicense:NOASSERTIONStargazers:2023Issues:0Issues:0

webdataset

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

Language:PythonLicense:BSD-3-ClauseStargazers:2230Issues:0Issues:0

YaFSDP

YaFSDP: Yet another Fully Sharded Data Parallel

Language:PythonLicense:Apache-2.0Stargazers:824Issues:0Issues:0

SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

Language:PythonLicense:MITStargazers:512Issues:0Issues:0

Awesome-Speaker-Diarization

Some comprehensive papers about speaker diarization

Stargazers:197Issues:0Issues:0

rir-classifier

Recipe for training and testing RIR-Classifier

Language:PythonLicense:MITStargazers:3Issues:0Issues:0

jsalt2020_simulate

Training data simulation

Language:PythonLicense:Apache-2.0Stargazers:40Issues:0Issues:0

CTranslate2

Fast inference engine for Transformer models

Language:C++License:MITStargazers:3256Issues:0Issues:0

einops

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Language:PythonLicense:MITStargazers:8395Issues:0Issues:0

C8DASR-Baseline-NeMo

NeMo: a toolkit for conversational AI

Language:PythonLicense:Apache-2.0Stargazers:12Issues:0Issues:0

Pengi

An Audio Language model for Audio Tasks

Language:PythonLicense:MITStargazers:282Issues:0Issues:0

meeteval

MeetEval - A meeting transcription evaluation toolkit

Language:PythonLicense:MITStargazers:75Issues:0Issues:0

chime-utils

Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.

Language:PythonLicense:MITStargazers:20Issues:0Issues:0

NOTSOFAR1-Challenge

NOTSOFAR-1 Challenge: Distant Diarization and ASR

Language:PythonLicense:MITStargazers:42Issues:0Issues:0

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:1418Issues:0Issues:0