n400peanuts / mexca-sd-experiment

A repository for comparing potential speaker diarization tools to be used in the MEXCA pipeline.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

mexca-sd-experiment

A repository for comparing potential speaker diarization tools to be used in the MEXCA pipeline.

Structure

The repository contains subdirectories for different parts of the experiment:

  • speaker-diarization\: Contains all files for the speaker diarization part
    • embeddings\: Contains the encoded speaker embeddings as .pt files
    • results\: Contains the .rttm files with speaker annotations
    • clustering.py: Script for clustering the speaker embeddings and assigning the speaker labels to speaker segments
    • sd_*.py: Scripts for applying the respective speaker encoding models
    • compare_sd.ipynb: Notebook for comparing the speaker diarization approaches
    • speaker_diarization.py: Script to run all speaker encoding scripts after each other
    • speaker_representation.py: Helper functions for performing speaker diarization
  • voice-activity-detection\: Contains all files for the voice activity detection part
    • results\: Contains the .rttm files with speech segments
    • compare_vad.ipynb: Notebook for comparing the voice activity detection approaches
    • custom.conf: Configuration file for the opensmile feature extractor
    • opensmile_helper_functions: Helper functions for extracting opensmile voice activity features
    • vad_*.py: Scripts for applying the voice activity detection models
  • explore_ami_corpus.ipynb: Notebook for exploring the properties of the AMI corpus
  • rttm.py: Functions for creating, reading, modifying, and writing .rttm files and objects
  • rttm_test.py: Preliminary test suite for rttm.py

About

A repository for comparing potential speaker diarization tools to be used in the MEXCA pipeline.

License:Apache License 2.0


Languages

Language:Jupyter Notebook 99.6%Language:Python 0.3%Language:R 0.0%