MSDWild Dataset

MSDWILD: MULTI-MODAL SPEAKER DIARIZATION DATASET IN THE WILD

Demo

Labels

rttms (all)

rttms (few train)

rttms (few val)

rttms (many val)

Wavs

Download (8.1 GB) password: mbc2

md5: 0057f82daaddf2ce993d1bf0679929c4

Video part

The video part includes cropped videos and corresponding talking faces. If you want to use this part, a license agreement must first be signed (no students) and sent to Administration.

Note:

The database is ONLY for research purposes.
The copyright of the video belongs to the original author, if you have any questions, please contact us (email).

Reference

@inproceedings{liu22t_interspeech,
  author={Tao Liu and Shuai Fan and Xu Xiang and Hongbo Song and Shaoxiong Lin and Jiaqi Sun and Tianyuan Han and Siyuan Chen and Binwei Yao and Sen Liu and Yifei Wu and Yanmin Qian and Kai Yu},
  title={{MSDWild: Multi-modal Speaker Diarization Dataset in the Wild}},
  year=2022,
  booktitle={Proc. Interspeech 2022},
  pages={1476--1480},
  doi={10.21437/Interspeech.2022-10466}
}

ZhaZhaFon / MSDWILD

MSDWild Dataset

Labels

Wavs

Video part

Reference

About

Languages