ZhaZhaFon / MSDWILD

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MSDWild Dataset

MSDWILD: MULTI-MODAL SPEAKER DIARIZATION DATASET IN THE WILD

Demo

Labels

rttms (all)

rttms (few train)

rttms (few val)

rttms (many val)

Wavs

Download (8.1 GB) password: mbc2

md5: 0057f82daaddf2ce993d1bf0679929c4

Video part

The video part includes cropped videos and corresponding talking faces. If you want to use this part, a license agreement must first be signed (no students) and sent to Administration.

Note:

  • The database is ONLY for research purposes.
  • The copyright of the video belongs to the original author, if you have any questions, please contact us (email).

Reference

@inproceedings{liu22t_interspeech,
  author={Tao Liu and Shuai Fan and Xu Xiang and Hongbo Song and Shaoxiong Lin and Jiaqi Sun and Tianyuan Han and Siyuan Chen and Binwei Yao and Sen Liu and Yifei Wu and Yanmin Qian and Kai Yu},
  title={{MSDWild: Multi-modal Speaker Diarization Dataset in the Wild}},
  year=2022,
  booktitle={Proc. Interspeech 2022},
  pages={1476--1480},
  doi={10.21437/Interspeech.2022-10466}
}

About

License:Other


Languages

Language:HTML 100.0%