Postprocessing of the labels

Question

Postprocessing of the labels

victorywys opened this issue 4 years ago · comments

Hi,
Thanks for this fantastic work! Currently, I'm trying to replicate your results and build my own model. When I looked into the way you're dealing with the data, I find two functions in core/dataset.py called: _postprocess_speech_label and _post_process, which seems to transform SPEAKING_NOT_AUDIBLE to NOT_SPEAKING. As far as I can understand, this will change the original 3-category classification task to a 2-category classification during training. Will that influence the results and does it conform to the official guide? Maybe I'm misunderstanding something, please correct me if so. Thanks!

fuankarion · Answer 1 · Wed Sep 09 2020 12:17:03 GMT+0800 (China Standard Time)

Hi you are right, we turned the problem into a binary one as the official evaluation is indeed binary (active speaker vs anything else).
Additionally, less than 2% of labels correspond to SPEAKING_NOT_AUDIBLE, so the current dateset is not the best option to evaluate the 3 category problem.