google-research / sound-separation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

jams annotation order vs filenames?

mcusi opened this issue · comments

Hey, I'm confused about whether the order of the FUSS jams annotations relates to the filenames of the separated sources (eg., background0, foreground0, foreground1, etc.)

I downloaded the dry ssdata from zenodo. For some examples, it seems like the order of the annotations in the JAMS file (ordered by time I believe) is different than the ordering of the foreground sounds in the filenames. For example:

>>> m = jams.load("./ssdata/train/example13537.jams")
>>> m["annotations"][0].data
SortedKeyList([Observation(time=0.0, duration=10.0, value={'label': 'sound', 'source_file': '/data/DCASE2020/fsd_data/train/sound/155571.wav', 'source_time': 1.3854725120886693, 'event_time': 0, 'event_duration': 10.0, 'snr': 0, 'role': 'background', 'pitch_shift': None, 'time_stretch': None}, confidence=1.0), 
Observation(time=0.6711880000000008, duration=9.328812, value={'label': 'sound', 'source_file': '/data/DCASE2020/fsd_data/train/sound/372821.wav', 'source_time': 0.0, 'event_time': 0.6711880000000008, 'event_duration': 9.328812, 'snr': -0.3880649923842725, 'role': 'foreground', 'pitch_shift': None, 'time_stretch': None}, confidence=1.0), 
Observation(time=1.6686094265599583, duration=4.748812, value={'label': 'sound', 'source_file': '/data/DCASE2020/fsd_data/train/sound/375026.wav', 'source_time': 0.0, 'event_time': 1.6686094265599583, 'event_duration': 4.748812, 'snr': 23.017906447229286, 'role': 'foreground', 'pitch_shift': None, 'time_stretch': None}, confidence=1.0), 
Observation(time=3.5576066287429065, duration=1.59025, value={'label': 'sound', 'source_file': '/data/DCASE2020/fsd_data/train/sound/349792.wav', 'source_time': 0.0, 'event_time': 3.5576066287429065, 'event_duration': 1.59025, 'snr': 12.441491562340214, 'role': 'foreground', 'pitch_shift': None, 'time_stretch': None}, confidence=1.0)], key=<bound method Annotation._key of <class 'jams.core.Annotation'>>)

But looking at the separated sources, it looks like foreground1 is the sound that begins at 3.5sec whereas foreground2 begins at 1.66s (whereas I expected the opposite)
image

I'm wondering if I'm missing how to order jams annotations so that they will consistently match up with the indexes in the filenames? thanks so much!!