muellerzr / Practical-Deep-Learning-for-Coders-2.0

Notebooks for the "A walk with fastai2" Study Group and Lecture Series

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

issue with CropSignal in Audio notebook

turgut090 opened this issue · comments

Hi. The notebooks are very helpful! However, I could not find the following function in this notebook:

CropSignal(1000)

Is it changed to CropTime(1000)?

However, in this case I get:

>>> get_y = lambda x: x.name[0]
>>> 
>>> aud_digit = DataBlock(blocks=(AudioBlock, CategoryBlock),  
...                  get_items=get_audio_files, 
...                  splitter=RandomSplitter(),
...                  item_tfms = item_tfms,
...                  get_y=get_y)
>>> 
>>> dls = aud_digit.dataloaders(path_dig, bs=64)
>>> dls.show_batch(max_n=3)
RuntimeError: stack expects each tensor to be equal size, but got [1, 128, 1076] at entry 0 and [1, 128, 431] at entry 1

Look like the fastai audio library has been redone/moved around. I'll look towards updating this notebook here soon

ResizeSignal is the new name as the signal isn't always cropped to the time passed. If you were to apply CropSignal(1000) to a 800ms audio clip it would pad the signal to make it 1000ms

Interesting. one_batch works but show_batch not.

>>> dls.show_batch(max_n=3)
AttributeError: 'Axes' object has no attribute 'get_array'
>>> dls.one_batch()
(AudioSpectrogram([[[[ -8.3181, -10.2000, -12.6481,  ..., -22.9956, -22.1275, -21.1268],
          [-14.6939,  -9.6992,  -5.7920,  ..., -26.1172, -20.2113, -19.8419],
          [-11.3864,  -9.1018,  -6.8128,  ..., -28.3644, -22.6686, -19.2496],
          ...,
          [-37.3387, -38.2659, -40.5929,  ..., -23.8105, -25.2935, -26.5386],
          [-38.1808, -39.1065, -41.6412,  ..., -22.2388, -24.5692, -27.2292],
          [-43.2318, -43.8571, -45.6337,  ..., -26.2471, -28.8644, -31.5725]]],


        [[[ -0.6156,  -1.5698,  -7.2658,  ...,  -8.8658,  -3.9938,  -2.8227],
          [  1.2372,  -1.2855,  -4.1364,  ...,  -7.8216,  -3.2151,  -1.4409],
          [  1.0344,   1.8436,  -1.6778,  ...,  -4.4782,  -2.6898,  -2.9348],
          ...,
          [-22.0949, -22.1624, -22.8304,  ..., -12.0152, -13.1652, -13.7102],
          [-19.2713, -19.8037, -21.3614,  ..., -10.6371, -11.5922, -11.7593],
          [-20.5059, -20.6659, -21.2461,  ...,  -9.1825, -11.1106, -12.0782]]],


        [[[-18.3703, -19.5959, -24.2047,  ..., -14.5802, -12.5391, -11.7561],
          [-20.9951, -22.8386, -28.1964,  ..., -20.4750, -19.0906, -19.6360],
          [-22.2578, -22.9181, -25.7134,  ..., -25.1796, -42.6287, -37.7331],
          ...,
          [-49.6489, -49.1947, -48.2453,  ..., -49.5460, -49.6013, -49.4587],
          [-49.0422, -48.9614, -49.0389,  ..., -51.5584, -51.6974, -51.9679],
          [-50.9010, -51.6192, -53.4695,  ..., -56.1152, -56.0454, -55.8673]]],


        ...,


        [[[ -6.9652,  -8.0364, -11.4502,  ..., -33.1305, -35.5177, -42.4881],
          [ -9.6232,  -9.7865, -10.7225,  ..., -33.2689, -32.6359, -32.6976],
          [-24.9720, -14.7013, -12.5733,  ..., -35.2582, -35.6559, -35.3164],
          ...,
          [-20.1046, -20.1812, -20.5957,  ..., -44.7309, -42.6575, -41.8783],
          [-22.2047, -21.5025, -20.6947,  ..., -51.7986, -52.2079, -52.0918],
          [-30.9515, -29.7638, -28.4658,  ..., -50.2453, -49.8501, -49.8834]]],


        [[[-49.3383, -46.1890, -40.8572,  ..., -17.6733, -12.8186, -11.3199],
          [-45.6537, -42.6323, -41.6800,  ..., -17.2358, -11.9648, -11.0973],
          [-36.1794, -37.3899, -40.3495,  ..., -14.3630, -11.4240,  -8.9613],
          ...,
          [-37.8178, -37.6950, -37.0713,  ...,   5.2818,   5.9530,   6.1184],
          [-41.7684, -42.2734, -43.5165,  ...,   5.7649,   5.0034,   4.2097],
          [-44.3601, -44.7549, -45.3909,  ...,   2.0408,   1.5633,   0.9212]]],


        [[[ -1.4403,  -2.8520,  -7.0391,  ..., -19.8113, -15.4940, -14.0321],
          [  0.8504,  -0.3509,  -4.8551,  ..., -18.1012, -14.2483, -13.2409],
          [  9.0323,   7.8041,   4.4682,  ..., -15.9741, -10.1831,  -8.4963],
          ...,
          [-44.5452, -44.8842, -45.5113,  ...,  -3.3812,  -4.8031,  -5.7803],
          [-48.4126, -48.5500, -48.6792,  ...,  -4.9052,  -6.3405,  -7.0175],
          [-47.1862, -47.6854, -48.7584,  ...,  -2.8045,  -4.2454,  -5.1359]]]],
       device='cuda:0'), TensorCategory([1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1,
        0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1], device='cuda:0'))

Yes apologies, we're on to that one as well: fastaudio/fastaudio#38