Testing CNN model using sound generated from pyroomacoustics room simulation
kehinde-elelu opened this issue · comments
I have generated a large set of audio using Pyroomacoustics for room simulation, employing a circular microphone array and a single sound source.
I have successfully trained and tested a CRNN (Convolutional Recurrent Neural Network) model with this audio dataset to predict events and DOA estimation.
However, when using the trained model to analyze audio from a Respeaker v4 mic-array, the results are unsatisfactory, despite both setups having a similar mic-arrangement in the simulated scenarios.
Can I accurately estimate the Direction of Arrival (DOA) for a circular microphone array with a small radius, especially given that the Respeaker mic-array has microphones spaced less than 0.05cm apart from each other?
I've observed differences in the spectrograms between the WAV files generated from the Pyroomacoustics room simulation and the Respeaker audio. Is it possible to adjust the room simulation parameters to generate audio with a spectrogram more closely resembling that of the Respeaker?
I have generated a large set of audio using Pyroomacoustics for room simulation, employing a circular microphone array and a single sound source.
I have successfully trained and tested a CRNN (Convolutional Recurrent Neural Network) model with this audio dataset to predict events and DOA estimation.
However, when using the trained model to analyze audio from a Respeaker v4 mic-array, the results are unsatisfactory, despite both setups having a similar mic-arrangement in the simulated scenarios.
Can I accurately estimate the Direction of Arrival (DOA) for a circular microphone array with a small radius, especially given that the Respeaker mic-array has microphones spaced less than 0.05cm apart from each other?
I've observed differences in the spectrograms between the WAV files generated from the Pyroomacoustics room simulation and the Respeaker audio. Is it possible to adjust the room simulation parameters to generate audio with a spectrogram more closely resembling that of the Respeaker?
have you solve this problem?
Hello, first of all sorry to @kehinde-elelu as I never replied 🙇
There is not yet a perfect solution to match the simulation to a specific hardware or make it generalize in general.
Enabling the randomized image source model (by setting use_rand_ism=True
see the doc) will help the model generalize in practice.
However, the simulation will still be missing the response of the microphone array you are using. If you have a way to measure it, you could try to include it in the simulation.