LCAV / pyroomacoustics

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Home Page:https://pyroomacoustics.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Testing CNN model using sound generated from pyroomacoustics room simulation

kehinde-elelu opened this issue · comments

I have generated a large set of audio using Pyroomacoustics for room simulation, employing a circular microphone array and a single sound source.

I have successfully trained and tested a CRNN (Convolutional Recurrent Neural Network) model with this audio dataset to predict events and DOA estimation.

However, when using the trained model to analyze audio from a Respeaker v4 mic-array, the results are unsatisfactory, despite both setups having a similar mic-arrangement in the simulated scenarios.

Can I accurately estimate the Direction of Arrival (DOA) for a circular microphone array with a small radius, especially given that the Respeaker mic-array has microphones spaced less than 0.05cm apart from each other?

I've observed differences in the spectrograms between the WAV files generated from the Pyroomacoustics room simulation and the Respeaker audio. Is it possible to adjust the room simulation parameters to generate audio with a spectrogram more closely resembling that of the Respeaker?

I have generated a large set of audio using Pyroomacoustics for room simulation, employing a circular microphone array and a single sound source.

I have successfully trained and tested a CRNN (Convolutional Recurrent Neural Network) model with this audio dataset to predict events and DOA estimation.

However, when using the trained model to analyze audio from a Respeaker v4 mic-array, the results are unsatisfactory, despite both setups having a similar mic-arrangement in the simulated scenarios.

Can I accurately estimate the Direction of Arrival (DOA) for a circular microphone array with a small radius, especially given that the Respeaker mic-array has microphones spaced less than 0.05cm apart from each other?

I've observed differences in the spectrograms between the WAV files generated from the Pyroomacoustics room simulation and the Respeaker audio. Is it possible to adjust the room simulation parameters to generate audio with a spectrogram more closely resembling that of the Respeaker?

have you solve this problem?

Hello, first of all sorry to @kehinde-elelu as I never replied 🙇

There is not yet a perfect solution to match the simulation to a specific hardware or make it generalize in general.
Enabling the randomized image source model (by setting use_rand_ism=True see the doc) will help the model generalize in practice.

However, the simulation will still be missing the response of the microphone array you are using. If you have a way to measure it, you could try to include it in the simulation.