How to simulation energy decrease?

Question

How to simulation energy decrease?

coreeey opened this issue 4 months ago · comments

In theory, as the speech signal travels further into the far-field, we expect to observe a significant decrease in energy, leading to a noticeable attenuation in the spectrum. This attenuation typically manifests as a shift from dense resonance peaks to gradually sparse ones. However, in my simulation experiments, I noticed a curious anomaly: regardless of the distance simulated, the spectrum only exhibited aliasing effects without any observable attenuation. So why does this phenomenon occur?
(the code i used is room_L_shape_3d_rt.py in example)
The last line is the original signal, the others are reverberated signals

Robin Scheibler · Answer 1 · Tue Apr 09 2024 14:14:24 GMT+0800 (China Standard Time)

Hello @coreeey , this looks pretty good to me.
I suppose the absence of attenuation may be due to a global rescaling of the signal before saving to file. Please check that.
Also, I don't see any aliasing occuring in these spectrogram (aliasing would be some copy of high frequencies into low frequencies).
The further the source is from the microphone, the longer the reverberation time will be.
This causes the longer tail that is observed in your simulated signals.

coreeey · Answer 2 · Tue Apr 09 2024 14:33:49 GMT+0800 (China Standard Time)

Thanks @fakufaku for your very prompt reply, I did indeed perform a global rescaling of the signal before saving it to file, which might explain the absence of attenuation. I will attempt another approach to address this issue. Additionally, I realize now that I misunderstood the 'aliasing occurring in these spectrograms.' I initially thought it referred to aliasing of spectra over time.

coreeey · Answer 3 · Wed Apr 10 2024 10:29:13 GMT+0800 (China Standard Time)

Thanks @fakufaku, I managed to address this issue by multiplying the normalized signal by 1000, resulting in a spectrum that closely resembles the actual microphone audio.

s = convolve(audio_anechoic, rir)
s = np.squeeze(s, axis=0)
s_norm = s / np.max(np.abs(s))
# s_norm_ = np.int16(s_norm * 32767)
s_norm_ = np.int16(s_norm * 1000)
wavfile.write("tmp_out.wav", 16000, s_norm_)

However, I have a minor inquiry to make: 's_norm_ = np.int16(s_norm * 32767)' and 's_norm_ = np.int16(s_norm * 1000)'. What is the relationship between the 1000 and 32767? Will the volume increase when multiplied by a larger value? and if i want to adjust the simulated microphone's sound pressure to reach 65 dB. Can I achieve this by modifying this constant?

DanTremonti · Answer 4 · Thu Apr 18 2024 19:41:39 GMT+0800 (China Standard Time)

Hi @coreeey, can you clarify whether the signals used to create the spectrogram plots in the initial comment were

loaded from disk, or
were the direct room processed output without writing to disk

coreeey · Answer 5 · Mon Apr 22 2024 13:45:30 GMT+0800 (China Standard Time)

Hi @coreeey, can you clarify whether the signals used to create the spectrogram plots in the initial comment were

loaded from disk, or

were the direct room processed output without writing to disk

Hi @DanTremonti, i processed output with a max based normalization, and plot the spectrogram through audacity.

DanTremonti · Answer 6 · Mon Apr 22 2024 14:34:18 GMT+0800 (China Standard Time)

@coreeey Thanks for the clarification :)

Robin Scheibler · Answer 7 · Mon Apr 22 2024 15:56:05 GMT+0800 (China Standard Time)

@coreeey The normalization of audio before saving to a format like wav is one of the finer and confusing point of audio processing. The problem is that wav (when saving has integer valued samples) has a finite precision.
Many files are saved in 16 bits and you want to maximize the use of the 16 bits to represent the amplitude of the sound.
If the maximum amplitude is too small, only a few bits will be used to encode all the values. Often, we will rescale the maximum to a value close to 2^15 which is the maximum value allowed by 16 bits to maximize the precision used.
The rescaling in practicer only changes the volume of the audio.
The simingly innocuous operation will lose the relative difference of amplitudes of different files, as you have noticed in your original issue.

The trick if you want to conserve the relative differences is to rescale all files by the same value in such a way that they do not go outside the range of 16 bits. This is usually done by rescaling the maximum absolute amplitude across all signals that we want to compare so that it maps to 2^15.

Here is an example for two signals.

scale = max(abs(signal1).max(), abs(signal2).max())
signal1 = (signal1 * 32768 / scale).astype(np.int16)
signal2 = (signal2 * 32768 / scale).astype(np.int16)

coreeey · Answer 8 · Wed Apr 24 2024 14:40:21 GMT+0800 (China Standard Time)

@coreeey The normalization of audio before saving to a format like wav is one of the finer and confusing point of audio processing. The problem is that wav (when saving has integer valued samples) has a finite precision. Many files are saved in 16 bits and you want to maximize the use of the 16 bits to represent the amplitude of the sound. If the maximum amplitude is too small, only a few bits will be used to encode all the values. Often, we will rescale the maximum to a value close to 2^15 which is the maximum value allowed by 16 bits to maximize the precision used. The rescaling in practicer only changes the volume of the audio. The simingly innocuous operation will lose the relative difference of amplitudes of different files, as you have noticed in your original issue.

The trick if you want to conserve the relative differences is to rescale all files by the same value in such a way that they do not go outside the range of 16 bits. This is usually done by rescaling the maximum absolute amplitude across all signals that we want to compare so that it maps to 2^15.

Here is an example for two signals.
scale = max(abs(signal1).max(), abs(signal2).max())
signal1 = (signal1 * 32768 / scale).astype(np.int16)
signal2 = (signal2 * 32768 / scale).astype(np.int16)

@fakufaku thank you for the detailed and kind reply, and for developing such a great project.