micsthepick / JSVocalRedIso

Implementation of a fft center/vocals isolation plugin based on an audacity plugin

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Some noises before a transient

FelipeZanabria opened this issue · comments

Hello! There is a function in the nyquist plugin that is worth putting in to avoid that problem. First of all, you have to adjust the step size (hop), which determines the jumps by the size of fft. In vocalRediso.ny the size of fft is 16512, =8192. The hop is 5127, there is a zero padding, which is hop2, and finally, the windows are adapted to the previous functions.
The an window, which is the simplest and best is:
sequence two sounds, Set the frequency of the control signal in size.
Inside the sequence we place an lfo that oscillates between 0 and 0.5 with these data:
in the frequency parameter: size/(hop
2.0).
Duration parameter: size-zeros/size. The waveform is sine.
The phase is 270 degrees.
The second element of the sequence is a constant signal of amplitude 0, its sample rate is size, floatZeros/size.

Hello, and thank you for contributing to this repository! I encourage you to take a look at the development branch, it may already contain some of these improvements

Set the frequency of the control signal in size.
Inside the sequence we place an lfo that oscillates between 0 and 0.5 with these data:
in the frequency parameter: size/(hop2.0).
Duration parameter: size-zeros/size. The waveform is sine.
The phase is 270 degrees.
The second element of the sequence is a constant signal of amplitude 0, its sample rate is size, floatZeros/size.

This appears to be the window function (If I am not mistaken), which means it's either the sine window (from the Modulated Lapped Transform?) or the hahn window, which is already used in the main branch. Do you happen to know more information about this one?

In the develop branch (edit - oops forgot to finish this sentence) all of the versions of the plugin currently use the MLT sine window, which seems to sound the best in all settings without having any choppiness.

In vocalRediso.ny the size of fft is 16512, =8192. The hop is 5127

like before, in develop - vocalrediso-more-overlap.jsfx experiements with a different hop size, I believe that it is 1/4 * 8192.

So check out develop. I might even fix up some of the stuff in develop, and see if I can do another release into main with that code.

Yes, is the han window. I will test the develop version. The noise removal plug-in from Norbert have the same problem.

In the development version, there seems to be an inverted reverb before the transients. Have you put a post fft window?
If you lower that, (in the large fft version) the 8192 version will sound perfect.

I'm not sure which file you are referring to from the develop branch, but I can see some inverted reverb with the vocalrediso-more-overlap.jsfx file, however, according to my understanding, processing after applying the fft may violate the assumption that the waveform is zero at the endpoints (correct me if I am wrong!) - which causes discontinuity, which sounds like horrid regular clicking.

Yes, the biggest problem is hearing clicks. I don't know if it's possible to make the overlays play at a lower volume. Or try the Han window. The audacity version applyes the Han window before and after fft, (when applyes ifft).

to be clear, I have actually put a post fft window (after applying ifft). This did solve the issue with clicks. The version in master lets you choose which window it uses for both stages (including Hahn), however I don't see much benefit for that given that it will lead to clicking and vibrato.

I found this article on the audacity wiki about the internal noise reduction effect, which sounds better than ReaFir and Ratio denoiser, which may help us get the best sound.
https://wiki.audacityteam.org/wiki/How_Audacity_Noise_Reduction_Works

Hello again! After modifying and testing the code many times, I have found the solution to artifacts. I have used the stable version, since it is the one that has the code closest to what I wanted.
First, I was able to add an offset to the window buffers in size, which results in a sound starting in the middle or quarter of the window, eliminating spectral leakage.
On the other hand, I have corrected the calculation of out1 and out2.
The second tile in the windows is bound to the first window buffer, so out2 is there too. Try and see if this can be a new stable version.
vocalrediso.txt

a couple things:

outScale = windowBuffer1[tilePos1] * windowBuffer2[tilePos1] + windowBuffer2[tilePos2] * windowBuffer2[tilePos2];

this line is wrong, you refer to windowBuffer2 twice on the right, but not windowBuffer1 twice in the left. I'm guessing you meant

outScale = windowBuffer1[tilePos1] * windowBuffer1[tilePos1] + windowBuffer2[tilePos2] * windowBuffer2[tilePos2];

or perhaps something similar?

secondly:

windowTile1 = windowBuffer1[tilePos1+SIZE];
windowTile2 = windowBuffer1[tilePos2+SIZE];
...
outPart1 = bufferO1C[tilePos1] * windowBuffer1[tilePos1];
outPart2 = bufferO2C[tilePos2] * windowBuffer1[tilePos2];

what you've essentially done here is swap the input and output windows. Are you sure you want to do that?

Also, I'm planning on replacing the current version in master with the version in develop. Make sure that you check out the latest changes from the develop branch (it should be a commit where I create the buffers with a custom simple_alloc function, instead of doing raw pointer arithmetic), and see if you can implement the changes in there. Preferably, create a pull request! I've also removed the options to select the windows for that branch, which means there are less things to worry about for the end user.