obtain the noise data

Question

obtain the noise data

YueWenWu opened this issue 8 years ago · comments

Hi, my aim is to get the noise data from a audio file, which is from classroom, it is meaning that i want to remove the teachers voice. So, what should I do ?

James Lyons · Answer 1 · Mon Apr 03 2017 17:33:32 GMT+0800 (China Standard Time)

This isn't the right library for that. Your task is more similar to speech enhancement. You could try something like spectral subtraction , though it probably won't do a very good job. I would try to just use the parts where the teacher is not talking

宗吾 · Answer 2 · Mon Apr 03 2017 18:11:35 GMT+0800 (China Standard Time)

If I build a speech eigenvalue module to represent the teacher's voice, and then, I use the eigenvalue of the original speech to subtract the value of the teacher's model, So, the rest can be represent the noise part?

James Lyons · Answer 3 · Mon Apr 03 2017 18:26:35 GMT+0800 (China Standard Time)

Enhancement algorithms like this are called 'subspace methods'. It may work the way you are describing it, but the result wont sound very good, there will be artifacts remaining. There is really no way i know of to remove speech without it being obvious that something was there. Perhaps by copying other silence regions and pasting them over the speech.

宗吾 · Answer 4 · Mon Apr 03 2017 18:34:47 GMT+0800 (China Standard Time)

In fact, my purpose is to assess the quality of classroom teaching, such as whether there is no noise interference(in addition to the teacher's voice are considered interference) . So do you have some good ways to achieve it?

My idea is to remove the teacher's voice, the rest as interference!

宗吾 · Answer 5 · Mon Apr 03 2017 18:44:35 GMT+0800 (China Standard Time)

I have other idea, that is how to assess how many individuals are talking in the speech？it is easy?

thank you very much,

James Lyons · Answer 6 · Mon Apr 03 2017 19:07:55 GMT+0800 (China Standard Time)

You build a speech activity detector using mfccs, basically detect speech vs silence (or background). Then you could measure the background interference during the silence regions. Counting speakers is hard, but speaking vs. Silence is much easier

宗吾 · Answer 7 · Mon Apr 03 2017 19:22:53 GMT+0800 (China Standard Time)

In this case, I have two problem:
1, Whether I use the background interference during the silence regions to assess the quality of classroom teaching?
2, Will the separation of the blind source in this voice can be separated from the number of voice sources? Is not it very difficult? Is there any good advice?
Thank!