What size is the begin/end padding measured in samples and/or duration?

Question

What size is the begin/end padding measured in samples and/or duration?

VRciF opened this issue 7 years ago · comments

Many thanks for this great library.
I'm trying to encode the microphone recorded from the browser using lamejs.
The encoded chunks are then sent to another browser tab using a nodejs websocket server and decoded using audioContext.decodeAudio which works great.
The websocket connections are handled in a web worker thread and if i send the raw pcm data i have no glitches in the audio output. But using the encoded mp3 chunks i get crackling noises.
I noticed that the mp3 chunks are around 50ms longer in duration the decoded chunk is also around 2000 bytes bigger.

I tested to encode pcm samples of a fixed value. The decoded audio then seems to have a padding of around 25ms at the beginning and end of the mp3 which seems to be just how mp3 works according to lame faq.

So i tried to just ignore 25ms at the beginning and end and the crackling got better but is still there.

My question is: Is there a way that i know exactly how many samples or duration has been added to the beginning and end of the mp3? Is it encoded in the mp3 as mentioned in the faq from above or is it some kind of fixed value?

Kind regards

Dani Weidman · Answer 1 · Tue Oct 08 2019 07:12:44 GMT+0800 (China Standard Time)

Hi @VRciF
I know this is very old, but did you ever find a solution? I am in pretty much the same situation.
Thanks

Ge · Answer 2 · Tue Oct 08 2019 07:54:24 GMT+0800 (China Standard Time)

There is no easy solution to splicing multiple mp3 files seamlessly. Why it's hard and how to do it is described in the LAME Technical FAQ linked above.
But it sounds like you are trying to fix the wrong thing. Instead of transferring multiple mp3 files try to send chunks of a single mp3 stream.

Dani Weidman · Answer 3 · Tue Oct 08 2019 23:24:30 GMT+0800 (China Standard Time)

Hi @geeee,
Thanks so much for your response and your contributions to this very useful resource. I have read the FAQ, and to me it sounds like the issue concerns the spread of data across multiple frames. I should clarify that my plan for splicing together these multiple files was to just convert each to WAV format (16 bit linear PCM) and then to splice those files together. I figured that if I knew the padding added onto each MP3 file, I would be able to just ignore the WAV data (after conversion) that fell into the padding. My thinking was that the conversion to WAV (server side, by a different library) would handle to combination of data from multiple frames. Is there something else I am missing that would cause this not to work? My main purpose for converting the data into MP3 format in the browser is just to decrease bandwidth usage.

I'll look into the process of sending an MP3 stream in chunks, since I'm sure that would be the more elegant solution. I didn't consider this as strongly before because I was just going for something quick and dirty that I could easily work into the framework of an existing project where WAV data is put in base64 and then sent over a websocket at a certain interval. I thought it might be quick to just throw in the extra step of converting the WAV data to MP3 before the base64 encoding so I could make minimal changes. Seems like it might not be that simple though I guess.

Ge · Answer 4 · Wed Oct 09 2019 07:29:57 GMT+0800 (China Standard Time)

Padding at the start is going to be 576 samples. Padding at the end will vary. At the very least it's going to be 288 samples. And more silence to make the total number of samples exact multiple of 1152. And on top of that decoders might also contribute by adding even more silence.
Try to generate one PCM chunk with values of all of the samples set to maximum. After encoding/decoding this chunk it's going to be ease to tell an exact number of samples added to the start and the end of the chunk.

Dani Weidman · Answer 5 · Sat Oct 12 2019 00:27:22 GMT+0800 (China Standard Time)

Thanks very much, I will try that!