what changes you make on whisper_openAI?

Question

what changes you make on whisper_openAI?

rrscholarship opened this issue 2 months ago · comments

Hi @Srijith-rkr , I saw you cloned whisper_openAI, not installing in, and I wonder what changes you made to this library? Also using Large-v2 lead to OOM on my machine (24gb VRAM), any advice?

rrscholarship · Answer 1 · Wed May 08 2024 06:29:23 GMT+0800 (China Standard Time)

is there pretrained weights for this repo?

huckiyang · Answer 2 · Sun May 19 2024 14:15:57 GMT+0800 (China Standard Time)

Yes, there is weights for this repo as in https://github.com/Srijith-rkr/Whispering-LLaMA/blob/main/README.md#model-weights

https://huggingface.co/Srijith-rkr/Whispering-LLaMA

changes you made to this library?

at the time we worked on this project, there is no beam searching algorithm and temp based decoding. I guess the modification might be on this part. @Srijith-rkr any thoughts?

Srijith-rkr · Answer 3 · Sun May 19 2024 16:10:18 GMT+0800 (China Standard Time)

In the paper, we generate multiple hypothesis in the from Whisper model to use as a prompt input to the LLM. We modified the beam search in the Whisper code to select the next token based on temperature sampling, to generate multiple candidates that do not capture the utterance very well. We do this to model a weak acoustic model for the LLM to improve upon.

You can get the Whisper model weights with just
import whisper
model = whisper.load_model("mention size")

When we wrote the paper, we did not have an instruction tuned model, so we used alpaca weights. And the weights of that model converted to lit-llama (the repo our code is built on) format is attached in https://huggingface.co/Srijith-rkr/Whispering-LLaMA

We also shared our dataset here

Regarding OOM for using Large-V2.
We also have a baseline using Whisper Tiny in the paper, but I don't think you will be able to finetune LLaMA 7B even then. We used 2 A100s (80GB VRAM)

You will be able to run LLaMA inference with 24GB VRAM with quantization.

I hope that help you. Feel free to reopen this if you have any questions.

rrscholarship · Answer 4 · Sat May 25 2024 06:27:41 GMT+0800 (China Standard Time)

Thank you for your reply, its really helpful @Srijith-rkr! One last question, I still quite confused how it alpaca work as "ASR selector" prompt, if there no instruction tuned model? Also I did not find prompt for "ASR selector" in code base, does it mean alpaca weight is already finetuned insturction as "ASR selector"?