Gadersd / whisper-burn

A Rust implementation of OpenAI's Whisper model using the burn framework

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Changes to run on CUDA with TchBackend

jbrough opened this issue · comments

This library is awesome, thank you. Incredibly fast and a much nicer API than alternatives.

I was hoping it would be the magic bullet that works on M2 and CUDA so that it can be deployed (running services from a MacBook seems the only option with these models!).

I tried last night on AWS with TchBackend and ran into:

Could not run 'aten::empty_strided' with arguments from the 'CUDA' backend.

After that I noticed your chunk branch used the same settings I'd used.

It looks like empty_strided isn't available on CUDA at all, and models using it need to be moved to the CPU.

Is it possible to use alternative methods in the tensor constructors (?) so that it's compatible with both WGPU and CUDA? Or do you have any pointers - did you get it working with Tch initially?

Most likely it's due to an incorrect Cuda version specified. Please see this ticket for a similar problem/resolution: Gadersd/stable-diffusion-burn#2

this worked - thank you.

For anyone else:

I also needed libtorch and to set LIBTORCH, which is alluded to in the ticket, but there's more details here in the tch dependency:

https://github.com/LaurentMazare/tch-rs

@antimora I get "Segmentation fault" as soon as it attempts to load a model when using webgpu on AWS G4ad instance (AMD Radeon Pro V520 GPU supporting Vulcan / OpenGl APIs).

I've tried it with AutoGraphicsApi, and Vulcan/OpenGl specifically.

Is there anything I should try or do you think this is a fatal issue atm?

My objective is to get whisper-burn deployable as a service (without shipping a Mac). I'm there with Conda on Tch backend though, but would be good to be able to use AutoGraphicsApi and PR things back into here.

@jbrough I am not sure but it's worth looking into this. Do you mind refiling with burn so the team gets needed attention? I don't think anyone else from the burn team is monitoring this repo. You can just copy your last comment. It'd be also super helpful if you could share the actual error you're seeing.