Is it possible to run a music / sounds generation model?
flatsiedatsie opened this issue · comments
Question
I'd love to create a browser-based music generation tool, or one that can turn text into sound effects. Is that supported?
I guess my more general question is: can Transformers.js run pretty much any .onnx I throw at it, or does each model require some level of implementation before it can be used?
Well would you look at that plot thickening: https://huggingface.co/Xenova/musicgen-small
Except.. that the repo was created a day ago. Whoa.
Yep! Somehow your feature request came at the perfect time! This is something I've been working on already for a few days. This is all now possible thanks to this PR by @fxmarty.
Stay tuned for updates! This might only be available in v3 due to WebGPU support (I think CPU/WASM-only will be too slow).
Very very cool. Nice work @fxmarty. You wouldn't happen to have an early online test somewhere that I can tinker with?
I did not implement it in the PR yet, I'll maybe do it with ONNX Runtime (patching some transformers methods to use ORT instead of torch)
I have no idea what that means, except that I should be patient :-)
Whoop! #545 (comment)
- It's now possible! e.g. https://flatsiedatsie.github.io/transformers_js_musicgen/