Boo is a local REST API to use ML models in local desktop apps.
You need install each tool following the instructions in the model's repository.
- πΈ Coqui for text to speech
- π¦ Llama for large language models
- ποΈβπ¨οΈ PaddleOCR for OCR (in progress)
- π Whisper for speech to text
Additional requirements are listed in the requirements.txt
file.
./run.sh
- Create a new websocket connection to
ws://localhost:5000/ws/{client_id}
- Consume the API (using the same the client_id).
- The tasks are queued using the REST API and the outcome will be sent using the websocket.
Adds a button to the right of the <p>
tags with the p selected to generate audio from the selected text and a button on the bottom input to transcribe audio to text.
Adds a parrot button at the right of to the <p>
tags with the p selected to generate audio from the selected text.