askmeegs / papermusic

draw an instrument, then play the notes πŸ—’οΈ πŸ’› 🎡

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

🎡 papermusic ✏️

draw an instrument, and play it! (fun with PaliGemma and SCAMP)

video walkthrough

(click to play)

Video Walkthrough

(πŸ“Ί Original implementation)

architecture

screenshot

how to run

Note - this is a prototype, not a production-grade app. (The OpenCV streaming setup is a bit buggy and occasionally crashes. You may need to restart.)

You will need:

  • A local machine with a webcam, and Python 3.11+. (Client has been tested on MacOS Sonoma / M1 Mac).
  • A Google Cloud project, with access to Google Compute Engine.
  • A HuggingFace account with an Access Token

Setup (Server)

  1. Create a Google Compute Engine instance with at least one GPU. I used two NVIDIA T4 GPUs, but adjust to whatever your quota allows. Set "Allow HTTP/HTTPS" traffic to true.
  2. SSH into the instance.
  3. Install Python packages for the server workloads.
git clone https://github.com/askmeegs/papermusic 
cd papermusic 
python3 -m venv . 
source ./bin/activate 
pip install -r requirements.txt
  1. Set your HuggingFace Access Token as an environment variable.
export HUGGINGFACE_USER_ACCESS_TOKEN=your_token_here

Setup (Client)

  1. Clone the repo on your local machine.
git clone https://github.com/askmeegs/papermusic 
cd papermusic 
  1. Install client packages.
python3 -m venv . 
source ./bin/activate 
pip install -r requirements.txt

Run all components

Place a hand-drawn musical instrument in front of your webcam, like the screenshot shown above. Make sure the notes are written clearly on the instrument.

  1. Start server/handlestream.py on GCE, to listen for the webcam stream.
  2. Start client/webcamclient.py on your local machine, to send your webcam stream to GCE.
  3. Start server/server.py on GCE, to process the webcam stream and identify the notes.
  4. Start client/audioclient.py on your local machine, to poll the Server and play the notes over local audio.

πŸ“š sources

About

draw an instrument, then play the notes πŸ—’οΈ πŸ’› 🎡

License:Apache License 2.0


Languages

Language:Jupyter Notebook 33.3%Language:Python 31.7%Language:PowerShell 20.4%Language:Roff 7.5%Language:Shell 7.1%