jpauwels / in-latent-space

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

In Latent Space, No One Can Hear You Scream

My submission to the Timbre Tools Hackathon. Its purpose is to navigate the latent space of an auto-encoder with sensor data, specifically the spatial orientation of a mobile phone.

My hack was created by:

  • Training a RAVE model on the first 100 files of the public Warblr b10k dataset containing birdsound recorded with mobile phones. The default configuration was used:
    rave preprocess --input_path /audio/folder --output_path ./bird-dataset --channels 1
    rave train --config v2 --db_path ./bird-dataset --out_path bird-rave --name bird-rave --channels 1
    rave export --streaming --run bird-rave/bird-rave_e18d54798e/version_0
    
    The trained model is included in this repo as bird-rave_e18d54798e_streaming.ts.
  • Installing MultiSense OSC on an Android phone. Configure the orientation data to be sent to OSC addresses /yaw, /pitch, /roll and the proximity data to address /proximity.
  • Installing the nn~ external to Pure Data and creating the included bird-rave.pd patch, which uses the phone sensor data to explore the latent space.

About

License:Apache License 2.0


Languages

Language:TypeScript 100.0%