kukas / constant-q

A naive implementation of the Constant Q transform (suited for melodic inputs) programmed in Haskell

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Constant Q transform

Naive Constant Q transform implementation in Haskell. CQT is more suitable for melodic input data than discrete Fourier transform because the transform resolution in lower frequency bands is bigger - in other words the bin frequency/resolution ratio remains constant.

constant q transform of cello

Example of Constant Q transform of a violoncello sample

Running

The code is structured as a Stack package. The build should be easily reproducible by running stack build in the root directory.

Usage

Usage: constant-q-exe [OPTIONS]... -i INPUTFILE -o OUTPUTFILE
Process INPUTFILE (in WAV format) and save the Constant Q transform 
spectrogram to OUTPUTFILE (png image format).
Options:
  -min FREQ Set minimum frequency to FREQ (default 110.0)
  -max FREQ Set maximum frequency to FREQ (default 11000.0)
  -b NUM    Set the number of frequency bins in one octave (default 48)
  -p NUM    Set the hop size (default 1024)
  -q NUM    Set the Q factor (quality of frequency resolution) (default 72)
  -h    Print a help message and exit

Usage example

stack exec -- constant-q-exe -i input/cello.wav -o doc/cello.png -q 50 -p 200 -min 220

Setting the minimum frequency to 220 Hz and Q factor down to 50 speeds up the computation considerably. The hop size was adjusted to create an image with bigger width.

The output image is shown in the introduction section of the readme.

Implementation

The input file is read using WAVE-0.1.3 library, only WAV files are supported. The input samples are sliced up to a list of suffixes with hop size gaps. The program then evaluates the CQ transform on these subsets of the input, which is similar to computing short term Fourier transform (through the direct, naive approach), the results of each of the subset are the pixel columns in the output image. To reduce spectral leak across frequency bins, I used Hann window.

To speed up the computation, repeating computations are cached (see transformFactorMemoized). I tried straightforward parallelisation by using the parallel package but it did not help and in some cases the runtime was actually slower. Unfortunately I don't understand why since the computation of the image columns is mutually independent.

About

A naive implementation of the Constant Q transform (suited for melodic inputs) programmed in Haskell


Languages

Language:Haskell 100.0%