jonathanlimsc / CS4347

Sound and Music Computing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Assignment 1

Part 1A

Reads an input_file line-by-line with each line having the format of <audio_file_path, label> The audio file is read and its audio features extracted (Root-mean-squared, Peak-to-average, Zero Crossing, Median Absolute Deviation, Mean Absolute Deviation) Writes the audio file path and the audio feature calculated values into a CSV output file. Output string format per audio file: <audio_file_path>,feature1,feature2,...,featureN

Part 1B

Reads an input_file line-by-line with each line having the format of <audio_file_path, label> The audio file is read into a list of floats. From that, a feature matrix is generated based on number of desired buffers, which is dependent on STEP_SIZE and WINDOW_SIZE of the buffer. Feature extraction will be applied to every buffer. Finally, the mean and std of each feature is calculated.

Hence, each audio file will be represented as a list of mean and std values. An output string will be written to the output_file in this format: feature1_mean,feature2_mean,...,featureN_mean,feature1_std,feature2_std,...,featureN_std,label

Assignment 2

Similar to Assignment 1 with different features.

Assignment 3

Generate MFCC for each audio file by using 26 mel-spaced filters. Refs: http://practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/#computing-the-mel-filterbank

About

Sound and Music Computing


Languages

Language:Python 100.0%