Google Audiovisual Model

Description

Adapted version of the model for for Part IV Project #80 (2019). Commands are adpated to be run on windows.

A modified version of https://github.com/bill9800/speech_separation (adapted for Part IV Project #80).

To run this project fully, the following libraries and tools are required.

Python 3.6

Use pip install:

To install all the required libraries listed above, run the following command:

python -m pip install -r requirements.txt

Install manually:

Ensure that the following are in your system path environment variables:

Follow the instructions on the README in the data folder.

Steps 2-7* for data downloading can be run by calling:

python download_dataset.py

from within the data folder.

After having downloaded a range of data into the data folder, the audiovisual model can be trained or ran.

From within /model/model_v2, run the following to train the model:

python AV_train.py

Put the saved H5 model file into the /model/model_v2 directory, and change the model path parameter in predict_video.py to the correct name.

From within /model/model_v2, run the following to download a test video:

python download_test_video.py

From within /model/model_v2, run the following to run the demo:

python predict_video.py

The outputted clean audio files will be found in the pred folder.