javiferfer / prednet_pytorch

version 32

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PredNet in PyTorch Version

Kenta Tanaka, Manabu Kosaka & Eiji Watanabe, 2022. Fork by Javier Fdez, 2022.

Overview

The software learns a still image sequence generated by a deep learning algorithm, and generates predicted images.

Set environment

Using Poetry to manage the dependencies:

` poetry install `

Preparing data

Put the target video file "YOUR_VIDEO" in "YOUR_DATA" folder. Execute the following command to generate still images from the video.

` python PredNet/generate_image.py YOUR_DATA/YOUR_VIDEO -d YOUR_DATA `

To change the width of the image, use the -w option.

` python PredNet/generate_image.py YOUR_DATA/YOUR_VIDEO -d YOUR_DATA -w 160 `

To change the width of the image, use the -g option.

` python PredNet/generate_image.py data/YOUR_VIDEO -d data -w 160 -g 120 `

"train_list.txt" describing the list of files used for training and "test_list.txt" describing the list of files used for testing are saved in YOUR_DATA folder. By default, the latter half of the video will be the test data.

Execute the following command to generate dB data from an audio file.

` python generate_spectrum.py wave_to_db rain_1.mp3 --with_image `

Execute the following command to generate audio files from dB files in the result folder.

` python generate_spectrum.py db_to_wave result --with_image `

Training

Execute the following command to train the model.

` python PredNet/main.py -i YOUR_DATA/train_list.txt `

E.g.:

` python main.py -i data/train_list.txt `

The learning models are saved in "models" folder.

If you have multiple "train_list.txt"s, use sequence_list.txt in which directories of "train_list.txt"s are written as follows: data1/train_list.txt data2/train_list.txt data3/train_list.txt data4/train_list.txt .... and then, execute the following command.

` python PredNet/main.py -seq sequence_list.txt -g 0 python main.py -i data/train_list.txt --save 40 --period 100 `

If you train from dB files, execute the following command.

` python main.py --channels 2,48,96,192 --size 160,512 `

In case you are having trouble with the dependencies, go to the https://pytorch.org/get-started/locally/ and look for the appropiate cudatoolkit. If you are using conda:

` conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch `

Prediction

Generate predicted frames with the following command.

` python PredNet/main.py -i YOUR_DATA/test_list.txt --initmodel models/YOUR_MODEL -l NUMBER_OF_INPUT_IMAGES --ext NUMBER_OF_PREDICTED_IMAGES `

E.g.: ` python src/test/main.py -i data/test_list.txt --initmodel models/100.pth `

Predicted images (test_#y_ 0.jpg) of all the images described in "test_list.txt" are generated in "result" folder. Furthermore, for each length of the input image, images (test_#y_1.jpg, test_#y_2.jpg, ...) corresponding to the number of predicted frames are generated.

Options

parser = argparse.ArgumentParser(description='PredNet') parser.add_argument('--images', '-i', default='data/train_list.txt', help='Path to image list file') parser.add_argument('--sequences', '-seq', default='', help='Path to sequence list file') parser.add_argument('--device', '-d', default="", type=str,

help='Computational device')
parser.add_argument('--root', '-r', default='.',
help='Root directory path of sequence and image files')
parser.add_argument('--initmodel', default='',
help='Initialize the model from given file')
parser.add_argument('--size', '-s', default='160,120',
help='Size of target images. width,height (pixels)')
parser.add_argument('--channels', '-c', default='3,48,96,192',
help='Number of channels on each layers')
parser.add_argument('--offset', '-o', default='0,0',
help='Center offset of clipping input image (pixels)')
parser.add_argument('--input_len', '-l', default=20, type=int,
help='Input frame length fo extended prediction on test (frames)')
parser.add_argument('--ext', '-e', default=10, type=int,
help='Extended prediction on test (frames)')
parser.add_argument('--bprop', default=20, type=int,
help='Back propagation length (frames)')
parser.add_argument('--save', default=10000, type=int,
help='Period of save model and state (frames)')
parser.add_argument('--period', default=1000000, type=int,
help='Period of training (frames)')

parser.add_argument('--saveimg', dest='saveimg', action='store_true') parser.add_argument('--useamp', dest='useamp', action='store_true', help='Flag for using AMP') parser.add_argument('--lr', default=0.001, type=float,

help='Learning rate')
parser.add_argument('--lr_rate', default=1.0, type=float,
help='Reduction rate for Step lr scheduler')
parser.add_argument('--min_lr', default=0.0001, type=float,
help='Lower bound learning rate for Step lr scheduler')

parser.add_argument('--batchsize', default=1, type=int, help='Input batch size') parser.add_argument('--shuffle', default=False, type=strtobool, help=' True is enable to sampl data randomly (default: False)') parser.add_argument('--num_workers', default=0, type=int, help='Num. of dataloader process. (default: num of cpu cores') parser.add_argument('--tensorboard', dest='tensorboard', action='store_true', help='True is enable to log for Tensorboard') parser.add_argument('--up_down_up', action='store_true', help='True is enable to cycle up-down-up in order') parser.add_argument('--color_space', default='RGB', type=str, help='Image color space(RGB, HSV, LAB, CMYK, YcbCr) - the dimension of this color space and 1st channel must be same.') parser.add_argument('--loss', type=str, default='mse', help='Loss name for training. Please select loss from "mse", "corr_wise", and "ensemble" (default: mse).') parser.add_argument('--amp', default=0.0, type=float, help='Amplitude for sine function') parser.add_argument('--omg', default=1.0, type=float, help='Angular velocity for sine function') parser.set_defaults(test=False) args = parser.parse_args()

Tensorboard logs

Execute the software with "--tensorboard true" option. Tensorboard logs will be saved "runs" folder.

Then execute the following command.

$ python main.py --tensorboard $ tensorboard --logdir runs

From pth to csv, From csv to pth

from pth to csv

$ python csv_serializer.py pth_to_csv <path to pth file> -dir <csv_directory>

Sample Code;

$ python3 csv_serializer.py pth_to_csv model_x.pth -dir model_x_folder

from csv to pth

$ python3 csv_serializer.py csv_to_pth <output_directory> -dir <csv_directory>

Sample Code;

$ python3 csv_serializer.py csv_to_pth model_x -dir model_x_folder

Deterministic learning

For deterministic leraning, use "torch.backends.cudnn.enabled = False" command, (https://pytorch.org/docs/stable/backends.html#torch-backends-cudnn) and use fixed initial weight model by --initmodel option.

Reference

"https://coxlab.github.io/prednet/" [Original PredNet] "https://github.com/quadjr/PredNet" [Implemented by chainer] "https://github.com/leido/pytorch-prednet" [Implemented by torch]

Application to the study of the brain function

Illusory Motion Reproduced by Deep Neural Networks Trained for Prediction https://doi.org/10.3389/fpsyg.2018.00345

Code structure

Project based on https://drivendata.github.io/cookiecutter-data-science/

About

version 32

License:MIT License


Languages

Language:Python 93.1%Language:Cuda 6.1%Language:C++ 0.8%