andrewowens / multisensory

Code for the paper: Audio-Visual Scene Analysis with Self-Supervised Multisensory Features

Home Page:http://andrewowens.com/multisensory/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

I RuntimeError: Command failed! ffmpeg -i "/tmp/ao_wmjz0ezg.wav" -r 29.970000 -loglevel warning -safe 0 -f concat -i "/tmp/ao_i2pwi0b8.txt" -pix_fmt yuv420p -vcodec h264 -strict -2 -y -acodec aac "results/fg_translator.mp4"

opened this issue · comments

python sep_video.py data/translator.mp4 --model unet_pit --duration_mult 4 --out results/
Start time: 0.0
GPU = 0
Spectrogram samples: 512
(8.298, 8.288)
100.0% complete, total time: 0:00:00. 0:00:00 per iteration. (11:29 AM Tue)
Struct(alg=sourcesep, augment_audio=False, augment_ims=True, augment_rms=False, base_lr=0.0001, batch_size=24, bn_last=True, bn_scale=True, both_videos_in_batch=False, cam=False, check_iters=1000, crop_im_dim=224, dilate=False, do_shift=False, dset_seed=None, fix_frame=False, fps=29.97, frame_length_ms=64, frame_sample_delta=74.5, frame_step_ms=16, freq_len=1024, full_im_dim=256, full_model=False, full_samples_len=105000, gamma=0.1, gan_weight=0.0, grad_clip=10.0, im_split=False, im_type=jpeg, init_path=None, init_type=shift, input_rms=0.14142135623730953, l1_weight=1.0, log_spec=True, loss_types=['pit'], model_path=results/nets/sep/unet-pit/net.tf-160000, mono=False, multi_shift=False, net_style=no-im, normalize_rms=True, num_dbs=None, num_samples=173774, opt_method=adam, pad_stft=False, phase_type=pred, phase_weight=0.01, pit_weight=1.0, predict_bg=True, print_iters=10, profile_iters=None, resdir=/home/study/PycharmProjects/results/nets/sep/unet-pit, samp_sr=21000.0, sample_len=None, sampled_frames=248, samples_per_frame=700.7007007007007, show_iters=None, show_videos=False, slow_check_iters=10000, spec_len=512, spec_max=80.0, spec_min=-100.0, step_size=120000, subsample_frames=None, summary_iters=10, test_batch=10, test_list=../data/celeb-tf-v6-full/test/tf, total_frames=149, train_iters=160000, train_list=../data/celeb-tf-v6-full/train/tf, use_3d=True, use_sound=True, use_wav_gan=False, val_list=../data/celeb-tf-v6-full/val/tf, variable_frame_count=False, vid_dur=8.288, weight_decay=1e-05)
ffmpeg -loglevel error -ss 0.0 -i "data/translator.mp4" -safe 0 -t 8.338000000000001 -r 29.97 -vf scale=256:256 "/tmp/tmpw4889ppn/small_%04d.png"
ffmpeg -loglevel error -ss 0.0 -i "data/translator.mp4" -safe 0 -t 8.338000000000001 -r 29.97 -vf "scale=-2:'min(600,ih)'" "/tmp/tmpw4889ppn/full_%04d.png"
ffmpeg -loglevel error -ss 0.0 -i "data/translator.mp4" -safe 0 -t 8.338000000000001 -ar 21000.0 -ac 2 "/tmp/tmpw4889ppn/sound.wav"
Running on:
2019-05-14 11:29:30.212532: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-05-14 11:29:30.329825: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-05-14 11:29:30.330229: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce RTX 2070 major: 7 minor: 5 memoryClockRate(GHz): 1.62
pciBusID: 0000:01:00.0
totalMemory: 7.77GiB freeMemory: 7.19GiB
2019-05-14 11:29:30.330244: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2019-05-14 11:29:30.547596: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-14 11:29:30.547627: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
2019-05-14 11:29:30.547632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2019-05-14 11:29:30.547797: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6920 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070, pci bus id: 0000:01:00.0, compute capability: 7.5)
Raw spec length: [1, 514, 1025]
Truncated spec length: [1, 512, 1025]
('gen/conv1', [1, 512, 1024, 2], '->', [1, 512, 512, 64])
('gen/conv2', [1, 512, 512, 64], '->', [1, 512, 256, 128])
('gen/conv3', [1, 512, 256, 128], '->', [1, 256, 128, 256])
('gen/conv4', [1, 256, 128, 256], '->', [1, 128, 64, 512])
('gen/conv5', [1, 128, 64, 512], '->', [1, 64, 32, 512])
('gen/conv6', [1, 64, 32, 512], '->', [1, 32, 16, 512])
('gen/conv7', [1, 32, 16, 512], '->', [1, 16, 8, 512])
('gen/conv8', [1, 16, 8, 512], '->', [1, 8, 4, 512])
('gen/conv9', [1, 8, 4, 512], '->', [1, 4, 2, 512])
('gen/deconv1', [1, 4, 2, 512], '->', [1, 8, 4, 512])
('gen/deconv2', [1, 8, 4, 1024], '->', [1, 16, 8, 512])
('gen/deconv3', [1, 16, 8, 1024], '->', [1, 32, 16, 512])
('gen/deconv4', [1, 32, 16, 1024], '->', [1, 64, 32, 512])
('gen/deconv5', [1, 64, 32, 1024], '->', [1, 128, 64, 512])
('gen/deconv6', [1, 128, 64, 1024], '->', [1, 256, 128, 256])
('gen/deconv7', [1, 256, 128, 512], '->', [1, 512, 256, 128])
('gen/deconv8', [1, 512, 256, 256], '->', [1, 512, 512, 64])
('gen/fg', [1, 512, 512, 128], '->', [1, 512, 1024, 2])
('gen/bg', [1, 512, 512, 128], '->', [1, 512, 1024, 2])
Restoring from: results/nets/sep/unet-pit/net.tf-160000
predict
samples shape: (1, 173774, 2)
samples pred shape: (1, 173774, 2)
(512, 1025)
Writing to: results/
ffmpeg -i "/tmp/ao_wmjz0ezg.wav" -r 29.970000 -loglevel warning -safe 0 -f concat -i "/tmp/ao_i2pwi0b8.txt" -pix_fmt yuv420p -vcodec h264 -strict -2 -y -acodec aac "results/fg_translator.mp4"
[wav @ 0x558b3f868b40] Estimating duration from bitrate, this may be inaccurate
[wav @ 0x558b3f868b40] Could not find codec parameters for stream 0 (Audio: none, 1065353216 Hz, 16256 channels, 9481256 kb/s): unknown codec
Consider increasing the value for the 'analyzeduration' and 'probesize' options
Unknown encoder 'h264'
Traceback (most recent call last):
File "sep_video.py", line 455, in
ut.make_video(full_ims, pr.fps, pj(arg.out, 'fg%s.mp4' % name), snd(full_samples_fg))
File "/home/study/PycharmProjects/untitled/util.py", line 3176, in make_video
% (sound_flags_in, fps, input_file, sound_flags_out, flags, out_fname))
File "/home/study/PycharmProjects/untitled/util.py", line 917, in sys_check
fail('Command failed! %s' % cmd)
File "/home/study/PycharmProjects/untitled/util.py", line 14, in fail
def fail(s = ''): raise RuntimeError(s)
RuntimeError: Command failed! ffmpeg -i "/tmp/ao_wmjz0ezg.wav" -r 29.970000 -loglevel warning -safe 0 -f concat -i "/tmp/ao_i2pwi0b8.txt" -pix_fmt yuv420p -vcodec h264 -strict -2 -y -acodec aac "results/fg_translator.mp4"

I meet the problem like this... I run the code as you say... but what happend to this code? I run the code on python3,thank you for your prompt reply!!!

I just want to separate a mixfile(wav format),but it's so many errors...when I choose a mix.wav to takeplace the translator.mp4,

The version of ffmpeg that you have cannot write an h264 video, since it is missing the codec. You could reinstall it, or you could try a precompiled version: e.g. https://johnvansickle.com/ffmpeg/.

The version of ffmpeg that you have cannot write an h264 video, since it is missing the codec. You could reinstall it, or you could try a precompiled version: e.g. https://johnvansickle.com/ffmpeg/.
thank you for your quickly answer,and can I choose a mix.wav to takeplace the translator.mp4

excuse me...can you see my question? can you give me a little help,I don;t konw how to go on after downloading the"
ffmpeg-release-amd64-static.tar.xz - md5" what should I do next?