jhogsett / EMA-VFI-WebUI

Advanced AI-Based Video Renovation UI Using EMA-VFI & Real-ESRGAN

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Help with installation

VicMahoney opened this issue · comments

I wrote to you on YouTube. So I couldn’t start, although I seemed to have fulfilled the requirements. But I couldn’t carry out interpolation in the original EMA-VFI theme either. Do you have more detailed installation instructions or a similar installation based on Python?

@VicMahoney Hi Vic, I'd like to help.

What happened when you tried to interpolate with the EMA-VFI engine directly? Can you send me any console output?

I have been meaning to make a YT video showing step-by-step instructions. I probably ought to do this soon. In the mean time, I'll try going through the process myself on a laptop and see if the installation instructions still make sense.

Jerry

Here I will write what I do:

  1. I installed Python 3.10.9
  2. I install Torch 1.13.1 in the Python folder - to check, I enter the import command and it shows the number of tensor cores
  3. Next, I set the requirements in order: skimage 0.19.2, numpy 1.23.1, opencv-python 4.6.0, timm 0.6.11, tqdm - the latest versions available on the site. Almost all components are already installed - this is how Python writes. (They need to be installed in the Python folder, did I do it right or do they need to be installed in the EMA-VFI-main folder?)
  4. I downloaded “Download the model checkpoints” - this is the “ckpt” folder - I put it in the EMA-VFI-main folder.
  5. I run all these commands -
    python demo_2x.py # for 2x interpolation
    python demo_Nx.py --n 8 # for 8x interpolation
    After them, a new file of the train appears in GIF format
    Next in the instructions there is a testing section and this is where the problems begin.
    For example, I want to do a test using the command -
    python benchmark/dataset.py --model model[ours/ours_small] --path /where/is/your/dataset
    I convert this line by first downloading 1 dataset called “Xiph dataset” (it is stored on GitHub and weighs about 5 kilobytes), I throw it into a folder called dataset and the command is obtained:
    python benchmark/Xiph.py --model model[ours/ours_small] --path /where/is/your/D:\WP\EMA-VFI-main\dataset
    This command displays a message that
    "Traceback (most recent call last): File "D:\WP\EMA-VFI-main\benchmark\Xiph.py", line 23, in
    assert args.model in ['ours', 'ours_small'], 'Model not exists!'
    AssertionError: Model not exists!"
    Is the command I have given incorrect? Is it actually possible to test something without downloading an 80GB dataset?

It looks like you have their engine installed correctly since the two demo scripts worked.

The testing part of their instructions can be skipped, unless one is evaluating the engine itself, for instance to establish benchmarks*. I was personally confused by this when I started using these open source AI engines. What I found is they generally have three activities: training, testing and inference. It's only the inference that's needed for EMA-VFI WebUI.

I run Python via Anaconda, and for each application I use, I set up an Anaconda environment so each one can have their own specific versions of Python, Cuda, other dependencies without interfering.

I recommend these next steps:

I apologize for the installation being so messy and difficult. I have plans to make the installation easier, automatically installing the other components, but I haven't been able to do that yet. I have leads on the techniques needed (Automatic1111 Stable Diffusion WebUI, the original inspiration for my application design).

*If you do indeed want to run the testing part of their instructions, I recommend reaching out to those folks directly.

I thought that testing would show me the correctness of the installation, since creating GIFs is somehow simple.
Did I understand correctly that you need to install the required additional components in the program folder itself, and not in the Python folder, for example?
Before that, I did everything according to your instructions and after merging the "main" and "web UI" folders, I was asked to install realsrgan and some other components (cant remember for now), but as you indicated in the instructions, this should not happen.
There is no need to apologize for anything, I understand that in this environment most people understand the subject being studied
I’ve only been doing this for a few days. It’s my mistake that I didn’t study to be an engineer or programmer(

Did I understand correctly that you need to install the required additional components in the program folder itself, and not in the Python folder, for example?

Here's how I have my folders laid out:

C:\AI - root for my AI-based projects
C:\AI\CONDA - root for my Anaconda Python projects
C:\AI\CONDA\EMA-VFI - installed the EMA-VFI engine here initially
C:\AI\CONDA\Real-ESRGAN - installed the Real-ESRGAN engine here initially
C:\AI\CONDA\EMA-VFI-WebUI - root for EMA-VI WebUI app

I copied the needed folders & files from EMA-VFI and Real-ESRGAN to C:\AI\CONDA\EMA-VFI-WebUI per my installation notes

  • EMA-VFI: copy folders benchmark, ckpt, model; files config.py, dataset.py, Trainer.py
  • Real-ESRGAN: copy folder realesrgan

And then have FFmpeg installed on Windows in the system path.

I’ve only been doing this for a few days. It’s my mistake that I didn’t study to be an engineer or programmer(

Not a mistake! 😄 I would really like to make my app more easily usable by folks and not require expert-level knowledge to install it.

@VicMahoney I've started going through a from-scratch install and I've run into a couple small issues already (like EMA-VFI "requiring" slightly different versions of Python and Torch than what I'm actually using elsewhere). It might take a day or two but I'll put together a step-by-step that I think will lead to a successful installation.

@VicMahoney just ran through a from-scratch installation and have a working application afterwards. I'll clean up my notes and post them here in a bit.

@VicMahoney My detailed notes are here: https://github.com/jhogsett/EMA-VFI-WebUI/wiki/Example-Windows-11-Install-Steps-November-3,-2023

I was able to get a successful interpolation after running through all these steps. I have not double-checked the steps for accuracy by doing a new from-scratch full run, so there may be a typo here and there I haven't caught.

Please see how far you're able to get this this!

@VicMahoney My detailed notes are here: https://github.com/jhogsett/EMA-VFI-WebUI/wiki/Example-Windows-11-Install-Steps-November-3,-2023

I was able to get a successful interpolation after running through all these steps. I have not double-checked the steps for accuracy by doing a new from-scratch full run, so there may be a typo here and there I haven't caught.

Please see how far you're able to get this this!

I need time to check everything, I will write back in the next two days. Thank you for your promptness!

@jhogsett Its working! The interface has launched!
I did everything according to your new instructions in 20 minutes. Apparently I didn’t succeed before due to some inaccuracies and misunderstanding of the connections in the systems, as they say, the devil is in the details.
I will now understand the capabilities of the program. Thank you very much for your support!

@VicMahoney that's great news! 🎉

Let me know if anything is unclear or not working. I'm very interesting in user feedback.

The most evolved part of the application is the Video Remixer tool. I've focused most of my recent work on that, and it's what I used to create my YouTube videos.

@jhogsett Of course, boss)
Best regards, Victor

new problem: after trying to restart the interface the next day, it again did not want to open.
Now when installing in the "Set Up EMA-VFI WebUI" section, and specifically when installing "pip install -r requirements.txt" an error appears in red:
ERROR: Ignored the following versions that require a different python version: 1.21.2 Requires-Python >=3.7,<3.11; 1.21.3 Requires-Python >=3.7,<3.11; 1.21.4 Requires-Python >=3.7,<3.11; 1.21.5 Requires-Python >=3.7,<3.11; 1.21.6 Requires-Python >=3.7,<3.11; 1.6.2 Requires-Python >=3.7,<3.10; 1.6.3 Requires-Python >=3.7,<3.10; 1.7.0 Requires-Python >=3.7,<3.10; 1.7.1 Requires-Python >=3.7,<3.10; 1.7.2 Requires-Python >=3.7,<3.11; 1.7.3 Requires-Python >=3.7,<3.11; 1.8.0 Requires-Python >=3.8,<3.11; 1.8.0rc1 Requires-Python >=3.8,<3.11; 1.8.0rc2 Requires-Python >=3.8,<3.11; 1.8.0rc3 Requires-Python >=3.8,<3.11; 1.8.0rc4 Requires-Python >=3.8,<3.11; 1.8.1 Requires-Python >=3.8,<3.11
ERROR: Could not find a version that satisfies the requirement torch==1.13.1 (from versions: 2.0.0, 2.0.1, 2.1.0)
ERROR: No matching distribution found for torch==1.13.1

@VicMahoney You might need to open the Anaconda prompt and perform the activation step for that created environment. That will arrange the right Python version and all the installed dependencies to run the application. That activation is necessary each time it's run.

@jhogsett I did it as I did before with a successful attempt. the line just moved down and that's it. Can this be done again?

maybe now everything will work as it should - I demolished Anaconda and stupidly installed the required version of Python, the interface opened, sorry to bother you

@jhogsett To increase the FPS of a video, for example, by two times, do I need to go to the “Interpolate Video” module, having first converted the video into a folder with PNG?

@VicMahoney no bother! I'm glad you got it working again!

There are a few ways to increase the FPS. Only one of them will include the audio in the final video.

I'll send another message in a moment with some tips.

@jhogsett As I understand it, the “Video Remixer” item allows you to do everything in one bottle? Interpolation (insert additional frames) and upscaling?

If audio is important in the final video, use the Video Remixer tab:

  • On the Remix Home tab of Video Remixer, enter the full path to the video on the left, and click New Project
    • The video can be in .mp4 format, or any other format supported by FFmpeg, which is just about all
    • This will take you to the Remix Settings tab
  • On the Remix Settings tab, do the following:
    • Set the Split Type to None (unless you need scene detection)
    • Double check that the path in the Set Project Path field is what you want, it will place all the work files there
    • Click Next, this will take you to the Set Up Project tab
  • On the Set Up Project tab, leave the defaults and click Set Up Project

The application will do a series of things to get the project ready; this will take some time
- Breaking out the PNG frames from the source video
- Duplicating them into a single scene (or multiple scenes if detection was done)
- Create thumbnails

When done setting up, it'll leave you on the Choose Scenes tab
- If there's only one scene (scene detection not used) it will automatically be set to "kept"
- If so, just click Done Choosing Scenes
- With multiple scenes, you'll need to choose to Keep the scenes you want in the final video
- This will take you to the Compile Scenes tab

  • On the Compile Scenes tab, Click Compile Scenes (this ensures only the Kept scenes will be used in the remix video) and you'll be taken to the Process Remix tab
  • On the Process Remix tab, ensure only the Inflate New Frames checkbox is checked
    • This will insert frames between all existing frames, doubling the frame rate
    • This process takes some time

When done, you'll be taken to the Save Remix tab

  • On the Save Remix tab, Click Save Remix to finalize and save the FPS-doubled video
    • This will take some time
    • It will slices=s the necessary .WAV audio from the original video and merge it with the processed frames to create the final video

@jhogsett As I understand it, the “Video Remixer” item allows you to do everything in one bottle? Interpolation (insert additional frames) and upscaling?

Exactly right - it uses most of the other features of the application, plus adds the handling of audio as well.

I just posted a new YT video using this feature. I cut the raw footage down to a nice 7 minute video from the original 20 minutes.
https://youtu.be/oSdqVQRe1AQ

@jhogsett Got it, great. I’ve already started test processing, I’ll try the possibilities. Just noticed that the load on the video card is not constant compared to other programs that I used (both for upscaling and interpolation) - is there any room for optimization here to increase performance?

@VicMahoney this is something I've noticed and wondered about too.

There's not much control over how the two engines do their work. When brainstorming about it, it occurred to me that maybe the EMA-VFI engine could be reworked to operate on batches, to take advantage of more GPU VRAM. I'm not sure if that's possible - I haven't reached out to those folks about it but they might have an idea.

I'm also not sure if the Real-ESRGAN engine can be optimized. I feel like I often bump into memory limits with it. On my 24GB 3090 I can upscale full HD without tiling given NVIDIAs recent change to borrow system RAM, but above that and it needs triple digit GB's of RAM and crashes.

My long term hope is:

  • to refactor my code to make the various engines easily switchable
  • use improved technology as it emerges

Jerry

@jhogsett
I just ran the video with the items enabled
Resize/Crop Frames
Resynthesize Frames
Inflate New Frames
It was done for more than an hour (I didn’t measure it exactly), taking into account the fact that it itself lasted 20 seconds (that is, with upscaling it will be even longer) - is this normal?
and is it possible to choose the video bitrate in the end, not factor?

It was done for more than an hour (I didn’t measure it exactly), taking into account the fact that it itself lasted 20 seconds (that is, with upscaling it will be even longer) - is this normal?

That sounds about typical, it will really depend on your system and disk drive IO. I run on a suped-up gaming PC with these specs. Also I always process video on SSDs as HDDs are too slow in general.

Alienware Aurora R15

  • 13th Gen Intel(R) Core(TM) i9-13900KF, 3000 Mhz, 24 Core(s), 32 Logical Processor(s)
  • 64 GB RAM
  • 4TB NVMe storage
  • RTX 3090 w/ 24 GB

is it possible to choose the video bitrate in the end, not factor?

Yes. When saving the video, there is a "Create Custom Remix" tab that let's you provide custom FFmpeg arguments to guide the process of creating video clips from the processed PNG frames, and the process of merging the audio files with these clips. I would need to do some experimentation to give a specific recommendation on bit rate (I haven't done that often with FFmpeg). But that's the point of that tab, to do anything that FFmpeg will allow.

Info here may help with this: https://trac.ffmpeg.org/wiki/Limiting%20the%20output%20bitrate

I also have an M2 ssd.OK got it

But how to choose not double, but triple, for example, frame enlargement?

But how to choose not double, but triple, for example, frame enlargement?

Video Remixer is intentionally feature limited to make it easier to use, so it only offers the 1X, 2X and 4X upscaling options.

However the Upscale Frames tab under the top-level Film Restoration tab will let you upscale PNG frames arbitrary amounts from 1x to 8x. It's not tied directly in to Video Remixer but it will happily operate on a directory of PNG frames, or batch process a directory of Video Remixer Scene directories.