mingkin/FasterLivePortrait

FasterLivePortrait: Bring portraits to life in Real Time!

Original repository: LivePortrait, thanks to the authors for sharing

New features:

Achieved real-time running of LivePortrait on RTX 3090 GPU using TensorRT, reaching speeds of 30+ FPS. This is the speed for rendering a single frame, including pre- and post-processing, not just the model inference speed.
Implemented conversion of LivePortrait model to Onnx model, achieving inference speed of about 70ms/frame (~12 FPS) using onnxruntime-gpu on RTX 3090, facilitating cross-platform deployment.
Seamless support for native gradio app, with several times faster speed and support for simultaneous inference on multiple faces. Some results can be seen here: pr105
Refactored code structure, no longer dependent on pytorch, all models use onnx or tensorrt for inference.

If you find this project useful, please give it a star ❤️❤️

result2.mp4

Changelog

2024/07/17: Added support for Docker environment, providing a runnable image.
2024/07/18: macOS support added(No need for Docker, Python is enough). M1/M2 chips are faster, but it's still quite slow 😟
- Install ffmpeg: brew install ffmpeg
- Set up a Python 3.10 virtual environment. Recommend using miniforge: conda create -n flip python=3.10 && conda activate flip
- Install requirements: pip install -r requirements_macos.txt
- Download ONNX files: huggingface-cli download warmshao/FasterLivePortrait --local-dir ./checkpoints
- Test: python app.py --mode onnx
Windows integration package, supports one-click run

Option 1: Docker (recommended).A docker image is provided for eliminating the need to install onnxruntime-gpu and TensorRT manually.
- Install Docker according to your system
- Download the image: docker pull shaoguo/faster_liveportrait:v1
- Execute the command, replace $FasterLivePortrait_ROOT with the local directory where you downloaded FasterLivePortrait:
```
docker run -it --gpus=all \
--name faster_liveportrait \
-v $FasterLivePortrait_ROOT:/root/FasterLivePortrait \
--restart=always \
-p 9870:9870 \
shaoguo/faster_liveportrait:v1 \
/bin/bash
```
Option 2: Create a new Python virtual environment and install the necessary Python packages manually.
- First, install ffmpeg
- Run pip install -r requirements.txt
- Then follow the tutorials below to install onnxruntime-gpu or TensorRT. Note that this has only been tested on Linux systems.

First, download the converted onnx model files:huggingface-cli download warmshao/FasterLivePortrait --local-dir ./checkpoints.
(Ignored in Docker)If you want to use onnxruntime cpu inference, simply pip install onnxruntime. However, cpu inference is extremely slow and not recommended. The latest onnxruntime-gpu still doesn't support grid_sample cuda, but I found a branch that supports it. Follow these steps to install onnxruntime-gpu from source:
- git clone https://github.com/microsoft/onnxruntime
- git checkout liqun/ImageDecoder-cuda. Thanks to liqun for the grid_sample with cuda implementation!
- Run the following commands to compile, changing cuda_version and CMAKE_CUDA_ARCHITECTURES according to your machine:
```
./build.sh --parallel \
--build_shared_lib --use_cuda \
--cuda_version 11.8 \
--cuda_home /usr/local/cuda --cudnn_home /usr/local/cuda/ \
--config Release --build_wheel --skip_tests \
--cmake_extra_defines CMAKE_CUDA_ARCHITECTURES="60;70;75;80;86" \
--cmake_extra_defines CMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc \
--disable_contrib_ops \
--allow_running_as_root
```
- pip install build/Linux/Release/dist/onnxruntime_gpu-1.17.0-cp310-cp310-linux_x86_64.whl

Test the pipeline using onnxruntime:

  python run.py \
 --src_image assets/examples/source/s10.jpg \
 --dri_video assets/examples/driving/d14.mp4 \
 --cfg configs/onnx_infer.yaml

(Ignored in Docker) Install TensorRT. Remember the installation path of TensorRT.
(Ignored in Docker) Install the grid_sample TensorRT plugin, as the model uses grid sample that requires 5D input, which is not supported by the native grid_sample operator.
- git clone https://github.com/SeanWangJS/grid-sample3d-trt-plugin
- Modify line 30 in CMakeLists.txt to: set_target_properties(${PROJECT_NAME} PROPERTIES CUDA_ARCHITECTURES "60;70;75;80;86")
- export PATH=/usr/local/cuda/bin:$PATH
- mkdir build && cd build
- cmake .. -DTensorRT_ROOT=$TENSORRT_HOME, replace $TENSORRT_HOME with your own TensorRT root directory.
- make, remember the address of the .so file, replace /opt/grid-sample3d-trt-plugin/build/libgrid_sample_3d_plugin.so in scripts/onnx2trt.py and src/models/predictor.py with your own .so file path
Download ONNX model files:huggingface-cli download warmshao/FasterLivePortrait --local-dir ./checkpoints. Convert all ONNX models to TensorRT, run sh scripts/all_onnx2trt.sh

Test the pipeline using tensorrt:

 python run.py \
 --src_image assets/examples/source/s10.jpg \
 --dri_video assets/examples/driving/d14.mp4 \
 --cfg configs/trt_infer.yaml

To run in real-time using a camera:

 python run.py \
 --src_image assets/examples/source/s10.jpg \
 --dri_video 0 \
 --cfg configs/trt_infer.yaml \
 --realtime

Follow my shipinhao channel for continuous updates on my AIGC content. Feel free to message me for collaboration opportunities.