animate-anyone-reproduction

reproduction of AnimateAnyone using SVD

piepline based on SVD
train V0.9 which can only generate 14 frames per ref-image
train animate-anyone like pipeline V1 which can generate arbitrary frames per ref-image
enhance face quality and time consistency(trick according to analyse animate anyone app cases)
release V1 inference code and model

2024-02-25 update

V1 checkpoint can be download now.
We can not release V1.1 which is the latest version. But we will release V1.1 if we have V1.2. The released version will be one version behind the latest version.
we also provide testcase to reproduce V1 result as below.
the original result has bad quality on human face, so we use simswap to enhance face. More detials can be found in issue.
You should first download the SVD model, and then use the unet provided by us to replace the original unet.
we find that the model has a certain degree of generalization on apperance and temporal consistency, but lacks the ability to generalize poses. So V1 can have a better performance on UBC pose.
we only add 300 high quality videos to achieve V1.1 results, you can finetune by your own datset.
we do not have any plans to release the training script but svd-temporal-controlnet may work.

2024-02-05 update

because of the issue, we decide to release inference code in advance which is not well organized but works.
as for postprocess of face, you can use any video face swap framework to do that. More details can be found in issue.
our inference code mainly baed on svd-temporal-controlnet, you can also use training code to train your own model.
our dataset is only UBC, but it can generalize to other simple domains. we will continue collecting high quailty video data.

2024-01-25 update

according to analyse animate anyone app cases, we find there may be some tricks instead of training model. so we will update the case which has better face quality with free training.
the face enhance result shows below in the V1 part