DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video (AAAI2023)

A well-documented version of the DINet model for inference on custom videos. This repo provides a step-by-step guide to run the DINet model either on a local machine or on Google Colab.

Evironment Setup

  1. Clone this repository:
cd DINet
  1. Create a conda environment with python==3.7 and install all dependencies:
conda create -n dinet python==3.7 -y
conda activate dinet
pip install -r requirements.txt

If running in Google Colab, you will first need to install Miniconda and then install the packages in the requirements.txt file. The steps are as follows:

!chmod +x
!bash ./ -b -f -p /usr/local
import sys
!conda install python=3.7 -y
!python --version
!conda create --name dinet python=3.7 -y
!source activate dinet
!pip install -r requirements.txt
  1. Download the model and sample data ( in Google drive, unzip and put it in the root directory.
gdown --id 1CkeEn7l3PuubuJIMWNjWpIrt0HDd_AB3


  • For running inference on the example videos, run the following command:
python --mouth_region_size 256 --source_video_path test.mp4 --source_openface_landmark_path test.csv --driving_audio_path test.wav --res_video_dir test_output/ --pretrained_clip_DINet_path ./asserts/clip_training_DINet_256mouth.pth

Additional Setup Notes


The .csv file should contain the facial landmarks detected using OpenFace. Installing OpenFace from scratch can be challenging due to dependency issues. Therefore, it is recommended to use Docker for a quicker setup. Follow these steps for quickstart usage of OpenFace with Docker:

  1. Run the OpenFace Docker container:

    docker run -it --rm algebr/openface:latest
  2. Find the container ID by running this (in a different terminal):

    docker ps

    (Let's say it shows a52fea727822)

  3. Transfer any video you want to run OpenFace on:

    docker cp test.mp4 a52fea727822:/home/openface-build
  4. In the first shell (where the container is running), run OpenFace on the video:

    build/bin/FaceLandmarkVidMulti -f test.mp4 -2Dfp
  5. The output will be saved as test.csv in the processed directory. Transfer it back to your local machine (in a different terminal):

    docker cp a52fea727822:/home/openface-build/processed/test.csv .
  • Note that we are using FaceLandmarkVidMulti here for videos, than FaceLandmarkVid (For images, use FaceLandmarkImg). This is because FaceLandmarkVid requires a display for its operations, which is not available in a Docker container by default. If you need to use FaceLandmarkVid, you can set up X11 forwarding on your local machine to enable the display (Ref. this issue).


Ensure that FFMPEG is installed on your system to enable audio and video merging functionality in the DINet model.

To install FFMPEG, if you have root access to your system, run the following command:

sudo apt-get install ffmpeg

If you don't have root access, follow the instructions below to install FFMPEG statically in your root directory:

  1. Download the Correct Static Build For Your System: (Architecture can be checked using uname -m command)

    # For i686:
    wget -O ffmpeg.tar.xz
    # For x86_64 or amd64:
    wget -O ffmpeg.tar.xz
    # For arm64 or aarch64:
    wget -O ffmpeg.tar.xz
  2. Unzip and Unpack the Build:

    tar xvf ffmpeg.tar.xz
    rm ffmpeg.tar.xz
  3. Go to the unzipped directory: (Naming convention is ffmpeg-git-[YYYYMMDD]-[platform]-static)

    cd ffmpeg-git-*-static
  4. Check the Installation:

    ./ffmpeg -version
  5. Move the Binary to the root Directory:

    cd ..
    mv ffmpeg DINet/


This code is taken from the original repository of DINet