Added GRU to achieve video consistency #14

Jerry-Master · 2023-08-30T14:46:11Z

First of all, your work is amazing!! I just want to make it clear that I absolutely love this result together with matte anything. However, the main problem for real applications of this type of models is the temporal inconsistency. Since you are applying the model image-wise is impossible to achieve such temporal consistency for videos. This pull request is an attempt to include all the features that made RobustVideoMatting temporally consistent, so that you can easily retrain and see if you solve the temporal inconsistency problem.

The main change is the addition of convolutional GRUs in the detail capture mode. To make it possible to reuse already trained models, I add the ConvGRU layers similar to how it was done in controlnet, by initializing at zero and creating residual connections. This way, you can share a hidden state across frames and so the model can achieve temporal consistency. Nevertheless, that is not enough, I have also added another loss function that explicitly guides the model in achieving temporal consistency.

All the code is more or less recycled from the RobustVideoMatting repository. To not break anything I have duplicated the affected files and added a '_video' suffix. The code is supposed to be backward compatible except for working with 5D tensors instead of 4D. I tried to integrate it as much as possible so that you can rapidly try this idea. However, I am aware that the difficult part of managing the data is not included in this pull request. You would need to download RobustVideoMatting dataset and train on there.

I will be more than glad to help with any doubt or contribute further if you give me directions on the hardware you use or the environment. I really want this model to have temporal consistency so that it can be used in real world applications.

skyler14 · 2023-10-04T20:51:45Z

First of all, your work is amazing!! I just want to make it clear that I absolutely love this result together with matte anything. However, the main problem for real applications of this type of models is the temporal inconsistency. Since you are applying the model image-wise is impossible to achieve such temporal consistency for videos. This pull request is an attempt to include all the features that made RobustVideoMatting temporally consistent, so that you can easily retrain and see if you solve the temporal inconsistency problem.

The main change is the addition of convolutional GRUs in the detail capture mode. To make it possible to reuse already trained models, I add the ConvGRU layers similar to how it was done in controlnet, by initializing at zero and creating residual connections. This way, you can share a hidden state across frames and so the model can achieve temporal consistency. Nevertheless, that is not enough, I have also added another loss function that explicitly guides the model in achieving temporal consistency.

All the code is more or less recycled from the RobustVideoMatting repository. To not break anything I have duplicated the affected files and added a '_video' suffix. The code is supposed to be backward compatible except for working with 5D tensors instead of 4D. I tried to integrate it as much as possible so that you can rapidly try this idea. However, I am aware that the difficult part of managing the data is not included in this pull request. You would need to download RobustVideoMatting dataset and train on there.

I will be more than glad to help with any doubt or contribute further if you give me directions on the hardware you use or the environment. I really want this model to have temporal consistency so that it can be used in real world applications.

I was wondering if you went ahead and trained a model with this or anything further occurred?

Added GRU to achieve video consistency

7a45aa6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added GRU to achieve video consistency #14

Added GRU to achieve video consistency #14

Jerry-Master commented Aug 30, 2023

skyler14 commented Oct 4, 2023

Added GRU to achieve video consistency #14

Are you sure you want to change the base?

Added GRU to achieve video consistency #14

Conversation

Jerry-Master commented Aug 30, 2023

skyler14 commented Oct 4, 2023