Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runs out of memory and crashes when encoding an image sequence. #6

Open
sam598 opened this issue Jun 15, 2019 · 8 comments
Open

Runs out of memory and crashes when encoding an image sequence. #6

sam598 opened this issue Jun 15, 2019 · 8 comments
Labels
bug Something isn't working

Comments

@sam598
Copy link

sam598 commented Jun 15, 2019

When encoding a large number of images the encoding time will slowly increase until it becomes 2x-3x the time it took to encode the first image, then the script encode_images.py will crash. On my system it always crashes on the 56th image.

The culprit appears to be these lines in perceptual_model.py

self.sess.run(tf.assign(self.features_weight, weight_mask)) self.sess.run(tf.assign(self.ref_img_features, image_features)) self.sess.run(tf.assign(self.ref_weight, image_mask)) self.sess.run(tf.assign(self.ref_img, loaded_image))

I posted a pull request on Puzer's original stylegan-encoder: Puzer#4

but I'm not familiar enough with your changes to know how to fix it. There is more information here: Puzer#3

The changes you have made and collected are a fantastic step forward and actually make frame to frame stylegan animations possible. A fix for this bug would go a long way to helping encode image sequences.

@pbaylies
Copy link
Owner

pbaylies commented Jun 15, 2019

Thanks @sam598 -- interesting use case, I've never tried to do that many images at once before! Have you tried linearly interpolating between some of your images to cut down on the number of frames, or encoding a keyframe first and then copying that dlatent to use as an initial value for the rest of the frames?

Oh, also, the above commit should fix your issue!

@sam598
Copy link
Author

sam598 commented Jul 8, 2019

Thanks for adding that so fast @pbaylies!

It seems that there is still a memory leak somewhere, 100 iterations goes from 24s to 90s and crashes around the 70th image. Is there anything else that could use placeholders? Or anything else that could leak? Performance is definitely better though.

I have tried several experiments with encoding initial keyframes, as well as keeping the previous frame as the initial value for the next one. Something very interesting happens where facial features begin to be "burned in". Even if the first frame has 500 iterations and every subsequent frame is only 50, the head pose and facial structure begin to get stuck, and higher level features like reflections and hair begin to "seep" down to lower layers and affect the structure.

The best results I have gotten so far have been from encoding each frame from scratch, and then temporally smoothing them. I really need to do a writeup of these tests.

@pbaylies
Copy link
Owner

pbaylies commented Jul 8, 2019

I've added enough features that I'm sure there are some leaks, the whole thing is due for a more careful rewrite at this point. There are also a bunch of parameters you can tweak, and the different parts of the loss function can be tuned or turned off. I will see what I can do; and, patches welcome, of course!

@SystemErrorWang
Copy link

The same problem also happened in my case, when trying to encode a large number of images in 1024*1024 resolution. It even runs out of memory in 16g memory tesla v100 gpu with batch size1. I tried this (Puzer#4) but the problem remains

@pbaylies
Copy link
Owner

pbaylies commented Sep 1, 2019

@SystemErrorWang yes, this still needs improvement; currently feel free to use larger batch sizes if you like, but see if you can work around this by running the tool itself on smaller batches of images at a time.

@pbaylies pbaylies reopened this Sep 1, 2019
@pbaylies pbaylies removed their assignment Sep 1, 2019
@pbaylies pbaylies added the bug Something isn't working label Oct 11, 2019
@minha12
Copy link

minha12 commented Mar 4, 2020

Hello, is there any way to fix this bug?

@ChengBinJin
Copy link

@minha12 @pbaylies
The original function, optimize in perceptual_model.py should be separated into the initialization and run_optimizer two steps. It will initialize just one optimizer graph for all images.

    def init_optimizer(self, vars_to_optimize, iterations=200):
        self.vars_to_optimize = vars_to_optimize if isinstance(vars_to_optimize, list) else [vars_to_optimize]

        if self.use_optimizer == 'lbfgs':
            self.optimizer = tf.contrib.opt.ScipyOptimizerInterface(
                self.loss, var_list=self.vars_to_optimize, method='L-BFGS-B', options={'maxiter': iterations})
        else:
            if self.use_optimizer == 'ggt':
                self.optimizer = tf.contrib.opt.GGTOptimizer(learning_rate=self.learning_rate)
            else:
                self.optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate)
            min_op = self.optimizer.minimize(self.loss, var_list=[self.vars_to_optimize])
            self.sess.run(tf.variables_initializer(self.optimizer.variables()))
            self.fetch_ops = [min_op, self.loss, self.learning_rate]

        self.sess.run(self._reset_global_step)

    def run_optimizer(self, iterations=200):
        for _ in range(iterations):
            if self.use_optimizer == 'lbfgs':
                self.optimizer.minimize(self.sess, fetches=[self.vars_to_optimize, self.loss])
                yield {"loss": self.loss.eval()}
            else:
                _, loss, lr = self.sess.run(self.fetch_ops)
                yield {"loss": loss, "lr": lr}

@Akila-Ayanthi
Copy link

Hello, Is there a way to fix this bug?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants