Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add efficient upsample layer #6384

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

add efficient upsample layer #6384

wants to merge 1 commit into from

Conversation

twmht
Copy link
Contributor

@twmht twmht commented May 6, 2018

Add upsample layer which is faster than the current deconvolution implementation. In my GTX-970, it's almost 6x faster than the current deconvolution way.

For example, to upsample (2x) the input in the deconvolution way. We have to define the following.

layer {
  name: "input"
  type: "Input"
  top: "data"
  input_param {
    shape {
      dim: 1
      dim: 384
      dim: 224
      dim: 224
    }
  }
}
layer {
  name: "upsample"
  type: "Deconvolution"
  bottom: "data"
  top: "upsample"
  param {
    lr_mult: 0.0
    decay_mult: 0.0
  }
  convolution_param {
    num_output: 384
    bias_term: false
    pad: 1
    kernel_size: 4
    group: 384
    stride: 2
    weight_filler {
      type: "bilinear"
    }
  }
}

And test the forward time by built-in caffe tools.

./build/tools/caffe time --model test.prototxt --gpu 0 --iterations 1000

this shows me the forward time in the deconvolution layer

I0506 21:00:06.151249 11728 caffe.cpp:400]   upsample   forward: 87.2994 ms.

It seems to me that it's worth to implement an upsample layer which is much faster than deconvolution.

Here is an use case. It's also worth noting that the parameters are much simpler.

layer {
  name: "input"
  type: "Input"
  top: "data"
  input_param {
    shape {
      dim: 1
      dim: 384
      dim: 224
      dim: 224
    }
  }
}
layer {
  name: "upsample"
  type: "Upsample"
  bottom: "data"
  top: "upsample"
  upsample_param {
    scale: 2
  }
}

and this is really fast.

I0506 20:57:59.810890 11527 caffe.cpp:400]   upsample   forward: 15.0124 ms.

The other difference is that upsample layer implements Nearest Neighborhood, but deConvolution implements Bi-Linear. However, in my experiment in FPN (https://arxiv.org/abs/1612.03144), it does not affect the accuracy no matter what algorithm you choose.

@twmht twmht force-pushed the upsample branch 3 times, most recently from 7334c28 to 4d2400e Compare May 7, 2018 13:01
@eliseyang
Copy link

Find the same problem when comparing the calculating time between keras and caffe.
Keras use upsamling2D much faster than caffe upsampling by deconvolution.
In caffe, upsampling by deconv uses the param group(not 1),and the sequential computing of groups costs extra time.
It's necessary to add the upsample layer when it comes to model for images using upsample layer.

@aPonza
Copy link

aPonza commented Feb 21, 2019

@twmht: I tried adding the layer in my clCaffe installation and am facing a compilation error seemingly due to the fact that I don't have CUDA (or even a CUDA capable GPU) installed (see here for more details on the errors).

Looking at other layers' source code I see they have pervasive guards in the *.cu files with #ifdef USE_CUDA (and #ifdef USE_GREENTEA but I'm not sure if that matters) guards, so I'm left wondering if that might be the only reason I'm facing the issue. At the same time, I'm lacking in knowledge on where I should put these guards, so I'm sort of stumped. Would you say that could be the problem?

EDIT:
I read a bit more and compared to other layers I figure this patch would get me where I need
upsample_layer_patch.txt and it does compile, sure, but it's missing the greentea alternative code. I still get a linker error afterwards, so something is surely still off somewhere. I'll update the issue with details on said problem.

@twmht
Copy link
Contributor Author

twmht commented Nov 18, 2019

@sergiev

I am going to update to solve the conflict.

thank you.

@twmht
Copy link
Contributor Author

twmht commented Nov 18, 2019

@sergiev

Now it can be merged without conflict.

thank you.

@gasgallo
Copy link

I tested your code as well, in my case I was looking for an upsampling op that implemented nearest neighbor algorithm instead of bilinear one because I was getting different results.

You code works like a charm!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants