Second order gradient wrt inputs, expected behaviour. #14991

larroy · 2019-05-18T02:22:45Z

What would be the expected behaviour of this code?

It tries to calculate the gradient of a function using the gradient wrt of the inputs of the first gradient.

def test_ag_grad():
    x = mx.nd.ones((3,3))
    y = mx.nd.ones((3,3))
    x.attach_grad()
    y.attach_grad()
    with mx.autograd.record():
        z = x + y
        x_grad_y_grad = mx.autograd.grad(z, [x,y], create_graph=True, retain_graph=True)
        print(x_grad_y_grad)
        first_grad = nd.concat(*[x.reshape(-1) for x in x_grad_y_grad], dim=0)
        fg_f = 2 * first_grad
        second_grad = mx.autograd.grad(fg_f, [x,y], retain_graph=True)

The text was updated successfully, but these errors were encountered:

vdantu · 2019-05-19T06:53:20Z

@mxnet-label-bot add [question]

apeforest · 2019-05-20T19:29:21Z

Calling autograd.grad on a first order ndarray seems not working this way. The API design could have been better documented.
The following block works.

def test_ag_grad():
    x = mx.nd.array([1, 2, 3])
    y = mx.nd.array([2, 3, 4])
    x.attach_grad()
    y.attach_grad()
    with mx.autograd.record():
        z = nd.elemwise_add(x, y)
        first_grad = mx.autograd.grad(z, x, create_graph=True, retain_graph=True)[0]
        print(first_grad)
        fg_f = 2 * first_grad
    fg_f.backward()
    print(x.grad)

larroy · 2019-05-20T22:38:21Z

Which branch are you using? I'm getting the following when running your example:

  File "/home/ANT.AMAZON.COM/pllarroy/devel/mxnet/python/mxnet/ndarray/ndarray.py", line 2216, in backward
    ctypes.c_void_p(0)))
  File "/home/ANT.AMAZON.COM/pllarroy/devel/mxnet/python/mxnet/base.py", line 254, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [15:37:55] /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/src/imperative/imperative.cc:357: Check failed: var_nodes.variable_nodes.size() > 0 (0 vs. 0) : There are no inputs in computation graph that require gradients.
Stack trace:
  [bt] (0) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/cmake-build-debug/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4a) [0x7f6590fc7b5e]
  [bt] (1) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/cmake-build-debug/libmxnet.so(mxnet::Imperative::CreateGradientVariableNodes(std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<nnvm::NodeEntry, std::allocator<nnvm::NodeEntry> > const&)+0x535) [0x7f65911c63e1]
  [bt] (2) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/cmake-build-debug/libmxnet.so(mxnet::Imperative::Backward(std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, bool, bool, bool)+0x2ff) [0x7f65911c6ae1]
  [bt] (3) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/cmake-build-debug/libmxnet.so(MXAutogradBackwardEx+0x249) [0x7f6591034c12]
  [bt] (4) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7f65aac85dae]
  [bt] (5) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call+0x22f) [0x7f65aac8571f]
  [bt] (6) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/py3_venv/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(_ctypes_callproc+0x2b4) [0x7f65aae99524]
  [bt] (7) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/py3_venv/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(+0x11b93) [0x7f65aae99b93]
  [bt] (8) /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/py3_venv/bin/python(_PyObject_FastCallKeywords+0x19c) [0x5a730c]


-------------------- >> begin captured logging << --------------------
common: INFO: Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=1526948194 to reproduce.
--------------------- >> end captured logging << ---------------------

apeforest · 2019-05-20T22:55:02Z

I am using my own branch: https://github.com/apeforest/incubator-mxnet/tree/develop/higher_order_grad

I think one line you need to change is: https://github.com/apache/incubator-mxnet/pull/14613/files#diff-2d0bd6acfa276757ae59106dbed8e3e5R353

larroy · 2019-05-20T23:36:17Z

I merged your branch, works, thanks.

larroy · 2019-05-21T00:07:15Z

I get the warning: "[16:36:03] /home/ANT.AMAZON.COM/pllarroy/devel/mxnet/src/imperative/imperative.cc:362: There are no inputs in computation graph that require gradients."

larroy · 2019-05-21T01:02:36Z

This example works for me. I'm able to call grad two times, the second gradient has the correct value, given than the second function is 3 * 2 *x, so the grad is 6 for all elements.

When using ag.grad the gradient is not stored in x though. One key issue here is that create_graph in the second call to grad has to be set to false, otherwise we are re-creating the graph and having the problem I had before that the graph only contains backward nodes and NOT the original nodes.


def test_ag_grad():
    x = mx.nd.array([[2,2,2],[2,2,2],[2,2,4]])
    y = mx.nd.array([[2,2,2], [3,3,3], [4,4,4]])
    x.attach_grad()
    y.attach_grad()
    with mx.autograd.record():
        z = nd.elemwise_add(nd.elemwise_mul(x,x),y)
        x_grad_y_grad = mx.autograd.grad(z, x, create_graph=True, retain_graph=True)[0]
        print("dz/dx |x")
        print(type(x_grad_y_grad))
        print(x_grad_y_grad)
        fg_f = 3 * x_grad_y_grad
        second_grad = mx.autograd.grad(fg_f, [x], create_graph=False, retain_graph=True)
        print("second grad")
        print(second_grad)
        print("x.grad")
        print(x.grad)
    #fg_f.backward()
    print("x.grad")
    print(x.grad)

test_autograd.test_ag_grad ... variables 
[[2. 2. 2.]
 [2. 2. 2.]
 [2. 2. 4.]]
<NDArray 3x3 @cpu(0)>
var_handles: <mxnet.base.c_void_p_Array_1 object at 0x7f0b9e536a60>
dz/dx |x: 

[[4. 4. 4.]
 [4. 4. 4.]
 [4. 4. 8.]]
<NDArray 3x3 @cpu(0)>
variables 
[[2. 2. 2.]
 [2. 2. 2.]
 [2. 2. 4.]]
<NDArray 3x3 @cpu(0)>
var_handles: <mxnet.base.c_void_p_Array_1 object at 0x7f0b9e536b70>
second grad: 

[[6. 6. 6.]
 [6. 6. 6.]
 [6. 6. 6.]]
<NDArray 3x3 @cpu(0)>
x.grad: 

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
<NDArray 3x3 @cpu(0)>
x.grad: 

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
<NDArray 3x3 @cpu(0)>

marcoabreu added the Question label May 19, 2019

larroy mentioned this issue Jun 15, 2019

[DOC] refine autograd docs #15109

Merged

5 tasks

larroy mentioned this issue Jun 26, 2019

[MXNET-978] Fully connected, higher order grad #14779

Merged

5 tasks

sxjscience closed this as completed in #14779 Sep 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Second order gradient wrt inputs, expected behaviour. #14991

Second order gradient wrt inputs, expected behaviour. #14991

larroy commented May 18, 2019

vdantu commented May 19, 2019

apeforest commented May 20, 2019 •

edited

Loading

larroy commented May 20, 2019

apeforest commented May 20, 2019

larroy commented May 20, 2019

larroy commented May 21, 2019

larroy commented May 21, 2019 •

edited

Loading

Second order gradient wrt inputs, expected behaviour. #14991

Second order gradient wrt inputs, expected behaviour. #14991

Comments

larroy commented May 18, 2019

vdantu commented May 19, 2019

apeforest commented May 20, 2019 • edited Loading

larroy commented May 20, 2019

apeforest commented May 20, 2019

larroy commented May 20, 2019

larroy commented May 21, 2019

larroy commented May 21, 2019 • edited Loading

apeforest commented May 20, 2019 •

edited

Loading

larroy commented May 21, 2019 •

edited

Loading