Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Fix memory leak reported by ASAN in NNVM to ONNX conversion #15516

Merged
merged 4 commits into from
Nov 26, 2019

Conversation

kevinzh92
Copy link
Contributor

Description

Fix memory leak reported by ASAN in NNVM to ONNX conversion of a constant.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http:https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Fixed a memory leak in NNVM to ONNX constants conversion code. The shared_ptr, as written, does not call delete[] on a dynamically allocated array.

Comments

Copy link
Member

@wkcn wkcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for the fix!

@wkcn wkcn added the pr-awaiting-review PR is waiting for code review label Jul 11, 2019
Copy link
Contributor

@larroy larroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR @kevinzh92 I agree with the fix. May I suggest to change this to a vector of float and rename the variable to a more semantic name?

@ChaiBapchya
Copy link
Contributor

@larroy isn't it consistent with the existing naming system?
That is, to call the instance of that class as the corresponding data pointer.

@larroy
Copy link
Contributor

larroy commented Jul 12, 2019

Not really, in those cases you want a semantic name, even if you are wrapping your Cat in a pointer you want to name it:

unique_ptr<Cat> cat = make_unique<Cat>()

And not

unique_ptr<Cat> unique_ptr = make_unique<Cat>()

Semantic naming is important, we have already type information from the compiler and we are in a statically typed language, no need to name a pointer _ptr. And definitely not "data" for the name of a variable, this is a code smell.

@ChaiBapchya
Copy link
Contributor

Gotcha! Makes sense!

@karan6181
Copy link
Contributor

@kevinzh92 Could you please address the review comments?

@kevinzh92 kevinzh92 closed this Jul 23, 2019
@kevinzh92 kevinzh92 deleted the asanNnvmOnnxFix branch July 23, 2019 15:26
@kevinzh92 kevinzh92 restored the asanNnvmOnnxFix branch July 23, 2019 15:28
@kevinzh92 kevinzh92 reopened this Jul 23, 2019
@kevinzh92
Copy link
Contributor Author

Sorry guys. Things have been busy at work and this fell off my radar. @larroy, I hope I addressed all of your concerns.

@kevinzh92 kevinzh92 force-pushed the asanNnvmOnnxFix branch 2 times, most recently from c419ce4 to 856fe16 Compare July 23, 2019 17:43
initializer_proto->add_float_data(data_ptr[blob_idx]);
}
std::vector<float> constants(size);
nd.SyncCopyToCPU(&constants, size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙅‍♂️this is not correct. You are taking address to the vector itself, you want to take address to the data.

Please use .data() member or the address of the first element.

@KellenSunderland
Copy link
Contributor

Looks like there's still a compiler issue:

/work/mxnet/src/operator/subgraph/tensorrt/nnvm_to_onnx.cc:567:54: error: reinterpret_cast from type 'const short unsigned int*' to type 'int32_t* {aka int*}' casts away qualifiers

                   reinterpret_cast<int32_t&>(constant));

@larroy
Copy link
Contributor

larroy commented Aug 7, 2019

I also have concerns about this line:

                  reinterpret_cast<int32_t&>(constant));

@wkcn
Copy link
Member

wkcn commented Nov 26, 2019

Hi, I update this code to use T[] as the template type of shared_ptr.

Thank @kevinzh92 for pointing out this issue.

@larroy
Copy link
Contributor

larroy commented Nov 26, 2019

Please rebase

Copy link
Contributor

@larroy larroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wkcn wkcn merged commit d2d4876 into apache:master Nov 26, 2019
@wkcn
Copy link
Member

wkcn commented Nov 26, 2019

Merged.
It is a tiny change to modify the template type, and keep the original variable name.
We can modify the semantic name in another PR : )
std::vector is not suitable in this PR, since it will allocate extra capacity.

Thank you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants