Initialize a structure in operator ReduceSum #6005

xadupre · 2020-12-02T17:47:57Z

Description:

Fix an issue raised by valgrind in line: if (last_results.last_loop_red_size == 0 || last_results.last_loop_size == 0). last_results.last_loop_size may not be initialized but when this happens, last_loop_red_size is 0 and the function returns. The fix initializes the structure last_results to avoid raising that issue.

==6974== Conditional jump or move depends on uninitialised value(s)
==6974==    at 0x117FE0B: void onnxruntime::NoTransposeReduce<float, onnxruntime::ReduceAggregatorSum<float, float> >(onnxruntime::Tensor*, onnxruntime::TensorShape const&, onnxruntime::Tensor const&, std::vector<long, std::allocator<long> > const&, onnxruntime::concurrency::ThreadPool*, onnxruntime::ResultsNoTransposePrepareForReduce&) (reduction_ops.cc:362)
==6974==    by 0x117F3B8: void onnxruntime::CommonReduce<float, onnxruntime::ReduceAggregatorSum<float, float> >(onnxruntime::OpKernelContext*, std::vector<long, std::allocator<long> >, long, onnxruntime::ResultsNoTransposePrepareForReduce&, bool) (reduction_ops.cc:502)
==6974==    by 0x117C08B: onnxruntime::ReduceSum<float>::Compute(onnxruntime::OpKernelContext*) const (reduction_ops.cc:567)
==6974==    by 0x15330E9: onnxruntime::SequentialExecutor::Execute(onnxruntime::SessionState const&, std::vector<int, std::allocator<int> > const&, std::vector<OrtValue, std::allocator<OrtValue> > const&, std::vector<int, std::allocator<int> > const&, std::vector<OrtValue, std::allocator<OrtValue> >&, std::unordered_map<unsigned long, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtMemoryInfo const&, OrtValue&, bool&)>, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtMemoryInfo const&, OrtValue&, bool&)> > > > const&, onnxruntime::logging::Logger const&) (sequential_executor.cc:315)
==6974==    by 0x152984B: onnxruntime::utils::ExecuteGraphImpl(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager const&, std::vector<OrtValue, std::allocator<OrtValue> > const&, std::vector<OrtValue, std::allocator<OrtValue> >&, std::unordered_map<unsigned long, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtMemoryInfo const&, OrtValue&, bool&)>, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtMemoryInfo const&, OrtValue&, bool&)> > > > const&, ExecutionMode, bool const&, onnxruntime::logging::Logger const&, bool) (utils.cc:456)
==6974==    by 0x1529F49: onnxruntime::utils::ExecuteGraph(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager&, std::vector<OrtValue, std::allocator<OrtValue> > const&, std::vector<OrtValue, std::allocator<OrtValue> >&, ExecutionMode, bool const&, onnxruntime::logging::Logger const&, bool) (utils.cc:515)
==6974==    by 0xE43DF1: onnxruntime::InferenceSession::Run(OrtRunOptions const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<OrtValue, std::allocator<OrtValue> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<OrtValue, std::allocator<OrtValue> >*, std::vector<OrtDevice, std::allocator<OrtDevice> > const*) (inference_session.cc:1545)
==6974==    by 0xE44582: onnxruntime::InferenceSession::Run(OrtRunOptions const&, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, OrtValue, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OrtValue> > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<OrtValue, std::allocator<OrtValue> >*) (inference_session.cc:1613)
==6974==    by 0x871E4B: std::vector<OrtValue, std::allocator<OrtValue> > onnxruntime::test::OpTester::ExecuteModel<onnxruntime::InferenceSession>(onnxruntime::Model&, onnxruntime::InferenceSession&, onnxruntime::test::OpTester::ExpectResult, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, OrtRunOptions const*, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, OrtValue, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OrtValue> > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (provider_test_utils.cc:564)
==6974==    by 0x86DE71: onnxruntime::test::OpTester::Run(onnxruntime::SessionOptions, onnxruntime::test::OpTester::ExpectResult, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, OrtRunOptions const*, std::vector<std::unique_ptr<onnxruntime::IExecutionProvider, std::default_delete<onnxruntime::IExecutionProvider> >, std::allocator<std::unique_ptr<onnxruntime::IExecutionProvider, std::default_delete<onnxruntime::IExecutionProvider> > > >*, onnxruntime::Graph::ResolveOptions const&) (provider_test_utils.cc:866)
==6974==    by 0x86BB2B: onnxruntime::test::OpTester::Run(onnxruntime::test::OpTester::ExpectResult, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, OrtRunOptions const*, std::vector<std::unique_ptr<onnxruntime::IExecutionProvider, std::default_delete<onnxruntime::IExecutionProvider> >, std::allocator<std::unique_ptr<onnxruntime::IExecutionProvider, std::default_delete<onnxruntime::IExecutionProvider> > > >*, ExecutionMode, onnxruntime::Graph::ResolveOptions const&) (provider_test_utils.cc:656)
==6974==    by 0xB5241A: onnxruntime::test::ReductionOpTest_ReduceDimWithZero_Test::TestBody()::{lambda(onnxruntime::test::OpTester&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)#1}::operator()(onnxruntime::test::OpTester&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const (reduction_ops_test.cc:1971)

yuslepukhin

faxu · 2020-12-02T20:46:07Z

/azp run Linux GPU CI Pipeline, Linux CPU x64 NoContribops CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux GPU TensorRT CI Pipeline, MacOS NoContribops CI Pipeline, Windows CPU CI Pipeline

…o redval

* fix initialisation issue

* Fix PR #5550 reverted in #5911 (performance improvment for operator Transpose) (#5916) * Improves implementation of transpose operator * Fix issue mentioned in #5911 * adding unit test for function DoTransposeImpl * Make operator TreeEnsemble 5x faster for batches of size 100.000 (#5965) * improves processing time by 10 * extend coverage unit test coverage * better implementation for the multi regression case * better comment, keep parallelization by trees when not enough trees * Initialize a structure in operator ReduceSum (#6005) * fix initialisation issue * Fuse MatMulIntegerToFloat only when scales are scalar (#6008) MatMulIntegerToFloat fusion fuses per-row and per-column MatMulInteger, which is not supported by the MatMulIntegerToFloat kernel now. Limit the fusion to per-matrix only before we supporting the per-channel fully. * Disable Python 3.9 for training Python packaging build. (#6012) Disable Python 3.9 for training Python packaging build. Python 3.9 is not supported by the PyTorch dependency. * Fix bugs for 1: Calibrator should check model inputs; 2: (#6017) quantize_inupts forgot to use parameter initializer_use_weight_qtyp. * Bump highlight.js from 10.2.1 to 10.4.1 in /nodejs Bumps [highlight.js](https://github.com/highlightjs/highlight.js) from 10.2.1 to 10.4.1. - [Release notes](https://github.com/highlightjs/highlight.js/releases) - [Changelog](https://github.com/highlightjs/highlight.js/blob/master/CHANGES.md) - [Commits](highlightjs/highlight.js@10.2.1...10.4.1) Signed-off-by: dependabot[bot] <[email protected]> * work around of the build break in mac (#6069) * Fix the build break in macos release * revert android change * Bump up API version for 1.6 release (#6076) * Update version to 1.6.0 (#6041) * Update version to 1.6.0 * Add v 1.5.3 info * Updating WindowsAI and ONNX version Co-authored-by: Du Li <duli@OrtTrainingDev0.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Rsevert "Fuse MatMulIntegerToFloat only when scales are scalar (#6008)" This reverts commit beb950e. Co-authored-by: Xavier Dupré <[email protected]> Co-authored-by: Yufeng Li <[email protected]> Co-authored-by: Edward Chen <[email protected]> Co-authored-by: Zhang Lei <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Pranav Sharma <[email protected]> Co-authored-by: Du Li <duli@OrtTrainingDev0.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>

sdpython added 2 commits December 2, 2020 15:50

fix initialisation issue

df73532

constructor

efc91e8

xadupre requested a review from a team as a code owner December 2, 2020 17:47

xadupre changed the title ~~[WIP] Initialize a structure in operator ReduceSum~~ Initialize a structure in operator ReduceSum Dec 2, 2020

snnn added the release:1.6 label Dec 2, 2020

yuslepukhin approved these changes Dec 2, 2020

View reviewed changes

sdpython added 2 commits December 3, 2020 00:41

Merge branch 'master' of https://github.com/microsoft/onnxruntime int…

6511f7f

…o redval

Merge branch 'master' of https://github.com/microsoft/onnxruntime int…

3a61d4c

…o redval

xadupre merged commit 524b9fa into microsoft:master Dec 3, 2020

faxu added the triage:approved label Dec 4, 2020

duli2012 pushed a commit that referenced this pull request Dec 8, 2020

Initialize a structure in operator ReduceSum (#6005)

3e39b38

* fix initialisation issue

duli2012 mentioned this pull request Dec 8, 2020

Second round of cherry-pick #6083

Merged

xadupre deleted the redval branch September 28, 2021 22:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initialize a structure in operator ReduceSum #6005

Initialize a structure in operator ReduceSum #6005

xadupre commented Dec 2, 2020 •

edited

Loading

yuslepukhin left a comment

faxu commented Dec 2, 2020

Initialize a structure in operator ReduceSum #6005

Initialize a structure in operator ReduceSum #6005

Conversation

xadupre commented Dec 2, 2020 • edited Loading

yuslepukhin left a comment

Choose a reason for hiding this comment

faxu commented Dec 2, 2020

xadupre commented Dec 2, 2020 •

edited

Loading