Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8333098: ubsan: bytecodeInfo.cpp:318:59: runtime error: division by zero #20615

Closed
wants to merge 3 commits into from

Conversation

MBaesken
Copy link
Member

@MBaesken MBaesken commented Aug 16, 2024

When running test
compiler/classUnloading/methodUnloading/TestOverloadCompileQueues.java
with ubsan enabled binaries we run into the issue reported below.
Reason seems to be that we divide by zero in the code in some special cases (we should instead check for CompilationPolicy::min_invocations() == 0 and handle it separately).

/jdk/src/hotspot/share/opto/bytecodeInfo.cpp:318:59: runtime error: division by zero
#0 0x7f5145c0dda2 in InlineTree::should_not_inline(ciMethod*, ciMethod*, int, bool&, ciCallProfile&) src/hotspot/share/opto/bytecodeInfo.cpp:318
#1 0x7f51466366d7 in InlineTree::try_to_inline(ciMethod*, ciMethod*, int, JVMState*, ciCallProfile&, bool&) src/hotspot/share/opto/bytecodeInfo.cpp:382
#2 0x7f514663d36b in InlineTree::ok_to_inline(ciMethod*, JVMState*, ciCallProfile&, bool&) src/hotspot/share/opto/bytecodeInfo.cpp:596
#3 0x7f51470dffd6 in Compile::call_generator(ciMethod*, int, bool, JVMState*, bool, float, ciKlass*, bool) src/hotspot/share/opto/doCall.cpp:189
#4 0x7f51470e18ab in Parse::do_call() src/hotspot/share/opto/doCall.cpp:641
#5 0x7f514887dbf1 in Parse::do_one_block() src/hotspot/share/opto/parse1.cpp:1607
#6 0x7f514887fefa in Parse::do_all_blocks() src/hotspot/share/opto/parse1.cpp:724
#7 0x7f514888d4da in Parse::Parse(JVMState*, ciMethod*, float) src/hotspot/share/opto/parse1.cpp:628
#8 0x7f51469d8418 in ParseGenerator::generate(JVMState*) src/hotspot/share/opto/callGenerator.cpp:99
#9 0x7f5146d99cff in Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*) src/hotspot/share/opto/compile.cpp:793
#10 0x7f51469d5ebf in C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*) src/hotspot/share/opto/c2compiler.cpp:142
#11 0x7f5146db0274 in CompileBroker::invoke_compiler_on_method(CompileTask*) src/hotspot/share/compiler/compileBroker.cpp:2303
#12 0x7f5146db2826 in CompileBroker::compiler_thread_loop() src/hotspot/share/compiler/compileBroker.cpp:1961
#13 0x7f51478d475a in JavaThread::thread_main_inner() src/hotspot/share/runtime/javaThread.cpp:759
#14 0x7f51491620ea in Thread::call_run() src/hotspot/share/runtime/thread.cpp:225
#15 0x7f51487ac201 in thread_native_entry src/hotspot/os/linux/os_linux.cpp:846
#16 0x7f514e5cf6e9 in start_thread (/lib64/libpthread.so.0+0xa6e9) (BuildId: 2f8d3c2d0f4d7888c2598d2ff6356537f5708a73)
#17 0x7f514db1550e in clone (/lib64/libc.so.6+0x11850e) (BuildId: f732026552f6adff988b338e92d466bc81a01c37)


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8333098: ubsan: bytecodeInfo.cpp:318:59: runtime error: division by zero (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/20615/head:pull/20615
$ git checkout pull/20615

Update a local copy of the PR:
$ git checkout pull/20615
$ git pull https://git.openjdk.org/jdk.git pull/20615/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 20615

View PR using the GUI difftool:
$ git pr show -t 20615

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/20615.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Aug 16, 2024

👋 Welcome back mbaesken! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Aug 16, 2024

@MBaesken This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8333098: ubsan: bytecodeInfo.cpp:318:59: runtime error: division by zero

Reviewed-by: kvn, iveresov

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 113 new commits pushed to the master branch:

  • 72a4900: 8338888: SystemDictionary::class_name_symbol has incorrect length check
  • a8ac287: 8339126: JNI exception pending in Inflater.c
  • d08b5bd: 8258483: [TESTBUG] gtest CollectorPolicy.young_scaled_initial_ergo_vm fails if heap is too small
  • d03ec7a: 8339030: frame::print_value_on(outputStream* st, JavaThread *thread) doesn't need thread argument
  • eff6d9c: 8339167: Remove AbstractPoolEntry.PrimitiveEntry to reduce boxing overheads
  • a98ecad: 8338897: Small startup regression remains after JDK-8309622 and JDK-8331932
  • 3d49fb8: 8338103: Stabilize and open source a Swing OGL ButtonResizeTest
  • 0c2b175: 8328608: Multiple NewSessionTicket support for TLS
  • 379f3db: 8339175: ProblemList runtime/interpreter/LastJsrTest.java on all platforms with Xcomp
  • b670009: 8338729: Retire the test jdk/java/util/zip/TestZipError.java
  • ... and 103 more: https://git.openjdk.org/jdk/compare/ddbc0b6a39148cb30a8fda80fa7290e90e2a77d6...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot changed the title JDK-8333098: ubsan: bytecodeInfo.cpp:318:59: runtime error: division by zero 8333098: ubsan: bytecodeInfo.cpp:318:59: runtime error: division by zero Aug 16, 2024
@openjdk openjdk bot added the rfr Pull request is ready for review label Aug 16, 2024
@openjdk
Copy link

openjdk bot commented Aug 16, 2024

@MBaesken The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@mlbridge
Copy link

mlbridge bot commented Aug 16, 2024

Webrevs

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, I think Tier4MinInvocationThreshold and Tier3MinInvocationThreshold should be double flags. I see that they are used only in double type expressions (usually with scaling value which is double). May be another RFE.

I agree that we need to check Tier4MinInvocationThreshold == 0 because it is acceptable value and can be specified on command line. I think 0 means that (freq < min_freq) is always true. So your change could be changed:

   if (cp_min_inv == 0) {
       // Tier4MinInvocationThreshold == 0 means we should not inline
       min_freq = freq + 1.;

double min_freq = 0.0;
int cp_min_inv = CompilationPolicy::min_invocations();
if (cp_min_inv == 0) {
min_freq = MinInlineFrequencyRatio;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this need to be 1.0 or infinity to preserve the existing behavior?
However, I'm not convinced the existing behavior is correct. If min_invocations decreases, shouldn't that make it easier to inline, not harder? @veresov, do you agree?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the existing behavior is correct - the goal is to compute a minimum call freq that is considered for inlining. 1.0/min_invocations() means call site was reached at least once before the method was submitted for a C2 compile.

The question is what should we do if min_invocations() is 0. That basically means that there was no profiling. I would probably just return false in that case, which would mean "we can't really say that's low frequency, because we don't have data". It's a meaningless mode of operation regardless though...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think the existing behavior is correct - the goal is to compute a minimum call freq that is considered for inlining. 1.0/min_invocations() means call site was reached at least once before the method was submitted for a C2 compile.

Do we really need to take 1.0/min_invocations() into account? InlineSmallCode should handle cases like that.

Also MinInlineFrequencyRatio is 0.0085 which corresponds to Tier4MinInvocationThreshold value equal approximately 117. The default value is 600. The only way to change it is to use CompileThreshold or CompileThresholdScaling which nobody use in production but only in testing. So in production we use MinInlineFrequencyRatio as min_freq. Why complicate the code.

An other issue. This is should_not_inline() method. Returning false means inlining called method. I was curious why we return false when we run with -Xcomp (see check (UseInterpreter) at line 301) ?
If we return false when min_invocations() is 0 it will be similar to -Xcomp mode.

Is it intentional to inline when we don't have profiling info?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think the existing behavior is correct - the goal is to compute a minimum call freq that is considered for inlining. 1.0/min_invocations() means call site was reached at least once before the method was submitted for a C2 compile.

Do we really need to take 1.0/min_invocations() into account? InlineSmallCode should handle cases like that.

I'm not sure I understand the question. InlineSmallCode acts on already compiled methods. This logic analyzes call site frequencies.

Also MinInlineFrequencyRatio is 0.0085 which corresponds to Tier4MinInvocationThreshold value equal approximately 117. The default value is 600. The only way to change it is to use CompileThreshold or CompileThresholdScaling which nobody use in production but only in testing. So in production we use MinInlineFrequencyRatio as min_freq. Why complicate the code.

Can't exactly tell. But I'm sure I had a reason, probably some corner case exposed by testing. Probably something didn't get inlined with very low thresholds.

An other issue. This is should_not_inline() method. Returning false means inlining called method. I was curious why we return false when we run with -Xcomp (see check (UseInterpreter) at line 301) ? If we return false when min_invocations() is 0 it will be similar to -Xcomp mode.

Is it intentional to inline when we don't have profiling info?

I think some tests run with -Xcomp and expect inlining to happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So lowering CompileThreshold to below 117, means "compile sooner", but inline less, because should_not_inline will return true more often?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the problem is that the tests still expect you to inline things even with small or 0 thresholds. Hence the 1/min_invocations() floor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this turns into should not inline for freq < infinity or freq < 1, or basically "always should not inline". Wouldn't we want to use MIN2 instead of MAX2 for floor?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I think what I originally meant it to be is the minimum measurable frequency. In that case MAX2 would work. On the other hand, it's not a scaled value. I think we can try removing it and leaving only freq < MinInlineFrequencyRatio and see which test fails and why. I, unfortunately, don't remember the details.

@dean-long
Copy link
Member

@vnkozlov , I think this is happening because the scaling factor is 0.001, so 600 turns into 0.

@vnkozlov
Copy link
Contributor

vnkozlov commented Aug 16, 2024

@vnkozlov , I think this is happening because the scaling factor is 0.001, so 600 turns into 0.

Yes, I noticed that. But my point is that the flag's definition allows 0 regardless scaling factor:

  product(intx, Tier4MinInvocationThreshold, 600,                           \
          "Minimum invocation to compile at tier 4")                        \
          range(0, max_jint)                                                \

@MBaesken
Copy link
Member Author

MBaesken commented Aug 19, 2024

Doesn't this need to be 1.0 or infinity to preserve the existing behavior?

So I would change the code to min_freq = 1.0; , right ?

@MBaesken
Copy link
Member Author

First, I think Tier4MinInvocationThreshold and Tier3MinInvocationThreshold should be double flags.
I see that they are used only in double type expressions (usually with scaling value which is double). May be another RFE.

I can open a new JBS issue
'Make Tier4MinInvocationThreshold and Tier3MinInvocationThreshold double flags'
is that okay ?

Comment on lines 318 to 324
double min_freq = 0.0;
int cp_min_inv = CompilationPolicy::min_invocations();
if (cp_min_inv == 0) {
min_freq = MinInlineFrequencyRatio;
} else {
min_freq = MAX2(MinInlineFrequencyRatio, 1.0 / cp_min_inv);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
double min_freq = 0.0;
int cp_min_inv = CompilationPolicy::min_invocations();
if (cp_min_inv == 0) {
min_freq = MinInlineFrequencyRatio;
} else {
min_freq = MAX2(MinInlineFrequencyRatio, 1.0 / cp_min_inv);
}
int cp_min_inv = MAX2(1, CompilationPolicy::min_invocations());
double min_freq = MAX2(MinInlineFrequencyRatio, 1.0 / cp_min_inv);

@dean-long
Copy link
Member

Doesn't this need to be 1.0 or infinity to preserve the existing behavior?

So I would change the code to min_freq = 1.0; , right ?

I would do it slightly differently. See my suggested change.

@MBaesken
Copy link
Member Author

Thanks Dean, I added the suggestion.

@vnkozlov
Copy link
Contributor

First, I think Tier4MinInvocationThreshold and Tier3MinInvocationThreshold should be double flags.
I see that they are used only in double type expressions (usually with scaling value which is double). May be another RFE.

I can open a new JBS issue 'Make Tier4MinInvocationThreshold and Tier3MinInvocationThreshold double flags' is that okay ?

@veresov, do you agree to this suggestion? Or we should restrict values of these flags to not allow 0?

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a comment here explaining how min_freq is calculated and edge cases.

@veresov
Copy link
Contributor

veresov commented Aug 20, 2024

First, I think Tier4MinInvocationThreshold and Tier3MinInvocationThreshold should be double flags.
I see that they are used only in double type expressions (usually with scaling value which is double). May be another RFE.

I can open a new JBS issue 'Make Tier4MinInvocationThreshold and Tier3MinInvocationThreshold double flags' is that okay ?

@veresov, do you agree to this suggestion? Or we should restrict values of these flags to not allow 0?

I don't quite see a real need to do that but we can. Should we convert all threshold flags to double then?
About 0. It's actually a valid threshold.

@vnkozlov
Copy link
Contributor

I don't quite see a real need to do that but we can. Should we convert all threshold flags to double then?

Yes for all threshold flags. We always use scaling for them. With double flags results of related expressions and conditions will be more predictable.

@MBaesken
Copy link
Member Author

I created https://bugs.openjdk.org/browse/JDK-8339067
Convert Threshold flags (like Tier4MinInvocationThreshold and Tier3MinInvocationThreshold) to double
for the flag conversion issue brought up in this review .

@MBaesken
Copy link
Member Author

Besides adding some comment (?) is there anything else that should be done in this PR ?

@vnkozlov
Copy link
Contributor

Besides adding some comment (?) is there anything else that should be done in this PR ?

Current change looks fine to me. Please, add comment about avoiding devision by 0.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Aug 28, 2024
@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Aug 28, 2024
@openjdk openjdk bot added the ready Pull request is ready to be integrated label Aug 28, 2024
@MBaesken
Copy link
Member Author

Thanks for the reviews !

/integrate

@openjdk
Copy link

openjdk bot commented Aug 29, 2024

Going to push as commit f080b4b.
Since your change was applied there have been 119 commits pushed to the master branch:

  • ff59532: 8338678: Erroneous parameterized type represented as
  • 0b4a7d5: 8324859: Improve error recovery
  • 1383fec: 8327381: Refactor type-improving transformations in BoolNode::Ideal to BoolNode::Value
  • eb7ead5: 8336873: BasicSplitPaneDivider:oneTouchExpandableChanged() should mention that implementation depends on SplitPane.supportsOneTouchButtons property
  • 0ddcd70: 8335120: assert(!target->can_be_statically_bound() || target == cha_monomorphic_target) failed
  • 26e3d53: 8338716: Re-visit "interrupt handling" in jdk.internal.loader.Resource
  • 72a4900: 8338888: SystemDictionary::class_name_symbol has incorrect length check
  • a8ac287: 8339126: JNI exception pending in Inflater.c
  • d08b5bd: 8258483: [TESTBUG] gtest CollectorPolicy.young_scaled_initial_ergo_vm fails if heap is too small
  • d03ec7a: 8339030: frame::print_value_on(outputStream* st, JavaThread *thread) doesn't need thread argument
  • ... and 109 more: https://git.openjdk.org/jdk/compare/ddbc0b6a39148cb30a8fda80fa7290e90e2a77d6...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Aug 29, 2024
@openjdk openjdk bot closed this Aug 29, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Aug 29, 2024
@openjdk
Copy link

openjdk bot commented Aug 29, 2024

@MBaesken Pushed as commit f080b4b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler [email protected] integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants