-
-
Notifications
You must be signed in to change notification settings - Fork 709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU-22582 Avoid synchronizing in RuleBasedBreakIterator and ULocale unless strictly necessary #2775
Conversation
…nless strictly necessary See unicode-org#2775
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
Can you please sign the CLA so we can look at this? |
Hi Rich, thanks for the reminder, this slipped my mind, I've signed the CLA already, so it should be possible for you to look at the contents of this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little uncomfortable giving a ship-it here because I'm not longer familiar enough with either Java or the internals of ULocale to be sure I know what's going on. That said, all of this looks pretty reasonable to me. But it'd be nice to have a second pair of eyes on this (@markusicu ?).
@yumaoka and maybe @macchiati could you please also take a look? |
icu4j/main/core/src/main/java/com/ibm/icu/text/RuleBasedBreakIterator.java
Outdated
Show resolved
Hide resolved
icu4j/main/core/src/main/java/com/ibm/icu/text/RuleBasedBreakIterator.java
Outdated
Show resolved
Hide resolved
icu4j/main/core/src/main/java/com/ibm/icu/text/RuleBasedBreakIterator.java
Outdated
Show resolved
Hide resolved
PS: Consider separate pull requests for the changes in the two classes. |
…nless strictly necessary See unicode-org#2775
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
…nless strictly necessary See unicode-org#2775
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
…nless strictly necessary See unicode-org#2775
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
icu4j/main/core/src/main/java/com/ibm/icu/text/RuleBasedBreakIterator.java
Outdated
Show resolved
Hide resolved
I suggest you don't amend/squash right away, so that it's easier to track comments vs. changes. Ignore the one failing check about multiple commits -- we can squash later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes lgtm tnx
I will start the CI checks
looks like good changes now; plan to approve when CI is happy modulo squashing
…nless strictly necessary See unicode-org#2775
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
I wish had seen that cause I just squashed it :/ |
Adding my results from my own tests, they still pass as expected: Test code looks something like this (In Kotlin):
I can't append the someChromeOsFunction for obvious reasons, although the code looks something like this:
With Optimization: Without Optimization: So it's ~3x as fast, although I used 10x the number of virtual threads compared with last time, still same amount of physical threads, so result should be pretty much the same, most of the comments in the cl shouldn't affect perf, but rather were around edge cases in concurrency, so I expect results to stay very similar. |
https://unicode-org.atlassian.net/browse/ICU-22582
Optimize synchronized code which is causing contention which leads to latency degradation.
RuleBasedBreakIterator: Remove fBreakEngines and instead read from the static gAllBreakEngines, this of course has issues, which we fix by using a ConcurrentLinkedQueue and read as normal, but write within a synchronized context, we will only iterate over the LanguageBreakEngines we haven't iterated over and then add a new one if needed.
ULocale: If the Locale hasn't changed then use the cached value, else go through the code path that modifies it. (Similar solution to what Locale is doing), now both defaultLocale and defaultULocale are volatile to prevent CPU caching from leading to wrong results.
Tests: I performed some manual tests and the code still behaves the same it's just that now in an environment with 1 CPU with ~6 threads it finishes 2x as fast.