Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SO-3125: Resolve validation performance issues #246

Merged
merged 29 commits into from
Jul 4, 2018

Conversation

zoliMolnar
Copy link

https://snowowl.atlassian.net/browse/SO-3125

This pull request resolves validation performances by utilizing the jobs API, to async run multiple validation rules.

@zoliMolnar zoliMolnar self-assigned this Jun 29, 2018
@zoliMolnar zoliMolnar requested a review from cmark June 29, 2018 07:29
issue/SO-3125_resolve_validation_performance_issues

Conflicts:
	snomed/com.b2international.snowowl.validation.snomed/src/main/resources/scripts/ruleSnomedCommon1.groovy
	snomed/com.b2international.snowowl.validation.snomed/src/main/resources/scripts/ruleSnomedCommon2.groovy
@@ -0,0 +1,170 @@
/*
* Copyright 2011-2018 B2i Healthcare Pte Ltd, http:https://b2i.sg
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New files should indicate current year as starting year in the license header.

*/
public class ValidationThreadPoolTest {

private static final int MAXIMUM_AMOUNT_OF_RUNNING_EXPENSIVE_JOBS = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provide this as configuration to the ValidationThreadPool class so deployments with more computational power (CPU, RAM, ES cluster nodes) can increase the number of expensive jobs. This would be useful for the other CheckTypes as well.


@Test
public void testConcurrentExpensiveJobs() {
final ValidationThreadPool pool = new ValidationThreadPool(VALIDATION_THREAD_COUNT);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test fixture initialization should go into a @Before method, please convert common variables to fields and initialize them in that method.


final List<Promise<Object>> validationPromises = Lists.newArrayList();
for (int i = 0; i < 10; i++) {
validationPromises.add(pool.submit(CheckType.NORMAL, createValidatableRunnable(CheckType.FAST, manager)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ NORMAL vs FAST!

if (CheckType.NORMAL == checkType) {
assertTrue(runningNormalJobs <= MAXIMUM_AMOUNT_OF_RUNNING_NORMAL_JOBS);
}
Thread.sleep(500);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to configure this hardcoded 500 ms timeout value based on the given CheckType, just to simulate the cost more accurately.


@Override
public int hashCode() {
int hash = 17;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Objects.hash(args) correctly, where you pass all values directly to the function and not one-by-one.


@Override
public String toString() {
return uniqueRuleId;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add checkType to toString()

job.setRule(schedulingRule);
job.addJobChangeListener(new JobChangeAdapter() {

public void done(IJobChangeEvent event) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing @Override annotation.

final Job job = new ValidationJob(checkType.getName(), runnable);
String uniqueRuleId = UUID.randomUUID().toString();
final ISchedulingRule schedulingRule = new ValidationRuleSchedulingRule(checkType, maxValidationThreadCount, uniqueRuleId);
final Promise<Object> isRuleDonePromise = new Promise<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename isRuleDonePromise to just promise.

@@ -44,6 +41,8 @@
@NotNull
private Severity severity;

private CheckType checkType;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add NotNull annotation or default value CheckType.NORMAL.

Zoltan Molnar and others added 11 commits June 29, 2018 18:28
... cancel status
- Add check against null values for family and runnables, change
belongsTo to use Objects.equal

- Add NotNull annotaion to ValidationRuleCreateRequest

- Add configurability to limit the maximum amount of concurrent normal
or expensive jobs to ValidationBootstrap, ValidationThreadPool,
ValidationRuleSchedulingRule

- Fix test class and added logging for test
... fetched concepts in SnomedValidationIssueDetailExtension.class,
added validation rule execution time logging to ValidateRequest.class
...in case of _source = false configuration

This will reduce the memory consumption of aggregations with top hits
somewhat.
@@ -54,13 +67,12 @@ public void testConcurrentExpensiveJobs() {
}

Promise.all(validationPromises).getSync();
System.out.println();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove println calls. If you would like to separate the log output of the test cases then use a JUnit Rule like this class: com.b2international.snowowl.test.commons.TestMethodNameRule.

@@ -78,7 +78,11 @@ public void preRun(SnowOwlConfiguration configuration, Environment env) throws E
// initialize validation thread pool
// TODO make this configurable
int numberOfValidationThreads = Math.max(2, Runtime.getRuntime().availableProcessors() / 2);
env.services().registerService(ValidationThreadPool.class, new ValidationThreadPool(numberOfValidationThreads));
// TODO calculate these based on systems overall strength
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these to actual snowowl_config.yml configuration values.

@@ -48,7 +60,7 @@ protected IStatus run(IProgressMonitor monitor) {

@Override
public boolean belongsTo(Object family) {
return (this.family == null || family == null) ? false : this.family.equals(family);
return Objects.equal(this.family, family);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use java.util.Objects, please.

hash = hash * 31 + Objects.hash(checkType);
hash = hash * 31 + maxConcurrentJobs;
hash = hash * 31 + Objects.hash(uniqueRuleId);
hash = hash * 31 + Objects.hash(checkType, id, maxConcurrentJobs, maxConcurrentExpensiveJobs);

return hash;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just return the value computed by the Objects.hash method, like this:

@Override
public int hashCode() {
    return Objects.hash(checkType, id, maxConcurrentJobs, maxConcurrentExpensiveJobs);
}

}

@Override
public String toString() {
return uniqueRuleId;
return id + " " + checkType.getName();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use String.format() method.

Copy link
Member

@cmark cmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@cmark cmark merged commit de60e7f into 6.x Jul 4, 2018
@cmark cmark deleted the issue/SO-3125_resolve_validation_performance_issues branch July 4, 2018 11:48
@zoliMolnar
Copy link
Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants