Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SO-2902: Add multi-language RF2 import support. #194

Merged
merged 31 commits into from
Mar 8, 2018

Conversation

zoliMolnar
Copy link

This pull request enables multi-language RF2 importing.

@zoliMolnar zoliMolnar self-assigned this Jan 25, 2018
@zoliMolnar zoliMolnar changed the title Issue/rf2 multi language import fix Add multi-language RF2 import support. Jan 25, 2018
match what they represent, members ids instead of concept ids
public void importMoreThanOneLanguageCodeDescription() {
final String enDescriptionId = "41320138114";
final String svDescriptionId = "24688171113";
final String conceptIdODescription = "301795004";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo



@Test
public void importMoreThanOneLanguageCodeDescription() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a number to the test method's name like the others above. As you can see there is an annotation at the top: @FixMethodOrder(MethodSorters.NAME_ASCENDING) which enforces method execution order.


@Test
public void importFromMoreThanOneLanguageRefset() {
final String enLanguageRefsetConceptId = "34d07985-48a0-41e7-b6ec-b28e6b00adfc";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should decide to either merge the two test cases together (since the RF2 files are the same and you always import both descriptions and language refset members as well) or split the RF2 files and test language refset and description import separately.



@Test
public void importMoreThanOneLanguageCodeDescription() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

importDescriptionsWithMultipleLanguageCodes or importDescriptionsWithDifferentLanguageCodes

}

@Test
public void importFromMoreThanOneLanguageRefset() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not importing more than one language refset! It should be rather something like:
importLanguageRefsetMembersWithMultipleLanguageCodes or importLanguageRefsetMembersWithDifferentLanguageCodes

@@ -18,13 +18,16 @@
import java.io.File;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the following exception when I open the import wizard in the client (server-client mode):
java.lang.IllegalArgumentException: Could not find property with name descriptionsFile in class class com.b2international.snowowl.snomed.importer.net4j.ImportConfiguration at org.eclipse.core.internal.databinding.beans.BeanPropertyHelper.getPropertyDescriptor(BeanPropertyHelper.java:168) at org.eclipse.core.databinding.beans.PojoProperties.value(PojoProperties.java:119) at org.eclipse.core.databinding.beans.PojoProperties.value(PojoProperties.java:89) at org.eclipse.core.databinding.beans.PojoObservables.observeValue(PojoObservables.java:76) at org.eclipse.core.databinding.beans.PojoObservables.observeValue(PojoObservables.java:56) at com.b2international.snowowl.snomed.importer.ui.wizard.page.ImportDetailsWizardPage.bindTextField(ImportDetailsWizardPage.java:319) at com.b2international.snowowl.snomed.importer.ui.wizard.page.ImportDetailsWizardPage.initDataBinding(ImportDetailsWizardPage.java:236) at com.b2international.snowowl.snomed.importer.ui.wizard.page.AbstractImportWizardPage.createControl(AbstractImportWizardPage.java:67)

monitor.worked(); // 1

for (File descfile : descriptionsFiles) {
writeComponent(out, descfile, monitor);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not going to work like this, see how refset URLs or excluded refset ids are sent over the wire.

@@ -92,12 +93,12 @@ protected void indicating(final ExtendedDataInputStream in, final OMMonitor moni
receivedFilesDirectory.deleteOnExit();

readComponent(in, importConfiguration, receivedFilesDirectory, new FileCallback() { @Override public void setFile(final File f) { importConfiguration.setConceptsFile(f); }}, monitor.fork());
readComponent(in, importConfiguration, receivedFilesDirectory, new FileCallback() { @Override public void setFile(final File f) { importConfiguration.setDescriptionsFile(f); }}, monitor.fork());
readComponent(in, importConfiguration, receivedFilesDirectory, new FileCallback() { @Override public void setFile(final File f) { importConfiguration.setDescriptionsFiles(ImmutableList.of(f)); }}, monitor.fork());
readComponent(in, importConfiguration, receivedFilesDirectory, new FileCallback() { @Override public void setFile(final File f) { importConfiguration.setTextDefinitionFile(f); }}, monitor.fork());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not going to work like this, see how refset URLs or excluded refset ids are sent over the wire.

CsvParser parser = new CsvParser(descriptionReader, getFileName(url), CSV_SETTINGS, descriptionParserCallback, DESCRIPTION_FIELD_COUNT);
parser.parse();
if (!configuration.getDescriptionsFiles().isEmpty()) {
for(File descFile : configuration.getDescriptionsFiles()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing space

importArchive("SnomedCT_Release_INT_20150131_new_concept_for_languageCode.zip");
getComponent(branchPath, SnomedComponentType.CONCEPT, conceptIdODescription).statusCode(200);

importArchive("SnomedCT_Release_INT_20150201_new_description_with2_language_code.zip", importConfiguration);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description in sct2_Description_Delta-sv_INT_20150201 has a language code 'en', please fix that!

Copy link
Member

@nagyo nagyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that multiple Text Definition files could also be present in RF2 archives based on the language code... Please add that feature as well. They work just as the same as description files do.

final String svLanguageRefsetMemberId = "34d07985-48a0-41e7-b6ec-b28e6b00adfb";
final String conceptIdOfDescription = "301795004";


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary extra line


final SnomedRefSetNameCollector provider = new SnomedRefSetNameCollector(config, new NullProgressMonitor(), "");

try {
provider.parse(config.toURL(config.getLanguageRefSetFile()));
for (File langFile : config.getLanguageRefSetFiles()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are already iterating through the available language refset files in the outer for loop, I don't think this is necessary here.

.getAllFileName(zipFiles, ReleaseComponentType.DESCRIPTION, contentSubType)
.stream()
.map(fileName -> new File(fileName))
.collect(Collectors.toSet());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

descriptionReader = new InputStreamReader(url.openStream());
final CsvParser parser = new CsvParser(descriptionReader, getFileName(url), CSV_SETTINGS, descriptionParserCallback, DESCRIPTION_FIELD_COUNT);
parser.parse();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra line

}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra line

if (isValidReleaseFile(langFile)) {
releaseFileValidators.add(new SnomedLanguageRefSetValidator(configuration, configuration.toURL(langFile), this));
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra line

@@ -58,10 +61,10 @@
private File archiveFile;
private File rootFile;
private File conceptsFile;
private File descriptionsFile;
private Collection<File> descriptionsFiles = Collections.emptySet();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks to this the collection will be initialized still as immutable.

@@ -70,15 +71,27 @@ protected void requesting(final ExtendedDataOutputStream out, final OMMonitor mo

out.writeUTF(importConfiguration.getCodeSystemShortName());

final Collection<File> descriptionsFiles = importConfiguration.getDescriptionsFiles();
final Collection<File> languageRefSetFiles = importConfiguration.getLanguageRefSetFiles();
monitor.worked(); // 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this here? Have you checked how the monitor's work is calculated here?

writeComponent(out, importConfiguration.getTextDefinitionFile(), monitor.fork());
writeComponent(out, importConfiguration.getRelationshipsFile(), monitor.fork());
writeComponent(out, importConfiguration.getStatedRelationshipsFile(), monitor.fork());
writeComponent(out, importConfiguration.getDescriptionType(), monitor.fork());
writeComponent(out, importConfiguration.getLanguageRefSetFile(), monitor.fork()); // + 7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That comment is a clue!

readComponent(in, importConfiguration, receivedFilesDirectory, new FileCallback() { @Override public void setFile(final File f) { importConfiguration.setTextDefinitionFile(f); }}, monitor.fork());
readComponent(in, importConfiguration, receivedFilesDirectory, new FileCallback() { @Override public void setFile(final File f) { importConfiguration.setRelationshipsFile(f); }}, monitor.fork());
readComponent(in, importConfiguration, receivedFilesDirectory, new FileCallback() { @Override public void setFile(final File f) { importConfiguration.setStatedRelationshipsFile(f); }}, monitor.fork());
readComponent(in, importConfiguration, receivedFilesDirectory, new FileCallback() { @Override public void setFile(final File f) { importConfiguration.setDescriptionType(f); }}, monitor.fork());
readComponent(in, importConfiguration, receivedFilesDirectory, new FileCallback() { @Override public void setFile(final File f) { importConfiguration.setLanguageRefSetFile(f); }}, monitor.fork());

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra line

@nagyo nagyo changed the title Add multi-language RF2 import support. SO-2902: Add multi-language RF2 import support. Feb 26, 2018
issue/rf2-multi-language-import-fix

Conflicts:
	snomed/com.b2international.snowowl.snomed.api.rest.tests/src/com/b2international/snowowl/snomed/api/rest/io/SnomedImportApiTest.java
- remove unnecessary fields from import config
- refactored SnomedRefSetNameCollector:
  - do not parse description files for each refset URL, parse them only
once
  - "prepare" resolvable refset labels for the client
- do not send language refset files twice over the wire
Copy link
Member

@apeteri apeteri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More copyright dates to move forward:

  • SnomedImportApiTest
  • ImportRefSetCommand
  • ListLanguageRefSetsCommand
  • ImportUtil
  • RF2ReleaseRefSetFileCollector
  • SnomedRefSetNameCollector
  • AbstractSnomedValidator
  • SnomedConceptValidator
  • SnomedTaxonomyValidator
  • SnomedValidationContext
  • ImportConfiguration
  • ReleaseFileSet
  • ReleaseFileSetSelectors

config.setSourceKind(ImportSourceKind.ARCHIVE);
config.setArchiveFile(archiveFile);

for (final Entry<String, String> label : provider.getRefsetIdToLabelMap().entrySet()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This label map could also be extended with information from the store (from MAIN, most likely), similar to how it's done in the import wizard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I see that the name collector does this already. ✔️

put(ErroneousAustralianReleaseFileNames.ERRONEOUS_AU_20110531_SNOMED_LANGUAGE_REFSET_NAME, Concepts.REFSET_LANGUAGE_TYPE).
put(ErroneousAustralianReleaseFileNames.ERRONEOUS_AU_20120229_SNOMED_LANGUAGE_REFSET_NAME, Concepts.REFSET_LANGUAGE_TYPE).build();
private static final String UNLABELED_REFSET_SUFFIX = " (unresolved)";
private static final String ESCAPED_UNLABELED_REFSET_SUFFIX = CharMatcher.is(')')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pattern#quote(String) will return a String that is treated as a "literal pattern" in regular expressions; suggesting to use that instead of targeted replacements.

refSetTypeIconId = getRefSetTypeIconIdFromHeader(url, record);
} else {
unlabeledRefSetIds.add(record.get(CONCEPT_ID_COLUMN));
InputStreamReader descriptionReader = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

descriptionReader should get the try-with-resources treatment:

final URL url = configuration.toURL(descFile);
try (final InputStreamReader descriptionReader = new InputStreamReader(url.openStream())) {
    ...
}

Copy link
Member

@cmark cmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cmark cmark merged commit b7e0973 into 6.x Mar 8, 2018
@cmark cmark deleted the issue/rf2-multi-language-import-fix branch March 8, 2018 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants