Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lab results mappings for MIMIC-IV v2.0 - SSSOM format #1418

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

a-chahin
Copy link
Contributor

@a-chahin a-chahin commented Nov 8, 2022

This pull request adds two mapping csv files for lab concepts from the d_labitems definitions table in MIMIC-IV v2.0. The first file d_labitems_to_loinc.csv contains itemid to LOINC mappings. There will be a conflict when merging this file to main, as it uses the same file name as an older labs mapping file. This new file contains the latest lab concepts. However, its format is much different. The second file d_labitems_to_omop.csv contains itemid to OMOP mappings.

Both files use the simple standard for sharing ontology mappings SSSOM format @sssom

@tompollard
Copy link
Member

Thanks @a-chahin! Same comments as: #1419 (comment), copied below:

  • I notice that you have left some of the author_id fields empty for the mappings. I assume this is because we inherited those mappings?
  • The documentation at https://mapping-commons.github.io/sssom/author_id/ suggests that author_id is "Recommended to be a (pipe-separated) list of ORCIDs". I think you are currently using commas to separate lists. Should we swap out the commas for |?
  • Minor thing, but both files are missing a newline character at the end. This can lead to some tools to ignore the final line. If you are editing the files, we might want to add a newline (just hit return at the end of the final line of each file).

@matentzn hope you don't mind me tagging you here (again!). This file is a mapping for MIMIC-IV concepts. No pressure, but we would appreciate your thoughts on whether anything more should be done to conform to @sssom format.

Copy link

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, some quick feedback, happy to look this over in more detail if you want (see inline comments as well)

@@ -0,0 +1,1624 @@
subject_id,subject_label,predicate_id,object_id,object_label,mapping_justification,author_id,reviewer_id,confidence,comment
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mapping_justification uses invalid values, yours should be semapv:ManualMappingCuration, see https://github.com/mapping-commons/sssom/blob/master/src/sssom_schema/schema/sssom_schema.yaml#L211

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I will update the tables.

@@ -0,0 +1,1624 @@
subject_id,subject_label,predicate_id,object_id,object_label,mapping_justification,author_id,reviewer_id,confidence,comment
mimic:51221,Hematocrit,skos:exactMatch,omop:3023314,Hematocrit [Volume Fraction] of Blood by Automated count,HumanCurated,,"orcid:0000-0001-8822-1884,0000-0002-9348-9284",1,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

orcids should be pipe separated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This will be corrected in the updated files.

@@ -0,0 +1,1624 @@
subject_id,subject_label,predicate_id,object_id,object_label,mapping_justification,author_id,reviewer_id,confidence,comment
mimic:51221,Hematocrit,skos:exactMatch,omop:3023314,Hematocrit [Volume Fraction] of Blood by Automated count,HumanCurated,,"orcid:0000-0001-8822-1884,0000-0002-9348-9284",1,
mimic:50912,Creatinine,skos:exactMatch,omop:3016723,Creatinine [Mass/volume] in Serum or Plasma,HumanCurated,,"orcid:0000-0001-8822-1884,0000-0002-9348-9284",,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason confidence is only sometimes curated? Its not wrong, just wondering from a user perspective how I would interpret the absence of a value for confidence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapping was done before we were aware of the SSSOM standard. This is just the initial release for mapping files. In the next step, I will go over the mappings to check for any deprecated codes and I will add a confidence score for the rest of concepts.

52034,Total Calcium,Blood,Blood Gas,,3032503,Calcium [Mass/volume] in Blood,Measurement,LOINC,Lab Test,S,49765-1,0000-0001-8822-1884,0000-0002-9348-9284,MIMIC-IV v2.0,2.71,https://loinc.org/relma/,v4.7,3/1/2022,0.1,,0
52035,Total Calcium,Blood,Blood Gas,,3032503,Calcium [Mass/volume] in Blood,Measurement,LOINC,Lab Test,S,49765-1,0000-0001-8822-1884,0000-0002-9348-9284,MIMIC-IV v2.0,2.71,https://loinc.org/relma/,v4.7,3/1/2022,0.1,Duplicate of itemid 52034,0
52036,Voided Specimen,Blood,Blood Gas,,,,Measurement,LOINC,Lab Test,S,,,,MIMIC-IV v2.0,,,,,,Not a lab test,6397
52037,WB tCO2,Blood,Blood Gas,,3014094,"Carbon dioxide, total [Moles/volume] in Blood",Measurement,LOINC,Lab Test,S,20565-8,0000-0001-882
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid record according to sssom, do you intend to say "there is no exact mapping?"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends. For example, itemid 52036 is for the concept "Voided Specimen" which is not a lab test.
Other concepts have ambiguous labels that don't make any sense.
Do you think I should use "there is no exact mapping" for all unmappable concepts regardless of the reason?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting question for the sssom issue tracker https://github.com/mapping-commons/sssom/issues because what we usually do in this case is not include the subject entity at all in the mapping file. If there is no mapping there is no mapping. However, I can see how your desire to have an entry for all subject entities in MIMIC makes sense. In this case, we usually define the predicate_id (e.g skos:exactMatch) and then add a placeholder, sssom:NoTermFound, see mapping-commons/sssom#28 for a discussion on the subject. This will allow us to also later say: there is no broad/narrow match easier. However, for your case, this solution is not perfect, because you really want to say something like "out of scope", but that's not currently possible to express, so maybe make an issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @matentzn, I think you're right that sssom:NoTermFound isn't quite what we need. @a-chahin give me a shout if you'd like to discuss this before raising the issue. Something like an OutOfScope tag sounds good to me!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just raised an issue at mapping-commons/sssom#245

@tompollard
Copy link
Member

@matentzn Many thanks for your quick and helpful feedback. I appreciate you spending the time to do this! We will work on addressing the points that you've raised.

@a-chahin
Copy link
Contributor Author

a-chahin commented Nov 9, 2022

Hey, some quick feedback, happy to look this over in more detail if you want (see inline comments as well)

Hey, some quick feedback, happy to look this over in more detail if you want (see inline comments as well)

Thank you so much for the valuable information @matentzn.

  • I will double check all skos:broadMatch under predicate_id to make sure they are correct.
  • I used skos:exactMatch when the label, fluid, and unit of measurement in MIMIC matches LOINC.
  • I will check with Tom to setup a QC system.

@tompollard
Copy link
Member

tompollard commented Nov 9, 2022

I will check with Tom to setup a QC system.

Let's have a chat about this. In a separate pull request, it looks like we'll need to:

  1. Add a "validate_mappings" rule to either a Makefile or to our Pytest code (need to work out where the validation code lives!).
  2. Add a validation rule to our github workflow that references the rule, based on the one at https://github.com/mapping-commons/mh_mapping_initiative/blob/master/.github/workflows/qc.yml

@a-chahin
Copy link
Contributor Author

@matentzn I wanted to get your opinion on the most appropriate mapping_justification for medications data in MIMIC. We used a script written by @xborrat to automate the mapping of 5000+ medications from NDC to RxNorm utilizing the OMOP data model. NDC codes are mapped to OMOP concept IDs and then, using the non-standard to standard relationship, to RXCUIs for RxNorm. We the spot-checked the mappings to validate the results. Is semapv:MappingChaining the correct mapping justification?
Thank you for your help!

@matentzn
Copy link

@a-chahin There are no 100% matured guidelines here. I would recommend using semapv:MappingChaining only when two already established mappings are combined. If there is a mapping justification and semantic mapping predicate for NDC->OMOP (i.e. semapv:LexicalMatching, and skos:exactMatch) and the same for OMOP->Rxnorm (which afaik there is not, because OMOP standard to non standard is "exact or broad", with no way to know which is which, and no formal justifications combined).

Its tough in your case. Depending on the mapping predicate you use for the NDC->RxNorm mapping, maybe semapv:MappingChaining is ok, but then I would make a comment in the mapping set metadata saying how the chaining exactly was applied (e.g. how you interpreted the OMOP NS-S mappings).

Thanks for the great question! If you could use the sssom issue track for some of these, others would be able to chime in as well!

@a-chahin
Copy link
Contributor Author

a-chahin commented Feb 1, 2023

Thank you @matentzn for your reply! I will create an issue on the SSSOM GitHub page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants