Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a system for keeping partonomies in sync (Uberon & CC -> BP) #12658

Open
dosumis opened this issue Sep 13, 2016 · 5 comments
Open

Implement a system for keeping partonomies in sync (Uberon & CC -> BP) #12658

dosumis opened this issue Sep 13, 2016 · 5 comments

Comments

@dosumis
Copy link
Contributor

dosumis commented Sep 13, 2016

We have many cases where we manually maintain partonomies between Uberon and BP (most notably under the development branch, and between CC and BP (most notably in the organization branch). Ideally we'd have some way to keep these part hierarchies in sync automatically.

Chris has been on record in the past as suggesting a script based approach. I still like the GCI approach as it can dynamically adapt to changes in Uberon. I fear a script will run only periodically and will inevitably cause problems when it lags.

Here's an example of the GCI approach (the example comes from a case where the difference in partonomy caused an inconsistency: #12655)

Current pattern:
'circulatory system development'
EquivalentTo: 'anatomical structure development' that 'results in development of' some ('part of' some 'circulatory system')
Proposal - Add GCI: 'anatomical structure development' that 'results in development of' some ('part of' some 'circulatory system') SubClassOf part_of some 'circulatory system development' (*)

This is wordy enough that it would need to be implemented as part of a pattern system, i.e. it could be added as part of the pattern used to add any development term.

End users will, of course, want to see the part relationships directly, so we'd need a way to materialize these (non-redundantly) in the release files. @cmungall @baloff - is this possible with existing tools?

CC @cmungall @ukemi @balhoff - comments please.

* or fully denormalised: 'anatomical structure development' that 'results in development of' some ('part of' some 'circulatory system') SubClassOf part_of some 'anatomical structure development' that 'results in development of' some 'circulatory system' ?

@cmungall
Copy link
Member

My initial dislike of the GCI system was based on

  1. difficulties keeping them in sync with ontology
  2. inferences are not materialized / easily visible in Protege

The first objection can be taken care of by DOSDPs. Currently I have a rogue extension illustrated here: https://github.com/obophenotype/bio-attribute-ontology/blob/ecca8208294ecbb0bcfea5449a91bea975bc21ae/src/ontology/patterns/entity_attribute.yaml#L42

This uses functional syntax, but I think it would be cleaner to have two manchester expressions for left and right sides (I think you suggested something like this)

The second is now taken care of by the obscure --materialize-gci option in owltools, illustrated here: https://github.com/obophenotype/bio-attribute-ontology/blob/ecca8208294ecbb0bcfea5449a91bea975bc21ae/src/ontology/Makefile#L144

it takes a materialized-expression reasoner to find R some Y direct parents and materializes. We should be able to add this to robot easily (mexr is distributed on maven)

We could run this as part of the per-commit inference pipeline and add so what OE imports, so OE users would see their part-ofs. Protege users would not see these, but then partonomy sucks in Protege anyway (the entailments would of course be there, just not obviously visible). Alternatively we could have something like the old pipeline that injects the inferences (tagged) back into the edit file, but I don't think we want to go back to that.

@dosumis
Copy link
Contributor Author

dosumis commented Sep 13, 2016

Provisional plan

TBD

  • Pattern: nested class expressions for both sides vs simple on right. e.g:
    • 'anatomical structure development' that 'results in development of' some ('part of' some 'circulatory system') SubClassOf part_of some 'circulatory system development'
    • VS
    • 'anatomical structure development' that 'results in development of' some ('part of' some 'circulatory system') SubClassOf part_of some 'anatomical structure development' that ('results in development of' some 'circulatory system')

The former has the advantage that the GCI will be visible under the relevant GO class ('circulatory system development') in Protege 5. This may be better for editing. But perhaps this doesn't matter - the latter can never be wrong and could be entirely generated from imported Uberon classes.

  • Details of how to do redundancy stripping in the partonomy for public release
    • This is already something that needs to be done. It is likely to become urgent once we move to asserting inferred partonomy (unless this can be done non-redundantly in the first place).

To implement:

  • Add materialized expression reasoning to robot.
  • Change editor's file pipeline to => imported partonomy file. Inferred partonomy should be recomputed on commit as for inferred classification. As long as the OE filter works for relationships as well as is_a assertions, this should be straightforward.
  • Update relevant TG patterns to add GCIs?*
  • For branches to which this is applied: remove existing partonomy. This needs to be done carefully, removing only the relevant axioms and documenting as table for review.

* May not be necessary if we use the second GCI pattern above, as we could simply generate relevant GCIs during import of Uberon terms, following some standard set of patterns.

@pgaudet
Copy link
Contributor

pgaudet commented Oct 1, 2018

@ukemi @dougli1sqrd @cmungall what's the status on this?

@dosumis
Copy link
Contributor Author

dosumis commented Oct 30, 2018

AFAIK all the tools are in place to do this.

  • DOSDP & DOSDP-tools support GCIs
  • OWLtools can do materialized expression reasoning (& I think Robot can too)
  • DOSDP-tools can now be used to generate logical axioms only - leaving maintenance of all annotations (textual definitions, synonyms, xrefs etc) in the ontology.
  • The ontology development kit now manages DOSDP-based ontology development - automatically building a file of generated axioms.
  • Given a design pattern that matches current patterns, DOSDP-tools can be used to generate a tsv with fillers.

TODO:

  1. Update the GO repo to follow the latest ODK => support for pattern-based dev.
  2. Write two sets of patterns for each branch that needs fixing - one that matches current usage, and one with additional GCIs.
  3. For each - use DOSDP tools with the first pattern to generate tsvs. Substitute the second pattern to generate with GCIs.
  4. Add a statement to the MakeFile that generates a new file with materialised 'part of' relationships + an import statement on the editor's file to pull these axioms in.

Maybe worth having a brief meeting with @matentzn & myself if any questions about how to do this?

@cmungall
Copy link
Member

cmungall commented Oct 30, 2018

Thanks for the summary David, v useful.

We can try putting this into practice in Geneva. Maybe we can call you and Nico then (will be nearly the same timezone).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants