Bioinformatics related ontologies. Especially for generating RDF content using BioInterchange.
An ontology for describing genomic features and variants; in particular the contents of GFF3, GTF, GVF and VCF files.
Build instructions are intended for collaborators and enthusiasts who would like contributing to the BioInterchange ontologies. The ontologies can be edited using Protege, but a few post-processing steps are necessary to remove additional information that Protege inserts on its own.
Saving an ontology with Protege will introduce explicit class definitions and individuals for external URIs. These have to be removed, so that only BioInterchange URIs are described by the ontologies. A script has been provided that takes care of this, and additionally, increments the patch level version number of the ontologies.
For example, the following commands can be used to create a new cleaned version of the GFVO ontology:
<gfvo.xml scripts/cleanse.rb > gfvo.tmp
mv gfvo.tmp gfvo.xml
Due to technical limitations of BioPortal, GFVO in BioPortal cannot import other ontologies or contain SIO class- or property-equivalences. If ontologies are imported and equivalences kept, then BioPortal reports from summary statistics and the class browser shows thousands of classes that are not part of GFVO itself.
Removal of OWL imports and class- and property-equivalences:
grep -v '<owl:imports ' gfvo.xml | grep -v '<owl:equivalentProperty ' | grep -v '<owl:equivalentClass ' > gfvo_bioportal.xml
Summary statistics about classes and properties can be output in human-readable and HTML via:
./scripts/stats.rb < gfvo.xml
A regular expression of valid URIs as defined in the Gene Ontology Abbreviation Collection can be automatically generated using the following command:
./scripts/go_xref2xsd_pattern.rb
On Mac OS X, the generated regular expression can be copied into the clipboard for subsequent
pasting using the pbcopy
command:
./scripts/go_xref2xsd_pattern.rb | pbcopy
The following ontologies were prototypes that eventually merged into the Genomic Feature and Variation Ontology (GFVO).
An ontology for describing GFF3 file contents.
An ontology for describing GVF file contents.