-
Notifications
You must be signed in to change notification settings - Fork 211
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adding relationships in the gene ontology notebook.
- Loading branch information
Showing
1 changed file
with
356 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,356 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Relationships in the GO\n", | ||
"_Alex Warwick Vesztrocy, March 2016_" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"For some analyses, it is possible to only use the <code>is_a</code> definitions given in the Gene Ontology. \n", | ||
"\n", | ||
"However, it is important to remember that this isn't always the case. As such, <code>GOATOOLS</code> includes the option to load the relationship definitions also." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Loading GO graph with the relationship tags\n", | ||
"This is possible by using the <code>optional_attrs</code> argument, upon instantiating a <code>GODag</code>. This can be done with either the full or the basic version of the ontology. Here, the __full__ version shall be used. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [ | ||
{ | ||
"name": "stderr", | ||
"output_type": "stream", | ||
"text": [ | ||
"load obo file go.obo\n" | ||
] | ||
}, | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"go.obo: format-version(1.2) data-version(releases/2016-03-26)\n" | ||
] | ||
}, | ||
{ | ||
"name": "stderr", | ||
"output_type": "stream", | ||
"text": [ | ||
"46317 nodes imported\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"from goatools.obo_parser import GODag\n", | ||
"import wget\n", | ||
"\n", | ||
"go_fn = wget.download('https://geneontology.org/ontology/go.obo')\n", | ||
"go = GODag(go_fn, optional_attrs=['relationship'])" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Viewing relationships in the GO graph" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"So now, when looking at an individual term (which has relationships defined in the GO) these are listed in a nested manner. As an example, look at <code>GO:1901990</code> which has a single <code>has_part</code> relationship, as well as a <code>regulates</code> one." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"eg_term = go['GO:1901990']" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"GOTerm('GO:1901990'):\n", | ||
" name:regulation of mitotic cell cycle phase transition\n", | ||
" children: 7 items\n", | ||
" GO:0060564\tlevel-07\tdepth-13\tnegative regulation of APC-Cdc20 complex activity [biological_process] \n", | ||
" GO:2000045\tlevel-07\tdepth-08\tregulation of G1/S transition of mitotic cell cycle [biological_process] \n", | ||
" GO:1901992\tlevel-07\tdepth-08\tpositive regulation of mitotic cell cycle phase transition [biological_process] \n", | ||
" GO:1901991\tlevel-07\tdepth-08\tnegative regulation of mitotic cell cycle phase transition [biological_process] \n", | ||
" GO:0010389\tlevel-07\tdepth-08\tregulation of G2/M transition of mitotic cell cycle [biological_process] \n", | ||
" GO:0030071\tlevel-07\tdepth-10\tregulation of mitotic metaphase/anaphase transition [biological_process] \n", | ||
" GO:0007096\tlevel-07\tdepth-08\tregulation of exit from mitosis [biological_process] \n", | ||
" parents: 2 items\n", | ||
" GO:0007346\tlevel-05\tdepth-05\tregulation of mitotic cell cycle [biological_process] \n", | ||
" GO:1901987\tlevel-06\tdepth-06\tregulation of cell cycle phase transition [biological_process] \n", | ||
" level:6\n", | ||
" depth:7\n", | ||
" alt_ids: 0 items\n", | ||
" _parents: 2 items\n", | ||
" GO:0007346\n", | ||
" GO:1901987\n", | ||
" relationship: 2 items\n", | ||
" has_part: 1 items\n", | ||
" GO:0051437\tlevel-07\tdepth-12\tpositive regulation of ubiquitin-protein ligase activity involved in regulation of mitotic cell cycle transition [biological_process] \n", | ||
" regulates: 1 items\n", | ||
" GO:0044772\tlevel-05\tdepth-05\tmitotic cell cycle phase transition [biological_process] \n", | ||
" id:GO:1901990\n", | ||
" namespace:biological_process\n", | ||
" is_obsolete:False" | ||
] | ||
}, | ||
"execution_count": 3, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"eg_term" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"These different relationship types are stored as a dictionary within the relationship attribute on a GO term." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"dict_keys(['has_part', 'regulates'])\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"print(eg_term.relationship.keys())" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 5, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"{GOTerm('GO:0044772'):\n", | ||
" name:mitotic cell cycle phase transition\n", | ||
" children: 4 items\n", | ||
" GO:0000086\tlevel-06\tdepth-06\tG2/M transition of mitotic cell cycle [biological_process] \n", | ||
" GO:0007091\tlevel-06\tdepth-10\tmetaphase/anaphase transition of mitotic cell cycle [biological_process] \n", | ||
" GO:0010458\tlevel-06\tdepth-06\texit from mitosis [biological_process] \n", | ||
" GO:0000082\tlevel-06\tdepth-06\tG1/S transition of mitotic cell cycle [biological_process] \n", | ||
" parents: 2 items\n", | ||
" GO:0044770\tlevel-04\tdepth-04\tcell cycle phase transition [biological_process] \n", | ||
" GO:1903047\tlevel-04\tdepth-04\tmitotic cell cycle process [biological_process] \n", | ||
" level:5\n", | ||
" depth:5\n", | ||
" alt_ids: 0 items\n", | ||
" _parents: 2 items\n", | ||
" GO:0044770\n", | ||
" GO:1903047\n", | ||
" relationship: 1 items\n", | ||
" part_of: 1 items\n", | ||
" GO:0000278\tlevel-04\tdepth-04\tmitotic cell cycle [biological_process] \n", | ||
" id:GO:0044772\n", | ||
" namespace:biological_process\n", | ||
" is_obsolete:False}\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"print(eg_term.relationship['regulates'])" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Example use case\n", | ||
"One example use case for the relationship terms, would be to look for all functions which regulate pseudohyphal growth (<code>GO:0007124</code>). That is:\n", | ||
"\n", | ||
"> A pattern of cell growth that occurs in conditions of nitrogen limitation and abundant fermentable carbon source. Cells become elongated, switch to a unipolar budding pattern, remain physically attached to each other, and invade the growth substrate. \n", | ||
">\n", | ||
"> _Source: https://www.ebi.ac.uk/QuickGO/GTerm?id=GO:0007124#term=info&info=1_" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 6, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"term_of_interest = go['GO:0007124']" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"First, find the relationship types which contain \"regulates\":" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 7, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"frozenset({'positively_regulates', 'negatively_regulates', 'regulates'})\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"regulates = frozenset([typedef \n", | ||
" for typedef in go.typedefs.keys() \n", | ||
" if 'regulates' in typedef])\n", | ||
"print(regulates)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Now, search through the terms in the tree for those with a relationship in this list and add them to a dictionary dependent on the type of regulation." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 8, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from collections import defaultdict\n", | ||
"\n", | ||
"regulating_terms = defaultdict(list)\n", | ||
"\n", | ||
"for t in go.values():\n", | ||
" if hasattr(t, 'relationship'):\n", | ||
" for typedef in regulates.intersection(t.relationship.keys()):\n", | ||
" if term_of_interest in t.relationship[typedef]:\n", | ||
" regulating_terms['{:s}d_by'.format(typedef[:-1])].append(t)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Now <code>regulating_terms</code> contains the GO terms which relate to regulating protein localisation to the nucleolus." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 9, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"pseudohyphal growth (GO:0007124) is:\n", | ||
"\n", | ||
" - negatively_regulated_by:\n", | ||
" -- GO:1900462 negative regulation of pseudohyphal growth by negative regulation of transcription from RNA polymerase II promoter\n", | ||
" -- GO:2000221 negative regulation of pseudohyphal growth\n", | ||
" -- GO:0100042 negative regulation of pseudohyphal growth by transcription from RNA polymerase II promoter\n", | ||
"\n", | ||
" - positively_regulated_by:\n", | ||
" -- GO:2000222 positive regulation of pseudohyphal growth\n", | ||
" -- GO:0100041 positive regulation of pseudohyphal growth by transcription from RNA polymerase II promoter\n", | ||
" -- GO:1900461 positive regulation of pseudohyphal growth by positive regulation of transcription from RNA polymerase II promoter\n", | ||
"\n", | ||
" - regulated_by:\n", | ||
" -- GO:2000220 regulation of pseudohyphal growth\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"print('{:s} ({:s}) is:'.format(term_of_interest.name, term_of_interest.id))\n", | ||
"for (k, v) in regulating_terms.items():\n", | ||
" print('\\n - {:s}:'.format(k))\n", | ||
" for t in v:\n", | ||
" print(' -- {:s} {:s}'.format(t.id, t.name))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.5.1" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |