Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison of GO Terms between list #162

Closed
BioinformaNicks opened this issue Apr 20, 2020 · 7 comments
Closed

Comparison of GO Terms between list #162

BioinformaNicks opened this issue Apr 20, 2020 · 7 comments
Labels

Comments

@BioinformaNicks
Copy link

Dear authors,

I would like to ask for an example of using compare_gos.py to compare occurences of GO ID's between two different sets/lists.

I have a table like this for example:

Protein \t GO ID \t Function
X \t GO:0005737 \t cytoplasm

And a similar one for another condition. How do I compare whether certain GO ID's are enriched in comparison to the other condition?

@dvklopfenstein
Copy link
Collaborator

dvklopfenstein commented May 6, 2020

Thank you for taking the time to contact us and for the great question. Other user's will surely benefit from our answering your question. I will add a link in the README.md to this issue so they can see how to run it.

@dvklopfenstein
Copy link
Collaborator

dvklopfenstein commented May 6, 2020

Here is an example that you can run if you clone the goatools repo:

$ scripts/compare_gos.py data/compare_gos/tat_gos_simple1.tsv data/compare_gos/tat_gos_simple2.tsv

XX GO:0008150 BP 29210 D00  biological_process
XX GO:0065007 BP 12884 D01  biological regulation
XX GO:0050789 BP 11559 D02  regulation of biological process
XX GO:0050794 BP  8746 D03  regulation of cellular process
X. GO:0048519 BP  3641 D03  negative regulation of biological process
.X GO:0048518 BP  3519 D03  positive regulation of biological process
X. GO:0048523 BP  2662 D04  negative regulation of cellular process

# Marker keys:
#     X -> GO is present in tat_gos_simple1
#     X -> GO is present in tat_gos_simple2

@dvklopfenstein
Copy link
Collaborator

Your file format should work just fine. Here is a sample. GO IDs in i162a.tsv and i162b.tsv will be compared:

Contents of i162a.tsv

Protein \t GO ID \t Function
X \t GO:0005737 \t cytoplasm
X \t GO:0048523 \t cytoplasm

Contents of i162b.tsv

Protein \t GO ID \t Function
X \t GO:0005737 \t cytoplasm
X \t GO:0048518 \t cytoplasm

Run:

$ scripts/compare_gos.py i162a.tsv i162b.tsv
.X GO:0048518 BP  3519 D03  positive regulation of biological process
X. GO:0048523 BP  2662 D04  negative regulation of cellular process
XX GO:0005737 CC  1200 D02  cytoplasm

# Marker keys:
#     X -> GO is present in i162a
#     X -> GO is present in i162b

@dvklopfenstein
Copy link
Collaborator

The compare_gos script picks up all GO IDs on a line using regex.

Lines beginning with # are considered comments and are ignored, even if they contain GO IDs.

dvklopfenstein added a commit that referenced this issue May 6, 2020
@Habush
Copy link

Habush commented Apr 19, 2021

Thanks for creating this script and the library, it's really helpful. Is there a way to save the output of the comparison into a format that can be imported from another such as tsv, csv..etc?

EDIT: I found the solution by going through the script. I can use the --xlsx option to output as an excel file.

@dvklopfenstein
Copy link
Collaborator

Thank you so much for your interest in GOA Tools and for taking your time to write us. I would like to add a tsv format too. I am putting this on our TODO list.

Please let us know any more information that might be relevant to this issue or your thoughts about any specific features or implementations.

@tanghaibao tanghaibao reopened this May 23, 2021
@tanghaibao
Copy link
Owner

Issue appears to be resolved with --xlsx option as reported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants