Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check valid taxa when parsing data (GAF column 13; GPI column 7) #2061

Open
pgaudet opened this issue Sep 4, 2023 · 5 comments
Open

Check valid taxa when parsing data (GAF column 13; GPI column 7) #2061

pgaudet opened this issue Sep 4, 2023 · 5 comments

Comments

@pgaudet
Copy link
Contributor

pgaudet commented Sep 4, 2023

Taxon=0 has been observed: geneontology/pipeline#239 (comment)

This should not be allowed; ideally the allowed taxa should come from the taxa loaded (if this is doable)

@kltm
Copy link
Member

kltm commented Sep 5, 2023

Proposal: if taxon cannot be extracted for whatever reason, the annotation should be removed as a GORULE:0000001 violation. Noting that, since the most likely cause was probably a GPI/GPAD mismatch, the error could even take place while emitting lines.

@kltm kltm removed this from TODO in GORULES (low-hanging fruit) Sep 12, 2023
@kltm kltm changed the title New gorule to check valid taxa (GAF column 13; GPI column 7; not in GPAD) New gorule to check valid taxa (GAF column 13; GPI column 7) Sep 12, 2023
@kltm
Copy link
Member

kltm commented Sep 12, 2023

Noting that this is cleaning already bad data. The preventative aspect is here: #2066

@kltm
Copy link
Member

kltm commented Sep 12, 2023

Note that we will initially be trying to have this after the 3.10 requirement in ontobio, which we are aiming for the end of the week. If we cannot make that, we'll try some kind of "backport" or other hack to get this done sooner rather than later.

@kltm
Copy link
Member

kltm commented Sep 14, 2023

Now testing on master.

@kltm
Copy link
Member

kltm commented Sep 22, 2023

@pgaudet Issues should only be in a single project. As this was already in QC and now clearing, I'm pulling this out of GORULES.

@mugitty / @dustine32 Can you confirm that this is correct now on snapshot? If so, we can close out.

@kltm kltm moved this from Working to Clearing in Ongoing data QC and pipeline maintenance Sep 22, 2023
@kltm kltm removed this from Clearing in GORULES (low-hanging fruit) Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants