-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output file interpretation #162
Comments
Hi,
|
Hello! I've successfully executed the TOGA pipeline, but I'm still unclear on how to interpret the output file.
My objective was to identify missing or inactivated genes. I've observed that there's a file named loss_summ_data.tsv, which is divided into eight categories. Should I primarily focus on the UL (uncertain loss) and L (clearly lost) categories and disregard the others? Additionally, there's an inact_mut_data.txt file for visualization.What these two details of the contents of the documents, I should how to correctly identify genetic loss events and the inactivated genes.
In fact, I took the pika genome and compared it with hg38 using lastal, and the maf output was converted to a chain file(more faster). After TOGA, the loss_summ_data.tsv file had only 19 non-redundant results, and the inact_mut_data.txt file had 319 results. But after getting assembly quality statistics, the results are shown in the figure. Is this reasonable and what might be the cause
toga_statsplot.pdf
I'm sorry for disturbing you so many times. I really look forward to your reply, which is very important to me.
Best regards!
The text was updated successfully, but these errors were encountered: