-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can one read be used to support multiple fusions #203
Comments
Your understanding is correct. While they are listed in the column All read-level filters (including the BTW, not only multimappers may lead to read identifiers being listed for multiple fusions. Discordant mates which are in the vicinity of breakpoints close to each other also often do. |
Thanks for your explanation, @suhrig! Is there any way to know those read identifiers that passed all filters? |
This is not possible, unfortunately. However, I don't think this is meaningful either. If a fusion made it to the main output file of Arriba, you should consider all reads listed in the |
I see, thanks! One more question, I saw there's another issue where Arriba just picks one of the fusions indicated by multi-mapping reads "KANSL1-ARL17A/KANSL1-ARL17B fusion detection #165". I wonder if the difference between the two cases is that in "KANSL1-ARL17A/KANSL1-ARL17B", the alignment corresponding to KANSL1-ARL17A fusion are all discarded, thus no reads support KANSL1-ARL17A (or reads supporting KANSL1-ARL17A are moved by other filters), and while in the case I mentioned above, there exist both reads reduced to alignment supporting ISCA1--SLC30A1 and CTD-2561J22.1--SLC30A1 |
Yes, that's most likely what happened in this case.
After assignment of the multimapping reads to one of these two fusions, both fusions still have a sufficient number of supporting reads for Arriba to consider them real. However, I think Arriba made a mistake here. Compare the supporting reads of the two fusions: ISCA1--SLC30A1 has many split reads mapping to both fusion partners (columns |
Thanks for your reply! After looking into the code, I think the problem might be that after merging adjacent fusions of CTD-2561J22.1--SLC30A1: only the fusion SLC30A1--CTD-2561J22.1--211749630--21564170 is kept, but its Is there any reason not to include reads into the list for other merged fusions? Should we include them so that the supporting read count would be more accurate? |
Hi @suhrig ,
I was running Arriba on this simulation dataset :
https://data.broadinstitute.org/Trinity/CTAT_FUSIONTRANS_BENCHMARKING/on_simulated_data/sim_101 (In the case I'm about to mention it's reads from
sim3
.) I used the default script of Arriba_v2.4.0 and found these two fusions in fusions.tsv:ISCA1 SLC30A1 -/- -/- 9:88886922 1:211749631 CDS/splice-site CDS/splice-site translocation 299 289 152 1732 4155 high in-frame . Iron-sulphur_cluster_biosynthesis(58%)|Cation_efflux_family(52%) . . ENSG00000135070.9 ENSG00000170385.9 ENST00000375991.4 ENST00000367001.4 upstream downstream duplicates(11),low_entropy(3),mismatches(209),multimappers(162)
CTD-2561J22.1 SLC30A1 -/- -/- 19:21564171 1:211749631 exon CDS/splice-site translocation 3 247 1 10 4155 high . . |Cation_efflux_family(52%) . . ENSG00000268731.1 ENSG00000170385.9 ENST00000597319.1 ENST00000367001.4 upstream downstream duplicates(6),mismatches(100),multimappers(253)
Between these two fusions, there are 166 identical read identifiers, for example:
ISCA1|ENSG00000135070.9--SLC30A1|ENSG00000170385.9:10030480_1_76430_4501_261
ISCA1|ENSG00000135070.9--SLC30A1|ENSG00000170385.9:1010524_1_76430_4597_166
ISCA1|ENSG00000135070.9--SLC30A1|ENSG00000170385.9:10424315_1_76430_4580_164
ISCA1|ENSG00000135070.9--SLC30A1|ENSG00000170385.9:10462442_1_76430_4592_195
ISCA1|ENSG00000135070.9--SLC30A1|ENSG00000170385.9:10479657_1_76430_4532_241
According to Arriba's multimappers filter (I assume these reads are multi-mapping reads), "multi-mapping reads are reduced to a single alignment", each read should be mapped to only one fusion. Thus I wonder if my understanding for this filter is correct or if there's something wrong with the output.
Thanks for your time!
The text was updated successfully, but these errors were encountered: