Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--single-overhang option #230

Open
raybueno opened this issue Oct 17, 2019 · 2 comments
Open

--single-overhang option #230

raybueno opened this issue Oct 17, 2019 · 2 comments

Comments

@raybueno
Copy link

raybueno commented Oct 17, 2019

Hello,

When trying to align reads to to the end of a gene, kallisto is unable to pseudoalign this read. However when using the --single-overhang option, pseudoalignment occurs.

My question is why is it that kallisto cannot pseudoalign sequences that are from the end of a gene and how does the --single-overhang option work?

@mschilli87
Copy link

@raybueno:

Without a reproducible example I'm not sure I understand what is your issue but if I do my guess is as follows: You have a fragment length distribution that make it (numerically) impossible to ever see the 3' end of a transcript with a read coming from the 5' of any fragment:

>##...#> transcript
--...- (3' most, shortest possible) fragment
>==...=> (5', longest possible) read from that fragment
xx...x 'unreachable' part of the transcript

>#######################################################################>
                                     ------------------------------------
                                     >=========>
                                                xxxxxxxxxxxxxxxxxxxxxxxxx

Also,
Did you read the paper?
Did you read the documentation?
What parts of those relevant for your issue are not clear enough?
Maybe you could help to improve the documentation once you solve your problem?

@raybueno
Copy link
Author

raybueno commented Oct 18, 2019

@mschilli87:

Thank you for the reply and apologize for the lack of clarity. Here is of an example

Transcript:

GeneA1
ATCAGTCTCCGTGTGTGGATTTATGTCTACAGAGAGCATGGACGTTTTATGCTCACGTCACAACCTCCCGTTCTT

Read:
@ReadGeneA1:1:75_1
ATCAGTCTCCGTGTGTGGATTTATGTCTACAGAGAGCATGGACGTTTTATGCTCACGTCACAACCTCCCGTTCTT

In this example, we have a 75bp Transcript and a 75bp Read. As you can see the Read should align with the reference as it is the same sequence. However when running Kallisto quant, with no --single-overhang the read does not pseudoalign. Interestingly if you add the --single-overhang option there is pseudoalignment. This can occur with any bp length for a reference gene. In the case of a Transcript that is 1500bp, if there is a 75bp read that starts at position 1425 and ends at position 1500, pseudoalignment will not occur unless the --single-overhang option is used.

I have read both the paper and the manual. I couldn't find any reason as to why Kallisto is unable to align reads that are positioned at the end of a Transcript. Thank you for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants