Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

questions regarding building index including introns #264

Open
jingzhangorg opened this issue May 13, 2020 · 0 comments
Open

questions regarding building index including introns #264

jingzhangorg opened this issue May 13, 2020 · 0 comments

Comments

@jingzhangorg
Copy link

Hello,

I am new to kallisto and I am learning to use it to analyze single nuclei RNA-seq data, where I found a lot of reads from the introns.

I followed the instructions from https://www.kallistobus.tools/velocity_index_tutorial.html
and also compared to download the prebuilt index from https://www.kallistobus.tools/velocity_tutorial.html

The downloaded cDNA_introns.t2g.txt file seems to be correct that each intron has a unique id with correct mapping to genes.
ENST00000463325.1056070-I ENSG00000184319 RPL23AP82

The index I built myself seems to be strange. somehow the introns_tr2g.txt file does not show the correct intron ID.
ENST00000456328.2 ENSG00000223972.5 DDX11L1

And the code in the tutorial does not seem to output the intron ID but rather the transcript ID.
"awk 'NR==FNR{a[$1]=$2; b[$1]=$3;next} {$2=a[$1];$3=b[$1]} 1' tr2g.txt introns_transcripts.txt > introns_t2g.txt"

Would you please suggest on this? Thank you very much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant