Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kallisto bus parsing barcode length to be 0 #250

Open
caleblareau opened this issue Feb 13, 2020 · 0 comments
Open

kallisto bus parsing barcode length to be 0 #250

caleblareau opened this issue Feb 13, 2020 · 0 comments

Comments

@caleblareau
Copy link

Hi kallisto team,

I'm trying to quantify gRNAs for a scRNA/Perturb-seq experiment. The barcode/umi/gRNA are on a single-end read.

I've created an index using kite and then attempted to use kallisto|bustools but am getting a weird error where it appears that the parsed barcode length is 0.

Running

kallisto bus -i gRNA_kallisto_index test.fastq.gz -o bus_out -x 0,0,16:0,16,26:0,38,60

seems fine (stdout below):

[index] k-mer length: 21
[index] number of targets: 9,984
[index] number of k-mers: 9,951
[index] number of equivalence classes: 9,997
[quant] will process sample 1: test.fastq.gz
[quant] finding pseudoalignments for the reads ... done
[quant] processed 25,000 reads, 2,630 reads pseudoaligned

However, when I run the bustools correct command:

bustools correct -w data/737K-august-2016.txt -o corrected_bus_out bus_out

I get the following error:

Found 737280 barcodes in the whitelist
Error: barcode length and whitelist length differ, barcodes = 0, whitelist = 16
       check that your whitelist matches the technology used

My understanding is that the -x parameter should have correctly specified a 16bp barcode. Attempting to skip the bustools correct step leads to a similar error downstream (in bustools sort) as well.

My fastq file for reference:

zcat < test.fastq.gz | head
@NS500466:533:H5HNJBGXF:1:11101:20629:1041 1:N:0:TGGTAACG
GCTCCNAAGTAGATGTCCCAATTCAATTTCTTATATNGGGTCCGTTATAACTTGAAAAANTGGCACCGGTCGGTAG
+
AAAAA#EEEEEEEEEEEEEEEEEAEEEEEEEEEEEE#EEEEEAEEEEEEEEEEEEEEEE#EEA<AEEEEEEEA6AA
@NS500466:533:H5HNJBGXF:1:11101:1592:1041 1:N:0:TGGTAACG
CGCTGNACAGTAACGGACGCCACGGGTTTCTTATATNGGGTTATCAACTTGAAAAAGTGNCACCGAGTCGGTAGAT
+
AAAAA#EEEEEEEE/EEEEEAEEEEAEAEEEEAEEE#EAEEEEEEEEEAAAA//AEEEE#EEEEEE/EE/EE<E</
@NS500466:533:H5HNJBGXF:1:11101:15220:1042 1:N:0:TGGTAACG
AACTCNCTCACTCTTACTCGGTCGTCTTTCTTATATGGGTCAACTTGAAAAAGTGGCACNGAGTCGGTAGATCGGA

Any clues as to what's happening? I've also tried splitting things into multiple files (and updating -x), but I get a similar "barcodes = 0" error.

I'm using kallisto 0.46.1 and bustools 0.40.0.

Any help would be greatly appreciated. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant