Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kallisto occasionally freezing at BAM indexing #177

Open
ulah opened this issue Jul 19, 2018 · 1 comment
Open

Kallisto occasionally freezing at BAM indexing #177

ulah opened this issue Jul 19, 2018 · 1 comment

Comments

@ulah
Copy link

ulah commented Jul 19, 2018

Hi there,
I'm currently evaluating whether we could use kallisto/pizzly for fusion gene prediction. However, for some samples I realized that kallisto is somehow freezing at BAM indexing (waited for >12h). Unfortunately, this is no reproducible behavior, meaning that a repeated execution with the same command (and available ressources) may finish w/o problems. Any ideas why this happens?

If it helps, here my command line:

kallistoIdx="/.../Ensembl_GRCh38_v86/kallistoIdx/v0.44.0/GRCh38_cDNA_all_k29"
genomeGtf="/.../Ensembl_GRCh38_v86/Homo_sapiens.GRCh38.86.gtf"
genomeSizes="/.../Ensembl_GRCh38_v86/Homo_sapiens.GRCh38.dna.primary_assembly.chrSizes.txt"

kallisto quant --threads 12 --genomebam --gtf "$genomeGtf" --chromosomes "$genomeSizes" --index "$kallistoIdx" --fusion --output-dir "$outDirK" "$forReads" "$revReads"


And here the output from stdout:

[quant] fragment length distribution will be estimated from the data
[index] k-mer length: 29
[index] number of targets: 178,136
[index] number of k-mers: 105,160,906
[index] number of equivalence classes: 739,685
Warning: 34964 transcripts were defined in GTF file, but not in the index
[quant] running in paired-end mode
[quant] will process pair 1: /xxx_R1.fastq.gz
                             /xxx_R2.fastq.gz
[quant] finding pseudoalignments for the reads ... done
[quant] processed 124,190,815 reads, 108,875,156 reads pseudoaligned
[quant] estimated average fragment length: 175.248
[   em] quantifying the abundances ... done
[   em] the Expectation-Maximization algorithm ran for 1,535 rounds
[  bam] writing pseudoalignments to BAM format .. done
[  bam] sorting BAM files .. done
[  bam] indexing BAM file .. 


@pabloiturralde
Copy link

pabloiturralde commented Feb 10, 2019

Hi, I get this exact same warning:

Warning: 34964 transcripts were defined in GTF file, but not in the index

When I run:
kallisto quant -i ~/kallisto_index/bdgp6.93_kallisto_index.fa -o /volumes/piturral/fastq/learning/kallisto_output/C02plusO -b 100 --genomebam --gtf ~/gtf/Drosophila_melanogaster.BDGP6.93.gtf /volumes/piturral/fastq/learning/untrimmed/C02plusO_S4_L001_R1_001.fastq.gz /volumes/piturral/fastq/learning/untrimmed/C02plusO_S4_L001_R2_001.fastq.gz /volumes/piturral/fastq/learning/untrimmed/C02plusO_S4_L002_R1_001.fastq.gz /volumes/piturral/fastq/learning/untrimmed/C02plusO_S4_L002_R2_001.fastq.gz /volumes/piturral/fastq/learning/untrimmed/C02plusO_S4_L003_R1_001.fastq.gz /volumes/piturral/fastq/learning/untrimmed/C02plusO_S4_L003_R2_001.fastq.gz /volumes/piturral/fastq/learning/untrimmed/C02plusO_S4_L004_R1_001.fastq.gz /volumes/piturral/fastq/learning/untrimmed/C02plusO_S4_L004_R2_001.fastq.gz

And I also get these results:
quant] fragment length distribution will be estimated from the data
[index] k-mer length: 31
[index] number of targets: 3,739
[index] number of k-mers: 173,304,639
[index] number of equivalence classes: 16,422
Warning: 34767 transcripts were defined in GTF file, but not in the index
[quant] running in paired-end mode
[quant] will process pair 1: /volumes/piturral/fastq/learning/untrimmed/J02O_S1_L001_R1_001.fastq.gz
/volumes/piturral/fastq/learning/untrimmed/J02O_S1_L001_R2_001.fastq.gz
[quant] will process pair 2: /volumes/piturral/fastq/learning/untrimmed/J02O_S1_L002_R1_001.fastq.gz
/volumes/piturral/fastq/learning/untrimmed/J02O_S1_L002_R2_001.fastq.gz
[quant] will process pair 3: /volumes/piturral/fastq/learning/untrimmed/J02O_S1_L003_R1_001.fastq.gz
/volumes/piturral/fastq/learning/untrimmed/J02O_S1_L003_R2_001.fastq.gz
[quant] will process pair 4: /volumes/piturral/fastq/learning/untrimmed/J02O_S1_L004_R1_001.fastq.gz
/volumes/piturral/fastq/learning/untrimmed/J02O_S1_L004_R2_001.fastq.gz
[quant] finding pseudoalignments for the reads ... done
[quant] processed 44,717,805 reads, 40,364,816 reads pseudoaligned
[quant] estimated average fragment length: 187.16
[ em] quantifying the abundances ... done
[ em] the Expectation-Maximization algorithm ran for 132 rounds
[bstrp] running EM for the bootstrap: 100
[ bam] writing pseudoalignments to BAM format .. done
[ bam] sorting BAM files .. done
[ bam] indexing BAM file .. done

Can someone please explain what does the warning mean?

Thanks!
P.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants