Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finding pseudoalignments for the reads ...Segmentation fault (core dumped) #105

Open
aungthurhahein opened this issue Mar 25, 2016 · 12 comments

Comments

@aungthurhahein
Copy link

When trying to get pesudobam file, it gives me the core-dump error.
Machine is Ubuntu server with x86-64 architecture.

kallisto quant -i Trinity.fasta.kallisto_idx -l 590 -s 160.95 -o out --pseudobam --single ge50.fasta > out.sam
@pmelsted
Copy link
Contributor

pmelsted commented Apr 5, 2016

What is the error reported? We fixed an issue in v0.42.5 which could be the cause of this. Can you run this on the latest versio 0.42.5

@aungthurhahein
Copy link
Author

Kallisto version is 0.42.4 and this is the error message:

finding pseudoalignments for the reads ...Segmentation fault (core dumped) 

I will download v0.42.5 and try again.
I will get back to you with the outcome.

@aungthurhahein
Copy link
Author

I tried with kallisto ver. 0.42.5 with the following command and the error still persists.

Command:

kallisto quant -l 605 -s 136 -i Trinity.fasta.kallisto_idx -o aln_out --pseudobam --single lib.fasta

Program halted with the following error:

@SQ     SN:c10356_g10356_i1     LN:1195
@PG     ID:kallisto     PN:kallisto     VN:0.42.5
Segmentation fault (core dumped)

@pmelsted
Copy link
Contributor

In your case you seem to have only a single sequence in your index, can you confirm that this is what you expected?

Can you run kallisto quant without the pseudobam, I'm just trying to isolate whether there is a problem with the pseudobam or other parts.

can you also report what is written to stderr when you run your command as
kallisto quant -l 605 -s 136 -i Trinity.fasta.kallisto_idx -o aln_out --pseudobam --single lib.fasta > lib.sam

@aungthurhahein
Copy link
Author

The index file has more than one sequence. I just reported the end of the stdout.
I can run kallisto quant without generating pseudobam successfully.

This is the output of both stdout and stderr:

[quant] fragment length distribution is truncated gaussian with mean = 605, sd = 136
[index] k-mer length: 31
[index] number of targets: 10,357
[index] number of k-mers: 5,947,007
[index] number of equivalence classes: 24,998
[quant] running in single-end mode
[quant] will process file 1: /colossus/home/anuphap/EST/EST_lib_IDs/pm/slect_bytissues_pm_chula/PM82_wTempLibID_04092014.txt.PmTwI.seqID.fasta
[quant] finding pseudoalignments for the reads ...@HD   VN:1.0
@SQ     SN:c0_g0_i1     LN:216
@SQ     SN:c1_g1_i1     LN:374
@SQ     SN:c2_g2_i1     LN:197
...
@SQ     SN:c10354_g10354_i1     LN:594
@SQ     SN:c10355_g10355_i1     LN:682
@SQ     SN:c10356_g10356_i1     LN:1195
@PG     ID:kallisto     PN:kallisto     VN:0.42.5
Segmentation fault (core dumped)

Also, "core.xxxx" file is written inside the working directory.

@pmelsted
Copy link
Contributor

The sequences you are aligning have the ending .fasta are they truly FASTA entries and not FASTQ. Because pseudoalignment outputs SAM files which are required to have a quality string kallisto (probably) fails because it has no quality string.

I'll have to check for this a bit more carefully when doing pseudoalignment.

kallisto never uses the quality values so you can supply a dummy value, essentially converting the FASTA file to a FASTQ files.

You can try this by just converting the first few sequences of the input file to FASTQ

@aungthurhahein
Copy link
Author

Yes.I confirmed that .fasta file has no quality file.
I didn't mention it before because don't expect that it can be the cause of the issue.

I will test with .fastq file format and report the outcome soon.

@maubarsom
Copy link

maubarsom commented Jul 10, 2018

I also ran into a segfault when generating the pseudobam, but due to a slightly different problem . I was running kallisto using process substitution to deal with an interleaved paired end file
e.g

kallisto quant -t 8 -i kallisto.idx -o my_sample --pseudobam <(seqtk seq -1 interleaved.fq) <(seqtk seq -2 interleaved.fq)

Kallisto runs perfectly fine without the --pseudobam flag, but it crashes if I request the pseudobam.

I figured the pseudobam needs re-reading the fastq files, so I tried doing the split beforehand and then the seg fault does not happen (runs fine).

Would be nice to add this to the docs at least :). A nice would have also would be support for interleaved paired end files :)

@Evi-050
Copy link

Evi-050 commented Apr 14, 2020

Hello, I face this problem: " [ bam] writing pseudoalignments to BAM format .. Segmentation fault"
and I have no idea how to fix it. I have smartseq.2 single reads, dual indexed ( this is who the fastq reads look like:
@NB551291:160:H55CJBGXF:1:11101:12947:14932 1:N:0:TAAGGCGA+GCGATCTA
GGCGTGTCCCGCGCGTGTGGGGGGAACCTCCGCGTCGGTGTTCCCCCGCCGGGTCCGCCCCCCGGGCCGCGGTTTT
+
AAAA/EAAAEEEA/EEAEEEAEE/E/EEEEEEAEA/EEEEEEEEEEEEEAEEEE/E/EAEEAEEE6AE/</EA/// )

I run this pipeline:
[user@vm-129-49 mouse1.fastq_gz]$ kallisto quant -i /ad/vlachou/scRNAseq.2/kallisto_analysis/gencode.vM24.transcripts.idx --output-dir /ad/vlachou/scRNAseq.2/kallisto_analysis/kallisto_quant/gencode_indexed/mouse1 --pseudobam --genomebam --gtf /vlachou/scRNAseq.2/kallisto_analysis/gencode.vM24.annotation.gtf.gz --single -l 530 -s 150 -t 16 *fastq.gz

this is the outcome message:
[quant] fragment length distribution is truncated gaussian with mean = 530, sd = 150
[index] k-mer length: 31
[index] number of targets: 142,552
[index] number of k-mers: 120,672,054

[quant] finding pseudoalignments for the reads ... done
[quant] processed 482,819,438 reads, 208,880,499 reads pseudoaligned
[ em] quantifying the abundances ... done
[ em] the Expectation-Maximization algorithm ran for 1,273 rounds
[ bam] writing pseudoalignments to BAM format .. Segmentation fault
I tried the same with esnembl as reference but I get the same problem.

If anyone could help me out, it would be great!
Thanks

@kopardev
Copy link

Any idea if this issue has been resolved yet. I am also getting something very similar:

[  bam] writing pseudoalignments to BAM format .. /spin1/swarm/kopardevn/M0tDGHNewa/cmd.10: line 1: 12564 Segmentation fault      ( kallisto quant -i mm10_M21 -o TreatmentB_S72 --bias --plaintext
--fusion --rf-stranded -t 56 --pseudobam --genomebam --gtf genes.gtf -c mm10.genome trim/TreatmentB_S72.R1.trim.fastq.gz trim/TreatmentB_S72.R2.trim.fastq.gz )

@Evi-050
Copy link

Evi-050 commented Jul 31, 2020

So, personally, I went with STAR since I was not in a hurry, but someone in another post suggested going back to the older version that works. But frankly, I didn't try it. Also if I remember when I removed the "--pseudobam --genomebam --gtf genes.gtf" and run for example "kallisto quant -i index -o output --single -l 200 -s 20 file1.fastq.gz file2.fastq.gz file3.fastq.gz" it worked.

Very good luck!

@redst4r
Copy link

redst4r commented Aug 8, 2020

keeps happening to me too in kallisto 0.46.2:

[quant] finding pseudoalignments for the reads ...

[quant] done
[quant] processed 250,960,675 reads, 156,761,018 reads pseudoaligned
[   em] quantifying the abundances ... done
[   em] the Expectation-Maximization algorithm ran for 1,513 rounds
[  bam] writing pseudoalignments to BAM format .. [1]    2673 segmentation fault

works when removing the --genomebam flag, but I'd really like to get the bamfile out of this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants