Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault on the cliveome #9

Open
mp15 opened this issue May 3, 2023 · 6 comments
Open

Segfault on the cliveome #9

mp15 opened this issue May 3, 2023 · 6 comments

Comments

@mp15
Copy link

mp15 commented May 3, 2023

Whilst running on the cliveome aligned with Dorado, mapped with minimap2 and sorted with samtools I have run into the following segfault. Any chance you can put out a version of your binary compiled with full debug symbols or should I compile from scratch and try and reproduce?

#0  __memmove_avx_unaligned () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:222
#1  0x00005555555dd498 in cram_to_bam.isra ()
#2  0x00005555555e97da in cram_get_bam_seq ()
#3  0x00005555555d4436 in sam_read1 ()
#4  0x00005555555d5548 in sam_readrec ()
#5  0x00005555555c111c in hts_itr_next ()
#6  0x00005555555d9a51 in bam_plp64_auto ()
#7  0x00005555555d9ad1 in bam_plp_auto ()
#8  0x000055555552b03a in <mod_kit::mod_pileup::PileupIter as core::iter::traits::iterator::Iterator>::next ()
#9  0x0000555555528d96 in _ZN7mod_kit8commands12ModBamPileup3run28_$u7b$$u7b$closure$u7d$$u7d$28_$u7b$$u7b$closure$u7d$$u7d$28_$u7b$$u7b$closure$u7d$$u7d$28_$u7b$$u7b$closure$u7d$$u7d$17hb169f6e83b14f0b3E.llvm.16552777208159181741 ()
#10 0x00005555555277fe in rayon::iter::plumbing::Folder::consume_iter ()
#11 0x00005555554f3501 in rayon::iter::plumbing::bridge_producer_consumer::helper ()
#12 0x000055555550d9a3 in _ZN10rayon_core4join12join_context28_$u7b$$u7b$closure$u7d$$u7d$17h64310596cc7337a5E.llvm.316613399766671356 ()
#13 0x00005555555118e4 in rayon_core::registry::in_worker ()
#14 0x00005555554f35f8 in rayon::iter::plumbing::bridge_producer_consumer::helper ()
#15 0x000055555553d629 in _ZN83_$LT$rayon_core..job..StackJob$LT$L$C$F$C$R$GT$$u20$as$u20$rayon_core..job..Job$GT$7execute17hcb306c6da184137aE.llvm.14917117514892184753 ()
#16 0x00005555554bcaa3 in rayon_core::registry::WorkerThread::wait_until_cold ()
#17 0x000055555550da5d in _ZN10rayon_core4join12join_context28_$u7b$$u7b$closure$u7d$$u7d$17h64310596cc7337a5E.llvm.316613399766671356 ()
#18 0x00005555555118e4 in rayon_core::registry::in_worker ()
#19 0x00005555554f35f8 in rayon::iter::plumbing::bridge_producer_consumer::helper ()
#20 0x000055555553d629 in _ZN83_$LT$rayon_core..job..StackJob$LT$L$C$F$C$R$GT$$u20$as$u20$rayon_core..job..Job$GT$7execute17hcb306c6da184137aE.llvm.14917117514892184753 ()
#21 0x00005555554bcaa3 in rayon_core::registry::WorkerThread::wait_until_cold ()
#22 0x000055555550da5d in _ZN10rayon_core4join12join_context28_$u7b$$u7b$closure$u7d$$u7d$17h64310596cc7337a5E.llvm.316613399766671356 ()
#23 0x00005555555118e4 in rayon_core::registry::in_worker ()
#24 0x00005555554f35f8 in rayon::iter::plumbing::bridge_producer_consumer::helper ()
#25 0x000055555550d9a3 in _ZN10rayon_core4join12join_context28_$u7b$$u7b$closure$u7d$$u7d$17h64310596cc7337a5E.llvm.316613399766671356 ()
#26 0x00005555555118e4 in rayon_core::registry::in_worker ()
#27 0x00005555554f35f8 in rayon::iter::plumbing::bridge_producer_consumer::helper ()
#28 0x000055555553d629 in _ZN83_$LT$rayon_core..job..StackJob$LT$L$C$F$C$R$GT$$u20$as$u20$rayon_core..job..Job$GT$7execute17hcb306c6da184137aE.llvm.14917117514892184753 ()
#29 0x00005555554bcaa3 in rayon_core::registry::WorkerThread::wait_until_cold ()
#30 0x00005555558ca9a8 in rayon_core::registry::ThreadBuilder::run ()
#31 0x00005555558d12ea in std::sys_common::backtrace::__rust_begin_short_backtrace ()
#32 0x00005555558ce431 in core::ops::function::FnOnce::call_once{{vtable.shim}} ()
#33 0x0000555555943893 in alloc::boxed::{impl#45}::call_once<(), dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:1987
#34 alloc::boxed::{impl#45}::call_once<(), alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:1987
#35 std::sys::unix::thread::{impl#2}::new::thread_start () at library/std/src/sys/unix/thread.rs:108
#36 0x00007ffff7d0bb43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#37 0x00007ffff7d9da00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
@mp15
Copy link
Author

mp15 commented May 3, 2023

I've now managed to replicate this converting that cram to a bam with samtools proper so I'm raising this with the samtools team.

@ArtRand
Copy link
Contributor

ArtRand commented May 3, 2023

Hello @mp15,

Could you give me some more details on what's happening? If I'm understanding correctly the steps you've performed are:

  1. dorado basecall
  2. align with minimap2
  3. sort and index with samtools
  4. attempt pileup with modkit

But you get the above error at (4). Then you were able to get a similar error when using samtools to convert your CRAM file to a BAM file, correct? Do you think this is because the file is corrupted or there is an incompatibility with your system? It would be great of modkit could catch this kind of thing instead of causing this error.

@mp15
Copy link
Author

mp15 commented May 3, 2023

It's a funny one, and kinda low level so you probably can't catch it.

I got James Bonfield to have a look at the CRAM and he found the problem, I actually broke the fileformat. I had used --emit-moves in dorado in anticipation of doing duplex calling. When this is combined with minimap the large mv tags were replicated to each secondary mapping by minimap2. Secondary mappings don't have SEQ having a * instead. This means they don't count towards the 5mbases of sequences we keep in each CRAM container block. So when I merged the 3 promethion lanes of Cliveome a huge number of secondary mappings with their mv tags ended up in one container block making it larger than 2GB and overflowed the 32-bit signed int used to store size.

James is going to make a fix to htslib, I'll let you know when. You may wish to upgrade htslib when we do. In the mean time I'm going to tweak my workflow so I don't make such a silly result.

@ArtRand
Copy link
Contributor

ArtRand commented May 3, 2023

Interesting, thanks, keep me posted.

@mp15
Copy link
Author

mp15 commented May 4, 2023

Bug fix for htslib is in: samtools/htslib#1613. I'm going to ask Rob if we can make a release soon.

@ArtRand
Copy link
Contributor

ArtRand commented May 5, 2023

Great. It'll have to make it into rust-htslib also, but I can probably push on it once it's in mainline. Thanks for the update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants