Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: negative dimensions are not allowed when calling set_refine_signal_mapping() #138

Closed
drivenbyentropy opened this issue Nov 21, 2023 · 3 comments

Comments

@drivenbyentropy
Copy link

Hi,

I am getting an error which I am not sure to approach. For a very small subset of reads, the following code snippet

import pysam
import pod5
from remora import io, refine_signal_map, util, metrics, data_chunks

# prepare the sequence data
can_pod5_fh = pod5.Reader("read.pod5")
can_bam_fh = pysam.AlignmentFile("read.bam")

# instantiate the signal map reginer
sig_map_refiner = refine_signal_map.SigMapRefiner(
    kmer_model_filename="9mer_levels_v1.txt",
    scale_iters=0,
    do_fix_guage=True,
    do_rough_rescale=True
)

# get example read
bam_read = [ read for read in can_bam_fh.fetch() if io.read_is_primary(read)][0]
io_read = io.get_io_reads([bam_read], can_pod5_fh, reverse_signal=False)[0]
io_read.set_refine_signal_mapping(sig_map_refiner, ref_mapping=True)

fails with the error below:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_422116/4186189044.py in <module>
     19 bam_read = [ read for read in can_bam_fh.fetch() if io.read_is_primary(read)][0]
     20 io_read = io.get_io_reads([bam_read], can_pod5_fh, reverse_signal=False)[0]
---> 21 io_read.set_refine_signal_mapping(sig_map_refiner, ref_mapping=True)
     22 

[...]/python3.9/site-packages/remora/io.py in set_refine_signal_mapping(self, sig_map_refiner, ref_mapping)
   2055             return
   2056         remora_read = self.into_remora_read(ref_mapping)
-> 2057         remora_read.refine_signal_mapping(sig_map_refiner)
   2058         if ref_mapping:
   2059             if self.ref_to_signal is None:

[...]/python3.9/site-packages/remora/data_chunks.py in refine_signal_mapping(self, sig_map_refiner, check_read)
    289                     self.shift,
    290                     self.scale,
--> 291                 ) = sig_map_refiner.refine_sig_map(
    292                     self.shift,
    293                     self.scale,

[...]/python3.9/site-packages/remora/refine_signal_map.py in refine_sig_map(self, shift, scale, seq_to_sig_map, int_seq, dacs)
    476         # 0 indicates 1 round of sig map refine w/o rescaling
    477         for _ in range(max(1, self.scale_iters)):
--> 478             seq_to_sig_map, _, _, _, _ = refine_signal_mapping(
    479                 (dacs - shift) / scale,
    480                 seq_to_sig_map,

[...]/python3.9/site-packages/remora/refine_signal_map.py in refine_signal_mapping(signal, seq_to_sig_map, levels, band_half_width, refine_algo, short_dwell_pen, adjust_band_min_step)
    831     temp_levels = np.copy(levels)
    832     temp_levels[np.isnan(levels)] = 0
--> 833     all_scores, path, traceback, base_offsets = seq_banded_dp(
    834         signal.astype(np.float32),
    835         temp_levels.astype(np.float32),

src/remora/refine_signal_map_core.pyx in remora.refine_signal_map_core.seq_banded_dp()

ValueError: negative dimensions are not allowed

I have attached the corresponding pod5 and bam file to reproduce this issue in case you might find it useful. Any pointer as to where I could start debugging this would be greatly appreciated.

Thank you very much in advance.

@marcus1487
Copy link
Collaborator

I've not seen this error before. Thank you for the test data. I am able to reproduce the error and will look into a fix.

@drivenbyentropy
Copy link
Author

Reference genome is dm6.

@marcus1487
Copy link
Collaborator

This should be fixed in the v3.1 release. There is a larger issue here though. The read you've sent contains many 500+ base pair deletions from the reference. This is highly suggestive that the reference is not correct for this read/mapping. The fix put in a check in to ensure that the value to compute the offset does not overflow the array (and use an unsigned integer instead of a signed int), but this is likely far too large a search space for the signal refinement algorithm. I will explore a variable to filter these reads out in a future release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants