-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Options to favor detection of several small deletions instead of single large ones #8
Comments
Minimap2/unimap won't work with a very large -B because internally every score must be within [-128,127]. I should have added a test for that. Nonetheless, the failure of
This is probably violating the SW formulation unimap/minimap2 is using. You need to make sure In your case, you can try |
Unfortunately, these options does not solve the issue, and still the majority of reads show the large deletion + mismatches. From what I understand, it may be possible to try something like Thank you for the hints and feedback, |
The requirement is |
Thank you for the clarifications, and for offering giving it a look. I have been trying different combinations of parameters following the above rule, but with no success. Shall I send you the files to an email address? apparently I cannot attach them here. |
Hello, @elcortegano I would like to ask, how did you obtain such a nice view of the alignment? I guess using some software (?) and fetching it bam/sam format coming out of the unimap? Thank you for your response. |
Hi @ilikvlad , this is with IGV (https://software.broadinstitute.org/software/igv/download). |
unimap
does a great job aligning complex regions (e.g. satellites) even using reads. We are using it to detect some variants at telomeric regions, using PacBio HiFi sequencing data.We have evidence from one of these regions that there should be 2 separate deletions, corresponding to two separate groups of monomers from a satellite, each of them of different size. However,
unimap
finds one large deletion instead, close together to several variants that suggest mismapping. I assume this behavior comes from the default settings, that may favour large deletions instead of many small deletions. The program was run withunimap -a -x asm5 -x hifi --cs
. I attach snapshot below for this region.I have been trying to tune settings to penalize nucleotide mismatchings and favour more than 1 deletion if convenient. However, I find some problems including:
-B 500 -O 1 -E 2
: increased mismatch penalty and lowered cost for gaps. This results in a "Segmentation fault" error.-B 500
: Does not seem to solve the problem, the alignment remains the same.-O 1 -E 2
: This makes some improvements, and for some reads we observe that mismatches disappear in favour of deletions (snapshot below), but not for the majority of reads. Cannot make progress from here however, since using "-B > 6" will cause the segfault, lower values of "-B" won't change anything, and -O and -E are near their minimum values (cannot take values of zero).Wonder if there are other options that are worth exploring here. Any hints on how to deal with this situation? Thank you
The text was updated successfully, but these errors were encountered: