-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extremely low mapping efficiency using bowtie2 and reference genome with many scaffolds #639
Comments
Thanks for reaching out. By default, Bismark runs in end-to-end mode in a fairly stringent setting ( If you really don't like trimming at all you could use the local alignments (envoked using the option Very happy to assist you along the way, we can also continue this via email. |
Thank you very much for your quick response! I've been testing by extracting data from a pair of paired-end reads, 10,000 lines each (equivalent to 2,500 fastq) to get results as quickly as possible. From now on, all the results are from these fastq files of 10,000 lines (2500 fastq). I trimmed my sequences with trim_galore, and then I aligned them with Bowtie2, changing the parameter --score_min L,0,-06. However, the mapping efficiency remains at 22%. If I use trimmed sequences and add --very-sensitive-local it increases to 31%. If I only use one read, fastq_1 (trimmed and --very-sensitive-local), the mapping efficiency increases to 76%. On the other hand, if I use only fastq_2 (trimmed and --very-sensitive-local), it is 55.9%. I used the --pbat parameter only on fastq_2, and it increases to 51.6%. Anyway, a 50% mapping efficiency is still low. I don't understand why when I use both sequences together, it decreases to 20%. Should I map all my fastq files as if they were single-end reads? Can I use the --pbat parameter with paired-end reads? I'm not sure how to proceed. Thanks for your help |
Hi Laura, if possible at all, could you send me a subset of the reads of say 100K reads for some quick tests on my side? |
Hello,are you here? I have met the similar diffcuilties as you, I am very confused, I want to know have you solved this problem? I am sincerely looking forward to your reply. |
Hi changchuanjun! |
You can find a few ways of interpreting and mitigating low mapping efficiency issues here: https://felixkrueger.github.io/Bismark/faq/low_mapping/. If this doesn't solve your problems feel free to send me an email with as many details as possible. |
The most likely issue here is that the trimming you performed somehow interfered with the synchronisation between R1 and R2: both files need to have corresponding sequences in both files, and have the exact same number of reads in total. I would recommend using the raw FastQ files once more, and using Trim Galore:
followed by a paired-end mapping command (maybe you could relax the mapping stringency a little, e.g. to |
So can we say this is a success all-around? All the best going foward! |
Yes, exactly. Besides, I also used bsmap software to run pipeline.It also demonstrated there was just a little different no matter fastp or trim_galore to treat the raw data in final mapping efficiency result. Interestingly, bsmap's mapping efficiency was sightly high than bismark's mapping efficiency when used the same input file. |
Hi!
I'm I'm using Bismark 0.24.2 with my WGBS samples (paired-end reads and not trimmed with TrimGalore cause I'm not a fan of trimming sequences).
In genome preparation (bismark_genome_preparation) I'm using an assembly (with many scaffolds in the same fasta file and same genotype as my samples) and Bowtie2 as aligner, but the mapping efficiency is really low, just 20%.
Trying to solve this low efficiency I used bowtie2 out of Bismark (parameter --very-sensitive I had almost 55%) but it is still really low, isn't it?
Using bwa-meth the mapping rate is almost 95% but these SAM files are not compatible with bismark (I'm having troubles resolving the compatibility issue between the bwa-meth SAM files and Bismark).
I'd like to use Bismark for the entire workflow (from mapping to methylation extraction) but I don't know how to increase the mapping efficiency
Thanks!!
The text was updated successfully, but these errors were encountered: