Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low unique alignment % #676

Closed
gevro opened this issue Jun 23, 2024 · 8 comments
Closed

Low unique alignment % #676

gevro opened this issue Jun 23, 2024 · 8 comments

Comments

@gevro
Copy link

gevro commented Jun 23, 2024

Hi,
2 out of 23 samples, all prepared in the same batch, have low unique alignment % (38% and 45%) relative to other samples ~(75% average).

See attached bismark alignment reports for the 2 problematic samples (M3 and M5) relative to a good sample (M6).

M3.pdf
M5.pdf
M6.pdf

Representative FastQC of the samples does not indicate any abnormality in terms of excessive adapter or overrepresented sequences. I don't see any other reason why these two samples would have low unique alignment %. Do you have a suggestion of how to troubleshoot this?
M3.fastqc.pdf
M4.fastqc.pdf
M5.fastqc.pdf

Thanks

@gevro
Copy link
Author

gevro commented Jun 23, 2024

Note: One possibility I will investigate is perhaps these two samples had a higher than expected spike-in genome %, accounting for the unmapped reads.

@FelixKrueger
Copy link
Owner

(As a general comment, if you run the deduplicate_bismark and the bismark_methylation_extractor afterwards the reports reports by bismark2report are a lot richer. Even better, running MultiQC (https://multiqc.info/) will aggregate everything into a single report. Also, all HTML files produced by Bismark, FastQC or MultiQC should be shareable, and are much nicer to look at than .pdf)

Now for the problem at hand, I agree that all QC profiles you shared look very similar, and they also look good. Some standard trimming should get rid of the unwanted adapter, so it is not obvious why the samples would behave very differently. I have compiled a few FAQs regarding low mapping efficiency here: https://felixkrueger.github.io/Bismark/faq/low_mapping/

Maybe they can set you on the right path?

@gevro
Copy link
Author

gevro commented Jun 24, 2024

Thank you. I had another idea from your FAQ website--these are NEB em-seq libraries. And I forgot to set the Max insert size to 1000. So perhaps those two samples have higher insert sizes and lost more reads due to that.

I see also now there is an nf-core pipeline for bismark with an em-seq preset. So I will just switch to that.

@gevro gevro closed this as completed Jun 24, 2024
@FelixKrueger
Copy link
Owner

good point. Here are some trimming recommendations for EM-seq (https://felixkrueger.github.io/Bismark/bismark/library_types/#em-seq-neb), and there is preset for the nf-core/methylseq workflow, too (be sure to use the dev revision as 2.6.0 is a little broken...)

@gevro
Copy link
Author

gevro commented Jun 24, 2024

Thanks. How do I use the dev version exactly?

@FelixKrueger
Copy link
Owner

on the command line it is -r dev ( I believe)

@gevro
Copy link
Author

gevro commented Jun 25, 2024

Hi, It looks like the dev version is still broken, with at least two major bugs: nf-core/methylseq#406

Any suggestions?

Thanks!

@gevro
Copy link
Author

gevro commented Jul 1, 2024

Hi, Seems to be working now, I had to add this to the config: process.stageInMode = 'copy'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants