Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difficulties to run modkit with the modbases tutorial #49

Open
marceelrf opened this issue Aug 31, 2023 · 2 comments
Open

Difficulties to run modkit with the modbases tutorial #49

marceelrf opened this issue Aug 31, 2023 · 2 comments
Labels
question Looking for clarification on inputs and/or outputs

Comments

@marceelrf
Copy link

Hello, we are starting to apply ONT in our laboratory.
I'd like to apply the modkit to the chr20 data from the GIAB NA24385 sample. I managed to replicate the tutorial using modbam2bed (https://labs.epi2me.io/notebooks/Modified_Base_Tutorial.html), but I would like to apply modkit as indicated by nanopore.

However, I am getting the result: “processed 0 rows and skipped ~331448 reads.

Has anyone managed to modkit this data or have any suggestions as to what might be wrong?

Please let me know if you need more information.

Thanks in advance,

Marcel

@ArtRand
Copy link
Contributor

ArtRand commented Aug 31, 2023

Hello @marceelrf

The data used in this tutorial is a little old, and as a result the base modification tags in the BAM don't have the MM-flag and implicitly denote the . flag. The SAM spec has details. By default, modkit pileup will not allow you to use these records unless you specify --force-allow-implicit on the command line. There is a note in the Troubleshooting section of the documentation regarding this. If you run modkit pileup with the --log-filepath <file> option, you will see lines such as

[src/read_cache.rs::376][2023-08-31 15:05:04][DEBUG] read 8272cf99-c8f3-463b-9e32-5db2efdf364d, Skipped: record 8272cf99-c8f3-463b-9e32-5db2efdf364d has un-allowed mode (ImplicitProbModified), use '--force-allow-implicit' or 'modkit update-tags --mode ambiguous'

indicating what action to take. We can work on our side on updating the tutorial.

I would suggest running modkit update-tags --mode ambiguous on the tutorial data. The ambiguous mode (?) is correct for the CpG models used in that tutorial. Also note that the handling of thresholds has changed slightly from modbam2bed to modkit, so I would expect some slight differences there too.

@marceelrf
Copy link
Author

Thank you @ArtRand !

@ArtRand ArtRand added the question Looking for clarification on inputs and/or outputs label Dec 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Looking for clarification on inputs and/or outputs
Projects
None yet
Development

No branches or pull requests

2 participants