-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect methylation data background noise #147
Comments
Hello @Macdot3, This is a difficult question to answer in a very general case. A couple things you could try:
If you tell me the result of (1) above maybe I can give you additional advice. |
Hello Arthur, sorry for the delay in responding, thank you so much for your advice. I'm sending you my summary results separated by native samples and PCR in two files. You'll find the sample correspondence in the names. Only two of them don't have duplicates. From what you're telling me, should I consider the |
Hello @Macdot3, First off, I would use the A couple things stand out to me from your
Happy to answer any additional questions. |
Thank you, @ArtRand, for your invaluable support. Upon conducting a more detailed analysis, it appears that my native Linfo_ samples exhibit remarkably low coverage. Regarding thresholds, I have some inquiries. How does modkit discern, in its automated threshold determination, whether these samples are PCR? While it may seem trivial, I seek clarity to deepen my understanding. Additionally, in setting a summary threshold manually (excluding dorado), could you provide guidance pertinent to the ongoing experiment? Your insights would be greatly appreciated. Thank you once again for your assistance. |
I also have another question: how many CpG sites does Modkit calculate the threshold for and provide extract outputs? To clarify, I know that there are a total of 435 predicted CpG sites on the mtDNA, but the output from Modkit extract still provides many more sites than 435 on the ref_position. I'm not sure if I was clear. I appreciate your advice, thank you very much |
Hello @Macdot3, As far as the number of reads in the As for the threshold calculation, the exact calculation can be found in the documentation. The number of CpG sites used in the calculation isn't logged. I can add this in the next release. The number of reads is logged, however, use If you are using |
Hello Arthur, |
@ArtRand , I've generated my probabilities.tsv and probabilities.txt files. I wanted to ask you for information on how |
Hello @Macdot3, The two options, In general, I would trust the filter threshold that is estimated by |
Certainly, I'll attach my output files separately for the + and - strands of two native and PCR samples that I selected to estimate the threshold, considering the highest coverage. |
Hello @Macdot3, Thanks for sending these results over. The 10% threshold in all of the samples is ~0.9, which is what I would expect. The minimum value you're seeing is the minimum value bin. The file you're looking at is a histogram so what this means is you have at least 1 call with that probability. I'll clarify the documentation on the contents of these files. This is also what I would expect since there are 3 classes, so if the model cannot make a call it will have a probability of ~33% (random guess). The |
Thank you very much, @ArtRand , for your advice. |
Hi everyone,
I have a question regarding the analysis of data derived from modkits. I am working with two sets of data (for the moment, a few samples): one concerning samples sequenced without PCR amplification using MinION r10, and another set of data derived from PCR and then sequenced using the same technology. Some samples are the same. I was wondering if there is a method (statistical or a pipeline, etc.) that would allow me to determine whether the methylation values obtained constitute background noise on the PCR samples, as I do not expect significant methylation on these. Additionally, if there is a way to establish a threshold for background noise. I have read a lot about it, but perhaps you can help me.
Thank you all.
The text was updated successfully, but these errors were encountered: