Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call DMR using files in 10 bp bin format #91

Open
Greacy-x opened this issue May 12, 2024 · 6 comments
Open

Call DMR using files in 10 bp bin format #91

Greacy-x opened this issue May 12, 2024 · 6 comments

Comments

@Greacy-x
Copy link

Hi yupeng,

I've obtained a batch of processed WGBS data from GEO database, which was generated using the 'methylpy allc-to-bigwig --bin-size 10' command, resulting in 10 bp bin format files (e.g., GSM4603053_allc_18.CGN.10bp.bw). I'm curious about how to utilize these files for DMR calling. Could you provide some guidance or suggestions on this matter?

Attached is an example of 10 bp bin format file.

Thank you in advance for your help!
bigwig data in 10 bp bin format

@Greacy-x Greacy-x reopened this May 12, 2024
@yupenghe
Copy link
Owner

You can use 10bp bin as a test unit. Converting the bigwig file to the allc format where each row corresponds to a 10bp bin will allow methylpy to perform differential methylation analysis.

@Greacy-x
Copy link
Author

Thank you for your prompt response!

Meanwhile I'm considering comparing these public data with my own WGBS data, should I convert my own data into 10bp bin format and then use 10bp bin as a single unit for differential methylation analysis as you mentioned above?

Thank you once again for your time and expertise.

@yupenghe
Copy link
Owner

Yes I think that would be a valid strategy.

@Greacy-x
Copy link
Author

Thanks a lot!

@GiveHeartToU
Copy link

You can use 10bp bin as a test unit. Converting the bigwig file to the allc format where each row corresponds to a 10bp bin will allow methylpy to perform differential methylation analysis.

Regarding the bw format file mentioned by Minghui, the fourth column represents the average methylation level for this 10bp region. However, the allc file format requires seven mandatory columns, particularly the position of the 1-based cytosine (C). How should this position be chosen? How should the sixth column, 'cov', and the seventh column, 'methylated', be transformed? What specific method are you referring to for converting the bw format file into an allc file?

Looking forward to your reply, thank you.

@yupenghe
Copy link
Owner

For the position, you can choose the first base of each 10bp bin. For mc and cov, I would recommend them to be the sum of methyl bases and total bases of all CpGs in the 10bp bins. For the last column, setting all values to 1 will work.

I don't think there are any specific tools you can use for this. Custom script will be needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants