Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in the script 'fix_megahit_contig_naming.py' #529

Open
quliping opened this issue Dec 20, 2023 · 1 comment
Open

Bug in the script 'fix_megahit_contig_naming.py' #529

quliping opened this issue Dec 20, 2023 · 1 comment

Comments

@quliping
Copy link

quliping commented Dec 20, 2023

Metawrap is a good software for metagenomic analysis, while I found a bug in the script 'fix_megahit_contig_naming.py' which was used in the 'assembly' module. I found that this script reads and modifies the contig ID, and determine the length of contig at the first loop, then, in the next loop, reads the sequence and store the sequence with it's ID in the dictionary. This means that the last contig would not be stored, therefore you wrote 'dic[name]=tmp_contig' to store the last contig. However, you forgot to added a length judgment process at this line which will result in the last contig being output to the final file regardless of whether its length is long enough.
image
image

For example, the length of the last contig in the assembly result produced by megahit is 599 bp and the threshold is 1000 bp, but this contig still be wrote to the final output file.
image

@quliping
Copy link
Author

By the way, I found the other bug in the 'assembly.sh'. In the first screenshot, I can see that you set a situation that metawrap assembly only using metaspades (elif [ "$metaspades_assemble" = true ]; then). However, you set the default variable $megahit_assemble to 'True'... It means that their only two situation: 1) only megahit was used: '--metaspades' was not specified, '--megahit' was pecified or not specified; 2) both megahit and metaspades were used: '--metaspades' was specified, '--megahit' was pecified or not specified. I think the default value of variable $megahit_assemble should be 'False' rather than 'True', then the third situation will exists: 3) only metaspades was used when the '--metaspades' was specified and the '--megahit' was not specified.
image
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant