Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misassembly correction #7

Open
yongyiyu opened this issue Jan 10, 2019 · 3 comments
Open

misassembly correction #7

yongyiyu opened this issue Jan 10, 2019 · 3 comments

Comments

@yongyiyu
Copy link

Hi Fidel,
I have created a corrected Hi-C matrix in h5 format by HiCExporer,now an assertion error killed the assembly while using the HiCAssembler.
" Traceback (most recent call last):
File "/annoroad/data1/bioinfo/PMO/yangweifei/Hicassemble/py2/bin/assemble", line 312, in
main(args)
File "/annoroad/data1/bioinfo/PMO/yangweifei/Hicassemble/py2/bin/assemble", line 308, in main
chain_file=args.outFolder + "/liftover.chain")
File "/annoroad/data1/bioinfo/PMO/yangweifei/Hicassemble/py2/bin/assemble", line 218, in save_fasta
assert(next_contig['start'] - end >= 0)
AssertionError "
I'm a little confused while checking the code of assemble. The misassembly correction of my data mybe like the overlap of data. As the principle you described , "this means that a contig was split by the misassembly correction but was later joined together", the overlap of data isn't be considered , and I think this kind of misassembly correction shouldn't be joined.

Best regards,
yongyi

@fidelram
Copy link
Contributor

fidelram commented Jan 11, 2019 via email

@yongyiyu
Copy link
Author

yongyiyu commented Jan 15, 2019

Hi Fidel,
I ran the misassembly correction before using the HiCAssembler. Now when testing the HiCAssembler with the Hi-C matrix that wasn't corrected,it worked successfully. So I think that the situation of the overlap may not be considered.
The code is followed:

“hicBuildMatrix --samFiles L3-8_Lib1_Lane1_genome.reads1.bam \
L3-8_Lib1_Lane1_genome.reads2.bam --binSize 10000 --restrictionSequence GATC --threads 4 \ --inputBufferSize 100000 --outBam hic.bam -o hic_matrix.h5 --QCfolder ./hicQC”

“hicCorrectMatrix correct -m hic_matrix.h5 -t -1.2 5 -o hic_corrected_matrix.h5”

“assemble -f genome.fa -m hic_corrected_matrix.h5 -o ./assembly_output
--min_scaffold_length 100000 --bin_size 5000 --misassembly_zscore_threshold -1.0
--num_iterations 3 --num_processors 16”

yongyi

@xuxiaoman0212
Copy link

Hi yongyi,

I got the same error, how did you solve it? @yongyiyu

Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants