WO2024138574A1 - Helicase and use thereof - Google Patents
Helicase and use thereof Download PDFInfo
- Publication number
- WO2024138574A1 WO2024138574A1 PCT/CN2022/143631 CN2022143631W WO2024138574A1 WO 2024138574 A1 WO2024138574 A1 WO 2024138574A1 CN 2022143631 W CN2022143631 W CN 2022143631W WO 2024138574 A1 WO2024138574 A1 WO 2024138574A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- phenylalanine
- amino acid
- helicase
- introduces
- protein
- Prior art date
Links
- 108060004795 Methyltransferase Proteins 0.000 title claims abstract description 94
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 24
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 23
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 23
- 238000007672 fourth generation sequencing Methods 0.000 claims abstract description 17
- 238000012512 characterization method Methods 0.000 claims abstract description 6
- 150000001413 amino acids Chemical group 0.000 claims description 120
- 108090000623 proteins and genes Proteins 0.000 claims description 104
- 235000001014 amino acid Nutrition 0.000 claims description 100
- 229940024606 amino acid Drugs 0.000 claims description 99
- 102000004169 proteins and genes Human genes 0.000 claims description 94
- 235000018102 proteins Nutrition 0.000 claims description 88
- 108020004414 DNA Proteins 0.000 claims description 87
- 102000053602 DNA Human genes 0.000 claims description 62
- 238000012163 sequencing technique Methods 0.000 claims description 48
- NEMHIKRLROONTL-QMMMGPOBSA-N (2s)-2-azaniumyl-3-(4-azidophenyl)propanoate Chemical compound OC(=O)[C@@H](N)CC1=CC=C(N=[N+]=[N-])C=C1 NEMHIKRLROONTL-QMMMGPOBSA-N 0.000 claims description 37
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 claims description 31
- 229960000310 isoleucine Drugs 0.000 claims description 31
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 claims description 31
- 239000004472 Lysine Substances 0.000 claims description 28
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 claims description 28
- 235000018417 cysteine Nutrition 0.000 claims description 26
- 239000004475 Arginine Substances 0.000 claims description 25
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 claims description 25
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 claims description 24
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 claims description 24
- 239000013598 vector Substances 0.000 claims description 22
- 229960003767 alanine Drugs 0.000 claims description 19
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 claims description 19
- 229960002885 histidine Drugs 0.000 claims description 19
- 230000035772 mutation Effects 0.000 claims description 19
- 239000002773 nucleotide Substances 0.000 claims description 18
- 125000003729 nucleotide group Chemical group 0.000 claims description 18
- 229960001153 serine Drugs 0.000 claims description 17
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 claims description 16
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 claims description 16
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 claims description 16
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 claims description 16
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 claims description 16
- 235000004279 alanine Nutrition 0.000 claims description 16
- 229960004295 valine Drugs 0.000 claims description 16
- 239000004474 valine Substances 0.000 claims description 16
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 claims description 14
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 claims description 14
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 claims description 14
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 claims description 14
- 229960005190 phenylalanine Drugs 0.000 claims description 14
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 claims description 14
- 229960004441 tyrosine Drugs 0.000 claims description 14
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 claims description 13
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 claims description 13
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 claims description 13
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 claims description 13
- 229960001230 asparagine Drugs 0.000 claims description 13
- 235000009582 asparagine Nutrition 0.000 claims description 13
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 claims description 12
- 230000003197 catalytic effect Effects 0.000 claims description 12
- 238000000034 method Methods 0.000 claims description 12
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 claims description 10
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 claims description 10
- 230000001276 controlling effect Effects 0.000 claims description 10
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 claims description 10
- ZXSBHXZKWRIEIA-JTQLQIEISA-N (2s)-3-(4-acetylphenyl)-2-azaniumylpropanoate Chemical compound CC(=O)C1=CC=C(C[C@H](N)C(O)=O)C=C1 ZXSBHXZKWRIEIA-JTQLQIEISA-N 0.000 claims description 9
- 230000004568 DNA-binding Effects 0.000 claims description 9
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 9
- 230000027455 binding Effects 0.000 claims description 9
- 239000004471 Glycine Substances 0.000 claims description 8
- 229960002449 glycine Drugs 0.000 claims description 8
- 239000013612 plasmid Substances 0.000 claims description 8
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 claims description 7
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 claims description 7
- 229960003136 leucine Drugs 0.000 claims description 7
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 claims description 6
- 230000001105 regulatory effect Effects 0.000 claims description 6
- 241000588724 Escherichia coli Species 0.000 claims description 5
- QLUTZSRKFWJIGM-QMMMGPOBSA-N (2r)-2-azaniumyl-3-[(2-nitrophenyl)methylsulfanyl]propanoate Chemical compound OC(=O)[C@@H](N)CSCC1=CC=CC=C1[N+]([O-])=O QLUTZSRKFWJIGM-QMMMGPOBSA-N 0.000 claims description 4
- VTERJWKRLSSHIC-QMMMGPOBSA-N (2s)-2-amino-3-[(2-nitrophenyl)methoxy]propanoic acid Chemical compound OC(=O)[C@@H](N)COCC1=CC=CC=C1[N+]([O-])=O VTERJWKRLSSHIC-QMMMGPOBSA-N 0.000 claims description 4
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 claims description 4
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 claims description 4
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 claims description 4
- 239000004473 Threonine Substances 0.000 claims description 4
- 229960002743 glutamine Drugs 0.000 claims description 4
- 229930182817 methionine Natural products 0.000 claims description 4
- 229960004452 methionine Drugs 0.000 claims description 4
- 229960002898 threonine Drugs 0.000 claims description 4
- POGSZHUEECCEAP-ZETCQYMHSA-N (2s)-2-amino-3-(3-amino-4-hydroxyphenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C(N)=C1 POGSZHUEECCEAP-ZETCQYMHSA-N 0.000 claims description 3
- ZHUOMTMPTNZOJE-VIFPVBQESA-N (2s)-2-amino-3-(3-cyanophenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=CC(C#N)=C1 ZHUOMTMPTNZOJE-VIFPVBQESA-N 0.000 claims description 3
- PEMUHKUIQHFMTH-QMMMGPOBSA-N (2s)-2-amino-3-(4-bromophenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(Br)C=C1 PEMUHKUIQHFMTH-QMMMGPOBSA-N 0.000 claims description 3
- KWIPUXXIFQQMKN-VIFPVBQESA-N (2s)-2-amino-3-(4-cyanophenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(C#N)C=C1 KWIPUXXIFQQMKN-VIFPVBQESA-N 0.000 claims description 3
- VPRPVNNXDCMVQT-JTQLQIEISA-N (2s)-2-amino-3-(4-ethylsulfanylcarbonylphenyl)propanoic acid Chemical compound CCSC(=O)C1=CC=C(C[C@H](N)C(O)=O)C=C1 VPRPVNNXDCMVQT-JTQLQIEISA-N 0.000 claims description 3
- PHUOJEKTSKQBNT-NSHDSACASA-N (2s)-2-amino-3-(4-prop-2-enoxyphenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(OCC=C)C=C1 PHUOJEKTSKQBNT-NSHDSACASA-N 0.000 claims description 3
- RKXGYYMTJRQUQQ-NSHDSACASA-N (2s)-2-amino-3-(4-propan-2-ylsulfanylcarbonylphenyl)propanoic acid Chemical compound CC(C)SC(=O)C1=CC=C(C[C@H](N)C(O)=O)C=C1 RKXGYYMTJRQUQQ-NSHDSACASA-N 0.000 claims description 3
- AWYGHHYADLYNRW-RGURZIINSA-N (2s)-2-amino-3-[4-[(2-amino-3-sulfanylpropanoyl)amino]phenyl]propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(NC(=O)C(N)CS)C=C1 AWYGHHYADLYNRW-RGURZIINSA-N 0.000 claims description 3
- WORDWOPJMYWZSB-NSHDSACASA-N (2s)-2-amino-6-[(2-nitrophenyl)methoxycarbonylamino]hexanoic acid Chemical compound OC(=O)[C@@H](N)CCCCNC(=O)OCC1=CC=CC=C1[N+]([O-])=O WORDWOPJMYWZSB-NSHDSACASA-N 0.000 claims description 3
- SDZGVFSSLGTJAJ-ZETCQYMHSA-N (2s)-2-azaniumyl-3-(2-nitrophenyl)propanoate Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1[N+]([O-])=O SDZGVFSSLGTJAJ-ZETCQYMHSA-N 0.000 claims description 3
- GTVVZTAFGPQSPC-QMMMGPOBSA-N (2s)-2-azaniumyl-3-(4-nitrophenyl)propanoate Chemical compound OC(=O)[C@@H](N)CC1=CC=C([N+]([O-])=O)C=C1 GTVVZTAFGPQSPC-QMMMGPOBSA-N 0.000 claims description 3
- CYHRSNOITZHLJN-NSHDSACASA-N (2s)-2-azaniumyl-3-(4-propan-2-ylphenyl)propanoate Chemical compound CC(C)C1=CC=C(C[C@H](N)C(O)=O)C=C1 CYHRSNOITZHLJN-NSHDSACASA-N 0.000 claims description 3
- YZJSUQQZGCHHNQ-BYPYZUCNSA-N (2s)-6-amino-2-azaniumyl-6-oxohexanoate Chemical compound OC(=O)[C@@H](N)CCCC(N)=O YZJSUQQZGCHHNQ-BYPYZUCNSA-N 0.000 claims description 3
- CMUHFUGDYMFHEI-UHFFFAOYSA-N -2-Amino-3-94-aminophenyl)propanoic acid Natural products OC(=O)C(N)CC1=CC=C(N)C=C1 CMUHFUGDYMFHEI-UHFFFAOYSA-N 0.000 claims description 3
- YZXUCQCJZKJMIR-UHFFFAOYSA-N 2-azaniumyl-3-[4-(trifluoromethoxy)phenyl]propanoate Chemical compound OC(=O)C(N)CC1=CC=C(OC(F)(F)F)C=C1 YZXUCQCJZKJMIR-UHFFFAOYSA-N 0.000 claims description 3
- UQTZMGFTRHFAAM-ZETCQYMHSA-N 3-iodo-L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C(I)=C1 UQTZMGFTRHFAAM-ZETCQYMHSA-N 0.000 claims description 3
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 3
- CMUHFUGDYMFHEI-QMMMGPOBSA-N 4-amino-L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(N)C=C1 CMUHFUGDYMFHEI-QMMMGPOBSA-N 0.000 claims description 3
- XWHHYOYVRVGJJY-QMMMGPOBSA-N 4-fluoro-L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(F)C=C1 XWHHYOYVRVGJJY-QMMMGPOBSA-N 0.000 claims description 3
- PZNQZSRPDOEBMS-QMMMGPOBSA-N 4-iodo-L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(I)C=C1 PZNQZSRPDOEBMS-QMMMGPOBSA-N 0.000 claims description 3
- 239000004229 Alkannin Substances 0.000 claims description 3
- OUYCCCASQSFEME-MRVPVSSYSA-N D-tyrosine Chemical compound OC(=O)[C@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-MRVPVSSYSA-N 0.000 claims description 3
- 229930195709 D-tyrosine Natural products 0.000 claims description 3
- 239000003508 Dilauryl thiodipropionate Substances 0.000 claims description 3
- 239000001358 L(+)-tartaric acid Substances 0.000 claims description 3
- WTDRDQBEARUVNC-LURJTMIESA-N L-DOPA Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C(O)=C1 WTDRDQBEARUVNC-LURJTMIESA-N 0.000 claims description 3
- FBWIRBFZWNIGJC-LURJTMIESA-N L-dihomomethionine zwitterion Chemical compound CSCCCC[C@H](N)C(O)=O FBWIRBFZWNIGJC-LURJTMIESA-N 0.000 claims description 3
- GEYBMYRBIABFTA-VIFPVBQESA-N O-methyl-L-tyrosine Chemical compound COC1=CC=C(C[C@H](N)C(O)=O)C=C1 GEYBMYRBIABFTA-VIFPVBQESA-N 0.000 claims description 3
- 241000700605 Viruses Species 0.000 claims description 3
- 239000001361 adipic acid Substances 0.000 claims description 3
- JCZLABDVDPYLRZ-AWEZNQCLSA-N biphenylalanine Chemical compound C1=CC(C[C@H](N)C(O)=O)=CC=C1C1=CC=CC=C1 JCZLABDVDPYLRZ-AWEZNQCLSA-N 0.000 claims description 3
- 239000004301 calcium benzoate Substances 0.000 claims description 3
- 239000004148 curcumin Substances 0.000 claims description 3
- 239000013604 expression vector Substances 0.000 claims description 3
- 239000000252 konjac Substances 0.000 claims description 3
- 229960004502 levodopa Drugs 0.000 claims description 3
- 239000004300 potassium benzoate Substances 0.000 claims description 3
- 239000004334 sorbic acid Substances 0.000 claims description 3
- 239000000213 tara gum Substances 0.000 claims description 3
- JPZXHKDZASGCLU-LBPRGKRZSA-N β-(2-naphthyl)-alanine Chemical compound C1=CC=CC2=CC(C[C@H](N)C(O)=O)=CC=C21 JPZXHKDZASGCLU-LBPRGKRZSA-N 0.000 claims description 3
- HIAVWJOQCVNAQC-LBPRGKRZSA-N (2s)-2-amino-3-(naphthalen-2-ylamino)propanoic acid Chemical compound C1=CC=CC2=CC(NC[C@H](N)C(O)=O)=CC=C21 HIAVWJOQCVNAQC-LBPRGKRZSA-N 0.000 claims description 2
- HYSPNOMZFGNKBR-QMMMGPOBSA-N (2s)-2-amino-3-[(4,5-dimethoxy-2-nitrophenyl)methoxy]propanoic acid Chemical compound COC1=CC(COC[C@H](N)C(O)=O)=C([N+]([O-])=O)C=C1OC HYSPNOMZFGNKBR-QMMMGPOBSA-N 0.000 claims description 2
- NLFOHNAFILVHGM-AWEZNQCLSA-N (2s)-2-amino-3-[4-[(2-nitrophenyl)methoxy]phenyl]propanoic acid Chemical compound C1=CC(C[C@H](N)C(O)=O)=CC=C1OCC1=CC=CC=C1[N+]([O-])=O NLFOHNAFILVHGM-AWEZNQCLSA-N 0.000 claims description 2
- JVGVDSSUAVXRDY-MRVPVSSYSA-N (R)-3-(4-hydroxyphenyl)lactic acid Chemical compound OC(=O)[C@H](O)CC1=CC=C(O)C=C1 JVGVDSSUAVXRDY-MRVPVSSYSA-N 0.000 claims description 2
- VMCWPGMERAIIKT-DEZAYANASA-N N1=C(C=CC(=C1)C[C@@H](N)C(=O)O)C1=NC=CC=C1.[NH3+][C@@H](C(=O)[O-])CCCCCC Chemical compound N1=C(C=CC(=C1)C[C@@H](N)C(=O)O)C1=NC=CC=C1.[NH3+][C@@H](C(=O)[O-])CCCCCC VMCWPGMERAIIKT-DEZAYANASA-N 0.000 claims description 2
- 229960004799 tryptophan Drugs 0.000 claims description 2
- 102000003844 DNA helicases Human genes 0.000 claims 1
- 108090000133 DNA helicases Proteins 0.000 claims 1
- 150000003839 salts Chemical class 0.000 abstract description 36
- 230000000694 effects Effects 0.000 abstract description 31
- 230000015784 hyperosmotic salinity response Effects 0.000 abstract description 8
- 102000000309 PIN domains Human genes 0.000 abstract 1
- 108050008752 PIN domains Proteins 0.000 abstract 1
- 239000011535 reaction buffer Substances 0.000 description 26
- 239000000243 solution Substances 0.000 description 22
- 238000001514 detection method Methods 0.000 description 19
- 108020004682 Single-Stranded DNA Proteins 0.000 description 16
- 238000006243 chemical reaction Methods 0.000 description 15
- 238000002844 melting Methods 0.000 description 13
- 230000008018 melting Effects 0.000 description 13
- 239000000945 filler Substances 0.000 description 10
- 238000000137 annealing Methods 0.000 description 9
- 239000000872 buffer Substances 0.000 description 8
- HVYWMOMLDIMFJA-DPAQBDIFSA-N cholesterol Chemical compound C1C=C2C[C@@H](O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CCCC(C)C)[C@@]1(C)CC2 HVYWMOMLDIMFJA-DPAQBDIFSA-N 0.000 description 8
- 230000006870 function Effects 0.000 description 8
- 239000002808 molecular sieve Substances 0.000 description 8
- URGAHOPLAPQHLN-UHFFFAOYSA-N sodium aluminosilicate Chemical compound [Na+].[Al+3].[O-][Si]([O-])=O.[O-][Si]([O-])=O URGAHOPLAPQHLN-UHFFFAOYSA-N 0.000 description 8
- 239000013641 positive control Substances 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 6
- 239000007995 HEPES buffer Substances 0.000 description 6
- 230000000903 blocking effect Effects 0.000 description 6
- 239000001913 cellulose Substances 0.000 description 6
- 229920002678 cellulose Polymers 0.000 description 6
- 239000003153 chemical reaction reagent Substances 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 239000012528 membrane Substances 0.000 description 6
- 238000000746 purification Methods 0.000 description 6
- -1 (2S)-2-amino-3-(naphthylamino)propionic acid Chemical compound 0.000 description 5
- 241000894006 Bacteria Species 0.000 description 5
- 239000012505 Superdex™ Substances 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 4
- 239000007984 Tris EDTA buffer Substances 0.000 description 4
- 235000012000 cholesterol Nutrition 0.000 description 4
- 239000003431 cross linking reagent Substances 0.000 description 4
- 238000010828 elution Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 239000013642 negative control Substances 0.000 description 4
- 238000005406 washing Methods 0.000 description 4
- 108091006112 ATPases Proteins 0.000 description 3
- 102000057290 Adenosine Triphosphatases Human genes 0.000 description 3
- 230000002860 competitive effect Effects 0.000 description 3
- 230000000295 complement effect Effects 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 3
- 150000003904 phospholipids Chemical class 0.000 description 3
- 102000040430 polynucleotide Human genes 0.000 description 3
- 108091033319 polynucleotide Proteins 0.000 description 3
- 239000002157 polynucleotide Substances 0.000 description 3
- 229920001184 polypeptide Polymers 0.000 description 3
- 102000004196 processed proteins & peptides Human genes 0.000 description 3
- 108090000765 processed proteins & peptides Proteins 0.000 description 3
- 241000894007 species Species 0.000 description 3
- 238000002965 ELISA Methods 0.000 description 2
- OAKJQQAXSVQMHS-UHFFFAOYSA-N Hydrazine Chemical compound NN OAKJQQAXSVQMHS-UHFFFAOYSA-N 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- XYFCBTPGUUZFHI-UHFFFAOYSA-N Phosphine Chemical compound P XYFCBTPGUUZFHI-UHFFFAOYSA-N 0.000 description 2
- 108090000190 Thrombin Proteins 0.000 description 2
- 150000001345 alkine derivatives Chemical class 0.000 description 2
- 238000010382 chemical cross-linking Methods 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 150000001945 cysteines Chemical class 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000011033 desalting Methods 0.000 description 2
- 239000003792 electrolyte Substances 0.000 description 2
- 238000009396 hybridization Methods 0.000 description 2
- 238000006460 hydrolysis reaction Methods 0.000 description 2
- 239000012535 impurity Substances 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 229930027917 kanamycin Natural products 0.000 description 2
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 2
- 229960000318 kanamycin Drugs 0.000 description 2
- 229930182823 kanamycin A Natural products 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 239000011780 sodium chloride Substances 0.000 description 2
- KZNICNPSHKQLFF-UHFFFAOYSA-N succinimide Chemical compound O=C1CCC(=O)N1 KZNICNPSHKQLFF-UHFFFAOYSA-N 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 229960004072 thrombin Drugs 0.000 description 2
- QRZUPJILJVGUFF-UHFFFAOYSA-N 2,8-dibenzylcyclooctan-1-one Chemical compound C1CCCCC(CC=2C=CC=CC=2)C(=O)C1CC1=CC=CC=C1 QRZUPJILJVGUFF-UHFFFAOYSA-N 0.000 description 1
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 1
- XLOULZPUVVVWES-UHFFFAOYSA-N 2-amino-3-(8-hydroxyquinolin-3-yl)propanoic acid Chemical compound OC1=CC=CC2=CC(CC(N)C(O)=O)=CN=C21 XLOULZPUVVVWES-UHFFFAOYSA-N 0.000 description 1
- NOWKCMXCCJGMRR-UHFFFAOYSA-N Aziridine Chemical compound C1CN1 NOWKCMXCCJGMRR-UHFFFAOYSA-N 0.000 description 1
- ZUHQCDZJPTXVCU-UHFFFAOYSA-N C1#CCCC2=CC=CC=C2C2=CC=CC=C21 Chemical compound C1#CCCC2=CC=CC=C2C2=CC=CC=C21 ZUHQCDZJPTXVCU-UHFFFAOYSA-N 0.000 description 1
- 101100298998 Caenorhabditis elegans pbs-3 gene Proteins 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 108010061982 DNA Ligases Proteins 0.000 description 1
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 241000701533 Escherichia virus T4 Species 0.000 description 1
- 102220533243 Glycophorin-B_Y51A_mutation Human genes 0.000 description 1
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 1
- 102220580933 Induced myeloid leukemia cell differentiation protein Mcl-1_F56V_mutation Human genes 0.000 description 1
- PEEHTFAAVSWFBL-UHFFFAOYSA-N Maleimide Chemical compound O=C1NC(=O)C=C1 PEEHTFAAVSWFBL-UHFFFAOYSA-N 0.000 description 1
- 101150008132 NDE1 gene Proteins 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 1
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 1
- 229920005654 Sephadex Polymers 0.000 description 1
- 239000012507 Sephadex™ Substances 0.000 description 1
- 229910021607 Silver chloride Inorganic materials 0.000 description 1
- 239000004809 Teflon Substances 0.000 description 1
- 229920006362 Teflon® Polymers 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 150000001266 acyl halides Chemical class 0.000 description 1
- 150000001540 azides Chemical class 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 125000001626 borono group Chemical group [H]OB([*])O[H] 0.000 description 1
- 238000010804 cDNA synthesis Methods 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000007795 chemical reaction product Substances 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 230000015271 coagulation Effects 0.000 description 1
- 238000005345 coagulation Methods 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- DIXBSCZRIZDQGC-UHFFFAOYSA-N diaziridine Chemical compound C1NN1 DIXBSCZRIZDQGC-UHFFFAOYSA-N 0.000 description 1
- AFOSIXZFDONLBT-UHFFFAOYSA-N divinyl sulfone Chemical compound C=CS(=O)(=O)C=C AFOSIXZFDONLBT-UHFFFAOYSA-N 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 239000008151 electrolyte solution Substances 0.000 description 1
- 230000009881 electrostatic interaction Effects 0.000 description 1
- 230000004049 epigenetic modification Effects 0.000 description 1
- 150000002148 esters Chemical class 0.000 description 1
- 125000000524 functional group Chemical group 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 125000005179 haloacetyl group Chemical group 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 230000003301 hydrolyzing effect Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 239000000543 intermediate Substances 0.000 description 1
- PGLTVOMIXTUURA-UHFFFAOYSA-N iodoacetamide Chemical compound NC(=O)CI PGLTVOMIXTUURA-UHFFFAOYSA-N 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 1
- 150000002540 isothiocyanates Chemical class 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000007899 nucleic acid hybridization Methods 0.000 description 1
- TVIDEEHSOPHZBR-AWEZNQCLSA-N para-(benzoyl)-phenylalanine Chemical compound C1=CC(C[C@H](N)C(O)=O)=CC=C1C(=O)C1=CC=CC=C1 TVIDEEHSOPHZBR-AWEZNQCLSA-N 0.000 description 1
- 229910000073 phosphorus hydride Inorganic materials 0.000 description 1
- 239000011148 porous material Substances 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000000734 protein sequencing Methods 0.000 description 1
- 239000012460 protein solution Substances 0.000 description 1
- 230000035484 reaction time Effects 0.000 description 1
- 102200041760 rs387907237 Human genes 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- HKZLPVFGJNLROG-UHFFFAOYSA-M silver monochloride Chemical compound [Cl-].[Ag+] HKZLPVFGJNLROG-UHFFFAOYSA-M 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- 239000001509 sodium citrate Substances 0.000 description 1
- NLJMYIDDQXHKNR-UHFFFAOYSA-K sodium citrate Chemical compound O.O.[Na+].[Na+].[Na+].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O NLJMYIDDQXHKNR-UHFFFAOYSA-K 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 238000001179 sorption measurement Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 229960002317 succinimide Drugs 0.000 description 1
- YBBRCQOCSYXUOC-UHFFFAOYSA-N sulfuryl dichloride Chemical compound ClS(Cl)(=O)=O YBBRCQOCSYXUOC-UHFFFAOYSA-N 0.000 description 1
- 238000000108 ultra-filtration Methods 0.000 description 1
- 230000003313 weakening effect Effects 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
Definitions
- the helicase in the current commercial nanopore sequencer is the Dda helicase derived from the bacterial phage T4, which has poor production, stability and salt tolerance.
- high salt will inhibit the unwinding activity of the Dda helicase, causing its unwinding speed to decrease and unable to fully exert its unwinding ability, thereby weakening its sequencing speed in nanopore sequencing applications and reducing sequencing efficiency.
- an isolated DNA molecule has (a) a nucleotide sequence encoding the helicase of any one of claims 2 to 3; or (b) a nucleotide sequence that hybridizes with the DNA molecule defined in (a) under stringent conditions; or (c) a nucleotide sequence shown in SEQ ID NO: 2 or SEQ ID NO: 4; or (d) a nucleotide sequence that has more than 70% homology with any one of the nucleotide sequences defined in (a) to (c) and encodes a protein having the same function as the helicase.
- nucleic acid control includes controlling the speed of nucleic acid passing through a nanopore, controlling the stability of nucleic acid perforation, or controlling the continuity of nucleic acid perforation; further, the application includes the application of nanosensors and single-molecule nanopore sequencing applications.
- a nanopore sequencing kit comprising a helicase, and the helicase is any of the above-mentioned helicases of the present invention.
- FIG. 12 shows the detection of the restriction sequence blocking the melting activity of BCH326 protein (high salt reaction buffer).
- FIG. 14 shows the detection of the restriction sequence blocking the melting activity of BCH338 protein (high salt reaction buffer).
- the helicase defined in C) above can improve protein uniformity, thereby improving indicators such as sequencing uniformity, by mutating at least one cysteine on the surface of the helicase to alanine, glutamine, glycine, histidine, isoleucine, leucine, valine, serine, threonine or methionine.
- the helicase defined in C) above includes: a protein in which the C at position 319 of BCH326 is replaced by A, S, T, V, I, L or G; and a protein in which the C at position 326 or 459 of BCH338 is replaced by A, S, T, V, I, L or G.
- the protein is mutated to stably connect the tower domain and the pin domain together, so that the DNA is fixed in the region formed by the two during the sequencing process, thereby improving the stability and sustainability of the sequencing.
- the amino acid mutation at at least one site on the tower domain and/or the pin domain of any of the amino acid sequences in A), B) and C) to cysteine or the introduction of at least one non-natural amino acid includes at least one of the following:
- the amino acid of at least one of E93, I94, R95, P96, D97, I98, N99, E100, F101, G102, E103, R104, I105, F106, V107, P108, K109, L110, R111, D112, M113, and M114 on the pin domain of BCH338 is mutated to cysteine or at least one unnatural amino acid is introduced;
- the above mutations enable chemical linkage between the tower domain and the pin domain, including covalent or non-covalent linkage.
- the protein preferably has 70%, 80%, 90%, 95% or 99% or more homology and the same function as the amino acid sequence of the protein defined in any of A), B) and C).
- the term "homology” used herein has the meaning generally known in the art, and those skilled in the art are also familiar with the rules and standards for determining the homology between different sequences.
- the sequences defined by different degrees of homology in the present invention must also have the helicase function at the same time. Those skilled in the art can obtain such variant sequences under the guidance of the disclosure of this application.
- the non-natural amino acids mentioned are not limited to 4-azido-L-phenylalanine, 4-acetyl-L-phenylalanine, 3-acetyl-L-phenylalanine, 4-acetoacetyl-L-phenylalanine, O-allyl-L-tyrosine, 3-(phenylselenoalkyl)-L-alanine, O-2-propyn-1-yl-L-tyrosine, 4-(dihydroxyboryl)-L-phenylalanine, 4-[(ethylsulfanyl)carbonyl]-L-phenylalanine, (2S)-2-amino-3- ⁇ 4- [(Propan-2-ylsulfanyl)carbonyl]phenyl ⁇ propanoic acid, (2S)-2-amino-3- ⁇ 4-[(2-amino-3-sulfanylpropionyl)amino]phenyl ⁇ prop
- the mutation of the original amino acid to an amino acid with a larger side chain includes at least one of the following: asparagine (N) is replaced by glutamine (Q), histidine (H), arginine (R) or lysine (K); proline (P) is replaced by arginine (R), lysine (K), phenylalanine (F) or leucine (I); histidine (H) is replaced by arginine (R), lysine (K), glutamine (Q), asparagine (N) phenylalanine (F), tyrosine (Y) or tryptophan (W); proline (P) is replaced by arginine (R), lysine (K), glutamine (Q), asparagine (N) phenylalanine (F), tyrosine (Y) or tryptophan (W); Acid (P) is replaced by (i) arginine (R), lysine (K), glutamine (Q), asparagine (N)
- the long side chain amino acids at the binding amino acid sites of the surface and nanopore binding region of this type of helicase can be mutated into short side chain amino acids to reduce the repulsion between the helicase and the nanopore during sequencing.
- Common mutation directions include asparagine (N) replaced by isoleucine (I), valine (V), isoleucine (L), alanine (A), serine (S) or glycine (G); lysine (K) replaced by isoleucine (I), valine (V), isoleucine (L), alanine (A), serine (S) or glycine (G); lysine (K) replaced by isoleucine (I), valine (V), isoleucine (L), alanine (A), serine (S) or glycine (G); arginine (R) replaced by isoleucine (I), valine (V), isoleucine (L), alanine (A), serine (S) or glycine (G), etc.
- these amino acid positions include but are not limited to BCH326: M1, E2, S3, K4, I5, N6, L7, T8, E9, D10, Q11, L12, K13, I14, I15, K16, I189, I190, R191, T192, Q193, N194, K195, N196, S197; BCH338: M1, G2, E3, I4, K5, L6, N7, E8, E9, Q10, Q11, K12, K177, One or more of the sites I177, L178, R179, T180, K181, N182, L213, I214, D215, H216, F217, H218, V219, Y220, G221, D248, L249, T250, D251, S252, T253, E254, S255.
- BCH326 M1, E2, S3, K4, I5, N6, L7, T8, E9, D10, Q11, L12, K13, I14, I15, K16, I189, I190,
- an isolated DNA molecule which has (a) a nucleotide sequence encoding any of the above-mentioned helicases; or (b) a nucleotide sequence that hybridizes with the DNA molecule specified in (a) under stringent conditions; or (c) a nucleotide sequence shown in SEQ ID NO: 2 or SEQ ID NO: 4; or (d) a nucleotide sequence that has more than 70% (preferably more than 80%, more preferably more than 85%, further preferably more than 90%, most preferably more than 95%, for example, it can be 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 98.5%, 99%, 99.5%, 99.6%, 99.7%, 99.8% or more, or even more than 99.9%) homology with any of the nucleotide sequences specified in (a) to
- the “homology” in the present application refers to the identity between any two nucleotide sequences or amino acid sequences, from the first amino acid to the last amino acid encoded by the corresponding genes.
- isolated in this application means changed “by the hand of man” from its natural state, i.e., if it occurs in nature, it is changed and/or separated from its original environment.
- a polynucleotide or polypeptide naturally present in a living organism is not “isolated”, however, the same polynucleotide or polypeptide separated from its coexisting state in its natural state is “isolated” (as the term is used in this article).
- Deviation from complete complementarity is allowed as long as this deviation does not completely prevent the two molecules from forming a double-stranded structure.
- a DNA molecule In order for a DNA molecule to be able to serve as a primer or probe, it is only necessary to ensure that it has sufficient complementarity in sequence so that a stable double-stranded structure can be formed under the specific solvent and salt concentration used.
- the substantially homologous sequence is a DNA molecule that can specifically hybridize with the complementary strand of another matching DNA molecule under highly stringent conditions.
- Suitable stringent conditions for promoting DNA hybridization for example, treatment with 6.0 ⁇ sodium chloride/sodium citrate (SSC) at about 45°C, followed by washing with 2.0 ⁇ SSC at 50°C, are well known to those skilled in the art.
- the salt concentration in the washing step can be selected from about 2.0 ⁇ SSC, 50°C for low stringency conditions to about 0.2 ⁇ SSC, 50°C for high stringency conditions.
- the temperature conditions in the washing step can be increased from about 22°C at room temperature for low stringency conditions to about 65°C for high stringency conditions.
- the stringent conditions in the present invention may be specific hybridization with the nucleotide sequence encoding the helicase of the present application in a 6 ⁇ SSC, 0.5% SDS solution at 65°C, and then washing the membrane once with 2 ⁇ SSC, 0.1% SDS and once with 1 ⁇ SSC, 0.1% SDS.
- a recombinant vector which comprises the above-mentioned DNA molecule, i.e., the helicase expression gene.
- the recombinant vector is selected from a plasmid, a virus or a carrier expression vector, etc.; the recombinant vector comprises a regulatory element for controlling the expression of the above-mentioned DNA molecule; the regulatory element comprises a promoter operably connected to the DNA molecule; preferably, the promoter comprises T7, trc, lac, ara or ⁇ L; more preferably, the recombinant vector is selected from plasmid PET.28a(+), PET.21a(+) or PET.32a(+), etc.
- the helicase expression gene is inserted into the recombinant vector, and the helicase expression gene is replicated in large quantities by utilizing the ability of the recombinant vector to replicate itself in large quantities.
- "Recombinant” here refers to genetically engineered DNA prepared by transplanting or splicing a gene from one species into the cells of a host organism of a different species. This DNA becomes part of the host's genetic structure and is replicated.
- a host cell is provided, wherein the host cell is transformed with the above-mentioned DNA molecule or recombinant vector.
- the above recombinant vector is transformed into a host cell, and the host cell is used to replicate, transcribe, and translate the helicase expression gene on the recombinant vector, so that a large amount of helicase can be produced.
- the host cell includes Escherichia coli, which can be BL21 (DE3), BL21Star (DE3) pLysS, Rossata (DE3), Lemo21 (DE3), etc.
- the helicase of the present invention can be successfully expressed in the Escherichia coli recombinant protein expression system, and the protein is uniform and has high purity.
- this type of helicase exhibits superior unwinding activity in a high-salt environment than in a low-salt environment, can bind well to single-stranded DNA, and unwind double-stranded DNA.
- This type of helicase has strong unwinding activity, and the limiting sequence blocking Spacer-18 (Sp18) cannot completely block its unwinding activity.
- This helicase can be used for the control and characterization of nucleic acids and applied to single-molecule nanopore sequencing.
- the full-length DNA sequences of BCH326 and BCH338 were respectively connected to the PET.28a(+) plasmid, and the double restriction sites Nde1 and Xho1 were used, so that the N-termini of the expressed BCH326 and BCH338 proteins had a 6*His tag and a thrombin restriction site.
- PET.28a(+)-BCH326 and PET.28a(+)-BCH328 plasmids were transformed into E. coli expression bacteria BL21(DE3) or its derivatives.
- Buffer A 20 mM Tris-HCl pH 7.5, 250 mM NaCl, 20 mM imidazole
- Buffer B 20 mM Tris-HCl pH 7.5, 250 mM NaCl, 300 mM imidazole
- Buffer C 20 mM Tris-HCl pH 7.5, 80 mM NaCl
- Buffer D 20 mM Tris-HCl pH 7.5, 1000 mM NaCl
- Buffer E 20 mM Tris-HCl pH 7.5, 200 mM NaCl
- the ssDNA cellulose filler was washed 3-4 times with Buffer C to remove the impurities that were not adsorbed to the ssDNA cellulose filler, and then eluted with buffer D to destroy the specific adsorption of the target protein and the ssDNA filler, and the target protein was eluted into the solution.
- the protein purified by ssDNA cellulose was concentrated in a 4°C precooled centrifuge through a 30K ultrafiltration concentrator (Merck millipore), the parameters were set to a speed of 3000g, each centrifugation time was 10min, and repeated several times to concentrate the final protein volume to 2mL.
- Detection of the remaining ATP in the reaction According to the manufacturer's instructions, use an ATP detection kit (Biyuntian, S0026B) to determine the remaining ATP concentration in the reaction.
- the experimental results are plotted by calculating the ratio of the measured fluorescence value to the fluorescence value measured by the positive control (due to the sensitivity of the instrument, the negative control group has fluorescence absorption reading). From the experimental results, it can be seen that the negative control group in each experiment remains unchanged during the measurement process, while the fluorescence value of the experimental group gradually increases with the increase of reaction time, indicating that it has the activity of unwinding double-stranded DNA, and the unwinding direction is 5'-3'.
- Dilute protein Dilute BCH326 and BCH338 proteins to 4.8 ⁇ M using 1 ⁇ PBS.
- FIG. 15 shows a schematic diagram of the linker (a: top strand; b: bottom strand).
- FIG. 16 shows a schematic diagram of a sequencing library containing helicase (a: upper strand; b: lower strand; c: double-stranded target fragment; d: helicase; e: cholesterol-labeled double-stranded DNA).
- the sequencing library containing helicase was incubated with single-stranded DNA containing cholesterol at the 5' end (ssDNA-chol, SEQ ID NO.13) at room temperature for 10 minutes.
- the ssDNA-chol sequence is complementary to a part of the bottom strand of the adapter. After cholesterol binds to the phospholipid membrane, it can reduce the amount of library loading and increase the capture rate.
- a Teflon membrane with a micrometer-sized hole in the middle divides the electrolytic cell into two chambers, the cis chamber and the trans chamber; a pair of Ag/AgCl electrodes are placed in each of the cis chamber and the trans chamber; a layer of bimolecular phospholipid membrane is formed at the micropores of the two chambers, and the nanopore protein CsgG-Eco-(Y51A/F56Q/R97W/R192D-StrepII(C)) is added; electrical measurements are obtained after a single nanopore protein is inserted into the phospholipid membrane; the reaction product of step 3 is added, 180 mV is applied, the sequencing library is captured by the nanopore, and the nucleic acid passes through the nanopore under the control of the helicase.
- the buffer used in this experiment was: 0.47M KCl, 25mM
- the sequencing experiment was performed using BCH326, and the sequencing electrical signal is shown in Figure 18.
- the sequencing experiment was performed using BCH338, and the sequencing electrical signal is shown in Figure 19.
- the results show that as the helicase controls the DNA single strand to enter the nanopore, part of the current is blocked and the current becomes smaller. Since different nucleotides have different sizes, the size of the blocked current is also different, so a fluctuating current signal can be seen. And both Figures 18 and 19 have complete connector signals and reply signals, and the signal-to-noise ratio is high, indicating that the stability of the sequencing signal is good.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Medicinal Chemistry (AREA)
- Physics & Mathematics (AREA)
- Analytical Chemistry (AREA)
- Immunology (AREA)
- Gastroenterology & Hepatology (AREA)
- Plant Pathology (AREA)
- Immobilizing And Processing Of Enzymes And Microorganisms (AREA)
Abstract
Disclosed in the present invention are a helicase and a use thereof. The helicase has two tower domains and a PIN domain, and the two tower domains are located on the same side of a helicase three-dimensional structure. According to the technical solution of the present invention, a brand-new helicase BCH3X having a special helix characteristic domain is provided, and the helicase has good salt tolerance and stability, can have high unwinding activity in the case of a high salt content, can be used for nucleic acid control and characterization, and is applied to nanopore sequencing.
Description
本发明涉及生物技术领域,具体而言,涉及一种解旋酶及其应用。The present invention relates to the field of biotechnology, and in particular to a helicase and an application thereof.
纳米孔测序技术作为新兴起的单分子测序技术,凭借着高通量、读长长、快速度、原位检测和无标记操作等独特优势,给基因测序行业带来了颠覆性的改变。该技术不需借助成像设备来检测,从而使系统可以缩小到便携式水平,满足不同测序场景。并且由于其非扩增直接测序的性质,对可测序的DNA没有长度限制,允许进行实时碱基调用,也可实现RNA、甲基化等修饰的分子,以及其它单分子的直接测序。纳米孔测序技术在分子生物学、医学,流行病学和生态学等许多领域都有着广泛应用价值,比如基因组图谱绘制、疫情等传染病的监制、稀有物种的检测、隐藏中间产物的识别、生物非共价相互作用的动力学监测、促进表观遗传和翻译后修饰的表征以及快速且廉价的蛋白实施测序等。As an emerging single-molecule sequencing technology, nanopore sequencing technology has brought disruptive changes to the gene sequencing industry with its unique advantages such as high throughput, long read length, fast speed, in situ detection and label-free operation. This technology does not require the use of imaging equipment for detection, so that the system can be reduced to a portable level to meet different sequencing scenarios. And due to its non-amplified direct sequencing nature, there is no length limit on the sequenceable DNA, allowing real-time base calling, and can also achieve RNA, methylation and other modified molecules, as well as direct sequencing of other single molecules. Nanopore sequencing technology has a wide range of application value in many fields such as molecular biology, medicine, epidemiology and ecology, such as genome mapping, epidemic and other infectious diseases, detection of rare species, identification of hidden intermediates, dynamic monitoring of biological non-covalent interactions, promotion of epigenetic and post-translational modification characterization, and fast and inexpensive protein sequencing.
纳米孔测序技术是基于电信号的测序技术,由一个插在膜上作为信号传感器的纳米孔(蛋白或固态)将两个装有电解液的电解室分开。当电压施加给两个电解室之间时,会产生稳定的穿孔电流,而当待测分子进入纳米孔时会对离子的流动造成阻碍从而等导致电流信号波动,而不同的碱基对电流的影响是不同的。通过实时检测纳米孔的电流波动信号,并借助机器学习分析并解码电流信号,从而实现对待测分子进行实时测序。Nanopore sequencing technology is a sequencing technology based on electrical signals. A nanopore (protein or solid) inserted in a membrane as a signal sensor separates two electrolyte chambers. When voltage is applied between the two electrolyte chambers, a stable perforation current is generated. When the molecule to be tested enters the nanopore, the flow of ions is hindered, resulting in current signal fluctuations. Different bases have different effects on the current. By detecting the current fluctuation signal of the nanopore in real time and using machine learning to analyze and decode the current signal, real-time sequencing of the molecule to be tested can be achieved.
在该测序过程中,由于核酸分子穿过纳米孔通道时速度极快,无法精确获得多核苷酸序列信息。因此有效地降低并控制核酸分子的穿孔运动是实现纳米孔测序的关键技术问题。目前,最常见的有效方法是利用解旋酶解旋的思路控制核酸分子的穿孔运动,提高检测精度。并且为了更好地维持测序速度和测序均一性,需要解旋酶在高盐电解质液中具有良好的盐耐受能力及热稳定性。In the sequencing process, due to the extremely fast speed of nucleic acid molecules passing through the nanopore channel, it is impossible to accurately obtain polynucleotide sequence information. Therefore, effectively reducing and controlling the perforation movement of nucleic acid molecules is a key technical problem in realizing nanopore sequencing. At present, the most common and effective method is to control the perforation movement of nucleic acid molecules using the idea of helicase unwinding to improve the detection accuracy. And in order to better maintain the sequencing speed and sequencing uniformity, the helicase needs to have good salt tolerance and thermal stability in high-salt electrolyte solution.
当前商品化的纳米孔测序仪中的解旋酶为来源于细菌噬菌体T4的Dda解旋酶,其产量、稳定性及盐耐受性均不佳,特别在盐耐受能力上,高盐会抑制Dda解旋酶的解旋活性,使其解旋速度下降,不能充分发挥其解旋能力,从而削弱了其在纳米孔测序应用中的测序速度,降低测序效率。The helicase in the current commercial nanopore sequencer is the Dda helicase derived from the bacterial phage T4, which has poor production, stability and salt tolerance. In particular, high salt will inhibit the unwinding activity of the Dda helicase, causing its unwinding speed to decrease and unable to fully exert its unwinding ability, thereby weakening its sequencing speed in nanopore sequencing applications and reducing sequencing efficiency.
发明内容Summary of the invention
本发明旨在提供一种解旋酶及其应用,以解决现有技术中解旋酶盐耐受性不佳的技术问题。The present invention aims to provide a helicase and application thereof, so as to solve the technical problem of poor salt tolerance of the helicase in the prior art.
为了实现上述目的,根据本发明的一个方面,提供了一种解旋酶。该解旋酶具有两个塔结构域和一个销结构域,两个塔结构域位于解旋酶三维结构的同一侧。To achieve the above object, according to one aspect of the present invention, a helicase is provided, which has two tower domains and one pin domain, and the two tower domains are located on the same side of the three-dimensional structure of the helicase.
进一步地,解旋酶包括以下至少一种:A)BCH326,BCH326为具有SEQ ID NO:1所示的氨基酸序列的蛋白质;B)BCH338,BCH338为具有SEQ ID NO:3所示的氨基酸序列的蛋白质;C)在A)或B)中限定的蛋白质的表面上至少一个半胱氨酸突变为丙氨酸、谷氨酰胺、甘氨酸、组氨酸、异亮氨酸、亮氨酸、缬氨酸、丝氨酸、苏氨酸或甲硫氨酸的蛋白质;D)对A)、B)和C)中任一所限定的蛋白质的氨基酸序列的塔结构域和/或销结构域上至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸,且具有DNA解旋能力的蛋白质;和E)与A)、B)、C)和D)中任一所限定的蛋白质的氨基酸序列具有70%以上同源性且具有相同功能的蛋白质。Further, the helicase includes at least one of the following: A) BCH326, BCH326 is a protein having the amino acid sequence shown in SEQ ID NO: 1; B) BCH338, BCH338 is a protein having the amino acid sequence shown in SEQ ID NO: 3; C) a protein in which at least one cysteine on the surface of the protein defined in A) or B) is mutated to alanine, glutamine, glycine, histidine, isoleucine, leucine, valine, serine, threonine or methionine; D) a protein in which at least one amino acid at a site on the tower domain and/or pin domain of the amino acid sequence of any of the proteins defined in A), B) and C) is mutated to cysteine or at least one non-natural amino acid is introduced, and the protein has the ability to unwind DNA; and E) a protein having more than 70% homology with the amino acid sequence of any of the proteins defined in A), B), C) and D) and having the same function.
进一步地,C)包括:将BCH326的第319位的C置换为A、S、T、V、I、L或G的蛋白质;和将BCH338的第326位或第459位的C置换为A、S、T、V、I、L或G的蛋白质。Furthermore, C) includes: proteins in which the C at position 319 of BCH326 is replaced by A, S, T, V, I, L or G; and proteins in which the C at position 326 or 459 of BCH338 is replaced by A, S, T, V, I, L or G.
进一步地,D)中,塔结构域和/或销结构域上至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸包括以下至少一种:BCH326的塔结构域上S389、R340、K341、S342、N343、K343、S344、I345、V346、I347、D348、K349、D350、G351、K352、A353、K354、E355、F356、L357、R358、K359、F360、L361、N362、F363、A364、K365、I366、Y367、N368、F369、T370、N371、K372、G373、G374、H378、G379、R380、R381、I382、T383、K384、K385、S386、K387、K388、E389、L390和W391中的至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸;BCH326的销结构域上D87、I88、G89、T90、I91、H92、S93、Y94、F95、D96、I97、K98、P99、D100、I101、D102、D103、N104、G105、N106、R107、V108、F109、K110、P111或S112中的至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸;BCH338的塔结构域上S405、K406、F407、L408、V409、P410、L411、G412、D413、G414、S415、K416、E417、D418、L419、F420、P421、L422、Y423、K424、E425、A426、V427、F428、D429、I430、A431、K432、T433、M434、N435、N436、Q437、R438、K439、I440、S441、K442、N443、S444、K445、K446、N447、F448或W449中的至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸;BCH338的销结构域上E93、I94、R95、P96、D97、I98、N99、E100、F101、G102、E103、R104、I105、F106、V107、P108、K109、L110、R111、D112、M113、M114中的至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸;优选地,E)中,蛋白质与A)、B)、C)和D)中任一所限定的蛋白质的氨基酸序列具有70%、80%、90%、95%或99%以上同源性及相同功能。Further, in D), the amino acid mutation at at least one site on the tower domain and/or the pin domain to cysteine or the introduction of at least one unnatural amino acid includes at least one of the following: S389, R340, K341, S342, N343, K343, S344, I345, V346, I347, D348, K349, D350, G351, K352, A353, K354, E355, F356, L357, R358, K359, F360, L361, N362, F363, A364, K365, I366, Y367, N368, F369, T370, N371, K372, G373 on the tower domain of BCH326 , G374, H378, G379, R380, R381, I382, T383, K384, K385, S386, K387, K388, E389, L390 and W391 are mutated to cysteine or at least one unnatural amino acid is introduced; at least one of D87, I88, G89, T90, I91, H92, S93, Y94, F95, D96, I97, K98, P99, D100, I101, D102, D103, N104, G105, N106, R107, V108, F109, K110, P111 or S112 on the pin domain of BCH326 is mutated to cysteine or at least one unnatural amino acid is introduced cysteine or introduces at least one unnatural amino acid; S405, K406, F407, L408, V409, P410, L411, G412, D413, G414, S415, K416, E417, D418, L419, F420, P421, L422, Y423, K424, E425, A426, V427, F428, D429, I430, A431, K432, T433, M434, N435, N436, Q437, R438, K439, I440, S441, K442, N443, S444, K445, K446, N447, F448 or W449 on the tower domain of BCH338 9, the amino acid at least one of which is mutated to cysteine or at least one unnatural amino acid is introduced; the amino acid at least one of which is mutated to cysteine or at least one unnatural amino acid is introduced on the pin domain of BCH338, including E93, I94, R95, P96, D97, I98, N99, E100, F101, G102, E103, R104, I105, F106, V107, P108, K109, L110, R111, D112, M113, and M114; preferably, in E), the protein has 70%, 80%, 90%, 95% or 99% or more homology and the same function as the amino acid sequence of the protein defined in any one of A), B), C) and D).
进一步地,非天然氨基酸选自4-叠氮基-L-苯丙氨酸、4-乙酰基-L-苯丙氨酸、3-乙酰基-L-苯丙氨酸、4-乙酰乙酰基-L苯丙氨酸、O-烯丙基-L-酪氨酸、3-(苯基硒烷基)-L-丙氨酸、O-2-丙炔-1-基-L-酪氨酸、4(二羟基硼基)-L-苯丙氨酸、4-[(乙基硫烷基)羰基]-L-苯丙氨酸、(2S)-2-氨基-3-{4-[(丙烷-2-基硫烷基)羰基]苯基}丙酸、(2S)-2-氨基-3-{4-[(2-氨基-3-硫烷基丙酰基)氨基]苯基}丙酸、O-甲基-L-酪氨酸、4-氨基-L-苯丙氨酸、4-氰基-L-苯丙氨酸、3-氰基-L-苯丙氨酸、4-氟-L-苯丙氨酸、4-碘-L-苯丙氨酸、4-溴-L-苯丙氨酸、O-(三氟甲基)酪氨酸、4-硝基L-苯丙氨酸、3-羟基-L-酪氨酸、3-氨基-L-酪氨酸、3-碘-L-酪氨酸、4-异丙基-L-苯丙氨酸、3-(2-萘基)-L-丙氨酸、4-苯基-L-苯丙氨酸、(2S)-2-氨基-3-(萘-2-基氨基)丙酸、6-(甲基硫烷基)正亮氨酸、6-氧-L-赖氨酸、D-酪氨酸、(2R)-2-羟基-3-(4-羟基苯 基)丙酸、(2R)-2氨基辛酸酯3-(2、2′-二吡啶-5-基)-D-丙氨酸、2-氨基-3-(8-羟基-3-喹啉基)丙酸、4-苯甲酰-L-苯丙氨酸、S-(2-硝基苄基)半胱氨酸、(2R)-2-氨基-3-[(2-硝基苄基)硫烷基]丙酸、(2S)-2-氨基-3-[(2-硝基苄基)氧基]丙酸、O-(4、5-二甲氧基-2-硝基苄基)-L-丝氨酸、(2S)-2-氨基-6-({[(2-硝基苄基)氧基]羰基}氨基)己酸和O-(2-硝基苄基)-L-酪氨酸或2-硝基苯丙氨酸中的至少一种;优选的,BCH326引入至少一个非天然氨基酸包括如下至少一种:D100引入4-叠氮基-L-苯丙氨酸、I101引入4-叠氮基-L-苯丙氨酸、D102引入4-叠氮基-L-苯丙氨酸、D103引入4-叠氮基-L-苯丙氨酸、N104引入4-叠氮基-L-苯丙氨酸、G105引入4-叠氮基-L-苯丙氨酸、N106引入4-叠氮基-L-苯丙氨酸、R107引入4-叠氮基-L-苯丙氨酸、D103引入4-乙酰基-L-苯丙氨酸、G105引入4-乙酰基-L-苯丙氨酸和N106引入4-乙酰基-L-苯丙氨酸;优选的,BCH338引入至少一个非天然氨基酸包括如下至少一种:A431引入4-叠氮基-L-苯丙氨酸、K432引入4-叠氮基-L-苯丙氨酸、T433引入4-叠氮基-L-苯丙氨酸、M434引入4-叠氮基-L-苯丙氨酸、N435引入4-叠氮基-L-苯丙氨酸、S441引入4-叠氮基-L-苯丙氨酸、K442引入4-叠氮基-L-苯丙氨酸、N443引入4-叠氮基-L-苯丙氨酸和S444引入4-叠氮基-L-苯丙氨酸。Further, the unnatural amino acid is selected from 4-azido-L-phenylalanine, 4-acetyl-L-phenylalanine, 3-acetyl-L-phenylalanine, 4-acetoacetyl-L-phenylalanine, O-allyl-L-tyrosine, 3-(phenylselenoyl)-L-alanine, O-2-propyn-1-yl-L-tyrosine, 4(dihydroxyboryl)-L-phenylalanine, 4-[(ethylsulfanyl)carbonyl]-L-phenylalanine, (2S)-2-amino-3-{4-[(propan-2-ylsulfanyl)carbonyl]phenyl}propanoic acid, (2S)-2-amino-3-{4-[(2-amino-3-sulfanylpropanoyl)amino]phenyl}propanoic acid, O-methyl-L-tyrosine, 4-amino-L-phenylalanine, 4-cyano-L-phenylalanine, 3-cyano-L-phenylalanine, 4-fluoro-L-phenylalanine, 4-iodo-L-phenylalanine, 4-bromo-L-phenylalanine, O-(trifluoromethyl)tyrosine, 4-nitro-L-phenylalanine, 3-hydroxy-L-tyrosine, 3-amino-L-tyrosine, 3-iodo-L-tyrosine, 4-isopropyl-L-phenylalanine, 3-(2-naphthyl)-L-alanine, 4-phenyl-L-phenylalanine, (2S)-2-amino-3-(naphthylamino)propionic acid, 6-(methylsulfanyl)norleucine, 6-oxo-L-lysine, D-tyrosine, (2R)-2-hydroxy-3-(4-hydroxyphenyl)-L-alanine, 2-amino-3-(8-hydroxy-3-quinolyl)propionic acid, 4-benzoyl-L-phenylalanine, S-(2-nitrobenzyl)cysteine, (2R)-2-amino-3-[(2-nitrobenzyl)sulfanyl]propionic acid, (2S)-2-amino-3-[(2-nitrobenzyl)oxy]propionic acid, O-(4,5-dimethoxy-2-nitrobenzyl)-L-serine , (2S)-2-amino-6-({[(2-nitrobenzyl)oxy]carbonyl}amino)hexanoic acid and O-(2-nitrobenzyl)-L-tyrosine or 2-nitrophenylalanine; preferably, BCH326 introduces at least one unnatural amino acid including at least one of the following: D100 introduces 4-azido-L-phenylalanine, I101 introduces 4-azido-L-phenylalanine, D102 introduces 4-azido-L-phenylalanine, D103 introduces 4-azido-L-phenylalanine phenylalanine, N104 introduces 4-azido-L-phenylalanine, G105 introduces 4-azido-L-phenylalanine, N106 introduces 4-azido-L-phenylalanine, R107 introduces 4-azido-L-phenylalanine, D103 introduces 4-acetyl-L-phenylalanine, G105 introduces 4-acetyl-L-phenylalanine and N106 introduces 4-acetyl-L-phenylalanine; preferably, BCH338 introduces at least one unnatural amino acid including at least one of the following: A 431 introduced 4-azido-L-phenylalanine, K432 introduced 4-azido-L-phenylalanine, T433 introduced 4-azido-L-phenylalanine, M434 introduced 4-azido-L-phenylalanine, N435 introduced 4-azido-L-phenylalanine, S441 introduced 4-azido-L-phenylalanine, K442 introduced 4-azido-L-phenylalanine, N443 introduced 4-azido-L-phenylalanine and S444 introduced 4-azido-L-phenylalanine.
进一步地,在解旋酶的DNA结合区的氨基酸位点和/或ATP催化活性中心附近的氨基酸位点具有至少一个位点的氨基酸突变,突变包括将原有氨基酸突变为更大侧链氨基酸;优选的,将原有氨基酸突变为更大侧链氨基酸包括以下至少一种:天冬酰胺被谷氨酰胺、组氨酸、精氨酸或赖氨酸取代;脯氨酸被精氨酸、赖氨酸、苯丙氨酸或亮氨酸取代;组氨酸被精氨酸、赖氨酸、谷氨酰胺、天冬酰胺苯丙氨酸、酪氨酸或色氨酸取代;脯氨酸被精氨酸、赖氨酸、谷氨酰胺、天冬酰胺或组氨酸取代;苯丙氨酸精氨酸、赖氨酸、组氨酸、酪氨酸或色氨酸取代;异亮氨酸被苯丙氨酸、色氨酸、组氨酸、赖氨酸或精氨酸取代;酪氨酸被精氨酸、赖氨酸、或色氨酸取代;BCH326的DNA结合区氨基酸位点包括:L157、V160、L294、G296、N299、L303、A304、I328、F329、T330、N331、G332、G333和E334,ATP催化活性中心附近的氨基酸位点包括:K211、E212、E213、N214、Y215、K216、A217、P218、L219、K220、D221、I222、N223和N224;BCH338的DNA结合区氨基酸位点包括:H89、S90、Y91、F92、E93、I94、R95和P96;ATP催化活性中心附近的氨基酸位点包括:Y152、Q153、L154、P155、P156、V157、F193、L194、I195、K196、E197、Y198、E199、E200和N201。Further, there is at least one amino acid mutation at the amino acid site of the DNA binding region of the helicase and/or the amino acid site near the ATP catalytic active center, and the mutation includes mutating the original amino acid to an amino acid with a larger side chain; preferably, mutating the original amino acid to an amino acid with a larger side chain includes at least one of the following: asparagine is replaced by glutamine, histidine, arginine or lysine; proline is replaced by arginine, lysine, phenylalanine or leucine; histidine is replaced by arginine, lysine, glutamine, asparagine phenylalanine, tyrosine or tryptophan; proline is replaced by arginine, lysine, glutamine, asparagine or histidine; phenylalanine is replaced by arginine, lysine, histidine, tyrosine or tryptophan; isoleucine is replaced by phenylalanine, tryptophan, histidine, lysine or arginine; tyrosine is replaced by arginine, lysine, or tryptophan; the amino acid site of the DNA binding region of BCH326 includes The amino acid sites near the ATP catalytic activity center include: L157, V160, L294, G296, N299, L303, A304, I328, F329, T330, N331, G332, G333 and E334, and the amino acid sites near the ATP catalytic activity center include: K211, E212, E213, N214, Y215, K216, A217, P218, L219, K220, D221, I222, N22 3 and N224; the amino acid sites in the DNA binding region of BCH338 include: H89, S90, Y91, F92, E93, I94, R95 and P96; the amino acid sites near the ATP catalytic active center include: Y152, Q153, L154, P155, P156, V157, F193, L194, I195, K196, E197, Y198, E199, E200 and N201.
进一步地,在解旋酶表面的与纳米孔结合区相互作用的氨基酸具有至少一个位点的突变,突变包括将原来的氨基酸突变为更短侧链的氨基酸;优选的,将原来的氨基酸突变为更短侧链的氨基酸包括:天冬酰胺被异亮氨酸、缬氨酸、异亮氨酸、丙氨酸、丝氨酸或甘氨酸取代;赖氨酸被异亮氨酸、缬氨酸、异亮氨酸、丙氨酸、丝氨酸或甘氨酸取代;赖氨酸被异亮氨酸、缬氨酸、异亮氨酸、丙氨酸、丝氨酸或甘氨酸取代;精氨酸被异亮氨酸、缬氨酸、异亮氨酸、丙氨酸、丝氨酸或甘氨酸取代;优选的,BCH326表面与纳米孔结合区相互作用的氨基酸包括:M1、E2、S3、K4、I5、N6、L7、T8、E9、D10、Q11、L12、K13、I14、I15、K16、I189、I190、R191、T192、Q193、N194、K195、N196和S197;BCH338表面与纳米孔结合区相互作用的氨基酸包括:M1、G2、E3、I4、K5、L6、N7、E8、E9、Q10、Q11、K12、K177、I177、L178、R179、T180、K181、N182、L213、I214、D215、H216、F217、H218、V219、Y220、G221、 D248、L249、T250、D251、S252、T253、E254和S255。Further, the amino acid on the surface of the helicase that interacts with the nanopore binding region has a mutation in at least one site, and the mutation includes mutating the original amino acid to an amino acid with a shorter side chain; preferably, the mutation of the original amino acid to an amino acid with a shorter side chain includes: asparagine is replaced by isoleucine, valine, isoleucine, alanine, serine or glycine; lysine is replaced by isoleucine, valine, isoleucine, alanine, serine or glycine; lysine is replaced by isoleucine, valine, isoleucine, alanine, serine or glycine; arginine is replaced by isoleucine, valine, isoleucine, alanine, serine or glycine; preferably, the amino acids on the surface of BCH326 that interact with the nanopore binding region include : M1, E2, S3, K4, I5, N6, L7, T8, E9, D10, Q11, L12, K13, I14, I15, K16, I189, I190, R191, T192, Q193, N194, K195, N196 and S197; the amino acids on the surface of BCH338 that interact with the nanopore binding region include: M1, G2, E3, I4, K5, L6, N7, E8, E9, Q10, Q11, K12, K177, I177, L178, R179, T180, K181, N182, L213, I214, D215, H216, F217, H218, V219, Y220, G221, D248, L249, T250, D251, S252, T253, E254 and S255.
根据本发明的另一个方面,提供一种分离的DNA分子。该DNA分子具有(a)编码权利要求2至3中任一项的解旋酶的核苷酸序列;或(b)在严格条件下与(a)限定的DNA分子杂交的核苷酸序列;或(c)具有SEQ ID NO:2或SEQ ID NO:4所示的核苷酸序列;或(d)与(a)至(c)中限定的任一种的核苷酸序列具有70%以上同源性,且编码与解旋酶具有相同功能的蛋白质的核苷酸序列。According to another aspect of the present invention, an isolated DNA molecule is provided. The DNA molecule has (a) a nucleotide sequence encoding the helicase of any one of claims 2 to 3; or (b) a nucleotide sequence that hybridizes with the DNA molecule defined in (a) under stringent conditions; or (c) a nucleotide sequence shown in SEQ ID NO: 2 or SEQ ID NO: 4; or (d) a nucleotide sequence that has more than 70% homology with any one of the nucleotide sequences defined in (a) to (c) and encodes a protein having the same function as the helicase.
进一步地,DNA分子具有与(a)至(c)中限定的任一种核苷酸序列具有75%以上,优选85%以上,更优选95%以上,进一步优选99%以上同源性且编码具有相同功能蛋白质的核苷酸序列。Furthermore, the DNA molecule has a nucleotide sequence that has 75% or more, preferably 85% or more, more preferably 95% or more, and further preferably 99% or more homology with any of the nucleotide sequences defined in (a) to (c) and encodes a protein having the same function.
根据本发明的再一个方面,提供一种重组载体。该重组载体包含上述任一种DNA分子。According to another aspect of the present invention, a recombinant vector is provided, which comprises any one of the above-mentioned DNA molecules.
进一步地,重组载体选自质粒、病毒或运载体表达载体;进一步地,重组载体包括用于控制DNA分子表达的调控元件;更进一步地,调控元件包括与DNA分子可操作地连接的启动子;优选地,启动子包括T7、trc、lac、ara或λL;更优选地,重组载体选自质粒PET.28a(+)、PET.21a(+)或PET.32a(+)。Further, the recombinant vector is selected from a plasmid, a virus or a carrier expression vector; further, the recombinant vector includes a regulatory element for controlling the expression of the DNA molecule; further, the regulatory element includes a promoter operably linked to the DNA molecule; preferably, the promoter includes T7, trc, lac, ara or λL; more preferably, the recombinant vector is selected from plasmid PET.28a(+), PET.21a(+) or PET.32a(+).
根据本发明的又一个方面,提供一种宿主细胞。该宿主细胞包含本发明的上述任一种DNA分子,或本发明的上述任一种重组载体。According to another aspect of the present invention, a host cell is provided, which comprises any one of the above-mentioned DNA molecules of the present invention, or any one of the above-mentioned recombinant vectors of the present invention.
进一步地,宿主细胞包括大肠杆菌;优选地,宿主细胞包括BL21(DE3)、BL21Star(DE3)pLysS、Rossata(DE3)或Lemo21(DE3)。Further, the host cell includes Escherichia coli; preferably, the host cell includes BL21(DE3), BL21Star(DE3)pLysS, Rossata(DE3) or Lemo21(DE3).
根据本发明的再一个方面,提供上述解旋酶在核酸控制或表征中的应用;进一步地,核酸控制包括对核酸穿过纳米孔的速度的控制、对核酸穿孔的稳定性控制或对核酸穿孔的持续性控制;更进一步地,应用包括纳米传感器的应用及单分子纳米孔测序应用。According to another aspect of the present invention, there is provided an application of the above-mentioned helicase in nucleic acid control or characterization; further, nucleic acid control includes controlling the speed of nucleic acid passing through a nanopore, controlling the stability of nucleic acid perforation, or controlling the continuity of nucleic acid perforation; further, the application includes the application of nanosensors and single-molecule nanopore sequencing applications.
根据本发明的又一个方面,提供一种纳米孔测序试剂盒。该试剂盒包括解旋酶,该解旋酶为本发明的上述任一种解旋酶。According to another aspect of the present invention, a nanopore sequencing kit is provided, wherein the kit comprises a helicase, and the helicase is any of the above-mentioned helicases of the present invention.
根据本发明的再一个方面,提供一种纳米孔测序的方法。该方法包括待测序核酸分子在解旋酶的控制下进行测序,该解旋酶为本发明的上述任一种解旋酶。According to another aspect of the present invention, a method for nanopore sequencing is provided, which comprises sequencing a nucleic acid molecule to be sequenced under the control of a helicase, wherein the helicase is any of the above-mentioned helicases of the present invention.
应用本发明的技术方案,提供一类全新的且具有特殊螺旋特征结构域的解旋酶BCH3X,其具有良好的盐耐受性及稳定性,能在高盐下具有高解旋活性,可以用于核酸的控制和表征,应用于纳米孔测序。By applying the technical solution of the present invention, a new type of helicase BCH3X with a special helical characteristic domain is provided, which has good salt tolerance and stability, can have high unwinding activity under high salt, can be used for the control and characterization of nucleic acids, and is applied to nanopore sequencing.
构成本申请的一部分的说明书附图用来提供对本发明的进一步理解,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The drawings constituting a part of the present application are used to provide a further understanding of the present invention. The exemplary embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute an improper limitation of the present invention. In the drawings:
图1示出了BCH326的分子筛Superdex 200纯化结果,其中,(A)BCH326的分子筛Superdex 200洗脱图;(B)BCH326的分子筛洗脱胶图。Figure 1 shows the molecular sieve Superdex 200 purification results of BCH326, including: (A) molecular sieve Superdex 200 elution graph of BCH326; (B) molecular sieve elution gel graph of BCH326.
图2示出了BCH338的分子筛Superdex 200纯化结果,其中,(A)BCH338的分子筛Superdex 200洗脱图;(B)BCH338的分子筛洗脱胶图。Figure 2 shows the molecular sieve Superdex 200 purification results of BCH338, including: (A) molecular sieve Superdex 200 elution graph of BCH338; (B) molecular sieve elution gel graph of BCH338.
图3示出了BCH326的Alphafold2预测结构。FIG3 shows the Alphafold2 predicted structure of BCH326.
图4示出了BCH338的Alphafold2预测结构。FIG4 shows the Alphafold2 predicted structure of BCH338.
图5示出了BCH326蛋白的ATPase活性检测。FIG5 shows the ATPase activity detection of BCH326 protein.
图6示出了BCH338蛋白的ATPase活性检测。FIG6 shows the ATPase activity detection of BCH338 protein.
图7示出了BCH326蛋白的dsDNA解链活性检测(低盐反应缓冲液)。FIG. 7 shows the dsDNA melting activity detection of BCH326 protein (low salt reaction buffer).
图8示出了BCH326蛋白的dsDNA解链活性检测(高盐反应缓冲液)。FIG. 8 shows the dsDNA melting activity detection of BCH326 protein (high salt reaction buffer).
图9示出了BCH338蛋白的dsDNA解链活性检测(低盐反应缓冲液)。FIG. 9 shows the dsDNA melting activity detection of BCH338 protein (low salt reaction buffer).
图10示出了BCH338蛋白的dsDNA解链活性检测(高盐反应缓冲液)。FIG. 10 shows the dsDNA melting activity detection of BCH338 protein (high salt reaction buffer).
图11示出了限位序列阻滞BCH326蛋白解链活性的检测(低盐反应缓冲液)。FIG. 11 shows the detection of the restriction sequence blocking the melting activity of BCH326 protein (low salt reaction buffer).
图12示出了限位序列阻滞BCH326蛋白解链活性的检测(高盐反应缓冲液)。FIG. 12 shows the detection of the restriction sequence blocking the melting activity of BCH326 protein (high salt reaction buffer).
图13示出了限位序列阻滞BCH338蛋白解链活性的检测(低盐反应缓冲液)。FIG. 13 shows the detection of the restriction sequence blocking the melting activity of BCH338 protein (low salt reaction buffer).
图14示出了限位序列阻滞BCH338蛋白解链活性的检测(高盐反应缓冲液)。FIG. 14 shows the detection of the restriction sequence blocking the melting activity of BCH338 protein (high salt reaction buffer).
图15示出了接头示意图(a:上链;b:下链)。FIG. 15 shows a schematic diagram of a linker (a: upper chain; b: lower chain).
图16示出了含有解旋酶测序文库示意图(a:上链;b:下链;c:双链目的片段;d:解旋酶;e:胆固醇标记双链DNA)。FIG16 shows a schematic diagram of a sequencing library containing helicase (a: upper strand; b: lower strand; c: double-stranded target fragment; d: helicase; e: cholesterol-labeled double-stranded DNA).
图17示出膜片钳放大器示意图。FIG17 shows a schematic diagram of a patch clamp amplifier.
图18示出了BCH326测序电流信号图。FIG. 18 shows a graph of BCH326 sequencing current signals.
图19示出了BCH338测序电流信号图。FIG. 19 shows a graph of BCH338 sequencing current signals.
图20示出了Dda的结晶结构。FIG20 shows the crystal structure of Dda.
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本发明。It should be noted that, in the absence of conflict, the embodiments and features in the embodiments of the present application can be combined with each other. The present invention will be described in detail below with reference to the accompanying drawings and in combination with the embodiments.
根据本申请一种典型的实施方式,提供了一类解旋酶,该解旋酶具有两个塔结构域和一 个销结构域,且两个塔结构域位于解旋酶三维结构的同一侧。具有此种结构的解旋酶在纳米孔测序时,塔结构域和销结构域可以交联构建成一个DNA的结合区,方便控速测序,增加测序持续性和稳定性,防止DNA在测序过程中滑落或波动时引起测序信号的波动;且具有较高的盐耐受能力,在高盐浓度下能够发挥较优的解旋能力,提高测序效率。According to a typical embodiment of the present application, a type of helicase is provided, which has two tower domains and one pin domain, and the two tower domains are located on the same side of the three-dimensional structure of the helicase. When the helicase with such a structure is sequenced in the nanopore, the tower domain and the pin domain can be cross-linked to form a DNA binding region, which is convenient for speed-controlled sequencing, increases sequencing continuity and stability, and prevents DNA from slipping or fluctuating during the sequencing process, causing fluctuations in sequencing signals; and has a high salt tolerance, can exert a better unwinding ability at high salt concentrations, and improves sequencing efficiency.
根据本申请一种典型的实施方式,提供了一种解旋酶,其基因来源于深海宏基因组,具有较高的盐耐受性。该解旋酶包括以下至少一种:A)BCH326,BCH326为具有SEQ ID NO:1所示的氨基酸序列的蛋白质;B)BCH338,BCH338为具有SEQ ID NO:3所示的氨基酸序列的蛋白质;C)在A)或B)中限定的蛋白质的表面上至少一个半胱氨酸突变为丙氨酸、谷氨酰胺、甘氨酸、组氨酸、异亮氨酸、亮氨酸、缬氨酸、丝氨酸、苏氨酸或甲硫氨酸的蛋白质;D)在A)、B)和C)中任一所限定的蛋白质的氨基酸序列的塔结构域和/或销结构域上至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸,且具有DNA解旋能力的蛋白质;和E)与A)、B)、C)和D)中任一所限定的蛋白质的氨基酸序列具有70%以上同源性且具有相同功能的蛋白质。According to a typical embodiment of the present application, a helicase is provided, the gene of which is derived from a deep-sea metagenome and has high salt tolerance. The helicase includes at least one of the following: A) BCH326, BCH326 is a protein having an amino acid sequence shown in SEQ ID NO: 1; B) BCH338, BCH338 is a protein having an amino acid sequence shown in SEQ ID NO: 3; C) a protein in which at least one cysteine on the surface of the protein defined in A) or B) is mutated to alanine, glutamine, glycine, histidine, isoleucine, leucine, valine, serine, threonine or methionine; D) a protein in which at least one amino acid at a site on the tower domain and/or pin domain of the amino acid sequence of any of the proteins defined in A), B) and C) is mutated to cysteine or at least one non-natural amino acid is introduced, and has the ability to unwind DNA; and E) a protein having more than 70% homology with the amino acid sequence of any of the proteins defined in A), B), C) and D) and having the same function.
上述A)或B)限定的解旋酶,来源于深海宏基因组,具有较高的盐耐受性。The helicase defined in A) or B) above is derived from the deep-sea metagenome and has a high salt tolerance.
上述C)限定的解旋酶,通过将该类解旋酶表面的至少一个的半胱氨酸突变为丙氨酸、谷氨酰胺、甘氨酸、组氨酸、异亮氨酸、亮氨酸、缬氨酸、丝氨酸、苏氨酸或甲硫氨酸,可以提高蛋白的均一性,从而提高测序的均一性等指标。The helicase defined in C) above can improve protein uniformity, thereby improving indicators such as sequencing uniformity, by mutating at least one cysteine on the surface of the helicase to alanine, glutamine, glycine, histidine, isoleucine, leucine, valine, serine, threonine or methionine.
根据本发明一种典型的实施方式,上述C)限定的解旋酶包括:将所述BCH326的第319位的C置换为A、S、T、V、I、L或G的蛋白质;和将所述BCH338的第326位或第459位的C置换为A、S、T、V、I、L或G的蛋白质。According to a typical embodiment of the present invention, the helicase defined in C) above includes: a protein in which the C at position 319 of BCH326 is replaced by A, S, T, V, I, L or G; and a protein in which the C at position 326 or 459 of BCH338 is replaced by A, S, T, V, I, L or G.
根据本发明一种典型的实施方式,在A)或B)限定的序列基础上,对蛋白进行突变将塔结构域与销结构域稳定连接在一起,使得DNA在测序过程中固定在两者形成的区域,提高测序的稳定性和持续性。例如,在A)、B)和C)中任一氨基酸序列的塔结构域和/或销结构域上至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸包括以下至少一种:According to a typical embodiment of the present invention, based on the sequence defined in A) or B), the protein is mutated to stably connect the tower domain and the pin domain together, so that the DNA is fixed in the region formed by the two during the sequencing process, thereby improving the stability and sustainability of the sequencing. For example, the amino acid mutation at at least one site on the tower domain and/or the pin domain of any of the amino acid sequences in A), B) and C) to cysteine or the introduction of at least one non-natural amino acid includes at least one of the following:
BCH326的塔结构域上S389、R340、K341、S342、N343、K343、S344、I345、V346、I347、D348、K349、D350、G351、K352、A353、K354、E355、F356、L357、R358、K359、F360、L361、N362、F363、A364、K365、I366、Y367、N368、F369、T370、N371、K372、G373、G374、H378、G379、R380、R381、I382、T383、K384、K385、S386、K387、K388、E389、L390和W391中的至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸;On the tower domain of BCH326, S389, R340, K341, S342, N343, K343, S344, I345, V346, I347, D348, K349, D350, G351, K352, A353, K354, E355, F356, L357, R358, K359, F360, L361, N362, F363, A364, K3 The amino acid in at least one of 65, I366, Y367, N368, F369, T370, N371, K372, G373, G374, H378, G379, R380, R381, I382, T383, K384, K385, S386, K387, K388, E389, L390 and W391 is mutated to cysteine or at least one unnatural amino acid is introduced;
BCH326的销结构域上D87、I88、G89、T90、I91、H92、S93、Y94、F95、D96、I97、K98、P99、D100、I101、D102、D103、N104、G105、N106、R107、V108、F109、K110、P111或S11中的至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸;The amino acid of at least one of D87, I88, G89, T90, I91, H92, S93, Y94, F95, D96, I97, K98, P99, D100, I101, D102, D103, N104, G105, N106, R107, V108, F109, K110, P111 or S11 on the pin domain of BCH326 is mutated to cysteine or at least one unnatural amino acid is introduced;
BCH338的塔结构域上S405、K406、F407、L408、V409、P410、L411、G412、D413、G414、S415、K416、E417、D418、L419、F420、P421、L422、Y423、K424、E425、A426、 V427、F428、D429、I430、A431、K432、T433、M434、N435、N436、Q437、R438、K439、I440、S441、K442、N443、S444、K445、K446、N447、F448或W449中的至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸;On the tower domain of BCH338, S405, K406, F407, L408, V409, P410, L411, G412, D413, G414, S415, K416, E417, D418, L419, F420, P421, L422, Y423, K424, E425, A426, at least one of V427, F428, D429, I430, A431, K432, T433, M434, N435, N436, Q437, R438, K439, I440, S441, K442, N443, S444, K445, K446, N447, F448, or W449 is mutated to cysteine or at least one unnatural amino acid is introduced;
BCH338的销结构域上E93、I94、R95、P96、D97、I98、N99、E100、F101、G102、E103、R104、I105、F106、V107、P108、K109、L110、R111、D112、M113、M114中的至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸;The amino acid of at least one of E93, I94, R95, P96, D97, I98, N99, E100, F101, G102, E103, R104, I105, F106, V107, P108, K109, L110, R111, D112, M113, and M114 on the pin domain of BCH338 is mutated to cysteine or at least one unnatural amino acid is introduced;
上述突变使塔结构域与销结构域之间能够发生化学连接,包括共价或非共价形式的连接。The above mutations enable chemical linkage between the tower domain and the pin domain, including covalent or non-covalent linkage.
另外,F)中,蛋白质与A)、B)和C)中任一所限定的蛋白质的氨基酸序列优选具有70%、80%、90%、95%或99%以上同源性及相同功能。本文使用的术语“同源性”具有本领域通常已知的含义,本领域技术人员也熟知测定不同序列间同源性的规则、标准。本发明用不同程度同源性限定的序列还必须要同时具有解旋酶功能。本领域技术人员可以在本申请公开内容的教导下获得这样的变体序列。In addition, in F), the protein preferably has 70%, 80%, 90%, 95% or 99% or more homology and the same function as the amino acid sequence of the protein defined in any of A), B) and C). The term "homology" used herein has the meaning generally known in the art, and those skilled in the art are also familiar with the rules and standards for determining the homology between different sequences. The sequences defined by different degrees of homology in the present invention must also have the helicase function at the same time. Those skilled in the art can obtain such variant sequences under the guidance of the disclosure of this application.
根据本发明一种典型的实施方式,所提及的非天然氨基酸包括但不限于4-叠氮基-L-苯丙氨酸4-乙酰基-L-苯丙氨酸、3-乙酰基-L-苯丙氨酸、4-乙酰乙酰基-L苯丙氨酸、O-烯丙基-L-酪氨酸、3-(苯基硒烷基)-L-丙氨酸、O-2-丙炔-1-基-L-酪氨酸、4(二羟基硼基)-L-苯丙氨酸、4-[(乙基硫烷基)羰基]-L-苯丙氨酸、(2S)-2-氨基-3-{4-[(丙烷-2-基硫烷基)羰基]苯基}丙酸、(2S)-2-氨基-3-{4-[(2-氨基-3-硫烷基丙酰基)氨基]苯基}丙酸、O-甲基-L-酪氨酸、4-氨基-L-苯丙氨酸、4-氰基-L-苯丙氨酸、3-氰基-L-苯丙氨酸、4-氟-L-苯丙氨酸、4-碘-L-苯丙氨酸、4-溴-L-苯丙氨酸、O-(三氟甲基)酪氨酸、4-硝基L-苯丙氨酸、3-羟基-L-酪氨酸、3-氨基-L-酪氨酸、3-碘-L-酪氨酸、4-异丙基-L-苯丙氨酸、3-(2-萘基)-L-丙氨酸、4-苯基-L-苯丙氨酸、(2S)-2-氨基-3-(萘-2-基氨基)丙酸、6-(甲基硫烷基)正亮氨酸、6-氧-L-赖氨酸、D-酪氨酸、(2R)-2-羟基-3-(4-羟基苯基)丙酸、(2R)-2氨基辛酸酯3-(2、2′-二吡啶-5-基)-D-丙氨酸、2-氨基-3-(8-羟基-3-喹啉基)丙酸、4-苯甲酰-L-苯丙氨酸、S-(2-硝基苄基)半胱氨酸、(2R)-2-氨基-3-[(2-硝基苄基)硫烷基]丙酸、(2S)-2-氨基-3-[(2-硝基苄基)氧基]丙酸、O-(4、5-二甲氧基-2-硝基苄基)-L-丝氨酸、(2S)-2-氨基-6-({[(2-硝基苄基)氧基]羰基}氨基)己酸、O-(2-硝基苄基)-L-酪氨酸或2-硝基苯丙氨酸等。According to a typical embodiment of the present invention, the non-natural amino acids mentioned are not limited to 4-azido-L-phenylalanine, 4-acetyl-L-phenylalanine, 3-acetyl-L-phenylalanine, 4-acetoacetyl-L-phenylalanine, O-allyl-L-tyrosine, 3-(phenylselenoalkyl)-L-alanine, O-2-propyn-1-yl-L-tyrosine, 4-(dihydroxyboryl)-L-phenylalanine, 4-[(ethylsulfanyl)carbonyl]-L-phenylalanine, (2S)-2-amino-3-{4- [(Propan-2-ylsulfanyl)carbonyl]phenyl}propanoic acid, (2S)-2-amino-3-{4-[(2-amino-3-sulfanylpropionyl)amino]phenyl}propanoic acid, O-methyl-L-tyrosine, 4-amino-L-phenylalanine, 4-cyano-L-phenylalanine, 3-cyano-L-phenylalanine, 4-fluoro-L-phenylalanine, 4-iodo-L-phenylalanine, 4-bromo-L-phenylalanine, O-(trifluoromethyl)tyrosine, 4-nitro-L-phenylalanine, 3-hydroxy-L-tyrosine, 3-amino- L-tyrosine, 3-iodo-L-tyrosine, 4-isopropyl-L-phenylalanine, 3-(2-naphthyl)-L-alanine, 4-phenyl-L-phenylalanine, (2S)-2-amino-3-(naphth-2-ylamino)propionic acid, 6-(methylsulfanyl)norleucine, 6-oxo-L-lysine, D-tyrosine, (2R)-2-hydroxy-3-(4-hydroxyphenyl)propionic acid, (2R)-2-aminooctanoate 3-(2,2′-bipyridin-5-yl)-D-alanine, 2-amino-3-(8-hydroxy- [0063] The invention also includes the following: (1) an amino acid, (2-nitro-3-[(2-nitrobenzyl) sulfanyl] propionic acid, (2S)-2-amino-3-[(2-nitrobenzyl) oxy] propionic acid, (2R)-2-amino-3-[(2-nitrobenzyl) sulfanyl] propionic acid, (2S)-2-amino-3-[(2-nitrobenzyl) oxy] propionic acid, (4,5-dimethoxy-2-nitrobenzyl)-L-serine, (2S)-2-amino-6-({[(2-nitrobenzyl) oxy] carbonyl} amino) hexanoic acid, (2-nitrobenzyl)-L-tyrosine or 2-nitrophenylalanine.
在本发明一实施方式中,对上述位置进行突变半胱氨酸或引入非天然氨基酸,并采用一个或多个连接器的一个或多个末端优选将解旋酶的塔结构域和销结构域进行共价连接。如果一端为共价连接,则一个或多个连接器可以瞬时连接上述两个或多个半胱氨酸和/或非天然氨基酸。如果两个或所有末端为共价连接,则一个或多个连接器永久连接两个或多个半胱氨酸和/或非天然氨基酸。其中,连接器是能够产生包含共价作用或非共价作用的介质。这可以是商用交联剂,也可以是一段小蛋白,一段多肽,一个合成的小分子等。其用于连接塔结构域和销结构域,使得DNA在结合解旋酶后,两个结构域发生连接,DNA能在穿孔运动时不脱离该区域,提高纳米孔测序的稳定性和持续性。In one embodiment of the present invention, cysteine is mutated or non-natural amino acids are introduced at the above positions, and one or more ends of one or more connectors are preferably used to covalently connect the tower domain and the pin domain of the helicase. If one end is covalently connected, one or more connectors can instantaneously connect the above two or more cysteines and/or non-natural amino acids. If two or all ends are covalently connected, one or more connectors permanently connect two or more cysteines and/or non-natural amino acids. Among them, the connector is capable of producing a medium containing covalent or non-covalent effects. This can be a commercial cross-linking agent, or it can be a small protein, a polypeptide, a synthetic small molecule, etc. It is used to connect the tower domain and the pin domain, so that after the DNA binds to the helicase, the two domains are connected, and the DNA can not leave the area during the perforation movement, thereby improving the stability and sustainability of nanopore sequencing.
其中一个共价连接方式即将该类解旋酶的塔结构域或销结构域上在测序时采用交联剂进 行交联,从而提高测序的持续性和稳定性。合适的化学交联剂是本领域公知的。合适的化学交联剂包括但不限于包括以下官能团的那些化学交联剂:马来酰亚胺、活性酯、琥珀酰亚胺、叠氮化物、炔烃(例如,二苯并环辛炔(DIBO或DBCO)、二氟环炔烃和线性炔烃)、膦(例如,无痕和非无痕施陶丁格结合中使用的那些)、卤代乙酰基(例如,碘乙酰胺)、光气型试剂、磺酰氯化物试剂、异硫氰酸酯、酰基卤化物、肼、二硫化物、乙烯基砜、氮杂环丙烷和光敏试剂(例如,芳基叠氮化物、二氮杂环丙烷)。One of the covalent attachment methods is to cross-link the tower domain or the pin domain of the helicase during sequencing using a cross-linking agent, thereby improving the continuity and stability of sequencing. Suitable chemical cross-linking agents are well known in the art. Suitable chemical cross-linking agents include, but are not limited to, those comprising the following functional groups: maleimide, active ester, succinimide, azide, alkyne (e.g., dibenzocyclooctyne (DIBO or DBCO), difluorocycloalkyne and linear alkyne), phosphine (e.g., those used in traceless and non-traceless Staudinger binding), haloacetyl (e.g., iodoacetamide), phosgene-type reagents, sulfonyl chloride reagents, isothiocyanates, acyl halides, hydrazine, disulfide, vinyl sulfone, aziridine and photosensitive reagents (e.g., aryl azide, diaziridine).
另外,本发明还可以对该类解旋酶的DNA结合区氨基酸位点,和ATP催化活性中心附近的氨基酸位点进行突变,突变方向包括但不限于突变为更大侧链氨基酸,从而增加了至少一个氨基酸与ssDNA中一个或多个核苷酸之间的(i)静电相互作用;(ii)氢键和/或(iii)阳离子-pi(阳离子-π)相互作用;取代增加正电氨基酸,以减少马达蛋白与孔之间的排斥等。In addition, the present invention can also mutate the amino acid sites in the DNA binding region of this type of helicase, and the amino acid sites near the ATP catalytic active center. The mutation direction includes but is not limited to mutation to larger side chain amino acids, thereby increasing the (i) electrostatic interaction between at least one amino acid and one or more nucleotides in ssDNA; (ii) hydrogen bonding and/or (iii) cation-pi (cation-π) interaction; substitution to increase positively charged amino acids to reduce the repulsion between the motor protein and the pore, etc.
其中,将原有氨基酸突变为更大侧链氨基酸包括以下至少一种:天冬酰胺(N)被谷氨酰胺(Q)、组氨酸(H)、精氨酸(R))或赖氨酸(K)取代;脯氨酸(P)被精氨酸(R)、赖氨酸(K)、苯丙氨酸(F)或亮氨酸(I)取代;组氨酸(H)被精氨酸(R)、赖氨酸(K)、谷氨酰胺(Q)、天冬酰胺(N)苯丙氨酸(F)、酪氨酸(Y)或色氨酸(W)取代;脯氨酸(P)被(i)精氨酸(R)、赖氨酸(K)、谷氨酰胺(Q)、天冬酰胺(N)或组氨酸(H)取代;苯丙氨酸(F)精氨酸(R)、赖氨酸(K)、组氨酸(H)、酪氨酸(Y)或色氨酸(W)取代;异亮氨酸(I)被苯丙氨酸(F)、色氨酸(W)、组氨酸(H)、赖氨酸(K)或精氨酸(R)取代;酪氨酸(Y)被精氨酸(R)、赖氨酸(K)、或色氨酸(W)取代等。Among them, the mutation of the original amino acid to an amino acid with a larger side chain includes at least one of the following: asparagine (N) is replaced by glutamine (Q), histidine (H), arginine (R) or lysine (K); proline (P) is replaced by arginine (R), lysine (K), phenylalanine (F) or leucine (I); histidine (H) is replaced by arginine (R), lysine (K), glutamine (Q), asparagine (N) phenylalanine (F), tyrosine (Y) or tryptophan (W); proline (P) is replaced by arginine (R), lysine (K), glutamine (Q), asparagine (N) phenylalanine (F), tyrosine (Y) or tryptophan (W); Acid (P) is replaced by (i) arginine (R), lysine (K), glutamine (Q), asparagine (N) or histidine (H); phenylalanine (F) is replaced by arginine (R), lysine (K), histidine (H), tyrosine (Y) or tryptophan (W); isoleucine (I) is replaced by phenylalanine (F), tryptophan (W), histidine (H), lysine (K) or arginine (R); tyrosine (Y) is replaced by arginine (R), lysine (K), or tryptophan (W), etc.
这些位点包括但不限于DNA结合区:BCH326:L157、V160、L294、G296、N299、L303、A304、I328、F329、T330、N331、G332、G333、E334;BCH338:H89、S90、Y91、F92、E93、I94、R95、P96;ATP催化活性中心附近的氨基酸位点:BCH326:K211、E212、E213、N214、Y215、K216、A217、P218、L219、K220、D221、I222、N223、N224;BCH338:Y152、Q153、L154、P155、P156、V157、F193、L194、I195、K196、E197、Y198、E199、E200、N201中的一个或多个位点。在本发明中,在ATP催化活性中心附近的氨基酸位点,是本领域技术人员能够理解的,是指在ATP催化活性中心周围的,且对ATP催化活性中心有影响的氨基酸位点。These sites include but are not limited to the DNA binding region: BCH326: L157, V160, L294, G296, N299, L303, A304, I328, F329, T330, N331, G332, G333, E334; BCH338: H89, S90, Y91, F92, E93, I94, R95, P96; amino acid sites near the ATP catalytic active center: BCH326: K 211, E212, E213, N214, Y215, K216, A217, P218, L219, K220, D221, I222, N223, N224; BCH338: one or more of Y152, Q153, L154, P155, P156, V157, F193, L194, I195, K196, E197, Y198, E199, E200, N201. In the present invention, the amino acid site near the ATP catalytic active center is understood by those skilled in the art to refer to the amino acid site around the ATP catalytic active center and having an impact on the ATP catalytic active center.
根据本发明一实施方式,可对该类解旋酶的表面与纳米孔结合区的结合氨基酸位点进行突变处理的长侧链氨基酸突变为短侧链的氨基酸,降低其测序时与纳米孔间的排斥。According to one embodiment of the present invention, the long side chain amino acids at the binding amino acid sites of the surface and nanopore binding region of this type of helicase can be mutated into short side chain amino acids to reduce the repulsion between the helicase and the nanopore during sequencing.
常见的突变方向有天冬酰胺(N)被异亮氨酸(I)、缬氨酸(V)、异亮氨酸(L)、丙氨酸(A)、丝氨酸(S)或甘氨酸(G)取代;赖氨酸(K)被异亮氨酸(I)、缬氨酸(V)、异亮氨酸(L)、丙氨酸(A)、丝氨酸(S)或甘氨酸(G)取代;赖氨酸(K)被异亮氨酸(I)、缬氨酸(V)、异亮氨酸(L)、丙氨酸(A)、丝氨酸(S)或甘氨酸(G)取代;精氨酸(R)被异亮氨酸(I)、缬氨酸(V)、异亮氨酸(L)、丙氨酸(A)、丝氨酸(S)或甘氨酸(G)取代等。优选的,这些氨基酸位点包括但不限于BCH326:M1、E2、S3、K4、I5、N6、L7、T8、E9、D10、Q11、L12、K13、I14、I15、K16、I189、I190、R191、T192、Q193、N194、K195、N196、S197;BCH338:M1、G2、E3、I4、K5、L6、N7、E8、E9、Q10、Q11、K12、K177、 I177、L178、R179、T180、K181、N182、L213、I214、D215、H216、F217、H218、V219、Y220、G221、D248、L249、T250、D251、S252、T253、E254、S255其中一个或多个位点。Common mutation directions include asparagine (N) replaced by isoleucine (I), valine (V), isoleucine (L), alanine (A), serine (S) or glycine (G); lysine (K) replaced by isoleucine (I), valine (V), isoleucine (L), alanine (A), serine (S) or glycine (G); lysine (K) replaced by isoleucine (I), valine (V), isoleucine (L), alanine (A), serine (S) or glycine (G); arginine (R) replaced by isoleucine (I), valine (V), isoleucine (L), alanine (A), serine (S) or glycine (G), etc. Preferably, these amino acid positions include but are not limited to BCH326: M1, E2, S3, K4, I5, N6, L7, T8, E9, D10, Q11, L12, K13, I14, I15, K16, I189, I190, R191, T192, Q193, N194, K195, N196, S197; BCH338: M1, G2, E3, I4, K5, L6, N7, E8, E9, Q10, Q11, K12, K177, One or more of the sites I177, L178, R179, T180, K181, N182, L213, I214, D215, H216, F217, H218, V219, Y220, G221, D248, L249, T250, D251, S252, T253, E254, S255.
根据本申请一种典型的实施方式,提供了一种分离的DNA分子,该DNA分子具有(a)编码上述任一种解旋酶的核苷酸序列;或(b)在严格条件下与(a)限定的DNA分子杂交的核苷酸序列;或(c)具有SEQ ID NO:2或SEQ ID NO:4所示的核苷酸序列;或(d)与(a)至(c)中限定的任一种所述核苷酸序列具有70%以上(优选80%以上,更优选85%以上,进一步优选90%以上,最优选95%以上,比如可以是85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、98.5%、99%、99.5%、99.6%、99.7%、99.8%以上,甚至99.9%以上)同源性且编码与上述解旋酶具有相同功能蛋白质的核苷酸序列。According to a typical embodiment of the present application, an isolated DNA molecule is provided, which has (a) a nucleotide sequence encoding any of the above-mentioned helicases; or (b) a nucleotide sequence that hybridizes with the DNA molecule specified in (a) under stringent conditions; or (c) a nucleotide sequence shown in SEQ ID NO: 2 or SEQ ID NO: 4; or (d) a nucleotide sequence that has more than 70% (preferably more than 80%, more preferably more than 85%, further preferably more than 90%, most preferably more than 95%, for example, it can be 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 98.5%, 99%, 99.5%, 99.6%, 99.7%, 99.8% or more, or even more than 99.9%) homology with any of the nucleotide sequences specified in (a) to (c) and encodes a nucleotide sequence having the same function as the above-mentioned helicase.
需要说明的是,本申请中的“同源性”是指任意两个核苷酸序列或氨基酸序列之间进行比较,从相应基因编码的第一个氨基酸到最后一个氨基酸之间的同一性。It should be noted that the “homology” in the present application refers to the identity between any two nucleotide sequences or amino acid sequences, from the first amino acid to the last amino acid encoded by the corresponding genes.
本申请中“分离的”是指“通过人工”从其天然状态改变,即,如果它在自然界中发生,则将其改变和/或从其原始环境中分离出来。例如,天然存在于生命有机体中的多核苷酸或多肽不是“分离的”,然而从其天然状态的共存物中分离的相同的多核苷酸或多肽是“分离的”(如在本文中使用的术语)。"Isolated" in this application means changed "by the hand of man" from its natural state, i.e., if it occurs in nature, it is changed and/or separated from its original environment. For example, a polynucleotide or polypeptide naturally present in a living organism is not "isolated", however, the same polynucleotide or polypeptide separated from its coexisting state in its natural state is "isolated" (as the term is used in this article).
本发明中的DNA分子在“严格条件下”与本发明的解旋酶编码基因杂交,是指能够通过的核酸杂交的方式来鉴定本发明解旋酶编码基因的存在的条件。本发明中,如果两个DNA分子能形成反平行的双链核酸结构,就可以说这两个DNA分子彼此间能够进行特异性杂交。如果两个DNA分子显示出完全的互补性,则称其中一个DNA分子是另一个DNA分子的“互补物”。本发明中,如果两个DNA分子能够以足够的稳定性相互杂交从而使它们在常规的“高度严格”条件下退火且彼此结合,则称这两个DNA分子具有“互补性”。从完全互补性中偏离是可以允许的,只要这种偏离不完全阻止两个分子形成双链结构。为了使一个DNA分子能够作为引物或探针,仅需保证其在序列上具有充分的互补性,以使得在所采用的特定溶剂和盐浓度下能形成稳定的双链结构。The DNA molecule in the present invention hybridizes with the helicase encoding gene of the present invention under "strict conditions", which refers to the conditions under which the existence of the helicase encoding gene of the present invention can be identified by nucleic acid hybridization. In the present invention, if two DNA molecules can form an antiparallel double-stranded nucleic acid structure, it can be said that the two DNA molecules can specifically hybridize with each other. If two DNA molecules show complete complementarity, one of the DNA molecules is called the "complement" of the other DNA molecule. In the present invention, if two DNA molecules can hybridize with each other with sufficient stability so that they anneal and bind to each other under conventional "highly stringent" conditions, the two DNA molecules are said to have "complementarity". Deviation from complete complementarity is allowed as long as this deviation does not completely prevent the two molecules from forming a double-stranded structure. In order for a DNA molecule to be able to serve as a primer or probe, it is only necessary to ensure that it has sufficient complementarity in sequence so that a stable double-stranded structure can be formed under the specific solvent and salt concentration used.
本发明中,基本同源的序列是一段DNA分子,该DNA分子在高度严格条件下能够和相匹配的另一段DNA分子的互补链发生特异性杂交。促进DNA杂交的适合的严格条件,例如,大约在45℃条件下用6.0×氯化钠/柠檬酸钠(SSC)处理,然后在50℃条件下用2.0×SSC洗涤,这些条件对本领域技术人员是公知的。例如,在洗涤步骤中的盐浓度可以选自低度严格条件的约2.0×SSC、50℃到高度严格条件的约0.2×SSC、50℃。此外,洗涤步骤中的温度条件可以从低度严格条件的室温约22℃,升高到高度严格条件的约65℃。温度条件和盐浓度可以都发生改变,也可以其中一个保持不变而另一个变量发生改变。优选地,本发明中的严格条件可为在6×SSC、0.5%SDS溶液中,在65℃下与编码本申请的解旋酶的核苷酸序列发生特异性杂交,然后用2×SSC、0.1%SDS和1×SSC、0.1%SDS各洗膜1次。In the present invention, the substantially homologous sequence is a DNA molecule that can specifically hybridize with the complementary strand of another matching DNA molecule under highly stringent conditions. Suitable stringent conditions for promoting DNA hybridization, for example, treatment with 6.0× sodium chloride/sodium citrate (SSC) at about 45°C, followed by washing with 2.0×SSC at 50°C, are well known to those skilled in the art. For example, the salt concentration in the washing step can be selected from about 2.0×SSC, 50°C for low stringency conditions to about 0.2×SSC, 50°C for high stringency conditions. In addition, the temperature conditions in the washing step can be increased from about 22°C at room temperature for low stringency conditions to about 65°C for high stringency conditions. Both the temperature conditions and the salt concentration can be changed, or one of them can be kept constant while the other variable is changed. Preferably, the stringent conditions in the present invention may be specific hybridization with the nucleotide sequence encoding the helicase of the present application in a 6×SSC, 0.5% SDS solution at 65°C, and then washing the membrane once with 2×SSC, 0.1% SDS and once with 1×SSC, 0.1% SDS.
根据本申请一种典型的实施方式,提供了一种重组载体,该重组载体包含上述DNA分子,即解旋酶表达基因。重组载体选自质粒、病毒或运载体表达载体等;重组载体包括用于控制 上述DNA分子表达的调控元件;调控元件包括与所述DNA分子可操作地连接的启动子;优选地,启动子包括T7、trc、lac、ara或λL;更优选地,重组载体选自质粒PET.28a(+)、PET.21a(+)或PET.32a(+)等。According to a typical embodiment of the present application, a recombinant vector is provided, which comprises the above-mentioned DNA molecule, i.e., the helicase expression gene. The recombinant vector is selected from a plasmid, a virus or a carrier expression vector, etc.; the recombinant vector comprises a regulatory element for controlling the expression of the above-mentioned DNA molecule; the regulatory element comprises a promoter operably connected to the DNA molecule; preferably, the promoter comprises T7, trc, lac, ara or λL; more preferably, the recombinant vector is selected from plasmid PET.28a(+), PET.21a(+) or PET.32a(+), etc.
在重组载体上插入解旋酶表达基因,利用重组载体能够大量自我复制的功能,大量复制解旋酶表达基因。此处的“重组”是指通过将来自一个物种的基因移植或剪接到不同物种的宿主有机体的细胞中而制备的基因工程化的DNA。这种DNA成为宿主基因结构的一部分并被复制。The helicase expression gene is inserted into the recombinant vector, and the helicase expression gene is replicated in large quantities by utilizing the ability of the recombinant vector to replicate itself in large quantities. "Recombinant" here refers to genetically engineered DNA prepared by transplanting or splicing a gene from one species into the cells of a host organism of a different species. This DNA becomes part of the host's genetic structure and is replicated.
根据本申请一种典型的实施方式,提供了一种宿主细胞,该宿主细胞转化有上述DNA分子,或重组载体。According to a typical embodiment of the present application, a host cell is provided, wherein the host cell is transformed with the above-mentioned DNA molecule or recombinant vector.
将上述重组载体转化入宿主细胞中,利用宿主细胞对重组载体上的解旋酶表达基因进行复制、转录、翻译,能够大量产生解旋酶。宿主细胞包括大肠杆菌,可以是BL21(DE3)、BL21Star(DE3)pLysS、Rossata(DE3)、Lemo21(DE3)等。本发明的解旋酶在大肠杆菌重组蛋白表达系统中能成功表达,蛋白均一,纯度高。The above recombinant vector is transformed into a host cell, and the host cell is used to replicate, transcribe, and translate the helicase expression gene on the recombinant vector, so that a large amount of helicase can be produced. The host cell includes Escherichia coli, which can be BL21 (DE3), BL21Star (DE3) pLysS, Rossata (DE3), Lemo21 (DE3), etc. The helicase of the present invention can be successfully expressed in the Escherichia coli recombinant protein expression system, and the protein is uniform and has high purity.
根据本申请一种典型的实施方式,该类解旋酶在高盐环境下展示出比低盐环境下还优越的解旋活性,能够良好地与单链DNA结合,并解旋双链DNA。该类解旋酶解旋活性很强,限位序列阻滞Spacer-18(Sp18)无法完全阻滞其解旋活性,此解旋酶可以用于核酸的控制和表征,并应用于单分子纳米孔测序。核酸控制包括对核酸穿过纳米孔的速度的控制,对核酸穿孔的稳定性控制或对核酸穿孔的持续性控制;更进一步地,该解旋酶的应用包括纳米传感器的应用及单分子纳米孔测序应用。其中,测序稳定性是指待测DNA以恒定的速度进入纳米孔,测序的信号平稳且完整,信噪比高,随着测序时间增加,测序质量不会明显下降。测序持续性是指测序信号持续输出,直到待测分子测序结束,文库持续捕获,不会突然中断导致测序的覆盖率和准确率低下。According to a typical embodiment of the present application, this type of helicase exhibits superior unwinding activity in a high-salt environment than in a low-salt environment, can bind well to single-stranded DNA, and unwind double-stranded DNA. This type of helicase has strong unwinding activity, and the limiting sequence blocking Spacer-18 (Sp18) cannot completely block its unwinding activity. This helicase can be used for the control and characterization of nucleic acids and applied to single-molecule nanopore sequencing. Nucleic acid control includes the control of the speed of nucleic acid passing through the nanopore, the stability control of nucleic acid perforation, or the continuity control of nucleic acid perforation; further, the application of the helicase includes the application of nanosensors and single-molecule nanopore sequencing applications. Among them, sequencing stability refers to the DNA to be tested entering the nanopore at a constant speed, the sequencing signal is stable and complete, the signal-to-noise ratio is high, and the sequencing quality will not decrease significantly as the sequencing time increases. Sequencing continuity refers to the continuous output of sequencing signals until the sequencing of the molecule to be tested is completed, the library is continuously captured, and there is no sudden interruption resulting in low coverage and accuracy of sequencing.
下面将结合具体的实施例来进一步详细解释本申请的有益效果。The beneficial effects of the present application will be further explained in detail below in conjunction with specific embodiments.
实施例1Example 1
BCH326和BCH338的克隆、表达和纯化Cloning, expression and purification of BCH326 and BCH338
1.BCH326和BCH338的克隆和表达1. Cloning and expression of BCH326 and BCH338
将全长BCH326和BCH338的DNA序列分别连接入PET.28a(+)质粒中,使用双酶切位点为Nde1和Xho1,因此表达出来的BCH326和BCH338蛋白N端具有6*His标签和thrombin酶切位点。The full-length DNA sequences of BCH326 and BCH338 were respectively connected to the PET.28a(+) plasmid, and the double restriction sites Nde1 and Xho1 were used, so that the N-termini of the expressed BCH326 and BCH338 proteins had a 6*His tag and a thrombin restriction site.
将克隆好的PET.28a(+)-BCH326和PET.28a(+)-BCH328质粒转化入大肠杆菌表达菌BL21(DE3)或其衍生菌中。挑取单菌落,接入20mL含有卡纳抗性的LB培养基中,37℃震荡培养过夜。然后转接入2L含有卡纳抗性的LB中,37℃震荡培养至OD
600=0.6-0.8,降温至16℃,加入终浓度500μM的IPTG诱导表达过夜。
The cloned PET.28a(+)-BCH326 and PET.28a(+)-BCH328 plasmids were transformed into E. coli expression bacteria BL21(DE3) or its derivatives. A single colony was picked and inoculated into 20 mL of LB medium containing kanamycin resistance, and cultured at 37°C overnight with shaking. Then, the culture was transferred into 2 L of LB medium containing kanamycin resistance, and cultured at 37°C with shaking until OD 600 = 0.6-0.8, cooled to 16°C, and IPTG was added at a final concentration of 500 μM to induce expression overnight.
2.BCH326和BCH338的纯化2. Purification of BCH326 and BCH338
Buffer A:20mM Tris-HCl pH 7.5,250mM NaCl,20mM咪唑Buffer A: 20 mM Tris-HCl pH 7.5, 250 mM NaCl, 20 mM imidazole
Buffer B:20mM Tris-HCl pH 7.5,250mM NaCl,300mM咪唑Buffer B: 20 mM Tris-HCl pH 7.5, 250 mM NaCl, 300 mM imidazole
Buffer C:20mM Tris-HCl pH 7.5,80mM NaClBuffer C: 20 mM Tris-HCl pH 7.5, 80 mM NaCl
Buffer D:20mM Tris-HCl pH 7.5,1000mM NaClBuffer D: 20 mM Tris-HCl pH 7.5, 1000 mM NaCl
Buffer E:20mM Tris-HCl pH 7.5,200mM NaClBuffer E: 20 mM Tris-HCl pH 7.5, 200 mM NaCl
收集表达的BCH326和BCH338菌体,使用BufferA重悬菌体,用细胞破碎仪破碎菌体,然后离心取上清。将上清与提前用BufferA平衡好的Ni-NTA填料混合,结合1h。收集填料,用Buffer A大量清洗填料,直至没有杂蛋白被洗出。接着,在填料中加入Buffer B用于洗脱蛋白。将洗脱得到的蛋白过Buffer C平衡好的脱盐柱(Cytiva,Sephadex G-25),将蛋白的缓冲液从Buffer B更换为Buffer C。然后,将通过脱盐柱的蛋白溶液加入到Buffer C平衡好的ssDNA cellulose填料中,并加入适量的凝血蛋白酶,该酶可以特异性识别载体序列PET28(a)+中的凝血酶切割位点氨基酸序列LVPRG↓S,从而切除蛋白所带有的亲和His标签,该操作在4℃进行,在旋转摇床上孵育过夜。次日,收集ssDNA cellulose填料,此时目标蛋白与ssDNA填料特异性吸附。用Buffer C清洗ssDNA cellulose填料3-4次,目的是除去未吸附ssDNA cellulose填料的杂蛋白,然后用buffer D洗脱,破坏目标蛋白与ssDNA填料特异性吸附,将目标蛋白洗脱到溶液中。将ssDNA cellulose纯化后的蛋白。将ssDNA cellulose纯化后的蛋白通过30K的超滤浓缩管(Merck millipore)在4℃预冷的离心机中浓缩,参数设置为转速3000g,每次离心时间10min,反复多次,将最终蛋白体积浓缩至2mL。最后经过分子筛Superdex 200(Cytiva),所用分子筛buffer为BufferE。收集目的蛋白峰,浓缩,冻存。由图1可见,经过纯化,最终可得到较大量的纯度良好的BGH326蛋白,该蛋白的峰形均一。平均每1L表达菌纯化出目的蛋白0.28mg,与解旋酶Dda的产量持平(每1L表达菌纯化得到0.23mg蛋白)。Collect the expressed BCH326 and BCH338 cells, resuspend the cells with Buffer A, break the cells with a cell disruptor, and then centrifuge to obtain the supernatant. Mix the supernatant with the Ni-NTA filler that has been equilibrated with Buffer A in advance and combine for 1 hour. Collect the filler and wash the filler with Buffer A in large quantities until no impurities are washed out. Next, add Buffer B to the filler to elute the protein. Pass the eluted protein through a desalting column (Cytiva, Sephadex G-25) equilibrated with Buffer C, and change the protein buffer from Buffer B to Buffer C. Then, add the protein solution that has passed through the desalting column to the ssDNA cellulose filler equilibrated with Buffer C, and add an appropriate amount of coagulation protease, which can specifically recognize the thrombin cleavage site amino acid sequence LVPRG↓S in the vector sequence PET28(a)+, thereby removing the affinity His tag carried by the protein. This operation is performed at 4°C and incubated overnight on a rotary shaker. The next day, the ssDNA cellulose filler was collected, and the target protein was specifically adsorbed to the ssDNA filler. The ssDNA cellulose filler was washed 3-4 times with Buffer C to remove the impurities that were not adsorbed to the ssDNA cellulose filler, and then eluted with buffer D to destroy the specific adsorption of the target protein and the ssDNA filler, and the target protein was eluted into the solution. The protein purified by ssDNA cellulose. The protein purified by ssDNA cellulose was concentrated in a 4°C precooled centrifuge through a 30K ultrafiltration concentrator (Merck millipore), the parameters were set to a speed of 3000g, each centrifugation time was 10min, and repeated several times to concentrate the final protein volume to 2mL. Finally, it was passed through the molecular sieve Superdex 200 (Cytiva), and the molecular sieve buffer used was Buffer E. The target protein peak was collected, concentrated, and frozen. As shown in Figure 1, after purification, a large amount of BGH326 protein with good purity can be obtained, and the peak shape of the protein is uniform. On average, 0.28 mg of target protein was purified per 1 L of expression bacteria, which was on par with the yield of helicase Dda (0.23 mg of protein was purified per 1 L of expression bacteria).
由图2可见,经过纯化,最终可得到较大量的纯度良好的BGH338蛋白,该蛋白的峰形均一。平均每1L表达菌纯化出目的蛋白0.42mg,比解旋酶Dda的产量高(每1L表达菌纯化得到0.23mg蛋白)。As shown in Figure 2, after purification, a large amount of BGH338 protein with good purity can be obtained, and the peak shape of the protein is uniform. On average, 0.42 mg of the target protein is purified per 1L of expression bacteria, which is higher than the yield of helicase Dda (0.23 mg protein is purified per 1L of expression bacteria).
3.BCH326的氨基酸序列(SEQ ID NO.1)3. Amino acid sequence of BCH326 (SEQ ID NO.1)
4.BCH326的DNA序列(SEQ ID NO.2)4. DNA sequence of BCH326 (SEQ ID NO.2)
5.BCH338的氨基酸序列(SEQ ID NO.3)5. Amino acid sequence of BCH338 (SEQ ID NO.3)
6.BCH338的DNA序列(SEQ ID NO.4)6. DNA sequence of BCH338 (SEQ ID NO.4)
7.BCH326和BCH338的AlphaFold2结构预测7. AlphaFold2 structure prediction of BCH326 and BCH338
借助AlphaFold2,对BCH326和BCH338分别进行预测,得出结构图,蛋白骨架结构的预测值与真实值之间的均方根误差(RMSD)分别达到
和
如图3和图4所示。不同的二级结构用不同形状标示出来:螺旋结构(helix),片状结构(sheet),环形结构(loop)。可以看出,和Dda这一类解旋酶(如图20所示,PDB编号为3UPU)对比,这两个解旋酶分别具有两个塔结构域,且均位于蛋白的同一侧,而Dda解旋酶只有一个塔结构域。
With the help of AlphaFold2, BCH326 and BCH338 were predicted and their structures were obtained. The root mean square error (RMSD) between the predicted value and the true value of the protein skeleton structure reached and As shown in Figure 3 and Figure 4. Different secondary structures are marked with different shapes: helix, sheet, and loop. It can be seen that compared with the Dda helicase (as shown in Figure 20, PDB number 3UPU), these two helicases have two tower domains, both located on the same side of the protein, while the Dda helicase has only one tower domain.
实施例2Example 2
BCH326和BCH338蛋白的ATPase活性检测ATPase activity detection of BCH326 and BCH338 proteins
1.制备双链DNA(ovDNA)和单链DNA(ssDNA):将SEQ ID NO.5和SEQ ID NO.6退火为5’悬挂20个T的ovDNA,退火流程为95℃孵育5分钟,0.1℃/s的降温速度降至25℃,继续孵育30分钟,退火配方见表1。将100μM的SEQ ID NO.6用TE缓冲液(pH=8)稀释到10μM,作为ssDNA。1. Preparation of double-stranded DNA (ovDNA) and single-stranded DNA (ssDNA): SEQ ID NO.5 and SEQ ID NO.6 were annealed to ovDNA with 20 Ts hanging at 5'. The annealing process was incubated at 95℃ for 5 minutes, cooled to 25℃ at a rate of 0.1℃/s, and continued to incubate for 30 minutes. The annealing formula is shown in Table 1. 100μM SEQ ID NO.6 was diluted to 10μM with TE buffer (pH=8) as ssDNA.
表1.ovDNA退火配方Table 1. ovDNA annealing recipe
溶液Solution | 体积volume |
100μM SEQ ID NO.5100μM SEQ ID NO.5 | 5μL5μL |
100μM SEQ ID NO.6100μM SEQ ID NO.6 | 5μL5μL |
TE缓冲液(pH=8)TE buffer (pH = 8) | 40μL40μL |
2.配制高盐反应缓冲液(2×):20mM HEPES(pH8.0)、4mMATP、4mM MgCl
2、1.0M KCl。
2. Prepare high salt reaction buffer (2×): 20 mM HEPES (pH 8.0), 4 mM ATP, 4 mM MgCl 2 , 1.0 M KCl.
3.稀释蛋白:用1×PBS将BCH326、BCH338蛋白稀释到10μM。3. Dilute protein: Dilute BCH326 and BCH338 proteins to 10 μM using 1× PBS.
4.进行ATP水解反应:按表2的反应体系加入相应试剂,30℃孵育30min,80℃灭活5min。其中①②为实验组,③④⑤⑥为相应的对照组,每组3个重复。4. Perform ATP hydrolysis reaction: add corresponding reagents according to the reaction system in Table 2, incubate at 30℃ for 30min, and inactivate at 80℃ for 5min. ①② are experimental groups, ③④⑤⑥ are corresponding control groups, and each group has 3 replicates.
表2.ATP水解反应体系Table 2. ATP hydrolysis reaction system
编号serial number | 反应缓冲液(2×)Reaction buffer (2×) | DNADNA | 蛋白protein | H 2O H2O | |
①① | 10μL10μL | 1μL(ovDNA)1μL (ovDNA) |
1μL | 8μL8μL | |
②② | 10μL10μL | 1μL(ssDNA)1μL (ssDNA) | 1μL1μL | 8μL8μL | |
③③ | 10μL10μL | ———— | 1μL1μL | 9μL9μL | |
④④ | 10μL10μL | 1μL(ovDNA)1μL (ovDNA) | ———— |
9μL |
|
⑤⑤ | 10μL10μL | 1μL(ssDNA)1μL (ssDNA) | ———— | 9μL9μL | |
⑥⑥ | 10μL10μL | ———— | ———— | 10μL10μL |
5.检测反应剩余ATP:按照生产商说明,使用ATP检测试剂盒(碧云天,S0026B)测定反应剩余ATP浓度。5. Detection of the remaining ATP in the reaction: According to the manufacturer's instructions, use an ATP detection kit (Biyuntian, S0026B) to determine the remaining ATP concentration in the reaction.
6.实验结果:如图5、图6所示,在高盐条件下,BCH326、BCH338均具有水解ATP的活性。6. Experimental results: As shown in Figures 5 and 6, under high salt conditions, both BCH326 and BCH338 have the activity of hydrolyzing ATP.
实施例3Example 3
BCH326和BCH338蛋白的dsDNA解链活性检测Detection of dsDNA melting activity of BCH326 and BCH338 proteins
1.制备双链DNA(ovDNA):将SEQ ID NO.7和SEQ ID NO.8退火为5’悬挂20个T的ovDNA,退火流程为95℃孵育5分钟,0.1℃/s的降温速度降至25℃,孵育30分钟,退火配方见表3。1. Preparation of double-stranded DNA (ovDNA): anneal SEQ ID NO.7 and SEQ ID NO.8 to ovDNA with 20 Ts hanging from the 5'. The annealing process is to incubate at 95°C for 5 minutes, cool to 25°C at a rate of 0.1°C/s, and incubate for 30 minutes. The annealing formula is shown in Table 3.
表3.ovDNA退火配方Table 3. ovDNA annealing recipe
溶液Solution | 体积volume |
100μM SEQ ID NO.7100μM SEQ ID NO.7 | 5μL5μL |
100μM SEQ ID NO.8100μM SEQ ID NO.8 | 5μL5μL |
TE缓冲液(pH=8)TE buffer (pH = 8) | 40μL40μL |
2.配制反应缓冲液:低盐反应缓冲液1为100mM HEPES(pH=8.0)、1mg/mL BSA、10mM MgCl
2、150mM KCl;高盐反应缓冲液2为100mM HEPES(pH=8.0)、1mg/mL BSA、10mM MgCl
2、500mM KCl。
2. Prepare reaction buffer: low salt reaction buffer 1 is 100 mM HEPES (pH=8.0), 1 mg/mL BSA, 10 mM MgCl 2 , 150 mM KCl; high salt reaction buffer 2 is 100 mM HEPES (pH=8.0), 1 mg/mL BSA, 10 mM MgCl 2 , 500 mM KCl.
3.配制反应液:取3μL 10μM退火好的ovDNA、6μL 100μM SEQ ID NO.9(作为竞争DNA,去捕获解旋后的单条DNA链)、6μL 100mM ATP到585μL低盐反应缓冲液或高盐反应缓冲液中,作为实验反应液。取1μL 10μM SEQ ID NO.8、2μL 100μM SEQ ID NO.9、2μL 100mM ATP到195μL低盐反应缓冲液或高盐反应缓冲液中,作为阳性对照液。3. Prepare the reaction solution: add 3μL 10μM annealed ovDNA, 6μL 100μM SEQ ID NO.9 (as competitive DNA to capture the unwound single DNA chain), and 6μL 100mM ATP to 585μL low salt reaction buffer or high salt reaction buffer as the experimental reaction solution. Add 1μL 10μM SEQ ID NO.8, 2μL 100μM SEQ ID NO.9, and 2μL 100mM ATP to 195μL low salt reaction buffer or high salt reaction buffer as the positive control solution.
4.稀释蛋白:用1×PBS将BCH326、BCH338蛋白稀释到4.8μM。4. Dilute protein: Dilute BCH326 and BCH338 proteins to 4.8 μM using 1× PBS.
5.配制解链反应:分为实验组①、阴性组②、阳性组③,按表4加入相应试剂,使用酶标仪在30℃条件下检测反应30min内荧光强度的动力学变化,每组3个重复。5. Prepare the melting reaction: Divide into experimental group ①, negative group ②, and positive group ③. Add the corresponding reagents according to Table 4. Use an ELISA reader to detect the kinetic changes of fluorescence intensity within 30 minutes at 30°C. Repeat 3 times for each group.
6.数据分析:计算实验组、阴性对照组的荧光值相对于阳性对照组荧光值的百分比。6. Data analysis: Calculate the percentage of the fluorescence values of the experimental group and the negative control group relative to the fluorescence value of the positive control group.
表4.解链反应配方Table 4. Melting reaction recipe
编号serial number | 溶液1Solution 1 |
溶液2 |
①① | 58.5μL实验反应液58.5μL experimental reaction solution |
1.5μL蛋白1.5 |
②② | 58.5μL实验反应液58.5μL experimental reaction solution | 1.5μL反应缓冲液1.5 μL reaction buffer |
③③ | 58.5μL阳性对照液58.5μL positive control solution | 1.5μL反应缓冲液1.5 μL reaction buffer |
7.实验结果:7. Experimental results:
在误差范围和仪器波动允许情况下,通过计算测定到的荧光值与阳性对照测定的荧光值的比例,绘制实验结果图(因仪器灵敏度的关系,阴性对照组有荧光吸收读取)。从实验结果可以看出,每个实验中的阴性对照组在测定过程中一直保持不变,而实验组荧光值随反应时间的增加而逐渐增大,则表明该具有解旋双链DNA的活性,且解旋方向为5’-3’。Within the error range and the allowable instrument fluctuation, the experimental results are plotted by calculating the ratio of the measured fluorescence value to the fluorescence value measured by the positive control (due to the sensitivity of the instrument, the negative control group has fluorescence absorption reading). From the experimental results, it can be seen that the negative control group in each experiment remains unchanged during the measurement process, while the fluorescence value of the experimental group gradually increases with the increase of reaction time, indicating that it has the activity of unwinding double-stranded DNA, and the unwinding direction is 5'-3'.
如图7和图8所示,在低盐(KCl终浓度为150mM)和高盐(KCl终浓度为500mM)条件下,BCH326有解旋dsDNA的活性,且随着盐浓度升高,BCH326解旋dsDNA的活性增强。如图9和图10所示,在低盐和高盐条件下,BCH338有解旋dsDNA的活性,且随着盐浓度升高,BCH338解旋dsDNA的活性增强。As shown in Figures 7 and 8, under low salt (final KCl concentration of 150 mM) and high salt (final KCl concentration of 500 mM) conditions, BCH326 has the activity of unwinding dsDNA, and as the salt concentration increases, the activity of BCH326 unwinding dsDNA increases. As shown in Figures 9 and 10, under low salt and high salt conditions, BCH338 has the activity of unwinding dsDNA, and as the salt concentration increases, the activity of BCH338 unwinding dsDNA increases.
实施例4Example 4
限位序列阻滞BCH326、BCH338解链活性检测Limiting sequence blocking BCH326 and BCH338 unzipping activity detection
1.制备含有限位序列的双链DNA(ovDNA):将SEQ ID NO.7和SEQ ID NO.10退火为5’悬挂20个T的ovDNA(含有限位序列),退火流程为95℃孵育5分钟,0.1℃/s的降温速度降至25℃,孵育30分钟,退火配方见表5。1. Prepare double-stranded DNA (ovDNA) containing a restriction sequence: anneal SEQ ID NO.7 and SEQ ID NO.10 to ovDNA (containing a restriction sequence) with 20 Ts hanging from the 5'. The annealing process is to incubate at 95°C for 5 minutes, cool to 25°C at a rate of 0.1°C/s, and incubate for 30 minutes. The annealing formula is shown in Table 5.
表5.ovDNA(含有限位序列)退火配方Table 5. Annealing formula of ovDNA (containing restriction sequences)
溶液Solution | 体积volume |
100μM SEQ ID NO.7100μM SEQ ID NO.7 | 5μL5μL |
100μM SEQ ID NO.10100μM SEQ ID NO.10 | 5μL5μL |
TE缓冲液(pH=8)TE buffer (pH = 8) | 40μL40μL |
2.配制反应缓冲液:低盐反应缓冲液为100mM HEPES(pH=8.0)、1mg/mL BSA、10mM MgCl
2、150mM KCl;高盐反应缓冲液为100mM HEPES(pH=8.0)、1mg/mL BSA、10mM MgCl
2、500mM KCl。
2. Prepare reaction buffer: low salt reaction buffer is 100 mM HEPES (pH=8.0), 1 mg/mL BSA, 10 mM MgCl 2 , 150 mM KCl; high salt reaction buffer is 100 mM HEPES (pH=8.0), 1 mg/mL BSA, 10 mM MgCl 2 , 500 mM KCl.
3.配制反应液:取3μL 10μM退火好的ovDNA(含有限位序列)、6μL 100μM SEQ ID NO.9(20倍竞争DNA)、6μL 100mM ATP到585μL低盐反应缓冲液或高盐反应缓冲液中,作为实验反应液。取1μL 10μM SEQ ID NO.11、2μL 100μM SEQ ID NO.9(20倍竞争DNA)、2μL 100mM ATP到195μL低盐反应缓冲液或高盐反应缓冲液中,作为阳性对照液.3. Prepare the reaction solution: Take 3μL 10μM annealed ovDNA (containing the restriction sequence), 6μL 100μM SEQ ID NO.9 (20-fold competitive DNA), and 6μL 100mM ATP to 585μL low-salt reaction buffer or high-salt reaction buffer as the experimental reaction solution. Take 1μL 10μM SEQ ID NO.11, 2μL 100μM SEQ ID NO.9 (20-fold competitive DNA), and 2μL 100mM ATP to 195μL low-salt reaction buffer or high-salt reaction buffer as the positive control solution.
4.稀释蛋白:用1×PBS将BCH326、BCH338蛋白稀释到4.8μM。4. Dilute protein: Dilute BCH326 and BCH338 proteins to 4.8 μM using 1× PBS.
5.配制解链反应:分为实验组①、阴性组②、阳性组③,按表6加入相应试剂,使用酶标仪在30℃条件下检测反应30min内荧光强度的动力学变化,每组3个重复。5. Prepare the melting reaction: Divide into experimental group ①, negative group ②, and positive group ③. Add the corresponding reagents according to Table 6. Use an ELISA reader to detect the kinetic changes of fluorescence intensity within 30 minutes at 30°C. Repeat 3 times for each group.
6.数据分析:计算实验组、阴性对照组的荧光值相对于阳性对照组荧光值的百分比。6. Data analysis: Calculate the percentage of the fluorescence values of the experimental group and the negative control group relative to the fluorescence value of the positive control group.
表6.解链反应配方Table 6. Melting reaction recipe
编号serial number | 溶液1Solution 1 |
溶液2 |
①① | 58.5μL实验反应液58.5μL experimental reaction solution |
1.5μL蛋白1.5 |
②② | 58.5μL实验反应液58.5μL experimental reaction solution | 1.5μL 1×PBS1.5 μL 1× PBS |
③③ | 58.5μL阳性对照液58.5μL positive control solution | 1.5μL 1×PBS1.5 μL 1× PBS |
7.实验结果:7. Experimental results:
如图11-图12所示,在低盐(KCl终浓度为150mM)和高盐(KCl终浓度为500mM)条件下,限位序列均削弱了BCH326解旋dsDNA的活性,但不能完全阻滞其解旋活性,且在 高盐条件下BCH326仍具有持续的解旋活性趋势。如图13所示,在低盐条件下,限位序列几乎完全阻滞BCH338解旋dsDNA;如图14所示,在高盐条件下,限位序列可以削弱BCH338解旋dsDNA的活性,但不能完全阻滞其解旋dsDNA。As shown in Figures 11 and 12, under low salt (final KCl concentration of 150 mM) and high salt (final KCl concentration of 500 mM) conditions, the restriction sequence weakened the activity of BCH326 in unwinding dsDNA, but could not completely block its unwinding activity, and BCH326 still had a continuous unwinding activity trend under high salt conditions. As shown in Figure 13, under low salt conditions, the restriction sequence almost completely blocked BCH338 from unwinding dsDNA; as shown in Figure 14, under high salt conditions, the restriction sequence could weaken the activity of BCH338 in unwinding dsDNA, but could not completely block its unwinding dsDNA.
实施例5Example 5
BCH326和BCH338蛋白的纳米孔测序应用Nanopore sequencing applications of BCH326 and BCH338 proteins
1.将两条部分区域互补的DNA链(top strand,SEQ ID NO.11和bottom strand,SEQ ID NO.12)退火形成接头,与待测双链目的片段利用T4DNA连接酶在室温下连接并纯化,制备测序文库。图15示出了接头示意图(a:上链;b:下链)。1. Two partially complementary DNA strands (top strand, SEQ ID NO.11 and bottom strand, SEQ ID NO.12) were annealed to form a linker, which was then connected to the double-stranded target fragment to be tested using T4 DNA ligase at room temperature and purified to prepare a sequencing library. Figure 15 shows a schematic diagram of the linker (a: top strand; b: bottom strand).
2.BCH326或者BCH338蛋白分别与测序文库在25℃孵育1h(摩尔浓度比1∶8),形成含解旋酶的测序文库。图16示出了含有解旋酶测序文库示意图(a:上链;b:下链;c:双链目的片段;d:解旋酶;e:胆固醇标记双链DNA)。2. BCH326 or BCH338 protein was incubated with the sequencing library at 25°C for 1 h (molar concentration ratio 1:8) to form a sequencing library containing helicase. Figure 16 shows a schematic diagram of a sequencing library containing helicase (a: upper strand; b: lower strand; c: double-stranded target fragment; d: helicase; e: cholesterol-labeled double-stranded DNA).
3.含解旋酶的测序文库与5’端含有胆固醇的单链DNA(ssDNA-chol,SEQ ID NO.13)在室温下孵育10min。ssDNA-chol序列与接头bottom strand其中一部分区域互补,胆固醇结合磷脂膜后能够降低文库上样量,提高捕获率。3. The sequencing library containing helicase was incubated with single-stranded DNA containing cholesterol at the 5' end (ssDNA-chol, SEQ ID NO.13) at room temperature for 10 minutes. The ssDNA-chol sequence is complementary to a part of the bottom strand of the adapter. After cholesterol binds to the phospholipid membrane, it can reduce the amount of library loading and increase the capture rate.
4.使用膜片钳放大器或其他电信号放大器(如图17所示)采集电流信号。中间有微米级小孔的(直径50-200μm)Teflon膜将电解池分为两个腔室,cis腔室和trans腔室;在cis腔室和trans腔室各放置一对Ag/AgCl电极;在两个腔室的微孔处形成一层双分子磷脂膜后加入纳米孔蛋白CsgG-Eco-(Y51A/F56Q/R97W/R192D-StrepII(C));待单个纳米孔蛋白插入磷脂膜后获得了电测量;加入步骤3的反应产物,施加180mV,测序文库被纳米孔所捕获并在解旋酶的控制下核酸穿过纳米孔。该实验所用缓冲液为:0.47M KCl,25mM HEPES,1mM EDTA,5mM ATP,25mM MgCl2,pH 7.6,测序温度是28℃。4. Use a patch clamp amplifier or other electrical signal amplifier (as shown in FIG17 ) to collect current signals. A Teflon membrane with a micrometer-sized hole in the middle (diameter 50-200 μm) divides the electrolytic cell into two chambers, the cis chamber and the trans chamber; a pair of Ag/AgCl electrodes are placed in each of the cis chamber and the trans chamber; a layer of bimolecular phospholipid membrane is formed at the micropores of the two chambers, and the nanopore protein CsgG-Eco-(Y51A/F56Q/R97W/R192D-StrepII(C)) is added; electrical measurements are obtained after a single nanopore protein is inserted into the phospholipid membrane; the reaction product of step 3 is added, 180 mV is applied, the sequencing library is captured by the nanopore, and the nucleic acid passes through the nanopore under the control of the helicase. The buffer used in this experiment was: 0.47M KCl, 25mM HEPES, 1mM EDTA, 5mM ATP, 25mM MgCl2, pH 7.6, and the sequencing temperature was 28°C.
5.利用BCH326进行测序实验,测序电信号如图18所示。利用BCH338进行测序实验,测序电信号如图19所示。结果可见,随着解旋酶控制DNA单链进入纳米孔,部分电流被阻碍,电流变小。由于不同核苷酸大小不同,阻碍的电流大小也因此不同,所以可以看到波动的电流信号。且图18和19均具有完整的接头信号和回复信号,信噪比高,说明测序信号的稳定性好。5. The sequencing experiment was performed using BCH326, and the sequencing electrical signal is shown in Figure 18. The sequencing experiment was performed using BCH338, and the sequencing electrical signal is shown in Figure 19. The results show that as the helicase controls the DNA single strand to enter the nanopore, part of the current is blocked and the current becomes smaller. Since different nucleotides have different sizes, the size of the blocked current is also different, so a fluctuating current signal can be seen. And both Figures 18 and 19 have complete connector signals and reply signals, and the signal-to-noise ratio is high, indicating that the stability of the sequencing signal is good.
SEQ ID NO.5:5’-GCGTCGAAAAGCAGTACTTAGGCATT-3’SEQ ID NO.5:5’-GCGTCGAAAAGCAGTACTTAGGCATT-3’
SEQ ID NO.6:5’-TTTTTTTTTTTTTTTTTTTTTAATGCCTAAGTACTGCTTTTCGACGC-3’SEQ ID NO.6:5’-TTTTTTTTTTTTTTTTTTTTTTTAATGCCTAAGTACTGCTTTTCGACGC-3’
SEQ ID NO.7:5’-BHQ-1-GCGTCGAAAAGCAGTACTTAGGCATT-3’SEQ ID NO.7:5’-BHQ-1-GCGTCGAAAAGCAGTACTTAGGCATT-3’
SEQ ID NO.8:5’-TTTTTTTTTTTTTTTTTTTTTAATGCCTAAGTACTGCTTTTCGACGC-FAM-3’SEQ ID NO.8:5’-TTTTTTTTTTTTTTTTTTTTTTTAATGCCTAAGTACTGCTTTTCGACGC-FAM-3’
SEQ ID NO.9:5’-AATGCCTAAGTACTGCTTTTCGACGCT-3’SEQ ID NO.9:5’-AATGCCTAAGTACTGCTTTTCGACGCT-3’
SEQ ID NO.10:5’-TTTTTTTTTTTTTTTTTTTTTNNNNAATGCCTAAGTACTGCTTTTCGACGC-FAM-3’(N=iSP18)SEQ ID NO.10: 5’-TTTTTTTTTTTTTTTTTTTTTTTNNNNAATGCCTAAGTACTGCTTTTCGACGC-FAM-3’(N=iSP18)
SEQ ID NO.11:5’-TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTNNNNGGTTGTTTCTGTTGGTGCTGATATTGCT-3’(N=iSP18)SEQ ID NO.11:5’-TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTNNNNGGTTGTTTCTGTTGGTGCTGATATTGCT-3’(N=iSP18)
SEQ ID NO.12:5’-GCAATATCAGCACCAACAGAAACAACCTTTGAGGCGAGCGGTCAA-3’SEQ ID NO.12:5’-GCAATATCAGCACCAACAGAAACAACCTTTGAGGCGAGCGGTCAA-3’
SEQ ID NO.13:5’-cholesterol-TTGACCGCTCGCCTC-3’。SEQ ID NO.13:5’-cholesterol-TTGACCGCTCGCCTC-3’.
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and variations. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.
Claims (16)
- 一种解旋酶,其特征在于,所述解旋酶具有两个塔结构域和一个销结构域,两个所述塔结构域位于所述解旋酶三维结构的同一侧。A helicase, characterized in that the helicase has two tower domains and one pin domain, and the two tower domains are located on the same side of the three-dimensional structure of the helicase.
- 根据权利要求1所述的解旋酶,其特征在于,所述解旋酶包括以下至少一种:The helicase according to claim 1, characterized in that the helicase comprises at least one of the following:A)BCH326,所述BCH326为具有SEQ ID NO:1所示的氨基酸序列的蛋白质;A) BCH326, wherein BCH326 is a protein having an amino acid sequence as shown in SEQ ID NO: 1;B)BCH338,所述BCH338为具有SEQ ID NO:3所示的氨基酸序列的蛋白质;B) BCH338, wherein BCH338 is a protein having an amino acid sequence as shown in SEQ ID NO: 3;C)在A)或B)中限定的蛋白质的表面上至少一个半胱氨酸突变为丙氨酸、谷氨酰胺、甘氨酸、组氨酸、异亮氨酸、亮氨酸、缬氨酸、丝氨酸、苏氨酸或甲硫氨酸的蛋白质;C) a protein in which at least one cysteine on the surface of the protein defined in A) or B) is mutated to alanine, glutamine, glycine, histidine, isoleucine, leucine, valine, serine, threonine or methionine;D)对A)、B)和C)中任一所限定的蛋白质的氨基酸序列的所述塔结构域和/或所述销结构域上至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸,且具有DNA解旋能力的蛋白质;和D) a protein having DNA unwinding ability, wherein at least one amino acid in the tower domain and/or the pin domain of the amino acid sequence of the protein defined in any one of A), B) and C) is mutated to cysteine or at least one unnatural amino acid is introduced; andE)与A)、B)、C)和D)中任一所限定的蛋白质的氨基酸序列具有70%以上同源性且具有相同功能的蛋白质。E) A protein having an amino acid sequence homology of more than 70% with the protein defined in any one of A), B), C) and D) and having the same function.
- 根据权利要求2所述的解旋酶,其特征在于,所述C)包括:The helicase according to claim 2, characterized in that said C) comprises:将所述BCH326的第319位的氨基酸C置换为A、S、T、V、I、L或G的蛋白质;和A protein in which the amino acid C at position 319 of BCH326 is substituted with A, S, T, V, I, L or G; and将所述BCH338的第326位或第459位的氨基酸C置换为A、S、T、V、I、L或G的蛋白质。A protein in which the amino acid C at position 326 or position 459 of BCH338 is substituted with A, S, T, V, I, L or G.
- 根据权利要求2所述的解旋酶,其特征在于,所述D)中,所述塔结构域和/或所述销结构域上至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸包括以下至少一种:The helicase according to claim 2, characterized in that, in D), the amino acid mutation at at least one site on the tower domain and/or the pin domain to cysteine or the introduction of at least one non-natural amino acid comprises at least one of the following:所述BCH326的塔结构域上S389、R340、K341、S342、N343、K343、S344、I345、V346、I347、D348、K349、D350、G351、K352、A353、K354、E355、F356、L357、R358、K359、F360、L361、N362、F363、A364、K365、I366、Y367、N368、F369、T370、N371、K372、G373、G374、H378、G379、R380、R381、I382、T383、K384、K385、S386、K387、K388、E389、L390和W391中的至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸;The tower domain of BCH326 has S389, R340, K341, S342, N343, K343, S344, I345, V346, I347, D348, K349, D350, G351, K352, A353, K354, E355, F356, L357, R358, K359, F360, L361, N362, F363, A364, K The amino acid in at least one of 365, I366, Y367, N368, F369, T370, N371, K372, G373, G374, H378, G379, R380, R381, I382, T383, K384, K385, S386, K387, K388, E389, L390 and W391 is mutated to cysteine or at least one unnatural amino acid is introduced;所述BCH326的销结构域上D87、I88、G89、T90、I91、H92、S93、Y94、F95、D96、I97、K98、P99、D100、I101、D102、D103、N104、G105、N106、R107、V108、F109、K110、P111或S112中的至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸;The amino acid of at least one of D87, I88, G89, T90, I91, H92, S93, Y94, F95, D96, I97, K98, P99, D100, I101, D102, D103, N104, G105, N106, R107, V108, F109, K110, P111 or S112 on the pin domain of BCH326 is mutated to cysteine or at least one unnatural amino acid is introduced;所述BCH338的塔结构域上S405、K406、F407、L408、V409、P410、L411、G412、D413、G414、S415、K416、E417、D418、L419、F420、P421、L422、Y423、K424、E425、A426、V427、F428、D429、I430、A431、K432、T433、M434、N435、N436、Q437、R438、K439、I440、S441、K442、N443、S444、K445、K446、N447、F448或W449中的至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸;The tower domain of BCH338 has S405, K406, F407, L408, V409, P410, L411, G412, D413, G414, S415, K416, E417, D418, L419, F420, P421, L422, Y423, K424, E425, A426, V427, F428, D the amino acid in at least one of 429, I430, A431, K432, T433, M434, N435, N436, Q437, R438, K439, I440, S441, K442, N443, S444, K445, K446, N447, F448, or W449 is mutated to cysteine or at least one unnatural amino acid is introduced;所述BCH338的销结构域上E93、I94、R95、P96、D97、I98、N99、E100、F101、G102、E103、R104、I105、F106、V107、P108、K109、L110、R111、D112、M113、M114中的至少一个位点的氨基酸突变成半胱氨酸或引入至少一个非天然氨基酸;The amino acid of at least one of E93, I94, R95, P96, D97, I98, N99, E100, F101, G102, E103, R104, I105, F106, V107, P108, K109, L110, R111, D112, M113, and M114 on the pin domain of BCH338 is mutated to cysteine or at least one unnatural amino acid is introduced;优选地,所述E)中,所述蛋白质与A)、B)、C)和D)中任一所限定的蛋白质的氨基酸序列具有70%、80%、90%、95%或99%以上同源性及相同功能。Preferably, in E), the protein has 70%, 80%, 90%, 95% or 99% or more homology and the same function as the amino acid sequence of the protein defined in any one of A), B), C) and D).
- 根据权利要求2至4任一项所述的解旋酶,其特征在于,所述非天然氨基酸选自4-叠氮基-L-苯丙氨酸、4-乙酰基-L-苯丙氨酸、3-乙酰基-L-苯丙氨酸、4-乙酰乙酰基-L苯丙氨酸、O-烯丙基-L-酪氨酸、3-(苯基硒烷基)-L-丙氨酸、O-2-丙炔-1-基-L-酪氨酸、4(二羟基硼基)-L-苯丙氨酸、4-[(乙基硫烷基)羰基]-L-苯丙氨酸、(2S)-2-氨基-3-{4-[(丙烷-2-基硫烷基)羰基]苯基}丙酸、(2S)-2-氨基-3-{4-[(2-氨基-3-硫烷基丙酰基)氨基]苯基}丙酸、O-甲基-L-酪氨酸、4-氨基-L-苯丙氨酸、4-氰基-L-苯丙氨酸、3-氰基-L-苯丙氨酸、4-氟-L-苯丙氨酸、4-碘-L-苯丙氨酸、4-溴-L-苯丙氨酸、O-(三氟甲基)酪氨酸、4-硝基L-苯丙氨酸、3-羟基-L-酪氨酸、3-氨基-L-酪氨酸、3-碘-L-酪氨酸、4-异丙基-L-苯丙氨酸、3-(2-萘基)-L-丙氨酸、4-苯基-L-苯丙氨酸、(2S)-2-氨基-3-(萘-2-基氨基)丙酸、6-(甲基硫烷基)正亮氨酸、6-氧-L-赖氨酸、D-酪氨酸、(2R)-2-羟基-3-(4-羟基苯基)丙酸、(2R)-2氨基辛酸酯3-(2、2′-二吡啶-5-基)-D-丙氨酸、2-氨基-3-(8-羟基-3-喹啉基)丙酸、4-苯甲酰-L-苯丙氨酸、S-(2-硝基苄基)半胱氨酸、(2R)-2-氨基-3-[(2-硝基苄基)硫烷基]丙酸、(2S)-2-氨基-3-[(2-硝基苄基)氧基]丙酸、O-(4、5-二甲氧基-2-硝基苄基)-L-丝氨酸、(2S)-2-氨基-6-({[(2-硝基苄基)氧基]羰基}氨基)己酸和O-(2-硝基苄基)-L-酪氨酸或2-硝基苯丙氨酸中的至少一种;The helicase according to any one of claims 2 to 4, characterized in that the non-natural amino acid is selected from 4-azido-L-phenylalanine, 4-acetyl-L-phenylalanine, 3-acetyl-L-phenylalanine, 4-acetoacetyl-L-phenylalanine, O-allyl-L-tyrosine, 3-(phenylselenoyl)-L-alanine, O-2-propyn-1-yl-L-tyrosine, 4-(dihydroxyboryl)-L-phenylalanine, 4-[(ethylsulfanyl)carbonyl]-L-phenylalanine, (2S)-2-amino-3- {4-[(propan-2-ylsulfanyl)carbonyl]phenyl}propanoic acid, (2S)-2-amino-3-{4-[(2-amino-3-sulfanylpropionyl)amino]phenyl}propanoic acid, O-methyl-L-tyrosine, 4-amino-L-phenylalanine, 4-cyano-L-phenylalanine, 3-cyano-L-phenylalanine, 4-fluoro-L-phenylalanine, 4-iodo-L-phenylalanine, 4-bromo-L-phenylalanine, O-(trifluoromethyl)tyrosine, 4-nitro-L-phenylalanine, 3-hydroxy-L-tyrosine, 3-amino- L-tyrosine, 3-iodo-L-tyrosine, 4-isopropyl-L-phenylalanine, 3-(2-naphthyl)-L-alanine, 4-phenyl-L-phenylalanine, (2S)-2-amino-3-(naphth-2-ylamino)propionic acid, 6-(methylsulfanyl)norleucine, 6-oxo-L-lysine, D-tyrosine, (2R)-2-hydroxy-3-(4-hydroxyphenyl)propionic acid, (2R)-2-aminooctanoate 3-(2,2′-bipyridin-5-yl)-D-alanine, 2-amino-3-(8-hydroxy-3- at least one of (2R)-2-amino-3-[(2-nitrobenzyl)sulfanyl]propionic acid, (2S)-2-amino-3-[(2-nitrobenzyl)oxy]propionic acid, O-(4,5-dimethoxy-2-nitrobenzyl)-L-serine, (2S)-2-amino-6-({[(2-nitrobenzyl)oxy]carbonyl}amino)hexanoic acid and O-(2-nitrobenzyl)-L-tyrosine or 2-nitrophenylalanine;优选的,所述BCH326引入至少一个非天然氨基酸包括如下至少一种:D100引入4-叠氮基-L-苯丙氨酸、I101引入4-叠氮基-L-苯丙氨酸、D102引入4-叠氮基-L-苯丙氨酸、D103引入4-叠氮基-L-苯丙氨酸、N104引入4-叠氮基-L-苯丙氨酸、G105引入4-叠氮基-L-苯丙氨酸、N106引入4-叠氮基-L-苯丙氨酸、R107引入4-叠氮基-L-苯丙氨酸、D103引入4-乙酰基-L-苯丙氨酸、G105引入4-乙酰基-L-苯丙氨酸和N106引入4-乙酰基-L-苯丙氨酸;Preferably, the BCH326 introduces at least one unnatural amino acid including at least one of the following: D100 introduces 4-azido-L-phenylalanine, I101 introduces 4-azido-L-phenylalanine, D102 introduces 4-azido-L-phenylalanine, D103 introduces 4-azido-L-phenylalanine, N104 introduces 4-azido-L-phenylalanine, G105 introduces 4-azido-L-phenylalanine, N106 introduces 4-azido-L-phenylalanine, R107 introduces 4-azido-L-phenylalanine, D103 introduces 4-acetyl-L-phenylalanine, G105 introduces 4-acetyl-L-phenylalanine and N106 introduces 4-acetyl-L-phenylalanine;优选的,所述BCH338引入至少一个非天然氨基酸包括如下至少一种:A431引入4-叠氮基-L-苯丙氨酸、K432引入4-叠氮基-L-苯丙氨酸、T433引入4-叠氮基-L-苯丙氨酸、M434引入4-叠氮基-L-苯丙氨酸、N435引入4-叠氮基-L-苯丙氨酸、S441引入4-叠氮基-L-苯丙氨酸、K442引入4-叠氮基-L-苯丙氨酸、N443引入4-叠氮基-L-苯丙氨酸和S444引入4-叠氮基-L-苯丙氨酸。Preferably, the BCH338 introduces at least one unnatural amino acid including at least one of the following: A431 introduces 4-azido-L-phenylalanine, K432 introduces 4-azido-L-phenylalanine, T433 introduces 4-azido-L-phenylalanine, M434 introduces 4-azido-L-phenylalanine, N435 introduces 4-azido-L-phenylalanine, S441 introduces 4-azido-L-phenylalanine, K442 introduces 4-azido-L-phenylalanine, N443 introduces 4-azido-L-phenylalanine and S444 introduces 4-azido-L-phenylalanine.
- 根据权利要求1至4中任一项所述的解旋酶,其特征在于,在所述解旋酶的DNA结合区的氨基酸位点和/或ATP催化活性中心附近的氨基酸位点具有至少一个位点的氨基酸突变,所述突变包括将原有氨基酸突变为更大侧链氨基酸;The helicase according to any one of claims 1 to 4, characterized in that there is at least one amino acid mutation at an amino acid site in the DNA binding region of the helicase and/or an amino acid site near the ATP catalytic active center, wherein the mutation comprises mutating the original amino acid to an amino acid with a larger side chain;优选的,所述将原有氨基酸突变为更大侧链氨基酸包括以下至少一种:天冬酰胺被谷氨酰胺、组氨酸、精氨酸或赖氨酸取代;脯氨酸被精氨酸、赖氨酸、苯丙氨酸或亮氨酸取代;组氨酸被精氨酸、赖氨酸、谷氨酰胺、天冬酰胺苯丙氨酸、酪氨酸或色氨酸取代;脯氨酸被精氨酸、赖氨酸、谷氨酰胺、天冬酰胺或组氨酸取代;苯丙氨酸精氨酸、赖氨酸、组氨酸、酪氨酸或色氨酸取代;异亮氨酸被苯丙氨酸、色氨酸、组氨酸、赖氨酸或精氨酸取代;酪氨酸被精氨酸、赖氨酸、或色氨酸取代;Preferably, the mutation of the original amino acid to an amino acid with a larger side chain comprises at least one of the following: asparagine is replaced by glutamine, histidine, arginine or lysine; proline is replaced by arginine, lysine, phenylalanine or leucine; histidine is replaced by arginine, lysine, glutamine, asparagine phenylalanine, tyrosine or tryptophan; proline is replaced by arginine, lysine, glutamine, asparagine or histidine; phenylalanine is replaced by arginine, lysine, histidine, tyrosine or tryptophan; isoleucine is replaced by phenylalanine, tryptophan, histidine, lysine or arginine; tyrosine is replaced by arginine, lysine or tryptophan;BCH326的DNA结合区氨基酸位点包括:L157、V160、L294、G296、N299、L303、A304、I328、F329、T330、N331、G332、G333和E334,ATP催化活性中心附近的氨基酸位点包括:K211、E212、E213、N214、Y215、K216、A217、P218、L219、K220、D221、I222、N223和N224;The amino acid sites in the DNA binding region of BCH326 include: L157, V160, L294, G296, N299, L303, A304, I328, F329, T330, N331, G332, G333, and E334, and the amino acid sites near the ATP catalytic active center include: K211, E212, E213, N214, Y215, K216, A217, P218, L219, K220, D221, I222, N223, and N224;BCH338的DNA结合区氨基酸位点包括:H89、S90、Y91、F92、E93、I94、R95和P96;ATP催化活性中心附近的氨基酸位点包括:Y152、Q153、L154、P155、P156、V157、F193、L194、I195、K196、E197、Y198、E199、E200和N201。The amino acid sites in the DNA binding region of BCH338 include: H89, S90, Y91, F92, E93, I94, R95 and P96; the amino acid sites near the ATP catalytic active center include: Y152, Q153, L154, P155, P156, V157, F193, L194, I195, K196, E197, Y198, E199, E200 and N201.
- 根据权利要求1至4中任一项所述的解旋酶,其特征在于,在所述解旋酶表面的与纳米孔结合区相互作用的氨基酸具有至少一个位点的突变,所述突变包括将原来的氨基酸突变为更短侧链的氨基酸;The helicase according to any one of claims 1 to 4, characterized in that the amino acid on the surface of the helicase that interacts with the nanopore binding region has a mutation in at least one site, wherein the mutation comprises mutating the original amino acid into an amino acid with a shorter side chain;优选的,所述将原来的氨基酸突变为更短侧链的氨基酸包括:天冬酰胺被异亮氨酸、缬氨酸、异亮氨酸、丙氨酸、丝氨酸或甘氨酸取代;赖氨酸被异亮氨酸、缬氨酸、异亮氨酸、丙氨酸、丝氨酸或甘氨酸取代;赖氨酸被异亮氨酸、缬氨酸、异亮氨酸、丙氨酸、丝氨酸或甘氨酸取代;精氨酸被异亮氨酸、缬氨酸、异亮氨酸、丙氨酸、丝氨酸或甘氨酸取代;Preferably, the mutation of the original amino acid to an amino acid with a shorter side chain includes: asparagine is replaced by isoleucine, valine, isoleucine, alanine, serine or glycine; lysine is replaced by isoleucine, valine, isoleucine, alanine, serine or glycine; lysine is replaced by isoleucine, valine, isoleucine, alanine, serine or glycine; arginine is replaced by isoleucine, valine, isoleucine, alanine, serine or glycine;优选的,BCH326表面与纳米孔结合区相互作用的氨基酸包括:M1、E2、S3、K4、I5、N6、L7、T8、E9、D10、Q11、L12、K13、I14、I15、K16、I189、I190、R191、T192、Q193、N194、K195、N196和S197;Preferably, the amino acids on the surface of BCH326 that interact with the nanopore binding region include: M1, E2, S3, K4, I5, N6, L7, T8, E9, D10, Q11, L12, K13, I14, I15, K16, I189, I190, R191, T192, Q193, N194, K195, N196, and S197;BCH338表面与纳米孔结合区相互作用的氨基酸包括:M1、G2、E3、I4、K5、L6、N7、E8、E9、Q10、Q11、K12、K177、I177、L178、R179、T180、K181、N182、L213、I214、D215、H216、F217、H218、V219、Y220、G221、D248、L249、T250、D251、S252、T253、E254和S255。The amino acids on the surface of BCH338 that interact with the nanopore binding region include: M1, G2, E3, I4, K5, L6, N7, E8, E9, Q10, Q11, K12, K177, I177, L178, R179, T180, K181, N182, L213, I214, D215, H216, F217, H218, V219, Y220, G221, D248, L249, T250, D251, S252, T253, E254, and S255.
- 一种分离的DNA分子,其特征在于,所述DNA分子具有An isolated DNA molecule, characterized in that the DNA molecule has(a)编码权利要求2至3中任一项所述的解旋酶的核苷酸序列;或(a) a nucleotide sequence encoding the helicase according to any one of claims 2 to 3; or(b)在严格条件下与(a)限定的DNA分子杂交的核苷酸序列;或(b) a nucleotide sequence that hybridizes under stringent conditions to the DNA molecule defined in (a); or(c)具有SEQ ID NO:2或SEQ ID NO:4所示的核苷酸序列;或(c) having the nucleotide sequence shown in SEQ ID NO: 2 or SEQ ID NO: 4; or(d)与(a)至(c)中限定的任一种所述的核苷酸序列具有70%以上同源性,且编码与所述解旋酶具有相同功能的蛋白质的核苷酸序列。(d) A nucleotide sequence having 70% or more homology with any one of the nucleotide sequences defined in (a) to (c) and encoding a protein having the same function as the helicase.
- 根据权利要求8所述的DNA分子,其特征在于,所述DNA分子具有与(a)至(c)中限定的任一种所述核苷酸序列具有75%以上,优选85%以上,更优选95%以上,进一步优选99%以上同源性且编码具有相同功能蛋白质的核苷酸序列。The DNA molecule according to claim 8 is characterized in that the DNA molecule has a nucleotide sequence that has more than 75%, preferably more than 85%, more preferably more than 95%, and further preferably more than 99% homology with any one of the nucleotide sequences defined in (a) to (c) and encodes a protein with the same function.
- 一种重组载体,其特征在于,所述重组载体包含权利要求8或9所述的DNA分子。A recombinant vector, characterized in that the recombinant vector comprises the DNA molecule according to claim 8 or 9.
- 根据权利要求10所述的重组载体,其特征在于,所述重组载体选自质粒、病毒或运载体表达载体;The recombinant vector according to claim 10, characterized in that the recombinant vector is selected from a plasmid, a virus or a carrier expression vector;进一步地,所述重组载体包括用于控制所述DNA分子表达的调控元件;Furthermore, the recombinant vector includes a regulatory element for controlling the expression of the DNA molecule;更进一步地,所述调控元件包括与所述DNA分子可操作地连接的启动子;Furthermore, the regulatory element includes a promoter operably linked to the DNA molecule;优选地,所述启动子包括T7、trc、lac、ara或λL;Preferably, the promoter comprises T7, trc, lac, ara or λL;更优选地,所述重组载体选自质粒PET.28a(+)、PET.21a(+)或PET.32a(+)。More preferably, the recombinant vector is selected from plasmid PET.28a(+), PET.21a(+) or PET.32a(+).
- 一种宿主细胞,其特征在于,所述宿主细胞包含有权利要求8或9所述的DNA分子,或权利要求10或11所述的重组载体。A host cell, characterized in that the host cell contains the DNA molecule according to claim 8 or 9, or the recombinant vector according to claim 10 or 11.
- 根据权利要求12所述的宿主细胞,其特征在于,所述宿主细胞包括大肠杆菌;The host cell according to claim 12, characterized in that the host cell comprises Escherichia coli;优选地,所述宿主细胞包括BL21(DE3)、BL21 Star(DE3)pLysS、Rossata(DE3)或Lemo21(DE3)。Preferably, the host cells include BL21(DE3), BL21 Star(DE3)pLysS, Rossata(DE3) or Lemo21(DE3).
- 如权利要求1至7中任一项所述的解旋酶在核酸控制或表征中的应用;Use of a helicase according to any one of claims 1 to 7 in nucleic acid control or characterization;进一步地,所述核酸控制包括对核酸穿过纳米孔的速度的控制、对核酸穿孔的稳定性控制或对核酸穿孔的持续性控制;Further, the nucleic acid control includes controlling the speed of the nucleic acid passing through the nanopore, controlling the stability of the nucleic acid perforation, or controlling the continuity of the nucleic acid perforation;更进一步地,所述应用包括在纳米传感器中的应用和/或在单分子纳米孔测序中的应用。Furthermore, the application includes application in nanosensors and/or application in single-molecule nanopore sequencing.
- 一种纳米孔测序试剂盒,包括解旋酶,其特征在于,所述解旋酶为如权利要求1至7中任一项所述的解旋酶。A nanopore sequencing kit, comprising a helicase, characterized in that the helicase is the helicase according to any one of claims 1 to 7.
- 一种纳米孔测序的方法,包括待测序核酸分子在解旋酶的控制下进行测序,其特征在于,所述解旋酶为如权利要求1至7中任一项所述的解旋酶。A nanopore sequencing method, comprising sequencing a nucleic acid molecule to be sequenced under the control of a helicase, wherein the helicase is the helicase according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/143631 WO2024138574A1 (en) | 2022-12-29 | 2022-12-29 | Helicase and use thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/143631 WO2024138574A1 (en) | 2022-12-29 | 2022-12-29 | Helicase and use thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024138574A1 true WO2024138574A1 (en) | 2024-07-04 |
Family
ID=91716076
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/143631 WO2024138574A1 (en) | 2022-12-29 | 2022-12-29 | Helicase and use thereof |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024138574A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107109380A (en) * | 2014-10-07 | 2017-08-29 | 牛津纳米孔技术公司 | Enzyme through modification |
US20170335297A1 (en) * | 2014-11-13 | 2017-11-23 | The Board Of Trustees Of The University Of Illinois | Bio-engineered hyper-functional "super" helicases |
CN114599666A (en) * | 2020-06-19 | 2022-06-07 | 北京齐碳科技有限公司 | Pif1-like helicase and application thereof |
-
2022
- 2022-12-29 WO PCT/CN2022/143631 patent/WO2024138574A1/en unknown
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107109380A (en) * | 2014-10-07 | 2017-08-29 | 牛津纳米孔技术公司 | Enzyme through modification |
US20170335297A1 (en) * | 2014-11-13 | 2017-11-23 | The Board Of Trustees Of The University Of Illinois | Bio-engineered hyper-functional "super" helicases |
CN114599666A (en) * | 2020-06-19 | 2022-06-07 | 北京齐碳科技有限公司 | Pif1-like helicase and application thereof |
Non-Patent Citations (3)
Title |
---|
CHEN ZHIJIE, WANG ZHENQIN, XU YANG, ZHANG XIAOCHUN, TIAN BOXUE, BAI JINGWEI: "Controlled movement of ssDNA conjugated peptide through Mycobacterium smegmatis porin A (MspA) nanopore by a helicase motor for peptide sequencing application", CHEMICAL SCIENCE, ROYAL SOCIETY OF CHEMISTRY, UNITED KINGDOM, vol. 12, no. 47, 8 December 2021 (2021-12-08), United Kingdom , pages 15750 - 15756, XP093185487, ISSN: 2041-6520, DOI: 10.1039/D1SC04342K * |
DATABASE Protein 27 July 2021 (2021-07-27), ANONYMOUS: "AAA family ATPase [Malaciobacter marinus]", XP093187905, retrieved from NCBI Database accession no. WP_104412940.1 * |
YANJIE LI, CHEN YULONG, DU YONG: "Study on the expression of helicase of HCV in E.coli ", CHINESE JOURNAL OF GASTROENTEROLOGY AND HEPATOLOG, vol. 10, no. 4, 15 December 2001 (2001-12-15), pages 315 - 317, XP093187903 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7027334B2 (en) | Alpha hemolysin variants and their use | |
US20230123669A1 (en) | Base editor predictive algorithm and method of use | |
US11479584B2 (en) | Alpha-hemolysin variants with altered characteristics | |
US11261488B2 (en) | Alpha-hemolysin variants | |
JP2020530276A (en) | New protein pores | |
CN111065647A (en) | Fusion protein for improving base editing precision | |
Yuzenkova et al. | Genome of Xanthomonas oryzae bacteriophage Xp10: an odd T-odd phage | |
JP7157164B2 (en) | Alpha-hemolysin variants and their uses | |
AU2017295442A2 (en) | Biological nanopores for biopolymer sensing and sequencing based on FraC actinoporin | |
IL293024A (en) | Artificial nanopores and uses and methods relating thereto | |
EP4299746A1 (en) | Modified prp43 helicase and use thereof | |
WO2024138574A1 (en) | Helicase and use thereof | |
WO2024138632A1 (en) | Helicase, preparation method therefor and use thereof in sequencing | |
WO2024138380A1 (en) | Helicase, preparation method therefor, and use thereof in high-throughput sequencing | |
WO2024138635A1 (en) | Helicase and preparation method therefor and use thereof in high-throughput sequencing | |
WO2024138422A1 (en) | Pin domain, helicase containing same, preparation method for pin domain, and use of pin domain | |
JPS62262994A (en) | D-amino acid oxidase gene | |
JP2010104304A (en) | Dna polymerase variant having enhanced exonuclease activity | |
WO2024138626A1 (en) | Helicase topif 1, and preparation method therefor and use thereof in high-throughput sequencing | |
WO2024138631A1 (en) | Mutant of dda helicase, and preparation method therefor and use thereof in sequencing | |
WO2020081958A2 (en) | Compositions and methods for identifying mutations of genes of multi-gene systems having improved function | |
WO2021065155A1 (en) | METHOD FOR CONTROLLING DNA CLEAVAGE ACTIVITY OF Cas9 NUCLEASE | |
EP4370666A2 (en) | Context-specific adenine base editors and uses thereof | |
Boamah | Dissecting the Histone-binding Mechanism of a PHD Finger Subtype | |
Chang | Genetic and biochemical studies on PRD1 terminal protein |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22969721 Country of ref document: EP Kind code of ref document: A1 |