CN106978438B - 提高同源重组效率的方法 - Google Patents
提高同源重组效率的方法 Download PDFInfo
- Publication number
- CN106978438B CN106978438B CN201710106331.7A CN201710106331A CN106978438B CN 106978438 B CN106978438 B CN 106978438B CN 201710106331 A CN201710106331 A CN 201710106331A CN 106978438 B CN106978438 B CN 106978438B
- Authority
- CN
- China
- Prior art keywords
- leu
- lys
- sequence
- glu
- asp
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 59
- 230000006801 homologous recombination Effects 0.000 title claims abstract description 56
- 238000002744 homologous recombination Methods 0.000 title claims abstract description 56
- 108020001507 fusion proteins Proteins 0.000 claims abstract description 32
- 102000037865 fusion proteins Human genes 0.000 claims abstract description 32
- 238000010362 genome editing Methods 0.000 claims abstract description 31
- 241000196324 Embryophyta Species 0.000 claims description 46
- 239000002773 nucleotide Substances 0.000 claims description 40
- 125000003729 nucleotide group Chemical group 0.000 claims description 40
- 240000007594 Oryza sativa Species 0.000 claims description 18
- 235000007164 Oryza sativa Nutrition 0.000 claims description 18
- 235000009566 rice Nutrition 0.000 claims description 15
- 102000040430 polynucleotide Human genes 0.000 claims description 14
- 108091033319 polynucleotide Proteins 0.000 claims description 14
- 239000002157 polynucleotide Substances 0.000 claims description 14
- 108091026890 Coding region Proteins 0.000 claims description 8
- 238000010453 CRISPR/Cas method Methods 0.000 claims description 5
- 240000008042 Zea mays Species 0.000 claims description 4
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 claims description 4
- 235000002017 Zea mays subsp mays Nutrition 0.000 claims description 4
- 235000009973 maize Nutrition 0.000 claims description 4
- 241000219194 Arabidopsis Species 0.000 claims description 2
- 244000075850 Avena orientalis Species 0.000 claims description 2
- 235000007319 Avena orientalis Nutrition 0.000 claims description 2
- 235000007558 Avena sp Nutrition 0.000 claims description 2
- 235000014698 Brassica juncea var multisecta Nutrition 0.000 claims description 2
- 235000006008 Brassica napus var napus Nutrition 0.000 claims description 2
- 240000000385 Brassica napus var. napus Species 0.000 claims description 2
- 235000006618 Brassica rapa subsp oleifera Nutrition 0.000 claims description 2
- 235000004977 Brassica sinapistrum Nutrition 0.000 claims description 2
- 229920000742 Cotton Polymers 0.000 claims description 2
- 229940123611 Genome editing Drugs 0.000 claims description 2
- 244000068988 Glycine max Species 0.000 claims description 2
- 235000010469 Glycine max Nutrition 0.000 claims description 2
- 244000299507 Gossypium hirsutum Species 0.000 claims description 2
- 240000005979 Hordeum vulgare Species 0.000 claims description 2
- 235000007340 Hordeum vulgare Nutrition 0.000 claims description 2
- 240000000111 Saccharum officinarum Species 0.000 claims description 2
- 235000007201 Saccharum officinarum Nutrition 0.000 claims description 2
- 240000006394 Sorghum bicolor Species 0.000 claims description 2
- 235000011684 Sorghum saccharatum Nutrition 0.000 claims description 2
- 244000062793 Sorghum vulgare Species 0.000 claims description 2
- 235000021307 Triticum Nutrition 0.000 claims description 2
- 244000098338 Triticum aestivum Species 0.000 claims description 2
- 235000019713 millet Nutrition 0.000 claims description 2
- 125000003275 alpha amino acid group Chemical group 0.000 claims 3
- 230000009466 transformation Effects 0.000 abstract description 11
- 238000005516 engineering process Methods 0.000 abstract description 2
- 108020004414 DNA Proteins 0.000 description 65
- 108090000623 proteins and genes Proteins 0.000 description 52
- 239000013598 vector Substances 0.000 description 42
- 210000004027 cell Anatomy 0.000 description 39
- 108020005004 Guide RNA Proteins 0.000 description 36
- 150000007523 nucleic acids Chemical class 0.000 description 34
- 150000001413 amino acids Chemical group 0.000 description 29
- 102000004169 proteins and genes Human genes 0.000 description 29
- 102000039446 nucleic acids Human genes 0.000 description 27
- 108020004707 nucleic acids Proteins 0.000 description 27
- 108020004999 messenger RNA Proteins 0.000 description 26
- 235000018102 proteins Nutrition 0.000 description 26
- 108091033409 CRISPR Proteins 0.000 description 21
- 230000014509 gene expression Effects 0.000 description 21
- 206010020649 Hyperkeratosis Diseases 0.000 description 17
- 235000001014 amino acid Nutrition 0.000 description 17
- 229940024606 amino acid Drugs 0.000 description 17
- 238000013518 transcription Methods 0.000 description 17
- 230000035897 transcription Effects 0.000 description 17
- 241000589158 Agrobacterium Species 0.000 description 16
- 239000013604 expression vector Substances 0.000 description 16
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 15
- 239000013612 plasmid Substances 0.000 description 15
- 238000010186 staining Methods 0.000 description 15
- 238000010276 construction Methods 0.000 description 13
- 239000005631 2,4-Dichlorophenoxyacetic acid Substances 0.000 description 12
- 108091028043 Nucleic acid sequence Proteins 0.000 description 12
- 239000002609 medium Substances 0.000 description 12
- 108090000765 processed proteins & peptides Proteins 0.000 description 12
- 230000006870 function Effects 0.000 description 11
- 238000003259 recombinant expression Methods 0.000 description 11
- 239000000243 solution Substances 0.000 description 11
- 238000010354 CRISPR gene editing Methods 0.000 description 10
- 238000003776 cleavage reaction Methods 0.000 description 10
- 230000007017 scission Effects 0.000 description 10
- 102000053602 DNA Human genes 0.000 description 9
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 9
- 239000007788 liquid Substances 0.000 description 9
- 239000000047 product Substances 0.000 description 9
- 230000001105 regulatory effect Effects 0.000 description 9
- 241000701489 Cauliflower mosaic virus Species 0.000 description 8
- 108091028113 Trans-activating crRNA Proteins 0.000 description 8
- 230000001580 bacterial effect Effects 0.000 description 8
- 230000000295 complement effect Effects 0.000 description 8
- 238000005520 cutting process Methods 0.000 description 8
- 230000005782 double-strand break Effects 0.000 description 8
- 230000006798 recombination Effects 0.000 description 8
- 238000005215 recombination Methods 0.000 description 8
- 108091008146 restriction endonucleases Proteins 0.000 description 8
- 241000588724 Escherichia coli Species 0.000 description 7
- 238000000137 annealing Methods 0.000 description 7
- 238000010367 cloning Methods 0.000 description 7
- 238000003752 polymerase chain reaction Methods 0.000 description 7
- 150000003839 salts Chemical class 0.000 description 7
- 239000007787 solid Substances 0.000 description 7
- HXKWSTRRCHTUEC-UHFFFAOYSA-N 2,4-Dichlorophenoxyaceticacid Chemical compound OC(=O)C(Cl)OC1=CC=C(Cl)C=C1 HXKWSTRRCHTUEC-UHFFFAOYSA-N 0.000 description 6
- 108010042407 Endonucleases Proteins 0.000 description 6
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 6
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 6
- 230000015572 biosynthetic process Effects 0.000 description 6
- 239000005018 casein Substances 0.000 description 6
- BECPQYXYKAMYBN-UHFFFAOYSA-N casein, tech. Chemical compound NCCCCC(C(O)=O)N=C(O)C(CC(O)=O)N=C(O)C(CCC(O)=N)N=C(O)C(CC(C)C)N=C(O)C(CCC(O)=O)N=C(O)C(CC(O)=O)N=C(O)C(CCC(O)=O)N=C(O)C(C(C)O)N=C(O)C(CCC(O)=N)N=C(O)C(CCC(O)=N)N=C(O)C(CCC(O)=N)N=C(O)C(CCC(O)=O)N=C(O)C(CCC(O)=O)N=C(O)C(COP(O)(O)=O)N=C(O)C(CCC(O)=N)N=C(O)C(N)CC1=CC=CC=C1 BECPQYXYKAMYBN-UHFFFAOYSA-N 0.000 description 6
- 235000021240 caseins Nutrition 0.000 description 6
- 238000001976 enzyme digestion Methods 0.000 description 6
- 239000000499 gel Substances 0.000 description 6
- 230000004048 modification Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 6
- 230000035772 mutation Effects 0.000 description 6
- 230000006780 non-homologous end joining Effects 0.000 description 6
- 230000037361 pathway Effects 0.000 description 6
- 229920001184 polypeptide Polymers 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 102000004196 processed proteins & peptides Human genes 0.000 description 6
- 230000008439 repair process Effects 0.000 description 6
- 238000012795 verification Methods 0.000 description 6
- 239000011782 vitamin Substances 0.000 description 6
- 235000013343 vitamin Nutrition 0.000 description 6
- 229940088594 vitamin Drugs 0.000 description 6
- 229930003231 vitamin Natural products 0.000 description 6
- JXCKZXHCJOVIAV-UHFFFAOYSA-N 6-[(5-bromo-4-chloro-1h-indol-3-yl)oxy]-3,4,5-trihydroxyoxane-2-carboxylic acid;cyclohexanamine Chemical compound [NH3+]C1CCCCC1.O1C(C([O-])=O)C(O)C(O)C(O)C1OC1=CNC2=CC=C(Br)C(Cl)=C12 JXCKZXHCJOVIAV-UHFFFAOYSA-N 0.000 description 5
- WQZGKKKJIJFFOK-QTVWNMPRSA-N D-mannopyranose Chemical compound OC[C@H]1OC(O)[C@@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-QTVWNMPRSA-N 0.000 description 5
- 102100031780 Endonuclease Human genes 0.000 description 5
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 5
- 229930006000 Sucrose Natural products 0.000 description 5
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 5
- 230000027455 binding Effects 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 5
- 239000012634 fragment Substances 0.000 description 5
- 229930027917 kanamycin Natural products 0.000 description 5
- 229960000318 kanamycin Drugs 0.000 description 5
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 5
- 229930182823 kanamycin A Natural products 0.000 description 5
- 108010034529 leucyl-lysine Proteins 0.000 description 5
- 238000011084 recovery Methods 0.000 description 5
- 239000000523 sample Substances 0.000 description 5
- 230000035939 shock Effects 0.000 description 5
- 239000011780 sodium chloride Substances 0.000 description 5
- 239000005720 sucrose Substances 0.000 description 5
- 238000011144 upstream manufacturing Methods 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 4
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 4
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 4
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 4
- 102000004190 Enzymes Human genes 0.000 description 4
- 108090000790 Enzymes Proteins 0.000 description 4
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 4
- 241000206602 Eukaryota Species 0.000 description 4
- 108091092195 Intron Proteins 0.000 description 4
- 108091022912 Mannose-6-Phosphate Isomerase Proteins 0.000 description 4
- 108091005461 Nucleic proteins Proteins 0.000 description 4
- 108091081024 Start codon Proteins 0.000 description 4
- 108700009124 Transcription Initiation Site Proteins 0.000 description 4
- OJOBTAOGJIWAGB-UHFFFAOYSA-N acetosyringone Chemical compound COC1=CC(C(C)=O)=CC(OC)=C1O OJOBTAOGJIWAGB-UHFFFAOYSA-N 0.000 description 4
- 108010047495 alanylglycine Proteins 0.000 description 4
- 229940041514 candida albicans extract Drugs 0.000 description 4
- 239000003795 chemical substances by application Substances 0.000 description 4
- 239000002299 complementary DNA Substances 0.000 description 4
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 4
- 238000003780 insertion Methods 0.000 description 4
- 230000037431 insertion Effects 0.000 description 4
- 230000004807 localization Effects 0.000 description 4
- 108010054155 lysyllysine Proteins 0.000 description 4
- 230000001404 mediated effect Effects 0.000 description 4
- 230000010076 replication Effects 0.000 description 4
- 239000012192 staining solution Substances 0.000 description 4
- 238000006467 substitution reaction Methods 0.000 description 4
- 239000000725 suspension Substances 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- 239000012137 tryptone Substances 0.000 description 4
- 108010073969 valyllysine Proteins 0.000 description 4
- 239000012138 yeast extract Substances 0.000 description 4
- 108091034151 7SK RNA Proteins 0.000 description 3
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 3
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 3
- PAPSMOYMQDWIOR-AVGNSLFASA-N Arg-Lys-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PAPSMOYMQDWIOR-AVGNSLFASA-N 0.000 description 3
- OPEPUCYIGFEGSW-WDSKDSINSA-N Asn-Gly-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OPEPUCYIGFEGSW-WDSKDSINSA-N 0.000 description 3
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 3
- 108091062157 Cis-regulatory element Proteins 0.000 description 3
- 101710177611 DNA polymerase II large subunit Proteins 0.000 description 3
- 101710184669 DNA polymerase II small subunit Proteins 0.000 description 3
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 3
- 241000710198 Foot-and-mouth disease virus Species 0.000 description 3
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 3
- 239000004471 Glycine Substances 0.000 description 3
- 102100025022 Mannose-6-phosphate isomerase Human genes 0.000 description 3
- 108700026226 TATA Box Proteins 0.000 description 3
- 108020004566 Transfer RNA Proteins 0.000 description 3
- 108090000848 Ubiquitin Proteins 0.000 description 3
- 229960000723 ampicillin Drugs 0.000 description 3
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 3
- 230000003197 catalytic effect Effects 0.000 description 3
- 210000000349 chromosome Anatomy 0.000 description 3
- 238000012217 deletion Methods 0.000 description 3
- 230000037430 deletion Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 229960001484 edetic acid Drugs 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 239000008103 glucose Substances 0.000 description 3
- 101150054900 gus gene Proteins 0.000 description 3
- 108010025306 histidylleucine Proteins 0.000 description 3
- 208000015181 infectious disease Diseases 0.000 description 3
- 238000002156 mixing Methods 0.000 description 3
- 238000003199 nucleic acid amplification method Methods 0.000 description 3
- 210000003463 organelle Anatomy 0.000 description 3
- 229920000642 polymer Polymers 0.000 description 3
- 239000002244 precipitate Substances 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000002441 reversible effect Effects 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 239000006228 supernatant Substances 0.000 description 3
- 210000001519 tissue Anatomy 0.000 description 3
- 230000014616 translation Effects 0.000 description 3
- 150000003722 vitamin derivatives Chemical class 0.000 description 3
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 2
- IAJOBQBIJHVGMQ-UHFFFAOYSA-N 2-amino-4-[hydroxy(methyl)phosphoryl]butanoic acid Chemical compound CP(O)(=O)CCC(N)C(O)=O IAJOBQBIJHVGMQ-UHFFFAOYSA-N 0.000 description 2
- -1 4-12 amino acids Chemical class 0.000 description 2
- 108020005075 5S Ribosomal RNA Proteins 0.000 description 2
- 229920001817 Agar Polymers 0.000 description 2
- PHHRSPBBQUFULD-UWVGGRQHSA-N Arg-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)N PHHRSPBBQUFULD-UWVGGRQHSA-N 0.000 description 2
- FSNVAJOPUDVQAR-AVGNSLFASA-N Arg-Lys-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FSNVAJOPUDVQAR-AVGNSLFASA-N 0.000 description 2
- SPCONPVIDFMDJI-QSFUFRPTSA-N Asn-Ile-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O SPCONPVIDFMDJI-QSFUFRPTSA-N 0.000 description 2
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 2
- HOBNTSHITVVNBN-ZPFDUUQYSA-N Asp-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N HOBNTSHITVVNBN-ZPFDUUQYSA-N 0.000 description 2
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 2
- XWSIYTYNLKCLJB-CIUDSAMLSA-N Asp-Lys-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O XWSIYTYNLKCLJB-CIUDSAMLSA-N 0.000 description 2
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 2
- LTCKTLYKRMCFOC-KKUMJFAQSA-N Asp-Phe-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LTCKTLYKRMCFOC-KKUMJFAQSA-N 0.000 description 2
- 108020004635 Complementary DNA Proteins 0.000 description 2
- 230000006820 DNA synthesis Effects 0.000 description 2
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 2
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 2
- 108700024394 Exon Proteins 0.000 description 2
- 230000010337 G2 phase Effects 0.000 description 2
- KDXKFBSNIJYNNR-YVNDNENWSA-N Gln-Glu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDXKFBSNIJYNNR-YVNDNENWSA-N 0.000 description 2
- IULKWYSYZSURJK-AVGNSLFASA-N Gln-Leu-Lys Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O IULKWYSYZSURJK-AVGNSLFASA-N 0.000 description 2
- QQLBPVKLJBAXBS-FXQIFTODSA-N Glu-Glu-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QQLBPVKLJBAXBS-FXQIFTODSA-N 0.000 description 2
- SOYWRINXUSUWEQ-DLOVCJGASA-N Glu-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 2
- 239000005561 Glufosinate Substances 0.000 description 2
- UFPXDFOYHVEIPI-BYPYZUCNSA-N Gly-Gly-Asp Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O UFPXDFOYHVEIPI-BYPYZUCNSA-N 0.000 description 2
- DAKSMIWQZPHRIB-BZSNNMDCSA-N His-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DAKSMIWQZPHRIB-BZSNNMDCSA-N 0.000 description 2
- 101150062179 II gene Proteins 0.000 description 2
- BCISUQVFDGYZBO-QSFUFRPTSA-N Ile-Val-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O BCISUQVFDGYZBO-QSFUFRPTSA-N 0.000 description 2
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 2
- 241000880493 Leptailurus serval Species 0.000 description 2
- DLCXCECTCPKKCD-GUBZILKMSA-N Leu-Gln-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DLCXCECTCPKKCD-GUBZILKMSA-N 0.000 description 2
- KGCLIYGPQXUNLO-IUCAKERBSA-N Leu-Gly-Glu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O KGCLIYGPQXUNLO-IUCAKERBSA-N 0.000 description 2
- YRRCOJOXAJNSAX-IHRRRGAJSA-N Leu-Pro-Lys Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N YRRCOJOXAJNSAX-IHRRRGAJSA-N 0.000 description 2
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 2
- JCFYLFOCALSNLQ-GUBZILKMSA-N Lys-Ala-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JCFYLFOCALSNLQ-GUBZILKMSA-N 0.000 description 2
- JMNRXRPBHFGXQX-GUBZILKMSA-N Lys-Ser-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JMNRXRPBHFGXQX-GUBZILKMSA-N 0.000 description 2
- IEIHKHYMBIYQTH-YESZJQIVSA-N Lys-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCCN)N)C(=O)O IEIHKHYMBIYQTH-YESZJQIVSA-N 0.000 description 2
- VWJFOUBDZIUXGA-AVGNSLFASA-N Lys-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCCCN)N VWJFOUBDZIUXGA-AVGNSLFASA-N 0.000 description 2
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 2
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 2
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 2
- 108700026244 Open Reading Frames Proteins 0.000 description 2
- 238000012408 PCR amplification Methods 0.000 description 2
- LRBSWBVUCLLRLU-BZSNNMDCSA-N Phe-Leu-Lys Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)Cc1ccccc1)C(=O)N[C@@H](CCCCN)C(O)=O LRBSWBVUCLLRLU-BZSNNMDCSA-N 0.000 description 2
- IPFXYNKCXYGSSV-KKUMJFAQSA-N Phe-Ser-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N IPFXYNKCXYGSSV-KKUMJFAQSA-N 0.000 description 2
- 241000589517 Pseudomonas aeruginosa Species 0.000 description 2
- 108010009460 RNA Polymerase II Proteins 0.000 description 2
- 102000009572 RNA Polymerase II Human genes 0.000 description 2
- 108020004511 Recombinant DNA Proteins 0.000 description 2
- 230000018199 S phase Effects 0.000 description 2
- LALNXSXEYFUUDD-GUBZILKMSA-N Ser-Glu-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LALNXSXEYFUUDD-GUBZILKMSA-N 0.000 description 2
- 108020004682 Single-Stranded DNA Proteins 0.000 description 2
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 2
- 241000194017 Streptococcus Species 0.000 description 2
- 241000193996 Streptococcus pyogenes Species 0.000 description 2
- 238000010459 TALEN Methods 0.000 description 2
- JMGJDTNUMAZNLX-RWRJDSDZSA-N Thr-Glu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JMGJDTNUMAZNLX-RWRJDSDZSA-N 0.000 description 2
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 2
- HSBZWINKRYZCSQ-KKUMJFAQSA-N Tyr-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O HSBZWINKRYZCSQ-KKUMJFAQSA-N 0.000 description 2
- 102000044159 Ubiquitin Human genes 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- IZFVRRYRMQFVGX-NRPADANISA-N Val-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N IZFVRRYRMQFVGX-NRPADANISA-N 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 108010017070 Zinc Finger Nucleases Proteins 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- 239000008272 agar Substances 0.000 description 2
- 235000004279 alanine Nutrition 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 108010013835 arginine glutamate Proteins 0.000 description 2
- 108010062796 arginyllysine Proteins 0.000 description 2
- 210000004436 artificial bacterial chromosome Anatomy 0.000 description 2
- 108010038633 aspartylglutamate Proteins 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000003115 biocidal effect Effects 0.000 description 2
- 238000010804 cDNA synthesis Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004440 column chromatography Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 210000003527 eukaryotic cell Anatomy 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 238000009396 hybridization Methods 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 2
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 2
- 108010057821 leucylproline Proteins 0.000 description 2
- 108010009298 lysylglutamic acid Proteins 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 230000002438 mitochondrial effect Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 229910052757 nitrogen Inorganic materials 0.000 description 2
- 108091027963 non-coding RNA Proteins 0.000 description 2
- 102000042567 non-coding RNA Human genes 0.000 description 2
- 108010058731 nopaline synthase Proteins 0.000 description 2
- SCVFZCLFOSHCOH-UHFFFAOYSA-M potassium acetate Chemical compound [K+].CC([O-])=O SCVFZCLFOSHCOH-UHFFFAOYSA-M 0.000 description 2
- 210000001236 prokaryotic cell Anatomy 0.000 description 2
- 102000005962 receptors Human genes 0.000 description 2
- 108020003175 receptors Proteins 0.000 description 2
- 230000008263 repair mechanism Effects 0.000 description 2
- 230000003252 repetitive effect Effects 0.000 description 2
- 230000003362 replicative effect Effects 0.000 description 2
- JQXXHWHPUNPDRT-WLSIYKJHSA-N rifampicin Chemical compound O([C@](C1=O)(C)O/C=C/[C@@H]([C@H]([C@@H](OC(C)=O)[C@H](C)[C@H](O)[C@H](C)[C@@H](O)[C@@H](C)\C=C\C=C(C)/C(=O)NC=2C(O)=C3C([O-])=C4C)C)OC)C4=C1C3=C(O)C=2\C=N\N1CC[NH+](C)CC1 JQXXHWHPUNPDRT-WLSIYKJHSA-N 0.000 description 2
- 229960001225 rifampicin Drugs 0.000 description 2
- 238000007789 sealing Methods 0.000 description 2
- 241000894007 species Species 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 108010061238 threonyl-glycine Proteins 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000014621 translational initiation Effects 0.000 description 2
- 108010051110 tyrosyl-lysine Proteins 0.000 description 2
- 239000013603 viral vector Substances 0.000 description 2
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- RRBGTUQJDFBWNN-MUGJNUQGSA-N (2s)-6-amino-2-[[(2s)-6-amino-2-[[(2s)-6-amino-2-[[(2s)-2,6-diaminohexanoyl]amino]hexanoyl]amino]hexanoyl]amino]hexanoic acid Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O RRBGTUQJDFBWNN-MUGJNUQGSA-N 0.000 description 1
- OWEGMIWEEQEYGQ-UHFFFAOYSA-N 100676-05-9 Natural products OC1C(O)C(O)C(CO)OC1OCC1C(O)C(O)C(O)C(OC2C(OC(O)C(O)C2O)CO)O1 OWEGMIWEEQEYGQ-UHFFFAOYSA-N 0.000 description 1
- GOJUJUVQIVIZAV-UHFFFAOYSA-N 2-amino-4,6-dichloropyrimidine-5-carbaldehyde Chemical group NC1=NC(Cl)=C(C=O)C(Cl)=N1 GOJUJUVQIVIZAV-UHFFFAOYSA-N 0.000 description 1
- OPIFSICVWOWJMJ-AEOCFKNESA-N 5-bromo-4-chloro-3-indolyl beta-D-galactoside Chemical compound O[C@@H]1[C@@H](O)[C@@H](O)[C@@H](CO)O[C@H]1OC1=CNC2=CC=C(Br)C(Cl)=C12 OPIFSICVWOWJMJ-AEOCFKNESA-N 0.000 description 1
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- 241000589155 Agrobacterium tumefaciens Species 0.000 description 1
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 1
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 1
- LWUWMHIOBPTZBA-DCAQKATOSA-N Ala-Arg-Lys Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O LWUWMHIOBPTZBA-DCAQKATOSA-N 0.000 description 1
- WXERCAHAIKMTKX-ZLUOBGJFSA-N Ala-Asp-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O WXERCAHAIKMTKX-ZLUOBGJFSA-N 0.000 description 1
- LSLIRHLIUDVNBN-CIUDSAMLSA-N Ala-Asp-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LSLIRHLIUDVNBN-CIUDSAMLSA-N 0.000 description 1
- BLGHHPHXVJWCNK-GUBZILKMSA-N Ala-Gln-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BLGHHPHXVJWCNK-GUBZILKMSA-N 0.000 description 1
- NJPMYXWVWQWCSR-ACZMJKKPSA-N Ala-Glu-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NJPMYXWVWQWCSR-ACZMJKKPSA-N 0.000 description 1
- KXEVYGKATAMXJJ-ACZMJKKPSA-N Ala-Glu-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KXEVYGKATAMXJJ-ACZMJKKPSA-N 0.000 description 1
- KMGOBAQSCKTBGD-DLOVCJGASA-N Ala-His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CN=CN1 KMGOBAQSCKTBGD-DLOVCJGASA-N 0.000 description 1
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 1
- QQACQIHVWCVBBR-GVARAGBVSA-N Ala-Ile-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QQACQIHVWCVBBR-GVARAGBVSA-N 0.000 description 1
- LNNSWWRRYJLGNI-NAKRPEOUSA-N Ala-Ile-Val Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O LNNSWWRRYJLGNI-NAKRPEOUSA-N 0.000 description 1
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 1
- QUIGLPSHIFPEOV-CIUDSAMLSA-N Ala-Lys-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O QUIGLPSHIFPEOV-CIUDSAMLSA-N 0.000 description 1
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 1
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 1
- NINQYGGNRIBFSC-CIUDSAMLSA-N Ala-Lys-Ser Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CO)C(O)=O NINQYGGNRIBFSC-CIUDSAMLSA-N 0.000 description 1
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 1
- ZBLQIYPCUWZSRZ-QEJZJMRPSA-N Ala-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 ZBLQIYPCUWZSRZ-QEJZJMRPSA-N 0.000 description 1
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 1
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 1
- IOFVWPYSRSCWHI-JXUBOQSCSA-N Ala-Thr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C)N IOFVWPYSRSCWHI-JXUBOQSCSA-N 0.000 description 1
- IETUUAHKCHOQHP-KZVJFYERSA-N Ala-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)[C@@H](C)O)C(O)=O IETUUAHKCHOQHP-KZVJFYERSA-N 0.000 description 1
- 241000239290 Araneae Species 0.000 description 1
- YYOVLDPHIJAOSY-DCAQKATOSA-N Arg-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N YYOVLDPHIJAOSY-DCAQKATOSA-N 0.000 description 1
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 1
- KWTVWJPNHAOREN-IHRRRGAJSA-N Arg-Asn-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KWTVWJPNHAOREN-IHRRRGAJSA-N 0.000 description 1
- IIABBYGHLYWVOS-FXQIFTODSA-N Arg-Asn-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O IIABBYGHLYWVOS-FXQIFTODSA-N 0.000 description 1
- YFBGNGASPGRWEM-DCAQKATOSA-N Arg-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YFBGNGASPGRWEM-DCAQKATOSA-N 0.000 description 1
- TTXYKSADPSNOIF-IHRRRGAJSA-N Arg-Asp-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O TTXYKSADPSNOIF-IHRRRGAJSA-N 0.000 description 1
- JCAISGGAOQXEHJ-ZPFDUUQYSA-N Arg-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N JCAISGGAOQXEHJ-ZPFDUUQYSA-N 0.000 description 1
- QAODJPUKWNNNRP-DCAQKATOSA-N Arg-Glu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QAODJPUKWNNNRP-DCAQKATOSA-N 0.000 description 1
- RKRSYHCNPFGMTA-CIUDSAMLSA-N Arg-Glu-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O RKRSYHCNPFGMTA-CIUDSAMLSA-N 0.000 description 1
- MZRBYBIQTIKERR-GUBZILKMSA-N Arg-Glu-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MZRBYBIQTIKERR-GUBZILKMSA-N 0.000 description 1
- OHYQKYUTLIPFOX-ZPFDUUQYSA-N Arg-Glu-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OHYQKYUTLIPFOX-ZPFDUUQYSA-N 0.000 description 1
- MSILNNHVVMMTHZ-UWVGGRQHSA-N Arg-His-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CN=CN1 MSILNNHVVMMTHZ-UWVGGRQHSA-N 0.000 description 1
- RKQRHMKFNBYOTN-IHRRRGAJSA-N Arg-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N RKQRHMKFNBYOTN-IHRRRGAJSA-N 0.000 description 1
- UPKMBGAAEZGHOC-RWMBFGLXSA-N Arg-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O UPKMBGAAEZGHOC-RWMBFGLXSA-N 0.000 description 1
- UAOSDDXCTBIPCA-QXEWZRGKSA-N Arg-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UAOSDDXCTBIPCA-QXEWZRGKSA-N 0.000 description 1
- LVMUGODRNHFGRA-AVGNSLFASA-N Arg-Leu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O LVMUGODRNHFGRA-AVGNSLFASA-N 0.000 description 1
- OTZMRMHZCMZOJZ-SRVKXCTJSA-N Arg-Leu-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OTZMRMHZCMZOJZ-SRVKXCTJSA-N 0.000 description 1
- IIAXFBUTKIDDIP-ULQDDVLXSA-N Arg-Leu-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O IIAXFBUTKIDDIP-ULQDDVLXSA-N 0.000 description 1
- RIIVUOJDDQXHRV-SRVKXCTJSA-N Arg-Lys-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O RIIVUOJDDQXHRV-SRVKXCTJSA-N 0.000 description 1
- XUGATJVGQUGQKY-ULQDDVLXSA-N Arg-Lys-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XUGATJVGQUGQKY-ULQDDVLXSA-N 0.000 description 1
- NPAVRDPEFVKELR-DCAQKATOSA-N Arg-Lys-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NPAVRDPEFVKELR-DCAQKATOSA-N 0.000 description 1
- PYZPXCZNQSEHDT-GUBZILKMSA-N Arg-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N PYZPXCZNQSEHDT-GUBZILKMSA-N 0.000 description 1
- CZUHPNLXLWMYMG-UBHSHLNASA-N Arg-Phe-Ala Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 CZUHPNLXLWMYMG-UBHSHLNASA-N 0.000 description 1
- YTMKMRSYXHBGER-IHRRRGAJSA-N Arg-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YTMKMRSYXHBGER-IHRRRGAJSA-N 0.000 description 1
- NGYHSXDNNOFHNE-AVGNSLFASA-N Arg-Pro-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O NGYHSXDNNOFHNE-AVGNSLFASA-N 0.000 description 1
- VENMDXUVHSKEIN-GUBZILKMSA-N Arg-Ser-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VENMDXUVHSKEIN-GUBZILKMSA-N 0.000 description 1
- VRTWYUYCJGNFES-CIUDSAMLSA-N Arg-Ser-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O VRTWYUYCJGNFES-CIUDSAMLSA-N 0.000 description 1
- FRBAHXABMQXSJQ-FXQIFTODSA-N Arg-Ser-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O FRBAHXABMQXSJQ-FXQIFTODSA-N 0.000 description 1
- MOGMYRUNTKYZFB-UNQGMJICSA-N Arg-Thr-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MOGMYRUNTKYZFB-UNQGMJICSA-N 0.000 description 1
- IZSMEUDYADKZTJ-KJEVXHAQSA-N Arg-Tyr-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IZSMEUDYADKZTJ-KJEVXHAQSA-N 0.000 description 1
- XMZZGVGKGXRIGJ-JYJNAYRXSA-N Arg-Tyr-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O XMZZGVGKGXRIGJ-JYJNAYRXSA-N 0.000 description 1
- PSUXEQYPYZLNER-QXEWZRGKSA-N Arg-Val-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PSUXEQYPYZLNER-QXEWZRGKSA-N 0.000 description 1
- LLQIAIUAKGNOSE-NHCYSSNCSA-N Arg-Val-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N LLQIAIUAKGNOSE-NHCYSSNCSA-N 0.000 description 1
- FMYQECOAIFGQGU-CYDGBPFRSA-N Arg-Val-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FMYQECOAIFGQGU-CYDGBPFRSA-N 0.000 description 1
- 101100272670 Aromatoleum evansii boxB gene Proteins 0.000 description 1
- XWGJDUSDTRPQRK-ZLUOBGJFSA-N Asn-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O XWGJDUSDTRPQRK-ZLUOBGJFSA-N 0.000 description 1
- BDMIFVIWCNLDCT-CIUDSAMLSA-N Asn-Arg-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O BDMIFVIWCNLDCT-CIUDSAMLSA-N 0.000 description 1
- MEFGKQUUYZOLHM-GMOBBJLQSA-N Asn-Arg-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MEFGKQUUYZOLHM-GMOBBJLQSA-N 0.000 description 1
- BVLIJXXSXBUGEC-SRVKXCTJSA-N Asn-Asn-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BVLIJXXSXBUGEC-SRVKXCTJSA-N 0.000 description 1
- QYXNFROWLZPWPC-FXQIFTODSA-N Asn-Glu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QYXNFROWLZPWPC-FXQIFTODSA-N 0.000 description 1
- JREOBWLIZLXRIS-GUBZILKMSA-N Asn-Glu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JREOBWLIZLXRIS-GUBZILKMSA-N 0.000 description 1
- CTQIOCMSIJATNX-WHFBIAKZSA-N Asn-Gly-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O CTQIOCMSIJATNX-WHFBIAKZSA-N 0.000 description 1
- PNHQRQTVBRDIEF-CIUDSAMLSA-N Asn-Leu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)N)N PNHQRQTVBRDIEF-CIUDSAMLSA-N 0.000 description 1
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 1
- FHETWELNCBMRMG-HJGDQZAQSA-N Asn-Leu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FHETWELNCBMRMG-HJGDQZAQSA-N 0.000 description 1
- NLDNNZKUSLAYFW-NHCYSSNCSA-N Asn-Lys-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O NLDNNZKUSLAYFW-NHCYSSNCSA-N 0.000 description 1
- RAUPFUCUDBQYHE-AVGNSLFASA-N Asn-Phe-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O RAUPFUCUDBQYHE-AVGNSLFASA-N 0.000 description 1
- JTXVXGXTRXMOFJ-FXQIFTODSA-N Asn-Pro-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O JTXVXGXTRXMOFJ-FXQIFTODSA-N 0.000 description 1
- YUOXLJYVSZYPBJ-CIUDSAMLSA-N Asn-Pro-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O YUOXLJYVSZYPBJ-CIUDSAMLSA-N 0.000 description 1
- NJSNXIOKBHPFMB-GMOBBJLQSA-N Asn-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)N)N NJSNXIOKBHPFMB-GMOBBJLQSA-N 0.000 description 1
- QUMKPKWYDVMGNT-NUMRIWBASA-N Asn-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QUMKPKWYDVMGNT-NUMRIWBASA-N 0.000 description 1
- XBQSLMACWDXWLJ-GHCJXIJMSA-N Asp-Ala-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XBQSLMACWDXWLJ-GHCJXIJMSA-N 0.000 description 1
- CASGONAXMZPHCK-FXQIFTODSA-N Asp-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)CN=C(N)N CASGONAXMZPHCK-FXQIFTODSA-N 0.000 description 1
- ZELQAFZSJOBEQS-ACZMJKKPSA-N Asp-Asn-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZELQAFZSJOBEQS-ACZMJKKPSA-N 0.000 description 1
- GWTLRDMPMJCNMH-WHFBIAKZSA-N Asp-Asn-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GWTLRDMPMJCNMH-WHFBIAKZSA-N 0.000 description 1
- VPSHHQXIWLGVDD-ZLUOBGJFSA-N Asp-Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O VPSHHQXIWLGVDD-ZLUOBGJFSA-N 0.000 description 1
- XJQRWGXKUSDEFI-ACZMJKKPSA-N Asp-Glu-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XJQRWGXKUSDEFI-ACZMJKKPSA-N 0.000 description 1
- QCLHLXDWRKOHRR-GUBZILKMSA-N Asp-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N QCLHLXDWRKOHRR-GUBZILKMSA-N 0.000 description 1
- VILLWIDTHYPSLC-PEFMBERDSA-N Asp-Glu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VILLWIDTHYPSLC-PEFMBERDSA-N 0.000 description 1
- DGKCOYGQLNWNCJ-ACZMJKKPSA-N Asp-Glu-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O DGKCOYGQLNWNCJ-ACZMJKKPSA-N 0.000 description 1
- RRKCPMGSRIDLNC-AVGNSLFASA-N Asp-Glu-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RRKCPMGSRIDLNC-AVGNSLFASA-N 0.000 description 1
- WBDWQKRLTVCDSY-WHFBIAKZSA-N Asp-Gly-Asp Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O WBDWQKRLTVCDSY-WHFBIAKZSA-N 0.000 description 1
- SVABRQFIHCSNCI-FOHZUACHSA-N Asp-Gly-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O SVABRQFIHCSNCI-FOHZUACHSA-N 0.000 description 1
- GBSUGIXJAAKZOW-GMOBBJLQSA-N Asp-Ile-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O GBSUGIXJAAKZOW-GMOBBJLQSA-N 0.000 description 1
- CYCKJEFVFNRWEZ-UGYAYLCHSA-N Asp-Ile-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O CYCKJEFVFNRWEZ-UGYAYLCHSA-N 0.000 description 1
- QNFRBNZGVVKBNJ-PEFMBERDSA-N Asp-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N QNFRBNZGVVKBNJ-PEFMBERDSA-N 0.000 description 1
- KYQNAIMCTRZLNP-QSFUFRPTSA-N Asp-Ile-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O KYQNAIMCTRZLNP-QSFUFRPTSA-N 0.000 description 1
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 1
- HKEZZWQWXWGASX-KKUMJFAQSA-N Asp-Leu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 HKEZZWQWXWGASX-KKUMJFAQSA-N 0.000 description 1
- IVPNEDNYYYFAGI-GARJFASQSA-N Asp-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N IVPNEDNYYYFAGI-GARJFASQSA-N 0.000 description 1
- LIVXPXUVXFRWNY-CIUDSAMLSA-N Asp-Lys-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O LIVXPXUVXFRWNY-CIUDSAMLSA-N 0.000 description 1
- QNIACYURSSCLRP-GUBZILKMSA-N Asp-Lys-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O QNIACYURSSCLRP-GUBZILKMSA-N 0.000 description 1
- AHWRSSLYSGLBGD-CIUDSAMLSA-N Asp-Pro-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O AHWRSSLYSGLBGD-CIUDSAMLSA-N 0.000 description 1
- UAXIKORUDGGIGA-DCAQKATOSA-N Asp-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)O)N)C(=O)N[C@@H](CCCCN)C(=O)O UAXIKORUDGGIGA-DCAQKATOSA-N 0.000 description 1
- KGHLGJAXYSVNJP-WHFBIAKZSA-N Asp-Ser-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O KGHLGJAXYSVNJP-WHFBIAKZSA-N 0.000 description 1
- JSHWXQIZOCVWIA-ZKWXMUAHSA-N Asp-Ser-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O JSHWXQIZOCVWIA-ZKWXMUAHSA-N 0.000 description 1
- JDDYEZGPYBBPBN-JRQIVUDYSA-N Asp-Thr-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JDDYEZGPYBBPBN-JRQIVUDYSA-N 0.000 description 1
- USENATHVGFXRNO-SRVKXCTJSA-N Asp-Tyr-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 USENATHVGFXRNO-SRVKXCTJSA-N 0.000 description 1
- BJDHEININLSZOT-KKUMJFAQSA-N Asp-Tyr-Lys Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(O)=O BJDHEININLSZOT-KKUMJFAQSA-N 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 108091079001 CRISPR RNA Proteins 0.000 description 1
- 101100228196 Caenorhabditis elegans gly-4 gene Proteins 0.000 description 1
- 108090000994 Catalytic RNA Proteins 0.000 description 1
- 102000053642 Catalytic RNA Human genes 0.000 description 1
- 108010001857 Cell Surface Receptors Proteins 0.000 description 1
- 108091092236 Chimeric RNA Proteins 0.000 description 1
- 108091060290 Chromatid Proteins 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 108091035707 Consensus sequence Proteins 0.000 description 1
- UGPCUUWZXRMCIJ-KKUMJFAQSA-N Cys-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CS)N UGPCUUWZXRMCIJ-KKUMJFAQSA-N 0.000 description 1
- 102000004127 Cytokines Human genes 0.000 description 1
- 108090000695 Cytokines Proteins 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 108010061982 DNA Ligases Proteins 0.000 description 1
- 238000012270 DNA recombination Methods 0.000 description 1
- 230000008265 DNA repair mechanism Effects 0.000 description 1
- 230000007018 DNA scission Effects 0.000 description 1
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 1
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 1
- 230000004568 DNA-binding Effects 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 101100300807 Drosophila melanogaster spn-A gene Proteins 0.000 description 1
- ZGTMUACCHSMWAC-UHFFFAOYSA-L EDTA disodium salt (anhydrous) Chemical compound [Na+].[Na+].OC(=O)CN(CC([O-])=O)CCN(CC(O)=O)CC([O-])=O ZGTMUACCHSMWAC-UHFFFAOYSA-L 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- 108060002716 Exonuclease Proteins 0.000 description 1
- 241001523858 Felipes Species 0.000 description 1
- 230000010190 G1 phase Effects 0.000 description 1
- REJJNXODKSHOKA-ACZMJKKPSA-N Gln-Ala-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N REJJNXODKSHOKA-ACZMJKKPSA-N 0.000 description 1
- XEYMBRRKIFYQMF-GUBZILKMSA-N Gln-Asp-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O XEYMBRRKIFYQMF-GUBZILKMSA-N 0.000 description 1
- CGVWDTRDPLOMHZ-FXQIFTODSA-N Gln-Glu-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O CGVWDTRDPLOMHZ-FXQIFTODSA-N 0.000 description 1
- PNENQZWRFMUZOM-DCAQKATOSA-N Gln-Glu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O PNENQZWRFMUZOM-DCAQKATOSA-N 0.000 description 1
- NNXIQPMZGZUFJJ-AVGNSLFASA-N Gln-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N NNXIQPMZGZUFJJ-AVGNSLFASA-N 0.000 description 1
- LKVCNGLNTAPMSZ-JYJNAYRXSA-N Gln-His-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCC(=O)N)N LKVCNGLNTAPMSZ-JYJNAYRXSA-N 0.000 description 1
- GIVHPCWYVWUUSG-HVTMNAMFSA-N Gln-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N GIVHPCWYVWUUSG-HVTMNAMFSA-N 0.000 description 1
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 1
- KHNJVFYHIKLUPD-SRVKXCTJSA-N Gln-Leu-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N KHNJVFYHIKLUPD-SRVKXCTJSA-N 0.000 description 1
- ZBKUIQNCRIYVGH-SDDRHHMPSA-N Gln-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZBKUIQNCRIYVGH-SDDRHHMPSA-N 0.000 description 1
- JNENSVNAUWONEZ-GUBZILKMSA-N Gln-Lys-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JNENSVNAUWONEZ-GUBZILKMSA-N 0.000 description 1
- SXGMGNZEHFORAV-IUCAKERBSA-N Gln-Lys-Gly Chemical compound C(CCN)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N SXGMGNZEHFORAV-IUCAKERBSA-N 0.000 description 1
- JRHPEMVLTRADLJ-AVGNSLFASA-N Gln-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JRHPEMVLTRADLJ-AVGNSLFASA-N 0.000 description 1
- STHSGOZLFLFGSS-SUSMZKCASA-N Gln-Thr-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O STHSGOZLFLFGSS-SUSMZKCASA-N 0.000 description 1
- HLRLXVPRJJITSK-IFFSRLJSSA-N Gln-Thr-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O HLRLXVPRJJITSK-IFFSRLJSSA-N 0.000 description 1
- WTJIWXMJESRHMM-XDTLVQLUSA-N Gln-Tyr-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O WTJIWXMJESRHMM-XDTLVQLUSA-N 0.000 description 1
- FITIQFSXXBKFFM-NRPADANISA-N Gln-Val-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FITIQFSXXBKFFM-NRPADANISA-N 0.000 description 1
- JJKKWYQVHRUSDG-GUBZILKMSA-N Glu-Ala-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O JJKKWYQVHRUSDG-GUBZILKMSA-N 0.000 description 1
- KKCUFHUTMKQQCF-SRVKXCTJSA-N Glu-Arg-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O KKCUFHUTMKQQCF-SRVKXCTJSA-N 0.000 description 1
- KEBACWCLVOXFNC-DCAQKATOSA-N Glu-Arg-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O KEBACWCLVOXFNC-DCAQKATOSA-N 0.000 description 1
- YYOBUPFZLKQUAX-FXQIFTODSA-N Glu-Asn-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YYOBUPFZLKQUAX-FXQIFTODSA-N 0.000 description 1
- CKRUHITYRFNUKW-WDSKDSINSA-N Glu-Asn-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CKRUHITYRFNUKW-WDSKDSINSA-N 0.000 description 1
- LXAUHIRMWXQRKI-XHNCKOQMSA-N Glu-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)C(=O)O LXAUHIRMWXQRKI-XHNCKOQMSA-N 0.000 description 1
- QPRZKNOOOBWXSU-CIUDSAMLSA-N Glu-Asp-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N QPRZKNOOOBWXSU-CIUDSAMLSA-N 0.000 description 1
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 1
- CYHBMLHCQXXCCT-AVGNSLFASA-N Glu-Asp-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CYHBMLHCQXXCCT-AVGNSLFASA-N 0.000 description 1
- ZXQPJYWZSFGWJB-AVGNSLFASA-N Glu-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)O)N ZXQPJYWZSFGWJB-AVGNSLFASA-N 0.000 description 1
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 1
- MUSGDMDGNGXULI-DCAQKATOSA-N Glu-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O MUSGDMDGNGXULI-DCAQKATOSA-N 0.000 description 1
- LGYZYFFDELZWRS-DCAQKATOSA-N Glu-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O LGYZYFFDELZWRS-DCAQKATOSA-N 0.000 description 1
- IQACOVZVOMVILH-FXQIFTODSA-N Glu-Glu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O IQACOVZVOMVILH-FXQIFTODSA-N 0.000 description 1
- PHONAZGUEGIOEM-GLLZPBPUSA-N Glu-Glu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PHONAZGUEGIOEM-GLLZPBPUSA-N 0.000 description 1
- PXXGVUVQWQGGIG-YUMQZZPRSA-N Glu-Gly-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N PXXGVUVQWQGGIG-YUMQZZPRSA-N 0.000 description 1
- LRPXYSGPOBVBEH-IUCAKERBSA-N Glu-Gly-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O LRPXYSGPOBVBEH-IUCAKERBSA-N 0.000 description 1
- WTMZXOPHTIVFCP-QEWYBTABSA-N Glu-Ile-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 WTMZXOPHTIVFCP-QEWYBTABSA-N 0.000 description 1
- ZHNHJYYFCGUZNQ-KBIXCLLPSA-N Glu-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O ZHNHJYYFCGUZNQ-KBIXCLLPSA-N 0.000 description 1
- DNPCBMNFQVTHMA-DCAQKATOSA-N Glu-Leu-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DNPCBMNFQVTHMA-DCAQKATOSA-N 0.000 description 1
- ATVYZJGOZLVXDK-IUCAKERBSA-N Glu-Leu-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O ATVYZJGOZLVXDK-IUCAKERBSA-N 0.000 description 1
- VGBSZQSKQRMLHD-MNXVOIDGSA-N Glu-Leu-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VGBSZQSKQRMLHD-MNXVOIDGSA-N 0.000 description 1
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 1
- GJBUAAAIZSRCDC-GVXVVHGQSA-N Glu-Leu-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O GJBUAAAIZSRCDC-GVXVVHGQSA-N 0.000 description 1
- OCJRHJZKGGSPRW-IUCAKERBSA-N Glu-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O OCJRHJZKGGSPRW-IUCAKERBSA-N 0.000 description 1
- YKBUCXNNBYZYAY-MNXVOIDGSA-N Glu-Lys-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YKBUCXNNBYZYAY-MNXVOIDGSA-N 0.000 description 1
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 1
- ZGEJRLJEAMPEDV-SRVKXCTJSA-N Glu-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)N ZGEJRLJEAMPEDV-SRVKXCTJSA-N 0.000 description 1
- ZQYZDDXTNQXUJH-CIUDSAMLSA-N Glu-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(=O)O)N ZQYZDDXTNQXUJH-CIUDSAMLSA-N 0.000 description 1
- XNOWYPDMSLSRKP-GUBZILKMSA-N Glu-Met-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(O)=O XNOWYPDMSLSRKP-GUBZILKMSA-N 0.000 description 1
- LKOAAMXDJGEYMS-ZPFDUUQYSA-N Glu-Met-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LKOAAMXDJGEYMS-ZPFDUUQYSA-N 0.000 description 1
- YTRBQAQSUDSIQE-FHWLQOOXSA-N Glu-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 YTRBQAQSUDSIQE-FHWLQOOXSA-N 0.000 description 1
- MIIGESVJEBDJMP-FHWLQOOXSA-N Glu-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 MIIGESVJEBDJMP-FHWLQOOXSA-N 0.000 description 1
- SYAYROHMAIHWFB-KBIXCLLPSA-N Glu-Ser-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYAYROHMAIHWFB-KBIXCLLPSA-N 0.000 description 1
- TWYSSILQABLLME-HJGDQZAQSA-N Glu-Thr-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWYSSILQABLLME-HJGDQZAQSA-N 0.000 description 1
- XAXJIUAWAFVADB-VJBMBRPKSA-N Glu-Trp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CNC4=CC=CC=C43)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XAXJIUAWAFVADB-VJBMBRPKSA-N 0.000 description 1
- HJTSRYLPAYGEEC-SIUGBPQLSA-N Glu-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCC(=O)O)N HJTSRYLPAYGEEC-SIUGBPQLSA-N 0.000 description 1
- KIEICAOUSNYOLM-NRPADANISA-N Glu-Val-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KIEICAOUSNYOLM-NRPADANISA-N 0.000 description 1
- MLILEEIVMRUYBX-NHCYSSNCSA-N Glu-Val-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O MLILEEIVMRUYBX-NHCYSSNCSA-N 0.000 description 1
- YQPFCZVKMUVZIN-AUTRQRHGSA-N Glu-Val-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O YQPFCZVKMUVZIN-AUTRQRHGSA-N 0.000 description 1
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 1
- 102000053187 Glucuronidase Human genes 0.000 description 1
- 108010060309 Glucuronidase Proteins 0.000 description 1
- CLODWIOAKCSBAN-BQBZGAKWSA-N Gly-Arg-Asp Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O CLODWIOAKCSBAN-BQBZGAKWSA-N 0.000 description 1
- JPXNYFOHTHSREU-UWVGGRQHSA-N Gly-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)CN JPXNYFOHTHSREU-UWVGGRQHSA-N 0.000 description 1
- XCLCVBYNGXEVDU-WHFBIAKZSA-N Gly-Asn-Ser Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O XCLCVBYNGXEVDU-WHFBIAKZSA-N 0.000 description 1
- LURCIJSJAKFCRO-QWRGUYRKSA-N Gly-Asn-Tyr Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LURCIJSJAKFCRO-QWRGUYRKSA-N 0.000 description 1
- LCNXZQROPKFGQK-WHFBIAKZSA-N Gly-Asp-Ser Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O LCNXZQROPKFGQK-WHFBIAKZSA-N 0.000 description 1
- TZOVVRJYUDETQG-RCOVLWMOSA-N Gly-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CN TZOVVRJYUDETQG-RCOVLWMOSA-N 0.000 description 1
- CQZDZKRHFWJXDF-WDSKDSINSA-N Gly-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CN CQZDZKRHFWJXDF-WDSKDSINSA-N 0.000 description 1
- NTOWAXLMQFKJPT-YUMQZZPRSA-N Gly-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)CN NTOWAXLMQFKJPT-YUMQZZPRSA-N 0.000 description 1
- BEQGFMIBZFNROK-JGVFFNPUSA-N Gly-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)CN)C(=O)O BEQGFMIBZFNROK-JGVFFNPUSA-N 0.000 description 1
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 1
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 1
- HPAIKDPJURGQLN-KBPBESRZSA-N Gly-His-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CNC=N1 HPAIKDPJURGQLN-KBPBESRZSA-N 0.000 description 1
- SXJHOPPTOJACOA-QXEWZRGKSA-N Gly-Ile-Arg Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SXJHOPPTOJACOA-QXEWZRGKSA-N 0.000 description 1
- DENRBIYENOKSEX-PEXQALLHSA-N Gly-Ile-His Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 DENRBIYENOKSEX-PEXQALLHSA-N 0.000 description 1
- UESJMAMHDLEHGM-NHCYSSNCSA-N Gly-Ile-Leu Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O UESJMAMHDLEHGM-NHCYSSNCSA-N 0.000 description 1
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 1
- SCWYHUQOOFRVHP-MBLNEYKQSA-N Gly-Ile-Thr Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SCWYHUQOOFRVHP-MBLNEYKQSA-N 0.000 description 1
- AFWYPMDMDYCKMD-KBPBESRZSA-N Gly-Leu-Tyr Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFWYPMDMDYCKMD-KBPBESRZSA-N 0.000 description 1
- VBOBNHSVQKKTOT-YUMQZZPRSA-N Gly-Lys-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O VBOBNHSVQKKTOT-YUMQZZPRSA-N 0.000 description 1
- IBYOLNARKHMLBG-WHOFXGATSA-N Gly-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 IBYOLNARKHMLBG-WHOFXGATSA-N 0.000 description 1
- WNZOCXUOGVYYBJ-CDMKHQONSA-N Gly-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)CN)O WNZOCXUOGVYYBJ-CDMKHQONSA-N 0.000 description 1
- VDCRBJACQKOSMS-JSGCOSHPSA-N Gly-Phe-Val Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O VDCRBJACQKOSMS-JSGCOSHPSA-N 0.000 description 1
- WNGHUXFWEWTKAO-YUMQZZPRSA-N Gly-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN WNGHUXFWEWTKAO-YUMQZZPRSA-N 0.000 description 1
- ZZWUYQXMIFTIIY-WEDXCCLWSA-N Gly-Thr-Leu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O ZZWUYQXMIFTIIY-WEDXCCLWSA-N 0.000 description 1
- FOKISINOENBSDM-WLTAIBSBSA-N Gly-Thr-Tyr Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O FOKISINOENBSDM-WLTAIBSBSA-N 0.000 description 1
- PYFIQROSWQERAS-LBPRGKRZSA-N Gly-Trp-Gly Chemical compound C1=CC=C2C(C[C@H](NC(=O)CN)C(=O)NCC(O)=O)=CNC2=C1 PYFIQROSWQERAS-LBPRGKRZSA-N 0.000 description 1
- UIQGJYUEQDOODF-KWQFWETISA-N Gly-Tyr-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 UIQGJYUEQDOODF-KWQFWETISA-N 0.000 description 1
- YJDALMUYJIENAG-QWRGUYRKSA-N Gly-Tyr-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN)O YJDALMUYJIENAG-QWRGUYRKSA-N 0.000 description 1
- PNUFMLXHOLFRLD-KBPBESRZSA-N Gly-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 PNUFMLXHOLFRLD-KBPBESRZSA-N 0.000 description 1
- DNVDEMWIYLVIQU-RCOVLWMOSA-N Gly-Val-Asp Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O DNVDEMWIYLVIQU-RCOVLWMOSA-N 0.000 description 1
- SYOJVRNQCXYEOV-XVKPBYJWSA-N Gly-Val-Glu Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SYOJVRNQCXYEOV-XVKPBYJWSA-N 0.000 description 1
- YGHSQRJSHKYUJY-SCZZXKLOSA-N Gly-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN YGHSQRJSHKYUJY-SCZZXKLOSA-N 0.000 description 1
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 1
- 241000204988 Haloferax mediterranei Species 0.000 description 1
- SYMSVYVUSPSAAO-IHRRRGAJSA-N His-Arg-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O SYMSVYVUSPSAAO-IHRRRGAJSA-N 0.000 description 1
- MVADCDSCFTXCBT-CIUDSAMLSA-N His-Asp-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MVADCDSCFTXCBT-CIUDSAMLSA-N 0.000 description 1
- ZJSMFRTVYSLKQU-DJFWLOJKSA-N His-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC1=CN=CN1)N ZJSMFRTVYSLKQU-DJFWLOJKSA-N 0.000 description 1
- UOAVQQRILDGZEN-SRVKXCTJSA-N His-Asp-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UOAVQQRILDGZEN-SRVKXCTJSA-N 0.000 description 1
- KWBISLAEQZUYIC-UWJYBYFXSA-N His-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CN=CN2)N KWBISLAEQZUYIC-UWJYBYFXSA-N 0.000 description 1
- FSOXZQBMPBQKGJ-QSFUFRPTSA-N His-Ile-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]([NH3+])CC1=CN=CN1 FSOXZQBMPBQKGJ-QSFUFRPTSA-N 0.000 description 1
- AIPUZFXMXAHZKY-QWRGUYRKSA-N His-Leu-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O AIPUZFXMXAHZKY-QWRGUYRKSA-N 0.000 description 1
- OWYIDJCNRWRSJY-QTKMDUPCSA-N His-Pro-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O OWYIDJCNRWRSJY-QTKMDUPCSA-N 0.000 description 1
- 235000013717 Houttuynia Nutrition 0.000 description 1
- 240000000691 Houttuynia cordata Species 0.000 description 1
- 101150098499 III gene Proteins 0.000 description 1
- FVEWRQXNISSYFO-ZPFDUUQYSA-N Ile-Arg-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N FVEWRQXNISSYFO-ZPFDUUQYSA-N 0.000 description 1
- YOTNPRLPIPHQSB-XUXIUFHCSA-N Ile-Arg-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOTNPRLPIPHQSB-XUXIUFHCSA-N 0.000 description 1
- ZZHGKECPZXPXJF-PCBIJLKTSA-N Ile-Asn-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZZHGKECPZXPXJF-PCBIJLKTSA-N 0.000 description 1
- RGSOCXHDOPQREB-ZPFDUUQYSA-N Ile-Asp-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N RGSOCXHDOPQREB-ZPFDUUQYSA-N 0.000 description 1
- KUHFPGIVBOCRMV-MNXVOIDGSA-N Ile-Gln-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)O)N KUHFPGIVBOCRMV-MNXVOIDGSA-N 0.000 description 1
- LGMUPVWZEYYUMU-YVNDNENWSA-N Ile-Glu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N LGMUPVWZEYYUMU-YVNDNENWSA-N 0.000 description 1
- PHIXPNQDGGILMP-YVNDNENWSA-N Ile-Glu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N PHIXPNQDGGILMP-YVNDNENWSA-N 0.000 description 1
- MTFVYKQRLXYAQN-LAEOZQHASA-N Ile-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O MTFVYKQRLXYAQN-LAEOZQHASA-N 0.000 description 1
- PNDMHTTXXPUQJH-RWRJDSDZSA-N Ile-Glu-Thr Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@H](O)C)C(=O)O PNDMHTTXXPUQJH-RWRJDSDZSA-N 0.000 description 1
- NZOCIWKZUVUNDW-ZKWXMUAHSA-N Ile-Gly-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O NZOCIWKZUVUNDW-ZKWXMUAHSA-N 0.000 description 1
- KFVUBLZRFSVDGO-BYULHYEWSA-N Ile-Gly-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O KFVUBLZRFSVDGO-BYULHYEWSA-N 0.000 description 1
- NYEYYMLUABXDMC-NHCYSSNCSA-N Ile-Gly-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)O)N NYEYYMLUABXDMC-NHCYSSNCSA-N 0.000 description 1
- JLWLMGADIQFKRD-QSFUFRPTSA-N Ile-His-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CN=CN1 JLWLMGADIQFKRD-QSFUFRPTSA-N 0.000 description 1
- HYLIOBDWPQNLKI-HVTMNAMFSA-N Ile-His-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N HYLIOBDWPQNLKI-HVTMNAMFSA-N 0.000 description 1
- HUWYGQOISIJNMK-SIGLWIIPSA-N Ile-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N HUWYGQOISIJNMK-SIGLWIIPSA-N 0.000 description 1
- CSQNHSGHAPRGPQ-YTFOTSKYSA-N Ile-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(=O)O)N CSQNHSGHAPRGPQ-YTFOTSKYSA-N 0.000 description 1
- OUUCIIJSBIBCHB-ZPFDUUQYSA-N Ile-Leu-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O OUUCIIJSBIBCHB-ZPFDUUQYSA-N 0.000 description 1
- GVKKVHNRTUFCCE-BJDJZHNGSA-N Ile-Leu-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)O)N GVKKVHNRTUFCCE-BJDJZHNGSA-N 0.000 description 1
- UIEZQYNXCYHMQS-BJDJZHNGSA-N Ile-Lys-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)O)N UIEZQYNXCYHMQS-BJDJZHNGSA-N 0.000 description 1
- GVNNAHIRSDRIII-AJNGGQMLSA-N Ile-Lys-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N GVNNAHIRSDRIII-AJNGGQMLSA-N 0.000 description 1
- RVNOXPZHMUWCLW-GMOBBJLQSA-N Ile-Met-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N RVNOXPZHMUWCLW-GMOBBJLQSA-N 0.000 description 1
- RCMNUBZKIIJCOI-ZPFDUUQYSA-N Ile-Met-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N RCMNUBZKIIJCOI-ZPFDUUQYSA-N 0.000 description 1
- UAELWXJFLZBKQS-WHOFXGATSA-N Ile-Phe-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O UAELWXJFLZBKQS-WHOFXGATSA-N 0.000 description 1
- OWSWUWDMSNXTNE-GMOBBJLQSA-N Ile-Pro-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N OWSWUWDMSNXTNE-GMOBBJLQSA-N 0.000 description 1
- BJECXJHLUJXPJQ-PYJNHQTQSA-N Ile-Pro-His Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N BJECXJHLUJXPJQ-PYJNHQTQSA-N 0.000 description 1
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 1
- SAEWJTCJQVZQNZ-IUKAMOBKSA-N Ile-Thr-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SAEWJTCJQVZQNZ-IUKAMOBKSA-N 0.000 description 1
- SWNRZNLXMXRCJC-VKOGCVSHSA-N Ile-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 SWNRZNLXMXRCJC-VKOGCVSHSA-N 0.000 description 1
- 108060003951 Immunoglobulin Proteins 0.000 description 1
- 102100034343 Integrase Human genes 0.000 description 1
- 108010061833 Integrases Proteins 0.000 description 1
- 241000222712 Kinetoplastida Species 0.000 description 1
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 1
- IBMVEYRWAWIOTN-UHFFFAOYSA-N L-Leucyl-L-Arginyl-L-Proline Natural products CC(C)CC(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O IBMVEYRWAWIOTN-UHFFFAOYSA-N 0.000 description 1
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 1
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 1
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 1
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 1
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 1
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 1
- 108091026898 Leader sequence (mRNA) Proteins 0.000 description 1
- LJHGALIOHLRRQN-DCAQKATOSA-N Leu-Ala-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N LJHGALIOHLRRQN-DCAQKATOSA-N 0.000 description 1
- MJOZZTKJZQFKDK-GUBZILKMSA-N Leu-Ala-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(N)=O MJOZZTKJZQFKDK-GUBZILKMSA-N 0.000 description 1
- IBMVEYRWAWIOTN-RWMBFGLXSA-N Leu-Arg-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(O)=O IBMVEYRWAWIOTN-RWMBFGLXSA-N 0.000 description 1
- VIWUBXKCYJGNCL-SRVKXCTJSA-N Leu-Asn-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 VIWUBXKCYJGNCL-SRVKXCTJSA-N 0.000 description 1
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 1
- ZURHXHNAEJJRNU-CIUDSAMLSA-N Leu-Asp-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZURHXHNAEJJRNU-CIUDSAMLSA-N 0.000 description 1
- GLBNEGIOFRVRHO-JYJNAYRXSA-N Leu-Gln-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GLBNEGIOFRVRHO-JYJNAYRXSA-N 0.000 description 1
- OXRLYTYUXAQTHP-YUMQZZPRSA-N Leu-Gly-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(O)=O OXRLYTYUXAQTHP-YUMQZZPRSA-N 0.000 description 1
- FMEICTQWUKNAGC-YUMQZZPRSA-N Leu-Gly-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O FMEICTQWUKNAGC-YUMQZZPRSA-N 0.000 description 1
- HYIFFZAQXPUEAU-QWRGUYRKSA-N Leu-Gly-Leu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C HYIFFZAQXPUEAU-QWRGUYRKSA-N 0.000 description 1
- PBGDOSARRIJMEV-DLOVCJGASA-N Leu-His-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O PBGDOSARRIJMEV-DLOVCJGASA-N 0.000 description 1
- BKTXKJMNTSMJDQ-AVGNSLFASA-N Leu-His-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N BKTXKJMNTSMJDQ-AVGNSLFASA-N 0.000 description 1
- KXODZBLFVFSLAI-AVGNSLFASA-N Leu-His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 KXODZBLFVFSLAI-AVGNSLFASA-N 0.000 description 1
- USLNHQZCDQJBOV-ZPFDUUQYSA-N Leu-Ile-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O USLNHQZCDQJBOV-ZPFDUUQYSA-N 0.000 description 1
- QJXHMYMRGDOHRU-NHCYSSNCSA-N Leu-Ile-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O QJXHMYMRGDOHRU-NHCYSSNCSA-N 0.000 description 1
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 1
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 1
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 1
- FKQPWMZLIIATBA-AJNGGQMLSA-N Leu-Lys-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FKQPWMZLIIATBA-AJNGGQMLSA-N 0.000 description 1
- ONPJGOIVICHWBW-BZSNNMDCSA-N Leu-Lys-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 ONPJGOIVICHWBW-BZSNNMDCSA-N 0.000 description 1
- AIRUUHAOKGVJAD-JYJNAYRXSA-N Leu-Phe-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIRUUHAOKGVJAD-JYJNAYRXSA-N 0.000 description 1
- INCJJHQRZGQLFC-KBPBESRZSA-N Leu-Phe-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O INCJJHQRZGQLFC-KBPBESRZSA-N 0.000 description 1
- YWKNKRAKOCLOLH-OEAJRASXSA-N Leu-Phe-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YWKNKRAKOCLOLH-OEAJRASXSA-N 0.000 description 1
- RRVCZCNFXIFGRA-DCAQKATOSA-N Leu-Pro-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O RRVCZCNFXIFGRA-DCAQKATOSA-N 0.000 description 1
- IRMLZWSRWSGTOP-CIUDSAMLSA-N Leu-Ser-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O IRMLZWSRWSGTOP-CIUDSAMLSA-N 0.000 description 1
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 1
- GOFJOGXGMPHOGL-DCAQKATOSA-N Leu-Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C GOFJOGXGMPHOGL-DCAQKATOSA-N 0.000 description 1
- ZDJQVSIPFLMNOX-RHYQMDGZSA-N Leu-Thr-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZDJQVSIPFLMNOX-RHYQMDGZSA-N 0.000 description 1
- AEDWWMMHUGYIFD-HJGDQZAQSA-N Leu-Thr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O AEDWWMMHUGYIFD-HJGDQZAQSA-N 0.000 description 1
- LINKCQUOMUDLKN-KATARQTJSA-N Leu-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(C)C)N)O LINKCQUOMUDLKN-KATARQTJSA-N 0.000 description 1
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 1
- YIRIDPUGZKHMHT-ACRUOGEOSA-N Leu-Tyr-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YIRIDPUGZKHMHT-ACRUOGEOSA-N 0.000 description 1
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 1
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 241000209510 Liliopsida Species 0.000 description 1
- PNPYKQFJGRFYJE-GUBZILKMSA-N Lys-Ala-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O PNPYKQFJGRFYJE-GUBZILKMSA-N 0.000 description 1
- KCXUCYYZNZFGLL-SRVKXCTJSA-N Lys-Ala-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O KCXUCYYZNZFGLL-SRVKXCTJSA-N 0.000 description 1
- IXHKPDJKKCUKHS-GARJFASQSA-N Lys-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N IXHKPDJKKCUKHS-GARJFASQSA-N 0.000 description 1
- NTEVEUCLFMWSND-SRVKXCTJSA-N Lys-Arg-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O NTEVEUCLFMWSND-SRVKXCTJSA-N 0.000 description 1
- FUKDBQGFSJUXGX-RWMBFGLXSA-N Lys-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)C(=O)O FUKDBQGFSJUXGX-RWMBFGLXSA-N 0.000 description 1
- GGAPIOORBXHMNY-ULQDDVLXSA-N Lys-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)O GGAPIOORBXHMNY-ULQDDVLXSA-N 0.000 description 1
- HQVDJTYKCMIWJP-YUMQZZPRSA-N Lys-Asn-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HQVDJTYKCMIWJP-YUMQZZPRSA-N 0.000 description 1
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 1
- PXHCFKXNSBJSTQ-KKUMJFAQSA-N Lys-Asn-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N)O PXHCFKXNSBJSTQ-KKUMJFAQSA-N 0.000 description 1
- QUYCUALODHJQLK-CIUDSAMLSA-N Lys-Asp-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUYCUALODHJQLK-CIUDSAMLSA-N 0.000 description 1
- IBQMEXQYZMVIFU-SRVKXCTJSA-N Lys-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N IBQMEXQYZMVIFU-SRVKXCTJSA-N 0.000 description 1
- IWWMPCPLFXFBAF-SRVKXCTJSA-N Lys-Asp-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O IWWMPCPLFXFBAF-SRVKXCTJSA-N 0.000 description 1
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 1
- PHHYNOUOUWYQRO-XIRDDKMYSA-N Lys-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N PHHYNOUOUWYQRO-XIRDDKMYSA-N 0.000 description 1
- RZHLIPMZXOEJTL-AVGNSLFASA-N Lys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N RZHLIPMZXOEJTL-AVGNSLFASA-N 0.000 description 1
- KZOHPCYVORJBLG-AVGNSLFASA-N Lys-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N KZOHPCYVORJBLG-AVGNSLFASA-N 0.000 description 1
- QZONCCHVHCOBSK-YUMQZZPRSA-N Lys-Gly-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O QZONCCHVHCOBSK-YUMQZZPRSA-N 0.000 description 1
- FHIAJWBDZVHLAH-YUMQZZPRSA-N Lys-Gly-Ser Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FHIAJWBDZVHLAH-YUMQZZPRSA-N 0.000 description 1
- KKFVKBWCXXLKIK-AVGNSLFASA-N Lys-His-Glu Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCCN)N KKFVKBWCXXLKIK-AVGNSLFASA-N 0.000 description 1
- WOEDRPCHKPSFDT-MXAVVETBSA-N Lys-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCCCN)N WOEDRPCHKPSFDT-MXAVVETBSA-N 0.000 description 1
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 1
- WVJNGSFKBKOKRV-AJNGGQMLSA-N Lys-Leu-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVJNGSFKBKOKRV-AJNGGQMLSA-N 0.000 description 1
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 1
- VUTWYNQUSJWBHO-BZSNNMDCSA-N Lys-Leu-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VUTWYNQUSJWBHO-BZSNNMDCSA-N 0.000 description 1
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 1
- GAHJXEMYXKLZRQ-AJNGGQMLSA-N Lys-Lys-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GAHJXEMYXKLZRQ-AJNGGQMLSA-N 0.000 description 1
- WBSCNDJQPKSPII-KKUMJFAQSA-N Lys-Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O WBSCNDJQPKSPII-KKUMJFAQSA-N 0.000 description 1
- KJIXWRWPOCKYLD-IHRRRGAJSA-N Lys-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N KJIXWRWPOCKYLD-IHRRRGAJSA-N 0.000 description 1
- PLDJDCJLRCYPJB-VOAKCMCISA-N Lys-Lys-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PLDJDCJLRCYPJB-VOAKCMCISA-N 0.000 description 1
- TWPCWKVOZDUYAA-KKUMJFAQSA-N Lys-Phe-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O TWPCWKVOZDUYAA-KKUMJFAQSA-N 0.000 description 1
- LMGNWHDWJDIOPK-DKIMLUQUSA-N Lys-Phe-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LMGNWHDWJDIOPK-DKIMLUQUSA-N 0.000 description 1
- LNMKRJJLEFASGA-BZSNNMDCSA-N Lys-Phe-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LNMKRJJLEFASGA-BZSNNMDCSA-N 0.000 description 1
- CNGOEHJCLVCJHN-SRVKXCTJSA-N Lys-Pro-Glu Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O CNGOEHJCLVCJHN-SRVKXCTJSA-N 0.000 description 1
- QBHGXFQJFPWJIH-XUXIUFHCSA-N Lys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN QBHGXFQJFPWJIH-XUXIUFHCSA-N 0.000 description 1
- WQDKIVRHTQYJSN-DCAQKATOSA-N Lys-Ser-Arg Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N WQDKIVRHTQYJSN-DCAQKATOSA-N 0.000 description 1
- GHKXHCMRAUYLBS-CIUDSAMLSA-N Lys-Ser-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O GHKXHCMRAUYLBS-CIUDSAMLSA-N 0.000 description 1
- ZUGVARDEGWMMLK-SRVKXCTJSA-N Lys-Ser-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN ZUGVARDEGWMMLK-SRVKXCTJSA-N 0.000 description 1
- CUHGAUZONORRIC-HJGDQZAQSA-N Lys-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N)O CUHGAUZONORRIC-HJGDQZAQSA-N 0.000 description 1
- YFQSSOAGMZGXFT-MEYUZBJRSA-N Lys-Thr-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YFQSSOAGMZGXFT-MEYUZBJRSA-N 0.000 description 1
- RQILLQOQXLZTCK-KBPBESRZSA-N Lys-Tyr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O RQILLQOQXLZTCK-KBPBESRZSA-N 0.000 description 1
- WINFHLHJTRGLCV-BZSNNMDCSA-N Lys-Tyr-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=C(O)C=C1 WINFHLHJTRGLCV-BZSNNMDCSA-N 0.000 description 1
- SQRLLZAQNOQCEG-KKUMJFAQSA-N Lys-Tyr-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 SQRLLZAQNOQCEG-KKUMJFAQSA-N 0.000 description 1
- VVURYEVJJTXWNE-ULQDDVLXSA-N Lys-Tyr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O VVURYEVJJTXWNE-ULQDDVLXSA-N 0.000 description 1
- RPWQJSBMXJSCPD-XUXIUFHCSA-N Lys-Val-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCCN)C(C)C)C(O)=O RPWQJSBMXJSCPD-XUXIUFHCSA-N 0.000 description 1
- DRRXXZBXDMLGFC-IHRRRGAJSA-N Lys-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN DRRXXZBXDMLGFC-IHRRRGAJSA-N 0.000 description 1
- OZVXDDFYCQOPFD-XQQFMLRXSA-N Lys-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N OZVXDDFYCQOPFD-XQQFMLRXSA-N 0.000 description 1
- HMZPYMSEAALNAE-ULQDDVLXSA-N Lys-Val-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HMZPYMSEAALNAE-ULQDDVLXSA-N 0.000 description 1
- IKXQOBUBZSOWDY-AVGNSLFASA-N Lys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N IKXQOBUBZSOWDY-AVGNSLFASA-N 0.000 description 1
- GUBGYTABKSRVRQ-PICCSMPSSA-N Maltose Natural products O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@@H]1O[C@@H]1[C@@H](CO)OC(O)[C@H](O)[C@H]1O GUBGYTABKSRVRQ-PICCSMPSSA-N 0.000 description 1
- QAHFGYLFLVGBNW-DCAQKATOSA-N Met-Ala-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN QAHFGYLFLVGBNW-DCAQKATOSA-N 0.000 description 1
- ULNXMMYXQKGNPG-LPEHRKFASA-N Met-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCSC)N ULNXMMYXQKGNPG-LPEHRKFASA-N 0.000 description 1
- OLWAOWXIADGIJG-AVGNSLFASA-N Met-Arg-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(O)=O OLWAOWXIADGIJG-AVGNSLFASA-N 0.000 description 1
- PNDCUTDWYVKBHX-IHRRRGAJSA-N Met-Asp-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PNDCUTDWYVKBHX-IHRRRGAJSA-N 0.000 description 1
- VOOINLQYUZOREH-SRVKXCTJSA-N Met-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCSC)N VOOINLQYUZOREH-SRVKXCTJSA-N 0.000 description 1
- DGNZGCQSVGGYJS-BQBZGAKWSA-N Met-Gly-Asp Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O DGNZGCQSVGGYJS-BQBZGAKWSA-N 0.000 description 1
- FZUNSVYYPYJYAP-NAKRPEOUSA-N Met-Ile-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O FZUNSVYYPYJYAP-NAKRPEOUSA-N 0.000 description 1
- AFFKUNVPPLQUGA-DCAQKATOSA-N Met-Leu-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O AFFKUNVPPLQUGA-DCAQKATOSA-N 0.000 description 1
- BEZJTLKUMFMITF-AVGNSLFASA-N Met-Lys-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCNC(N)=N BEZJTLKUMFMITF-AVGNSLFASA-N 0.000 description 1
- WTHGNAAQXISJHP-AVGNSLFASA-N Met-Lys-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O WTHGNAAQXISJHP-AVGNSLFASA-N 0.000 description 1
- MUDYEFAKNSTFAI-JYJNAYRXSA-N Met-Tyr-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O MUDYEFAKNSTFAI-JYJNAYRXSA-N 0.000 description 1
- 241000187479 Mycobacterium tuberculosis Species 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 1
- 108010047562 NGR peptide Proteins 0.000 description 1
- 102000007999 Nuclear Proteins Human genes 0.000 description 1
- 108010089610 Nuclear Proteins Proteins 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- WSXKXSBOJXEZDV-DLOVCJGASA-N Phe-Ala-Asn Chemical compound NC(=O)C[C@@H](C([O-])=O)NC(=O)[C@H](C)NC(=O)[C@@H]([NH3+])CC1=CC=CC=C1 WSXKXSBOJXEZDV-DLOVCJGASA-N 0.000 description 1
- YYRCPTVAPLQRNC-ULQDDVLXSA-N Phe-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CC1=CC=CC=C1 YYRCPTVAPLQRNC-ULQDDVLXSA-N 0.000 description 1
- ZENDEDYRYVHBEG-SRVKXCTJSA-N Phe-Asp-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZENDEDYRYVHBEG-SRVKXCTJSA-N 0.000 description 1
- UEEVBGHEGJMDDV-AVGNSLFASA-N Phe-Asp-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 UEEVBGHEGJMDDV-AVGNSLFASA-N 0.000 description 1
- RIYZXJVARWJLKS-KKUMJFAQSA-N Phe-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 RIYZXJVARWJLKS-KKUMJFAQSA-N 0.000 description 1
- SWZKMTDPQXLQRD-XVSYOHENSA-N Phe-Asp-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWZKMTDPQXLQRD-XVSYOHENSA-N 0.000 description 1
- GDBOREPXIRKSEQ-FHWLQOOXSA-N Phe-Gln-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GDBOREPXIRKSEQ-FHWLQOOXSA-N 0.000 description 1
- KJJROSNFBRWPHS-JYJNAYRXSA-N Phe-Glu-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KJJROSNFBRWPHS-JYJNAYRXSA-N 0.000 description 1
- PSKRILMFHNIUAO-JYJNAYRXSA-N Phe-Glu-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N PSKRILMFHNIUAO-JYJNAYRXSA-N 0.000 description 1
- HBGFEEQFVBWYJQ-KBPBESRZSA-N Phe-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HBGFEEQFVBWYJQ-KBPBESRZSA-N 0.000 description 1
- WLYPRKLMRIYGPP-JYJNAYRXSA-N Phe-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 WLYPRKLMRIYGPP-JYJNAYRXSA-N 0.000 description 1
- BSHMIVKDJQGLNT-ACRUOGEOSA-N Phe-Lys-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 BSHMIVKDJQGLNT-ACRUOGEOSA-N 0.000 description 1
- GPSMLZQVIIYLDK-ULQDDVLXSA-N Phe-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O GPSMLZQVIIYLDK-ULQDDVLXSA-N 0.000 description 1
- RBRNEFJTEHPDSL-ACRUOGEOSA-N Phe-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 RBRNEFJTEHPDSL-ACRUOGEOSA-N 0.000 description 1
- YMIZSYUAZJSOFL-SRVKXCTJSA-N Phe-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O YMIZSYUAZJSOFL-SRVKXCTJSA-N 0.000 description 1
- MSSXKZBDKZAHCX-UNQGMJICSA-N Phe-Thr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O MSSXKZBDKZAHCX-UNQGMJICSA-N 0.000 description 1
- SJRQWEDYTKYHHL-SLFFLAALSA-N Phe-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC3=CC=CC=C3)N)C(=O)O SJRQWEDYTKYHHL-SLFFLAALSA-N 0.000 description 1
- YUPRIZTWANWWHK-DZKIICNBSA-N Phe-Val-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N YUPRIZTWANWWHK-DZKIICNBSA-N 0.000 description 1
- IEIFEYBAYFSRBQ-IHRRRGAJSA-N Phe-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N IEIFEYBAYFSRBQ-IHRRRGAJSA-N 0.000 description 1
- 241000589579 Planomicrobium okeanokoites Species 0.000 description 1
- 239000002202 Polyethylene glycol Substances 0.000 description 1
- 108010076039 Polyproteins Proteins 0.000 description 1
- DZZCICYRSZASNF-FXQIFTODSA-N Pro-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 DZZCICYRSZASNF-FXQIFTODSA-N 0.000 description 1
- VOHFZDSRPZLXLH-IHRRRGAJSA-N Pro-Asn-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VOHFZDSRPZLXLH-IHRRRGAJSA-N 0.000 description 1
- CJZTUKSFZUSNCC-FXQIFTODSA-N Pro-Asp-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 CJZTUKSFZUSNCC-FXQIFTODSA-N 0.000 description 1
- SGCZFWSQERRKBD-BQBZGAKWSA-N Pro-Asp-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 SGCZFWSQERRKBD-BQBZGAKWSA-N 0.000 description 1
- KPDRZQUWJKTMBP-DCAQKATOSA-N Pro-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 KPDRZQUWJKTMBP-DCAQKATOSA-N 0.000 description 1
- UAYHMOIGIQZLFR-NHCYSSNCSA-N Pro-Gln-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O UAYHMOIGIQZLFR-NHCYSSNCSA-N 0.000 description 1
- MGDFPGCFVJFITQ-CIUDSAMLSA-N Pro-Glu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MGDFPGCFVJFITQ-CIUDSAMLSA-N 0.000 description 1
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 1
- XYHMFGGWNOFUOU-QXEWZRGKSA-N Pro-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 XYHMFGGWNOFUOU-QXEWZRGKSA-N 0.000 description 1
- VZKBJNBZMZHKRC-XUXIUFHCSA-N Pro-Ile-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O VZKBJNBZMZHKRC-XUXIUFHCSA-N 0.000 description 1
- GURGCNUWVSDYTP-SRVKXCTJSA-N Pro-Leu-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GURGCNUWVSDYTP-SRVKXCTJSA-N 0.000 description 1
- CDGABSWLRMECHC-IHRRRGAJSA-N Pro-Lys-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O CDGABSWLRMECHC-IHRRRGAJSA-N 0.000 description 1
- KDBHVPXBQADZKY-GUBZILKMSA-N Pro-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 KDBHVPXBQADZKY-GUBZILKMSA-N 0.000 description 1
- YIPFBJGBRCJJJD-FHWLQOOXSA-N Pro-Trp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@@H]3CCCN3 YIPFBJGBRCJJJD-FHWLQOOXSA-N 0.000 description 1
- LEBTWGWVUVJNTA-FKBYEOEOSA-N Pro-Trp-Phe Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CC4=CC=CC=C4)C(=O)O LEBTWGWVUVJNTA-FKBYEOEOSA-N 0.000 description 1
- ZYJMLBCDFPIGNL-JYJNAYRXSA-N Pro-Tyr-Arg Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@@H]1CCCN1)C(O)=O ZYJMLBCDFPIGNL-JYJNAYRXSA-N 0.000 description 1
- IMNVAOPEMFDAQD-NHCYSSNCSA-N Pro-Val-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IMNVAOPEMFDAQD-NHCYSSNCSA-N 0.000 description 1
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 108010003201 RGH 0205 Proteins 0.000 description 1
- 108010013845 RNA Polymerase I Proteins 0.000 description 1
- 102000017143 RNA Polymerase I Human genes 0.000 description 1
- 108010078067 RNA Polymerase III Proteins 0.000 description 1
- 102000014450 RNA Polymerase III Human genes 0.000 description 1
- 238000010357 RNA editing Methods 0.000 description 1
- 230000026279 RNA modification Effects 0.000 description 1
- 108010065868 RNA polymerase SP6 Proteins 0.000 description 1
- 102000001218 Rec A Recombinases Human genes 0.000 description 1
- 108010055016 Rec A Recombinases Proteins 0.000 description 1
- 108020005091 Replication Origin Proteins 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 108091081021 Sense strand Proteins 0.000 description 1
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 1
- QFBNNYNWKYKVJO-DCAQKATOSA-N Ser-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N QFBNNYNWKYKVJO-DCAQKATOSA-N 0.000 description 1
- OHKLFYXEOGGGCK-ZLUOBGJFSA-N Ser-Asp-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OHKLFYXEOGGGCK-ZLUOBGJFSA-N 0.000 description 1
- BNFVPSRLHHPQKS-WHFBIAKZSA-N Ser-Asp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O BNFVPSRLHHPQKS-WHFBIAKZSA-N 0.000 description 1
- SWSRFJZZMNLMLY-ZKWXMUAHSA-N Ser-Asp-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O SWSRFJZZMNLMLY-ZKWXMUAHSA-N 0.000 description 1
- ZOHGLPQGEHSLPD-FXQIFTODSA-N Ser-Gln-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZOHGLPQGEHSLPD-FXQIFTODSA-N 0.000 description 1
- OJPHFSOMBZKQKQ-GUBZILKMSA-N Ser-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CO OJPHFSOMBZKQKQ-GUBZILKMSA-N 0.000 description 1
- DSGYZICNAMEJOC-AVGNSLFASA-N Ser-Glu-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DSGYZICNAMEJOC-AVGNSLFASA-N 0.000 description 1
- SNVIOQXAHVORQM-WDSKDSINSA-N Ser-Gly-Gln Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O SNVIOQXAHVORQM-WDSKDSINSA-N 0.000 description 1
- MIJWOJAXARLEHA-WDSKDSINSA-N Ser-Gly-Glu Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O MIJWOJAXARLEHA-WDSKDSINSA-N 0.000 description 1
- WSTIOCFMWXNOCX-YUMQZZPRSA-N Ser-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N WSTIOCFMWXNOCX-YUMQZZPRSA-N 0.000 description 1
- WEQAYODCJHZSJZ-KKUMJFAQSA-N Ser-His-Tyr Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 WEQAYODCJHZSJZ-KKUMJFAQSA-N 0.000 description 1
- BKZYBLLIBOBOOW-GHCJXIJMSA-N Ser-Ile-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O BKZYBLLIBOBOOW-GHCJXIJMSA-N 0.000 description 1
- JIPVNVNKXJLFJF-BJDJZHNGSA-N Ser-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N JIPVNVNKXJLFJF-BJDJZHNGSA-N 0.000 description 1
- ZOPISOXXPQNOCO-SVSWQMSJSA-N Ser-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CO)N ZOPISOXXPQNOCO-SVSWQMSJSA-N 0.000 description 1
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 1
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 1
- HDBOEVPDIDDEPC-CIUDSAMLSA-N Ser-Lys-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O HDBOEVPDIDDEPC-CIUDSAMLSA-N 0.000 description 1
- OWCVUSJMEBGMOK-YUMQZZPRSA-N Ser-Lys-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O OWCVUSJMEBGMOK-YUMQZZPRSA-N 0.000 description 1
- XUDRHBPSPAPDJP-SRVKXCTJSA-N Ser-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO XUDRHBPSPAPDJP-SRVKXCTJSA-N 0.000 description 1
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 1
- JUTGONBTALQWMK-NAKRPEOUSA-N Ser-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CO)N JUTGONBTALQWMK-NAKRPEOUSA-N 0.000 description 1
- KZPRPBLHYMZIMH-MXAVVETBSA-N Ser-Phe-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KZPRPBLHYMZIMH-MXAVVETBSA-N 0.000 description 1
- UPLYXVPQLJVWMM-KKUMJFAQSA-N Ser-Phe-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UPLYXVPQLJVWMM-KKUMJFAQSA-N 0.000 description 1
- RRVFEDGUXSYWOW-BZSNNMDCSA-N Ser-Phe-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RRVFEDGUXSYWOW-BZSNNMDCSA-N 0.000 description 1
- ADJDNJCSPNFFPI-FXQIFTODSA-N Ser-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO ADJDNJCSPNFFPI-FXQIFTODSA-N 0.000 description 1
- GZGFSPWOMUKKCV-NAKRPEOUSA-N Ser-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO GZGFSPWOMUKKCV-NAKRPEOUSA-N 0.000 description 1
- FZXOPYUEQGDGMS-ACZMJKKPSA-N Ser-Ser-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O FZXOPYUEQGDGMS-ACZMJKKPSA-N 0.000 description 1
- VGQVAVQWKJLIRM-FXQIFTODSA-N Ser-Ser-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O VGQVAVQWKJLIRM-FXQIFTODSA-N 0.000 description 1
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 1
- YEDSOSIKVUMIJE-DCAQKATOSA-N Ser-Val-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O YEDSOSIKVUMIJE-DCAQKATOSA-N 0.000 description 1
- LGIMRDKGABDMBN-DCAQKATOSA-N Ser-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N LGIMRDKGABDMBN-DCAQKATOSA-N 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 108010052160 Site-specific recombinase Proteins 0.000 description 1
- 102000039471 Small Nuclear RNA Human genes 0.000 description 1
- 241000194022 Streptococcus sp. Species 0.000 description 1
- 241000187191 Streptomyces viridochromogenes Species 0.000 description 1
- 108091027544 Subgenomic mRNA Proteins 0.000 description 1
- 101710137500 T7 RNA polymerase Proteins 0.000 description 1
- ZUXQFMVPAYGPFJ-JXUBOQSCSA-N Thr-Ala-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN ZUXQFMVPAYGPFJ-JXUBOQSCSA-N 0.000 description 1
- JMZKMSTYXHFYAK-VEVYYDQMSA-N Thr-Arg-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O JMZKMSTYXHFYAK-VEVYYDQMSA-N 0.000 description 1
- CEXFELBFVHLYDZ-XGEHTFHBSA-N Thr-Arg-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O CEXFELBFVHLYDZ-XGEHTFHBSA-N 0.000 description 1
- JBHMLZSKIXMVFS-XVSYOHENSA-N Thr-Asn-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JBHMLZSKIXMVFS-XVSYOHENSA-N 0.000 description 1
- OJRNZRROAIAHDL-LKXGYXEUSA-N Thr-Asn-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OJRNZRROAIAHDL-LKXGYXEUSA-N 0.000 description 1
- VXMHQKHDKCATDV-VEVYYDQMSA-N Thr-Asp-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VXMHQKHDKCATDV-VEVYYDQMSA-N 0.000 description 1
- QILPDQCTQZDHFM-HJGDQZAQSA-N Thr-Gln-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QILPDQCTQZDHFM-HJGDQZAQSA-N 0.000 description 1
- SHOMROOOQBDGRL-JHEQGTHGSA-N Thr-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SHOMROOOQBDGRL-JHEQGTHGSA-N 0.000 description 1
- OQCXTUQTKQFDCX-HTUGSXCWSA-N Thr-Glu-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N)O OQCXTUQTKQFDCX-HTUGSXCWSA-N 0.000 description 1
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 1
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 1
- ADPHPKGWVDHWML-PPCPHDFISA-N Thr-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N ADPHPKGWVDHWML-PPCPHDFISA-N 0.000 description 1
- IHAPJUHCZXBPHR-WZLNRYEVSA-N Thr-Ile-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N IHAPJUHCZXBPHR-WZLNRYEVSA-N 0.000 description 1
- BVOVIGCHYNFJBZ-JXUBOQSCSA-N Thr-Leu-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O BVOVIGCHYNFJBZ-JXUBOQSCSA-N 0.000 description 1
- VTVVYQOXJCZVEB-WDCWCFNPSA-N Thr-Leu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VTVVYQOXJCZVEB-WDCWCFNPSA-N 0.000 description 1
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 1
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 1
- PRNGXSILMXSWQQ-OEAJRASXSA-N Thr-Leu-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PRNGXSILMXSWQQ-OEAJRASXSA-N 0.000 description 1
- KRDSCBLRHORMRK-JXUBOQSCSA-N Thr-Lys-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O KRDSCBLRHORMRK-JXUBOQSCSA-N 0.000 description 1
- ZXIHABSKUITPTN-IXOXFDKPSA-N Thr-Lys-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O ZXIHABSKUITPTN-IXOXFDKPSA-N 0.000 description 1
- KKPOGALELPLJTL-MEYUZBJRSA-N Thr-Lys-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KKPOGALELPLJTL-MEYUZBJRSA-N 0.000 description 1
- DXPURPNJDFCKKO-RHYQMDGZSA-N Thr-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O DXPURPNJDFCKKO-RHYQMDGZSA-N 0.000 description 1
- IWAVRIPRTCJAQO-HSHDSVGOSA-N Thr-Pro-Trp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O IWAVRIPRTCJAQO-HSHDSVGOSA-N 0.000 description 1
- OGOYMQWIWHGTGH-KZVJFYERSA-N Thr-Val-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O OGOYMQWIWHGTGH-KZVJFYERSA-N 0.000 description 1
- AKHDFZHUPGVFEJ-YEPSODPASA-N Thr-Val-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AKHDFZHUPGVFEJ-YEPSODPASA-N 0.000 description 1
- PWONLXBUSVIZPH-RHYQMDGZSA-N Thr-Val-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O PWONLXBUSVIZPH-RHYQMDGZSA-N 0.000 description 1
- 108091036066 Three prime untranslated region Proteins 0.000 description 1
- ATJFFYVFTNAWJD-UHFFFAOYSA-N Tin Chemical compound [Sn] ATJFFYVFTNAWJD-UHFFFAOYSA-N 0.000 description 1
- 108010073062 Transcription Activator-Like Effectors Proteins 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- 239000013504 Triton X-100 Substances 0.000 description 1
- RSUXQZNWAOTBQF-XIRDDKMYSA-N Trp-Arg-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RSUXQZNWAOTBQF-XIRDDKMYSA-N 0.000 description 1
- MICFJCRQBFSKPA-UMPQAUOISA-N Trp-Met-Thr Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O)=CNC2=C1 MICFJCRQBFSKPA-UMPQAUOISA-N 0.000 description 1
- MBFJIHUHHCJBSN-AVGNSLFASA-N Tyr-Asn-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MBFJIHUHHCJBSN-AVGNSLFASA-N 0.000 description 1
- NJLQMKZSXYQRTO-FHWLQOOXSA-N Tyr-Glu-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 NJLQMKZSXYQRTO-FHWLQOOXSA-N 0.000 description 1
- KCPFDGNYAMKZQP-KBPBESRZSA-N Tyr-Gly-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O KCPFDGNYAMKZQP-KBPBESRZSA-N 0.000 description 1
- NMKJPMCEKQHRPD-IRXDYDNUSA-N Tyr-Gly-Tyr Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 NMKJPMCEKQHRPD-IRXDYDNUSA-N 0.000 description 1
- CTDPLKMBVALCGN-JSGCOSHPSA-N Tyr-Gly-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O CTDPLKMBVALCGN-JSGCOSHPSA-N 0.000 description 1
- GFJXBLSZOFWHAW-JYJNAYRXSA-N Tyr-His-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O GFJXBLSZOFWHAW-JYJNAYRXSA-N 0.000 description 1
- KIJLSRYAUGGZIN-CFMVVWHZSA-N Tyr-Ile-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KIJLSRYAUGGZIN-CFMVVWHZSA-N 0.000 description 1
- PMHLLBKTDHQMCY-ULQDDVLXSA-N Tyr-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMHLLBKTDHQMCY-ULQDDVLXSA-N 0.000 description 1
- JXGUUJMPCRXMSO-HJOGWXRNSA-N Tyr-Phe-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 JXGUUJMPCRXMSO-HJOGWXRNSA-N 0.000 description 1
- SOAUMCDLIUGXJJ-SRVKXCTJSA-N Tyr-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O SOAUMCDLIUGXJJ-SRVKXCTJSA-N 0.000 description 1
- ZPFLBLFITJCBTP-QWRGUYRKSA-N Tyr-Ser-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)NCC(O)=O ZPFLBLFITJCBTP-QWRGUYRKSA-N 0.000 description 1
- MQGGXGKQSVEQHR-KKUMJFAQSA-N Tyr-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 MQGGXGKQSVEQHR-KKUMJFAQSA-N 0.000 description 1
- JIODCDXKCJRMEH-NHCYSSNCSA-N Val-Arg-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N JIODCDXKCJRMEH-NHCYSSNCSA-N 0.000 description 1
- PAPWZOJOLKZEFR-AVGNSLFASA-N Val-Arg-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N PAPWZOJOLKZEFR-AVGNSLFASA-N 0.000 description 1
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 1
- HZYOWMGWKKRMBZ-BYULHYEWSA-N Val-Asp-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HZYOWMGWKKRMBZ-BYULHYEWSA-N 0.000 description 1
- NYTKXWLZSNRILS-IFFSRLJSSA-N Val-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C(C)C)N)O NYTKXWLZSNRILS-IFFSRLJSSA-N 0.000 description 1
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 1
- XWYUBUYQMOUFRQ-IFFSRLJSSA-N Val-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N)O XWYUBUYQMOUFRQ-IFFSRLJSSA-N 0.000 description 1
- JVYIGCARISMLMV-HOCLYGCPSA-N Val-Gly-Trp Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N JVYIGCARISMLMV-HOCLYGCPSA-N 0.000 description 1
- APQIVBCUIUDSMB-OSUNSFLBSA-N Val-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N APQIVBCUIUDSMB-OSUNSFLBSA-N 0.000 description 1
- BTWMICVCQLKKNR-DCAQKATOSA-N Val-Leu-Ser Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C([O-])=O BTWMICVCQLKKNR-DCAQKATOSA-N 0.000 description 1
- GQMNEJMFMCJJTD-NHCYSSNCSA-N Val-Pro-Gln Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O GQMNEJMFMCJJTD-NHCYSSNCSA-N 0.000 description 1
- VSCIANXXVZOYOC-AVGNSLFASA-N Val-Pro-His Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N VSCIANXXVZOYOC-AVGNSLFASA-N 0.000 description 1
- SSYBNWFXCFNRFN-GUBZILKMSA-N Val-Pro-Ser Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O SSYBNWFXCFNRFN-GUBZILKMSA-N 0.000 description 1
- LTTQCQRTSHJPPL-ZKWXMUAHSA-N Val-Ser-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N LTTQCQRTSHJPPL-ZKWXMUAHSA-N 0.000 description 1
- QZKVWWIUSQGWMY-IHRRRGAJSA-N Val-Ser-Phe Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QZKVWWIUSQGWMY-IHRRRGAJSA-N 0.000 description 1
- CEKSLIVSNNGOKH-KZVJFYERSA-N Val-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](C(C)C)N)O CEKSLIVSNNGOKH-KZVJFYERSA-N 0.000 description 1
- LCHZBEUVGAVMKS-RHYQMDGZSA-N Val-Thr-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)[C@@H](C)O)C(O)=O LCHZBEUVGAVMKS-RHYQMDGZSA-N 0.000 description 1
- VTIAEOKFUJJBTC-YDHLFZDLSA-N Val-Tyr-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VTIAEOKFUJJBTC-YDHLFZDLSA-N 0.000 description 1
- GUIYPEKUEMQBIK-JSGCOSHPSA-N Val-Tyr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)NCC(O)=O GUIYPEKUEMQBIK-JSGCOSHPSA-N 0.000 description 1
- JSOXWWFKRJKTMT-WOPDTQHZSA-N Val-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N JSOXWWFKRJKTMT-WOPDTQHZSA-N 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 1
- 230000021736 acetylation Effects 0.000 description 1
- 238000006640 acetylation reaction Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N adenyl group Chemical group N1=CN=C2N=CNC2=C1N GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 238000007605 air drying Methods 0.000 description 1
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 1
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 1
- 108010041407 alanylaspartic acid Proteins 0.000 description 1
- 108010005233 alanylglutamic acid Proteins 0.000 description 1
- 108010044940 alanylglutamine Proteins 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 210000004102 animal cell Anatomy 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 230000000692 anti-sense effect Effects 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 108010008355 arginyl-glutamine Proteins 0.000 description 1
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 1
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 1
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 1
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 1
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 1
- 108010060035 arginylproline Proteins 0.000 description 1
- 108010077245 asparaginyl-proline Proteins 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-L aspartate group Chemical group N[C@@H](CC(=O)[O-])C(=O)[O-] CKLJMWTZIZZHCS-REOHCLBHSA-L 0.000 description 1
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 1
- 108010047857 aspartylglycine Proteins 0.000 description 1
- 108010092854 aspartyllysine Proteins 0.000 description 1
- 108010068265 aspartyltyrosine Proteins 0.000 description 1
- 238000003287 bathing Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000001851 biosynthetic effect Effects 0.000 description 1
- 238000006664 bond formation reaction Methods 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 210000004899 c-terminal region Anatomy 0.000 description 1
- 229940027138 cambia Drugs 0.000 description 1
- 238000006555 catalytic reaction Methods 0.000 description 1
- 230000022131 cell cycle Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 210000003763 chloroplast Anatomy 0.000 description 1
- 210000004756 chromatid Anatomy 0.000 description 1
- 239000013599 cloning vector Substances 0.000 description 1
- 238000003501 co-culture Methods 0.000 description 1
- 108091036078 conserved sequence Proteins 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 108010016616 cysteinylglycine Proteins 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- KXZOIWWTXOCYKR-UHFFFAOYSA-M diclofenac potassium Chemical compound [K+].[O-]C(=O)CC1=CC=CC=C1NC1=C(Cl)C=CC=C1Cl KXZOIWWTXOCYKR-UHFFFAOYSA-M 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 239000000539 dimer Substances 0.000 description 1
- 230000034431 double-strand break repair via homologous recombination Effects 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 210000002257 embryonic structure Anatomy 0.000 description 1
- 239000000839 emulsion Substances 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 241001233957 eudicotyledons Species 0.000 description 1
- 102000013165 exonuclease Human genes 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 1
- IAJOBQBIJHVGMQ-BYPYZUCNSA-N glufosinate-P Chemical compound CP(O)(=O)CC[C@H](N)C(O)=O IAJOBQBIJHVGMQ-BYPYZUCNSA-N 0.000 description 1
- KZNQNBZMBZJQJO-YFKPBYRVSA-N glyclproline Chemical compound NCC(=O)N1CCC[C@H]1C(O)=O KZNQNBZMBZJQJO-YFKPBYRVSA-N 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- HPAIKDPJURGQLN-UHFFFAOYSA-N glycyl-L-histidyl-L-phenylalanine Natural products C=1C=CC=CC=1CC(C(O)=O)NC(=O)C(NC(=O)CN)CC1=CN=CN1 HPAIKDPJURGQLN-UHFFFAOYSA-N 0.000 description 1
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 1
- 108010027668 glycyl-alanyl-valine Proteins 0.000 description 1
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 1
- 108010010096 glycyl-glycyl-tyrosine Proteins 0.000 description 1
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 108010048994 glycyl-tyrosyl-alanine Proteins 0.000 description 1
- 108010010147 glycylglutamine Proteins 0.000 description 1
- 108010050848 glycylleucine Proteins 0.000 description 1
- 108010077515 glycylproline Proteins 0.000 description 1
- 108010037850 glycylvaline Proteins 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 108010036413 histidylglycine Proteins 0.000 description 1
- 108010028295 histidylhistidine Proteins 0.000 description 1
- 108010085325 histidylproline Proteins 0.000 description 1
- 108010018006 histidylserine Proteins 0.000 description 1
- 102000018358 immunoglobulin Human genes 0.000 description 1
- 229940072221 immunoglobulins Drugs 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 239000003999 initiator Substances 0.000 description 1
- 238000011081 inoculation Methods 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 244000000056 intracellular parasite Species 0.000 description 1
- 108010027338 isoleucylcysteine Proteins 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 1
- 108010047926 leucyl-lysyl-tyrosine Proteins 0.000 description 1
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 1
- 108010000761 leucylarginine Proteins 0.000 description 1
- 230000029226 lipidation Effects 0.000 description 1
- 108010003700 lysyl aspartic acid Proteins 0.000 description 1
- 108010044348 lysyl-glutamyl-aspartic acid Proteins 0.000 description 1
- 108010045397 lysyl-tyrosyl-lysine Proteins 0.000 description 1
- 108010064235 lysylglycine Proteins 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 102000006240 membrane receptors Human genes 0.000 description 1
- 108010063431 methionyl-aspartyl-glycine Proteins 0.000 description 1
- 108010005942 methionylglycine Proteins 0.000 description 1
- 238000009740 moulding (composite fabrication) Methods 0.000 description 1
- 238000007899 nucleic acid hybridization Methods 0.000 description 1
- 239000002853 nucleic acid probe Substances 0.000 description 1
- 210000004940 nucleus Anatomy 0.000 description 1
- 230000030648 nucleus localization Effects 0.000 description 1
- ZPIRTVJRHUMMOI-UHFFFAOYSA-N octoxybenzene Chemical compound CCCCCCCCOC1=CC=CC=C1 ZPIRTVJRHUMMOI-UHFFFAOYSA-N 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 239000000816 peptidomimetic Substances 0.000 description 1
- 108010084572 phenylalanyl-valine Proteins 0.000 description 1
- 108010012581 phenylalanylglutamate Proteins 0.000 description 1
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 1
- 108010082527 phosphinothricin N-acetyltransferase Proteins 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 210000002706 plastid Anatomy 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 229920001223 polyethylene glycol Polymers 0.000 description 1
- 230000001124 posttranscriptional effect Effects 0.000 description 1
- 235000011056 potassium acetate Nutrition 0.000 description 1
- 238000012257 pre-denaturation Methods 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 108010093296 prolyl-prolyl-alanine Proteins 0.000 description 1
- 108010031719 prolyl-serine Proteins 0.000 description 1
- 230000001681 protective effect Effects 0.000 description 1
- 235000004252 protein component Nutrition 0.000 description 1
- 210000001938 protoplast Anatomy 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 108700022487 rRNA Genes Proteins 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 108020004418 ribosomal RNA Proteins 0.000 description 1
- 108091092562 ribozyme Proteins 0.000 description 1
- 102220092319 rs876657875 Human genes 0.000 description 1
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 1
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 1
- 108010026333 seryl-proline Proteins 0.000 description 1
- 238000010008 shearing Methods 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 108091029842 small nuclear ribonucleic acid Proteins 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 125000006850 spacer group Chemical group 0.000 description 1
- 210000003568 synaptosome Anatomy 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical class CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 1
- 230000005026 transcription initiation Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 108010060175 trypsinogen activation peptide Proteins 0.000 description 1
- 108010015666 tryptophyl-leucyl-glutamic acid Proteins 0.000 description 1
- 108010012567 tyrosyl-glycyl-glycyl-phenylalanyl Proteins 0.000 description 1
- 108010020532 tyrosyl-proline Proteins 0.000 description 1
- 241000701161 unidentified adenovirus Species 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 239000011701 zinc Substances 0.000 description 1
- 229910052725 zinc Inorganic materials 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8201—Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
- C12N15/8213—Targeted insertion of genes into the plant genome by homologous recombination
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Medicinal Chemistry (AREA)
- Crystallography & Structural Chemistry (AREA)
- Cell Biology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
本发明公开了一种提高同源重组效率的方法,包括向宿主细胞中引入FokI‑dCas9融合蛋白。本发明首次将FokI‑dCas9融合蛋白应用于提高同源重组效率,为基因组编辑技术实现高效同源重组编辑提供了一种新的选择,提高同源重组效率的同时,也减少了转化受体的使用量。
Description
技术领域
本发明涉及一种提高同源重组率的方法,特别涉及一种将FokI-dCas9蛋白应用于基因编辑系统中以实现提高同源重组效率的方法。
背景技术
随着生命科学研究进入基因组时代,越来越多物种的基因组完成测序,解读与改造基因组的功能就显得非常紧迫,近几年,生物学家们巧妙地利用蛋白质结构与功能领域的研究成果,将特异识别与结合DNA的蛋白质结构与和核酸内切酶结构域融合,创造出能够按照人们意愿特异切割DNA的序列特异核酸酶(Sequence-specific nucleases,SSNs),并藉此实现了对基因组特定位点的靶向修饰,及基因组编辑(Genome editing)。SSNs主要包括3种类型:锌指核酸酶(Zinc finger nuclease,ZFN)、类转录激活因子效应物核酸酶(Transcription activator-like effector nuclease,TALEN)和成簇的规律间隔的短回文重复序列及其相关系统(Clustered reg-ularly interspaced short palindromicrepeats/CRISPR-associated 9,CRISPR/Cas9system)。上述几类SSNs的共同特征是能够作为核酸内切酶切割特定的DNA序列,创造DNA双链断裂(Double-strand breaks,DSBs)。
在真核生物中,DSBs的修复机制是高度保守的,主要包括两种途径:非同源末端链接(Non-homologous end joining,NHEJ)和同源重组(Homologous recombination,HR)。通过NHEJ方式,断裂的染色体会重新连接,但往往不是精确的,断裂位置会产生少量核苷酸的插入或删除,从而产生基因敲除突变体;通过HR方式,在引入同源序列的情况下,以同源序列为模板进行合成修复,从而产生精确的定点替换或者插入突变体。在这两种途径中,NHEJ方式占绝对主导,可以发生在几乎所有类型的细胞以及不同的细胞周期中(G1、S和G2期);然而,HR发生频率很低,主要发生在S和G2期。HR根据其发生方式不同可以分为两类:单链退火(Single-strand annealing,SSA)和合成依赖式退火(Synthesis-dependent strandannealing,SDSA)。DSBs产生后,在这两种途径下DNA断裂末端都会发生5’至3’方向的DNA切除,形成3’单链末端。SSA途径类似NHEJ途径,DSBs两端各有一段同源序列,同源序列区域直接退火形成互补双链,再经过末端加工和连接修复DSBs,在基因组串联重复区域,SSA是主要的DSBs修复方式。SDSA途径是依赖DNA合成的修复过程,基因组编辑过程中同源重组通常是指这种方式。DSB经过5’至3’方向的DNA切除产生的一个3’单链末端入侵同源供体DNA模板,形成D-loop环状结构,再利用同源供体DNA的互补链作为模板进行DNA合成修复,当延伸至可以与DSB另一个单链末端互补配对的位置时,脱离D-loop结构,两个单链DSB末端退火形成双链,完成修复过程。SDSA途径最终结果就是完成从同源DNA至DSB遗传信息的转化过程。SDSA途径发生频率非常低,相同条件下只有SSA方式的10%-20%。由此可见,提高HR效率是基因组编辑研究最重要也是最迫切的任务之一。
在CRISPR/Cas9基因编辑系统中,设计不同的sgRNA以指导Cas9内切酶完成对DNA的定点切割,通过同源重组修复机制实现目标基因中不同类型的修饰,包括基因的删除、添加和替换等。因此明确DNA修复机制特别是HR修复过程的研究将有助于人们采用适当方法提高基因组编辑中定点插入或替换的效率。
发明内容
本发明的目的是提供一种提高同源重组效率的方法,首次提供了FokI-dCas9融合蛋白可以提高同源重组效率。
为实现上述目的,本发明提供了一种提高同源重组效率的方法,包括向宿主细胞中引入FokI-dCas9融合蛋白。
进一步地,所述FokI-dCas9融合蛋白在宿主细胞中瞬时表达或稳定表达。
更进一步地,所述宿主细胞为植物细胞。
优选地,所述植物为玉米、水稻、大豆、拟南芥、棉花、油菜、高粱、小麦、大麦、粟、甘蔗或燕麦。
在上述技术方案的基础上,所述FokI-dCas9融合蛋白的氨基酸序列具有SEQ IDNO:4和SEQ ID NO:5所示的氨基酸序列。
优选地,所述FokI-dCas9融合蛋白的核苷酸序列具有SEQ ID NO:1第643-5523位所示的核苷酸序列。
为实现上述目的,本发明还提供了一种基因组编辑系统,包含FokI-dCas9融合蛋白。
进一步地,所述FokI-dCas9融合蛋白的氨基酸序列具有SEQ ID NO:4和SEQ IDNO:5所示的氨基酸序列。
更进一步地,所述FokI-dCas9融合蛋白的核苷酸序列具有SEQ ID NO:1第643-5523位所示的核苷酸序列。
可选地,所述基因组编辑系统还包括编码序列操纵系统的多核苷酸序列。
优选地,所述序列操纵系统为CRISPR/Cas系统。
为实现上述目的,本发明还提供了一种实现基因组编辑的方法,包括在生物体中表达所述基因组编辑系统。
为实现上述目的,本发明还提供了一种产生基因组编辑的植物的方法,包括向植物基因组中引入编码所述基因组编辑系统的核苷酸序列。
为实现上述目的,本发明还提供了一种产生基因组编辑植物种子的方法,包括将所述方法产生的基因组编辑的植物自交,从而获得具有基因组编辑植物种子。
为实现上述目的,本发明还提供了一种培育基因组编辑植物的方法,包括:
种植至少一粒所述方法产生的所述基因组编辑植物种子;
使所述种子长成植株。
为实现上述目的,本发明还提供了一种所述基因组编辑系统在提高同源重组效率和/或提高基因组编辑效率中的用途。
为实现上述目的,本发明还提供了一种FokI-dCas9融合蛋白在提高同源重组效率中的用途。
进一步地,所述FokI-dCas9融合蛋白的氨基酸序列具有SEQ ID NO:4和SEQ IDNO:5所示的氨基酸序列。
更进一步地,所述FokI-dCas9融合蛋白的核苷酸序列具有SEQ ID NO:1第643-5523位所示的核苷酸序列。
本发明所述的FokI是包括DNA识别结构域和催化(内切核酸酶)结构域的II型限制性内切核酸酶。本文所述的融合蛋白可包括所有FokI或仅仅催化内切核酸酶结构域,例如,基因库登录号AAA24927.1的氨基酸388-583或408-583,例如,如Li等,NucleicAcidsRes.39(1):359–372(2011);Cathomen和Joung,Mol.Ther.16:1200–1207(2008)中描述的,或如Miller等NatBiotechnol25:778–785(2007);Szczepek等,Nat Biotechnol25:786–793(2007);或Bitinaite等,Proc.Natl.Acad.Sci.USA.95:10570–10575(1998)中描述的FokI的突变形式。
本发明中所述的Cas9,Ⅱ型CRISPR/Cas系统中一种重要的蛋白质成分,可以从生物体如链球菌属物种(Streptococcus sp.),优选酿脓链球菌(Streptococcus pyogenes.)中分离得到。当Cas9与称为CRISPR RNA(crRNA)和反式激活crRNA(TracrRNA)的两个RNA复合时,形成活性核酸内切酶,从而切断入侵噬菌体或质粒中的外源遗传元件,以保护宿主细胞。crRNA从宿主基因组中的CRISPR元件转录,其中该CRISPR元件之前自外源入侵物捕获。研究表明通过融合crRNA和tracrRNA的必要部分产生的单链嵌合RNA可以取代Cas9/RNA复合体中的两个RNA以形成功能性核酸内切酶。Cas9蛋白质的变体可以是Cas9的突变体形式,其中催化性天冬氨酸残基改变为任何其它氨基酸。优选地,所述其它氨基酸可以是丙氨酸。
本发明中所述FokI-dCas9融合蛋白,其中FokI序列任选地通过间插接头,例如2-30个氨基酸,例如4-12氨基酸的接头,例如Gly4Ser与dCas9(优选地与dCas9的氨基末端,而且还任选地与羧基末端)融合。在一些实施方案中,所述融合蛋白包括dCas9与FokI结构域之间的接头。可用于这些融合蛋白的接头(或串联结构中的融合蛋白之间)可包括不干扰融合蛋白的功能的任何序列。在优选实施方案中,所述接头是短的,例如2-20个氨基酸,并且通常是柔性的(即包含具有高度自由的氨基酸诸如甘氨酸、丙氨酸和丝氨酸)。在一些实施方案中,所述接头包含一个或多个由GGGS或GGGGS组成的单元,例如,2、3、4或更多个GGGS或GGGGS单元的重复,还可使用其它接头序列。
本发明所用2A肽(T2A)是一种可“自我剪切的短小肽链”,最初在手足口病毒(foot-and-mouth disease virus,FMDA)中发现,平均长度为18-22个氨基酸,2A肽可在蛋白翻译时通过核糖体跳跃从自身最后2个氨基酸C末端断裂(de Felipe et al.,2003)。甘氨酸和脯氨酸之间的肽链结合群在2A为电视受损的,能引发核糖体跳跃而从第2个密码子开始翻译,从而使1个转录单元里2个蛋白独立表达。这中2A介导的剪切广泛存在与所有的真核动物细胞当中。利用2A较高的剪切效率及促使上下游基因平衡表达的能力,可以改进异源多聚蛋白(如细胞表面受体、细胞因子、免疫球蛋白等)的表达效率。
本发明中所述的向导RNA或引导RNA(guide RNA,gRNA),也称为小向导RNA(smallguide RNA,sgRNA),作用于动质体(kinetoplastid)体内一种称为RNA编辑(RNA editing)的后转录修饰过程中,也是一种小型非编码RNA。可与pre-mRNA配对,并在其中插入一些尿嘧啶(U),产生具有作用的mRNA。向导RNA编辑的RNA分子,长度大约是60-80个核苷酸,是由单独的基因转录的,在gRNA的5’末端有一段锚定区,以特殊的G-U配对方式与非编辑的pre-mRNA序列互补,锚定序列促进gRNA与pre-mRNA中的编辑区互补,特意结合;在gRNA分子的中间部位有一个编辑区负责在被编辑的pre-mRNA分子中插入U的位置,其与被编辑mRNA精确互补;在gRNA分子的3’末端,有一段转录后加入的由大约15个非编码的PolyU序列,功能为把gRNA链接到pre-mRNA的编辑区的5’上游富含嘌呤碱基的核苷酸序列上。在编辑时,形成一个编辑体(editosome),以gRNA内部的序列作为模板进行转录物的校正,同时产生编辑的mRNA。
有三种类型CRISPR/Cas系统,其中涉及Cas9蛋白质和crRNA、tracrRNA的Ⅱ型CRISPR/Cas系统是代表性的。Cas9蛋白质通过人工修饰的向导RNA的导向作用可以打靶DNA序列的5’-N20-NGG-3’(N代表任何脱氧核苷酸碱基),N20是与gRNA的5’序列相同的20个碱基,NGG是PAM区(原型间隔子邻近基序,Protospacer-adjacent motif)。Cas9剪切的位点就是PAM附近的区域。相对于锌指和转录激活因子样效应物DNA结合蛋白提供了优势-因为在核苷酸结合CRISPR-Cas蛋白中的位点特异性由RNA分子调控而不是DNA结合蛋白调控。
本发明中所述重组,当用于例如细胞、核酸、蛋白质或载体时,表示该细胞、核酸、蛋白质或载体已通过引入异源核酸或蛋白质、或改变天然核酸或蛋白质而被修饰,或该细胞源自此修饰的细胞。
本发明中向导RNA可以以编码该向导RNA的RNA或DNA的形式转移到细胞或生物体中。向导RNA可以是分离的RNA、并入病毒载体的RNA的形式、或者在载体中编码。优选地,载体可以是病毒载体、质粒载体、或农杆菌载体。
编码向导RNA的DNA可以是包含编码向导RNA序列的载体。例如,可以通过用分离的向导RNA或包含编码向导RNA的序列和启动子的质粒DNA转染细胞或生物体,将向导RNA转染到细胞或生物体。
本发明中所述切割或剪切是指核苷酸分子共价骨架的断裂。向导RNA可以制备为特异于任何待切割的靶,通过向导RNA的靶特异性部分切割任何靶DNA。
本发明中所述非同源末端连接(Non-homologous end joining,NHEJ)指完全不需要任何模版的帮助,修复蛋白可以直接将DNA断裂的末端彼此拉近,再借由DNA连接酶的帮助将断裂的末端重新接合的修复机制。
本发明中所述同源重组(Homologous Recombination,HR)是指发生在非姐妹染色单体之间或同一染色体上含有同源序列的DNA分子之间或分子之内的重新组合。同源重组需要一系列的蛋白质催化,如原核生物细胞内的RecA、RecBCD、RecF、RecO、RecR等;以及真核生物细胞内的Rad51、Mre11-Rad50等等。同源重组反应通常根据交叉分子或Holliday结构的形成和拆分分为三个阶段,即前联会体阶段、联会体形成和Holliday结构的拆分。同源重组反应依赖于DNA分子之间的同源性,100%同源性的DNA分子之间的重组常见于非姐妹染色体之间的同源重组,称为Homologous Recombination,而小于100%同源性的DNA分子之间或分子之内的重组,则被称为Hemologus Recombination。后者可被负责碱基错配对的蛋白如原核细胞内的MutS或真核生物细胞内的MSH2-3等蛋白质“编辑”。同源重组可以双向交换DNA分子,也可以单向转移DNA分子,后者又被称为基因转换(Gene Conversion)。
本发明中,单链退火(SSA)模型是由Lin等于1984年提出的。在SSA模型中,重组在DNA双链断裂处开始,在单链特异性核酸外切酶的作用下,断裂点两侧逐渐形成DNA单链区域,这一过程持续到两个断点处出现互补DNA单链。互补DNA单链退火,非互补末端被切除,单链缺口修复连接,完成DNA重组。SSA模型没有其它模型所需的双链DNA的识别配对过程,不形成Holliday结构作为重组的中间过渡形式。因此,重组产生DNA双链交换,并丢失非退火区域单链DNA顺序。
本发明中,所述串联是指将二个或二个以上向导RNA(sgRNA)排成一串,每个sgRNA的首端和前一个sgRNA的尾端由Csy4切割识别序列连接。
本发明中所述的植物、植物组织或植物细胞的基因组,是指植物、植物组织或植物细胞内的任何遗传物质,且包括细胞核和质体和线粒体基因组。
本发明中所述的多核苷酸和/或核苷酸形成完整“基因”,在所需宿主细胞中编码蛋白质或多肽。本领域技术人员很容易认识到,可以将本发明的多核苷酸和/或核苷酸置于目的宿主中的调控序列控制下。
如本申请,包括权利要求中使用的,除非上下文中清楚地另有说明,否则单数和单数形式的术语,例如“一”、“一个”和“该”,包括复数指代物。因此,例如“植物”、“该植物”或“一个植物”也指示多个植物。并且根据上下文,使用术语“植物”还可指示该植物在遗传上相似或相同的后代。类似地,术语“核酸”可以指核酸分子的许多拷贝。类似地,术语“探针”可以指代相同或相似的探针分子。
数字范围包括限定该范围的数字,并明确地包括所限定的范围内的每个整数和非整数分数。除非另外指出,否则本文所用的全部技术和科学术语具有与本领域普通技术人员普遍理解的相同的含义。
本发明中,术语“核酸”、“核苷酸”、“核苷酸序列”、“寡核苷酸”和“多核苷酸”可互换使用,它们是指具有任何长度的核苷酸的聚合形式,根据上下文含义,可指代DNA或RNA、或其类似物。其中DNA包括但不限于cDNA、基因组DNA、合成DNA(例如人工合成)和含核酸类似物的DNA(或RNA)。多核苷酸可具有任何三维结构,并且可以执行已知或未知的任何功能。核酸可为双链或单链(既有义链或反义单链)。多核苷酸的非限制性示例包括基因、基因片段、外显子、内含子、信使RNA(mRNA)、转移RNA、核糖体RNA、核酶、cDNA、重组多核苷酸、分支多核苷酸、质粒、载体、任何序列的分离DNA、任何序列的分离RNA、核酸探针和引物、以及核酸类似物。
本发明中“野生型”表示生物、菌株、基因的典型形式或者当它在自然界存在时区别于突变体或变体形式的特征。
本发明中“突变体”或“变体”是指发生突变的个体,其具有与野生型不同的序列,可能会导致其中序列的至少部分功能已丢失的序列,例如,在启动子或增强子区域中序列的变化将至少部分地影响生物体中编码序列的表达。术语“突变”是指可由诸如缺失、添加、取代或重排引起的核酸序列中序列的任何变化。突变还可影响该序列参与的一个或多个步骤。例如,DNA序列中的变化可导致有活性的、有部分活性的或无活性的改变的mRNA和/或蛋白的合成。
本发明中“非天然存在的”表明人工的参与。当指核酸分子或多肽时,表示该核酸分子或多肽至少基本上从它们在自然界中或如发现于自然界中的与其结合的至少另一种组分游离出来。
本发明中“表达”指感兴趣的序列转录产生对应mRNA并且该mRNA翻译产生对应产物,即肽、多肽或蛋白。调节元件控制或调整感兴趣的序列表达,所述调节元件包括5’调节元件如启动子。
本发明中“多肽”、“肽”和“蛋白质”可互换地使用,是指具有任何长度的氨基酸聚合物。该聚合物可以是直链或支链的,它可以包含修饰的氨基酸,并且它可以被非氨基酸中断。这些术语还涵盖已经被修饰的氨基酸聚合物;这些修饰例如二硫键形成、糖基化、脂化、乙酰化、磷酸化、或任何其他操作,如与标记组分的结合。术语“氨基酸”包括天然的和/或非天然的或者合成的氨基酸,包括甘氨酸以及D和L旋光异构体、以及氨基酸类似物和肽模拟物。
本发明中术语“载体”是指能够在宿主细胞内复制的DNA分子。质粒和粘粒是示例性载体。此外,术语“载体”和“媒介”可互换使用以指将DNA片段从一种细胞转移至另一种细胞的核酸分子,因此细胞不必要属于相同的生物(例如将DNA片段从农杆菌细胞转移至植物细胞)。
本发明中术语“表达载体”是指含有所需要的编码序列和在特定宿主生物中表达有效连接的编码序列所需要的适宜核酸序列的重组DNA分子。
本发明中术语“重组表达载体”指来自任何来源、能够整合入基因组或自主复制的任何因子如质粒、粘粒、病毒、BAC(细菌人工染色体)、自主复制型序列、噬菌体、或者线性或环状单链或双链DNA或RNA核苷酸序列,包括其中一种或多种DNA序列使用众所周知的重组DNA技术以功能性可操作方式连接的DNA分子。
在本发明中,“定位结构域”可任选地添加为蛋白部分的部分,定位结构域可将蛋白部分或编程的蛋白部分或组装的复合物定位至活细胞中特定细胞或亚细胞定位。定位结构域可通过将蛋白部分的氨基酸序列融合到掺入包括以下结构域的氨基酸来构建:核定位信号(NLS);线粒体前导序列(MLS);叶绿体前导序列;和/或被设计以将蛋白运输或引导或定位至含有核酸的细胞器、细胞区室或细胞的任何细分部分的任何序列。在一些实施方案中,生物体是真核生物,且定位结构域包括允许蛋白进入细胞核和基因组DNA内的核定位结构域(NLS)。所述NLS的序列可包括带正电荷序列的任何功能NLS。在另一些实施方案中,定位结构域可包括使蛋白部分或编程的核蛋白进入细胞器的前导序列,使细胞器DNA被修饰成为可能。
本发明中,真核生物有3类RNA聚合酶,负责转录3类不同的启动子。由RNA聚合酶I负责转录的rRNA基因,启动子(I型)比较单一,由转录起始位点附近的两部分序列构成:第一部分是核心启动子(core promoter),由-45—+20位核苷酸组成,单独存在时就足以起始转录;另一部分由-170—-107位序列组成,称为上游调控元件,能有效地增强转录效率。
由RNA聚合酶Ⅲ负责转录的是5S rRNA、tRNA和某些核内小分子RNA(snRNA),其启动子(Ⅲ型)组成较复杂,又可被分为两个亚类:一类属于结构基因内启动子,一类属于结构基因外启动子。内启动子的有效启动依赖于基因内部包含的两个不连续的DNA片段,这两个DNA片段包括一些不同的连续DNA序列A、B或C区,两个区之间被隔开。根据两个区的不同组合,可以分为I类和II类两种:I类包括A和C区,目前发现其仅存在于5S rRNA的基因中;II类包括A和B区,存在于tRNA的基因、7SLRNA的基因、腺病毒VAI与VAII的RNA中。A、B或A、C内部DNA序列是转录因子TFⅢA和TFⅢC转录启动结合部位。在多数情况下,5’端还会有其它调节或关键元件,这些元件对RNA的高效转录是必需的。这些序列的存在与否之间影响着转录效率。这些序列呈现复杂的多样性,不过大多数启动子5’端-30到-20处存在着TATA盒样序列,与外启动子相似。外启动子缺乏相应的内部序列,仅在5’端有顺式作用元件,在基因的末端有一组由4个或更多胸腺嘧啶组成的终止信号,如脊椎动物U6核小RNA和7SK RNA启动子,这些启动子都高度相似或完全相同,其位置和碱基序列高度保守,其结构与pol II启动子也有一定的相似性。它们的5’端顺式作用元件包括几个控制元件,在其上游约-30处存在一个TATA样序列,近-60处有一个snRNA PSE(snRNA近似序列)和一个或多个被称为OCT的修饰序列5’-ATGCAAAT-3’。TATA样序列是pol III对snRNA基因进行转录特异的。TATA样元件和PSE元件共同决定了转录起始位点的选择和转录效率。TATA样元件和PSE元件之间的距离决定了RNA聚合酶转录的特异性,但似乎TATA样元件更重要,因为在PSE缺失的U6RNA和7SK RNA基因转录中,只有转录效率的下降。而且,PSE元件可能和B盒(boxB)相关,B盒在一定程度上可以替代PSE元件。这些序列对下游基因的转录具有决定性的作用,而且它们与起始位点相距较远,往往超过150bp,而对pol III内启动子来说,一般在80bp以内;与pol III外启动子相比,pol II启动子5’端各顺式作用元件的作用刚好相反,如果TATA样元件缺失,PSE则可以履行TATA样元件功能,决定转录的起始部位。在PSE上游,外启动子还有一个远端控制序列,其结构与pol II增强子OCT骨架相似,但较其复杂。而在-223处还有一个CACC序列与OCT骨架相连。这些远端控制序列的存在可大大提高U6RNA和7SK RNA的表达效率。
由RNA聚合酶II负责转录的II型基因包括所有蛋白质编码基因和部分snRNA基因,后者的启动子结构与III型基因启动子中的第三亚类相似,编码蛋白质的II型基因启动子在结构上有共同的保守序列。转录起始位点没有广泛的序列同源性,但第一个碱基为腺嘌呤,而两侧是嘧啶碱基。这个区域被称为起始子(initiator,Inr),序列可表示为Py2CAPy5。Inr元件位于-3—+5。仅由Inr元件组成的启动子是具有可被RNA聚合酶II识别的最简单启动子形式。多数II型启动子有一个被称为TATA盒的共有序列,通常处于-30区,相对于转录起始位点的位置比较固定。TATA盒存在于所有真核生物中,TATA盒是一个保守的七碱基对,也有一些II型启动子不含有TATA盒,这样的启动子称为无TATA盒启动子。
本发明中所述“有效连接”或“可操作地连接”表示核酸序列的联结,所述联结使得一条序列可提供对相连序列来说需要的功能。在本发明中所述“有效连接”可以为将启动子与感兴趣的序列相连,使得该感兴趣的序列的转录受到该启动子控制和调控。当感兴趣的序列编码蛋白并且想要获得该蛋白的表达时“有效连接”表示:启动子与所述序列相连,相连的方式使得得到的转录物高效翻译。如果启动子与编码序列的连接是转录物融合并且想要实现编码的蛋白的表达时,制造这样的连接,使得得到的转录物中第一翻译起始密码子是编码序列的起始密码子。备选地,如果启动子与编码序列的连接是翻译融合并且想要实现编码的蛋白的表达时,制造这样的连接,使得5’非翻译序列中含有的第一翻译起始密码子与启动子相连结,并且连接方式使得得到的翻译产物与编码想要的蛋白的翻译开放读码框的关系是符合读码框的。可以“有效连接”的核酸序列包括但不限于:提供基因表达功能的序列(即基因表达元件,例如启动子、5’非翻译区域、内含子、蛋白编码区域、3’非翻译区域、聚腺苷化位点和/或转录终止子)、提供DNA转移和/或整合功能的序列(即T-DNA边界序列、位点特异性重组酶识别位点、整合酶识别位点)、提供选择性功能的序列(即抗生素抗性标记物、生物合成基因)、提供可计分标记物功能的序列、体外或体内协助序列操作的序列(即多接头序列、位点特异性重组序列)和提供复制功能的序列(即细菌的复制起点、自主复制序列、着丝粒序列)。
本发明中,调节元件可操作地连接至CRISPR系统的一个或多个元件,从而驱动该CRISPR系统的所述一个或多个元件的表达。一般而言,CRISPR(规律间隔成簇短回文重复),也称为SPIDR(Spacer间隔开的同向重复),构成通常对于特定细菌物种而言特异性的DNA基因座的家族。该CRISPR座位包含在大肠杆菌中被识别的间隔开的短序列重复(SSR)的一个不同类、以及相关基因。类似的间隔开的SSR已经鉴定于地中海富盐菌、化脓链球菌、鱼腥草属和结核分枝杆菌中。这些CRISPR座位典型地不同于其它SSR的重复结构,这些重复已被称为规律间隔的短重复(SRSR)。一般而言,这些重复是以簇存在的短元件,其被具有基本上恒定长度的独特间插序列规律地间隔开。虽然重复序列在菌株之间是高度保守的,许多间隔开的重复和这些间隔区的序列一般在菌株与菌株之间不同,已经在40种以上的原核生物中鉴定出CRISPR座位。
本发明中,“靶序列”或“靶点序列”或“靶位点序列”或“靶多核苷酸”是待被作用的任何期望的预定的核酸序列,包括但不限于编码或非编码序列、基因、外显子或内含子、调节序列、基因间序列、合成序列和细胞内寄生物序列。在一些实施方案中,靶序列存在于靶细胞、组织、器官或生物体内。
术语“引物”是一段分离的核酸分子,其通过核酸杂交,退火结合到互补的目标DNA链上,在引物和目标DNA链之间形成杂合体,然后在聚合酶(例如DNA聚合酶)的作用下,沿目标DNA链延伸。本发明的引物对涉及其在目标核酸序列扩增中的应用,例如,通过聚合酶链式反应(PCR)或其他常规的核酸扩增方法。
引物的长度一般是11个多核苷酸或更多,优选的是18个多核苷酸或更多,更优选的是24个多核苷酸或更多,最优选的是30个多核苷酸或更多。这种引物在高度严格杂交条件下与目标序列特异性地杂交。尽管不同于目标DNA序列且对目标DNA序列保持杂交能力的引物是可以通过常规方法设计出来的,但是,优选的,本发明中的引物与目标序列的连续核酸具有完全的DNA序列同一性。
本发明的引物在严格条件下与目标DNA序列杂交。核酸分子或其片段在一定情况下能够与其他核酸分子进行特异性杂交。如本发明使用的,如果两个核酸分子能形成反平行的双链核酸结构,就可以说这两个核酸分子彼此间能够进行特异性杂交。如果两个核酸分子显示出完全的互补性,则称其中一个核酸分子是另一个核酸分子的“互补物”。如本发明使用的,当一个核酸分子的每一个核苷酸都与另一个核酸分子的对应核苷酸互补时,则称这两个核酸分子显示出“完全互补性”。如果两个核酸分子能够以足够的稳定性相互杂交从而使它们在至少常规的“低度严格”条件下退火且彼此结合,则称这两个核酸分子为“最低程度互补”。类似地,如果两个核酸分子能够以足够的稳定性相互杂交从而使它们在常规的“高度严格”条件下退火且彼此结合,则称这两个核酸分子具有“互补性”。从完全互补性中偏离是可以允许的,只要这种偏离不完全阻止两个分子形成双链结构。为了使一个核酸分子能够作为引物或探针,仅需保证其在序列上具有充分的互补性,以使得在所采用的特定溶剂和盐浓度下能形成稳定的双链结构。
术语“特异性结合(目标序列)”是指在严格杂交条件下引物仅与包含目标序列的样品中的目标序列发生杂交。
本发明中,“试剂盒”可包括本发明描述的基因组修饰系统与以下的任一种或全部:测定试剂、缓冲液、探针和/或引物、和无菌盐水或另一种药学上可接受的乳剂和悬液基底。此外,试剂盒可包括含有用于实践本发明描述的方法的用法说明(例如,操作方案)的说明性材料。
转化方案以及将核苷酸序列导入植物的方案依定向转化的植物或植物细胞类型而异,即单子叶植物或双子叶植物。将核苷酸序列导入植物细胞并随后插入植物基因组中的适合方法包括但不限于,农杆菌介导的转化、微量发射轰击、直接将DNA摄入原生质体、电穿孔或晶须硅介导的DNA导入。已经转化的细胞可按照常规的方式生长成植物。这些植物被培育,用相同的转化株或不同的转化株授粉,得到的杂交体表达所需的被鉴定的表现型特征。可培育二代或多代以保证稳定地保持和遗传所需表现型特征的表达,然后收获可保证得到所需表现型特征表达的种子。
本发明提供了一种提高同源重组效率的方法,具有以下优点:
1、本发明首次将FokI-dCas9融合蛋白应用于提高同源重组效率,利用FokI二聚体的内切酶活性,使得同源重组靶位点被切开的概率提高,以提高同源重组效率。
2、本发明为基因组编辑技术实现高效同源重组编辑提供了一种新的选择,提高同源重组效率的同时,也减少了转化受体的使用量。
下面通过附图和实施例,对本发明的技术方案做进一步的详细描述。
附图说明
图1为本发明提高同源重组效率方法中重组克隆剪刀载体DBN01-T构建流程图;
图2为本发明提高同源重组效率方法中重组表达载体DBN-GET326构建流程图;
图3为本发明提高同源重组效率方法中重组表达载体DBN-GET344载体结构示意图;
图4为本发明提高同源重组效率方法中重组表达载体DBN-GET345载体结构示意图;
图5为本发明提高同源重组效率方法中水稻抗性愈伤GUS染色标准图。
具体实施方式
下面通过具体实施例进一步说明本发明提高同源重组效率方法的技术方案。
第一实施例、剪刀载体构建
1.构建基础载体和重组克隆剪刀载体
将pCAMBIA2300(CAMBIA机构可以提供)载体进行改造,利用常规的酶切方法构建载体是本领域技术人员所熟知的,通过点突变去掉pCAMBIA2300载体上的BsaI位点,同时将卡那霉素表达盒去掉,得到pDBN骨架载体。向所述pDBN骨架载体引入PAT表达盒,得到表达载体DBN-PAT用于下述载体构建。
将合成的Csy4-T2A-FokI-dCas9核苷酸序列连入克隆载体pGEM-T(Promega,Madison,USA,CAT:A3600)上,操作步骤按Promega公司产品pGEM-T载体说明书进行,得到重组克隆剪刀载体DBN01-T,其构建流程如图1所示(其中,Amp表示氨苄青霉素抗性基因;f1表示噬菌体f1的复制起点;LacZ为LacZ起始密码子;SP6为SP6RNA聚合酶启动子;T7为T7RNA聚合酶启动子;Csy4-T2A-FokI-dCas9为Csy4-T2A-FokI-dCas9核苷酸序列(Csy4-T2A-FokI-dCas9核苷酸序列如SEQ ID NO:1所示;Csy4氨基酸序列如SEQ ID NO:2所示,T2A多肽序列如SEQ ID NO:3所示;FokI的氨基酸序列如SEQ ID NO:4所示;dCas9氨基酸序列如SEQ IDNO:5所示);MCS为多克隆位点)。
然后将重组克隆剪刀载体DBN01-T用热激方法转化大肠杆菌T1感受态细胞(Transgen,Beijing,China,CAT:CD501),其热激条件为:50μL大肠杆菌T1感受态细胞、10μL质粒DNA(重组克隆剪刀载体DBN01-T),42℃水浴90秒;37℃振荡培养1小时(100rpm转速下摇床摇动),在表面涂有IPTG(异丙基硫代-β-D-半乳糖苷)和X-gal(5-溴-4-氯-3-吲哚-β-D-半乳糖苷)的氨苄青霉素(100mg/L)的LB固体培养基(胰蛋白胨10g/L,酵母提取物5g/L,NaCl 10g/L,琼脂15g/L,用NaOH调pH至7.5)上生长过夜。挑取白色菌落,在LB液体培养基(胰蛋白胨10g/L,酵母提取物5g/L,NaCl 10g/L,氨苄青霉素100mg/L,用NaOH调pH至7.5)中于温度37℃条件下培养过夜。碱法提取其质粒:将菌液在12000rpm转速下离心1min,去上清液,沉淀菌体用100μL冰预冷的溶液I(25mM Tris-HCl,10mM EDTA(乙二胺四乙酸),50mM葡萄糖,pH8.0)悬浮;加入200μL新配制的溶液II(0.2M NaOH,1%SDS(十二烷基硫酸钠)),将管子颠倒4次,混合,置冰上3-5min;加入150μL冰冷的溶液III(3M醋酸钾,5M醋酸),立即充分混匀,冰上放置5-10min;于温度4℃、转速12000rpm条件下离心5min,在上清液中加入2倍体积无水乙醇,混匀后室温放置5min;于温度4℃、转速12000rpm条件下离心5min,弃上清液,沉淀用浓度(V/V)为70%的乙醇洗涤后晾干;加入30μL含RNase(20μg/mL)的TE(10mMTris-HCl,1mM EDTA,pH8.0)溶解沉淀;于温度37℃下水浴30min,消化RNA;于温度-20℃保存备用。
提取的质粒经SnabI和SpeI酶切鉴定后,对阳性克隆进行测序验证,结果表明重组克隆剪刀载体DBN01-T中插入的所述FokI-dCas9核苷酸序列为序列表中SEQ ID NO:1所示的核苷酸序列,即Csy4-T2A-FokI-dCas9核苷酸序列正确插入。
2.构建重组表达剪刀载体
用限制性内切酶SnabI和SpeI分别酶切重组克隆剪刀载体DBN01-T和表达载体DBN-PAT,将切下的Csy4-T2A-FokI-dCas9核苷酸序列插入表达载体DBN-PAT,利用常规的酶切方法构建载体是本领域技术人员所熟知的,构建成重组表达载体DBN-GET326,其构建流程如图2所示(RB:右边界;pr35S:花椰菜花叶病毒35S启动子(SEQ ID NO:6);Csy4-T2A-FokI-dCas9:Csy4-T2A-FokI-dCas9核苷酸序列(Csy4-T2A-FokI-dCas9核苷酸序列如SEQID NO:1所示;Csy4氨基酸序列如SEQ ID NO:2所示;T2A多肽序列如SEQ ID NO:3所示;FokI的氨基酸序列如SEQ ID NO:4所示;dCas9氨基酸序列如SEQ ID NO:5所示);t35S:花椰菜花叶病毒35S终止子(SEQ ID NO:7);PAT:草丁膦乙酰转移酶基因(SEQ ID NO:8);LB:左边界)。
将重组表达载体DBN-GET326用热激方法转化大肠杆菌T1感受态细胞,其热激条件为:50μL大肠杆菌T1感受态细胞、10μL质粒DNA(重组表达载体DBN-GET326),42℃水浴90秒;37℃振荡培养1小时(100rpm转速下摇床摇动);然后在含50mg/L卡那霉素(Kanamycin)的LB固体培养基(胰蛋白胨10g/L,酵母提取物5g/L,NaCl 10g/L,琼脂15g/L,用NaOH调pH至7.5)上于温度37℃条件下培养12小时,挑取白色菌落,在LB液体培养基(胰蛋白胨10g/L,酵母提取物5g/L,NaCl 10g/L,卡那霉素50mg/L,用NaOH调pH至7.5)中于温度37℃条件下培养过夜。碱法提取其质粒。将提取的质粒用限制性内切酶SnabI和SpeI酶切后鉴定,并将阳性克隆进行测序鉴定,结果表明重组表达载体DBN-GET326在SnabI和SpeI位点间的核苷酸序列为序列表中SEQ ID NO:1所示核苷酸序列,即Csy4-T2A-FokI-dCas9核苷酸序列。
第二实施例、水稻GUUS验证载体的构建
1、GUUS靶点的选择
将GUUS之间的靶序列信息输入到ZIFIT网站
(https://zifit.partners.org/ZiFiT/ChoiceMenu.aspx)中,选择一对可用的靶点,即靶点1序列(如SEQ ID NO:9所示)和靶点2序列(如SEQ ID NO:10所示)。
2、水稻无靶点载体的构建
本实施例中,无靶点载体设计为prOsU6+sgRNA+t35S结构。向第一实施例中所述pDBN骨架载体引入PMI表达盒、GUUS表达盒,利用常规的酶切方法构建载体是本领域技术人员所熟知的,构建成水稻无靶点载体DBN-GET344,其载体结构示意图如图3所示(LB:左边界;prOsU6:水稻U6启动子(SEQ ID NO:14);Csy4-R:Csy4切割识别序列(如SEQ ID NO:11所示);sgRNA:sgRNA序列(如SEQ ID NO:15所示);t35S:花椰菜花叶病毒35S终止子(SEQ IDNO:7);pr35S:花椰菜花叶病毒35S启动子(SEQ ID NO:6);GUUS:含有靶点1序列和靶点2序列的GUS基因(如SEQ ID NO:16所示);tNos:胭脂碱合成酶基因的终止子(SEQ ID NO:17);prUbi:玉米泛素(Ubiquitin)基因启动子(SEQ ID NO:18);PMI:磷酸甘露糖异构酶基因(SEQ ID NO:19);RB:右边界)。
按照第一实施例2中的方法,将无靶点载体DBN-GET344用热激方法转化大肠杆菌;碱法提取的质粒经AscI和AvrII酶切鉴定后,对阳性克隆进行测序验证,结果表明无靶点载体DBN-GET344构建正确。
3、水稻靶点载体的构建
本实施例中,靶点载体设计为prOsU6+靶点+sgRNA+t35S结构。两个靶点+sgRNA之间连接有Csy4切割识别序列。通过限制性内切酶BsaI将各个片段无缝连接在一起。Csy4切割识别序列如SEQ ID NO:11所示。本实施例中使用的2个靶点为:
靶点1序列,如SEQ ID NO:9所示;
靶点2序列,如SEQ ID NO:10所示。
引入靶点1和靶点2的引物如下:
正向引物:acatcaggtctccaaacggaggcattggtgcttcttggttttagagctagaaata,如SEQ ID NO:12所示;
反向引物:taggatggtctcgaaaacgtcgaggatgcctgggttgcctgcctatacggcagtgaacgcac,如SEQ ID NO:13所示;
其中,上述引物5’端的粗体小写字母为保护碱基,斜体小写字母为酶切位点BsaI,带下划线的小写字母为酶切位点BsaI的粘性末端。
以合成的sgRNA+cys4识别序列为模板(扩增体系中<250ng),将靶点1序列和靶点2序列通过上述正向引物和反向引物带入模板,用Pfu酶(NEB)进行PCR扩增,PCR体系如下:
PCR反应条件为:98℃预变性30s,然后进入下列循环:98℃变性10s,56-60℃退火30s,72℃延伸30s/kb,共30-32个循环,最后72℃延伸5-10min;于4℃保存。
PCR扩增后得到含有靶位点序列+sgRNA+Csy4切割识别序列的产物,使用过柱纯化试剂盒(购自北京全式金生物技术有限公司)对上述PCR产物过柱纯化,具体方法参考其产品说明书;BsaI酶切PCR产物和表达载体DBN-GET344,切胶回收对应酶切产物后,将酶切后的表达载体DBN-GET344产物和PCR产物按照1:10的比例用T4连接酶于温度16℃下连接30min,利用常规的酶切方法构建载体是本领域技术人员所熟知的,构建成水稻靶点载体DBN-GET345,其载体结构示意图如图4所示(LB:左边界;prOsU6:水稻U6启动子(SEQ ID NO:14);Csy4-R:Csy4切割识别序列(如SEQ ID NO:11所示);靶点1:靶1序列(SEQ ID NO:9);靶点2:靶点2序列(SEQ ID NO:10);sgRNA:sgRNA序列(如SEQ ID NO:15所示);t35S:花椰菜花叶病毒35S终止子(SEQ ID NO:7);pr35S:花椰菜花叶病毒35S启动子(SEQ ID NO:6);GUUS:含有靶点1序列和靶点2序列的GUS基因(如SEQ ID NO:16所示);tNos:胭脂碱合成酶基因的终止子(SEQ ID NO:17);prUbi:玉米泛素(Ubiquitin)基因启动子(SEQ ID NO:18);PMI:磷酸甘露糖异构酶基因(SEQ ID NO:19);RB:右边界)。
按照第一实施例2中的方法,将靶点载体DBN-GET345用热激方法转化大肠杆菌;碱法提取的质粒经KpnI和AscI酶切鉴定后,对阳性克隆进行测序验证,结果表明靶点载体DBN-GET345中2个靶点(靶点1和靶点2)正确插入。
第三实施例、剪刀载体和GUUS验证载体转化农杆菌
将己经构建正确的重组表达载体DBN-GET326、DBN-GET344和DBN-GET345用液氮法转化到农杆菌LBA4404中,其转化条件为:100μL农杆菌LBA4404、3μL质粒DNA(重组表达载体);置于液氮中5分钟,37℃温水浴5分钟;将转化后的农杆菌LBA4404接种于LB试管中于温度28℃、转速为200rpm条件下培养2小时,涂于含50mg/L的利福平(Rifampicin)和50mg/L的卡那霉素的LB固体培养基上直至长出阳性单克隆,挑取单克隆培养并提取其质粒,用限制性内切酶进行酶切验证,结果表明重组表达载体DBN-GET326、DBN-GET344和DBN-GET345结构完全正确。
按照以下组合将菌液等体积混合:DBN-GET326和DBN-GET345菌液(靶点处理)、DBN-GET326和DBN-GET344菌液(无靶点处理)、DBN-GET344菌液(对照处理),室温静置3h,获得对应处理的农杆菌悬浮液。
第四实施例、获得稳定转化的水稻愈伤
对于农杆菌介导的水稻转化,简要地,把水稻种子(日本晴,中国农业大学提供)接种在诱导培养基(N6盐3.1g/L、N6维他命、干酪素300mg/L、麦芽糖30g/L、2,4-二氯苯氧乙酸(2,4-D)2mg/L、植物凝胶3g/L,pH5.8)上,从水稻成熟胚诱导出愈伤组织(步骤1:愈伤诱导步骤),之后,优选愈伤组织,用上述3种处理的农杆菌悬浮液接触愈伤组织,其中农杆菌能够将目的构建体传递至愈伤组织上的至少一个细胞(步骤2:侵染步骤)。在此步骤中,愈伤组织优选地浸入农杆菌悬浮液(OD660=0.3,侵染培养基(N6盐3.1g/L、N6维他命、干酪素300mg/L、蔗糖30g/L、葡萄糖10g/L、乙酰丁香酮(AS)40mg/L、2,4-二氯苯氧乙酸(2,4-D)2mg/L、pH5.4))中以启动接种。愈伤组织与农杆菌共培养一段时期(3天)(步骤3:共培养步骤)。优选地,愈伤组织在侵染步骤后在固体培养基(N6盐3.1g/L、N6维他命、干酪素300mg/L、蔗糖30g/L、葡萄糖10g/L、乙酰丁香酮(AS)40mg/L、2,4-二氯苯氧乙酸(2,4-D)2mg/L、植物凝胶3g/L,pH5.8)上培养。在此共培养阶段后,有一个“恢复”步骤。在“恢复”步骤中,恢复培养基(N6盐3.1g/L、N6维他命、干酪素300mg/L、蔗糖30g/L、2,4-二氯苯氧乙酸(2,4-D)2mg/L、植物凝胶3g/L,pH5.8)中至少存在一种己知抑制农杆菌生长的抗生素(头孢霉素150-250mg/L),不添加植物转化体的选择剂(步骤4:恢复步骤)。优选地,愈伤组织在有抗生素但没有选择剂的固体培养基上培养,以消除农杆菌并为侵染细胞提供恢复期。接着,接种的愈伤组织在含选择剂(甘露糖和/或草铵膦)的培养基上培养并选择生长着的转化愈伤组织(步骤5:选择步骤)。优选地,所述靶点处理的愈伤组织和所述无靶点处理的愈伤组织在有甘露糖和草铵膦的筛选固体培养基(N6盐3.1g/L、N6维他命、干酪素300mg/L、蔗糖5g/L、甘露糖12.5g/L、草胺磷4mg/L、2,4-二氯苯氧乙酸(2,4-D)2mg/L、植物凝胶3g/L,pH5.8)上培养,所述对照处理的愈伤组织在有甘露糖的筛选固体培养基(N6盐3.1g/L、N6维他命、干酪素300mg/L、蔗糖5g/L、甘露糖12.5g/L、2,4-二氯苯氧乙酸(2,4-D)2mg/L、植物凝胶3g/L,pH5.8)上培养,导致转化的细胞选择性生长。将上述筛选得到的抗性愈伤组织进行GUS染色分析。
第五实施例、水稻愈伤的GUS染色测定
分别取稳定转化获得的靶点处理的抗性愈伤组织、无靶点处理的抗性愈伤组织和对照处理的抗性愈伤组织作为样品,参照Jefferson等(Jefferson R.A.,Burgess S.M.,Hirsh D.Beta-glucuronidase from Escherichia coliasa gene fusionmarker.Proc.Natl.Acad.Sci.,1986,83:8447-8454)的方法,通过在GUS染色液中37℃密封染色1-2天,从组织化学上检验GUS的表达方式,即GUUS突变为GUS酶后会将X-gluc原位分解产生蓝色沉淀,从而可以说明FokI-dCas9有助于使GUUS恢复GUS染色。每个处理设3次重复,每个重复做10个抗性愈伤组织,取平均值。具体方法如下:
步骤1、5-溴-4-氯-3-吲哚-β-D-葡糖苷酸(X-gluc)按照40mg/mL的浓度溶解到二甲基亚砜(DMSO)中,用锡箔纸密封后置于-80℃冰箱中保存;
步骤2、配制GUS染色液:100mM NaH2PO4、10mM Na2EDTA、0.5mM K4[Fe(CN)6]·3H2O、0.5mM K3[Fe(CN)6]、体积比为1%的聚乙二醇辛基苯基醚(Triton X-100),用pH计调pH至7.0,加水定容至1L;
步骤3、在步骤2配制好的GUS染色液中加入步骤1密封保存的X-Gluc,使X-Gluc终浓度为0.5mg/mL,用于GUS染色;
步骤4、分别取靶点处理的抗性愈伤组织、无靶点处理的抗性愈伤组织和对照处理的抗性愈伤组织各30个,每3粒抗性愈伤组织放入1支2mL离心管中,加入步骤3获得的GUS染色液并没过样品,于37℃恒温箱中放置24-48h后,目测染色情况。
GUS染色结果如表1所示,在上述GUS染色验证同源重组效率的实验中,将GUS染色的程度分为四个等级,即+++、++、+、-,依次表示大部分细胞深蓝色、仅不到一半细胞蓝色、少数细胞蓝色、无蓝色),GUS染色标准如图5所示,GUS染色实验结果如表1所示,约有24%所述靶点处理的抗性愈伤组织发生了GUS回复突变(染色程度为+的占14.00%),说明靶点与FokI-dCas9融合蛋白共转化时可以促进同源重组的发生;仅有3.00%的所述对照处理的抗性愈伤组织的少数细胞被GUS染成蓝色(染色程度为+);值得注意的是,在没有靶点存在的情况下,约有17.20%所述无靶点处理的抗性愈伤组织发生了GUS回复突变(染色程度为+),比所述对照处理的抗性愈伤组织的同源重组效率提高4.7倍,说明当不存在切割(无靶点)时,FokI-dCas9融合蛋白能单独促进同源重组的产生,并且通过过量表达能显著提高同源重组效率。
综上所述,本发明了首次提供了FokI-dCas9融合蛋白能够促进同源重组的发生,并且显著提高细胞内同源重组效率,同时也大大的减少了对转化受体的需求,为高效同源重组编辑提供了一种新的选择。
最后所应说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或者等同替换,而不脱离本发明技术方案的精神和范围。
SEQUENCE LISTING
<110> 北京大北农生物技术有限公司
<120> 提高同源重组效率的方法
<130> DBNBC120
<160> 19
<170> PatentIn version 3.3
<210> 1
<211> 5523
<212> DNA
<213> Artificial sequence
<220>
<223> Csy4-T2A-FokI-dCas9核苷酸序列
<400> 1
atgggcgacc actacctgga catcaggctg aggccggacc cggagttccc gccggcccag 60
ctgatgagcg tgctgttcgg caagctgcac caggcactgg tggcccaggg cggcgacagg 120
atcggcgtga gcttcccgga cctggacgag agcaggagca ggctgggcga gaggctgaga 180
atccacgcca gcgccgacga cctgagggca ctgctggcca ggccgtggct ggagggcctg 240
agggaccacc tgcaattcgg cgagccggcc gtggtgccgc acccgacccc gtacaggcag 300
gtgagcaggg tgcaggccaa gagcaacccg gagaggctga ggaggaggct gatgaggagg 360
cacgacctga gcgaggagga ggccaggaag agaatcccgg acaccgtggc aagggccctg 420
gacctgccgt tcgtgaccct gaggagccag agcaccggcc agcacttcag gctgttcatc 480
aggcacggcc cgctacaggt gaccgccgag gagggcggct tcacctgcta cggcctgagc 540
aagggcggct tcgtgccgtg gttcgagggc aggggcagcc tgctgacctg cggcgacgtg 600
gaggagaacc cgggcccgat gccgaagaag aagaggaagg tgtcctccca gctcgtgaag 660
tccgagctcg aggagaagaa gtccgagctc cgccacaagc tcaagtacgt gccgcacgag 720
tacatcgagc tcatcgagat cgcccgcaac tccacccagg accgcatcct cgagatgaag 780
gtgatggagt tcttcatgaa ggtgtacggc taccgcggca agcacctcgg cggctcccgc 840
aagccggacg gcgccatcta caccgtgggc tccccgatcg actacggcgt gatcgtggac 900
accaaggcct actccggcgg ctacaacctc ccgatcggcc aggccgacga gatgcagcgc 960
tacgtggagg agaaccagac ccgcaacaag cacatcaacc cgaacgagtg gtggaaggtg 1020
tacccgtcct ccgtgaccga gttcaagttc ctcttcgtgt ccggccactt caagggcaac 1080
tacaaggccc agctcacccg cctcaaccac atcaccaact gcaacggcgc cgtgctctcc 1140
gtggaggagc tcctcatcgg cggcgagatg atcaaggccg gcaccctcac cctcgaggag 1200
gtgcgccgca agttcaacaa cggcgagatc aacttcggcg gcggcggcag catggactac 1260
aaggaccacg acggggatta caaagaccac gacatagact acaaggatga cgatgacaaa 1320
atggcaccga agaaaaaaag gaaggtcgga atccatggcg ttccagctgc cgataagaaa 1380
tattccatcg gactcgccat tggcacgaat agcgtcggat gggctgttat tactgatgag 1440
tacaaagttc cgtctaagaa gttcaaggtg ctgggcaaca cagaccgcca cagcataaag 1500
aaaaatctca tcggtgcact ccttttcgat agtggggaga ctgcagaagc gacaagattg 1560
aaaaggactg cgagaaggcg ctatacacgg cgtaagaata gaatctgcta ccttcaggag 1620
attttctcta acgaaatggc taaggtcgat gacagtttct ttcatagact tgaggaatcg 1680
ttcttggttg aggaggataa gaaacatgag aggcacccga tatttggaaa catcgtggat 1740
gaggtcgcat atcatgaaaa gtaccccaca atctaccacc tgagaaagaa actcgttgat 1800
tccaccgaca aagcggattt gagactcatc tacctcgctc ttgcccatat gataaagttc 1860
cgcggacact ttctgatcga gggcgacctc aaccctgata atagcgacgt cgataagctc 1920
ttcatccagt tggttcaaac ctacaatcag ctctttgagg aaaacccaat taatgctagt 1980
ggagtggatg caaaagcgat actgtcggcc agactctcca agagcagaag gttggagaac 2040
ctgatcgctc aacttcctgg agaaaagaaa aacggtcttt ttgggaattt gattgccttg 2100
tctctgggcc tcacaccaaa cttcaagtca aattttgacc tcgctgagga tgccaaactt 2160
cagttgtcta aggataccta tgatgacgat cttgacaatt tgctggcaca aattggcgac 2220
cagtacgcgg atctgttcct cgcagcgaag aatctgagtg atgctattct cctttcggac 2280
atactcaggg ttaacactga gatcacaaaa gcacctttga gtgcgtcgat gattaagcgc 2340
tatgatgaac atcaccaaga cctcactttg ctgaaggccc ttgtgcggca gcaattgcca 2400
gagaagtaca aagaaatctt ctttgaccaa tctaagaacg gatacgctgg ctatattgat 2460
ggaggagctt ctcaggagga attctataag tttatcaaac ctatacttga gaagatggat 2520
ggtacagagg aactccttgt taaattgaac agagaagatt tgctgcgcaa gcaacggacc 2580
tttgacaacg gatcaattcc gcatcagata cacctcggcg agcttcatgc catccttcgc 2640
cggcaggaag atttctaccc ctttttgaag gacaaccgcg agaagataga aaaaatcctt 2700
acgttccgga ttccttacta tgtgggtcca ttggcaaggg ggaattcccg ctttgcgtgg 2760
atgactcgga aaagcgagga aactatcaca ccgtggaact tcgaggaagt tgtggacaag 2820
ggagcttctg cccaatcatt cattgagagg atgactaact tcgataagaa cctgccgaac 2880
gagaaagttc tccccaagca ctccctcctt tacgagtatt tcaccgtgta taacgaactt 2940
acgaaggtta aatacgtgac tgagggtatg aggaagccag cattcttgag cggggaacaa 3000
aagaaagcga ttgttgattt gctgtttaaa actaatcgca aggtgacagt caagcagctc 3060
aaagaggatt atttcaagaa aattgaatgt ttcgactctg tggagatatc aggagtcgaa 3120
gataggttta acgcttccct tggcacatac catgacctcc ttaagatcat taaggacaaa 3180
gatttcctgg ataacgagga aaatgaggac atcctcgaag atattgttct taccttgacg 3240
ctgtttgagg atcgcgaaat gatcgaggaa cggcttaaga cgtatgctca cttgttcgac 3300
gataaggtta tgaagcagct caagcgtaga aggtacactg gatggggccg tctgtctaga 3360
aagctcatca acggaatacg tgataaacaa agtggcaaga caattttgga ttttctgaag 3420
tcggacggat tcgccaacag aaattttatg cagctgattc atgacgatag tctcaccttc 3480
aaagaggaca tacagaaggc tcaagtgagt ggtcaagggg attcgctgca tgaacacatc 3540
gcaaacctcg cgggttcacc ggccataaag aaaggaatcc ttcaaactgt taaggtcgtt 3600
gatgagttgg ttaaagtgat gggtaggcac aagcccgaaa acatagtgat cgagatggct 3660
cgcgaaaatc agactacaca aaaagggcag aagaactctc gcgagcggat gaaaaggatt 3720
gaggaaggaa tcaaggaact gggctcacag attctcaaag agcatccagt cgaaaacaca 3780
cagctgcaaa atgagaagct ctatctttac tatctccaaa atggccggga catgtatgtt 3840
gatcaggagc ttgacatcaa ccgtttgtcc gactatgatg tggacgccat tgtcccgcaa 3900
tctttcctta aggacgattc aatcgataat aaggtgttga cccggagcga taaaaaccgt 3960
ggaaagtctg acaatgtccc ttcagaggaa gtggttaaga agatgaagaa ctactggaga 4020
caattgctga atgcaaaact gatcacacag agaaagttcg acaacctcac caaagcagag 4080
agaggtgggc tcagtgaact tgataaagcg ggcttcatta agcgtcagct cgttgagact 4140
agacagatca cgaagcatgt cgcgcagatt ttggattcgc ggatgaacac gaagtacgac 4200
gagaatgata aactgatacg tgaagtcaag gttatcactc ttaagtccaa attggtgagc 4260
gatttcagaa aggacttcca attctataag gtcagggaga tcaacaatta tcatcacgct 4320
cacgatgcct accttaatgc tgttgtgggg accgccctta ttaagaaata ccctaaattg 4380
gagtctgaat tcgtttacgg ggattataag gtctacgacg ttaggaaaat gatagctaag 4440
agtgagcagg agatcggtaa agcaactgcg aagtatttct tttactcgaa catcatgaat 4500
ttctttaaga ccgagataac gctggcaaat ggcgaaatta gaaagaggcc tctcatagag 4560
actaacggtg agacagggga aatcgtctgg gataagggta gggactttgc gacagtgcgc 4620
aaggtcctct ctatgccgca agttaatatt gtgaagaaaa ccgaggtgca gacgggaggc 4680
ttctccaagg aaagcatact tcccaaacgg aactctgata agttgatcgc tcgtaagaaa 4740
gattgggacc ctaagaaata tggtgggttc gattccccaa ctgttgctta cagcgtgctg 4800
gtcgttgcca aggtcgagaa gggtaaatcc aagaaactca aaagcgttaa ggaactcctt 4860
gggattacta tcatggagag atcttcattc gaaaagaatc ctatcgactt tcttgaggcc 4920
aaaggatata aggaagttaa gaaagatctg ataatcaaac tcccaaagta ctcattgttt 4980
gagctggaaa acggcaggaa gcgcatgctt gcttccgccg gagagttgca gaaagggaac 5040
gagttggctc tgccttctaa gtatgttaac ttcctctatc ttgcctctca ttacgagaag 5100
ctcaaaggct caccagagga caacgaacag aaacaacttt ttgtcgagca acataagcac 5160
tatttggatg agattataga acagatcagt gaattctcga aaagggttat ccttgcagat 5220
gcgaatcttg acaaggtgtt gtctgcatac aacaaacata gagataagcc gatcagggag 5280
caagcggaaa atatcattca cctcttcact cttacaaact tgggtgctcc cgctgccttc 5340
aagtattttg ataccacgat tgaccggaaa cgttacacct caacgaagga ggtgctggat 5400
gccaccctca tccaccaatc tattaccgga ctctacgaga ctagaatcga tctctcacag 5460
ctcggcgggg ataaaagacc agcagcgacg aaaaaggcag gacaggctaa gaagaagaaa 5520
tag 5523
<210> 2
<211> 188
<212> PRT
<213> Pseudomonas aeruginosa
<400> 2
Met Gly Asp His Tyr Leu Asp Ile Arg Leu Arg Pro Asp Pro Glu Phe
1 5 10 15
Pro Pro Ala Gln Leu Met Ser Val Leu Phe Gly Lys Leu His Gln Ala
20 25 30
Leu Val Ala Gln Gly Gly Asp Arg Ile Gly Val Ser Phe Pro Asp Leu
35 40 45
Asp Glu Ser Arg Ser Arg Leu Gly Glu Arg Leu Arg Ile His Ala Ser
50 55 60
Ala Asp Asp Leu Arg Ala Leu Leu Ala Arg Pro Trp Leu Glu Gly Leu
65 70 75 80
Arg Asp His Leu Gln Phe Gly Glu Pro Ala Val Val Pro His Pro Thr
85 90 95
Pro Tyr Arg Gln Val Ser Arg Val Gln Ala Lys Ser Asn Pro Glu Arg
100 105 110
Leu Arg Arg Arg Leu Met Arg Arg His Asp Leu Ser Glu Glu Glu Ala
115 120 125
Arg Lys Arg Ile Pro Asp Thr Val Ala Arg Ala Leu Asp Leu Pro Phe
130 135 140
Val Thr Leu Arg Ser Gln Ser Thr Gly Gln His Phe Arg Leu Phe Ile
145 150 155 160
Arg His Gly Pro Leu Gln Val Thr Ala Glu Glu Gly Gly Phe Thr Cys
165 170 175
Tyr Gly Leu Ser Lys Gly Gly Phe Val Pro Trp Phe
180 185
<210> 3
<211> 18
<212> PRT
<213> Foot-and-mouth disease virus
<400> 3
Glu Gly Arg Gly Ser Leu Leu Thr Cys Gly Asp Val Glu Glu Asn Pro
1 5 10 15
Gly Pro
<210> 4
<211> 198
<212> PRT
<213> Flavobacterium okeanokoites
<400> 4
Ser Ser Gln Leu Val Lys Ser Glu Leu Glu Glu Lys Lys Ser Glu Leu
1 5 10 15
Arg His Lys Leu Lys Tyr Val Pro His Glu Tyr Ile Glu Leu Ile Glu
20 25 30
Ile Ala Arg Asn Ser Thr Gln Asp Arg Ile Leu Glu Met Lys Val Met
35 40 45
Glu Phe Phe Met Lys Val Tyr Gly Tyr Arg Gly Lys His Leu Gly Gly
50 55 60
Ser Arg Lys Pro Asp Gly Ala Ile Tyr Thr Val Gly Ser Pro Ile Asp
65 70 75 80
Tyr Gly Val Ile Val Asp Thr Lys Ala Tyr Ser Gly Gly Tyr Asn Leu
85 90 95
Pro Ile Gly Gln Ala Asp Glu Met Gln Arg Tyr Val Glu Glu Asn Gln
100 105 110
Thr Arg Asn Lys His Ile Asn Pro Asn Glu Trp Trp Lys Val Tyr Pro
115 120 125
Ser Ser Val Thr Glu Phe Lys Phe Leu Phe Val Ser Gly His Phe Lys
130 135 140
Gly Asn Tyr Lys Ala Gln Leu Thr Arg Leu Asn His Ile Thr Asn Cys
145 150 155 160
Asn Gly Ala Val Leu Ser Val Glu Glu Leu Leu Ile Gly Gly Glu Met
165 170 175
Ile Lys Ala Gly Thr Leu Thr Leu Glu Glu Val Arg Arg Lys Phe Asn
180 185 190
Asn Gly Glu Ile Asn Phe
195
<210> 5
<211> 1423
<212> PRT
<213> Artificial Sequence
<220>
<223> dCas9氨基酸序列
<400> 5
Met Asp Tyr Lys Asp His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp
1 5 10 15
Tyr Lys Asp Asp Asp Asp Lys Met Ala Pro Lys Lys Lys Arg Lys Val
20 25 30
Gly Ile His Gly Val Pro Ala Ala Asp Lys Lys Tyr Ser Ile Gly Leu
35 40 45
Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr
50 55 60
Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His
65 70 75 80
Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu
85 90 95
Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr
100 105 110
Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu
115 120 125
Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe
130 135 140
Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn
145 150 155 160
Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His
165 170 175
Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu
180 185 190
Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu
195 200 205
Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe
210 215 220
Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile
225 230 235 240
Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser
245 250 255
Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys
260 265 270
Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr
275 280 285
Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln
290 295 300
Leu Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln
305 310 315 320
Ile Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser
325 330 335
Asp Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr
340 345 350
Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His
355 360 365
Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu
370 375 380
Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly
385 390 395 400
Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys
405 410 415
Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu
420 425 430
Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser
435 440 445
Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg
450 455 460
Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu
465 470 475 480
Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg
485 490 495
Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile
500 505 510
Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln
515 520 525
Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu
530 535 540
Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr
545 550 555 560
Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro
565 570 575
Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe
580 585 590
Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe
595 600 605
Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp
610 615 620
Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile
625 630 635 640
Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu
645 650 655
Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu
660 665 670
Glu Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys
675 680 685
Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys
690 695 700
Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp
705 710 715 720
Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile
725 730 735
His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val
740 745 750
Ser Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly
755 760 765
Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp
770 775 780
Glu Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile
785 790 795 800
Glu Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser
805 810 815
Arg Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser
820 825 830
Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu
835 840 845
Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp
850 855 860
Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp Ala Ile
865 870 875 880
Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu
885 890 895
Thr Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu
900 905 910
Glu Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala
915 920 925
Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg
930 935 940
Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu
945 950 955 960
Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser
965 970 975
Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val
980 985 990
Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp
995 1000 1005
Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala
1010 1015 1020
His Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys
1025 1030 1035
Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys
1040 1045 1050
Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile
1055 1060 1065
Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn
1070 1075 1080
Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys
1085 1090 1095
Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val Trp
1100 1105 1110
Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met
1115 1120 1125
Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly
1130 1135 1140
Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu
1145 1150 1155
Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe
1160 1165 1170
Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val
1175 1180 1185
Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu
1190 1195 1200
Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile
1205 1210 1215
Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu
1220 1225 1230
Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly
1235 1240 1245
Arg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn
1250 1255 1260
Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala
1265 1270 1275
Ser His Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln
1280 1285 1290
Lys Gln Leu Phe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile
1295 1300 1305
Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp
1310 1315 1320
Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His Arg Asp
1325 1330 1335
Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe Thr
1340 1345 1350
Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr
1355 1360 1365
Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp
1370 1375 1380
Ala Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg
1385 1390 1395
Ile Asp Leu Ser Gln Leu Gly Gly Asp Lys Arg Pro Ala Ala Thr
1400 1405 1410
Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys
1415 1420
<210> 6
<211> 328
<212> DNA
<213> Cauliflower mosaic virus
<400> 6
ccattgccca gctatctgtc actttattgt gaagatagtg gaaaaggaag gtggctccta 60
caaatgccat cattgcgata aaggaaaggc catcgttgaa gatgcctctg ccgacagtgg 120
tcccaaagat ggacccccac ccacgaggag catcgtggaa aaagaagacg ttccaaccac 180
gtcttcaaag caagtggatt gatgtgatat ctccactgac gtaagggatg acgcacaatc 240
ccactatcct tcgcaagacc cttcctctat ataaggaagt tcatttcatt tggagaggac 300
acgctgacaa gctgactcta gcagatct 328
<210> 7
<211> 195
<212> DNA
<213> Cauliflower mosaic virus
<400> 7
ctgaaatcac cagtctctct ctacaaatct atctctctct ataataatgt gtgagtagtt 60
cccagataag ggaattaggg ttcttatagg gtttcgctca tgtgttgagc atataagaaa 120
cccttagtat gtatttgtat ttgtaaaata cttctatcaa taaaatttct aattcctaaa 180
accaaaatcc agtgg 195
<210> 8
<211> 552
<212> DNA
<213> Streptomyces viridochromogenes
<400> 8
atgagccctg aaagacggcc tgtggagatt agaccagcga cggcagcgga catggcggcg 60
gtgtgcgaca tcgtgaacca ttacatcgaa acttcaacgg tgaacttccg cacagagccc 120
caaacaccac aggagtggat cgacgatctg gagagacttc aagacagata cccgtggctt 180
gttgcagagg tcgagggcgt ggtcgcgggg atcgcgtatg ccggcccgtg gaaggcgagg 240
aacgcctacg attggacagt ggaatccacc gtgtatgtca gccatcgcca ccagaggctg 300
ggcctcggca gcactctcta cacccatctc ctgaagagca tggaggcgca gggcttcaag 360
tccgtggtcg cagtgattgg cctgcctaac gatccatccg tgagactcca tgaggccctc 420
ggctacactg cgcgcggcac tctgcgcgcc gcgggctata agcacggcgg gtggcatgac 480
gtgggcttct ggcagagaga ctttgaactt cccgctcccc caagacctgt cagacccgtt 540
acgcagatct aa 552
<210> 9
<211> 20
<212> DNA
<213> Oryza sativa
<400> 9
ggaggcattg gtgcttcttg 20
<210> 10
<211> 20
<212> DNA
<213> Oryza sativa
<400> 10
gcaacccagg catcctcgac 20
<210> 11
<211> 20
<212> DNA
<213> Pseudomonas aeruginosa
<400> 11
gttcactgcc gtataggcag 20
<210> 12
<211> 55
<212> DNA
<213> Artificial Sequence
<220>
<223> 正向引物
<400> 12
acatcaggtc tccaaacgga ggcattggtg cttcttggtt ttagagctag aaata 55
<210> 13
<211> 62
<212> DNA
<213> Artificial Sequence
<220>
<223> 反向引物
<400> 13
taggatggtc tcgaaaacgt cgaggatgcc tgggttgcct gcctatacgg cagtgaacgc 60
ac 62
<210> 14
<211> 245
<212> DNA
<213> Oryza sativa
<400> 14
ggatcatgaa ccaacggcct ggctgtattt ggtggttgtg tagggagatg gggagaagaa 60
aagcccgatt ctcttcgctg tgatgggctg gatgcatgcg ggggagcggg aggcccaagt 120
acgtgcacgg tgagcggccc acagggcgag tgtgagcgcg agaggcggga ggaacagttt 180
agtaccacat tgcccagcta actcgaacgc gaccaactta taaacccgcg cgctgtcgct 240
tgtgt 245
<210> 15
<211> 76
<212> DNA
<213> Artificial Sequence
<220>
<223> sgRNA序列
<400> 15
gttttagagc tagaaatagc aagttaaaat aaggctagtc cgttatcaac ttgaaaaagt 60
ggcaccgagt cggtgc 76
<210> 16
<211> 3536
<212> DNA
<213> Artificial Sequence
<220>
<223> 含有靶点1序列和靶点2序列的GUS基因
<400> 16
atggtagatc tgagggtaaa tttctagttt ttctccttca ttttcttggt taggaccctt 60
ttctcttttt atttttttga gctttgatct ttctttaaac tgatctattt tttaattgat 120
tggttatggt gtaaatatta catagcttta actgataatc tgattacttt atttcgtgtg 180
tctatgatga tgatgatagt tacagaaccg acgaacttct ctgtacccga tcaacaccga 240
aacccgtggc gtcttcgacc tcaatggcgt ctggaacttc aagctggact acgggaaagg 300
actggaagag aagtggtacg aaagcaagct gaccgacact attagtatgg ccgtcccaag 360
cagttacaat gacattggcg tgaccaagga aatccgcaac catatcggat atgtctggta 420
cgaacgtgag ttcacggtgc cggcctatct gaaggatcag cgtatcgtgc tccgcttcgg 480
ctctgcaact cacaaagcaa ttgtctatgt caatggtgag ctggtcgtgg agcacaaggg 540
cggattcctg ccattcgaag cggaaatcaa caactcgctg cgtgatggca tgaatcgcgt 600
caccgtcgcc gtggacaaca tcctcgacga tagcaccctc ccggtggggc tgtacagcga 660
gcgccacgaa gagggcctcg gaaaagtcat tcgtaacaag ccgaacttcg acttcttcaa 720
ctatgcaggc ctgcaccgtc cggtgaaaat ctacacgacc ccgtttacgt acgtcgagga 780
catctcggtt gtgaccgact tcaatggccc aaccgggact gtgacctata cggtggactt 840
tcaaggcaaa gccgaaaacc tgaactgaac tgaactgaag gttatgacat tccaagcgga 900
tggaagatcc tgccggtgtt agccgcggtg catctggact cgtccctgta cgaggacccc 960
cagcgcttca atccctggag atggaaggtc agtcgcaata ggattatcag tgtctcaagg 1020
cgccattcag ttccccgtgt tccacaagaa gcaccaatgc ctccgcccat ggtctgtccg 1080
tgcaacccag gcatcctcga ccggagcatc aggagcagga aaaggaggag gattgaacaa 1140
tctacaggaa gaggtctaaa aagctgcctg tgcggtggct ggcttcctgc actgcatgca 1200
ggtcgatctc tgcgacgggc gacggcgcgc gtcgaggcgt tggcggcatg cgcggtcatc 1260
gctcacgcgt ccgcggggat ggtggcctgc ggtgaccgcg gagcttgtaa ggataatgag 1320
gtactggctg gaaggcccaa gagcgggcga ggtagaggtg ttcgcgaacc tgccgggctt 1380
ccccgacaac gtgcgctcca acggcagggg ccagttctgg gtggcgatcg actgctgccg 1440
gacgccggcg caggaggtgt tcgccaagag gccgtggctc cggaccctat acttcaagtt 1500
cccgctgtcg ctcaaggtgc tcacttggaa ggccgccagg aggatgcaca cggtgctcgc 1560
gctcctcgac ggcgaagggc gcgtcgtgga ggtgctcgag gaccggggcc acgaggtgat 1620
gaagctggtg agtgaggtgc gggaggtggg cagcaagctg tggatcggaa ccgtggcgca 1680
caaccacatc gccaccatcc cctacccttt agaggactaa ttttacccgt ggcgtcttcg 1740
acctcaatgg cgtctggaac ttcaagctgg actacgggaa aggactggaa gagaagtggt 1800
acgaaagcaa gctgaccgac actattagta tggccgtccc aagcagttac aatgacattg 1860
gcgtgaccaa ggaaatccgc aaccatatcg gatatgtctg gtacgaacgt gagttcacgg 1920
tgccggccta tctgaaggat cagcgtatcg tgctccgctt cggctctgca actcacaaag 1980
caattgtcta tgtcaatggt gagctggtcg tggagcacaa gggcggattc ctgccattcg 2040
aagcggaaat caacaactcg ctgcgtgatg gcatgaatcg cgtcaccgtc gccgtggaca 2100
acatcctcga cgatagcacc ctcccggtgg ggctgtacag cgagcgccac gaagagggcc 2160
tcggaaaagt cattcgtaac aagccgaact tcgacttctt caactatgca ggcctgcacc 2220
gtccggtgaa aatctacacg accccgttta cgtacgtcga ggacatctcg gttgtgaccg 2280
acttcaatgg cccaaccggg actgtgacct atacggtgga ctttcaaggc aaagccgaaa 2340
ccgtgaaagt gtcggtcgtg gatgaggaag gcaaagtggt cgcaagcacc gagggcctga 2400
gcggtaacgt ggagattccg aatgtcatcc tctgggaacc actgaacacg tatctctacc 2460
agatcaaagt ggaactggtg aacgacggac tgaccatcga tgtctatgaa gagccgttcg 2520
gcgtgcggac cgtggaagtc aacgacggca agttcctcat caacaacaaa ccgttctact 2580
tcaagggctt tggcaaacat gaggacactc ctatcaacgg ccgtggcttt aacgaagcga 2640
gcaatgtgat ggatttcaat atcctcaaat ggatcggtgc caacagcttc cggaccgcac 2700
actatccgta ctctgaagag ttgatgcgtc ttgcggatcg cgagggtctg gtcgtgatcg 2760
acgagactcc ggcagttggc gtgcacctca acttcatggc caccacggga ctcggcgaag 2820
gcagcgagcg cgtcagtacc tgggagaaga ttcggacgtt tgagcaccat caagacgttc 2880
tccgtgaact ggtgtctcgt gacaagaacc atccaagcgt cgtgatgtgg agcatcgcca 2940
acgaggcggc gactgaggaa gagggcgcgt acgagtactt caagccgttg gtggagctga 3000
ccaaggaact cgacccacag aagcgtccgg tcacgatcgt gctgtttgtg atggctaccc 3060
cggagacgga caaagtcgcc gaactgattg acgtcatcgc gctcaatcgc tataacggat 3120
ggtacttcga tggcggtgat ctcgaagcgg ccaaagtcca tctccgccag gaatttcacg 3180
cgtggaacaa gcgttgccca ggaaagccga tcatgatcac tgagtacggc gcagacaccg 3240
ttgcgggctt tcacgacatt gatccagtga tgttcaccga ggaatatcaa gtcgagtact 3300
accaggcgaa ccacgtcgtg ttcgatgagt ttgagaactt cgtgggtgag caagcgtgga 3360
acttcgcgga cttcgcgacc tctcagggcg tgatgcgcgt ccaaggaaac aagaagggcg 3420
tgttcactcg tgaccgcaag ccgaagctcg ccgcgcacgt ctttcgcgag cgctggacca 3480
acattccaga tttcggctac aagaacgcta gccatcacca tcaccatcac gtgtga 3536
<210> 17
<211> 253
<212> DNA
<213> Agrobacterium tumefaciens
<400> 17
gatcgttcaa acatttggca ataaagtttc ttaagattga atcctgttgc cggtcttgcg 60
atgattatca tataatttct gttgaattac gttaagcatg taataattaa catgtaatgc 120
atgacgttat ttatgagatg ggtttttatg attagagtcc cgcaattata catttaatac 180
gcgatagaaa acaaaatata gcgcgcaaac taggataaat tatcgcgcgc ggtgtcatct 240
atgttactag atc 253
<210> 18
<211> 1992
<212> DNA
<213> Zea Mays
<400> 18
ctgcagtgca gcgtgacccg gtcgtgcccc tctctagaga taatgagcat tgcatgtcta 60
agttataaaa aattaccaca tatttttttt gtcacacttg tttgaagtgc agtttatcta 120
tctttataca tatatttaaa ctttactcta cgaataatat aatctatagt actacaataa 180
tatcagtgtt ttagagaatc atataaatga acagttagac atggtctaaa ggacaattga 240
gtattttgac aacaggactc tacagtttta tctttttagt gtgcatgtgt tctccttttt 300
ttttgcaaat agcttcacct atataatact tcatccattt tattagtaca tccatttagg 360
gtttagggtt aatggttttt atagactaat ttttttagta catctatttt attctatttt 420
agcctctaaa ttaagaaaac taaaactcta ttttagtttt tttatttaat aatttagata 480
taaaatagaa taaaataaag tgactaaaaa ttaaacaaat accctttaag aaattaaaaa 540
aactaaggaa acatttttct tgtttcgagt agataatgcc agcctgttaa acgccgtcga 600
cgagtctaac ggacaccaac cagcgaacca gcagcgtcgc gtcgggccaa gcgaagcaga 660
cggcacggca tctctgtcgc tgcctctgga cccctctcga gagttccgct ccaccgttgg 720
acttgctccg ctgtcggcat ccagaaattg cgtggcggag cggcagacgt gagccggcac 780
ggcaggcggc ctcctcctcc tctcacggca cggcagctac gggggattcc tttcccaccg 840
ctccttcgct ttcccttcct cgcccgccgt aataaataga caccccctcc acaccctctt 900
tccccaacct cgtgttgttc ggagcgcaca cacacacaac cagatctccc ccaaatccac 960
ccgtcggcac ctccgcttca aggtacgccg ctcgtcctcc cccccccccc ctctctacct 1020
tctctagatc ggcgttccgg tccatggtta gggcccggta gttctacttc tgttcatgtt 1080
tgtgttagat ccgtgtttgt gttagatccg tgctgctagc gttcgtacac ggatgcgacc 1140
tgtacgtcag acacgttctg attgctaact tgccagtgtt tctctttggg gaatcctggg 1200
atggctctag ccgttccgca gacgggatcg atttcatgat tttttttgtt tcgttgcata 1260
gggtttggtt tgcccttttc ctttatttca atatatgccg tgcacttgtt tgtcgggtca 1320
tcttttcatg cttttttttg tcttggttgt gatgatgtgg tctggttggg cggtcgttct 1380
agatcggagt agaattctgt ttcaaactac ctggtggatt tattaatttt ggatctgtat 1440
gtgtgtgcca tacatattca tagttacgaa ttgaagatga tggatggaaa tatcgatcta 1500
ggataggtat acatgttgat gcgggtttta ctgatgcata tacagagatg ctttttgttc 1560
gcttggttgt gatgatgtgg tgtggttggg cggtcgttca ttcgttctag atcggagtag 1620
aatactgttt caaactacct ggtgtattta ttaattttgg aactgtatgt gtgtgtcata 1680
catcttcata gttacgagtt taagatggat ggaaatatcg atctaggata ggtatacatg 1740
ttgatgtggg ttttactgat gcatatacat gatggcatat gcagcatcta ttcatatgct 1800
ctaaccttga gtacctatct attataataa acaagtatgt tttataatta ttttgatctt 1860
gatatacttg gatgatggca tatgcagcag ctatatgtgg atttttttag ccctgccttc 1920
atacgctatt tatttgcttg gtactgtttc ttttgtcgat gctcaccctg ttgtttggtg 1980
ttacttctgc ag 1992
<210> 19
<211> 1176
<212> DNA
<213> Escherichia coli
<400> 19
atgcaaaaac tcattaactc agtgcaaaac tatgcctggg gcagcaaaac ggcgttgact 60
gaactttatg gtatggaaaa tccgtccagc cagccgatgg ccgagctgtg gatgggcgca 120
catccgaaaa gcagttcacg agtgcagaat gccgccggag atatcgtttc actgcgtgat 180
gtgattgaga gtgataaatc gactctgctc ggagaggccg ttgccaaacg ctttggcgaa 240
ctgcctttcc tgttcaaagt attatgcgca gcacagccac tctccattca ggttcatcca 300
aacaaacaca attctgaaat cggttttgcc aaagaaaatg ccgcaggtat cccgatggat 360
gccgccgagc gtaactataa agatcctaac cacaagccgg agctggtttt tgcgctgacg 420
cctttccttg cgatgaacgc gtttcgtgaa ttttccgaga ttgtctccct actccagccg 480
gtcgcaggtg cacatccggc gattgctcac tttttacaac agcctgatgc cgaacgttta 540
agcgaactgt tcgccagcct gttgaatatg cagggtgaag aaaaatcccg cgcgctggcg 600
attttaaaat cggccctcga tagccagcag ggtgaaccgt ggcaaacgat tcgtttaatt 660
tctgaatttt acccggaaga cagcggtctg ttctccccgc tattgctgaa tgtggtgaaa 720
ttgaaccctg gcgaagcgat gttcctgttc gctgaaacac cgcacgctta cctgcaaggc 780
gtggcgctgg aagtgatggc aaactccgat aacgtgctgc gtgcgggtct gacgcctaaa 840
tacattgata ttccggaact ggttgccaat gtgaaattcg aagccaaacc ggctaaccag 900
ttgttgaccc agccggtgaa acaaggtgca gaactggact tcccgattcc agtggatgat 960
tttgccttct cgctgcatga ccttagtgat aaagaaacca ccattagcca gcagagtgcc 1020
gccattttgt tctgcgtcga aggcgatgca acgttgtgga aaggttctca gcagttacag 1080
cttaaaccgg gtgaatcagc gtttattgcc gccaacgaat caccggtgac tgtcaaaggc 1140
cacggccgtt tagcgcgtgt ttacaacaag ctgtaa 1176
Claims (16)
1.一种提高同源重组效率的方法,其特征在于,包括向宿主细胞中引入FokI-dCas9融合蛋白,所述FokI-dCas9融合蛋白具有SEQ ID NO:4和SEQ ID NO:5所示的氨基酸序列。
2.根据权利要求1所述提高同源重组效率的方法,其特征在于,所述FokI-dCas9融合蛋白在宿主细胞中瞬时表达或稳定表达。
3.根据权利要求1或2所述提高同源重组效率的方法,其特征在于,所述宿主细胞为植物细胞。
4.根据权利要求3所述提高同源重组效率的方法,其特征在于,所述植物为玉米、水稻、大豆、拟南芥、棉花、油菜、高粱、小麦、大麦、粟、甘蔗或燕麦。
5.根据权利要求4所述提高同源重组效率的方法,其特征在于,所述FokI-dCas9融合蛋白的核苷酸序列具有SEQ ID NO:1第643-5523位所示的核苷酸序列。
6.一种基因组编辑系统,其特征在于,包含FokI-dCas9融合蛋白,所述FokI-dCas9融合蛋白具有SEQ ID NO:4和SEQ ID NO:5所示的氨基酸序列。
7.根据权利要求6所述基因组编辑系统,其特征在于,所述FokI-dCas9融合蛋白的核苷酸序列具有SEQ ID NO:1第643-5523位所示的核苷酸序列。
8.根据权利要求6或7所述基因组编辑系统,其特征在于,所述基因组编辑系统还包括编码序列操纵系统的多核苷酸序列。
9.根据权利要求8所述基因组编辑系统,其特征在于,所述序列操纵系统为CRISPR/Cas系统。
10.一种实现基因组编辑的方法,其特征在于,包括在生物体中表达权利要求6-9任一项所述基因组编辑系统。
11.一种产生基因组编辑的植物的方法,其特征在于,包括向植物基因组中引入编码权利要求6-9任一项所述基因组编辑系统的核苷酸序列。
12.一种产生基因组编辑植物种子的方法,其特征在于,包括将权利要求11所述方法产生的基因组编辑的植物自交,从而获得具有基因组编辑植物种子。
13.一种培育基因组编辑植物的方法,其特征在于,包括:
种植至少一粒权利要求12所述方法产生的所述基因组编辑植物种子;
使所述种子长成植株。
14.一种权利要求6-9任一项所述基因组编辑系统在提高同源重组效率和/或提高基因组编辑效率中的用途。
15.一种FokI-dCas9融合蛋白在提高同源重组效率中的用途,其特征在于所述FokI-dCas9融合蛋白具有SEQ ID NO:4和SEQ ID NO:5所示的氨基酸序列。
16.根据权利要求15所述用途,其特征在于,所述FokI-dCas9融合蛋白的核苷酸序列具有SEQ ID NO:1第643-5523位所示的核苷酸序列。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710106331.7A CN106978438B (zh) | 2017-02-27 | 2017-02-27 | 提高同源重组效率的方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710106331.7A CN106978438B (zh) | 2017-02-27 | 2017-02-27 | 提高同源重组效率的方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106978438A CN106978438A (zh) | 2017-07-25 |
CN106978438B true CN106978438B (zh) | 2020-08-28 |
Family
ID=59339365
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710106331.7A Active CN106978438B (zh) | 2017-02-27 | 2017-02-27 | 提高同源重组效率的方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106978438B (zh) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110396523B (zh) * | 2018-04-23 | 2023-06-09 | 中国科学院分子植物科学卓越创新中心 | 一种重复片段介导的植物定点重组方法 |
CN112204156A (zh) * | 2018-05-25 | 2021-01-08 | 先锋国际良种公司 | 用于通过调节重组率来改善育种的系统和方法 |
CA3140442A1 (en) * | 2019-07-08 | 2021-01-14 | Inscripta, Inc. | Increased nucleic acid-guided cell editing via a lexa-rad51 fusion protein |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5400034B2 (ja) * | 2007-04-26 | 2014-01-29 | サンガモ バイオサイエンシーズ, インコーポレイテッド | Ppp1r12c座への標的化組込み |
LT3138912T (lt) * | 2012-12-06 | 2019-02-25 | Sigma-Aldrich Co. Llc | Genomo modifikavimas ir reguliavimas crispr pagrindu |
US9737604B2 (en) * | 2013-09-06 | 2017-08-22 | President And Fellows Of Harvard College | Use of cationic lipids to deliver CAS9 |
CN105524897A (zh) * | 2014-09-30 | 2016-04-27 | 深圳华大基因研究院 | 转录激活因子样效应因子核酸酶及其应用 |
-
2017
- 2017-02-27 CN CN201710106331.7A patent/CN106978438B/zh active Active
Also Published As
Publication number | Publication date |
---|---|
CN106978438A (zh) | 2017-07-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10415046B2 (en) | Precision gene targeting to a particular locus in maize | |
CN109072207B (zh) | 用于修饰靶核酸的改进方法 | |
CN113166744A (zh) | 用于基因组编辑的新颖crispr-cas系统 | |
KR102127418B1 (ko) | 부위-특이적인 뉴클레오티드 치환을 통해 글리포세이트-내성 벼를 수득하는 방법 | |
EP3080275B1 (en) | Method of selection of transformed diatoms using nuclease | |
WO2019207274A1 (en) | Gene replacement in plants | |
EP2796558A1 (en) | Improved gene targeting and nucleic acid carrier molecule, in particular for use in plants | |
US20160201072A1 (en) | Genome modification using guide polynucleotide/cas endonuclease systems and methods of use | |
CN110527697B (zh) | 基于CRISPR-Cas13a的RNA定点编辑技术 | |
JP2018531024A (ja) | マーカーフリーゲノム改変のための方法および組成物 | |
JP2018531024A6 (ja) | マーカーフリーゲノム改変のための方法および組成物 | |
CN116391038A (zh) | 用于改善基因组编辑的工程化Cas内切核酸酶变体 | |
CN112105738A (zh) | 使用合成转录因子的靶向转录调控 | |
CN110607320A (zh) | 一种植物基因组定向碱基编辑骨架载体及其应用 | |
US20220235363A1 (en) | Enhanced plant regeneration and transformation by using grf1 booster gene | |
CN111902541A (zh) | 增加细胞中感兴趣的核酸分子表达水平的方法 | |
CN106978438B (zh) | 提高同源重组效率的方法 | |
CN116286742B (zh) | CasD蛋白、CRISPR/CasD基因编辑系统及其在植物基因编辑中的应用 | |
CN114340656A (zh) | 使用huh内切核酸酶促进靶向基因组修饰的方法和组合物 | |
WO2023216415A1 (zh) | 基于双分子脱氨酶互补的碱基编辑系统及其应用 | |
CA3112164C (en) | Virus-based replicon for plant genome editing without inserting replicon into plant genome and use thereof | |
CN106676129A (zh) | 提高基因组编辑效率的方法 | |
US20230272408A1 (en) | Plastid transformation by complementation of plastid mutations | |
TWI686477B (zh) | 特異性造成植物葉綠體基因變異的轉殖載體、套組、方法及利用其產生之轉殖植物細胞與農桿菌 | |
WO2024158857A2 (en) | Mb2cas12a variants with flexible pam spectrum |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |