MXPA98002972A - Long wavelength engineered fluorescent proteins - Google Patents
Long wavelength engineered fluorescent proteinsInfo
- Publication number
- MXPA98002972A MXPA98002972A MXPA/A/1998/002972A MX9802972A MXPA98002972A MX PA98002972 A MXPA98002972 A MX PA98002972A MX 9802972 A MX9802972 A MX 9802972A MX PA98002972 A MXPA98002972 A MX PA98002972A
- Authority
- MX
- Mexico
- Prior art keywords
- amino acid
- fluorescent protein
- acid sequence
- substitution
- protein
- Prior art date
Links
- 102000034387 fluorescent proteins Human genes 0.000 title claims abstract description 230
- 108091006031 fluorescent proteins Proteins 0.000 title claims abstract description 230
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 90
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 81
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 159
- 229910052739 hydrogen Inorganic materials 0.000 claims description 130
- 238000006467 substitution reaction Methods 0.000 claims description 129
- 108010043121 Green Fluorescent Proteins Proteins 0.000 claims description 112
- 102000004144 Green Fluorescent Proteins Human genes 0.000 claims description 111
- 239000005090 green fluorescent protein Substances 0.000 claims description 111
- 102000004169 proteins and genes Human genes 0.000 claims description 103
- 108090000623 proteins and genes Proteins 0.000 claims description 103
- 229910052727 yttrium Inorganic materials 0.000 claims description 94
- 210000004027 cells Anatomy 0.000 claims description 92
- 229910052731 fluorine Inorganic materials 0.000 claims description 86
- 230000014509 gene expression Effects 0.000 claims description 86
- 241000243290 Aequorea Species 0.000 claims description 76
- 229920001850 Nucleic acid sequence Polymers 0.000 claims description 70
- 150000001413 amino acids Chemical class 0.000 claims description 68
- 102200079156 GFRA1 S65G Human genes 0.000 claims description 61
- 102200079196 GFRA1 S72A Human genes 0.000 claims description 56
- 102200079204 GFRA1 T203Y Human genes 0.000 claims description 53
- 230000035772 mutation Effects 0.000 claims description 42
- 239000000126 substance Substances 0.000 claims description 38
- 229920001184 polypeptide Polymers 0.000 claims description 37
- 229910052799 carbon Inorganic materials 0.000 claims description 35
- 229910052757 nitrogen Inorganic materials 0.000 claims description 34
- 229910052700 potassium Inorganic materials 0.000 claims description 33
- 108090001123 antibodies Proteins 0.000 claims description 31
- 102000004965 antibodies Human genes 0.000 claims description 31
- 229910052721 tungsten Inorganic materials 0.000 claims description 30
- 239000000523 sample Substances 0.000 claims description 26
- 108020001507 fusion proteins Proteins 0.000 claims description 25
- 102000037240 fusion proteins Human genes 0.000 claims description 25
- 102200079149 GFRA1 V68L Human genes 0.000 claims description 23
- 238000002866 fluorescence resonance energy transfer Methods 0.000 claims description 21
- 229920000023 polynucleotide Polymers 0.000 claims description 20
- 239000002157 polynucleotide Substances 0.000 claims description 20
- 102200079206 GFRA1 T203F Human genes 0.000 claims description 19
- 230000003993 interaction Effects 0.000 claims description 17
- 108020004705 Codon Proteins 0.000 claims description 16
- 239000004289 sodium hydrogen sulphite Substances 0.000 claims description 16
- -1 aromatic amino acid Chemical class 0.000 claims description 15
- 230000027455 binding Effects 0.000 claims description 14
- 125000004429 atoms Chemical group 0.000 claims description 13
- 102200079261 GFRA1 E222G Human genes 0.000 claims description 12
- 102200079162 GFRA1 F64L Human genes 0.000 claims description 12
- 102200079157 GFRA1 S65A Human genes 0.000 claims description 12
- 102200079154 GFRA1 Y66H Human genes 0.000 claims description 12
- 108020004711 Nucleic Acid Probes Proteins 0.000 claims description 12
- 239000002853 nucleic acid probe Substances 0.000 claims description 12
- 102200087963 MSH6 S65L Human genes 0.000 claims description 11
- 102200059879 PRKCSH S65V Human genes 0.000 claims description 11
- 102220415446 TDRD15 S65C Human genes 0.000 claims description 11
- 239000001257 hydrogen Substances 0.000 claims description 11
- 102200079159 GFRA1 Y66F Human genes 0.000 claims description 10
- 102200066206 IL36G Q69K Human genes 0.000 claims description 10
- 238000000205 computational biomodeling Methods 0.000 claims description 10
- 102220243291 rs1555194026 Human genes 0.000 claims description 10
- 238000004519 manufacturing process Methods 0.000 claims description 9
- 102200079158 GFRA1 Y66W Human genes 0.000 claims description 8
- 239000000203 mixture Substances 0.000 claims description 8
- 229910052717 sulfur Inorganic materials 0.000 claims description 8
- 239000003446 ligand Substances 0.000 claims description 7
- 210000004962 mammalian cells Anatomy 0.000 claims description 7
- 238000003860 storage Methods 0.000 claims description 7
- 230000002797 proteolythic Effects 0.000 claims description 6
- 210000003527 eukaryotic cell Anatomy 0.000 claims description 5
- 230000017854 proteolysis Effects 0.000 claims description 5
- 108050004351 Green fluorescent protein-related Proteins 0.000 claims description 4
- 238000006366 phosphorylation reaction Methods 0.000 claims description 4
- 230000000865 phosphorylative Effects 0.000 claims description 4
- 102200079205 GFRA1 T203I Human genes 0.000 claims description 3
- 241000124008 Mammalia Species 0.000 claims description 3
- 229920000453 Consensus sequence Polymers 0.000 claims description 2
- 239000011159 matrix material Substances 0.000 claims description 2
- 210000001236 prokaryotic cell Anatomy 0.000 claims description 2
- 239000007787 solid Substances 0.000 claims description 2
- 102200079155 GFRA1 S65T Human genes 0.000 claims 22
- 229920001785 Response element Polymers 0.000 claims 2
- 238000007385 chemical modification Methods 0.000 claims 2
- 230000002209 hydrophobic Effects 0.000 claims 1
- 235000018102 proteins Nutrition 0.000 description 83
- 235000001014 amino acid Nutrition 0.000 description 75
- 230000005284 excitation Effects 0.000 description 27
- 229920003013 deoxyribonucleic acid Polymers 0.000 description 22
- 241000700605 Viruses Species 0.000 description 20
- 238000000034 method Methods 0.000 description 19
- 229940014598 TAC Drugs 0.000 description 18
- 239000000370 acceptor Substances 0.000 description 18
- 230000000694 effects Effects 0.000 description 16
- 230000035897 transcription Effects 0.000 description 16
- 229920001405 Coding region Polymers 0.000 description 15
- 102000005962 receptors Human genes 0.000 description 15
- 108020003175 receptors Proteins 0.000 description 15
- 239000000758 substrate Substances 0.000 description 14
- HSPSXROIMXIJQW-BQBZGAKWSA-N Asp-His Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 HSPSXROIMXIJQW-BQBZGAKWSA-N 0.000 description 10
- LSPKYLAFTPBWIL-BYPYZUCNSA-N Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(O)=O LSPKYLAFTPBWIL-BYPYZUCNSA-N 0.000 description 10
- ATIPDCIQTUXABX-UWVGGRQHSA-N Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ATIPDCIQTUXABX-UWVGGRQHSA-N 0.000 description 10
- 125000002915 carbonyl group Chemical group [*:2]C([*:1])=O 0.000 description 10
- 239000000463 material Substances 0.000 description 10
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 9
- 238000004166 bioassay Methods 0.000 description 9
- 238000000295 emission spectrum Methods 0.000 description 9
- 230000002068 genetic Effects 0.000 description 9
- 229920002676 Complementary DNA Polymers 0.000 description 8
- 241000196324 Embryophyta Species 0.000 description 8
- CIOWSLJGLSUOME-BQBZGAKWSA-N Lys-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O CIOWSLJGLSUOME-BQBZGAKWSA-N 0.000 description 8
- BQBCIBCLXBKYHW-CSMHCCOUSA-N Thr-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])[C@@H](C)O BQBCIBCLXBKYHW-CSMHCCOUSA-N 0.000 description 8
- WITCOKQIPFWQQD-FSPLSTOPSA-N Val-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O WITCOKQIPFWQQD-FSPLSTOPSA-N 0.000 description 8
- XXDVDTMEVBYRPK-XPUUQOCRSA-N Val-Gln Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O XXDVDTMEVBYRPK-XPUUQOCRSA-N 0.000 description 8
- GVRKWABULJAONN-UHFFFAOYSA-N Valyl-Threonine Chemical compound CC(C)C(N)C(=O)NC(C(C)O)C(O)=O GVRKWABULJAONN-UHFFFAOYSA-N 0.000 description 8
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 8
- 108010038983 glycyl-histidyl-lysine Proteins 0.000 description 8
- 230000001939 inductive effect Effects 0.000 description 8
- 238000003780 insertion Methods 0.000 description 8
- 108010034529 leucyl-lysine Proteins 0.000 description 8
- 108010057821 leucylproline Proteins 0.000 description 8
- 230000000051 modifying Effects 0.000 description 8
- 108010051110 tyrosyl-lysine Proteins 0.000 description 8
- 241000242764 Aequorea victoria Species 0.000 description 7
- HGNRJCINZYHNOU-LURJTMIESA-N Lys-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(O)=O HGNRJCINZYHNOU-LURJTMIESA-N 0.000 description 7
- 102000001253 Protein Kinases Human genes 0.000 description 7
- 239000002299 complementary DNA Substances 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 125000000267 glycino group Chemical group [H]N([*])C([H])([H])C(=O)O[H] 0.000 description 7
- 101700076752 pyp Proteins 0.000 description 7
- 239000002904 solvent Substances 0.000 description 7
- 230000003612 virological Effects 0.000 description 7
- FYRVDDJMNISIKJ-UWVGGRQHSA-N Asn-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FYRVDDJMNISIKJ-UWVGGRQHSA-N 0.000 description 6
- JHFNSBBHKSZXKB-VKHMYHEASA-N Asp-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(O)=O JHFNSBBHKSZXKB-VKHMYHEASA-N 0.000 description 6
- YZQCXOFQZKCETR-UWVGGRQHSA-N Asp-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YZQCXOFQZKCETR-UWVGGRQHSA-N 0.000 description 6
- 241000894006 Bacteria Species 0.000 description 6
- 210000001671 Embryonic Stem Cells Anatomy 0.000 description 6
- 102000004190 Enzymes Human genes 0.000 description 6
- 108090000790 Enzymes Proteins 0.000 description 6
- SCCPDJAQCXWPTF-VKHMYHEASA-N Gly-Asp Chemical compound NCC(=O)N[C@H](C(O)=O)CC(O)=O SCCPDJAQCXWPTF-VKHMYHEASA-N 0.000 description 6
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 6
- 241000880493 Leptailurus serval Species 0.000 description 6
- VTJUNIYRYIAIHF-IUCAKERBSA-N Leu-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O VTJUNIYRYIAIHF-IUCAKERBSA-N 0.000 description 6
- XGDCYUQSFDQISZ-BQBZGAKWSA-N Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O XGDCYUQSFDQISZ-BQBZGAKWSA-N 0.000 description 6
- QCZYYEFXOBKCNQ-STQMWFEESA-N Lys-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QCZYYEFXOBKCNQ-STQMWFEESA-N 0.000 description 6
- 108020004999 Messenger RNA Proteins 0.000 description 6
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 6
- GKZIWHRNKRBEOH-HOTGVXAUSA-N Phe-Phe Chemical compound C([C@H]([NH3+])C(=O)N[C@@H](CC=1C=CC=CC=1)C([O-])=O)C1=CC=CC=C1 GKZIWHRNKRBEOH-HOTGVXAUSA-N 0.000 description 6
- ROHDXJUFQVRDAV-UWVGGRQHSA-N Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ROHDXJUFQVRDAV-UWVGGRQHSA-N 0.000 description 6
- 108020004511 Recombinant DNA Proteins 0.000 description 6
- UBAQSAUDKMIEQZ-QWRGUYRKSA-N Tyr-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UBAQSAUDKMIEQZ-QWRGUYRKSA-N 0.000 description 6
- GIAZPLMMQOERPN-YUMQZZPRSA-N Val-Pro Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O GIAZPLMMQOERPN-YUMQZZPRSA-N 0.000 description 6
- STTYIMSDIYISRG-WDSKDSINSA-N Val-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(O)=O STTYIMSDIYISRG-WDSKDSINSA-N 0.000 description 6
- 125000000539 amino acid group Chemical group 0.000 description 6
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 6
- 230000000295 complement Effects 0.000 description 6
- 238000010494 dissociation reaction Methods 0.000 description 6
- 230000005593 dissociations Effects 0.000 description 6
- 230000005281 excited state Effects 0.000 description 6
- 230000004927 fusion Effects 0.000 description 6
- 238000000338 in vitro Methods 0.000 description 6
- 229920002106 messenger RNA Polymers 0.000 description 6
- 108010051242 phenylalanylserine Proteins 0.000 description 6
- 238000006862 quantum yield reaction Methods 0.000 description 6
- 241000894007 species Species 0.000 description 6
- 108010061238 threonyl-glycine Proteins 0.000 description 6
- IZAYCFBWPSFFJI-UHFFFAOYSA-N 1-methylsulfonylsulfanylethane Chemical compound CCSS(C)(=O)=O IZAYCFBWPSFFJI-UHFFFAOYSA-N 0.000 description 5
- 239000004475 Arginine Substances 0.000 description 5
- YBAFDPFAUTYYRW-YUMQZZPRSA-N Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O YBAFDPFAUTYYRW-YUMQZZPRSA-N 0.000 description 5
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 5
- 241001465754 Metazoa Species 0.000 description 5
- RJFAYQIBOAGBLC-BYPYZUCNSA-N Selenium-L-methionine Chemical compound C[Se]CC[C@H](N)C(O)=O RJFAYQIBOAGBLC-BYPYZUCNSA-N 0.000 description 5
- 125000001931 aliphatic group Chemical group 0.000 description 5
- 230000029918 bioluminescence Effects 0.000 description 5
- 238000005415 bioluminescence Methods 0.000 description 5
- 230000015572 biosynthetic process Effects 0.000 description 5
- 238000010367 cloning Methods 0.000 description 5
- 238000000695 excitation spectrum Methods 0.000 description 5
- 230000001976 improved Effects 0.000 description 5
- 230000004807 localization Effects 0.000 description 5
- 125000001360 methionine group Chemical group N[C@@H](CCSC)C(=O)* 0.000 description 5
- 238000000520 microinjection Methods 0.000 description 5
- 239000002773 nucleotide Substances 0.000 description 5
- 125000003729 nucleotide group Chemical group 0.000 description 5
- 229910052760 oxygen Inorganic materials 0.000 description 5
- 239000001301 oxygen Substances 0.000 description 5
- MYMOFIZGZYHOMD-UHFFFAOYSA-N oxygen Chemical compound O=O MYMOFIZGZYHOMD-UHFFFAOYSA-N 0.000 description 5
- 235000013824 polyphenols Nutrition 0.000 description 5
- VNYDHJARLHNEGA-RYUDHWBXSA-N (2S)-1-[(2S)-2-azaniumyl-3-(4-hydroxyphenyl)propanoyl]pyrrolidine-2-carboxylate Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CC=C(O)C=C1 VNYDHJARLHNEGA-RYUDHWBXSA-N 0.000 description 4
- MVORZMQFXBLMHM-QWRGUYRKSA-N (2S)-6-amino-2-[[(2S)-2-[(2-aminoacetyl)amino]-3-(1H-imidazol-5-yl)propanoyl]amino]hexanoic acid Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CN=CN1 MVORZMQFXBLMHM-QWRGUYRKSA-N 0.000 description 4
- ULXYQAJWJGLCNR-YUMQZZPRSA-N (3S)-3-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-(carboxymethylamino)-4-oxobutanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O ULXYQAJWJGLCNR-YUMQZZPRSA-N 0.000 description 4
- DGVVWUTYPXICAM-UHFFFAOYSA-N 2-mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 4
- BNODVYXZAAXSHW-UHFFFAOYSA-N Arginyl-Histidine Chemical compound NC(=N)NCCCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 BNODVYXZAAXSHW-UHFFFAOYSA-N 0.000 description 4
- CKAJHWFHHFSCDT-WHFBIAKZSA-N Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O CKAJHWFHHFSCDT-WHFBIAKZSA-N 0.000 description 4
- OAMLVOVXNKILLQ-BQBZGAKWSA-N Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O OAMLVOVXNKILLQ-BQBZGAKWSA-N 0.000 description 4
- OMSMPWHEGLNQOD-UHFFFAOYSA-N Asparaginyl-Phenylalanine Chemical compound NC(=O)CC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 OMSMPWHEGLNQOD-UHFFFAOYSA-N 0.000 description 4
- VGRHZPNRCLAHQA-UHFFFAOYSA-N Aspartyl-Asparagine Chemical compound OC(=O)CC(N)C(=O)NC(CC(N)=O)C(O)=O VGRHZPNRCLAHQA-UHFFFAOYSA-N 0.000 description 4
- QIVBCDIJIAJPQS-SECBINFHSA-N D-tryptophane Chemical compound C1=CC=C2C(C[C@@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-SECBINFHSA-N 0.000 description 4
- 210000001161 Embryo, Mammalian Anatomy 0.000 description 4
- 241000588724 Escherichia coli Species 0.000 description 4
- 108060003162 GFP Proteins 0.000 description 4
- KOSRFJWDECSPRO-WDSKDSINSA-N Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O KOSRFJWDECSPRO-WDSKDSINSA-N 0.000 description 4
- YSWHPLCDIMUKFE-QWRGUYRKSA-N Glu-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 YSWHPLCDIMUKFE-QWRGUYRKSA-N 0.000 description 4
- CLSDNFWKGFJIBZ-UHFFFAOYSA-N Glutaminyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CCC(N)=O CLSDNFWKGFJIBZ-UHFFFAOYSA-N 0.000 description 4
- IEFJWDNGDZAYNZ-BYPYZUCNSA-N Gly-Glu Chemical compound NCC(=O)N[C@H](C(O)=O)CCC(O)=O IEFJWDNGDZAYNZ-BYPYZUCNSA-N 0.000 description 4
- PFMUCCYYAAFKTH-YFKPBYRVSA-N Gly-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)CN PFMUCCYYAAFKTH-YFKPBYRVSA-N 0.000 description 4
- BCCRXDTUTZHDEU-VKHMYHEASA-N Gly-Ser Chemical compound NCC(=O)N[C@@H](CO)C(O)=O BCCRXDTUTZHDEU-VKHMYHEASA-N 0.000 description 4
- HTOOKGDPMXSJSY-STQMWFEESA-N His-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 HTOOKGDPMXSJSY-STQMWFEESA-N 0.000 description 4
- RAXXELZNTBOGNW-UHFFFAOYSA-N Imidazole Chemical compound C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 4
- 102000018358 Immunoglobulins Human genes 0.000 description 4
- 108060003951 Immunoglobulins Proteins 0.000 description 4
- XUJNEKJLAYXESH-REOHCLBHSA-N L-cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 4
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 4
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 4
- NFNVDJGXRFEYTK-YUMQZZPRSA-N Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O NFNVDJGXRFEYTK-YUMQZZPRSA-N 0.000 description 4
- LESXFEZIFXFIQR-LURJTMIESA-N Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(O)=O LESXFEZIFXFIQR-LURJTMIESA-N 0.000 description 4
- ZOKVLMBYDSIDKG-CSMHCCOUSA-N Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ZOKVLMBYDSIDKG-CSMHCCOUSA-N 0.000 description 4
- YQAIUOWPSUOINN-IUCAKERBSA-N Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN YQAIUOWPSUOINN-IUCAKERBSA-N 0.000 description 4
- IMTUWVJPCQPJEE-IUCAKERBSA-N Met-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN IMTUWVJPCQPJEE-IUCAKERBSA-N 0.000 description 4
- DZMGFGQBRYWJOR-YUMQZZPRSA-N Met-Pro Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O DZMGFGQBRYWJOR-YUMQZZPRSA-N 0.000 description 4
- BJFJQOMZCSHBMY-YUMQZZPRSA-N Met-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(O)=O BJFJQOMZCSHBMY-YUMQZZPRSA-N 0.000 description 4
- 101700040790 PH Proteins 0.000 description 4
- NYQBYASWHVRESG-MIMYLULJSA-N Phe-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 NYQBYASWHVRESG-MIMYLULJSA-N 0.000 description 4
- 108060006633 Protein Kinases Proteins 0.000 description 4
- WXVIGTAUZBUDPZ-DTLFHODZSA-N Thr-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 WXVIGTAUZBUDPZ-DTLFHODZSA-N 0.000 description 4
- QOLYAJSZHIJCTO-VQVTYTSYSA-N Thr-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O QOLYAJSZHIJCTO-VQVTYTSYSA-N 0.000 description 4
- HPYDSVWYXXKHRD-VIFPVBQESA-N Tyr-Gly Chemical compound [O-]C(=O)CNC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 HPYDSVWYXXKHRD-VIFPVBQESA-N 0.000 description 4
- AOLHUMAVONBBEZ-STQMWFEESA-N Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AOLHUMAVONBBEZ-STQMWFEESA-N 0.000 description 4
- 239000000556 agonist Substances 0.000 description 4
- 239000000427 antigen Substances 0.000 description 4
- 102000038129 antigens Human genes 0.000 description 4
- 108091007172 antigens Proteins 0.000 description 4
- 108010038633 aspartylglutamate Proteins 0.000 description 4
- 108010092854 aspartyllysine Proteins 0.000 description 4
- 239000003623 enhancer Substances 0.000 description 4
- 230000002708 enhancing Effects 0.000 description 4
- 238000005755 formation reaction Methods 0.000 description 4
- 210000004602 germ cell Anatomy 0.000 description 4
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 4
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 4
- STKYPAFSDFAEPH-LURJTMIESA-N gly-val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CN STKYPAFSDFAEPH-LURJTMIESA-N 0.000 description 4
- 108010015792 glycyllysine Proteins 0.000 description 4
- 108010087823 glycyltyrosine Proteins 0.000 description 4
- 108010036413 histidylglycine Proteins 0.000 description 4
- 230000001965 increased Effects 0.000 description 4
- 230000002401 inhibitory effect Effects 0.000 description 4
- 108010064235 lysylglycine Proteins 0.000 description 4
- 108010017391 lysylvaline Proteins 0.000 description 4
- 239000003550 marker Substances 0.000 description 4
- RZVAJINKPMORJF-UHFFFAOYSA-N p-acetaminophenol Chemical compound CC(=O)NC1=CC=C(O)C=C1 RZVAJINKPMORJF-UHFFFAOYSA-N 0.000 description 4
- 230000036961 partial Effects 0.000 description 4
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N phenol group Chemical group C1(=CC=CC=C1)O ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 4
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 4
- 229920000642 polymer Polymers 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 4
- 108010070643 prolylglutamic acid Proteins 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- LCTONWCANYUPML-UHFFFAOYSA-N pyruvic acid Chemical compound CC(=O)C(O)=O LCTONWCANYUPML-UHFFFAOYSA-N 0.000 description 4
- 238000005215 recombination Methods 0.000 description 4
- 238000007363 ring formation reaction Methods 0.000 description 4
- 108010048818 seryl-histidine Proteins 0.000 description 4
- 230000003595 spectral Effects 0.000 description 4
- 238000001890 transfection Methods 0.000 description 4
- 108010084932 tryptophyl-proline Proteins 0.000 description 4
- 108010020532 tyrosyl-proline Proteins 0.000 description 4
- 241000701161 unidentified adenovirus Species 0.000 description 4
- 241001430294 unidentified retrovirus Species 0.000 description 4
- DXJZITDUDUPINW-UHFFFAOYSA-N γ-glutamyl-Asparagine Chemical compound NC(=O)CCC(N)C(=O)NC(CC(N)=O)C(O)=O DXJZITDUDUPINW-UHFFFAOYSA-N 0.000 description 4
- 229920000160 (ribonucleotides)n+m Polymers 0.000 description 3
- 102100009049 AGBL1 Human genes 0.000 description 3
- 101710003512 AGBL1 Proteins 0.000 description 3
- 210000000349 Chromosomes Anatomy 0.000 description 3
- 241000206602 Eukaryota Species 0.000 description 3
- 102200079210 GFRA1 M153T Human genes 0.000 description 3
- 102200079215 GFRA1 V163A Human genes 0.000 description 3
- 101710014266 Gzmf Proteins 0.000 description 3
- JKMHFZQWWAIEOD-UHFFFAOYSA-N HEPES Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 3
- 239000007995 HEPES buffer Substances 0.000 description 3
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L MgCl2 Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 3
- 108091022025 Ornithine Decarboxylase Proteins 0.000 description 3
- 102000028557 Ornithine Decarboxylase Human genes 0.000 description 3
- YBYRMVIVWMBXKQ-UHFFFAOYSA-N PMSF Chemical compound FS(=O)(=O)CC1=CC=CC=C1 YBYRMVIVWMBXKQ-UHFFFAOYSA-N 0.000 description 3
- 108091005771 Peptidases Proteins 0.000 description 3
- 108091000081 Phosphotransferases Proteins 0.000 description 3
- 206010038997 Retroviral infection Diseases 0.000 description 3
- 241000723873 Tobacco mosaic virus Species 0.000 description 3
- 238000002835 absorbance Methods 0.000 description 3
- 238000010521 absorption reaction Methods 0.000 description 3
- 239000012491 analyte Substances 0.000 description 3
- 230000003042 antagnostic Effects 0.000 description 3
- 239000005557 antagonist Substances 0.000 description 3
- 230000031018 biological processes and functions Effects 0.000 description 3
- 230000001413 cellular Effects 0.000 description 3
- 108091006028 chimera Proteins 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 230000000875 corresponding Effects 0.000 description 3
- 238000006297 dehydration reaction Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 238000004520 electroporation Methods 0.000 description 3
- 238000009396 hybridization Methods 0.000 description 3
- 239000003112 inhibitor Substances 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 239000012528 membrane Substances 0.000 description 3
- 238000010369 molecular cloning Methods 0.000 description 3
- 230000001264 neutralization Effects 0.000 description 3
- PXHVJJICTQNCMI-UHFFFAOYSA-N nickel Chemical compound [Ni] PXHVJJICTQNCMI-UHFFFAOYSA-N 0.000 description 3
- 238000003752 polymerase chain reaction Methods 0.000 description 3
- 239000000047 product Substances 0.000 description 3
- 235000019833 protease Nutrition 0.000 description 3
- 238000002708 random mutagenesis Methods 0.000 description 3
- 230000002829 reduced Effects 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 102220081616 rs2298758 Human genes 0.000 description 3
- 239000011669 selenium Substances 0.000 description 3
- 239000011780 sodium chloride Substances 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 230000001131 transforming Effects 0.000 description 3
- IGROJMCBGRFRGI-YTLHQDLWSA-N (2S)-2-[[(2S)-2-[[(2S,3R)-2-amino-3-hydroxybutanoyl]amino]propanoyl]amino]propanoic acid Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O IGROJMCBGRFRGI-YTLHQDLWSA-N 0.000 description 2
- YXOLAZRVSSWPPT-UHFFFAOYSA-N 3,5,7,2',4'-Pentahydroxyflavonol Chemical compound OC1=CC(O)=CC=C1C1=C(O)C(=O)C2=C(O)C=C(O)C=C2O1 YXOLAZRVSSWPPT-UHFFFAOYSA-N 0.000 description 2
- 101710006746 7.5K Proteins 0.000 description 2
- BUQICHWNXBIBOG-LMVFSUKVSA-N Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)N BUQICHWNXBIBOG-LMVFSUKVSA-N 0.000 description 2
- 102000006410 Apoproteins Human genes 0.000 description 2
- 108010083590 Apoproteins Proteins 0.000 description 2
- XNSKSTRGQIPTSE-UHFFFAOYSA-N Arginyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CCCNC(N)=N XNSKSTRGQIPTSE-UHFFFAOYSA-N 0.000 description 2
- IIFDPDVJAHQFSR-WHFBIAKZSA-N Asn-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O IIFDPDVJAHQFSR-WHFBIAKZSA-N 0.000 description 2
- KLKHFFMNGWULBN-VKHMYHEASA-N Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)NCC(O)=O KLKHFFMNGWULBN-VKHMYHEASA-N 0.000 description 2
- SONUFGRSSMFHFN-IMJSIDKUSA-N Asn-Ser Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O SONUFGRSSMFHFN-IMJSIDKUSA-N 0.000 description 2
- KWBQPGIYEZKDEG-FSPLSTOPSA-N Asn-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O KWBQPGIYEZKDEG-FSPLSTOPSA-N 0.000 description 2
- NALWOULWGHTVDA-UWVGGRQHSA-N Asp-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NALWOULWGHTVDA-UWVGGRQHSA-N 0.000 description 2
- UKGGPJNBONZZCM-WDSKDSINSA-N Aspartyl-L-proline Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O UKGGPJNBONZZCM-WDSKDSINSA-N 0.000 description 2
- NTQDELBZOMWXRS-UHFFFAOYSA-N Aspartyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC(O)=O NTQDELBZOMWXRS-UHFFFAOYSA-N 0.000 description 2
- UUQMNUMQCIQDMZ-UHFFFAOYSA-N Betahistine Chemical compound CNCCC1=CC=CC=N1 UUQMNUMQCIQDMZ-UHFFFAOYSA-N 0.000 description 2
- 210000002459 Blastocyst Anatomy 0.000 description 2
- 210000001109 Blastomeres Anatomy 0.000 description 2
- 210000000170 Cell Membrane Anatomy 0.000 description 2
- XZFYRXDAULDNFX-UHFFFAOYSA-N Cysteinyl-Phenylalanine Chemical compound SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 2
- WYVKPHCYMTWUCW-UHFFFAOYSA-N Cysteinyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CS WYVKPHCYMTWUCW-UHFFFAOYSA-N 0.000 description 2
- 102000033147 ERVK-25 Human genes 0.000 description 2
- VLCYCQAOQCDTCN-UHFFFAOYSA-N Eflornithine Chemical compound NCCCC(N)(C(F)F)C(O)=O VLCYCQAOQCDTCN-UHFFFAOYSA-N 0.000 description 2
- 102200018617 FHIT Y145F Human genes 0.000 description 2
- 102200079195 GFRA1 K79R Human genes 0.000 description 2
- 102200079194 GFRA1 Q80R Human genes 0.000 description 2
- LOJYQMFIIJVETK-WDSKDSINSA-N Gln-Gln Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O LOJYQMFIIJVETK-WDSKDSINSA-N 0.000 description 2
- BBBXWRGITSUJPB-YUMQZZPRSA-N Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O BBBXWRGITSUJPB-YUMQZZPRSA-N 0.000 description 2
- 229960002989 Glutamic Acid Drugs 0.000 description 2
- XIPZDANNDPMZGQ-UHFFFAOYSA-N Glutaminyl-Cysteine Chemical compound NC(=O)CCC(N)C(=O)NC(CS)C(O)=O XIPZDANNDPMZGQ-UHFFFAOYSA-N 0.000 description 2
- ARPVSMCNIDAQBO-UHFFFAOYSA-N Glutaminyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CCC(N)=O ARPVSMCNIDAQBO-UHFFFAOYSA-N 0.000 description 2
- IKAIKUBBJHFNBZ-LURJTMIESA-N Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CN IKAIKUBBJHFNBZ-LURJTMIESA-N 0.000 description 2
- 208000009889 Herpes Simplex Diseases 0.000 description 2
- 241000238631 Hexapoda Species 0.000 description 2
- MDCTVRUPVLZSPG-BQBZGAKWSA-N His-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CNC=N1 MDCTVRUPVLZSPG-BQBZGAKWSA-N 0.000 description 2
- WSDOHRLQDGAOGU-UHFFFAOYSA-N Histidinyl-Asparagine Chemical compound NC(=O)CC(C(O)=O)NC(=O)C(N)CC1=CN=CN1 WSDOHRLQDGAOGU-UHFFFAOYSA-N 0.000 description 2
- 229920002459 Intron Polymers 0.000 description 2
- 108020004391 Introns Proteins 0.000 description 2
- VYZAGTDAHUIRQA-WHFBIAKZSA-N L-alanyl-L-glutamic acid Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O VYZAGTDAHUIRQA-WHFBIAKZSA-N 0.000 description 2
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 2
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 2
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 2
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 2
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 2
- MTCFGRXMJLQNBG-REOHCLBHSA-N L-serine Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 2
- 125000000510 L-tryptophano group Chemical group [H]C1=C([H])C([H])=C2N([H])C([H])=C(C([H])([H])[C@@]([H])(C(O[H])=O)N([H])[*])C2=C1[H] 0.000 description 2
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 2
- NPBGTPKLVJEOBE-IUCAKERBSA-N Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N NPBGTPKLVJEOBE-IUCAKERBSA-N 0.000 description 2
- YSZNURNVYFUEHC-BQBZGAKWSA-N Lys-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O YSZNURNVYFUEHC-BQBZGAKWSA-N 0.000 description 2
- WEDDFMCSUNNZJR-WDSKDSINSA-N Met-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O WEDDFMCSUNNZJR-WDSKDSINSA-N 0.000 description 2
- 101710025567 PSAT1 Proteins 0.000 description 2
- GLUBLISJVJFHQS-VIFPVBQESA-N Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 GLUBLISJVJFHQS-VIFPVBQESA-N 0.000 description 2
- FADYJNXDPBKVCA-UHFFFAOYSA-N Phenylalanyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 2
- 241001417958 Phialidium Species 0.000 description 2
- JQOHKCDMINQZRV-WDSKDSINSA-N Pro-Asn Chemical compound NC(=O)C[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 JQOHKCDMINQZRV-WDSKDSINSA-N 0.000 description 2
- UEKYKRQIAQHOOZ-KBPBESRZSA-N Pro-Trp Chemical compound N([C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)[O-])C(=O)[C@@H]1CCC[NH2+]1 UEKYKRQIAQHOOZ-KBPBESRZSA-N 0.000 description 2
- GVUVRRPYYDHHGK-UHFFFAOYSA-N Prolyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C1CCCN1 GVUVRRPYYDHHGK-UHFFFAOYSA-N 0.000 description 2
- 239000004365 Protease Substances 0.000 description 2
- 102000007312 Recombinant Proteins Human genes 0.000 description 2
- 108010033725 Recombinant Proteins Proteins 0.000 description 2
- 241000242739 Renilla Species 0.000 description 2
- 229920001914 Ribonucleotide Polymers 0.000 description 2
- 241000283984 Rodentia Species 0.000 description 2
- FGDZQCVHDSGLHJ-UHFFFAOYSA-M Rubidium chloride Chemical compound [Cl-].[Rb+] FGDZQCVHDSGLHJ-UHFFFAOYSA-M 0.000 description 2
- YZMPDHTZJJCGEI-BQBZGAKWSA-N Ser-His Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 YZMPDHTZJJCGEI-BQBZGAKWSA-N 0.000 description 2
- 241000256251 Spodoptera frugiperda Species 0.000 description 2
- 229920000978 Start codon Polymers 0.000 description 2
- NHUHCSRWZMLRLA-UHFFFAOYSA-N Sulfizole Chemical compound CC1=NOC(NS(=O)(=O)C=2C=CC(N)=CC=2)=C1C NHUHCSRWZMLRLA-UHFFFAOYSA-N 0.000 description 2
- BWUHENPAEMNGQJ-ZDLURKLDSA-N Thr-Gln Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O BWUHENPAEMNGQJ-ZDLURKLDSA-N 0.000 description 2
- BIYXEUAFGLTAEM-WUJLRWPWSA-N Thr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(O)=O BIYXEUAFGLTAEM-WUJLRWPWSA-N 0.000 description 2
- IQHUITKNHOKGFC-MIMYLULJSA-N Thr-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IQHUITKNHOKGFC-MIMYLULJSA-N 0.000 description 2
- DSGIVWSDDRDJIO-ZXXMMSQZSA-N Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DSGIVWSDDRDJIO-ZXXMMSQZSA-N 0.000 description 2
- 102000006601 Thymidine Kinase Human genes 0.000 description 2
- 108020004440 Thymidine Kinase Proteins 0.000 description 2
- DXYQIGZZWYBXSD-UHFFFAOYSA-N Tryptophyl-Proline Chemical compound C=1NC2=CC=CC=C2C=1CC(N)C(=O)N1CCCC1C(O)=O DXYQIGZZWYBXSD-UHFFFAOYSA-N 0.000 description 2
- ONWMQORSVZYVNH-UHFFFAOYSA-N Tyrosyl-Asparagine Chemical compound NC(=O)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 ONWMQORSVZYVNH-UHFFFAOYSA-N 0.000 description 2
- 241000700618 Vaccinia virus Species 0.000 description 2
- UPJONISHZRADBH-XPUUQOCRSA-N Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UPJONISHZRADBH-XPUUQOCRSA-N 0.000 description 2
- JKHXYJKMNSSFFL-IUCAKERBSA-N Val-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN JKHXYJKMNSSFFL-IUCAKERBSA-N 0.000 description 2
- 108010047495 alanylglycine Proteins 0.000 description 2
- 210000004102 animal cell Anatomy 0.000 description 2
- 125000000129 anionic group Chemical group 0.000 description 2
- 230000002547 anomalous Effects 0.000 description 2
- 125000003118 aryl group Chemical group 0.000 description 2
- 108010093581 aspartyl-proline Proteins 0.000 description 2
- 108010047857 aspartylglycine Proteins 0.000 description 2
- 108010068265 aspartyltyrosine Proteins 0.000 description 2
- 230000001580 bacterial Effects 0.000 description 2
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 108091005941 blue fluorescent protein Proteins 0.000 description 2
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 2
- 230000004663 cell proliferation Effects 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 210000004748 cultured cells Anatomy 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 238000002050 diffraction method Methods 0.000 description 2
- 201000009910 diseases by infectious agent Diseases 0.000 description 2
- 239000006185 dispersion Substances 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000002255 enzymatic Effects 0.000 description 2
- 235000013922 glutamic acid Nutrition 0.000 description 2
- 239000004220 glutamic acid Substances 0.000 description 2
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 2
- 108010045126 glycyl-tyrosyl-glycine Proteins 0.000 description 2
- 108010037850 glycylvaline Proteins 0.000 description 2
- YAMHXTCMCPHKLN-UHFFFAOYSA-N imidazolidin-2-one Chemical group O=C1NCCN1 YAMHXTCMCPHKLN-UHFFFAOYSA-N 0.000 description 2
- 238000003018 immunoassay Methods 0.000 description 2
- 238000010348 incorporation Methods 0.000 description 2
- SIKJAQJRHWYJAI-UHFFFAOYSA-N indole Chemical compound C1=CC=C2NC=CC2=C1 SIKJAQJRHWYJAI-UHFFFAOYSA-N 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 108010012988 lysyl-glutamyl-aspartyl-glycine Proteins 0.000 description 2
- 108010038320 lysylphenylalanine Proteins 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000006011 modification reaction Methods 0.000 description 2
- IJGRMHOSHXDMSA-UHFFFAOYSA-N nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 230000003647 oxidation Effects 0.000 description 2
- 238000007254 oxidation reaction Methods 0.000 description 2
- 239000004031 partial agonist Substances 0.000 description 2
- ISWSIDIOOBJBQZ-UHFFFAOYSA-M phenolate Chemical compound [O-]C1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-M 0.000 description 2
- 229940031826 phenolate Drugs 0.000 description 2
- 108010058453 phenylalanyl-glutamyl-glycine Proteins 0.000 description 2
- 229920002704 polyhistidine Polymers 0.000 description 2
- 108010053725 prolylvaline Proteins 0.000 description 2
- 230000004850 protein–protein interaction Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000001177 retroviral Effects 0.000 description 2
- 239000002336 ribonucleotide Substances 0.000 description 2
- 125000002652 ribonucleotide group Chemical group 0.000 description 2
- 230000028327 secretion Effects 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 2
- 230000035939 shock Effects 0.000 description 2
- 230000019491 signal transduction Effects 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 230000005030 transcription termination Effects 0.000 description 2
- 241001515965 unidentified phage Species 0.000 description 2
- 108010073969 valyllysine Proteins 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- OPINTGHFESTVAX-UHFFFAOYSA-N γ-glutamyl-Arginine Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CCCNC(N)=N OPINTGHFESTVAX-UHFFFAOYSA-N 0.000 description 2
- KHUXNRRPPZOJPT-UHFFFAOYSA-N $l^{1}-oxidanylbenzene Chemical compound O=C1C=C[CH]C=C1 KHUXNRRPPZOJPT-UHFFFAOYSA-N 0.000 description 1
- KIUMMUBSPKGMOY-UHFFFAOYSA-N 3,3'-Dithiobis(6-nitrobenzoic acid) Chemical compound C1=C([N+]([O-])=O)C(C(=O)O)=CC(SSC=2C=C(C(=CC=2)[N+]([O-])=O)C(O)=O)=C1 KIUMMUBSPKGMOY-UHFFFAOYSA-N 0.000 description 1
- JZUWLOGLONEFKL-UHFFFAOYSA-N 4-[(4-hydroxyphenyl)methylidene]imidazolidin-2-one Chemical compound C1=CC(O)=CC=C1C=C1NC(=O)NC1 JZUWLOGLONEFKL-UHFFFAOYSA-N 0.000 description 1
- JTEGQNOMFQHVDC-NKWVEPMBSA-N 4-amino-1-[(2R,5S)-2-(hydroxymethyl)-1,3-oxathiolan-5-yl]-1,2-dihydropyrimidin-2-one Chemical compound O=C1N=C(N)C=CN1[C@H]1O[C@@H](CO)SC1 JTEGQNOMFQHVDC-NKWVEPMBSA-N 0.000 description 1
- 229940034982 ANTINEOPLASTIC AGENTS Drugs 0.000 description 1
- 102100007409 APRT Human genes 0.000 description 1
- 101710006647 APRT Proteins 0.000 description 1
- 230000035533 AUC Effects 0.000 description 1
- 108010024223 Adenine Phosphoribosyltransferase Proteins 0.000 description 1
- 108010083868 Aequorea victoria green fluorescent protein Proteins 0.000 description 1
- 229920000936 Agarose Polymers 0.000 description 1
- 241000269328 Amphibia Species 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N Ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 229960001230 Asparagine Drugs 0.000 description 1
- 229960005261 Aspartic Acid Drugs 0.000 description 1
- 241001203868 Autographa californica Species 0.000 description 1
- 108090001008 Avidin Proteins 0.000 description 1
- 210000004952 Blastocoel Anatomy 0.000 description 1
- 241000701822 Bovine papillomavirus Species 0.000 description 1
- 241000282472 Canis lupus familiaris Species 0.000 description 1
- 241000701489 Cauliflower mosaic virus Species 0.000 description 1
- 229920000062 Coding strand Polymers 0.000 description 1
- 241000938605 Crocodylia Species 0.000 description 1
- 102100002080 DACT1 Human genes 0.000 description 1
- 101700013768 DACT1 Proteins 0.000 description 1
- 101710007887 DHFR Proteins 0.000 description 1
- 101700070526 DNMT1 Proteins 0.000 description 1
- VILAVOFMIJHSJA-UHFFFAOYSA-N Dicarbon monoxide Chemical compound [C]=C=O VILAVOFMIJHSJA-UHFFFAOYSA-N 0.000 description 1
- 108010091358 EC 2.4.2.8 Proteins 0.000 description 1
- 238000002965 ELISA Methods 0.000 description 1
- 210000001513 Elbow Anatomy 0.000 description 1
- 210000002472 Endoplasmic Reticulum Anatomy 0.000 description 1
- 108090001126 FURIN Proteins 0.000 description 1
- 230000035693 Fab Effects 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 102000004961 Furin Human genes 0.000 description 1
- 108091006011 G proteins Proteins 0.000 description 1
- 102200079193 GFRA1 F99S Human genes 0.000 description 1
- 102200079214 GFRA1 I167T Human genes 0.000 description 1
- 102200079221 GFRA1 N146I Human genes 0.000 description 1
- 102200079216 GFRA1 S175G Human genes 0.000 description 1
- 102200079223 GFRA1 Y145F Human genes 0.000 description 1
- 102200079222 GFRA1 Y145H Human genes 0.000 description 1
- 102000030007 GTP-Binding Proteins Human genes 0.000 description 1
- 108091000058 GTP-Binding Proteins Proteins 0.000 description 1
- 240000007842 Glycine max Species 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 210000002288 Golgi Apparatus Anatomy 0.000 description 1
- 101700048347 HIS5 Proteins 0.000 description 1
- 102100016790 HPRT1 Human genes 0.000 description 1
- 101710015954 HVA1 Proteins 0.000 description 1
- 229940072221 IMMUNOGLOBULINS Drugs 0.000 description 1
- 229960000310 ISOLEUCINE Drugs 0.000 description 1
- 108090000745 Immune Sera Proteins 0.000 description 1
- 102000017727 Immunoglobulin Variable Region Human genes 0.000 description 1
- 108010067060 Immunoglobulin Variable Region Proteins 0.000 description 1
- 102000004310 Ion Channels Human genes 0.000 description 1
- 108090000862 Ion Channels Proteins 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- FBOZXECLQNJBKD-ZDUSSCGKSA-N L-methotrexate Chemical compound C=1N=C2N=C(N)N=C(N)C2=NC=1CN(C)C1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 FBOZXECLQNJBKD-ZDUSSCGKSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- 101700065814 LEA2 Proteins 0.000 description 1
- 101700021338 LEC Proteins 0.000 description 1
- 101700077545 LECC Proteins 0.000 description 1
- 101700028499 LECG Proteins 0.000 description 1
- 101700063913 LECT Proteins 0.000 description 1
- 101700021119 LEUC Proteins 0.000 description 1
- OTXBNHIUIHNGAO-UWVGGRQHSA-N Leu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN OTXBNHIUIHNGAO-UWVGGRQHSA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- GLFNIEUTAYBVOC-UHFFFAOYSA-L MANGANESE CHLORIDE Chemical compound Cl[Mn]Cl GLFNIEUTAYBVOC-UHFFFAOYSA-L 0.000 description 1
- 108090000157 Metallothionein Proteins 0.000 description 1
- 102000003792 Metallothionein Human genes 0.000 description 1
- 210000003470 Mitochondria Anatomy 0.000 description 1
- 229910021380 MnCl2 Inorganic materials 0.000 description 1
- 229960000951 Mycophenolic Acid Drugs 0.000 description 1
- 241001045988 Neogene Species 0.000 description 1
- 210000004940 Nucleus Anatomy 0.000 description 1
- 229920000272 Oligonucleotide Polymers 0.000 description 1
- 101710034340 Os04g0173800 Proteins 0.000 description 1
- 241000283898 Ovis Species 0.000 description 1
- 102200000322 PSG6 I167T Human genes 0.000 description 1
- 102200008181 PSME2 N146I Human genes 0.000 description 1
- 102000035443 Peptidases Human genes 0.000 description 1
- 210000002824 Peroxisome Anatomy 0.000 description 1
- 229960005190 Phenylalanine Drugs 0.000 description 1
- 102000015439 Phospholipases Human genes 0.000 description 1
- 108010064785 Phospholipases Proteins 0.000 description 1
- 229920001030 Polyethylene Glycol 4000 Polymers 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 241000242743 Renilla reniformis Species 0.000 description 1
- 210000003705 Ribosomes Anatomy 0.000 description 1
- 108010003581 Ribulose-Bisphosphate Carboxylase Proteins 0.000 description 1
- 102200038488 SPCS1 T44A Human genes 0.000 description 1
- 241000242583 Scyphozoa Species 0.000 description 1
- 229960002718 Selenomethionine Drugs 0.000 description 1
- SSJMZMUVNKEENT-IMJSIDKUSA-N Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CO SSJMZMUVNKEENT-IMJSIDKUSA-N 0.000 description 1
- 241000700584 Simplexvirus Species 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- 241000282887 Suidae Species 0.000 description 1
- 102000004183 Synaptosomal-Associated Protein 25 Human genes 0.000 description 1
- 108010057722 Synaptosomal-Associated Protein 25 Proteins 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H Tricalcium phosphate Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 102200157357 UBA3 F65G Human genes 0.000 description 1
- 208000007089 Vaccinia Diseases 0.000 description 1
- 206010046865 Vaccinia virus infection Diseases 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 108091005955 Wild-type GFP Proteins 0.000 description 1
- 210000004340 Zona Pellucida Anatomy 0.000 description 1
- 238000000862 absorption spectrum Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 239000012190 activator Substances 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 125000003295 alanine group Chemical group N[C@@H](C)C(=O)* 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000003698 anagen phase Effects 0.000 description 1
- 150000001450 anions Chemical class 0.000 description 1
- 230000001399 anti-metabolic Effects 0.000 description 1
- 230000000692 anti-sense Effects 0.000 description 1
- 229960000070 antineoplastic Monoclonal antibodies Drugs 0.000 description 1
- 239000002246 antineoplastic agent Substances 0.000 description 1
- 125000000511 arginine group Chemical group N[C@@H](CCCNC(N)=N)C(=O)* 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 235000019622 astringency Nutrition 0.000 description 1
- 235000019606 astringent taste Nutrition 0.000 description 1
- 238000006701 autoxidation reaction Methods 0.000 description 1
- 125000001797 benzyl group Chemical group [H]C1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])* 0.000 description 1
- 102000024070 binding proteins Human genes 0.000 description 1
- 108091007650 binding proteins Proteins 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000001110 calcium chloride Substances 0.000 description 1
- 235000011148 calcium chloride Nutrition 0.000 description 1
- 229910001628 calcium chloride Inorganic materials 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 125000004432 carbon atoms Chemical group C* 0.000 description 1
- CURLTUGMZLYLDI-UHFFFAOYSA-N carbon dioxide Chemical compound O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 description 1
- 231100000504 carcinogenesis Toxicity 0.000 description 1
- 230000024881 catalytic activity Effects 0.000 description 1
- 125000002091 cationic group Chemical group 0.000 description 1
- 230000033077 cellular process Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000009402 cross-breeding Methods 0.000 description 1
- 238000002447 crystallographic data Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- 150000001945 cysteines Chemical class 0.000 description 1
- 230000001086 cytosolic Effects 0.000 description 1
- 239000002254 cytotoxic agent Substances 0.000 description 1
- 231100000599 cytotoxic agent Toxicity 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 230000001809 detectable Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000000502 dialysis Methods 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 235000013601 eggs Nutrition 0.000 description 1
- 230000005670 electromagnetic radiation Effects 0.000 description 1
- 210000002257 embryonic structures Anatomy 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 230000030583 endoplasmic reticulum localization Effects 0.000 description 1
- 238000000198 fluorescence anisotropy Methods 0.000 description 1
- 238000000799 fluorescence microscopy Methods 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 239000003269 fluorescent indicator Substances 0.000 description 1
- 125000002485 formyl group Chemical group [H]C(*)=O 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 125000000404 glutamine group Chemical group N[C@@H](CCC(N)=O)C(=O)* 0.000 description 1
- 230000003899 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 125000000623 heterocyclic group Chemical group 0.000 description 1
- 238000004128 high performance liquid chromatography Methods 0.000 description 1
- 101700014181 his2 Proteins 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- UFHFLCQGNIYNRP-UHFFFAOYSA-N hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 description 1
- 125000004435 hydrogen atoms Chemical group [H]* 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 230000000984 immunochemical Effects 0.000 description 1
- 238000000099 in vitro assay Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 230000003834 intracellular Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl β-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 101700036391 lecA Proteins 0.000 description 1
- 239000002523 lectin Substances 0.000 description 1
- 239000002502 liposome Substances 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 239000011565 manganese chloride Substances 0.000 description 1
- 235000002867 manganese chloride Nutrition 0.000 description 1
- 238000004949 mass spectrometry Methods 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 101700001016 mbhA Proteins 0.000 description 1
- 238000010297 mechanical methods and process Methods 0.000 description 1
- 239000002609 media Substances 0.000 description 1
- 230000001404 mediated Effects 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 229960000485 methotrexate Drugs 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 238000000386 microscopy Methods 0.000 description 1
- 210000004939 midgestation embryo Anatomy 0.000 description 1
- 230000000116 mitigating Effects 0.000 description 1
- 230000025608 mitochondrion localization Effects 0.000 description 1
- 108010045030 monoclonal antibodies Proteins 0.000 description 1
- 102000005614 monoclonal antibodies Human genes 0.000 description 1
- 229960000060 monoclonal antibodies Drugs 0.000 description 1
- 235000007708 morin Nutrition 0.000 description 1
- 239000012452 mother liquor Substances 0.000 description 1
- 238000002703 mutagenesis Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- HPNSFSBZBAHARI-RUDMXATFSA-N mycophenolic acid Chemical compound OC1=C(C\C=C(/C)CCC(O)=O)C(OC)=C(C)C2=C1C(=O)OC2 HPNSFSBZBAHARI-RUDMXATFSA-N 0.000 description 1
- 229910052759 nickel Inorganic materials 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 239000007800 oxidant agent Substances 0.000 description 1
- 230000001590 oxidative Effects 0.000 description 1
- 230000003071 parasitic Effects 0.000 description 1
- 230000014318 peroxisome localization Effects 0.000 description 1
- 108010024607 phenylalanylalanine Proteins 0.000 description 1
- 230000002319 phototactic Effects 0.000 description 1
- 230000035479 physiological effects, processes and functions Effects 0.000 description 1
- 239000000049 pigment Substances 0.000 description 1
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 1
- 230000001402 polyadenylating Effects 0.000 description 1
- 230000001323 posttranslational Effects 0.000 description 1
- 230000003334 potential Effects 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 230000001737 promoting Effects 0.000 description 1
- 230000000644 propagated Effects 0.000 description 1
- 210000004777 protein coat Anatomy 0.000 description 1
- 230000018883 protein targeting Effects 0.000 description 1
- 230000005588 protonation Effects 0.000 description 1
- 210000001938 protoplasts Anatomy 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 108010054624 red fluorescent protein Proteins 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 229920003987 resole Polymers 0.000 description 1
- 238000002165 resonance energy transfer Methods 0.000 description 1
- 238000006798 ring closing metathesis reaction Methods 0.000 description 1
- 102220159132 rs761183228 Human genes 0.000 description 1
- 150000003839 salts Chemical group 0.000 description 1
- 239000006152 selective media Substances 0.000 description 1
- BUGBHKTXTAQXES-UHFFFAOYSA-N selenium Chemical compound [Se] BUGBHKTXTAQXES-UHFFFAOYSA-N 0.000 description 1
- 229910052711 selenium Inorganic materials 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 238000007614 solvation Methods 0.000 description 1
- 238000004611 spectroscopical analysis Methods 0.000 description 1
- 230000002269 spontaneous Effects 0.000 description 1
- 230000000087 stabilizing Effects 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
- 210000000130 stem cell Anatomy 0.000 description 1
- 239000003774 sulfhydryl reagent Substances 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 230000002194 synthesizing Effects 0.000 description 1
- 230000001225 therapeutic Effects 0.000 description 1
- 125000003396 thiol group Chemical group [H]S* 0.000 description 1
- 210000001519 tissues Anatomy 0.000 description 1
- 238000004448 titration Methods 0.000 description 1
- 231100000419 toxicity Toxicity 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 238000010361 transduction Methods 0.000 description 1
- 230000026683 transduction Effects 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- 108060008647 trpB Proteins 0.000 description 1
- 125000001493 tyrosinyl group Chemical group [H]OC1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 1
- 241000701447 unidentified baculovirus Species 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
- 238000002424 x-ray crystallography Methods 0.000 description 1
- SITLTJHOQZFJGG-XPUUQOCRSA-N α-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SITLTJHOQZFJGG-XPUUQOCRSA-N 0.000 description 1
Abstract
Engineered fluorescent proteins, nucleic acids encoding them and methods of use.
Description
FLUORESCENT PROTEINS DESIGNED OF LONG WAVE LENGTH
Background of the Invention This application claims the benefit of the United States provisional patent application serial number 60 / 024,050 with prior filing date, filed on August 16, 1996, entitled "Mutant Fluorescent Long Wavelength Proteins" , and patent application serial number 08 / 706,408 filed on August 30, 1996, entitled "Long Wavelength Designed Fluorescent Proteins", both incorporated herein by reference. This invention was made in part with the support of the government of the United States under grant number MCB 9418479, awarded by the National Science Foundation. The government of the United States may have rights over this invention. Fluorescent molecules are attractive as reporter molecules in many assay systems, due to their high sensitivity and ease of quantification. Recently, fluorescent proteins have been the focus of much attention because these can be produced in vivo by biological systems, and can be used to trace intracellular events without the need to be introduced into the cell through microinjection or permeabilization. The green fluorescent protein of Aequorea victoria is particularly interesting as a fluorescent protein. A cDNA for the protein has been cloned. (D.C. Prasher et al., "Primary structure of the Aequorea victory green-fluorescent protein", Gene (1992) 111: 229-33). Not only can the primary amino acid sequence of the protein expressed from the cDNA be expressed, but the expressed protein can release fluorescent light rays. This indicates that the protein can undergo the cyclization and oxidation that are believed to be necessary for fluorescence. The green fluorescent protein ("GFP") of the Aequorea victoria is a stable single chain, resistant to proteolysis, of 238 residues, and has two absorption maxima at around 395 and 475 nm. The relative amplitudes of these two peaks are sensitive to environmental factors (WW Ward, Bioluminescence and Chemiluminescence (MA DeLuca and WD McElroy, eds) Academic Press pp. 235-242 (1981), WW Ward and SH Bokman Biochemistry 21: 4535-4540 (1982), WW Ward and collaborators Photochem, Photobiol, 35: 803-808 (1982)) and the history of illumination (AB Cubitt and collaborators Trends Biochem, Sci. 20: 448-455 (1995)), presumably reflecting two or more lower states. The excitation at the primary absorption peak of 395 nm produces a maximum emission of 508 nm with a quantum yield of 0.72-0.85 (O. Shimomura and FH Johnson J. Cell, Comp.Physiol. 59: 223 (1962); Morin and JW Hastings, J. Cell, Physiol., 77: 313 (1971), H. Morise et al. Biochemistry 13: 2656 (1974), WW Ward Photochem.
Photobiol. Reviews (Smith, K.C. editor) 4: 1 (1979); A.B. Cubitt and collaborators Trends Biochem. Sci. 20: 448-455 (1995); D.C. Prasher Trends Genet. 11: 320-323 (1995); M. Chalfie Photochem. Photobiol. 62: 651-656 (1995); W.W. Ward. Bioluminescence and Chemiluminescence (M.A. DeLuca and W.D. McElroy, editors) Academic Press pp. 235-242 (1981); W.W. Ward and S.H. Bokman Biochemistry 21: 4535-4540 (1982); W.W. Ward and Photochem collaborators. Photobiol. 35: 803-808 (1982)). The fluorophore is the result of the autocatalytic cyclization of the base structure of the polypeptide between the Ser65 and Gly67 residues, and the oxidation of the D-β bond of Tyr66 (AB Cubitt et al., Trends Biochem. Sci. 20: 448-455 (1995); CW Cody et al Biochemistry 32: 1212-1218 (1993); R. Heim et al Proc. Nati, Acad Sci. USA 91: 12501-12504 (1994)). The mutation of Ser65 to Thr (S65T) if plifies the excitation spectrum at a single peak at 488 nm of the improved amplitude (R. Heim et al., Na ture 373: 664-665
(1995)), which no longer gives signals of conformational isomers (A.B.
Cubitt and collaborators Trends Biochem. Sci. 20: 448-455 (1995)). Fluorescent proteins have been used as gene expression markers, cell line tracers and as fusion tags to monitor the location of the protein within living cells. (M. Chalfie et al. "Green fluorescent protein as marker for gene expression", Science 263: 802-805; AB Cubitt et al. "Understanding, Improving and using green fluorescent proteins", TIBS 20, November 1995, pages 448 -455 U.S. Patent No. 5,491,084, M. Chalfie and D. Prasher). On the other hand, it has been identified that the designed versions of the Aequorea green fluorescent protein exhibit altered fluorescence characteristics, including excitation and altered emission maxima, as well as excitation and emission spectra of different forms. (R. Heim et al., "Wavelength mutations and post-translational autoxidation of green fluorescent protein", Proc. Nati, Acad. Sci. USA, (1994) 91: 12501-04; R. Heim et al., "Improved green fluorescence. ", Na ture (1995) 373: 663-665). These properties add variety and utility to the arsenal of biologically based fluorescent indicators. There is a need for fluorescent proteins designed with different fluorescent properties. Brief Description of the Drawings Figures 1A-1B. (A) Schematic drawing of the base structure of the green fluorescent protein produced by Molscript
(J.P. Kraulis, J. Appl. Cryst., 24: 946 (1991)). The chromophore is shown as a ball and rod model. (B) Schematic drawing of the global fold of the green fluorescent protein. The approximate numbers of waste mark the beginning and the end of the secondary structural elements. Figures 2A-2C. (A) Stereoscopic drawing of the chromo forum and the residues in the immediate vicinity. Carbon atoms are drawn as open circles, oxygen is filled, and nitrogen is shaded. The solvent molecules are shown as isolated filled circles. (B) Portion of the density map of the final 2F0-FC electron, pred at 1.0 D, showing the density of the electron surrounding the chromophore. (C) Schematic diagram showing the first and second coordination spheres of the chromophore. The hydrogen bonds are shown as striped lines, and have the indicated lengths in A. Insert: proposed structure of the carbide-sheet intermediate that presumably forms during the generation of the chromophore. Figure 3 illustrates the nucleotide sequence (SEQ ID NO: 1) and the deduced amino acid sequence (SEQ ID NO: 2) of a green fluorescent protein of Aequorea. Figure 4 illustrates the nucleotide sequence (SEQ ID NO: 3) and the deduced amino acid sequence (SEQ ID NO: 4) of a fluorescent protein designed related to Aequorea S65G / S72A / T203Y using the preferred mammalian codons and the Optimal Kozak sequence. Figures 5-1 to 5-28 present the coordinates for the crystal structure of the green fluorescent protein related to Aequorea S65T. Figure 6 shows the fluorescence excitation and the emission spectra for the fluorescent proteins designed 20A and 10C (Table F). The vertical line at 528 nm compares the emission maxima of 10C, to the left of the line, and 20A, to the right of the line. SUMMARY OF THE INVENTION This invention provides functional designed fluorescent proteins, with different fluorescence characteristics, which can be readily distinguished from the green and blue fluorescent proteins that currently exist. These designed fluorescent proteins allow the simultaneous measurement of two or more processes within the cells, and can be used as donors or fluorescence energy receptors, when they are used to monitor protein-protein interactions through FRET. The designed fluorescent proteins of longer wavelength are particularly useful because the photodynamic toxicity and autofluorescence of the cells are significantly reduced at longer wavelengths. In particular, the introduction of the T203X substitution, where X is an aromatic amino acid, results in an increase in the excitation maxima and emission wavelength of the fluorescent proteins related to Aequorea. In one aspect, this invention provides a nucleic acid molecule comprising a nucleotide sequence that encodes a functionally designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2). ), and which differs from SEQ ID NO: 2 in at least one amino acid substitution located no more than about 0.5 nm of the chromophore of the designed fluorescent protein, wherein the substitution alters the electronic environment of the chromophore, whereby the protein Functionally designed fluorescent has a fluorescent property different from the green fluorescent protein of Aequorea. In one aspect, this invention provides a nucleic acid molecule comprising a nucleotide sequence encoding a functionally designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of Aequorea's green fluorescent protein.
(SEQ ID NO: 2), and which differs from SEQ ID N0: 2 in at least one substitution in T203 and, in particular, T203X, where X is an aromatic amino acid selected from H, Y, W, or F, the functionally designed fluorescent protein having a fluorescent property different from the green fluorescent protein of Aequorea. In one embodiment, the amino acid sequence also comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I. In another embodiment, the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S65G / V68L / Q69K / S72A / T203Y; S72A / S65G / V68L / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203W. In another embodiment, the amino acid sequence also comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F, and Y66W. In another embodiment, the amino acid sequence also comprises a mutation of Table A. In another embodiment, the amino acid sequence also comprises a mutation that folds. In another embodiment, the nucleotide sequence encoding the protein differs from the nucleotide sequence of SEQ ID NO: 1 by replacing at least one codon with a preferred mammalian codon. In another embodiment, the nucleic acid molecule encodes a fusion protein, wherein the fusion protein comprises a polypeptide of interest and the functionally designed fluorescent protein. In another aspect, this invention provides a nucleic acid molecule comprising a nucleotide sequence that encodes a functionally designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2). ), and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220, E222 (not E222G), or V224, the functionally designed fluorescent protein having a fluorescent property different from the green fluorescent protein of Aequorea. In one embodiment, the amino acid substitution is: L42X, where X is selected from C, F, H, W
And Y. V61X, where X is selected from F, Y, H, and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y, H, and C, V68X, where X is selected from F, Y, and H,
Q69X, wherein X is selected from K, R, E and G, Q94X, where X is selected from D, E, H, K and N, N121X, where X is selected from F, H, W and
Y. Y145X, where X is selected from W, C, F, L, E, H, K and Q, H148X, where X is selected from F, Y, N, K,
Q and R. V150X, where X is selected from F, Y, and H, F165X, where X is selected from H, Q, W and Y, I167X, where X is selected from F , Y, and H, Q183X, where X is selected from H, Y, E and
K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and
T, E222X, where X is selected from N and Q, or V224X, where X is selected from H, N, Q, T, F, W and Y.
In another aspect, the invention provides an expression vector comprising expression control sequences operably linked to any of the aforementioned nucleic acid molecules. In other aspects, this invention provides a recombinant host cell comprising the aforementioned expression vector. In another aspect, this invention provides a functional designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2), and which differs from SEQ ID NO: 2 in at least one amino acid substitution located at no more than about 0.5 nm from the chromophore of the designed fluorescent protein, where the substitution alters the electronic environment of the chromophore, whereby the functionally designed fluorescent protein has a fluorescent property different from the green fluorescent protein of Aequorea. In another aspect, this invention provides a functional designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2), and which differs from SEQ ID NO: 2 in at least the amino acid substitution in T203 and, in particular, T203X, wherein X is an aromatic amino acid selected from H, Y, W, or F, the functionally designed fluorescent protein having a fluorescent property different from the green fluorescent protein of Aequorea. In one embodiment, the amino acid sequence also comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I. In another embodiment, the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S 65G / V68L / T203 Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203W. In another embodiment, the amino acid sequence also comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F, and Y66W. In another embodiment, the amino acid sequence also comprises a mutation that folds. In another embodiment, the designed fluorescent protein is part of a fusion protein, wherein the fusion protein comprises a polypeptide of interest, and the functionally designed fluorescent protein. In another aspect, this invention provides a functional designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2), and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220, E222, or V224, the functionally designed fluorescent protein having a property fluorescent protein different from the green fluorescent protein of Aeguorea. In another aspect, this invention provides a fluorescently labeled antibody, which comprises an antibody coupled to any of the functionally designed fluorescent proteins mentioned above. In one embodiment, the fluorescently labeled antibody is a fusion protein wherein the fusion protein comprises the antibody fused to the functionally designed fluorescent protein. In another aspect, this invention provides a nucleic acid molecule comprising a nucleotide sequence encoding an antibody fused to the nucleotide sequence encoding a functional designed fluorescent protein of this invention. In another aspect, this invention provides a fluorescently labeled nucleic acid probe, comprising a nucleic acid probe coupled to a functional designed fluorescent protein of this invention. The fusion can be through a linker peptide. In another aspect, this invention provides a method for determining whether a mixture contains a target, comprising contacting the mixture with a fluorescently labeled probe comprising a probe and a functional designed fluorescent protein of this invention; and determine if the target has been fixed to the probe. In one embodiment, the target molecule is captured in a solid matrix. In another aspect, this invention provides a method for designing a functional designed fluorescent protein that has a fluorescent property different from the green fluorescent protein of Aequorea, which comprises substituting an amino acid that is located no more than 0.5 nm from any atom in the chromophore of a green fluorescent protein related to Aequorea with another amino acid; whereby the substitution alters a fluorescent property of the protein. In another mode, the substitution of the amino acid alters the electronic environment of the chromophore. In another aspect, this invention provides a method for designing a functional designed fluorescent protein having a fluorescent property different from the Aequorea green fluorescent protein, which comprises substituting amino acids in a cycle domain of a green fluorescent protein related to Aequorea with amino acids, in order to create a consensus sequence for phosphorylation or for proteolysis. In another aspect, this invention provides a method for producing fluorescence resonance energy transfer, comprising providing a donor molecule comprising a functional designed fluorescent protein of this invention; provide an appropriate acceptor molecule for the fluorescent protein; and putting the donor molecule and the acceptor molecule in close enough contact to allow the transfer of fluorescence resonance energy. In another aspect, this invention provides a method for producing fluorescence resonance energy transfer, comprising providing an acceptor molecule comprising a functional designed fluorescent protein of this invention.; provide a suitable donor molecule for the fluorescent protein; and putting the donor molecule and the acceptor molecule in close enough contact to allow the transfer of fluorescence resonance energy. In one embodiment, the donor molecule is a designed fluorescent protein whose amino acid sequence comprises the T203I substitution, and the acceptor molecule is a designed fluorescent protein whose amino acid sequence comprises the T203X substitution, wherein X is an aromatic amino acid selected from H, Y, W, or F, the functionally designed fluorescent protein having a fluorescent property different from the green fluorescent protein of Aequorea. In another aspect, this invention provides a crystal of a protein comprising a fluorescent protein with an amino acid sequence substantially identical to SEQ ID NO: 2, wherein the crystal is diffracted with at least a resolution of 2.0 to 3.0 Angstroms. In another embodiment, this invention provides a computational method of designing a fluorescent protein, which comprises determining a three-dimensional model of a crystallized fluorescent protein comprising a fluorescent protein with a binding ligand, at least one amino acid interaction of the fluorescent protein that interacts with at least the first chemical fraction, to produce a second chemical fraction with a structure to either decrease, or increase an interaction between the interaction amino acid and the second chemical fraction, compared with the interaction between the interaction amino acid and the first chemical fraction. In another embodiment, this invention provides a computational method for modeling the three-dimensional structure of a fluorescent protein, comprising determining a three-dimensional relationship between at least two atoms listed in the atomic coordinates of Figures 5-1 through 5-28. In another embodiment, this invention provides a device comprising a storage device and, stored in the device, at least 10 atomic coordinates selected from the atomic coordinates listed in Figures 5-1 through 5-28. In one embodiment, the storage device is a computer readable device that stores the code it receives as it enters the atomic coordinates. In another embodiment, the computer-readable device is a floppy disk or a hard disk. Detailed Description of the Invention I. DEFINITIONS Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those of ordinary skill in the art to which this invention pertains. tion. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, preferred methods and materials are described. For the purposes of the present invention, the following terms are defined below. "Fixing pair" refers to two fractions (for example, chemical or biochemical) that have an affinity for one another. Examples of binding pairs include antigens / antibodies, lectin / avidin, target polynucleotide / probe oligonucleotide, antibody / anti-antibody, receptor / ligand, enzyme / ligand and the like. "A member of a fixing pair" refers to a fraction of the pair, such as an antigen or ligand. "Nucleic acid" refers to a deoxyribo-nucleotide or ribonucleotide polymer in the form of either single or double chain and; unless it is limited in another way; It encompasses known analogs of natural nucleotides that can function in a manner similar to naturally occurring nucleotides. It will be understood that when a nucleic acid molecule is represented by a DNA sequence, it also includes RNA molecules having the corresponding RNA sequence in which "U" replaces "T". "Recombinant Nucleic Acid Molecule" refers to a nucleic acid molecule that does not occur naturally, and comprising two nucleotide sequences that are not naturally joined together. Recombinant nucleic acid molecules are produced by artificial recombination, for example, genetic design techniques or chemical synthesis. The reference to a nucleotide sequence "encoding" a polypeptide means that the sequence, after transcription and translation of the mRNA, produces the polypeptide. This includes both the coding strand, whose nucleotide sequence is identical to the mRNA, and whose sequence is usually provided in the sequence listing, as well as its complementary strand, which is used as the template for transcription. As recognized by any person skilled in the art, this also includes all degenerate nucleotide sequences that encode the same amino acid sequence. The 3 nucleotide sequences encoding a polypeptide include the introns-containing sequences. "Expression control sequences" refers to nucleotide sequences that regulate the expression of a nucleotide sequence to which they are linked in an operable manner. The expression control sequences are "operably linked" to a nucleotide sequence when the expression control sequences control and regulate transcription and, as appropriate, translation of the nucleotide sequence. Therefore, expression control sequences may include promoters, enhancers, transcription terminators, a start codon (i.e., ATG) in front of a protein coding gene, splice signals for intro-nes, frame maintenance of correct reading of that gene to allow proper translation of the mRNA, and appropriate stop codons. As used herein, "occurring naturally", as applied to an object, refers to the fact that an object can be found in nature. For example, a polypeptide or polynucleotide sequence that is present in an organism (including virus) that can be isolated from a source in nature, and that has not been intentionally modified by man in the laboratory, occurs naturally. "Operably linked" refers to a juxtaposition wherein the components so described are in a relationship that allows them to function in their intended manner. A control sequence "operably linked" to a coding sequence is linked in such a way that expression of the coding sequence is achieved under conditions compatible with the control sequences, such as when the appropriate molecules (e.g. inductors and polymerases) are fixed to the control sequence (s) or regulators.
"Control sequence" refers to polynucleotide sequences that are necessary to effect the expression of coding and non-coding sequences to which they are linked. The nature of these control sequences differs depending on the host organism; in prokaryotes, these control sequences generally include promoter, ribosomal binding site, and transcription termination sequence; in eukaryotes, generally, those control sequences include promoters and transcription termination sequence. The term "control sequences" is intended to include, to a minimum, components whose presence may influence expression, and may also include additional components whose presence is convenient, for example, leader sequences, and couple fusion sequences. "Isolated polynucleotide" refers to a polynucleotide of genomic, cDNA, or synthetic origin, or a combination thereof, which by virtue of its origin, the "isolated polynucleotide" (1) is not associated with the cell in which the "isolated polynucleotide" is found in nature, or (2) is operably linked to a polynucleotide to which it is not linked in nature. "Polynucleotide" refers to a polymeric form of nucleotides of at least 10 bases in length, either ribonu-cleotide or deoxyribonucleotide, or a modified form of any type of nucleotide.The term includes single-stranded or double-stranded forms of DNA The term "probe" refers to a substance that specifically binds to another substance (an "objective.") Probes include, for example, antibodies, nucleic acids, receptors and their ligands. "Modulation" refers to the ability to either improve or inhibit a functional property of the biological activity or process (eg, enzyme activity or receptor binding), this improvement or inhibition may be contingent on the occurrence of a specific event, such as activation of a signal transduction path, and / or can be manifested only in particular cell types.The term "modulator" refers to a chemical product (occurring from nera natural or occurring in a non-natural way), or an extract made from biological materials such as cells or tissues of bacteria, plants, fungi, or animals
(particularly mammals). Modulators can be evaluated to see their potential activity as inhibitors or activators
(directly or indirectly) of a biological process or processes (e.g., agonist, partial antagonist, partial agonist, inverse agonist, antagonist, antineoplastic agents, cytotoxic agents, neoplastic transformation inhibitors or cell proliferation, cell proliferation promoting agents, and the like ) by inclusion in the classification assays described herein. The activity of a modulator can be known, unknown or partially known. The term "test chemical" refers to a chemical that is to be tested by one or more classification methods of the invention, such as a putative modulator. It is usually not known that a test chemical is fixed to the target of interest. The term "control test chemical" refers to a chemical known to bind to the target (eg, an agonist, antagonist, partial agonist, or known inverse agonist). Usually, different pre-determined concentrations of test chemicals are used for classification, such as .01 μM, .1 μM, 1.0 μM, and 10.0 μM. The term "objective" refers to a biochemical entity that involves a biological process. The targets are typically proteins that play a useful role in the physiology or biology of an organism. A therapeutic chemical is fixed to the target to alter or modulate its function. As used herein, targets may include surface cell receptors, G proteins, kinases, ion channels, phospholipases and other proteins mentioned herein. The term "label" refers to a composition that can be detected by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. For example, suitable labels include 32P, fluorescent dyes, fluorescent proteins, dense electron reagents, enzymes (eg, as commonly used in the enzyme-linked immunosorbent assay), biotin, dioxigenin, or haptens and proteins for which are available antisera or monoclonal antibodies. For example, the polypeptides of this invention can be made as detectable labels, for example by incorporating them into a polypeptide, and can be used to label antibodies reactive specifically with the polypeptide. Frequently a label generates a measurable signal, such as radioactivity, fluorescent light or enzymatic activity, which can be used to quantify the amount of label set. The term "nucleic acid probe" refers to a nucleic acid molecule that binds to a specific sequence or subsequence of another nucleic acid molecule. A probe is preferably a nucleic acid molecule that is fixed through a base pair complementary to the entire sequence or to a subsequence of a target nucleic acid. It will be understood that the probes can set target sequences that lack complete complementarity with the probe sequence, depending on the stringency of the hybridization conditions. The probes are preferably labeled directly as with isotopes, chromophores, luminophores, chromogens, fluorescent proteins, or indirectly labeled such as with biotin to which a streptavidin complex can then be attached. By testing to see the presence or absence of the probe, one can detect the presence or absence of the selected sequence or subsequence. A "labeled nucleic acid probe" is a nucleic acid probe that is fixed, covalently, through a linker, or through ionic bonds, van der Waals or hydrogen bonds, to a tag, such that it can be detect the presence of the probe by detecting the presence of the label attached to the probe. The terms "polypeptide" and "protein" refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residues are an artificial chemical analogue of a corresponding amino acid occurring naturally, as well as to naturally occurring amino acid polymers. The term "recombinant protein" refers to a protein that is produced by the expression of a nucleotide sequence that encodes the amino acid sequence of the protein of a recombinant DNA molecule. The term "recombinant host cell" refers to a cell comprising a recombinant nucleic acid molecule. Thus, for example, recombinant host cells can express genes that are not found within the native (non-recombinant) form of the cell. The terms "isolated", "purified" or "biologically pure" refer to material that is substantially or essentially free of the components that normally accompany it, as found in its native state. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid chromatography. A protein or nucleic acid molecule that is the predominant protein or nucleic acid species present in a preparation is substantially purified. Generally, an isolated protein or nucleic acid molecule will comprise more than 80 percent of all macromolecular species present in the preparation. Preferably, the protein is purified to represent more than 90 percent of all macromolecular species present. More preferred, the protein is purified to more than 95 percent, and more preferably the protein is purified to essential homogeneity, where other macromolecular species are not detected by conventional techniques. The term "occurring naturally", as applied to an object, refers to the fact that an object can be found in nature. For example, a poly-peptide or polynucleotide sequence that is present in an organism (including virus) that can be isolated from a source in nature, and that has not been intentionally modified by man in the laboratory, occurs naturally . The term "antibody" refers to a polypeptide substantially encoded by an immunoglobulin gene or immunoglobulin genes, or fragments thereof, which binds to, and specifically recognizes an analyte (antigen). The recognized immunoglobulin genes include the genes of the constant region kapa, lamda, alpha, gamma, delta, epsilon, and mu, as well as the myriad genes of the immunoglobulin variable region. The antibodies exist, for example, as intact immunoglobulins, or as a number of well-characterized fragments produced by digestion with different peptidases. This includes, for example, the Fab 'and F (ab)' 2 fragments. The term "antibody-po", as used herein, also includes fragments of antibodies produced either by the modification of whole antibodies, or those synthesized de novo using recombinant DNA methodologies. The term "immunoassay" refers to an assay that uses an antibody to specifically bind an analyte. The immunoassay is characterized by the use of specific binding properties of a particular antibody to isolate, direct, and / or quantify the analyte. The term "identical" in the context of two nucleic acid sequences or polypeptides, refers to the residues in the two sequences that are the same when aligned for maximum correspondence. When the percentage of sequence identity is used with reference to proteins or peptides it is recognized that the positions of the residues that are not identical often differ in conservative amino acid substitutions, where the amino acid residues are replaced by other amino acid residues. with similar chemical properties (eg, charge or hydrophobicity) and, therefore, do not change the functional properties of the molecule. Where the sequences differ in conservative substitutions, the percent identity of the sequences upward can be adjusted to correct the conservative nature of the substitution. For those skilled in the art, the means for making this adjustment are well known. Typically this involves marking a conservative substitution as a partial poor rather than complete mismatch, thereby increasing the percent identity of the sequence. Therefore, for example, where an identical amino acid is given a mark of 1, and a non-conservative substitution is given a zero mark, a conservative substitution is given a mark between zero and 1. The labeling of conservative substitutions is calculated, for example, in accordance with a known algorithm. See, for example, Meyers and Miller, Computer Applic. Biol. Sci., 4: 11-17 (1988); Smith and Waterman (1981) Adv. Appl. Ma th. 2: 482; Needleman and Wunsch (1970) J. Mol. Biol. 48: 443; Pearson and Lipman (1988) Proc. Nati Acad. Sci. USA 85: 2444; Higgins and Sharp (1988) Gene, 73: 237-244 and Higgins and Sharp (1989) CABIOS 5: 151-153; Corpet, et al. (1988) Nucleic Acids Research 16, 10881-90; Huang, et al. (1992) Computer Applications in the Biosciences 8, 155-65, and Pearson, et al. (1994) Methods in Molecular Biology 24, 307-31. Alignment is also often done by manual inspection and alignment. The "conservatively modified variations" of a particular nucleic acid sequence refers to those nucleic acids that encode identical or essentially identical amino acid sequences, or wherein the nucleic acid does not encode an amino acid sequence, to essentially identical sequences . Due to the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given polypeptide. For example, all the codons CGU, CGC, CGA, CGG, AGA, and AGG encode the amino acid arginine. Therefore, in every position where an arginine is specified by a codon, the codon can be altered to any of the corresponding codons described, without altering the encoded polypeptide. These variations of nucleic acid are "silent variations", which are a kind of "conservatively modified variations". Any nucleic acid sequence herein that encodes a polypeptide also describes any possible silent variation. One of experience will recognize that each codon can be modified in a nucleic acid
(except AUG, which is ordinarily the only codon for methionine), to produce a functionally identical molecule by standard techniques. In accordance with the above, each "silent variation" of a nucleic acid encoding a polypeptide is implicit in each described sequence. On the other hand, one of experience will recognize that substitutions, deletions, or individual additions that alter, add or delete a single amino acid or a small percentage of amino acids (typically less than 5 percent, more typically less than 1 percent) in a encoded sequence, are "conservatively modified variations" wherein the alterations result in the substitution of an amino acid with a chemically similar amino acid. Conservative amino acid substitutions that provide amino acids of similar functionality are well known in the art. Each of the following six groups contains amino acids that are conservative substitutions for one another: 1) Alanine (A), Serine (S), Threonine (T); 2) Aspartic acid (D), glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). The term "complementary" means that a nucleic acid molecule has the sequence of the binding partner of another nucleic acid molecule. Therefore, the sequence 5'-ATGC-3 'is complementary to the sequence 5'-GCAT-3'. An amino acid sequence or nucleotide sequence is "substantially identical" or "substantially similar" to a reference sequence if the amino acid sequence or nucleotide sequence has at least 80 percent sequence identity with the sequence reference on a given comparison window. Therefore, substantially similar sequences include those that have, for example, at least 85 percent sequence identity, at least 90 percent sequence identity, at least 95 percent sequence identity , or at least 99 percent sequence identity. Of course, two sequences that are identical to one another are also substantially identical. A subject nucleotide sequence is "substantially complementary" to a reference nucleotide sequence if the complement of the subject nucleotide sequence is substantially identical to the reference nucleotide sequence. The term "astringent conditions" refers to a temperature and the ionic conditions that are used in the hybridization of nucleic acids. The conditions of astringency depend on the sequence and are different under different environmental parameters. Generally, the astringent conditions are selected to be about 5 ° C to 20 ° C lower than the thermal melting point (Tra) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50 percent of the target sequence is hybridized to a perfectly coupled probe.
The term "allelic variants" refers to the polymorphic forms of a gene at a particular genetic site, as well as to cDNAs derived from mRNA transcripts of the genes and polypeptides encoded by them. The term "preferred mammalian codon" refers to the subset of codons from the set of codons that encode an amino acid that are most frequently used in proteins expressed in mammalian cells, as chosen from the following list: Amino Acid Preferred codons for expression of high level mammal Gly GGC, GGG Glu GAG Asp GAC Val GUG, GUC Ala GCC, GCU Ser AGC, UCC Lys AAG Asn AAC Met AUG He AUC Thr ACC Trp UGG Cys UGC Tyr UAU, UAC Leu CUG Phe UUC Arg CGC, AGG, AGA Gln CAG His CAC Pro CCC Fluorescent molecules are useful in the transfer of fluorescence resonance energy ("FRET"). The fluorescence resonance energy transfer involves a donor molecule and an acceptor molecule. To optimize the efficiency and detectability of the fluorescence resonance energy transfer between a donor and acceptor molecule, many factors need to be considered. The donor emission spectrum must overlap as much as possible with the exciter spectrum of the acceptor to maximize the overlap integral. In addition, the quantum yield of the donor fraction and the extinction coefficient of the acceptor must be equally as high as possible to maximize R0, the distance at which the energy transfer efficiency is 50 percent. However, the excitation spectra of the donor and the acceptor must overlap as little as possible, so that a region of wavelength can be found in which the donor can be excited efficiently, without exciting directly the acceptor. The fluorescence that arises from the direct excitation of the acceptor is difficult to distinguish from the fluorescence that arises from the fluorescence resonance energy transfer. Similarly, the emission spectra of the donor and the acceptor should overlap as little as possible, so that the two emissions can be clearly distinguished. The high fluorescence quantum yield of the acceptor fraction is desirable, if the emission from the acceptor is to be measured either as the single display of information or as part of an emission ratio. One factor to consider when choosing the donor and acceptor pair is the efficiency of the fluorescence resonance energy transfer between them. Preferably, the efficiency of the fluorescence resonance energy transfer between the donor and the acceptor is at least 10 percent, more preferably at least 50 percent, and even more preferred at least 80 percent. hundred. The term "fluorescent property" refers to the_molar extinction coefficient at an appropriate excitation wavelength, the quantum efficiency of fluorescence, the shape of the excitation spectrum- or emission spectrum, the maximum excitation wavelength and the length Maximum emission wavelength, the ratio of the excitation amplitudes at two different wavelengths, the ratio of the emission amplitudes at two different wavelengths, the lifetime of the excited state, or the fluorescence anisotropy. A difference that can be measured in any of these properties between the green fluorescent protein of wild-type Aequorea and the mutant form is useful. A measurable difference can be determined by determining the amount of any quantitative fluorescent property, for example, the amount of fluorescence at a particular wavelength, or the fluorescence integral on the emission spectrum. The determinant proportions of the excitation amplitude or the emission amplitude at two different wavelengths ("excitation amplitude proportion" and "emission amplitude proportion", respectively) are particularly convenient because the proportioning process provides a internal reference, and cancels the variations in the brightness of the excitation source, the sensitivity of the detector, and the dispersion or damping through the sample. II. FLUORESCENT PROTEINS DESIGNED FROM LONG WAVE LENGTH A. Fluorescent Proteins As used herein, the term "fluorescent protein" refers to any protein capable of fluorescence when excited by the appropriate electromagnetic ... This includes fluorescent proteins whose amino acid sequences are either naturally occurring or designed (ie, analogs or mutants). Many Enidarians use green fluorescent proteins ("GFPs") as bioluminescent energy transfer acceptors. A "green fluorescent protein", as used herein, is a protein that emits green fluorescent light rays. Similarly, "blue fluorescent proteins" give off blue fluorescent light rays, and the "red fluorescent proteins" give off red fluorescent light rays. The fluorescent green proteins of the jellyfish of the Pacific Northwest, the Aequorea victoria, of the thought of the sea, of the Renilla reníformis, and the Phialidium gregarium have been isolated. W.W. Ward and Photochem collaborators. Photobiol. 35: 803-808 (1982); L.D. Levine et al., Comp. Biochem. Physiol. , 72B: 77-85 (1982). A variety of Aequorea-related fluorescent proteins have been designed that have useful excitation and emission spectra, by modifying the amino acid sequence of a green fluorescent protein naturally occurring from Aequorea victoria. (DC Prasher et al., Gene, 111: 229-233 (1992); R. Heim et al., Proc. Na ti. Acad. Sci., USA, 91: 12501-04 (1994); United States 08 / 337,915, filed on November 10, 1994, International application PCT / US95 / 14692, filed on 10/11/95). As used herein, a fluorescent protein is a "fluorescent protein related to Aequorea" if any contiguous sequence of 150 amino acids of the fluorescent protein has at least 85 percent sequence identity with an amino acid sequence, either contiguous or non-contiguous, of the 238 amino acid wild type Aequorea green fluorescent protein of Figure 3 (SEQ ID NO: 2). More preferably, a fluorescent protein is a fluorescent protein related to Aequorea if any contiguous sequence of 200 amino acids of the fluorescent protein has at least 95 percent sequence identity with an amino acid sequence, either contiguous or non-contiguous, of the protein . fluorescent green Aequorea of Figure 3 (SEQ ID NO: 2). Similarly, the fluorescent protein can be related to wild type fluorescent proteins of Renilla or Phialidium, using the same standards. The fluorescent proteins related to Aequorea include, for example and without limitation, the green fluorescent protein of Aequorea victoria wild-type (native) (DC Prasher et al., "Primary structure of the Aequorea victoria green fluorescent protein", Gene, (1992) 111: 229-33), whose nucleotide sequence (SEQ ID NO: 1) and whose deduced amino acid sequence (SEQ ID NO: 2) are presented in Table 3; allelic variants of this sequence, for example, Q80R, which has the glutamine residue at position 80 substituted with arginine (M. Chalfie et al., Science, (1994) 263: 802-805); those fluorescent proteins related to Aequorea designed, described herein, for example, in Table A or Table F, variants that include one or more mutations and fold fragments of these proteins that are fluorescent, such as the green fluorescent protein of Aequorea from which the two amino acids with amino terminal have been removed. Many of these contain different aromatic amino acids within the central chromophore and emit fluorescent light rays at a significantly shorter wavelength than the wild-type species. For example, the designed proteins P4 and P4-3 contain (in addition to other mutations) the Y66H substitution, while W2 and W7 contain (in addition to other mutations) Y66W. Other mutations both near the region of the chromophore of the protein and far from it, in the primary sequence, can affect the spectral properties of the green fluorescent protein, and are listed in the first part of the following table.
TABLE A
Clone Mutation (s) Max. of excitation Max. of emission (nm) Coef. of Extin. Yield (nm) (MW) quantum Type None 395 (475) 508 21,000 (7,150) 0.77 wild P4 Y66H 383 447 13,500 0.21 P4-3 Y66H 381 445 14,000 0.38 Y145F W7 Y66W 433 (453) 475 (501) 18,000 (17,100) 0.67 N146I M153T V163A N212K W2 Y66W 432 (453) 480 10,000 (9,600) 0.72 I123V Y145H H148R M153T V163A N212K S65T S65T 489 511 39,200 0.68 P4-1 S65T 504 (396) 514 14,500 (8,600) 0.53 M153A K238E S65A S65A 471 504 S65C S65C 479 507 S65L S65L 484 510 Y66F Y66F 360 442 Y66W Y66W 458 480
Additional mutations in fluorescent proteins related to Aequorea, referred to as "fold mutations", improve the ability of fluorescent proteins to fold at higher temperatures, and to be more fluorescent when expressed in mammalian cells. , but have little or no effect on the excitation and emission peak wavelengths. It should be noted that these can be combined with mutations that influence the spectral properties of the green fluorescent protein to produce proteins with altered spectral and fold properties. Fold mutations include: F64L, V68L, S72A, and also T44A, F99S, Y145F, N146I, M153T or A, V163A, I167T, S175G, S205T, and N212K. As used herein, the term "cycle domain" refers to an amino acid sequence of a fluorescent protein related to Aequorea that connects the amino acids involved in the secondary structure of the eleven chains of the D-barrel or the D-helix central (residues 56-72) (see Figures IA and IB). As used herein, the "fluorescent protein fraction" of a fluorescent protein is that portion of the amino acid sequence of a fluorescent protein that, when the amino acid sequence of the fluorescent protein substrate is optimally aligned with the fluorescent protein. Amino acid sequence of a naturally occurring fluorescent protein is found between amino acids with terminal amino and carboxy terminals, inclusive, of the amino acid sequence of the naturally occurring fluorescent protein. It has been found that fluorescent proteins can be genetically fused to other target proteins, and used as markers to identify the location and amount of the target protein produced. In accordance with the above, this invention provides fusion proteins comprising a fluorescent protein fraction, and additional amino acid sequences. These sequences can be, for example, up to about 15, up to about 50, up to about 150 or up to about 1000 amino acids long. The fusion proteins have the ability to fire fluorescent light rays when excited by electromagnetic radiation. In one embodiment, the fusion protein comprises a polyhistidine tag to aid in the purification of the protein. B. Use of the Crystal Structure of the Green Fluorescent Protein to Design Mutants Having Altered Fluorescent Characteristics Using X-ray crystallography and computer processing, we have created a model of the crystal structure of the green fluorescent protein of Aequorea, which shows the relative location of the atoms in the molecule. This information is useful in the identification of amino acids whose substitution alters the fluorescent properties of the protein. The fluorescent characteristics of the fluorescent proteins related to Aeguorea depend, in part, on the electronic environment of the chromophore. In general, amino acids that are within approximately 0.5 nm of the chromophore have an influence on the electronic environment of the chromophore. Therefore, the substitution of these amino acids can produce fluorescent proteins with altered fluorescent characteristics. In the excited state, the density of the electron tends to change from the phenolate to the carbonyl end of the chromophore. Therefore, the increasing positive charge placement near the carbonyl end of the chromophore tends to decrease the energy of the excited state, and cause a change to red in the absorbance and the maximum wavelength of emission of the protein. The decrease in the positive charge near the carbonyl end of the chromophore tends to have the opposite effect, causing a change to blue in the wavelengths of the protein. The amino acids with charged side groups (D, E, K, and R ionized), dipolar (H, N, Q, S, T, and D, E and K non-charged), and polarizable (for example, C, F, H, M, W and Y) are useful for altering the electronic environment of the chromophore, especially when replacing an amino acid with an uncharged, non-polar or non-polarizable side chain. In general, amino acids with polarizable side groups alter the electronic environment less, and, consequently, are expected to cause a comparatively smaller change in a fluorescent property. Amino acids with charged side groups alter the environment more, and, consequently, are expected to cause a comparatively larger change in a fluorescent property. However, amino acids with charged side groups are more likely to break down the structure of the protein, and avoid proper folding if they hide next to the chromophore without additional solvation or salt bridging. Therefore, charged amino acids are more likely to be tolerated, and to give useful effects when they replace other charged or highly polar amino acids, which have already been solvated or that are enveloped in salt bridges. In certain cases, where the substitution with a polarizable amino acid is chosen, the structure of the protein can make the selection of a longer amino acid, for example, W, less appropriate. Alternatively, positions occupied by amino acids with charged or polar side groups that are unfavorably oriented may be substituted with amino acids having less charged or polar side groups. In another alternative, an amino acid whose side group has a dipole oriented in one direction in the protein can be substituted with an amino acid having a dipole oriented in a different direction. More particularly, Table B lists many amino acids located within about 0.5 nm from the chromophore, the replacement of which can result in altered fluorescent characteristics. The table indicates, underlined, the preferred amino acid substitutions at the indicated location to alter a fluorescent characteristic of the protein. In order to introduce these substitutions, the table also provides codons for the first ones used in site-directed mutagenesis involving the amplification. These primers have been selected to economically code the preferred amino acids, but these also encode other amino acids, as indicated, or up to a stop codon, denoted by Z. When introducing the substitutions using these first degenerates, the most efficient strategy is classify the collection to identify the mutants with the desired properties, and then sequence their DNA to find out which of the possible substitutions is responsible. The codons are shown as double chain with the upstream chain, the antisense chain down. In nucleic acid sequences, R = (A or g);
Y = (C or T); M = (A or C); K = (g or T); S = (g OR C); W = (A or T); H = (A, T, OR C); B = (g, T, OR C); V = (g. A, OR C); D = (g. A, or T); N = (A, C, g, OR T).
TABLE B
Original position and supposed paper Change to Codon
L42 Aliphatic residue near C = N of the chromopholor CFHLORWYZ 5? DS3 '
V61 Aliphatic residue close to -CH = chromophore center FYHCLR YDC RHg T62 Almost directly over the center of the bridge of the AVFS KYF chromophore MRg DEHKNQ VAS BTS FYHCLR YDC RHg V68 Aliphatic residue close to the carbonyl and G67 FYHL YWC RWg N121 Near the CN site of the closing ring between T65 and G67 CFHLORWYZ YDS RHS Y145 Packages near the tyrosine ring of the WCFL TKS AMS chromophore
DEHNKQ VAS BTS H148 H phenyl oxygen links FYNI WWC WWg KOR MRg KYC
VI 50 Aliphatic residue near the ring of tyrosine of the chromophore FYHL YWC RWg F165 Packages near the ring of tyrosine CHORWYZ YRS RYS 1167 Aliphatic residue near the phenolate; I167T has effects FYHL YWC RWg T203 H links to phenolic oxygen of the chromophore FHLQRWYZ YDS RHS E222 The protonation regulates the ionization of the chromophore HKNO MAS KTS
Examples of amino acids with polar side groups that can be substituted with polarizable side groups include, for example, those in Table C. TABLE C
Original position and assumed paper Change to Elbow: Q69 Terminates the link water chain H KREG RRg YYC Q94 H bonds to the carbonyl chromophore term DEHKNO VAS BTS Q183 Bridges Arg96 and center of the chromophore bridge HY YAC RTG _____ RAg YTC
NI 85 Part of the H link network near the carbonyl of the DEHNKQ VAS BTS chromophore
In another embodiment, an amino acid that is close to a second amino acid within about 0.5 nm of the chromophore can, upon substitution, alter the electronic properties of the second amino acid, in turn altering the electronic environment of the chromophore. Table D represents two of these amino acids. The amino acids, L220 and V224, are close to E222, and oriented in the same direction in the folded sheet D.
TABLE D
Original position and supposed paper Change to Codon L220 Packages near Glu222; to make the pH sensitive GFP HKNPOT MMS KKS V224 Packs near Glu222; to make the pH sensitive GFP HKNPQT MMS KKS
CFHLORWYZ YDS RHS One embodiment of the invention includes a nucleic acid molecule comprising a nucleotide sequence encoding a functionally designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ. ID NO: 2), and which differs from SEQ ID NO: 2 in at least one substitution in Q69, wherein the functionally designed fluorescent protein has a fluorescent property different from the green fluorescent protein of Aequorea. Preferably, the substitution in Q69 is selected from the group of K, R, E and G. Substitution Q69 can be combined with other mutations, to improve the properties of the protein, such as a functional mutation in S65. One embodiment of the invention includes a nucleic acid molecule comprising a nucleotide sequence that encodes a functionally designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) , and which differs from SEQ ID NO: 2 in at least one substitution in E222, but not including E222G, wherein the functionally designed fluorescent protein has a fluorescent property different from the green fluorescent protein of Aequorea. Preferably, the substitution in E222 is selected from the group of N and Q. The substitution E222 can be combined with other mutations, to improve the properties of the protein, such as a functional mutation in F64. One embodiment of the invention includes a nucleic acid molecule comprising a nucleotide sequence that encodes a functionally designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2), and which differs from SEQ ID NO: 2 in at least one substitution in Y145, wherein the functionally designed fluorescent protein has a fluorescent property different from the green fluorescent protein of Aequorea. Preferably, the substitution at Y145 is selected from the group of W, C, F, L, E, H, K and Q. The Y145 substitution can be combined with other mutations, to improve the properties of the protein, such as one Y66. The invention also includes computer-related modalities, including computational methods for using crystal coordinates to design new fluorescent protein mutations, and devices for storing crystal data, including coordinates. For example, the invention includes a device comprising a storage device and, stored in the device, at least 10 atomic coordinates selected from the atomic coordinates listed in Figures 5-1 through 5-28. More coordinates can be stored, depending on the complexity of the calculations or the objective of using the coordinates (for example, approximately 100, 1,000 or more coordinates). For example, larger numbers of coordinates will be desirable for more detailed representations of the structure of the fluorescent protein. Typically, the storage device is a computer readable device 5 that stores the code it receives as it enters the coordinates. Although other means of storage are contemplated as are known in the art. The computer readable device can be a floppy disk or a hard disk. C. Production of Fluorescent Proteins of ^ 10 Long Wavelength The recombinant production of a fluorescent protein involves the expression of a nucleic acid molecule having sequences that encode the protein. In one embodiment, the nucleic acid encodes a fusion protein in which a single polypeptide includes the fluorescent protein fraction within a longer β-polypeptide. The longer polypeptide may include a second functional protein, such as the fluorescence resonance energy transfer pair or a protein having a second function (eg, enzyme, antibody, or other binding protein). The nucleic acids encoding fluorescent proteins are useful as starting materials. Fluorescent proteins can be produced as fusion proteins by recombinant DNA technology.
The recombinant production of fluorescent proteins involves the expression of nucleic acids that have sequences that encode proteins. Nucleic acids encoding fluorescent proteins can be obtained by methods known in the art. Fluorescent proteins can be made by site-specific mutagenesis of other nucleic acids encoding fluorescent proteins, or by random mutagenesis caused by increasing the error ratio of the polymerase chain reaction of the original polynucleotide to 0. lmM of MnCl2 and unbalanced nucleotide concentrations. See, for example, United States patent application 08 / 337,915, filed on November 10, 1994, or International application PCT / US95 / 14692, filed on 10/11/95. The nucleic acid encoding a green fluorescent protein can be isolated by the polymerase chain reaction of the cDNA from A. victoria, using primers based on the DNA sequence of the green fluorescent protein of A. victoria, as shown in FIG. presented in Figure 3. Polymerase chain reaction methods are described, for example, in U.S. Patent No. 4,683,195; Mullis, and collaborators (1987) Cold Spring Harbor Symp. Quant. Biol. 51: 263; and Erlich, ed. , PCR Technology, (Stockton Press, NY, 1989). The construction of expression vectors and gene expression in transfected cells includes the use of molecular cloning techniques also well known in the art. Sambrook, et al., Molecular Cloning - - A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, (1989) and Current Protocols in Molecular Biology, F.M. Ausubel, et al., Eds., (Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley &Sons, Inc.). The expression vector for the function can be adapted in prokaryotes or eukaryotes by including appropriate promoters, replication sequences, markers, and so on. The nucleic acids used to transfect cells with sequences encoding the expression of the polypeptide of interest will generally be in the form of an expression vector that includes expression control sequences operably linked to a nucleotide sequence that encodes the expression of the polypeptide. As used, the term "nucleotide sequence encoding the expression of" a polypeptide, refers to a sequence that, upon transcription and translation of the mRNA, produces the polypeptide. This can include sequences that contain, for example, introñes. The expression control sequences are operatively linked to a nucleic acid sequence when the expression control sequences control and regulate transcription and, as appropriate, translation of the nucleic acid sequence. In this manner, expression control sequences may include appropriate promoters, enhancers, transcription terminators, a start codon (ie, ATG) versus a gene encoding protein, splice signals for introns, frame maintenance of correct reading of that gene to allow the proper translation of the mRNA, and stop codons. Methods that are well known to those skilled in the art can be used to construct expression vectors containing the fluorescent protein coding sequence and the appropriate transcription / translation control signals. These methods include in vitro recombinant DNA techniques, synthetic techniques and in vivo recombination / genetic recombination. (See, for example, the techniques described in Maniatis, and collaborators, Molecular Cloning A
Laboratory Manual, Cold Spring Harbor Laboratory, N.Y. , 1989). Transformation of a host cell with the recombinant DNA can be performed by conventional techniques that are also well known to those skilled in the art. When the host is prokaryotic, such as E. coli, competent cells can be prepared that can capture DNA from cultured cells after the exponential growth phase and can subsequently be treated by the CaCl2 method by means of procedures well known in the art. . Alternatively, MgCl 2 or RbCl can be used. Transformation can also be performed after the formation of a protoplast of the host cell or by electroporation.
When the host is a eukaryote, methods such as transfection of DNA such as calcium phosphate coprecipitates, conventional mechanical methods such as microinjection, electroporation, insertion of a plasmid enclosed in liposomes, or virus vectors can be used. Eukaryotic cells can also be co-transfected with the DNA sequences encoding the fusion polypeptide of the invention, and a second foreign DNA molecule encoding a selectable phenotype, such as the herpes simplex thymidine kinase gene. Another method is to use a eukaryotic viral vector or transform eukaryotic cells and express the protein. (Eukaryotic Viral Vectors, Cold Spring Harbor Laboratory, Gluzman, et al., 1982). Preferably, a eukaryotic host is used as the host cell as described herein. The techniques for the isolation and purification of polypeptides of the invention expressed either microbially or eukaryotic, can be by any conventional means such as, for example, preparative chromatographic separations and immunological separations such as those including the use of antibodies or monoclonal or polyclonal antigens. In one embodiment, recombinant fluorescent proteins can be produced by expression of the nucleic acid encoding the protein in E. coli. The fluorescent proteins related by Ae? Ruorea are best expressed by cells grown between about 15 ° C and 30 ° C but higher temperatures are possible (for example, 37 ° C). After synthesis, these enzymes are stable at higher temperatures (eg, 37 ° C) and can be used in tests at these temperatures. A variety of host expression vector systems can be used to express the coding sequence of the fluorescent protein. These include, but are not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage DNA, plasmid DNA or cosmid DNA expression vectors containing a fluorescent protein coding sequence; the yeast transformed with recombinant yeast expression vectors containing the fluorescent protein coding sequence; plant cell systems infected with recombinant virus expression vectors (e.g., cauliflower mosaic virus, CaMV, tobacco mosaic virus, TMV) or transformed with recombinant plasmid expression vectors (e.g., Ti plasmid) which contain a fluorescent protein coding sequence; insect cell systems infected with recombinant virus expression vectors (e.g., baculovirus) containing a fluorescent protein coding sequence; or animal cell systems infected with recombinant virus expression vectors (e.g., retroviruses, adenoviruses, vaccinia viruses) containing a fluorescent protein coding sequence; or transformed animal cell systems designed for stable expression. Depending on the host / vector system used, any of a number of transcription and translation elements, including constitutive and inducible promoters, transcription enhancer elements, transcription terminators, etc., may be used in the expression vector ( see, for example, Bitter, et al., Methods in Enzymology 153: 516-544, 1987). For example, when cloning into bacterial systems, inducible promoters such as the pL of bacteriophage D, plac, ptrp, ptac (ptrp-lac hybrid promoter) and the like can be used. When cloning into mammalian cell systems, promoters derived from the genome of mammalian cells (for example, the metallothionein promoter) or from mammalian virus (for example, the repeat of the long terminal of the mammalian cell) can be used. retrovirus; the late adenovirus promoter; the 7.5K promoter of the vaccinia virus). Promoters produced by recombinant DNA or synthetic techniques can also be used to provide transcription of the coding sequence of the inserted fluorescent protein. In bacterial systems, a number of expression vectors may be advantageously selected, depending on the intended use for the expressed fluorescent protein. For example, when large quantities of the fluorescent protein must be produced, vectors that direct the expression of high levels of fusion protein products that have been rapidly purified may be desirable. Preferred are those that have been designed to contain a dissociation site to aid in the recovery of the fluorescent protein. In yeast, a number of vectors containing constitutive or inducible promoters can be used. For a review see, Current Protocols in Molecular Biology, volume 2, Ed. Ausubel, and collaborators, Greene Publish Assoc. & Wiley Inters-cience, Chapter 13, 1988; Grant, et al., Expression and Secretion Vectors for Yeast, in Methods in Enzymology, Eds. Wu & Grossman, 31987, Acad. Press, N.Y. , volume 153, pages 516-544, 1987, DNA Cloning, volume II, IRL Press, Wash., D.C., chapter 3, 1986; and Bitter, Heterologous Gene Expression in Yeast, Methods in Enzymology, Eds. Berger & Kimmel, Acad. Press, N.Y., volume 152, pages 673-684, 1987; and The Molecular Biology of the Yeast Saccharomyces, Eds. Strathern, et al., Cold Spring Harbor Press, Volumes I and II, 1982. A constitutive yeast promoter such as ADH or LEU2 or an inducible promoter such as GAL can be used (Cloning in Yeast, Chapter 3, R. Rothstein In: DNA Cloning volume 11, A Practical Approach, Ed. DM Glover, IRL Press, Wash., DC, 1986). Alternatively, vectors that promote the integration of the foreign DNA sequences within the chromosome of the yeast can be used. In cases where plant expression vectors are used, the expression of a fluorescent protein coding sequence can be impelled by a number of promoters. For example, viral promoters such as the 35S RNA and the 19S RNA of CaMV (Brisson, et al., Nature 310: 511-514, 1984), or the TMV shell protein promoter (Takamatsu, et al., EMBO J. 6: 301-311, 1987); alternatively, plant promoters such as the small subunit of RUBISCO (Coruzzi, et al., 1984, EMBO J. 3: 1671-1680; Broglie, et al., Science 224: 838-843, 1984); or heat shock promoters can be used, for example, hspl7.5-E or hspl7.3-B from soybean (Gurley, et al., Mol.Cell. Biol. 6: 559-565, 1986). These constructs can be introduced into plant cells using Ti plasmids, Ri plasmids, plant virus vectors, direct DNA transformation, microinjection, electroporation, and so on. For reviews of these techniques see, for example, Weissbach and Weissbach, Methods for Plant Molecular Biology, Academic Press, NY, Section VIII, pages 421-463, 1988; and Grierson and Corey, Plant Molecular Biology, 2nd Ed., Blackie, London, chapters 7-9, 1988. An alternative expression system which can be used to express the fluorescent protein is an insect system. In one such system, the nuclear polyhedra-sis virus Autographa californica (AcNPV) is used as a vector to express foreign genes. The virus grows in Spodoptera frugiperda cells. The coding sequence of the fluorescent protein can be cloned into non-essential regions (eg, the polyhedrin gene) of the virus and placed under the control of an AcNPV promoter (eg, the polyhedrin promoter). Successful insertion of the fluorescent protein coding sequence will result in the inactivation of the polyhedrin gene and the production of non-occluded recombinant viruses (i.e., viruses lacking the protein coat for which it was encoded by the polyhedrin gene) . These recombinant viruses are then used to infect Spodoptera frugiperda cells in which the inserted gene is expressed, see Smith, et al., "Biol. 46: 584, 1983; Smith, U.S. Patent No. 4,215,051. Eukaryotic systems, and preferably mammalian expression systems, allow modifications to occur after translation of expressed mammalian proteins. Eukaryotic cells possessing the cellular machinery for the proper processing of the primary transcript, glycosylation, phosphorylation, and, the advantageous secretion of the gene product should be used as host cells for the expression of the fluorescent protein. These host cell lines may include, but are not limited to, CHO, VERO, HeLa, COS, MDCK, Jur at, HEK-293, and WI38. Mammalian cell systems that use recombinant viruses or viral elements for direct expression can be designed. For example, when adenovirus expression vectors are used, the fluorescent protein coding sequence can be ligated to an adenovirus transcription / translation control complex, for example, the leader sequence of the tripartite late promoter. This chimeric gene can then be inserted by in vitro or in vivo recombination. Insertion into a non-essential region of the viral genome (eg, El region or E3) will result in a recombinant virus and can express the fluorescent protein in infested hosts (eg see Logan and Shenk, Proc. Nati. Acad. Sci USA, 81: 3655-3659, 1984). Alternatively, the vaccinia 7.5K virus promoter can be used. (For example, see Mackett, et al., Proc. Nati, Acad. Sci. USA, 79: 7415-7419, 1982, Mackett, et al., J. "Virol. 49: 857-864, 1984; Panicali, et al. , Proc. Nati, Acad. Sci. USA, 79: 4927-4931, 1982). Of particular interest are vectors based on the bovine papilloma virus which has the capacity to replicate as extrachromosomal elements (Sarver, et al. , Mol. Cell, Biol. 1: 486, 1981) Shortly after the entry of this DNA into mouse cells, the plasmid replicates to approximately 100 to 200 copies per cell.Transcription of the inserted cDNA does not require integration of the plasmid within the chromosome of the host, giving a yield by the same of a high level of expression.These vectors can be used for stable expression by means of including a selectable marker in the plasmid, such as the neo gene. , the retroviral genome can be modified to be used com or a vector that can introduce and direct the expression of the fluorescent protein gene in the host cells (Cone and Mulligan, Proc. Nati Acad. Sci. USA, 81: 6349-6353, 1984). A high level of expression can also be achieved by using inducible promoters, including, but not limited to, the IIA promoter of metallothionine and heat shock promoters. The invention may also include a localization sequence, such as a nuclear localization sequence, an endoplasmic reticulum localization sequence, a peroxisome localization sequence, a mitochondrial localization sequence, or a localized protein. The localization sequences can be target sequences which are described, for example, in "Protein Targeting", chapter 35 of Stryer, L., Biochemistry (4th ed.). W.H. Freeman, 1995. The localization sequence can also be a localized protein. Some important localization sequences include those targeting the nucleus (KKKRK), the mitochondria (amino terminal MLRTSSLFTRRVQPSLFRNILRLQST-), the endoplasmic reticulum (KDEL in the C-terminus, acquiring a signal sequence present in the N-terminus), the peroxisome (SKF in term C), pre-insertion or insertion into the plasma membrane (CaaX, CC, CXC, or CCXX in the C term), the cytoplasmic side of the plasma membrane (fusion to SNAP-25), or the Golgi apparatus (fusion to furin). For long-term, high-yield production of recombinant proteins, stable expression is preferred. Rather than using expression vectors containing replication viral origins, the host cells can be transformed with the cDNA of the fluorescent protein controlled by the appropriate expression control elements (e.g., promoter, enhancer, sequences, terminators). of transcription, polyadenylation sites, etc.), and a selectable marker. The selectable marker in the recombinant plasmid confers resistance to selection and allows the cells to stably integrate the plasmid into their chromosomes and grow to form the foci which in turn can be cloned and expanded within the cell lines. For example, following the instruction of the foreign DNA, the designed cells may be allowed to grow for 1-2 days in an enriched medium, and then they are changed to a selective medium. A number of selection systems can be used, including but not limited to, the thymidine kinase of the herpes simplex virus (Wigler, et al., Cell, 11: 223, 1977), the hypoxanthine-guanine phosphoribosyl transferase (Szybalska and Szybalski, Proc. Nati, Acad. Sci. USA, 48: 2026, 1962), and the adenine phosphoribosyl transferase genes (Lowy, et al., Cell, 22: 817, 1980) can be used in tkl cells. , hgprt or aprt respectively. Also, antimetabolic resistance can be used as the basis of selection for dhfr, which confers resistance to methotrexate (Wigler, et al, Proc. Nati, Acad. Sci. USA, 77: 3567, 1980; O'Hare, and collaborators, Proc. Nati Acad. Sci. USA, 8: 1527, 1981); gpt, which confers resistance to mycophenolic acid (Mulligan and Berg, Proc Nati Acad Sci USA, 78: 2072, 1981); neo, which confers resistance to aminoglucoside G-418 (Colberre-Garapin, et al., J. Mol. Biol., 150: 1, 1981), and hygro, which confers resistance to hygromycin genes (Santerre, and collaborators, Gene, 30: 147, 1984.) Recently, additional selectable genes have been described, namely trpB, which allow cells to use indole instead of tryptophan, hisD, which allows cells to utilize histinol instead of histidine (Hartman and Mulligan, Proc. Nati, Acad. Sci. USA, 85: 8047, 1988), and ODC (ornithine decarboxylase) which confers resistance to the inhibitor of ornithine decarboxylase, 2- (difluoromethyl) -DL-ornithine, DFMO (MaConlogue L., in: Current Communications in Molecular Biology, Cold Spring Harbor Laboratory, ed., 1987.) The DNA sequences encoding the fluorescent protein polypeptide of the invention, in vitro, can be expressed. by transferring DNA into an appropriate host cell The "host cells" are cells in which the vector can be propagated and its DNA expressed. The term also includes any progeny of the subject host cell. It is understood that not all progeny can be identical to the mother cell, since there could be mutations that occur during replication. However, this progeny is included when the term "host cell" is used. Stable transfer methods are well known in the art, in other words when the foreign DNA is continuously maintained in the host. The expression vector can be transfected into a host cell for expression of the recombinant nucleic acid. The host cells can be selected for a high level of expression to be able to purify the fusion protein of the fluorescent proteins. E. coli is useful for this purpose. Alternatively, the host cell can be a prokaryotic or eukaryotic cell selected to study the activity of an enzyme produced by the cell. In this case, the binding peptide is selected to include an amino acid sequence recognized by the protease. The cell can be, for example, a cultured cell or a cell in vivo. A first advantage of the fluorescent protein fusion proteins is that they are prepared by normal protein biosynthesis, thus completely avoiding the organic synthesis and the requirement of tailor-made non-natural amino acid analogues. Constructs can be expressed in E. coli on a large scale for in vitro assays. Purification from bacteria is simplified when the sequences include polyhistidine tags for single step purification by nickel chelate chromatography. Alternatively, the substrates can be expressed directly in a host cell for in situ assays. In another embodiment, the invention provides a transgenic non-human animal that expresses a nucleic acid sequence which encodes the fluorescent protein. The "non-human animals" of the invention comprise any non-human animal having a nucleic acid sequence which encodes a fluorescent protein. These non-human animals include vertebrates such as rodents, non-human primates, sheep, dogs, cows, pigs, amphibians, and reptiles. The preferred non-human animals are selected from the family of rodents, which includes the rat and the mouse, most preferably the mouse. The "transgenic non-human animals" of the invention are produced by the introduction of "transgenes" into the germ line of the non-human animal. Embryonic target cells can be used at different stages of development to introduce the transgenes. Different methods are used depending on the stage of development of the embryonic target cell. The zygote is the best target for microinjection. In the mouse, the male pronucleus reaches the size of approximately 20 micrometers in diameter, which allows the reproducible injection of 1-2 pl of DNA solution. The use of zygotes as a target for gene transfer has a greater advantage because in most cases the injected DNA will be incorporated into the host gene before the first dissociation (Brinster, et al., Proc. Nati. Acad. Sci. USA 82: 4438-4442, 1985). As a consequence, all cells of the transgenic non-human animal will carry the transgene incorporated. This will also be reflected in general in the efficient transmission of the transgene to the offspring of the original since 50 percent of the germ cells will have the transgene. Zygote microinjection is the preferred method for the incorporation of transgenes when practicing the invention. The term "transgenic" is used to describe an animal which includes exogenous genetic material within all its cells. A "transgenic" animal can be produced by cross-breeding two chimeric animals which include exogenous genetic material within the cells that are used in reproduction. Twenty-five percent of the resulting offspring will be transgenic, that is, animals that include the exogenous genetic material within all of their cells in both alleles. Fifty percent of the animals that result will include the exogenous genetic material within an allele and 25 percent will not include the exogenous genetic material. Retroviral infection can also be used to introduce the transgene into a non-human animal. The non-human embryo developing in vitro in the blastocyst stage can be cultured. During this time, blastorneros may be the target for retroviral infection (Jaenich, R., Proc. Nati.
Acad. Sci. USA 73: 1260-1264). Efficient infection of the blastomeres is obtained by enzymatic treatment to remove the zona pellucida (Hogan, et al. (1986) in Manipulating the Mouse Embryo, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.). The viral vector system that is used to introduce the transgene is typically a replication-defective retrovirus carrying the transgene (Jahner, et al., Proc. Nati, Acad. Sci. USA 82: 6927-6931, 1985; Putten, et al., Proc. Nati, Acad. Sci. USA 82: 6148-6152, 1985). Transfection is obtained simply and efficiently by culturing the blastomeres in a monolayer of virus-producing cells (Van der Putten, supra; Stewart, et al., EMBO J. 6: 383-388, 1987). Alternatively, the infection can be performed at a later stage. Viruses or virus producing cells can be injected into the blastocoel (D. Jahner, et al., Nature 298: 623-628). Most originators will be mosaics for the transgene, since incorporation occurs only in a subset of the cells that formed the non-human transgenic animal. In addition, the originator can contain several retroviral insertions of the transgene in different positions in the genome, which will generally segregate in its offspring. In addition, it is also possible to introduce the transgenes into the germline, albeit with low efficiency, by intrauterine retroviral infection of the mid-gestation embryo (D. Jahner et al., Supra). A third type of target cell for the introduction of the transgene is the embryonic stem cell (ES). ES cells are obtained from pre-implantation embryos cultured in vitro and fused with embryos (MJ Evans, et al., Nature 292: 154-156, 1981; MO Bradley, et al., Nature 309: 255-258, 1984; Grossler, et al., Proc. Nati, Acad. Sci. USA 83: 9065-9069, 1986; and Robertson, et al., Nature 322: 445-448, 1986). Either transgenes within ES cells can be efficiently introduced by transfection of DNA or by retrovirus-mediated transduction. These transformed ES cells can then be combined with blasts from a non-human animal. After this the ES cells colonize the embryo and contribute to the germline of the resulting chimeric animal. (For review, see Jaenisch, R., Science 240: 1468-1474, 1988). "Transformed" means a cell within which (or within the ascendant of which) a heterologous nucleic acid molecule has been introduced, by means of recombinant nucleic acid techniques. "Heterologist" refers to a nucleic acid sequence that originates from either other species or is modified from either its original form or the primary form that is expressed in the cell. "Transgene" means any piece of DNA which is inserted by artifice into a cell, and becomes part of the organism's genome (that is, either stably integrated or common stable extrachromosomal element) which develops from of that cell. This transgene can include representing a gene homologous to an endogenous gene of the organism. Included within this definition is a transgene created by means of providing an RNA sequence which is transcribed into DNA and then incorporated into the genome. The transgenes of the invention include the DNA sequences encoding what encodes the fluorescent protein, which can be expressed in a transgenic non-human animal. The term "transgenic" as used herein, further includes any organism the genome of which has been altered by in vitro manipulation of the early embryo or fertilized egg or by any transgenic technology to induce a specific gene knock. The term "gene knock" as used in this, refers to the objective breakage of an in vivo gene with complete loss of function that has been achieved by any transgenic technology familiar to those in the art. In one embodiment, transgenic animals that have knocks are those in which the target gene has been presented as non-functional through an insertion directed to the gene that will be presented as non-functional through homologous recombination. As used herein, the term "transgenic" includes any transgenic technology familiar to those in the art, which can produce an organism carrying an introduced transgene or one in which the endogenous gene has been presented as non-functional or " beaten." III. USES OF DESIGNED FLUORESCENT PROTEINS The proteins of this invention are useful in any methods employing fluorescent proteins. The designed fluorescent proteins of this invention are useful as fluorescent labels in the many ways in which fluorescent labels are currently used. This includes, for example, designed fluorescent proteins that are coupled to antibodies, nucleic acids or other receptors for use in detection assays, such as immunoassays or hybridization assays. The fluorescent proteins designed of this invention to track the movement of proteins in cells. In this embodiment, a nucleic acid molecule encoding the fluorescent protein is fused to a nucleic acid molecule encoding the protein of interest in an expression vector. After expression within the cell, the protein of interest based on fluorescence can be localized. In another version, two proteins of interest are fused with two designed fluorescent proteins that have different fluorescent characteristics. The designed fluorescent proteins of this invention are useful in systems for detecting the induction of transcription. In certain embodiments, a nucleotide sequence encoding the designed fluorescent protein is fused to expression control sequences of interest and the expression vector is transfected into a cell. The induction of the promoter can be measured by detecting the expression and / or the amount of fluorescence. These constructs can be used to follow the signaling paths from the receiver to the promoter. The designed fluorescent proteins of this invention are useful in applications involving fluorescence resonance energy transfer. These applications can detect events as a function of the movement of fluorescent donors and receptors toward or away from each other. One or both of the donor / recipient pair can be a fluorescent protein. A preferred donor and receptor pair for assays based on fluorescence resonance energy transfer, is a donor with a T2031 mutation and a receptor with the T203X mutation, wherein X is an aromatic amino acid-39, especially T203Y, T203W , or T203H. In a particularly useful pair, the donor contains the following mutations: S72A. K79R, Y145F, M153A and T203I (with an excitation peak of 395 nm and an emission peak of 511 nm) and the receptor contains the following mutations: S65G, S72A, K79R, and T203Y. This particular pair provides a wide separation between the excitation and emission peaks and provides a good overlap between the emission spectrum of the donor and the excitation spectrum of the receiver.
Other mutants that changed to red, such as those described hereinabove, can also be used as the receptor in this pair. In one aspect, the fluorescence resonance energy transfer is used to detect the dissociation of a substrate having the donor and the receptor coupled to the substrate on opposite sides of the dissociation site. After dissociation of the substrate, the donor / receptor pair is physically separated, eliminating the fluorescence resonance energy transfer. The assays include contacting the substrate with a sample, and determining a qualitative or quantitative change in the fluorescence resonance energy transfer. In one embodiment, the fluorescent protein designed on a substrate for the D-lactamase is used. In the patent application of the United States 08 / 407,544, filed on March 20, 1995, and in the international application PCT / US96 / 04059, filed on March 20, 1996, examples of these substrates are described. In another embodiment, a donor / receptor pair of the designed fluorescent protein is part of a fusion protein coupled by a peptide having a proteolytic cleavage site. In the United States patent application 08 / 594,575, filed on January 31, 1996, these double fluorescent proteins are described. In another aspect, the fluorescence resonance energy transfer is used to detect changes in potential across a membrane. A donor and a receptor are placed on opposite sides of a membrane so that one is moved across the membrane in response to the voltage change. This creates a transfer of fluorescence resonance energy that can be measured. In the United States patent application 08 / 481,977, filed on June 7, 1995, and in the international application PCT / US96 / 09652, filed on June 6, 1996, this method is described. The designed proteins of this invention are useful in the creation of fluorescent substrates for protein kinases. These substrates incorporate an amino acid sequence that can be recognized by protein kinases. After phosphorylation, the designed fluorescent protein undergoes a change in a fluorescent property. These substrates are useful for detecting and measuring the activity of the protein kinase in a sample of a cell, after transfection and expression of the substrate. Preferably, the kinase recognition site is placed between about 20 amino acids from a term of the designed fluorescent protein. The kinase recognition site can also be placed in a protein cycle domain (See, for example, Figure IB.) In United States Patent Application 08 / 680,877, filed July 16. of 1996, methods for making fluorescent substrates for protein kinases are described. A protease recognition site can also be introduced within a cycle domain. After dissociation, the fluorescent property changes in a measurable manner. The invention also includes a method for identifying a test chemical. Typically, the method includes contacting a test chemical, a sample containing a biological entity labeled with a designed, functional fluorescent protein, or a polynucleotide that encodes this functional, designed fluorescent protein. By means of monitoring the fluorescence (i.e., a fluorescent property) of the sample containing the designed, functional fluorescent protein, it can be determined whether the test chemical is active or not. Controls may be included to ensure the specificity of the signal. These controls include measurements of a fluorescent property in the absence of the test chemical, in the presence of a chemical with an expected activity (e.g., a known modulator), or designed controls (e.g., absence of the designed fluorescent protein, absence of the designed fluorescent protein polynucleotide or the absence of an operable linkage of the designed fluorescent protein). Fluorescence in the presence of a test chemical may be higher or lower in the absence of this test chemical. For example, if the fluorescent protein designed to report the expression of the gene is used, the test chemical can regulate up or down the expression of the gene.
For these types of classification, the polynucleotide encoding the designed functional fluorescent protein is operably linked to a genomic polynucleotide or a re. Alternatively, the designed, functional fluorescent protein is fused to a second functional protein. This modality can be used to track the location of the second protein or to track protein-protein interactions that use energy transfer. IV. PROCEDURES Fluorescence is measured in a sample using a fluorometer. In general, the excitation radiation from an excitation source having a first wavelength, passes through excitation optics. The excitation optics cause the excitation radiation to stimulate the sample. In response, the fluorescent proteins in the sample emit radiation which has a wavelength that is different from 1 wavelength of excitation. Afterwards, the collection opticians collect the sample emission. The device includes a temperature controller to maintain the sample at a specific temperature while it is being scanned. According to one embodiment, a multi-axis translation stage moves a microtitre plate that holds a plurality of samples in order to position the different wells to be exposed. The multi-axis translation stage, the temperature controller, the auto-focused feature, and the electronics associated with image formation and data collection can be handled by a digital computer programmed in the appropriate manner. The computer can also transform the data that was collected during the trial to another format for the presentation. This process can be miniaturized and automated to allow the classification of many thousands of compounds. Methods for conducting assays on fluorescent materials are well known in the art and are described in, for example, Lakowicz, J.R. , Principies of Fluorescent Spectros-copy, New York: Plenum Press (1983); Hermán, B., Resonance energy transfer microscopy, in: Fluorescence Microscopy of Living Cells in Culture, Part B, Methods in Cell Biology, volume 30, ed. Taylor, D.L. and Wang, Y.L., San Diego: Academic Press (1989), pages 219-243; Turro, N.J., Modern Molecular Photoche istry, Menlo Park: Benjamin / Cummings Publishing Col, Inc. (1978), pages 296-361. The following examples are provided by way of illustration, and not by way of limitation. Examples As a step to understand the properties of the green fluorescent protein, and to aid in the preparation of green fluorescent proteins with altered characteristics, we have determined the three-dimensional structure at a resolution of 1.9A of the S65T mutant (R. Heim, et al. Nature 373: 664-665 (1995)) of the green fluorescent protein of A victoria. This mutant also contains the ubiquitous Q80R substitution, which occurred accidentally in the early distribution of the cDNA of the green fluorescent protein and of which it is not known if it has any effect on the properties of the protein (M. Chalfie, et al. , Science 263: 802-805 (1994)). Green fluorescent protein S65T labeled with histidine was overexpressed (R. Heim, et al., Nature 373: 664-665 (1995)) in JM109 / pRSETB in broth 41YT plus ampicillin in 37D, 450 rpm and 5 liters airflow / minute. The temperature was reduced to 25 D at A595 = 0.3, followed by induction with 1 M of isopropylthiogalactoside for 5 hours. The cell paste was stored at -80 D overnight, then resuspended in 50 mM HEPES at a pH of 7.9, 0.3 M NaCl, 5 mM 2-mercaptoethanol, 0.1 mM phenylmethyl sulfonyl fluoride ( PMSF) was passed once through a French press at 10,000 psi, then centrifuged at 20 K revolutions per minute for 45 minutes. The supernatant was applied to a Ni-NTA-agarose column (Qiagen), followed by a wash with 20 mM imidazole, then it was leached with 100 mM imidazole. The green fractions were pooled and subjected to chymotryptic proteolysis (Sigma) (1:50 weight / weight) for 22 hours at RT. After the addition of 0.5 mM of phenylmethyl-sulfonyl fluoride, what was collected was reapplied to the Ni column. N-terminal sequencing verified the presence of the correct N-terminal methionine. After dialysis against 20 mM of HEPES, at a pH of 7.5 and concentration at A490 = 20, rod-shaped crystals were obtained at RT in hanging drops containing 5 Di proteins and 5 DI well solution, 22-26 by PEG 4000 (Serva), 50 mM HEPES at a pH of 8.0-8.5, 50 mM MgCl2 and 10 mM of 2-mercaptoethanol in a period of 5 days. The crystals were 0.005 mm across and up to 1.0 mm in length. The space group is P2 | 2 | 2 | with a = 51.8, b = 62.8, c = 70.7 A, Z = 4. M.A. Perrozo, K.B. Ward, R.B. Thompson, and W.W. Ward J ". Biol. Chem. 203, 7713-7716 (1988), have described two forms of wild type green fluorescent protein crystals, which are not related to the present form. The structure of the green fluorescent protein was determined by multiple isomorphic replacement and anomalous dispersion (Table E), solvent flattening, phase combination and crystallographic refinement The most noticeable feature of the green fluorescent protein fold is an 11-barrel β-tangled around a single central helix (Figure IA and IB), where each chain consists of approximately 9-13 residues.The barrel forms an almost perfect cylinder 42A in length and 24A in diameter.The N-terminal half of the polypeptide comprises three anti-parallel chains, the central helix, and then 3 more anti-parallel chains, the last one of which (residues 118-123) is parallel to the terminal chain N (resides-duos 11-23). of the polypeptide crosses the "bottom" of the molecule to form the second half of the barrel in a five-chain Greek Key motif. The upper end of the cylinder is covered by three short, twisted helical segments, while a short, very twisted helical segment covers the bottom of the cylinder. The hydrogen bonding of the main chain that interlaces the surface of the cylinder is very likely the reason for the unusual stability of the protein towards denaturation and proteolysis. There are no large segments of the polypeptide that could be removed and still preserve the integrity of the envelope around the chromophore. Therefore it would seem difficult to redesign the green fluorescent protein to reduce its molecular weight (J. Dopf and T.M. Horiagon Gene 173: 39-43 (1996)) by a large percentage. The p-hydroxybenzylideneimidazolidinone chromophore (C.W. Cody, et al., Biochemistry 32: 1212-1218 (1993)), is completely protected from the crude solvent and is located centrally in the molecule. The total and presumably rigid encapsulation is probably responsible for the small Stoke change (ie, wavelength difference between excitation and emission maxima), high quantum efficiency, lack of 02 capacity to mitigate the excited state (BD Nageswara Rao, et al, Biophys, J. 32: 630-632 (1980)), and resistance of the chromophore to the external pH titration (WW Ward, Bioluminescence and Chemiluminescence (MA DeLuca and WD McElroy, en.) Academy Pres pages 235-242 (1981); W.W. Ward and S.H. Bokman, Biochemistry 21: 4535-4540 (1982); W.W. Ward, and collaborators, Photochem. Photobiol. 35: 803-808 (1982)). It also allows one to reason why the fluorophore formation must be a spontaneous intramolecular process (R. Heim, et al, Proc. Nati, Acad. Sci. USA 91: 12501-12504 (1994)), since it is difficult Imagine how an enzyme could gain access to the substrate. The plane of the chromophore is barely perpendicular (60D) to the axis of symmetry of the surrounding barrel. One side of the chromophore faces a surprisingly large cavity, occupying a volume of approximately 135 Á3 (B. Lee and FM Richards, J. "Mol. Biol. 55: 379-400 (1971).) The atomic radii were those of Lee and Richards, which were calculated using the MS program with a probe radius of 1.4 A (ML Connoly, Science 221: 709-713 (1983)), The cavity does not open to the crude solvent. water in the cavity, forming a chain of hydrogen bonds that bind the hidden side chains of Glu222 and Gln69.Unless it is busy, this large cavity would be expected to destabilize the protein by several kcal / mol (SJ Hubbard., and contributors, Protein Engineering 7: 613-626 (1994), AE Eriksson, et al., Science 255: 178-183 (1992).) Part of the volume of the cavity could be the consequence of compression resulting from cyclization reactions. and dehydration, the cavity can also accommodate temporary oxidant, most likely 02 (A.B. Cubbit, and collaborators, Trends Biochem. Sci. 20: 448-455 (1995); R. Heim, et al., Proc. Nati Acad. Sci. USA 91: 12501-12504 (1994); S. Inouye and F.I. Tsuji, FEBS Lett. 351: 211-214 (1994)), which dehydrogenizes the D-D junction of Tyr66. Figure 2A shows the chromophore, the cavity, and the side chains that are in contact with the chromophore, and a portion of the map of the final density of the electron in this neighborhood in 2B. The opposite side of the chromophore is pressed against several aromatic and polar side chains. Of particular interest is the intricate network of polar interactions with the chromophore (Figure 2C). His148, Thr203 and Ser205 form hydrogen bonds with the phenolic hydroxyl; Arg96 and Gln94 interact with the carbonyl of the imidazolidinone ring and Glu222 forms a hydrogen bond with the side chain of Thr65. Additional polar interactions, such as hydrogen bonds to Arg96 from the carbonyl of Thr62, and the side chain carbonyl of Gln183, supposedly stabilize the hidden Arg96 in its protonated form. In turn, the hidden charge suggests that a partial negative charge resides in the carbonyl oxygen of the imidazolidinone ring of the deprotonated fluorophore, as previously suggested
(W.W. Ward, Bioluminescence and Chemiluminescence (M.A. DeLuca and
W.D. McElroy, eds.) Pres Academy pages 235-242 (1981); W.W.
Ward and S.H. Bokman, Biochemistry 21: 4535-4540 (1982); W.W. Ward, and collaborators, Photochem. Photobiol. 35: 803-808 (1982)). It is likely that Arg96 is essential for the formation of the fluorophore, and may help catalyze the initial ring closure. Finally, Tyr145 shows a shore-side interaction typically stabilizing with the benzyl ring. Trp57, the only tryptophan of the green fluorescent protein, is localized from 13 Á to 15 Á of the chromophore and the long axes of the two ring systems are almost parallel. This indicates that efficient energy transfer to the latter should occur, and explains why a separate tryptophan emission can not be observed (D.C. Prasher, et al. Gene 111: 229-233 (1992)). The two cysteines in the green fluorescent protein, Cys48 and Cys70, are 24 A separated, too distant to form a bisulfide bridge. Cys70 is hidden, but Cys48 should be relatively accessible to specific sulfhydryl reagents. It is reported that this reagent, 5, 5'-dithiobis (2-nitrobenzoic acid), labels the green fluorescent protein and mitigates its fluorescence (S. Inouye and F.I. Tsuji FEBS Lett 351: 211-214 (1994)). This effect was attributed to the need for a free sulfhydryl, but could also reflect specific mitigation by the 5-thio-2-nitrobenzoate fraction that would be bound to Cys48. Although the electron density map is consistent most of the time with the proposed structure of the chromophore (DC Prasher, et al., Gene 111: 229-233 (1992); CW Cody, et al., Biochemistry 32: 1212- 1218 (1993)) in the cis [Z-] configuration, without evidence of any substantial fraction of the opposite isomer around the double bond of the chromophore, the difference characteristics are found in > 4 D in the final electron density map (F0-Fc) that can be interpreted to represent either the intact polypeptide, without having been run through a cycle, or a carbinolamine (insertion to Figure 2). This suggests that a significant fraction, perhaps as much as 30 percent of the molecules in the crystal, have not gone through the final dehydration reaction. The confirmation of the incomplete dehydration comes from the electro-debris mass spectrometry, which shows consistently that the average masses of both natural types and of the green fluorescent protein S65T (31,086 ± 4 and 31,099.5 + 4 Da, respectively) are 6 -7 Gives larger than predicted (31,079 and 31,093 Da, respectively) for fully matured proteins. This discrepancy could be explained by a fraction of 30-35 mole percent of apoprotein or carbinolamine with 18 or 20 Da of higher molecular weight. The natural abundance of 13C and 2H and the finite resolution of the Hewlett-Packard 5989B electro-dew mass spectrometer that was used to make these measurements does not allow individual peaks to be resolved, but yields an average mass peak with a width total to half of the maximum of approximately 15 Da. The molecular weights shown include the His tag which has the sequence MRGSHHHHHH GMASMTGGQQM GRDLYDDDDK DPPAEF (SEQ ID NO: 5). Mutants of the green fluorescent protein that increase the efficiency of the maturation of the fluorophore may yield somewhat brighter preparations. In a model for apoprotein, the peptide bond Thr65-Tyr66 is approximately in the helical-D conformation, while it seems that the peptide of Tyr66-Gly67 is inclined almost perpendicular to the axis of the helix by its interaction with Arg96. This further supports the speculation that Arg96 is important to generate the conformation required for cyclization, and possibly also to promote the attack of Gly67 on the carbonyl carbon of Thr65 (A.B. Cubitt, and collaborators, Trends Biochem. Sci. 20: 448-455 (1995)). The results of the previous random mutagenesis have involved different side chains of amino acids to have substantial effects on the spectra and the atomic model confirms that these residues are close to the chromophore. The T2031 and E222G mutations have profound but opposite consequences on the absorption spectrum (T. Ehrig, et al., FEBS Letters 367: 163-166 (1995)). T2031 (with Ser65 of the wild type) lacks the absorbance peak of 475 nm that is usually attributed to the anionic chromophore, and shows only the peak of 395 nm, which is thought to reflect the neutral chromophore (R. Hein, et al. , Proc. Nati, Acad. Sci. USA 91: 12501-12504 (1994); T. Ehrig, et al., FEBS Letters 367: 163-166 (1995)). In fact, Thr203 is linked by hydrogen to the phenolic oxygen of the chromophore, so that replacement by lie should prevent the ionization of phenolic oxygen. The mutation of Glu222 to Gly (T. Ehrig, et al., FEBS Letters 367: 163-166 (1995)) has many of the same spectroscopic effect as when replacing Ser65 by Gly, Ala, Cys, Val, or Thr, namely, suppress the 395 nm peak in favor of a peak at 470-490 nm (R. Heim, et al., Nature 373: 664-665 (1995); S. Delagrave, et al., Bio / echnology 13: 151-154 ( nineteen ninety five)). Truly Glu222 and the rest of Thr65 are hydrogen bonded to each other in the present structure, probably with the uncharged carboxyl of Glu222 acting as a donor to the side chain oxygen of Thr65. All mutations E222G, S65A, and S65V, would suppress this linkage by H. To explain how only the wild-type protein has both excitation peaks, Ser65, unlike Thr65, can adopt a conformation in which its hydroxyl donates a hydrogen bond to Glu222 and stabilizes it as an anion, the charge of which then inhibits the ionization of the chromophore. The structure also explains why some mutations appear neutral. For example, Gln80 is a surface residue that was removed from the chromophore, which explains why its accidental and ubiquitous mutation to Arg does not seem to have an obvious intramolecular spectroscopic effect (M. Chalfie, et al., Science 263: 802-805 (1994)). ). The development of green fluorescent protein mutants with excitation maxima and emission of change to red is an interesting challenge in the design of proteins (AB Cubitt, et al, Trends Biochem, Sci. 20: 448-455 (1995); Heim, et al, Nature 373: 664-665 (1995), S. Delagrave, and collaborators Bio / Technology 13: 151-154 (1995)). These mutants would also be valuable for avoiding cellular autofluorescence at short wavelengths, for simultaneous multicolored reports of the activity of two or more cellular processes, and for exploiting the fluorescence resonance energy transfer as a signal of the interaction of protein-protein (R. Heim and RY Tsien, Current Biol. 6: 178-182 (1996)). Extensive attempts using random mutagenesis have shifted the emission maximum by at most 6 nm at longer wavelengths, at 514 nm (R. Heim and R.Y. Tsien, Current Biol. 6: 178-182 (1996)); the "red shift" mutants described above simply suppressed the excitation peak of 395 nm in favor of the 475 nm peak without any significant reddening of the 505 nm emission (S. Delagrave, et al. Bio / Technology 13: 151- 154 (1995)). Because it is revealed that Thr203 is adjacent to the phenolic end of the chromophore, we mutated it to polar aromatic residues such as His, Tyr, and Trp in the hope that the additional polarizability of its D systems would decrease the energy of the excited state of the adjacent chromophore. . The three substitutions actually changed the emission peak to more than 520 nm (Table F). A particularly attractive mutation was T203Y / S65G / V68L / S72A, with excitation and emission peaks at 513 nm and 527 nm, respectively. These wavelengths are sufficiently different from the mutants of the above green fluorescent protein to be easily distinguished by appropriate filter sets in a fluorescence microscope. The extinction coefficient, 36,500 M'1 cm "1, and the quantum yield, 0.63, are almost as high as those of S65T (R. Heim, et al., Nature 373: 664-665 (1995)). Instructive comparison of the green fluorescent protein of Aequorea with other protein pigments Unfortunately, its closest characterized homologue, the green fluorescent protein of Renilla renif ormis of sea thought (0. Shimomura and FH Johnson J.). Physiol. 59: 223 (1962); J.G. Morin and J.W. Hastings, J. "Cell Physiol., 77: 313 (1971), H. Morise, et al., Biochemistry 13: 2656 (1974), W. W. Ward Photochem, Photobiol. Reviews (Smith, K.C. ed.)
4: 1 (1979); W.W. Ward, Bioluminescence and Chemiluminescence
(M.A. DeLuca and W.D. McElroy, eds.) Pres Academy pages 235-242
(1981); W.W. Ward and S.H. Bokman Biochemistry 21: 4535-4540 (1982);
W.W. Ward, and collaborators, Photochem. Photobiol. 35: 803-808 (1982)), it has not been sequenced or cloned, although its chromophore is derived from the same FSYG sequence as in the green fluorescent protein of Aequorea wild-type (RM San Pietro, et al., Photochem Photobiol 51 - 63S (1993)). The closest analogue for which a three-dimensional structure is available is the photoactive yellow protein (PYP, G.E. O.
Borgstahl, et al., Biochemistry 34: 6278-6287 (1995)), a 14-kDa photoreceptor of the halophilic bacterium. The photoactive yellow protein is its native dark state, absorbs to the maximum at 446 nm and transduces the light with a quantum yield of 0.64, coinciding in a close manner with the maximum long wavelength absorbance of the green fluorescent protein of type natural close to 475 nm and fluorescence quantum yield of 0.72-0.85. The fundamental chromophore in both proteins is an anionic p-hydroxycinnamyl group, which is covalently bound to the protein by a thioester bond in the photoactive yellow protein and a heterocyclic iminolactam in the green fluorescent protein. Both proteins stabilize the negative charge in the chromophore with the help of hidden cationic arginine and groups of neutral glutamic acid, Arg52 and Glu46 in the photoactive yellow protein, and Arg96 and Glu222 in the green fluorescent protein, although in the photoactive yellow protein the residues they are close to the oxyphenyl ring, whereas in the green fluorescent protein they are closer to the carbonyl end of the chromophore. However, the photoactive yellow protein has a total D / D fold with the appropriate flexibility and signal transduction domains to allow it to mediate between the cellular phototactic response, whereas the green fluorescent protein is a much more regular and rigid D-barrel, to minimize the parasitic dissipation of excited state energy as thermal or conformational movements. The green fluorescent protein is an elegant example of how a visually attractive and extremely useful function of efficient fluorescence can be generated spontaneously from a cohesive and economical protein structure. A. Summary of Green Fluorescent Protein Structure Determination Data were collected at room temperature locally using detectors from either Molecular Structure Corp. R-axis or San Diego Multiwire Systems (SDMS) (CuKD) and more. late a X4A flash line at the Brookhaven National Laboratory at the selenium absorption border (D = 0.979 A) using imaging plates. Data were evaluated using the HKL package (Z. Otwi-nowski, in Proceedings of the CCP4 Study Weekend: Data Collection and Processing, L. Sawyer, N. Issacs, S. Bailey, Es. (Science and Engineering Research Council (SERC ), Daresbury Laboratory, Warrington, UK, (1991)), pages 56-62, W. Minor, XDISPLAYF (Purdue University, West Lafayette, IN, (1993)), or SDMS software (AJ Howard, et al. Meth. Enzymol 114: 452-471 (1985).) Each set of data was collected from a single crystal.The heavy atom soaks were 2 mM in the mother liquor for 2 days. initials were based on three heavy atom derivatives using local data, then later replaced with synchrotron data.The Patterson map was disbanded from EMTS by inspection, then used to calculate Fourier maps of difference from other derivatives, lack of refinement of closure of the parameters of the Heavy volume was performed using the Protein package (W. Steigemann, in Ph.D. Thesis (Technical University, Munich, (1974)). The MIR maps were much more deficient than the total figure of merit would suggest, and it was clear that the isomorphic differences of EMTS dominated the phase adjustment. The increased anomalous occupancy for the synchrotron data provided a partial solution to the problem. Note that the phase adjustment energy for the synchrotron data was reduced, but the figure of merit remained unchanged. All density maps of the experimental electron were improved by the flattening of the solvent using the DM program of the CCP4 package (CCP4: A Suite of Programs for Protein Crystallography (SERC Daresbury Laboratory, Warrington WA4 4AD UK, (1979)), assuming the content of a 38 percent solvent.The phase combination was performed with PHASC02 of the Protein package using a weight of 1.0 on the atomic model, the parameters of the heavy atom were subsequently improved by refining against the combined phases. construction of the model with FRODO and O (TA Jones, et al, Acta Crystallogr, Sect. A 47: 110 (1991); T.A. Jones, in Computational Crystallography, D. Sayre, Ed. (Oxford University Press, Oxford, 1982), pages 303-317), and crystallographic refinement was performed with the TNT package (DE Tronrud, et al., Acta Cryst, A. 43 : 489-503 (1987)). Link lengths and angles for the chromophore were calculated using CHEM3D (Cambridge Scientific Computing). The final refinement and construction of the model was performed against a selenometion data set X4A, using electron density maps (2F0-FC). The data had not been used beyond a resolution of 1.9 Á at this stage. The final model contains residues 2-229 because the terminal residues are not visible in the electron density map, and the side chains of different disordered surface residues have been omitted. The density is weak for residues 156-158 and the coordinates for these residues are not reliable. This disorder is consistent with the above analyzes and shows that residues 1 and 233-238 are dispensable but that additional truncations can avoid fluorescence (J. Dopf and T.M. Horiagon, Gene 173: 39-43 (1996)). The atomic model was deposited in the Protein Data Bank (IEMA access code).
TABLE E Diffraction Data Statistics
Glass Resolution obs Single TotemComp. Compl. (cuRmerqe Riso (%) d
(A) co (%) a bierta) 1"ar R-axix II Native 2.0 51907 13582 80 69 4.1 5.8
EMTSe 2.6 17727 6787 87 87 5.7 20.6
SeMet 2.3 44975 10292 92 88 10.2 9.3
Multiwire H6I4-Se 3.0 15380 4332 84 79 7.2 28.87
X4a SeHet 1.8 126078 19503 80 55 9.3 9.4
EMTS 2.3 57812 9204 82 66 7.2 26.3
Aiuste de Fase statistics
Derivative Resolution Number of Enerqpei of Enerqía of F0Hg Fon (cu¬
O) fixed phase aiuste august sites) fasef (cubiei-ta) Local EMTS 3.0 2 2.08 2.08 0.77 .072
SeMet 3.0 4 1.66 1.28 - - HGI4-Se 3.0 9 1.77 1.90 - - X4a EHTS 3.0 2 1.36 1.26 0.77 .072
SeHet 3.0 4 1.31 1.08 - - Statistics of the Atomic Model protein atoms 1790 Atoms of the solvent 94 Range of resol (Á) 20-1.9 Number of reflections (F> 0) 17676 Integrity 84. Factor "11 R. 0. 175 Average B value (Á2) 24. 1 Deviations from ideal Link lengths (Á) 0.014 Link angles (D) 1.9 Restricted B values (Á2) 4.3 Ramachandran absentees 0 Notes: (a ) Integrity is the proportion of the observed reflections that could be expressed theoretically as a percentage (b) The cover indicates the highest resolution cover, typically 0.1-0.4 A wide. (C) Rmerge = D [i - < I > / DI, where <I> is the means of the individual observations of the intensities I. (d) Riso = D | lDER - INAT | / D INAT (e) Derivatives were EMTS = etimercuritiosalicylate (modified residues Cys48 and Cys70), SeMet = protein replaced by selenomethionine (Met1 and Met233 could not be localized); HgI4-SeMet = double Hgl4 derivative on SeMet background.
(f) Phase adjustment energy = < FH > / < E > where < FH > = r.m.s. heavy atom distribution and < E > = lack of closure.
(g) FOM, average figure of merit (h) Standard crystallographic factor R, R = D | | Fobs | I Fcalc II / DI Fo s I B. Spectral properties of the Thr203 mutants ("T203"), compared to S65T The F64L, V68L and S72A mutations improved the fold of the green fluorescent protein in 37D (BP Cormack, et al., Gene 173: 33 (1996)), but did not significantly change the emission spectra.
TABLE F
: Canvas Mutations Max. of ExcitaCoefficient of extinction- Háx. of emission (nm) tion - '1) sion (nm)
S65T S65T 489 39.2 511
5B T203H / S65T 512 19.4 524
6C T203Y / S65T 513 14.5 525
10B T203Y / F64L / S65G / S72A 513 30.8 525
10C T203Y / F65G / V68L / S72A 513 36.5 527
11 T203W / S65G / S72A 502 33.0 512
12H T203Y / S65G / S72A 513 36.5 527
20A T203Y / S65G / V68L / Q69K / S72A 515 46.0 527
The present invention provides novel length wavelength designed fluorescent proteins. Although specific examples have been provided, the above description is illustrative and not restrictive. Many variations of the invention will be apparent to those skilled in the art, after a review of this specification. The scope of the invention should be determined, therefore, not with reference to the foregoing description, but should be determined with reference to the appended claims together with their total scope of equivalents. All publications and patent documents cited in this application are incorporated by reference in their entirety, for all purposes to the same extent as if each publication or individual patent document were denoted individually in that manner.
LIST OF SEQUENCES (2) INFORMATION FOR S? Q ID NO: l: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 716 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: simple (D) ) TOPOLOGY: linear (ix) CHARACTERISTICS: (A) NAME / KEY: CDS (B) LOCATION: 1..714 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: l:
ATG AGT AAA GGA GAA GAA CTT TTC ACT GCA GTT GTC CCA ATT CTT GTT 48 Met Ser Lys Gly Glu Glu Leu Phe Thr Ala Val Val Pro lie Leu Val 1 5 10 15 GAA TTA GAT GAT GTAT AAT GGG CAC AAA TTT TCT GTC AGT GGA GAG 96 Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu 20 25 30 GGT GAA GGT GAT GTA ACA TAC GGA AAA CTT ACC CTT AAA TTT ATT TGC 144 Gly Glu Gly Asp Val Thr Tyr Gly Lys Leu Thr Leu Lys Phe lie Cys 35 40 45 ACT ACT GGA AAA CTA CCT GTT CCA TGG CCA ACA CTT GTC ACT ACT TTC 192 Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 50 55 60 TCT TAT GGT GTT CAA TGC TTT TCA AGA TAC CCA GAT CAT ATG AAA CGG 240 Ser Tyr Gly Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys Arg 65 70 75 80 CAT GAC TTT TTC AAG AGT GCC ATG CCC GAA GGT TAT GTA CAG CA AGA 288 His Asp Phe Phe Lys Ser Wing Met Pro Glu Gly Tyr Val Gln Gln Arg 85 90 95 ACT ATA TTT TTC AAA GAT GAC GGG AAC TAC AAG ACA CGT GCT GAA GTC 336 Thr lie Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Wing Glu Val 100 105 110 AAG TTT GAA GGT GAT ACC CTT GTT AAT AGA ATC GAG TTA AAA GGT ATT 384 Lys Phe Glu Gly Asp Thr Leu Val Asn Arg lie Gl u Leu Lys Gly lie 115 120 125 GAT TTT AAA GAA GAT GGA AAC ATT CTT GGA CAT AAA TTG GAA TAC AAC 432 Asp Phe Lys Glu Asp Gly Asn lie Leu Gly His Lys Leu Glu Tyr Asn 130 135 140 TAT AAC TCA CAC AAT GTA TAC ATC ATG GCA GAC AAA CA AAG AAT GGA 480 Tyr Asn Ser His Asn Val Tyr lie Met Wing Asp Lys Gln Lys Asn Gly 145 150 155 160 ATC AAA GTT AAC TTC AAA ATT AGA CAC AAC ATT GAA GAT GGA AGC GTT 528 lie Lys Val Asn Phe Lys lie Arg His Asn lie Glu Asp Gly Ser Val 165 170 175 CAA CTA GCA GAC TAT TAT CAA CAA AAT ACT CCA ATT CTC GAT GGC CCT 576 Gln Leu Wing Asp Tyr Tyr Gln Gln Asn Thr Pro lie Leu Asp Gly Pro 180 185 190 GTC CTT TTA CCA GAC AAC CAT TAC CTG TCC ACA CAA TCT GCC CTT TCG 624 Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Wing Leu Ser 195 200 205 AAA GAT CCC AAC GAA AAG AGA GAC CAC ATG GTC CTT CTT GAG TTT GTA 672 Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val 210 215 220 ACA GCT GCT GGG ATT ACA CAT GGC ATG GAT GAA CTA TAC AAA 714
Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys 225 230 235 TA 716
(2) INFORMATION FOR SEQ ID NO: 2: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 238 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 2:
Met Ser Lys Gly Glu Glu Leu Phe Thr Wing Val Val Pro He Leu Val 1 5 10 15 Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu 20 25 30 Gly Glu Gly Asp Val Thr Tyr Gly Lys Leu Thr Leu Lys Phe He Cys 35 40 45 Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe
50 55 60 Ser Tyr Gly Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys Arg 65 70 75 80 His Asp Phe Phe Lys Ser Wing Met Pro Glu Gly Tyr Val Gln Gln Arg 85 90 95
Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val
100 105 110 Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly He 115 120 125 Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr Asn
130 135 140 Tyr Asn Ser His Asn Val Tyr He Met Wing Asp Lys Gln Lys Asn Gly 145 150 155 160 He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser Val 165 170 175
Gln Leu Wing Asp Tyr Tyr Gln Gln Asn Thr Pro He Leu Asp Gly Pro
180 185 190 Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Wing Leu Ser 195 200 205 Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val
210 215 220 Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys 225 230 235 (2) INFORMATION FOR SEQ ID NO: 3: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 720 base pairs (B) ) TYPE: nucleic acid (C) TYPE OF CHAIN: simple (D) TOPOLOGY: linear (ix) FEATURE: (A) NAME / KEY: CDS (B) LOCATION: 1.720 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 3:
ATG GTG AGC AAG GGC GAG GG CTG TTC ACC GGG GTG GTG CCC ATC CTG 48 Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu 240 245 250 GTC GAG CTG GAC GGC GAC GAC AAC GGC CAC AAG TTC AGC GTG TCC GGC 96 Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly 255 260 265 270 GAG GGC GAG GGC GAT GCC ACC TAC GGC AAG CTG ACC CTG AAG TTC ATC 144 Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He 275 280 285 TGC ACC ACC GGC AAG CTG CCC GTG CCC TGG CCC ACC CTC GTG ACC ACC 192 Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr 290 295 300 TTC GGC TAC GGC GTG CAG TGC TTC GCC CGC TAC CCC GAC CAC ATG AAG 240 Phe Gly Tyr Gly Val Gln Cys Phe Wing Arg Tyr Pro Asp His Met Lys 305 310 315 CAG CAG GAC TTC TTC AAG TCC GCC ATG CCC GAA GGC TAC GTC CAG GAG 288 Gln Gln Asp Phe Phe Lys Ser Wing Met Pro Glu Gly Tyr Val Gln Glu 320 325 330 CGC ACC ATC TTC TAG AAG GAC GAC GGC AAC TAC AAG ACC CGC GCC GAG 336
Arg Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Wing Glu 335 340 345 350 GTG AAG TTC GAG GGC GAC ACC CTG GTG AAC CGC ATC GAG CTG AAG GGC 384 Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly 355 360 365 ATC GAC TTC AAG GAC GAC GGC AAC ATC CTG GGG CAC AAG CTG GAG TAC 432 He Asp Phe Lys Asp Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr 370 375 380 AAC TAC AAC AGC CAC AAC GTC TAT ATC ATG GCC GAC AAG CAG AAG AAC 480 Asn Tyr Asn Ser His Asn Val Tyr He Met Wing Asp Lys Gln Lys Asn 385 390 395 GGC ATC AAG GTG AAC TTC AAG ATC CGC CAC AAC ATC GAG GAC GGC AGC 528 Gly He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser 400 405 410 GTG CAG CCC GCC GAC CAC TAC CAG CAG AAC ACC CCC ATC GGC GAC GGC 576
Val Gln Pro Wing Asp His Tyr Gln Gln Asn Thr Pro He Gly Asp Gly 415 420 425 430 CCC GTG CTG CTG CCC GAC AAC CAC TAC CTG AGC TAC CAG TCC GCC CTG 624 Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Tyr Gln Be Wing Leu 435 440 445 AGC AAA GAC CCC AAC GAG AAG CGC GAT CAC ATG GTC CTG CTG GAG TTC 672 Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe 450 455 460 GTG ACC GCC GCC GGG ATC ACT CAC GGC ATG GAC GAG CTG TAC AAG TAA 720 Val Thr Wing Wing Gly He Thr His Gly Met Asp Glu Leu Tyr Lys * 465 470 475
(2) INFORMATION FOR SEQ ID NO: 4: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 240 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 4:
Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu 1 5 10 15 Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly 20 25 30 Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He 35 40 45 Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr 50 55 60 Phe Gly Tyr Gly Val Gln Cys Phe Ala Arg Tyr Pro Asp His Met Lys 65 70 75 80
Gln Gln Asp Phe Phe Lys Ser Wing Met Pro Glu Gly Tyr Val Gln Glu 85 90 95 Arg Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu 100 105 110 Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly
115 120 125 He Asp Phe Lys Asp Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr 130 135 140 Asn Tyr Asn Ser His Asn Val Tyr He Met Wing Asp Lys Gln Lys Asn
145 150 155 160
Gly He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser 165 170 175 Val Gln Pro Wing Asp His Tyr Gln Gln Asn Thr Pro He Gly Asp Gly
180 185 190 Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Tyr Gln Ser Ala Leu
195 200 205 Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe 210 215 220 Val Thr Wing Wing Gly He Thr His Gly Met Asp Glu Leu Tyr Lys * 225 230 235 240
Claims (100)
- CLAIMS 1. A nucleic acid molecule comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) ) and which differs from SEQ ID NO: 2 by at least the T203X substitution, where X is an aromatic amino acid selected from H, Y, W or F, said functional fluorescent protein, engineered product having a fluorescent property different from the green fluorescent protein from Aequorea.
- 2. The nucleic acid molecule of claim 1, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
- 3. The nucleic acid molecule of claim 1, wherein the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203W. .
- The nucleic acid molecule of claim 1 or 2, wherein the amino acid sequence further comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F and Y66W.
- 5. The nucleic acid molecule of claim 1 or 2, wherein the amino acid sequence further comprises a mutation of Table A.
- 6. The nucleic acid molecule of claim 1 or 2, wherein the amino acid sequence further comprises a fold mutation.
- 7. The nucleic acid molecule of any of claims 1 to 3, wherein the nucleotide sequence encoding the protein differs from the nucleotide sequence of SEQ ID NO: 1 by replacing at least one codon with a codon of preferred mammal.
- 8. The nucleic acid molecule of any of claims 1-3, which encodes a fusion protein, wherein the fusion protein comprises a polypeptide of interest and the functional fluorescent protein product of engineering.
- 9. An expression vector, comprising expression control sequences operably linked to a nucleic acid molecule comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least the amino acid substitution T203X, where X is an aromatic amino acid selected from H, Y, W or F, said functional fluorescent protein, engineered product, having a fluorescent property different from the green fluorescent protein of Aequorea.
- 10. The expression vector of claim 9, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
- 11. The expression vector of claim 9, wherein the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203W.
- 12. The expression vector of claim 10 or 11, wherein the amino acid sequence further comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F and Y66.
- 13. The expression vector of claim 10 or 11, wherein the amino acid sequence comprises a mutation of Table A.
- 14. The expression vector of claim 9 or 10, wherein the amino acid sequence further comprises a fold mutation.
- 15. The expression vector of any of claims 9-11, wherein the nucleotide sequence encoding the protein differs from the nucleotide sequence of SEQ ID NO: 1 by replacing at least one codon with a mammalian codon. favorite.
- 16. The expression vector of any of claims 9 to 11, which encodes a fusion protein where the fusion protein comprises a polypeptide of interest and the functional fluorescent protein, engineered product.
- 17. A recombinant host cell, comprising an expression vector comprising expression control sequences operably linked to a nucleic acid molecule comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Ae? ruorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least the amino acid substitution T203X, where X is a selected aromatic amino acid of H, Y, or F, said functional fluorescent protein, engineered product, having a fluorescent property different from the Aequore green fluorescent protein.
- 18. The recombinant host cell of claim 17, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
- 19. The recombinant host cell of claim 17, wherein the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203W.
- 20. The recombinant host cell of claim 17 or 18, wherein the amino acid sequence further comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F and Y66.
- 21. The recombinant host cell of claim 17 or 18, wherein the amino acid sequence further comprises a mutation of Table A.
- 22. The recombinant host cell of claim 17 or 18, wherein the amino acid sequence further comprises a mutation of fold 23.
- The recombinant host cell of any of claims 17-19, wherein the nucleotide sequence encoding the protein differs from the nucleotide sequence of SEQ ID NO: 1 by replacing at least one codon with a mammalian codon. favorite.
- The recombinant host cell of any of claims 17-19, which encodes a fusion protein, wherein the fusion protein comprises a polypeptide of interest and the functional fluorescent protein, engineered product.
- 25. The recombinant host cell of any of claims 17-19, which is a prokaryotic cell.
- 26. The recombinant host cell of any of claims 17-19, which is a eukaryotic cell.
- 27. A functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 by minus the T203X substitution, where X is an aromatic amino acid selected from H, Y, or F, said functional fluorescent protein, engineered product having a fluorescent property different from the green fluorescent protein from Aequorea.
- The protein of claim 27, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
- 29. The protein of claim 27, wherein the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; Or S65G / S72A / T203.
- 30. The protein of claim 27 or 28, wherein the amino acid sequence further comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F and Y66.
- 31. The protein of claim 27 or 28, wherein the amino acid sequence further comprises a fold mutation.
- 32. The protein of any of claims 27-29, which is a fusion protein, wherein the fusion protein comprises a polypeptide of interest and the functional fluorine-cente protein, engineered product.
- 33. A fluorescently labeled antibody, comprising an antibody coupled to a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and the which differs from SEQ ID NO: 2 by at least the T203X substitution, where X is an aromatic amino acid selected from H, Y, or F, said functional fluorescent protein, engineered product having a fluorescent property different from the green fluorescent protein of Aequorea.
- 34. The fluorescently labeled antibody of claim 33, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
- 35. The fluorescently labeled antibody of claim 33, wherein the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; Or S65G / S72A / T203.
- 36. The fluorescently labeled antibody of claim 33 or 34, wherein the amino acid sequence further comprises a substitution in Y66, where the substitution is selected from Y66H, Y66F and Y66.
- 37. The fluorescently labeled antibody of any one of claims 33-35, which is a fusion protein, wherein the fusion protein comprises the antibody by fusing the functional fluorescent protein, engineered product.
- 38. A nucleic acid molecule, comprising a nucleotide sequence encoding an antibody fused to a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 by at least the T203X substitution, where X is an aromatic amino acid selected from H, Y, or F, said functional fluorescent protein , engineering product having a fluorescent property different from the green fluorescent protein of Aequorea.
- 39. The nucleic acid molecule of claim 38, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
- 40. The nucleic acid molecule of the claim 38, where the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203.
- 41. The nucleic acid molecule of claim 38 or 39, wherein the amino acid sequence further comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F and Y66.
- 42. A fluorescently labeled nucleic acid probe comprising a nucleic acid probe coupled to a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea ( SEQ ID NO: 2) and which differs from SEQ ID NO: 2 by at least the T203X substitution, where X is an aromatic amino acid selected from H, Y, or F, said functional fluorescent protein, engineering product having a property fluorescent protein of the green fluorescent protein of Aeguorea.
- 43. The fluorescently labeled nucleic acid probe of claim 42, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
- 44. The fluorescently labeled nucleic acid probe of claim 42, wherein the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203W.
- 45. The fluorescently labeled nucleic acid probe of claim 42 or 43, wherein the amino acid sequence further comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F and Y66W.
- 46. A nucleic acid molecule, comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) ), and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220 , E222 (not E222G), or V224, said functional fluorescent protein, engineered product, having a fluorescent property different from the green fluorescent protein of Aequorea.
- 47. The nucleic acid molecule of claim 46, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, and Y, V61X, where X is selected from F, Y, H and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y, H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R , E and G, Q94X, where X is selected from D, E, H, K and N, N121X, where X is selected from F, H, and Y, Y145X, where X is selected from, C, F, L, E, H, K and Q, H148X, where X is selected from F, Y, N, K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H, Q , W and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y, E and K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and T, E222X, where X is selected from N and Q, or V224X, where X is selected from H, N, Q, T, F, and Y.
- 48. A expression vector, comprising sequence expression control proteins operably linked to a nucleic acid molecule comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea ( SEQ ID NO: 2), and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220, E222 (not E222G), or V224, said functional fluorescent protein, engineering product, having a fluorescent property different from the green fluorescent protein of Aequorea.
- 49. The expression vector of claim 48, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, W and Y, V61X, where X is selected from F, Y, H and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y. H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R , E and G, Q94X, where X is selected from D, E, H, K and N, N121X, where X is selected from F, H, W and Y, Y145X, where X is selected from W, C, F, L, E, H, K and Q, H148X, where X is selected from F, Y, N, K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H , Q, and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y, E and K, N185X, where X is selected from D, E, H, K and Q , L220X, where X is selected from H, N, Q and T, E222X, where X is selected from N and Q, or V224X, where X is selected from H, N, Q, T, F, and Y.
- 50. A recombinant host cell, comprising a expression vector comprising expression control sequences operably linked to a nucleic acid molecule comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the protein fluorescent green Aequorea (SEQ ID NO: 2), and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150 , F165, 1167, Q183, N185, L220, E222 (not E222G), or V224, said functional fluorescent protein, engineered product, having a fluorescent property different from the green fluorescent protein of Aeguorea.
- 51. The recombinant host cell of claim 50, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, W and Y, V61X, where X is selected from F, Y, H and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y, H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R , E and G, Q94X, where X is selected from D, E, H, K and N, N121X, where X is selected from F, H, and Y, Y145X, where X is selected from, C, F, L, E, H, K and Q, H148X, where X is selected from F, Y, N, K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H, Q, W and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y, E and K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and T, E222X, where X is selected from N and Q, or V224X, where X is selected from H, N, Q, T, F, W and Y.
- 52. A functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220, E222 (E222G), or V224, said functional fluorescent protein, engineered product , having a fluorescent property different from the green fluorescent protein of Aequorea.
- 53. The functional fluorescent protein, engineered product, of claim 52, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, and Y, V61X, where X is selected from F, Y, H and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y, H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R, E and G, Q94X, where X is selected from D, E, H, K and N , N121X, where X is selected from F, H, and Y, Y145X, where X is selected from, C, F, L, E, H, K and Q, H148X, where X is selected from F, Y, N, K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H, Q, and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y , E and K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and T, E222X, where X is selected from N and Q, or V224X , wherein X is selected from H, N, Q, T, F, W and Y.
- 54. A fluorescently labeled antibody, comprising an antibody coupled to a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 by at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121 , Y145, H148, V150, F165, 1167, Q183, N185, L220, E222 (E222G), OV224, said functional fluorescent protein, engineered product, having a fluorescent property different from the protein fluorescent green Aequorea.
- 55. The antibody of claim 54, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, W and Y, V61X, where X is selected from F, Y, H and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y, H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R, E and G, Q94X, where X is selected from D, E, H, K and N, N121X, where X is selected from F, H, W and Y, Y145X, where X is selected from, C, F, L, E , H, K And Q, H148X, where X is selected from F, Y, N, K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H, Q, W and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y, E and K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and T, E222X, where X is selected from N and Q, or V224X, where X is selected from H, N, Q, T, F, W and Y.
- 56. A nucleic acid molecule, comprising a nucleotide sequence encoding an antibody fused to a nucleotide sequence encoding a fluorescent protein functional, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220, E222 (E222G), or V224, said functional fluorescent protein, pr oducto engineering, having a fluorescent property different from the green fluorescent protein of Aequorea.
- 57. The nucleic acid molecule of claim 56, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, W and Y, V61X, where X is selected from F, Y, H and C, T62X , where X is selected from A, V, F, S, D, N, Q, Y, H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R, E and G, Q94X, where X is selected from D, E, H, K and N , N121X, where X is selected from F, H, and Y, Y145X, where X is selected from W, C, F, L, E, H, K and Q, H148X, where X is selected from F, Y, N , K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H, Q, and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y, E and K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and T, E222X, where X is selected of N and Q, or V224X, where X is selected from H, N, Q, T, F, and Y.
- 58. A fluorescently labeled nucleic acid probe, comprising a nucleic acid probe coupled to a functional fluorescent protein, engineering product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167 , Q183, N185, L220, E222 (E222G), or V224, said functional fluorescent protein, engineered product, having a fluorescent property different from the green fluorescent protein of Aequorea.
- 59. The antibody of claim 58, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, W and Y, V61X, where X is selected from F, Y, H and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y, H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R, E and G, Q94X, where X is selected from D, E, H, K and N, N121X, where X is selected from F, H, and Y, Y145X, where X is selected from W, C, F, L, E , H, K and Q. H148X, where X is selected from F, Y, N, K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H, Q, and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y , E and K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and T, E222X, where X is selected from N and Q, or V224X , wherein X is selected from H, N, Q, T, F, and Y.
- 60. A method for determining whether a mixture contains a target, comprising: contacting the mixture with a fluorescently labeled probe, comprising a probe and a functional fluorescent protein, engineered product, of claim 27 or claim 52; and determine if the target has been linked to the probe.
- 61. The method of claim 60, wherein the target is linked to a solid matrix.
- 62. A method for engineering a functional fluorescent protein, an engineered product, that has a different fluorescent property than the green fluorescent protein of Aeguorea, which comprises replacing an amino acid that is located no more than 0.5 nm from any atom in the chromophore of a green fluorescent protein related to Aequorea with another amino acid; whereby the substitution alters a fluorescent property of the protein.
- 63. The method of claim 62, wherein the amino acid substitution alters the electronic environment of the chromophore.
- 64. A method for engineering a functional fluorescent protein, an engineered product, having a different fluorescent property than the green fluorescent protein of Aequorea, which comprises substituting amino acids in a green fluorescent protein loop domain related to Aequorea with amino acids, so as to create a consensus sequence for phosphorylation or proteolysis.
- 65. A method for producing fluorescence resonance energy transfer, comprising: providing a donor molecule comprising a functional fluorescent protein, engineered product, of claim 27 or claim 52; provide an appropriate acceptor molecule for the fluorescent protein; and placing the donor molecule and the acceptor molecule in sufficiently close contact to allow fluorescence resonance energy transfer.
- 66 A method for producing fluorescence resonance energy transfer, comprising: providing an acceptor molecule comprising a functional fluorescent protein, engineered product, of claim 27 or claim 52; provide a suitable donor molecule for the fluorescent protein; and placing the donor molecule and the acceptor molecule in sufficiently close contact to allow fluorescence resonance energy transfer.
- 67. The method of claim 66, wherein the donor molecule is a fluorescent, engineered protein, whose amino acid sequence comprises the T203I substitution and the acceptor molecule is a mutant fluorescent protein whose amino acid sequence comprises the T203X substitution, where X is a aromatic amino acid selected from H, Y, W or F, said functional fluorescent protein, engineered product, having a fluorescent property different from the green fluorescent protein of Aeguorea.
- 68. A nucleic acid molecule, comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) ) and which differs from SEQ ID NO: 2 in at least one amino acid substitution located at no more than about 0.5 nm of the chromophore of the engineered fluorescent protein, where the substitution alters the electronic environment of the chromophore, with which the functional fluorescent protein, engineered product, has a fluorescent property different from the green fluorescent protein of Aeguorea.
- 69. An expression vector, comprising expression control sequences operably linked to a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the fluorescent protein Aeguorea green (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least one amino acid substitution located at no more than about 0.5 nm of the chromophore of the engineered fluorescent protein, where the substitution alters the electronic environment of the chromophore, whereby the functional fluorescent protein, engineered product, has a different fluorescent property than the green fluorescent protein of Aeguorea.
- 70. A functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in less an amino acid substitution located at no more than about 0.5 nm of the chromophore of the engineered fluorescent protein, where the substitution alters the electronic environment of the chromophore, whereby the functional fluorescent protein, engineered product, has a fluorescent property different from the green fluorescent protein of Aequorea.
- 71. A crystal of a protein, comprising a fluorescent protein with an amino acid sequence substantially identical to SEQ ID NO: 2, wherein said crystal diffracts with at least a resolution of 2.0 to 3.0 Angstroms.
- 72. The crystal of claim 71, wherein the fluorescent protein has at least 200 amino acids, a terminating value of at least 80%, and has a crystal stability within 0.5% of its unit cell dimensions.
- 73. The crystal of claim 71, wherein the amino acid sequence comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
- 74. The crystal of claim 71, wherein said crystal has the following unit cell dimensions in Angstroms: a = 51.8, b = 62.8 and e = 70.7, with a space group of P 2 2 2 and an angle D of 90.00 D , an angle D of 90.00D, and an angle D of 90.00D, and the crystal has a diffraction limit where 90% or more of the potential reflections can be used to determine the coordinates of the atoms.
- 75. A computational method of designing a fluorescent protein, comprising: determining from a three-dimensional model of a crystallized fluorescent protein comprising a fluorescent protein with a ligand ligated, at least one amino acid that interacts with the fluorescent protein that interacts with the minus a first chemical fraction of the ligand, and select at least one chemical modification of the first chemical fraction to produce a second chemical fraction with a structure to either reduce or increase an interaction between the interacting amino acid and the second chemical fraction compared to the interaction between the interacting amino acid and the first chemical fraction.
- 76. The computational method of claim 75, further comprising generating the three-dimensional model of the crystallized protein comprising a fluorescent protein with an amino acid sequence substantially identical to SEQ ID NO: 2.
- 77. The computational method of claim 75, wherein the selection selects the first chemical fraction that interacts with at least one of the amino acids listed in Figures 5-1 through 5-28.
- 78. The computational method of claim 75, wherein the chemical modification improves the hydrogen bonding interaction, the charge interaction, the hydrophobic interaction, the Van Der Waals interaction or the dipole interaction between the second chemical fraction and the amino acid which interacts in comparison to the first chemical fraction and the amino acid that interacts.
- 79. A computational method of modeling the three-dimensional structure of a fluorescent protein, comprising determining a three-dimensional relationship between at least two atoms listed in the atomic coordinates of Figures 5-1 through 5-28.
- 80. The computational method of claim 79, wherein the determination comprises determining the three-dimensional structure of a fluorescent protein with an amino acid sequence at least 80% identical to SEQ ID NO: 2.
- 81. The computational method of the claim 79, wherein the determination comprises determining the three-dimensional structure of a fluorescent protein with an amino acid sequence at least 95% identical to SEQ ID NO: 2.
- 82. The computational method of claim 79, wherein the determination comprises determining the three-dimensional relationship of at least 1,500 atoms listed in Figures 5-1 through 5-28.
- 83. A device comprising a storage device and, stored in the device, at least 10 atomic coordinates selected from the atomic coordinates listed in Figures 5-1 to 5-28.
- 84. The device of claim 83, wherein the storage device is a device capable of being read by a computer that stores code that receives as input the atomic coordinates.
- 85. The device of claim 84, wherein the device capable of being read by a computer is a flexible disk or a hard disk.
- 86. A nucleic acid molecule, comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) ) and which differs from SEQ ID NO: 2 in at least one substitution in Q69, where said functional fluorescent protein, engineered product, has a fluorescent property different from the green fluorescent protein of Aequorea. 5
- 87. The nucleic acid molecule of the claim 86, wherein said substitution in Q69 is selected from the group of K, R, E and G.
- 88. The nucleic acid molecule of claim 86, wherein said amino acid sequence further comprises a 10 mutation of function in S65.
- 89. A nucleic acid molecule, comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the 15 green fluorescent protein of Aeguorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least one substitution in E222, '^ but not including E222G, where said functional fluorescent protein, engineered product, has a fluorescent property different from the green fluorescent protein of Aeguorea.
- 90. The nucleic acid molecule of the claim 89, where said substitution at E222 is selected from the group of N and Q.
- 91. The nucleic acid molecule of claim 89, wherein said amino acid sequence further comprises a 25 mutation of function in F64.
- 92. A nucleic acid molecule, comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which. differs from SEQ ID NO: 2 in at least one substitution at Y145, where said functional fluorescent protein, engineered product, has a fluorescent property different from the green fluorescent protein of Aequorea.
- 93. The nucleic acid molecule of the claim 92, where said substitution at Y145 is selected from the group of W, C, F, L, E, H, K and Q.
- 94. The nucleic acid molecule of claim 92, wherein said amino acid sequence further comprises a function in Y66.
- 95. A method of identifying a test chemical, comprising: contacting a test chemical with a sample containing a biological entity labeled with a functional fluorescent protein, engineered product or a polynucleotide that encodes said functional fluorescent protein , engineering product; and detecting fluorescence of said functional fluorescent protein, engineered product.
- 96. The method of claim 95, wherein said fluorescence in the presence of a test chemical is greater than in the absence of said test chemical.
- 97. The method of claim 96, wherein said polynucleotide encoding said func- tional fluorescent protein, engineered product, is operably linked to a genomic polynucleotide.
- 98. The method of claim 95, wherein said functional fluorescent protein, engineered product, is fused to a second functional protein.
- 99. The method of claim 96, wherein said polynucleotide encoding said functional fluorescent protein, engineered product, is operably linked to a response element.
- 100. The method of claim 96, wherein said polynucleotide encoding said functionally engineered fluorescent protein is operably linked to a response element in a mammalian cell.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US60/024,050 | 1996-08-16 | ||
US08706408 | 1996-08-30 |
Publications (1)
Publication Number | Publication Date |
---|---|
MXPA98002972A true MXPA98002972A (en) | 2000-06-05 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU727088B2 (en) | Long wavelength engineered fluorescent proteins | |
US8263412B2 (en) | Long wavelength engineered fluorescent proteins | |
US6608189B1 (en) | Fluorescent protein sensors for measuring the pH of a biological sample | |
US6469154B1 (en) | Fluorescent protein indicators | |
US20030212265A1 (en) | Fluorescent protein sensors for measuring the pH of a biological sample | |
WO2002068605A2 (en) | Non-oligomerizing tandem fluorescent proteins | |
WO2000071565A9 (en) | Fluorescent protein indicators | |
US6699687B1 (en) | Circularly permuted fluorescent protein indicators | |
AU767375B2 (en) | Long wavelength engineered fluorescent proteins | |
MXPA98002972A (en) | Long wavelength engineered fluorescent proteins | |
AU2004200425A1 (en) | Long wavelength engineered fluorescent proteins |