MXPA98002972A - Long wavelength engineered fluorescent proteins - Google Patents

Long wavelength engineered fluorescent proteins

Info

Publication number
MXPA98002972A
MXPA98002972A MXPA/A/1998/002972A MX9802972A MXPA98002972A MX PA98002972 A MXPA98002972 A MX PA98002972A MX 9802972 A MX9802972 A MX 9802972A MX PA98002972 A MXPA98002972 A MX PA98002972A
Authority
MX
Mexico
Prior art keywords
amino acid
fluorescent protein
acid sequence
substitution
protein
Prior art date
Application number
MXPA/A/1998/002972A
Other languages
Spanish (es)
Inventor
Y Tsien Roger
B Cubitt Andrew
Heim Roger
F Horma Mats
S Remington James
Original Assignee
Aurora Biosciences
The Regents Of The University Of California
The University Of Oregon
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aurora Biosciences, The Regents Of The University Of California, The University Of Oregon filed Critical Aurora Biosciences
Publication of MXPA98002972A publication Critical patent/MXPA98002972A/en

Links

Abstract

Engineered fluorescent proteins, nucleic acids encoding them and methods of use.

Description

FLUORESCENT PROTEINS DESIGNED OF LONG WAVE LENGTH Background of the Invention This application claims the benefit of the United States provisional patent application serial number 60 / 024,050 with prior filing date, filed on August 16, 1996, entitled "Mutant Fluorescent Long Wavelength Proteins" , and patent application serial number 08 / 706,408 filed on August 30, 1996, entitled "Long Wavelength Designed Fluorescent Proteins", both incorporated herein by reference. This invention was made in part with the support of the government of the United States under grant number MCB 9418479, awarded by the National Science Foundation. The government of the United States may have rights over this invention. Fluorescent molecules are attractive as reporter molecules in many assay systems, due to their high sensitivity and ease of quantification. Recently, fluorescent proteins have been the focus of much attention because these can be produced in vivo by biological systems, and can be used to trace intracellular events without the need to be introduced into the cell through microinjection or permeabilization. The green fluorescent protein of Aequorea victoria is particularly interesting as a fluorescent protein. A cDNA for the protein has been cloned. (D.C. Prasher et al., "Primary structure of the Aequorea victory green-fluorescent protein", Gene (1992) 111: 229-33). Not only can the primary amino acid sequence of the protein expressed from the cDNA be expressed, but the expressed protein can release fluorescent light rays. This indicates that the protein can undergo the cyclization and oxidation that are believed to be necessary for fluorescence. The green fluorescent protein ("GFP") of the Aequorea victoria is a stable single chain, resistant to proteolysis, of 238 residues, and has two absorption maxima at around 395 and 475 nm. The relative amplitudes of these two peaks are sensitive to environmental factors (WW Ward, Bioluminescence and Chemiluminescence (MA DeLuca and WD McElroy, eds) Academic Press pp. 235-242 (1981), WW Ward and SH Bokman Biochemistry 21: 4535-4540 (1982), WW Ward and collaborators Photochem, Photobiol, 35: 803-808 (1982)) and the history of illumination (AB Cubitt and collaborators Trends Biochem, Sci. 20: 448-455 (1995)), presumably reflecting two or more lower states. The excitation at the primary absorption peak of 395 nm produces a maximum emission of 508 nm with a quantum yield of 0.72-0.85 (O. Shimomura and FH Johnson J. Cell, Comp.Physiol. 59: 223 (1962); Morin and JW Hastings, J. Cell, Physiol., 77: 313 (1971), H. Morise et al. Biochemistry 13: 2656 (1974), WW Ward Photochem.
Photobiol. Reviews (Smith, K.C. editor) 4: 1 (1979); A.B. Cubitt and collaborators Trends Biochem. Sci. 20: 448-455 (1995); D.C. Prasher Trends Genet. 11: 320-323 (1995); M. Chalfie Photochem. Photobiol. 62: 651-656 (1995); W.W. Ward. Bioluminescence and Chemiluminescence (M.A. DeLuca and W.D. McElroy, editors) Academic Press pp. 235-242 (1981); W.W. Ward and S.H. Bokman Biochemistry 21: 4535-4540 (1982); W.W. Ward and Photochem collaborators. Photobiol. 35: 803-808 (1982)). The fluorophore is the result of the autocatalytic cyclization of the base structure of the polypeptide between the Ser65 and Gly67 residues, and the oxidation of the D-β bond of Tyr66 (AB Cubitt et al., Trends Biochem. Sci. 20: 448-455 (1995); CW Cody et al Biochemistry 32: 1212-1218 (1993); R. Heim et al Proc. Nati, Acad Sci. USA 91: 12501-12504 (1994)). The mutation of Ser65 to Thr (S65T) if plifies the excitation spectrum at a single peak at 488 nm of the improved amplitude (R. Heim et al., Na ture 373: 664-665 (1995)), which no longer gives signals of conformational isomers (A.B.
Cubitt and collaborators Trends Biochem. Sci. 20: 448-455 (1995)). Fluorescent proteins have been used as gene expression markers, cell line tracers and as fusion tags to monitor the location of the protein within living cells. (M. Chalfie et al. "Green fluorescent protein as marker for gene expression", Science 263: 802-805; AB Cubitt et al. "Understanding, Improving and using green fluorescent proteins", TIBS 20, November 1995, pages 448 -455 U.S. Patent No. 5,491,084, M. Chalfie and D. Prasher). On the other hand, it has been identified that the designed versions of the Aequorea green fluorescent protein exhibit altered fluorescence characteristics, including excitation and altered emission maxima, as well as excitation and emission spectra of different forms. (R. Heim et al., "Wavelength mutations and post-translational autoxidation of green fluorescent protein", Proc. Nati, Acad. Sci. USA, (1994) 91: 12501-04; R. Heim et al., "Improved green fluorescence. ", Na ture (1995) 373: 663-665). These properties add variety and utility to the arsenal of biologically based fluorescent indicators. There is a need for fluorescent proteins designed with different fluorescent properties. Brief Description of the Drawings Figures 1A-1B. (A) Schematic drawing of the base structure of the green fluorescent protein produced by Molscript (J.P. Kraulis, J. Appl. Cryst., 24: 946 (1991)). The chromophore is shown as a ball and rod model. (B) Schematic drawing of the global fold of the green fluorescent protein. The approximate numbers of waste mark the beginning and the end of the secondary structural elements. Figures 2A-2C. (A) Stereoscopic drawing of the chromo forum and the residues in the immediate vicinity. Carbon atoms are drawn as open circles, oxygen is filled, and nitrogen is shaded. The solvent molecules are shown as isolated filled circles. (B) Portion of the density map of the final 2F0-FC electron, pred at 1.0 D, showing the density of the electron surrounding the chromophore. (C) Schematic diagram showing the first and second coordination spheres of the chromophore. The hydrogen bonds are shown as striped lines, and have the indicated lengths in A. Insert: proposed structure of the carbide-sheet intermediate that presumably forms during the generation of the chromophore. Figure 3 illustrates the nucleotide sequence (SEQ ID NO: 1) and the deduced amino acid sequence (SEQ ID NO: 2) of a green fluorescent protein of Aequorea. Figure 4 illustrates the nucleotide sequence (SEQ ID NO: 3) and the deduced amino acid sequence (SEQ ID NO: 4) of a fluorescent protein designed related to Aequorea S65G / S72A / T203Y using the preferred mammalian codons and the Optimal Kozak sequence. Figures 5-1 to 5-28 present the coordinates for the crystal structure of the green fluorescent protein related to Aequorea S65T. Figure 6 shows the fluorescence excitation and the emission spectra for the fluorescent proteins designed 20A and 10C (Table F). The vertical line at 528 nm compares the emission maxima of 10C, to the left of the line, and 20A, to the right of the line. SUMMARY OF THE INVENTION This invention provides functional designed fluorescent proteins, with different fluorescence characteristics, which can be readily distinguished from the green and blue fluorescent proteins that currently exist. These designed fluorescent proteins allow the simultaneous measurement of two or more processes within the cells, and can be used as donors or fluorescence energy receptors, when they are used to monitor protein-protein interactions through FRET. The designed fluorescent proteins of longer wavelength are particularly useful because the photodynamic toxicity and autofluorescence of the cells are significantly reduced at longer wavelengths. In particular, the introduction of the T203X substitution, where X is an aromatic amino acid, results in an increase in the excitation maxima and emission wavelength of the fluorescent proteins related to Aequorea. In one aspect, this invention provides a nucleic acid molecule comprising a nucleotide sequence that encodes a functionally designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2). ), and which differs from SEQ ID NO: 2 in at least one amino acid substitution located no more than about 0.5 nm of the chromophore of the designed fluorescent protein, wherein the substitution alters the electronic environment of the chromophore, whereby the protein Functionally designed fluorescent has a fluorescent property different from the green fluorescent protein of Aequorea. In one aspect, this invention provides a nucleic acid molecule comprising a nucleotide sequence encoding a functionally designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of Aequorea's green fluorescent protein.
(SEQ ID NO: 2), and which differs from SEQ ID N0: 2 in at least one substitution in T203 and, in particular, T203X, where X is an aromatic amino acid selected from H, Y, W, or F, the functionally designed fluorescent protein having a fluorescent property different from the green fluorescent protein of Aequorea. In one embodiment, the amino acid sequence also comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I. In another embodiment, the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S65G / V68L / Q69K / S72A / T203Y; S72A / S65G / V68L / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203W. In another embodiment, the amino acid sequence also comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F, and Y66W. In another embodiment, the amino acid sequence also comprises a mutation of Table A. In another embodiment, the amino acid sequence also comprises a mutation that folds. In another embodiment, the nucleotide sequence encoding the protein differs from the nucleotide sequence of SEQ ID NO: 1 by replacing at least one codon with a preferred mammalian codon. In another embodiment, the nucleic acid molecule encodes a fusion protein, wherein the fusion protein comprises a polypeptide of interest and the functionally designed fluorescent protein. In another aspect, this invention provides a nucleic acid molecule comprising a nucleotide sequence that encodes a functionally designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2). ), and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220, E222 (not E222G), or V224, the functionally designed fluorescent protein having a fluorescent property different from the green fluorescent protein of Aequorea. In one embodiment, the amino acid substitution is: L42X, where X is selected from C, F, H, W And Y. V61X, where X is selected from F, Y, H, and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y, H, and C, V68X, where X is selected from F, Y, and H, Q69X, wherein X is selected from K, R, E and G, Q94X, where X is selected from D, E, H, K and N, N121X, where X is selected from F, H, W and Y. Y145X, where X is selected from W, C, F, L, E, H, K and Q, H148X, where X is selected from F, Y, N, K, Q and R. V150X, where X is selected from F, Y, and H, F165X, where X is selected from H, Q, W and Y, I167X, where X is selected from F , Y, and H, Q183X, where X is selected from H, Y, E and K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and T, E222X, where X is selected from N and Q, or V224X, where X is selected from H, N, Q, T, F, W and Y.
In another aspect, the invention provides an expression vector comprising expression control sequences operably linked to any of the aforementioned nucleic acid molecules. In other aspects, this invention provides a recombinant host cell comprising the aforementioned expression vector. In another aspect, this invention provides a functional designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2), and which differs from SEQ ID NO: 2 in at least one amino acid substitution located at no more than about 0.5 nm from the chromophore of the designed fluorescent protein, where the substitution alters the electronic environment of the chromophore, whereby the functionally designed fluorescent protein has a fluorescent property different from the green fluorescent protein of Aequorea. In another aspect, this invention provides a functional designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2), and which differs from SEQ ID NO: 2 in at least the amino acid substitution in T203 and, in particular, T203X, wherein X is an aromatic amino acid selected from H, Y, W, or F, the functionally designed fluorescent protein having a fluorescent property different from the green fluorescent protein of Aequorea. In one embodiment, the amino acid sequence also comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I. In another embodiment, the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S 65G / V68L / T203 Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203W. In another embodiment, the amino acid sequence also comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F, and Y66W. In another embodiment, the amino acid sequence also comprises a mutation that folds. In another embodiment, the designed fluorescent protein is part of a fusion protein, wherein the fusion protein comprises a polypeptide of interest, and the functionally designed fluorescent protein. In another aspect, this invention provides a functional designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2), and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220, E222, or V224, the functionally designed fluorescent protein having a property fluorescent protein different from the green fluorescent protein of Aeguorea. In another aspect, this invention provides a fluorescently labeled antibody, which comprises an antibody coupled to any of the functionally designed fluorescent proteins mentioned above. In one embodiment, the fluorescently labeled antibody is a fusion protein wherein the fusion protein comprises the antibody fused to the functionally designed fluorescent protein. In another aspect, this invention provides a nucleic acid molecule comprising a nucleotide sequence encoding an antibody fused to the nucleotide sequence encoding a functional designed fluorescent protein of this invention. In another aspect, this invention provides a fluorescently labeled nucleic acid probe, comprising a nucleic acid probe coupled to a functional designed fluorescent protein of this invention. The fusion can be through a linker peptide. In another aspect, this invention provides a method for determining whether a mixture contains a target, comprising contacting the mixture with a fluorescently labeled probe comprising a probe and a functional designed fluorescent protein of this invention; and determine if the target has been fixed to the probe. In one embodiment, the target molecule is captured in a solid matrix. In another aspect, this invention provides a method for designing a functional designed fluorescent protein that has a fluorescent property different from the green fluorescent protein of Aequorea, which comprises substituting an amino acid that is located no more than 0.5 nm from any atom in the chromophore of a green fluorescent protein related to Aequorea with another amino acid; whereby the substitution alters a fluorescent property of the protein. In another mode, the substitution of the amino acid alters the electronic environment of the chromophore. In another aspect, this invention provides a method for designing a functional designed fluorescent protein having a fluorescent property different from the Aequorea green fluorescent protein, which comprises substituting amino acids in a cycle domain of a green fluorescent protein related to Aequorea with amino acids, in order to create a consensus sequence for phosphorylation or for proteolysis. In another aspect, this invention provides a method for producing fluorescence resonance energy transfer, comprising providing a donor molecule comprising a functional designed fluorescent protein of this invention; provide an appropriate acceptor molecule for the fluorescent protein; and putting the donor molecule and the acceptor molecule in close enough contact to allow the transfer of fluorescence resonance energy. In another aspect, this invention provides a method for producing fluorescence resonance energy transfer, comprising providing an acceptor molecule comprising a functional designed fluorescent protein of this invention.; provide a suitable donor molecule for the fluorescent protein; and putting the donor molecule and the acceptor molecule in close enough contact to allow the transfer of fluorescence resonance energy. In one embodiment, the donor molecule is a designed fluorescent protein whose amino acid sequence comprises the T203I substitution, and the acceptor molecule is a designed fluorescent protein whose amino acid sequence comprises the T203X substitution, wherein X is an aromatic amino acid selected from H, Y, W, or F, the functionally designed fluorescent protein having a fluorescent property different from the green fluorescent protein of Aequorea. In another aspect, this invention provides a crystal of a protein comprising a fluorescent protein with an amino acid sequence substantially identical to SEQ ID NO: 2, wherein the crystal is diffracted with at least a resolution of 2.0 to 3.0 Angstroms. In another embodiment, this invention provides a computational method of designing a fluorescent protein, which comprises determining a three-dimensional model of a crystallized fluorescent protein comprising a fluorescent protein with a binding ligand, at least one amino acid interaction of the fluorescent protein that interacts with at least the first chemical fraction, to produce a second chemical fraction with a structure to either decrease, or increase an interaction between the interaction amino acid and the second chemical fraction, compared with the interaction between the interaction amino acid and the first chemical fraction. In another embodiment, this invention provides a computational method for modeling the three-dimensional structure of a fluorescent protein, comprising determining a three-dimensional relationship between at least two atoms listed in the atomic coordinates of Figures 5-1 through 5-28. In another embodiment, this invention provides a device comprising a storage device and, stored in the device, at least 10 atomic coordinates selected from the atomic coordinates listed in Figures 5-1 through 5-28. In one embodiment, the storage device is a computer readable device that stores the code it receives as it enters the atomic coordinates. In another embodiment, the computer-readable device is a floppy disk or a hard disk. Detailed Description of the Invention I. DEFINITIONS Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those of ordinary skill in the art to which this invention pertains. tion. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, preferred methods and materials are described. For the purposes of the present invention, the following terms are defined below. "Fixing pair" refers to two fractions (for example, chemical or biochemical) that have an affinity for one another. Examples of binding pairs include antigens / antibodies, lectin / avidin, target polynucleotide / probe oligonucleotide, antibody / anti-antibody, receptor / ligand, enzyme / ligand and the like. "A member of a fixing pair" refers to a fraction of the pair, such as an antigen or ligand. "Nucleic acid" refers to a deoxyribo-nucleotide or ribonucleotide polymer in the form of either single or double chain and; unless it is limited in another way; It encompasses known analogs of natural nucleotides that can function in a manner similar to naturally occurring nucleotides. It will be understood that when a nucleic acid molecule is represented by a DNA sequence, it also includes RNA molecules having the corresponding RNA sequence in which "U" replaces "T". "Recombinant Nucleic Acid Molecule" refers to a nucleic acid molecule that does not occur naturally, and comprising two nucleotide sequences that are not naturally joined together. Recombinant nucleic acid molecules are produced by artificial recombination, for example, genetic design techniques or chemical synthesis. The reference to a nucleotide sequence "encoding" a polypeptide means that the sequence, after transcription and translation of the mRNA, produces the polypeptide. This includes both the coding strand, whose nucleotide sequence is identical to the mRNA, and whose sequence is usually provided in the sequence listing, as well as its complementary strand, which is used as the template for transcription. As recognized by any person skilled in the art, this also includes all degenerate nucleotide sequences that encode the same amino acid sequence. The 3 nucleotide sequences encoding a polypeptide include the introns-containing sequences. "Expression control sequences" refers to nucleotide sequences that regulate the expression of a nucleotide sequence to which they are linked in an operable manner. The expression control sequences are "operably linked" to a nucleotide sequence when the expression control sequences control and regulate transcription and, as appropriate, translation of the nucleotide sequence. Therefore, expression control sequences may include promoters, enhancers, transcription terminators, a start codon (i.e., ATG) in front of a protein coding gene, splice signals for intro-nes, frame maintenance of correct reading of that gene to allow proper translation of the mRNA, and appropriate stop codons. As used herein, "occurring naturally", as applied to an object, refers to the fact that an object can be found in nature. For example, a polypeptide or polynucleotide sequence that is present in an organism (including virus) that can be isolated from a source in nature, and that has not been intentionally modified by man in the laboratory, occurs naturally. "Operably linked" refers to a juxtaposition wherein the components so described are in a relationship that allows them to function in their intended manner. A control sequence "operably linked" to a coding sequence is linked in such a way that expression of the coding sequence is achieved under conditions compatible with the control sequences, such as when the appropriate molecules (e.g. inductors and polymerases) are fixed to the control sequence (s) or regulators.
"Control sequence" refers to polynucleotide sequences that are necessary to effect the expression of coding and non-coding sequences to which they are linked. The nature of these control sequences differs depending on the host organism; in prokaryotes, these control sequences generally include promoter, ribosomal binding site, and transcription termination sequence; in eukaryotes, generally, those control sequences include promoters and transcription termination sequence. The term "control sequences" is intended to include, to a minimum, components whose presence may influence expression, and may also include additional components whose presence is convenient, for example, leader sequences, and couple fusion sequences. "Isolated polynucleotide" refers to a polynucleotide of genomic, cDNA, or synthetic origin, or a combination thereof, which by virtue of its origin, the "isolated polynucleotide" (1) is not associated with the cell in which the "isolated polynucleotide" is found in nature, or (2) is operably linked to a polynucleotide to which it is not linked in nature. "Polynucleotide" refers to a polymeric form of nucleotides of at least 10 bases in length, either ribonu-cleotide or deoxyribonucleotide, or a modified form of any type of nucleotide.The term includes single-stranded or double-stranded forms of DNA The term "probe" refers to a substance that specifically binds to another substance (an "objective.") Probes include, for example, antibodies, nucleic acids, receptors and their ligands. "Modulation" refers to the ability to either improve or inhibit a functional property of the biological activity or process (eg, enzyme activity or receptor binding), this improvement or inhibition may be contingent on the occurrence of a specific event, such as activation of a signal transduction path, and / or can be manifested only in particular cell types.The term "modulator" refers to a chemical product (occurring from nera natural or occurring in a non-natural way), or an extract made from biological materials such as cells or tissues of bacteria, plants, fungi, or animals (particularly mammals). Modulators can be evaluated to see their potential activity as inhibitors or activators (directly or indirectly) of a biological process or processes (e.g., agonist, partial antagonist, partial agonist, inverse agonist, antagonist, antineoplastic agents, cytotoxic agents, neoplastic transformation inhibitors or cell proliferation, cell proliferation promoting agents, and the like ) by inclusion in the classification assays described herein. The activity of a modulator can be known, unknown or partially known. The term "test chemical" refers to a chemical that is to be tested by one or more classification methods of the invention, such as a putative modulator. It is usually not known that a test chemical is fixed to the target of interest. The term "control test chemical" refers to a chemical known to bind to the target (eg, an agonist, antagonist, partial agonist, or known inverse agonist). Usually, different pre-determined concentrations of test chemicals are used for classification, such as .01 μM, .1 μM, 1.0 μM, and 10.0 μM. The term "objective" refers to a biochemical entity that involves a biological process. The targets are typically proteins that play a useful role in the physiology or biology of an organism. A therapeutic chemical is fixed to the target to alter or modulate its function. As used herein, targets may include surface cell receptors, G proteins, kinases, ion channels, phospholipases and other proteins mentioned herein. The term "label" refers to a composition that can be detected by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. For example, suitable labels include 32P, fluorescent dyes, fluorescent proteins, dense electron reagents, enzymes (eg, as commonly used in the enzyme-linked immunosorbent assay), biotin, dioxigenin, or haptens and proteins for which are available antisera or monoclonal antibodies. For example, the polypeptides of this invention can be made as detectable labels, for example by incorporating them into a polypeptide, and can be used to label antibodies reactive specifically with the polypeptide. Frequently a label generates a measurable signal, such as radioactivity, fluorescent light or enzymatic activity, which can be used to quantify the amount of label set. The term "nucleic acid probe" refers to a nucleic acid molecule that binds to a specific sequence or subsequence of another nucleic acid molecule. A probe is preferably a nucleic acid molecule that is fixed through a base pair complementary to the entire sequence or to a subsequence of a target nucleic acid. It will be understood that the probes can set target sequences that lack complete complementarity with the probe sequence, depending on the stringency of the hybridization conditions. The probes are preferably labeled directly as with isotopes, chromophores, luminophores, chromogens, fluorescent proteins, or indirectly labeled such as with biotin to which a streptavidin complex can then be attached. By testing to see the presence or absence of the probe, one can detect the presence or absence of the selected sequence or subsequence. A "labeled nucleic acid probe" is a nucleic acid probe that is fixed, covalently, through a linker, or through ionic bonds, van der Waals or hydrogen bonds, to a tag, such that it can be detect the presence of the probe by detecting the presence of the label attached to the probe. The terms "polypeptide" and "protein" refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residues are an artificial chemical analogue of a corresponding amino acid occurring naturally, as well as to naturally occurring amino acid polymers. The term "recombinant protein" refers to a protein that is produced by the expression of a nucleotide sequence that encodes the amino acid sequence of the protein of a recombinant DNA molecule. The term "recombinant host cell" refers to a cell comprising a recombinant nucleic acid molecule. Thus, for example, recombinant host cells can express genes that are not found within the native (non-recombinant) form of the cell. The terms "isolated", "purified" or "biologically pure" refer to material that is substantially or essentially free of the components that normally accompany it, as found in its native state. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid chromatography. A protein or nucleic acid molecule that is the predominant protein or nucleic acid species present in a preparation is substantially purified. Generally, an isolated protein or nucleic acid molecule will comprise more than 80 percent of all macromolecular species present in the preparation. Preferably, the protein is purified to represent more than 90 percent of all macromolecular species present. More preferred, the protein is purified to more than 95 percent, and more preferably the protein is purified to essential homogeneity, where other macromolecular species are not detected by conventional techniques. The term "occurring naturally", as applied to an object, refers to the fact that an object can be found in nature. For example, a poly-peptide or polynucleotide sequence that is present in an organism (including virus) that can be isolated from a source in nature, and that has not been intentionally modified by man in the laboratory, occurs naturally . The term "antibody" refers to a polypeptide substantially encoded by an immunoglobulin gene or immunoglobulin genes, or fragments thereof, which binds to, and specifically recognizes an analyte (antigen). The recognized immunoglobulin genes include the genes of the constant region kapa, lamda, alpha, gamma, delta, epsilon, and mu, as well as the myriad genes of the immunoglobulin variable region. The antibodies exist, for example, as intact immunoglobulins, or as a number of well-characterized fragments produced by digestion with different peptidases. This includes, for example, the Fab 'and F (ab)' 2 fragments. The term "antibody-po", as used herein, also includes fragments of antibodies produced either by the modification of whole antibodies, or those synthesized de novo using recombinant DNA methodologies. The term "immunoassay" refers to an assay that uses an antibody to specifically bind an analyte. The immunoassay is characterized by the use of specific binding properties of a particular antibody to isolate, direct, and / or quantify the analyte. The term "identical" in the context of two nucleic acid sequences or polypeptides, refers to the residues in the two sequences that are the same when aligned for maximum correspondence. When the percentage of sequence identity is used with reference to proteins or peptides it is recognized that the positions of the residues that are not identical often differ in conservative amino acid substitutions, where the amino acid residues are replaced by other amino acid residues. with similar chemical properties (eg, charge or hydrophobicity) and, therefore, do not change the functional properties of the molecule. Where the sequences differ in conservative substitutions, the percent identity of the sequences upward can be adjusted to correct the conservative nature of the substitution. For those skilled in the art, the means for making this adjustment are well known. Typically this involves marking a conservative substitution as a partial poor rather than complete mismatch, thereby increasing the percent identity of the sequence. Therefore, for example, where an identical amino acid is given a mark of 1, and a non-conservative substitution is given a zero mark, a conservative substitution is given a mark between zero and 1. The labeling of conservative substitutions is calculated, for example, in accordance with a known algorithm. See, for example, Meyers and Miller, Computer Applic. Biol. Sci., 4: 11-17 (1988); Smith and Waterman (1981) Adv. Appl. Ma th. 2: 482; Needleman and Wunsch (1970) J. Mol. Biol. 48: 443; Pearson and Lipman (1988) Proc. Nati Acad. Sci. USA 85: 2444; Higgins and Sharp (1988) Gene, 73: 237-244 and Higgins and Sharp (1989) CABIOS 5: 151-153; Corpet, et al. (1988) Nucleic Acids Research 16, 10881-90; Huang, et al. (1992) Computer Applications in the Biosciences 8, 155-65, and Pearson, et al. (1994) Methods in Molecular Biology 24, 307-31. Alignment is also often done by manual inspection and alignment. The "conservatively modified variations" of a particular nucleic acid sequence refers to those nucleic acids that encode identical or essentially identical amino acid sequences, or wherein the nucleic acid does not encode an amino acid sequence, to essentially identical sequences . Due to the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given polypeptide. For example, all the codons CGU, CGC, CGA, CGG, AGA, and AGG encode the amino acid arginine. Therefore, in every position where an arginine is specified by a codon, the codon can be altered to any of the corresponding codons described, without altering the encoded polypeptide. These variations of nucleic acid are "silent variations", which are a kind of "conservatively modified variations". Any nucleic acid sequence herein that encodes a polypeptide also describes any possible silent variation. One of experience will recognize that each codon can be modified in a nucleic acid (except AUG, which is ordinarily the only codon for methionine), to produce a functionally identical molecule by standard techniques. In accordance with the above, each "silent variation" of a nucleic acid encoding a polypeptide is implicit in each described sequence. On the other hand, one of experience will recognize that substitutions, deletions, or individual additions that alter, add or delete a single amino acid or a small percentage of amino acids (typically less than 5 percent, more typically less than 1 percent) in a encoded sequence, are "conservatively modified variations" wherein the alterations result in the substitution of an amino acid with a chemically similar amino acid. Conservative amino acid substitutions that provide amino acids of similar functionality are well known in the art. Each of the following six groups contains amino acids that are conservative substitutions for one another: 1) Alanine (A), Serine (S), Threonine (T); 2) Aspartic acid (D), glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). The term "complementary" means that a nucleic acid molecule has the sequence of the binding partner of another nucleic acid molecule. Therefore, the sequence 5'-ATGC-3 'is complementary to the sequence 5'-GCAT-3'. An amino acid sequence or nucleotide sequence is "substantially identical" or "substantially similar" to a reference sequence if the amino acid sequence or nucleotide sequence has at least 80 percent sequence identity with the sequence reference on a given comparison window. Therefore, substantially similar sequences include those that have, for example, at least 85 percent sequence identity, at least 90 percent sequence identity, at least 95 percent sequence identity , or at least 99 percent sequence identity. Of course, two sequences that are identical to one another are also substantially identical. A subject nucleotide sequence is "substantially complementary" to a reference nucleotide sequence if the complement of the subject nucleotide sequence is substantially identical to the reference nucleotide sequence. The term "astringent conditions" refers to a temperature and the ionic conditions that are used in the hybridization of nucleic acids. The conditions of astringency depend on the sequence and are different under different environmental parameters. Generally, the astringent conditions are selected to be about 5 ° C to 20 ° C lower than the thermal melting point (Tra) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50 percent of the target sequence is hybridized to a perfectly coupled probe.
The term "allelic variants" refers to the polymorphic forms of a gene at a particular genetic site, as well as to cDNAs derived from mRNA transcripts of the genes and polypeptides encoded by them. The term "preferred mammalian codon" refers to the subset of codons from the set of codons that encode an amino acid that are most frequently used in proteins expressed in mammalian cells, as chosen from the following list: Amino Acid Preferred codons for expression of high level mammal Gly GGC, GGG Glu GAG Asp GAC Val GUG, GUC Ala GCC, GCU Ser AGC, UCC Lys AAG Asn AAC Met AUG He AUC Thr ACC Trp UGG Cys UGC Tyr UAU, UAC Leu CUG Phe UUC Arg CGC, AGG, AGA Gln CAG His CAC Pro CCC Fluorescent molecules are useful in the transfer of fluorescence resonance energy ("FRET"). The fluorescence resonance energy transfer involves a donor molecule and an acceptor molecule. To optimize the efficiency and detectability of the fluorescence resonance energy transfer between a donor and acceptor molecule, many factors need to be considered. The donor emission spectrum must overlap as much as possible with the exciter spectrum of the acceptor to maximize the overlap integral. In addition, the quantum yield of the donor fraction and the extinction coefficient of the acceptor must be equally as high as possible to maximize R0, the distance at which the energy transfer efficiency is 50 percent. However, the excitation spectra of the donor and the acceptor must overlap as little as possible, so that a region of wavelength can be found in which the donor can be excited efficiently, without exciting directly the acceptor. The fluorescence that arises from the direct excitation of the acceptor is difficult to distinguish from the fluorescence that arises from the fluorescence resonance energy transfer. Similarly, the emission spectra of the donor and the acceptor should overlap as little as possible, so that the two emissions can be clearly distinguished. The high fluorescence quantum yield of the acceptor fraction is desirable, if the emission from the acceptor is to be measured either as the single display of information or as part of an emission ratio. One factor to consider when choosing the donor and acceptor pair is the efficiency of the fluorescence resonance energy transfer between them. Preferably, the efficiency of the fluorescence resonance energy transfer between the donor and the acceptor is at least 10 percent, more preferably at least 50 percent, and even more preferred at least 80 percent. hundred. The term "fluorescent property" refers to the_molar extinction coefficient at an appropriate excitation wavelength, the quantum efficiency of fluorescence, the shape of the excitation spectrum- or emission spectrum, the maximum excitation wavelength and the length Maximum emission wavelength, the ratio of the excitation amplitudes at two different wavelengths, the ratio of the emission amplitudes at two different wavelengths, the lifetime of the excited state, or the fluorescence anisotropy. A difference that can be measured in any of these properties between the green fluorescent protein of wild-type Aequorea and the mutant form is useful. A measurable difference can be determined by determining the amount of any quantitative fluorescent property, for example, the amount of fluorescence at a particular wavelength, or the fluorescence integral on the emission spectrum. The determinant proportions of the excitation amplitude or the emission amplitude at two different wavelengths ("excitation amplitude proportion" and "emission amplitude proportion", respectively) are particularly convenient because the proportioning process provides a internal reference, and cancels the variations in the brightness of the excitation source, the sensitivity of the detector, and the dispersion or damping through the sample. II. FLUORESCENT PROTEINS DESIGNED FROM LONG WAVE LENGTH A. Fluorescent Proteins As used herein, the term "fluorescent protein" refers to any protein capable of fluorescence when excited by the appropriate electromagnetic ... This includes fluorescent proteins whose amino acid sequences are either naturally occurring or designed (ie, analogs or mutants). Many Enidarians use green fluorescent proteins ("GFPs") as bioluminescent energy transfer acceptors. A "green fluorescent protein", as used herein, is a protein that emits green fluorescent light rays. Similarly, "blue fluorescent proteins" give off blue fluorescent light rays, and the "red fluorescent proteins" give off red fluorescent light rays. The fluorescent green proteins of the jellyfish of the Pacific Northwest, the Aequorea victoria, of the thought of the sea, of the Renilla reníformis, and the Phialidium gregarium have been isolated. W.W. Ward and Photochem collaborators. Photobiol. 35: 803-808 (1982); L.D. Levine et al., Comp. Biochem. Physiol. , 72B: 77-85 (1982). A variety of Aequorea-related fluorescent proteins have been designed that have useful excitation and emission spectra, by modifying the amino acid sequence of a green fluorescent protein naturally occurring from Aequorea victoria. (DC Prasher et al., Gene, 111: 229-233 (1992); R. Heim et al., Proc. Na ti. Acad. Sci., USA, 91: 12501-04 (1994); United States 08 / 337,915, filed on November 10, 1994, International application PCT / US95 / 14692, filed on 10/11/95). As used herein, a fluorescent protein is a "fluorescent protein related to Aequorea" if any contiguous sequence of 150 amino acids of the fluorescent protein has at least 85 percent sequence identity with an amino acid sequence, either contiguous or non-contiguous, of the 238 amino acid wild type Aequorea green fluorescent protein of Figure 3 (SEQ ID NO: 2). More preferably, a fluorescent protein is a fluorescent protein related to Aequorea if any contiguous sequence of 200 amino acids of the fluorescent protein has at least 95 percent sequence identity with an amino acid sequence, either contiguous or non-contiguous, of the protein . fluorescent green Aequorea of Figure 3 (SEQ ID NO: 2). Similarly, the fluorescent protein can be related to wild type fluorescent proteins of Renilla or Phialidium, using the same standards. The fluorescent proteins related to Aequorea include, for example and without limitation, the green fluorescent protein of Aequorea victoria wild-type (native) (DC Prasher et al., "Primary structure of the Aequorea victoria green fluorescent protein", Gene, (1992) 111: 229-33), whose nucleotide sequence (SEQ ID NO: 1) and whose deduced amino acid sequence (SEQ ID NO: 2) are presented in Table 3; allelic variants of this sequence, for example, Q80R, which has the glutamine residue at position 80 substituted with arginine (M. Chalfie et al., Science, (1994) 263: 802-805); those fluorescent proteins related to Aequorea designed, described herein, for example, in Table A or Table F, variants that include one or more mutations and fold fragments of these proteins that are fluorescent, such as the green fluorescent protein of Aequorea from which the two amino acids with amino terminal have been removed. Many of these contain different aromatic amino acids within the central chromophore and emit fluorescent light rays at a significantly shorter wavelength than the wild-type species. For example, the designed proteins P4 and P4-3 contain (in addition to other mutations) the Y66H substitution, while W2 and W7 contain (in addition to other mutations) Y66W. Other mutations both near the region of the chromophore of the protein and far from it, in the primary sequence, can affect the spectral properties of the green fluorescent protein, and are listed in the first part of the following table.
TABLE A Clone Mutation (s) Max. of excitation Max. of emission (nm) Coef. of Extin. Yield (nm) (MW) quantum Type None 395 (475) 508 21,000 (7,150) 0.77 wild P4 Y66H 383 447 13,500 0.21 P4-3 Y66H 381 445 14,000 0.38 Y145F W7 Y66W 433 (453) 475 (501) 18,000 (17,100) 0.67 N146I M153T V163A N212K W2 Y66W 432 (453) 480 10,000 (9,600) 0.72 I123V Y145H H148R M153T V163A N212K S65T S65T 489 511 39,200 0.68 P4-1 S65T 504 (396) 514 14,500 (8,600) 0.53 M153A K238E S65A S65A 471 504 S65C S65C 479 507 S65L S65L 484 510 Y66F Y66F 360 442 Y66W Y66W 458 480 Additional mutations in fluorescent proteins related to Aequorea, referred to as "fold mutations", improve the ability of fluorescent proteins to fold at higher temperatures, and to be more fluorescent when expressed in mammalian cells. , but have little or no effect on the excitation and emission peak wavelengths. It should be noted that these can be combined with mutations that influence the spectral properties of the green fluorescent protein to produce proteins with altered spectral and fold properties. Fold mutations include: F64L, V68L, S72A, and also T44A, F99S, Y145F, N146I, M153T or A, V163A, I167T, S175G, S205T, and N212K. As used herein, the term "cycle domain" refers to an amino acid sequence of a fluorescent protein related to Aequorea that connects the amino acids involved in the secondary structure of the eleven chains of the D-barrel or the D-helix central (residues 56-72) (see Figures IA and IB). As used herein, the "fluorescent protein fraction" of a fluorescent protein is that portion of the amino acid sequence of a fluorescent protein that, when the amino acid sequence of the fluorescent protein substrate is optimally aligned with the fluorescent protein. Amino acid sequence of a naturally occurring fluorescent protein is found between amino acids with terminal amino and carboxy terminals, inclusive, of the amino acid sequence of the naturally occurring fluorescent protein. It has been found that fluorescent proteins can be genetically fused to other target proteins, and used as markers to identify the location and amount of the target protein produced. In accordance with the above, this invention provides fusion proteins comprising a fluorescent protein fraction, and additional amino acid sequences. These sequences can be, for example, up to about 15, up to about 50, up to about 150 or up to about 1000 amino acids long. The fusion proteins have the ability to fire fluorescent light rays when excited by electromagnetic radiation. In one embodiment, the fusion protein comprises a polyhistidine tag to aid in the purification of the protein. B. Use of the Crystal Structure of the Green Fluorescent Protein to Design Mutants Having Altered Fluorescent Characteristics Using X-ray crystallography and computer processing, we have created a model of the crystal structure of the green fluorescent protein of Aequorea, which shows the relative location of the atoms in the molecule. This information is useful in the identification of amino acids whose substitution alters the fluorescent properties of the protein. The fluorescent characteristics of the fluorescent proteins related to Aeguorea depend, in part, on the electronic environment of the chromophore. In general, amino acids that are within approximately 0.5 nm of the chromophore have an influence on the electronic environment of the chromophore. Therefore, the substitution of these amino acids can produce fluorescent proteins with altered fluorescent characteristics. In the excited state, the density of the electron tends to change from the phenolate to the carbonyl end of the chromophore. Therefore, the increasing positive charge placement near the carbonyl end of the chromophore tends to decrease the energy of the excited state, and cause a change to red in the absorbance and the maximum wavelength of emission of the protein. The decrease in the positive charge near the carbonyl end of the chromophore tends to have the opposite effect, causing a change to blue in the wavelengths of the protein. The amino acids with charged side groups (D, E, K, and R ionized), dipolar (H, N, Q, S, T, and D, E and K non-charged), and polarizable (for example, C, F, H, M, W and Y) are useful for altering the electronic environment of the chromophore, especially when replacing an amino acid with an uncharged, non-polar or non-polarizable side chain. In general, amino acids with polarizable side groups alter the electronic environment less, and, consequently, are expected to cause a comparatively smaller change in a fluorescent property. Amino acids with charged side groups alter the environment more, and, consequently, are expected to cause a comparatively larger change in a fluorescent property. However, amino acids with charged side groups are more likely to break down the structure of the protein, and avoid proper folding if they hide next to the chromophore without additional solvation or salt bridging. Therefore, charged amino acids are more likely to be tolerated, and to give useful effects when they replace other charged or highly polar amino acids, which have already been solvated or that are enveloped in salt bridges. In certain cases, where the substitution with a polarizable amino acid is chosen, the structure of the protein can make the selection of a longer amino acid, for example, W, less appropriate. Alternatively, positions occupied by amino acids with charged or polar side groups that are unfavorably oriented may be substituted with amino acids having less charged or polar side groups. In another alternative, an amino acid whose side group has a dipole oriented in one direction in the protein can be substituted with an amino acid having a dipole oriented in a different direction. More particularly, Table B lists many amino acids located within about 0.5 nm from the chromophore, the replacement of which can result in altered fluorescent characteristics. The table indicates, underlined, the preferred amino acid substitutions at the indicated location to alter a fluorescent characteristic of the protein. In order to introduce these substitutions, the table also provides codons for the first ones used in site-directed mutagenesis involving the amplification. These primers have been selected to economically code the preferred amino acids, but these also encode other amino acids, as indicated, or up to a stop codon, denoted by Z. When introducing the substitutions using these first degenerates, the most efficient strategy is classify the collection to identify the mutants with the desired properties, and then sequence their DNA to find out which of the possible substitutions is responsible. The codons are shown as double chain with the upstream chain, the antisense chain down. In nucleic acid sequences, R = (A or g); Y = (C or T); M = (A or C); K = (g or T); S = (g OR C); W = (A or T); H = (A, T, OR C); B = (g, T, OR C); V = (g. A, OR C); D = (g. A, or T); N = (A, C, g, OR T).
TABLE B Original position and supposed paper Change to Codon L42 Aliphatic residue near C = N of the chromopholor CFHLORWYZ 5? DS3 ' V61 Aliphatic residue close to -CH = chromophore center FYHCLR YDC RHg T62 Almost directly over the center of the bridge of the AVFS KYF chromophore MRg DEHKNQ VAS BTS FYHCLR YDC RHg V68 Aliphatic residue close to the carbonyl and G67 FYHL YWC RWg N121 Near the CN site of the closing ring between T65 and G67 CFHLORWYZ YDS RHS Y145 Packages near the tyrosine ring of the WCFL TKS AMS chromophore DEHNKQ VAS BTS H148 H phenyl oxygen links FYNI WWC WWg KOR MRg KYC VI 50 Aliphatic residue near the ring of tyrosine of the chromophore FYHL YWC RWg F165 Packages near the ring of tyrosine CHORWYZ YRS RYS 1167 Aliphatic residue near the phenolate; I167T has effects FYHL YWC RWg T203 H links to phenolic oxygen of the chromophore FHLQRWYZ YDS RHS E222 The protonation regulates the ionization of the chromophore HKNO MAS KTS Examples of amino acids with polar side groups that can be substituted with polarizable side groups include, for example, those in Table C. TABLE C Original position and assumed paper Change to Elbow: Q69 Terminates the link water chain H KREG RRg YYC Q94 H bonds to the carbonyl chromophore term DEHKNO VAS BTS Q183 Bridges Arg96 and center of the chromophore bridge HY YAC RTG _____ RAg YTC NI 85 Part of the H link network near the carbonyl of the DEHNKQ VAS BTS chromophore In another embodiment, an amino acid that is close to a second amino acid within about 0.5 nm of the chromophore can, upon substitution, alter the electronic properties of the second amino acid, in turn altering the electronic environment of the chromophore. Table D represents two of these amino acids. The amino acids, L220 and V224, are close to E222, and oriented in the same direction in the folded sheet D.
TABLE D Original position and supposed paper Change to Codon L220 Packages near Glu222; to make the pH sensitive GFP HKNPOT MMS KKS V224 Packs near Glu222; to make the pH sensitive GFP HKNPQT MMS KKS CFHLORWYZ YDS RHS One embodiment of the invention includes a nucleic acid molecule comprising a nucleotide sequence encoding a functionally designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ. ID NO: 2), and which differs from SEQ ID NO: 2 in at least one substitution in Q69, wherein the functionally designed fluorescent protein has a fluorescent property different from the green fluorescent protein of Aequorea. Preferably, the substitution in Q69 is selected from the group of K, R, E and G. Substitution Q69 can be combined with other mutations, to improve the properties of the protein, such as a functional mutation in S65. One embodiment of the invention includes a nucleic acid molecule comprising a nucleotide sequence that encodes a functionally designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) , and which differs from SEQ ID NO: 2 in at least one substitution in E222, but not including E222G, wherein the functionally designed fluorescent protein has a fluorescent property different from the green fluorescent protein of Aequorea. Preferably, the substitution in E222 is selected from the group of N and Q. The substitution E222 can be combined with other mutations, to improve the properties of the protein, such as a functional mutation in F64. One embodiment of the invention includes a nucleic acid molecule comprising a nucleotide sequence that encodes a functionally designed fluorescent protein whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2), and which differs from SEQ ID NO: 2 in at least one substitution in Y145, wherein the functionally designed fluorescent protein has a fluorescent property different from the green fluorescent protein of Aequorea. Preferably, the substitution at Y145 is selected from the group of W, C, F, L, E, H, K and Q. The Y145 substitution can be combined with other mutations, to improve the properties of the protein, such as one Y66. The invention also includes computer-related modalities, including computational methods for using crystal coordinates to design new fluorescent protein mutations, and devices for storing crystal data, including coordinates. For example, the invention includes a device comprising a storage device and, stored in the device, at least 10 atomic coordinates selected from the atomic coordinates listed in Figures 5-1 through 5-28. More coordinates can be stored, depending on the complexity of the calculations or the objective of using the coordinates (for example, approximately 100, 1,000 or more coordinates). For example, larger numbers of coordinates will be desirable for more detailed representations of the structure of the fluorescent protein. Typically, the storage device is a computer readable device 5 that stores the code it receives as it enters the coordinates. Although other means of storage are contemplated as are known in the art. The computer readable device can be a floppy disk or a hard disk. C. Production of Fluorescent Proteins of ^ 10 Long Wavelength The recombinant production of a fluorescent protein involves the expression of a nucleic acid molecule having sequences that encode the protein. In one embodiment, the nucleic acid encodes a fusion protein in which a single polypeptide includes the fluorescent protein fraction within a longer β-polypeptide. The longer polypeptide may include a second functional protein, such as the fluorescence resonance energy transfer pair or a protein having a second function (eg, enzyme, antibody, or other binding protein). The nucleic acids encoding fluorescent proteins are useful as starting materials. Fluorescent proteins can be produced as fusion proteins by recombinant DNA technology.
The recombinant production of fluorescent proteins involves the expression of nucleic acids that have sequences that encode proteins. Nucleic acids encoding fluorescent proteins can be obtained by methods known in the art. Fluorescent proteins can be made by site-specific mutagenesis of other nucleic acids encoding fluorescent proteins, or by random mutagenesis caused by increasing the error ratio of the polymerase chain reaction of the original polynucleotide to 0. lmM of MnCl2 and unbalanced nucleotide concentrations. See, for example, United States patent application 08 / 337,915, filed on November 10, 1994, or International application PCT / US95 / 14692, filed on 10/11/95. The nucleic acid encoding a green fluorescent protein can be isolated by the polymerase chain reaction of the cDNA from A. victoria, using primers based on the DNA sequence of the green fluorescent protein of A. victoria, as shown in FIG. presented in Figure 3. Polymerase chain reaction methods are described, for example, in U.S. Patent No. 4,683,195; Mullis, and collaborators (1987) Cold Spring Harbor Symp. Quant. Biol. 51: 263; and Erlich, ed. , PCR Technology, (Stockton Press, NY, 1989). The construction of expression vectors and gene expression in transfected cells includes the use of molecular cloning techniques also well known in the art. Sambrook, et al., Molecular Cloning - - A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, (1989) and Current Protocols in Molecular Biology, F.M. Ausubel, et al., Eds., (Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley &Sons, Inc.). The expression vector for the function can be adapted in prokaryotes or eukaryotes by including appropriate promoters, replication sequences, markers, and so on. The nucleic acids used to transfect cells with sequences encoding the expression of the polypeptide of interest will generally be in the form of an expression vector that includes expression control sequences operably linked to a nucleotide sequence that encodes the expression of the polypeptide. As used, the term "nucleotide sequence encoding the expression of" a polypeptide, refers to a sequence that, upon transcription and translation of the mRNA, produces the polypeptide. This can include sequences that contain, for example, introñes. The expression control sequences are operatively linked to a nucleic acid sequence when the expression control sequences control and regulate transcription and, as appropriate, translation of the nucleic acid sequence. In this manner, expression control sequences may include appropriate promoters, enhancers, transcription terminators, a start codon (ie, ATG) versus a gene encoding protein, splice signals for introns, frame maintenance of correct reading of that gene to allow the proper translation of the mRNA, and stop codons. Methods that are well known to those skilled in the art can be used to construct expression vectors containing the fluorescent protein coding sequence and the appropriate transcription / translation control signals. These methods include in vitro recombinant DNA techniques, synthetic techniques and in vivo recombination / genetic recombination. (See, for example, the techniques described in Maniatis, and collaborators, Molecular Cloning A Laboratory Manual, Cold Spring Harbor Laboratory, N.Y. , 1989). Transformation of a host cell with the recombinant DNA can be performed by conventional techniques that are also well known to those skilled in the art. When the host is prokaryotic, such as E. coli, competent cells can be prepared that can capture DNA from cultured cells after the exponential growth phase and can subsequently be treated by the CaCl2 method by means of procedures well known in the art. . Alternatively, MgCl 2 or RbCl can be used. Transformation can also be performed after the formation of a protoplast of the host cell or by electroporation.
When the host is a eukaryote, methods such as transfection of DNA such as calcium phosphate coprecipitates, conventional mechanical methods such as microinjection, electroporation, insertion of a plasmid enclosed in liposomes, or virus vectors can be used. Eukaryotic cells can also be co-transfected with the DNA sequences encoding the fusion polypeptide of the invention, and a second foreign DNA molecule encoding a selectable phenotype, such as the herpes simplex thymidine kinase gene. Another method is to use a eukaryotic viral vector or transform eukaryotic cells and express the protein. (Eukaryotic Viral Vectors, Cold Spring Harbor Laboratory, Gluzman, et al., 1982). Preferably, a eukaryotic host is used as the host cell as described herein. The techniques for the isolation and purification of polypeptides of the invention expressed either microbially or eukaryotic, can be by any conventional means such as, for example, preparative chromatographic separations and immunological separations such as those including the use of antibodies or monoclonal or polyclonal antigens. In one embodiment, recombinant fluorescent proteins can be produced by expression of the nucleic acid encoding the protein in E. coli. The fluorescent proteins related by Ae? Ruorea are best expressed by cells grown between about 15 ° C and 30 ° C but higher temperatures are possible (for example, 37 ° C). After synthesis, these enzymes are stable at higher temperatures (eg, 37 ° C) and can be used in tests at these temperatures. A variety of host expression vector systems can be used to express the coding sequence of the fluorescent protein. These include, but are not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage DNA, plasmid DNA or cosmid DNA expression vectors containing a fluorescent protein coding sequence; the yeast transformed with recombinant yeast expression vectors containing the fluorescent protein coding sequence; plant cell systems infected with recombinant virus expression vectors (e.g., cauliflower mosaic virus, CaMV, tobacco mosaic virus, TMV) or transformed with recombinant plasmid expression vectors (e.g., Ti plasmid) which contain a fluorescent protein coding sequence; insect cell systems infected with recombinant virus expression vectors (e.g., baculovirus) containing a fluorescent protein coding sequence; or animal cell systems infected with recombinant virus expression vectors (e.g., retroviruses, adenoviruses, vaccinia viruses) containing a fluorescent protein coding sequence; or transformed animal cell systems designed for stable expression. Depending on the host / vector system used, any of a number of transcription and translation elements, including constitutive and inducible promoters, transcription enhancer elements, transcription terminators, etc., may be used in the expression vector ( see, for example, Bitter, et al., Methods in Enzymology 153: 516-544, 1987). For example, when cloning into bacterial systems, inducible promoters such as the pL of bacteriophage D, plac, ptrp, ptac (ptrp-lac hybrid promoter) and the like can be used. When cloning into mammalian cell systems, promoters derived from the genome of mammalian cells (for example, the metallothionein promoter) or from mammalian virus (for example, the repeat of the long terminal of the mammalian cell) can be used. retrovirus; the late adenovirus promoter; the 7.5K promoter of the vaccinia virus). Promoters produced by recombinant DNA or synthetic techniques can also be used to provide transcription of the coding sequence of the inserted fluorescent protein. In bacterial systems, a number of expression vectors may be advantageously selected, depending on the intended use for the expressed fluorescent protein. For example, when large quantities of the fluorescent protein must be produced, vectors that direct the expression of high levels of fusion protein products that have been rapidly purified may be desirable. Preferred are those that have been designed to contain a dissociation site to aid in the recovery of the fluorescent protein. In yeast, a number of vectors containing constitutive or inducible promoters can be used. For a review see, Current Protocols in Molecular Biology, volume 2, Ed. Ausubel, and collaborators, Greene Publish Assoc. & Wiley Inters-cience, Chapter 13, 1988; Grant, et al., Expression and Secretion Vectors for Yeast, in Methods in Enzymology, Eds. Wu & Grossman, 31987, Acad. Press, N.Y. , volume 153, pages 516-544, 1987, DNA Cloning, volume II, IRL Press, Wash., D.C., chapter 3, 1986; and Bitter, Heterologous Gene Expression in Yeast, Methods in Enzymology, Eds. Berger & Kimmel, Acad. Press, N.Y., volume 152, pages 673-684, 1987; and The Molecular Biology of the Yeast Saccharomyces, Eds. Strathern, et al., Cold Spring Harbor Press, Volumes I and II, 1982. A constitutive yeast promoter such as ADH or LEU2 or an inducible promoter such as GAL can be used (Cloning in Yeast, Chapter 3, R. Rothstein In: DNA Cloning volume 11, A Practical Approach, Ed. DM Glover, IRL Press, Wash., DC, 1986). Alternatively, vectors that promote the integration of the foreign DNA sequences within the chromosome of the yeast can be used. In cases where plant expression vectors are used, the expression of a fluorescent protein coding sequence can be impelled by a number of promoters. For example, viral promoters such as the 35S RNA and the 19S RNA of CaMV (Brisson, et al., Nature 310: 511-514, 1984), or the TMV shell protein promoter (Takamatsu, et al., EMBO J. 6: 301-311, 1987); alternatively, plant promoters such as the small subunit of RUBISCO (Coruzzi, et al., 1984, EMBO J. 3: 1671-1680; Broglie, et al., Science 224: 838-843, 1984); or heat shock promoters can be used, for example, hspl7.5-E or hspl7.3-B from soybean (Gurley, et al., Mol.Cell. Biol. 6: 559-565, 1986). These constructs can be introduced into plant cells using Ti plasmids, Ri plasmids, plant virus vectors, direct DNA transformation, microinjection, electroporation, and so on. For reviews of these techniques see, for example, Weissbach and Weissbach, Methods for Plant Molecular Biology, Academic Press, NY, Section VIII, pages 421-463, 1988; and Grierson and Corey, Plant Molecular Biology, 2nd Ed., Blackie, London, chapters 7-9, 1988. An alternative expression system which can be used to express the fluorescent protein is an insect system. In one such system, the nuclear polyhedra-sis virus Autographa californica (AcNPV) is used as a vector to express foreign genes. The virus grows in Spodoptera frugiperda cells. The coding sequence of the fluorescent protein can be cloned into non-essential regions (eg, the polyhedrin gene) of the virus and placed under the control of an AcNPV promoter (eg, the polyhedrin promoter). Successful insertion of the fluorescent protein coding sequence will result in the inactivation of the polyhedrin gene and the production of non-occluded recombinant viruses (i.e., viruses lacking the protein coat for which it was encoded by the polyhedrin gene) . These recombinant viruses are then used to infect Spodoptera frugiperda cells in which the inserted gene is expressed, see Smith, et al., "Biol. 46: 584, 1983; Smith, U.S. Patent No. 4,215,051. Eukaryotic systems, and preferably mammalian expression systems, allow modifications to occur after translation of expressed mammalian proteins. Eukaryotic cells possessing the cellular machinery for the proper processing of the primary transcript, glycosylation, phosphorylation, and, the advantageous secretion of the gene product should be used as host cells for the expression of the fluorescent protein. These host cell lines may include, but are not limited to, CHO, VERO, HeLa, COS, MDCK, Jur at, HEK-293, and WI38. Mammalian cell systems that use recombinant viruses or viral elements for direct expression can be designed. For example, when adenovirus expression vectors are used, the fluorescent protein coding sequence can be ligated to an adenovirus transcription / translation control complex, for example, the leader sequence of the tripartite late promoter. This chimeric gene can then be inserted by in vitro or in vivo recombination. Insertion into a non-essential region of the viral genome (eg, El region or E3) will result in a recombinant virus and can express the fluorescent protein in infested hosts (eg see Logan and Shenk, Proc. Nati. Acad. Sci USA, 81: 3655-3659, 1984). Alternatively, the vaccinia 7.5K virus promoter can be used. (For example, see Mackett, et al., Proc. Nati, Acad. Sci. USA, 79: 7415-7419, 1982, Mackett, et al., J. "Virol. 49: 857-864, 1984; Panicali, et al. , Proc. Nati, Acad. Sci. USA, 79: 4927-4931, 1982). Of particular interest are vectors based on the bovine papilloma virus which has the capacity to replicate as extrachromosomal elements (Sarver, et al. , Mol. Cell, Biol. 1: 486, 1981) Shortly after the entry of this DNA into mouse cells, the plasmid replicates to approximately 100 to 200 copies per cell.Transcription of the inserted cDNA does not require integration of the plasmid within the chromosome of the host, giving a yield by the same of a high level of expression.These vectors can be used for stable expression by means of including a selectable marker in the plasmid, such as the neo gene. , the retroviral genome can be modified to be used com or a vector that can introduce and direct the expression of the fluorescent protein gene in the host cells (Cone and Mulligan, Proc. Nati Acad. Sci. USA, 81: 6349-6353, 1984). A high level of expression can also be achieved by using inducible promoters, including, but not limited to, the IIA promoter of metallothionine and heat shock promoters. The invention may also include a localization sequence, such as a nuclear localization sequence, an endoplasmic reticulum localization sequence, a peroxisome localization sequence, a mitochondrial localization sequence, or a localized protein. The localization sequences can be target sequences which are described, for example, in "Protein Targeting", chapter 35 of Stryer, L., Biochemistry (4th ed.). W.H. Freeman, 1995. The localization sequence can also be a localized protein. Some important localization sequences include those targeting the nucleus (KKKRK), the mitochondria (amino terminal MLRTSSLFTRRVQPSLFRNILRLQST-), the endoplasmic reticulum (KDEL in the C-terminus, acquiring a signal sequence present in the N-terminus), the peroxisome (SKF in term C), pre-insertion or insertion into the plasma membrane (CaaX, CC, CXC, or CCXX in the C term), the cytoplasmic side of the plasma membrane (fusion to SNAP-25), or the Golgi apparatus (fusion to furin). For long-term, high-yield production of recombinant proteins, stable expression is preferred. Rather than using expression vectors containing replication viral origins, the host cells can be transformed with the cDNA of the fluorescent protein controlled by the appropriate expression control elements (e.g., promoter, enhancer, sequences, terminators). of transcription, polyadenylation sites, etc.), and a selectable marker. The selectable marker in the recombinant plasmid confers resistance to selection and allows the cells to stably integrate the plasmid into their chromosomes and grow to form the foci which in turn can be cloned and expanded within the cell lines. For example, following the instruction of the foreign DNA, the designed cells may be allowed to grow for 1-2 days in an enriched medium, and then they are changed to a selective medium. A number of selection systems can be used, including but not limited to, the thymidine kinase of the herpes simplex virus (Wigler, et al., Cell, 11: 223, 1977), the hypoxanthine-guanine phosphoribosyl transferase (Szybalska and Szybalski, Proc. Nati, Acad. Sci. USA, 48: 2026, 1962), and the adenine phosphoribosyl transferase genes (Lowy, et al., Cell, 22: 817, 1980) can be used in tkl cells. , hgprt or aprt respectively. Also, antimetabolic resistance can be used as the basis of selection for dhfr, which confers resistance to methotrexate (Wigler, et al, Proc. Nati, Acad. Sci. USA, 77: 3567, 1980; O'Hare, and collaborators, Proc. Nati Acad. Sci. USA, 8: 1527, 1981); gpt, which confers resistance to mycophenolic acid (Mulligan and Berg, Proc Nati Acad Sci USA, 78: 2072, 1981); neo, which confers resistance to aminoglucoside G-418 (Colberre-Garapin, et al., J. Mol. Biol., 150: 1, 1981), and hygro, which confers resistance to hygromycin genes (Santerre, and collaborators, Gene, 30: 147, 1984.) Recently, additional selectable genes have been described, namely trpB, which allow cells to use indole instead of tryptophan, hisD, which allows cells to utilize histinol instead of histidine (Hartman and Mulligan, Proc. Nati, Acad. Sci. USA, 85: 8047, 1988), and ODC (ornithine decarboxylase) which confers resistance to the inhibitor of ornithine decarboxylase, 2- (difluoromethyl) -DL-ornithine, DFMO (MaConlogue L., in: Current Communications in Molecular Biology, Cold Spring Harbor Laboratory, ed., 1987.) The DNA sequences encoding the fluorescent protein polypeptide of the invention, in vitro, can be expressed. by transferring DNA into an appropriate host cell The "host cells" are cells in which the vector can be propagated and its DNA expressed. The term also includes any progeny of the subject host cell. It is understood that not all progeny can be identical to the mother cell, since there could be mutations that occur during replication. However, this progeny is included when the term "host cell" is used. Stable transfer methods are well known in the art, in other words when the foreign DNA is continuously maintained in the host. The expression vector can be transfected into a host cell for expression of the recombinant nucleic acid. The host cells can be selected for a high level of expression to be able to purify the fusion protein of the fluorescent proteins. E. coli is useful for this purpose. Alternatively, the host cell can be a prokaryotic or eukaryotic cell selected to study the activity of an enzyme produced by the cell. In this case, the binding peptide is selected to include an amino acid sequence recognized by the protease. The cell can be, for example, a cultured cell or a cell in vivo. A first advantage of the fluorescent protein fusion proteins is that they are prepared by normal protein biosynthesis, thus completely avoiding the organic synthesis and the requirement of tailor-made non-natural amino acid analogues. Constructs can be expressed in E. coli on a large scale for in vitro assays. Purification from bacteria is simplified when the sequences include polyhistidine tags for single step purification by nickel chelate chromatography. Alternatively, the substrates can be expressed directly in a host cell for in situ assays. In another embodiment, the invention provides a transgenic non-human animal that expresses a nucleic acid sequence which encodes the fluorescent protein. The "non-human animals" of the invention comprise any non-human animal having a nucleic acid sequence which encodes a fluorescent protein. These non-human animals include vertebrates such as rodents, non-human primates, sheep, dogs, cows, pigs, amphibians, and reptiles. The preferred non-human animals are selected from the family of rodents, which includes the rat and the mouse, most preferably the mouse. The "transgenic non-human animals" of the invention are produced by the introduction of "transgenes" into the germ line of the non-human animal. Embryonic target cells can be used at different stages of development to introduce the transgenes. Different methods are used depending on the stage of development of the embryonic target cell. The zygote is the best target for microinjection. In the mouse, the male pronucleus reaches the size of approximately 20 micrometers in diameter, which allows the reproducible injection of 1-2 pl of DNA solution. The use of zygotes as a target for gene transfer has a greater advantage because in most cases the injected DNA will be incorporated into the host gene before the first dissociation (Brinster, et al., Proc. Nati. Acad. Sci. USA 82: 4438-4442, 1985). As a consequence, all cells of the transgenic non-human animal will carry the transgene incorporated. This will also be reflected in general in the efficient transmission of the transgene to the offspring of the original since 50 percent of the germ cells will have the transgene. Zygote microinjection is the preferred method for the incorporation of transgenes when practicing the invention. The term "transgenic" is used to describe an animal which includes exogenous genetic material within all its cells. A "transgenic" animal can be produced by cross-breeding two chimeric animals which include exogenous genetic material within the cells that are used in reproduction. Twenty-five percent of the resulting offspring will be transgenic, that is, animals that include the exogenous genetic material within all of their cells in both alleles. Fifty percent of the animals that result will include the exogenous genetic material within an allele and 25 percent will not include the exogenous genetic material. Retroviral infection can also be used to introduce the transgene into a non-human animal. The non-human embryo developing in vitro in the blastocyst stage can be cultured. During this time, blastorneros may be the target for retroviral infection (Jaenich, R., Proc. Nati.
Acad. Sci. USA 73: 1260-1264). Efficient infection of the blastomeres is obtained by enzymatic treatment to remove the zona pellucida (Hogan, et al. (1986) in Manipulating the Mouse Embryo, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.). The viral vector system that is used to introduce the transgene is typically a replication-defective retrovirus carrying the transgene (Jahner, et al., Proc. Nati, Acad. Sci. USA 82: 6927-6931, 1985; Putten, et al., Proc. Nati, Acad. Sci. USA 82: 6148-6152, 1985). Transfection is obtained simply and efficiently by culturing the blastomeres in a monolayer of virus-producing cells (Van der Putten, supra; Stewart, et al., EMBO J. 6: 383-388, 1987). Alternatively, the infection can be performed at a later stage. Viruses or virus producing cells can be injected into the blastocoel (D. Jahner, et al., Nature 298: 623-628). Most originators will be mosaics for the transgene, since incorporation occurs only in a subset of the cells that formed the non-human transgenic animal. In addition, the originator can contain several retroviral insertions of the transgene in different positions in the genome, which will generally segregate in its offspring. In addition, it is also possible to introduce the transgenes into the germline, albeit with low efficiency, by intrauterine retroviral infection of the mid-gestation embryo (D. Jahner et al., Supra). A third type of target cell for the introduction of the transgene is the embryonic stem cell (ES). ES cells are obtained from pre-implantation embryos cultured in vitro and fused with embryos (MJ Evans, et al., Nature 292: 154-156, 1981; MO Bradley, et al., Nature 309: 255-258, 1984; Grossler, et al., Proc. Nati, Acad. Sci. USA 83: 9065-9069, 1986; and Robertson, et al., Nature 322: 445-448, 1986). Either transgenes within ES cells can be efficiently introduced by transfection of DNA or by retrovirus-mediated transduction. These transformed ES cells can then be combined with blasts from a non-human animal. After this the ES cells colonize the embryo and contribute to the germline of the resulting chimeric animal. (For review, see Jaenisch, R., Science 240: 1468-1474, 1988). "Transformed" means a cell within which (or within the ascendant of which) a heterologous nucleic acid molecule has been introduced, by means of recombinant nucleic acid techniques. "Heterologist" refers to a nucleic acid sequence that originates from either other species or is modified from either its original form or the primary form that is expressed in the cell. "Transgene" means any piece of DNA which is inserted by artifice into a cell, and becomes part of the organism's genome (that is, either stably integrated or common stable extrachromosomal element) which develops from of that cell. This transgene can include representing a gene homologous to an endogenous gene of the organism. Included within this definition is a transgene created by means of providing an RNA sequence which is transcribed into DNA and then incorporated into the genome. The transgenes of the invention include the DNA sequences encoding what encodes the fluorescent protein, which can be expressed in a transgenic non-human animal. The term "transgenic" as used herein, further includes any organism the genome of which has been altered by in vitro manipulation of the early embryo or fertilized egg or by any transgenic technology to induce a specific gene knock. The term "gene knock" as used in this, refers to the objective breakage of an in vivo gene with complete loss of function that has been achieved by any transgenic technology familiar to those in the art. In one embodiment, transgenic animals that have knocks are those in which the target gene has been presented as non-functional through an insertion directed to the gene that will be presented as non-functional through homologous recombination. As used herein, the term "transgenic" includes any transgenic technology familiar to those in the art, which can produce an organism carrying an introduced transgene or one in which the endogenous gene has been presented as non-functional or " beaten." III. USES OF DESIGNED FLUORESCENT PROTEINS The proteins of this invention are useful in any methods employing fluorescent proteins. The designed fluorescent proteins of this invention are useful as fluorescent labels in the many ways in which fluorescent labels are currently used. This includes, for example, designed fluorescent proteins that are coupled to antibodies, nucleic acids or other receptors for use in detection assays, such as immunoassays or hybridization assays. The fluorescent proteins designed of this invention to track the movement of proteins in cells. In this embodiment, a nucleic acid molecule encoding the fluorescent protein is fused to a nucleic acid molecule encoding the protein of interest in an expression vector. After expression within the cell, the protein of interest based on fluorescence can be localized. In another version, two proteins of interest are fused with two designed fluorescent proteins that have different fluorescent characteristics. The designed fluorescent proteins of this invention are useful in systems for detecting the induction of transcription. In certain embodiments, a nucleotide sequence encoding the designed fluorescent protein is fused to expression control sequences of interest and the expression vector is transfected into a cell. The induction of the promoter can be measured by detecting the expression and / or the amount of fluorescence. These constructs can be used to follow the signaling paths from the receiver to the promoter. The designed fluorescent proteins of this invention are useful in applications involving fluorescence resonance energy transfer. These applications can detect events as a function of the movement of fluorescent donors and receptors toward or away from each other. One or both of the donor / recipient pair can be a fluorescent protein. A preferred donor and receptor pair for assays based on fluorescence resonance energy transfer, is a donor with a T2031 mutation and a receptor with the T203X mutation, wherein X is an aromatic amino acid-39, especially T203Y, T203W , or T203H. In a particularly useful pair, the donor contains the following mutations: S72A. K79R, Y145F, M153A and T203I (with an excitation peak of 395 nm and an emission peak of 511 nm) and the receptor contains the following mutations: S65G, S72A, K79R, and T203Y. This particular pair provides a wide separation between the excitation and emission peaks and provides a good overlap between the emission spectrum of the donor and the excitation spectrum of the receiver.
Other mutants that changed to red, such as those described hereinabove, can also be used as the receptor in this pair. In one aspect, the fluorescence resonance energy transfer is used to detect the dissociation of a substrate having the donor and the receptor coupled to the substrate on opposite sides of the dissociation site. After dissociation of the substrate, the donor / receptor pair is physically separated, eliminating the fluorescence resonance energy transfer. The assays include contacting the substrate with a sample, and determining a qualitative or quantitative change in the fluorescence resonance energy transfer. In one embodiment, the fluorescent protein designed on a substrate for the D-lactamase is used. In the patent application of the United States 08 / 407,544, filed on March 20, 1995, and in the international application PCT / US96 / 04059, filed on March 20, 1996, examples of these substrates are described. In another embodiment, a donor / receptor pair of the designed fluorescent protein is part of a fusion protein coupled by a peptide having a proteolytic cleavage site. In the United States patent application 08 / 594,575, filed on January 31, 1996, these double fluorescent proteins are described. In another aspect, the fluorescence resonance energy transfer is used to detect changes in potential across a membrane. A donor and a receptor are placed on opposite sides of a membrane so that one is moved across the membrane in response to the voltage change. This creates a transfer of fluorescence resonance energy that can be measured. In the United States patent application 08 / 481,977, filed on June 7, 1995, and in the international application PCT / US96 / 09652, filed on June 6, 1996, this method is described. The designed proteins of this invention are useful in the creation of fluorescent substrates for protein kinases. These substrates incorporate an amino acid sequence that can be recognized by protein kinases. After phosphorylation, the designed fluorescent protein undergoes a change in a fluorescent property. These substrates are useful for detecting and measuring the activity of the protein kinase in a sample of a cell, after transfection and expression of the substrate. Preferably, the kinase recognition site is placed between about 20 amino acids from a term of the designed fluorescent protein. The kinase recognition site can also be placed in a protein cycle domain (See, for example, Figure IB.) In United States Patent Application 08 / 680,877, filed July 16. of 1996, methods for making fluorescent substrates for protein kinases are described. A protease recognition site can also be introduced within a cycle domain. After dissociation, the fluorescent property changes in a measurable manner. The invention also includes a method for identifying a test chemical. Typically, the method includes contacting a test chemical, a sample containing a biological entity labeled with a designed, functional fluorescent protein, or a polynucleotide that encodes this functional, designed fluorescent protein. By means of monitoring the fluorescence (i.e., a fluorescent property) of the sample containing the designed, functional fluorescent protein, it can be determined whether the test chemical is active or not. Controls may be included to ensure the specificity of the signal. These controls include measurements of a fluorescent property in the absence of the test chemical, in the presence of a chemical with an expected activity (e.g., a known modulator), or designed controls (e.g., absence of the designed fluorescent protein, absence of the designed fluorescent protein polynucleotide or the absence of an operable linkage of the designed fluorescent protein). Fluorescence in the presence of a test chemical may be higher or lower in the absence of this test chemical. For example, if the fluorescent protein designed to report the expression of the gene is used, the test chemical can regulate up or down the expression of the gene.
For these types of classification, the polynucleotide encoding the designed functional fluorescent protein is operably linked to a genomic polynucleotide or a re. Alternatively, the designed, functional fluorescent protein is fused to a second functional protein. This modality can be used to track the location of the second protein or to track protein-protein interactions that use energy transfer. IV. PROCEDURES Fluorescence is measured in a sample using a fluorometer. In general, the excitation radiation from an excitation source having a first wavelength, passes through excitation optics. The excitation optics cause the excitation radiation to stimulate the sample. In response, the fluorescent proteins in the sample emit radiation which has a wavelength that is different from 1 wavelength of excitation. Afterwards, the collection opticians collect the sample emission. The device includes a temperature controller to maintain the sample at a specific temperature while it is being scanned. According to one embodiment, a multi-axis translation stage moves a microtitre plate that holds a plurality of samples in order to position the different wells to be exposed. The multi-axis translation stage, the temperature controller, the auto-focused feature, and the electronics associated with image formation and data collection can be handled by a digital computer programmed in the appropriate manner. The computer can also transform the data that was collected during the trial to another format for the presentation. This process can be miniaturized and automated to allow the classification of many thousands of compounds. Methods for conducting assays on fluorescent materials are well known in the art and are described in, for example, Lakowicz, J.R. , Principies of Fluorescent Spectros-copy, New York: Plenum Press (1983); Hermán, B., Resonance energy transfer microscopy, in: Fluorescence Microscopy of Living Cells in Culture, Part B, Methods in Cell Biology, volume 30, ed. Taylor, D.L. and Wang, Y.L., San Diego: Academic Press (1989), pages 219-243; Turro, N.J., Modern Molecular Photoche istry, Menlo Park: Benjamin / Cummings Publishing Col, Inc. (1978), pages 296-361. The following examples are provided by way of illustration, and not by way of limitation. Examples As a step to understand the properties of the green fluorescent protein, and to aid in the preparation of green fluorescent proteins with altered characteristics, we have determined the three-dimensional structure at a resolution of 1.9A of the S65T mutant (R. Heim, et al. Nature 373: 664-665 (1995)) of the green fluorescent protein of A victoria. This mutant also contains the ubiquitous Q80R substitution, which occurred accidentally in the early distribution of the cDNA of the green fluorescent protein and of which it is not known if it has any effect on the properties of the protein (M. Chalfie, et al. , Science 263: 802-805 (1994)). Green fluorescent protein S65T labeled with histidine was overexpressed (R. Heim, et al., Nature 373: 664-665 (1995)) in JM109 / pRSETB in broth 41YT plus ampicillin in 37D, 450 rpm and 5 liters airflow / minute. The temperature was reduced to 25 D at A595 = 0.3, followed by induction with 1 M of isopropylthiogalactoside for 5 hours. The cell paste was stored at -80 D overnight, then resuspended in 50 mM HEPES at a pH of 7.9, 0.3 M NaCl, 5 mM 2-mercaptoethanol, 0.1 mM phenylmethyl sulfonyl fluoride ( PMSF) was passed once through a French press at 10,000 psi, then centrifuged at 20 K revolutions per minute for 45 minutes. The supernatant was applied to a Ni-NTA-agarose column (Qiagen), followed by a wash with 20 mM imidazole, then it was leached with 100 mM imidazole. The green fractions were pooled and subjected to chymotryptic proteolysis (Sigma) (1:50 weight / weight) for 22 hours at RT. After the addition of 0.5 mM of phenylmethyl-sulfonyl fluoride, what was collected was reapplied to the Ni column. N-terminal sequencing verified the presence of the correct N-terminal methionine. After dialysis against 20 mM of HEPES, at a pH of 7.5 and concentration at A490 = 20, rod-shaped crystals were obtained at RT in hanging drops containing 5 Di proteins and 5 DI well solution, 22-26 by PEG 4000 (Serva), 50 mM HEPES at a pH of 8.0-8.5, 50 mM MgCl2 and 10 mM of 2-mercaptoethanol in a period of 5 days. The crystals were 0.005 mm across and up to 1.0 mm in length. The space group is P2 | 2 | 2 | with a = 51.8, b = 62.8, c = 70.7 A, Z = 4. M.A. Perrozo, K.B. Ward, R.B. Thompson, and W.W. Ward J ". Biol. Chem. 203, 7713-7716 (1988), have described two forms of wild type green fluorescent protein crystals, which are not related to the present form. The structure of the green fluorescent protein was determined by multiple isomorphic replacement and anomalous dispersion (Table E), solvent flattening, phase combination and crystallographic refinement The most noticeable feature of the green fluorescent protein fold is an 11-barrel β-tangled around a single central helix (Figure IA and IB), where each chain consists of approximately 9-13 residues.The barrel forms an almost perfect cylinder 42A in length and 24A in diameter.The N-terminal half of the polypeptide comprises three anti-parallel chains, the central helix, and then 3 more anti-parallel chains, the last one of which (residues 118-123) is parallel to the terminal chain N (resides-duos 11-23). of the polypeptide crosses the "bottom" of the molecule to form the second half of the barrel in a five-chain Greek Key motif. The upper end of the cylinder is covered by three short, twisted helical segments, while a short, very twisted helical segment covers the bottom of the cylinder. The hydrogen bonding of the main chain that interlaces the surface of the cylinder is very likely the reason for the unusual stability of the protein towards denaturation and proteolysis. There are no large segments of the polypeptide that could be removed and still preserve the integrity of the envelope around the chromophore. Therefore it would seem difficult to redesign the green fluorescent protein to reduce its molecular weight (J. Dopf and T.M. Horiagon Gene 173: 39-43 (1996)) by a large percentage. The p-hydroxybenzylideneimidazolidinone chromophore (C.W. Cody, et al., Biochemistry 32: 1212-1218 (1993)), is completely protected from the crude solvent and is located centrally in the molecule. The total and presumably rigid encapsulation is probably responsible for the small Stoke change (ie, wavelength difference between excitation and emission maxima), high quantum efficiency, lack of 02 capacity to mitigate the excited state (BD Nageswara Rao, et al, Biophys, J. 32: 630-632 (1980)), and resistance of the chromophore to the external pH titration (WW Ward, Bioluminescence and Chemiluminescence (MA DeLuca and WD McElroy, en.) Academy Pres pages 235-242 (1981); W.W. Ward and S.H. Bokman, Biochemistry 21: 4535-4540 (1982); W.W. Ward, and collaborators, Photochem. Photobiol. 35: 803-808 (1982)). It also allows one to reason why the fluorophore formation must be a spontaneous intramolecular process (R. Heim, et al, Proc. Nati, Acad. Sci. USA 91: 12501-12504 (1994)), since it is difficult Imagine how an enzyme could gain access to the substrate. The plane of the chromophore is barely perpendicular (60D) to the axis of symmetry of the surrounding barrel. One side of the chromophore faces a surprisingly large cavity, occupying a volume of approximately 135 Á3 (B. Lee and FM Richards, J. "Mol. Biol. 55: 379-400 (1971).) The atomic radii were those of Lee and Richards, which were calculated using the MS program with a probe radius of 1.4 A (ML Connoly, Science 221: 709-713 (1983)), The cavity does not open to the crude solvent. water in the cavity, forming a chain of hydrogen bonds that bind the hidden side chains of Glu222 and Gln69.Unless it is busy, this large cavity would be expected to destabilize the protein by several kcal / mol (SJ Hubbard., and contributors, Protein Engineering 7: 613-626 (1994), AE Eriksson, et al., Science 255: 178-183 (1992).) Part of the volume of the cavity could be the consequence of compression resulting from cyclization reactions. and dehydration, the cavity can also accommodate temporary oxidant, most likely 02 (A.B. Cubbit, and collaborators, Trends Biochem. Sci. 20: 448-455 (1995); R. Heim, et al., Proc. Nati Acad. Sci. USA 91: 12501-12504 (1994); S. Inouye and F.I. Tsuji, FEBS Lett. 351: 211-214 (1994)), which dehydrogenizes the D-D junction of Tyr66. Figure 2A shows the chromophore, the cavity, and the side chains that are in contact with the chromophore, and a portion of the map of the final density of the electron in this neighborhood in 2B. The opposite side of the chromophore is pressed against several aromatic and polar side chains. Of particular interest is the intricate network of polar interactions with the chromophore (Figure 2C). His148, Thr203 and Ser205 form hydrogen bonds with the phenolic hydroxyl; Arg96 and Gln94 interact with the carbonyl of the imidazolidinone ring and Glu222 forms a hydrogen bond with the side chain of Thr65. Additional polar interactions, such as hydrogen bonds to Arg96 from the carbonyl of Thr62, and the side chain carbonyl of Gln183, supposedly stabilize the hidden Arg96 in its protonated form. In turn, the hidden charge suggests that a partial negative charge resides in the carbonyl oxygen of the imidazolidinone ring of the deprotonated fluorophore, as previously suggested (W.W. Ward, Bioluminescence and Chemiluminescence (M.A. DeLuca and W.D. McElroy, eds.) Pres Academy pages 235-242 (1981); W.W.
Ward and S.H. Bokman, Biochemistry 21: 4535-4540 (1982); W.W. Ward, and collaborators, Photochem. Photobiol. 35: 803-808 (1982)). It is likely that Arg96 is essential for the formation of the fluorophore, and may help catalyze the initial ring closure. Finally, Tyr145 shows a shore-side interaction typically stabilizing with the benzyl ring. Trp57, the only tryptophan of the green fluorescent protein, is localized from 13 Á to 15 Á of the chromophore and the long axes of the two ring systems are almost parallel. This indicates that efficient energy transfer to the latter should occur, and explains why a separate tryptophan emission can not be observed (D.C. Prasher, et al. Gene 111: 229-233 (1992)). The two cysteines in the green fluorescent protein, Cys48 and Cys70, are 24 A separated, too distant to form a bisulfide bridge. Cys70 is hidden, but Cys48 should be relatively accessible to specific sulfhydryl reagents. It is reported that this reagent, 5, 5'-dithiobis (2-nitrobenzoic acid), labels the green fluorescent protein and mitigates its fluorescence (S. Inouye and F.I. Tsuji FEBS Lett 351: 211-214 (1994)). This effect was attributed to the need for a free sulfhydryl, but could also reflect specific mitigation by the 5-thio-2-nitrobenzoate fraction that would be bound to Cys48. Although the electron density map is consistent most of the time with the proposed structure of the chromophore (DC Prasher, et al., Gene 111: 229-233 (1992); CW Cody, et al., Biochemistry 32: 1212- 1218 (1993)) in the cis [Z-] configuration, without evidence of any substantial fraction of the opposite isomer around the double bond of the chromophore, the difference characteristics are found in > 4 D in the final electron density map (F0-Fc) that can be interpreted to represent either the intact polypeptide, without having been run through a cycle, or a carbinolamine (insertion to Figure 2). This suggests that a significant fraction, perhaps as much as 30 percent of the molecules in the crystal, have not gone through the final dehydration reaction. The confirmation of the incomplete dehydration comes from the electro-debris mass spectrometry, which shows consistently that the average masses of both natural types and of the green fluorescent protein S65T (31,086 ± 4 and 31,099.5 + 4 Da, respectively) are 6 -7 Gives larger than predicted (31,079 and 31,093 Da, respectively) for fully matured proteins. This discrepancy could be explained by a fraction of 30-35 mole percent of apoprotein or carbinolamine with 18 or 20 Da of higher molecular weight. The natural abundance of 13C and 2H and the finite resolution of the Hewlett-Packard 5989B electro-dew mass spectrometer that was used to make these measurements does not allow individual peaks to be resolved, but yields an average mass peak with a width total to half of the maximum of approximately 15 Da. The molecular weights shown include the His tag which has the sequence MRGSHHHHHH GMASMTGGQQM GRDLYDDDDK DPPAEF (SEQ ID NO: 5). Mutants of the green fluorescent protein that increase the efficiency of the maturation of the fluorophore may yield somewhat brighter preparations. In a model for apoprotein, the peptide bond Thr65-Tyr66 is approximately in the helical-D conformation, while it seems that the peptide of Tyr66-Gly67 is inclined almost perpendicular to the axis of the helix by its interaction with Arg96. This further supports the speculation that Arg96 is important to generate the conformation required for cyclization, and possibly also to promote the attack of Gly67 on the carbonyl carbon of Thr65 (A.B. Cubitt, and collaborators, Trends Biochem. Sci. 20: 448-455 (1995)). The results of the previous random mutagenesis have involved different side chains of amino acids to have substantial effects on the spectra and the atomic model confirms that these residues are close to the chromophore. The T2031 and E222G mutations have profound but opposite consequences on the absorption spectrum (T. Ehrig, et al., FEBS Letters 367: 163-166 (1995)). T2031 (with Ser65 of the wild type) lacks the absorbance peak of 475 nm that is usually attributed to the anionic chromophore, and shows only the peak of 395 nm, which is thought to reflect the neutral chromophore (R. Hein, et al. , Proc. Nati, Acad. Sci. USA 91: 12501-12504 (1994); T. Ehrig, et al., FEBS Letters 367: 163-166 (1995)). In fact, Thr203 is linked by hydrogen to the phenolic oxygen of the chromophore, so that replacement by lie should prevent the ionization of phenolic oxygen. The mutation of Glu222 to Gly (T. Ehrig, et al., FEBS Letters 367: 163-166 (1995)) has many of the same spectroscopic effect as when replacing Ser65 by Gly, Ala, Cys, Val, or Thr, namely, suppress the 395 nm peak in favor of a peak at 470-490 nm (R. Heim, et al., Nature 373: 664-665 (1995); S. Delagrave, et al., Bio / echnology 13: 151-154 ( nineteen ninety five)). Truly Glu222 and the rest of Thr65 are hydrogen bonded to each other in the present structure, probably with the uncharged carboxyl of Glu222 acting as a donor to the side chain oxygen of Thr65. All mutations E222G, S65A, and S65V, would suppress this linkage by H. To explain how only the wild-type protein has both excitation peaks, Ser65, unlike Thr65, can adopt a conformation in which its hydroxyl donates a hydrogen bond to Glu222 and stabilizes it as an anion, the charge of which then inhibits the ionization of the chromophore. The structure also explains why some mutations appear neutral. For example, Gln80 is a surface residue that was removed from the chromophore, which explains why its accidental and ubiquitous mutation to Arg does not seem to have an obvious intramolecular spectroscopic effect (M. Chalfie, et al., Science 263: 802-805 (1994)). ). The development of green fluorescent protein mutants with excitation maxima and emission of change to red is an interesting challenge in the design of proteins (AB Cubitt, et al, Trends Biochem, Sci. 20: 448-455 (1995); Heim, et al, Nature 373: 664-665 (1995), S. Delagrave, and collaborators Bio / Technology 13: 151-154 (1995)). These mutants would also be valuable for avoiding cellular autofluorescence at short wavelengths, for simultaneous multicolored reports of the activity of two or more cellular processes, and for exploiting the fluorescence resonance energy transfer as a signal of the interaction of protein-protein (R. Heim and RY Tsien, Current Biol. 6: 178-182 (1996)). Extensive attempts using random mutagenesis have shifted the emission maximum by at most 6 nm at longer wavelengths, at 514 nm (R. Heim and R.Y. Tsien, Current Biol. 6: 178-182 (1996)); the "red shift" mutants described above simply suppressed the excitation peak of 395 nm in favor of the 475 nm peak without any significant reddening of the 505 nm emission (S. Delagrave, et al. Bio / Technology 13: 151- 154 (1995)). Because it is revealed that Thr203 is adjacent to the phenolic end of the chromophore, we mutated it to polar aromatic residues such as His, Tyr, and Trp in the hope that the additional polarizability of its D systems would decrease the energy of the excited state of the adjacent chromophore. . The three substitutions actually changed the emission peak to more than 520 nm (Table F). A particularly attractive mutation was T203Y / S65G / V68L / S72A, with excitation and emission peaks at 513 nm and 527 nm, respectively. These wavelengths are sufficiently different from the mutants of the above green fluorescent protein to be easily distinguished by appropriate filter sets in a fluorescence microscope. The extinction coefficient, 36,500 M'1 cm "1, and the quantum yield, 0.63, are almost as high as those of S65T (R. Heim, et al., Nature 373: 664-665 (1995)). Instructive comparison of the green fluorescent protein of Aequorea with other protein pigments Unfortunately, its closest characterized homologue, the green fluorescent protein of Renilla renif ormis of sea thought (0. Shimomura and FH Johnson J.). Physiol. 59: 223 (1962); J.G. Morin and J.W. Hastings, J. "Cell Physiol., 77: 313 (1971), H. Morise, et al., Biochemistry 13: 2656 (1974), W. W. Ward Photochem, Photobiol. Reviews (Smith, K.C. ed.) 4: 1 (1979); W.W. Ward, Bioluminescence and Chemiluminescence (M.A. DeLuca and W.D. McElroy, eds.) Pres Academy pages 235-242 (1981); W.W. Ward and S.H. Bokman Biochemistry 21: 4535-4540 (1982); W.W. Ward, and collaborators, Photochem. Photobiol. 35: 803-808 (1982)), it has not been sequenced or cloned, although its chromophore is derived from the same FSYG sequence as in the green fluorescent protein of Aequorea wild-type (RM San Pietro, et al., Photochem Photobiol 51 - 63S (1993)). The closest analogue for which a three-dimensional structure is available is the photoactive yellow protein (PYP, G.E. O.
Borgstahl, et al., Biochemistry 34: 6278-6287 (1995)), a 14-kDa photoreceptor of the halophilic bacterium. The photoactive yellow protein is its native dark state, absorbs to the maximum at 446 nm and transduces the light with a quantum yield of 0.64, coinciding in a close manner with the maximum long wavelength absorbance of the green fluorescent protein of type natural close to 475 nm and fluorescence quantum yield of 0.72-0.85. The fundamental chromophore in both proteins is an anionic p-hydroxycinnamyl group, which is covalently bound to the protein by a thioester bond in the photoactive yellow protein and a heterocyclic iminolactam in the green fluorescent protein. Both proteins stabilize the negative charge in the chromophore with the help of hidden cationic arginine and groups of neutral glutamic acid, Arg52 and Glu46 in the photoactive yellow protein, and Arg96 and Glu222 in the green fluorescent protein, although in the photoactive yellow protein the residues they are close to the oxyphenyl ring, whereas in the green fluorescent protein they are closer to the carbonyl end of the chromophore. However, the photoactive yellow protein has a total D / D fold with the appropriate flexibility and signal transduction domains to allow it to mediate between the cellular phototactic response, whereas the green fluorescent protein is a much more regular and rigid D-barrel, to minimize the parasitic dissipation of excited state energy as thermal or conformational movements. The green fluorescent protein is an elegant example of how a visually attractive and extremely useful function of efficient fluorescence can be generated spontaneously from a cohesive and economical protein structure. A. Summary of Green Fluorescent Protein Structure Determination Data were collected at room temperature locally using detectors from either Molecular Structure Corp. R-axis or San Diego Multiwire Systems (SDMS) (CuKD) and more. late a X4A flash line at the Brookhaven National Laboratory at the selenium absorption border (D = 0.979 A) using imaging plates. Data were evaluated using the HKL package (Z. Otwi-nowski, in Proceedings of the CCP4 Study Weekend: Data Collection and Processing, L. Sawyer, N. Issacs, S. Bailey, Es. (Science and Engineering Research Council (SERC ), Daresbury Laboratory, Warrington, UK, (1991)), pages 56-62, W. Minor, XDISPLAYF (Purdue University, West Lafayette, IN, (1993)), or SDMS software (AJ Howard, et al. Meth. Enzymol 114: 452-471 (1985).) Each set of data was collected from a single crystal.The heavy atom soaks were 2 mM in the mother liquor for 2 days. initials were based on three heavy atom derivatives using local data, then later replaced with synchrotron data.The Patterson map was disbanded from EMTS by inspection, then used to calculate Fourier maps of difference from other derivatives, lack of refinement of closure of the parameters of the Heavy volume was performed using the Protein package (W. Steigemann, in Ph.D. Thesis (Technical University, Munich, (1974)). The MIR maps were much more deficient than the total figure of merit would suggest, and it was clear that the isomorphic differences of EMTS dominated the phase adjustment. The increased anomalous occupancy for the synchrotron data provided a partial solution to the problem. Note that the phase adjustment energy for the synchrotron data was reduced, but the figure of merit remained unchanged. All density maps of the experimental electron were improved by the flattening of the solvent using the DM program of the CCP4 package (CCP4: A Suite of Programs for Protein Crystallography (SERC Daresbury Laboratory, Warrington WA4 4AD UK, (1979)), assuming the content of a 38 percent solvent.The phase combination was performed with PHASC02 of the Protein package using a weight of 1.0 on the atomic model, the parameters of the heavy atom were subsequently improved by refining against the combined phases. construction of the model with FRODO and O (TA Jones, et al, Acta Crystallogr, Sect. A 47: 110 (1991); T.A. Jones, in Computational Crystallography, D. Sayre, Ed. (Oxford University Press, Oxford, 1982), pages 303-317), and crystallographic refinement was performed with the TNT package (DE Tronrud, et al., Acta Cryst, A. 43 : 489-503 (1987)). Link lengths and angles for the chromophore were calculated using CHEM3D (Cambridge Scientific Computing). The final refinement and construction of the model was performed against a selenometion data set X4A, using electron density maps (2F0-FC). The data had not been used beyond a resolution of 1.9 Á at this stage. The final model contains residues 2-229 because the terminal residues are not visible in the electron density map, and the side chains of different disordered surface residues have been omitted. The density is weak for residues 156-158 and the coordinates for these residues are not reliable. This disorder is consistent with the above analyzes and shows that residues 1 and 233-238 are dispensable but that additional truncations can avoid fluorescence (J. Dopf and T.M. Horiagon, Gene 173: 39-43 (1996)). The atomic model was deposited in the Protein Data Bank (IEMA access code).
TABLE E Diffraction Data Statistics Glass Resolution obs Single TotemComp. Compl. (cuRmerqe Riso (%) d (A) co (%) a bierta) 1"ar R-axix II Native 2.0 51907 13582 80 69 4.1 5.8 EMTSe 2.6 17727 6787 87 87 5.7 20.6 SeMet 2.3 44975 10292 92 88 10.2 9.3 Multiwire H6I4-Se 3.0 15380 4332 84 79 7.2 28.87 X4a SeHet 1.8 126078 19503 80 55 9.3 9.4 EMTS 2.3 57812 9204 82 66 7.2 26.3 Aiuste de Fase statistics Derivative Resolution Number of Enerqpei of Enerqía of F0Hg Fon (cu¬ O) fixed phase aiuste august sites) fasef (cubiei-ta) Local EMTS 3.0 2 2.08 2.08 0.77 .072 SeMet 3.0 4 1.66 1.28 - - HGI4-Se 3.0 9 1.77 1.90 - - X4a EHTS 3.0 2 1.36 1.26 0.77 .072 SeHet 3.0 4 1.31 1.08 - - Statistics of the Atomic Model protein atoms 1790 Atoms of the solvent 94 Range of resol (Á) 20-1.9 Number of reflections (F> 0) 17676 Integrity 84. Factor "11 R. 0. 175 Average B value (Á2) 24. 1 Deviations from ideal Link lengths (Á) 0.014 Link angles (D) 1.9 Restricted B values (Á2) 4.3 Ramachandran absentees 0 Notes: (a ) Integrity is the proportion of the observed reflections that could be expressed theoretically as a percentage (b) The cover indicates the highest resolution cover, typically 0.1-0.4 A wide. (C) Rmerge = D [i - < I > / DI, where <I> is the means of the individual observations of the intensities I. (d) Riso = D | lDER - INAT | / D INAT (e) Derivatives were EMTS = etimercuritiosalicylate (modified residues Cys48 and Cys70), SeMet = protein replaced by selenomethionine (Met1 and Met233 could not be localized); HgI4-SeMet = double Hgl4 derivative on SeMet background. (f) Phase adjustment energy = < FH > / < E > where < FH > = r.m.s. heavy atom distribution and < E > = lack of closure. (g) FOM, average figure of merit (h) Standard crystallographic factor R, R = D | | Fobs | I Fcalc II / DI Fo s I B. Spectral properties of the Thr203 mutants ("T203"), compared to S65T The F64L, V68L and S72A mutations improved the fold of the green fluorescent protein in 37D (BP Cormack, et al., Gene 173: 33 (1996)), but did not significantly change the emission spectra.
TABLE F : Canvas Mutations Max. of ExcitaCoefficient of extinction- Háx. of emission (nm) tion - '1) sion (nm) S65T S65T 489 39.2 511 5B T203H / S65T 512 19.4 524 6C T203Y / S65T 513 14.5 525 10B T203Y / F64L / S65G / S72A 513 30.8 525 10C T203Y / F65G / V68L / S72A 513 36.5 527 11 T203W / S65G / S72A 502 33.0 512 12H T203Y / S65G / S72A 513 36.5 527 20A T203Y / S65G / V68L / Q69K / S72A 515 46.0 527 The present invention provides novel length wavelength designed fluorescent proteins. Although specific examples have been provided, the above description is illustrative and not restrictive. Many variations of the invention will be apparent to those skilled in the art, after a review of this specification. The scope of the invention should be determined, therefore, not with reference to the foregoing description, but should be determined with reference to the appended claims together with their total scope of equivalents. All publications and patent documents cited in this application are incorporated by reference in their entirety, for all purposes to the same extent as if each publication or individual patent document were denoted individually in that manner.
LIST OF SEQUENCES (2) INFORMATION FOR S? Q ID NO: l: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 716 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: simple (D) ) TOPOLOGY: linear (ix) CHARACTERISTICS: (A) NAME / KEY: CDS (B) LOCATION: 1..714 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: l: ATG AGT AAA GGA GAA GAA CTT TTC ACT GCA GTT GTC CCA ATT CTT GTT 48 Met Ser Lys Gly Glu Glu Leu Phe Thr Ala Val Val Pro lie Leu Val 1 5 10 15 GAA TTA GAT GAT GTAT AAT GGG CAC AAA TTT TCT GTC AGT GGA GAG 96 Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu 20 25 30 GGT GAA GGT GAT GTA ACA TAC GGA AAA CTT ACC CTT AAA TTT ATT TGC 144 Gly Glu Gly Asp Val Thr Tyr Gly Lys Leu Thr Leu Lys Phe lie Cys 35 40 45 ACT ACT GGA AAA CTA CCT GTT CCA TGG CCA ACA CTT GTC ACT ACT TTC 192 Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 50 55 60 TCT TAT GGT GTT CAA TGC TTT TCA AGA TAC CCA GAT CAT ATG AAA CGG 240 Ser Tyr Gly Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys Arg 65 70 75 80 CAT GAC TTT TTC AAG AGT GCC ATG CCC GAA GGT TAT GTA CAG CA AGA 288 His Asp Phe Phe Lys Ser Wing Met Pro Glu Gly Tyr Val Gln Gln Arg 85 90 95 ACT ATA TTT TTC AAA GAT GAC GGG AAC TAC AAG ACA CGT GCT GAA GTC 336 Thr lie Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Wing Glu Val 100 105 110 AAG TTT GAA GGT GAT ACC CTT GTT AAT AGA ATC GAG TTA AAA GGT ATT 384 Lys Phe Glu Gly Asp Thr Leu Val Asn Arg lie Gl u Leu Lys Gly lie 115 120 125 GAT TTT AAA GAA GAT GGA AAC ATT CTT GGA CAT AAA TTG GAA TAC AAC 432 Asp Phe Lys Glu Asp Gly Asn lie Leu Gly His Lys Leu Glu Tyr Asn 130 135 140 TAT AAC TCA CAC AAT GTA TAC ATC ATG GCA GAC AAA CA AAG AAT GGA 480 Tyr Asn Ser His Asn Val Tyr lie Met Wing Asp Lys Gln Lys Asn Gly 145 150 155 160 ATC AAA GTT AAC TTC AAA ATT AGA CAC AAC ATT GAA GAT GGA AGC GTT 528 lie Lys Val Asn Phe Lys lie Arg His Asn lie Glu Asp Gly Ser Val 165 170 175 CAA CTA GCA GAC TAT TAT CAA CAA AAT ACT CCA ATT CTC GAT GGC CCT 576 Gln Leu Wing Asp Tyr Tyr Gln Gln Asn Thr Pro lie Leu Asp Gly Pro 180 185 190 GTC CTT TTA CCA GAC AAC CAT TAC CTG TCC ACA CAA TCT GCC CTT TCG 624 Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Wing Leu Ser 195 200 205 AAA GAT CCC AAC GAA AAG AGA GAC CAC ATG GTC CTT CTT GAG TTT GTA 672 Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val 210 215 220 ACA GCT GCT GGG ATT ACA CAT GGC ATG GAT GAA CTA TAC AAA 714 Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys 225 230 235 TA 716 (2) INFORMATION FOR SEQ ID NO: 2: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 238 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 2: Met Ser Lys Gly Glu Glu Leu Phe Thr Wing Val Val Pro He Leu Val 1 5 10 15 Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu 20 25 30 Gly Glu Gly Asp Val Thr Tyr Gly Lys Leu Thr Leu Lys Phe He Cys 35 40 45 Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 50 55 60 Ser Tyr Gly Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys Arg 65 70 75 80 His Asp Phe Phe Lys Ser Wing Met Pro Glu Gly Tyr Val Gln Gln Arg 85 90 95 Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val 100 105 110 Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly He 115 120 125 Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr Asn 130 135 140 Tyr Asn Ser His Asn Val Tyr He Met Wing Asp Lys Gln Lys Asn Gly 145 150 155 160 He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser Val 165 170 175 Gln Leu Wing Asp Tyr Tyr Gln Gln Asn Thr Pro He Leu Asp Gly Pro 180 185 190 Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Wing Leu Ser 195 200 205 Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val 210 215 220 Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys 225 230 235 (2) INFORMATION FOR SEQ ID NO: 3: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 720 base pairs (B) ) TYPE: nucleic acid (C) TYPE OF CHAIN: simple (D) TOPOLOGY: linear (ix) FEATURE: (A) NAME / KEY: CDS (B) LOCATION: 1.720 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 3: ATG GTG AGC AAG GGC GAG GG CTG TTC ACC GGG GTG GTG CCC ATC CTG 48 Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu 240 245 250 GTC GAG CTG GAC GGC GAC GAC AAC GGC CAC AAG TTC AGC GTG TCC GGC 96 Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly 255 260 265 270 GAG GGC GAG GGC GAT GCC ACC TAC GGC AAG CTG ACC CTG AAG TTC ATC 144 Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He 275 280 285 TGC ACC ACC GGC AAG CTG CCC GTG CCC TGG CCC ACC CTC GTG ACC ACC 192 Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr 290 295 300 TTC GGC TAC GGC GTG CAG TGC TTC GCC CGC TAC CCC GAC CAC ATG AAG 240 Phe Gly Tyr Gly Val Gln Cys Phe Wing Arg Tyr Pro Asp His Met Lys 305 310 315 CAG CAG GAC TTC TTC AAG TCC GCC ATG CCC GAA GGC TAC GTC CAG GAG 288 Gln Gln Asp Phe Phe Lys Ser Wing Met Pro Glu Gly Tyr Val Gln Glu 320 325 330 CGC ACC ATC TTC TAG AAG GAC GAC GGC AAC TAC AAG ACC CGC GCC GAG 336 Arg Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Wing Glu 335 340 345 350 GTG AAG TTC GAG GGC GAC ACC CTG GTG AAC CGC ATC GAG CTG AAG GGC 384 Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly 355 360 365 ATC GAC TTC AAG GAC GAC GGC AAC ATC CTG GGG CAC AAG CTG GAG TAC 432 He Asp Phe Lys Asp Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr 370 375 380 AAC TAC AAC AGC CAC AAC GTC TAT ATC ATG GCC GAC AAG CAG AAG AAC 480 Asn Tyr Asn Ser His Asn Val Tyr He Met Wing Asp Lys Gln Lys Asn 385 390 395 GGC ATC AAG GTG AAC TTC AAG ATC CGC CAC AAC ATC GAG GAC GGC AGC 528 Gly He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser 400 405 410 GTG CAG CCC GCC GAC CAC TAC CAG CAG AAC ACC CCC ATC GGC GAC GGC 576 Val Gln Pro Wing Asp His Tyr Gln Gln Asn Thr Pro He Gly Asp Gly 415 420 425 430 CCC GTG CTG CTG CCC GAC AAC CAC TAC CTG AGC TAC CAG TCC GCC CTG 624 Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Tyr Gln Be Wing Leu 435 440 445 AGC AAA GAC CCC AAC GAG AAG CGC GAT CAC ATG GTC CTG CTG GAG TTC 672 Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe 450 455 460 GTG ACC GCC GCC GGG ATC ACT CAC GGC ATG GAC GAG CTG TAC AAG TAA 720 Val Thr Wing Wing Gly He Thr His Gly Met Asp Glu Leu Tyr Lys * 465 470 475 (2) INFORMATION FOR SEQ ID NO: 4: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 240 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 4: Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu 1 5 10 15 Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly 20 25 30 Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He 35 40 45 Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr 50 55 60 Phe Gly Tyr Gly Val Gln Cys Phe Ala Arg Tyr Pro Asp His Met Lys 65 70 75 80 Gln Gln Asp Phe Phe Lys Ser Wing Met Pro Glu Gly Tyr Val Gln Glu 85 90 95 Arg Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu 100 105 110 Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly 115 120 125 He Asp Phe Lys Asp Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr 130 135 140 Asn Tyr Asn Ser His Asn Val Tyr He Met Wing Asp Lys Gln Lys Asn 145 150 155 160 Gly He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser 165 170 175 Val Gln Pro Wing Asp His Tyr Gln Gln Asn Thr Pro He Gly Asp Gly 180 185 190 Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Tyr Gln Ser Ala Leu 195 200 205 Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe 210 215 220 Val Thr Wing Wing Gly He Thr His Gly Met Asp Glu Leu Tyr Lys * 225 230 235 240

Claims (100)

  1. CLAIMS 1. A nucleic acid molecule comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) ) and which differs from SEQ ID NO: 2 by at least the T203X substitution, where X is an aromatic amino acid selected from H, Y, W or F, said functional fluorescent protein, engineered product having a fluorescent property different from the green fluorescent protein from Aequorea.
  2. 2. The nucleic acid molecule of claim 1, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
  3. 3. The nucleic acid molecule of claim 1, wherein the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203W. .
  4. The nucleic acid molecule of claim 1 or 2, wherein the amino acid sequence further comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F and Y66W.
  5. 5. The nucleic acid molecule of claim 1 or 2, wherein the amino acid sequence further comprises a mutation of Table A.
  6. 6. The nucleic acid molecule of claim 1 or 2, wherein the amino acid sequence further comprises a fold mutation.
  7. 7. The nucleic acid molecule of any of claims 1 to 3, wherein the nucleotide sequence encoding the protein differs from the nucleotide sequence of SEQ ID NO: 1 by replacing at least one codon with a codon of preferred mammal.
  8. 8. The nucleic acid molecule of any of claims 1-3, which encodes a fusion protein, wherein the fusion protein comprises a polypeptide of interest and the functional fluorescent protein product of engineering.
  9. 9. An expression vector, comprising expression control sequences operably linked to a nucleic acid molecule comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least the amino acid substitution T203X, where X is an aromatic amino acid selected from H, Y, W or F, said functional fluorescent protein, engineered product, having a fluorescent property different from the green fluorescent protein of Aequorea.
  10. 10. The expression vector of claim 9, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
  11. 11. The expression vector of claim 9, wherein the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203W.
  12. 12. The expression vector of claim 10 or 11, wherein the amino acid sequence further comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F and Y66.
  13. 13. The expression vector of claim 10 or 11, wherein the amino acid sequence comprises a mutation of Table A.
  14. 14. The expression vector of claim 9 or 10, wherein the amino acid sequence further comprises a fold mutation.
  15. 15. The expression vector of any of claims 9-11, wherein the nucleotide sequence encoding the protein differs from the nucleotide sequence of SEQ ID NO: 1 by replacing at least one codon with a mammalian codon. favorite.
  16. 16. The expression vector of any of claims 9 to 11, which encodes a fusion protein where the fusion protein comprises a polypeptide of interest and the functional fluorescent protein, engineered product.
  17. 17. A recombinant host cell, comprising an expression vector comprising expression control sequences operably linked to a nucleic acid molecule comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Ae? ruorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least the amino acid substitution T203X, where X is a selected aromatic amino acid of H, Y, or F, said functional fluorescent protein, engineered product, having a fluorescent property different from the Aequore green fluorescent protein.
  18. 18. The recombinant host cell of claim 17, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
  19. 19. The recombinant host cell of claim 17, wherein the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203W.
  20. 20. The recombinant host cell of claim 17 or 18, wherein the amino acid sequence further comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F and Y66.
  21. 21. The recombinant host cell of claim 17 or 18, wherein the amino acid sequence further comprises a mutation of Table A.
  22. 22. The recombinant host cell of claim 17 or 18, wherein the amino acid sequence further comprises a mutation of fold 23.
  23. The recombinant host cell of any of claims 17-19, wherein the nucleotide sequence encoding the protein differs from the nucleotide sequence of SEQ ID NO: 1 by replacing at least one codon with a mammalian codon. favorite.
  24. The recombinant host cell of any of claims 17-19, which encodes a fusion protein, wherein the fusion protein comprises a polypeptide of interest and the functional fluorescent protein, engineered product.
  25. 25. The recombinant host cell of any of claims 17-19, which is a prokaryotic cell.
  26. 26. The recombinant host cell of any of claims 17-19, which is a eukaryotic cell.
  27. 27. A functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 by minus the T203X substitution, where X is an aromatic amino acid selected from H, Y, or F, said functional fluorescent protein, engineered product having a fluorescent property different from the green fluorescent protein from Aequorea.
  28. The protein of claim 27, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
  29. 29. The protein of claim 27, wherein the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; Or S65G / S72A / T203.
  30. 30. The protein of claim 27 or 28, wherein the amino acid sequence further comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F and Y66.
  31. 31. The protein of claim 27 or 28, wherein the amino acid sequence further comprises a fold mutation.
  32. 32. The protein of any of claims 27-29, which is a fusion protein, wherein the fusion protein comprises a polypeptide of interest and the functional fluorine-cente protein, engineered product.
  33. 33. A fluorescently labeled antibody, comprising an antibody coupled to a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and the which differs from SEQ ID NO: 2 by at least the T203X substitution, where X is an aromatic amino acid selected from H, Y, or F, said functional fluorescent protein, engineered product having a fluorescent property different from the green fluorescent protein of Aequorea.
  34. 34. The fluorescently labeled antibody of claim 33, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
  35. 35. The fluorescently labeled antibody of claim 33, wherein the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; Or S65G / S72A / T203.
  36. 36. The fluorescently labeled antibody of claim 33 or 34, wherein the amino acid sequence further comprises a substitution in Y66, where the substitution is selected from Y66H, Y66F and Y66.
  37. 37. The fluorescently labeled antibody of any one of claims 33-35, which is a fusion protein, wherein the fusion protein comprises the antibody by fusing the functional fluorescent protein, engineered product.
  38. 38. A nucleic acid molecule, comprising a nucleotide sequence encoding an antibody fused to a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 by at least the T203X substitution, where X is an aromatic amino acid selected from H, Y, or F, said functional fluorescent protein , engineering product having a fluorescent property different from the green fluorescent protein of Aequorea.
  39. 39. The nucleic acid molecule of claim 38, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
  40. 40. The nucleic acid molecule of the claim 38, where the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203.
  41. 41. The nucleic acid molecule of claim 38 or 39, wherein the amino acid sequence further comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F and Y66.
  42. 42. A fluorescently labeled nucleic acid probe comprising a nucleic acid probe coupled to a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea ( SEQ ID NO: 2) and which differs from SEQ ID NO: 2 by at least the T203X substitution, where X is an aromatic amino acid selected from H, Y, or F, said functional fluorescent protein, engineering product having a property fluorescent protein of the green fluorescent protein of Aeguorea.
  43. 43. The fluorescently labeled nucleic acid probe of claim 42, wherein the amino acid sequence further comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
  44. 44. The fluorescently labeled nucleic acid probe of claim 42, wherein the amino acid sequence differs by no more than the S65T / T203H substitutions; S65T / T203Y; S72A / F64L / S65G / T203Y; S72A / S65G / V68L / T203Y; S65G / V68L / Q69K / S72A / T203Y; S65G / S72A / T203Y; or S65G / S72A / T203W.
  45. 45. The fluorescently labeled nucleic acid probe of claim 42 or 43, wherein the amino acid sequence further comprises a substitution at Y66, wherein the substitution is selected from Y66H, Y66F and Y66W.
  46. 46. A nucleic acid molecule, comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) ), and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220 , E222 (not E222G), or V224, said functional fluorescent protein, engineered product, having a fluorescent property different from the green fluorescent protein of Aequorea.
  47. 47. The nucleic acid molecule of claim 46, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, and Y, V61X, where X is selected from F, Y, H and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y, H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R , E and G, Q94X, where X is selected from D, E, H, K and N, N121X, where X is selected from F, H, and Y, Y145X, where X is selected from, C, F, L, E, H, K and Q, H148X, where X is selected from F, Y, N, K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H, Q , W and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y, E and K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and T, E222X, where X is selected from N and Q, or V224X, where X is selected from H, N, Q, T, F, and Y.
  48. 48. A expression vector, comprising sequence expression control proteins operably linked to a nucleic acid molecule comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea ( SEQ ID NO: 2), and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220, E222 (not E222G), or V224, said functional fluorescent protein, engineering product, having a fluorescent property different from the green fluorescent protein of Aequorea.
  49. 49. The expression vector of claim 48, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, W and Y, V61X, where X is selected from F, Y, H and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y. H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R , E and G, Q94X, where X is selected from D, E, H, K and N, N121X, where X is selected from F, H, W and Y, Y145X, where X is selected from W, C, F, L, E, H, K and Q, H148X, where X is selected from F, Y, N, K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H , Q, and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y, E and K, N185X, where X is selected from D, E, H, K and Q , L220X, where X is selected from H, N, Q and T, E222X, where X is selected from N and Q, or V224X, where X is selected from H, N, Q, T, F, and Y.
  50. 50. A recombinant host cell, comprising a expression vector comprising expression control sequences operably linked to a nucleic acid molecule comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the protein fluorescent green Aequorea (SEQ ID NO: 2), and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150 , F165, 1167, Q183, N185, L220, E222 (not E222G), or V224, said functional fluorescent protein, engineered product, having a fluorescent property different from the green fluorescent protein of Aeguorea.
  51. 51. The recombinant host cell of claim 50, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, W and Y, V61X, where X is selected from F, Y, H and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y, H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R , E and G, Q94X, where X is selected from D, E, H, K and N, N121X, where X is selected from F, H, and Y, Y145X, where X is selected from, C, F, L, E, H, K and Q, H148X, where X is selected from F, Y, N, K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H, Q, W and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y, E and K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and T, E222X, where X is selected from N and Q, or V224X, where X is selected from H, N, Q, T, F, W and Y.
  52. 52. A functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220, E222 (E222G), or V224, said functional fluorescent protein, engineered product , having a fluorescent property different from the green fluorescent protein of Aequorea.
  53. 53. The functional fluorescent protein, engineered product, of claim 52, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, and Y, V61X, where X is selected from F, Y, H and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y, H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R, E and G, Q94X, where X is selected from D, E, H, K and N , N121X, where X is selected from F, H, and Y, Y145X, where X is selected from, C, F, L, E, H, K and Q, H148X, where X is selected from F, Y, N, K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H, Q, and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y , E and K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and T, E222X, where X is selected from N and Q, or V224X , wherein X is selected from H, N, Q, T, F, W and Y.
  54. 54. A fluorescently labeled antibody, comprising an antibody coupled to a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 by at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121 , Y145, H148, V150, F165, 1167, Q183, N185, L220, E222 (E222G), OV224, said functional fluorescent protein, engineered product, having a fluorescent property different from the protein fluorescent green Aequorea.
  55. 55. The antibody of claim 54, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, W and Y, V61X, where X is selected from F, Y, H and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y, H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R, E and G, Q94X, where X is selected from D, E, H, K and N, N121X, where X is selected from F, H, W and Y, Y145X, where X is selected from, C, F, L, E , H, K And Q, H148X, where X is selected from F, Y, N, K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H, Q, W and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y, E and K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and T, E222X, where X is selected from N and Q, or V224X, where X is selected from H, N, Q, T, F, W and Y.
  56. 56. A nucleic acid molecule, comprising a nucleotide sequence encoding an antibody fused to a nucleotide sequence encoding a fluorescent protein functional, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220, E222 (E222G), or V224, said functional fluorescent protein, pr oducto engineering, having a fluorescent property different from the green fluorescent protein of Aequorea.
  57. 57. The nucleic acid molecule of claim 56, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, W and Y, V61X, where X is selected from F, Y, H and C, T62X , where X is selected from A, V, F, S, D, N, Q, Y, H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R, E and G, Q94X, where X is selected from D, E, H, K and N , N121X, where X is selected from F, H, and Y, Y145X, where X is selected from W, C, F, L, E, H, K and Q, H148X, where X is selected from F, Y, N , K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H, Q, and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y, E and K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and T, E222X, where X is selected of N and Q, or V224X, where X is selected from H, N, Q, T, F, and Y.
  58. 58. A fluorescently labeled nucleic acid probe, comprising a nucleic acid probe coupled to a functional fluorescent protein, engineering product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least one amino acid substitution in L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167 , Q183, N185, L220, E222 (E222G), or V224, said functional fluorescent protein, engineered product, having a fluorescent property different from the green fluorescent protein of Aequorea.
  59. 59. The antibody of claim 58, wherein the amino acid substitution is: L42X, where X is selected from C, F, H, W and Y, V61X, where X is selected from F, Y, H and C, T62X, where X is selected from A, V, F, S, D, N, Q, Y, H and C, V68X, where X is selected from F, Y and H, Q69X, where X is selected from K, R, E and G, Q94X, where X is selected from D, E, H, K and N, N121X, where X is selected from F, H, and Y, Y145X, where X is selected from W, C, F, L, E , H, K and Q. H148X, where X is selected from F, Y, N, K, Q and R, V150X, where X is selected from F, Y and H, F165X, where X is selected from H, Q, and Y, I167X, where X is selected from F, Y and H, Q183X, where X is selected from H, Y , E and K, N185X, where X is selected from D, E, H, K and Q, L220X, where X is selected from H, N, Q and T, E222X, where X is selected from N and Q, or V224X , wherein X is selected from H, N, Q, T, F, and Y.
  60. 60. A method for determining whether a mixture contains a target, comprising: contacting the mixture with a fluorescently labeled probe, comprising a probe and a functional fluorescent protein, engineered product, of claim 27 or claim 52; and determine if the target has been linked to the probe.
  61. 61. The method of claim 60, wherein the target is linked to a solid matrix.
  62. 62. A method for engineering a functional fluorescent protein, an engineered product, that has a different fluorescent property than the green fluorescent protein of Aeguorea, which comprises replacing an amino acid that is located no more than 0.5 nm from any atom in the chromophore of a green fluorescent protein related to Aequorea with another amino acid; whereby the substitution alters a fluorescent property of the protein.
  63. 63. The method of claim 62, wherein the amino acid substitution alters the electronic environment of the chromophore.
  64. 64. A method for engineering a functional fluorescent protein, an engineered product, having a different fluorescent property than the green fluorescent protein of Aequorea, which comprises substituting amino acids in a green fluorescent protein loop domain related to Aequorea with amino acids, so as to create a consensus sequence for phosphorylation or proteolysis.
  65. 65. A method for producing fluorescence resonance energy transfer, comprising: providing a donor molecule comprising a functional fluorescent protein, engineered product, of claim 27 or claim 52; provide an appropriate acceptor molecule for the fluorescent protein; and placing the donor molecule and the acceptor molecule in sufficiently close contact to allow fluorescence resonance energy transfer.
  66. 66 A method for producing fluorescence resonance energy transfer, comprising: providing an acceptor molecule comprising a functional fluorescent protein, engineered product, of claim 27 or claim 52; provide a suitable donor molecule for the fluorescent protein; and placing the donor molecule and the acceptor molecule in sufficiently close contact to allow fluorescence resonance energy transfer.
  67. 67. The method of claim 66, wherein the donor molecule is a fluorescent, engineered protein, whose amino acid sequence comprises the T203I substitution and the acceptor molecule is a mutant fluorescent protein whose amino acid sequence comprises the T203X substitution, where X is a aromatic amino acid selected from H, Y, W or F, said functional fluorescent protein, engineered product, having a fluorescent property different from the green fluorescent protein of Aeguorea.
  68. 68. A nucleic acid molecule, comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) ) and which differs from SEQ ID NO: 2 in at least one amino acid substitution located at no more than about 0.5 nm of the chromophore of the engineered fluorescent protein, where the substitution alters the electronic environment of the chromophore, with which the functional fluorescent protein, engineered product, has a fluorescent property different from the green fluorescent protein of Aeguorea.
  69. 69. An expression vector, comprising expression control sequences operably linked to a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the fluorescent protein Aeguorea green (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least one amino acid substitution located at no more than about 0.5 nm of the chromophore of the engineered fluorescent protein, where the substitution alters the electronic environment of the chromophore, whereby the functional fluorescent protein, engineered product, has a different fluorescent property than the green fluorescent protein of Aeguorea.
  70. 70. A functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in less an amino acid substitution located at no more than about 0.5 nm of the chromophore of the engineered fluorescent protein, where the substitution alters the electronic environment of the chromophore, whereby the functional fluorescent protein, engineered product, has a fluorescent property different from the green fluorescent protein of Aequorea.
  71. 71. A crystal of a protein, comprising a fluorescent protein with an amino acid sequence substantially identical to SEQ ID NO: 2, wherein said crystal diffracts with at least a resolution of 2.0 to 3.0 Angstroms.
  72. 72. The crystal of claim 71, wherein the fluorescent protein has at least 200 amino acids, a terminating value of at least 80%, and has a crystal stability within 0.5% of its unit cell dimensions.
  73. 73. The crystal of claim 71, wherein the amino acid sequence comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I.
  74. 74. The crystal of claim 71, wherein said crystal has the following unit cell dimensions in Angstroms: a = 51.8, b = 62.8 and e = 70.7, with a space group of P 2 2 2 and an angle D of 90.00 D , an angle D of 90.00D, and an angle D of 90.00D, and the crystal has a diffraction limit where 90% or more of the potential reflections can be used to determine the coordinates of the atoms.
  75. 75. A computational method of designing a fluorescent protein, comprising: determining from a three-dimensional model of a crystallized fluorescent protein comprising a fluorescent protein with a ligand ligated, at least one amino acid that interacts with the fluorescent protein that interacts with the minus a first chemical fraction of the ligand, and select at least one chemical modification of the first chemical fraction to produce a second chemical fraction with a structure to either reduce or increase an interaction between the interacting amino acid and the second chemical fraction compared to the interaction between the interacting amino acid and the first chemical fraction.
  76. 76. The computational method of claim 75, further comprising generating the three-dimensional model of the crystallized protein comprising a fluorescent protein with an amino acid sequence substantially identical to SEQ ID NO: 2.
  77. 77. The computational method of claim 75, wherein the selection selects the first chemical fraction that interacts with at least one of the amino acids listed in Figures 5-1 through 5-28.
  78. 78. The computational method of claim 75, wherein the chemical modification improves the hydrogen bonding interaction, the charge interaction, the hydrophobic interaction, the Van Der Waals interaction or the dipole interaction between the second chemical fraction and the amino acid which interacts in comparison to the first chemical fraction and the amino acid that interacts.
  79. 79. A computational method of modeling the three-dimensional structure of a fluorescent protein, comprising determining a three-dimensional relationship between at least two atoms listed in the atomic coordinates of Figures 5-1 through 5-28.
  80. 80. The computational method of claim 79, wherein the determination comprises determining the three-dimensional structure of a fluorescent protein with an amino acid sequence at least 80% identical to SEQ ID NO: 2.
  81. 81. The computational method of the claim 79, wherein the determination comprises determining the three-dimensional structure of a fluorescent protein with an amino acid sequence at least 95% identical to SEQ ID NO: 2.
  82. 82. The computational method of claim 79, wherein the determination comprises determining the three-dimensional relationship of at least 1,500 atoms listed in Figures 5-1 through 5-28.
  83. 83. A device comprising a storage device and, stored in the device, at least 10 atomic coordinates selected from the atomic coordinates listed in Figures 5-1 to 5-28.
  84. 84. The device of claim 83, wherein the storage device is a device capable of being read by a computer that stores code that receives as input the atomic coordinates.
  85. 85. The device of claim 84, wherein the device capable of being read by a computer is a flexible disk or a hard disk.
  86. 86. A nucleic acid molecule, comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) ) and which differs from SEQ ID NO: 2 in at least one substitution in Q69, where said functional fluorescent protein, engineered product, has a fluorescent property different from the green fluorescent protein of Aequorea. 5
  87. 87. The nucleic acid molecule of the claim 86, wherein said substitution in Q69 is selected from the group of K, R, E and G.
  88. 88. The nucleic acid molecule of claim 86, wherein said amino acid sequence further comprises a 10 mutation of function in S65.
  89. 89. A nucleic acid molecule, comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the 15 green fluorescent protein of Aeguorea (SEQ ID NO: 2) and which differs from SEQ ID NO: 2 in at least one substitution in E222, '^ but not including E222G, where said functional fluorescent protein, engineered product, has a fluorescent property different from the green fluorescent protein of Aeguorea.
  90. 90. The nucleic acid molecule of the claim 89, where said substitution at E222 is selected from the group of N and Q.
  91. 91. The nucleic acid molecule of claim 89, wherein said amino acid sequence further comprises a 25 mutation of function in F64.
  92. 92. A nucleic acid molecule, comprising a nucleotide sequence encoding a functional fluorescent protein, engineered product, whose amino acid sequence is substantially identical to the amino acid sequence of the green fluorescent protein of Aequorea (SEQ ID NO: 2) and which. differs from SEQ ID NO: 2 in at least one substitution at Y145, where said functional fluorescent protein, engineered product, has a fluorescent property different from the green fluorescent protein of Aequorea.
  93. 93. The nucleic acid molecule of the claim 92, where said substitution at Y145 is selected from the group of W, C, F, L, E, H, K and Q.
  94. 94. The nucleic acid molecule of claim 92, wherein said amino acid sequence further comprises a function in Y66.
  95. 95. A method of identifying a test chemical, comprising: contacting a test chemical with a sample containing a biological entity labeled with a functional fluorescent protein, engineered product or a polynucleotide that encodes said functional fluorescent protein , engineering product; and detecting fluorescence of said functional fluorescent protein, engineered product.
  96. 96. The method of claim 95, wherein said fluorescence in the presence of a test chemical is greater than in the absence of said test chemical.
  97. 97. The method of claim 96, wherein said polynucleotide encoding said func- tional fluorescent protein, engineered product, is operably linked to a genomic polynucleotide.
  98. 98. The method of claim 95, wherein said functional fluorescent protein, engineered product, is fused to a second functional protein.
  99. 99. The method of claim 96, wherein said polynucleotide encoding said functional fluorescent protein, engineered product, is operably linked to a response element.
  100. 100. The method of claim 96, wherein said polynucleotide encoding said functionally engineered fluorescent protein is operably linked to a response element in a mammalian cell.
MXPA/A/1998/002972A 1996-08-16 1998-04-16 Long wavelength engineered fluorescent proteins MXPA98002972A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US60/024,050 1996-08-16
US08706408 1996-08-30

Publications (1)

Publication Number Publication Date
MXPA98002972A true MXPA98002972A (en) 2000-06-05

Family

ID=

Similar Documents

Publication Publication Date Title
AU727088B2 (en) Long wavelength engineered fluorescent proteins
US8263412B2 (en) Long wavelength engineered fluorescent proteins
US6608189B1 (en) Fluorescent protein sensors for measuring the pH of a biological sample
US6469154B1 (en) Fluorescent protein indicators
US20030212265A1 (en) Fluorescent protein sensors for measuring the pH of a biological sample
WO2002068605A2 (en) Non-oligomerizing tandem fluorescent proteins
WO2000071565A9 (en) Fluorescent protein indicators
US6699687B1 (en) Circularly permuted fluorescent protein indicators
AU767375B2 (en) Long wavelength engineered fluorescent proteins
MXPA98002972A (en) Long wavelength engineered fluorescent proteins
AU2004200425A1 (en) Long wavelength engineered fluorescent proteins