WO1994028133A1

WO1994028133A1 - Recombinant neu differentiation factors

Info

Publication number: WO1994028133A1
Application number: PCT/US1994/005769
Authority: WO
Inventors: Duanzhi Wen; Raymond Allen Koski; Glenn Francis Pierce; Shaw-Fren Sylvia Hu; Barry J. Sugarman; Naili Liu
Original assignee: Amgen Inc.
Priority date: 1993-05-21
Filing date: 1994-05-23
Publication date: 1994-12-08
Also published as: AU7042994A

Abstract

Non-naturally occurring polypeptides are described which stimulate neu receptor phosphorylation and can be prepared by recombinant DNA methods from nucleotide sequences obtained from human cells and tissues. These polypeptides are useful to treat various human conditions involving cells that express the neu receptor and respond to modulation by the polypeptides. Also described are DNA molecules encoding the polypeptides, or analogs of the polypeptides, methods of recombinant DNA production of the polypeptides or analogs from the DNA molecules, biological materials useful in these methods such as expression vectors and transformed or transfected host cells, and pharmaceutical compositions containing the polypeptides.

Description

RECOMBINANT NEU DIFFERENTIATION FACTORS

Cross-Reference to Related Application

This application is a continuation-in-part of copending application number 07/877,431, filed April 29, 1992, which is incorporated by reference herein in its entirety.

Field Of The Invention

This invention relates to novel polypeptides (herein referred to as neu differentiation factors), produced by recombinant DNA methods, which interact with and stimulate the neu receptor and modulate cellular function. The invention includes analogs and

derivatives of such polypeptides, as well as nucleotide sequences encoding the polypeptides, analogs and

derivatives, and to methods of use for the polypeptides and nucleotide sequences.

Background Of The Invention Cell growth and differentiation are regulated in part by extracellular signals that are mediated by polypeptide molecules (Aaronson, S.A., Science

254:1146-1152, 1991). The interaction between these factors and specific cell surface receptors initiates a biochemical cascade culminating in nuclear events that regulate gene expression and DNA replication

(Ullrich, A. and Schlessinger, J., Cell 61:203-212, 1990). This mechanism of cell regulation has

implications for the determination of cell fate during development, and breakdown of this mechanism may lead to oncogenic transformation. The latter can be induced by constitutive production of growth regulatory factors or by altered forms of their cognate receptors (Yarden, Y., and Ullrich, A., Ann . Rev. Biochem . 57:443-448, 1988) . The oncogenic receptors include a family of

transmembrane glycoproteins that share a common

catalytic function in their cytoplasmic domains, namely, a tyrosine-specific protein kinase activity

(Hanks, S.K., Cur. Op . Struct . Biol . 1:364-383, 1991).

The neu proto-oncogene (also known as HER-2 or c-erbB-2) encodes a 185 kilodalton (kDa) receptor tyrosine kinase. This pl85^neu glycoprotein is known to be present in many epithelial and neural tissues

(Maguire, H.C. et al., J. Invest . Dermatol . 92 : 186-190 , 1989; Press, M.F. et al., Oncogene 5:953-962, 1990;

Gullick, W.J. et al.. Int . J. Cancer 40:246-254, 1987; Quirke, P.A. et al., Br. J. Cancer 60:64-69, 1989;

Kokai, Y. et al., Proc . Natl . Acad. Sci . USA 84 : 8498-8501, 1987; Natali, P.G. et al., Int . J. Cancer 45:457-461, 1990; Cohen, J.A. et al., J. Neurosci . Res . 31 : 622-634, 1992; Mori, S. et al., Lab . Invest . 61:93-97,

1989). Although adult tissues generally express less p185^neu than corresponding fetal tissues, pl85^neu

expression levels are frequently elevated in neoplastic adult tissues. Overexpression of pl85^neu or neu gene amplification occurs in approximately twenty percent of breast, stomach, bladder and ovarian adenocarcinomas (Lofts, F.J., and Gullick, W.J., Cancer Treat . Res .

61 : 161-119, 1992), and is associated with poor prognosis for breast and ovarian cancers (Slamon, D.J. et al.,

Science 244 : 707-712 , 1989). Increased p185^neu expression also occurs in certain non-malignant neoplasias, such as adenomatous polyps (D'Emilia, J.K. et al., Oncogene 4:1233-1239, 1989; Cohen, J.A. et al., Oncogene 4:1233-1239 1989), Barrett's esophagous (Jankowski, J.G. et al., Gut 33:1033-1038, 1992), and polycystic kidneys (Herrera, G., Kidney Int . 40:509-513, 1991).

Polypeptide molecules that activate receptor tyrosine kinases induce receptor tyrosine

phosphorylation, thereby initiating intracellular signaling leading to cellular responses (Cantley, L.C. et al., Cell 64:281-302, 1991). Phosphorylation of the pl85^neu receptor on tyrosine residues can be stimulated by proteins purified from several sources (Peles, E. et al., Cell 69: 205-216, 1992; Lupu, R. et al.. Science 249:1552-1555, 1990; Holmes, W.E. et al.. Science

256:1205-1210, 1992; Huang, S.S., and Huang, J.S., J. Biol . Chem . 267:11508-11512, 1992; Dobashi, K. et al., Proc . Natl . Acad . Sci . USA 88:8582-8586, 1991) .

Purification of rat and human pl85^neu stimulatory proteins led to the isolation of cDNAs encoding novel epidermal growth factor (EGF) -related proteins (Peles, E. et al., Cell . 1992, above; Wen, D. et al., Cell

69:559-572, 1992; Holmes, W.E. et al., Science, 1992, above). The 44 kDa rat factor, named neu

differentiation factor (NDF), stimulates p185^neu tyrosine phosphorylation and induces the production of milk components (casein and lipids) in certain breast

carcinoma cell lines (Peles, E. et al., Cell . 1992, above). The NDF cDNA sequences were predictive of a transmembrane glycoprotein precursor (proNDF) with an EGF-like domain, an immunoglobulin homology unit, and a 157 amino acid cytoplasmic domain. The recombinant version of this rat NDF was found to interact with pl85^neu and stimulate tyrosine phosphorylation of the receptor in human tumor cells of breast, colon and neuronal origin (Peles, E. et al., EMBO J. 12:461-971, 1993). In situ hybridization with a NDF probe

identified the central and peripheral nervous systems as prominent sites of NDF expression in mouse embryos (Orr- Urtreger, A. et al., Proc. Natl . Acad. Sci . USA 90:1867-1871, 1993).

As part of the invention now described herein, a number of human NDF cDNA clones and human proNDF polypeptides encoded by them have been discovered.

Various recombinant human NDF polypeptides have been made using bacterial and mammalian expression systems and characterized as to biological activities, as described in the following text. These human

polypeptides add to the expanding knowledge of NDF-related molecules, which now also includes glial growth factors (Marchionni, M.A. et al.. Nature 362:312-318, 1993), as well as chicken acetylcholine receptor

inducing activity, or "ARIA" (Falls, D.L. et al., Cell 72:801-815, 1993). See, also, published PCT application WO 92/18627, which describes Schwann cell-mitogenic factors of 30-36 kDa and 55-63 kDa.

Summary of the Invention

In accordance with this invention, non-naturally occurring polypeptides encoded by human nucleotide sequences are provided which function as stimulators and inducers of neu (or Her-2, or c-erbB-2) receptor activities. These polypeptides include

recombinant DNA-derived neu receptor stimulating factors expressed from cDNA clones isolated from human tissues and cell lines. Such polypeptides possess an ability to stimulate human p185^neu tyrosine phosphorylation. These recombinant polypeptides can be termed neu receptor stimulating factors, but are preferably referred to herein as neu differentiation factors (or "NDFs"), based on their ability to induce a differentiated phenotype in certain cell lines. The polypeptides are also

characterized by their ability to stimulate or inhibit proliferation of certain cells. These polypeptides are encoded by nucleotide sequences which are also described in detail herein.

The neu differentiation factors of the present invention are also characterized by the inclusion in all instances of either of the following two amino acid sequences in their polypeptide structure:

"Alpha" Form [SEQ ID NO: 1]

CysAlaGluLysGluLysThrPheCysValAsnGlyGlyGluCysPheMetVal LysAspLeuSerAsnProSerArgTyrLeuCysLysCysGlnProGlyPheThr GlyAlaArgCys

"Beta" Form [SEQ ID NO: 2]

CysAlaGluLysGluLysThrPheCysValAsnGlyGlyGluCysPheMetVal LysAspLeuSerAsnProSerArgTyrLeuCysLysCysProAsnGluPheThr GlyAspArgCys

The isolated DNA sequences provided by the present invention are useful in securing expression of the polypeptides in procaryotic and/or eucaryotic host cells. The present invention specifically provides DNA sequences encoding all or part of the unprocessed amino acid sequences (proNDFs) as well as DNA sequences encoding all or part of the processed (mature) forms of the NDFs and novel recombinant analogs thereof. Such DNA sequences include:

(a) the DNA sequences set out in

Figures 30, 31, 32, 33, 34, 35, 36, 37 [SEQ ID NOS: 3, 5, 7, 9, 11, 13, 15 and 17, respectively] and their complementary strands; (b) DNA sequences which hybridize to the DNA sequences defined in (a) or fragments thereof; and

(c) DNA sequences which, but for the degeneracy of the genetic code, would hybridize to the DNA sequences defined in (a) and (b).

Specifically included in parts (b) and (c) are complementary DNA (cDNA) and genomic DNA sequences encoding variant forms of human NDF, manufactured DNA sequences (e.g., as by solid phase chemical synthesis) encoding human NDF, fragments of human NDF, and analogs of human NDF. The DNA sequences may incorporate codons facilitating transcription and translation of messenger RNA in host cells.

Also provided are vectors containing such DNA sequences, and host cells transformed or transfected with such vectors. Additionally, the invention includes methods of producing biologically active human NDF polypeptides by recombinant DNA techniques, and methods of treating disorders using recombinant NDFs.

Pharmaceutical compositions containing recombinant human NDF polypeptides and antibodies generated with such polypeptides are also encompassed by this invention.

Brief Description Of The Figures

Figure 1 depicts a high pressure liquid chromatogram of the trypsin digest of "naturally- occurring" neu receptor stimulating factor purified from media conditioned by ras-transformed rat fibroblasts according to the methods of Peles, E. et al., Cell, 1992, above. The elution profile is shown, and the amino acid sequences obtained from the fractions corresponding to each peak are indicated in conventional single letter designation. The amino acid in parentheses represents an asparagine-linked glycosylation site, while the dotted line represents a longer sequence in which the amino acids were not identified. The inset shows

re-chromatographic separation of the 27.3-minute peak after dithiothreitol reduction.

Figure 2 shows the structure of the mammalian

COS-7 cell expression vector pJT-2/NDF. The rat NDF cDNA insert is the clone 44 cDNA sequence of Figure 5 [SEQ ID NO: 21].

Figure 3 demonstrates stimulation of human neu receptor tyrosine phosphorylation by recombinant rat NDF. Human MDA-MB-453 breast carcinoma cells were incubated with concentrated conditioned media from COS-7 cells transfected with the indicated partially purified cDNA clones (left panel) or with fully purified cDNA clones (right panel). Positive controls comprised

10 ng/ml (left lane) and 100 ng/ml of purified

naturally-occurring rat NDF. Negative controls

comprised concentrated media conditioned by

untransfected COS-7 cells (lane designated "COS"), or by cells transfected with pJT-2 plasmids containing

unrelated cDNA (lanes marked as "clone 27" and

"clone 29"), and no addition ("NONE"). Lysates were prepared from the stimulated MDA-MB-453 cells and subjected to SDS-PAGE in a 10% gel (Novex). Shown are autoradiograms of anti-phosphotyrosine Western blots, with the locations of molecular weight marker proteins given in kilodaltons. The following volumes of the COS-7 supernatants were used in a total volume of

0.125 ml PBS containing 0.1% BSA: 0.01 ml (left lane) and 0.1 ml (right lane) for clones 4, 19 and 44. Assays of the purified clones were performed in 0.25 ml total volume with 0.2 ml of cell supernatants or as indicated.

Figure 4 shows the nucleotide sequence of rat NDF cDNA [SEQ ID NO: 19] and a deduced amino acid sequence [SEQ ID NO: 20]. Depicted is a combined nucleotide sequence from four rat cDNA clones. The beginning of the clone 44 DNA sequence in particular is indicated by an arrow. Nucleotide numbers and amino acid numbers are given in the left and right columns, respectively. Potential sites of N-linked glycoprotein are marked by asterisks, and cysteine residues found in the presumed extracellular domain are encircled. The overlining indicates peptide sequences that were

determined directly from purified naturally-occurring NDF. The underlining indicates the potential

transmembrane region, whereas the dashed underlining indicates the polyadenylation sites. The portions of the protein sequence that contain an immunoglobulin (Ig) homology unit and an epidermal growth factor (EGF)-like motif are indicated on the right hand side.

Figure 5 shows the nucleotide sequence of rat NDF cDNA clone 44 [SEQ ID NO: 21], separately, and a deduced amino acid sequence [SEQ ID NO: 22]. Nucleotide numbers are given in the right hand column.

Figure 6 depicts the hydropathy profile of the precursor of the rat neu receptor stimulating factor encoded by rat cDNA clone 44. The method of Kyte and Doolittle, J. Mol . Biol . 157:105-132 (1982), was used with a window size of nine amino acid residues.

Positive values indicate increasing hydrophobicity.

Amino acid numbers are given below the profile.

Figure 7 shows the alignment of the amino acid sequence [SEQ ID NO: 23] of the EGF-like domain and the flanking carboxyl terminal sequence of rat NDF from clone 44 with representative members of the EGF family. Alignment and numbering begin at the most amino terminal cysteine residue of the EGF motifs. Amino acid residues are indicated by the single letter code. Dashes

indicate gaps that were introduced for maximal

alignment. Residues are boxed to indicate their identity with the corresponding amino acids of NDF. The underlines indicate the amino terminal portions of the putative transmembrane domains that flank some of the EGF motifs at their carboxy terminal side. Asterisks show the carboxyl termini. The following proteins are compared to NDF: rat transforming growth factor α [SEQ ID NO: 24] (TGFα; Marquardt, H. et al., Science

223:1079-1082, 1984), human amphiregulin [SEQ ID NO: 25] (AR; Shoyab, M. et al.. Science 243:1074-1076, 1989), sheep fibroma virus growth factor [SEQ ID NO: 30] (SFGF; Chang, W. et al., Λfol. Cell . Biol . 7:535-540 1987);

myxoma virus growth factor [SEQ ID NO: 31] (MGF;

Upton, C. et al., J. Virol . 51:1271-1275, 1987); mouse EGF [SEQ ID NO: 26] (Gray, A. et al., Nature

303:722-725, 1983; Scott, J. et al., Science

221 : 236-240, 1983); human heparin-binding EGF [SEQ ID NO: 27] (HB-EGF; Higashiyama, S. et al.,

Science 252:936-939, 1991); rat schwannoma-derived growth factor [SEQ ID NO: 28] (SDGF; Kimura, H. et al., Nature 348:257-260, 1990); and vaccinia virus growth factor [SEQ ID NO: 29] (VGF; Blomquist, M.D. et al., Proc . Natl . Acad . Sci . USA 81:7363-7367, 1984).

Figure 8 shows the alignment of the immunoglobulin (Ig)-like domain of rat NDF [SEQ ID

NO: 32] from cDNA clone 44 with the fourth Ig-related sequence of the murine neural cell adhesion molecule

[SEQ ID NO: 33] (NCAM; Barthels, D. et al., EMBO J. 6; 907-914, 1987), a representative protein of the C2-set of the immunoglobulin superfamily. Residues that are highly conserved in the C2-set are indicated on the lower line, and positions conserved across the

superfamily (according to Williams, A., and Barclay, A., Rev. Immunol . 6:381-405, 1988) are overlined. Stretches of amino acids that may be involved in β-strand

formation are heavily underlined and labeled B through F, in analogy with the immunoglobulin domains. Boxes indicate identical residues in NDF and NCAM. Dashes (gaps) were introduced for maximal alignment.

Figure 9 is a schematic presentation of the presumed secondary structure and membrane orientation of the precursor of the neu receptor simulating factor from rat cDNA clone 44. Positions corresponding to the immunoglobulin (Ig)-domain and epidermal growth factor (EGF) motif are shown by thick lines and their cysteine residues are indicated by circles. The disulfide linkage of the Ig domain was directly demonstrated by amino acid sequence analysis (Example ID, below), and the secondary structure of the EGF-domain is based on the homology with the EGF family. Also shown are the three cysteine residues found in the transmembrane domain. Arrows mark the processing sites of the

precursor protein at the amino terminus (based on

N-terminal sequencing of rat NDF protein) and a putative proteolytic site close to the plasma membrane. The branched line indicates the location of a proven

N-glycosylation site and the short vertical lines represent presumed sites of O-glycosylation.

Figure 10 depicts a receptor competition assay using naturally-occurring neu receptor stimulating factor purified from ras-transformed rat fibroblast conditioned media (Peles, E. et al., Cell, 1992, above) which has been radiolabelled ("¹²⁵I-NDF"). The radio-labelled NDF was incubated with human MDA-MB-453 breast carcinoma cells in the absence ("NONE") or presence of conditioned media from COS-7 cells expressing either recombinant neu receptor stimulating factor from rat cDNA clone 44 ("C-NDF") or recombinant TGFα ("C-TGFα") . The cell-bound radioactivity is shown a≤ average ± S.D. (n=3). Figure 11 depicts another receptor competition assay. ¹²⁵I-labeled naturally-occurring NDF was

incubated with human MDA-MB-453 cells in the absence ("NONE") or presence of unlabeled "cold" naturally-occurring NDF ("NDF"; 200 ng/ml), or media conditioned by COS-7 cells that were transfected either with the NDF expression plasmid of Fig. 2 ("C-NDF"), containing rat cDNA clone 44, or with a TGFα-encoding vector

("C-TGFα"). Following crosslinking with BS3, the cells were lysed and the neu receptor protein was

immunoprecipitated by using a monoclonal antibody.

Shown is an autoradiogram (3-day exposure) of the polyacrylamide gel-separated immunocomplexes. Molecular weights of marker proteins are indicated in kilodaltons. Mostly the presumed receptor dimer was radiolabeled.

Figure 12 shows autoradiograms obtained from Northern blots of mRNA isolated from cultured cells (panels A and C) or freshly isolated tissue of an adult rat (panel B), using rat NDF cDNA clone 44 as a probe. The autoradiograms were obtained after a three-hour

(panel A) or twenty-four hour (panels B and C) exposure. Molecular weight estimation (in kilobases) was performed by using a mixture of marker molecules from GIBCO BRL Life Technologies, Inc. (Gaithersburg, MD).

Figure 13 shows the nucleotide sequence of rat

NDF cDNA clone 4 [SEQ ID NO: 34] and a deduced partial rat (proNDF-ß3) amino acid sequence [SEQ ID NO: 35]. Nucleotide numbers are given in the left hand column and amino acids numbers on the right hand column. Linker sequences added to the 5' and 3' ends of the cDNA to facilitate cloning are included. The amino acid

numbering corresponds to the amino acid numbering of Figure 23 (composite figure for rat NDFs).

Figure 14 shows the nucleotide sequence of rat NDF cDNA clone 19 [SEQ ID NO: 36] and a deduced rat (proNDF-α2b) amino acid sequence [SEQ ID NO: 37].

Nucleotide numbers are given in the left hand column and amino acids numbers on the right hand column. Linker sequences added to the 3' ends of the cDNA to facilitate cloning are included.

Figure 15 shows the nucleotide sequence of rat NDF cDNA clone 20 [SEQ ID NO: 38] and a deduced rat (proNDF-α2b) amino acid sequence [SEQ ID NO: 39].

Figure 16 shows the nucleotide sequence of rat NDF cDNA clone 22 [SEQ ID NO: 40] and a deduced rat (proNDF-ß2a) amino acid sequence [SEQ ID NO: 41].

Figure 17 shows the nucleotide sequence of rat

NDF cDNA clone 38 [SEQ ID NO: 42] and a deduced rat (proNDF-α2a) amino acid sequence [SEQ ID NO: 43].

Figure 18 shows the nucleotide sequence of rat NDF cDNA clone 40 [SEQ ID NO: 44] and a deduced partial rat (proNDF-ß2) amino acid sequence [SEQ ID NO: 45] . Nucleotide numbers are given in the left hand column and amino acids numbers on the right hand column. Linker sequences added to the 3' ends of the cDNA to facilitate cloning are included.

Figure 19 shows the nucleotide sequence of rat NDF cDNA clone 41 [SEQ ID NO: 46] and a deduced rat (proNDF-ß2a) amino acid sequence [SEQ ID NO: 47].

Figure 20 shows the nucleotide sequence of rat NDF cDNA clone 42A [SEQ ID NO: 48] and a deduced rat (proNDF-ß4a) amino acid sequence [SEQ ID NO: 49].

Figure 21 shows the nucleotide sequence of rat NDF cDNA clone 42B [SEQ ID NO: 50] and a deduced rat (proNDF-α2a) amino acid sequence [SEQ ID NO: 51].

Figure 22 shows rat proNDF (precursor) structures predicted from the rat cDNA sequences of Figures 4, 5, and 13-21. Boxed areas indicate protein coding regions. "Ig", "EGF" and "TM" indicate the immunoglobulin-like, EGF-like and transmembrane domains, respectively. The number of amino acid residues in each predicted precursor sequence is given in the right hand column. Dashed lines represent divergent 3'

untranslated DNA sequences.

Figure 23 shows the amino acid sequences encoded by different rat proNDF cDNAs. The complete proNDF-α2a amino acid sequence from Figure 17 is shown [SEQ ID NO: 43]. Divergent sequences in proNDF variants are aligned with the proNDF-α2a sequence. ProNDF-ß2a [SEQ ID NO: 126], ß3 [SEQ ID NO: 127] and ß4a [SEQ ID NO: 52] structures were deduced from Rat-1-EJ cDNAs (Table 2, below, and Figures 16, 13, and 20,

respectively). The NDF-ß1 sequence [SEQ ID NO: 52] was obtained from cDNA amplified from rat brain and spinal cord (See Example 9 and Figure 28, below). Asterisks, carboxyl-terminal amino acids; dashes, gaps introduced to facilitate sequence alignment; overline, putative transmembrane domain; dots, cysteine residues in predicted extracellular domain (the two N-terminal cysteine residues are part of the immunoglobulin-like domain; the six other marked cysteine residues reside in the EGF-like growth factor domain).

Figure 24 shows autoradiograms of anti-phosphotyrosine Western blots, using the neu receptor tyrosine phosphorylation assay procedure of Example 3, below. The results with recombinant rat NDF proteins expressed in COS-7 cells from the indicated cDNA clones are shown. COS-7(-) is medium from untransfected COS-7 cells.

Figure 25 shows the transient expression of recombinant proNDFs from rat cDNA clones. COS-7 cells were transfected with expression plasmids containing cDNAs encoding the indicated NDF isoforms and were labeled with ³⁵S-methionine/cysteine. Conditioned media and cell lysates were immunoprecipitated with an affinity-purified rabbit anti-NDF antibody and analyzed by electrophoresis. The positions of molecular weight standards are shown on the right.

Figure 26 shows endoglycosidase treatment of recombinant rat proNDFs expressed in COS-7 cells. COS-7 cells were transfected with proNDF-ß3 and proNDF-α2c cDNA expression plasmids and labeled for seventeen hours with ³⁵S-methionine/cysteine. Radiolabeled COS-7 cell media and lysates were immunoprecipitated with affinity-purified rabbit antibody raised against recombinant rat NDF-α2_14-241. The immunoprecipitates were treated with endoglycosidases as indicated and analyzed by

electrophoresis. The positions of the molecular weight markers are shown on the right.

Figure 27 shows the reverse transcription-polymerase chain reaction (RT-PCR) analysis of NDF mRNAs expressed in various rat tissues. Reverse transcription was used to prepare first-strand cDNA from mRNA samples. NDF cDNAs were amplified in PCR reactions with primers specific for the NDF EGF-like domain. Lanes are as follows: 1, Rat-1-EJ; 2, heart; 3, skin; 4, ovary; 5, lung; 6, stomach; 7, spleen; 8, liver; 9, muscle; 10, kidney; 11, brain; 12, spinal cord; 13, cDNA clone 20 (proNDF-α2b) 14, cDNA clone 40 (proNDF-ß2); 15, cDNA clone 42A (proNDF-ß4). Positions of DNA size markers (in base pairs) are indicated.

Figure 28 shows the nucleotide sequence [SEQ ID NO: 55] of the PCR products obtained from rat spinal cord and brain, and a deduced partial rat NDF-ß1 amino acid sequence [SEQ ID NO: 56]. Nucleotide numbers are given in the left hand column, and amino acid numbers are given in the right hand column. The amino acid numbering corresponds to the amino acid numbering in Figure 23.

Figure 29 shows a Northern blot analysis of several human cell lines screened for the presence of NDF-related mRNA with rat NDF clone 44 cDNA probe.

Figure 30 shows the nucleotide sequence of human NDF cDNA clone PI [SEQ ID NO: 3] and a deduced partial amino acid sequence for human proNDF-α1a [SEQ ID NO: 4]. Nucleotide numbers are given in the left hand column and amino acid numbers on the right hand column. Amino acid numbering corresponds to the amino acid numbering of Figure 38 (composite figure for human

NDFs). Linker sequences added to the 5' and 3' ends of the cDNA to facilitate cloning are included. Figure 31 shows the composite nucleotide sequence of human proNDF-α2b cDNA [SEQ ID NO: 5] and a deduced amino acid sequence [SEQ ID NO: 6]. This sequence was derived from the sequences of clone 43 [SEQ ID NO: 7] and clone 17, shown in Figures 32 and 33, respectively. Nucleotide numbers are given in the left hand column and amino acid numbers on the right hand column. Linker sequences added to the 3' end of the cDNA to facilitate cloning are included.

Figure 32 shows the nucleotide sequence of human NDF cDNA clone 43 and a deduced human proNDF-α2b amino acid sequence [SEQ ID NO: 8]. Nucleotide numbers are given in the left hand column and amino acid numbers on the right hand column. Linker sequences added to the 3' end of the cDNA to facilitate cloning are included.

Figure 33 shows the nucleotide sequence of human NDF cDNA clone 17 [SEQ ID NO: 9] and a deduced partial amino acid sequence of human proNDF-α2b [SEQ ID NO: 10]. Nucleotide numbers are given in the left hand column and amino acid numbers on the right hand column. Amino acid numbering corresponds to the amino acid numbering of Figure 38. Linker sequences added to the 5' and 3' ends of the cDNA to facilitate cloning are included.

Figure 34 shows the nucleotide sequence of human NDF cDNA clone 19 [SEQ ID NO: 11] and a deduced partial amino acid sequence of human proNDF-α3 [SEQ ID NO: 12]. Nucleotide numbers are given in the left hand column and amino acid numbers on the right hand column. Amino acid numbering corresponds to the amino acid numbering of Figure 38 (composite figure for human NDFs). Linker sequences added to the 5' and 3 ' ends of the cDNA to facilitate cloning are included .

Figure 35 shows the nucleotide sequence of human NDF cDNA clone P13 [SEQ ID NO: 13] and a deduced partial amino acid sequence for human NDF-ß1a [SEQ ID NO: 14]. Nucleotide numbers are given in the left hand column and amino acid numbers on the right hand column. Amino acid numbering corresponds to the amino acid numbering of Figure 38 (composite figure for human

NDFs). Linker sequences added to the 5' and 3' ends of the cDNA to facilitate cloning are included.

Figure 36 shows the nucleotide sequence of human NDF cDNA clone 294-8 [SEQ ID NO: 15] and a deduced partial amino acid sequence for human proNDF-ß2 [SEQ ID NO: 16]. Nucleotide numbers are given in the left hand column and amino acid numbers on the right hand column. Amino acid numbering corresponds to the amino acid numbering of Figure 38 (composite figure for human

NDFs). Flanking PCR primer sequences are included.

Figure 37 shows the nucleotide sequence of human NDF cDNA clone 33 [SEQ ID NO: 17] and a deduced partial amino acid sequence for human proNDF-ß3 [SEQ ID NO: 18]. Nucleotide numbers are given in the left hand column and amino acid numbers on the right hand column. Amino acid numbering corresponds to the amino acid numbering of Figure 38 (composite figure for human NDFs). Linker sequences added to the 5' and 3' ends of the cDNA to facilitate cloning are included.

Figure 38 is a composite of the amino acid sequences encoded by different human proNDF cDNAs

(similar to Figure 23 for the rat clones). The complete human proNDF-α2b amino acid sequence [SEQ ID NO: 8] from Figure 32 is shown. Divergent sequences in human proNDF isoforms are aligned with the proNDF-α2b sequence.

Figure 39 is a schematic diagram of a mammalian cell vector for expression of a chimeric NDF gene comprised of rat and human sequences.

Figure 40 shows the induction of tyrosine phosphorylation in MDA-MB-453 cells (using the assay procedure of Example 3) with conditioned medium from COS-7 cells which had been transfected with expression plasmids described in Example 11, below, containing various (human, rat, and human-rat chimera) NDF cDNA clones: Lane 1, control, 100 μl 10x concentrated conditioned medium from untransfected COS-7 cells; lanes 2 and 3, hNDF-α2b; lanes 4 and 5, h-rNDF-α2c; lanes 6 and 7, h-rNDF-ß1c; lanes 8 and 9, h-rNDF-ß2c; lanes 10 and 11, h-rNDF-α1c; lanes 12 and 13, rNDF-α2c; lane 14, positive control of human met-NDF-α2_14-241 purified from E. coli . Even-numbered lanes represent 100 μl

conditioned media, and odd-numbered lanes represent 100 μl 10x concentrated media.

Figure 41 shows SDS-PAGE analysis of a recombinant rat NDF-α2 purified from media conditioned by CHO cells with the pDSRα2/rNDF-α2c expression plasmid described in Example 11. The sizes of molecular weight markers in the left hand lane are indicated in

kilodaltons. The indicated amounts of purified rat NDF-α2 were electrophoresed in the gel which was stained with Coomassie blue.

Figure 42 shows the stimulation of MDA-MB-453 cell neu receptor tyrosine phosphorylation by various amounts of recombinant rat NDF-α2 purified from media conditioned by transfected CHO cells. The assay

procedure of Example 3 was used. PBSA (phosphate buffered saline, or PBS, with 0.1% bovine serum albumin, or BSA) was used as a negative control.

Figure 43 shows the stimulation of MDA-MB-453 neu receptor tyrosine phosphorylation by recombinant human and rat NDF isoforms purified from E. coli . PBSA was used as a negative control.

Figure 44 shows histograms depicting gold-to-red fluorescence ratios for BT-474 cells treated for seven days in culture with recombinant human met-NDF- α2_14-241 purified from E. coli and stained with Nile Red. Treatment regimens included the following concentrations of the recombinant NDF: A, 0 ng/ml (media control); B, 100 ng/ml; C, 20 ng/ml; and D, 4 ng/ml. The results show that recombinant human NDF induces accumulation of neutral lipids.

Figure 45 shows histograms depicting gold-to-red fluorescence ratios for BT-474 cells treated for seven days in culture with recombinant rat NDF-α2 produced by CHO cells and stained with Nile Red.

Treatment regimens included the following concentrations of the recombinant NDF: A, 0 ng/ml (media control); B, 100 ng/ml; C, 20 ng/ml; and D, 4 ng/ml. The results show that recombinant rat NDF induces accumulation of neutral lipids.

Figure 46 shows the growth stimulatory effects of recombinant human met-NDF-α2_14-241 on BT-474 cell proliferation in vitro . Cells were treated with

different concentrations of the recombinant NDF purified from E. coli and processed on days 5, 6, and 7 using a modified MTT bioassay protocol. Absorbance measurements at 560 nm (minus the reference absorbance at 690 nm) for each treatment are presented as the arithmetic mean ± standard deviation (n=4).

Figure 47 shows the growth inhibitory effects of recombinant human met-NDF-α2_14-241 on MDA-MB-468 cell proliferation in vitro . Cells were treated with

Figure 48 illustrates the effect in vitro of rat met-NDF-α2_14-241 on the adhesion on colon epithelial cells. Intestinal crypts were isolated from mouse colons and plated on collagen type IV coated plates.

After twenty-four hours at 37°C with varying

concentrations of recombinant NDF, colonies of attached cells were stained and counted.

Figure 49 shows indirect immunofluorescence analysis of LIM 1215 colon carcinoma cells cultured in the absence (panel A), or presence (panel B) of 50 ng/ml recombinant rat NDF-α2 purified from media conditioned by transfected CHO cells. Treated cells were sectioned, then stained with monoclonal anti-CEA antibodies

followed with FITC-labeled sheep anti-mouse IgG

antibodies.

Figure 50 shows indirect immunofluorescent analysis of LIM 1863 colon carcinoma cells cultured in the absence (panel A), or presence (panel B) of 50 ng/ml recombinant rat NDF-α2 purified from media conditioned by transfected CHO cells. Treated cells were sectioned, then stained with rabbit anti-TIMP-2 antibodies followed with phycoerythrin-labeled goat anti-rabbit IgG

antibodies.

Figure 51 shows measurements of new epithelium in rabbit ear wounds treated with different doses of recombinant human met-NDF-α2_14-241,

Figure 52 shows measurements of the area of new epithelium covering rabbit ear wounds treated with different doses of recombinant human met-NDF-α214-241,

Figure 53 shows measurements of the number and percentage of proliferating (BrdU-positive) basal and suprabasal keratinocytes in rabbit ear wounds treated with different doses of recombinant human met-NDF-α2_14-241. Detailed Description Of The Invention

The DNA sequences of this invention are valuable as products useful in effecting the large scale synthesis of human NDFs by a variety of recombinant techniques. Put another way, DNA sequences provided by the invention are useful in generating new and useful viral and circular plasmid DNA vectors, new and useful transformed and transfected procaryotic and eucaryotic host cells (including bacterial and yeast cells and mammalian cells grown in culture), and new and useful methods for cultured growth of such host cells capable of the expression of recombinant NDFs and their related products.

DNA sequences of the invention are also suitable materials for use as probes in isolating additional human cDNAs and genomic DNA encoding NDFs and other genes encoding related proteins. The DNA

sequences may also be useful in various alternative methods of protein synthesis (e.g., in insect host cells) or in genetic therapy in humans and other

mammals. DNA sequences of the invention are expected to be useful in developing transgenic mammalian species which may serve as eucaryotic "hosts" for production of NDFs and NDF products in quantity. See, generally, Palmiter et al., Science 222, 809-814 (1983).

Diagnostic applications of NDF DNA sequences of this invention are also possible, such as for the detection of alterations of genes and of mRNA structure and expression levels.

The recombinant human NDF polypeptides of this invention are characterized by being the products of procaryotic or eucaryotic host expression (e.g., by bacterial, yeast, higher plant, insect or mammalian cells in culture) of exogenous DNA sequences obtained by genomic or cDNA cloning or by gene synthesis,

particularly with use of the exogenous DNA sequence carried on an autonomously replicating DNA plasmid or viral vector in accordance with conventional techniques. The product of expression in typical yeast (e.g.,

Saccharomyces cerevisiae) or procaryote (e.g., E. coli) host cells is free of association with any mammalian proteins. The product of expression in vertebrate

(e.g., non-human mammalian, such as COS or CHO, and avian) cells is free of association with any human proteins. Depending upon the host employed, the

recombinant NDFs of the invention may be glycosylated with mammalian or other eucaryotic carbohydrates or may be non-glycosylated. The recombinant NDFs of the invention may also include an additional methionine amino acid residue at the amino terminus.

The present invention also embraces products such as polypeptide analogs of human NDFs encoded by naturally-occurring mRNAs. Such analogs include

fragments of NDF or of NDF precursors (proNDFs).

Following known procedures, one can readily design and manufacture genes encoding related polypeptides having primary structures which differ from that herein

specified for in terms of the identity or location of one or more residues (e.g., substitutions, terminal and intermediate additions and deletions). Alternately, modifications of cDNA and genomic nucleotide sequences can be readily accomplished by well-known site-directed mutagenesis techniques and employed to generate analogs and derivatives of the described NDFs. Such products share one or more of the biological properties of NDF polypeptide products encoded by naturally-occurring mRNAs. As examples, polypeptide products of the

invention include those which are foreshortened, for example, by deletions; or those which are more stable to hydrolysis (and, therefore, may have more pronounced or longer-lasting effects); or which have been altered to delete one or more potential sites for O-glycosylation and/or N-glycosylation or which have one or more

cysteine residues deleted or replaced by other residues, e.g., alanine or serine residues, and are potentially more easily isolated in active form from microbial systems; or which have one or more tyrosine residues replaced by phenylalanine and bind more or less readily to target proteins or to receptors on target cells.

Also included are polypeptide fragments duplicating only a part of the continuous amino acid sequence or

secondary conformations within the recombinant NDFs and NDF precursors described herein, which fragments may possess one property of such NDFs and not others. It is noteworthy that the described activities are not

necessary for any one or more of the products of the invention to have therapeutic utility or utility in other contexts, such as in antagonism of naturally-occurring NDFs. Competitive antagonists may be quite useful in, for example, cases of overproduction of NDF.

The present invention also includes that class of polypeptides encoded by portions of the DNA

complementary to the protein-coding strand of the human cDNA or genomic DNA sequences of the described NDFs.

Also encompassed by the invention are pharmaceutical compositions useful in treating

biological disorders associated with neu receptor expression, comprising therapeutically effective amounts of a polypeptide product or products of the invention together with suitable diluents, preservatives,

solubilizers, emulsifiers, adjuvants and/or carriers useful in recombinant NDF therapy. A "therapeutically effective amount" as used herein refers to that amount which provides therapeutic effect for a given condition and administration regimen. Such compositions include diluents of various buffer content (e.g., Tris-HCl, acetate, phosphate), pH and ionic strength; additives such as detergents and solubilizing agents (e.g.,

Tween 80, Polysorbate 80), anti-oxidants (e.g., ascorbic acid, sodium metabisulfite), preservatives (e.g.,

Thimerosol, benzyl alcohol) and bulking substances (e.g., lactose, mannitol); covalent attachment of polymers to the polypeptide to prolong in vivo half-life and to enhance potency (for instance, water-soluble polymers as polyethylene glycol, polypropylene glycol and copolymers of polyethylene glycol and polypropylene glycol, see, e.g., Davis et al., U.S. Patent No.

4,179,337); incorporation of the material into

particulate preparations of polymeric compounds such as polylactic acid, polyglycolic acid, etc., or into liposomes. Such compositions will influence the physical state, stability, rate of in vivo release, and rate of in vivo clearance of recombinant NDF.

The recombinant NDF polypeptides of this invention are expected to be useful, alone or in combination with other therapy, in treating diseases and conditions involving cells which express the neu receptor on their surfaces. In particular, the

recombinant NDFs of this invention are useful as biological agents for modulating cellular proliferation and differentiation and, accordingly, are applicable for the therapeutic treatments described below.

The neu receptor is expressed by human epithelial cells of the gastrointestinal, respiratory, urinary and reproductive tracts (Press, M. et al., Oncogene, 1990, above; Gullick, W. et al., Int. J.

Cancer Res . , 1987, above; Quirke, P. et al., Br. J. Cancer 1989, above). For injured tissues that express the neu receptor, recombinant NDFs can be used to promote reepithelialization and restoration of mature, functionally differentiated phenotypes in the new epithelia. For diseased tissues expressing the neu receptor, recombinant NDFs can be used alone or in combination with cytotoxic agents to inhibit abnormal cellular proliferation. Recombinant NDF polypeptides are expected to be particularly useful the treatment of dermal wounds, cancer, gastrointestinal disorders, kidney diseases, Barrett's esophagus, and diseased or damaged lung.

Dermal wound healing:

Recombinant neu differentiation factors are useful in the repair of diseased and damaged skin through their ability to stimulate tyrosine

phosphorylation of neu receptors present in the basal and squamous layers of human skin. Increased expression of the neu receptor is associated with the more

differentiated epithelium of the surface and the external root sheath of hair follicles (Maguire, H.C. et al., J. Invest . Dermatol . , 1990, above), indicating that recombinant NDFs can be used to induce a more mature and functionally differentiated phenotype in hyperproliferative skin lesions such as psoriasis vulgaris, seborrheic keratoses, acanthosis nigricans, icthyosis.

Recombinant NDFs can be used to accelerate healing of dermal wounds, such as acute and chronic dermal wounds, excisional wounds, second and third degree burns (partial and full thickness) and

epidemolysis bullosa. Epidermolysis bullosa is a defect in adherence of the epidermis to the underlying dermis, resulting in frequent open, painful blisters, which can cause severe morbidity. Accelerated reepithelialization of these lesions would result in less risk of infection, diminished pain, and less. wound care.

Cancer:

Recombinant neu differentiation factors are useful in cancer therapy for inhibiting proliferation of certain cancer cells. In addition to inhibiting tumor cell growth, recombinant NDFs can restore a more mature and functionally differentiated phenotype in certain cancer cells, thereby restoring normal cellular

functions. For example, recombinant NDFs can inhibit tumor cell invasion and metastasis by induction of intercellular and extracellular matrix adhesion

molecules and tissue inhibitors of metalloproteinases.

Cancer cells that are susceptible to NDF treatment are those that express the neu receptor.

Tumors in which neu receptor expression or neu gene amplification have been detected include cancers of the breast, ovary, endometrium, cervix, salivary gland, esophagus, lung, stomach, pancreas, colon, kidney, prostate, and bladder, as well as squamous cell

carcinomas of the head and neck, carcinoid tumors of the gut, glioblastomas, astrocytomas, papillary thyroid carcinomas, sebaceous and sweat gland carcinomas, and hepatocellular carcinomas.

Tumors showing reduced levels of naturally occurring NDF proteins, such as lung and colon cancers (Park, J.W., et al. Proc. Am . Assoc. Cancer Res . 34:521 1993), will be particularly susceptible to recombinant NDF therapy, since normal NDF levels can be restored with recombinant NDF polypeptides. Likewise, tumors with mutated NDF genes will be responsive to NDF therapy.

Additional inhibition of tumor growth can be achieved through enhancement of chemotherapy by recombinant NDFs. Antibodies and ligands that interact with the neu receptor and with the related EGF receptor enhance tumor cell sensitivity to the chemotherapeutic agent cisplatin (Aboud-Pirak, E. et al. J. Natl . Cancer Inst. 80:1605-1611 1988; Christen, R.D. et al., J. Clin . Invest . 86:1632-1640, 1990; Hancock, M. C. et al. Cancer Res . 51:4575-4580 1991; Nishikawa, K. et al.. Cancer Res . 52:4758-4765, 1992). Phorbol esters, which

activate the signal transduction pathway downstream from receptor tyrosine kinases, also enhance cisplatin cytotoxicity (Isohishi, S., Andrews, P. A., and Howell, S. B., J. Biol . Chem. 5:3623-3627, 1990). Through analogous mechanisms, recombinant NDF can enhance the cytotoxicity of cisplatin and other chemotherapeutic agents toward tumor cells expressing the neu receptor.

Gastrointestinal diseases

The distribution of NDF mRNA and the neu receptor messenger in normal gastrointestinal tissues predicts that recombinant NDFs will have activities on normal, diseased and injured stomach and intestine.

Immunohistochemical studies show that the neu receptor localizes to epithelial cells in these tissues.

Furthermore, recombinant NDF is a potent inducer of adherence for normal colonic epithelial cells. It is expected that recombinant NDFs can be useful in

restoring normal gastrointestinal epithelia following disease or injury.

Gastric ulcers, although treatable by H2 antagonists, show significant morbidity and recurrence, and heal by scar formation of the mucosal lining. The ability to regenerate mucosa more rapidly with

recombinant NDFs can offer a significant therapeutic improvement in the treatment of gastric ulcers.

Duodenal ulcers, like gastric ulcers, are treatable, but recombinant NDF therapy to more fully and more rapidly regenerate the mucosal lining of the duodenum would be an important advance.

Inflammatory bowel diseases, such as Crohn's disease (affecting primarily the small intestine) and ulcerative colitis (affecting primarily the large bowel) are chronic diseases of unknown etiology which result in the destruction of the mucosal surface, inflammation, scar and adhesion formation during repair, and

significant morbidity to the affected individuals.

Therapy at present is designed to quiet the

inflammation. Recombinant NDF therapy to stimulate resurfacing of the mucosal surface, resulting in faster healing, can be of benefit in controlling progression of disease.

Gut toxicity is a major limiting factor in radiation and chemotherapy treatment regimes.

Pretreatment with recombinant NDFs may have a

cytoprotective affect on the small intestinal mucosa allowing increased dosages of such therapies while reducing potential fatal side effects of gut toxicity.

Barrett's esophagus

Barrett's esophagus (a precursor of adenocarcinoma) exhibits esophageal columnar metaplasia with frequent overexpression of the neu receptor

(Jankowski, J., et al.. Gut, 1992, above). Recombinant NDF therapy is expected to restore the normal

differentiated phenotype to the diseased tissue and inhibit abnormal cellular proliferation. Lung

Smoke inhalation is a significant cause of morbidity and mortality in the week following a burn injury due to necrosis of the bronchiolar epithelium and the alveoli. Recombinant NDFs are expected to stimulate proliferation and differentiation of the bronchiolar epithelial cells, which express the neu receptor, thereby enhancing repair and regeneration of the lung epithelium damaged by smoke inhalation.

Kidney

A variety of kidney diseases with abnormal cell growth and differentiation (e.g. autosomal-dominant polycystic kidney disease, acquired dialysis-associated cystic disease, and non-cystic end stage kidneys) have frequent overexpression of the neu receptor (Herrera, G.A., Kidney. Int . , 1991, above). Recombinant NDF therapy may restore the normal differentiated phenotype to the diseased tissue and also inhibit abnormal proliferation.

Recombinant NDFs, through stimulation of the neu receptor present in epithelial cells lining kidney tubules, will also be useful in restoring normal kidney epithelium following ischemic acute tubular necrosis.

Liver

Neu receptor expression in the liver can be detected in certain hepatitis B virus and hepatitis C virus infectious lesions of the liver (Brunt, E.M., and Swanson, P.E., Am. J. Clin . Pathol . 97:53-61, 1992). Recombinant NDFs can be used to treat such lesions, either alone or in combination with cytotoxic agents. Peripheral nerves

Neu receptor expression is found in Schwann cells associated with transected peripheral nerves undergoing Wallerian degeneration (Cohen, J.A. et al., J. Neurosci . Res . , 1992, above). Recombinant NDFs can be used to regulate Schwann cell proliferation and maturation in cases of peripheral nerve injury or degeneration.

Altered expression of naturally-occurring NDF in diseased tissues (e.g., malignant or premalignant tissues) can be found with the NDF DNAs, NDF mRNA detection methods, anti-NDF antibodies, and NDF

polypeptide detection methods of this invention.

Perturbations in the expression of naturally-occurring NDFs are expected to alter normal cell growth and differentiation with pathological consequences such as neoplasia. Other methods can also be used to detect aberrant NDF expression in human tissues. These methods, known to those skilled in the art of analyzing protein and mRNA expression, include immunohistochemical assays, enzyme-linked immunoabsorbent assays, and polymerase chain reaction assays. The identification of premalignant and malignant cells with abnormal NDF expression is particularly useful for the diagnosis of cancer.

The NDFs of this invention may also be combined with substances such as radiolabeled molecules, toxins, cytokines, and other compounds useful in tumor treatment, in order to increase localization of these substances on human tumors expressing high levels of the neu receptor.

The polypeptides of the invention will be formulated and dosed according to the specific disorder to be treated, the condition of the individual patient, the site of delivery of the factor, the method of administration, and other circumstances known to the skilled practitioner. Thus, for the purposes herein, an effective amount of recombinant NDF or analog or derivative is an amount that is effective to alter cellular proliferation and differentiation, or to prevent, lessen the worsening of, alleviate or cure the condition for which the polypeptide is administered. The activity of the present polypeptide may be enhanced or supplemented with use of one or more additional biologically active or cytotoxic agents which are known to be useful in treating the same condition for which the polypeptide of the invention is being administered, e.g., IL-2 or chemotherapeutic agents for cancer

therapy, platelet-derived growth factors, epidermal growth factor, fibroblast growth factors, and the like for wound healing, and so forth.

Description of the Specific Embodiments

The invention is illustrated in the following Examples, which are not intended to be limiting.

Biological materials employed in these Examples were obtained as follows. The monoclonal antibody (Ab-3) to the carboxyl terminus of the neu receptor was obtained from Oncogene Science (Uniondale, NY). A monoclonal antibody to phosphotyrosine, PY20, was obtained from Amersham (Arlington Heights, IL). A mouse monoclonal antibody to human ß-casein was obtained through Dr. R.C. Coombes from the Ludwig Institute in London (Earl, H.M., and McIlhinney, R.A.J., Mol . Immunol . 22: 981-991,

1985). AU-565 human breast carcinoma cells were

obtained from the Cell Culture Laboratory, Naval Supply Center (Oakland, CA). The Rat1-EJ cell line (ATCC CRL 10984) was generated by transfection of the human EJ ras oncogene into Ratl fibroblasts as described by Peles, E. et al. in Cell, 1992, above, and by Land, H. et al., in Nature 304:596-602, 1983. The following cell lines were obtained from the American Type Culture Collection

(Rockville, MD) : MDA-MB-231 (ATCC HTB 26), MDA-MB-453 (ATCC HTB 131), Hs 294T (ATCC HTB 140), SK-BR-3 (ATCC HTB 30), HT-1080 (ATCC CCL 121), BALB/c 3T3 (ATCC CRL 6587) and COS-7 (ATCC CRL 1651). COS-7 cells were cultured in Dulbecco's modified Eagle medium (GIBCO, Grand Island, NY) supplemented with 10% fetal bovine serum (Hyclone, Logan, Utah). MDA-MB-453 cells were grown in RPMI Medium 1640 with 15% fetal bovine serum.

EXAMPLE 1

Amino Acid Sequence Analysis of Naturally-Occurring Rat NDF

A. Tryptic Digestion

The purification and amino-terminal sequencing of the approximately 44-kilodalton naturally-occurring NDF glycoprotein from media conditioned by

ras-transformed rat fibroblasts (Rat-1-EJ cells) has been described by Peles, E. et al., in Cell, 1992, above). In order to obtain more amino acid sequence information for the design of independent

oligonucleotide probes, 300 picomoles of the purified rat protein were subjected to partial proteolysis with trypsin, as follows: ten micrograms of the protein were reconstituted in 200 μl of 0.1 M ammonium bicarbonate buffer (pH 7.8); digestion was conducted with L-1-tosylamido-2-phenylethyl chloromethyl ketone-treated trypsin (Serva) at 37°C for eighteen hours, using an enzyme-to-substrate ratio of 1:10.

B. Separation of Trvptic Digests by HPLC

The resulting peptide mixture was separated by reversed phase HPLC and monitored at 215 nm using a Vydac C4 micro column (2.1 mm i.d. × 15 cm, 300 A) and a Hewlett-Packard 1090 liquid chromatographic system equipped with a diode-array detector and a workstation (Figure 1). The column was equilibrated with 0.1% trifluoroacetic acid (mobile phase A) and elution was effected with a linear gradient from 0-55% mobile phase B (90% acetonitrile in 0.1% trifluoroacetic acid) over seventy minutes. The flow rate was 0.2 ml/min and the column temperature was controlled at 25°C. Three absorbance peaks eluted from the column very early (retention time less than five minutes), suggesting that these fractions correspond to short-length peptides. Three other, major fractions were recovered for amino acid sequence determination: a first fraction eluted after thirteen minutes (T13.3), a second after twenty- one minutes (T21.8), and a third after twenty-seven minutes (T27.3). C. Sequencing of Eluted Peptide Fractions

The amino acid sequences of the peptides corresponding to the three major eluted fractions were determined by automated Edman degradation. Amino acid sequence analyses of the peptides were performed with a Model 477 protein sequencer (Applied Biosystems, Inc., Foster City, CA) equipped with an on-line

phenylthiohydantoinyl (PTH) amino acid analyzer and a Model 900 data analysis system (Hunkapiller, M.W.

et al.. Methods of Protein Microcharacterization, Humana Press, Clifton, NJ, pages 223-247, 1986). The protein was loaded onto a trifluoroacetic acid-treated glass fiber disc precycled with polybrene and NaCl. The PTH-amino acid analysis was performed with a micro liquid chromatography system (Model 120) using dual syringe pumps and reversed phase (C-18) narrow bore columns (Applied Biosystems, 2.1 mm × 250 mm). The first fraction (T13.3) yielded two distinguishable major sequence signals (signal ratio 5:1) with the length of three [SEQ ID NO: 65] and four amino acids [SEQ ID

NO: 66] (the sequences are indicated in Fig. 1). The second fraction (T21.8) gave a single sequence [SEQ ID NO: 67], starting at residue number 9 of the previously determined primary N-terminal amino acid sequence (Peles et al., Cell, above; the eighth residue is arginine). The sequence of peptide T21.8 thus confirmed the information obtained from the N-terminal sequence analysis of the whole protein and also assigned an aspartate to position 17, that was previously reported as an unassigned residue. The third fraction (T27.3) gave three distinct sequencing signals up to cycle 12 of Edman sequencing, and one clear sequence from cycles 13 through 24, suggesting that this third peptide is even longer. D. Re-Chromatographing and Sequencing of Co-Eluted Peptides.

To precisely determine the amino acid sequences of the co-eluting peptides in fraction T27.3, an aliquot of that fraction was treated as follows. A seventy percent aliquot of the peptide fraction was dried in vacuo and reconstituted in 100 μl of 0.2 M ammonium bicarbonate buffer (pH 7.8). Dithiothreitol (final concentration 2 mM) was added to the solution which was then incubated at 37°C for thirty minutes. The reduced peptide mixture was then separated by reversed phase HPLC using a Vydac column (2.1 mm i.d. × 15 cm). Elution conditions and flow rate were identical to those described previously.

Two major peptide peaks were recovered and sequenced by automated Edman degradation as described above, namely T34.4, an 11-residue arginine peptide [SEQ ID NO: 68] and T40.4, a 12-amino acid long lysine peptide [SEQ ID NO: 69], (Fig. 1, inset). Residue 1 of peptide T34.4 and cycle 11 from T40.4 remained

unassigned, suggesting that they may be cysteine residues related to the disulfide bond of peptides in fraction T27.3. This possibility was confirmed by electro-spray mass spectrometric analysis using an API-III mass spectrometer (SCIEX, Toronto, Canada). Both peptides T34.4 and T40.4 were analyzed, resulting in average mass figures of 1261.5 and 1274.0, respectively. It was therefore concluded that these two peptides are held together in the naturally-occurring protein by a disulfide linkage. On the basis of these data, the mixed sequence signals of peak T27.3 could be re-examined. When the sequences of peptides T34.4 and T40.4 were subtracted from the mixed signals of fraction T27.3, the sequence of the longer third peptide became apparent also for the first twelve cycles. The deduced 24-amino acid long sequence [SEQ ID NO: 70] of this peptide is indicated in Figure 1. The 9th amino acid in this peptide sequence was assigned as an asparagine but at a much lower yield (about 10% of the sequence

signal), suggesting a site for N-linked glycosylation.

In view of the molecular weight of naturally-occurring NDF and the estimation that approximately one-quarter of its molecular mass is contributed by sugar moieties (Peles, E. et al., Cell, 1992, above), it was unexpected that only a few peptides were recovered after partial proteolysis. One possibility is that the naturally-occurring protein contains a protease-resistance core region that remained intact during proteolysis, but underwent denaturation and therefore was not resolved by HPLC chromatography. EXAMPLE 2

Cloning of cDNA Encoding Rat NDF

A. Construction of a cDNA Library

RNA was isolated from Rat-1-EJ cells by standard procedures (Maniatis, T. et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor

Laboratory, Cold Spring Harbor, NY, 1982) and poly (A)⁺ mRNA was selected using an "mRNA Separator" kit

(Clontech Laboratories, Inc., Palo Alto, CA). cDNA was synthesized with the "Superscript" kit (GIBCO BRL Life Technologies, Inc., Gaithersburg, MD). Column-fractionated double-strand cDNA was ligated into a SalI- and NotI-digested pJT-2 plasmid vector, to yield plasmids containing cDNA inserts such as depicted in

Figure 2. The pJT-2 plasmid vector was derived from the V19.8 vector (ATCC 68124). In particular, the HindIII and SacII cloning sites of V19.8 were changed into SalI and NotI sites by using synthetic oligonucleotide linkers to yield the PJT-2 vector. Plasmids containing cDNA inserts were transformed into DH10B E. coli cells by electroporation (Dower, W.J. et al., Nucleic Acids Res . 16:6127-6145, 1988). B. Preparation of cDNA Probes and Isolation of Clones

Approximately 5 × 10⁵ primary transformants were screened with two oligonucleotide probes that were based on the combined amino acid sequences of the

N-terminal region of NDF as described in Example 1C, specifically, residues 5-24 (RGSRGKPGPAEGDPSPALPP) [SEQ ID NO: 71], and residues 7-12 of the T40.4 tryptic peptide (GEYMCK) [SEQ ID NO: 72]. Their respective sequences (shown in antisense) were as follows ("N" indicates all four nucleotides): (1) 5'-ATA GGG AAG GGC GGG GGA AGG GTC NCC CTC NGC AGG

A T GCC GGG CTT GCC TCT GGA GCC TCT-3' [SEQ ID NO: 73] (2) 5'-TTT ACA CAT ATA TTC NCC-3' [SEQ ID NO: 74]

C G G C

The synthetic oligonucleotides were end-labeled with α-³²p-ATP with T4 polynucleotide kinase and used to screen replicate sets of nitrocellulose filters. The hybridization solution contained 6 × SSC, 50 mM sodium-phosphate (pH 6.8), 0.1% sodium-pyrophosphate, 2 × Denhardt's solution, 50 μg/ml salmon sperm DNA and 20% formamide (for probe 1) or no formamide (for probe 2). Hybridization was carried out for fourteen hours at 42°C (for probe 1) or 37°C (for probe 2). The filters were washed at either 50°C with 0.5 × SSC/ 0.2% SDS/ 2 mM EDTA (for probe 1) or at 37°C with 2 × SSC/ 0.2% SDS/ 2 mM EDTA (for probe 2). Autoradiography of the filters gave ten clones that hybridized with both probes. These clones were purified by re-plating and probe

hybridization as previously described.

EXAMPLE 3

Expression of Recombinant Rat NDF by Transfected

COS-7 Cells

In order to verify that the cDNA clones detected in the primary screening procedure described in Example 2B corresponded to NDF-specific transcripts, the clones were transiently expressed in COS-7 cells. For that purpose the cDNA clones were inserted into the pJT-2 eukaryotic expression vector under the control of the SV40 promoter and 3'-flanked with SV40 termination and poly-adenylation signals. The resulting plasmids, including the pJT-2/NDF plasmid containing clone 44 cDNA (Fig. 2), were used to transfect COS-7 cells by

electroporation as follows. 6 × 10⁶ cells in 0.8 ml Dulbecco's modified Eagle medium (DMEM) and 10% fetal bovine serum were transferred to a 0.4 cm cuvette and mixed with 20 μg of plasmid DNA in 10 μL of TE solution (10 mM Tris-HCl, pH 8.0, 1 mM EDTA). Electroporation was performed at room temperature, 1600 volts and 25 μF, using a BioRad Gene Pulser apparatus with the pulse controller unit set at 200 ohms. The cells were then diluted into 20 ml of DMEM/10% fetal bovine serum and transferred into a T75 flask (Falcon). After fourteen hours of incubation at 37°C, the medium was replaced with DMEM/1% fetal bovine serum, and the incubation was continued for an additional forty eight hours.

The resulting conditioned media were then evaluated for their ability to stimulate tyrosine phosphorylation of the neu receptor in MDA-MB-453 human breast tumor cells as follows. The conditioned media were filtered through a 0.2-micron sterile filter unit

(Costar, Cambridge, MA) and concentrated sixteen-fold by using a Centriprep 10 unit (Amicon, Beverly, MA) . The concentrated media were diluted in phosphate-buffered saline (PBS) that contained 0.1% bovine serum albumin and were then added to individual wells of a forty eight-well dish that contained 3 × 105 MDA-MB-453 cells per well. Following five minutes of incubation at 37°C, the media were aspirated and the cells were processed for Western blotting using a monoclonal antibody to phosphotyrosine (PY20). The protocol used for cell lysis and Western blotting is described by Peles, E. et al., Cell, 1992, above.

The results of the analysis are depicted in Figure 3. Although the medium of untransfected COS-7 cells induced a slight increase in tyrosine phosphorylation, the medium harvested from clone 44 (i.e., pJT-2/NDF plasmid) transfected cells was

significantly more active. Therefore, clone 44 cDNA was completely purified by re-plating and by filter

hybridization screening. Figure 3 (right panel)

compares the activity of the completely purified

clone 44 with the activity of control clones 27 and 29 that were randomly selected. Evidently, after

transfection into COS-7 cells the completely purified clone 44 cDNA elicited higher activity than the control plasmids or the partially purified clones, including partially purified clone 44. On the basis of its hybridization to two independent oligonucleotide probes (Example 2B) and its ability to direct the synthesis of a biologically active NDF, clone 44 was selected for further analysis by DNA sequencing.

EXAMPLE 4

Sequencing of Rat NDF cDNA

Rat clone 44 cDNA was primarily sequenced using a 373A automated DNA sequencer and "Taq DyeDeoxy™ Terminator" cycle sequencing kits from Applied

Biosystems, Inc. (Foster City, CA) , in accordance with the manufacturer's instructions. Some of the sequencing was performed using 35s-dATP (Amersham) and "Sequenase™" kits from United States Biochemicals (Cleveland, Ohio), following manufacturer's instructions. Both strands of the cDNA of clone 44 cDNA were sequenced using synthetic oligonucleotides as primers. Nucleotide sequence analysis of the cDNA insert of clone 44 yielded a 1894 bp sequence [SEQ ID NO: 21] that contained a 436 amino acid long open reading frame [SEQ ID NO: 22] that extends to the 5' end of the cDNA. Downstream of the 3' end of this reading frame is a 594 base-long untranslated stretch. The latter includes a poly (A) tail preceded by a polyadenylation signal (AAATAAA) . Because no stop codon and recognizable signal peptide sequence were found at the amino terminus of the longest open reading frame, the nucleotide sequences of three other independent positive cDNA clones were analyzed in a similar manner. All three clones included an

additional 5' 292 bp sequence with an in-frame stop codon, suggesting that clone 44 is a 5' truncated cDNA.

The combined nucleotide sequence of the rat cDNA clones [SEQ ID NO: 19] is presented in Figure 4, and the clone 44 cDNA sequence is given in Figure 5. The combined sequence spans 2,186 base pairs, including a poly (A) tail, and contains an open reading frame of four hundred and twenty-two residues [SEQ ID NO: 20] if the amino terminal methionine is considered to be the initiator codon. Hydropathy analysis (Figure 6) revealed no prominent hydrophobic sequence at the amino terminus of the protein that could function as a signal peptide for protein secretion (von Hijne, G.,

J. Mol . Biol . 184:99-105, 1985). However, this analysis showed the presence of a highly hydrophobic stretch of twenty three amino acids (underlined in Fig. 4) that qualifies as a potential transmembrane domain. This region bisects the presumed precursor of NDF (proNDF) and defines a putative cytoplasmic domain of one hundred and fifty seven amino acids. The predicted

extracellular domain contains all of the peptide sequences that were determined directly from the purified protein (overlined in Fig. 4). They all perfectly matched the sequence deduced from the cDNA except for threonine residue 137, which was assigned as isoleucine in peptide T27.3 (Fig. 1). Five different cDNA clones encoded a threonine residue in the

corresponding position, indicating that this change may not be due to an isoform of the protein. Instead, reexamination of the protein sequence data suggested that the isoleucine signal was carried over from the previous Edman cycle, and that the threonine escaped detection due to O-glycosylation. Significantly, all of these peptides sequenced are located in the amino-terminal half of the presumed ectodomain. The other half contains six cysteine residues that comprise an epidermal growth factor (EGF)-like domain (Figure 7), which is known to be resistant to proteolysis due to its very compact structure (reviewed in Massague, J.,

J. Biol . Chem. 265:21393-21396, 1990). The other two cysteine residues of the ectodomain, that appear to be disulfide-linked in NDF (described in Example 1D), define an immunoglobulin (Ig) homology unit (Figure 8). In addition, the predicted protein sequence includes four potential sites for N-linked glycosylation, all of which reside amino-terminally to the transmembrane region (indicated by asterisks in Fig. 4). A predicted overall structure for the rat NDF precursor is presented in Figure 9.

By virtue of the transmembrane domain depicted in Figure 9, the NDF precursor is expected to accumulate on the surface of cells expressing high levels of NDF mRNAs. Membrane-bound precursor proteins have been found for other members of the EGF growth factor family. For example, Brachmann, R., et al. (Cell 56: 691-700, 1989) demonstrated the presence of transforming growth factor α on the surface of cells expressing transforming growth factor α mRNA. Molecules that specifically bind proNDF could selectively direct therapeutic molecules to cells overexpressing NDF. For example, monoclonal antibodies raised against recombinant NDF and conjugated to therapeutic molecules could selectively localize the therapeutic molecule on the surface of cells expressing high levels of membrane-bound proNDF. Such cells would include tumor cells with an activated ras gene.

Antibody conjugates could be constructed using

chemotherapeutic compounds, cytokines, toxins,

lymphokines or radionuclides.

EXAMPLE 5

Functional Analyses of Recombinant Rat NDF Expressed bv Transfected COS-7 Cells

It has been previously reported that homogeneously purified, naturally-occurring NDF secreted from ras-transformec rat fibroblasts inhibits the growth of cultured human breast carcinoma cells and induces them to produce milk components indicative of cell

differentiation (Peles, E. et al., Cell, 1992, above). In order to directly correlate these activities with the cloned DNA, the effects of a recombinant NDF derived from rat clone 44 on cultured breast carcinoma cells were examined. More specifically, cultures of AU-565 or MDA-MB-453 cells were treated with sixteen-fold

concentrated conditioned medium from COS-7 cells that had been transfected with either the clone 44 pJT-2/NDF expression vector (Fig. 2) or a control pJT-2 plasmid that contained an unrelated cDNA insert (clone 27). Both media were used at 1:50 final dilution and incubated with the cells at 37°C for three days. Cell numbers were then determined with a hemocytometer and the nuclear area was measured by computerized image analysis. Staining for lipids and casein was performed as described in

Bacus, S., et al., Mol. Carcinog. 3:350-362, 1990. The experiment was repeated three times and yielded

qualitatively similar results, which are set forth in Table 1. TABLE 1

Cells with Cell Number Nuclear Area Cells with Lipid Droplets (10⁴/cm²) (μ²) Casein (%) AU-565 Cells

Control 10% 4 × 10⁴ 106 <2 Recombinant NRST 74% 1.3 × 10⁴ 215.8 >90

MDA-MB-453 Cells

Control 9% 3.6 × 10⁴ 79.8 >1 Recombinant NRSF 56% 1.8 × 10⁴ 115.7 78

> means greater than

< means less than

The results show that AU-565 and MDA-MB-453 cultures that were treated with the control-conditioned medium displayed mostly an immature morphology, whereas most of the cells treated with conditioned medium containing the recombinant NDF from rat clone 44 displayed a characteristic mature morphology that included large nuclei and the appearance of lipid droplets in the cytoplasm. Immunohistochemical staining specific for human β-casein indicated that most of these cells also synthesized casein, unlike the control-treated cultures. In conclusion, despite the fact that rat clone 44 lacks part of the 5' end of the NDF cDNA, this cDNA clone directs synthesis of a functionally active mammalian NDF which is produced by the

transfected COS-7 cells.

This conclusion was further supported by the ability of recombinant NDF to compete with naturally-occurring NDF in ligand displacement analyses. Purified naturally-occurring rat NDF was radiolabeled with 1 mCi of Nai25ι using lodogen (Pierce, Rockford, IL) according to the manufacturer's instructions. Unreacted iodine was separated from the protein by gel filtration on

Excellulose GF-5 desalting column (Pierce). The specific activity of the radiolabeled naturally-occurring NDF (¹²⁵I-NDF) was 3 × 10⁶ cpm/ng. The radiolabeled NDF (10 picomolar) was incubated for sixty minutes at 4°C with monolayers of MDA-MB-453 cells that were grown in a 24-well dish (Costar). This incubation was also performed in the presence of conditioned media from transfected COS-7 cells. Unbound ¹²⁵I-NDF was removed by three washes with phosphate-buffered saline (PBS), and the cells were solubilized in 0.1 N NaOH solution containing 0.1% SDS. Radioactivity was determined with a γ-counter. As depicted in Figure 10, conditioned medium containing recombinant NDF expressed from the pJT-2/NDF expression vector (Fig. 2) in COS-7 cells reduced the total 125I-NDF binding by approximately 50%. For a negative control, COS-7 conditioned medium from cells transfected with a pJT-2 expression vector containing a rat TGFα cDNA (Blasband, A. et al., Mol . Cell Biol . 10 : 2111-2121, 1990) was employed. In contrast to the recombinant NDF conditioned medium, the TGFα conditioned medium did not inhibit 125I-NDF binding. The partial inhibitory effect of the recombinant rat NDF conditioned medium, as compared with the purified, natural, unlabeled rat NDF (Peles, E. et al., Cell, 1992, above), may be attributed either to its relatively low concentration in the binding assay, or to high non-specific ligand binding.

To overcome this problem and also demonstrate direct interaction with the neu receptor, a covalent crosslinking assay was employed. This was performed by allowing MDA-MB-453 cells to bind 125ι-NDF in the

presence of transfected COS-7 cell conditioned media, as above. The chemical crosslinking reagent bis

(sulfosuccinimidyl) suberate (BS³, Pierce) was added at 1 mM final concentration after one hour of binding at 4°C, followed by a brief wash with PBS. Following forty-five minutes of incubation at 22°C, the monolayers were incubated for ten minutes with quenching buffer (100 mM glycine in PBS, pH 7.4). The cells were then washed twice with ice-cold PBS, lysed in lysis buffer, and the neu protein was immunoprecipitated with

monoclonal antibody Ab-3 according to the protocol described in Peles, E. et al., EMBO J. 10: 2077-2086 , 1991). The extensively washed immunocomplexes were resolved by gel electrophoresis (6% acrylamide) and autoradiography. The results of this analysis (Figure 11) indicated that the recombinant NDF, unlike recombinant TGFα, was able to displace most of the receptor-bound I25ι-NDF. This result confirmed that rat clone 44 cDNA encodes a functional NDF molecule.

EXAMPLE 6

Northern Blot Analyses of NDF Expression

To determine the size and tissue distribution of the NDF mRNA, Northern blot hybridization experiments were carried out using the cDNA insert of clone 44 as a hybridization probe. Tissues were obtained through surgery of adult female rats, their RNA was extracted and poly (A) + RNA was selected using standard methods (Manniatis, T. et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring

Harbor, New York, 1982). The 1.9 kb-long cDNA insert of rat clone 44 was labeled with α-³²P-dCTP by the random priming method (Feinberg, A.P., and Vogelstein, B., Anal. Biochem 232:6-13, 1983). The conditions of hybridization were as follows: 6 × SSC, 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 2 ×

Denhardt's solution, 50 μg/ml salmon sperm DNA and 50% formamide. Hybridization was carried out for fourteen hours at 42°C, followed by washing for thirty minutes at 60°C with 0.2 × SSC, 0.1% SDS and 2 mM EDTA. The filters were exposed to Kodak XAR x-ray film with an intensifier screen at -70°C for the indicated periods of time.

Figure 12 (panel A) shows that three bands were visualized in the Northern blots with poly (A) selected RNA of Rat-1-EJ fibroblasts. Their molecular sizes corresponded to 6.8, 2.6, and 1.7 kilobases.

After further autoradiography, two additional species of mRNA became visible (not shown). The relative level of expression of NDF in normal Rat1 or 3T3 fibroblasts was significantly lower than in the ras-transformed cells, in agreement with the earlier observation that

correlated the neu receptor stimulatory activity with transformation by an oncogenic ras gene (Yarden, Y., and Weinberg, R., Proc . Natl . Acad. Sci . USA 86: 3179-3188, 1989).

Figure 12, panel B demonstrates tissue-specific regulation of NDF mRNA expression levels in adult rat tissues. The highest NDF mRNA expression was observed in the spinal cord. NDF mRNAs were also detected in brain tissue. In addition to the mRNAs described above, these tissues express a variant mRNA that is 3.4 kb in size. Other positive tissues include lung, ovary and stomach. Relatively low amounts of the middle-size transcript were displayed by the skin, kidney and heart. The liver, spleen and muscle did not contain detectable NDF mRNA. The Northern blot also shows tissue-specific differences in the relative proportion of the different NDF mRNA species. Each variant mRNA could encode variant NDF proteins with different biological properties. Naturally-occurring NDF may regulate cell proliferation and differentiation through tissue-specific variations in NDF mRNA structure and expression.

Figure 12, panel C, shows that human tumor cells can overexpress NDF mRNA, An initial survey of human tumor cell lines found that HT-1080 fibrosarcoma cells, Hs 294T melanoma cells and MDA-MB-231 mammary adenocarcinoma cells express elevated NDF mRNA levels. EXAMPLE 7

Characterization of cDNA Clones Isolated from a Rat-1-EJ cDNA Library that Encode Additional NDF Isoforms The isolation of rat NDF cDNA clone 44 and nine additional independent rat NDF cDNA clones from a cDNA library prepared from Rat-1-EJ cell poly (A)⁺ RNA is described in Example 2. These ten independent cDNA clones have been designated as clones 4, 19, 20, 22, 38, 40, 41, 42A, 42B, and 44 (Table 2). Clones 44 and 42B were deposited with American Type Culture collection, Rockville, Maryland, under accession number 69303 and 69301, respectively. The nucleotide sequences for both strands of the NDF coding region were determined for all ten cDNAs using synthetic oligonucleotide primers with fluorescence-based dideoxy-DNA sequencing as described in Example 4. The cDNAs ranged in size from 1.1 kb to 3.2 kb and, presumably, are derived from the 6.8, 2.6 and 1.7 kb NDF mRNAs found in Rat-1-EJ cells (See

Example 6). DNA sequence analyses revealed that these ten cDNAs encode six distinct NDF precursor proteins (proNDFs) (Table 2, and Figures 5 and 13-21). The open reading frames in the ten cDNAs ranged from 0.7 kb

(clone 4) to 2.0 kb (clone 42A). Two of the cDNAs were partial cDNA clones. The clone 4 sequence begins at proNDF codon 11. Clone 40 is truncated at codon 36 of the intracellular domain. Clone 44 has a complete open reading frame, but has a truncated 5'-untranslated sequence.

The ten cDNAs encode six homologous NDF precursor proteins comprised of 241 to 662 amino acid residues, which are designated proNDF-α2a [SEQ ID NO: 43], proNDF-α2b, proNDF-α2c, proNDF-β2a, proNDF-β3 and proNDFβ-4a (Figures 22, 23 and Table 2). The six predicted proNDFs are identical in the first 213 amino acid residues. This identical region includes: a basic N-terminus which is proteolytically processed at amino acid 14 in naturally occurring rat NDF; an

immunoglobulin-like domain; a spacer domain with sites for N- and O-linked glycosylations; and the first two disulfide loops of the EGF-like domain (See Example 4 and Figure 9). ProNDF cDNAs do not encode N-terminal hydrophobic signal peptide sequences. However, with the exception of proNDF-β3, a 23-amino acid hydrophobic region is present in the proNDFs . Carboxy-terminal to this putative transmembrane domain is a cytoplasmic domain that is invariant in the first 157 amino acid residues.

Differences in NDF precursor proteins localized to two regions. The variable region of the proNDF ectodomain begins in the EGF-like domain at amino acid position 213. The α/β variation alters the

sequence between the fifth and sixth cysteines of this domain. The homologous regions of EGF and TGF-α are essential for activity and are thought to interact directly with the EGF receptor. Since the neu receptor stimulatory activities reside in the EGF-like domains (e.g., NDF-α2₁₇₇_₂₄₁, Example 15 and Figure 43), the α/β sequence variation may alter receptor activation in different cellular contexts. Alternatively, the NDF isoforms may differentially bind receptor heterodimers formed between the NDF receptor and related receptor molecules, such as the 170 kDa EGF receptor (Wada, T. et al., Cell 61:1339-1347, 1990; Spivak-Kroizman, T. et al., J. Biol . Chem. 267:8056-8063, 1992), pl60erbB3 (Plowman, G.D. et al., Mol . Cell Biol . 10 : 1969-1981 , 1990) or p180^erbB4 (Plowman, G.D. et al., Proc. Natl . Acad. Sci . USA 90 : 1746-1750 , 1993).

As can be seen from Figure 23, the proNDF-β proteins share a common sequence at positions 213 to 230 that is distinct from the corresponding NDF-α sequence. Variation between members of the proNDF-β subfamily occurs carboxy-terminal to the EGF-like domain, where the β1 and β4a proteins contain additional amino acid residues. ProNDF-β3 cDNA encodes a stop codon in this region, resulting in a smaller protein that lacks the transmembrane and cytoplasmic domains.

The different proNDF sequences may alter proteolytic release of 44 kDa isoforms from the larger NDF precursor molecules. The proNDF proteins differ significantly in the number of charged residues located between the EGF-like domain and the hydrophobic

transmembrane domain. For example, the 27-amino acid sequence that is unique to the β4a isoform includes nine basic residues and five acidic residues. This sequence introduces several dibasic residues that are potential proteolytic cleavage sites.

Cytoplasmic domains with 374, 196 and 157 amino acid residues distinguish proNDFs α2a, α2b and α2c, respectively (Figures 5, 14, 15, 17, 21, 22, and

23). The proNDF-α2 a, b and c isoforms are identical in the first four hundred and twenty-two amino acid

residues. The proNDF-α2a cytoplasmic domain contains two hundred and seventeen additional amino acid

residues. This extended carboxyl-terminus is found in proNDFs β2a and β4a (Figures 16, 19, and 20). ProNDF-α2b cDNA encodes a different 39-amino acid sequence

following position 422 that results from a DNA insert of one hundred and forty-six base pairs in the proNDF-α2b cDNA sequence. ProNDF-α2c has the shortest cytoplasmic domain, with the cDNA having a stop codon at codon position 423 followed by a different 3'-untranslated sequence. TABLE 2. Rat-1-EJ cDNA Clones cDNA Size proNDF Bioassay Figure

Clone (kb) Isoform Results^a Number

4^b 1.1 proNDF-β3 - 13

19 2.9 proNDF-α2b + 14

20 2.5 proNDF-α2b + 15

22^c 3.4 proNDF-β2a + 16

38 2.4 proNDF-α2a + 17

40^d 1.3 truncated + 18 proNDF-β2

41 2.7 proNDF-β2a + 19

42A 2.4 proNDF-β4a + 20

42B^c 3.2 proNDF-α2a + 21

44 1.9 proNDF-α2c + 5

^a Media conditioned by transfected COS-7 cells were assayed for stimulation of neu receptor tyrosine phosphorylation using the procedure described in Example 3.

b Truncated at 5' end, starts at codon 11. The 5' end was

replaced with the 5' end of clone 44 for expression studies. ^c The original cDNA clones had two inserts. Data on the NDFrelated sequences are presented.

d Truncated at codon 36 of intracellular domain. EXAMPLE 8

Expression of Recombinant Rat NDF Proteins in COS-7 Cells

Transient COS-7 cell expression of proNDF cDNAs was employed to determine if all ten cDNA clones encode biologically active NDF proteins. Utilizing the methods of Example 3, the proNDF cDNA clones were

inserted into the pJT-2 eukaryotic plasmid vector under the control of the SV40 early promoter with 3 '-flanking SV40 termination and polyadenylation signals, and

transfected into COS-7 cells. Media conditioned by the transfected COS-7 cells were incubated with human MDA-MB-453 cells expressing the neu receptor. The MDA-MB-453 cells were lysed, and the lysates were analyzed in Western blots for tyrosine phosphorylation (Figure 24). Nine of the ten cDNA clones directed the

synthesis and secretion of biologically active

recombinant NDF into the COS-7 cell conditioned media (Table 2) . Clone 4, which encodes proNDF-β3, did not produce a biologically active conditioned media.

Synthesis of NDF proteins in transfected COS-7 was also analyzed by immunoprecipitation of ³⁵S-labeled proteins. For these experiments, antibodies were raised by immunizing rabbits with recombinant rat met-NDF-α2_14- ₂₄₁ purified from E. coli as described in Example 14 (Section I), below. Rabbits were immunized with an initial injection of 200 μg of recombinant rat met-NDF-α2_14-241. The antigen was emulsified with complete

Freund's adjuvant and injected subcutaneously at

multiple sites. At four week intervals the rabbits were boosted with 200 μg of antigen in incomplete Freund's adjuvant injected at multiple subcutaneous sites. Sera were collected by ear vein bleed ten days after each injection and evaluated for reactivity with recombinant rat met-NDF-α2₁₄_₂₄₁ by an ELISA. Anti-NDF antibodies were purified by affinity chromatography using an

Actigel ALD kit as recommended by the manufacturer (Sterogene Bioseparations, Inc, Arcadia, CA). The affinity gel was made by overnight coupling of 2 mg of recombinant rat met-NDF-α2_14-241 to 3 ml monoaldehydeagarose. Unreacted aldehyde groups were deactivated with 0.1 M ethanolamine. Five milliliters of antiserum were passed through a 1 × 10 cm column containing the affinity gel. The column was washed with 50 ml PBS, and eluted with 3 mM MgCl2, 80 mM HEPES, pH 6.0, 25% ethylene glycol. The eluted antibodies were dialyzed overnight against PBS at 4°C and concentrated with a Centriprep unit (Amicon) to ~1 mg/ml concentration.

The resulting purified anti-NDF antibody was used to immunoprecipitate recombinant ³⁵S-labeled NDF proteins from COS-7 cell cultures expressing the various recombinant precursor proteins. COS-7 cells transfected with proNDF cDNA expression plasmids were grown in 60-mm dishes in DMEM plus 10% FBS at 37°C for forty eight hours. Cells were washed and placed in 1.5 ml DMEM minus methionine and cysteine with 1% dialyzed FBS.

After sixty minutes, 200 μCi ³⁵S methionine/cysteine (Tran³⁵S-label, ICN) was added and cells were incubated overnight for approximately seventeen hours. Media were collected, made 1 mM in phenylmethyl sulfonylfluoride (PMSF), and clarified of cell debris in a microfuge. The cells were harvested from culture dishes by scraping them into 1 ml PBS. The cells were recovered by centrifugation and lysed by addition of 200 μl of 1% SDS, 1 mM PMSF and heating at 100°C for three minutes. The cell lysates were clarified by centrifugation and diluted 1:4 in immunoprecipitation dilution buffer (1.25% Triton X-100; 190 mM NaCl; 60 mM Tris-HCl, pH 7.4; 6 mM EDTA; 10 units/liter trasylol). The radiolabeled COS-7 cell conditioned media and diluted cell lysates were pretreated at 4°C for two hours with 20 μl of rabbit normal serum and 20 μl of protein A-Sepharose CL-4B (Sigma) diluted 1:1 in PBS. The protein A-Sepharose was then removed by centrifugation. The pretreated samples were gently agitated at 4°C overnight with affinity-purified anti-NDF antibody (20 μg/ml) and 20 μl protein A-Sepharose. The protein A-Sepharose beads were pelleted and washed four times in a washing solution (0.1% Triton X-100; 0.02% SDS; 150 mM NaCl; 50 mM Tris-HCl, pH 7.5; 5 mM EDTA; 10 units/liter trasylol). The washed beads were heated with 60 μl of SDS gel electrophoresis loading buffer at 100°C for three minutes. The beads were removed by centrifugation and the supernatants analyzed by electrophoresis in 10% SDS-PAGE gels (Novex). Prestained protein molecular weight markers were from Novex. NDF proteins were undetectable in cell lysates and in conditioned media from COS-7 cells transfected with irrelevant cDNAs, and also when samples were immunoprecipitated in the presence of excess unlabelled recombinant NDF (not shown in Figures). The affinity-purified antibody

immunoprecipitated 40 to 45 kDa recombinant NDF proteins from cells transfected with cDNAs encoding proNDF proteins α2a, α2b, α2c, β2a and β4a (Figure 25). The immunoprecipitated proteins correspond in size to the processed form of NDF purified from media conditioned by Rat-1-EJ cells. The 40 to 45 kDa recombinant NDF proteins were found in both cell lysates and conditioned media. Larger NDF precursor proteins were detected as high molecular weight proteins immunoprecipitable from the cell lysates, but not from conditioned media. As predicted from proNDF cDNA structures (Fig. 23), the NDF α2a, β2a, and β4a precursor proteins are significantly larger than the NDF α2b and α2c precursor proteins.

These immunoprecipitated proteins are presumably the full-length, glycosylated NDF precursor proteins.

NDF-α2c precursor proteins were found at higher levels than the other precursor isoforms, suggesting somewhat less efficient processing for this variant. In contrast to other clones, proNDF-β3 cDNA directed the synthesis of 30- to 35-kDa proteins. These proteins were found in the COS-7 cell lysate, but not in the conditioned medium (Figure 25).

The diffuse nature and the size of the immunoprecipitated recombinant NDF proteins suggests that they are glycosylated. A large percentage of the mass of naturally occurring rat NDF is due to N- and O- linked glycosylations (Peles, E. et al., Cell, 1992, above). Five microliters of immunoprecipitated recombinant NDF-ß3 and -α2c proteins in SDS gel loading buffer were diluted into 20 μl H₂O with CHAPS added to a 2% final concentration. The samples were incubated at 37°C for two hours after adding 0.5 unit of N-glycanase, O-glycanase or neuraminidase (Genzyme). The digested proteins were mixed with 5 μl of 4X SDS gel loading buffer, heated at 100°C for three minutes and analyzed by electrophoresis in a 10% SDS-PAGE gel (Novex)

Prestained molecular weight markers were from Novex. Enzymatic deglycoslyation of the 40-44 kDa and 60-75 kDa NDF-α2c recombinant proteins reduced their size

approximately 10 kDa (Figure 26). The mobility shifts with the glycanases indicate that O-linked

glycoslyations are more abundant than N-linked

glycosylations on both the recombinant NDF-α2c precursor proteins and processed 40-44 kDa NDF-α2c recombinant proteins. Glycosylation sites found in the proNDF

"spacer" region are apparently modified prior to proteolytic processing of the proNDF. NDF-β3, however, was resistant to glycanase treatment. This result, and the exclusively intracellular localization of NDF-β3, indicates that NDF-β3 protein does not enter the

secretory pathway. This is consistent with the absence of a significant hydrophobic domain in NDF-β3.

Amino terminal signal peptide sequences are notably absent from all of the predicted NDF precursor proteins. However, the different isoforms of proNDF (with the exception of proNDF-β3) are all glycoslyated and processed to 40 to 45 kDa biologically active NDF glycoproteins. These proteins are found in the

transfected COS-7 cell culture media. The capacity for secretion localizes to the internal 23-amino acid hydrophobic domain. ProNDF-β3, which accumulates as an unglycosylated intracellular protein, has essentially the entire proNDF extracellular domain. In contrast, rat clone 40 cDNA encodes a truncated proNDF-β2

comprised of the the extracellular domain, the 23-amino acid hydrophobic domain, and the first thirty-six amino acid residues of the cytoplasmic domain. This truncated cDNA clone directs the expression of biologically active NDF that accumulates in conditioned medium. The intracellular, nonsecreted NDF-β3 protein may be

released from cells by a different mechanism. The NDF-β3 isoform may function as a sequestered growth factor that is released when tissues are injured. A similar role has been proposed for IL-1β, which also lacks a signal peptide sequence (Mitzutani, H. et al., J. Clin . Invest . 87:1066-1071, 1991). Alternatively, NDF-β3 could have intracellular functions. EXAMPLE 9

PCR Analyses of Rat ProNDF mRNA and Cloning of an

Additional Rat NDF cDNA

PCR was used to analyze expression of proNDF mRNAs that encode the alternative EGF-like domains. Eleven rat tissues were analyzed for tissue-specific expression of mRNAs encoding variant proNDF proteins (Figure 27). NDF cDNA sequences encoding the variable portion of the EGF-like domain (e.g. proNDF-α2a codon positions 200-241) were amplified using the synthetic oligonueleotide primers 5' -GCGTCTAGATGAAGGACCTGTCAAAOCC-3' (sense) [SEQ ID NO: 75] and 5'- GCGGGATCCCTTCTGGTAGAGTTCCTCC-3' (antisense) [SEQ ID

NO: 76]. The underlined sequences add XbaI or BamHI restriction sites to the 5' ends of the primers. RNA samples (1 μg) from normal adult female Sprague-Dawley rat (Charles River Laboratory, Wilmington, MA) tissues and from the Rat-1 and Rat-1-EJ cell lines were reverse transcribed to generate first strand cDNA with cDNA

Cycle™ kits (Invitrogen). PCR reactions employed Perkin Elmer Cetus GeneAmp™ PCR kits. The reactions were in a final volume of 50 μl and contained 10% of the reverse transcription reaction products or approximately 100 ng of cloned cDNAs as positive controls. Twenty-five cycles in a Perkin Elmer-Cetus GeneAmp™ 9600

thermocycler amplified the PCR products. Each cycle included twenty seconds at 94°C, twenty seconds at 55°C, and twenty seconds at 72°C. Ten μl aliquots of each reaction mixture were analyzed by electrophoresis in 2% agarose gels, followed by ethidium bromide staining.

DNA molecular weight markers were from Boehringer

Mannheim Corp. (Indianapolis, IN). Most of the eleven tissues yielded cDNA that comigrated with products obtained from cloned proNDF-α2 and proNDF-β2 cDNAs. The slightly lrger PCR products from brain and spinal cord (lanes 11 and 12) were cloned into the pSPORT vector (GIBCO BRL) and the resulting plasmids transformed into E. coli . Plasmids from colonies that hybridized to the oligonucleotide probe 5'-TGAAGGACCTGTCAAACCC-3' [SEQ ID NO: 77], which corresponds to the sense oligonucleotide PCR primer without a XbaI sequence, were analyzed by DNA sequencing. These cDNAs encoded a new NDF amino acid sequence corresponding to a NDF-β2 sequence with eight additional amino acids following codon position 231 (Figures 23 and 28). This sequence, herein termed

NDF-β1, is homologous to human heregulin β1 (Holmes et al. Science, 1992, above).

Tissue-specific NDF variants were not detected by PCR and DNA sequencing with the exception of

preferential β1 isoform expression in brain and spinal cord, which suggests a neurological role for the NDF-βl isoform. A NDF-β1 related cDNA (designated ARIA-1) was cloned from an embryonic chick brain cDNA library (Falls et al.. Cell, 1993, above). Recombinant ARIA-1

increases the number of acetylcholine receptors and Na⁺ channels in muscle cells. In addition, NDF and ARIA in situ hybridizations with NDF and ARIA probes detect expression in spinal cord motor neurons, intestinal and dorsal root ganglia, and neuroepithelium lining lateral ventricles of the brain (See, also, Orr-Urtreger, et al., Proc. Natl . Acad. Sci . USA, 1993, above).

The conserved 374 amino acid cytoplasmic domain found in proNDFs α2a, β2a and β4a is more than 85% identical to the proHRG (Holmes et al.. Science, 1992, above) and proARIA-1 (Falls, D.L. et al.. Cell, 1993, above) cytoplasmic domains. All of these cytoplasmic domains have a carboxy-terminal valine residue that may be important for proteolytic processing. ProTGF-α has a carboxy-terminal valine that is critical for regulated release of TGF-α (Bosenberg, M.W. et al.. Cell 71:1157- 1165, 1992). Leucine or isoleucine can substitute for the terminal TGF-α valine in directing cleavage.

ProNDF-α2b has a 196 amino acid residue cytoplasmic domain with a carboxy-terminal leucine and is processed efficiently to mature NDF-α2. In contrast, the 157 amino acid proNDF-α2c cytoplasmic domain ends with two hydrophilic residues. Accumulation of proNDF-α2c within transfected COS-7 cells suggests that this precursor isoform is less efficiently processed. However, processing to mature NDF-α2 does occur, demonstrating that NDF release from COS-7 cells does not require a terminal hydrophobic residue. In nontransfected cells, alternative carboxy-termini may be more critical for mature NDF release.

EXAMPLE 10

Isolation of Human NDF cDNA Clones

I. Initial cDNA Library Screening

Several human cell lines were screened for the presence of NDF-related mRNAs by Northern analysis with rat clone 44 NDF cDNA probe. Poly (A)⁺ RNA was isolated from the cell lines indicated in Figure 29 and separated by agarose gel electrophoresis (5μg mRNA per lane).

RNAs were transferred onto nitrocellulose filters and probed with ³²P-labeled rat NDF clone 44 cDNA (See

Example 6). Among them, a human kidney adenocarcinoma cell line, A-704 (American Type Culture Collection, ATCC HTB 45) exhibited the highest level of NDF mRNA

expression .

Double-strand cDNA was synthesized from poly (A)⁺ RNA extracted from A-704 cells. Column-fractionated cDNA was ligated to SalI and NotI digested plasmid vector pJT-2 and transformed into E. coli strain DH10B (GIBCO BRL) or E. coli strain MC1061 (BioRad) by electroporation, using a cDNA kit and procedure

recommended by the manufacturer (GIBCO BRL) .

Approximately 700,000 primary transformants were

screened with the ³²P-labelled rat clone 44 NDF cDNA probe. Hybridization was at medium stringency (30% formamide , 6x SSC, 2x Denhardt's solution, 100 μg/ml salmon sperm DNA, 1 mM EDTA, 0.2% SDS, 0.1% sodium pyrophosphate, 50 mM NaH₂PO₄, pH 6.8) at 42°C overnight. Filters were washed in a solution of 0.5x SSC, 2 mM EDTA, and 0.2% SDS at 23°C for thirty minutes; then at 42°C for ninety minutes. The filters were exposed overnight to X-ray film at -75°C.

Seven positive clones were identified after several rounds of colony hybridization. Plasmid DNA was purified from each clone and cDNA sequences were

determined by the methods of Example 4.

Among the seven positive human cDNA clones were one cDNA encoding a full length NDF-related

molecule, designated as human NDF-α2b [SEQ ID NO: 7] (Figure 32, clone 43), and two cDNAs encoding partial human NDF sequences, designated as NDF-α2b [SEQ ID

NO: 9] (Figure 33, clone 17) and NDF-α3 [SEQ ID NO: 11] (Figure 34, clone 19).

II. Isolation of Additional Human NDF cDNA Clones

A cDNA library was synthesized with poly (A)⁺ RNA extracted from A-704 cells. Double-strand cDNA was ligated to BstXI linkers and ligated into the BstXI site on plasmid vector pCDNA II (Invitrogen, San Diego, CA). The ligation mixture was used to transform E. coli strain DH10B.

Approximately 260,000 primary transformants were screened with a ³²P-labeled human NDFα2b cDNA probe (clone 43, Fig. 32) using the procedure described by Martin et al. ( Cell, 63:203-211, 1990). Hybridization was at high stringency (50% formamide, 6x SSC, 2x

Denhardt's solution, 100 μg/ml salmon sperm DNA, 1 mM EDTA, 0.2% SDS, 0.1% sodium pyrophosphate, 50 mM NaH₂PO₄, pH 6.8) at 42°C overnight. The filters were washed in a solution made of 0.1x SSC, 2 mM EDTA, 0.2% SDS, at 23°C for 30 min; then at 70°C for 60 min. The filters were then exposed to X-ray films at -75°C overnight.

Six positive clones were identified after several rounds of colony hybridization. Plasmid DNA was purified from the clones and the cDNA sequences were determined by the method previously described. Among the positive cDNA clones was a 920 bp cDNA encoding a partial human NDF-β3 [SEQ ID NO: 17] (Figure 37,

clone 33).

III. Screening of human pituitary cDNA libraries Two human pituitary cDNA libraries (Clontech,

Palo Alto, CA) were screened with the human NDF-α 2b clone 43 cDNA probe. One library was made from normal pituitary gland (Cat.# HL1139a) while the second was made from a gonadotropin producing adenoma (Cat.#

HL1096v). The library screening procedures were the same as mentioned above.

Among ten positive cDNA clones identified from the normal pituitary library, DNA sequencing data showed that one partial cDNA clone encoded human NDF-α1a sequences [SEQ ID NO: 3] (Figure 30, clone PI) and another partial cDNA clone encoded human NDF-β3

sequences.

Among four positive cDNA clones identified from the pituitary tumor library, one partial cDNA encoded human NDF-β1a sequences [SEQ ID NO: 13]

(Figure 35, clone P13). IV. PCR Cloning of Additional Human NDF cDNA Sequences

To identify additional independent human NDF cDNA clones, a PCR cloning strategy was used to amplify NDF-specific DNA sequences from tumor cell lines that express NDF mRNAs (See Fig. 29). Five human tumor cell lines were used: HT-1080, Hs 294T, 5637 (ATCC HTB9) , MDA-MB-231 and A-704. Poly (A)⁺ RNAs were extracted from the five cell lines and reverse-transcribed to generate first-strand cDNA templates using a cDNA kit from GIBCO BRL. NDF cDNA sequences (~1 kb)

corresponding to the region containing amino acids 119 to 410 in human NDF-α2b (Figure 31) were amplified.

This region contains: the spacer domain; the variable EGF-like domain (α and β forms); the variable sequences between the EGF-like domain and the transmembrane domain; the conserved transmembrane domain; and a portion of the cytoplasmic tail. The sense primer was 5' -AGG AAA TGA CAG TGC CTC T-3' [SEQ ID NO: 78]. The antisense primer was 5'-TCT CTG GCA TGC CTG AGG-3' [SEQ ID NO: 79]. PCR was carried out for 40 cycles. The reaction conditions were: 94°C, 1 minute; 50°C,

2 minutes; 72°C, 3 minutes. The PCR amplified DNA fragments were subcloned into the pSPORT plasmid vector (GIBCO BRL) and the DNA sequences were determined. The results are listed in Table 3. In addition to the previously identified human NDF-α2, -ß1 and -ß3

sequences, DNA encoding an additional NDF isoform (ß2) was identified. The sequence of proNDF-ß2 (clone 294-8) [SEQ ID NO: 15] is shown in Figure 36. Table 3.

Frequency of Alternative NDF Sequences in NDF cDNA

Amplified from Five Different Human Tumor Cell Lines

Human NDF Isoforms

α2 β1 β2 β3

A-704 19 1 3

MDA-MB-231 5

Hs294T 16 2 2

HT1080 2 1

5637 1

The human proNDF cDNA clones are summarized in Table 4, and the composite amino acid sequence for the different human proNDF isoforms is shown in Figure 38.

TABLE 4 . Human proNDF cDNA Clones

CDNA Size proNDF Figure ATCC Clone (kb) Isoform Number Number

P1* 1.1 NDF-α1a 30 69305

43* 1.8 NDF-α2b 32 69307

17 1.6 NDF-α2b 33 ╌

19* 0.9 NDF-α3 34 69308

P13* 1.8 NDF-ß1a 35 69302

294-8* 0.8 NDF-ß2 36 69304

33* 0.9 NDF-ß3 37 69306

*Deposited with American Type Culture Collection, Rockville, M.D . EXAMPLE 11

Expression of Recombinant NDFs in Mammalian Cells I. Construction of pDSR-α2 expression plasmids a. Rat NDF-α2c expression vector

Rat cDNA clone 44 (proNDF-α2c, Figure 5) was subjected to PCR using two primers. The 37-base

oligonucleotide sense primer [SEQ ID NO: 80] had the sequence: 5'-CGG TCT AGA AGC TTC CAC CAT GTC TGA GCG CAA AGA A-3'. It included 5' XbaI and HindIII restriction enzyme cloning sites followed by the Kozak consensus sequence CCACC, as well as the initial 18 bases of the rat NDF coding sequences. The 30-base oligonucleotide antisense primer [SEQ ID NO: 81] had the sequence: 5'-GCC GTC GAC CTA TTA CCT TTC GCT ATG AGG-3 '. It included a SalI site, two tandem translation stop codons, and 15 bases complementary to the sequences encoding the carboxyl end of rat proNDF-α2c. The PCR product was digested with XbaI and SalI to generate a ~1.4 kb DNA fragment containing sequences encoding the entire rat proNDF-α2c. This fragment was then subcloned into the expression vector pDSRα2 (published European patent application A20398753) that had been cut with XbaI and SalI. The final plasmid was designated as pDSRα2/rNDF-α2c. b. Rat NDF-ß4 expression vector

Rat clone 42A cDNA (Figure 20) encodes proNDF-ß4a, with a unique XbaI site at amino acid positions 244-245. To express recombinant rat NDF-ß4, two pairs of PCR primers were used to amplify different portions of the proNDF-ß4a coding region. The first pair

included the same 37-base sense primer described in Example 11 [SEQ ID NO: 80] for expression of rat NDF-α2c. This primer contained XbaI and HindIII sites, the Kozak consensus sequence CCACC, and the 18 bases

encoding amino-terminal rat proNDF-ßa sequences. The 27-base antisense primer had the following sequence: 5'-GCT CTA GAG GCT TCT CTG TTT CTT GCC-3' [SEQ ID NO: 82] and contained an XbaI site and 19 bases of the upstream sequences adjacent to the XbaI site mentioned above. The rat proNDF-ß4a (clone 42A) cDNA was subjected to PCR using this pair of primers. The PCR product was

digested with HindIII and XbaI to generate an

approximately 730 bp DNA fragment that encoded amino acids 1-244. The second pair of PCR primers included a 33-base sense primer with the following sequences: 5'-GCT CTA GAA AGA AAA TTG GAT CAT AGC CTT GTG-3' [SEQ ID NO: 83], containing an XbaI site and 25 bases of the adjacent sequences downstream from the XbaI site in NDF. The 33-base antisense primer had the sequence 5'-GCC GTC GAC CTA TTA TAC AGC AAT AGG GTC TTG-3' [SEQ ID NO: 84], and contained a SalI site, two tandem translation stop codons and 18 bases complementary to sequences encoding the proNDF-ß4a carboxyl terminus. The rat proNDF-ß4a cDNA was subjected to PCR using this second pair of primers. The PCR product was digested with XbaI and SalI to generate an approximately 1.3 kb DNA fragment which encoded amino acids 245-662. The approximately 730 bp HindIII-XbaI fragment and this approximately 1.3 kb Xba-SalI fragment were ligated with the expression vector pDSRα2 which was digested with HindIII and SalI. The resulting plasmid was designated pDSRα2/rNDF-ß4a. c. Human NDF-α2b expression vector

The initial plasmid made to express recombinant human NDF-α2b was constructed following the strategy described above for expresson of rat NDF-α2c. Human NDF-α2b clone 43 cDNA (Figure 32) was used as the template DNA for PCR. The two primers used had the following sequences: sense primer (37-mer): 5'-CGG TCT AGA AGC TTC CAC CAT GTC CGA GCG CAA AGA A-3' [SEQ ID NO: 85]; antisense primer (30-mer): 5'-GCC GTC GAC CTA TTA GAG AAT GAA GCC CAA-3 ' [SEQ ID NO: 86]. The -1.4 kb PCR product, after cutting with XbaI and SalI, was subcloned into expression vector pDSRα2. This plasmid was designated as pDSRα2/hNDF-α2b. d. Chimeric human-rat NDFa2c expression vector

Using methods described in Example 12 below, it was observed that transfected cell clones expressing recombinant rat proNDF-α2c consistently secreted higher levels of NDF protein than the transfected cell clones expressing recombinant human proNDF-α2b. Moreover, some of the human proNDF-α2b expressing cell clones had NDF mRNA levels comparable to that of rat NDF-α2c clones, but produced much lower levels of NDF protein. Since the cytoplasmic portion of the rat proNDF-α2c amino acid sequence somehow augments recombinant NDF production (i.e., affects proNDF stability or enhances the

proteolytic cleavage rate) to generate more of the mature active recombinant NDF than the "b" isoform of human proNDF, a chimeric gene was constructed. The 197-amino acid cytoplasmic domain of human proNDF-α2b was substituted with the 157-amino acid cytoplasmic domain of rat proNDF-α2c. Figure 39 illustrates the structure of such a chimeric proNDF plasmid. When the human

NDF-α2b DNA was engineered in this way, many clones were isolated which produced increased amounts of

biologically active recombinant human NDF-α2.

The chimeric human-rat proNDF-α2c expression vector was constructed by the following method. Human NDF-α2b (clone 43) cDNA (Figure 32) was subjected to PCR using two PCR primers. The sense primer (39-mer) had the sequence: 5' -GCC GAA GAC GGT CAT GAA GCT TCT GCC GCT GTT TCT TGG-'3 [SEQ ID NO: 87], and included an unique internal XhoI site; the antisense primer (33-mer) had the sequence: 5'-CCT TTC AAA CCC CTC GAG ATA CTT GTG CAA GTG-3' [SEQ ID NO: 88] and included several base changes to create a HindIII site. The resulting PCR product of approximately two hundred and ten base pairs, encoding proNDF-α2b amino acids 206-274 and including the

transmembrane domain (amino acids 243-265), was digested with XhoI and HindIII and subcloned into plasmid

pGEM7Zf(-) (Promega) between the XhoI and HindIII cloning sites. This DNA fragment of approximately two hundred and ten base pairs was then ligated to two other DNA fragments, one of them being a ~5.1 kb NotI-XhoI fragment retrieved from the plasmid pDSRα2/hNDF-α2b, the other a -3.6 kb HindIII-NotI fragment was derived from the plasmid pDSRα2/rNDF-α2c. The resulting plasmid was a chimeric molecule encoding the human proNDF-α2b extracellular domain, the conserved transmembrane domain (TM) , and the rat proNDF-α2c cytoplasmic domain. This plasmid was designated as pDSRα2/h-rNDF-α2c. e. Chimeric human-rat NDF-α1c, ß1c and ß2c

The strategy used to express recombinant human

NDF-α2 from a chimeric rat-human DNA construct was also used for expressing recombinant human NDF-α1, -ß1 and -ß2 isoforms. Briefly, cDNAs individually encoding human proNDF-αl, -ß1 and -ß2 were subjected to PCR using the same primers` described above for generating the chimeric human-rat NDF-α2c DNA. The -210 bp XhoI- HindIII PCR fragments generated from human NDF cDNA clone PI as the template represents sequences from human NDF-α1; the human NDF cDNA clone P-13 template gave the human NDF-ß1 sequence; and the human NDF cDNA clone 294- 8 template yielded the human NDF-B2 sequence. The XhoI-HindIII fragments were first subcloned into pGEM7Zf(-). They were then ligated to the 5.1 kb NotI-XhoI fragment (from pDSRα2/hNDF-α2b) and the 3.6 kb HindIII-NotI fragment (from pDSRα2/rNDF-α2c) as described above. The resulting expression plasmids were designated as pDSRα2/h-rNDF-αlc, pDSRα2/h-rNDF-ßlc and pDSRα2/h-rNDF-ß2c, respectively. These DNAs were prepared for transfection to express the recombinant human NDF-α1, -ß1 and -ß2 isoforms in COS-7 and CHO cells. I I . Transient expression of human and rat NDFs in CQS-7 cells

The expression plasmids constructed and described above were tested in a transient expression system to assess their ability to produce biologically active recombinant NDFs. COS-7 cells (ATCC CRL 1651) at 2.5 × 10⁶ cells/ml were electroporated with 12.5 μg/ml DNA from (1) pDSRα2/hNDF-α2b, (2) pDSRα2/h-rNDF-α2c, (3) pDSRα2/h-rNDF-ßlc, (4) pDSRα2/h-rNDF-ß2c,

(5) pDSRα2/h-rNDF-αlc, and (6) pDSRα2/rNDF-α2c,

individually at 1600 volts for 0.7 milliseconds.

Transfected COS-7 cells were plated at 2 × 10⁶ cells per 60 mm plate in DMEM with 10% FBS. Conditioned medium (CM) from twenty-four to seventy-two hours post-transfection were collected from the plates and tested in the neu receptor tyrosine phosphorylation assay using MDA-MB-453 carcinoma cells as described in Example 3. As shown in Figure 40, all of the CM samples tested at two concentrations (100 μl unconcentrated CM, or 100 μl of 10-fold concentrated CM) were active in the assay. Conditioned medium from untransfected COS-7 cells was negative (Fig. 40, lane 1). EXAMPLE 12

Expression of Human and Rat NDFs in CHO Cells

I. Transfection

A Chinese hamster ovary cell line (CHO D-) deficient in dihydrofolate reductase initially described by Urlaub and Chasin (Urlaub, G. and Chasin, L.A., Proc . Natl . Acad. Sci . USA 77:4216-4220, 1980) was used. The CHO D- cell line requires proline for growth and is routinely grown in high glucose DMEM, supplemented with non-essential amino acids (NEAA) , 16 μM thymidine,

100 μM sodium hypoxanthine, 100 U/ml penicillin, 100 μg/ml streptomycin, and 5% fetal bovine serum (FBS). Plasmid DNAs were introduced into the CHO D- cells by calcium phosphate precipitation. Approximately 0.8 × 10⁶ cells in 60 mm plates were exposed to 2 to 3 μg of plasmid DNA together with sufficient mouse spleen DNA to total 10 μg. The medium was changed the next day. After incubation for another twenty-four hours, the cells were split into eight 100-mm plates containing selective medium (high glucose DMEM with NEAA and 5% dialyzed FBS but without nucleosides). After ten to fourteen days, clones were picked and serum-free

conditioned media from individual clones were assayed by the neu receptor tyrosine phosphorylation assay of

Example 3, or analyzed by SDS/polyacrylamide gel

electrophoresis followed by Western blot deletion using rabbit antisera specific for human or rat NDF (rabbits were immunized with purified recombinant NDF-α2_14-241 expressed in E. coli ; see Examples 8 and 14). Both procedures were used to select the highest NDF-producing clones for generating recombinant NDF-containing

conditioned media for subsequent purifications as described below. II. NDF/CHO cell growth and production of conditioned medium for purification

To generate conditioned medium from recombinant CHO cells producing NDFs, selected

recombinant clones were grown in a spinner bottle in a 1:1 mixture of high glucose DMEM and F12 with NEAA, 5% FBS and 2 mM glutamine. Cells were then transferred to 850 cm² roller bottles containing the same medium with 2 × 10⁷ cells per bottle. After three to four days at 37°C, the cell monolayers were washed with PBS and re-fed with 150 to 200 ml fresh medium per bottle (as above but lacking FBS). Conditioned medium was

harvested six days later. Second and third harvests of conditioned media were also produced from the cell monolayer. Conditioned media were harvested and centrifuged at 7,000 × g for twenty minutes at 4°C or filtered using 0.45 μm cellulose acetate filters and stored frozen at -80°C until purification. III. Purification of recombinant mammalian NDF produced in CHO cells

Recombinant rat NDF-α2 expressed by CHO cells transfected with pDSRα2/rNDF-α2c (Example 11) was purified by the following procedure. Pooled serum-free conditioned media from harvests of roller bottles containing the transfected CHO cells expressing

recombinant rat NDF-α2c were cleared by filtration through 0.2 μ filters and concentrated with a Pellicon diafiltration system with 10-kDa molecular size cut-off membrane. Concentrated material was directly loaded onto a column of heparin-Sepharose, pre-equilibrated with 20 mM sodium phosphate buffer pH 7.210.1)

containing 25 mM NaCl. The column was then washed with the same buffer containing 0.25 M NaCl until absorbance at 280 nm fell below 0.05. Bound proteins were eluted with a continuous gradient of NaCl from 0.25 M to 1 M. Recombinant NDF in the collected fractions was detected with a NDF-specific antibody (Example 8) and by the neu receptor phosphorylation assay. Active fractions were pooled and dialyzed against 25 mM sodium phosphate buffer containing 20 mM NaCl. Insoluble particulates present in the sample were cleared by centrifugation. The purification yield was about 70% at this point.

The dialyzed, clarified sample was then loaded onto a DEAE Sepharose 6B fast flow column which had been pre-equilibrated with dialysis buffer described above. The column was extensively washed until no absorbance at 280 nm could be detected. The column was then developed with a 200 ml gradient of 0.02 M to 0.5M NaCl in 20 mM sodium phosphate buffer, pH 7.210.1. Fractions were collected and assayed for NDF content. Ammonium sulfate was added to the pooled NDF fraction obtained from DEAE Sepharose chromatography to achieve a concentration of 2 M. The material was loaded on a Phenyl-Sepharose 4B column (Pharmacia) pre-equilibrated with 20 mM sodium phosphate, pH 7.210.1 containing 2 M ammonium sulfate. After loading, the column was washed with starting buffer and developed with a gradient of ammonium sulfate (from 2 M to no salt) in 20 mM sodium phosphate, pH 7.110.1. The main peak of activity was pooled and extensively dialyzed against PBS buffer and concentrated by centrifugation using a Centriprep 10 cartridge. IV. Characterization of the purified recombinant rat NDF-α2 from CHO cells

As shown in Figure 41, in SDS-PAGE under reducing conditions, the purified recombinant rat NDF-α2 isoform migrates at a molecular weight of about forty to forty-four kilodaltons, identical to that of the

purified naturally occurring rat NDF isolated from rat-1-EJ cells described by Peles et al. Cell , 1992, above).

A 5 μg sample was subjected to automated N-terminal

sequence analysis. A single, major amino acid sequence was the only signal detected during analysis. The N-terminal sequence was determined to be:

NH₂-Lys-Glu-Gly-Arg-Gly-Lys-Gly-Lys-Gly-Lys- Lys-Lys-Asp-Arg-Gly-Ser-Arg-Gly-... [SEQ ID NO: 89]

This sequence had nine extra amino acids preceding the amino terminal sequence observed for rat

NDF from Rat-1-EJ cells (Peles et al., Cell, 1992,

above). This result indicates that the CHO cell-derived rat NDF-α2 and the Rat-1-EJ cell-derived NDF are

proteolytically processed at different N-terminal sites.

The purified recombinant rat NDF polypeptide of this Example was found to stimulate tyrosine

phosphorylation of the neu receptor protein in the

activity assay shown in Figure 42; it was active at

relatively low concentrations and the stimulatory effect was concentration dependent.

EXAMPLE 13

EXPRESSION OF RECOMBINANT NDF ISOFORMS IN E. COLI

I. Construction of expression plasmids

Expression plasmids, pCFM1656 and pCFM3106, used in the following examples can be derived from

plasmid pCFM836 (a detailed description of pCFM836 is contained in U.S. Patent No. 4,710,473, incorporated

herein by reference). Plasmid pCFM1656 can be derived from pCFM836 plasmid by destroying the two endogenous Ndel restriction sites by end filling with T4 polymerase enzyme followed by blunt end ligation, by replacing the DNA sequence between the unique Aatll and Clal

restriction sites containing the synthetic P_L promoter with a similar fragment obtained from pCFM636 containing the P_L promoter, which has the sequence :

AatII P_L promoter DNA sequence

5 ' CTAATTCCGCTCTCACCTACCAAACAATGCCCCCCTGCAAAAAATAAATTCATAT¬

3 ' TGCAGATTAAGGCGAGAGTGGATGGTTTGTTACGGGGGGACGTTTTTTATTTAAGTATA¬

-AAAAAACATACAGATAACCATCTGCGGTGATAAATTATCTCTGGCGGTGTTGACATAAA-TTTTTTGTATGTCTATTGGTAGACGCCACTATTTAATAGAGACCGCCACAACTGTATTT¬

-TACCACTGGCGGTGATACTGAGCACAT 3' [SEQ ID NO: 90]

-ATGGTGACCGCCACTATGACTCGTGTAGC 5' [SEQ ID NO: 91]

ClaI and by substituting the small DNA sequence between the unique Clal and Kpnl restriction sites with the

following oligonucleotide:

Clal Kpnl

5' CGATTTGATTCTAGAAGGAGGAATAACATATGGTTAACGCGTTGGAATTCGGTAC 3'

[SEQ ID NO: 92]

3' TAAACTAAGATCTTCCTCCTTATTGTATACCAATTGCGCAACCTTAAGC 5' [SEQ ID NO: 93]

Plasmid pCFM3106 can be derived from pCFM1656 by making a series of site-directed base changes by PCR

overlapping oligonucleotide mutagenesis. These changes can be described by an arbitrary plasmid base pair position reference to a BglII restriction site (plasmid bp # 180) immediately 5' to the plasmid replication promoter P_copB and proceeding toward the plasmid

replication genes. The base pair changes are as follows: plasmid bp # bp in PCFM1656 bp changed to in

pCFM3106

# 428 A/T G/C

# 509 G/C A/T

# 617 - - insert two G/C bp

# 978 T/A C/G

# 992 G/C A/T

# 1002 A/T C/G

# 1005 C/G T/A

# 1026 A/T T/A

# 1045 C/G T/A

# 1176 G/C T/A

# 1464 G/C T/A

# 2026 G/C bp deletion

# 2186 C/G T/A

# 2479 A/T T/A

# 2498-2501 AGTG GTCA

TCAC CAGT

# 2641-2647 TCCGAGC bps deleted

AGGCTCG

# 3441 G/C A/T

# 3649 A/T T/A

# 4556 ┄ insert bps

5' - GAGCTCACTAGTGTCGACCTGCAG-3'

[SEQ ID NO: 94]

3'- CTCGAGTGATCACAGCTGGACGTC-5' [SEQ ID NO: 95] All recombinant human and rat NDF E. coli expression plasmids were constructed using DNA obtained from PCR amplification of NDF cDNA plasmid clones

according to standard procedures. The source of these plasmid templates, the oligonucleotide primers, the

synthetic DNA fragments, and the resulting constructs

are listed in Table 5. Table 6 shows the PCR primer sequences used to amplify NDF coding sequences for insertion into pCFM1656 and pCFM3106.

⁺

a. Recombinant human met-NDF-α2_14-241

For expression of human NDF isoforms in

E. coli, a DNA fragment encoding NDF-α2_14-241 was first constructed. This fragment contained the native cDNA sequence corresponding to amino acid positions 14-241 of human NDF-α2. The oligonucleotide primers 409-27 [SEQ ID NO: 97] and 409-28 [SEQ ID NO: 98] were used to amplify the NDF coding sequence from human NDF-α2 cDNA plasmid clone 43 (Figure 32). The 5' forward (sense) primer 409-27 provided an XbaI site, a ribosome-binding sequence, and a methionine start codon followed by the NDF-α2 gene sequence starting from codon 14. The 3¹ backward (antisense) primer 409-28 contained a BamHI site and the complementary sequence of a TAA stop codon and the NDF-α2 3' coding sequence ending at codon 241. The PCR fragment amplified from cDNA clone 43 was ligated into the XbaI and BamHI sites of plasmid

pCFM3106 (described in Section I above), so that

transcription of the human met-NDF-α2₁₄_₂₄₁ gene was under the control of a P_L promoter and ended at the

transcriptional terminator provided by the plasmid vector. The resulting construct was transformed into E. coli strain FM5 (ATCC #53911) which is a derivative of strain K-12 with a temperature sensitive λ repressor gene, CI857 (Sussman et al., C. R. Acad. Sci. 254:

1517-1519, 1962) integrated into the chromosome. The transformant was designated Strain 1664.

The met-huNDF-α2 expression level of Strain 1664 was subsequently improved by optimizing the first 12 codons of the protein coding sequence. A fragment with NdeI and SacII cohesive ends was generated by annealing two chemically synthesized oligonucleotides (436-23 and 436-24) [SEQ ID NOS: 100 and 101,

respectively] with complementary sequences. This synthesized fragment was then used to replace the NdeI-SacII fragment of plasmid pCFM3106-met-huNDF-α2₁₄_₂₄₁. The coding strand sequence was therefore changed from AAG AAG AAG GAG CGA GGC TCC GGC AAG AAG CCG GAG

[SEQ ID NO: 115] to AAA AAA AAA GAA CGT GGT TCT GGT AAA AAA CCG GAA [SEQ ID NO: 116]. The entire NDF-α2 gene was subcloned into another plasmid expression vector pCFM1656. When transformed into E. coli strain FM5, this new plasmid with the N-terminal codon-optimized met-huNDF-α2_14-241 gene gave higher expression of NDF-α2_14- ₂₄₁, and was used to construct expression plasmids for additional human NDF isoforms. The FM5 strain

containing this plasmid was designated Strain 1679. b. Recombinant human NDF-α2_1-241

Because the native human NDF cDNA coding sequence starts at codon 1 (methionine), and not at codon 14 (lysine), fragment NDF-α2_1-241 was PCR

synthesized from human cDNA clone 43. The overlapping forward primers 507-8 [SEQ ID NO: 110] and 507-9 [SEQ ID NO: 111], which introduce a few codon changes, were used with reverse primer 507-10 [SEQ ID NO: 112]. The resulting PCR fragment was cloned into pCFM1656 using the same strategy described above. The coding strand sequence was therefore changed from TCC GAG CGC AAA GAA GGC AGA GGC [SEQ ID NO: 117] to TCT GAA CGT AAA GAA GGT CGT GGT [SEQ ID NO: 118] in the region of codon 2 to codon 9. The FM5 strain with plasmid pCFM1656-huNDF-α2_1- ₂₄₁ containing the extended NDF sequence was designated Strain 1784. c. Recombinant human met-NDF-α1_14-249

The NDF cDNAs isolated from cDNA libraries that encode different human NDF isoforms differ in the sequences 3' to an unique XhoI site in the EGF-like coding region, which makes the XhoI site a convenient cloning site for the construction of different

expression plasmids for the human NDF isoforms.

Oligonucleotide 502-10 [SEQ ID NO: 106] containing sequences 5' to the XhoI site was used as the 5' primer, and oligonucleotide 502-11 [SEQ ID NO: 107], which provided the BamHI site and TAA stop codon sequence, was the 3' primer. The resulting PCR fragment amplified from human NDF-α1 cDNA clone P1 (Figure 30) was used to replace the XhoI-BamHI fragment in plasmid pCFM1656-methuNDF-α2_14-241, and thus generated the NDF-αl expression plasmid pCFM1656-met-huNDF-α1_14-249. Strain 1854 was the designation assigned to the FM5 strain containing this plasmid. d. Recombinant human met-NDF-β1_14-246

Because human isoforms NDF-α1 and NDF-β1 share a common carboxy-terminal amino acid sequence, the primers 502-10 and 502-11 [SEQ ID NOS: 106 and 107, respectively] were also used for PCR synthesis of a NDF-β1 gene. Using the same cloning strategy used for the NDF-α1 gene, the XhoI-BamHI fragment in pCFM1656-methuNDF-α2_14-241 was replaced by the PCR fragment amplified from pCFM1656-met-huNDF-β1_177-246 (described below in Section e). The FM5 strain harboring the resulting construct pCFM1656-met-huNDF-β1_14-246 was designated

Strain 1789. e. Recombinant human met-NDF-β1_177-246

Expression of smaller recombinant NDF proteins containing the EGF-like domains of isoforms α2 and β1 was also performed in E. coli FM5. A DNA fragment containing the coding sequence for NDF-β1_177-246 and necessary cloning and expression elements (cloning sites, ribosome-binding sequence, methionine start codon, and TAA stop codon) was PCR synthesized from the EGF-like region of huNDF-β1 cDNA clone 294-1 with primers 494-26 [SEQ ID NO: 105] and 487-15 [SEQ ID

NO: 104], and cloned into the XbaI and BamHI sites of pCFM1656. Clone 294-1 is a partial cDNA clone for NDF-β1 which encodes amino acids 119 to 410 (Figure 38) . It is isolated by PCR as described in Example 10 (Section IV). Strain 1769 was assigned to this FM5 strain, bearing pCFM1656-met-huNDF-β1_177-246. f. Recombinant human NDF-α2_177-241

The α2 isoform counterpart of pCFM1656-methuNDF-α2_177-241 was constructed similarly with PCR primers 437-29 [SEQ ID NO: 102] and 443-1 [SEQ ID NO: 103] from human cDNA clone 43. However, the expression level of this resulting strain 1697 with pCFM1656-met-huNDF-α2_177- ₂₄₁ was unsatisfactory. Therefore, secretion was considered as an alternative to produce NDF-α2_177-241. An ompA secretion signal peptide (A) was fused to the cDNA sequence encoding human NDF-α2 amino acids 177-241.

Oligonucleotide 502-24 [SEQ ID NO: 108] containing an XbaI cloning site, a ribosome-binding sequence, a methionine start codon, degenerate ompA codons, and part of the NDF 5' coding sequences beginning at codon 177 was used as the sense primer together with antisense primer 502-25 [SEQ ID NO: 109] containing sequence around the XhoI site. Using these two primers,

fragments containing the ompA signal peptide codons and the NDF-α2 coding sequence were PCR synthesized from pCFM1656-met-huNDF-α2_177-241. The fragments were used to substitute the XbaI-XhoI fragment of pCFM1656-met-huNDF-α2_177-241. One of such clones that showed good

expression and signal peptide processing has the ompA signal sequence A1 (5'-ATG AAG AAG ACC GCG ATT GCA ATT GCG GTA GCG CTG GCG GGT TTT GCG ACC GTT GCG CAG GCG-3') [SEQ ID NO: 119]. The FM5 secretion strain harboring pCFM1656-A1-huNDF-α2_177-241 was designated Strain 1776. The expression level of secretion strain 1776 was greatly improved . However, the ratio of the processed huNDF-α2_177-241 vs . non-processed preprotein was

acceptable for cells grown in shake flasks, but not in fermentos rs . g . Recombinant human met-NDF-α3_14-247

The nucleotide sequence of human NDF-α3 differs from human NDF-α2 only in the region downstream from the unique XhoI site in the EGF-like domain. The α3 segment was PCR amplified with primers 580-31 and 583-16 [SEQ ID NOS: 113 and 114, respectively] from cDNA clone 19. The resulting fragment was then used to replace the XhoI-BamHI fragment of pCFM1656-met-huNDF-α2_14-241 in which the N-terminal codons of NDF have been optimized. The new construct, pCFM1656-met-huNDF-α3_14- ₂₄₇, was transformed into FM5, and thus generated Strain 1910. h. Recombinant rat met-NDF-α2_14-241

cDNA fragment encoding rat NDF-α2_14-241 was PCR amplified from rat cDNA plasmid clone 44 with primers 260-19 [SEQ ID NO: 96] and 425-26 [SEQ ID NO: 99] and cloned into the XbaI and XhoI sites in pCFM1656. The primer design and cloning strategy are similar to the ones used for the cloning of the met-huNDF-α2_14-241 sequence described. The FM5 strain containing the resulting expression plasmid pCFM1656-met-ratNDF-α2_14-241 was designated as Strain 1685. II. Protein expression a. Fermentor

E. coli cultures (for all direct expression strains but not the secretion Strain 1776) were

inoculated into a 2 liter Fernbach flask which contained 500 ml Luria broth plus kanamycin at 50 μg/ml. The inoculated flask was then aerated by shaking at 30°C for ten to sixteen hours at 250 rpm. The cells in the flask were aseptically transferred into a 10 liter LSL

Biolafitte fermentor which contained eight liters of batch medium (80 g of yeast extract, 42 g of ammonium sulfate, 28 g of dibasic potassium phosphate, 32 g of monobasic potassium phosphate, 5 g of sodium chloride, 40 g of glucose, 32 ml of 1 M magnesium sulfate, 5 ml of antifoam Dow P2000, 16 ml of trace metal solution and 16 ml of vitamin solution. The culture was allowed to grow to an optical density of 4-5 at 600 nm before starting phase I feeding. The feed one medium contains 100 g of Bacto tryptone, 100 g of yeast extract, 900 g of glucose, 70 ml of 1 M magnesium sulfate, 20 ml of trace metal solution and 20 ml of vitamin solution in a total volume of two liters. The phase one feed rate was initiated at 13 ml/hr and then adjusted every two hours according to the cell mass determined by optical density. The medium fed into the fermentor allowed the E. coli cells to grow at an exponential rate under a glucose limited condition so that the amount of toxic by-products accumulating in the cell culture was minimal. The temperature during the entire growth phase was set at 30°C to ensure the complete suppression of the transcription from the lambda P_L promoter and the tight control of plasmid amplification. The culture pH was kept at 7 with phosphoric acid and ammonium

hydroxide. The desired dissolved oxygen level was maintained by adjusting the agitation, air- and oxygen-input rates, and back pressure in the fermentor. As the optical density of the culture reached 30, the

temperature was raised to 42°C to start the production phase. The maximal protein expression level was

achieved at about six hours afterwards. During the production phase, the first feed medium was replaced with a second medium (800 g of Bacto tryptone, 400 g of yeast extract, and 440 g of glucose in a total volume of four liters) at a constant feed rate of 250-300 ml/hr to provide enough glucose as energy and amino acids as precursors for the synthesis of recombinant NDF. At the end of fermentation, the culture was chilled to 10-15°C. The broth was harvested and centrifuged, and the

resulting cell paste was sealed in plastic bags and stored at -70°C. b. Shake flask

E. coli Strain 1776 was inoculated into a 250 ml flask which contained 100 ml Luria broth plus

kanamycin at 50 μg/ml. The inoculated flask was aerated by shaking at 30°C for 10-16 hours at 250 rpm. Twenty ml of the growth culture was then dispensed into each of ten two-liter Fernbach flasks containing 500 ml Luria broth. The culture continued to grow in the fresh medium at the same temperature and agitation rate. When the cell density reading at OD600 reached 0.6-1.0, the expression of the NDF product was induced by raising the temperature to 42°C. Four hours after induction, the cells were harvested by centrifugation at 4°C. The resulting cell paste was sealed in plastic bags, and stored at -70°C. EXAMPLE 14

Purification of E. Coli Expressed Isoforms I. Purification of recombinant rat met-NDF-a2_14-741

Recombinant rat met-NDF-α2_14-241 was purified from E. coli Strain 1685 by subjecting a clarified cell lysate to anion exchange, cation exchange, hydrophobic interaction, and hydroxyapatite column chromatography.

E. coli cell paste was disrupted in 5 mM EDTA, pH 8.5, by two passages through a Niro-Soavi Homogenizer at 14,000 psi. The cell lysate was adjusted to pH 8.5 and centrifuged for one hour at 4,100 rpm in a J6-B centrifuge (Beckman), using a JS-4.2 rotor. The

clarified supernatant was passed through Q-Sepharose Fast Flow (Pharmacia, Piscataway, NJ) equilibrated with 5 mM Tris-Cl, pH 8.5. The flow-through fraction was loaded onto a S-Sepharose Fast Flow (Pharmacia) column previously equilibrated with 1x Dulbecco's PBS (Gibco BRL, Gaithersburg, MD), adjusted to pH 8.5. The column was washed with PBS, pH 8.5. Bound protein was eluted with a gradient of 0 to 1 M NaCl in 1x PBS, pH 8.5.

Individual fractions were collected.

S-Sepharose fractions containing a prominent recombinant NDF protein of about 34 kDa were identified by SDS-PAGE analysis using a 10% gel (Novex, San Diego) followed by Coomassie blue staining. These fractions were pooled, diluted with an equal volume of water, and adjusted to 1.2 M ammonium sulfate, pH 6.5. The sample was then loaded onto a column containing TSKGEL Butyltoyopearl 650M (TosoHaas, Montgomeryville, PA). This column was previously equilibrated with 1.2 M ammonium sulfate in PBS, pH 6.5. After washing with

equilibration buffer, the column was eluted with a gradient of 1.2 to 0 M ammonium sulfate in 1x PBS, pH 6.5. Individual fractions were collected. Column fractions containing the recombinant NDF protein were identified by SDS-PAGE. The first NDF peak eluted contained less aggregates and more monomer. Fractions containing this first peak were pooled and dialyzed overnight at 4ºC against two changes of 10 mM sodium phosphate, pH 8.5. The dialyzed sample was loaded onto a S-Sepharose Fast Flow column previously equilibrated with 10 mM sodium phosphate, pH 8.5. The column was washed with 10 mM sodium phosphate, pH 8.5, then eluted with a gradient of 0 to 1 M NaCl in 10 mM sodium

phosphate, pH 8.5. Based on the results of SDS-PAGE, fractions containing primarily the monomer NDF were pooled. This second S-Sepharose pool was dialyzed overnight at 4ºC against two changes of 10 mM sodium phosphate, pH 6.8.

A hydroxyapatite (IBF Biotechnics, Columbia, MD) column was used as a fourth chromatography step. The column was equilibrated with 10 mM sodium phosphate, pH 6.8. After loading the dialyzed S-Sepharose

fraction, the column was washed with 10 mM sodium phosphate, pH 6.8, and eluted with a gradient of 10-500 mM sodium phosphate, pH 6.8. Fractions that contained only a single band on SDS-PAGE corresponding to the monomeric recombinant NDF were pooled and diluted with H₂O to reduce conductivity to the level of 1x PBS.

After adjusting the pH to 8.5, the sample was loaded onto a S-Sepharose column previously equilibrated with 1x PBS, pH 8.5. The column was washed with 1x PBS, pH 8.5, and then eluted stepwise with 600mM NaCl in 1x PBS, pH 8.5. Column fractions containing the monomeric recombinant NDF protein were identified by SDS-PAGE, pooled, dialyzed overnight at 4ºC against two changes of 1x PBS, and passed through a 0.2 micron filter (Nalgene). This final product was designated

recombinant rat met-NDF-α2_14-241. Characterization of recombinant rat met-NDF-α2_14-741

Protein concentrations for recombinant rat met-NDF-α2_14-241 were determined by ultraviolet absorption at 280 nm with a Beckman DU-650 spectrophotometer and an extinction coefficient of 0.54 mg^-1cm^-1ml. Reversed-phase HPLC analysis using a HP1090 HPLC with a Vydac C₄ column (Hewlett-Packard Co., Palo Alto, CA) showed that the purified and refolded NDF preparation had a single peak with a retention time of 47.5 minutes in the absence and presence of 6 M guanidineHCl. The addition of both guanidineHCl and dithiothreitol caused the NDF to become reduced and unfolded, observed by the shift in retention time to 50.2 minutes. N-terminal amino acid sequence analysis of the first twenty residues

(MKKKDRGSRGKPGPAEGDPS) [SEQ ID NO : 120] matched the sequence predicted from the DNA sequence and included the additional N-terminal methionine residue used to initiate translation in E. coli . The sequence analysis also showed that N-terminal processing occurred at amino acid residues 3 and 18, with the ratios of 4% and 8%, respectively.

III. Purification of recombinant human ND F-α2_1-241

Recombinant human NDF-α2_1-241 was purified from E. coli Strain 1784 by subjecting a clarified cell lysate to anion exchange, cation exchange, and

hydrophobic interaction column chromatography.

E. coli cell paste was disrupted in 5 mM EDTA, pH 7.5 by two passages through a Niro-Soavi Homogenizer at 14,000 psi. The cell lysate was centrifuged one hour at 4,100 rpm in a J6-B centrifuge (Beckman), using a JS-4.2 rotor. The clarified supernatant was adjusted to pH 8.5 and passed through Q-Sepharose Fast Flow

(Pharmacia, Piscataway, NJ) equilibrated with 5 mM

Tris-Cl, pH 8.5. The flow-through fraction from the Q-Sepharose column was loaded onto a S-Sepharose Fast Flow column. Column loading, equilibration and elution conditions were as described for recombinant human met-NDF-α2_14-241.

S-Sepharose fractions containing a prominent monomeric recombinant NDF protein were identified by SDS-PAGE analysis, pooled, and loaded onto a column containing TSKGEL Butyl-toyopearl 650M equilibrated with 1.2 M ammonium sulfate, pH 6.5. The column loading and elution conditions were the same as described for rat met-NDF-α2_14-241. Column fractions containing the monomeric recombinant NDF protein were identified by SDS-PAGE, pooled, and dialyzed against three changes of 1x PBS, pH 8.5 at 4ºC overnight.

The dialyzed sample was loaded onto a S-Sepharose Fast Flow column previously equilibrated with 1x PBS, pH 8.5. The column was washed with 1x PBS, pH 8.5, and then eluted with a gradient of 0 to 1 M NaCl in 1x PBS, pH 8.5. Column fractions primarily

containing the monomer NDF were identified by SDS-PAGE and pooled. The sample was dialyzed against two changes of 1x PBS at 4°C overnight and passed through a 0.2 micron filter (Nalgene). This final product was

designated recombinant human NDF-α2_1-241. IV. Characterization of recombinant human NDF-α2_1-241

The protein concentration of recombinant human NDF-α2_1-241 was determined by ultraviolet absorption at 280 nm using a Beckman DU-650 spectrophotometer with an extinction coefficient of 0.48 mg^1-cm^-1ml. Reverse phase HPLC analysis showed that the purified and refolded NDF preparation had a single peak with a retention time of 45.8 minutes. Addition of

guanidineHCl and dithiothreitol caused the NDF to become reduced and unfolded, which was observed by a shift in retention time to 48.8 minutes. The N-terminal amino acid sequence analysis for the first twenty-five residues (MSERKEGRGKGKGKKKERGSGKKPE) [SEQ ID NO: 12] matched the putative sequence predicted from the DNA sequence. V. Purification of recombinant human met-NDF-α2_14-241

Recombinant human met-NDF-α2_14-241 was purified from E. coli Strain 1664 by subjecting a clarified cell lysate to anion exchange, cation exchange, hydrophobic interaction, and hydroxyapatite column chromatography.

E. coli cell paste was disrupted in of 5 mM EDTA, pH 8.5 by two passages through a Niro-Soavi

Homogenizer at 14,000 psi. The cell lysate was adjusted to pH 8.5 and centrifuged one hour at 4,100 rpm in a J6-B centrifuge (Beckman), using a JS-4.2 rotor. The clarified supernatant was then passed through

Q-Sepharose Fast Flow (Pharmacia, Piscataway, NJ) equilibrated with 5 mM Tris-Cl, pH 8.5. The flow-through fraction from the Q-Sepharose column was loaded onto a S-Sepharose Fast Flow (Pharmacia) column

previously equilibrated with 1x Dulbecco's PBS (Gibco BRL, Gaithersburg, MD) adjusted to pH 8.5. The column was washed with PBS, pH 8.5, and eluted with a gradient of 0 to 1 M NaCl in 1x PBS pH 8.5. Individual fractions were collected.

S-Sepharose fractions containing a prominent recombinant NDF protein of about 35 kDa were identified by SDS-PAGE analysis using a 10% gel (Novex, San Diego) followed by coomassie blue staining. These fractions were pooled, diluted with an equal volume of H₂O, and adjusted to 1.4 M ammonium sulfate, pH 6.5. The sample was then loaded onto a column containing TSKGEL Butyltoyopearl 650M, equilibrated with 1.4 M ammonium

sulfate, 20 mM citric acid, pH 6 . 5 . The column was washed with equilibration buffer and eluted with a gradient of 1.4 to 0 M ammonium sulfate in 1x PBS, pH 6.5. Column fractions containing the monomeric NDF protein were identified by SDS-PAGE, pooled, and

dialyzed against 10 mM sodium phosphate, pH 6.8.

A hydroxyapatite (IBF Biotechnics, Columbia, MD) column was used as a fourth chromatography step.

The column was equilibrated with 10 mM sodium phosphate, pH 6.8. After loading the dialyzed butyl-toyo fraction, the column was washed with 10 mM sodium phosphate

buffer, pH 6.8, and eluted with a gradient of 10-600 mM sodium phosphate, pH 6.8. Column fractions containing the monomer NDF protein were identified by SDS-PAGE, pooled, and diluted with H₂O to reduce conductivity to the level of 0.15 M NaCl. After adjusting the pH to

8.5, the sample was loaded onto a S-Sepharose Fast Flow column previously equilibrated with 1x PBS, pH 8.5. The column was washed with 1x PBS, pH 8.5, and then eluted with a gradient of 0 to 1 M NaCl in 1x PBS, pH 8.5.

Column fractions containing the monomer NDF protein were identified by SDS-PAGE, pooled, dialyzed against 1x PBS, and passed through a 0.2 micron filter (Nalgene). This final product was designated recombinant human met-NDF-α2_14-241.

VI. Characterization of recombinant human met-NDF-α2_{1 4-2 4 1}

Protein concentrations for recombinant human met-NDF-α_14-241 were determined by ultraviolet absorption at 280 nm using a Beckman DU-650 spectrophotometer with an extinction coeffiecient of 0.54 mg^-1cm^-1ml. For reverse phase HPLC analysis, Vydac C4 or Synchropak RP-4 columns were used with a HP1090 HPLC (Hewlett-Packard, Palo Alto, CA). The columns were equilibrated with 97% buffer A (0.1% trifluoroacetic acid in HPLC water) and 3% buffer B (90% acetonitrile, 0.1% trifluoroacetic acid in HPLC water). Recombinant NDF protein (10-50 μg) was injected in a total volume of 250 μl. For unfolding and reduction, a 40 μl sample was first treated with 200 μl of 0 . 1 M Tris-6 M guanidineHCl, pH 8 and 10 μl of 0.2 M dithiothreitol for thirty minutes at 23°C. The final concentrations of guanidineHCl and dithiothreitol were 4.8 M and 8 mM, respectively. The columns were eluted with a linear gradient of 3-50% buffer B for the first sixty minutes. The gradient was increased to 95% buffer B for the next ten minutes. Elution was continued for another ten to fifteen minutes with 95% buffer B. The analysis showed that the purified and refolded NDF preparation had a single peak with a retention time of 46.4 minutes in the presence and absence of 6 M

guanidineHCl. The addition of both guanidineHCl and dithiothreitol caused the NDF to become reduced and unfolded, observed by the shift in retention time to 49.6 minutes. The N-terminal amino acid sequence analysis for the first ten residues (MKKKERGSGK) [SEQ ID NO: 122] matched the sequence predicted from DNA and included the N-terminal methionine used to initiate translation in E. coli . VII. Purification of recombinant human NDF-α2_177-241

Recombinant human NDF-α2_177-241 was purified from E. coli Strain 1776 by renaturing recombinant NDF protein from a cell lysate pellet fraction, followed by cation exchange, hydrophobic and anion exchange column chromatography. E. coli cell paste was disrupted in 5 mM EDTA, pH 6.5, by two passages through a Microfluidizer

(Microfluidics Corp., Newton, MA) at 14,000 psi. The cell lysate was centrifuged for one hour at 4,100 rpm in a J6-B centrifuge (Beckman). The pellet was resuspended in 5 mM EDTA, pH 6.5, and centrifuged again for one hour. The pellet was solubilized by mixing with eight volumes of 8 M urea, 20 mM Tris-HCl, pH 8.6, for thirty minutes at room temperature. Reduced glutathione was added to 10 mM and the sample was stirred at room temperature, pH 8.8, for an additional thirty minutes. The denatured, reduced inclusion body protein was diluted 1:20 into a buffer containing 20 mM Tris-HCl, 1 mM EDTA, 0.2 M L-arginine, 1 mM reduced glutathione, 1 mM oxidized glutathione (pH 8.8). The solution was left for sixty minutes at room temperature, then at 4ºC for sixteen hours without further stirring.

The renatured sample was diluted with two volumes of water and adjusted to pH 4.6 by adding citric acid to 5 mM followed by 5 M HCl. After one hour of centifugation at 4,100, rpm in a J6-B centrifuge, the supernatant was loaded onto a S-Sepharose Fast Flow column. The column was previously equilibrated with 20 mM citric acid, pH 4.6. After loading, the column was washed first with equilibration buffer, then with 20 mM sodium phosphate, pH 6.8. Bound protein was stepwise eluted with 0.5 M NaCl in 1x PBS, pH 7.0.

S-Sepharose fractions containing a prominent recombinant NDF protein of about 8 kDa were identified by SDS-PAGE analysis using a 4-20% gradient gel (Novex, San Diego) followed by coomassie blue staining. These fractions were pooled, diluted to 1 mg/ml with 10 mM citric acid, pH 5.0, and adjusted to 1.5 M ammonium sulfate. The sample was then loaded onto a column containing TSKGEL Butyl-toyopearl 650M equilibrated with 1.5 M ammonium sulfate, 10 mM citric acid, pH 5. The column was washed with the equilibration buffer and eluted with a gradient of 1.5 to 0 M ammonium sulfate in 10 mM citric acid, pH 5. Column fractions containing the monomeric

recombinant NDF protein were identified by SDS-PAGE and pooled. After concentration in a stirred cell with a YM3 ultrafiltration membrane (Amicon, Danvers, MA), the sample was dialyzed against three changes of 1x PBS overnight at 4ºC.

The dialyzed sample was diluted with two volumes of water, adjusted to pH 6.9, and passed through a Q-Sepharose Fast Flow column previously equilibrated with 20 mM sodium phosphate, pH 6.9. The flow-through fraction was adjusted to pH 4.5 with 3 M citric acid, then loaded onto a S-Sepharose Fast Flow column

previously equilibrated with 10 mM citric acid, pH 4.5. The column was washed with equilibration buffer, then stepwise eluted with 50 mM sodium phosphate pH 6.9.

Column fractions containing the monomer NDF protein were identified by SDS-PAGE, pooled, dialyzed overnight at 4ºC against three changes of 1x PBS, pH 6.5, and passed through a 0.2 micron filter (Nalgene). This final product was designated recombinant human NDF-α2_177-241. VIII. Characterization of recombinant human NDF-α2_177-241

Protein concentrations for recombinant human NDF-α2_177-241 were determined by ultraviolet absorption at 280 nm using a Beckman DU-650 spectrophotometer with an extinction coefficient of 0.389 mg^-1cm^-1ml. Reverse phase HPLC analysis showed that the purified and refolded NDF preparation had a single peak with a retention time of 34.5 minutes. The addition of both guanidineHCl and dithiothreitol caused the NDF to become reduced and unfolded, which was observed by a shift in retention time to 43.5 minutes. The N-terminal amino acid sequence analysis for the first fifteen residues (SHLVKCAEKEKTFCV) [SEQ ID NO: 123] matches the sequence predicted from DNA. IX. Purification of recombinant human met-NDF-α1_14-249

Recombinant human met-NDF-α1_14-249 was purified from E. coli Strain 1854 by renaturing recombinant NDF protein from a cell lysate pellet fraction, followed by cation exchange and hydrophobic column chromatography.

E. coli cell paste was disrupted in 1 mM EDTA, 10 mM Tris-Cl, pH 7.0 by two passages through a Niro-Soavi Homogenizer at 14,000 psi. The cell lysate was centrifuged for one hour at 4,100 rpm in a J6-B

centrifuge (Beckman), using a JS-4.2 rotor. The pellet was resuspended in 1 mM EDTA, 10 mM Tris-Cl, pH 7.0 and centrifuged again for one hour. The pellet fraction were solubilized by mixing with five volumes of 8 M urea, 2 mM EDTA, pH 8.6, for thirty minutes at room temperature. Reduced glutathione was added to 10 mM and the sample was stirred at room temperature, pH 8.6, for an additional thirty minutes. The denatured, reduced inclusion body protein was dropped into 50 volumes of 20 mM Tris-HCl, 1 mM EDTA, 0.2 M L-arginine, 1 mM reduced glutathione, 1 mM oxidized glutathione (pH 8.8). The stirring was continued for ten minutes at room temperature. The solution was then left at 4ºC overnight.

The renatured sample was diluted with one volume of water and adjusted to pH 7.0. After one hour of centifugation at 4,100 rpm in a J6-B centrifuge, the supernatant was loaded onto a S-Sepharose Fast Flow column, previously equilibrated with 1x PBS, pH 7.0.

After loading, the column was first washed with 1x PBS, pH 7.0, then with 1x PBS, pH 7.0 containing an

additional 0.15 M NaCl. Bound protein was eluted by a gradient of 0.15-0.75 M NaCl in 1x PBS, pH 8.0. S-Sepharose fractions containing a prominent recombinant NDF protein of about 34 kDa were identified by SDS-PAGE analysis. These fractions were pooled and mixed with an equal volume of 2.4 M ammonium sulfate. After

adjusting the pH to 6.5, the sample was loaded onto a column containing TSKGEL Butyl-toyopearl 650M

equilibrated with 1.2 M ammonium sulfate, 10 mM sodium phosphate, pH 6.5. The column was washed with the equilibration buffer and eluted with a gradient of 1.2 to 0 M ammonium sulfate in 1x PBS, pH 6.5. Column fractions containing monomeric recombinant NDF protein were identified by SDS-PAGE, pooled, and dialyzed against three changes of 1x PBS overnight at 4ºC.

The dialyzed sample was loaded onto a S- Sepharose Fast Flow column previously equilibrated with 1x PBS, pH 7.0. Bound protein was eluted by a gradient of 0.15-0.80 M NaCl in 1x PBS, pH 7.0. Column

fractions containing the monomer NDF protein were identified by SDS-PAGE, pooled, dialyzed against two changes of 1x PBS, and passed through a 0.2 micron filter (Nalgene). This final product was designated recombinant human met-NDF-α1_14-249. X. Characterization of recombinant human met-NDF-α1_14-249

Protein concentrations for recombinant human met-NDF-α1_14-249 were determined by ultraviolet absorption at 280 nm with an extinction efficient of 0.486 mg^-1 cm^-1 ml. Reverse phase HPLC analysis showed that the

purified and refolded NDF preparation had a single peak with a retention time of 47.2 minutes. The addition of both guanidineHCl and dithiothreitol caused the NDF to become reduced and unfolded, which was observed by the shift in retention time to 50.4 minutes. The N-terminal amino acid sequence analysis at the first twenty

residues matched the sequence predicted from DNA and included the N-terminal methionine residue added to initiate translation in E. coli . The sequence analysis also showed that amino-terminal processing occurred at amino acid residue 15 with the ratio of 4.8%.

XI. Purification of recombinant human met-NDF-β1_14-246 Recombinant human met-NDF-β1_14-246 was purified from E. coli Strain 1789 by renaturing recombinant NDF protein from a cell lysate pellet fraction, followed by cation exchange and hydrophobic interaction column chromatography.

E. coli cell paste was disrupted in ten volumes of 10mM tris-HCl 5mM EDTA, pH 8.0. The cell lysate was centrifuged, and the pellet was solubilized in five volumes of 8M Urea, pH 8.6, for thirty minutes at room temprature. Dithiothreitol was added to 5mM to complete the reduction. The sample was stirred at room temperature, pH 8.6, for an additional thirty minutes. The solublized, reduced inclusion body protein was added dropwise to fifty volumes of 20 mM Tris-Cl, 1 mM EDTA, 0.2 M L-arginine, 2 M urea, 1 mM reduced glutathione, 1 mM oxidized glutathione (pH 8.6). The stirring was continued for 10 minutes at room temperature. The solution was then left at 4°C overnight.

After diluting the renatured sample with two volumes of water and adjusting the pH to 8.0, the material was loaded onto a CM-Sepharose Fast Flow

(Pharmacia) column, previously equilibrated with 20 mM sodium phosphate, pH 8.0. After loading, the column was washed with 20 mM sodium phosphate pH 8.0, then with 10 mM sodium phosphate, 15mM glycine, pH 8.5. Bound protein was eluted with 0.5 M NaCl in 10 mM sodium phosphate, 15 mM glycine, at pH 8.5. CM-Seρharose fractions containing a prominent recombinant NDF protein of about 34 kDa were identified by SDS-PAGE analysis. The pooled fractions were

adjusted to 10% (weight/volume) ammonium sulfate and pH 6.5 before loading onto a TSKGEL phenyl-toyopearl

650M column, previously equilibrated with 10% (w/v) ammonium sulfate, pH 6.5. The column was washed with the equilibration buffer and eluted with a gradient of 10% to 0% ammonium sulfate in 20mM sodium phosphate at pH 6.5. Column fractions containing the monomer NDF protein were identified by SDS-PAGE, pooled, and

dialyzed against 50mM sodium phosphate, pH 7.0. The dialyzed sample was loaded onto a S-Sepharose Fast Flow column previously equilibrated with 50mM sodium

phosphate, pH 7.0. The column was washed, with 50mM sodium phosphate, pH 8.0, then stepwise eluted with 0.5M NaCl in 20mM sodium phosphate, 20mM glycine, pH 8.8.

Column fractions containing monomeric recombinant NDF protein were identified by SDS-PAGE, pooled, dialyzed against 1x PBS, and passed through a 0.2-micron filter (Nalgene). This final product was designated

recombinant human met-NDF-β1_14-246.

XII. Characterization of recombinant human met-NDF-β1_14-246

The protein concentration of recombinant human met-NDF-β1_14-246 was determined by absorption at

280nm with an extinction coefficient of 0.59 mg^-1cm^-1ml. Reverse phase HPLC analysis showed that the purified and refolded recombinant NDF preparation had a single major peak with a retention time of 49.2 minutes. The

addition of both guanidineHCl and dithiothreitol caused the recombinant NDF to become reduced and unfolded, observed by a shift in retention time to 53.2 minutes. The N-terminal amino acid sequence analyses for the first twenty-five residues (MKKKERGSGKKPESAAGSQSPALPP) [SEQ ID NO: 124] matched the sequence predicted from DNA and included the additional N-terminal methionine used to initiate translation in E. coli . The sequence analysis also revealed that the first two residues were absent from 25% of the material, which has the amino-terminal sequence KKERGSG.

XIII. Purification of recombinant human met-NDF-β1_177-246

Recombinant human met-NDF-β1_177-246 was purified from E. coli Strain 1769 by renaturing recombinant NDF protein from a cell lysate pellet fraction, followed by cation exchange and reverse phase column chromatography.

E. coli cell paste was disrupted in ten volumes of 5mM EDTA, 10mM tris-HCl, pH 8.0. The cell lysate was centrifuged, the pellet was resuspended in 5 mM EDTA, 10 mM Tris-Cl, pH 8.0 and centrifuged again, then solubilized by mixing with five volumes of 8 M urea, pH 8.5, at room temperature for thirty minutes. Reduced glutathione was added to lOmM, and the sample was stirred at room temperature, pH 8.7, for an

additional thirty minutes. The solublized, reduced inclusion body protein was added dropwise to fifty volumes of thirty mM Tris-HCl, 1 mM EDTA, 0.2 M

L-arginine, 2 M urea, 1 mM reduced glutathione, 1 mM oxidized glutathione (pH 8.6). The stirring was

continued for ten minutes at room temperature. The sample was then left at 4ºC overnight.

After diluting the renatured sample with two volumes of water and adjusting the pH to 4.5 with 3 M citric acid, the material was centrifuged for one hour. The supernatant was loaded onto a CM-Sepharose Fast Flow (Pharmacia) column previously equilibrated with 10 mM citric acid, pH 4.5. After loading, the column was washed with equilibration buffer. Bound protein was stepwise eluted with 1x PBS, pH 8.5.

CM-Sepharose fractions containing a prominent ~7 kD recombinant NDF protein were identified by SDS-PAGE analysis. Pooled fractions containing the product were diluted with two volumes of water. The pH was adjusted to 4.0, and the material was loaded onto a

Vydac C₄ column, from The Separations Group, Hesperia, CA. The C₄ column was previously packed in 80% ethanol, pH 4.0, and equilibrated with 20mM citric acid, pH 4.0. After loading, the column was washed with 20 mM citric acid, pH 4.0, then with 20% ethanol in 20mM citric acid, pH 4.0. Bound proteins were eluted by a gradient of 20-80% ethanol in 20 mM citric acid, pH 4.0. Column

fractions containing monomeric recombinant NDF protein were identified by SDS-PAGE, pooled, and dialyzed

against two changes of 20 mM citric acid, pH 4.5, at 4ºC overnight.

The dialyzed sample was loaded onto a S-Sepharose Fast Flow (Pharmacia) column previously

equilibrated with 20 mM citric acid, pH 4.5. The column was washed with the equilibration buffer and stepwise eluted with 1x PBS, pH 8.0. Column fractions containing the monomer NDF protein were identified by SDS-PAGE and pooled. To clarify the pooled material, 0.2 volumes of water and 1/20 volumes of 1 M tris-HCl, pH 8.5, were added. The sample was then dialyzed against 1x PBS, pH 8.0, and passed through a 0.2 micron filter (Nalgene).

This final product was designated recombinant human met- NDF-β1_177-246.

XIV. Characterization of recombinant human met-NDF-β1_177-246

The protein concentration of recombinant human met-NDF-β1_177-246 was determined by absorption at 280 nm with an extinction coefficient of 0.66 mg^-1cm^-1ml.

Reverse phase HPLC analysis showed that the purified and refolded recombinant NDF preparation has a single major peak with a retention time of 50.5 minutes. The

addition of both guanidineHCl and dithiothreitol caused the recombinant NDF to become reduced and unfolded, which is observed by a shift in retention time to 53.7 minutes. The N-terminal amino acid sequence analysis of the first fifteen residues revealed a major sequence (MSHLVKCAEKEKTFC) [SEQ ID NO: 125] that matched the sequence predicted from DNA and included the additional N-terminal methionine used to initiate translation in E. coli . XV. Purification of recombinant human met-NDF-α3_14-247

Recombinant human met-NDF-α3_14-247 was purified from E. coli Strain 1910 by renaturing recombinant NDF protein from a cell lysate pellet traction, followed by cation exchange, hydrophobic extraction and anion exchange column chromatography. E. coli cell paste was disrupted in ten volumes of 5mM tris-HCl, 1mM EDTA, pH 8.0. The cell lysate was centrifuged, and the pellet was solubilized in ten volumes of 8M urea, 2mM EDTA, 5mM Dithioerythritol (2, 3-Dihydroxybutane-1.4-dithiol), pH 9 for ninety minutes at room temperature. The solubilized, reduced inclusion body protein was added slowly into twenty five volumes of 20 mM tris-HCl, 0.4 M L-arginine, ImM EDTA, 0.5mM reduced glutathione, pH 8.7. The stirring was continued for five minutes at room temperature. The sample was then left at room

temperature for one hour; then at 4°C for eighteen hours without further stirring.

The renatured sample was adjusted to pH 4.7 by adding citric acid to 10 mM followed by 5M HCl. After centrifugation, the supernatant was loaded onto a s-sepharose fast flow column. The column was previously equilibrated with 20mM citric acid pH 4.7. After

loading, the column was washed first with equilibration buffer, then with 20mM sodium phosphate, 20mM Glycine, pH 8.5; finally with 0.2M NaCl in 20mM sodium phosphate, 20mM glycine, pH 8.5. Bound protein was stepwise eluted with 0.7 M NaCl in 20mM sodium phosphate and 20mM

glycine, pH 8.5. The eluted fractions from S-Sepharose column containing a prominent recombinant NDF protein of about 34 kDa were identified by SDS-PAGE analysis using a 10% NOVEX gel followed by coomassie blue staining.

These fractions were pooled, diluted by water to

approximately lmg/ml, and then adjusted to 1.2M ammonium sulfate and 50mM sodium phosphate pH 6.0. The sample was loaded onto a Butyl-toyopearl 650 M column

equilibrated with 1.2 M ammonium sulfate, 50mM sodium phosphate pH 6.0. After loading the column was washed with the equilibration buffer and eluted with a gradient of 1 M to 0 M ammonium sulfate in 50 mM sodium phosphate pH 6.0. Column fractions containing monomeric

recombinant NDF protein were identified by SDS-PAGE and pooled. The pooled sample was dialyzed against three changes of 1 × PBS pH 7.5.

The dialyzed sample was passed through a

Q-sepharose fast flow column previously equilibrated with 1 × PBS pH 7.5. The flow-through fraction was collected and adjusted to pH 4.7 by 3M citric acid. After adding 0.2M NaCl, the sample was loaded onto a S-Sepharose fast flow column at pH 4.7. The column was then washed with 20mM citric acid, pH 4.7, and with 20mM sodium phosphate, 20mM glycine, pH 8.5. The bound protein was eluted with a gradient of 0 to 1M NaCl in 20mM sodium phosphate, 20mM glycine, pH 8.5. The first peak containing the monomeric recombinant NDF protein was pooled, dialyzed at 4ºC against three changes of 1x PBS, and passed through a

0.2 micron filter. This final product were designated recombinant human met-NDF-α3_14-247 XVI. Characterization of recombinant human-met-NDF-α3_14-247

The protein concentration of recombinant human met-NDF-α3_14-247 was determined by absorption at 280 nm with an extinction coefficient of 0.446 mg^-1cm^-1ml. SDS-PAGE analysis using a 10% NOVEX gel followed by

coomassie blue staining showed that the purified NDF protein appeared as a single band approximately 34K

(15μg/lane). In the presence of reducing reagent (30mM dithiothreitol) the reduced form tended to run slower than the non-reduced form in a SDS-PAGE. Reverse phase HPLC analysis showed that the purified and refolded

recombinant NDF preparation had a single peak with a retention time of 47.9 minutes. The addition of both guanidineHCl and dithiothreitol caused the NDF protein to become reduced and unfolded, observed by a shift in retention time to 51.0 minutes. The N-terminal amino sequence analysis of the first nineteen residues matched the sequence predicted from DNA and included the

additional N-terminal methionine used to initiate

translation in E . coli .

EXAMPLE 15

Induction of Her2/nee u Tyrosine Phosphorylation by

Recombinant NDFs from E. coli

Subconfluent monolayers of MDA-MB-453 cells were grown in 48-well plates. The cells were then

treated with the indicated concentrations of recombinant rat or human NDF at 37°C for five minutes . Cell lysates were prepared and subjected to Western blot analysis with an anti-phosphotyrosine antibody. The results are shown in Figure 43. All of the recombinant rat and human NDF isoforms were active in stimulating tyrosine phosphorylation of the neu receptor in MDA-MB-453 cells at concentrations from 5-20 ng/ml. PBSA is PBS with 0.1% albumin as a negative control.

EXAMPLE 16

Accumulation of Neutral Lipids in Mammary Carcinoma

Cells Cultured in the Presence of Recombinant Human

NDF-α2 and Recombinant Rat NDF-α2 Phenotypic changes usually associated with "functional" differentiation of breast epithelium include the synthesis of milk components and lipids (Topper, Y.J, and Freeman, C.S., Physiol . Rev. 60:1049-1106, 1980; Stampfer, M.R., and Yaswen, P.,

Transformation of Human Epithelial Cells: in Molecular and Oncogenic Mechanisms, G.E. Milo, B.C. Castro and CF. Shuler, eds. ( CRC Press, Ann Arbor, MI) , pp. 117-140, 1992). Specific cultured human breast cell lines can acquire these differentiated characteristics upon treatment with factors such as phorbol 12-myristate

13-acetate (Bacus, S.S. et al., Mol . Carcinog. 3:350-362, 1990). This experiment was designed to determine whether recombinant human met-NDF-α2_14-241 (rHuNDF-α2) or recombinant rat NDF-α2 (rRtNDF-α2) stimulates, retards or has no affect on the in vitro accumulation of neutral lipids in BT-474 cells, a human mammary ductal carcinoma line which overexpresses the neu receptor. rHuNDF-α2 used herein was purified from E. coli as described in Example 14, whereas rRtNDF-α2 was prepared from CHO cell supernatants as described in Example 12. I. Materials and Methods

BT-474 cells (ATCC HTB 20) were obtained from the

American Type Culture Collection (Rockville, MD). Cells were routinely cultured in "complete" media containing RPMI 1640 (GIBCO BRL, Gaithersburg, MD) supplemented with 10% heat-inactivated fetal bovine serum (Hyclone Laboratories Inc., Logan UT), 2 mM L-glutamine (GIBCO BRL), and 10 μg/mL bovine insulin (Clonetics Corp., San Diego, CA).

Accumulation of neutral lipids (e.g.,

cytoplasmic triglycerides) was measured using a

modification of the Nile Red bioassay (Smyth, M.J., and Wharton, W., Exp . Cell Res . 199 : 29-38, 1992). BT-474 cells were seeded in 6-well flat-bottom dishes in 2 mL of "complete" media at a density of 1 × 10⁵ cells per well. After incubation for twenty-four hours at 37°C in a humidified air/5% CO2 chamber, media were removed and replaced with samples containing either rHuNDF-α2 or rRtNDF-α2. Plates were re-incubated using the

conditions described above. Recombinant NDF-α2

dilutions were prepared from stock solutions

(rHuNDF-α2_14-241, 370 μg/mL; rRtNDF-α2, 492 μg/mL) in a 1:1 mixture of "complete" media and diluent. Diluent consisted of a 1:1 mixture of DMEM (GIBCO-BRL) :F-12

(GIBCO BRL) with 100 μM non-essential amino acids (GIBCO BRL) and 2mM L-glutamine (GIBCO BRL). Cells were refed with fresh media (with or without recombinant NDF-α2) after three and five days of incubation. On day 7, cells were detached from the substratum using 0.25 mL 0.1% trypsin/1.1 mM EDTA (GIBCO BRL) in Dulbecco's phosphate-buffered saline (D-PBS) (GIBCO BRL) at 37°C. After each sample was resuspended in a final volume of 2 mL D-PBS, 0.8 mL phosphate-buffered 10% formalin (Baxter, Deerfield, IL) was added (i.e., final formaldehyde concentration was approximately 1%) as a fixative. Each cell suspension was filtered through Nitex bolting cloth (Tetko Inc., Briarcliff Manor, NY) and stained with 50 ng/mL of Nile Red (Molecular Probes Inc., Eugene, OR). Samples were analyzed (10⁴ events per sample) with a Fluorescence Activated Cell Sorter (FACS) for gold fluorescence, red fluorescence, side-scatter, forward scatter, and a gold-to-red fluorescence ratio. The FACStar^PLUS (Becton Dickinson Immunocytometry Systems, San Jose, CA) was equipped with a 5 W argon laser operated at 488 nm. Gold fluorescence was collected using a narrow band pass filter at 575 ± 13 nm (Becton Dickinson) while red fluorescence was collected beyond 610 nm with a long pass filter (Omega Optical, Brattleboro, VT). Data were acquired and analyzed using the software package Lysis II Version 1.0 (Beckton Dickinson).

II. Results and Discussion:

Formation of cytoplasmic lipid vesicles, a phenotypic change normally associated with

differentiating mammary epithelium, can be detected by staining cells with a fluorescent hydrophobic probe known as Nile Red. Nile Red fluorescence can vary:

although not fluorescent in aqueous media, it will fluoresce yellow-gold in a neutral lipid environment (i.e., lipid vesicles) or red if in a polar (i.e., plasma membrane) lipid environment (Fowler, S.D., and Greenspan, P., J. Histochem. Cytochem. 33:833-836, 1985). Therefore, as a cell differentiates and

intracellular neutral lipid deposits accumulate, total gold fluorescence will increase relative to total red fluorescence. Such changes are most readily quantified when gold-to-red fluorescence ratios are monitored rather than either total gold or total red fluorescence (Smyth, M.J., and Wharton, W., Exp . Cell Res . , above).

BT-474 cells were treated with recombinant NDF-α2 samples for seven days prior to staining with Nile Red and analysis by flow cytometry. Histograms depicting the gold-to-red fluorescence ratio measurements are presented in Figures 44 and 45. Recombinant NDF-α2 induced changes in the distribution of cells from the low ratio peak (i.e., "nondifferentiated" cells with few neutral lipid vesicles) to the high ratio peak (i.e.,

"differentiated" with prominent neutral lipid vesicles). The data, summarized in Table 7, indicate that both E. coli-produced rHuNDF-α2 and CHO-produced rRtNDF-α2 stimulate accumulation of neutral lipids in BT-474 cells in a dose-dependent manner. This is reflected by an increase in the percentage of cells in the high-ratio peak. This effect was most pronounced at the highest concentration tested (100 ng/mL). Thus, treatment with either preparation of NDF-α2 can induce a

"differentiated" phenotype in BT-474 cells.

= = = = = == = = = = = = = = = = == = = = = = = = = = = == = = = = = = = = = = == = = = = = = = = = = == = = = = = = = = = = == = = = = = = = = = = == = = = = = = = = = = == = = = = TABLE 7. Accumulation of Neutral Lipids in BT-474 Mammary Carcinoma Cells = = = = = == = = = = = = = = = = == = = = = = = = = = = == = = = = = = = = = = == = = = = = = = = = = == = = = = = = = == = = = = = = = = = = == = = = = = = = = = = == = = = = = = =

LOW-RATIO PEAK^b HIGH-RATIO PEAK^c

[NDF-α2] Type of rNDF^a Percent Percent- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

0 ng/mL E. coli/human 60.4 39 . 6

CHO/rat 63.2 36 . 8

4 ng/mL E. coli/human 45.6 54 . 5

CHO/rat 65.6 34 . 4

20 ng/mL E. coli/human 37.3 62 . 8

CHO/rat 30.2 69 . 0

100 ng/mL E. coli/human 20.1 79 . 9

CHO/rat 9.7 90 . 3

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - -

Cultures of BT-474 were treated for seven days with defined amounts of either E. col i-produced rlluNDF-α2 or CHO-produced rRtNDF-α2. Staining for lipids was as described in the Materials and Methods section. Data refer to either the fraction of cells (i.e., percent) residing in either tha low or high-ratio peak. The experiments were repeated and yielded similar results.

a " E. coli/human" refers to recombinant human met-NDF-α2_14-241 purified from E. coli ;

"CHO/rat" refers to recombinant rat NDF-α2 purified from CHO cell supernatents.

b Cells whose mean gold-to-red fluorescence ratio waa < 60.

c Cells whose mean gold-to-red fluorescence ratio waa > 60 and < 2000.

EXAMPLE 17

Stimulation of BT-474 Proliferation and Inhibition of

MDA-MB-468 Proliferation in the Presence of Recombinant

Human NDF-α2

The in vitro growth of cancer cells can be modified by numerous growth stimulatory and inhibitory factors (Aaronson, S.A., Science 254:1146-1153, 1991). Some of these, such as EGF (Imai, Y. et al., Cancer Res . 42:4394-4398, 1982), bFGF (Karey, K.P., and Sirbasku, D.A., Cancer Res . 48:4083-4092, 1988), IGF-I (Karey, K.P., and Sirbasku, D.A., Cancer Res . , above) and TGF-ßχ (Knabbe, C. et al.. Cell 48 :411-428, 1987), can regulate the proliferation of rodent and mammalian breast

carcinoma cells in culture. Responses can be dependent upon variables such as the target cell line, specific factor or factors, media and media supplements, cell density, and type of assay. In this experiment, we determined whether recombinant human met-NDF-α2_14-241 (rHuNDF-α2) stimulates, retards or has no affect on the in vitro growth of two human mammary cell lines: BT-474 cells, a ductal carcinoma line that overexpresses the neu/Her-2 receptor, and MDA-MB-468 cells, an

adenocarcinoma line which expresses the same receptor. rHuNDF-α2 was purified from E. coli and re-folded in an active configuration (Example 14).

I. Materials and Methods:

BT-474 (ATCC HTB 20) and MDA-MB-468 (ATCC HTB 132) cells were obtained from the American Type Culture Collection (Rockville, MD). Cells were

routinely cultured in "complete" media containing RPMI 1640 (GIBCO BRL, Gaithersburg, MD) supplemented with 10% heat-inactivated fetal bovine serum (Hyclone

Laboratories Inc., Logan, UT), 2 mM L-glutamine (GIBCO BRL), and 10 μg/mL bovine insulin (Clonetics Corp., San Diego, CA). MDA-MB-468 were cultured in the same media without L-glutamine and insulin as supplements.

Proliferation was monitored using a modification of the 3-[4,5-dimethylthiazol-2-yl]-2,5-diphenyltetrazolium bromide "MTT" bioassay (Mosmann, T.,J. Immunol . Methods 65:55-63, 1983; Denizot, F., and Lang, R., J. Immunol . Methods 89: 271-277, 1986; Hansen, M.B. et al., J. Immunol . Methods 119: 203-210, 1989). Cells were seeded in 96-well flat-bottom microtiter dishes in 100 μL of cell-specific "complete" media at low density [i.e., 3,000 (BT-474) or 2,000 (MDA-MB-468) cells per well]. After incubation for twenty-four hours at 37°C in a humidified air/5% CO₂ chamber, 100 μL aliquots with samples of rHuNDF-α2 were added which was followed by incubation of the plates using the

conditions described above. rHuNDF-α2 dilutions were prepared from a stock solution (rHuNDF-α2_14-241, 370 μg/mL) in diluent (1:1 mix of DMEM (GIBCO):F-12 (GIBCO) with 100 μM non-essential amino acids (GIBCO) and 2 mM L-glutamine (GIBCO)). All dilutions were tested in quadruplicate. Just prior to analysis, sample media were replaced with 50 μL per well phenol red-free RPMI 1640 (GIBCO) containing 1 mg/mL MTT (Sigma). Plates were then re-incubated for an additional three hours. The formazan precipitate was solubilized by adding

150 μL per well pre-heated (65°C) lysis buffer (50% dimethyl formamide (Fluka), 20% sodium dodecyl sulfate (Sigma)). Total absorbance (i.e., the result of subtracting background absorbance at 690 nm from

absorbance at 560nm) was determined for each well using a Vmax microplate reader (Molecular Devices Corp., Menlo Park, CA). Data were analyzed using Excel V3.0

(Microsoft) and plotted with CA-Cricket Graph VI.3.2 (Computer Associates, San Diego, CA). II. Results and Discussion

Differentiation of mammary epithelial cells invokes two sometimes independent processes: the production of lipids and milk proteins (i.e.,

"functional" differentiation), and cessation of cell growth (i.e., "terminal" differentiation) (Stampfer, M.R., and Yasmen, P., CRC Press, above). Both can be regulated by a myriad of factors including steroid hormones, growth factors, and matrix components (Topper, Y.J., and Freeman, C.S., Physiol Rev, above). Native rat 44 kDa NDF, an EGF-related protein whose receptor is a member of the EGF receptor family, induces both terminal and functional differentiated phenotypes in AU-565, a human breast carcinoma cell line (Peles, E. et al., Cell, 1992, above).

In these experiments, the effects of rHuNDF-α2 on the in vitro proliferation of two mammary cell lines, BT-474 and MDA-MB-468, were studied. Both lines were treated with various concentrations of rHuNDF-α2 for periods of five, six, and seven days prior to analysis using the modified MTT assay described above in

Materials and Methods (Section I of this Example). Bar graphs depicting absorbance measurements (i.e., optical density) are presented in Figures 46 and 47. Since changes in absorbance are linearly proportional to cell number within a predetermined range, these data indicate that rHuNDF-α2 stimulates BT-474 proliferation, yet retards the growth of MDA-MB-468 cells in serum-containing media. Responses of each cell line were clearly both dose- and time-dependent. Increases in BT-474 growth of nearly 40% occurred by day 7 at doses greater than 25 ng/ml. Likewise, the most pronounced effects on MDA-MB-468 cells (i.e., > 80% inhibition) were also on day 7 at doses greater than 25 ng/mL. Thus, as was the case for EGF, it appears that the effects of rHuNDF-α2 on proliferation are pleiotropic; some cells are stimulated to grow while a terminally differentiated (nonproliferating) phenotype is induced in others.

EXAMPLE 18

In Vitro Effects of Recombinant NDF-α2 on Colon

Epithelial Cells

To test the effect of NDF on mucosal epithelial cells, E. coli-expressed rat NDF-α2_14-241 was used in a crypt colony formation assay. Results show that this isoform stimulated the attachment of colon crypts on plates coated with rat collagen type IV.

Mouse colon crypts were prepared as described by Whitehead et al. ( In Vitro Cellular & Developmental Biology, Vol. 23, Number 6, Pages 436-442, 1987). Mice were sacrificed with a lethal dose of CO2, and large intestines were isolated. The large intestine was cut longitudinally, rinsed with buffer A(1x PBS containing 0.3 mg/ml of L-Glutamine, 100 units/ml of penicillin, 100 units/ml of streptomycin), and sliced into 0.5 cm pieces. These pieces were washed again with buffer A in a 50 ml conical tube several times. Clean tissue was washed with extraction buffer (0.5mM DTT, 2mM EDTA in buffer A) and incubated with 10 ml of extraction buffer for one hour. The extraction buffer was then

aspirated, and tissue was washed with Solution A.

Crypts were harvested by shaking the tissue in 5 ml of Solution A. Crypts were transferred to a 15-ml tube.

The crypts were distributed in collagen type IV coated 6-well plates (Collaborative Biomedical, Cambridge, MA) at a density of five hundred crypts per well. The medium (RPMI 1640, 0.3 mg/ml L-Glutamine, 100 units/ml penicillin, 100 units/ml streptomycin) contained either 1% or 10% fetal bovine serum (FBS). After twenty-four hours of incubation at 37°C, colonies of attached cells were stained with Gram Crystal Violet and counted. Formation of colonies was FBS dependent with 3-5 fold more colonies in medium containing 10% FBS than in media containing 1% FBS. The increase of colony formation was not due to the general increase of protein in the medium, since the same effect was not observed when BSA was used to substitute FBS.

To evaluate the effect of known protein factors in the assay, IL-1, IL-2, IL-3, IL-6, IL-8, PDGF, SCF, EGF, TGF-α, bFGF, aFGF, KGF, G-CSF, and NDF were comparison tested at 10 ng/ml in medium containing 1% FBS. Among these factors, only NDF gave results similar to the stimulation obtained with 10% FBS in this assay. The assay was then repeated with various

concentrations of NDF. Although the result shown in Figure 48 is from one experiment, similar results were obtained in two other experiments: stimulation started at a concentration of 2 ng/ml, and maximum stimulationwas achieved at 100 ng/ml. These results are consistent with the hypothesis that NDF stimulates the attachment of colonic crypts through binding to its receptor, and imply that NDF may play a role in the regulation of colon epithelial cell growth and development.

EXAMPLE 19

Use of Recombinant (Rat) NDF in Treatment of Human Colon Cancer by Modulating CEA Expression in Carcinoma Cells

I. Background

Colorectal carcinoma is among the most common malignancies in Western societies. The carcinogenic target is the colonic epithelium, which is a highly proliferative organ that undergoes constant growth renewal and differentiation. Tumors are thought to arise from undifferentiated progenitor cells which harbor mutations that disrupt normal growth control mechanisms. Tumors begin as a small benign growth, or adenoma, which later acquire more aggressive growth characteristics and eventually metastasize. Current therapies include surgical resection and stringent chemotherapy in a combined modality prior to metastasis. Unfortunately, tumors that are not cured by surgery respond poorly to follow-up chemotherapy.

The HER-2/neu proto-oncogene encodes a type I transmembrane protein with intrinsic tyrosine kinase activity, which is generally believed to be a growth factor receptor that transmits mitogenic signals.

HER-2/neu is expressed in a variety of tissues including those enriched in epithelial surfaces such as breast, lung and the intestine. The HER-2 gene has previously been shown to be amplified in breast cancer, suggesting its role in tumorigenesis of epithelial cell types

(Slamon, D. et al. Science 244 : 707-713, 1989).

Identification of a ligand for the HER-2/neu receptor, Neu-Differentiation Factor (NDF), has made it possible to analyze the biological effects of the receptor-ligand interactions in HER-2/neu expressing cells (Peles et al., Ceil, 1992, above). Preliminary evidence suggests that NDF induces maturation of breast carcinoma cell lines in vitro. This suggests utility of

recombinant rat NDF in inducing maturation of carcinoma cells which express functional HER-2 receptors. II. Observations

From previous work it is shown that the HER-2 receptor is expressed on the colorectal carcinoma cell line LIM 1215. To determine the effects of recombinant rat NDF on colonocyte growth and differentiation NDF treated LIM 1215 cells were analyzed using antibodies to carcinoembryonic antigen (CEA). CEA is a highly glycosylated membrane protein that functions as an intercellular adhesion molecule (Benchimol, S. et al., Cell 57: 327-334 , 1989). CEA is expressed at elevated levels in fully differentiated cells of the large intestine, suggesting its normal function is restricted to mature cell types. Modulation of CEA expression in undifferentiated colon carcinoma cell lines would indicate the ability to induce differentiation in those cell types. LIM 1215 cells were untreated, or treated with NDF, then stained with antibodies to the

extracellular domain of CEA. Recombinant rat NDF induces expression of CEA in LIM 1215 cells. This strongly suggests that recombinant rat NDF modulates the differentiation state of HER-2 positive colorectal carcinoma cells in vitro, and presumably in vivo. III. Materials and methods

The LIM 1215 colorectal carcinoma cell line was provided by Robert Whitehead (Ludwig Institute for Cancer Research, Melbourne) and maintained as adherent monolayer cultures in RPMI 1640 growth media containing 5% FBS, 1 μg/ml insulin, 10 μg/ml hydrocortisone and

10 μM α-thioglycerol as previously described (Whitehead, R.H. et al., J. Natl . Cancer Inst . 74:759-765, 1985). Exponentially growing cells were cultured in the presence or absence of 50 ng/ml recombinant rat NDF (CHO-derived NDF-α2,) for seventy-two hours in growth media containing 2% FBS. The cell cultures were washed with PBS, removed from the dish by scraping with a rubber policeman, then pelleted by centrifugation. The cell pellet was surrounded by O.C.T. embedding compound (#4583, Miles, Inc.) and flash-frozen in liquid nitrogen. The embedded frozen pellets were sectioned using a cryostat fitted with a glass knife, and the frozen sections placed on glass slides. The sections were fixed in absolute acetone at -20°C for six minutes, then air dried. The fixed sections were rehydrated in PBS containing 0.3 μg/ml BSA (antibody diluent) then incubated with 1 μg/ml mouse monoclonal antibody to human CEA (MAB 425, Chemicon International, Inc.) in antibody diluent for one hour at 23°C. The slides were washed three times with antibody diluent, then incubated with 1 μg/ml FITC conjugated sheep anti-mouse IgG antibody (Amersham, Inc.) for approximately thirty minutes at room temperature. The slides were washed, and the area containing the stained section was covered with PBS solution containing 4% n-propyl galate (Sigma Chemical Co., # P-3130) and 90% glycerol to minimize fluorochrome bleaching. The slides were analyzed under a fluorescent microscope at UV 320 nm and photographed using Kodak Gold ASA 100 color film (Figure 49).

IV. Results and Discussion

In untreated LIM 1215 cells, basal level expression of CEA is detected in less than 5-10% of the cells. Treatment of these same cells with recombinant rat NDF induced a high level of CEA expression in 80-90% of the cells analyzed. The induced CEA molecule appears to be localized to the cell surface, suggesting that NDF treatment results in a functional increase in CEA activity.

CEA is detected at significant levels only at the apical portion of the colonic crypt (Benchimol, S. et al, Cell, 1989, above). Intestinal epithelial cells are known to differentiate as they migrate from the basal to apical crypt compartments (Cheng, H., and

Leblond, C . P . , Am . J. Anat . 141 : 537-562 , 1974 ) . The in vivo pattern of CEA expression strongly suggests that in normal colon it is present only in fully differentiated or mature epithelium. Since colon carcinoma cells are generally believed to arise from undifferentiated basal crypt progenitor or stem cells, the resulting cell lines isolated from tumors are likely to be clonal,

transformed crypt progenitor cells. Modulation of surface CEA protein by exogenous recombinant rat NDF indicates that NDF is capable of regulating cell growth and differentiation control in HER-2 expressing colon carcinoma in vitro . NDF may therefore modulate CEA expression in normal and transformed colonic epithelial cells in vivo . This may prove to be useful in treatment of colon carcinoma by regulating carcinoma cell-cell and cell-matrix interactions by decreasing the mortality and morbidity associated with metastatic colon carcinoma.

EXAMPLE 20

Use of Recombinant Rat NDF in Treatment of Metastatic Colon Carcinoma by Modulating Expression of TIMP-2

I . Background

Tumor metastasis is a complex, multi-step process that begins when individual tumor cells leave the primary site of transformation by degrading their resident extracellular matrix. Matrix degradation is mediated by several classes of proteolytic enzymes, including members of the Zn⁺-dependent metalloproteinases, such as type IV collagenase and stromelysin (transin) (Liotta, L.A., et al., Cancer Research

51:5054s-5059s, 1991). The type IV collagenase and stromelysin genes are known to be induced following initiation of the growth response, and to be

constitutively secreted by proliferating cancer cells (Angel, P. et al, Mol . Cell . Biol . 7:2256-2266, 1987). Since deregulation of matrix protease expression can enhance metastatic potential, an obvious anti-cancer therapy would be to block protease activity secreted by cancer cells. Naturally occurring inhibitors of metalloproteinases exist which regulate protease activities with tissues and body fluids. These include a family of proteins known as tissue inhibitors of metalloproteinase, or TIMPs. A novel member of this family has recently been isolated, known as TIMP-2

(Boone, T.C. et al, Proc. Natl . Acad. Sci . 87:2800-2804, 1990), and has been shown to inhibit both invasion and metastasis of tumor cells in an animal model system.

II. Observations

Previously, it has been shown that the HER-2 receptor is expressed by the metastatic ileoceacal carcinoma cell line LIM 1863. LIM 1863 cells were treated with recombinant rat NDF to determine its effects on TIMP-2 protein expression and found that NDF treatment dramatically increased TIMP-2 expression and secretion. Since NDF is believed to play a role in differentiation of epithelial cell types, NDF-induced expression of TIMP-2 may have a role in inducing

differentiation of LIM 1863 cells. Because TIMP-2 can inhibit metalloproteinases involved in tumor metastasis (DeClerck, Y.A. et al, Cancer Research 52:701-703, 1991), NDF treatment may alter the metastatic properties of LIM 1863 cell in vitro, and of other human colon cancers in vivo.

III. Materials and Methods

The LIM 1863 ileoceacal carcinoma cell line (Whitehead, R.H. et al, Cancer Research 47: 2704-2713 , 1987) was provided by Robert Whitehead (Ludwig Institute for Cancer Research, Melbourne) and maintained in suspension cultures in RPMI 1640 growth media containing 5% FBS, 1 μg/ml insulin, 10 μg/ml hydrocortisone and 10 μM α-thioglycerol. Confluent cultures are

subcultured 1:5, then incubated for seventy-two hours. The culture was then harvested and the cells pelleted at 1 X g for five minutes. The supernatant was discarded, and the cell pellet was washed with ~50 ml of serum-free media (RPMI 1640 containing 0.03% BSA, 1 μg/ml insulin, 10 μg/ml hydrocortisone and 10 μM α-thioglycerol). The cells were re-pelleted, then resuspended in 500 ml of serum-free assay buffer. The cells were incubated for forty-eight hours in a 500 ml spinner culture flask at 50 rpm. The medium was removed at this time and replaced with 500 ml of fresh serum-free media, then allowed to incubate overnight. The cells were then harvested, counted, diluted to a concentration of about six hundred organoids per ml, then cultured in the presence or absence of 50 ng/ml of recombinant rat NDF (CHO-derived NDF-α2), for one hundred and twenty hours. After treatment, the cells were harvested and collected by centrifugation. The cell pellet was surrounded by optimum cooling temperature (OCT) embedding compound (#4583, Miles, Inc.) and flash-frozen in liquid

nitrogen. The embedded frozen pellets were sectioned using a cryostat fitted with a glass knife, and the frozen sections were placed on glass slides. The sections were fixed in absolute acetone at -20°C for six minutes, then air dried. The fixed sections were rehydrated in PBS containing 0.3 mg/ml BSA (antibody diluent), then incubated with a 1:300 dilution of rabbit polyclonal anti-TIMP-2 in antibody diluent for one hour at room temperature. The TIMP-2 antiserum, provided by Helen Hockmen (Amgen, Inc.), was raised against

bacterially expressed TIMP-2 protein (Boone, T.C.

et al., Proc. Natl . Acad. Sci . , 1990, above) . The slides were washed three times with antibody diluent, then incubated with ~1 μg/ml phycoerythrin conjugated goat anti-rabbit IgG antibody (Cappel, Inc.) for

approximately thirty minutes at 23°C. The slides were washed, and the area containing the stained section was covered with PBS solution containing 4% n-propyl galate and 90% glycerol to minimize fluorochrome bleaching. The slides were analyzed under a fluorescent microscope at UV 320 nm and photographed using Kodak Gold ASA 100 color film.

IV. Results and Discussion

The LIM 1863 cell line spontaneously forms organoids in suspension consisting of a polarized layer of epithelial cells surrounding a central lumen

(Whitehead, R.H. et al., Cancer Research, 1987, above) In untreated LIM 1863 cells, basal levels of TIMP-2 expression is detected as both cell associated

(cytoplasmic) protein, and as a secreted protein

contained within the central lumen (Figure 50, panel A). In treated cells, there is a dramatic increase in the level of TIMP-2 protein, suggesting that NDF induces expression of the TIMP-2 gene (Figure 50, panel B). The induced TIMP-2 protein accumulated in the cytoplasmic and luminal compartments. In addition, there were significant amounts of TIMP-2 protein located at the basolateral edges of the organoid clumps and arranged in fiber-like structures (Figure 30, panel B). This result suggests that some of the TIMP-2 protein produced by LIM 1863 cells in response to NDF treatment is secreted and is therefore capable of interacting with tissue

metalloproteinases and modulating their activity.

Regulation of TIMP-2 expression in cells expressing HER-2 receptors implies that NDF treatment would result in the downstream modulation of metalloproteinase activity in various compartments of the body. This may be useful in regulating tissue remodeling and repair in injured organs. More

significantly, regulating the expression of TIMP-2 in HER-2 receptor positive colon cancer cells by NDF treatment may inhibit their ability to metastasize and subsequently invade neighboring tissues. This effect may be difficult to mimic by the addition of even large amounts of recombinant TIMP-2 protein and may decrease the mortality and morbidity associated with metastatic colon carcinoma.

EXAMPLE 21

NDF-Stimulated Proliferation of Epithelium in an In Vivo

Wound Healing Model

In this Example, a modified rabbit ear partial thickness dermal wound model was used. The rabbit ear dermal ulcer model described by Mustoe, T., et al.

(J. Clin . Invest . , 87: 694-703, 1991) was modified to produce a wound through the cartilage, to the dermis on the back side of the ear, using a 6mm trephine. As a result, the wound heals by sprouting of epithelial elements beneath the cartilage and reepithelialization from wound borders. Contraction is not a variable during healing, permitting accurate quantitation of new tissues. Recombinant human met-NDF-α2_14-241 from E. coli was utilized as the therapeutic wound healing agent. A. Materials and Methods

Recombinant NDF-α2 or phosphate buffered saline vehicle alone was applied once on the day of surgery and the wounds (0.25 cm²) were covered with Tegaderm occlusive dressing (3M Company, St. Paul, MN). Prior to sacrifice five days later, each animal received intravenous injections of BrdU (Aldrich Chemical

Company, Milwaukee, WI) in an amount of 50 mg per kg of body weight. Administration of BrdU was employed to better quantify the degree of basal keratinocyte

proliferation and the kinetics of basal cell migration toward the stratum spinosum and stratum corneum. After sacrifice, each wound was bisected, with one section being frozen in an OCT medium (Miles Inc., Elkhart, IN), and the second section being fixed in Omnifix (An-Con, Genetics, Inc., Melville, NY) and processed according to routine histological methods. Masson Trichrome, Oil Red 0, and immunohistochemical (IHC) stains were

performed on 3 μm-thick sections for each wound. B. Measurement of Reepithelialization

Measurements of the total wound gap and epithelial gap were made for each bisected wound using a calibrated stage micrometer. The amount of

reepithelialization for each wound was calculated by taking the difference of the total gap and the

epithelial gap. Data from two sections per wound were averaged. Differences in the amount of

reepthelialization were analyzed by a one-tailed, unpaired Student's t-test.

The area of epithelium produced in treated and untreated wounds was calculated from cross-sectional bisection for each dose group. A one-way ANOVA and Dunnett's t-test was run for each dose against the control group.

C. Measurement of Proliferation and Differentiation of Epithelial Cells

Paraffin-embedded 3 μm sections of tissue were stained using anti-BrdU (Dako Corp., Carpinteria, CA), Avidin-Biotin Complex (Vector Laboratories, Inc., Burlingame, CA) , and diaminobenzidine substrate (DAB; Sigma Chemical Co., St. Louis, MO). Sections were digested with 0.1% protease solution, followed by treatment with 2N HCl. Endogenous peroxidase was quenched by exposure to 3% hydrogen peroxide solution. Slides were blocked with a 10% solution of normal horse serum in phosphate buffered saline (PBS), then incubated with anti-BrdU diluted 1:400 in 1% bovine serum albumin. Following washing, sections were incubated with a secondary antibody, and subsequently incubated for twenty minutes with peroxidase-linked Avidin-Biotin Complex diluted 1:100 in 1% BSA. Slides were then exposed to DAB substrate (10 mg of DAB, 20 ml of PBS, 20 μl of 30% hydrogen peroxide) for ten minutes.

Sections were counterstained with hematoxylin.

D. Results and Discussion

An increase in new epithelium covering the wounds was observed for each concentration of recombinant NDF-α2 tested, in comparison with the control, indicating NDF-accelerated epithelial coverage of the wounds

(Figure 51). Increased epithelial area was also observed for all concentrations of recombinant NDF-α2 tested, indicating that NDF increased epithelial layer thickness and enhanced the integrity of epithelial coverage of the wounds (Figure 52). At doses of 5 μg per wound,

treatment with NDF-α2 resulted in an increase in the number and percentage of proliferating (BrdU positive) basal and suprabasal keratinocytes (Figure 53). These results indicate that NDF can directly stimulate basal and suprabasal keratinocytes to increase epithelial area and coverage of the treated wounds. The results also show that suprabasal cells, which are partially

differentiated in nature, also respond to NDF, suggesting a role for NDF in the acceleration of differentiation in such cells.

SEQUENCE LISTING

(1) GENERAL INFORMATION:

(i) APPLICANTS: Amgen Inc.

(ii) TITLE OF INVENTION: RECOMBINANT NEU DIFFERENTIATION FACTORS

(iii) NUMBER OF SEQUENCES: 127

(iv) CORRESPONDENCE ADDRESS:

(A) ADDRESSEE: Amgen Inc.

(B) STREET: Amgen Center

1840 Dehavilland Drive

(C) CITY: Thousand Oaks

(D) STATE: California

(E) COUNTRY: USA

(F) ZIP: 91320-1789

(v) COMPUTER READABLE FORM:

(A) MEDIUM TYPE: Floppy Disk

(B) COMPUTER: IBM PC Compatible

(C) OPERATING SYSTEM: MS-DOS

(D) SOFTWARE: Macintosh Microsoft Word Version 5.1a/Text only

(vi) CURRENT APPLICATION DATA:

(A) APPLICATION NUMBER: 08/066,384

(B) FILING DATE: 21-MAY-1993

(C) CLASSIFICATION: 435

(vii) PRIOR APPLICATION DATA:

(A) APPLICATION NUMBER: US 07/877,431

(B) FILING DATE: 29-APR-1992

(2) INFORMATION FOR SEQ ID NO:1:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 40 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys

1 5 10 15

Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys Lys

20 25 30

Cys Gln Pro Gly Phe Thr Gly Ala Arg Cys

35 40

(3) INFORMATION FOR SEQ ID NO:2:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 40 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:

Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys

1 5 10 15

Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys Lys

20 25 30

Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys

35 40

(4) INFORMATION FOR SEQ ID NO: 3:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1063 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:

G AATTCCGGGG GAG TGC TTC ATG GTG AAA GAC CTT TCA AAC CCC 44

Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro 1 5 10

TCG AGA TAC TTG TGC AAG TGC CAA CCT GGA TTC ACT GGA GCA AGA 89 Ser Arg Tyr Leu Cys Lys Cys Gln Pro Gly Phe Thr Gly Ala Arg

15 20 25

TGT ACT GAG AAT GTG CCC ATG AAA GTC CAA AAC CAA GAA AAG CAT 134 Cys Thr Glu Asn Val Pro Met Lys Val Gln Asn Gln Glu Lys His

30 35 40

CTT GGG ATT GAA TTT ATT GAG GCG GAG GAG CTG TAC CAG AAG AGA 179 Leu Gly Ile Glu Phe Ile Glu Ala Glu Glu Leu Tyr Gln Lys Arg

45 50 55 GTG CTG ACC ATA ACC GGC ATC TGC ATC GCC CTC CTT GTG GTC GGC 224 Val Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val Val Gly

60 65 70

ATC ATG TGT GTG GTG GCC TAC TGC AAA ACC AAG AAA CAG CGG AAA 269 Ile Met Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg Lys

75 80 85

AAG CTG CAT GAC CGT CTT CGG CAG AGC CTT CGG TCT GAA CGA AAC 314 Lys Leu His Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Asn

90 95 100

AAT ATG ATG AAC ATT GCC AAT GGG CCT CAC CAT CCT AAC CCA CCC 359 Asn Met Met Asn Ile Ala Asn Gly Pro His His Pro Asn Pro Pro

105 110 115

CCC GAG AAT GTC CAG CTG GTG AAT CAA TAC GTA TCT AAA AAC GTC 404 Pro Glu Asn Val Gln Leu Val Asn Gln Tyr Val Ser Lys Asn Val

120 125 130

ATC TCC AGT GAG CAT ATT GTT GAG AGA GAA GCA GAG ACA TCC TTT 449 Ile Ser Ser Glu His Ile Val Glu Arg Glu Ala Glu Thr Ser Phe

135 140 145

TCC ACC AGT CAC TAT ACT TCC ACA GCC CAT CAC TCC ACT ACT GTC 494 Ser Thr Ser His Tyr Thr Ser Thr Ala His His Ser Thr Thr Val

150 155 160

ACC CAG ACT CCT AGC CAC AGC TGG AGC AAC GGA CAC ACT GAA AGC 539 Thr Gln Thr Pro Ser His Ser Trp Ser Asn Gly His Thr Glu Ser

165 170 175

ATC CTT TCC GAA AGC CAC TCT GTA ATC GTG ATG TCA TCC GTA GAA 584 Ile Leu Ser Glu Ser His Ser Val Ile Val Met Ser Ser Val Glu

180 185 190

AAC AGT AGG CAC AGC AGC CCA ACT GGG GGC CCA AGA GGA CGT CTT 629 Asn Ser Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly Arg Leu

195 200 205

AAT GGC ACA GGA GGC CCT CGT GAA TGT AAC AGC TTC CTC AGG CAT 674 Asn Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu Arg His

210 215 220

GCC AGA GAA ACC CCT GAT TCC TAC CGA GAC TCT CCT CAT AGT GAA 719 Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His Ser Glu

225 230 235

AGG TAT GTG TCA GCC ATG ACC ACC CCG GCT CGT ATG TCA CCT GTA 764 Arg Tyr Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro Val

240 245 250 GAT TTC CAC ACG CCA AGC TCC CCC AAA TCG CCC CCT TCG GAA ATG 809 Asp Phe His Thr Pro Ser Ser Pro Lys Ser Pro Pro Ser Glu Met

255 260 265

TCT CCA CCC GTG TCC AGC ATG ACG GTG TCC ATG CCT TCC ATG GCG 854 Ser Pro Pro Val Ser Ser Met Thr Val Ser Met Pro Ser Met Ala

270 275 280

GTC AGC CCC TTC ATG GAA GAA GAG AGA CCT CTA CTT CTC GTG ACA 899 Val Ser Pro Phe Met Glu Glu Glu Arg Pro Leu Leu Leu Val Thr

285 290 295

CCA CCA AGG CTG CGG GAG AAG AAG TTT GAC CAT CAC CCT CAG CAG 944 Pro Pro Arg Leu Arg Glu Lys Lys Phe Asp His His Pro Gln Gln

300 305 310

TTC AGC TCC TTC CAC CAC AAC CCC GCG CAT GAC AGT AAC AGC CTC 989 Phe Ser Ser Phe His His Asn Pro Ala His Asp Ser Asn Ser Leu

315 320 325

CCT GCT AGC CCC TTG AGG ATA GTG GAG GAT GAG GAG TAT GAA ACG 1034 Pro Ala Ser Pro Leu Arg Ile Val Glu Asp Glu Glu Tyr Glu Thr

330 335 340

ACC CAA GAG TAC GAG CCA GCC CGGAATTC 1063

Thr Gln Glu Tyr Glu Pro Ala

345

(5) INFORMATION FOR SEQ ID NO: 4:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 348 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:

Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro 1 5 10

Ser Arg Tyr Leu Cys Lys Cys Gln Pro Gly Phe Thr Gly Ala Arg

15 20 25

Cys Thr Glu Asn Val Pro Met Lys Val Gln Asn Gln Glu Lys His

30 35 40

Leu Gly Ile Glu Phe Ile Glu Ala Glu Glu Leu Tyr Gln Lys Arg

45 50 55

Val Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val Val Gly Ile Met Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg Lys 75 80 85

Lys Leu His Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Asn

90 95 100

Asn Met Met Asn Ile Ala Asn Gly Pro His His Pro Asn Pro Pro

105 110 115

Pro Glu Asn Val Gln Leu Val Asn Gln Tyr Val Ser Lys Asn Val

120 125 130 Ile Ser Ser Glu His Ile Val Glu Arg Glu Ala Glu Thr Ser Phe

135 140 145

Ser Thr Ser His Tyr Thr Ser Thr Ala His His Ser Thr Thr Val

150 155 160

Thr Gln Thr Pro Ser His Ser Trp Ser Asn Gly His Thr Glu Ser

165 170 175 Ile Leu Ser Glu Ser His Ser Val Ile Val Met Ser Ser Val Glu

180 185 190

Asn Ser Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly Arg Leu

195 200 205

Asn Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu Arg His

210 215 220

Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His Ser Glu

225 230 235

Arg Tyr Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro Val

240 245 250

Asp Phe His Thr Pro Ser Ser Pro Lys Ser Pro Pro Ser Glu Met

255 260 265

Ser Pro Pro Val Ser Ser Met Thr Val Ser Met Pro Ser Met Ala

270 275 280

Val Ser Pro Phe Met Glu Glu Glu Arg Pro Leu Leu Leu Val Thr

285 290 295

Pro Pro Arg Leu Arg Glu Lys Lys Phe Asp His His Pro Gln Gln

300 305 310

Phe Ser Ser Phe His His Asn Pro Ala His Asp Ser Asn Ser Leu

315 320 325 Pro Ala Ser Pro Leu Arg Ile Val Glu Asp Glu Glu Tyr Glu Thr

330 335 340

Thr Gln Glu Tyr Glu Pro Ala

345

(6) INFORMATION FOR SEQ ID NO: 5:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2335 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:

GCGGCCGCTC GCTCTCCCCA TCGAGGGACA AACTTTTCCC AAACCCGATC 50

CGAGCCCTTG GACCAAACTC GCCTGCGCCG AGAGCCGTCC GCGTAGAGCG 100

CTCCGTCTCC GGCGAG ATG TCC GAG CGC AAA GAA GGC AGA GGC AAA 146

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys

1 5 10

GGG AAG GGC AAG AAG AAG GAG CGA GGC TCC GGC AAG AAG CCG GAG 191 Gly Lys Gly Lys Lys Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu

15 20 25

TCC GCG GCG GGC AGC CAG AGC CCA GCC TTG CCT CCC CGA TTG AAA 236 Ser Ala Ala Gly Ser Gln Ser Pro Ala Leu Pro Pro Arg Leu Lys

30 35 40

GAG ATG AAA AGC CAG GAA TCG GCT GCA GGT TCC AAA CTA GTC CTT 281 Glu Met Lys Ser Gln Glu Ser Ala Ala Gly Ser Lys Leu Val Leu

45 50 55

CGG TGT GAA ACC AGT TCT GAA TAC TCC TCT CTC AGA TTC AAG TGG 326 Arg Cys Glu Thr Ser Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp

60 65 70

TTC AAG AAT GGG AAT GAA TTG AAT CGA AAA AAC AAA CCA CAA AAT 371 Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys Asn Lys Pro Gln Asn

75 80 85

ATC AAG ATA CAA AAA AAG CCA GGG AAG TCA GAA CTT CGC ATT AAC 416 Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu Leu Arg Ile Asn

90 95 100

AAA GCA TCA CTG GCT GAT TCT GGA GAG TAT ATG TGC AAA GTG ATC 461 Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys Lys Val Ile

105 110 115 AGC AAA TTA GGA AAT GAC AGT GCC TCT GCC AAT ATC ACC ATC GTG 506 Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr Ile Val

120 125 130

GAA TCA AAC GAG ATC ATC ACT GGT ATG CCA GCC TCA ACT GAA GGA 551 Glu Ser Asn Glu Ile Ile Thr Gly Met Pro Ala Ser Thr Glu Gly

135 140 145

GCA TAT GTG TCT TCA GAG TCT CCC ATT AGA ATA TCA GTA TCC ACA 596 Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr

150 155 160

GAA GGA GCA AAT ACT TCT TCA TCT ACA TCT ACA TCC ACC ACT GGG 641 Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly

165 170 175

ACA AGC CAT CTT GTA AAA TGT GCG GAG AAG GAG AAA ACT TTC TGT 686 Thr Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys

180 185 190

GTG AAT GGA GGG GAG TGC TTC ATG GTG AAA GAC CTT TCA AAC CCC 731 Val Asn Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro

195 200 205

TCG AGA TAC TTG TGC AAG TGC CAA CCT GGA TTC ACT GGA GCA AGA 776 Ser Arg Tyr Leu Cys Lys Cys Gln Pro Gly Phe Thr Gly Ala Arg

210 215 220

TGT ACT GAG AAT GTG CCC ATG AAA GTC CAA AAC CAA GAA AAG GCG 821 Cys Thr Glu Asn Val Pro Met Lys Val Gln Asn Gln Glu Lys Ala

225 230 235

GAG GAG CTG TAC CAG AAG AGA GTG CTG ACC ATA ACC GGC ATC TGC 866 Glu Glu Leu Tyr Gln Lys Arg Val Leu Thr Ile Thr Gly Ile Cys

240 245 250

ATC GCC CTC CTT GTG GTC GGC ATC ATG TGT GTG GTG GCC TAC TGC 911 Ile Ala Leu Leu Val Val Gly Ile Met Cys Val Val Ala Tyr Cys

255 260 265

AAA ACC AAG AAA CAG CGG AAA AAG CTG CAT GAC CGT CTT CGG CAG 956 Lys Thr Lys Lys Gln Arg Lys Lys Leu His Asp Arg Leu Arg Gln

270 275 280

AGC CTT CGG TCT GAA CGA AAC AAT ATG ATG AAC ATT GCC AAT GGG 1001 Ser Leu Arg Ser Glu Arg Asn Asn Met Met Asn Ile Ala Asn Gly

285 290 295

CCT CAC CAT CCT AAC CCA CCC CCC GAG AAT GTC CAG CTG GTG AAT 1046 Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gln Leu Val Asn

300 305 310 CAA TAC GTA TCT AAA AAC GTC ATC TCC AGT GAG CAT ATT GTT GAG 1091 Gln Tyr Val Ser Lys Asn Val Ile Ser Ser Glu His Ile Val Glu

315 320 325

AGA GAA GCA GAG ACA TCC TTT TCC ACC AGT CAC TAT ACT TCC ACA 1136 Arg Glu Ala Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr

330 335 340

GCC CAT CAC TCC ACT ACT GTC ACC CAG ACT CCT AGC CAC AGC TGG 1181 Ala His His Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp

345 350 355

AGC AAC GGA CAC ACT GAA AGC ATC CTT TCC GAA AGC CAC TCT GTA 1226 Ser Asn Gly His Thr Glu Ser Ile Leu Ser Glu Ser His Ser Val

360 365 370

ATC GTG ATG TCA TCC GTA GAA AAC AGT AGG CAC AGC AGC CCA ACT 1271 Ile Val Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro Thr

375 380 385

GGG GGC CCA AGA GGA CGT CTT AAT GGC ACA GGA GGC CCT CGT GAA 1316 Gly Gly Pro Arg Gly Arg Leu Asn Gly Thr Gly Gly Pro Arg Glu

390 395 400

TGT AAC AGC TTC CTC AGG CAT GCC AGA GAA ACC CCT GAT TCC TAC 1361 Cys Asn Ser Phe Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr

405 410 415

CGA GAC TCT CCT CAT AGT GAA AGA CAT AAC CTT ATA GCT GAG CTA 1406 Arg Asp Ser Pro His Ser Glu Arg His Asn Leu Ile Ala Glu Leu

420 425 430

AGG AGA AAC AAG GCA CAC AGA TCC AAA TGC ATG CAG ATC CAG CTA 1451 Arg Arg Asn Lys Ala His Arg Ser Lys Cys Met Gln Ile Gln Leu

435 440 445

TCA GCA ACT CAT CTT AGA TCT TCT TCC ATT CCC CAT TTG GGC TTC 1496 Ser Ala Thr His Leu Arg Ser Ser Ser Ile Pro His Leu Gly Phe

450 455 460

ATT CTC TAA GACCCCTTGG CCTTTAGGAA GGTATGTGTC AGCCATGACC 1545 Ile Leu

ACCCCGGCTC GTATGTCACC TGTAGATTTC CACACGCCAA GCTCCCCCAA 1595

ATCGCCCCCT TCGGAAATGT CTCCACCCGT GTCCAGCATG ACGGTGTCCA 1645

TGCCTTCCAT GGCGGTCAGC CCCTTCATGG AAGAAGAGAG ACCTCTACTT 1695

CTCGTGACAC CACCAAGGCT GCGGGAGAAG AAGTTTGACC ATCACCCTCA 1745

GCAGTTCAGC TCCTTCCACC ACAACCCCGC GCATGACAGT AACAGCCTCC 1795 CTGCTAGCCC CTTGAGGATA GTGGAGGATG AGGAGTATGA AACGACCCAA 1845

GAGCACGAGC CAGCCCAAGA GCCTGTTAAG AAACTCGCCA ATAGCCGGCG 1895 GGCCAAAAGA ACCAAGCCCA ATGGCCACAT TGCTAACAGA TTGGAAGTGG 1945

ACAGCAACAC AAGCTCCCAG AGCAGTAACT CAGAGAGTGA AACAGAAGAT 1995

GAAAGAGTAG GTGAAGATAC GCCTTTCCTG GGCATACAGA ACCCCCTGGC 2045

AGCCAGTCTT GAGGCAACAC CTGCCTTCCG CCTGGCTGAC AGCAGGACTA 2095

ACCCAGCAGG CCGCTTCTCG ACACAGGAAG AAATCCAGGC CAGGCTGTCT 2145

AGTGTAATTG CTAACCAAGA CCCTATTGCT GTATAAAACC TAAATAAACA 2195

CATAGATTCA CCTGTAAAAC TTTATTTTAT ATAATAAAGT ATTCCACCTT 2245

AAATTAAACA ATTTATTTTA TTTTAGCAGT TCTGCAAAAA AAAAAAAAAA 2295

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AGGGCGGCCG 2335

(7) INFORMATION FOR SEQ ID NO: 6:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 462 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys

1 5 10

Gly Lys Gly Lys Lys Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu

15 20 25

Ser Ala Ala Gly Ser Gln Ser Pro Ala Leu Pro Pro Arg Leu Lys

30 35 40

Glu Met Lys Ser Gln Glu Ser Ala Ala Gly Ser Lys Leu Val Leu

45 50 55

Arg Cys Glu Thr Ser Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp

60 65 70

Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys Asn Lys Pro Gln Asn

75 80 85 Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu Leu Arg Ile Asn 90 95 100

Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys Lys Val Ile

105 110 115

Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr Ile Val

120 125 130

Glu Ser Asn Glu Ile Ile Thr Gly Met Pro Ala Ser Thr Glu Gly

135 140 145

Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr

150 155 160

Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly

165 170 175

Thr Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys

180 185 190

Val Asn Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro

195 200 205

Ser Arg Tyr Leu Cys Lys Cys Gln Pro Gly Phe Thr Gly Ala Arg

210 215 220

Cys Thr Glu Asn Val Pro Met Lys Val Gln Asn Gln Glu Lys Ala

225 230 235

Glu Glu Leu Tyr Gln Lys Arg Val Leu Thr Ile Thr Gly Ile Cys

240 245 250 Ile Ala Leu Leu Val Val Gly Ile Met Cys Val Val Ala Tyr Cys

255 260 265

Lys Thr Lys Lys Gln Arg Lys Lys Leu His Asp Arg Leu Arg Gln

270 275 280

Ser Leu Arg Ser Glu Arg Asn Asn Met Met Asn Ile Ala Asn Gly

285 290 295

Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gln Leu Val Asn

300 305 310 Gln Tyr Val Ser Lys Asn Val Ile Ser Ser Glu His Ile Val Glu

315 320 325

Arg Glu Ala Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr

330 335 340 Ala His His Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp

345 350 355

Ser Asn Gly His Thr Glu Ser Ile Leu Ser Glu Ser His Ser Val

360 365 370

Ile Val Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro Thr

375 380 385

Gly Gly Pro Arg Gly Arg Leu Asn Gly Thr Gly Gly Pro Arg Glu

390 395 400

Cys Asn Ser Phe Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr

405 410 415

Arg Asp Ser Pro His Ser Glu Arg His Asn Leu Ile Ala Glu Leu

420 425 430

Arg Arg Asn Lys Ala His Arg Ser Lys Cys Met Gln Ile Gln Leu

435 440 445

Ser Ala Thr His Leu Arg Ser Ser Ser Ile Pro His Leu Gly Phe

450 455 460

Ile Leu

(8) INFORMATION FOR SEQ ID NO: 7:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1807 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:

GCGGCCGCTC GCTCTCCCCA TCGAGGGACA AACTTTTCCC AAACCCGATC 50

CGAGCCCTTG GACCAAACTC GCCTGCGCCG AGAGCCGTCC GCGTAGAGCG 100

CTCCGTCTCC GGCGAG 116

ATG TCC GAG CGC AAA GAA GGC AGA GGC AAA GGG AAG GGC AAG AAG 161 Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys

1 5 10 15

AAG GAG CGA GGC TCC GGC AAG AAG CCG GAG TCC GCG GCG GGC AGC 206 Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala Ala Gly Ser

20 25 30 CAG AGC CCA GCC TTG CCT CCC CGA TTG AAA GAG ATG AAA AGC CAG 251 Gln Ser Pro Ala Leu Pro Pro Arg Leu Lys Glu Met Lys Ser Gln

35 40 45

GAA TCG GCT GCA GGT TCC AAA CTA GTC CTT CGG TGT GAA ACC AGT 296 Glu Ser Ala Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser

50 55 60

TCT GAA TAC TCC TCT CTC AGA TTC AAG TGG TTC AAG AAT GGG AAT 341 Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn

65 70 75

GAA TTG AAT CGA AAA AAC AAA CCA CAA AAT ATC AAG ATA CAA AAA 386 Glu Leu Asn Arg Lys Asn Lys Pro Gln Asn Ile Lys Ile Gln Lys

80 85 90

AAG CCA GGG AAG TCA GAA CTT CGC ATT AAC AAA GCA TCA CTG GCT 431 Lys Pro Gly Lys Ser Glu Leu Arg Ile Asn Lys Ala Ser Leu Ala

95 100 105

GAT TCT GGA GAG TAT ATG TGC AAA GTG ATC AGC AAA TTA GGA AAT 476 Asp Ser Gly Glu Tyr Met Cys Lys Val Ile Ser Lys Leu Gly Asn

110 115 120

GAC AGT GCC TCT GCC AAT ATC ACC ATC GTG GAA TCA AAC GAG ATC 521 Asp Ser Ala Ser Ala Asn Ile Thr Ile Val Glu Ser Asn Glu Ile

125 130 135

ATC ACT GGT ATG CCA GCC TCA ACT GAA GGA GCA TAT GTG TCT TCA 566 Ile Thr Gly Met Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser

140 145 150

GAG TCT CCC ATT AGA ATA TCA GTA TCC ACA GAA GGA GCA AAT ACT 611 Glu Ser Pro Ile Arg Ile Ser Val Ser Thr Glu Gly Ala Asn Thr

155 160 165

TCT TCA TCT ACA TCT ACA TCC ACC ACT GGG ACA AGC CAT CTT GTA 656 Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Val

170 175 180

AAA TGT GCG GAG AAG GAG AAA ACT TTC TGT GTG AAT GGA GGG GAG 701 Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu

185 190 195

TGC TTC ATG GTG AAA GAC CTT TCA AAC CCC TCG AGA TAC TTG TGC 746 Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys

200 205 210 AAG TGC CAA CCT GGA TTC ACT GGA GCA AGA TGT ACT GAG AAT GTG 791 Lys Cys Gln Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val

215 220 225

CCC ATG AAA GTC CAA AAC CAA GAA AAG GCG GAG GAG CTG TAC CAG 836 Pro Met Lys Val Gln Asn Gln Glu Lys Ala Glu Glu Leu Tyr Gln

230 235 240

AAG AGA GTG CTG ACC ATA ACC GGC ATC TGC ATC GCC CTC CTT GTG 881 Lys Arg Val Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val

245 250 255

GTC GGC ATC ATG TGT GTG GTG GCC TAC TGC AAA ACC AAG AAA CAG 926 Val Gly Ile Met Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln

260 265 270

CGG AAA AAG CTG CAT GAC CGT CTT CGG CAG AGC CTT CGG TCT GAA 971 Arg Lys Lys Leu His Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu

275 280 285

CGA AAC AAT ATG ATG AAC ATT GCC AAT GGG CCT CAC CAT CCT AAC 1016 Arg Asn Asn Met Met Asn Ile Ala Asn Gly Pro His His Pro Asn

290 295 300

CCA CCC CCC GAG AAT GTC CAG CTG GTG AAT CAA TAC GTA TCT AAA 1061 Pro Pro Pro Glu Asn Val Gln Leu Val Asn Gln Tyr Val Ser Lys

305 310 315

AAC GTC ATC TCC AGT GAG CAT ATT GTT GAG AGA GAA GCA GAG ACA 1106 Asn Val Ile Ser Ser Glu His Ile Val Glu Arg Glu Ala Glu Thr

320 325 330

TCC TTT TCC ACC AGT CAC TAT ACT TCC ACA GCC CAT CAC TCC ACT 1151 Ser Phe Ser Thr Ser His Tyr Thr Ser Thr Ala His His Ser Thr

335 340 345

ACT GTC ACC CAG ACT CCT AGC CAC AGC TGG AGC AAC GGA CAC ACT 1196 Thr Val Thr Gln Thr Pro Ser His Ser Trp Ser Asn Gly His Thr

350 355 360

GAA AGC ATC CTT TCC GAA AGC CAC TCT GTA ATC GTG ATG TCA TCC 1241 Glu Ser Ile Leu Ser Glu Ser His Ser Val Ile Val Met Ser Ser

365 370 375

GTA GAA AAC AGT AGG CAC AGC AGC CCA ACT GGG GGC CCA AGA GGA 1286 Val Glu Asn Ser Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly

380 385 390

CGT CTT AAT GGC ACA GGA GGC CCT CGT GAA TGT AAC AGC TTC CTC 1331 Arg Leu Asn Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu

395 400 405 AGG CAT GCC AGA GAA ACC CCT GAT TCC TAC CGA GAC TCT CCT CAT 1376 Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His

410 415 420

AGT GAA AGA CAT AAC CTT ATA GCT GAG CTA AGG AGA AAC AAG GCA 1421 Ser Glu Arg His Asn Leu Ile Ala Glu Leu Arg Arg Asn Lys Ala

425 430 435

CAC AGA TCC AAA TGC ATG CAG ATC CAG CTA TCA GCA ACT CAT CTT 1466 His Arg Ser Lys Cys Met Gln Ile Gln Leu Ser Ala Thr His Leu

440 445 450

AGA TCT TCT TCC ATT CCC CAT TTG GGC TTC ATT CTC TAA 1505

Arg Ser Ser Ser Ile Pro His Leu Gly Phe Ile Leu

455 460

GACCCCTTGG CCTTTAGGAA GGTATGTGTC AGCCATGACC ACCCCGGCTC 1555

GTATGTCACC TGTAGATTTC CACACGCCAA GCTCCCCCAA ATCGCCCCCT 1605

TCGGAAATGT CTCCACCCGT GTCCAGCATG ACGGTGTCCA TGCCTTCCAT 1655

GGCGGTCAGC CCCTTCATGG AAGAAGAGAG ACCTCTACTT CTCGTGACAC 1705

CACCAAGGCT GCGGGAGAAG AAGTTTGACC ATCACCCTCA GCAGTTCAGC 1755

TCCTTCCACC ACAACCCCAC GCGCCCACGC GTCCGCGGAC GCGTGGGTCGAC 1807

(9) INFORMATION FOR SEQ ID NO: 8:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 462 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 :

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys

1 5 10 15

Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala Ala Gly Ser

20 25 30

Gln Ser Pro Ala Leu Pro Pro Arg Leu Lys Glu Met Lys Ser Gln

35 40 45

Glu Ser Ala Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser

50 55 60 Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn 65 70 75

Glu Leu Asn Arg Lys Asn Lys Pro Gln Asn Ile Lys Ile Gln Lys

80 85 90

Lys Pro Gly Lys Ser Glu Leu Arg Ile Asn Lys Ala Ser Leu Ala

95 100 105

Asp Ser Gly Glu Tyr Met Cys Lys Val Ile Ser Lys Leu Gly Asn

110 115 120

Asp Ser Ala Ser Ala Asn Ile Thr Ile Val Glu Ser Asn Glu Ile

125 130 135 Ile Thr Gly Met Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser

140 145 150

Glu Ser Pro Ile Arg Ile Ser Val Ser Thr Glu Gly Ala Asn Thr

155 160 165

Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Val

170 175 180

Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu

185 190 195

Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys

200 205 210

Lys Cys Gln Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val

215 220 225

Pro Met Lys Val Gln Asn Gln Glu Lys Ala Glu Glu Leu Tyr Gln

230 235 240

Lys Arg Val Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val

245 250 255

Val Gly Ile Met Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln

260 265 270

Arg Lys Lys Leu His Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu

275 280 285

Arg Asn Asn Met Met Asn Ile Ala Asn Gly Pro His His Pro Asn

290 295 300

Pro Pro Pro Glu Asn Val Gln Leu Val Asn Gln Tyr Val Ser Lys

305 310' 315 Asn Val Ile Ser Ser Glu His Ile Val Glu Arg Glu Ala Glu Thr

320 325 330

Ser Phe Ser Thr Ser His Tyr Thr Ser Thr Ala His His Ser Thr

335 340 345

Thr Val Thr Gln Thr Pro Ser His Ser Trp Ser Asn Gly His Thr

350 355 360

Glu Ser Ile Leu Ser Glu Ser His Ser Val Ile Val Met Ser Ser

365 370 375

Val Glu Asn Ser Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly

380 385 390

Arg Leu Asn Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu

395 400 405

Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His

410 415 420

Ser Glu Arg His Asn Leu Ile Ala Glu Leu Arg Arg Asn Lys Ala

425 430 435

His Arg Ser Lys Cys Met Gln Ile Gln Leu Ser Ala Thr His Leu

440 445 450

Arg Ser Ser Ser Ile Pro His Leu Gly Phe Ile Leu

455 460

(10) INFORMATION FOR SEQ ID NO: 9:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1651 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: unknown

(D) TOPOLOGY: unknown

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 20.. 817

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9:

GTCGACCCAC GCGTCCGGC TTC ATG GTG AAA GAC CTT TCA AAC CCC TCG AGA 52

Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg

1 5 10

TAC TTG TGC AAG TGC CAA CCT GGA TTC ACT GGA GCA AGA TGT ACT GAG 100 Tyr Leu Cys Lys Cys Gln Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu

15 20 25 AAT GTG CCC ATG AAA GTC CAA AAC CAA GAA AAG GCG GAG GAG CTG TAC 148 Asn Val Pro Met Lys Val Gln Asn Gln Glu Lys Ala Glu Glu Leu Tyr

30 35 40

CAG AAG AGA GTG CTG ACC ATA ACC GGC ATC TGC ATC GCC CTC CTT GTG 196 Gln Lys Arg Val Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val

45 50 55

GTC GGC ATC ATG TGT GTG GTG GCC TAC TGC AAA ACC AAG AAA CAG CGG 244 Val Gly Ile Met Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg

60 65 70 75

AAA AAG CTG CAT GAC CGT CTT CGG CAG AGC CTT CGG TCT GAA CGA AAC 292 Lys Lys Leu His Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Asn

80 85 90

AAT ATG ATG AAC ATT GCC AAT GGG CCT CAC CAT CCT AAC CCA CCC CCC 340 Asn Met Met Asn Ile Ala Asn Gly Pro His His Pro Asn Pro Pro Pro

95 100 105

GAG AAT GTC CAG CTG GTG AAT CAA TAC GTA TCT AAA AAC GTC ATC TCC 388 Glu Asn Val Gln Leu Val Asn Gln Tyr Val Ser Lys Asn Val Ile Ser

110 115 120

AGT GAG CAT ATT GTT GAG AGA GAA GCA GAG ACA TCC TTT TCC ACC AGT 436 Ser Glu His Ile Val Glu Arg Glu Ala Glu Thr Ser Phe Ser Thr Ser

125 130 135

CAC TAT ACT TCC ACA GCC CAT CAC TCC ACT ACT GTC ACC CAG ACT CCT 484 His Tyr Thr Ser Thr Ala His His Ser Thr Thr Val Thr Gln Thr Pro

140 145 150 155

AGC CAC AGC TGG AGC AAC GGA CAC ACT GAA AGC ATC CTT TCC GAA AGC 532 Ser His Ser Trp Ser Asn Gly His Thr Glu Ser Ile Leu Ser Glu Ser

160 165 170

CAC TCT GTA ATC GTG ATG TCA TCC GTA GAA AAC AGT AGG CAC AGC AGC 580 His Ser Val Ile Val Met Ser Ser Val Glu Asn Ser Arg His Ser Ser

175 180 185

CCA ACT GGG GGC CCA AGA GGA CGT CTT AAT GGC ACA GGA GGC CCT CGT 628 Pro Thr Gly Gly Pro Arg Gly Arg Leu Asn Gly Thr Gly Gly Pro Arg

190 195 200

GAA TGT AAC AGC TTC CTC AGG CAT GCC AGA GAA ACC CCT GAT TCC TAC 676 Glu Cys Asn Ser Phe Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr

205 210 215

CGA GAC TCT CCT CAT AGT GAA AGA CAT AAC CTT ATA GCT GAG CTA AGG 724 Arg Asp Ser Pro His Ser Glu Arg His Asn Leu Ile Ala Glu Leu Arg

220 225 230 235 AGA AAC AAG GCA CAC AGA TCC AAA TGC ATG CAG ATC CAG CTA TCA GCA 772 Arg Asn Lys Ala His Arg Ser Lys Cys Met Gln Ile Gln Leu Ser Ala

240 245 250

ACT CAT CTT AGA TCT TCT TCC ATT CCC CAT TTG GGC TTC ATT CTC 817 Thr His Leu Arg Ser Ser Ser Ile Pro His Leu Gly Phe Ile Leu

255 260 265

TAAGACCCCT TGGCCTTTAG GAAGGTATGT GTCAGCCATG ACCACCCCGG CTCGTATGTC 877

ACCTGTAGAT TTCCACACGC CAAGCTCCCC CAAATCGCCC CCTTCGGAAA TGTCTCCACC 937

CGTGTCCAGC ATGACGGTGT CCATGCCTTC CATGGCGGTC AGCCCCTTCA TGGAAGAAGA 997

GAGACCTCTA CTTCTCGTGA CACCACCAAG GCTGCGGGAG AAGAAGTTTG ACCATCACCC 1057

TCAGCAGTTC AGCTCCTTCC ACCACAACCC CGCGCATGAC AGTAACAGCC TCCCTGCTAG 1117

CCCCTTGAGG ATAGTGGAGG ATGAGGAGTA TGAAACGACC CAAGAGCACG AGCCAGCCCA 1177

AGAGCCTGTT AAGAAACTCG CCAATAGCCG GCGGGCCAAA AGAACCAAGC CCAATGGCCA 1237

CATTGCTAAC AGATTGGAAG TGGACAGCAA CACAAGCTCC CAGAGCAGTA ACTCAGAGAG 1297

TGAAACAGAA GATGAAAGAG TAGGTGAAGA TACGCCTTTC CTGGGCATAC AGAACCCCCT 1357

GGCAGCCAGT CTTGAGGCAA CACCTGCCTT CCGCCTGGCT GACAGCAGGA CTAACCCAGC 1417

AGGCCGCTTC TCGACACAGG AAGAAATCCA GGCCAGGCTG TCTAGTGTAA TTGCTAACCA 1477

AGACCCTATT GCTGTATAAA ACCTAAATAA ACACATAGAT TCACCTGTAA AACTTTATTT 1537

TATATAATAA AGTATTCCAC CTTAAATTAA ACAATTTATT TTATTTTAGC AGTTCTGCAA 1597

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAGGGCGG CCGC 1651

(11) INFORMATION FOR SEQ ID NO: 10:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 266 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:

Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys Lys Cys

1 5 10 15

Gln Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val Pro Met Lys

20 25 30 Val Gln Asn Gln Glu Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val Leu 35 40 45

Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val Val Gly Ile Met Cys 50 55 60

Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg Lys Lys Leu His Asp 65 70 75 80

Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Asn Asn Met Met Asn Ile

85 90 95

Ala Asn Gly Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gln Leu

100 105 110

Val Asn Gln Tyr Val Ser Lys Asn Val Ile Ser Ser Glu His Ile Val

115 120 125

Glu Arg Glu Ala Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr 130 135 140

Ala His His Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp Ser 145 150 155 160

Asn Gly His Thr Glu Ser Ile Leu Ser Glu Ser His Ser Val Ile Val

165 170 175

Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro Thr Gly Gly Pro

180 185 190

Arg Gly Arg Leu Asn Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe

195 200 205

Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His 210 215 220

Ser Glu Arg His Asn Leu Ile Ala Glu Leu Arg Arg Asn Lys Ala His 225 230 235 240

Arg Ser Lys Cys Met Gln Ile Gln Leu Ser Ala Thr His Leu Arg Ser

245 250 255

Ser Ser Ile Pro His Leu Gly Phe Ile Leu

260 265

(12) INFORMATION FOR SEQ ID NO: 11:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 946 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS : unknown (D) TOPOLOGY: unknown

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 20..394

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11:

GTCGACCCAC GCGTCCGGT GCC TCT GCC AAT ATC ACC ATC GTG GAA TCA AAC 52

Ala Ser Ala Asn Ile Thr Ile Val Glu Ser Asn

1 5 10

GAG ATC ATC ACT GGT ATG CCA GCC TCA ACT GAA GGA GCA TAT GTG TCT 100 Glu Ile Ile Thr Gly Met Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser

15 20 25

TCA GAG TCT CCC ATT AGA ATA TCA GTA TCC ACA GAA GGA GCA AAT ACT 148 Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr Glu Gly Ala Asn Thr

30 35 40

TCT TCA TCT ACA TCT ACA TCC ACC ACT GGG ACA AGC CAT CTT GTA AAA 196 Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Val Lys

45 50 55

TGT GCG GAG AAG GAG AAA ACT TTC TGT GTG AAT GGA GGG GAG TGC TTC 244 Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys Phe

60 65 70 75

ATG GTG AAA GAC CTT TCA AAC CCC TCG AGA TAC TTG TGC AAG TGC CAA 292 Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys Lys Cys Gln

80 85 90

CCT GGA TTC ACT GGA GCA AGA TGT ACT GAG AAT GTG CCC ATG AAA GTC 340 Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val Pro Met Lys Val

95 100 105

CAA AAC CAA GAA AGT GCC CAA ATG AGT TTA CTG GTG ATC GCT GCC AAA 388 Gln Asn Gln Glu Ser Ala Gln Met Ser Leu Leu Val Ile Ala Ala Lys

110 115 120

ACT ACG TAATGGCCAG CTTCTACAGT ACGTCCACTC CCTTTCTGTC TCTGCCTGAA 444 Thr Thr

125

TAGGAGCATG CTCAGTTGGT GCTGCTTTCT TGTTGCTGCA TCTCCCCTCA GATTCCACCT 504

AGAGCTAGAT GTGTCTTACC AGATCTAATA TTGACTGCCT CTGCCTGTCG CATGAGAACA 564

TTAACAAAAG CAATTGTATT ACTTCCTCTG TTCGCGACTA GTTGGCTCTG AGATACTAAT 624

AGGTGTGTGA GGCTCCGGAT GTTTCTGGAA TTGATATTGA ATGATGTGAT ACAAATTGAT 684

AGTCAATATC AAGCAGTGAA ATATGATAAT AAAGGCATTT CAAAGTCTCA CTTTTATTGA 744 TAAAATAAAA ATCATTCTAC TGAACAGTCC ATCTTCTTTA TACAATGACC ACATCCTGAA 804

AAGGGTGTTG CTAAGCTGTA ACCGATATGC ACTTGAAATG ATGGTAAGTT AATTTTGATT 864

CAGAATGTGT TATTTGTCAC AAATAAACAT AATAAAAGGA AAAAAAAAAA AAAAAAAAAA 924

AAAAAAAAAA AAGGGCGGCC GC 946

(13) INFORMATION FOR SEQ ID NO: 12:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 125 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12:

Ala Ser Ala Asn Ile Thr Ile Val Glu Ser Asn Glu Ile Ile Thr Gly

1 5 10 15

Met Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser Glu Ser Pro Ile

20 25 30

Arg Ile Ser Val Ser Thr Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser

35 40 45

Thr Ser Thr Thr Gly Thr Ser His Leu Val Lys Cys Ala Glu Lys Glu

50 55 60

Lys Thr Phe Cys Val Asn Gly Gly Glu Cys Phe Met Val Lys Asp Leu

65 70 75 80

Ser Asn Pro Ser Arg Tyr Leu Cys Lys Cys Gln Pro Gly Phe Thr Gly

85 90 95

Ala Arg Cys Thr Glu Asn Val Pro Met Lys Val Gln Asn Gln Glu Ser

100 105 110

Ala Gln Met Ser Leu Leu Val Ile Ala Ala Lys Thr Thr

115 120 125

(14) INFORMATION FOR SEQ ID NO: 13:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1873 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: unknown

(D) TOPOLOGY: unknown ( ix) FEATURE :

(A) NAME/KEY: CDS

(B) LOCATION: 12..1664

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:

GAATTCAAGC G TCA GAA CTT CGC ATT AAC AAA GCA TCA CTG .GCT GAT TCT 50

Ser Glu Leu Arg Ile Asn Lys Ala Ser Leu Ala Asp Ser

1 5 10

GGA GAG TAT ATG TGC AAA GTG ATC AGC AAA TTA GGA AAT GAC AGT GCC 98 Gly Glu Tyr Met Cys Lys Val Ile Ser Lys Leu Gly Asn Asp Ser Ala

15 20 25

TCT GCC AAT ATC ACC ATC GTG GAA TCA AAC GAG ATC ATC ACT GGT ATG 146 Ser Ala Asn Ile Thr Ile Val Glu Ser Asn Glu Ile Ile Thr Gly Met

30 35 40 45

CCA GCC TCA ACT GAA GGA GCA TAT GTG TCT TCA GAG TCT CCC ATT AGA 194 Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg

50 55 60

ATA TCA GTA TCC ACA GAA GGA GCA AAT ACT TCT TCA TCT ACA TCT ACA 242 Ile Ser Val Ser Thr Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr

65 70 75

TCC ACC ACT GGG ACA AGC CAT CTT GTA AAA TGT GCG GAG AAG GAG AAA 290 Ser Thr Thr Gly Thr Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys

80 85 90

ACT TTC TGT GTG AAT GGA GGG GAG TGC TTC ATG GTG AAA GAC CTT TCA 338 Thr Phe Cys Val Asn Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser

95 100 105

AAC CCC TCG AGA TAC TTG TGC AAG TGC CCA AAT GAG TTT ACT GGT GAT 386 Asn Pro Ser Arg Tyr Leu Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp

110 115 120 125

CGC TGC CAA AAC TAC GTA ATG GCC AGC TTC TAC AAG CAT CTT GGG ATT 434 Arg Cys Gln Asn Tyr Val Met Ala Ser Phe Tyr Lys His Leu Gly Ile

130 135 140

GAA TTT ATG GAG GCG GAG GAG CTG TAC CAG AAG AGA GTG CTG ACC ATA 482 Glu Phe Met Glu Ala Glu Glu Leu Tyr Gln Lys Arg Val Leu Thr Ile

145 150 155

ACC GGC ATC TGC ATC GCC CTC CTT GTG GTC GGC ATC ATG TGT GTG GTG 530 Thr Gly Ile Cys Ile Ala Leu Leu Val Val Gly Ile Met Cys Val Val

160 165 170 GCC TAC TGC AAA ACC AAG AAA CAG CGG AAA AAG CTG CAT GAC CGT CTT 578 Ala Tyr Cys Lys Thr Lys Lys Gln Arg Lys Lys Leu His Asp Arg Leu

175 180 185

CGG CAG AGC CTT CGG TCT GAA CGA AAC AAT ATG ATG AAC ATT GCC AAT 626 Arg Gln Ser Leu Arg Ser Glu Arg Asn Asn Met Met Asn Ile Ala Asn

190 195 200 205

GGG CCT CAC CAT CCT AAC CCA CCC CCC GAG AAT GTC CAG CTG GTG AAT 674 Gly Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gln Leu Val Asn

210 215 220

CAA TAC GTA TCT AAA AAC GTC ATC TCC AGT GAG CAT ATT GTT GAG AGA 722 Gln Tyr Val Ser Lys Asn Val Ile Ser Ser Glu His Ile Val Glu Arg

225 230 235

GAA GCA GAG ACA TCC TTT TCC ACC AGT CAC TAT ACT TCC ACA GCC CAT 770 Glu Ala Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr Ala His

240 245 250

CAC TCC ACT ACT GTC ACC CAG ACT CCT AGC CAC AGC TGG AGC AAC GGA 818 His Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp Ser Asn Gly

255 260 265

CAC ACT GAA AGC ATC CTT TCC GAA AGC CAC TCT GTA ATC GTG ATG TCA 866 His Thr Glu Ser Ile Leu Ser Glu Ser His Ser Val Ile Val Met Ser

270 275 280 285

TCC GTA GAA AAC AGT AGG CAC AGC AGC CCA ACT GGG GGC CCA AGA GGA 914 Ser Val Glu Asn Ser Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly

290 295 300

CGT CTT AAT GGC ACA GGA GGC CCT CGT GAA TGT AAC AGC TTC CTC AGG 962 Arg Leu Asn Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu Arg

305 310 315

CAT GCC AGA GAA ACC CCT GAT TTC TAC CGA GAC TCT CCT CAT AGT GAA 1010 His Ala Arg Glu Thr Pro Asp Phe Tyr Arg Asp Ser Pro His Ser Glu

320 325 330

AGG TAT GTG TCA GCC ATG ACC ACC CCG GCT CGT ATG TCA CCT GTA GAT 1058 Arg Tyr Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro Val Asp

335 340 345

TTC CAC ACG CCA AGC TCC CCC AAA TCG CCC CCT TCG GAA ATG TCT CCA 1106 Phe His Thr Pro Ser Ser Pro Lys Ser Pro Pro Ser Glu Met Ser Pro

350 355 360 365

CCC GTG TCC AGC ATG ACG GTG TCC ATG CCT TCC ATG GCG GTC AGC CCC 1154 Pro Val Ser Ser Met Thr Val Ser Met Pro Ser Met Ala Val Ser Pro

370 375 380 TTC ATG GAA GAA GAG AGA CCT CTA CTT CTC GTG ACA CCA CCA AGG CTG 1202 Phe Met Glu Glu Glu Arg Pro Leu Leu Leu Val Thr Pro Pro Arg Leu

385 390 395

CGG GAG AAG AAG TTT GAC CAT CAC CCT CAG CAG TTC AGC TCC TTC CAC 1250 Arg Glu Lys Lys Phe Asp His His Pro Gln Gln Phe Ser Ser Phe His

400 405 410

CAC AAC CCC GCG CAT GAC AGT AAC AGC CTC CCT GCT AGC CCC TTG AGG 1298 His Asn Pro Ala His Asp Ser Asn Ser Leu Pro Ala Ser Pro Leu Arg

415 420 425

ATA GTG GAG GAT GAG GAG TAT GAA ACG ACC CAA GAG TAC GAG CCA GCC 1346 Ile Val Glu Asp Glu Glu Tyr Glu Thr Thr Gln Glu Tyr Glu Pro Ala

430 435 440 445

CAA GAG CCT GTT AAG AAA CTC GCC AAT AGC CGG CGG GCC AAA AGA ACC 1394 Gln Glu Pro Val Lys Lys Leu Ala Asn Ser Arg Arg Ala Lys Arg Thr

450 455 460

AAG CCC AAT GGC CAC ATT GCT AAC AGA TTG GAA GTG GAC AGC AAC ACA 1442 Lys Pro Asn Gly His Ile Ala Asn Arg Leu Glu Val Asp Ser Asn Thr

465 470 475

AGC TCC CAG AGC AGT AAC TCA GAG AGT GAA ACA GAA GAT GAA AGA GTA 1490 Ser Ser Gln Ser Ser Asn Ser Glu Ser Glu Thr Glu Asp Glu Arg Val

480 485 490

GGT GAA GAT ACG CCT TTC CTG GGC ATA CAG AAC CCC CTG GCA GCC AGT 1538 Gly Glu Asp Thr Pro Phe Leu Gly Ile Gln Asn Pro Leu Ala Ala Ser

495 500 505

CTT GAG GCA ACA CCT GCC TTC CGC CTG GCT GAC AGC AGG ACT AAC CCA 1586 Leu Glu Ala Thr Pro Ala Phe Arg Leu Ala Asp Ser Arg Thr Asn Pro

510 515 520 525

GCA GGC CGC TTC TCG ACA CAG GAA GAA ATC CAG GCC AGG CTG TCT AGT 1634 Ala Gly Arg Phe Ser Thr Gln Glu Glu Ile Gln Ala Arg Leu Ser Ser

530 535 540

GTA ATT GCT AAC CAA GAC CCT ATT GCT GTA TAAAACCTAA ATAAACACAT 1684 Val Ile Ala Asn Gln Asp Pro Ile Ala Val

545 550

AGATTCACCT GTAAAACTTT ATTTTATATA ATAAAGTATT CCACCTTAAA TTAAACAATT 1744

TATTTTATTT TAGCAGTTCT GCAAATAGAA AACAGGAAAA AAACTTTTAT AAATTAAATA 1804

TATGTATGTA AAAATGTGTT ATGTGCCATA TGTAGCAATT TTTTACAGTA TTTCAAAAAA 1864

AAAAAGCTT 1873 (15) INFORMATION FOR SEQ ID NO: 14:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 551 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14:

Ser Glu Leu Arg Ile Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr 1 5 10 15

Met Cys Lys Val Ile Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn

20 25 30

Ile Thr Ile Val Glu Ser Asn Glu Ile Ile Thr Gly Met Pro Ala Ser

35 40 45

Thr Glu Gly Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val 50 55 60

Ser Thr Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr 65 70 75 80

Gly Thr Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys

85 90 95

Val Asn Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser

100 105 110

Arg Tyr Leu Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln

115 120 125

Asn Tyr Val Met Ala Ser Phe Tyr Lys His Leu Gly Ile Glu Phe Met 130 135 140

Glu Ala Glu Glu Leu Tyr Gln Lys Arg Val Leu Thr Ile Thr Gly Ile 145 150 155 160

Cys Ile Ala Leu Leu Val Val Gly Ile Met Cys Val Val Ala Tyr Cys

165 170 175

Lys Thr Lys Lys Gln Arg Lys Lys Leu His Asp Arg Leu Arg Gln Ser

180 185 190

Leu Arg Ser Glu Arg Asn Asn Met Met Asn Ile Ala Asn Gly Pro His

195 200 205

His Pro Asn Pro Pro Pro Glu Asn Val Gln Leu Val Asn Gln Tyr Val 210 215 220 Ser Lys Asn Val Ile Ser Ser Glu His Ile Val Glu Arg Glu Ala Glu 225 230 235 240

Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr Ala His His Ser Thr

245 250 255

Thr Val Thr Gln Thr Pro Ser His Ser Trp Ser Asn Gly His Thr Glu

260 265 270

Ser Ile Leu Ser Glu Ser His Ser Val Ile Val Met Ser Ser Val Glu

275 280 285

Asn Ser Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly Arg Leu Asn 290 295 300

Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu Arg His Ala Arg 305 310 315 320

Glu Thr Pro Asp Phe Tyr Arg Asp Ser Pro His Ser Glu Arg Tyr Val

325 330 335

Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro Val Asp Phe His Thr

340 345 350

Pro Ser Ser Pro Lys Ser Pro Pro Ser Glu Met Ser Pro Pro Val Ser

355 360 365

Ser Met Thr Val Ser Met Pro Ser Met Ala Val Ser Pro Phe Met Glu 370 375 380

Glu Glu Arg Pro Leu Leu Leu Val Thr Pro Pro Arg Leu Arg Glu Lys 385 390 395 400

Lys Phe Asp His His Pro Gln Gln Phe Ser Ser Phe His His Asn Pro

405 410 415

Ala His Asp Ser Asn Ser Leu Pro Ala Ser Pro Leu Arg Ile Val Glu

420 425 430

Asp Glu Glu Tyr Glu Thr Thr Gln Glu Tyr Glu Pro Ala Gln Glu Pro

435 440 445

Val Lys Lys Leu Ala Asn Ser Arg Arg Ala Lys Arg Thr Lys Pro Asn 450 455 460

Gly His Ile Ala Asn Arg Leu Glu Val Asp Ser Asn Thr Ser Ser Gln 465 470 475 480

Ser Ser Asn Ser Glu Ser Glu Thr Glu Asp Glu Arg Val Gly Glu Asp

485 490 495 Thr Pro Phe Leu Gly Ile Gln Asn Pro Leu Ala Ala Ser Leu Glu Ala

500 505 510

Thr Pro Ala Phe Arg Leu Ala Asp Ser Arg Thr Asn Pro Ala Gly Arg

515 520 525

Phe Ser Thr Gln Glu Glu Ile Gln Ala Arg Leu Ser Ser Val Ile Ala

530 535 540

Asn Gln Asp Pro Ile Ala Val

545 550

(16) INFORMATION FOR SEQ ID NO: 15:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 878 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: unknown

(D) TOPOLOGY: unknown

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 25..852

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15:

GTCGAAGGAA ATGACAGTGC CTCT GCC AAT ATC ACC ATC GTG GAA TCA AAC 51

Ala Asn Ile Thr Ile Val Glu Ser Asn.

1 5

GAG ATC ATC ACT GGT ATG CCA GCC TCA ACT GAA GGA GCA TAT GTG TCT 99 Glu Ile Ile Thr Gly Met Pro Ala Ser Thr Glu Gly Ala Tyr Val Ser

10 15 20 25

TCA GAG TCT CCC ATT AGA ATA TCA GTA TCC ACA GAA GGA GCA AAT ACT 147 Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr Glu Gly Ala Asn Thr

30 35 40

TCT TCA TCT ACA TCT ACA TCC ACC ACT GGG ACA AGC CAT CTT GTA AAA 195 Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Val Lys

45 50 55

TGT GCG GAG AAG GAG AAA ACT TTC TGT GTG AAT GGA GGG GAG TGC TTC 243 Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys Phe

60 65 70

ATG GTG AAA GAC CTT TCA AAC CCC TCG AGA TAC TTG TGC AAG TGC CCA 291 Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys Lys Cys Pro

75 80 85 AAT GAG TTT ACT GGT GAT CGC TGC CAA AAC TAC GTA ATG GCC AGC TTC 339 Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr Val Met Ala Ser Phe

90 95 100 105

TAC AAG GCG GAG GAG CTG TAC CAG AAG AGA GTG CTG ACC ATA ACC GGC 387 Tyr Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val Leu Thr Ile Thr Gly

110 115 120

ATC TGC ATC GCC CTC CTT GTG GTC GGC ATC ATG TGT GTG GTG GCC TAC 435 Ile Cys Ile Ala Leu Leu Val Val Gly Ile Met Cys Val Val Ala Tyr

125 130 135

TGC AAA ACC AAG AAA CAG CGG AAA AAG CTG CAT GAC CGT CTT CGG CAG 483 Cys Lys Thr Lys Lys Gln Arg Lys Lys Leu His Asp Arg Leu Arg Gln

140 145 150

AGC CTT CGG TCT GAA CGA AAC AAT ATG ATG AAC ATT GCC AAT GGG CCT 531 Ser Leu Arg Ser Glu Arg Asn Asn Met Met Asn Ile Ala Asn Gly Pro

155 160 165

CAC CAT CCT AAC CCA CCC CCC GAG AAT GTC CAG CTG GTG AAT CAA TAC 579 His His Pro Asn Pro Pro Pro Glu Asn Val Gln Leu Val Asn Gln Tyr

170 175 180 185

GTA TCT AAA AAC GTC ATC TCC AGT GAG CAT ATT GTT GAG AGA GAA GCA 627 Val Ser Lys Asn Val Ile Ser Ser Glu His Ile Val Glu Arg Glu Ala

190 195 200

GAG ACA TCC TTT TCC ACC AGT CAC TAT ACT TCC ACA GCC CAT CAC TCC 675 Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr Ala His His Ser

205 210 215

ACT ACT GTC ACC CAG ACT CCT AGC CAC AGC TGG AGC AAC GGA CAC ACT 723 Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp Ser Asn Gly His Thr

220 225 230

GAA AGC ATC CTT TCC GAA AGC CAC TCT GTA ATC GTG ATG TCA TCC GTA 771 Glu Ser Ile Leu Ser Glu Ser His Ser Val Ile Val Met Ser Ser Val

235 240 245

GAA AAC AGT AGG CAC AGC AGC CCA ACT GGG GGC CCA AGA GGA CGT CTT 819 Glu Asn Ser Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly Arg Leu

250 255 260 265

AAT GGC ACA GGA GGC CCT CGT GAA TGT AAC AGC TTCCTCAGGC ATGCCAGAGA 872 Asn Gly Thr Gly Gly Pro Arg Glu Cys Asn Ser

270 275

GGCCGC 878 (17) INFORMATION FOR SEQ ID NO: 16:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 276 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16:

Ala Asn Ile Thr Ile Val Glu Ser Asn Glu Ile Ile Thr Gly Met Pro 1 5 10 15

Ala Ser Thr Glu Gly Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile

20 25 30

Ser Val Ser Thr Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser

35 40 45

Thr Thr Gly Thr Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr 50 55 60

Phe Cys Val Asn Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser Asn 65 70 75 80

Pro Ser Arg Tyr Leu Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg

85 90 95

Cys Gln Asn Tyr Val Met Ala Ser Phe Tyr Lys Ala Glu Glu Leu Tyr

100 105 110

Gln Lys Arg Val Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val

115 120 125

Val Gly Ile Met Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg 130 135 140

Lys Lys Leu His Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Asn 145 150 155 160

Asn Met Met Asn Ile Ala Asn Gly Pro His His Pro Asn Pro Pro Pro

165 170 175

Glu Asn Val Gln Leu Val Asn Gln Tyr Val Ser Lys Asn Val Ile Ser

180 185 190

Ser Glu His Ile Val Glu Arg Glu Ala Glu Thr Ser Phe Ser Thr Ser

195 200 205

His Tyr Thr Ser Thr Ala His His Ser Thr Thr Val Thr Gln Thr Pro 210 215 220 Ser His Ser Trp Ser Asn Gly His Thr Glu Ser Ile Leu Ser Glu Ser

225 230 235 240

His Ser Val Ile Val Met Ser Ser Val Glu Asn Ser Arg His Ser Ser

245 250 255

Pro Thr Gly Gly Pro Arg Gly Arg Leu Asn Gly Thr Gly Gly Pro Arg

260 265 270

Glu Cys Asn Ser

275

(18) INFORMATION FOR SEQ ID NO: 17:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 906 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: unknown

(D) TOPOLOGY: unknown

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 43..531

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:

TCTAGATGCA TGCTCGAGCG GCCGCCAGTG TGCTCTAAAG AT CGA AAA AAC AAA 54

Arg Lys Asn Lys

1

CCA CAA AAT ATC AAG ATA CAA AAA AAG CCA GGG AAG TCA GAA CTT CGC 102 Pro Gln Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu Leu Arg

5 10 15 20

ATT AAC AAA GCA TCA CTG GCT GAT TCT GGA GAG TAT ATG TGC AAA GTG 150 Ile Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys Lys Val

25 30 35

ATC AGC AAA TTA GGA AAT GAC AGT GCC TCT GCC AAT ATC ACC ATC GTG 198 Ile Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr Ile Val

40 45 50

GAA TCA AAC GAG ATC ATC ACT GGT ATG CCA GCC TCA ACT GAA GGA GCA 246 Glu Ser Asn Glu Ile Ile Thr Gly Met Pro Ala Ser Thr Glu Gly Ala

55 60 65

TAT GTG TCT TCA GAG TCT CCC ATT AGA ATA TCA GTA TCC ACA GAA GGA 294 Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr Glu Gly

70 75 80 GCA AAT ACT TCT TCA TCT ACA TCT ACA TCC ACC ACT GGG ACA AGC CAT 342 Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His

85 90 95 100

CTT GTA AAA TGT GCG GAG AAG GAG AAA ACT TTC TGT GTG AAT GGA GGG 390 Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly

105 110 115

GAG TGC TTC ATG GTG AAA GAC CTT TCA AAC CCC TCG AGA TAC TTG TGC 438 Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys

120 125 130

AAG TGC CCA AAT GAG TTT ACT GGT GAT CGC TGC CAA AAC TAC GTA ATG 486 Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr Val Met

135 140 145

GCC AGC TTC TAC AGT ACG TCC ACT CCC TTT CTG TCT CTG CCT GAA 531

Ala Ser Phe Tyr Ser Thr Ser Thr Pro Phe Leu Ser Leu Pro Glu

150 155 160

TAGGAGCATG CTCAGTTGGT GCTGCTTTCT TGTTGCTGCA TCTCCCCTCA GATTCCACCT 591

AGAGCTAGAT GTGTCTTACC AGATCTAATA TTGACTGCCT CTGCCTGTCG CATGAGAACA 651

TTAACAAAAG CAATTGTATT ACTTCCTCTG TTCGCGACTA GTTGGCTCTG AGATACTAAT 711

AGGTGTGTGA GGCTCCGGAT GTTTCTGGAA TTGATATTGA ATGATGTGAT ACAAATTGAT 771

AGTCAATATC AAGCAGTGAA ATATGATAAT AAAGGCATTT CAAAGTCTCA AAAAAAAAAA 831

AAAAAAAAAA AAAAAAAAAA ACTTTAGAGC ACACTGGCGG CCGTTACTAG TGGATCCGAG 891

CTCGGTACCA AGCTT 906

(19) INFORMATION FOR SEQ ID NO: 18:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 163 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18:

Arg Lys Asn Lys Pro Gln Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys

1 5 10 15

Ser Glu Leu Arg Ile Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr

20 25 30 Met Cys Lys Val Ile Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn

35 40 45

Ile Thr Ile Val Glu Ser Asn Glu Ile Ile Thr Gly Met Pro Ala Ser

50 55 60

Thr Glu Gly Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val

65 70 75 80

Ser Thr Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr

85 90 95

Gly Thr Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys

100 105 110

Val Asn Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser

115 120 125

Arg Tyr Leu Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln

130 135 140

Asn Tyr Val Met Ala Ser Phe Tyr Ser Thr Ser Thr Pro Phe Leu Ser

145 150 155 160

Leu Pro Glu

(20) INFORMATION FOR SEQ ID NO: 19:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2186 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:

AGCTGCCGGG AGATGCGAGC GCAGACCGGA TTGTGATCAC CTTTCCCTCT TCGGGCTGTA 60

AGAGAGCGAG ACAAGCCACC GAAGCGAGGC CACTCCAGAG CCGGCAGCGG AGGGACCCGG 120

GACACTAGAG CAGCTCCGAG CCACTCCAGA CTGAGCGGAC GCTCCAGGTG ATCGAGTCCA 180

CGCTGCTTCC TGCAGGCGAC AGGCGACGCC TCCCGAGCAG CCCGGCCACT GGCTCTTCCC 240

CTCCTGGGAC AAACTTTTCT GCAAGCCCTT GGACCAAACT TGTCGCGCGT CACCGTCACC 300

CAACCGGGTC CGCGTAGAGC GCTCATCTTC GGCGAG ATG TCT GAG CGC AAA GAA 354

Met Ser Glu Arg Lys Glu

1 5 GGC AGA GGC AAG GGG AAG GGC AAG AAG AAG GAC CGG GGA TCC CGC GGG 402 Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Asp Arg Gly Ser Arg Gly

10 15 20

AAG CCC GGG CCC GCC GAG GGC GAC CCG AGC CCA GCA CTG CCT CCC AGA 450 Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser Pro Ala Leu Pro Pro Arg

25 30 35

TTG AAA GAA ATG AAG AGC CAG GAG TCA GCT GCA GGC TCC AAG CTA GTG 498 Leu Lys Glu Met Lys Ser Gln Glu Ser Ala Ala Gly Ser Lys Leu Val

40 45 50

CTC CGG TGC GAA ACC AGC TCC GAG TAC TCC TCA CTC AGA TTC AAA TGG 546 Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp

55 60 65 70

TTC AAG AAT GGG AAC GAG CTG AAC CGC AAA AAT AAA CCA GAA AAC ATC 594 Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys Asn Lys Pro Glu Asn Ile

75 80 85

AAG ATA CAG AAG AAG CCA GGG AAG TCA GAG CTT CGA ATT AAC AAA GCA 642 Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu Leu Arg Ile Asn Lys Ala

90 95 100

TCC CTG GCT GAC TCT GGA GAG TAT ATG TGC AAA GTG ATC AGC AAG TTA 690 Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys Lys Val Ile Ser Lys Leu

105 110 115

GGA AAT GAC AGT GCC TCT GCC AAC ATC ACC ATT GTT GAG TCA AAC GAG 738 Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr Ile Val Glu Ser Asn Glu

120 125 130

TTC ATC ACT GGC ATG CCA GCC TCG ACT GAG ACA GCC TAT GTG TCC TCA 786 Phe Ile Thr Gly Met Pro Ala Ser Thr Glu Thr Ala Tyr Val Ser Ser

135 140 145 150

GAG TCT CCC ATT AGA ATC TCA GTT TCA ACA GAA GGC GCA AAC ACT TCT 834 Glu Ser Pro Ile Arg Ile Ser Val Ser Thr Glu Gly Ala Asn Thr Ser

155 160 165

TCA TCC ACA TCA ACA TCC ACG ACT GGG ACC AGC CAT CTC ATA AAG TGT 882 Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Ile Lys Cys

170 175 180

GCG GAG AAG GAG AAA ACT TTC TGT GTG AAT GGG GGC GAG TGC TTC ACG 930 Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys Phe Thr

185 190 195

GTG AAG GAC CTG TCA AAC CCG TCA AGA TAC TTG TGC AAG TGC CAA CCT 978 Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys Lys Cys Gln Pro

200 205 210 GGA TTC ACT GGA GCA AGA TGT ACT GAG AAT GTA CCC ATG AAA GTC CAA 1026 Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val Pro Met Lys Val Gln

215 220 225 230

ACC CAA GAA AAA GCG GAG GAA CTC TAC CAG AAG AGG GTG CTG ACA ATT 1074 Thr Gln Glu Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val Leu Thr Ile

235 240 245

ACT GGC ATC TGT ATC GCC CTG CTG GTG GTC GGC ATC ATG TGT GTG GTG 1122 Thr Gly Ile Cys Ile Ala Leu Leu Val Val Gly Ile Met Cys Val Val

250 255 260

GCC TAC TGC AAA ACC AAG AAG CAG CGG CAG AAG CTT CAT GAT CGG CTT 1170 Ala Tyr Cys Lys Thr Lys Lys Gln Arg Gln Lys Leu His Asp Arg Leu

265 270 275

CGG CAG AGT CTT CGG TCA GAA CGG AGC AAC CTG GTG AAC ATA GCG AAT 1218 Arg Gln Ser Leu Arg Ser Glu Arg Ser Asn Leu Val Asn Ile Ala Asn

280 285 290

GGG CCT CAC CAC CCA AAC CCA CCG CCA GAG AAC GTG CAG CTG GTG AAT 1266 Gly Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gln Leu Val Asn

295 300 305 310

CAA TAC GTA TCT AAA AAC GTC ATC TCC AGT GAG CAT ATT GTT GAG AGA 1314 Gln Tyr Val Ser Lys Asn Val Ile Ser Ser Glu His Ile Val Glu Arg

315 320 325

GAA GTG GAG ACT TCC TTT TCC ACC AGT CAT TAC ACT TCC ACA GCC CAT 1362 Glu Val Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr Ala His

330 335 340

CAC TCC ACG ACT GTC ACC CAG ACT CCT AGT CAC AGC TGG AGT AAT GGG 1410 His Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp Ser Asn Gly

345 350 355

CAC ACG GAG AGC GTC ATT TCA GAA AGC AAC TCC GTA ATC ATG ATG TCT 1458 His Thr Glu Ser Val Ile Ser Glu Ser Asn Ser Val Ile Met Met Ser

360 365 370

TCG GTA GAG AAC AGC AGG CAC AGC AGT CCC GCC GGG GGC CCA CGA GGA 1506 Ser Val Glu Asn Ser Arg His Ser Ser Pro Ala Gly Gly Pro Arg Gly

375 380 385 390

CGT CTT CAT GGC CTG GGA GGC CCT CGT GAT AAC AGC TTC CTC AGG CAT 1554 Arg Leu His Gly Leu Gly Gly Pro Arg Asp Asn Ser Phe Leu Arg His

395 400 405

GCC AGA GAA ACC CCT GAC TCC TAC AGA GAC TCT CCT CAT AGC GAA AGG 1602 Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His Ser Glu Arg

410 415 420 TAAAATGGAA GGGTAAAGCT ATCGTGGAGG AGAACCTCAT TCAGTGAGAG AATCCCATGA 1662

GCACCTGCGG TCTCTCCTAA GGAAACTCAT CCTTTCATAA GGGGCGTCAT CAATTCCCTG 1722

ACGCTCCTAG TTGATGAAGT CAACCTCTTA TTTGTCTGAA CTCCTTTCTC CTGAGCTTCT 1782

CCCCGCGTCC CAGTGGCTGA CAGGCAACCA ACTCCTAAAG AGCAGGAGTA ATGTAATGTG 1842

GAAGGCCCAG CCGACTTGGA GTTCTCAGAC CTGACCCTAA GGTCAGACTC ACTGGGGTCT 1902

TCCTTTCGTC AGGTGCACCA TTTTAAGGAC CCTCATCTAA TCCCGATCAG CCAGTTGTCC 1962

ATTCTCTACC TCGATGGTTG TTCTGCCCTC CTCTCCTCAT GCTCTAAGTA CCCAGCCTCT 2022

AGCCTCAGTT AAATCAAGTC GAGGGCTGGG ACACGGCTGA TGGTCATGGC GGAATGCCTC 2082

CTCGGGTTCC ACCACACTGT ATCCATTTAC GACATCCCTA GAAGTCATGG GTAAATAAAG 2142

TTGTTGTGTG ATGTGAAAAA AAAAAAAAAA AAAAAAAAAA AAAA 2186

(21) INFORMATION FOR SEQ ID NO:20:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 422 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:

Met Ser Glu Arg Lys Glu

1 5

Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Asp Arg Gly Ser Arg Gly

10 15 20

Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser Pro Ala Leu Pro Pro Arg

25 30 35

Leu Lys Glu Met Lys Ser Gln Glu Ser Ala Ala Gly Ser Lys Leu Val

40 45 50

Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp

55 60 65 70

Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys Asn Lys Pro Glu Asn Ile

75 80 85

Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu Leu Arg Ile Asn Lys Ala

90 95 100 Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys Lys Val Ile Ser Lys Leu 105 110 115

Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr Ile Val Glu Ser Asn Glu 120 125 130

Phe Ile Thr Gly Met Pro Ala Ser Thr Glu Thr Ala Tyr Val Ser Ser 135 140 145 150

Glu Ser Pro Ile Arg Ile Ser Val Ser Thr Glu Gly Ala Asn Thr Ser

155 160 165

Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Ile Lys Cys

170 175 180

Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys Phe Thr

185 190 195

Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys Lys Cys Gln Pro 200 205 210

Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val Pro Met Lys Val Gln 215 220 225 230

Thr Gln Glu Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val Leu Thr Ile

235 240 245

Thr Gly Ile Cys Ile Ala Leu Leu Val Val Gly Ile Met Cys Val Val

250 255 260

Ala Tyr Cys Lys Thr Lys Lys Gln Arg Gln Lys Leu His Asp Arg Leu

265 270 275

Arg Gln Ser Leu Arg Ser Glu Arg Ser Asn Leu Val Asn Ile Ala Asn

280 285 290

Gly Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gln Leu Val Asn 295 300 305 310 Gln Tyr Val Ser Lys Asn Val Ile Ser Ser Glu His Ile Val Glu Arg

315 320 325

Glu Val Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr Ala His

330 335 340

His Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp Ser Asn Gly

345 350 355

His Thr Glu Ser Val Ile Ser Glu Ser Asn Ser Val Ile Met Met Ser 360 365 370

Ser Val Glu Asn Ser Arg His Ser Ser Pro Ala Gly Gly Pro Arg Gly 375 380 385 390 Arg Leu His Gly Leu Gly Gly Pro Arg Asp Asn Ser Phe Leu Arg His

395 400 405

Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His Ser Glu Arg

410 415 4,20

(22) INFORMATION FOR SEQ ID NO:21:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1894 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:

CCGT CACCCAACCG GGTCCGCGTA GAGCGCTCAT CTTCGGCGAG ATG TCT GAG 53

Met Ser Glu

1

CGC AAA GAA GGC AGA GGC AAG GGG AAG GGC AAG AAG AAG GAC CGG GGA 101 Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Asp Arg Gly

5 10 15

TCC CGC GGG AAG CCC GGG CCC GCC GAG GGC GAC CCG AGC CCA GCA CTG 149 Ser Arg Gly Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser Pro Ala Leu

20 25 30 35

CCT CCC AGA TTG AAA GAA ATG AAG AGC CAG GAG TCA GCT GCA GGC TCC 197 Pro Pro Arg Leu Lys Glu Met Lys Ser Gln Glu Ser Ala Ala Gly Ser

40 45 50

AAG CTA GTG CTC CGG TGC GAA ACC AGC TCC GAG TAC TCC TCA CTC AGA 245 Lys Leu Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser Ser Leu Arg

55 60 65

TTC AAA TGG TTC AAG AAT GGG AAC GAG CTG AAC CGC AAA AAT AAA CCA 293 Phe Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys Asn Lys Pro

70 75 80

GAA AAC ATC AAG ATA CAG AAG AAG CCA GGG AAG TCA GAG CTT CGA ATT 341 Glu Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu Leu Arg Ile

85 90 95

AAC AAA GCA TCC CTG GCT GAC TCT GGA GAG TAT ATG TGC AAA GTG ATC 389 Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys Lys Val Ile

100 105 110 115 AGC AAG TTA GGA AAT GAC AGT GCC TCT GCC AAC ATC ACC ATT GTT GAG 437 Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr Ile Val Glu

120 125 130

TCA AAC GAG TTC ATC ACT GGC ATG CCA GCC TCG ACT GAG ACA GCC TAT 485 Ser Asn Glu Phe Ile Thr Gly Met Pro Ala Ser Thr Glu Thr Ala Tyr

135 140 145

GTG TCC TCA GAG TCT CCC ATT AGA ATC TCA GTT TCA ACA GAA GGC GCA 533 Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr Glu Gly Ala

150 155 160

AAC ACT TCT TCA TCC ACA TCA ACA TCC ACG ACT GGG ACC AGC CAT CTC 581 Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu

165 170 175

ATA AAG TGT GCG GAG AAG GAG AAA ACT TTC TGT GTG AAT GGG GGC GAG 629 Ile Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu

180 185 190 195

TGC TTC ACG GTG AAG GAC CTG TCA AAC CCG TCA AGA TAC TTG TGC AAG 677 Cys Phe Thr Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys Lys

200 205 210

TGC CAA CCT GGA TTC ACT GGA GCA AGA TGT ACT GAG AAT GTA CCC ATG 725 Cys Gln Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val Pro Met

215 220 225

AAA GTC CAA ACC CAA GAA AAA GCG GAG GAA CTC TAC CAG AAG AGG GTG 773 Lys Val Gln Thr Gln Glu Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val

230 235 240

CTG ACA ATT ACT GGC ATC TGT ATC GCC CTG CTG GTG GTC GGC ATC ATG 821 Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val Val Gly Ile Met

245 250 255

TGT GTG GTG GCC TAC TGC AAA ACC AAG AAG CAG CGG CAG AAG CTT CAT 869 Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg Gln Lys Leu His

260 265 270 275

GAT CGG CTT CGG CAG AGT CTT CGG TCA GAA CGG AGC AAC CTG GTG AAC 917 Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Ser Asn Leu Val Asn

280 285 290

ATA GCG AAT GGG CCT CAC CAC CCA AAC CCA CCG CCA GAG AAC GTG CAG 965 Ile Ala Asn Gly Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gln

295 300 305 CTG GTG AAT CAA TAC GTA TCT AAA AAC GTC ATC TCC AGT GAG CAT ATT 1013 Leu Val Asn Gln Tyr Val Ser Lys Asn Val Ile Ser Ser Glu His Ile

310 315 320

GTT GAG AGA GAA GTG GAG ACT TCC TTT TCC ACC AGT CAT TAC ACT TCC 1061 Val Glu Arg Glu Val Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser

325 330 335

ACA GCC CAT CAC TCC ACG ACT GTC ACC CAG ACT CCT AGT CAC AGC TGG 1109 Thr Ala His His Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp

340 345 350 355

AGT AAT GGG CAC ACG GAG AGC GTC ATT TCA GAA AGC AAC TCC GTA ATC 1157 Ser Asn Gly His Thr Glu Ser Val Ile Ser Glu Ser Asn Ser Val Ile

360 365 370

ATG ATG TCT TCG GTA GAG AAC AGC AGG CAC AGC AGT CCC GCC GGG GGC 1205 Met Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro Ala Gly Gly

375 380 385

CCA CGA GGA CGT CTT CAT GGC CTG GGA GGC CCT CGT GAT AAC AGC TTC 1253 Pro Arg Gly Arg Leu His Gly Leu Gly Gly Pro Arg Asp Asn Ser Phe

390 395 400

CTC AGG CAT GCC AGA GAA ACC CCT GAC TCC TAC AGA GAC TCT CCT CAT 1301 Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His

405 410 415

AGC GAA AGG TAAAATGGAA GGGTAAAGCT ATCGTGGAGG AGAACCTCAT TCAGTGAGAG 1360

Ser Glu Arg

420

AATCCCATGA GCACCTGCGG TCTCTCCTAA GGAAACTCAT CCTTTCATAA GGGGCGTCAT 1420

CAATTCCCTG ACGCTCCTAG TTGATGAAGT CAACCTCTTA TTTGTCTGAA CTCCTTTCTC 1480

CTGAGCTTCT CCCCGCGTCC CAGTGGCTGA CAGGCAACCA ACTCCTAAAG AGCAGGAGTA 1540

ATGTAATGTG GAAGGCCCAG CCGACTTGGA GTTCTCAGAC CTGACCCTAA GGTCAGACTC 1600

ACTGGGGTCT TCCTTTCGTC AGGTGCACCA TTTTAAGGAC CCTCATCTAA TCCCGATCAG 1660

CCAGTTGTCC ATTCTCTACC TCGATGGTTG TTCTGCCCTC CTCTCCTCAT GCTCTAAGTA 1720

CCCAGCCTCT AGCCTCAGTT AAATCAAGTC GAGGGCTGGG ACACGGCTGA TGGTCATGGC 1780

GGAATGCCTC CTCGGGTTCC ACCACACTGT ATCCATTTAC GACATCCCTA GAAGTCATGG 1840

GTAAATAAAG TTGTTGTGTG ATGTGAAAAA AAAAAAAAAA AAAAAAAAAA AAAA 1894 (23) INFORMATION FOR SEQ ID NO:22:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 422 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:

Met Ser Glu

1

Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Asp Arg Gly 5 10 15

Ser Arg Gly Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser Pro Ala Leu 20 25 30 35

Pro Pro Arg Leu Lys Glu Met Lys Ser Gln Glu Ser Ala Ala Gly Ser

40 45 50

Lys Leu Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser Ser Leu Arg

55 60 65

Phe Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys Asn Lys Pro

70 75 80

Glu Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu Leu Arg Ile 85 90 95

Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys Lys Val Ile 100 105 110 115

Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr Ile Val Glu

120 125 130

Ser Asn Glu Phe Ile Thr Gly Met Pro Ala Ser Thr Glu Thr Ala Tyr

135 140 145

Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr Glu Gly Ala

150 155 160

Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu 165 170 175

Ile Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu 180 185 190 195

Cys Phe Thr Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys Lys

200 205 210 Cys Gln Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val Pro Met 215 220 225

Lys Val Gln Thr Gln Glu Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val

230 235 240

Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val Val Gly Ile Met 245 250 255

Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg Gln Lys Leu His 260 265 270 275

Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Ser Asn Leu Val Asn

280 285 290 Ile Ala Asn Gly Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gln

295 300 305

Leu Val Asn Gln Tyr Val Ser Lys Asn Val Ile Ser Ser Glu His Ile

310 315 320

Val Glu Arg Glu Val Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser 325 330 335

Thr Ala His His Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp 340 345 350 355

Ser Asn Gly His Thr Glu Ser Val Ile Ser Glu Ser Asn Ser Val Ile

360 365 370

Met Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro Ala Gly Gly

375 380 385

Pro Arg Gly Arg Leu His Gly Leu Gly Gly Pro Arg Asp Asn Ser Phe

390 395 400

Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His 405 410 415

Ser Glu Arg

420

(24) INFORMATION FOR SEQ ID NO: 23:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 64 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23:

Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys Phe 1 5 10 15

Thr Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys Lys Cys Gln

20 25 30

Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val Pro Met Lys Val

35 40 45

Gln Thr Gln Glu Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val Leu Thr 50 55 60

(25) INFORMATION FOR SEQ ID NO: 24:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 60 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24:

Cys Pro Asp Ser His Thr Gln Tyr Cys Phe His Gly Thr Cys Arg Phe 1 5 10 15

Leu Val Gln Glu Glu Lys Pro Ala Cys Val Cys His Ser Gly Tyr Val

20 25 30

Gly Val Arg Cys Glu His Ala Asp Leu Leu Ala Val Val Ala Ala Ser

35 40 45

Gln Lys Lys Gln Ala Ile Thr Ala Ala Val Val Val

50 55 60

(26) INFORMATION FOR SEQ ID NO: 25:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 60 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY : unknown

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 25 :

Cys Asn Ala Glu Phe Gln Asn Phe Cys Ile His Gly Glu Cys Lys Tyr 1 5 10 15 Ile Glu His Leu Glu Ala Val Thr Cys Asn Cys Gln Gln Glu Tyr Phe

20 25 30 Gly Glu Arg Cys Gly Glu Lys Ser Met Lys Thr His Ser Met Ile Asp 35 40 45

Ser Ser Leu Ser Lys Ile Ala Leu Leu Ala Ile Ala

50 55 60

(27) INFORMATION FOR SEQ ID NO: 26:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 61 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:26:

Cys Pro Ser Ser Tyr Asp Gly Tyr Cys Leu Asn Gly Gly Val Cys Met 1 5 10 15

His Ile Glu Ser Leu Asp Ser Tyr Thr Cys Asn Cys Val Ile Gly Tyr

20 25 30

Ser Gly Asp Arg Cys Gln Thr Arg Asp Leu Arg Trp Trp Glu Leu Arg

35 40 45

His Ala Gly Tyr Gly Gln Lys His Asp Ile Met Val Val

50 55 60

(28) INFORMATION FOR SEQ ID NO:27:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 60 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:27:

Cys Leu Arg Lys Tyr Lys Asp Phe Cys Ile His Gly Glu Cys Lys Tyr 1 5 10 15

Val Lys Glu Leu Arg Ala Pro Ser Cys Ile Cys His Pro Gly Tyr His

20 25 30

Gly Glu Arg Cys His Gly Leu Ser Leu Pro Val Glu Asn Arg Val Tyr

35 40 45

Thr Tyr Asp His Thr Thr Ile Leu Ala Val Val Ala

50 55 60 (29) INFORMATION FOR SEQ ID NO:28:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 60 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28:

Cys Ala Ala Lys Phe Gln Asn Phe Cys Ile His Gly Glu Cys Arg Tyr 1 5 10 15 Ile Glu Asn Leu Glu Val Val Thr Cys His Cys His Gln Asp Tyr Phe

20 25 30

Gly Glu Arg Cys Gly Glu Lys Thr Met Lys Thr Gln Lys Lys Asp Asp

35 40 45

Ser Asp Leu Ser Lys Ile Ala Leu Ala Ala Ile Ile

50 55 60

(30) INFORMATION FOR SEQ ID NO: 29:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 60 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29:

Cys Gly Pro Glu Gly Asp Gly Tyr Cys Leu His Gly Asp Cys Ile His 1 5 10 15

Ala Arg Asp Ile Asp Gly Met Tyr Cys Arg Cys Ser His Gly Tyr Thr

20 25 30

Gly Ile Arg Cys Gln His Val Val Leu Val Asp Tyr Gln Arg Ser Glu

35 40 45

Asn Pro Asn Thr Thr Thr Ser Tyr Ile Pro Ser Pro

50 55 60

(31) INFORMATION FOR SEQ ID NO: 30:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 48 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30:

Cys Asn His Asp Tyr Glu Asn Tyr Cys Leu Asn Asn Gly Thr Cys Phe 1 5 10 15

Thr Ile Ala Leu Asp Asn Val Ser Ile Thr Pro Phe Cys Val Cys Arg

20 25 30

Ile Asn Tyr Glu Gly Ser Arg Cys Gln Phe Ile Asn Leu Val Thr Tyr

35 40 45

(32) INFORMATION FOR SEQ ID NO: 31:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 49 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31:

Cys Asn Asp Asp Tyr Lys Asn Tyr Cys Leu Asn Asn Gly Thr Cys Phe 1 5 10 15

Thr Val Ala Leu Asn Asn Val Ser Leu Asn Pro Phe Cys Ala Cys His

20 25 30

Ile Asn Tyr Val Gly Ser Arg Cys Gln Phe Ile Asn Leu Ile Thr Ile

35 40 45

Lys

(33) INFORMATION FOR SEQ ID NO: 32:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 65 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32:

Leu Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser Ser Leu Arg Phe 1 5 10 15

Arg Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys Asn Asn Lys Pro

20 25 30

Glu Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu Leu Arg Ile

35 40 45 Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys Lys Val Ile

50 55 60

Ser

65

(34) INFORMATION FOR SEQ ID NO: 33:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 65 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33:

Val Thr Leu Thr Cys Glu Ala Ser Gly Asp Pro Ile Pro Ser Ile Thr

1 5 10 15

Trp Arg Thr Ser Thr Arg Asn Ile Ser Ser Glu Glu Gln Asp Leu Asp

20 25 30

Gly His Met Val Val Arg Ser His Ala Arg Val Ser Ser Leu Thr Leu

35 40 45

Lys Ser Ile Gln Tyr Arg Asp Ala Gly Glu Tyr Met Cys Thr Ala Ser

50 55 60

Asn

65

(35) INFORMATION FOR SEQ ID NO: 34:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1098 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: unknown

(D) TOPOLOGY: unknown

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 17..709

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34:

GTCGACCCAC GCGTCC GGG AAG GGC AAG AAG AAG GAC CGG GGA TCC CGC 49

Gly Lys Gly Lys Lys Lys Asp Arg Gly Ser Arg

1 5 10 GGG AAG CCC GGG CCC GCC GAG GGC GAC CCG AGC CCA GCA CTG CCT CCC 97 Gly Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser Pro Ala Leu Pro Pro

15 20 25

AGA TTG AAA GAA ATG AAG AGC CAG GAG TCA GCT GCA GGC TCC AAG CTA 145 Arg Leu Lys Glu Met Lys Ser Gln Glu Ser Ala Ala Gly Ser Lys Leu

30 35 40

GTG CTC CGG TGC GAA ACC AGC TCC GAG TAC TCC TCA CTC AGA TTC AAA 193 Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser Ser Leu Arg Phe Lys

45 50 55

TGG TTC AAG AAT GGG AAC GAG CTG AAC CGC AAA AAT AAA CCA GAA AAC 241 Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys Asn Lys Pro Glu Asn

60 65 70 75

ATC AAG ATA CAG AAG AAG CCA GGG AAG TCA GAG CTT CGA ATT AAC AAA 289 Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu Leu Arg Ile Asn Lys

80 85 90

GCA TCC CTG GCT GAC TCT GGA GAG TAT ATG TGC AAA GTG ATC AGC AAG 337 Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys Lys Val Ile Ser Lys

95 100 105

TTA GGA AAT GAC AGT GCC TCT GCC AAC ATC ACC ATT GTT GAG TCA AAC 385 Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr Ile Val Glu Ser Asn

110 115 120

GAG TTC ATC ACT GGC ATG CCA GCC TCG ACT GAG ACA GCC TAT GTG TCC 433 Glu Phe Ile Thr Gly Met Pro Ala Ser Thr Glu Thr Ala Tyr Val Ser

125 130 135

TCA GAG TCT CCC ATT AGA ATC TCA GTT TCA ACA GAA GGC GCA AAC ACT 481 Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr Glu Gly Ala Asn Thr

140 145 150 155

TCT TCA TCC ACA TCA ACA TCC ACG ACT GGG ACC AGC CAT CTC ATA AAG 529 Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Ile Lys

160 165 170

TGT GCG GAG AAG GAG AAA ACT TTC TGT GTG AAT GGG GGC GAG TGC TTC 577 Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys Phe

175 180 185

ACG GTG AAG GAC CTG TCA AAC CCG TCA AGA TAC TTG TGC AAG TGC CCA 625 Thr Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys Lys Cys Pro

190 195 200

AAT GAG TTT ACT GGT GAT CGT TGC CAA AAC TAC GTA ATG GCC AGC TTC 673 Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr Val Met Ala Ser Phe

205 210 215 TAC AGT ACG TCC ACT CCC TTT CTG TCT CTG CCT GAG TAGGAGCATG 719

Tyr Ser Thr Ser Thr Pro Phe Leu Ser Leu Pro Glu

220 225 230

CTCAGTCGAT GCTGCTTTCT TGTTGCTACA TCTCCCCTCA GAGTCACCTA GAGCCTAGAA 779

CCTTCTACCA GGTCTAATAT TGACTGCCTC TGCCTGTCGC ATGAGAACAT TAACAAGGCC 839

GCTGTGTTAC TTCTCTGTTT GTGACTAGTC GGCTCAGAGT TACTAAATAG GTGTGTGAGG 899

CCCCAGAGGT TTCTGAAAAT GGTCCCGCAG AAGGTGATGC AAATCGATAG TCAACATGCC 959

AAGAGTGAAA TAATAATAAT AATAATAATA ATAATAATAA TAATAATAAA GGCGTTTCAA 1019

ACTCGCACTT GTATTGGTAA AATAAAATCA TTCCAACAAG TAAAAAAAAA AAAAAAAAAA 1079

AAAAAAAAAG GGCGCCCGC 1098

(36) INFORMATION FOR SEQ ID NO: 35:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 231 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35:

Gly Lys Gly Lys Lys Lys Asp Arg Gly Ser Arg Gly Lys Pro Gly Pro

1 5 10 15

Ala Glu Gly Asp Pro Ser Pro Ala Leu Pro Pro Arg Leu Lys Glu Met

20 25 30

Lys Ser Gln Glu Ser Ala Ala Gly Ser Lys Leu Val Leu Arg Cys Glu

35 40 45

Thr Ser Ser Glu Tyr Ser Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly

50 55 60

Asn Glu Leu Asn Arg Lys Asn Lys Pro Glu Asn Ile Lys Ile Gln Lys

65 70 75 80

Lys Pro Gly Lys Ser Glu Leu Arg Ile Asn Lys Ala Ser Leu Ala Asp

85 90 95

Ser Gly Glu Tyr Met Cys Lys Val Ile Ser Lys Leu Gly Asn Asp Ser

100 105 110

Ala Ser Ala Asn Ile Thr Ile Val Glu Ser Asn Glu Phe Ile Thr Gly

115 120 125 Met Pro Ala Ser Thr Glu Thr Ala Tyr Val Ser Ser Glu Ser Pro Ile

130 135 140

Arg Ile Ser Val Ser Thr Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser

145 150 155 160

Thr Ser Thr Thr Gly Thr Ser His Leu Ile Lys Cys Ala Glu Lys Glu

165 170 175

Lys Thr Phe Cys Val Asn Gly Gly Glu Cys Phe Thr Val Lys Asp Leu

180 185 190

Ser Asn Pro Ser Arg Tyr Leu Cys Lys Cys Pro Asn Glu Phe Thr Gly

195 200 205

Asp Arg Cys Gln Asn Tyr Val Met Ala Ser Phe Tyr Ser Thr Ser Thr

210 215 220

Pro Phe Leu Ser Leu Pro Glu

225 230

(37) INFORMATION FOR SEQ ID NO: 36:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2914 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: unknown

(D) TOPOLOGY: unknown

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 345..1727

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36:

GCGGCCGCAG CTGCCGGGAG ATGCGAGCGC AGACCGGATT GTGATCACCT TTCCCTCTTC 60

GGGCTGTAAG AGAGCGAGAC AAGCCACCGA AGCGAGGCCA CTCCAGAGCC GGCAGCGGAG 120

GGACCCGGGA CACTAGAGCA GCTCCGAGCC ACTCCAGACT GAGCGGACGC TCCAGGTGAT 180

CGAGTCCACG CTGCTTCCTG CAGGCGACAG GCGACGCCTC CCGAGCAGCC CGGCCACTGG 240

CTCTTCCCCT CCTGGGACAA ACTTTTCTGC AAGCCCTTGG ACCAAACTTG TCGCGCGTCA 300

CCGTCACCCA ACCGGGTCCG CGTAGAGCGC TCATCTTCGG CGAG ATG TCT GAG CGC 356

Met Ser Glu Arg

1 AAA GAA GGC AGA GGC AAG GGG AAG GGC AAG AAG AAG GAC CGG GGA TCC 404 Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Asp Arg Gly Ser

5 10 15 20

CGC GGG AAG CCC GGG CCC GCC GAG GGC GAC CCG AGC CCA GCA CTG CCT 452 Arg Gly Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser Pro Ala Leu Pro

25 30 35

CCC AGA TTG AAA GAA ATG AAG AGC CAG GAG TCA GCT GCA GGC TCC AAG 500 Pro Arg Leu Lys Glu Met Lys Ser Gln Glu Ser Ala Ala Gly Ser Lys

40 45 50

CTA GTG CTC CGG TGC GAA ACC AGC TCC GAG TAC TCC TCA CTC AGA TTC 548 Leu Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser Ser Leu Arg Phe

55 60 65

AAA TGG TTC AAG AAT GGG AAC GAG CTG AAC CGC AAA AAT AAA CCA GAA 596 Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys Asn Lys Pro Glu

70 75 80

AAC ATC AAG ATA CAG AAG AAG CCA GGG AAG TCA GAG CTT CGA ATT AAC 644 Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu Leu Arg Ile Asn

85 90 95 100

AAA GCA TCC CTG GCT GAC TCT GGA GAG TAT ATG TGC AAA GTG ATC AGC 692 Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys Lys Val Ile Ser

105 110 115

AAG TTA GGA AAT GAC AGT GCC TCT GCC AAC ATC ACC ATT GTT GAG TCA 740 Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr Ile Val Glu Ser

120 125 130

AAC GAG TTC ATC ACT GGC ATG CCA GCC TCG ACT GAG ACA GCC TAT GTG 788 Asn Glu Phe Ile Thr Gly Met Pro Ala Ser Thr Glu Thr Ala Tyr Val

135 140 145

TCC TCA GAG TCT CCC ATT AGA ATC TCA GTT TCA ACA GAA GGC GCA AAC 836 Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr Glu Gly Ala Asn

150 155 160

ACT TCT TCA TCC ACA TCA ACA TCC ACG ACT GGG ACC AGC CAT CTC ATA 884 Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Ile

165 170 175 180

AAG TGT GCG GAG AAG GAG AAA ACT TTC TGT GTG AAT GGG GGC GAG TGC 932 Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn Gly Gly Glu Cys

185 190 195

TTC ACG GTG AAG GAC CTG TCA AAC CCG TCA AGA TAC TTG TGC AAG TGC 980 Phe Thr Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr Leu Cys Lys Cys

200 205 210 CAA CCT GGA TTC ACT GGA GCA AGA TGT ACT GAG AAT GTA CCC ATG AAA 1028 Gln Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val Pro Met Lys

215 220 225

GTC CAA ACC CAA GAA AAA GCG GAG GAA CTC TAC CAG AAG AGG GTG CTG 1076 Val Gln Thr Gln Glu Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val Leu

230 235 240

ACA ATT ACT GGC ATC TGT ATC GCC CTG CTG GTG GTC GGC ATC ATG TGT 1124 Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val Val Gly Ile Met Cys

245 250 255 260

GTG GTG GCC TAC TGC AAA ACC AAG AAG CAG CGG CAG AAG CTT CAT GAT 1172 Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg Gln Lys Leu His Asp

265 270 275

CGG CTT CGG CAG AGT CTT CGG TCA GAA CGG AGC AAC CTG GTG AAC ATA 1220 Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Ser Asn Leu Val Asn Ile

280 285 290

GCG AAT GGG CCT CAC CAC CCA AAC CCA CCG CCA GAG AAC GTG CAG CTG 1268 Ala Asn Gly Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gln Leu

295 300 305

GTG AAT CAA TAC GTA TCT AAA AAC GTC ATC TCC AGT GAG CAT ATT GTT 1316 Val Asn Gln Tyr Val Ser Lys Asn Val Ile Ser Ser Glu His Ile Val

310 315 320

GAG AGA GAA GTG GAG ACT TCC TTT TCC ACC AGT CAT TAC ACT TCC ACA 1364 Glu Arg Glu Val Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr

325 330 335 340

GCC CAT CAC TCC ACG ACT GTC ACC CAG ACT CCT AGT CAC AGC TGG AGT 1412 Ala His His Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp Ser

345 350 355

AAT GGG CAC ACG GAG AGC GTC ATT TCA GAA AGC AAC TCC GTA ATC ATG 1460 Asn Gly His Thr Glu Ser Val Ile Ser Glu Ser Asn Ser Val Ile Met

360 365 370

ATG TCT TCG GTA GAG AAC AGC AGG CAC AGC AGT CCC GCC GGG GGC CCA 1508 Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro Ala Gly Gly Pro

375 380 385

CGA GGA CGT CTT CAT GGC CTG GGA GGC CCT CGT GAT AAC AGC TTC CTC 1556 Arg Gly Arg Leu His Gly Leu Gly Gly Pro Arg Asp Asn Ser Phe Leu

390 395 400

AGG CAT GCC AGA GAA ACC CCT GAC TCC TAC AGA GAC TCT CCT CAT AGC 1604 Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His Ser

405 410 415 420 GAA AGA CAT AAC CTT ATA GCT GAG CTA AGG AGA AAC AAG GCT TAC AGA 1652 Glu Arg His Asn Leu Ile Ala Glu Leu Arg Arg Asn Lys Ala Tyr Arg

425 430 435

TCC AAA TGC ATG CAG ATC CAG CTG TCA GCA ACT CAT CTT AGA CCC TCT 1700 Ser Lys Cys Met Gln Ile Gln Leu Ser Ala Thr His Leu Arg Pro Ser

440 445 450

TCC ATT ACC CAT TTG GGC TTC ATT CTC TAAGACCCCT TGGCCTTTAG 1747 Ser Ile Thr His Leu Gly Phe Ile Leu

455 460

GAAGGTATGT ATCAGCCATG ACCACCCCGG CTCGTATGTC ACCTGTAGAT TTCCACACGC 1807

CAAGCTCCCC TAAATCGCCC CCTTCGGAAA TGTCTCCACC CGTGTCCAGC ATGACGGTGT 1867

CCATGCCCTC TGTGGCAGTC AGCCCCTTTG TGGAAGAAGA GAGGCCTCTG CTGCTTGTGA 1927

CGCCACCAAG GCTACGGGAG AAGAAATATG ATCATCACCC CCAGCAACTC AACTCCTTTC 1987

ATCACAACCC TGCACATCAG AGTACCAGCC TCCCCCCTAG CCCACTGAGG ATAGTGGAGG 2047

ATGAGGAGTA CGAGACGACC CAGGAGTATG AGTCAGTTCA AGAGCCCGTT AAGAAAGTCA 2107

CCAATAGCCG GCGGGCCAAA AGAACCAAGC CCAATGGCCA CATTGCCAAT AGGTTGGAAA 2167

TGGACAGCAA CACAAGTTCT GTGAGCAGTA ACTCAGAAAG TGAGACAGAA GACGAAAGAG 2227

TAGGTGAAGA CACACCATTC CTGGGCATAC AGAACCCCCT GGCAGCCAGC CTTGAGGTGG 2287

CCCCTGCCTT CCGTCTGGCT GAGAGCAGGA CTAACCCAGC AGGCCGCTTC TCCACACAGG 2347

AGGAATTACA GGCCAGGCTG TCTAGTGTAA TCGCTAACCA AGACCCTATT GCTGTATAAA 2407

ACCTAAATAA ACACATAGAT TCACCTGTAA AACTTTATTT TATATAATAA AGTATTTCAC 2467

CTTAAATTAA ACAATTTATT TTATTTTAGC AGTTCTGCAA ATAGAAAACA GGAAGAAAAA 2527

AAAACTTTTA TAAATTAAAT ATATGTATGT AAAAATGTGT TATGTGCCAT ATGTAGCAAT 2587

TTTTTACAGT ATTTCAAAAA CGAGAAAGAT ATCAATGGTG CCTTTATGTT CTGTTATGTC 2647

GAGAGCAAGT TTTATAAAGT TATGGTGATT TCTTTTTCAC AGTATTTCAG CAAAACCTCC 2707

CATATATTCA GTTTCTGCTG GCTTTTTGTG CATTGCATTA TGATGTTGAC TGGATGTATG 2767

GTTTGCAAGG CTAGCAGCTC GCTCGTGTTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTG 2827

TCTCTCTCTC TGTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCGTGATCCT 2887

CCTGCCTCTG CGGACGCGTG GGTCGAC 2914 (38) INFORMATION FOR SEQ ID NO: 37:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 461 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37:

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys 1 5 10 15

Asp Arg Gly Ser Arg Gly Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser

20 25 30

Pro Ala Leu Pro Pro Arg Leu Lys Glu Met Lys Ser Gln Glu Ser Ala

35 40 45

Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser

50 55 60

Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys 65 70 75 80

Asn Lys Pro Glu Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu

85 90 95

Leu Arg Ile Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys

100 105 110

Lys Val Ile Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr

115 120 125

Ile Val Glu Ser Asn Glu Phe Ile Thr Gly Met Pro Ala Ser Thr Glu 130 135 140

Thr Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr 145 150 155 160

Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr

165 170 175

Ser His Leu Ile Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn

180 185 190

Gly Gly Glu Cys Phe Thr Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr

195 200 205

Leu Cys Lys Cys Gln Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn 210 215 220

Val Pro Met Lys Val Gln Thr Gln Glu Lys Ala Glu Glu Leu Tyr Gln 225 230 235 240 Lys Arg Val Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val Val 245 250 255

Gly Ile Met Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg Gln

260 265 270

Lys Leu His Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Ser Asn

275 280 285

Leu Val Asn Ile Ala Asn Gly Pro His His Pro Asn Pro Pro Pro Glu 290 295 300

Asn Val Gln Leu Val Asn Gln Tyr Val Ser Lys Asn Val Ile Ser Ser 305 310 315 320

Glu His Ile Val Glu Arg Glu Val Glu Thr Ser Phe Ser Thr Ser His

325 330 335

Tyr Thr Ser Thr Ala His His Ser Thr Thr Val Thr Gln Thr Pro Ser

340 345 350

His Ser Trp Ser Asn Gly His Thr Glu Ser Val Ile Ser Glu Ser Asn

355 360 365

Ser Val Ile Met Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro 370 375 380

Ala Gly Gly Pro Arg Gly Arg Leu His Gly Leu Gly Gly Pro Arg Asp 385 390 395 400

Asn Ser Phe Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp

405 410 415

Ser Pro His Ser Glu Arg His Asn Leu Ile Ala Glu Leu Arg Arg Asn

420 425 430

Lys Ala Tyr Arg Ser Lys Cys Met Gln Ile Gln Leu Ser Ala Thr His

435 440 445

Leu Arg Pro Ser Ser Ile Thr His Leu Gly Phe Ile Leu

450 455 460

(39) INFORMATION FOR SEQ ID NO: 38:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2531 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: unknown

(D) TOPOLOGY: unknown ( ix) FEATURE :

(A) NAME/KEY: CDS

(B) LOCATION: 345..1727

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38:

GCGGCCGCAG CTGCCGGGAG ATGCGAGCGC AGACCGGATT GTGATCACCT TTCCCTCTTC 60

GGGCTGTAAG AGAGCGAGAC AAGCCACCGA AGCGAGGCCA CTCCAGAGCC GGCAGCGGAG 120

GGACCCGGGA CACTAGAGCA GCTCCGAGCC ACTCCAGACT GAGCGGACGC TCCAGGTGAT 180

CGAGTCCACG CTGCTTCCTG CAGGCGACAG GCGACGCCTC CCGAGCAGCC CGGCCACTGG 240

CTCTTCCCCT CCTGGGACAA ACTTTTCTGC AAGCCCTTGG ACCAAACTTG TCGCGCGTCA 300

CCGTCACCCA ACCGGGTCCG CGTAGAGCGC TCATCTTCGG CGAG ATG TCT GAG CGC 356

Met Ser Glu Arg

1

AAA GAA GGC AGA GGC AAG GGG AAG GGC AAG AAG AAG GAC CGG GGA TCC 404 Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Asp Arg Gly Ser

5 10 15 20

25 30 35

40 45 50

55 60 65

70 75 80

85 90 95 100

105 110 115

120 125 130 AAC GAG TTC ATC ACT GGC ATG CCA GCC TCG ACT GAG ACA GCC TAT GTG 788 Asn Glu Phe Ile Thr Gly Met Pro Ala Ser Thr Glu Thr Ala Tyr Val

135 140 145

150 155 160

165 170 175 180

185 190 195

200 205 210

CAA CCT GGA TTC ACT GGA GCA AGA TGT ACT GAG AAT GTA CCC ATG AAA 1028 Gln Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn Val Pro Met Lys

215 220 225

230 235 240

245 250 255 260

265 270 275

280 285 290

295 300 305

310 315 320

325 330 335 340 GCC CAT CAC TCC ACG ACT GTC ACC CAG ACT CCT AGT CAC AGC TGG AGT 1412 Ala His His Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp Ser

345 350 355

360 365 370

375 380 385

390 395 400

405 410 415 420

GAA AGA CAT AAC CTT ATA GCT GAG CTA AGG AGA AAC AAG GCT TAC AGA 1652 Glu Arg His Asn Leu Ile Ala Glu Leu Arg Arg Asn Lys Ala Tyr Arg

425 430 435

440 445 450

TCC ATT ACC CAT TTG GGC TTC ATT CTC TAAGACCCCT TGGCCTTTAG 1747

Ser Ile Thr His Leu Gly Phe Ile Leu

455 460

GAAGGTATGT ATCAGCCATG ACCACCCCGG CTCGTATGTC ACCTGTAGAT TTCCACACGC 1807

CAAGCTCCCC TAAATCGCCC CCTTCGGAAA TGTCTCCACC CGTGTCCAGC ATGACGGTGT 1867

CCATGCCCTC TGTGGCAGTC AGCCCCTTTG TGGAAGAAGA GAGGCCTCTG CTGCTTGTGA 1927

CGCCACCAAG GCTACGGGAG AAGAAATATG ATCATCACCC CCAGCAACTC AACTCCTTTC 1987

ATCACAACCC TGCACATCAG AGTACCAGCC TCCCCCCTAG CCCACTGAGG ATAGTGGAGG 2047

ATGAGGAGTA CGAGACGACC CAGGAGTATG AGTCAGTTCA AGAGCCCGTT AAGAAAGTCA 2107

CCAATAGCCG GCGGGCCAAA AGAACCAAGC CCAATGGCCA CATTGCCAAT AGGTTGGAAA 2167

TGGACAGCAA CACAAGTTCT GTGAGCAGTA ACTCAGAAAG TGAGACAGAA GACGAAAGAG 2227

TAGGTGAAGA CACACCATTC CTGGGCATAC AGAACCCCCT GGCAGCCAGC CTTGAGGTGG 2287

CCCCTGCCTT CCGTCTGGCT GAGAGCAGGA CTAACCCAGC AGGCCGCTTC TCCACACAGG 2347 AGGAATTACA GGCCAGGCTG TCTAGTGTAA TCGCTAACCA AGACCCTATT GCTGTATAAA 2407

ACCTAAATAA ACACATAGAT TCACCTGTAA AACTTTATTT TATATAATAA AGTATTTCAC 2467

CTTAAAAAAA AAAAAAAAGG ACGGCCGCTC GCGATCTAGA ACTAGTCCGG ACGCGTGGGT 2527

CGAC 2531

(40) INFORMATION FOR SEQ ID NO: 39:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 461 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39:

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys

1 5 10 15

Asp Arg Gly Ser Arg Gly Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser

20 25 30

Pro Ala Leu Pro Pro Arg Leu Lys Glu Met Lys Ser Gln Glu Ser Ala

35 40 45

Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser

50 55 60

Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys

65 70 75 80

Asn Lys Pro Glu Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu

85 90 95

Leu Arg Ile Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys

100 105 110

Lys Val Ile Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr

115 120 125

Ile Val Glu Ser Asn Glu Phe Ile Thr Gly Met Pro Ala Ser Thr Glu

130 135 140

Thr Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr

145 150 155 160

Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr

165 170 175 Ser His Leu Ile Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn 180 185 190

Gly Gly Glu Cys Phe Thr Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr

195 200 205

Leu Cys Lys Cys Gln Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn 210 215 220

Val Pro Met Lys Val Gln Thr Gln Glu Lys Ala Glu Glu Leu Tyr Gln 225 230 235 240

Lys Arg Val Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val Val

245 250 255

Gly Ile Met Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg Gln

260 265 270

Lys Leu His Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Ser Asn

275 280 285

Leu Val Asn Ile Ala Asn Gly Pro His His Pro Asn Pro Pro Pro Glu 290 295 300

Asn Val Gln Leu Val Asn Gln Tyr Val Ser Lys Asn Val Ile Ser Ser 305 310 315 320

Glu His Ile Val Glu Arg Glu Val Glu Thr Ser Phe Ser Thr Ser His

325 330 335

Tyr Thr Ser Thr Ala His His Ser Thr Thr Val Thr Gln Thr Pro Ser

340 345 350

His Ser Trp Ser Asn Gly His Thr Glu Ser Val Ile Ser Glu Ser Asn

355 360 365

Ser Val Ile Met Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro 370 375 380

Ala Gly Gly Pro Arg Gly Arg Leu His Gly Leu Gly Gly Pro Arg Asp 385 390 395 400

Asn Ser Phe Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp

405 410 415

Ser Pro His Ser Glu Arg His Asn Leu Ile Ala Glu Leu Arg Arg Asn

420 425 430 Lys Ala Tyr Arg Ser Lys Cys Met Gln Ile Gln Leu Ser Ala Thr His

435 440 445

Leu Arg Pro Ser Ser Ile Thr His Leu Gly Phe Ile Leu

450 455 460

(41) INFORMATION FOR SEQ ID NO: 40:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 3344 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: unknown

(D) TOPOLOGY: unknown

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 345..2252

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40:

GCGGCCGCAG CTGCCGGGAG ATGCGAGCGC AGACCGGATT GTGATCACCT TTCCCTCTTC 60

GGGCTGTAAG AGAGCGAGAC AAGCCACCGA AGCGAGGCCA CTCCAGAGCC GGCAGCGGAG 120

GGACCCGGGA CACTAGAGCA GCTCCGAGCC ACTCCAGACT GAGCGGACGC TCCAGGTGAT 180

CGAGTCCACG CTGCTTCCTG CAGGCGACAG GCGACGCCTC CCGAGCAGCC CGGCCACTGG 240

CTCTTCCCCT CCTGGGACAA ACTTTTCTGC AAGCCCTTGG ACCAAACTTG TCGCGCGTCA 300

CCGTCACCCA ACCGGGTCCG CGTAGAGCGC TCATCTTCGG CGAG ATG TCT GAG CGC 356

Met Ser Glu Arg

1

5 10 15 20

25 30 35

40 45 50

55 60 65 AAA TGG TTC AAG AAT GGG AAC GAG CTG AAC CGC AAA AAT AAA CCA GAA 596 Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys Asn Lys Pro Glu

70 75 80

85 90 95 100

105 110 115

120 125 130

135 140 145

150 155 160

165 170 175 180

185 190 195

200 205 210

CCA AAT GAG TTT ACT GGT GAT CGT TGC CAA AAC TAC GTA ATG GCC AGC 1028 Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr Val Met Ala Ser

215 220 225

TTC TAC AAA GCG GAG GAA CTC TAC CAG AAG AGG GTG CTG ACA ATT ACT 1076 Phe Tyr Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val Leu Thr Ile Thr

230 235 240

GGC ATC TGT ATC GCC CTG CTG GTG GTC GGC ATC ATG TGT GTG GTG GCC 1124 Gly Ile Cys Ile Ala Leu Leu Val Val Gly Ile Met Cys Val Val Ala

245 250 255 260

TAC TGC AAA ACC AAG AAG CAG CGG CAG AAG CTT CAT GAT CGG CTT CGG 1172 Tyr Cys Lys Thr Lys Lys Gln Arg Gln Lys Leu His Asp Arg Leu Arg

265 270 275 CAG AGT CTT CGG TCA GAA CGG AGC AAC CTG GTG AAC ATA GCG AAT GGG 1220 Gln Ser Leu Arg Ser Glu Arg Ser Asn Leu Val Asn Ile Ala Asn Gly

280 285 290

CCT CAC CAC CCA AAC CCA CCG CCA GAG AAC GTG CAG CTG GTG AAT CAA 1268 Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gln Leu Val Asn Gln

295 300 305

TAC GTA TCT AAA AAC GTC ATC TCC AGT GAG CAT ATT GTT GAG AGA GAA 1316 Tyr Val Ser Lys Asn Val Ile Ser Ser Glu His Ile Val Glu Arg Glu

310 315 320

GTG GAG ACT TCC TTT TCC ACC AGT CAT TAC ACT TCC ACA GCC CAT CAC 1364 Val Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr Ala His His

325 330 335 340

TCC ACG ACT GTC ACC CAG ACT CCT AGT CAC AGC TGG AGT AAT GGG CAC 1412 Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp Ser Asn Gly His

345 350 355

ACG GAG AGC GTC ATT TCA GAA AGC AAC TCC GTA ATC ATG ATG TCT TCG 1460 Thr Glu Ser Val Ile Ser Glu Ser Asn Ser Val Ile Met Met Ser Ser

360 365 370

GTA GAG AAC AGC AGG CAC AGC AGT CCC GCC GGG GGC CCA CGA GGA CGT 1508 Val Glu Asn Ser Arg His Ser Ser Pro Ala Gly Gly Pro Arg Gly Arg

375 380 385

CTT CAT GGC CTG GGA GGC CCT CGT GAT AAC AGC TTC CTC AGG CAT GCC 1556 Leu His Gly Leu Gly Gly Pro Arg Asp Asn Ser Phe Leu Arg His Ala

390 395 400

AGA GAA ACC CCT GAC TCC TAC AGA GAC TCT CCT CAT AGC GAA AGG TAT 1604 Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His Ser Glu Arg Tyr

405 410 415 420

GTA TCA GCC ATG ACC ACC CCG GCT CGT ATG TCA CCT GTA GAT TTC CAC 1652 Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro Val Asp Phe His

425 430 435

ACG CCA AGC TCC CCT AAA TCG CCC CCT TCG GAA ATG TCT CCA CCC GTG 1700 Thr Pro Ser Ser Pro Lys Ser Pro Pro Ser Glu Met Ser Pro Pro Val

440 445 450

TCC AGC ATG ACG GTG TCC ATG CCC TCT GTG GCA GTC AGC CCC TTT GTG 1748 Ser Ser Met Thr Val Ser Met Pro Ser Val Ala Val Ser Pro Phe Val

455 460 465

GAA GAA GAG AGG CCT CTG CTG CTT GTG ACG CCA CCA AGG CTA CGG GAG 1796 Glu Glu Glu Arg Pro Leu Leu Leu Val Thr Pro Pro Arg Leu Arg Glu

470 475 480 AAG AAA TAT GAT CAT CAC CCC CAG CAA CTC AAC TCC TTT CAT CAC AAC 1844 Lys Lys Tyr Asp His His Pro Gln Gln Leu Asn Ser Phe His His Asn

485 490 495 500

CCT GCA CAT CAG AGT ACC AGC CTC CCC CCT AGC CCA CTG AGG ATA GTG 1892 Pro Ala His Gln Ser Thr Ser Leu Pro Pro Ser Pro Leu Arg Ile Val

505 510 515

GAG GAT GAG GAG TAC GAG ACG ACC CAG GAG TAT GAG TCA GTT CAA GAG 1940 Glu Asp Glu Glu Tyr Glu Thr Thr Gln Glu Tyr Glu Ser Val Gln Glu

520 525 530

CCC GTT AAG AAA GTC ACC AAT AGC CGG CGG GCC AAA AGA ACC AAG CCC 1988 Pro Val Lys Lys Val Thr Asn Ser Arg Arg Ala Lys Arg Thr Lys Pro

535 540 545

AAT GGC CAC ATT GCC AAT AGG TTG GAA ATG GAC AGC AAC ACA AGT TCT 2036 Asn Gly His Ile Ala Asn Arg Leu Glu Met Asp Ser Asn Thr Ser Ser

550 555 560

GTG AGC AGT AAC TCA GAA AGT GAG ACA GAA GAC GAA AGA GTA GGT GAA 2084 Val Ser Ser Asn Ser Glu Ser Glu Thr Glu Asp Glu Arg Val Gly Glu

565 570 575 580

GAC ACA CCA TTC CTG GGC ATA CAG AAC CCC CTG GCA GCC AGC CTT GAG 2132 Asp Thr Pro Phe Leu Gly Ile Gln Asn Pro Leu Ala Ala Ser Leu Glu

585 590 595

GTG GCC CCT GCC TTC CGT CTG GCT GAG AGC AGG ACT AAC CCA GCA GGC 2180 Val Ala Pro Ala Phe Arg Leu Ala Glu Ser Arg Thr Asn Pro Ala Gly

600 605 610

CGC TTC TCC ACA CAG GAG GAA TTA CAG GCC AGG CTG TCT AGT GTA ATC 2228 Arg Phe Ser Thr Gln Glu Glu Leu Gln Ala Arg Leu Ser Ser Val Ile

615 620 625

GCT AAC CAA GAC CCT ATT GCT GTA TAAAACCTAA ATAAACACAT AGATTCACCT 2282 Ala Asn Gln Asp Pro Ile Ala Val

630 635

GTAAAACTTT ATTTTATATA ATAAAGTATT TCACCTTAAA TTAAACAATT TATTTTATTT 2342

TAGCAGTTCT GCAAATAGAA AACAGGAAGA AAAAAAAACT TTTATAAATT AAATATATGT 2402

ATGTAAAAAT GTGTTATGTG CCATATGTAG CAATTTTTTT ACAGTATTTC AAAAACGAGA 2462

AAGATATCAA TGGTGCCTTT ATGTTCTGTT ATGTCGAGAG CAAGTTTTAT AAAGTTATGG 2522

TGATTTCTTT TTCACAGTAT TTCAGCAAAA CCTCCCATAT ATTCAGTTTC TGCTGGCTTT 2582

TTGTGCATTG CATTATGATG TTGACTGGAT GTATGGTTTG CAAGGCTAGC AGCTCGCTCG 2642 TGTTCTCTCT CTCTCTCTCT CTCTCTCTCT CTCTGTCTCT CTCTCTGTCT CTCTCTCTCT 2702

CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CTGTCTCTCT CTCTGCTTCC 2762

CGTAGCTCCC AACCAGTACT GTCTTGGACT GGCACATCCA TCCAAATACC TTTCTACTTT 2822

GTATGAAGTT TTCTTTGCTT TCCCAATATG AAATGAGTTC TCTCTACTCT GTCAGCCAAA 2882

GGTTTGCTTC ACTGGACTCT GAGATAATAG TAGACCCAGC AGCATGCTAC TATTACGTAT 2942

AGCAGGAAAC TGCACCAAGT AATGTCCAAT AATAGGAAGA AAGTAATACT GTGATTTAAA 3002

AAAAAAAACA AACTATATTA TTAATCAGAA GACAGCTTGC TCTTGGTAAA AGGAGCTACC 3062

ATTGACTCTA ATTTTGACTT TTTAGTTATT GTTCTTGACA AAGAGTAACA GCTTCAAGTA 3122

CAGCCTAGAA AAAAAAATGG GTTCTGGCCT GCTATCAGGA TAAATCTATC GACGTAGATA 3182

GATTCAACTC AGTTTCACTT TCTGTCTTGG GGGAAATGAT CCAGCCACTC ATATGACGAC 3242

CAACCAACCA CAGGTGCCTC TGCTCCCTGT AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3302

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAGGGCGGCC GC 3344

(42) INFORMATION FOR SEQ ID NO: 41:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 636 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41:

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys

1 5 10 15

Asp Arg Gly Ser Arg Gly Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser

20 25 30

Pro Ala Leu Pro Pro Arg Leu Lys Glu Met Lys Ser Gln Glu Ser Ala

35 40 45

Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser

50 55 60

Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys

65 70 75 80

Asn Lys Pro Glu Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu

85 90 95 Leu Arg Ile Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys 100 105 110

Lys Val Ile Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr

115 120 125

Ile Val Glu Ser Asn Glu Phe Ile Thr Gly Met Pro Ala Ser Thr Glu 130 135 140

Thr Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr 145 150 155 160

Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr

165 170 175

Ser His Leu Ile Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn

180 185 190

Gly Gly Glu Cys Phe Thr Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr

195 200 205

Leu Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr 210 215 220

Val Met Ala Ser Phe Tyr Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val 225 230 235 240

Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val Val Gly Ile Met

245 250 255

Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg Gln Lys Leu His

260 265 270

Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Ser Asn Leu Val Asn

275 280 285

Ile Ala Asn Gly Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gln 290 295 300

Leu Val Asn Gln Tyr Val Ser Lys Asn Val Ile Ser Ser Glu His Ile 305 310 315 320

Val Glu Arg Glu Val Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser

325 330 335

Thr Ala His His Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp

340 345 350

Ser Asn Gly His Thr Glu Ser Val Ile Ser Glu Ser Asn Ser Val Ile

355 360 365 Met Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro Ala Gly Gly 370 375 380

Pro Arg Gly Arg Leu His Gly Leu Gly Gly Pro Arg Asp Asn Ser Phe 385 390 395 400

Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His

405 410 415

Ser Glu Arg Tyr Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro

420 425 430

Val Asp Phe His Thr Pro Ser Ser Pro Lys Ser Pro Pro Ser Glu Met

435 440 445

Ser Pro Pro Val Ser Ser Met Thr Val Ser Met Pro Ser Val Ala Val 450 455 460

Ser Pro Phe Val Glu Glu Glu Arg Pro Leu Leu Leu Val Thr Pro Pro 465 470 475 480

Arg Leu Arg Glu Lys Lys Tyr Asp His His Pro Gln Gln Leu Asn Ser

485 490 495

Phe His His Asn Pro Ala His Gln Ser Thr Ser Leu Pro Pro Ser Pro

500 505 510

Leu Arg Ile Val Glu Asp Glu Glu Tyr Glu Thr Thr Gln Glu Tyr Glu

515 520 525

Ser Val Gln Glu Pro Val Lys Lys Val Thr Asn Ser Arg Arg Ala Lys 530 535 540

Arg Thr Lys Pro Asn Gly His Ile Ala Asn Arg Leu Glu Met Asp Ser 545 550 555 560

Asn Thr Ser Ser Val Ser Ser Asn Ser Glu Ser Glu Thr Glu Asp Glu

565 570 575

Arg Val Gly Glu Asp Thr Pro Phe Leu Gly Ile Gln Asn Pro Leu Ala

580 585 590

Ala Ser Leu Glu Val Ala Pro Ala Phe Arg Leu Ala Glu Ser Arg Thr

595 600 605

Asn Pro Ala Gly Arg Phe Ser Thr Gln Glu Glu Leu Gln Ala Arg Leu 610 615 620

Ser Ser Val Ile Ala Asn Gln Asp Pro Ile Ala Val

625 630 635 (43) INFORMATION FOR SEQ ID NO: 42:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2356 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: unknown

(D) TOPOLOGY: unknown

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 345..2261

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42:

GCGGCCGCAG CTGCCGGGAG ATGCGAGCGC AGACCGGATT GTGATCACCT TTCCCTCTTC 60

GGGCTGTAAG AGAGCGAGAC AAGCCACCGA AGCGAGGCCA CTCCAGAGCC GGCAGCGGAG 120

GGACCCGGGA CACTAGAGCA GCTCCGAGCC ACTCCAGACT GAGCGGACGC TCCAGGTGAT 180

CGAGTCCACG CTGCTTCCTG CAGGCGACAG GCGACGCCTC CCGAGCAGCC CGGCCACTGG 240

CTCTTCCCCT CCTGGGACAA ACTTTTCTGC AAGCCCTTGG ACCAAACTTG TCGCGCGTCA 300

CCGTCACCCA ACCGGGTCCG CGTAGAGCGC TCATCTTCGG CGAG ATG TCT GAG CGC 356

Met Ser Glu Arg

1

5 10 15 20

25 30 35

40 45 50

55 60 65

70 75 80

85 90 95 100 AAA GCA TCC CTG GCT GAC TCT GGA GAG TAT ATG TGC AAA GTG ATC AGC 692 Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys Lys Val Ile Ser

105 110 115

120 125 130

135 140 145

150 155 160

165 170 175 180

185 190 195

200 205 210

215 220 225

230 235 240

245 250 255 260

265 270 275

280 285 290

295 300 305 GTG AAT CAA TAC GTA TCT AAA AAC GTC ATC TCC AGT GAG CAT ATT GTT 1316 Val Asn Gln Tyr Val Ser Lys Asn Val Ile Ser Ser Glu His Ile Val

310 315 320

325 330 335 340

345 350 355

360 365 370

375 380 385

390 395 400

405 410 415 420

GAA AGG TAT GTA TCA GCC ATG ACC ACC CCG GCT CGT ATG TCA CCT GTA 1652 Glu Arg Tyr Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro Val

425 430 435

GAT TTC CAC ACG CCA AGC TCC CCT AAA TCG CCC CCT TCG GAA ATG TCT 1700 Asp Phe His Thr Pro Ser Ser Pro Lys Ser Pro Pro Ser Glu Met Ser

440 445 450

CCA CCC GTG TCC AGC ATG ACG GTG TCC ATG CCC TCT GTG GCA GTC AGC 1748 Pro Pro Val Ser Ser Met Thr Val Ser Met Pro Ser Val Ala Val Ser

455 460 465

CCC TTT GTG GAA GAA GAG AGG CCT CTG CTG CTT GTG ACG CCA CCA AGG 1796 Pro Phe Val Glu Glu Glu Arg Pro Leu Leu Leu Val Thr Pro Pro Arg

470 475 480

CTA CGG GAG AAG AAA TAT GAT CAT CAC CCC CAG CAA CTC AAC TCC TTT 1844 Leu Arg Glu Lys Lys Tyr Asp His His Pro Gln Gln Leu Asn Ser Phe

485 490 495 500

CAT CAC AAC CCT GCA CAT CAG AGT ACC AGC CTC CCC CCT AGC CCA CTG 1892 His His Asn Pro Ala His Gln Ser Thr Ser Leu Pro Pro Ser Pro Leu

505 510 515 AGG ATA GTG GAG GAT GAG GAG TAC GAG ACG ACC CAG GAG TAT GAG TCA 1940 Arg Ile Val Glu Asp Glu Glu Tyr Glu Thr Thr Gln Glu Tyr Glu Ser

520 525 530

GTT CAA GAG CCC GTT AAG AAA GTC ACC AAT AGC CGG CGG GCC AAA AGA 1988 Val Gln Glu Pro Val Lys Lys Val Thr Asn Ser Arg Arg Ala Lys Arg

535 540 545

ACC AAG CCC AAT GGC CAC ATT GCC AAT AGG TTG GAA ATG GAC AGC AAC 2036 Thr Lys Pro Asn Gly His Ile Ala Asn Arg Leu Glu Met Asp Ser Asn

550 555 560

ACA AGT TCT GTG AGC AGT AAC TCA GAA AGT GAG ACA GAA GAC GAA AGA 2084 Thr Ser Ser Val Ser Ser Asn Ser Glu Ser Glu Thr Glu Asp Glu Arg

565 570 575 580

GTA GGT GAA GAC ACA CCA TTC CTG GGC ATA CAG AAC CCC CTG GCA GCC 2132 Val Gly Glu Asp Thr Pro Phe Leu Gly Ile Gln Asn Pro Leu Ala Ala

585 590 595

AGC CTT GAG GTG GCC CCT GCC TTC CGT CTG GCT GAG AGC AGG ACT AAC 2180 Ser Leu Glu Val Ala Pro Ala Phe Arg Leu Ala Glu Ser Arg Thr Asn

600 605 610

CCA GCA GGC CGC TTC TCC ACA CAG GAG GAA TTA CAG GCC AGG CTG TCT 2228 Pro Ala Gly Arg Phe Ser Thr Gln Glu Glu Leu Gln Ala Arg Leu Ser

615 620 625

AGT GTA ATC GCT AAC CAA GAC CCT ATT GCT GTA TAAAACCTAA ATAAACACAT 2281 Ser Val Ile Ala Asn Gln Asp Pro Ile Ala Val

630 635

AGATTCACCT GTAAAACTTT ATTTTATATA ATAAAGTATT TCACCTTAAA AAAAAAAAAA 2341

AAAAAAGGGG GCCGC 2356

(44) INFORMATION FOR SEQ ID NO: 43:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 639 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43:

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys

1 5 10 15 Asp Arg Gly Ser Arg Gly Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser 20 25 30

Pro Ala Leu Pro Pro Arg Leu Lys Glu Met Lys Ser Gln Glu Ser Ala

35 40 45

Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser 50 55 60

Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys 65 70 75 80

Asn Lys Pro Glu Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu

85 90 95

Leu Arg Ile Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys

100 105 110

Lys Val Ile Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr

115 120 125

Ile Val Glu Ser Asn Glu Phe Ile Thr Gly Met Pro Ala Ser Thr Glu 130 135 140

Thr Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr 145 150 155 160

Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr

165 170 175

Ser His Leu Ile Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn

180 185 190

Gly Gly Glu Cys Phe Thr Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr

195 200 205

Leu Cys Lys Cys Gln Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn 210 215 220

Val Pro Met Lys Val Gln Thr Gln Glu Lys Ala Glu Glu Leu Tyr Gln 225 230 235 240

Lys Arg Val Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val Val

245 250 255

Gly Ile Met Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg Gln

260 265 270

Lys Leu His Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Ser Asn

275 280 285 Leu Val Asn Ile Ala Asn Gly Pro His His Pro Asn Pro Pro Pro Glu 290 295 300

Asn Val Gln Leu Val Asn Gln Tyr Val Ser Lys Asn Val Ile Ser Ser 305 310 315 320

Glu His Ile Val Glu Arg Glu Val Glu Thr Ser Phe Ser Thr Ser His

325 330 335

Tyr Thr Ser Thr Ala His His Ser Thr Thr Val Thr Gln Thr Pro Ser

340 345 350

His Ser Trp Ser Asn Gly His Thr Glu Ser Val Ile Ser Glu Ser Asn

355 360 365

Ser Val Ile Met Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro 370 375 380

Ala Gly Gly Pro Arg Gly Arg Leu His Gly Leu Gly Gly Pro Arg Asp 385 390 395 400

Asn Ser Phe Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp

405 410 415

Ser Pro His Ser Glu Arg Tyr Val Ser Ala Met Thr Thr Pro Ala Arg

420 425 430

Met Ser Pro Val Asp Phe His Thr Pro Ser Ser Pro Lys Ser Pro Pro

435 440 445

Ser Glu Met Ser Pro Pro Val Ser Ser Met Thr Val Ser Met Pro Ser 450 455 460

Val Ala Val Ser Pro Phe Val Glu Glu Glu Arg Pro Leu Leu Leu Val 465 470 475 480

Thr Pro Pro Arg Leu Arg Glu Lys Lys Tyr Asp His His Pro Gln Gln

485 490 495

Leu Asn Ser Phe His His Asn Pro Ala His Gln Ser Thr Ser Leu Pro

500 505 510

Pro Ser Pro Leu Arg Ile Val Glu Asp Glu Glu Tyr Glu Thr Thr Gln

515 520 525

Glu Tyr Glu Ser Val Gln Glu Pro Val Lys Lys Val Thr Asn Ser Arg 530 535 540

Arg Ala Lys Arg Thr Lys Pro Asn Gly His Ile Ala Asn Arg Leu Glu 545 550 555 560 Met Asp Ser Asn Thr Ser Ser Val Ser Ser Asn Ser Glu Ser Glu Thr

565 570 575

Glu Asp Glu Arg Val Gly Glu Asp Thr Pro Phe Leu Gly Ile Gln Asn

580 585 590

Pro Leu Ala Ala Ser Leu Glu Val Ala Pro Ala Phe Arg Leu Ala Glu

595 600 605

Ser Arg Thr Asn Pro Ala Gly Arg Phe Ser Thr Gln Glu Glu Leu Gln

610 615 620

Ala Arg Leu Ser Ser Val Ile Ala Asn Gln Asp Pro Ile Ala Val

625 630 635

(45) INFORMATION FOR SEQ ID NO: 44:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1261 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: unknown

(D) TOPOLOGY: unknown

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 345..1238

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44:

GCGGCCGCAG CTGCCGGGAG ATGCGAGCGC AGACCGGATT GTGATCACCT TTCCCTCTTC 60

GGGCTGTAAG AGAGCGAGAC AAGCCACCGA AGCGAGGCCA CTCCAGAGCC GGCAGCGGAG 120

GGACCCGGGA CACTAGAGCA GCTCCGAGCC ACTCCAGACT GAGCGGACGC TCCAGGTGAT 180

CGAGTCCACG CTGCTTCCTG CAGGCGACAG GCGACGCCTC CCGAGCAGCC CGGCCACTGG 240

CTCTTCCCCT CCTGGGACAA ACTTTTCTGC AAGCCCTTGG ACCAAACTTG TCGCGCGTCA 300

CCGTCACCCA ACCGGGTCCG CGTAGAGCGC TCATCTTCGG CGAG ATG TCT GAG CGC 356

Met Ser Glu Arg

1

5 10 15 20

25 30 35 CCC AGA TTG AAA GAA ATG AAG AGC CAG GAG TCA GCT GCA GGC TCC AAG 500 Pro Arg Leu Lys Glu Met Lys Ser Gln Glu Ser Ala Ala Gly Ser Lys

40 45 50

55 60 65

70 75 80

85 90 95 100

105 110 115

120 125 130

135 140 145

150 155 160

165 170 175 180

185 190 195

200 205 210

215 220 225 TTC TAC AAA GCG GAG GAA CTC TAC CAG AAG AGG GTG CTG ACA ATT ACT 1076 Phe Tyr Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val Leu Thr Ile Thr

230 235 240

245 250 255 260

265 270 275

CAG AGT CTT CGG TCA GAA CGG AGC AAC CTG GTG AAC ATA GCG AAT GGG 1220 Gln Ser Leu Arg Ser Glu Arg Ser Asn Leu Val Asn Ile Ala Asn Gly

280 285 290

CCT CAC CAC CCA AAC CCA CGCGTCCGGA CGCGTGGGTC GAC 1261

Pro His His Pro Asn Pro

295

(46) INFORMATION FOR SEQ ID NO: 45:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 298 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45:

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys

1 5 10 15

Asp Arg Gly Ser Arg Gly Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser

20 25 30

Pro Ala Leu Pro Pro Arg Leu Lys Glu Met Lys Ser Gln Glu Ser Ala

35 40 45

Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser

50 55 60

Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys

65 70 75 80

Asn Lys Pro Glu Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu

Lys Val Ile Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr

115 120 125

Ile Val Glu Ser Asn Glu Phe Ile Thr Gly Met Pro Ala Ser Thr Glu 130 135 140

Thr Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr 145 150 155 160

Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr

165 170 175

Ser His Leu Ile Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn

180 185 190

Gly Gly Glu Cys Phe Thr Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr

195 200 205

Leu Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr 210 215 220

Val Met Ala Ser Phe Tyr Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val 225 230 235 240

Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val Val Gly Ile Met

245 250 255

Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg Gln Lys Leu His

260 265 270

Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Ser Asn Leu Val Asn

275 280 285

Ile Ala Asn Gly Pro His His Pro Asn Pro

290 295

(47) INFORMATION FOR SEQ ID NO: 46:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2743 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS : unknown

(D) TOPOLOGY: unknown

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 345..2252 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46:

GCGGCCGCAG CTGCCGGGAG ATGCGAGCGC AGACCGGATT GTGATCACCT TTCCCTCTTC 60

GGGCTGTAAG AGAGCGAGAC AAGCCACCGA AGCGAGGCCA CTCCAGAGCC GGCAGCGGAG 120

GGACCCGGGA CACTAGAGCA GCTCCGAGCC ACTCCAGACT GAGCGGACGC TCCAGGTGAT 180

CGAGTCCACG CTGCTTCCTG CAGGCGACAG GCGACGCCTC CCGAGCAGCC CGGCCACTGG 240

CTCTTCCCCT CCTGGGACAA ACTTTTCTGC AAGCCCTTGG ACCAAACTTG TCGCGCGTCA 300

CCGTCACCCA ACCGGGTCCG CGTAGAGCGC TCATCTTCGG CGAG ATG TCT GAG CGC 356

Met Ser Glu Arg

1

5 10 15 20

25 30 35

40 45 50

55 60 65

70 75 80

85 90 95 100

105 110 115

135 140 145

150 155 160

165 170 175 180

185 190 195

200 205 210

215 220 225

230 235 240

245 250 255 260

265 270 275

280 285 290

295 300 305

310 315 320

325 330 335 340 TCC ACG ACT GTC ACC CAG ACT CCT AGT CAC AGC TGG AGT AAT GGG CAC 1412 Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp Ser Asn Gly His

345 350 355

360 365 370

375 380 385

390 395 400

405 410 415 420

425 430 435

ACG CCA AGC TCC CCT AAA TCG CCC GCT TCG GAA ATG TCT CCA CCC GTG 1700 Thr Pro Ser Ser Pro Lys Ser Pro Ala Ser Glu Met Ser Pro Pro Val

440 445 450

455 460 465

470 475 480

AAG AAA TAT GAT CAT CAC CCC CAG CAA CTC AAC TCC TTT CAT CAC AAC 1844 Lys Lys Tyr Asp His His Pro Gln Gln Leu Asn Ser Phe His His Asn

485 490 495 500

505 510 515

520 525 530

535 540 545 AAT GGC CAC ATT GCC AAT AGG TTG GAA ATG GAC AGC AAC ACA AGT TCT 2036 Asn Gly His Ile Ala Asn Arg Leu Glu Met Asp Ser Asn Thr Ser Ser

550 555 560

565 570 575 580

585 590 595

600 605 610

615 620 625

630 635

GTAAAACTTT ATTTTATATA ATAAAGTATT TCACCTTAAA TTAAACAATT TATTTTATTT 2342

TAGCAGTTCT GCAAATAGAA AACAGGAAGA AAAAAAAACT TTTATAAATT AAATATATGT 2402

ATGTAAAAAT GTGTTATGTG CCATATGTAG CAATTTTTTA CAGTATTTCA AAAACGAGAA 2462

AGATATCAAT GGTGCCTTTA TGTTCTGTTA TGTCGAGAGC AAGTTTTATA AAGTTATGGT 2522

GATTTCTTTT TCACAGTATT TCAGCAAAAC CTCCCATATA TTCAGTTTCT GCTGGCTTTT 2582

TGTGCATTGC ATTATGATGT TGACTGGATG TATGGTTTGC AAGGCTAGCA GCTCGCTCGT 2642

GTTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCTGTCTCTC TCTCTGTCTC TCTCTCTCTC 2702

TCTCTCTCTC TCTCTCTCTC TCTCCGGACG CGTGGGTCGA C 2743

(48) INFORMATION FOR SEQ ID NO: 47:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 636 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47:

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys

Pro Ala Leu Pro Pro Arg Leu Lys Glu Met Lys Ser Gln Glu Ser Ala

35 40 45

Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser 50 55 60

Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys 65 70 75 80

Asn Lys Pro Glu Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu

85 90 95

Leu Arg Ile Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys

100 105 110

Lys Val Ile Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr

115 120 125

Ile Val Glu Ser Asn Glu Phe Ile Thr Gly Met Pro Ala Ser Thr Glu 130 135 140

Thr Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr 145 150 155 160

Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr

165 170 175

Ser His Leu Ile Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn

180 185 190

Gly Gly Glu Cys Phe Thr Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr

195 200 205

Leu Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr 210 215 220

Val Met Ala Ser Phe Tyr Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val 225 230 235 240

Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val Val Gly Ile Met

245 250 255

Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg Gln Lys Leu His

260 265 270

Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Ser Asn Leu Val Asn

275 280 285 Ile Ala Asn Gly Pro His His Pro Asn Pro Pro Pro Glu Asn Val Gln 290 295 300

Leu Val Asn Gln Tyr Val Ser Lys Asn Val Ile Ser Ser Glu His Ile 305 310 315 320

Val Glu Arg Glu Val Glu Thr Ser Phe Ser Thr Ser His Tyr Thr Ser

325 330 335

Thr Ala His His Ser Thr Thr Val Thr Gln Thr Pro Ser His Ser Trp

340 345 350

Ser Asn Gly His Thr Glu Ser Val Ile Ser Glu Ser Asn Ser Val Ile

355 360 365

Met Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro Ala Gly Gly 370 375 380

Pro Arg Gly Arg Leu His Gly Leu Gly Gly Pro Arg Asp Asn Ser Phe 385 390 395 400

Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His

405 410 415

Ser Glu Arg Tyr Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro

420 425 430

Val Asp Phe His Thr Pro Ser Ser Pro Lys Ser Pro Ala Ser Glu Met

435 440 445

Ser Pro Pro Val Ser Ser Met Thr Val Ser Met Pro Ser Val Ala Val 450 455 460

Ser Pro Phe Val Glu Glu Glu Arg Pro Leu Leu Leu Val Thr Pro Pro 465 470 475 480

Arg Leu Arg Glu Lys Lys Tyr Asp His His Pro Gln Gln Leu Asn Ser

485 490 495

Phe His His Asn Pro Ala His Gln Ser Thr Ser Leu Pro Pro Ser Pro

500 505 510

Leu Arg Ile Val Glu Asp Glu Glu Tyr Glu Thr Thr Gln Glu Tyr Glu

515 520 525

Ser Val Gln Glu Pro Val Lys Lys Val Thr Asn Ser Arg Arg Ala Lys 530 535 540

Arg Thr Lys Pro Asn Gly His Ile Ala Asn Arg Leu Glu Met Asp Ser 545 550 555 560 Asn Thr Ser Ser Val Ser Ser Asn Ser Glu Ser Glu Thr Glu Asp Glu

565 570 575

Arg Val Gly Glu Asp Thr Pro Phe Leu Gly Ile Gln Asn Pro Leu Ala

580 585 590

Ala Ser Leu Glu Val Ala Pro Ala Phe Arg Leu Ala Glu Ser Arg Thr

595 600 605

Asn Pro Ala Gly Arg Phe Ser Thr Gln Glu Glu Leu Gln Ala Arg Leu

610 615 620

Ser Ser Val Ile Ala Asn Gln Asp Pro Ile Ala Val

625 630 635

(49) INFORMATION FOR SEQ ID NO: 48:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2430 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: unknown

(D) TOPOLOGY: unknown

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 345..2330

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48:

GCGGCCGCAG CTGCCGGGAG ATGCGAGCGC AGACCGGATT GTGATCACCT TTCCCTCTTC 60

GGGCTGTAAG AGAGCGAGAC AAGCCACCGA AGCGAGGCCA CTCCAGAGCC GGCAGCGGAG 120

GGACCCGGGA CACTAGAGCA GCTCCGAGCC ACTCCAGACT GAGCGGACGC TCCAGGTGAT 180

CGAGTCCACG CTGCTTCCTG CAGGCGACAG GCGACGCCTC CCGAGCAGCC CGGCCACTGG 240

CTCTTCCCCT CCTGGGACAA ACTTTTCTGC AAGCCCTTGG ACCAAACTTG TCGCGCGTCA 300

CCGTCACCCA ACCGGGTCCG CGTAGAGCGC TCATCTTCGG CGAG ATG TCT GAG CGC 356

Met Ser Glu Arg

1

5 10 15 20

40 45 50

55 60 65

70 75 80

85 90 95 100

105 110 115

120 125 130

135 140 145

150 155 160

165 170 175 180

185 190 195

200 205 210

215 220 225

TTC TAC ATG ACT TCT AGG AGG AAA AGG CAA GAA ACA GAG AAG CCT CTA 1076 Phe Tyr Met Thr Ser Arg Arg Lys Arg Gln Glu Thr Glu Lys Pro Leu

230 235 240 GAA AGA AAA TTG GAT CAT AGC CTT GTG AAA GAA TCG AAA GCG GAG GAA 1124 Glu Arg Lys Leu Asp His Ser Leu Val Lys Glu Ser Lys Ala Glu Glu

245 250 255 260

CTC TAC CAG AAG AGG GTG CTG ACA ATT ACT GGC ATC TGT ATC GCC CTG 1172 Leu Tyr Gln Lys Arg Val Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu

265 270 275

CTG GTG GTC GGC ATC ATG TGT GTG GTG GCC TAC TGC AAA ACC AAG AAG 1220 Leu Val Val Gly Ile Met Cys Val Val Ala Tyr Cys Lys Thr Lys Lys

280 285 290

CAG CGG CAG AAG CTT CAT GAT CGG CTT CGG CAG AGT CTT CGG TCA GAA 1268 Gln Arg Gln Lys Leu His Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu

295 300 305

CGG AGC AAC CTG GTG AAC ATA GCG AAT GGG CCT CAC CAC CCA AAC CCA 1316 Arg Ser Asn Leu Val Asn Ile Ala Asn Gly Pro His His Pro Asn Pro

310 315 320

CCG CCA GAG AAC GTG CAG CTG GTG AAT CAA TAC GTA TCT AAA AAC GTC 1364 Pro Pro Glu Asn Val Gln Leu Val Asn Gln Tyr Val Ser Lys Asn Val

325 330 335 340

ATC TCC AGT GAG CAT ATT GTT GAG AGA GAA GTG GAG ACT TCC TTT TCC 1412 Ile Ser Ser Glu His Ile Val Glu Arg Glu Val Glu Thr Ser Phe Ser

345 350 355

ACC AGT CAT TAC ACT TCC ACA GCC CAT CAC TCC ACG ACT GTC ACC CAG 1460 Thr Ser His Tyr Thr Ser Thr Ala His His Ser Thr Thr Val Thr Gln

360 365 370

ACT CCT AGT CAC AGC TGG AGT AAT GGG CAC ACG GAG AGC GTC ATT TCA 1508 Thr Pro Ser His Ser Trp Ser Asn Gly His Thr Glu Ser Val Ile Ser

375 380 385

GAA AGC AAC TCC GTA ATC ATG ATG TCT TCG GTA GAG AAC AGC AGG CAC 1556 Glu Ser Asn Ser Val Ile Met Met Ser Ser Val Glu Asn Ser Arg His

390 395 400

AGC AGT CCC GCC GGG GGC CCA CGA GGA CGT CTT CAT GGC CTG GGA GGC 1604 Ser Ser Pro Ala Gly Gly Pro Arg Gly Arg Leu His Gly Leu Gly Gly

405 410 415 420

CCT CGT GAT AAC AGC TTC CTC AGG CAT GCC AGA GAA ACC CCT GAC TCC 1652 Pro Arg Asp Asn Ser Phe Leu Arg His Ala Arg Glu Thr Pro Asp Ser

425 430 435

TAC AGA GAC TCT CCT CAT AGC GAA AGG TAT GTA TCA GCC ATG ACC ACC 1700 Tyr Arg Asp Ser Pro His Ser Glu Arg Tyr Val Ser Ala Met Thr Thr

440 445 450 CCG GCT CGT ATG TCA CCT GTA GAT TTC CAC ACG CCA AGC TCC CCT AAA 1748 Pro Ala Arg Met Ser Pro Val Asp Phe His Thr Pro Ser Ser Pro Lys

455 460 465

TCG CCC CCT TCG GAA ATG TCT CCA CCC GTG TCC AGC ATG ACG GTG TCC 1796 Ser Pro Pro Ser Glu Met Ser Pro Pro Val Ser Ser Met Thr Val Ser

470 475 480

ATG CCC TCT GTG GCA GTC AGC CCC TTT GTG GAA GAA GAG AGG CCT CTG 1844 Met Pro Ser Val Ala Val Ser Pro Phe Val Glu Glu Glu Arg Pro Leu

485 490 495 500

CTG CTT GTG ACG CCA CCA AGG CTA CGG GAG AAG AAA TAT GAT CAT CAC 1892 Leu Leu Val Thr Pro Pro Arg Leu Arg Glu Lys Lys Tyr Asp His His

505 510 515

CCC CAG CAA CTC AAC TCC TTT CAT CAC AAC CCT GCA CAT CAG AGT ACC 1940 Pro Gln Gln Leu Asn Ser Phe His His Asn Pro Ala His Gln Ser Thr

520 525 530

AGC CTC CCC CCT AGC CCA CTG AGG ATA GTG GAG GAT GAG GAG TAC GAG 1988 Ser Leu Pro Pro Ser Pro Leu Arg Ile Val Glu Asp Glu Glu Tyr Glu

535 540 545

ACG ACC CAG GAG TAT GAG TCA GTT CAA GAG CCC GTT AAG AAA GTC ACC 2036 Thr Thr Gln Glu Tyr Glu Ser Val Gln Glu Pro Val Lys Lys Val Thr

550 555 560

AAT AGC CGG CGG GCC AAA AGA ACC AAG CCC AAT GGC CAC ATT GCC AAT 2084 Asn Ser Arg Arg Ala Lys Arg Thr Lys Pro Asn Gly His Ile Ala Asn

565 570 575 580

AGG TTG GAA ATG GAC AGC AAC ACA AGT TCT GTG AGC AGT AAC TCA GAA 2132 Arg Leu Glu Met Asp Ser Asn Thr Ser Ser Val Ser Ser Asn Ser Glu

585 590 595

AGT GAG ACA GAA GAC GAA AGA GTA GGT GAA GAC ACA CCA TTC CTG GGC 2180 Ser Glu Thr Glu Asp Glu Arg Val Gly Glu Asp Thr Pro Phe Leu Gly

600 605 610

ATA CAG AAC CCC CTG GCA GCC AGC CTT GAG GTG GCC CCT GCC TTC CGT 2228 Ile Gln Asn Pro Leu Ala Ala Ser Leu Glu Val Ala Pro Ala Phe Arg

615 620 625

CTG GCT GAG AGC AGG ACT AAC CCA GCA GGC CGC TTC TCC ACA CAG GAG 2276 Leu Ala Glu Ser Arg Thr Asn Pro Ala Gly Arg Phe Ser Thr Gln Glu

630 635 640

GAA TTA CAG GCC AGG CTG TCT AGT GTA ATC GCT AAC CAA GAC CCT ATT 2324 Glu Leu Gln Ala Arg Leu Ser Ser Val Ile Ala Asn Gln Asp Pro Ile

645 650 655 660 GCT GTA TAAAACCTAA ATAAACACAT AGATTCACCT GTAAAACTTT ATTTTATATA 2380 Ala Val

ATAAAGTATT TCACCTTAAA TTAAAAAAAA AAAAAAAAAA GGACGGCCGC 2430

(50) INFORMATION FOR SEQ ID NO: 49:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 662 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49:

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys

1 5 10 15

Asp Arg Gly Ser Arg Gly Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser

20 25 30

Pro Ala Leu Pro Pro Arg Leu Lys Glu Met Lys Ser Gln Glu Ser Ala

35 40 45

Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser

50 55 60

Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys

65 70 75 80

Asn Lys Pro Glu Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu

85 90 95

Leu Arg Ile Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys

100 105 110

Lys Val Ile Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr

115 120 125

Ile Val Glu Ser Asn Glu Phe Ile Thr Gly Met Pro Ala Ser Thr Glu

130 135 140

Thr Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr

145 150 155 160

Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr

165 170 175

Ser His Leu Ile Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn

180 185 190

Gly Gly Glu Cys Phe Thr Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr

195 200 205 Leu Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr 210 215 220

Val Met Ala Ser Phe Tyr Met Thr Ser Arg Arg Lys Arg Gln Glu Thr 225 230 235 240

Glu Lys Pro Leu Glu Arg Lys Leu Asp His Ser Leu Val Lys Glu Ser

245 250 255

Lys Ala Glu Glu Leu Tyr Gln Lys Arg Val Leu Thr Ile Thr Gly Ile

260 265 270

Cys Ile Ala Leu Leu Val Val Gly Ile Met Cys Val Val Ala Tyr Cys

275 280 285

Lys Thr Lys Lys Gln Arg Gln Lys Leu His Asp Arg Leu Arg Gln Ser 290 295 300

Leu Arg Ser Glu Arg Ser Asn Leu Val Asn Ile Ala Asn Gly Pro His 305 310 315 320

His Pro Asn Pro Pro Pro Glu Asn Val Gln Leu Val Asn Gln Tyr Val

325 330 335

Ser Lys Asn Val Ile Ser Ser Glu His Ile Val Glu Arg Glu Val Glu

340 345 350

Thr Ser Phe Ser Thr Ser His Tyr Thr Ser Thr Ala His His Ser Thr

355 360 365

Thr Val Thr Gln Thr Pro Ser His Ser Trp Ser Asn Gly His Thr Glu 370 375 380

Ser Val Ile Ser Glu Ser Asn Ser Val Ile Met Met Ser Ser Val Glu 385 390 395 400

Asn Ser Arg His Ser Ser Pro Ala Gly Gly Pro Arg Gly Arg Leu His

405 410 415

Gly Leu Gly Gly Pro Arg Asp Asn Ser Phe Leu Arg His Ala Arg Glu

420 425 430

Thr Pro Asp Ser Tyr Arg Asp Ser Pro His Ser Glu Arg Tyr Val Ser

435 440 445

Ala Met Thr Thr Pro Ala Arg Met Ser Pro Val Asp Phe His Thr Pro 450 455 460

Ser Ser Pro Lys Ser Pro Pro Ser Glu Met Ser Pro Pro Val Ser Ser 465 470 475 480 Met Thr Val Ser Met Pro Ser Val Ala Val Ser Pro Phe Val Glu Glu

485 490 495

Glu Arg Pro Leu Leu Leu Val Thr Pro Pro Arg Leu Arg Glu Lys Lys

500 505 510

Tyr Asp His His Pro Gln Gln Leu Asn Ser Phe His His Asn Pro Ala

515 520 525

His Gln Ser Thr Ser Leu Pro Pro Ser Pro Leu Arg Ile Val Glu Asp

530 535 540

Glu Glu Tyr Glu Thr Thr Gln Glu Tyr Glu Ser Val Gln Glu Pro Val

545 550 555 560

Lys Lys Val Thr Asn Ser Arg Arg Ala Lys Arg Thr Lys Pro Asn Gly

565 570 575

His Ile Ala Asn Arg Leu Glu Met Asp Ser Asn Thr Ser Ser Val Ser

580 585 590

Ser Asn Ser Glu Ser Glu Thr Glu Asp Glu Arg Val Gly Glu Asp Thr

595 600 605

Pro Phe Leu Gly Ile Gln Asn Pro Leu Ala Ala Ser Leu Glu Val Ala

610 615 620

Pro Ala Phe Arg Leu Ala Glu Ser Arg Thr Asn Pro Ala Gly Arg Phe

625 630 635 640

Ser Thr Gln Glu Glu Leu Gln Ala Arg Leu Ser Ser Val Ile Ala Asn

645 650 655

Gln Asp Pro Ile Ala Val

660

(51) INFORMATION FOR SEQ ID NO: 50:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 3161 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: unknown

(D) TOPOLOGY: unknown

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 345..2261

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50:

GCGGCCGCAG CTGCCGGGAG ATGCGAGCGC AGACCGGATT GTGATCACCT TTCCCTCTTC 60 GGGCTGTAAG AGAGCGAGAC AAGCCACCGA AGCGAGGCCA CTCCAGAGCC GGCAGCGGAG 120

GGACCCGGGA CACTAGAGCA GCTCCGAGCC ACTCCAGACT GAGCGGACGC TCCAGGTGAT 180

CGAGTCCACG CTGCTTCCTG CAGGCGACAG GCGACGCCTC CCGAGCAGCC CGGCCACTGG 240

CTCTTCCCCT CCTGGGACAA ACTTTTCTGC AAGCCCTTGG ACCAAACTTG TCGCGCGTCA 300

CCGTCACCCA ACCGGGTCCG CGTAGAGCGC TCATCTTCGG CGAG ATG TCT GAG CGC 356

Met Ser Glu Arg

1

5 10 15 20

25 30 35

40 45 50

55 60 65

70 75 80

AAC ATC AAG ATA CAG AAG AAG CCA GGG AAG TCA GAG CTT CGA ATT AAC 644 Asn lie Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu Leu Arg Ile Asn

85 90 95 100

105 110 115

120 125 130

135 140 145

150 155 160 ACT TCT TCA TCC ACA TCA ACA TCC ACG ACT GGG ACC AGC CAT CTC ATA 884 Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr Ser His Leu Ile

165 170 175 180

185 190 195

200 205 210

215 220 225

230 235 240

245 250 255 260

265 270 275

280 285 290

295 300 305

310 315 320

325 330 335 340

345 350 355

360 365 370 ATG TCT TCG GTA GAG AAC AGC AGG CAC AGC AGT CCC GCC GGG GGC CCA 1508

Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro Ala Gly Gly Pro

375 380 385

CGA GGA CGT CTT CAT GGC CTG GGA GGC CCT CGT GAT AAC AGC TTC CTC 1556

Arg Gly Arg Leu His Gly Leu Gly Gly Pro Arg Asp Asn Ser Phe Leu

390 395 400

AGG CAT GCC AGA GAA ACC CCT GAC TCC TAC AGA GAC TCT CCT CAT AGC 1604

Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp Ser Pro His Ser

405 410 415 420

GAA AGG TAT GTA TCA GCC ATG ACC ACC CCG GCT CGT ATG TCA CCT GTA 1652

Glu Arg Tyr Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro Val

425 430 435

GAT TTC CAC ACG CCA AGC TCC CCT AAA TCG CCC CCT TCG GAA ATG TCT 1700

Asp Phe His Thr Pro Ser Ser Pro Lys Ser Pro Pro Ser Glu Met Ser

440 445 450

CCA CCC GTG TCC AGC ATG ACG GTG TCC ATG CCC TCT GTG GCA GTC AGC 1748

Pro Pro Val Ser Ser Met Thr Val Ser Met Pro Ser Val Ala Val Ser

455 460 465

CCC TTT GTG GAA GAA GAG AGG CCT CTG CTG CTT GTG ACG CCA CCA AGG 1796

Pro Phe Val Glu Glu Glu Arg Pro Leu Leu Leu Val Thr Pro Pro Arg

470 475 480

CTA CGG GAG AAG AAA TAT GAT CAT CAC CCC CAG CAA CTC AAC TCC TTT 1844

Leu Arg Glu Lys Lys Tyr Asp His His Pro Gln Gln Leu Asn Ser Phe

485 490 495 500

CAT CAC AAC CCT GCA CAT CAG AGT ACC AGC CTC CCC CCT AGC CCA CTG 1892

His His Asn Pro Ala His Gln Ser Thr Ser Leu Pro Pro Ser Pro Leu

505 510 515

AGG ATA GTG GAG GAT GAG GAG TAC GAG ACG ACC CAG GAG TAT GAG TCA 1940

Arg Ile Val Glu Asp Glu Glu Tyr Glu Thr Thr Gln Glu Tyr Glu Ser

520 525 530

GTT CAA GAG CCC GTT AAG AAA GTC ACC AAT AGC CGG CGG GCC AAA AGA 1988

Val Gln Glu Pro Val Lys Lys Val Thr Asn Ser Arg Arg Ala Lys Arg

535 540 545

ACC AAG CCC AAT GGC CAC ATT GCC AAT AGG TTG GAA ATG GAC AGC AAC 2036

Thr Lys Pro Asn Gly His Ile Ala Asn Arg Leu Glu Met Asp Ser Asn

550 555 560

ACA AGT TCT GTG AGC AGT AAC TCA GAA AGT GAG ACA GAA GAC GAA AGA 2084

Thr Ser Ser Val Ser Ser Asn Ser Glu Ser Glu Thr Glu Asp Glu Arg

565 570 575 580 GTA GGT GAA GAC ACA CCA TTC CTG GGC ATA CAG AAC CCC CTG GCA GCC 2132 Val Gly Glu Asp Thr Pro Phe Leu Gly Ile Gln Asn Pro Leu Ala Ala

585 590 595

600 605 610

615 620 625

630 635

AGATTCACCT GTAAAACTTT ATTTTATATA ATAAAGTATT TCACCTTAAA TTAAACAATT 2341

TATTTTATTT TAGCAGTTCT GCAAATAGAA AACAGGAAGA AAAAAAAACT TTTATAAATT 2401

AAATATATGT ATGTAAAAAT GTGTTATGTG CCATATGTAG CAATTTTTTA CAGTATTTCA 2461

AAAACGAGAA AGATATCAAT GGTGCCTTTA TGTTCTGTTA TGTCGAGAGC AAGTTTTATA 2521

AAGTTATGGT GATTTCTTTT TCACAGTATT TCAGCAAAAC CTCCCATATA TTCAGTTTCT 2581

GCTGGCTTTT TGTGCATTGC ATTATGATGT TGACTGGATG TATGGTTTGC AAGGCTAGCA 2641

GCTCGCTCGT GTTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCTGTCTCTC TCTCTGTCTC 2701

TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC TGTCTCTCTC 2761

TCTGCTTCCC GTAGCTCCCA ACCAGTACTG TCTTGGACTG GCACATCCAT CCAAATACCT 2821

TTCTACTTTG TATGAAGTTT TCTTTGCTTT CCCAATATGA AATGAGTTCT CTCTACTCTG 2881

TCAGCCAAAG GTTTGCTTCA CTGGACTCTG AGATAATAGT AGACCCAGCA GCATGCTACT 2941

ATTACGTATA GCAGGAAACT GCACCAAGTA ATGTCCAATA ATAGGAAGAA AGTAATACTG 3001

TGATTTAAAA AAAAAAACAA ACTATATTAT TAATCAGAAG ACAGCTTGCT CTTGGTAAAA 3061

GGAGCTACCA TTGACTCTAA TTTTGACTTT TTAGTTATTG TTCTTGACAA AGAGTAACAG 3121

CTTCAAGTAC AGCCTAAAAA AAAAAAAAGG GCGGCCGCCC 3161 (52) INFORMATION FOR SEQ ID NO: 51:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 639 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51:

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys 1 5 10 15

Asp Arg Gly Ser Arg Gly Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser

20 25 30

Pro Ala Leu Pro Pro Arg Leu Lys Glu Met Lys Ser Gln Glu Ser Ala

35 40 45

Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser 50 55 60

Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys 65 70 75 80

Asn Lys Pro Glu Asn Ile Lys Ile Gln Lys Lys Pro Gly Lys Ser Glu

85 90 95

Leu Arg Ile Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys

100 105 110

Lys Val Ile Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr

115 120 125

Ile Val Glu Ser Asn Glu Phe Ile Thr Gly Met Pro Ala Ser Thr Glu 130 135 140

Thr Ala Tyr Val Ser Ser Glu Ser Pro Ile Arg Ile Ser Val Ser Thr 145 150 155 160

Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr

165 170 175

Ser His Leu Ile Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn

180 185 190

Gly Gly Glu Cys Phe Thr Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr

195 200 205

Leu Cys Lys Cys Gln Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu Asn 210 215 220 Val Pro Met Lys Val Gln Thr Gln Glu Lys Ala Glu Glu Leu Tyr Gln

225 230 235 240

Lys Arg Val Leu Thr Ile Thr Gly Ile Cys Ile Ala Leu Leu Val Val

245 250 255

Gly Ile Met Cys Val Val Ala Tyr Cys Lys Thr Lys Lys Gln Arg Gln

260 265 270

Lys Leu His Asp Arg Leu Arg Gln Ser Leu Arg Ser Glu Arg Ser Asn

275 280 285

Leu Val Asn Ile Ala Asn Gly Pro His His Pro Asn Pro Pro Pro Glu 290 295 300

Asn Val Gln Leu Val Asn Gln Tyr Val Ser Lys Asn Val Ile Ser Ser 305 310 315 320

Glu His Ile Val Glu Arg Glu Val Glu Thr Ser Phe Ser Thr Ser His

325 330 335

Tyr Thr Ser Thr Ala His His Ser Thr Thr Val Thr Gln Thr Pro Ser

340 345 350

His Ser Trp Ser Asn Gly His Thr Glu Ser Val Ile Ser Glu Ser Asn

355 360 365

Ser Val Ile Met Met Ser Ser Val Glu Asn Ser Arg His Ser Ser Pro 370 375 380

Ala Gly Gly Pro Arg Gly Arg Leu His Gly Leu Gly Gly Pro Arg Asp 385 390 395 400

Asn Ser Phe Leu Arg His Ala Arg Glu Thr Pro Asp Ser Tyr Arg Asp

405 410 415

Ser Pro His Ser Glu Arg Tyr Val Ser Ala Met Thr Thr Pro Ala Arg

420 425 430

Met Ser Pro Val Asp Phe His Thr Pro Ser Ser Pro Lys Ser Pro Pro

435 440 445

Ser Glu Met Ser Pro Pro Val Ser Ser Met Thr Val Ser Met Pro Ser 450 455 460

Val Ala Val Ser Pro Phe Val Glu Glu Glu Arg Pro Leu Leu Leu Val 465 470 475 480

Thr Pro Pro Arg Leu Arg Glu Lys Lys Tyr Asp His His Pro Gln Gln

485 490 495 Leu Asn Ser Phe His His Asn Pro Ala His Gln Ser Thr Ser Leu Pro 500 505 510

Pro Ser Pro Leu Arg Ile Val Glu Asp Glu Glu Tyr Glu Thr Thr Gln

515 520 525

Glu Tyr Glu Ser Val Gln Glu Pro Val Lys Lys Val Thr Asn Ser Arg 530 535 540

Arg Ala Lys Arg Thr Lys Pro Asn Gly His Ile Ala Asn Arg Leu Glu 545 550 555 560

Met Asp Ser Asn Thr Ser Ser Val Ser Ser Asn Ser Glu Ser Glu Thr

565 570 575

Glu Asp Glu Arg Val Gly Glu Asp Thr Pro Phe Leu Gly Ile Gln Asn

580 585 590

Pro Leu Ala Ala Ser Leu Glu Val Ala Pro Ala Phe Arg Leu Ala Glu

595 600 605

Ser Arg Thr Asn Pro Ala Gly Arg Phe Ser Thr Gln Glu Glu Leu Gln 610 615 620

Ala Arg Leu Ser Ser Val Ile Ala Asn Gln Asp Pro Ile Ala Val 625 630 635

(53) INFORMATION FOR SEQ ID NO: 52:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 45 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52:

Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr Val Met Ala 1 5 10 15

Ser Phe Tyr Met Thr Ser Arg Arg Lys Arg Gln Glu Thr Glu Lys

20 25 30

Pro Leu Glu Arg Lys Leu Asp His Ser Leu Val Lys Glu Ser Lys

35 40 45 (54) INFORMATION FOR SEQ ID NO: 53:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 27 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53:

Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr Val Met Ala

1 5 10 15

Ser Phe Tyr Lys His Leu Gly Ile Glu Phe Met Glu

20 25

(55) INFORMATION FOR SEQ ID NO: 54:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 39 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54:

His Asn Leu Ile Ala Glu Leu Arg Arg Asn Lys Ala Tyr Arg Ser

1 5 10 15

Lys Cys Met Gln Ile Gln Leu Ser Ala Thr His Leu Arg Pro Ser

20 25 30

Ser Ile Thr His Leu Gly Phe Ile Leu

35

(56) INFORMATION FOR SEQ ID NO: 55:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 159 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: unknown

(D) TOPOLOGY: unknown

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 27..128

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55:

TCTAGATGAA GGACCTGTCA AACCCG TCA AGA TAC TTG TGC AAG TGC CCA AAT 53

Ser Arg Tyr Leu Cys Lys Cys Pro Asn

1 5 GAG TTT ACT GGT GAT CGT TGC CAA AAC TAC GTA ATG GCC AGC TTC TAC 101 Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr Val Met Ala Ser Phe Tyr

10 15 20 25

AAG TAT CTT GGG ATT GAA TTT ATG GAA GCGGAGGAAC TCTACCAGAA 148

Lys Tyr Leu Gly Ile Glu Phe Met Glu

30

GGGATCCCGC G 159

(57) INFORMATION FOR SEQ ID NO:56:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 34 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56:

Ser Arg Tyr Leu Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys

1 5 10 15 Gln Asn Tyr Val Met Ala Ser Phe Tyr Lys Tyr Leu Gly Ile Glu Phe

20 25 30

Met Glu

(58) INFORMATION FOR SEQ ID NO: 57:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 8 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:57:

His Leu Gly Ile Glu Phe Ile Glu

1 5

(59) INFORMATION FOR SEQ ID NO: 58:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 14 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58:

Ser Ala Gln Met Ser Leu Leu Val Ile Ala Ala Lys Thr Thr

1 5 10

(60) INFORMATION FOR SEQ ID NO: 59:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 27 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:59:

Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr Val Met Ala Ser 1 5 10 15

Phe Tyr Lys His Leu Gly Ile Glu Phe Met Glu

20 25

(61) INFORMATION FOR SEQ ID NO: 60:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 19 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60:

Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr Val Met Ala Ser 1 5 10 15

Phe Tyr Lys

(62) INFORMATION FOR SEQ ID NO: 61:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 29 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61:

Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr Val Met Ala Ser 1 5 10 15

Phe Tyr Ser Thr Ser Thr Pro Phe Leu Ser Leu Pro Glu

20 25 (63) INFORMATION FOR SEQ ID NO: 62:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 111 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62:

Tyr Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro Val Asp Phe 1 5 10 15

His Thr Pro Ser Ser Pro Lys Ser Pro Pro Ser Glu Met Ser Pro Pro

20 25 30

Val Ser Ser Met Thr Val Ser Met Pro Ser Met Ala Val Ser Pro Phe

35 40 45

Met Glu Glu Glu Arg Pro Leu Leu Leu Val Thr Pro Pro Arg Leu Arg 50 55 60

Glu Lys Lys Phe Asp His His Pro Gln Gln Phe Ser Ser Phe His His 65 70 75 80

Asn Pro Ala His Asp Ser Asn Ser Leu Pro Ala Ser Pro Leu Arg Ile

85 90 95

Val Glu Asp Glu Glu Tyr Glu Thr Thr Gln Glu Tyr Glu Pro Ala

100 105 110

(64) INFORMATION FOR SEQ ID NO: 63:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 70 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63:

Phe Tyr Val Ser Ala Met Thr Thr Pro Ala Arg Met Ser Pro Val Asp 1 5 10 15

Phe His Thr Pro Ser Ser Pro Lys Ser Pro Pro Ser Glu Met Ser Pro

20 25 30

Pro Val Ser Ser Met Thr Val Ser Met Pro Ser Met Ala Val Ser Pro

35 40 45

Phe Met Glu Glu Glu Arg Pro Leu Leu Leu Val Thr Pro Pro Arg Leu 50 55 60 Arg Glu Lys Lys Phe Asp

65 70

(65) INFORMATION FOR SEQ ID NO: 64:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 148 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64:

His His Pro Gln Gln Phe Ser Ser Phe His His Asn Pro Ala His Asp 1 5 10 15

Ser Asn Ser Leu Pro Ala Ser Pro Leu Arg Ile Val Glu Asp Glu Glu

20 25 30

Tyr Glu Thr Thr Gln Glu Tyr Glu Pro Ala Gln Glu Pro Val Lys Lys

35 40 45

Leu Ala Asn Ser Arg Arg Ala Lys Arg Thr Lys Pro Asn Gly His Ile 50 55 60

Ala Asn Arg Leu Glu Val Asp Ser Asn Thr Ser Ser Gln Ser Ser Asn 65 70 75 80

Ser Glu Ser Glu Thr Glu Asp Glu Arg Val Gly Glu Asp Thr Pro Phe

85 90 95

Leu Gly Ile Gln Asn Pro Leu Ala Ala Ser Leu Glu Ala Thr Pro Ala

100 105 110

Phe Arg Leu Ala Asp Ser Arg Thr Asn Pro Ala Gly Arg Phe Ser Thr

115 120 125

Gln Glu Glu Ile Gln Ala Arg Leu Ser Ser Val Ile Ala Asn Gln Asp 130 135 140

Pro Ile Ala Val

145

(66) INFORMATION FOR SEQ ID NO: 65:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 3 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65:

Trp Phe Lys

1

(67) INFORMATION FOR SEQ ID NO: 66:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 4 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66:

Leu Val Leu Arg

1

(68) INFORMATION FOR SEQ ID NO: 67:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 17 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67:

Gly Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser Pro Ala Leu Pro 1 5 10 15

Pro Arg

(69) INFORMATION FOR SEQ ID NO: 68:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 11 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68:

Cys Glu Thr Ser Ser Glu Tyr Ser Ser Leu Arg

1 5 10

(70) INFORMATION FOR SEQ ID NO: 69:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 12 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69:

Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys Lys

1 5 10

(71) INFORMATION FOR SEQ ID NO: 70:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 24 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:70:

Leu Gly Asn Asp Ser Ala Ser Ala Asn Ile Thr Ile Val Glu Ser 1 5 10 15

Asn Glu Phe Ile Ile Gly Met Pro Ala

20

(72) INFORMATION FOR SEQ ID NO: 71:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 20 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:71:

Arg Gly Ser Arg Gly Lys Pro Gly Pro Ala Glu Gly Asp Pro Ser 1 5 10 15

Pro Ala Leu Pro Pro

20

(73) INFORMATION FOR SEQ ID NO: 72:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 6 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:72:

Gly Glu Tyr Met Cys Lys

1 5 (74) INFORMATION FOR SEQ ID NO: 73:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 60 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS : single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73:

ATAGGGAAGG GCGGGGGAAG GRTCNCCYTC NGCAGGGCCG GGCTTGCCTC 50

TGGAGCCTCT 60

(75) INFORMATION FOR SEQ ID NO:74:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74:

YTTRCACATR TAYTCNCC 18

(76) INFORMATION FOR SEQ ID NO: 75:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 28 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75:

GCGTCTAGAT GAAGGACCTG TCAAACCC 28

(77) INFORMATION FOR SEQ ID NO: 76:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 28 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76:

GCGGGATCCC TTCTGGTAGA GTTCCTCC 28

(78) INFORMATION FOR SEQ ID NO: 77:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 19 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:77:

TGAAGGACCT GTCAAACCC 19

(79) INFORMATION FOR SEQ ID NO: 78:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 19 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:78:

AGGAAATGAC AGTGCCTCT 19

(80) INFORMATION FOR SEQ ID NO: 79:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79:

TCTCTGGCAT GCCTGAGG 18

(81) INFORMATION FOR SEQ ID NO: 80:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 37 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80:

CGGTCTAGAA GCTTCCACCA TGTCTGAGCG CAAAGAA 37 (82) INFORMATION FOR SEQ ID NO: 81:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81:

GCCGTCGACC TATTACCTTT CGCTATGAGG 30

(83) INFORMATION FOR SEQ ID NO: 82:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 27 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82:

GCTCTAGAGG CTTCTCTGTT TCTTGCC 27

(84) INFORMATION FOR SEQ ID NO: 83:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 33 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83:

GCTCTAGAAA GAAAATTGGA TCATAGCCTT GTG 33

(85) INFORMATION FOR SEQ ID NO: 84:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 33 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84:

GCCGTCGACC TATTATACAG CAATAGGGTC TTG 33 (86) INFORMATION FOR SEQ ID NO: 85:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 37 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85:

CGGTCTAGAA GCTTCCACCA TGTCCGAGCG CAAAGAA 37

(87) INFORMATION FOR SEQ ID NO: 86:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86:

GCCGTCGACC TATTAGAGAA TGAAGCCCAA 30

(88) INFORMATION FOR SEQ ID NO: 87:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 39 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87:

GCCGAAGACG GTCATGAAGC TTCTGCCGCT GTTTCTTGG 39

(89) INFORMATION FOR SEQ ID NO: 88:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 33 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88:

CCTTTCAAAC CCCTCGAGAT ACTTGTGCAA GTG 33 (90) INFORMATION FOR SEQ ID NO: 89:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89:

Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Asp Arg Gly

1 5 10 15

Ser Arg Gly

(91) INFORMATION FOR SEQ ID NO: 90:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 141 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90:

CTAATTCCGC TCTCACCTAC CAAACAATGC CCCCCTGCAA AAAATAAATT 50

CATATAAAAA ACATACAGAT AACCATCTGC GGTGATAAAT TATCTCTGGC 100

GGTGTTGACA TAAATACCAC TGGCGGTGAT ACTGAGCACA T 141

(92) INFORMATION FOR SEQ ID NO: 91:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 147 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91:

CGATGTGCTC AGTATCACCG CCAGTGGTAT TTATGTCAAC ACCGCCAGAG 50

ATAATTTATC ACCGCAGATG GTTATCTGTA TGTTTTTTAT ATGAATTTAT 100

TTTTTGCAGG GGGGCATTGT TTGGTAGGTG AGAGCGGAAT TAGACGT 147 (93) INFORMATION FOR SEQ ID NO: 92:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 55 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92:

CGATTTGATT CTAGAAGGAG GAATAACATA TGGTTAACGC GTTGGAATTC 50

GGTAC 55

(94) INFORMATION FOR SEQ ID NO: 93:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 49 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93:

CGAATTCCAA CGCGTTAACC ATATCTTATT CCTCCTTCTA GAATCAAAT 49

(95) INFORMATION FOR SEQ ID NO: 94:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 24 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94:

GAGCTCACTA GTGTCGACCT GCAG 24

(96) INFORMATION FOR SEQ ID NO: 95:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 24 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95:

CTGCAGGTCG ACACTAGTGA GCTC 24 (97) INFORMATION FOR SEQ ID NO: 96:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 20 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96:

CACTGGCGGT GATAATGATC 20

(98) INFORMATION FOR SEQ ID NO: 97:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 49 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97:

TTTGATCTAG AAGGAGGAAT AACATATGAA GAAGAAGGAG CGAGGCTCC 49

(99) INFORMATION FOR SEQ ID NO: 98:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 36 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98:

CCGTTAGGAT CCTTACTTCT GGTACAGCTC CTCCGC 36

(100) INFORMATION FOR SEQ ID NO: 99:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 36 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99:

AACACACTCG AGTTACTTCT GGTAGAGTTC CTCCGC 36 (101) INFORMATION FOR SEQ ID NO: 100:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 45 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100:

TATGAAAAAA AAAGAACGTG GTTCTGGTAA AAAACCGGAA TCCGC 45

(102) INFORMATION FOR SEQ ID NO: 101:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 41 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101:

GGATTCCGGT TTTTTACCAG AACCACGTTC TTTTTTTTTC A 41

(103) INFORMATION FOR SEQ ID NO: 102:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 48 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102:

ACCACATCTA GAAGGAGGAA TAACATATGA GCCATCTTGT AAAATGTG 48

(104) INFORMATION FOR SEQ ID NO: 103:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103:

ACCACAGGAT CCTTACTTCT GGTACAGCTC C 31 (105) INFORMATION FOR SEQ ID NO: 104:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104:

ACACAAGGAT CCTTACTTCT GGTACAGCTC 30

(106) INFORMATION FOR SEQ ID NO: 105:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 50 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS : single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105:

ACACAATCTA GAAGGAGGAA TAACATATGT CTCATCTTGT AAAATGTGCT 50

(107) INFORMATION FOR SEQ ID NO:106:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 21 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106:

AGGAAATGAC AGTGCCTCTG C 21

(108) INFORMATION FOR SEQ ID NO: 107:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 35 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107:

TATGGGGATC CATTACTTCT GGTACAGCTC CTCCG 35 (109) INFORMATION FOR SEQ ID NO: 108:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 111 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108:

TGATTCTAGA AGGAGGAATA ACATATGAAR AARACYGCDA TYGCDGTWGC 50

DCTGGCDCTG GCDGGYTTYG CDACYGTWGC DCAGGCDAGC CATCTTGTAA 100

AATGTGCGGA G 111

(110) INFORMATION FOR SEQ ID NO: 109:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 23 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109:

AGTATCTCGA GGGGTTTGAA AGG 23

(111) INFORMATION FOR SEQ ID NO: 110:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 46 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110:

ACAAACTCTA GAAGGAGGAA TAACATATGT CTGAACGTAA AGAAGG 46

(112) INFORMATION FOR SEQ ID NO: 111:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 42 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:111:

GTCTGAACGT AAAGAAGGTC GTGGTAAAGG GAAGGGCAAG AA 42 (113) INFORMATION FOR SEQ ID NO: 112:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112:

ACAAACGGAT CCTTACTTCT GGTACAGCTC C 31

(114) INFORMATION FOR SEQ ID NO: 113:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 22 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113:

CAAACCCCTC GAGATACTTG TG 22

(115) INFORMATION FOR SEQ ID NO: 114:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 36 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single stranded

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114:

AGAAGGGATC CATTACGTAG TTTTGGCAGC GATCAC 36

(116) INFORMATION FOR SEQ ID NO: 115:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 36 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115:

AAGAAGAAGG AGCGAGGCTC CGGCAAGAAG CCGGAG 36 (117) INFORMATION FOR SEQ ID NO: 116:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 36 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:116:

AAAAAAAAAG AACGTGGTTC TGGTAAAAAA CCGGAA 36

(118) INFORMATION FOR SEQ ID NO: 117:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 24 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117:

TCCGAGCGCA AAGAAGGCAG AGGC 24

(119) INFORMATION FOR SEQ ID NO: 118:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 24 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118:

TCTGAACGTA AAGAAGGTCG TGGT 24

(120) INFORMATION FOR SEQ ID NO: 119:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 63 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double stranded

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119:

ATGAAGAAGA CCGCGATTGC AATTGCGGTA GCGCTGGCGG GTTTTGCGAC 50

CGTTGCGCAG GCG 63 (121) INFORMATION FOR SEQ ID NO: 120:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 20 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120:

Met Lys Lys Lys Asp Arg Gly Ser Arg Gly Lys Pro Gly Pro Ala 1 5 10 15

Glu Gly Asp Pro Ser

20

(122) INFORMATION FOR SEQ ID NO: 121:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 25 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121:

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys 1 5 10 15

Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu

20 25

(123) INFORMATION FOR SEQ ID NO: 122:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 10 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122:

Met Lys Lys Lys Glu Arg Gly Ser Gly Lys

1 5 10 (124) INFORMATION FOR SEQ ID NO: 123:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 15 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123:

Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val 1 5 10 15

(125) INFORMATION FOR SEQ ID NO: 124:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 25 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124:

Met Lys Lys Lys Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala 1 5 10 15

Ala Gly Ser Gln Ser Pro Ala Leu Pro Pro

20 25

(126) INFORMATION FOR SEQ ID NO: 125:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 15 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125:

Met Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys 1 5 10 15

(127) INFORMATION FOR SEQ ID NO: 126:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 19 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126:

Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr Val Met Ala 1 5 10 15

Ser Phe Tyr Lys

(128) INFORMATION FOR SEQ ID NO: 127:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 29 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:127:

Pro Asn Glu Phe Thr Gly Asp Arg Cys Gln Asn Tyr Val Met Ala 1 5 10 15

Ser Phe Tyr Ser Thr Ser Thr Pro Phe Leu Ser Leu Pro Glu

20 25

Claims

WHAT IS CLAIMED:

1. A non-naturally-occurring polypeptide which possesses neu receptor stimulatory activity and comprises an amino acid sequence that includes one of the following two sequences:

CysAlaGluLysGluLysThrPheCysValAsnGlyGlyGluCysPheMetVal LysAspLeuSerAsnProSerArgTyrLeuCysLysCysGlnProGlyPheThr (α) GlyAlaArgCys [SEQ ID NO: 1] CysAlaGluLysGluLysThrPheCysValAsnGlyGlyGluCysPheMetVal

LysAspLeuSerAsnProSerArgTyrLeuCysLysCysProAsnGluPheThr (ß ) GlyAspArgCys [SEQ ID NO: 2] or an analog thereof.

2. A polypeptide according to Claim 1 which is a product of procaryotic or eucaryotic expression of an exogenous DNA sequence.

3. A polypeptide according to Claim 2 which is a product of non-human mammalian cell expression.

4. A polypeptide according to Claim 3 in which the non-human mammalian cell is a Chinese hamster ovary cell.

5. A polypeptide according to Claim 2 which is a product of bacterial cell expression.

6. A polypeptide according to Claim 5 in which the bacterial cell is an E. coli cell.

7. A polypeptide according to Claim 2 which is a product of yeast cell expression.

8. A polypeptide according to Claim 2 which is a product of an exogenous DNA sequence which is a

manufactured DNA sequence.

9. A polypeptide according to Claim 2 which is the expression product of an exogenous DNA sequence which is a complementary DNA sequence.

10. A polypeptide according to Claim 1 which has all or a portion of the amino acid sequence of Figure 31, 32, 34, 35, 36, or 37 [SEQ ID NOS: 6, 8, 10, 12, 14, or 16, respectively].

11. A polypeptide according to Claim 1 which has the amino acid sequence of Figure 32 [SEQ ID NO: 8] from amino acid position 1 to amino acid position 241.

12. A polypeptide according to claim 1 comprising the amino acid sequence of Figure 32 [SEQ ID NO: 8] from amino acid position 14 to amino acid position 241.

13. A polypeptide according to Claim 12 which further includes a methionine residue at the amino terminus.

14. A polypeptide according to Claim 1 which has the amino acid sequence of proNDF-α1a in Figure 38 from amino acid position 14 to amino acid position 249.

15. A polypeptide according to Claim 1 which has the amino acid sequence of Figure 32 [SEQ ID NO: 8] from amino acid position 177 to amino acid position 241.

16. A polypeptide according to Claim 15 which further includes a methionine residue at the amino terminus.

17. A polypeptide according to Claim 1 which has the amino acid sequence of proNDF-ßla in Figure 38 from amino acid position 14 to amino acid position 246.

18. A polypeptide according to Claim 17 which further includes a methionine residue at the amino terminus.

19. A polypeptide according to Claim 1 which has the amino acid sequence of Figure 35 [SEQ ID NO: 14] from amino acid position 177 to amino acid position 246.

20. A polypeptide according to Claim 19 which further includes a methionine residue at the amino terminus.

21. A polypeptide according to Claim 1 which has the amino acid sequence of proNDF-α3 in Figure 38 from amino acid position 14 to amino acid position 247.

22. A polypeptide according to Claim 21 which further includes a methionine residue at the amino terminus.

23. A polypeptide according to Claim 1 which is the expression product of a DNA sequence encoding, in addition to one of the two amino acid sequences specified in Claim 1, the cytoplasmic domain of a rat neu

differentiation factor.

24. An isolated DNA molecule comprising a nucleic acid sequence encoding a polypeptide which

possess neu receptor stimulatory activity and comprises an amino acid sequence that includes one of the following two sequences: CysAlaGluLysGluLysThrPheCysValAsnGlyGlyGluCysPheMetVal

LysAspLeuSerAsnProSerArgTyrLeuCysLysCysGlnProGlyPheThr (α)

GlyAlaArgCys [SEQ ID NO: 1]

CysAlaGluLysGluLysThrPheCysValAsnGlyGlyGluCysPheMetVal

LysAspLeuSerAsnProSerArgTyrLeuCysLysCysProAsnGluPheThr (ß )

GlyAspArgCys [SEQ ID NO: 2] or an analog thereof.

25. A DNA molecule according to Claim 24 which comprises a DNA sequence selected from among the following:

(a) the DNA sequences set out in Figures 31, 32, 34, 35, 36, and 37 [SEQ ID NOS: 5, 7, 11, 13, or 15, respectively], or their complementary strands;

(b) DNA sequences which hybridize to the DNA sequences defined in (a) or fragments thereof; and

26. A recombinant DNA molecule according to Claim 24 which is deposited with the ATCC under

accession number 69302.

27. A recombinant DNA molecule according to Claim 24 which is deposited with the ATCC under accession number 69304.

28. A recombinant DNA molecule having the nucleotide sequence shown in Figure 30 [SEQ ID NO: 3].

29. A recombinant DNA molecule according to Claim 28 which is deposited with the ATCC under accession number 69305.

30. A recombinant DNA molecule according to Claim 24 which is deposited with the ATCC under accession number 69306.

31. A recombinant DNA molecule according to Claim 24 which is deposited with the ATCC under accession number 69307.

32. A recombinant DNA molecule according to Claim 24 which is deposited with the ATCC under accession number 69308.

33. A complementary DNA molecule according to

Claim 24.

34. A manufactured DNA molecule according to Claim 24.

35. A DNA molecule according to Claim 24 which includes one or more codons preferred for

expression in E. coli cells.

36. A DNA molecule according to Claim 24 which also encodes the cytoplasmic domain of a rat neu differentiation factor.

37. A biologically functional plasmid or viral DNA vector which includes a DNA molecule according to Claim 24.

38. A procaryotic or eucaryotic host cell stably transformed or transfected with a DNA vector according to Claim 37 in a manner allowing expression of the polypeptide encoded by the DNA molecule.

39. A host cell according to Claim 38 which is a non-human mammalian cell.

40. A host cell according to Claim 39 which is a Chinese hamster ovary cell.

41. A host cell according to Claim 38 which is an E. coli cell.

42. A host cell according to Claim 38 which is a yeast cell.

43. A polypeptide product of the expression in a procaryotic or eucaryotic host cell of a DNA

molecule according to Claim 24.

44. A process for the production of a

biologically active, non-naturally occurring polypeptide, comprising:

growing, under suitable nutrient conditions, procaryotic or eucaryotic host cells transformed or transfected with a DNA molecule according to Claim 24, and

isolating the desired polypeptide product of the expression in a biologically active form.

45. An antibody specifically generated by immunization with a polypeptide according to Claim 1.

46. An antibody according to Claim 45 which is a monoclonal antibody.

47. A biologically active composition comprising the polypeptide of Claim 1 covalently attached to a water soluble polymer.

48. A composition according to Claim 47, in which the water soluble polymer is from the group

consisting of polyethylene glycol, polypropylene glycol, and copolymers of polyethylene glycol and polypropylene glycol.

49. A method of modulating cellular proliferation and differentiation, comprising contacting the cells with an effective amount of the polypeptide of Claim 1.

50. A method of enhancing repair and regeneration, of human tissues that express the neu receptor, comprising contacting the tissue with an effective amount of a polypeptide according to Claim 1.

51. A method of treating a dermal wound in a human, comprising administering to the human a

therapeutically effective amount of a polypeptide according to Claim 1.

52. A method of treating tumors derived from epithelial tissue of the breast, stomach, lung, pancreas, colon, kidney, prostate or bladder in a human, comprising administering to the human a therapeutically effective amount of a polypeptide according to Claim 1.

53. A method according to Claim 52 in which a polypeptide according to Claim 1 is used in conjunction with one or more anti-cancer chemotherapeutic agents.

54. A method of treating gastrointestinal disease in a human, comprising administering to the human a therapeutically effective amount of a polypeptide according to Claim 1.

55. A method of treating Barrett's esophagus in a human, comprising administering to the human a therapeutically effective amount of a polypeptide

according to Claim 1.

56. A method of promoting reepithelialization in the human gastrointestinal, respiratory, reproductive or urinary tract, comprising administering to a human a therapeutically effective amount of a polypeptide according to Claim 1.

57. A method of treating cystic kidney disease or non-cystic end stage kidney disease in a human, comprising administering to the human a

therapeutically effective amount of a polypeptide according to Claim 1.

58. A method of treating inflammatory bowel disease in a human, comprising administering to the human a therapeutically effective amount of a polypeptide according to Claim 1.

59. A pharmaceutical composition comprising a therapeutically effective amount of a polypeptide according to Claim 1 and a pharmaceutically acceptable diluent, adjuvant or carrier.

60. An assay for detecting the level of naturally-occurring neu differentiation factor in cells or biological fluids from a human subject, comprising contacting an antibody according to Claim 45 with a sample of the cells or biological fluids to be tested under conditions appropriate for binding of the antibody to naturally-occurring neu differentiation factor, and

determining the level of antibody-antigen reaction as indicative of the amount of naturally-occurring neu differentiation factor in the sample.