or
Methods and compositions for detecting dysplasia
   
Document Number
US Application 20040146907
Publication Date
July 29, 2004
Link
Inventors
Smith, Victoria (Burlingame, CA)
MAP
Abstract
Methods and compositions are disclosed for detecting dysplasia in a tissue sample, screening candidate compounds for the ability to inhibit growth of a cancer cell, predicting predisposition to adenocarcinoma and treating cancer based on gene expression profiles.
Correspondence Name and Address
GENENTECH, INC.
1 DNA WAY
SOUTH SAN FRANCISCO
CA
94080
US
Assignee Name and Adress
GENENTECH, INC.
Publication Date
July 29, 2004
Serial No.
712124
Series Code
10
Filed
November 13, 2003
U.S. Current Class
435/6
U.S. Class at Publication
435/006
Intern'l Class
C12Q 001/68
Tags:
Description:
Amusing 0%
Clever 0%
Complex 0%
Efficient 0%
Historic 0%
Important 0%
Innovative 0%
Interesting 0%
Practical 0%
Simple 0%
Number of Claims:
45
Comments:
no comments yet
Claims
What is claimed is:

1. A method of detecting of high-grade dysplasia (HGD) in cells of a tissue sample, the method comprising: (a) obtaining a test tissue sample suspected of comprising cells exhibiting HGD; (b) establishing the level of expression in the test tissue sample of at least eight genes selected from the group consisting of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11); TM7SF1 (NM.sub.--003272) (SEQ ID NO: 13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); and TCF4 (NM.sub.--030756) (SEQ ID NO:43), or variants thereof having at least 80% nucleic acid sequence identity, wherein the tissue is from esophagus or colon; and (c) comparing expression of the at least eight genes to a baseline expression of the genes in normal tissue controls of the same tissue type, wherein an increase of at least 1.5-fold in expression of the genes relative to the baseline expression indicates that cells of the test sample exhibit HGD.

2. The method of claim 1, wherein the tissue is human tissue.

3. A method of identifying a esophageal tissue susceptable to esophageal adenocarcoma, comprising detecting esophageal HGD in a test tissue sample according to claim 1.

4. A method according to claim 1, wherein an increase of at least 2-fold in expression of genes relative to the baseline is observed.

5. A method according to claim 1, wherein at least one of the at least eight genes is selected from the group consisting of AGR2 (SEQ ID NO:3), TM7SF 1 (SEQ ID NO:13), MAT2B (SEQ ID NO:17), SLNAC1 (SEQ ID NO:23), and TCF4 (SEQ ID NO:43), or variants thereof having at least 80% nucleic acid sequence identity.

6. A method for determining predisposition of a mammalian tissue to a neo-plastic transformation by detecting HGD in cells of the tissue, the method comprising determining in a cell from the tissue expression of a nucleic acid sequence of at least eight genes selected from the group consisting of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); and TCF4 (NM.sub.--030756) (SEQ ID NO:43), or variants thereof having at least 80% nucleic acid sequence identity, wherein the tissue of from esophagus or colon, and wherein the expression in the test sample is at least 1.5-fold above baseline expression in a normal tissue control of the same tissue type.

7. A method according to claim 6, wherein the tissue is human tissue.

8. A method according to claim 6, wherein at least one of the at least eight genes is selected from the group consisting of AGR2 (SEQ ID NO:3), TM7SF1 (SEQ ID NO:13), MAT2B (SEQ ID NO:17), SLNAC1 (SEQ ID NO:23), and TCF4 (SEQ ID NO:43), or variants thereof having at least 80% nucleic acid sequence identity.

9. A method of detecting high-grade dysplasia (HGD) in cells of a mammalian tissue sample, the method comprising: (a) obtaining a test tissue sample suspected of comprising cells exhibiting HGD; (b) establishing the level of expression in the test tissue sample of at least eight polypeptides encoded by genes selected from the group consisting of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); and TCF4 (NM.sub.--030756) (SEQ ID NO:43), or variants thereof having at least 80% nucleic acid sequence identity, wherein the tissue is from esophagus or colon; and (c) comparing expression of the at least eight polypeptides in the test tissue sample to expression of the at least eight polypeptides in normal tissue controls of the same tissue type, wherein an increase of at least 1.5-fold in expression of the polypeptides in the test tissue sample relative to the normal tissue controls indicates that cells of the test sample exhibit HGD.

10. A method as according to claim 9 comprising contacting the test tissue sample with an antibody that specifically binds one of the at least eight polypeptides under conditions that permit the antibody to bind the polypeptide.

11. A method according to claim 9, wherein at least one of the at least eight polypeptides expressed by a gene selected from the group consisting of AGR2 (SEQ ID NO:3), TM7SF1 (SEQ ID NO:13), MAT2B (SEQ ID NO:17), SLNAC1 (SEQ ID NO:23), and TCF4 (SEQ ID NO:43), or variants thereof having at least 80% nucleic acid sequence entity.

12. The method of claim 1, wherein gene expression is determined by nucleic acid microarray analysis.

13. The method of claim 12, wherein analysis comprises contacting nucleic acid from a test tissue sample with a nucleic acid microarray comprising nucleic acid probe sequences, wherein at least eight of the nucleic acid probe sequences separately comprises at least 50 contiguous nucleotides from a gene selected from the group consisting of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); and TCF4 (NM.sub.--030756) (SEQ ID NO:43), or variants thereof having at least 80% nucleic acid sequence identity.

14. The method of claim 13, wherein the at least eight nucleic acid probe sequences comprise at least 60 contiguous nucleotides from a gene selected from the group.

15. The method of claim 14, wherein the at least eight nucleic acid probe sequences comprise at least 80 contiguous nucleotides from a gene selected from the group.

16. The method of claim 15, wherein the at least eight nucleic acid probe sequences comprise at least 100 contiguous nucleotides from a gene selected from the group.

17. The method of claim 16, wherein the at least eight nucleic acid probe sequences comprise at least 150 contiguous nucleotides from a gene selected from the group.

18. The method of claim 17, wherein the at least eight nucleic acid probe sequences comprise at least 200 contiguous nucleotides from a gene selected from the group.

19. The method of claim 13, wherein the nucleic acid microarray comprises nucleic acid probe sequences from at least ten genes selected from the group.

20. The method of claim 19, wherein the nucleic acid microarray comprises nucleic acid probe sequences from at least twelve genes selected from the group.

21. The method of claim 20, wherein the nucleic acid microarray comprises nucleic acid probe sequences from at least fifteen genes selected from the group.

22. The method of claim 21, wherein the nucleic acid microarray comprises nucleic acid probe sequences from at least eighteen genes selected from the group.

23. The method of claim 22, wherein the nucleic acid microarray comprises nucleic acid probe sequences from at least twenty genes selected from the group.

24. The method of claim 23, wherein the nucleic acid microarray comprises nucleic acid probe sequences from at least twenty two genes selected from the group.

25. The method of claim 1, wherein gene expression is determined by nucleic acid hybridization under high stringency conditions of a detectable probe comprising at least 50 contiguous nucleotides from a gene selected from the group to nucleic acid of cells of the test tissue sample relative to cells of the normal tissue control.

26. The method of claim 25, wherein the hybridization is in situ hybridization.

27. The method of claim 26, wherein the hybridization is fluorescent in situ hybridization.

28. The method of claim 1, wherein gene expression is determined by polymerase chain reaction (PCR) analysis.

29. The method of claim 1, wherein gene expression is determined by real-time polymerase chain reaction (RT-PCR) analysis.

30. The method of claim 1, wherein gene expression is determined by Taqman.RTM. polymerase chain reaction analysis.

31. A kit comprising a microarray, the microarray comprising nucleic acid probe sequences, wherein at least eight of the nucleic acid probe sequences each comprise at least 50 contiguous nucleotides from a gene selected from the group consisting of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); and TCF4 (NM.sub.--030756) (SEQ ID NO:43), or variants thereof having at least 80% nucleic acid sequence identity, and a package insert indicating that the microarray is for use in detecting HGD in a test tissue sample, wherein the tissue is from esophagus or colon, and wherein an increase in expression in the test tissue sample of at least 1.5-fold of the at least eight genes relative to a normal tissue control of the same tissue type indicates that cells of the test tissue exhibit HGD.

32. The kit of claim 31, wherein the nucleic acid probe sequences each comprise at least 60 contiguous nucleotides from a gene selected from the group.

33. The kit of claim 32, wherein the nucleic acid probe sequences each comprise at least 80 contiguous nucleotides from a gene selected from the group.

34. The kit of claim 33, wherein the nucleic acid probe sequences each comprise at least 100 contiguous nucleotides from a gene selected from the group.

35. The kit of claim 34, wherein the nucleic acid probe sequences each comprise at least 150 contiguous nucleotides from a gene selected from the group.

36. The kit of claim 35, wherein the nucleic acid probe sequences each comprise at least 200 contiguous nucleotides from a gene selected from the group.

37. A method of detecting cancer in a patient, the method comprising: (a) obtaining a test tissue sample from the patient; (b) establishing the level of expression of a gene selected from the group consisting of CAD17 (liver-intestine cadherin, NM.sub.--004063) (SEQ ID NO:45), CLDN15 (claudin 15, NM.sub.--014343) (SEQ ID NO:47), SLNAC1 (sodium channel, NM.sub.--004769) (SEQ ID NO:23), CFTR (chloride channel, NM.sub.--000492) (SEQ ID NO:49), H2R (histamine H2 receptor, NM.sub.--022304) (SEQ ID NO:51), PRSS8 (serine protease, NM.sub.--002773) (SEQ ID NO:7), PA21 (phospholipase A2 group IB, NM.sub.--000928) (SEQ ID NO:27), AGR2 (anterior gradient 2 homolog, (NM.sub.--006408) (SEQ ID NO:3), EGFR (NM.sub.--005228) (SEQ ID NO:53), EPHB2 (NM.sub.--004442) (SEQ ID NO:55), CRIPTO CR-1 (NM.sub.--003212) (SEQ ID NO:57), Eprin B1 (NM.sub.--004429) (SEQ ID NO:59), MMP-17/MT4-MMP (NM.sub.--016155) (SEQ ID NO:61), MMP26 (NM.sub.--021801) (SEQ ID NO:63), ADAM10 (NM.sub.--001110) (SEQ ID NO:65), ADAM8 (NM.sub.--001109) (SEQ ID NO:5), ADAM1 (XM.sub.--132370) (SEQ ID NO:67), TIM1 (NM.sub.--003254) (SEQ ID NO:69), MUC1 (XM.sub.--053256) (SEQ ID NO:71), CEA (NM.sub.--004363) (SEQ ID NO:73), NCA (NM.sub.--002483) (SEQ ID NO:75), Follistatin (NM.sub.--006350) (SEQ ID NO:77), Claudin 1 (NM.sub.--021101) (SEQ ID NO:79), Claudin 14 (NM.sub.--012130) (SEQ ID NO:81), tenascin-R (NM.sub.--003285) (SEQ ID NO:83), CAD3 (NM.sub.--001793) (SEQ ID NO:85), AXO1 (NM.sub.--005076) (SEQ ID NO:9), CONT (NM.sub.--001843) (SEQ ID NO:87), Osteopontin (NM.sub.--000582) (SEQ ID NO:89), Galectin 8 (NM.sub.--006499) (SEQ ID NO:91), PGS1 (bihlycan, NM.sub.--001711) (SEQ ID NO:93), Frizzled 2 (NM.sub.--001466) (SEQ ID NO.:95), ISLR (NM.sub.--005545) (SEQ ID NO:97), FLJ23399 (NM.sub.--022763) (SEQ ID NO:99), TEM1 (NM.sub.--020404) (SEQ ID NO:101), Tie2 ligand2 (NM.sub.--001147) (SEQ ID NO:103), STC-2 (NM.sub.--003714) (SEQ ID NO:19), VEGFC (NM.sub.--005429) (SEQ ID NO:105), tPA (NM.sub.--000930) (SEQ ID NO:107), Endothelin 1 (NM.sub.--001955) (SEQ ID NO:1), Thrombomodulin (NM.sub.--000361) (SEQ ID NO:109), TF (NM.sub.--001993) (SEQ ID NO:111), GPR4 (NM.sub.--005282) (SEQ ID NO:113), GPR66 (NM.sub.--006056) (SEQ ID NO:115), SLC22A2 (NML003058) ((SEQ ID NO:117), MLSN1 (NM.sub.--002420) (SEQ ID NO:119), and ATN2 (Na/K transport, NM.sub.--000702) (SEQ ID NO:121), or variants thereof having at least 80% nucleic acid sequence identity, wherein the test tissue is from esophagus or colon; and wherein the expressing in the test tissue is at a level at least 1.5-fold above expression of the gene in a normal tissue control of the same tissue type.

38. The method of claim 37, wherein inhibition of cell growth is cell death.

39. The method of claim 37, wherein at least two genes selected from the group are expressed at a level at least 1.5-fold above expression of the gene in a normal cell control.

40. The method of claim 39, wherein at least three genes selected from the group are expressed at a level at least 1.5-fold above expression of the gene in a normal cell control.

41. The method of claim 40, wherein at least 5 genes selected from the group are expressed at a level at least 1.5-fold above expression of the gene in a normal cell control.

42. The method of claim 41, wherein at least 8 genes selected from the group are expressed at a level at least 1.5-fold above expression of the gene in a normal cell control.

43. The method of claim 1, wherein the expression p value is less than 0.07.

44. The method of claim 6, wherein the expression p value is less than 0.07.

45. The method of claim 9, wherein the expression p value is less than 0.07.

Description
[0001] This application is a non-provisional application filed under 37 CFR 1.53(b), claiming priority under 35 USC 119(e) to provisional application No. 60/425,813 filed Nov. 13, 2002, the contents of which application is incorporated herein by reference.

TECHNICAL FIELD

[0002] The present invention relates to nucleic acid sequences, and compositions and uses therefore, which have been shown to be differentially expressed in high-grade dysplasia and which are useful as markers for the detection of high-grade dysplasia in a patient, and are implicated in the development of adenocarcinoma.

BACKGROUND OF THE INVENTION

[0003] The incidence of esophageal adenocarcinoma is rising in Western Countries, replacing squamous cell carcinoma as the most common neoplasm of the esophagus in white males and increasing in other ethnic groups (Devesa et al., Cancer 83:2049-2053 (1998); and Bollschweiler et al., Cancer 92:549-555 (2001)). Barrett's esophagus (BE) is the primary recognized risk factor for esophageal adenocarcinoma. BE results from repeated injury to the esophageal mucosa and develops in a subset of patients with chronic gastrointestinal reflux disease. It is characterized by a metaplastic change of squamous esophageal epithelium to intestinalized columnar mucosa (Csendes et al., Dis. Esoph 13:5-11 (2000); Cameron et al., New Eng. J. Med. 313:857-859 (1985); and Drewitz et al., Amer. J. Gastroenterol 92:212-215 (1997)).

[0004] Barrett's esophagus is found in 6% -16% of patients undergoing upper gastrointestinal endoscopy for gastroesophageal reflux, and it is estimated that a substantial patient population remains undiagnosed (Sarr et al., Amer. J. Surgery 149:187-193 (1985); Winters et al., Gastroenterology 92:118-124 (1985); Cameron et al., Gastroenterology 99:918-922 (1990); and Cameron et al., Gastroenterology 103:1241-1245 (1992)). The risk of developing esophageal carcinoma is 30-150 times greater in patients with BE. The outlook for patients diagnosed with adenocarcinoma is poor, with a 5 year survival rate of 10-15% (Streitz et al., Ann. Surg. 213:122-125 (1991); Menke-Pluymers et al., Gut 33:1454-1458 (1992); and Lerut et al., J. Thorac. Cariovasc. Surg. 107:1059-1066 (1994)). Patients with BE are placed on surveillance programs, although the absolute risk of developing adenocarcinoma in the context of BE remains relatively low, estimated at approximately 0.5% per patient year (Drewitz et al., Amer. J. Gastroenterol 92:212-215; O'Connor et al., Am. J. Gastroenterol 94:2037-2042 (1999); Spechler et al., JAMA 285:2331-2338 (2001); and Shaheen et al., Gastroenterology 119:333-338 (2000)). The value and cost-effectiveness of surveillance programs continue to be debated due to lack of understanding of the natural history of BE, the difficulty in obtaining representative biopsies by random sampling due to the heterogeneous nature of intestinal metaplasia, and inter-observer variability in endoscopic and histopathologic diagnosis (Falk, Gastroenterology 122:1569-1591 (2002); Sampliner, Am. J Gastroenterol. 93:1028-1032 (1998); and Alikhan et al., Gastrointest. Endosc. 50:23-26 (1999)). A metaplasia-dysplasia-carcinoma sequence has been described for BE and genetic changes involving cell cycle abnormalities, DNA ploidy, mutations, and amplification and expression of oncogenes have been identified (al-Kasspooles et al., Internat. J. Cancer 54:213-219 (1993); Vissers et al., Anticancer Res. 21:3813-3820 (2001); Bani-Hani et al., J. Natl. Cancer Inst. 92:1316-1321 (2000); Walch et al., Am. J. Pathol. 156:555-566 (2000); Wong et al., Cancer Res. 61:8284-8289 (2001); and Romagnoli et al., Laboratory Investigation 81:241-247 (2001)). There is a need for reliable detection of high-grade dysplasia and diagnosis of patients, such as BE patients, likely to develop adenocarcinoma, thereby allowing the disease to be monitored and treated early in its progression.

SUMMARY OF THE INVENTION

[0005] Generally, the present invention is based on the discovery that it is possible to detect high-grade dysplasia in a patient suspected of experiencing dysplasia, such as dysplasia associated with gastrointestinal reflux disease, such as Barrett's esophagus, or colon tissue dysplasia, by determining expression is an esophageal or colon biopsy from the patient wherein at least eight genes selected from a group of genes are expressed at a level of at least 1.5 fold over expression in a control sample. The control sample may comprise an esophageal or colon biopsy from a normal patient (i.e. one not experiencing gastrointestinal reflux disease) or from pooled samples of normal epithelial tissue (such as from normal liver, lung and kidney tissue). The group of high-grade dysplasia (HGD) gene markers, and their encoded polypeptides, comprise ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1 or 2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3 or 4); ADAM8 (NM.sub.--001109) (SEQ ID NO:5 or 6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7 or 8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9 or 10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11 or 12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13 or 14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15 or 16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17 or 18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19 or 20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21 or 22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23 or 24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27 or 28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29 or 30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31 or 32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33 or 34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35 or 36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37 or 38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41 or 42); and TCF4 (NM.sub.--030756) (SEQ ID NO:43 or 44). HGD marker polypeptides refer to the polypeptides encoded by the HGD gene markers.

[0006] In an aspect, the invention involves a method for the diagnosis of esophageal high-grade dysplasia (HGD) in a patient, comprising establishing increased expression of at least eight genes (listed here with the polypeptide encoded by the gene) selected from the group consisting of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1 or 2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3 or 4); ADAM8 (NM.sub.--001109) (SEQ ID NO:5 or 6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7 or 8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9 or 10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11 or 12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13 or 14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15 or 16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17 or 18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19 or 20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21 or 22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23 or 24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27 or 28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29 or 30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31 or 32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33 or 34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35 or 36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37 or 38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41 or 42); and TCF4 (NM.sub.--030756) (SEQ ID NO:43 or 44); and comparing expression of the genes to a baseline expression of the genes in normal tissue controls; wherein an increase of at least 1.5-fold in expression (and/or p value <0/07) of the genes from the group relative to the baseline indicates that the patient is experiencing esophageal high-grade dysplasia. In an embodiment of the invention, the tissue is human tissue.

[0007] In another embodiment, the invention involves a method of identifying a patient susceptable to esophageal adenocarcoma, comprising diagnosing esophageal high-grade dysplasia in a patient by establishing increased expression of at least eight genes selected from the group consisting of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); and TCF4 (NM.sub.--030756) (SEQ ID NO:43); and comparing expression of the genes to a baseline expression of the genes in normal tissue controls; wherein an increase of at least 1.5-fold in expression of the genes from the group relative to the baseline indicates that the patient is experiencing esophageal high-grade dysplasia. Alternatively, the patient may be susceptible to colon carcinoma and the diagnosing of high-grade dysplasia is by similarly determining expression of at least eight genes of the above group in a test colon tissue sample compared to a normal colon tissue sample.

[0008] In still another embodiment, the invention involves a method for determining whether an esophageal tissue is predisposed to a neo-plastic transformation, comprising determining whether in a cell from the esophageal tissue at least eight nucleic acid sequences selected from the group consisting of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); and TCF4 (NM.sub.--030756):(SEQ ID NO:43) is expressed at least 1.5-fold above baseline expression in a normal tissue control. In an embodiment, the tissue is human tissue.

[0009] In another aspect, the invention involves a method for the diagnosis of esophageal high-grade dysplasia in a patient, comprising establishing the level of expression a polypeptide encoded by at least eight genes selected from the group consisting of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); and TCF4 (NM.sub.--030756) (SEQ ID NO:43); and comparing expression of the at least eight genes from the group to a baseline expression of the genes in normal tissue controls; wherein an increase of at least 1.5-fold in expression of the polypeptide encoded by the genes from the group relative to the baseline indicates that the patient has esophageal dysplasia.

[0010] In an embodiment, the method involves contacting a HGD cell or a cancer cell with an antibody that binds specifically to a polypeptide, or fragment thereof, encoded by a gene selected from the group of HGD marker genes or cancer marker genes as disclosed herein.

[0011] In an embodiment, the method involves determining expression of at least 8 of the genes of the group of HGD marker genes using by nucleic acid miroarray analysis. In further embodiment, the microarray comprises nucleic acid sequences of at least 20 nucleotides derived from at least eight of the genes from the following group: ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); and TCF4 (NM.sub.--030756) (SEQ ID NO:43).

[0012] In another embodiment, the invention involves analysis using a microarray comprising nucleic acid probe sequences comprising at least 20 contiguous nucleotides from at least 8 genes selected from the group of HGD marker genes: ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); and TCF4 (NM.sub.--030756) (SEQ ID NO:43).

[0013] In a further embodiment, the methods of detecting high-grade dysplasia, diagnosing high-grade dysplasia, or prognosing development of cancer from detected high-grade dysplasia involves determining expression of at least eight genes from the group of HGD markers disclosed herein above as determined by an analysis method including, but not limited to polymerase chain reaction analysis, real-time polymerase chain reaction analysis, Taqman.RTM. polymerase chain reaction analysis, nucleic acid hybridization, fluorescent in situ hybridization and non-fluorescent in situ hybridization (e.g. radioactive, calorimetric, enzymatic or enzyme-linked detection methods for in situ hybridization). Where the method of the invention involves determining increased expression of polypeptides encoded by at least eight HGD marker genes as disclosed herein above, an embodiment of the method involves analysis using an antibody capable of specifically binding to a polypeptide, or a fragment thereof, encoded by a HGD marker gene.

[0014] In an alternative embodiment, the analytical methods of the invention involve probes or targets labelled with radionuclides or enzymatic labels such that expression of a gene or polypeptide is determinable.

[0015] In an embodiment of any of the methods or compositions of the invention, the dysplasia is high-grade dysplasia of esophagus tissue and the cancer is esophageal adenocarcinoma. Alternatively the patient is a human patient.

[0016] In another aspect, the invention involves a method of treating high-grade esophageal dysplasia or inhibiting or preventing cancer in a patient in need of such treatment, the method comprising administering to the patient a compound capable of decreasing expression of a gene selected from the group consisting of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); and TCF4 (NM.sub.--030756) (SEQ ID NO:43).

[0017] In still another aspect, the invention involves a method of treating high-grade esophageal dysplasia or inhibiting or preventing cancer in a patient in need of such treatment, the method comprising administering to the patient a compound capable of decreasing expression of a polypeptide encoded by a gene selected from the HGD marker genes.

[0018] In still another aspect, the invention involves a method of treating high-grade esophageal dysplasia or inhibiting or preventing cancer in a patient in need of such treatment, the method comprising administering to the patient a compound capable of inhibiting activity of a polypeptide encoded by a gene which is one of at least eight genes selected from the group of HGD marker genes as disclosed herein. In an embodiment, the compound is an antagonist of the polypeptide. In a further embodiment, the antagonist is an antibody, such as a monoclonal antibody or a humanized monoclonal antibody.

[0019] In a further aspect, the invention involves a method of screening for candidate drugs which inhibits or prevents progression from dysplasia to adenocarcinoma, the method comprising contacting a cell with a candidate drug, and assaying inhibition of progression from high-grade dysplasia to cancer in the cell, wherein the cell, prior to contacting with the candidate drug, expresses at least eight genes at a level at least 1.5-fold increased relative to a normal tissue baseline level, wherein the genes are selected from group of HGD marker genes as disclosed herein.

[0020] In another aspect, the invention involves a method of inhibiting or preventing progression from high-grade dysplasia to cancer in a patient by administering a drug identified by screening for candidate drugs which inhibits or prevents progression from dysplasia to adenocarcinoma, the method comprising contacting a cell with a candidate drug, and assaying inhibition of progression from high-grade dysplasia to cancer in the cell, wherein the cell, prior to contacting with the candidate drug, expresses at least eight genes at a level at least 1.5-fold increased relative to a normal tissue baseline level, wherein the genes are selected from group of HGD marker genes as disclosed herein.

[0021] In another aspect, the invention involves a compound capable of inhibiting or preventing the progression from high-grade dysplasia to cancer in a patient. In an embodiment of the invention the compound is identified by screening for a candidate drug which inhibits or prevents progression from dysplasia to adenocarcinoma, the method comprising contacting a cell expressing at least 1.5-fold relative to a normal tissue baseline level at least eight genes selected from the group of HGD marker genes as disclosed herein, with a candidate drug, and assaying inhibition of progression from high-grade dysplasia to cancer in the cell. In an embodiment, the invention involves a pharmaceutical composition comprising a compound capable of inhibiting or preventing the progression from high-grade dysplasia to cancer in a patient, and a pharmaceutically acceptable carrier.

[0022] In still another aspect, the invention involves detecting cancer in a patient by determining that a gene, or the polypeptide it encodes, selected from the group consisting of CAD17 (liver-intestine cadherin, NM.sub.--004063) (SEQ ID NO:45 or 46), CLDN15 (claudin 15, NM.sub.--014343) (SEQ ID NO:47 or 48), SLNAC1 (sodium channel, NM.sub.--004769) (SEQ ID NO:23 or 24), CFTR (chloride channel, NM.sub.--000492) (SEQ ID NO:49 or 50), H2R (histamine H2 receptor, NM.sub.--022304) (SEQ ID NO:51 or 52), PRSS8 (serine protease, NM.sub.--002773) (SEQ ID NO:7 or 8), PA21 (phospholipase A2 group IB, NM.sub.--000928) (SEQ ID NO:27 or 28), AGR2 (anterior gradient 2 homolog, (NM.sub.--006408) (SEQ ID NO:3 or 4), EGFR (NM.sub.--005228) (SEQ ID NO:53 or 54), EPHB2 (NM.sub.--004442) (SEQ ID NO:55 or 56), CRIPTO CR-1 (NM.sub.--003212) (SEQ ID NO:57 or 58), Eprin B1 (NM.sub.--004429) (SEQ ID NO:59 or 60), MMP-17/MT4-MMP (NM.sub.--016155) (SEQ ID NO:61 or 62), MMP26 (NM.sub.--021801) (SEQ ID NO:63 or 64), ADAM10 (NM.sub.--001110) (SEQ ID NO:65 or 66), ADAM8 (NM.sub.--001109) (SEQ ID NO:5 or 6), ADAM1 (XM.sub.--132370) (SEQ ID NO:67 or 68), TIM1 (NM.sub.--003254) (SEQ ID NO:69 or 70), MUC1 (XM.sub.--053256) (SEQ ID NO:71 or 72), CEA (NM.sub.--004363) (SEQ ID NO:73 or 74), NCA (NM.sub.--002483) (SEQ ID NO:75 or 76), Follistatin (NM.sub.--006350) (SEQ ID NO:77 or 78), Claudin 1 (NM.sub.--021101) (SEQ ID NO:79 or 80), Claudin 14 (NM.sub.--012130) (SEQ ID NO:81 or 82), tenascin-R (NM.sub.--003285) (SEQ ID NO:83 or 84), CAD3 (NM.sub.--001793) (SEQ ID NO:85 or 86), AXO1 (NM.sub.--005076) (SEQ ID NO:9 or 10), CONT (NM.sub.--001843) (SEQ ID NO:87 or 88), Osteopontin (NM.sub.--000582) (SEQ ID NO:89 or 90), Galectin 8 (NM.sub.--006499) (SEQ ID NO:91 or 92), PGS1 (bihlycan, NM.sub.--001711) (SEQ ID NO:93 or 94), Frizzled 2 (NM.sub.--001466) (SEQ ID NO:95 or 96), ISLR (NM.sub.--005545) (SEQ ID NO:97 or 98), FLJ23399 (NM.sub.--022763) (SEQ ID NO:99 or 100), TEM1 (NM.sub.--020404) (SEQ ID NO:101 or 102), Tie2 ligand2 (NM.sub.--001147) (SEQ ID NO:103 or 104), STC-2 (NM.sub.--003714) (SEQ ID NO:19 or 20), VEGFC (NM.sub.--005429) (SEQ ID NO:105 or 106), tPA (NM.sub.--000930) (SEQ ID NO:107 or 108), Endothelin 1 (NM.sub.--001955) (SEQ ID NO:1 or 2), Thrombomodulin (NM.sub.--000361) (SEQ ID NO:109 or 110), TF (NM.sub.--001993) (SEQ ID NO:111 or 112), GPR4 (NM.sub.--005282) (SEQ ID NO:113 or 114), GPR66 (NM.sub.--006056) (SEQ ID NO:115 or 116), SLC22A2 (NM.sub.--003058) ((SEQ ID NO:117 or 118), MLSN1 (NM.sub.--002420) (SEQ ID NO:119 or 120), and ATN2 (Na/K transport, NM.sub.--000702) (SEQ ID NO:121 or 122) is expressed at a level of about 1.5-fold in a test sample above the level of expression in a normal tissue sample of the same tissue type. The test sample is generally from a patient suspected of experiencing cancer, including, but not limited to, adenocarcinoma, esophageal adenocarcinoma, or colon cancer. The test sample is generally from the esophagus or colon of the patient. In an embodiment, at least two, alternatively at least three, alternatively at least five, and alternatively at least eight genes selected from the above group is upregulated in cancer tissue at 1.5-fold relative to normal tissue. Detection of the up-regulation of these genes is determined by, for example, hybridization analysis as standard in the and disclosed herein, as well as through antibody binding analysis of the level polypeptides expressed by the up-regulated gene or genes.

[0023] In an embodiment, the invention involves treatment by contacting a cancer cell with a compound that inhibits expression of at least one, optionally at least two, at least three, at least five, or at least eight genes, or the polypeptides encoded by the genes, selected from the group consisting of CAD17 (liver-intestine cadherin, NM.sub.--004063) (SEQ ID NO:45 or 46), CLDN15 (claudin 15, NM.sub.--014343) (SEQ ID NO:47 or 48), SLNAC1 (sodium channel, NM.sub.--004769) (SEQ ID NO:23 or 24), CFTR (chloride channel, NM.sub.--000492) (SEQ ID NO:49 or 50), H2R (histamine H2 receptor, NM.sub.--022304) (SEQ ID NO:51 or 52), PRSS8 (serine protease, NM.sub.--002773) (SEQ ID NO:7 or 8), PA21 (phospholipase A2 group IB, NM.sub.--000928) (SEQ ID NO:27 or 28), AGR2 (anterior gradient 2 homolog, (NM.sub.--006408) (SEQ ID NO:3 or 4), EGFR (NM.sub.--005228) (SEQ ID NO:53 or 54), EPHB2 (NM.sub.--004442) (SEQ ID NO:55 or 56), CRIPTO CR-1 (NM.sub.--003212) (SEQ ID NO:57 or 58), Eprin B1 (NM.sub.--004429) (SEQ ID NO:59 or 60), MMP-17/MT4-MMP (NM.sub.--016155) (SEQ ID NO:61 or 62), MMP26 (NM.sub.--021801) (SEQ ID NO:63 or 64), ADAM10 (NM.sub.--001110) (SEQ ID NO:65 or 66), ADAM8 (NM.sub.--001109) (SEQ ID NO:5 or 6), ADAM1 (XM.sub.--132370) (SEQ ID NO:67 or 68), TIM1 (NM.sub.--003254) (SEQ ID NO:69 or 70), MUC1 (XM.sub.--053256) (SEQ ID NO:71 or 72), CEA (NM.sub.--004363) (SEQ ID NO:73 or 74), NCA (NM.sub.--002483) (SEQ ID NO:75 or 76), Follistatin (NM.sub.--006350) (SEQ ID NO:77 or 78), Claudin 1 (NM.sub.--021101) (SEQ ID NO:79 or 80), Claudin 14 (NM.sub.--012130) (SEQ ID NO:81 or 82), tenascin-R (NM.sub.--003285) (SEQ ID NO:83 or 84), CAD3 (NM.sub.--001793) (SEQ ID NO:85 or 86), AXO1 (NM.sub.--005076) (SEQ ID NO:9 or 10), CONT (NM.sub.--001843) (SEQ ID NO:87 or 88), Osteopontin (NM.sub.--000582) (SEQ ID NO:89 or 90), Galectin 8 (NM.sub.--006499) (SEQ ID NO:91 or 92), PGS1 (bihlycan, NM.sub.--001711) (SEQ ID NO:93 or 94), Frizzled 2 (NM.sub.--001466) (SEQ ID NO:95 or 96), ISLR (NM.sub.--005545) (SEQ ID NO:97 or 98), FLJ23399 (NM.sub.--022763) (SEQ ID NO:99 or 100), TEM1 (NM.sub.--020404) (SEQ ID NO:101 or 102), Tie2 ligand2 (NM.sub.--001147) (SEQ ID NO:103 or 104), STC-2 (NM.sub.--003714) (SEQ ID NO:19 or 20), VEGFC (NM.sub.--005429) (SEQ ID NO:105 or 106), tPA (NM.sub.--000930) (SEQ ID NO:107 or 108), Endothelin 1 (NM.sub.--001955) (SEQ ID NO:1 or 2), Thrombomodulin (NM.sub.--000361) (SEQ ID NO:109 or 110), TF (NM.sub.--001993) (SEQ ID NO:111 or 112), GPR4 (NM.sub.--005282) (SEQ ID NO:113 or 114), GPR66 (NM.sub.--006056) (SEQ ID NO:115 or 116), SLC22A2 (NM.sub.--003058) ((SEQ ID NO:117 or 118), MLSN1 (NM.sub.--002420) (SEQ ID NO:119 or 120), and ATN2 (Na/K transport, NM.sub.--000702) (SEQ ID NO:121 or 122). In another embodiment, treatment is by contacting the cancer cell with a compound that inhibits the production or activity of a polypeptide of the above group and/or encoded by a gene of the above group. Where inhibition of a polypeptide is desired, the compound is often an antibody specific for the polypeptide, is often a monoclonal antibody such as a humanized antibody.

[0024] In yet another aspect, the invention involves a method of screening a candidate compound for the ability to inhibit cancer cell growth or cause cancer cell death by contacting the candidate compound with a cancer cell expressing a gene or polypeptide selected from the following group: CAD17 (liver-intestine cadherin, NM.sub.--004063) (SEQ ID NO:45 or 46), CLDN15 (claudin 15, NM.sub.--014343) (SEQ ID NO:47 or 48), SLNAC1 (sodium channel, NM.sub.--004769) (SEQ ID NO:23 or 24), CFTR (chloride channel, NM.sub.--000492) (SEQ ID NO:49 or 50), H2R (histamine H2 receptor, NM.sub.--022304) (SEQ ID NO:51 or 52), PRSS8 (serine protease, NM.sub.--002773) (SEQ ID NO:7 or 8), PA21 (phospholipase A2 group IB, NM.sub.--000928) (SEQ ID NO:27 or 28), AGR2 (anterior gradient 2 homolog, (NM.sub.--006408) (SEQ ID NO:3 or 4), EGFR (NM.sub.--005228) (SEQ ID NO:53 or 54), EPHB2 (NM.sub.--004442) (SEQ ID NO:55 or 56), CRIPTO CR-1 (NM.sub.--003212) (SEQ ID NO:57 or 58), Eprin B1 (NM.sub.--004429) (SEQ ID NO:59 or 60), MMP-17/MT4-MMP (NM.sub.--016155) (SEQ ID NO:61 or 62), MMP26 (NM.sub.--021801) (SEQ ID NO:63 or 64), ADAM10 (NM.sub.--001110) (SEQ ID NO:65 or 66), ADAM8 (NM.sub.--001 109) (SEQ ID NO:5 or 6), ADAM1 (XM.sub.--132370) (SEQ ID NO:67 or 68), TIM1 (NM.sub.--003254) (SEQ ID NO:69 or 70), MUC1 (XM.sub.--053256) (SEQ ID NO:71 or 72), CEA (NM.sub.--004363) (SEQ ID NO:73 or 74), NCA (NM.sub.--002483) (SEQ ID NO:75 or 76), Follistatin (NM.sub.--006350) (SEQ ID NO:77 or 78), Claudin 1 (NM.sub.--021101) (SEQ ID NO:79 or 80), Claudin 14 (NM.sub.--012130) (SEQ ID NO:81 or 82), tenascin-R (NM.sub.--003285) (SEQ ID NO:83 or 84), CAD3 (NM.sub.--001793) (SEQ ID NO:85 or 86), AXO1 (NM.sub.--005076) (SEQ ID NO:9 or 10), CONT (NM.sub.--001843) (SEQ ID NO:87 or 88), Osteopontin (NM.sub.--000582) (SEQ ID NO:89 or 90), Galectin 8 (NM.sub.--006499) (SEQ ID NO:91 or 92), PGS1 (bihlycan, NM.sub.--001711) (SEQ ID NO:93 or 94), Frizzled 2 (NM.sub.--001466) (SEQ ID NO:95 or 96), ISLR (NM.sub.--005545) (SEQ ID NO:97 or 98), FLJ23399 (NM.sub.--022763) (SEQ ID NO:99 or 100), TEM1 (NM.sub.--020404) (SEQ ID NO:101 or 102), Tie2 ligand2 (NM.sub.--001147) (SEQ ID NO:103 or 104), STC-2 (NM.sub.--003714) (SEQ ID NO:19 or 20), VEGFC (NM.sub.--005429) (SEQ ID NO:105 or 106), tPA (NM.sub.--000930) (SEQ ID NO:107 or 108), Endothelin 1 (NM.sub.--001955) (SEQ ID NO:1 or 2), Thrombomodulin (NM.sub.--000361) (SEQ ID NO:109 or 110), TF (NM.sub.--001993) (SEQ ID NO:111 or 112), GPR4 (NM.sub.--005282) (SEQ ID NO:113 or 114), GPR66 (NM.sub.--006056) (SEQ ID NO:115 or 116), SLC22A2 (NM.sub.--003058) ((SEQ ID NO:117 or 118), MLSN1 (NM.sub.--002420) (SEQ ID NO:119 or 120), and ATN2 (Na/K transport, NM.sub.--000702) (SEQ ID NO:121 or 122), wherein gene expression of at least one, at least two, at least three, at least five, or at least eight genes selected from the group are expressed at a level at least about 1.5-fold above the level in normal control tissue. Where the candidate compound is an antibody, the antibody is alternatively a polyclonal, monoclonal, humanized antibody, a Fab, a F(ab')2, or a binding fragment of any one of these compounds.

[0025] In an embodiment, the sequences which are used to determine sequence identity or similarity are selected from the sequences described herein. Optionally, sequence variants are naturally occurring allelic variants, sequence variants or splice variants of these sequences. Sequence identity is typically calculated using the BLAST algorithm, described in Altschul et al Nucleic Acids Res. 25,3389-3402 (1997) with the BLOSUM62 default matrix.

[0026] In one embodiment, nucleic acid homology can be determined through hybridisation studies. Nucleic acids which hybridise under stringent conditions to the nucleic acids of the invention are considered high-grade esophageal dysplasia sequences. Under stringent conditions, hybridisation will most preferably occur at 42.degree. C. in 750 mM NaCl, 75 mM trisodium citrate, 2% SDS, 50% formamide, 1.times. Denhart's, 10% (w/v) dextran sulphate and 100 pg/ml denatured salmon sperm DNA. Useful variations on these conditions will be readily apparent to those skilled in the art. The washing steps which follow hybridization most preferably occur at 65.degree. C. in 15 mM NaCl, 1.5 mM trisodium citrate, and 1% SDS. Additional variations on these conditions will be readily apparent to those skilled in the art.

[0027] As a result of the degeneracy of the genetic code, a number of polynucleotide sequences encoding polypeptides of the invention, some that may have minimal similarity to the polynucleotide sequences of any known and naturally occurring gene, may be produced. Thus, the invention includes each and every possible variation of polynucleotide sequence that could be made by selecting combinations based on possible codon choices. These combinations are made in accordance with the standard triplet genetic code as applied to the polynucleotide sequence of naturally occurring high-grade esophageal dysplasia sequences, and all such variations are to be considered as being specifically disclosed.

[0028] The polynucleotides of this invention include RNA, cDNA, genomic DNA, synthetic forms, and mixed polymers, both sense and antisense strands, and may be chemically or biochemically modified, or may contain non-natural or derivatised nucleotide bases as will be appreciated by those skilled in the art. Such modifications include labels, methylation, intercalators, alkylators and modified linkages. In some instances it may be advantageous to produce nucleotide sequences encoding high-grade esophageal dysplasia sequences of the invention, or their derivatives, possessing a substantially different codon usage than that of the naturally occurring gene. For example, codons may be selected to increase the rate of expression of the peptide in a particular prokaryotic or eukaryotic host corresponding with the frequency that particular codons are utilized by the host. Other reasons to alter the nucleotide sequence encoding high-grade esophageal dysplasia sequences of the invention, or their derivatives, without altering the encoded amino acid sequences include the production of RNA transcripts having more desirable properties, such as a greater half-life, than transcripts produced from the naturally occurring sequence.

[0029] In some instances, useful nucleic acid sequences up-regulated in high-grade esophageal dysplasia of the invention are fragments of larger genes and may be used to identify and obtain corresponding full-length genes. Full-length sequences of the genes selected from the HGD gene marker group or cancer gene marker group of the invention can be obtained using a partial gene sequence using methods known per se to those skilled in the art. For example,"restriction-site PCR" may be used to retrieve unknown sequence adjacent to a portion of DNA whose sequence is known. In this technique universal primers are used to retrieve unknown sequence. Inverse PCR may also be used, in which primers based on the known sequence are designed to amplify adjacent unknown sequences. These upstream sequences may include promoters and regulatory elements. In addition, various other PCR-based techniques may be used, for example a kit available from Clontech (Palo Alto, Calif.) allows for a walking PCR technique, the 5'RACE kit (Gibco-BRL) allows isolation of additional sequence while additional 3'sequence can be obtained using practised techniques.

[0030] The present invention allows for the preparation of purified high-grade dysplasia polypeptide (i.e. a polypeptide encoded by a gene disclosed herein as up-regulated in high-grade esophageal dysplasia) or protein, from the polynucleotides of the present invention or variants thereof. In order to do this, host cells may be transfected with a nucleic acid molecule as described above. Typically said host cells are transfected with an expression vector comprising a nucleic acid encoding a high-grade esophageal dysplasia protein according to the invention. Cells are cultured under the appropriate conditions to induce or cause expression of the high-grade esophageal dysplasia protein. The conditions appropriate for high-grade esophageal dysplasia protein expression will vary with the choice of the expression vector and the host cell, and will be easily ascertained by one skilled in the art.

[0031] A variety of expression vector/hosi systems may be utilized to contain and express the high-grade dysplasia sequences of the invention and are well known in the art. These include, but are not limited to, microorganisms such as bacteria transformed with plasmid or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems infected with viral expression vectors (e. g., baculovirus); or mouse or other animal or human tissue cell systems. In a preferred embodiment the high-grade esophageal dysplasia proteins of the invention are expressed in mammalian cells using various expression vectors including plasmid, cosmid and viral systems such as adenoviral, retroviral or vaccinia virus expression systems. The invention is not limited by the host cell employed.

[0032] The polynucleotide sequences, or variants thereof, of the present invention can be stably expressed in cell lines to allow long term production of recombinant proteins in mammalian systems. These sequences can be transformed into cell lines using expression vectors which may contain viral origins of replication and/or endogenous expression elements and a selectable marker gene on the same or on a separate vector. The selectable marker confers resistance to a selective agent, and its presence allows growth and recovery of cells which successfully express the introduced sequences. Resistant clones of stably transformed cells may be propagated using tissue culture techniques appropriate to the cell type.

[0033] The protein produced by a transformed cell may be secreted or retained intracellularly depending on the sequence and/or the vector used. As will be understood by those of skill in the art, expression vectors containing polynucleotides which encode a protein of the invention may be designed to contain signal sequences which direct secretion of the protein through a prokaryotic or eukaryotic cell membrane.

[0034] In addition, a host cell strain may be chosen for its ability to modulate expression of the inserted sequences or to process the expressed protein in the desired fashion. Such modifications of the polypeptide include, but are not limited to, acetylation, glycosylation, phosphorylation, and acylation. Post-translational cleavage of the protein may also be used to specify protein targeting, folding, and/or activity. Different host cells having specific cellular machinery and characteristic mechanisms for post- translational activities (e. g., CHO or HeLa cells), are available from the American Type Culture Collection (ATCC) and may be chosen to ensure the correct modification and processing of the foreign protein.

[0035] When large quantities of protein are needed such as for antibody production, vectors which direct high levels of high-grade esophageal dysplasia gene expression may be used such as those containing the T5 or T7 inducible bacteriophage promoter.

[0036] The present invention also includes the use of the expression systems described above in generating and isolating fusion proteins which contain important functional domains of the protein. These fusion proteins are used for binding, structural and functional studies as well as for the generation of appropriate antibodies.

[0037] In order to express and purify the protein as a fusion protein, the appropriate cDNA sequence is inserted into a vector which contains a nucleotide sequence encoding another peptide (for example, glutathionine succinyl transferase). The fusion protein is expressed and recovered from prokaryotic or eukaryotic cells. The fusion protein can then be purified by affinity chromatography based upon the fusion vector sequence. The relevant protein can subsequently be obtained by enzymatic cleavage of the fusion protein.

[0038] In one embodiment, a fusion protein may be generated by the fusion of a high-grade dysplasia polypeptide with a tag polypeptide which provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is generally placed at the amino-or carboxy-terminus of the high-grade esophageal dysplasia polypeptide. The presence of such epitope-tagged forms of a high-grade esophageal dysplasia polypeptide can be detected using an antibody against the tag polypeptide. Also, provision of the epitope tag enables the high-grade dysplasia polypeptide to be readily purified by affinity purification using an anti-tag antibody or another type of affinity matrix that binds to the epitope tag.

[0039] Various tag polypeptides and their respective antibodies are well known in the art. Examples include poly-histidine or poly-histidine-glycine tags and the c-myc tag and antibodies thereto. Fragments of high-grade dysplasia polypeptide may also be produced by direct peptide synthesis using solid-phase techniques. Automated synthesis may be achieved by using the ABI 433A Peptide Synthesizer (Applied Biosystems, Foster City, Calif.). Various fragments of high-grade dysplasia polypeptide may be synthesized separately and then combined to produce the full-length molecule.

[0040] In a further aspect of the invention there is provided a method of preparing a polypeptide as described above, comprising the steps of: (1) culturing the host cells under conditions effective for production of the polypeptide; and (2) harvesting the polypeptide.

[0041] Substantially purified high-grade dysplasia polypeptide or fragments thereof can then be used in further biochemical analyses to establish secondary and tertiary structure for example by x-ray crystallography of the protein or by nuclear magnetic resonance (NMR). Determination of structure allows for the rational design of pharmaceuticals to interact with the protein, alter protein charge configuration or charge interaction with other proteins, or to alter its function in the cell.

[0042] With the identification of the high-grade esophageal dysplasia marker gene nucleotide sequences and the polypeptide sequences encoded by them, probes and antibodies raised to the genes can be used in a variety of hybridisation and immunological assays to screen for and detect the presence of either a normal or mutated gene or gene product.

[0043] In addition the nucleotide and protein sequences of the high-grade dysplasia genes provided in this invention enable therapeutic methods for the treatment of cancer, such as adenocarcinoma associated with one or more of these genes, enable screening of compounds for therapeutic intervention, and also enable methods for the diagnosis or prognosis of cancer associated with the these genes. Examples of such cancers include, but are not limited to, esophageal adenocarcinoma.

[0044] Transducing retroviral vectors are often used for producing a cell line expressing a gene above the level of expression in a cell lacking the additional copy of the gene. Such a cell is useful according to the invention for the production of a cell line useful for screening candidate compounds capable of reducing expression of a gene associated with high-grade esophageal dysplasia, reducing expression of a polypeptide encoded by the gene, or inhibiting activity of the polypeptide, such that the cell does not progress from dysplasia to cancer. The full-length high-grade dysplasia gene, or portions thereof, can be cloned into a retroviral vector and expression can be driven from its endogenous promoter or from the retroviral long terminal repeat or from a promoter specific for the target cell type of interest. Other viral vectors can be used and include, as is known in the art, adenoviruses, adeno-associated virus, vaccinia virus, papovaviruses, lentiviruses and retroviruses of avian, murine and human origin.

[0045] The viral vector described herein above is also useful for gene therapy to reduce the activity of the high-grade dysplasia genes of the invention, such as by antisense expression inhibition or RNA interference (see, for example, Paddison, P. J. et al., Genes & Development 16:948-958 (2002) and Brummelkamp, T. R. et al., Science 296:550-553 (2002)). Gene therapy would be carried out according to established methods (Friedman, 1991; Culver, 1996). A vector containing a copy of a high-grade esophageal dysplasia gene linked to expression control elements and capable of replicating inside the cells is prepared. Alternatively the vector may be replication deficient and may require helper cells or helper virus for replication and virus production and use in gene therapy.

[0046] Gene transfer using non-viral methods of infection can also be used. These methods include direct injection of DNA, uptake of naked DNA in the presence of calcium phosphate, electroporation, protoplast fusion or liposome delivery. Gene transfer can also be achieved by delivery as a part of a human artificial chromosome or receptor-mediated gene transfer. This involves linking the DNA to a targeting molecule that will bind to specific cell-surface receptors to induce endocytosis and transfer of the DNA into mammalian cells. One such technique uses poly-L-lysine to link asialoglycoprotein to DNA. An adenovirus is also added to the complex to disrupt the lysosomes and thus allow the DNA to avoid degradation and move to the nucleus. Infusion of these particles intravenously has resulted in gene transfer into hepatocytes.

[0047] Inhibiting high-grade esophageal dysplasia gene or polypeptide function that are up-regulated in cancer can be achieved in a variety of ways as would be appreciated by those skilled in the art. Typically, a vector expressing the complement of a polynucleotide encoding a high-grade dysplasia gene of the invention may be administered to a subject to treat or prevent a disorder associated with increased activity and/or expression of the gene including, but not limited to, those described above.

[0048] Antisense strategies may use a variety of approaches including the use of antisense oligonucleotides, ribozymes, DNAzymes, injection of antisense RNA and transfection of antisense RNA expression vectors. Many methods for introducing vectors into cells or tissues are available and equally suitable for use in vivo, in vitro, and ex vivo. For ex vivo therapy, vectors may be introduced into stem cells taken from the patient and clonally propagated for autologous transplant back into that same patient. Delivery by transfection, by liposome injections, or by polycationic amino polymers may be achieved using methods which are well known in the art (see, for example, Goldman, C K. et al., Nature Biotechnology 15: 462-466 (1997))

[0049] Where purified protein or polypeptide is used to produce antibodies which specifically bind a high-grade dysplasia protein, the antibody(ies) are used directly as an antagonist or indirectly as a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissues that express the protein. Such antibodies may include, but are not limited to, polyclonal, monoclonal, chimeric and single chain antibodies as would be understood by the person skilled in the art.

[0050] For the production of antibodies, various hosts including rabbits, rats, goats, mice, humans, and others may be immunized by injection with a protein of the invention or with any fragment or oligopeptide thereof, which has immunogenic properties. Various adjuvants may be used to increase immunological response and include, but are not limited to, Freund's, mineral gels such as aluminum hydroxide, and surface-active substances such as lysolecithin. Adjuvants used in humans include BCG (bacilli Calmette-Guerin) and Corynebacterium parvum.

[0051] It is preferred that the oligopeptides, peptides, or fragments used to induce antibodies to the high-grade dysplasia of the invention have an amino acid sequence consisting of at least about 5 amino acids, and, more preferably, of at least about 10 amino acids. It is also preferable that these oligopeptides, peptides, or fragments are identical to a portion of the amino acid sequence of the natural protein and contain the entire amino acid sequence of a small, naturally occurring molecule. Short stretches of amino acids from these proteins may be fused with those of another protein, such as KLH, and antibodies to the chimeric molecule may be produced.

[0052] Monoclonal antibodies to high-grade dysplasia polypeptides or proteins of the invention may be prepared using any technique which provides for the production of antibody molecules by continuous cell lines in culture. These include, but are not limited to, the hybridoma technique, the human B-cell hybridoma technique, and the EBV-hybridoma technique. (For example, see Kohler, G. and Milstein, C., Nature 256:495-497 (1975); Kozbor, D. et al., Immunol. Methods 81:31-42 (1985); and Cole, S. P. et al., Mol. Cell Biol. 62:109-120 (1984)).

[0053] Antibodies may also be produced by inducing in vivo production in the lymphocyte population or by screening immunoglobulin libraries or panels of highly specific binding reagents as disclosed in the literature.

[0054] Antibody fragments which contain specific binding sites for the high-grade esophageal dysplasia proteins may also be generated. For example, such fragments include fragments produced by pepsin digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of the F(AB)2 fragments. Alternatively, Fab expression libraries may be constructed to allow rapid and easy identification of monoclonal Fab fragments with the desired specificity. (For example, see Huse, W. D. et al., Science 246:1275-1281 (1989)). Various immunoassays well known in art may be used for screening to identify antibodies having the desired specificity.

[0055] Numerous protocols for competitive binding or immunoradiometric assays using either polyclonal or monoclonal antibodies with established specificities are well known in the art. Such immunoassays typically involve the measurement of complex formation between a protein and its specific antibody. A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to two non-interfering epitopes is preferred, but a competitive binding assay may also be employed.

[0056] Candidate pharmaceutical agents or compounds encompass numerous chemical classes, though typically they are organic molecules, preferably small organic compounds having molecular weight of more than 100 and less than about 2,500 daltons. Candidate agents are also found among biomolecules including peptides, saccharides, fatty acids and steroids and peptides.

[0057] Agent screening techniques include, but are not limited to, utilising eukaryotic or prokaryotic host cells that are stably transformed with recombinant molecules expressing a particular high-grade dysplasia polypeptide of the invention, or fragment thereof, preferably in competitive binding assays. Binding assays will measure for the formation of complexes between the high-grade esophageal dysplasia polypeptide, or fragments thereof, and the agent being tested, or will measure the degree to which an agent being tested will interfere with the formation of a complex between the high-grade esophageal dysplasia polypeptide, or fragment thereof, and a known ligand.

[0058] Another technique for drug screening provides high- throughput screening for compounds having suitable binding affinity to a high-grade dysplasia polypeptide. In such a technique, large numbers of small peptide test compounds are synthesised on a solid substrate and can be assayed through high-grade esophageal dysplasia polypeptide binding and washing. Bound high-grade dysplasia polypeptide is then detected by methods well known in the art. In a variation of this technique, purified polypeptides can be coated directly onto plates to identify interacting test compounds.

[0059] An additional method for drug screening involves the use of host eukaryotic cell lines which carry mutations in a particular high-grade dysplasia gene. The host cell lines are also defective at the polypeptide level. Other cell lines may be used where the gene expression of the high-grade esophageal dysplasia gene can be switched off or up-regulated. The host cell lines or cells are grown in the presence of various drug compounds and the rate of growth of the host cells is measured to determine if the compound is capable of regulating the growth of defective cells.

[0060] A high-grade esophageal dysplasia polypeptide encoded by an HGD marker gene may also be used for screening compounds developed as a result of combinatorial library technology. This provides a way to test a large number of different substances for their ability to modulate activity of a polypeptide. The use of peptide libraries is preferred with such libraries and their use known in the art.

[0061] A substance identified as a modulator of polypeptide function may be peptide or non-peptide in nature. Non-peptide "small molecules" are often preferred for many in vivo pharmaceutical applications. In addition, a mimic or mimetic of the substance may be designed for pharmaceutical use. The design of mimetics based on a known pharmaceutically active compound (i.e., a "lead compound") is a common approach to the development of novel pharmaceuticals. This is often desirable where the original active compound is difficult or expensive to synthesise or where it provides an unsuitable method of administration. In the design of a mimetic, particular parts of the original active compound that are important in determining the target property are identified. These parts or residues constituting the active region of the compound are known as its pharmacophore. Once found, the pharmacophore structure is modelled according to its physical properties using data from a range of sources including x-ray diffraction data and NMR. A template molecule is then selected onto which chemical groups which mimic the pharmacophore can be added. The selection can be made such that the mimetic is easy to synthesise, is likely to be pharmacologically acceptable, does not degrade in vivo and retains the biological activity of the lead compound. Further optimisation or modification can be carried out to select one or more final mimetics useful for in vivo or clinical testing.

[0062] It is also possible to isolate a target-specific antibody and then solve its crystal structure. In principle, this approach yields a pharmacophore upon which subsequent drug design can be based as described above. It may be possible to avoid protein crystallography altogether by generating anti-idiotypic antibodies (anti-ids) to a functional, pharmacologically active antibody.

[0063] As a mirror image of a mirror image, the binding site of the anti-ids would be expected to be an analogue of the original binding site. The anti-id could then be used to isolate peptides from chemically or biologically produced peptide banks.

[0064] In further embodiments, any of the genes, proteins, antagonists, antibodies, complementary sequences, or vectors of the invention may be administered in combination with other appropriate therapeutic agents.

[0065] Selection of the appropriate agents may be made by those skilled in the art, according to conventional pharmaceutical principles. The combination of therapeutic agents may act synergistically to effect the treatment or prevention of the various disorders described above. Using this approach, therapeutic efficacy with lower dosages of each agent may be possible, thus reducing the potential for adverse side effects.

[0066] In a further aspect a pharmaceutical composition and a pharmaceutically acceptable carrier may be administered to a patient diagnosed as experiencing high-grade esophageal dysplasia for the inhibition or prevention of progression of the disease to adenocarcinoma.

[0067] The pharmaceutical composition may comprise any one or more of a polypeptide as described above, typically a substantially purified high-grade esophageal dysplasia polypeptide, an antibody to a high-grade esophageal dysplasia polypeptide, a vector capable of expressing a high-grade esophageal dysplasia polypeptide, a compound which increases or decreases expression of a high-grade esophageal dysplasia gene, a candidate drug that restores wild-type activity to a high-grade esophageal dysplasia gene or an antagonist of a high-grade esophageal dysplasia gene.

[0068] The pharmaceutical composition may be administered to a subject to treat or prevent a cancer associated with decreased activity and/or expression of a high-grade esophageal dysplasia gene including, but not limited to, those provided above.

[0069] Pharmaceutical compositions in accordance with the present invention are prepared by mixing a polypeptide of the invention, or active fragments or variants thereof, having the desired degree of purity, with acceptable carriers, excipients, or stabilizers which are well known.

[0070] Acceptable carriers, excipients or stabilizers are nontoxic at the dosages and concentrations employed, and include buffers such as phosphate, citrate, and other organic acids; antioxidants including absorbic acid; low molecular weight (less than about 10 residues) polypeptides; proteins, such as serum albumin, gelatin, or immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids such as glycine, glutamine, asparagine, arginine or lysine; monosaccharides, disaccharides, and other carbohydrates including glucose, mannose, or dextrins; chelating agents such as EDTA; sugar alcohols such as mannitrol or sorbitol; salt-forming counterions such as sodium; and/or nonionic surfactants such as Tween, Pluronics or polyethylene glycol (PEG).

[0071] Any of the therapeutic methods described above may be applied to any subject in need of such therapy, including, for example, mammals such as dogs, cats, cows, horses, rabbits, monkeys, and most preferably, humans.

[0072] Polynucleotide sequences encoding the high-grade esophageal dysplasia genes of the invention may be used for the diagnosis or prognosis of cancers associated with their dysfunction, or a predisposition to such cancers. Examples of such cancers include, but are not limited to, adenocarcinoma, such as in patients having Barrett's esophagus. Diagnosis or prognosis may be used to determine the severity, type or stage of the disease state in order to initiate an appropriate therapeutic intervention.

[0073] In another embodiment of the invention, the polynucleotides that may be used for diagnostic or prognostic purposes include oligonucleotide sequences, genomic DNA and complementary RNA and DNA molecules. The polynucleotides may be used to detect and quantitate gene expression in biopsied tissues in which mutations or abnormal expression of the relevant high-grade esophageal dysplasia gene may be correlated with disease. Genomic DNA used for the diagnosis or prognosis may be obtained from body cells, such as those present in the blood, tissue biopsy, surgical specimen, or autopsy material. The DNA may be isolated and used directly for detection of a specific sequence or may be amplified by the polymerase chain reaction (PCR) prior to analysis. Similarly, RNA or cDNA may also be used, with or without PCR amplification. To detect a specific nucleic acid sequence, direct nucleotide sequencing, reverse transcriptase PCR (RT-PCR), hybridization using specific oligonucleotides, restriction enzyme digest and mapping, PCR mapping, RNAse protection, and various other methods may be employed.

[0074] Oligonucleotides specific to particular sequences can be chemically synthesized and labelled radioactively or non-radioactively and hybridised to individual samples immobilized on membranes or other solid-supports or in solution. The presence, absence or excess expression of a particular high-grade esophageal dysplasia gene may then be visualized using methods such as autoradiography, fluorometry, or colorimetry.

[0075] In a particular aspect, the nucleotide sequences encoding a high-grade esophageal dysplasia gene of the invention may be useful in assays that detect the presence of associated disorders, particularly those mentioned previously. The nucleotide sequences encoding the relevant high-grade esophageal dysplasia gene may be labelled by standard methods and added to a fluid or tissue sample from a patient under conditions suitable for the formation of hybridization complexes.

[0076] After a suitable incubation period, the sample is washed and the signal is quantitated and compared with a standard value. If the amount of signal in the patient sample is significantly altered in comparison to a control sample then the presence of altered levels of nucleotide sequences encoding the high-grade esophageal dysplasia gene in the sample indicates the presence of the associated disorder. Such assays may also be used to evaluate the efficacy of a particular therapeutic treatment regimen in animal studies, in clinical trials, or to monitor the treatment of an individual patient.

[0077] In order to provide a basis for the diagnosis or prognosis of a disorder associated with a mutation in a particular high-grade esophageal dysplasia gene of the invention, the nucleotide sequence of the relevant gene can be compared between normal tissue and diseased tissue in order to establish whether the patient expresses a mutant gene.

[0078] In order to provide a basis for the diagnosis or prognosis of a disorder associated with abnormal expression of a particular high-grade esophageal dysplasia gene of the invention, a normal or standard profile for expression is established. This may be accomplished by combining body fluids or cell extracts taken from normal subjects, either animal or human, with a sequence, or a fragment thereof, encoding the relevant high-grade esophageal dysplasia gene, under conditions suitable for hybridization or amplification. Standard hybridization may be quantified by comparing the values obtained from normal subjects with values from an experiment in which a known amount of a substantially purified polynucleotide is used.

[0079] Another method to identify a normal or standard profile for expression of a particular high-grade esophageal dysplasia gene is through quantitative RT-PCR studies. RNA isolated from body cells of a normal individual, particularly RNA isolated from tumour cells, is reverse transcribed and real-time PCR using oligonucleotides specific for the relevant high-grade esophageal dysplasia gene is conducted to establish a normal level of expression of the gene.

[0080] Standard values obtained in both these examples may be compared with values obtained from samples from patients who are symptomatic for a disorder. Deviation from standard values is used to establish the presence of a disorder.

[0081] Once the presence of a disorder is established and a treatment protocol is initiated, hybridization assays or quantitative RT-PCR studies may be repeated on a regular basis to determine if the level of expression in the patient begins to approximate that which is observed in the normal subject. The results obtained from successive assays may be used to show the efficacy of treatment over a period ranging from several days to months.

[0082] In one aspect, hybridization with PCR probes which are capable of detecting polynucleotide sequences, including genomic sequences, encoding a particular high-grade esophageal dysplasia gene, or closely related molecules, may be used to identify nucleic acid sequences which encode the gene. The specificity of the probe, whether it is made from a highly specific region, e. g., the 5'regulatory region, or from a less specific region, e.g., a conserved motif, and the stringency of the hybridization or amplification will determine whether the probe identifies only naturally occurring sequences encoding the high-grade esophageal dysplasia gene, allelic variants, or related sequences.

[0083] Probes may also be used for the detection of related sequences, and should preferably have at least 50% sequence identity to any of the high-grade esophageal dysplasia encoding sequences. The hybridization probes of the subject invention may be DNA or RNA and may be derived from the sequence of HGD marker genes disclosed in Table 4 or from genomic sequences including promoters, enhancers, and introns of the genes.

[0084] Means for producing specific hybridization probes for DNAs encoding the high-grade esophageal dysplasia genes of the invention include the cloning of polynucleotide sequences encoding these genes or their derivatives into vectors for the production of mRNA probes. Such vectors are known in the art, and are commercially available. Hybridization probes may be labelled by radionuclides such as 32p or 35S, or by enzymatic labels, such as alkaline phosphatase coupled to the probe via avidin/biotin coupling systems, or other methods known in the art.

[0085] According to a further aspect of the invention there is provided the use of a polypeptide as described above in the diagnosis or prognosis of a cancer associated with a high-grade esophageal dysplasia gene of the invention, or a predisposition to such cancers.

[0086] When a diagnostic or prognostic assay is to be based upon a high-grade esophageal dysplasia protein, a variety of approaches are possible. For example, diagnosis or prognosis can be achieved by monitoring differences in the electrophoretic mobility of normal and mutant proteins. Such an approach will be particularly useful in identifying mutants in which charge substitutions are present, or in which insertions, deletions or substitutions have resulted in a significant change in the electrophoretic migration of the resultant protein. Alternatively, diagnosis may be based upon differences in the proteolytic cleavage patterns of normal and mutant proteins, differences in molar ratios of the various amino acid residues, or by functional assays demonstrating altered function of the gene products.

[0087] In another aspect, antibodies that specifically bind a high-grade esophageal dysplasia gene of the invention may be used for the diagnosis or prognosis of cancers characterized by abnormal expression of the gene, or in assays to monitor patients being treated with the gene or agonists, antagonists, or inhibitors of the gene. Antibodies useful for diagnostic purposes may be prepared in the same manner as described above for therapeutics. Diagnostic or prognostic assays include methods that utilize the antibody and a label to detect a high-grade esophageal dysplasia gene of the invention in human body fluids or in extracts of cells or tissues. The antibodies may be used with or without modification, and may be labelled by covalent or non-covalent attachment of a reporter molecule.

[0088] A variety of protocols for measuring a high-grade esophageal dysplasia gene of the invention, including ELISA, RIAs, and FACS, are known in the art and provide a basis for diagnosing altered or abnormal levels of their expression. Normal or standard values for their expression are established by combining body fluids or cell extracts taken from normal mammalian subjects, preferably human, with antibody to the high-grade esophageal dysplasia protein under conditions suitable for complex formation. The amount of standard complex formation may be quantitated by various methods, preferably by photometric means. Quantities of any of the high-grade esophageal dysplasia genes expressed in subject, control, and disease samples from biopsied tissues are compared with the standard values. Deviation between standard and subject values establishes the parameters for diagnosing disease.

[0089] Once an individual has been diagnosed with a cancer, effective treatments can be initiated. These may include administering a selective agonist to the relevant mutant high-grade esophageal dysplasia gene so as to restore its function to a normal level or introduction of the wild-type gene, particularly through gene therapy approaches as described above. Typically, a vector capable of expressing the appropriate full-length high-grade esophageal dysplasia gene or a fragment or derivative thereof may be administered. In an alternative approach to therapy, a substantially purified high-grade esophageal dysplasia polypeptide and a pharmaceutically acceptable carrier may be administered, as described above, or drugs which can replace the function of or mimic the action of the relevant high-grade esophageal dysplasia gene may be administered.

[0090] In the treatment of cancers associated with increased high-grade esophageal dysplasia gene expression and/or activity, the affected individual may be treated with a selective antagonist such as an antibody to the relevant protein or an antisense (complement) probe to the corresponding gene as described above, or through the use of drugs which may block the action of the relevant high-grade esophageal dysplasia gene.

[0091] In further embodiments, complete cDNAs, oligonucleotides or longer fragments derived from any of the polynucleotide sequences described herein may be used as targets in a microarray. The microarray can be used to monitor the expression level of large numbers of genes simultaneously and to identify genetic variants, mutations, and polymorphisms. This information may be used to determine gene function, to understand the genetic basis of a disorder, to detect or prognose a disorder, and to develop and monitor the activities of therapeutic agents. Microarrays may be prepared, used, and analyzed using methods known in the art (for example, see Schena, M. et al. PNAS USA 93:10614-10619 (1996); Heller, R. A. et al., PNAS USA 94:2150-2155 (1997); and Heller, M. J., Annual Review of Biomedical Engineering 4:129-53 (2002)).

[0092] The present invention also provides for the production of genetically modified (knock-out, knock-down, knock-in and transgenic), non-human animal models transformed with the DNA molecules of the invention. These animals are useful for the study of high-grade esophageal dysplasia gene function, to study the mechanisms of cancer as related to the high-grade esophageal dysplasia genes, for the screening of candidate pharmaceutical compounds, for the creation of explanted mammalian cell cultures which express the protein or mutant protein and for the evaluation of potential therapeutic interventions.

[0093] One of the high-grade esophageal dysplasia genes of the invention may have been inactivated by knock-out deletion, and knock-out genetically modified non-human animals are therefore provided.

[0094] Animal species which are suitable for use in the animal models of the present invention include, but are not limited to, rats, mice, hamsters, guinea pigs, rabbits, dogs, cats, goats, sheep, pigs, and non-human primates such as monkeys and chimpanzees. For initial studies, genetically modified mice and rats are highly desirable due to their relative ease of maintenance and shorter life spans. For certain studies, transgenic yeast or invertebrates may be suitable and preferred because they allow for rapid screening and provide for much easier handling. For longer term studies, non-human primates may be desired due to their similarity with humans.

[0095] To create an animal model for a mutated high-grade esophageal dysplasia gene of the invention several methods can be employed. These include generation of a specific mutation in a homologous animal gene, insertion of a wild type human gene and/or a humanized animal gene by homologous recombination, insertion of a mutant (single or multiple) human gene as genomic or minigene cDNA constructs using wild type or mutant or artificial promoter elements or insertion of artificially modified fragments of the endogenous gene by homologous recombination. The modifications include insertion of mutant stop codons, the deletion of DNA sequences, or the inclusion of recombination elements (lox p sites) recognized by enzymes such as Cre recombinase.

[0096] To create a transgenic mouse, which is preferred, a mutant version of a particular high-grade esophageal dysplasia gene of the invention can be inserted into a mouse germ line using standard techniques of oocyte microinjection or transfection or microinjection into embryonic stem cells. Alternatively, if it is desired to inactivate or replace the endogenous high-grade esophageal dysplasia gene, homologous recombination using embryonic stem cells may be applied. For oocyte injection, one or more copies of the mutant or wild type high-grade esophageal dysplasia gene can be inserted into the pronucleus of a just-fertilized mouse oocyte. This oocyte is then reimplanted into a pseudo-pregnant foster mother. The liveborn mice can then be screened for integrants using analysis of tail DNA for the presence of human high-grade esophageal dysplasia gene sequences. The transgene can be either a complete genomic sequence injected as a YAC, BAC, PAC or other chromosome DNA fragment, a cDNA with either the natural promoter or a heterologous promoter, or a minigene containing all of the coding region and other elements found to be necessary for optimum expression. The genetically modified non-human animals as described above are useful for the screening of candidate pharmaceutical compounds.

BRIEF DESCRIPTION OF THE DRAWINGS

[0097] FIGS. 1A and 1B are graphs showing a distribution of expression of IL-1H1 (FIG. 1A) and CYP2J2 (FIG. 1B) in the dysplasia-carcinoma sequence in BE. Expression in normal epithelium and in esophageal epithelia from samples of Barrett's esophagus (BE), dysplasia (D), BE adjacent to andenocarcinoma (BE-CA); and adenocarcinoma (CA) are plotted. The vertical line denotes the average Z score in each disease group. Normal refers to the normal esophagus group. Dysplasia includes low- and high-grade dysplasia samples.

[0098] FIGS. 2A and 2B are graphs showing a distribution of expression of AGR2 (FIG. 2A) and NROB2 (FIG. 2B) in the dysplasia-carcinoma sequence in BE. Expression in esophageal epithelia from samples of Barrett's esophagus (BE), dysplasia (D), BE adjacent to andenocarcinoma (BE-CA); and adenocarcinoma (CA) are plotted. The vertical line denotes the average Z score in each disease group. Normal refers to pooled epithelia samples. Dysplasia includes low- and high-grade dysplasia samples.

[0099] FIGS. 3A and 3B are graphs showing a distribution of expression of TCF4 (FIG. 3A) and FLJ23399 (FIG. 3B) in the dysplasia-carcinoma sequence in BE. Expression in esophageal epithelia from samples of Barrett's esophagus (BE), dysplasia (D), BE adjacent to andenocarcinoma (BE-CA); and adenocarcinoma (CA) are plotted. The vertical line denotes the average Z score in each disease group. Normal refers to pooled epithelia samples. Dysplasia includes low- and high-grade dysplasia samples.

[0100] FIGS. 4A and 4B show the nucleic acid sequence (SEQ ID NO:1) and the amino acid sequence (SEQ ID NO:2) of ET-1 (endothelin-1, NM.sub.--001955).

[0101] FIGS. 5A and 5B show the nucleic acid sequence (SEQ ID NO:3) and the amino acid sequence (SEQ ID NO:4) of AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408).

[0102] FIGS. 6A and 6B show the nucleic acid sequence (SEQ ID NO:5) and the amino acid sequence (SEQ ID NO:6) of ADAM8 (NM.sub.--001109).

[0103] FIGS. 7A and 7B show the nucleic acid sequence (SEQ ID NO:7) and the amino acid sequence (SEQ ID NO:8) of PSS8 (Prostasin precursor, serine protease, NM.sub.--002773).

[0104] FIGS. 8A-8C show the nucleic acid sequence (SEQ ID NO:9) and FIG. 8D shows the amino acid sequence (SEQ ID NO:10) of AXO1 (Axonin-1 precursor, NM.sub.--005076).

[0105] FIGS. 9A and 9B show the nucleic acid sequence (SEQ ID NO:11) and the amino acid sequence (SEQ ID NO:12) of NROB2 (Nuclear hormone receptor, NM.sub.--021969).

[0106] FIGS. 10A and 10B show the nucleic acid sequence (SEQ ID NO:13) and the amino acid sequence (SEQ ID NO:14) of TM7SF1 (NM.sub.--003272).

[0107] FIGS. 11A and 11B show the nucleic acid sequence (SEQ ID NO:15) and the amino acid sequence (SEQ ID NO:16) of DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108).

[0108] FIGS. 12A and 12B show the nucleic acid sequence (SEQ ID NO:17) and the amino acid sequence (SEQ ID NO:18) of MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283).

[0109] FIGS. 13A and 13B show the nucleic acid sequence (SEQ ID NO:19) and the amino acid sequence (SEQ ID NO:20) of STC-2 (stanniocalcin-2, NM.sub.--003714).

[0110] FIGS. 14A and 14B show the nucleic acid sequence (SEQ ID NO:21) and the amino acid sequence (SEQ ID NO:22) of PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631).

[0111] FIGS. 15A and 15B show the nucleic acid sequence (SEQ ID NO:23) and the amino acid sequence (SEQ ID NO:24) of SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769).

[0112] FIGS. 16A and 16B show the nucleic acid sequence (SEQ ID NO:25) and the amino acid sequence (SEQ ID NO:26) of CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717).

[0113] FIGS. 17A and 17B show shows the nucleic acid sequence (SEQ ID NO:27) and the amino acid sequence (SEQ ID NO:28) of PA21 (phopholipase a2 precursor, NM.sub.--000928).

[0114] FIGS. 18A and 18B show the nucleic acid sequence (SEQ ID NO: 29) and the amino acid sequence (SEQ ID NO:30) of PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242).

[0115] FIGS. 19A and 19B show the nucleic acid sequence (SEQ ID NO:31) and the amin acid sequence (SEQ ID NO:32) of IDE (insulin-degrading enzyme, NM.sub.--004969).

[0116] FIGS. 20A-20B show the nucleic acid sequence (SEQ ID NO:33) and FIG. 20C shows the amino acid sequence (SEQ ID NO:34) of MYO1A (myosin-1A, NM.sub.--005379).

[0117] FIGS. 21A and 21B the nucleic acid sequence (SEQ ID NO:35) and the amin acid sequence (SEQ ID NO:36) of CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775).

[0118] FIGS. 22A and 22B show the nucleic acid sequence (SEQ ID NO:37) and the amin acid sequence (SEQ ID NO:38) of PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214).

[0119] FIGS. 23A and 23B show the nucleic acid sequence (SEQ ID NO:39) and the amin acid sequence (SEQ ID NO:40) of CYB5 (cytochrome b5, 3' end, NM.sub.--001914).

[0120] FIGS. 24A and 24B show the nucleic acid sequence (SEQ ID NO:41) and the amin acid sequence (SEQ ID NO:42) of COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863).

[0121] FIGS. 25A and 25B show the nucleic acid sequence (SEQ ID NO:43) and the amin acid sequence (SEQ ID NO:44) of TCF4 (NM.sub.--030756).

[0122] FIGS. 26A-26B show the nucleic acid sequence (SEQ ID NO:45) and FIG. 26C shows the amino acid sequence (SEQ ID NO:46) of CAD17 (liver-intestine cadherin, NM.sub.--004063).

[0123] FIGS. 27A and 27B show the nucleic acid sequence (SEQ ID NO:47) and the amino acid sequence (SEQ ID NO:48) of CLDN15 (claudin 15, NM.sub.--014343).

[0124] FIGS. 28A-28B show the nucleic acid sequence (SEQ ID NO:49) and FIG. 28C shows the amino acid sequence (SEQ ID NO:50) of CFTR (chloride channel, NM.sub.--000492).

[0125] FIGS. 29A and 29B show the nucleic acid sequence (SEQ ID NO:51) and the amino acid sequence (SEQ ID NO:52) of H2R (histamine H2 receptor, NM.sub.--022304).

[0126] FIGS. 30A-30B show the nucleic acid sequence (SEQ ID NO:53) and FIG. 30C shows the amino acid sequence (SEQ ID NO:54) of EGFR (epidermal growth factor receptor, NM.sub.--005228).

[0127] FIGS. 31A-31B show the nucleic acid sequence (SEQ ID NO:55) and FIG. 31C shows the amino acid sequence (SEQ ID NO:56) of EPHB2, NM.sub.--004442).

[0128] FIGS. 32A and 32B show the nucleic acid sequence (SEQ ID NO:57) and the amino acid sequence (SEQ ID NO:58) of CRIPTO CR-1 (NM.sub.--003212).

[0129] FIGS. 33A and 33B show the nucleic acid sequence (SEQ ID NO:59) and the amino acid sequence (SEQ ID NO:60) of Eprin B1 (NM.sub.--004429).

[0130] FIGS. 34A and 34B show the nucleic acid sequence (SEQ ID NO:61) and the amino acid sequence (SEQ ID NO:62) of MMP-17/MT4-MMP (matrix metalloproteinase 17, NM.sub.--016155).

[0131] FIGS. 35A and 35B show the the nucleic acid sequence (SEQ ID NO:63) and the amino acid sequence (SEQ ID NO:64) of MMP26 (matrix metalloproteinase 26, NM.sub.--021801).

[0132] FIGS. 36A and 36B show the nucleic acid sequence (SEQ ID NO:65) and the amino acid sequence (SEQ ID NO:66) of ADAM10 (NM.sub.--001110).

[0133] FIGS. 37A and 37B show the nucleic acid sequence (SEQ ID NO:67) and the amino acid sequence (SEQ ID NO:68) of ADAM1 (XM.sub.--132370).

[0134] FIGS. 38A and 38B show the nucleic acid sequence (SEQ ID NO:69) and the amino acid sequence (SEQ ID NO:70) of TIM1(NM.sub.--003254).

[0135] FIGS. 39A and 39B show the nucleic acid sequence (SEQ ID NO:71) and the amino acid sequence (SEQ ID NO:72) of MUC1 (XM.sub.--053256).

[0136] FIGS. 40A and 40B show the nucleic acid sequence (SEQ ID NO:73) and the amino acid sequence (SEQ ID NO:74) of CEA (NM.sub.--004363).

[0137] FIGS. 41A and 41B show the nucleic acid sequence (SEQ ID NO:75) and the amino acid sequence (SEQ ID NO:76) of NCA (NM.sub.--002483).

[0138] FIGS. 42A and 42B show the nucleic acid sequence (SEQ ID NO:77) and the amino acid sequence (SEQ ID NO:78) of Follistatin (NM.sub.--006350).

[0139] FIGS. 43A and 43B show the nucleic acid sequence (SEQ ID NO:79) and the amino acid sequence (SEQ ID NO:80) of Claudin 1 (NM.sub.--021101).

[0140] FIGS. 44A and 44B show the nucleic acid sequence (SEQ ID NO:81) and the amino acid sequence (SEQ ID NO:82) of Claudin 14 (NM.sub.--012130).

[0141] FIGS. 45A-45B show the nucleic acid sequence (SEQ ID NO:83) and FIG. 45C show the amino acid sequence (SEQ ID NO:84) of Tenascin-R (NM.sub.--003285).

[0142] FIGS. 46A and 46B show the nucleic acid sequence (SEQ ID NO:85) and the amino acid sequence (SEQ ID NO:86) of CAD3 (NM.sub.--001793).

[0143] FIGS. 47A and 47B show the nucleic acid sequence (SEQ ID NO:87) and the amino acid sequence (SEQ ID NO:88) of CONT (NM.sub.--001843).

[0144] FIGS. 48A and 48B show the nucleic acid sequence (SEQ ID NO:89) and the amino acid sequence (SEQ ID NO:90) of Osteopontin (NM.sub.--000582).

[0145] FIGS. 49A and 49B show the nucleic acid sequence (SEQ ID NO:91) and the amino acid sequence (SEQ ID NO:92) of Galectin 8 (NM.sub.--006499).

[0146] FIGS. 50A and 50B show the nucleic acid sequence (SEQ ID NO:93) and the amino acid sequence (SEQ ID NO:94) of GS1 (bihlycan, NM.sub.--001711).

[0147] FIGS. 51A and 51B show the nucleic acid sequence (SEQ ID NO:95) and the amino acid sequence (SEQ ID NO:96) of Fizzled 2 (NM001466).

[0148] FIGS. 52A and 52B show the nucleic acid sequence (SEQ ID NO:97) and the amino acid sequence (SEQ ID NO:98) of ISLR (NM.sub.--005545).

[0149] FIGS. 53A-53B show the nucleic acid sequence (SEQ ID NO:) and FIG. 53C shows the amino acid sequence (SEQ ID NO:2) of

[0150] FIGS. 54A and 54B show the nucleic acid sequence (SEQ ID NO:1) and the amino acid sequence (SEQ ID NO:2) of

[0151] FIGS. 55A and 55B show the nucleic acid sequence (SEQ ID NO:103) and the amino acid sequence (SEQ ID NO:104) of Tie2 ligand2 (NM.sub.--001147).

[0152] FIGS. 56A and 56B show the nucleic acid sequence (SEQ ID NO:105) and the amino acid sequence (SEQ ID NO:106) of VEGFC (NM.sub.--005429).

[0153] FIGS. 57A and 57B show the nucleic acid sequence (SEQ ID NO:107) and the amino acid sequence (SEQ ID NO:108) of tPA (NM.sub.--000930).

[0154] FIGS. 58A-58B show the nucleic acid sequence (SEQ ID NO:109) and FIG. 58C shows the amino acid sequence (SEQ ID NO:110) of thrombomodulin (NM.sub.--000361).

[0155] FIGS. 59A and 59B show the nucleic acid sequence (SEQ ID NO:111) and the amino acid sequence (SEQ ID NO:112) of TF (coagulation factor III, thromboplastin, tissue factor, NM.sub.--0001993).

[0156] FIGS. 60A and 60B show the nucleic acid sequence (SEQ ID NO:113) and the amino acid sequence (SEQ ID NO:114) of GPR4 (G-coupled protein receptor-4, NM.sub.--005282).

[0157] FIGS. 61A and 61B show the nucleic acid sequence (SEQ ID NO:115) and the amino acid sequence (SEQ ID NO:116) of GPR66 (G-coupled protein receptor 66).

[0158] FIGS. 62A and 62B show the nucleic acid sequence (SEQ ID NO:117) and the amino acid sequence (SEQ ID NO:118) of SLC22A2 (NM.sub.--003058).

[0159] FIGS. 63A-63B show the nucleic acid sequence (SEQ ID NO:119) and FIG. 63C shows the amino acid sequence (SEQ ID NO:120) of MLSN1 (NM.sub.--002420).

[0160] FIGS. 64A-64B show the nucleic acid sequence (SEQ ID NO:121) and FIG. 64C shows the amino acid sequence (SEQ ID NO:122) of ATN2 (Na/K transport, NM.sub.--000702).

DESCRIPTION OF THE INVENTION

[0161] Barrett's esophagus, a complication of gastrointestinal reflux disease, is the primary risk factor for esophageal adenocarcinoma. Biopsy specimens representing disease progression through Barrett's esophagus, dysplasia and adenocarcinoma, were collected and analyzed using cDNA microarrays to identify genes expressed in the different disease stages. It was discovered that the expression of particular genes increased with the progression of the disease through dysplasia, especially high grade dysplasia, suggestive of a differentiated small intestinal enterocyte lineage. The present invention defines a collection of markers that assist in identifying patients with highest risk of developing cancer, especially the development of esophageal adenocarcinoma.

[0162] The progression of Barrett's esophagus through dysplasia to adenocarcinoma was examined, identifying specific genes associated with increasing risk of carcinogenesis. These data provide insight into the potential role of progressive intestinal metaplasia in generating the colon tumor-like expression profiles disclosed herein for esophageal adenocarcinoma. Genes that define early stages of this process, progression of BE to dysplasia, serve as markers to permit targeting of surveillance to those patients at most risk of developing esophageal carcinoma.

[0163] DNA microarray technology has been used to characterize and cluster Barrett's metaplasia from normal mucosa, and esophageal adenocarcinoma and squamous cell carcinoma (Barrett et al., Neoplasia 4:121-128 (2002); and Selaru et al., Oncogene 21:475-478 (2002)). The authors do not, however, describe HGD markers or dysplasia markers of any kind useful for predicting patients likely to develop adenocarcinoma.

[0164] The present invention provides nucleic acid and protein sequences that are differentially expressed in high-grade esophageal dysplasia when compared to normal tissue controls, here-in termed "high-grade dysplasia genes," "high-grade dysplasia nucleic acid sequences," "HGD marker genes" and the like. As outlined below, high-grade esophageal dysplasia sequences that are differentially expressed include those that are up-regulated in high-grade esophageal dysplasia). The differential expression of these sequences in high-grade esophageal dysplasia combined with the fact they have been identified in patients likely to develop cancer, such as adenocarcinoma, they are contributory factors in cancer. The high-grade esophageal dysplasia nucleic acid sequences, or the polypeptides encoded by the nucleic acids, of the invention are disclosed in Table 4 as HGD marker genes, or polypeptides, as follows: ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1 or 2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3 or 4); ADAM8 (NM.sub.--001109) (SEQ ID NO:5 or 6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7 or 8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9 or 10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11 or 12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13 or 14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15 or 16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17 or 18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19 or 20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21 or 22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23 or 24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27 or 28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29 or 30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31 or 32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33 or 34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35 or 36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37 or 38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41 or 42); and TCF4 (NM.sub.--030756) (SEQ ID NO:43 or 44).

Definitions

[0165] The phrases "gene amplification" and "gene duplication" are used interchangeably and refer to a process by which multiple copies of a gene or gene fragment are formed in a particular cell or cell line. The duplicated region (a stretch of amplified DNA) is often referred to as "amplicon." Usually, the amount of the messenger RNA (mRNA) produced, i.e., the level of gene expression, also increases in the proportion of the number of copies made of the particular gene expressed.

[0166] "Tumor", as used herein, refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues.

[0167] The terms "cancer" and "cancerous" refer to or describe the physiological condition in mammals that is typically characterized by unregulated cell growth. Examples of cancer include but are not limited to, carcinoma, adenocarcinoma; lymphoma, blastoma, sarcoma, and leukemia. More particular examples of such cancers include esophageal cancer, breast cancer, prostate cancer, colon cancer, squamous cell cancer, small-cell lung cancer, non-small cell lung cancer, gastrointestinal cancer, pancreatic cancer, glioblastoma, cervical cancer, ovarian cancer, liver cancer, bladder cancer, hepatoma, colorectal cancer, endometrial carcinoma, salivary gland carcinoma, kidney cancer, liver cancer, vulval cancer, thyroid cancer, hepatic carcinoma and various types of head and neck cancer.

[0168] The term "diagnosis" or "diagnosing" as used herein shall refer to the determination of the nature of a case of a disease, such as by determining a gene expression profile or polypeptide expression profile unique to the disease or a stage of the disease.

[0169] A "normal" tissue sample refers to tissue or cells that are not diseased as defined herein, such as tissue from a mammal that is not experiencing a particular disease of interest. The term "normal cell" or "normal tissue" as used herein refers to a state of a cell or tissue in which the cell or tissue is apparently free of an adverse biological condition when compared to a diseased cell or tissue having that adverse biological condition. The normal cell or normal tissue may be from any prokaryotic or eukaryotic organism including, but not limited to, bacteria, yeast, insect, bird, reptile, and any mammal including human. Where the normal tissue or cell is used as a normal control sample, it is generally from the same species as the test sample. Where the cell or tissue is mammalian, the cell or tissue is any cell or tissue including, but not limited to blood, muscle, nerve, brain, breast, heart, lung, liver, pancreas, spleen, thymus, esophagus, stomach, intestine, kidney, testis, ovary, uterus, hair follicle, skin, bone, bladder, and spinal cord.

[0170] "Treatment" is an intervention performed with the intention of preventing the development or altering the pathology of a disorder. Accordingly, "treatment" refers to both therapeutic treatment and prophylactic or preventative measures. Those in need of treatment include those already with the disorder as well as those in which the disorder is to be prevented. In tumor (e.g., cancer) treatment, a therapeutic agent may directly decrease the pathology of tumor cells, or render the tumor cells more susceptible to treatment by other therapeutic agents, e.g., radiation and/or chemotherapy.

[0171] A "pharmaceutical composition" as used herein refers to a composition comprising a chemotherapeutic agent for treatment of a disease combined with physiologically acceptable materials such as carriers, excepients, stabilzers, buffers, salts, antioxidants, hydrophilic polymers, amino acids, carbohydrates, ionic or nonionic uurfactants, and/or polyethylene or propylene glycol. The pharmaceutical composition may be in aqueous form, tablet, capsule, microcapsules, liposomes, trandermal patches, and the like.

[0172] The "pathology" of cancer includes all phenomena that compromise the well-being of the patient. This includes, without limitation, abnormal or uncontrollable cell growth, metastasis, interference with the normal functioning of neighboring cells, release of cytokines or other secretory products at abnormal levels, suppression or aggravation of inflammatory or immunological response, etc.

[0173] "Mammal" for purposes of treatment refers to any animal classified as a mammal, including humans, domestic and farm animals, and zoo, sports, or pet animals, such as dogs, horses, cats, cattle, pigs, sheep, etc. Preferably, the mammal is human.

[0174] "Carriers" as used herein include pharmaceutically acceptable carriers, excipients, or stabilizers which are nontoxic to the cell or mammal being exposed thereto at the dosages and concentrations employed. Often the physiologically acceptable carrier is an aqueous pH buffered solution. Examples of physiologically acceptable carriers include buffers such as phosphate, citrate, and other organic acids; antioxidants including ascorbic acid; low molecular weight (less than about 10 residues) polypeptides; proteins, such as serum albumin, gelatin, or immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids such as glycine, glutamine, asparagine, arginine or lysine; monosaccharides, disaccharides, and other carbohydrates including glucose, mannose, or dextrins; chelating agents such as EDTA; sugar alcohols such as mannitol or sorbitol; salt-forming counterions such as sodium; and/or nonionic surfactants such as TWEEN.TM., polyethylene glycol (PEG), and PLURONICS.TM..

[0175] Administration "in combination with" one or more further therapeutic agents includes simultaneous (concurrent) and consecutive administration in any order.

[0176] The term "cytotoxic agent" as used herein refers to a substance that inhibits or prevents the function of cells and/or causes destruction of cells. The term is intended to include radioactive isotopes (e.g., I.sup.131, I.sup.125, Y.sup.90 and Re.sup.186), chemotherapeutic agents, and toxins such as enzymatically active toxins of bacterial, fungal, plant or animal origin, or fragments thereof.

[0177] A "chemotherapeutic agent" is a chemical compound useful in the treatment of cancer. Examples of chemotherapeutic agents include adriamycin, doxorubicin, epirubicin, 5-fluorouracil, cytosine arabinoside ("Ara-C"), cyclophosphamide, thiotepa, busulfan, cytoxin, taxoids, e.g., paclitaxel (Taxol, Bristol-Myers Squibb Oncology, Princeton, N.J.), and doxetaxel (Taxotere, Rhne-Poulenc Rorer, Antony, Rnace), toxotere, methotrexate, cisplatin, melphalan, vinblastine, bleomycin, etoposide, ifosfamide, mitomycin C, mitoxantrone, vincristine, vinorelbine, carboplatin, teniposide, daunomycin, carminomycin, aminopterin, dactinomycin, mitomycins, esperamicins (see U.S. Pat. No. 4,675,187), 5-FU, 6-thioguanine, 6-mercaptopurine, actinomycin D, VP-16, chlorambucil, melphalan, and other related nitrogen mustards. Also included in this definition are hormonal agents that act to regulate or inhibit hormone action on tumors such as tamoxifen and onapristone. In an embodiment, the chemotherapeutic agent of the invention is a chemical compound useful in the treatment of HGD, adenocarcinoma, or for inhibiting or preventing progression from the HGD to adenocarcinoma in a patient.

[0178] A "growth inhibitory agent" when used herein refers to a compound or composition which inhibits growth of a cell, especially cancer cell overexpressing any of the genes identified herein, either in vitro or in vivo. Thus, the growth inhibitory agent is one which significantly reduces the percentage of cells overexpressing such genes in S phase. Examples of growth inhibitory agents include agents that block cell cycle progression (at a place other than S phase), such as agents that induce G1 arrest and M-phase arrest. Classical M-phase blockers include the vincas (vincristine and vinblastine), taxol, and topo II inhibitors such as doxorubicin, epirubicin, daunorubicin, etoposide, and bleomycin. Those agents that arrest G1 also spill over into S-phase arrest, for example, DNA alkylating agents such as tamoxifen, prednisone, dacarbazine, mechlorethamine, cisplatin, methotrexate, 5-fluorouracil, and ara-C. Further information can be found in The Molecular Basis of Cancer, Mendelsohn and Israel, eds., Chapter 1, entitled "Cell cycle regulation, oncogens, and antineoplastic drugs" by Murakami et al., (W B Saunders: Philadelphia, 1995), especially p. 13.

[0179] "Doxorubicin" is an anthracycline antibiotic. The full chemical name of doxorubicin is (8S-cis)-10-[(3-amino-2,3,6-trideoxy-.alpha.-L-lyx- o-hexapyranosyl)oxy]-7,8,9,10-tetrahydro-6,8,11-trihydroxy-8-(hydroxyacety- l)-1-methoxy-5,12-naphthacenedione.

[0180] The term "cytokine" is a generic term for proteins released by one cell population which act on another cell as intercellular mediators. Examples of such cytokines are lymphokines, monokines, and traditional polypeptide hormones. Included among the cytokines are growth hormone such as human growth hormone, N-methionyl human growth hormone, and bovine growth hormone; parathyroid hormone; thyroxine; insulin; proinsulin; relaxin; prorelaxin; glycoprotein hormones such as follicle stimulating hormone (FSH), thyroid stimulating hormone (TSH), and luteinizing hormone (LH); hepatic growth factor; fibroblast growth factor; prolactin; placental lactogen; tumor necrosis factor-.alpha. and -.beta.; mullerian-inhibiting substance; mouse gonadotropin-associated peptide; inhibin; activin; vascular endothelial growth factor; integrin; thrombopoietin (TPO); nerve growth factors such as NGF-.beta.; platelet-growth factor; transforming growth factors (TGFs) such as TGF-.alpha. and TGF-.beta.; insulin-like growth factor-I and -II; erythropoietin (EPO); osteoinductive factors; interferons such as interferon -.alpha., -.beta., and -.gamma.; colony stimulating factors (CSFs) such as macrophage-CSF (M-CSF); granulocyte-macrophage-CSF (GM-CSF); and granulocyte-CSF (G-CSF); interleukins (ILs) such as IL-1, IL- 1a, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-11, IL-12; a tumor necrosis factor such as TNF-.alpha. or TNF-.beta.; and other polypeptide factors including LIF and kit ligand (KL). As used herein, the term cytokine includes proteins from natural sources or from recombinant cell culture and biologically active equivalents of the native sequence cytokines.

[0181] The term "prodrug" as used in this application refers to a precursor or derivative form of a pharmaceutically active substance that is less cytotoxic to tumor cells compared to the parent drug and is capable of being enzymatically activated or converted into the more active parent form. See, e.g., Wilman, "Prodrugs in Cancer Chemotherapy", Biochemical Society Transactions, 14:375-382, 615th Meeting, Belfast (1986), and Stella et al., "Prodrugs: A Chemical Approach to Targeted Drug Delivery", Directed Drug Delivery, Borchardt et al., (ed.), pp. 147-267, Humana Press (1985). The prodrugs of this invention include, but are not limited to, phosphate-containing prodrugs, thiophosphate-containing prodrugs, sulfate-containing prodrugs, peptide-containing prodrugs, D-amino acid-modified prodrugs, glysocylated prodrugs, .beta.-lactam-containing prodrugs, optionally substituted phenoxyacetamide-containing prodrugs or optionally substituted phenylacetamide-containing prodrugs, 5-fluorocytosine and other 5-fluorouridine prodrugs which can be converted into the more active cytotoxic free drug. Examples of cytotoxic drugs that can be derivatized into a prodrugs form for use in this invention include, but are not limited to, those chemotherapeutic agents described above.

[0182] An "effective amount" or therapeutically effective amount" of a polypeptide disclosed herein or an antagonist thereof, in reference to inhibition of neoplastic cell growth, tumor growth or cancer cell growth, is an amount capable of inhibiting, to some extent, the growth of target cells. The term includes an amount capable of invoking a growth inhibitory, cytostatic and/or cytotoxic effect and/or apoptosis of the target cells. An "effective amount" is an amount of an antagonist of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1 or 2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3 or 4); ADAM8 (NM.sub.--001109) (SEQ ID NO:5 or 6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7 or 8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9 or 10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11 or 12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13 or 14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15 or 16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17 or 18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19 or 20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21 or 22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23 or 24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27 or 28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29 or 30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31 or 32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33 or 34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35 or 36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37 or 38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41 or 42); and TCF4 (NM.sub.--030756) (SEQ ID NO:43 or 44) gene or polypeptide for purposes of inhibiting neoplastic cell growth, tumor growth or cancer cell growth, may be determined empirically and in a routine manner. The terms further refer to an amount capable of invoking one or more of the following effects: (1) inhibition, to some extent, of tumor growth, including, slowing down and complete growth arrest; (2) reduction in the number of tumor cells; (3) reduction in tumor size; (4) inhibition (ie., reduction, slowing down or complete stopping) of tumor cell infiltration into peripheral organs; (5) inhibition (i.e., reduction, slowing down or complete stopping) of metastasis; (6) enhancement of anti-tumor immune response, which may, but does not have to, result in the regression or rejection of the tumor; and/or (7) relief, to some extent, of one or more symptoms associated with the disorder. A "therapeutically effective amount" of an antagonist of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1 or 2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3 or 4); ADAM8 (NM.sub.--001109) (SEQ ID NO:5 or 6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7 or 8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9 or 10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11 or 12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13 or 14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15 or 16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17 or 18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19 or 20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21 or 22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23 or 24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27 or 28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29 or 30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31 or 32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33 or 34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35 or 36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37 or 38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41 or 42); or TCF4 (NM.sub.--030756) (SEQ ID NO:43 or 44) gene or polypeptide for purposes of treatment of tumor may be determined empirically and in a routine manner.

[0183] A "growth inhibitory amount" of a compound that inhibits growth of a cell expressing genes, or polypeptides, from the following group: ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1 or 2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3 or 4); ADAM8 (NM.sub.--001109) (SEQ ID NO:5 or 6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7 or 8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9 or 10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11 or 12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13 or 14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15 or 16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17 or 18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19 or 20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21 or 22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23 or 24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27 or 28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29 or 30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31 or 32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33 or 34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35 or 36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37 or 38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41 or 42); and TCF4 (NM.sub.--030756) (SEQ ID NO:43 or 44) is an amount of the compound capable of inhibiting the growth of a cell, especially tumor, e.g., cancer cell, either in vitro or in vivo. Optionally, the compound is an antagonist of the gene or polypeptide, such as an antagonist antibody or antagonist small organic molecule. A "growth inhibitory amount" of such a compound, for purposes of inhibiting neoplastic cell growth, may be determined empirically and in a routine manner.

[0184] A "cytotoxic amount" of an ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:4); ADAM8 (NM.sub.--001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NO:16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:42); or TCF4 (NM.sub.--030756) (SEQ ID NO:44) polypeptide antagonist is an amount capable of causing the destruction of a cell, especially tumor, e.g., cancer cell, either in vitro or in vivo. A "cytotoxic amount" of a such a polypeptide antagonist for purposes of inhibiting neoplastic cell growth may be determined empirically and in a routine manner.

[0185] The terms ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:4); ADAM8 (NM.sub.--001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NO:16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:42); and TCF4 (NM.sub.--030756) (SEQ ID NO:44) polypeptide or protein when used herein encompass native sequence ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:4); ADAM8 (NM.sub.--001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NO:16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:42); and TCF4 (NM.sub.--030756) (SEQ ID NO:44) polypeptide variants (which are further defined herein). The ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:4); ADAM8 (NM.sub.--001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NO:16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:42); or TCF4 (NM.sub.--030756) (SEQ ID NO:44) polypeptide may be isolated from a variety of sources, such as from human tissue types or from another source, or prepared by recombinant and/or synthetic methods.

[0186] A "native sequence polypeptide" of each HGD marker polypeptide has the same amino acid sequence or is a polypeptide variant having at least about 80% amino acid sequence identity, preferably at least about 81% amino acid sequence identity, more preferably at least about 82% amino acid sequence identity, more preferably at least about 83% amino acid sequence identity, more preferably at least about 84% amino acid sequence identity, more preferably at least about 85% amino acid sequence identity, more preferably at least about 86% amino acid sequence identity, more preferably at least about 87% amino acid sequence identity, more preferably at least about 88% amino acid sequence identity, more preferably at least about 89% amino acid sequence identity, more preferably at least about 90% amino acid sequence identity, more preferably at least about 91% amino acid sequence identity, more preferably at least about 92% amino acid sequence identity, more preferably at least about 93% amino acid sequence identity, more preferably at least about 94% amino acid sequence identity, more preferably at least about 95% amino acid sequence identity, more preferably at least about 96% amino acid sequence identity, more preferably at least about 97% amino acid sequence identity, more preferably at least about 98% amino acid sequence identity and most preferably at least about 99% amino acid sequence identity with a full-length native sequence polypeptide sequence, lacking the signal peptide as disclosed herein, as the ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:4); ADAM8 (NM.sub.--001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NO:16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:42); or TCF4 (NM.sub.--030756) (SEQ ID NO:44) polypeptide as derived from nature. Such native sequence polypeptide can be isolated from nature or can be produced by recombinant and/or synthetic means. The term "native sequence polypeptide" specifically encompasses naturally-occurring truncated or secreted forms (e.g., an extracellular domain sequence), naturally-occurring variant forms (e.g., alternatively spliced forms) and naturally-occurring allelic variants of the polypeptides encoded by a HGD marker gene as disclosed herein. In one embodiment of the invention, the native sequence HGD marker polypeptide is a mature or full-length native sequence HGD marker polypeptide as encoded by the nucleic acid sequences of the GenBank accession numbers listed in Table 4A for the respective polypeptide. Also, the HGD marker polypeptides encoded by the nucleic acid sequences disclosed in the respective GenBank accession numbers listed in Table 4A, are shown to begin with the methionine residue designated therein as amino acid position 1, it is conceivable and possible that another methionine residue located either upstream or downstream from amino acid position 1 may be employed as the starting amino acid residue for HGD marker polypeptide.

[0187] The "extracellular domain" or "ECD" of a polypeptide disclosed herein refers to a form of the polypeptide which is essentially free of the transmembrane and cytoplasmic domains. Ordinarily, a polypeptide ECD will have less than about 1% of such transmembrane and/or cytoplasmic domains and preferably, will have less than about 0.5% of such domains. It will be understood that any transmembrane domain(s) identified for the polypeptides of the present invention are identified pursuant to criteria routinely employed in the art for identifying that type of hydrophobic domain. The exact boundaries of a transmembrane domain may vary but most likely by no more than about 5 amino acids at either end of the domain as initially identified and as shown in the appended figures. As such, in one embodiment of the present invention, the extracellular domain of a polypeptide of the present invention comprises amino acids 1 to X of the mature amino acid sequence, wherein X is any amino acid within 5 amino acids on either side of the extracellular domain/transmembrane domain boundary.

[0188] The approximate location of the "signal peptides" of the various PRO polypeptides disclosed herein are shown in the accompanying figures. It is noted, however, that the C-terminal boundary of a signal peptide may vary, but most likely by no more than about 5 amino acids on either side of the signal peptide C-terminal boundary as initially identified herein, wherein the C-terminal boundary of the signal peptide may be identified pursuant to criteria routinely employed in the art for identifying that type of amino acid sequence element (e.g., Nielsen et al., Prot. Eng., 10:1-6 (1997) and von Heinje et al., Nucl. Acids. Res., 14:4683-4690 (1986)). Moreover, it is also recognized that, in some cases, cleavage of a signal sequence from a secreted polypeptide is not entirely uniform, resulting in more than one secreted species. These mature polypeptides, where the signal peptide is cleaved within no more than about 5 amino acids on either side of the C-terminal boundary of the signal peptide as identified herein, and the polynucleotides encoding them, are contemplated by the present invention.

[0189] A "polypeptide variant" of any one of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:4); ADAM8 (NM.sub.--001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NO:16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:42); or TCF4 (NM.sub.--030756) (SEQ ID NO:44) polypeptide as defined above or below having at least about 80% amino acid sequence identity with a full-length native sequence polypeptide, with or without the signal peptide, as disclosed herein or any other fragment of a full-length HGD marker polypeptides wherein one or more amino acid residues are added, or deleted, at the N- or C-terminus of the full-length native amino acid sequence. Ordinarily, a HGD marker polypeptide variant will have at least about 80% amino acid sequence identity, preferably at least about 81% amino acid sequence identity, more preferably at least about 82% amino acid sequence identity, more preferably at least about 83% amino acid sequence identity, more preferably at least about 84% amino acid sequence identity, more preferably at least about 85% amino acid sequence identity, more preferably at least about 86% amino acid sequence identity, more preferably at least about 87% amino acid sequence identity, more preferably at least about 88% amino acid sequence identity, more preferably at least about 89% amino acid sequence identity, more preferably at least about 90% amino acid sequence identity, more preferably at least about 91% amino acid sequence identity, more preferably at least about 92% amino acid sequence identity, more preferably at least about 93% amino acid sequence identity, more preferably at least about 94% amino acid sequence identity, more preferably at least about 95% amino acid sequence identity, more preferably at least about 96% amino acid sequence identity, more preferably at least about 97% amino acid sequence identity, more preferably at least about 98% amino acid sequence identity and most preferably at least about 99% amino acid sequence identity with a full-length native sequence polypeptide sequence lacking the signal peptide as disclosed herein, an extracellular domain of a HGD marker polypeptide, with or without the signal peptide, as disclosed herein or any other fragment of a full-length HGD marker polypeptide sequence as disclosed herein. Ordinarily, a HGD marker polypeptide variant is at least about 10 amino acids in length, often at least about 20 amino acids in length, more often at least about 30 amino acids in length, more often at least about 40 amino acids in length, more often at least about 50 amino acids in length, more often at least about 60 amino acids in length, more often at least about 70 amino acids in length, more often at least about 80 amino acids in length, more often at least about 90 amino acids in length, more often at least about 100 amino acids in length, more often at least about 150 amino acids in length, more often at least about 200 amino acids in length, more often at least about 300 amino acids in length, or more.

[0190] "Percent (%) amino acid sequence identity" with respect to the amino acid sequence of any of the HGD marker polypeptides identified herein is defined as the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in an ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:4); ADAM8 (NM.sub.--001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NO:16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:42); or TCF4 (NM 030756) (SEQ ID NO:44) polypeptide, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR) software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared. For purposes herein, however, % amino acid sequence identity values are obtained as described below by using the sequence comparison computer program ALIGN-2, wherein the complete source code for the ALIGN-2 program is provided in Table 5. The ALIGN-2 sequence comparison computer program was authored by Genentech, Inc., and the source code shown in Table 5 has been filed with user documentation in the U.S. Copyright Office, Washington D.C., 20559, where it is registered under U.S. Copyright Registration No. TXU510087. The ALIGN-2 program is publicly available through Genentech, Inc., South San Francisco, Calif. or may be compiled from the source code provided in Table 5. The ALIGN-2 program should be compiled for use on a UNIX operating system, preferably digital UNIX V4.0D. All sequence comparison parameters are set by the ALIGN-2 program and do not vary.

[0191] For purposes herein, the % amino acid sequence identity of a given amino acid sequence A to, with, or against a given amino acid sequence B (which can alternatively be phrased as a given amino acid sequence A that has or comprises a certain % amino acid sequence identity to, with, or against a given amino acid sequence B) is calculated as follows:

100 times the fraction X/Y

[0192] where X is the number of amino acid residues scored as identical matches by the sequence alignment program ALIGN-2 in that program's alignment of A and B, and where Y is the total number of amino acid residues in B. It will be appreciated that where the length of amino acid sequence A is not equal to the length of amino acid sequence B, the % amino acid sequence identity of A to B will not equal the % amino acid sequence identity of B to A. As examples of % amino acid sequence identity calculations, Tables 2A-2B demonstrate how to calculate the % amino acid sequence identity of the amino acid sequence designated "Comparison Protein" to the amino acid sequence designated "PRO".

[0193] Unless specifically stated otherwise, all % amino acid sequence identity values used herein are obtained as described above using the ALIGN-2 sequence comparison computer program. However, % amino acid sequence identity may also be determined using the sequence comparison program NCBI-BLAST2 (Altschul et al., Nucleic Acids Res., 25:3389-3402 (1997)). The NCBI-BLAST2 sequence comparison program may be downloaded from http://www.ncbi.nlm.nih.gov. NCBI-BLAST2 uses several search parameters, wherein all of those search parameters are set to default values including, for example, unmask=yes, strand=all, expected occurrences=10, minimum low complexity length=15/5, multi-pass e-value=0.01, constant for multi-pass=25, dropoff for final gapped alignment=25 and scoring matrix=BLOSUM62.

[0194] In situations where NCBI-BLAST2 is employed for amino acid sequence comparisons, the % amino acid sequence identity of a given amino acid sequence A to, with, or against a given amino acid sequence B (which can alternatively be phrased as a given amino acid sequence A that has or comprises a certain % amino acid sequence identity to, with, or against a given amino acid sequence B) is calculated as follows:

100 times the fraction X/Y

[0195] where X is the number of amino acid residues scored as identical matches by the sequence alignment program NCBI-BLAST2 in that program's alignment of A and B, and where Y is the total number of amino acid residues in B. It will be appreciated that where the length of amino acid sequence A is not equal to the length of amino acid sequence B, the % amino acid sequence identity of A to B will not equal the % amino acid sequence identity of B to A.

[0196] In addition, % amino acid sequence identity may also be determined using the WU-BLAST-2 computer program (Altschul et al., Methods in Enzymology, 266:460-480 (1996)). Most of the WU-BLAST-2 search parameters are set to the default values. Those not set to default values, i.e., the adjustable parameters, are set with the following values: overlap span=1, overlap fraction=0.125, word threshold (T)=11, and scoring matrix=BLOSUM62. For purposes herein, a % amino acid sequence identity value is determined by dividing (a) the number of matching identical amino acids residues between the amino acid sequence of the PRO polypeptide of interest having a sequence derived from the native PRO polypeptide and the comparison amino acid sequence of interest (i.e., the sequence against which the PRO polypeptide of interest is being compared which may be a PRO variant polypeptide) as determined by WU-BLAST-2 by (b) the total number of amino acid residues of the PRO polypeptide of interest. For example, in the statement "a polypeptide comprising an amino acid sequence A which has or having at least 80% amino acid sequence identity to the amino acid sequence B", the amino acid sequence A is the comparison amino acid sequence of interest and the amino acid sequence B is the amino acid sequence of the PRO polypeptide of interest.

[0197] As used herein, a "HGD marker" or "cancer marker gene or polypeptide," or "anti-[HGD marker]" or "anti-[cancer marker]" refers to any one of the genes, polypeptides encoded by the genes, or antibodies specific for the polypeptides described herein as diagnositic for HGD or cnacer. Thus, for example, "TCF4" refers to the gene marker or its encoded polypeptide, whereas anti-TCF4 refers to an antobidy to the TCF4-encoded polypeptide.

[0198] A "gene variant polynucleotide" as used herein refers to a nucleic acid sequence that varies from the native sequence of its respective HGD marker gene NCBI accession sequence as disclosed in Table 4A, and further refers to a nucleic acid molecule which encodes a biologically active polypeptide and which nucleic acid molecule has at least about 80% nucleic acid sequence identity with a nucleic acid sequence selected from the group of marker genes: ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); and TCF4 (NM.sub.--030756) (SEQ ID NO:43), which genes encode, respectively, the full-length native polypeptides of the group: ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:4); ADAM8 (NM.sub.--001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NO:16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:42); and TCF4 (NM.sub.--030756) (SEQ ID NO:44) polypeptide sequence as disclosed herein, a full-length native sequence HGD marker polypeptide sequence lacking the signal peptide as disclosed herein, an extracellular domain of a HGD marker polypeptide, with or without the signal peptide, as disclosed herein or any other fragment of a full-length HGD marker polypeptide sequence as disclosed herein. Ordinarily, a HGD marker variant polynucleotide will have at least about 80% nucleic acid sequence identity, more preferably at least about 81% nucleic acid sequence identity, more preferably at least about 82% nucleic acid sequence identity, more preferably at least about 83% nucleic acid sequence identity, more preferably at least about 84% nucleic acid sequence identity, more preferably at least about 85% nucleic acid sequence identity, more preferably at least about 86% nucleic acid sequence identity, more preferably at least about 87% nucleic acid sequence identity, more preferably at least about 88% nucleic acid sequence identity, more preferably at least about 89% nucleic acid sequence identity, more preferably at least about 90% nucleic acid sequence identity, more preferably at least about 91% nucleic acid sequence identity, more preferably at least about 92% nucleic acid sequence identity, more preferably at least about 93% nucleic acid sequence identity, more preferably at least about 94% nucleic acid sequence identity, more preferably at least about 95% nucleic acid sequence identity, more preferably at least about 96% nucleic acid sequence identity, more preferably at least about 97% nucleic acid sequence identity, more preferably at least about 98% nucleic acid sequence identity and yet more preferably at least about 99% nucleic acid sequence identity with the nucleic acid sequence encoding a full-length native sequence HGD marker polypeptide sequence as disclosed herein, a full-length native sequence HGD marker polypeptide sequence lacking the signal peptide as disclosed herein, an extracellular domain of a HGD marker polypeptide, with or without the signal sequence, as disclosed herein or any other fragment of a full-length HGD marker polypeptide sequence as disclosed herein. Variants do not encompass the native nucleotide sequence.

[0199] Ordinarily, HGD marker gene variant polynucleotides are at least about 20 nucleotides in length, frequently at least about 30 nucleotides in length, often at least about 60 nucleotides in length, more often at least about 90 nucleotides in length, more often at least about 120 nucleotides in length, more often at least about 150 nucleotides in length, more often at least about 180 nucleotides in length, more often at least about 210 nucleotides in length, more often at least about 240 nucleotides in length, more often at least about 270 nucleotides in length, more often at least about 300 nucleotides in length, more often at least about 450 nucleotides in length, more often at least about 600 nucleotides in length, more often at least about 900 nucleotides in length, or more.

[0200] "Percent (%) nucleic acid sequence identity" with respect to variant polypeptides of each of the HGD marker polypeptide-encoding nucleic acid sequences identified herein is defined as the percentage of nucleotides in a candidate sequence that are identical with the nucleotides in a HGD marker polypeptide-encoding nucleic acid sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity. Alignment for purposes of determining percent nucleic acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR) software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared. For purposes herein, however, % nucleic acid sequence identity values are obtained as described below by using the sequence comparison computer program ALIGN-2, wherein the complete source code for the ALIGN-2 program is provided in Table 5. The ALIGN-2 sequence comparison computer program was authored by Genentech, Inc., and the source code shown in Table 5 has been filed with user documentation in the U.S. Copyright Office, Washington D.C., 20559, where it is registered under U.S. Copyright Registration No. TXU510087. The ALIGN-2 program is publicly available through Genentech, Inc., South San Francisco, Calif. or may be compiled from the source code provided in Table 5. The ALIGN-2 program should be compiled for use on a UNIX operating system, preferably digital UNIX V4.0D. All sequence comparison parameters are set by the ALIGN-2 program and do not vary.

[0201] For purposes herein, the % nucleic acid sequence identity of a given nucleic acid sequence C to, with, or against a given nucleic acid sequence D (which can alternatively be phrased as a given nucleic acid sequence C that has or comprises a certain % nucleic acid sequence identity to, with, or against a given nucleic acid sequence D) is calculated as follows:

100 times the fraction W/Z

[0202] where W is the number of nucleotides scored as identical matches by the sequence alignment program ALIGN-2 in that program's alignment of C and D, and where Z is the total number of nucleotides in D. It will be appreciated that where the length of nucleic acid sequence C is not equal to the length of nucleic acid sequence D, the % nucleic acid sequence identity of C to D will not equal the % nucleic acid sequence identity of D to C. As examples of % nucleic acid sequence identity calculations, Tables 2C-2D demonstrate how to calculate the % nucleic acid sequence identity of the nucleic acid sequence designated "Comparison DNA" to the nucleic acid sequence designated "PRO-DNA".

[0203] Unless specifically stated otherwise, all % nucleic acid sequence identity values used herein are obtained as described above using the ALIGN-2 sequence comparison computer program. However, % nucleic acid sequence identity may also be determined using the sequence comparison program NCBI-BLAST2 (Altschul et al., Nucleic Acids Res., 25:3389-3402 (1997)). The NCBI-BLAST2 sequence comparison program may be downloaded from http:H/www.ncbi.nlm.nih.gov. NCBI-BLAST2 uses several search parameters, wherein all of those search parameters are set to default values including, for example, unmask=yes, strand=all, expected occurrences=10, minimum low complexity length=15/5, multi-pass e-value=0.01, constant for multi-pass=25, dropoff for final gapped alignment=25 and scoring matrix=BLOSUM62.

[0204] In situations where NCBI-BLAST2 is employed for sequence comparisons, the % nucleic acid sequence identity of a given nucleic acid sequence C to, with, or against a given nucleic acid sequence D (which can alternatively be phrased as a given nucleic acid sequence C that has or comprises a certain % nucleic acid sequence identity to, with, or against a given nucleic acid sequence D) is calculated as follows:

100 times the fraction W/Z

[0205] where W is the number of nucleotides scored as identical matches by the sequence alignment program NCBI-BLAST2 in that program's alignment of C and D, and where Z is the total number of nucleotides in D. It will be appreciated that where the length of nucleic acid sequence C is not equal to the length of nucleic acid sequence D, the % nucleic acid sequence identity of C to D will not equal the % nucleic acid sequence identity of D to C.

[0206] In addition, % nucleic acid sequence identity values may also be generated using the WU-BLAST-2 computer program (Altschul et al., Methods in Enzymology, 266:460-480 (1996)). Most of the WU-BLAST-2 search parameters are set to the default values. Those not set to default values, i.e., the adjustable parameters, are set with the following values: overlap span=1, overlap fraction=0.125, word threshold (T)=11, and scoring matrix=BLOSUM62. For purposes herein, a % nucleic acid sequence identity value is determined by dividing (a) the number of matching identical nucleotides between the nucleic acid sequence of the PRO polypeptide-encoding nucleic acid molecule of interest having a sequence derived from the native sequence PRO polypeptide-encoding nucleic acid and the comparison nucleic acid molecule of interest (i.e., the sequence against which the PRO polypeptide-encoding nucleic acid molecule of interest is being compared which may be a variant PRO polynucleotide) as determined by WU-BLAST-2 by (b) the total number of nucleotides of the PRO polypeptide-encoding nucleic acid molecule of interest. For example, in the statement "an isolated nucleic acid molecule comprising a nucleic acid sequence A which has or having at least 80% nucleic acid sequence identity to the nucleic acid sequence B", the nucleic acid sequence A is the comparison nucleic acid molecule of interest and the nucleic acid sequence B is the nucleic acid sequence of the PRO polypeptide-encoding nucleic acid molecule of interest.

[0207] In other embodiments, variants of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:1 1); TM7SFI (NM.sub.--003272) (SEQ ID NO:13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); or TCF4 (NM.sub.--030756) (SEQ ID NO:43) HGD marker genes encode an active HGD marker polypeptide, and nucleic acid sequences useful for identifying the marker genes by, for example, nucleic acid hybridization assays or PCR assays are capable of hybridizing, preferably under stringent hybridization and wash conditions, to nucleotide sequences encoding the full-length ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3); ADAM8 (NM.sub.--001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41); and TCF4 (NM.sub.--030756) (SEQ ID NO:43) gene or hybridizable fragments thereof, which nucleotide sequences are found in the NCBI accession numbers listed in Table 4A for the respective polypeptides. HGD variant polypeptides may be those that are encoded by a HGD marker gene variant polynucleotide.

[0208] The term "positives", in the context of the amino acid sequence identity comparisons performed as described above, includes amino acid residues in the sequences compared that are not only identical, but also those that have similar properties. Amino acid residues that score a positive value to an amino acid residue of interest are those that are either identical to the amino acid residue of interest or are a preferred substitution (as defined in Table 4A below) of the amino acid residue of interest.

[0209] For purposes herein, the % value of positives of a given amino acid sequence A to, with, or against a given amino acid sequence B (which can alternatively be phrased as a given amino acid sequence A that has or comprises a certain % positives to, with, or against a given amino acid sequence B) is calculated as follows:

100 times the fraction X/Y

[0210] where X is the number of amino acid residues scoring a positive value as defined above by the sequence alignment program ALIGN-2 in that program's alignment of A and B, and where Y is the total number of amino acid residues in B. It will be appreciated that where the length of amino acid sequence A is not equal to the length of amino acid sequence B, the % positives of A to B will not equal the % positives of B to A.

[0211] "Isolated," when used to describe the various polypeptides disclosed herein, means polypeptide that has been identified and separated and/or recovered from a component of its natural environment. Preferably, the isolated polypeptide is free of association with all components with which it is naturally associated. Contaminant components of its natural environment are materials that would typically interfere with diagnostic or therapeutic uses for the polypeptide, and may include enzymes, hormones, and other proteinaceous or non-proteinaceous solutes. In preferred embodiments, the polypeptide will be purified (1) to a degree sufficient to obtain at least 15 residues of N-terminal or internal amino acid sequence by use of a spinning cup sequenator, or (2) to homogeneity by SDS-PAGE under non-reducing or reducing conditions using Coomassie blue or, preferably, silver stain. Isolated polypeptide includes polypeptide in situ within recombinant cells, since at least one component of the ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:4); ADAM8 (NM.sub.--001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NO:16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:42); or TCF4 (NM.sub.--030756) (SEQ ID NO:44) polypeptide's natural environment will not be present. Ordinarily, however, isolated polypeptide will be prepared by at least one purification step.

[0212] An "isolated" nucleic acid molecule encoding an ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:4); ADAM8 (NM.sub.--001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NO:16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:42); or TCF4 (NM.sub.--030756) (SEQ ID NO:44) polypeptide or an "isolated" nucleic acid encoding an anti-[HGD marker polypeptide] antibody, is a nucleic acid molecule that is identified and separated from at least one contaminant nucleic acid molecule with which it is ordinarily associated in the natural source of the HGD marker genes or the anti-[HGD marker polypeptide]-encoding nucleic acid. Preferably, the isolated nucleic acid is free of association with all components with which it is naturally associated. An isolated polypeptide or nucleic acid sequence is other than in the form or setting in which it is found in nature. Isolated nucleic acid molecules therefore are distinguished from the nucleic acid molecule as it exists in natural cells. However, an isolated nucleic acid molecule encoding a HGD maker polypeptide or an anti-[HGD marker polypeptide] antibody includes HGD marker gene nucleic acid molecules and anti-[HGD marker polypeptide]-encoding nucleic acid molecules contained in cells that ordinarily express HGD marker polypeptides or express anti-[HGD maker polypeptide] antibodies where, for example, the nucleic acid molecule is in a chromosomal location different from that of natural cells.

[0213] The term "control sequences" refers to DNA sequences necessary for the expression of an operably linked coding sequence in a particular host organism. The control sequences that are suitable for prokaryotes, for example, include a promoter, optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are known to utilize promoters, polyadenylation signals, and enhancers.

[0214] Nucleic acid is "operably linked" when it is placed into a functional relationship with another nucleic acid sequence. For example, DNA for a presequence or secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation. Generally, "operably linked" means that the DNA sequences being linked are contiguous, and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers do not have to be contiguous. Linking is accomplished by ligation at convenient restriction sites. If such sites do not exist, the synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice.

[0215] The term "antibody" is used in the broadest sense and specifically covers, for example, single anti-[HGD marker polypeptide] monoclonal antibodies (including antagonist, and neutralizing antibodies), anti-[HGD marker polypeptide] antibody compositions with polyepitopic specificity, single chain anti-[HGD marker polypeptide] antibodies, and fragments thereof (see below). The term "monoclonal antibody" as used herein refers to an antibody obtained from a population of substantially homogeneous antibodies, i.e., the individual antibodies comprising the population are identical except for possible naturally-occurring mutations that may be present in minor amounts.

[0216] "Stringency" of hybridization reactions is readily determinable by one of ordinary skill in the art, and generally is an empirical calculation dependent upon probe length, washing temperature, and salt concentration. In general, longer probes require higher temperatures for proper annealing, while shorter probes need lower temperatures. Hybridization generally depends on the ability of denatured DNA to reanneal when complementary strands are present in an environment below their melting temperature. The higher the degree of desired homology between the probe and hybridizable sequence, the higher the relative temperature which can be used. As a result, it follows that higher relative temperatures would tend to make the reaction conditions more stringent, while lower temperatures less so. For additional details and explanation of stringency of hybridization reactions, see Ausubel et al., Current Protocols in Molecular Biology, Wiley Interscience Publishers, (1995).

[0217] "Stringent conditions" or "high stringency conditions", as defined herein, may be identified by those that: (1) employ low ionic strength and high temperature for washing, for example 0.015 M sodium chloride/0.0015 M sodium citrate/0.1% sodium dodecyl sulfate at 50.degree. C.; (2) employ during hybridization a denaturing agent, such as formamide, for example, 50% (v/v) formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride, 75 mM sodium citrate at 42.degree. C.; or (3) employ 50% formamide, 5.times.SSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5.times.Denhardt's solution, sonicated salmon sperm DNA (50 .quadrature.g/ml), 0.1% SDS, and 10% dextran sulfate at 42.degree. C., with washes at 42.degree. C. in 0.2.times.SSC (sodium chloride/sodium citrate) and 50% formamide at 55.degree. C., followed by a high-stringency wash consisting of 0.1.times.SSC containing EDTA at 55.degree. C.

[0218] "Moderately stringent conditions" may be identified as described by Sambrook et al., Molecular Cloning: A Laboratory Manual, New York: Cold Spring Harbor Press, 1989, and include the use of washing solution and hybridization conditions (e.g., temperature, ionic strength and % SDS) less stringent than those described above. An example of moderately stringent conditions is overnight incubation at 37.degree. C. in a solution comprising: 20% formamide, 5.times.SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5.times.Denhardt's solution, 10% dextran sulfate, and 20 mg/ml denatured sheared salmon sperm DNA, followed by washing the filters in 1.times.SSC at about 35.quadrature.C-50.degree. C. The skilled artisan will recognize how to adjust the temperature, ionic strength, etc. as necessary to accommodate factors such as probe length and the like.

[0219] The term "epitope tagged" when used herein refers to a chimeric polypeptide comprising a HGD marker polypeptide fused to a "tag polypeptide". The tag polypeptide has enough residues to provide an epitope against which an antibody can be made, yet is short enough such that it does not interfere with activity of the polypeptide to which it is fused. The tag polypeptide preferably also is fairly unique so that the antibody does not substantially cross-react with other epitopes. Suitable tag polypeptides generally have at least six amino acid residues and usually between about 8 and 50 amino acid residues (preferably, between about 10 and 20 amino acid residues).

[0220] "Active" or "activity" for the purposes herein refers to form(s) of ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:4); ADAM8 (NM.sub.--001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NO:16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:42); or TCF4 (NM.sub.--030756) (SEQ ID NO:44) polypeptides which retain a biological and/or an immunological activity/property of a native or naturally-occurring HGD marker polypeptide, wherein "biological" activity refers to a function (either inhibitory or stimulatory) caused by a native or naturally-occurring HGD marker polypeptide other than the ability to induce the production of an antibody against an antigenic epitope possessed by a native or naturally-occurring HGD marker polypeptide and an "immunological" activity refers to the ability to induce the production of an antibody against an antigenic epitope possessed by a native or naturally-occurring HGD marker polypeptide.

[0221] "Biological activity" in the context of an antibody or another antagonist molecule, or therapeutic compound that can be identified by the screening assays disclosed herein (e.g., an organic or inorganic small molecule, peptide, etc.) is used to refer to the ability of such molecules to bind or complex with the polypeptides encoded by the amplified genes identified herein, or otherwise interfere with the interaction of the encoded polypeptides with other cellular proteins or otherwise interfere with the transcription or translation of a HGD marker polypeptide. "Biological activity" in the context of an agonist molecule that enhances the activity of, for example, native anti-angiogenic molecules refers to the ability of such molecules to bind or complex with the polypeptides encoded by the amplified genes identified herein or otherwise modify the interaction of the encoded polypeptides with other cellular proteins or otherwise enhance the transcription or translation of a TIMP1 or thrombospondin 2 polypeptide. A preferred biological activity is growth inhibition of a target tumor cell. Another preferred biological activity is cytotoxic activity resulting in the death of the target tumor cell.

[0222] The term "biological activity" in the context of a HGD marker polypeptide means the typical activity of the HGD marker polypeptide in the cell.

[0223] The phrase "immunological activity" means immunological cross-reactivity with at least one epitope of a HGD marker polypeptide.

[0224] "Immunological cross-reactivity" as used herein means that the candidate polypeptide is capable of competitively inhibiting the qualitative biological activity of a HGD marker polypeptide having this activity with polyclonal antisera raised against the known active HGD marker polypeptide. Such antisera are prepared in conventional fashion by injecting goats or rabbits, for example, subcutaneously with the known active analogue in complete Freund's adjuvant, followed by booster intraperitoneal or subcutaneous injection in incomplete Freunds. The immunological cross-reactivity preferably is "specific", which means that the binding affinity of the immunologically cross-reactive molecule (e.g., antibody) identified, to the corresponding HGD marker polypeptide is significantly higher (preferably at least about 2-times, more preferably at least about 4-times, even more preferably at least about 8-times, most preferably at least about 10-times higher) than the binding affinity of that molecule to any other known native polypeptide.

[0225] The term "antagonist" is used in the broadest sense, and includes any molecule that partially or fully blocks, inhibits, or neutralizes a biological activity of a native HGD marker polypeptide disclosed herein or the transcription or translation thereof, particularly when the HGD marker polypeptide is expressed about 1.5-fold above the level of expression in normal tissue controls. Suitable antagonist molecules specifically include antagonist antibodies or antibody fragments, binding fragments, peptides, small organic molecules, anti-sense nucleic acids, etc. Included are methods for identifying antagonists of an ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1 or 2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3 or 4); ADAM8 (NM.sub.--001109) (SEQ ID NO:5 or 6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7 or 8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9 or 10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11 or 12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13 or 14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS: 15 or 16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17 or 18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19 or 20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21 or 22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23 or 24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27 or 28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29 or 30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31 or 32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33 or 34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35 or 36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37 or 38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41 or 42); and TCF4 (NM.sub.--030756) (SEQ ID NO:43 or 44) gene or polypeptide with a candidate antagonist molecule and measuring a detectable change in one or more biological activities normally associated with the ET-1 (endothelin-1, NM.sub.--001955) (SEQ ID NO:1 or 2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM.sub.--006408) (SEQ ID NO:3 or 4); ADAM8 (NM.sub.--001109) (SEQ ID NO:5 or 6); PRSS8 (Prostasin precursor, serine protease, NM.sub.--002773) (SEQ ID NO:7 or 8); AXO1 (Axonin-1 precursor, NM.sub.--005076) (SEQ ID NO:9 or 10); NROB2 (Nuclear hormone receptor, NM.sub.--021969) (SEQ ID NO:11 or 12); TM7SF1 (NM.sub.--003272) (SEQ ID NO:13 or 14); DLDH (dihydrolipamide dehydrogenase, NM.sub.--000108) (SEQ ID NOS:15 or 16); MAT2B (methionine adenosyltransferase II, beta, NM.sub.--013283) (SEQ ID NO:17 or 18); STC-2 (stanniocalcin-2, NM.sub.--003714) (SEQ ID NO:19 or 20); PPBI (alkaline phosphatase, intestinal precursor, NM.sub.--001631) (SEQ ID NO:21 or 22); SLNAC1 (sodium channel receptor SLNAC1, NM.sub.--004769) (SEQ ID NO:23 or 24); CAH4 (carbonic anhydrase iv precursor, NM.sub.--000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 precursor, NM.sub.--000928) (SEQ ID NO:27 or 28); PAR2 (proteinase activated receptor 2 precursor, NM.sub.--005242) (SEQ ID NO:29 or 30); IDE (insulin-degrading enzyme, NM.sub.--004969) (SEQ ID NO:31 or 32); MYO1A (myosin-1A, NM.sub.--005379) (SEQ ID NO:33 or 34); CYP2J2 (cytochrome P450 monooxygenase, NM.sub.--000775) (SEQ ID NO:35 or 36); PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM.sub.--006214) (SEQ ID NO:37 or 38); CYB5 (cytochrome b5, 3' end, NM.sub.--001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, last exon and flanking sequence, NM.sub.--001863) (SEQ ID NO:41 or 42); and TCF4 (NM.sub.--030756) (SEQ ID NO:43 or 44) gene or polypeptide.

[0226] A "small molecule" is defined herein to have a molecular weight below about 500 Daltons.

[0227] "Antibodies" (Abs) and "immunoglobulins" (Igs) are glycoproteins having the same structural characteristics. While antibodies exhibit binding specificity to a specific antigen, immunoglobulins include both antibodies and other antibody-like molecules which lack antigen specificity. Polypeptides of the latter kind are, for example, produced at low levels by the lymph system and at increased levels by myelomas. The term "antibody" is used in the broadest sense and specifically covers, without limitation, intact monoclonal antibodies, polyclonal antibodies, multispecific antibodies (e.g., bispecific antibodies) formed from at least two intact antibodies, and antibody fragments so long as they exhibit the desired biological activity.

[0228] "Native antibodies" and "native immunoglobulins" are usually heterotetrameric glycoproteins of about 150,000 daltons, composed of two identical light (L) chains and two identical heavy (H) chains. Each light chain is linked to a heavy chain by one covalent disulfide bond, while the number of disulfide linkages varies among the heavy chains of different immunoglobulin isotypes. Each heavy and light chain also has regularly spaced intrachain disulfide bridges. Each heavy chain has at one end a variable domain (V.sub.H) followed by a number of constant domains. Each light chain has a variable domain at one end (V.sub.L) and a constant domain at its other end; the constant domain of the light chain is aligned with the first constant domain of the heavy chain, and the light-chain variable domain is aligned with the variable domain of the heavy chain. Particular amino acid residues are believed to form an interface between the light- and heavy-chain variable domains.

[0229] The term "variable" refers to the fact that certain portions of the variable domains differ extensively in sequence among antibodies and are used in the binding and specificity of each particular antibody for its particular antigen. However, the variability is not evenly distributed throughout the variable domains of antibodies. It is concentrated in three segments called complementarity-determining regions (CDRs) or hypervariable regions both in the light-chain and the heavy-chain variable domains. The more highly conserved portions of variable domains are called the framework (FR) regions. The variable domains of native heavy and light chains each comprise four FR regions, largely adopting a P-sheet configuration, connected by three CDRs, which form loops connecting, and in some cases forming part of, the .beta.-sheet structure. The CDRs in each chain are held together in close proximity by the FR regions and, with the CDRs from the other chain, contribute to the formation of the antigen-binding site of antibodies (see Kabat et al., NIH Publ. No.91-3242, Vol. I, pages 647-669 (1991)). The constant domains are not involved directly in binding an antibody to an antigen, but exhibit various effector functions, such as participation of the antibody in antibody-dependent cellular toxicity.

[0230] The term "hypervariable region" when used herein refers to the amino acid residues of an antibody which are responsible for antigen-binding. The hypervariable region comprises amino acid residues from a "complementarity determining region" or "CDR" (i.e., residues 24-34 (L1), 50-56 (L2) and 89-97 (L3) in the light chain variable domain and 31-35 (H1), 50-65 (H2) and 95-102 (H3) in the heavy chain variable domain; Kabat et al., Sequences of Proteins of Immunological Interest, 5th Ed. Public Health Service, National Institute of Health, Bethesda, Md. [1991]) and/or those residues from a "hypervariable loop" (i.e., residues 26-32 (L1), 50-52 (L2) and 91-96 (L3) in the light chain variable domain and 26-32 (H1), 53-55 (H2) and 96-101 (H3) in the heavy chain variable domain ; Clothia and Lesk, J. Mol. Biol., 196:901-917 [1987]). "Framework" or "FR" residues are those variable domain residues other than the hypervariable region residues as herein defined.

[0231] "Antibody fragments" comprise a portion of an intact antibody, preferably the antigen binding or variable region of the intact antibody. Examples of antibody fragments include Fab, Fab', F(ab').sub.2, and Fv fragments; diabodies; linear antibodies (Zapata et al., Protein Eng., 8(10):1057-1062 [1995]); single-chain antibody molecules; and multispecific antibodies formed from antibody fragments.

[0232] Papain digestion of antibodies produces two identical antigen-binding fragments, called "Fab" fragments, each with a single antigen-binding site, and a residual "Fc" fragment, whose name reflects its ability to crystallize readily. Pepsin treatment yields an F(ab').sub.2 fragment that has two antigen-combining sites and is still capable of cross-linking antigen.

[0233] "Fv" is the minimum antibody fragment which contains a complete antigen-recognition and -binding site. This region consists of a dimer of one heavy- and one light-chain variable domain in tight, non-covalent association. It is in this configuration that the three CDRs of each variable domain interact to define an antigen-binding site on the surface of the V.sub.H-V.sub.L dimer. Collectively, the six CDRs confer antigen-binding specificity to the antibody. However, even a single variable domain (or half of an Fv comprising only three CDRs specific for an antigen) has the ability to recognize and bind antigen, although at a lower affinity than the entire binding site.

[0234] The Fab fragment also contains the constant domain of the light chain and the first constant domain (CH1) of the heavy chain. Fab fragments differ from Fab' fragments by the addition of a few residues at the carboxy terminus of the heavy chain CH1 domain including one or more cysteines from the antibody hinge region. Fab'-SH is the designation herein for Fab' in which the cysteine residue(s) of the constant domains bear a free thiol group. F(ab').sub.2 antibody fragments originally were produced as pairs of Fab' fragments which have hinge cysteines between them. Other chemical couplings of antibody fragments are also known.

[0235] The "light chains" of antibodies (immunoglobulins) from any vertebrate species can be assigned to one of two clearly distinct types, called kappa (.kappa.) and lambda (.lambda.), based on the amino acid sequences of their constant domains.

[0236] Depending on the amino acid sequence of the constant domain of their heavy chains, immunoglobulins can be assigned to different classes. There are five major classes of immunoglobulins: IgA, IgD, IgE, IgG, and IgM, and several of these may be further divided into subclasses (isotypes), e.g., IgG1, IgG2, IgG3, IgG4, IgA, and IgA2. The heavy-chain constant domains that correspond to the different classes of immunoglobulins are called .alpha., .delta., .epsilon., .gamma., and .mu., respectively. The subunit structures and three-dimensional configurations of different classes of immunoglobulins are well known.

[0237] The term "monoclonal antibody" as used herein refers to an antibody obtained from a population of substantially homogeneous antibodies, i.e., the individual antibodies comprising the population are identical except for possible naturally occurring mutations that may be present in minor amounts. Monoclonal antibodies are highly specific, being directed against a single antigenic site. Furthermore, in contrast to conventional (polyclonal) antibody preparations which typically include different antibodies directed against different determinants (epitopes), each monoclonal antibody is directed against a single determinant on the antigen. In addition to their specificity, the monoclonal antibodies are advantageous in that they are synthesized by the hybridoma culture, uncontaminated by other immunoglobulins. The modifier "monoclonal" indicates the character of the antibody as being obtained from a substantially homogeneous population of antibodies, and is not to be construed as requiring production of the antibody by any particular method. For example, the monoclonal antibodies to be used in accordance with the present invention may be made by the hybridoma method first described by Kohler et al., Nature, 256:495 [1975], or may be made by recombinant DNA methods (see, e.g., U.S. Pat. No. 4,816,567). The "monoclonal antibodies" may also be isolated from phage antibody libraries using the techniques described in Clackson et al., Nature, 352:624-628 [1991] and Marks et al., J. Mol. Biol., 222:581-597 (1991), for example.

[0238] The monoclonal antibodies herein specifically include "chimeric" antibodies (immunoglobulins) in which a portion of the heavy and/or light chain is identical with or homologous to corresponding sequences in antibodies derived from a particular species or belonging to a particular antibody class or subclass, while the remainder of the chain(s) is identical with or homologous to corresponding sequences in antibodies derived from another species or belonging to another antibody class or subclass, as well as fragments of such antibodies, so long as they exhibit the desired biological activity (U.S. Pat. No. 4,816,567; Morrison et al., Proc. Natl. Acad. Sci. USA, 81:6851-6855 [1984]).

[0239] "Humanized" forms of non-human (e.g., murine) antibodies are chimeric immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab').sub.2 or other antigen-binding subsequences of antibodies) which contain minimal sequence derived from non-human immunoglobulin. For the most part, humanized antibodies are human immunoglobulins (recipient antibody) in which residues from a CDR of the recipient are replaced by residues from a CDR of a non-human species (donor antibody) such as mouse, rat or rabbit having the desired specificity, affinity, and capacity. In some instances, Fv FR residues of the human immunoglobulin are replaced by corresponding non-human residues. Furthermore, humanized antibodies may comprise residues which are found neither in the recipient antibody nor in the imported CDR or framework sequences. These modifications are made to further refine and maximize antibody performance. In general, the humanized antibody will comprise substantially all of at least one, and typically two, variable domains, in which all or substantially all of the CDR regions correspond to those of a non-human immunoglobulin and all or substantially all of the FR regions are those of a human immunoglobulin sequence. The humanized antibody optimally also will comprise at least a portion of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin. For further details, see, Jones et al., Nature, 321:522-525 (1986); Reichmann et al., Nature, 332:323-329 [1988]; and Presta, Curr. Op. Struct. Biol., 2:593-596 (1992). The humanized antibody includes a PRIMATIZED.TM. antibody wherein the antigen-binding region of the antibody is derived from an antibody produced by immunizing macaque monkeys with the antigen of interest.

[0240] "Single-chain Fv" or "sFv" antibody fragments comprise the V.sub.H and V.sub.L domains of antibody, wherein these domains are present in a single polypeptide chain. Preferably, the Fv polypeptide further comprises a polypeptide linker between the V.sub.H and V.sub.L domains which enables the sFv to form the desired structure for antigen binding. For a review of sFv see Pluckthun in The Pharmacology of Monoclonal Antibodies, vol. 113, Rosenburg and Moore eds., Springer-Verlag, New York, pp. 269-315 (1994).

[0241] The term "diabodies" refers to small antibody fragments with two antigen-binding sites, which fragments comprise a heavy-chain variable domain (V.sub.H) connected to a light-chain variable domain (V.sub.L) in the same polypeptide chain (V.sub.H-V.sub.L). By using a linker that is too short to allow pairing between the two domains on the same chain, the domains are forced to pair with the complementary domains of another chain and create two antigen-binding sites. Diabodies are described more fully in, for example, EP 404,097; WO 93/11161; and Hollinger et. al., Proc. Natl. Acad. Sci. USA, 90:6444-6448 (1993).

[0242] An "isolated" antibody is one which has been identified and separated and/or recovered from a component of its natural environment. Contaminant components of its natural environment are materials which would interfere with diagnostic or therapeutic uses for the antibody, and may include enzymes, hormones, and other proteinaceous or nonproteinaceous solutes. In preferred embodiments, the antibody will be purified (1) to greater than 95% by weight of antibody as determined by the Lowry method, and most preferably more than 99% by weight, (2) to a degree sufficient to obtain at least 15 residues of N-terminal or internal amino acid sequence by use of a spinning cup sequenator, or (3) to homogeneity by SDS-PAGE under reducing or nonreducing conditions using Coomassie blue or, preferably, silver stain. Isolated antibody includes the antibody in situ within recombinant cells since at least one component of the antibody's natural environment will not be present. Ordinarily, however, isolated antibody will be prepared by at least one purification step.

[0243] The word "label" when used herein refers to a detectable compound or composition which is conjugated directly or indirectly to the antibody so as to generate a "labeled" antibody. The label may be detectable by itself (e.g., radioisotope labels or fluorescent labels) or, in the case of an enzymatic label, may catalyze chemical alteration of a substrate compound or composition which is detectable. Radionuclides that can serve as detectable labels include, for example, I-131, I-123, I-125, Y-90, Re-188, Re-186, At-211, Cu-67, Bi-212, and Pd-109. The label may also be a non-detectable entity such as a toxin.

[0244] A "liposome" is a small vesicle composed of various types of lipids, phospholipids and/or surfactant which is useful for delivery of a drug (such as a CXCR4; Laminin alpha 4; TIMP1; Type IV collagen alpha 1; Laminin alpha 3; Adrenomedullin; Thrombospondin 2; Type I collagen alpha 2; Type VI collagen alpha 2; Type VI collagen alpha 3; Latent TGFbeta binding protein 2 (LTBP2); Serine or cystein protease inhibitor heat shock protein (HSP47); Procollagen-lysine, 2-oxoglutarate 5-dioxygenase; connexin 43; Type IV collagen alpha 2; Connexin 37; Ephrin A1; Laminin beta 2; Integrin alpha 1; Stanniocalcin 1; Thrombospondin 4; or CD36 polypeptide or antibody thereto and, optionally, a chemotherapeutic agent) to a mammal. The components of the liposome are commonly arranged in a bilayer formation, similar to the lipid arrangement of biological membranes.

[0245] As used herein, the term "immunoadhesin" designates antibody-like molecules which combine the binding specificity of a heterologous protein (an "adhesin") with the effector functions of immunoglobulin constant domains. Structurally, the immunoadhesins comprise a fusion of an amino acid sequence with the desired binding specificity which is other than the antigen recognition and binding site of an antibody (i.e., is "heterologous"), and an immunoglobulin constant domain sequence. The adhesin part of an immunoadhesin molecule typically is a contiguous amino acid sequence comprising at least the binding site of a receptor or a ligand. The immunoglobulin constant domain sequence in the immunoadhesin may be obtained from any immunoglobulin, such as IgG-1, IgG-2, IgG-3, or IgG-4 subtypes, IgA (including IgA-1 and IgA-2), IgE, IgD or IgM.

[0246] "Up-regulation," "increased expression," and "overexpression" are used interchangeably and, as used herein, mean at least about a 1.5-fold increase in expression, alternatively at least about a 2-fold increase in expression, alternatively with at least about a 2.5-fold or higher increase in expression of a gene measured as an increase in its DNA (amplification), its mRNA (increased transcription), or in the level of polypeptide encoded by the gene. Alternatively, up-regulation or increased expression is determined using a Z score as a p value <0.07 relative to a normal tissue control.

[0247] The term "package insert" is used to refer to instructions customarily included in commercial packages of therapeutic products, that contain information about the indications, usage, dosage, administration, contraindications and/or warnings concerning the use of such therapeutic products.

[0248] It will be clearly understood that, although a number of art publications are referred to herein, this reference does not constitute an admission that any of these documents forms part of the common general knowledge in the art, in Australia or in any other country.

[0249] Throughout this specification and the claims, the terms "comprise," "comprises," and "comprising" are used in a non-exclusive sense, except where the context requires otherwise.

EXAMPLES

[0250] The following examples are offered by way of illustration and not by way of limitations. The examples are provided so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the compounds, compositions, and methods of the invention and are not intended to limit the scope of what the inventors regard as their invention. Efforts have been made to insure accuracy with respect to numbers used (e.g. amounts, temperature, etc. but some experimental errors and deviation should be accounted for. Unless indicated otherwise, parts are in parts by weight, temperature is in degrees C, and pressure is at or near atmospheric. The disclosures of all citations in the specification are expressly incorported herein by reference.

Example 1

Patients and Tissue Collection

[0251] Esophageal mucosal biopsies were obtained from patients undergoing surveillance endoscopy at the Western General Hospital and Royal Infirmary, Edinburgh during 2000-1. The study was approved by the Lothian Research and Ethics Committee and written, informed consent was obtained from all patients. All procedures were performed by one of two experienced endoscopists with expertise in Barrett's esophagus in a standard manner according to a local protocol for Barrett's surveillance. BE was defined as tongues or circumferential salmon pink mucosa extending for at least 3 cm above the gastro-esophageal junction. At endoscopy, careful note was made of the length of the CE segment, severity of any esophagitis if present and the presence of macroscopically visible abnormalities within the BE. Data on smoking history, use of acid-suppressing drugs and Helicobacter pylori status were also recorded.

[0252] Paired biopsies were taken. One sample was fixed in formalin for histology and the other stored fresh-frozen (-70.degree. C.) for microarray analysis. Two gastrointestinal pathologists reviewed all specimens, which were categorized as: normal squamous esophagus, BE (columnar lined esophagus with intestinal metaplasia and the presence of goblet cells and alcian blue positive mucin), BE with changes indeterminate dysplasia, BE with low-grade dysplasia (LGD), BE with high-grade dysplasia (HGD) or BE with adenocarcinoma (CA). For some patients, 2 separate biopsy specimens for the same disease state were available for array analysis. Additional matched samples were also analyzed (e.g. biopsies of BE adjacent to carcinoma in BE from the same patient). Analyzed samples included 10 normal esophagus, 28 samples of BE from 20 patients, 6 samples of LGD from 3 patients, 3 samples indeterminate for dysplasia from 2 patients, 6 samples HGD from 3 patients, 10 samples of BE adjacent to CA (BE-CA) from 7 patients, 16 samples CA from 10 patients.

[0253] Microarrays containing 9031 genes were generated by printing PCR products derived from cDNA clones (Invitrogen, California and Genentech, Inc.) on glass slides coated with 3-aminopropyltriethoxysilane(Aldrich, Milwaukee Wis.) and 1,4-phenylenediisothiocyanate (Aldrich, Milwaukee Wis.) using a robotic arrayer (Norgren Systems, Mountain View, Calif.). RNA isolation was accomplished by CsCl step gradient, (Kingston, Current Protocols in Molecular Biology 1:4.2.5-4.2.6 (1998)) typically 0.1-2 .mu.g of total RNA was obtained. Probes for array analysis were generated by conservative amplification and subsequent labelling as follows: double-stranded DNA generated from 0.1 .mu.g of total RNA (Invitrogen, Carlsbad, Calif.) was amplified using a single round of a modified in vitro transcription protocol (MEGASCript T7 from Ambion, Austin, Tex. (Gelder et al., Proc. Natl. Acad. Sci. USA 87:1663-1667 (1990)). The resulting cRNA was used as a template to generate a sense DNA probe using random primers (9mers, 0.15 mg/ml), Alexa 488 dUTP or Alexa 546 dUTP (40 .mu.M and 6 .mu.M, respectively, Molecular Probes, Eugene, Oreg.) using MMLV-derived reverse transcriptase (Invitrogen, Carlsbad, Calif.). A reference probe to reflect general epithelial cell expression was generated from 0.1 .mu.g of total RNA from a pool of liver, lung and kidney (Clontech, Palo Alto, Calif.). Probes were hybridized to arrays overnight in 50% formamide/5.times.SSC at 37.degree. C. and washed the next day in 2.times.SSC, 0.2% SDS followed by 0.2.times.SSC, 0.2% SDS. Array images were collected using a CCD-camera based imaging system (Norgren Systems, Mountain View, Calif.) equipped with a Xenon light source and optical filters appropriate for each dye. Full dynamic-range images were collected (Autograb, Genentech Inc) and intensities and ratios extracted using automated gridding and data extraction software (gImage, Genentech Inc) built on a Matlab (the MathWorks, Natick, Mass.) platform.

Example 3

Data Analysis

[0254] Data were sorted to identify genes expressed above background (N intensity of >12 where background values range from 0-8) in the test sample such that only meaningful ratios were included. Ratio values were further normalized for experimental scatter at different intensity values within each experiment by plotting log ratio versus N intensity and by fitting a normal distribution at each intensity level. A measure of standard deviation (Z score) around a mean of zero was derived for each gene in each experiment and this value was used in data mining. Specifically, for each microarray, data were normalized by computing Z-scores, which were obtained from a scatterplot of the logarithm of the ratio of the test and reference data versus the logarithm of the minimum of the test and reference data. The median of the ratio as a function of intensity was estimated by applying the loess algorithm to the scatterplot. The standard error was estimated by applying loess to the square root of the absolute residuals, and squaring the result to obtain the median absolute deviation (MAD), and making a multiplicative correction to convert from MAD to a standard error. The Z scores were determined for each ratio by dividing its vertical distance from the median loess curve by the standard error at that intensity.

[0255] A computational process useful computing Z-scores may be written in a standard high-level statistical language, S-Plus, as follows:

[0256] pos.test.fwdarw.test[test>0 & ref>0]

[0257] pos.ref.fwdarw.ref[test>0 & ref>0]

[0258] minorder.fwdarw.order(pmin(pos.test,pos.ref))

[0259] y.fwdarw.log(pos.test[minorder]+10)-log(pos.ref[minorder]+10)

[0260] x.fwdarw.log(pmin(pos.test[minorder],pos.ref[minorder]))

[0261] residuals.fwdarw.loess(y.about.x)$residuals

[0262] sqresiduals.fwdarw.sqrt(abs(residuals))

[0263] sqrt.mad.fwdarw.loess(sqresiduals.about.x)$fitted

[0264] sigma.fwdarw.sqrt.mad*sqrt.mad/0.6745

[0265] zscore.fwdarw.ifelse(sigma>0,residuals/sigma,0)

[0266] This code may be executed in a commercially available S-Plus program such as, for example, (http://www.insightful.com), or in a freely available substituteprogram, R (http://www.r-project.org).

Example 4

Differential Expression in Barrett's Esophagus-to-Adenocarcinoma Disease Stages

[0267] Samples and Data Mining:

[0268] High-quality data were obtained from >90% of biopsy specimens, including those of poor RNA quality and very limited RNA quantity (eg. less than 200 ng total RNA). A data mining strategy was applied to identify genes specifically associated with the different stages of disease progression. Experiments were grouped into disease categories based on pathologic diagnosis, and these groups compared to identify genes with significant elevated expression for at least 25% of the samples within a disease group with respect to both the epithelial pool reference and the normal esophagus group. Typically, genes with elevated expression were identified as those with Z scores of >1.7 (p<0.05) in the disease group, corresponding to ratio values of 2-20 in most cases. A total of 460 genes satisfied these criteria across the disease groups BE, dysplasia, and carcinoma (some genes are associated with more than one disease group). Selected genes (117) are listed (Tables 1, 2, 3). All dysplasia samples (high-, low-grade and indeterminate) were combined into a single group to improve data analysis, and the genes identified were then further inspected to determine if they were more prevalent in low- or high-grade dysplasia. HGD sample data were independently analyzed to determine gene expression profiles diagnostic for high-grade dysplasia (Table 4A).

[0269] Inflammation:

[0270] Significant expression of proinflammatory, costimulatory and inducible cytokines and receptors was observed in BE, dysplasia and carcinoma, and the most prevalent genes are listed (Table 1). Some binding partners were detected, such as putative inflammatory cytokine IL-17 family member IL-17E and its receptor IL-17BR, and SCYA20/LARC and receptor CCR6 (Lee et al., J. Biol. Chem. 276:1660-1664 (2001); and Baba et al., J. Biol. Chem. 272:14893-14898 (1997)). SCYA20 is expressed in the epithelium of the small intestine and is chemotactic for lymphocytes and dendritic cells (Tanaka et al., Eur. J. Immunol. 29:644-642 (1999)). Activin A is a TGF beta superfamily member that can act as a potent mediator of cell growth and differentiation and may be involved in response to injury (Munz et al., EMBO J. 18:5205-5215 (1999)). It was co-expressed particularly in carcinoma in Barrett's samples with its serine-threonine kinase receptor AVRII (the type I receptor was also detected but less well correlated). Chemokine receptors CXCR4 and CCR7 have been detected on a variety of inflammatory cell types, but have also been described has highly expressed in breast tumor cells, with possible involvement in lymph node metastasis (Muller et al., Nature 410:50-56 (2001)). In this study, CXCR4 in particular was associated with high-grade dysplasia and detected in some samples of adenocarcinoma.

1TABLE 1A Cytokines and chemokines up-regulated in BE-to-Adenocarcinoma NCBI RefSeq Gene BE D BE-CA CA NM_000594 TNF-a * * * NM_002546 Osteoprotegerin * * NM_002993 GCP-2 (*) *H (*) * NM_025240 B7-H3 *L (*) * NM_002995 Lymphotactin (*) * (*) NM_005746 PBEF * (*) NM_004591 SCYA20 (*) * NM_004843 WSX1 * NM_019618 IL1-H1 (*) * * NM_000418 IL-4R * NM_022789 IL-17E (*) * * * NM_018725 IL-17BR *H (*) NM_014432 IL-20Ra *L (*) NM_021798 IL-21R (*) * * NM_002192 Activin A (*) (*) * NM_001616 AVR2, type II activin receptor * * NM_001105 Activin A type I Receptor (*) NM_031409 CCR6 (*) * * NM_003467 CXCR4 *H (*) NM_001838 CKR7 (*) (*) *

[0271]

2TABLE 1B Prostaglandin synthesis-related genes up-regulated in BE-to-Adenocarcinoma NCBI RefSeq Gene BE D BE-CA CA NM_000963 COX-2, prostaglandin synthase 2 (*) *H * NM_000962 COX-1, prostaglandin synthase 1 * NM_007366 PLA2R phosphlipase A2 R1 * (*) * NM_000953 PD2R prostaglandin D2 R (*) (*) * NM_000959 PF2AR prostaglandin F2.alpha. R * (*) (*) NM_000957 PER3 prostaglandin E R 2 (*) * NM_000960 Prostaglindin IP (I2) R * * (*) Genes are associated with the disease states B3, dysplasia (D), BE adjacent to carcinoma (BE-CA), or carcinoma (CA) if present in at least 25% of samples tested. (*) indicates gene expression changes associated with 15-25% of samples.

[0272] An otherwise rare IL-1 homolog, IL1-H1, was highly expressed in carcinoma in Barrett's, and also the matched adjacent BE tissue from the same patients (FIG. 1). A previous study of the murine Il-1H1 ortholog detected constitutive only in esophageal squamous mucosa. In addition, human IL1-H1 mRNA could be induced in TNF.quadrature. and IFN.quadrature. treated keratinocytes and squamous epithelial tumor cell line A431 (Kumar et al., J. Biol. Chem. 275:10308-10314 (2000)). This gene is one marker of a specific esophageal squamous cell type exhibiting a striking induction of expression in both adenocarcinoma and patient-matched BE, amidst primarily intestinal and tumor markers observed in this study (Tables 2 and 3). The high expression in BE matched with adenocarcinoma in addition to adenocarcinoma suggests a possible epigenetic association.

[0273] Cylooxyengase isoform 2 (COX-2), which catalyzes a rate-limiting step in conversion of arachidonate to inflammatory prostaglandins, has been implicated in Barrett's metaplasia and other cancers (Morris et al., Am. J. Gastroenterol. 96:990-996 (2001); Heasley et al., J. Biol. Chem. 272:14501-14504 (1997); and Tsujii et al., Cell 93:705-716 (1998)). Consistent with previous reports, a significant increase was observed in COX-2 gene expression with increasing dysplasia (high-grade dysplasia) and in adenocarcinoma (Table 1B). Smaller changes were also observed in COX-1 and several prostaglandin receptors. Arachidonic acid is released from the membrane by the action of phospholipases. Phospholipase A2 expression associated with increasing malignancy was also observed (Table 2) along with the M-type receptor (PLA2R, Table 1B), consistent with studies suggesting that COX-2, PA2 and PLA2R are coordinately expressed (Rys-Sikora et al., Am. Physiol. Cell Physiol. 278:822-833 (2000)).

[0274] Elevated expression was detected for another enzyme that generates a different class of biologically active eicosanoids from arachidonic acid, the epoxygenase CYP2J2 (FIG. 1B, Table 2). This cytochrome P450 enzyme is expressed in a variety of cell types in the small intestine, including epithelial cells, and may play a role in electrolyte transport, intestinal motility, and other processes (Wu et al., J. Biol. Chem. 271:3460-3468 (1996); Zeldin et al., Mol. Pharm. 51:931-943 (1997); and Node et al., Science 285:1276-1279 (1999)). Similar to COX-2, elevated expression is most apparent in samples of adenocarcinoma and dysplasia (both low-grade and high-grade dysplasia). The expression profile for CYP2J2 also reflects the progressive intestinal metaplasia observed in this study (Table 2).

[0275] Intestinal Metaplasia:

[0276] Analysis for gene expression changes associated with dysplasia revealed a large group of genes whose normal expression is primarily associated with the small intestine, and to a lesser extent, colon (Table 2). The previously described marker villin was detected, (Peterson and Moosekar, J. Cell Sci. 102:581-600 (1992)) along with a diverse set of genes including cell surface cadherins and claudins, ion channels and transporters, and enzymes, many of which are normally associated with structural and absorptive functions of small intestinal villi. Increased expression of many of these genes was associated with dysplasia and a significant subset of carcinoma samples, with differential expression also detected in a smaller subset of BE samples. Furthermore, expression of the majority of genes was less prevalent in matched BE samples taken from the carcinoma patients, even when expression was apparent in the tumor sample (FIGS. 2A, 2B, 3A; Table 2). This suggests that these gene expression changes are more specifically associated with the foci of dysplasia and developing carcinoma within the larger region of BE.

3TABLE 2 Genes up-regulated in intestinal metaplasia SEQ ID NOS NCBI RefSeq (na and aa) Gene Gene Description BE D BE-CA CA Normal Tissues NM_007127 Villin 1 actin binding protein * * * * SI, C NM_003379 Villin 2 actin binding protein * SI, St, C, O NM_000775 35 and 36 CYP2J2 arachidonic acid epoxygenase * (*) * SI, L, H NM_005379 33 and 34 MYO1A myosin 1A *H * SI (C) NM_004063 45 and 46 CAD17 liver-intestine cadherin (*) (*H) (*) * SI, C NM_017717 MUCDHL mucin and cadherin like * SI (C, K) NM_014343 47 and 48 CLDN15 claudin 15 (*) *L (*) * SI NM_012132 CLDN8 claudin 8 * (*) C, K NM_005567 IR-95 lectin-binding (*) * C, SI, St, O NM_000021 Presenilin-1 beta-catenin binding *H (*) SI, C NM_003039 GLUT5 glucose transporter * (*) (*) SI NM_001081 CUBN transport (HDL, vit.B12, etc) *L K, SI NM_004769 23 and 24 SLNAC1 sodium channel *H * * CNS, SI, O NM_000492 49 and 50 CFTR chloride channel * (*H) * P, SI, C NM_003272 13 and 14 TM7SF1 novel GPCR (*) *H K, C, SI, O NM_005242 29 and 30 PAR2/F2RL1 GPCR, proteinase-activated *H SI, C NM_022304 51 and 52 H2R histamine H2 receptor (*) * * * St-par NM_004624 VIPR1 intestinal peptide GPCR * L, SI, C, CNS NM_002773 7 and 8 PRSS8 serine protease * * SI, C, St NM_058186 RPLA320 novel *L (*) SI (St, C, P) NM_003561 SPLA2 phosphlipase A2 group X * (*) (*) C, St, SI NM_000928 27 and 28 PA21 phospholipase A2 group IB * (*) * P, SI, C NM_001631 21 and 22 PPBI intestinal alkaline phosphatase (*) * SI NM_000717 25 and 26 CAH4 carbonic anhydrase IV *H (*) C, SI NM_005763 LKR/SDH lysine catabolism (*) *H * SI, C, O NM_004969 31 and 32 IDE insulin degrading enzyme (*) * * * SI-ent., O NM_001914 39 and 40 CYB5 cytochrome B5 (*) *H (*) L, SI, K NM_001863 41 and 42 COX6B cytochrome C oxidase subunit (*) *H * H, M, SI, C, St NM_000108 15 and 16 DLDH dihydrolipamide dehydrogenase (*) * H, M, K; SI, C NM_006214 37 and 38 PHYH phytanoyl-CoA hydroxylase *H L, K, M; SI, C NM_013283 17 and 18 MAT2B methionine adenosyltransferase *H (*) (*) SI, C, O NM_000414 BHSD hydroxysteroid dehydrogenase (*) * L, SI, O NM_005038 cyclophilin-40 peptidyl prolyl isomerase *L * SI, C, L, M NM_138393 DP1 membrane trafficking (*) * * L, SI NM_006408 3 and 4 AGR2 anterior gradient 2 homolog *H * St, SI, C NM_021969 11 and 12 NROB2 nuclear hormone receptor * *H * SI, L, St NM_005524 Hes1 transcriptional regulator * *H * * SI-ent., O NM_002054 GCG proglucagon (*) * P, SI, C Genes are assocaited withthe disease states B3, dysplasia (D), BE adjacent to carcinoma (BE-CA), or carcinoma (CA) if present in at lease 25% of samples tested. (*) indicateds gene expression changes associated wiht 15-25% of samples. Normal Tissues: highest normal tissue expression is listed. SI (small intestine); C (colon); St (stomach); K (kidney); P (pancreas); L (liver); M (muscle); H (heart); CNS (central nervous system); SI-ent (intestinal enterocytes); St-par (parietal cells; O (other tissues). In the dysplasia column, H or L denote expression associated with high-grade or low-grade dysplasia, respectively. GPCR (G protein coupled recepter. "na" and "aa" refer to the nucleic acid and amino acid SEQ ID NO, respectively, for the associated markers.

[0277] Examples include MYO1A, an unconventional myosin that is differentially expressed along with crypt-villus axis, exhibiting low level cytosolic expression in immature crypts and high expression in villus cells with localization at the brush border (Skowron et al., Cell Motil Cytoskel. 41:308-324 (1998); and MacLennan et al., Molec. Carcinogen. 24:137-143 (1999)). Unlike villin, another marker of the brush border that was detected across all disease states, MYO1A was most associated with high-grade dysplasia and carcinoma. The novel secreted factor AGR2 gives one of the most striking profiles as a marker for high-grade dysplasia (FIG. 2A). AGR2 is a human homolog of the X. laevis cement gland gene XAG-2, which is implicated in ectodermal patterning (Aberger et al., Mech. Dev. 72:115-130 (1998)). Elevated expression of this gene is also associated with hormonally-responsive high-grade esophageal dysplasias (Thompson and Weigel, Biochem. Biophys. Res. Commun. 251:111-116 (1998)).

[0278] Expression of nuclear hormone receptor NROB2 is induced by bile acids, and NROB2 in turn participates in transcriptional repression of the rate-limiting enzyme (CYP7A1) in bile synthesis (Lu et al., Mol. Cell 6:507-515 (2000)). In this study, overexpression of NROB2 is detected in particularly in high-grade dysplasia, in addition to some carcinomas and a subset of BE samples (FIG. 2B). In addition to supporting the general pattern of intestinal metaplasia, expression of NROB2 may further reflect the response to the unnatural exposure of esophageal cells to bile, which is considered to be a contributing factor in Barrett's metaplasia (Bremner et al, Surgery 68:209-216 (1970); and Gillen et al., Br. J. Surg. 75:1352-1355 (1988)). Bile acids have also been shown to activate transcription of COX-2 (Zhang et al., J. Biol. Chem. 273:2424-2428 (1998)).

[0279] While these gene expression profiles are consistent with the observations of an increased columnar cell type in BE, the most consistent changes are associated with dysplasia, especially high-grade dypslasia (Table 2). These genes could serve as markers for progression in a clinical setting. For example, the number of genes which meet the described criteria for elevated expression in individual samples progressively increases through BE and dysplasia. The average of the number of markers detected per sample is 7.6 for BE, 11.7 for low-grade dysplasia, and 16.4 for high-grade dysplasia. Within the BE group, 3 samples have unusually high scores of 12, 12, and 14 markers detected. The two samples with 12 markers are different biopsies from the same patient: while the overall expression profiles vary between the 2 biopsies, they score identically in the marker analysis. Marker selection could be further refined to a subset associated with particular disease stages. This type of quantitative analysis may be of utility in identifying BE patients with greater risk of progression, and may be less sensitive to sampling and observer-related effects. Some of the secreted and processed factors listed (Table 1A, 2, 3) may even be detectable in the blood, which could further simplify screening.

[0280] Adenocarcinoma:

[0281] Many of the genes differentially expressed in adenocarcinoma in Barrett's, similar to other solid tumors, reflect the changes occurring as the cells acquire a more proliferative and invasive phenotype (Table 3). Included are genes involved with growth, cell adhesion, matrix invasion, vascularization, and intracellular remodeling. The majority of genes are most prevalent in adenocarinoma, but some are also detected at earlier stages. For example, genes likely to be involved in tumor angiogenesis showed significant upregulation in samples with dysplasia (eg. tumor endothelial marker 1 (TEM1), Tie2 ligand 2, VEGFC, endothelin 1).

4TABLE 3 Genes up-regulated in esophageal adenocarcinoma NCBI RefSeq Gene families/genes BE D BE-CA CA Growth factors/receptors NM_005228 EGFR (*H) * NM_004442 EPHB2 * NM_003212 CRIPTO CR-1 (*) * * NM_004429 Ephrin B1 *$ Metalloproteinases - related NM_016155 MMP-17/MT4-MMP * NM_021801 MMP26 (*) (*) (*) *$ NM_001110 ADAM10 * * NM_001109 ADAM8 *H (*) XM_132370# ADAM1 * (*) NM_003254 TIM1 * * * * Intracellular cytoskeletal NM_001665 rho G (*) * * NM_006113 VAV3 * * NM_002086 GRB2 * * (*) NM_001666 C1 *H NM_007124 Utrophin * Transcription/nuclear NM_030756 Tcf4, DNA269446 (*) * * NM_005252 c-Fos * * * NM_002592 PCNA * * NM_004060 cyclin G * NM_053056 Cyclin D1 * (*) $ NM_003401 XRCC4 * NM_007149 Zinc finger protein * Cell surface adhesion/matrix XM_053256 MUC1 * * * * NM_004363 CEA (*) * NM_002483 NCA * NM_006350 Follistatin *H (*) *$ NM_021101 Claudin 1 *$ NM_012130 Claudin 14 * NM_003285 tenascin-R (*) * * NM_001793 CAD3 (*) * * NM_005076 AXO1 *H NM_001843 CONT *H NM_000582 Osteopontin (*) * * NM_006499 Galectin 8 (*) * NM_001711 PGS1 (biglycan) * *L NM_001466 Frizzled 2 *$ NM_005545 ISLR *$ NM_022763 FLJ23399 (*) * * Vascularization NM_020404 TEM1 *H (*) NM_001147 Tie2 ligand2 * * * NM_003714 STC-2 *H (*) NM_005429 VEGFC * (*) NM_000930 tPA * * NM_001955 Endothelin 1 *H (*) NM_000361 Thrombomodulin (*) * NM_001993 TF (*) * * Channel/transmembrane NM_005282 GPR4 * * NM_006056 GPR66 * NM_003058 SLC22A2 (*) (*H) * * NM_002420 MLSN1 * NM_000702 ATN2, Na/K transport * Genes are associated with the disease states B3, dysplasia (D), BE adjacent to carcinoma (BE-CA), or carcinoma (CA) if present in at least 25% of samples tested. (*) indicates geneexpression changes associated wiht 15-25% of samples. $ indicates a target of the Wnt signallin pathway.

[0282] The gene expression profiles in Barrett's adenocarcinoma share many similarities with colon tumors. For example, epidermal growth factor receptor (EGFR; previously described in carcinoma in BE) (ak-Kasspooles et al., Internat. J. Cancer 54:213-219 (1993), along with other growth factor-related or cell-surface proteins such as Cripto CR1, EPHB2, MUC1, NCA/CEACAM6, CEA (Table 3), are often highly expressed in colon cancer (Ciardiello et al., Proc. Natl. Acad. Sci. USA 88:7792-7796 (1991); Liu et al., Cancer 94:934-939 (2002); Zimmerman et al., Proc. Natl. Acad. Sci. USA 84:2960-2964 (1987); Medina et al., Cancer Res. 59:1061-1070 (1999); and Ilantzis et al., Neoplasia 4:151-163 (2002)). The sodium channel associated with cystic fibrosis, CFTR, was upregulated in adenocarcinoma and can be detected in some cases of high-grade dysplasia (Table 2). This gene is also overexpressed in colon tumors. Furthermore, there is evidence that several genes listed are targets of Wnt signalling pathways (Table 3) (Tetsu and McCormick, Nature 398:422-426 (1999); Miwa et al., Oncol. Res. 12:469-476 (2000); Marchenko et al., Biochem. J. 363:253-262 (2002); Sagara et al., Biochem. and Biophys. Res. Comm. 252:117-122 (1998); Lescher et al., Dev. Dyn. 213:440-451 (1998); Willert et al., BMC Dev. Biol. 2:1-6 (2002); and Tice et al., J. Biol. Chem. 277:14329-14335 (2002)), and it is possible that COX-2, which is implicated in colon cancer as well as adenocarcinoma in Barrett's, is a Wnt pathway target (Howe et al., Cancer Res. 59:1572-1577 (1999)). An additional synergistic link is suggested by the recent finding that EGFR is activated by prostaglandin E2, a product of COX-2 (Tsujii et al., Cell 93:705-716 (1998); Tsujii et al., Proc. Natl. Acad. Sci. USA 94:3336-3340 (1997); and Pai et al., Nature Med. 8:289-293 (2002)).

[0283] More support for Wnt/beta catenin-like induction comes from the strong induction of transcription factor and TCF4 (TCF7L2) in several dysplasia and adenocarcinoma samples (FIG. 3A). Knockout studies in mice indicate that TCF4 is necessary for the maintenance of proliferative crypts in the small intestine, and constitutive acitivity of TCF4 in APC-deficient human epithelial cells may contribute to their malignant transformation (Korinek et al., Nature Gen. 19:379-383 (1998)). Given its role in colon carcinogenesis, TCF4 provides another key link between intestinal metaplasia and carcinoma in BE.

[0284] Most genes listed represent known genes, but the novel gene FLJ23399 was one of the genes most consistently observed in adenocarcinoma and patient-matched adjacent BE samples (FIG. 3B). Expression in BE adjacent to carcinoma suggests the induction may be epigenetic, or possibly reflect small foci of adencarcinoma that cannot be identified histologically. Increased expression of this gene was also discovered herein to be associated with colon tumors, and with metastatic prostate tumors (increased expression with metastasis as compared to primary tumors). Its function is unknown, but the presence of 4 type III fibronectin domains in the putative extracellular region suggest a possible role in cell adhesion and/or cell-matrix interactions.

[0285] Barrett's Esophagus-to-Adenocarcinoma Disease Progression:

[0286] Despite the difficulties associated with sampling and interpretation, the presence and degree of dysplasia is still the most predictive factor for risk of progression to adenocarinoma (Miros et al., Gut 32:1441-1446 (1991)). Foci of carcinoma typically appear adjacent to dysplasia, and esophageal resections of high-grade dysplasia frequently contain previously unrecognized adenocarcinoma (Falk et al., Gastrointest. Endosc. 49:170-176 (1999); and Cameron and Carpenter, Am. J. Gastroenterol. 92:586-591 (1997)). In this study, by the time dysplasia was apparent, there was evidence of progressive development toward a gene expression profile similar to a differentiated small intestinal enterocyte (along with a small group of genes representative of other intestinal cell types). A possible key contributing factor is the increased expression of TCF4 with advancing disease. Homozygous disruption of TCF4 in mice results in death shortly after birth, and the neonatal epithelium is composed only of non-dividing villus cells (Korinek, V. et al., Nature Gen. 19:379-383 (1998)). This suggests that the genetic program controlled by TCF4 maintains, and possibly establishes, the crypt stem cells of the small intestine. In humans, TCF4 is expressed strongly in the crypts in early fetal development, with increasing expression on the villi up to week 22 as the small intestine develops (Barker et al., Am. J. Pathol. 154:29-35 (1999)). TCF4 is also expressed along the crypt-villus axis of adult small intestine and along the epithelial lining of the crypts of adult colon. The TCF4 profile observed in dysplasia and carcinoma in BE may reflect the inappropriate activation of a developmental pathway with a possible underlying dynamic and differentiating stem cell-like population, or acquisition of some of these characteristics. The delicate cells of the small intestine, with their specialized absorptive and digestive functions and rapid turnover, would seem highly susceptible to damage in the context of the esophagus and gastrointestinal reflux disease.

[0287] The developing intestinal phenotype apparent by progression to dysplasia, associated with increased expression of TCF4, suggests some tantalizing links to the development of carcinoma and the similarities in gene expression between adenocarcinoma of the esophagus and colon. In the context of loss of APC function, association of beta catenin with TCF4 results in constitutive transcription of Tcf target genes, a proposed crucial event in the early transformation of colonic epithelia in colon cancer (Korinek et al., Science 275:1784-1787 (1997)). While there is not strong evidence of truncating mutations in APC or oncogenic beta catenin in esophageal adenocarcinoma, there is evidence of hypermethylation of the APC promoter (in 48/52 of adenocarcinoma patients and 17/43 patients with BE metaplasia) (Kawakami et al., J. Natl. Cancer Inst. 92:1805-1811 (2000)). APC hypermethylation has also been implicated in progression in colon cancer (Hiltunen et al., Int. J. Cancer 70:644-648 (1997)). In this context, it is interesting to note that elevated c-Fos expression was apparent in our study in both dysplasia and carcinoma (Table 3). This could perhaps be related to the presence of bile acids from reflux, overexpression of proglucagon-derived peptide GLP2 (Table 2), or of TNFa (Table 1), all of which have been shown to induce c-Fos expression (Bakin and Curran, Science 283:387-390 (1999); Di Toro et al., Eur. J. Pharm. Sci. 11:291-298 (2000); and Bjerknes and Cheng, Proc. Natl. Acad. Sci. USA 98:12497-12502 (2001)). One proposal for oncogenic transformation by c-Fos is hypermethylation resulting from induction of DNA 5-methylcytosine transferase (Goetze et al., Atherosclerosis 159:93-101 (2001)). These factors may contribute to a potential increased availability of beta catenin to combine with TCF4 and activate transcriptional pathways that contribute to carcinogenesis. c-Fos may play an earlier role in intestinal metaplasia as well: studies of intestinal development in mice indicate that GLP2-mediated induction of c-Fos in enteric neurons signals growth of columnar epithelial cell progenitors and stem cells (Di Toro et al., Eur. J. Pharm. Sci. 11:291-298 (2000)).

[0288] Gene expression profiling of esophageal biopsies has revealed several intriguing associations for the progression of malignancy in the context of Barrett's esophagus. Many of the genes may be involved in potentiating regulatory cycles, and there is potential synergy for the development of adenocarcinoma between exposure to damaging agents (eg. bile), inflammatory response and prostaglandin synthesis, intestinal metaplasia and TCF4 induction, along with induction of growth factors such as EGFR and oncogenes such as c-Fos. Subsets of the genes identified may also eventually serve as markers to identify patients at higher risk for adenocarcinoma. This could permit streamlining of expensive and time-consuming surveillance programs, along with earlier detection and associated improved survival chances for high-risk patients.

[0289] Diagnosis of High-grade Esophageal Dysplasia and Prognosis of Esophageal Adenocarcinoma:

[0290] Several HGD gene markers were discovered as being up-regulated at least 1.5-fold in many high-grade dysplasia samples but are up-regulated in relatively few Barrett's esophagus samples (see Table 4A compared to Table 4B). According to the invention, where at least eight of the twenty-two HGD gene markers are detected to be up-regulated at 1.5-fold in an esophageal tissue sample, cells of the tissue sample are said to exhibit HGD. In addition, the patient from whom the sample was taken may be diagnosed as experiencing high-grade esophageal dysplasia. Further, the prognosis for the patient includes the likely development of adenocarcinoma. Based on the detection of HGD, diagnosis and prognosis, the patient may be treated accordingly and at an earlier stage in the BE-to-cancer progression than would otherwise have occurred prior to disclosure of the instant invention. Alternatively, in a test esophageal tissue sample, where at least one of the at least eight up-regulated HGD marker genes is AGR2 (SEQ ID NO:3), TM7SF1 (SEQ ID NO:13), MAT2B (SEQ ID NO:17), SLNAC1 (SEQ ID NO:23), or TCF4 (SEQ ID NO:43), cells of the tissue sample exhibit HGD and the the patient is said to be diagnosed as experiencing dysplasia, particularly high-grade dysplasia, and is likely to develop adenocarcinoma.

5TABLE 4A High-grade Dysplasia Markers Sample ID # SEQ ID NO: Z score* NCBI # (na and aa) Gene name 2493 2955 2491 2958 3128 2493 3130 NM_001955 1 and 2 Endothelin 1 ET-1 2.9 1.9 2.7 2.2 NM_006408 3 and 4 anterior gradient 2 (Xenepus laevis) homolog AGR2 3.1 2.7 2.6 2.7 3.4 2. 2.9 NM_001109 5 and 6 ADAM8 ADAM8 3.6 1.8 2.3 NM_002773 7 and 8 Prostasin precursor, serine protease PRSS8 2.5 1.8 2.7 3.1 2.3 NM_005076 9 and 10 Axonin-1 precursor AXO1 2. 1.6 2. 1.5 NM_021969 11 and 12 Nuclear hormone receptor NROB2 4.9 2.1 2.8 3.6 2.6 2.7 NM_003272 13 and 14 TM7SF1 TM7SF1 1.5 3.6 2.3 1.7 3. 2.2 1.7 NM_000108 15 and 16 dihydrolipamide dehydrogenase DLDH 2.1 3.2 1.9 1.7 NM_013283 17 and 18 methionine adenosyltransferase II, beta MAT2B 2.5 1.8 2.2 3. 2.7 NM_003714 19 and 20 stanniocalcin-2 STC-2 2.3 1.7 1.9 1.6 1.9 NM_001631 21 and 22 Alkaline phosphatase, intestinal precursor PPBI 2.3 1.6 2. 2.4 ND NM_004769 23 and 24 Sodium channel receptor SLNAC1 SLNAC1 2.9 1.8 3.6 3. 2.9 ND 2.5 NM_000717 25 and 26 Carbonic anhydrase iv precursor CAH4 1.7 1.8 1.8 NM_000928 27 and 28 Phospholipase a2 precursor PA21 2. 2.4 2.4 NM_005242 29 and 30 Proteinase activated receptor 2 precursor PAR2 2.9 2.7 NM_004969 31 and 32 Insulin-degrading enzyme IDE 1.6 2.5 4.4 1.8 1.9 1.8 NM_005379 33 and 34 Myosin IA (MYO1A) MYO1A 1.8 2.3 1.5 NM_000775 35 and 36 Cytochrome P450 monooxygenase CYP2J2 CYP2J2 2.4 4.3 2.3 NM_006214 37 and 38 Phytanoyl-CoA hydroxylase (Refsum disease) PHYH 2.9 2.4 1.9 NM_001914 39 and 40 "Cytochrome b5, 3' end" CYB5 3. 2.4 NM_001863 41 and 42 "CoxVIb gene, last exon and flanking sequence" coxVIb 1.9 2.2 2. 1.9 1.6 NM_030756 43 and 44 TCF4 TCF4 3.6 2.6 6.8 3.5 4.1 total number 15 10 17 18 16 12 8 Z score cut-off was 1.5 or above (p < 0.07). "na" and "aa" refer to the nucleic acid and amino acid SEQ ID NO, respectively, for the associated markers

[0291]

6TABLE 4B Low Prevalence of HGD Markers Sample ID # SEQ ID NO: Z score* NCBI # (na and aa) Gene name B-15 B-17 B-18 B 3091 3131 3132 NM_001955 1 and 2 ET-1 NM_006408 3 and 4 AGR2 2.5 NM_001109 5 and 6 ADAM8 2.2 NM_002773 7 and 8 PRSS8 3.4 1.5 NM_005076 9 and 10 AXO1 NM_021969 11 and 12 NROB2 3.2 2.4 2.4 NM_003272 13 and 14 TM7SF1 3.1 NM_000108 15 and 16 DLDH 2. NM_013283 17 and 18 MAT2B 2.4 NM_003714 19 and 20 STC-2 NM_001631 21 and 22 PPBI 2. NM_004769 23 and 24 SLNAC1 2.8 NM_000717 25 and 26 CAH4 1.8 1.5 NM_000928 27 and 28 PA21 NM_005242 29 and 30 PAR2 NM_004969 31 and 32 IDE 1.5 NM_005379 33 and 34 MYO1A 1.5 1.6 NM_000775 35 and 36 CYP2J2 NM_006214 37 and 38 PHYH NM_001914 39 and 40 CYB5 5.3 1.8 NM_001863 41 and 42 coxVIb 1.8 1.9 NM_030756 43 and 44 TCF4 2.4 Total # 5 4 5 4 0 2 2 Sample ID # Z score* NCBI # 3142 3143 3088 2296 2554 2555 3134 3135 3140 3181 3141 NM_001955 NM_006408 1.5 NM_001109 NM_002773 NM_005076 NM_021969 2.2 1.7 1.7 2.6 1.5 NM_003272 NM_000108 NM_013283 NM_003714 NM_001631 NM_004769 NM_000717 1.5 NM_000928 4.2 4.7 2.6 4.3 7.4 NM_005242 NM_004969 2.6 2.8 4.9 NM_005379 1.6 1.7 1.6 NM_000775 5.7 NM_006214 1.6 3.2 NM_001914 NM_001863 1.7 2.1 NM_030756 2.3 2 1 4 1 2 3 3 0 2 4 2 Z score cut-off was 1.5 or above (p < 0.07). "na" and "aa" refer to the nucleic acid and amino acid SEQ ID NO, respectively, for the associated markers

[0292] In addition to detecting and diagnosing HGD and developing a prognosis of esophageal adenocarcinoma, treatment of cancer, including, but not limited to adenocarcinoma, esophageal adenocarcioma, and colon cancer is also possible by administering to a patient a therapeutically effective amount of an antagonist of one or more of the following adenocarcinoma marker polypeptides: CAD17 (liver-intestine cadherin, NM.sub.--004063) (SEQ ID NO:46), CLDN15 (claudin 15, NM.sub.--014343) (SEQ ID NO:48), SLNAC1 (sodium channel, NM.sub.--004769) (SEQ ID NO:24), CFTR (chloride channel, NM.sub.--000492) (SEQ ID NO:50), H2R (histamine H2 receptor, NM.sub.--022304) (SEQ ID NO:52), PRSS8 (serine protease, NM.sub.--002773) (SEQ ID NO:8), PA21 (phospholipase A2 group IB, NM.sub.--000928) (SEQ ID NO:28), AGR2 (anterior gradient 2 homolog, (NM.sub.--006408) (SEQ ID NO:4), EGFR (NM.sub.--005228) (SEQ ID NO:54), EPHB2 (NM.sub.--004442) (SEQ ID NO:56), CRIPTO CR-1 (NM.sub.--003212) (SEQ ID NO:58), Eprin B1 (NM.sub.--004429) (SEQ ID NO:60), MMP-17/MT4-MMP (NM.sub.--016155) (SEQ ID NO:62), MMP26 (NM.sub.--021801) (SEQ ID NO:64), ADAM10 (NM.sub.--001110) (SEQ ID NO:66), ADAM8 (NM.sub.--001109) (SEQ ID NO:6), ADAM1 (XM.sub.--132370) (SEQ ID NO:68), TIM1 (NM.sub.--003254) (SEQ ID NO:70), MUC1 (XM.sub.--053256) (SEQ ID NO:72), CEA (NM.sub.--004363) (SEQ ID NO:74), NCA (NM.sub.--002483) (SEQ ID NO:76), Follistatin (NM.sub.--006350) (SEQ ID NO:78), Claudin 1 (NM.sub.--021101) (SEQ ID NO:80), Claudin 14 (NM.sub.--012130) (SEQ ID NO:82), tenascin-R (NM.sub.--003285) (SEQ ID NO:84), CAD3 (NM.sub.--001793) (SEQ ID NO:86), AXO1 (NM.sub.--005076) (SEQ ID NO:10), CONT (NM.sub.--001843) (SEQ ID NO:88), Osteopontin (NM.sub.--000582) (SEQ ID NO:90), Galectin 8 (NM.sub.--006499) (SEQ ID NO:92), PGS1 (bihlycan, NM.sub.--001711) (SEQ ID NO:94), Frizzled 2 (NM.sub.--001466) (SEQ ID NO:96), ISLR (NM.sub.--005545) (SEQ ID NO:98), FLJ23399 (NM.sub.--022763) (SEQ ID NO:100), TEM1 (NM.sub.--020404) (SEQ ID NO:102), Tie2 ligand2 (NM.sub.--001147) (SEQ ID NO:104), STC-2 (NM.sub.--003714) (SEQ ID NO:20), VEGFC (NM.sub.--005429) (SEQ ID NO:106), tPA (NM.sub.--000930) (SEQ ID NO:108), Endothelin 1 (NM.sub.--001955) (SEQ ID NO:2), Thrombomodulin (NM.sub.--000361) (SEQ ID NO:110), TF (NM.sub.--001993) (SEQ ID NO:112), GPR4 (NM.sub.--005282) (SEQ ID NO:114), GPR66 (NM.sub.--006056) (SEQ ID NO:116), SLC22A2 (NM.sub.--003058) ((SEQ ID NO:118), MLSN1 (NM.sub.--002420) (SEQ ID NO: 120), or ATN2 (Na/K transport, NM.sub.--000702) (SEQ ID NO: 122). The antagonist is a small molecule that binds and inactivates the polypeptide; binds and inactivates a precursor of the polypeptide; prevents translation of the polypeptide; prevents its transcription; or the like. Alternatively, the antagonist is an antibody that specifically binds the polypeptide and inhibits or prevents its activity. Where the antagonist is an antibody, the antibody is optionally a monoclonal antibody, a humanized antibody, or a binding fragment thereof. The treatment involves contacting a cancer cell with an antagonist of at least one of the polypeptides encoded by the adenocarcinoma marker genes listed above, alternatively with an antagonist of at least three, alternatively with at least five, and alternatively with at least eight of the polypeptides encoded by the adenocarcinoma marker genes listed above.

[0293] Further, a method of screening for a compound that inhibits cancer cell growth or causes the death of a cancer cell, particularly an adenocarcinoma cell, an esophageal adenocarcinoma cell, or a colon cancer cell, is an aspect of the invention. Accordingly, the screening method involves contacting a cancer cell, such as one expressing at least one, three, five, eight or more of the adenocarcinoma gene markers selected from the group consisiting of CAD17 (liver-intestine cadherin, NM.sub.--004063) (SEQ ID NO:45), CLDN15 (claudin 15, NM.sub.--014343) (SEQ ID NO:47), SLNAC1 (sodium channel, NM.sub.--004769) (SEQ ID NO:23), CFTR (chloride channel, NM.sub.--000492) (SEQ ID NO:49), H2R (histamine H2 receptor, NM.sub.--022304) (SEQ ID NO:51), PRSS8 (serine protease, NM.sub.--002773) (SEQ ID NO:7), PA21 (phospholipase A2 group IB, NM.sub.--000928) (SEQ ID NO:27), AGR2 (anterior gradient 2 homolog, (NM.sub.--006408) (SEQ ID NO:3), EGFR (NM.sub.--005228) (SEQ ID NO:53), EPHB2 (NM.sub.--004442) (SEQ ID NO:55), CRIPTO CR-1 (NM.sub.--003212) (SEQ ID NO:57), Eprin B1 (NM.sub.--004429) (SEQ ID NO:59), MMP-17/MT4-MMP (NM.sub.--016155) (SEQ ID NO:61), MMP26 (NM.sub.--021801) (SEQ ID NO:63), ADAM10 (NM.sub.--001110) (SEQ ID NO:65), ADAM8 (NM.sub.--001109) (SEQ ID NO:5), ADAM1 (XM.sub.--132370) (SEQ ID NO:67), TIM1 (NM.sub.--003254) (SEQ ID NO:69), MUC1 (XM.sub.--053256) (SEQ ID NO:71), CEA (NM.sub.--004363) (SEQ ID NO:73), NCA (NM.sub.--002483) (SEQ ID NO:75), Follistatin (NM.sub.--006350) (SEQ ID NO:77), Claudin 1 (NM.sub.--021101) (SEQ ID NO:79), Claudin 14 (NM.sub.--012130) (SEQ ID NO:81), tenascin-R (NM.sub.--003285) (SEQ ID NO:83), CAD3 (NM.sub.--001793) (SEQ ID NO:85), AXO1 (NM.sub.--005076) (SEQ ID NO:9), CONT (NM.sub.--001843) (SEQ ID NO:87), Osteopontin (NM.sub.--000582) (SEQ ID NO:89), Galectin 8 (NM.sub.--006499) (SEQ ID NO:91), PGS1 (bihlycan, NM.sub.--001711) (SEQ ID NO:93), Frizzled 2 (NM.sub.--001466) (SEQ ID NO:95), ISLR (NM.sub.--005545) (SEQ ID NO:97), FLJ23399 (NM 022763) (SEQ ID NO:99), TEM1 (NM.sub.--020404) (SEQ ID NO:101), Tie2 ligand2 (NM.sub.--001147) (SEQ ID NO:103), STC-2 (NM.sub.--003714) (SEQ ID NO:19), VEGFC (NM.sub.--005429) (SEQ ID NO:105), tPA (NM.sub.--000930) (SEQ ID NO:107), Endothelin 1 (NM.sub.--001955) (SEQ ID NO:1), Thrombomodulin (NM.sub.--000361) (SEQ ID NO:109), TF (NM.sub.--001993) (SEQ ID NO:111), GPR4 (NM.sub.--005282) (SEQ ID NO:113), GPR66 (NM.sub.--006056) (SEQ ID NO:115), SLC22A2 (NM.sub.--003058) ((SEQ ID NO:117), MLSN1 (NM.sub.--002420) (SEQ ID NO:119), and ATN2 (Na/K transport, NM.sub.--000702) (SEQ ID NO:121), followed by determining cancer cell growth inhibition or cancer cell death.

Example 5

Nucleic Acid and Amino Acid Sequence Identity Determinations

[0294] As shown below, Table 5 provides the complete source code for the ALIGN-2 sequence comparison computer program. This source code may be routinely compiled for use on a UNIX operating system to provide the ALIGN-2 sequence comparison computer program.

[0295] In addition, disclosed herein are hypothetical exemplifications for using the below described method to determine % amino acid sequence identity and % nucleic acid sequence identity using the ALIGN-2 sequence comparison computer program, wherein "PRO" represents the amino acid sequence of a hypothetical HGD marker polypeptide of interest, "Comparison Protein" represents the amino acid sequence of a polypeptide against which the "PRO" polypeptide of interest is being compared, "PRO-DNA" represents a hypothetical HGD marker polypeptide-encoding nucleic acid sequence of interest, "Comparison DNA" represents the nucleotide sequence of a nucleic acid molecule against which the "PRO-DNA" nucleic acid molecule of interest is being compared, "X", "Y", and "Z" each represent different hypothetical amino acid residues and "N", "L" and "V" each represent different hypothetical nucleotides.

[0296] Example calculations for determining % amino acid sequence identity and nucleic acid sequence identity: 1.

7 PRO XXXXXXXXXXXXXXX (Length = 15 amino acids) Comparison XXXXXYYYYYYY (Length = 12 amino acids) Protein

[0297] % amino acid sequence identity=

[0298] (the number of identically matching amino acid residues between the two polypeptide sequences as determined by ALIGN-2) divided by (the total number of amino acid residues of the PRO polypeptide)=

[0299] 5 divided by 15=33.3% 2.

8 PRO XXXXXXXXXX (Length = 10 amino acids) Comparison XXXXXYYYYYYZZYZ (Length = 15 amino acids) Protein

[0300] % amino acid sequence identity=

[0301] (the number of identically matching amino acid residues between the two polypeptide sequences as determined by ALIGN-2) divided by (the total number of amino acid residues of the PRO polypeptide)=

[0302] 5 divided by 10=50% 3.

9 PRO-DNA NNNNNNNNNNNNNN (Length = 14 nucleotides) Comparison NNNNNNLLLLLLLLLL (Length = 16 nucleotides) DNA

[0303] % nucleic acid sequence identity=

[0304] (the number of identically matching nucleotides between the two nucleic acid sequences as determined by ALIGN-2) divided by (the total number of nucleotides of the PRO-DNA nucleic acid sequence)=

[0305] 6 divided by 14=42.9% 4.

10 PRO-DNA NNNNNNNNNNNN (Length = 12 nucleotides) Comparison DNA NNNNLLLVV (Length = 9 nucleotides)

[0306] % nucleic acid sequence identity=

[0307] (the number of identically matching nucleotides between the two nucleic acid sequences as determined by ALIGN-2) divided by (the total number of nucleotides of the PRO-DNA nucleic acid sequence)=

[0308] 4 divided by 12=33.3%

[0309] Although the foregoing refers to particular embodiments, it will be understood that the present invention is not so limited. It will occur to those of ordinary skill in the art that various modifications may be made to the disclosed embodiments without diverting from the overall concept of the invention. All such modifications are intended to be within the scope of the present invention.

Sequence CWU 1

123 1 1334 DNA Homo sapien 1 cgccgcgtgc gcctgcagac gctccgctcg ctgccttctc tcctggcagg 50 cgctgccttt tctccccgtt aaagggcact tgggctgaag gatcgctttg 100 agatctgagg aacccgcagc gctttgaggg acctgaagct gtttttcttc 150 gttttccttt gggttcagtt tgaacgggag gtttttgatc cctttttttc 200 agaatggatt atttgctcat gattttctct ctgctgtttg tggcttgcca 250 aggagctcca gaaacagcag tcttaggcgc tgagctcagc gcggtgggtg 300 agaacggcgg ggagaaaccc actcccagtc caccctggcg gctccgccgg 350 tccaagcgct gctcctgctc gtccctgatg gataaagagt gtgtctactt 400 ctgccacctg gacatcattt gggtcaacac tcccgagcac gttgttccgt 450 atggacttgg aagccctagg tccaagagag ccttggagaa tttacttccc 500 acaaaggcaa cagaccgtga gaatagatgc caatgtgcta gccaaaaaga 550 caagaagtgc tggaattttt gccaagcagg aaaagaactc agggctgaag 600 acattatgga gaaagactgg aataatcata agaaaggaaa agactgttcc 650 aagcttggga aaaagtgtat ttatcagcag ttagtgagag gaagaaaaat 700 cagaagaagt tcagaggaac acctaagaca aaccaggtcg gagaccatga 750 gaaacagcgt caaatcatct tttcatgatc ccaagctgaa aggcaatccc 800 tccagagagc gttatgtgac ccacaaccga gcacattggt gacagacctt 850 cggggcctgt ctgaagccat agcctccacg gagagccctg tggccgactc 900 tgcactctcc accctggctg ggatcagagc aggagcatcc tctgctggtt 950 cctgactggc aaaggaccag cgtcctcgtt caaaacattc caagaaaggt 1000 taaggagttc ccccaaccat cttcactggc ttccatcagt ggtaactgct 1050 ttggtctctt ctttcatctg gggatgacaa tggacctctc agcagaaaca 1100 cacagtcaca ttcgaattcg ggtggcatcc tccggagaga gagagaggaa 1150 ggagattcca cacaggggtg gagtttctga cgaaggtcct aagggagtgt 1200 ttgtgtctga ctcaggcgcc tggcacattt cagggagaaa ctccaaagtc 1250 cacacaaaga ttttctaagg aatgcacaaa ttgaaaacac actcaaaaga 1300 caaacatgca agtaaagaaa aaaaaaaaaa aaaa 1334 2 212 PRT Homo sapien 2 Met Asp Tyr Leu Leu Met Ile Phe Ser Leu Leu Phe Val Ala Cys 1 5 10 15 Gln Gly Ala Pro Glu Thr Ala Val Leu Gly Ala Glu Leu Ser Ala 20 25 30 Val Gly Glu Asn Gly Gly Glu Lys Pro Thr Pro Ser Pro Pro Trp 35 40 45 Arg Leu Arg Arg Ser Lys Arg Cys Ser Cys Ser Ser Leu Met Asp 50 55 60 Lys Glu Cys Val Tyr Phe Cys His Leu Asp Ile Ile Trp Val Asn 65 70 75 Thr Pro Glu His Val Val Pro Tyr Gly Leu Gly Ser Pro Arg Ser 80 85 90 Lys Arg Ala Leu Glu Asn Leu Leu Pro Thr Lys Ala Thr Asp Arg 95 100 105 Glu Asn Arg Cys Gln Cys Ala Ser Gln Lys Asp Lys Lys Cys Trp 110 115 120 Asn Phe Cys Gln Ala Gly Lys Glu Leu Arg Ala Glu Asp Ile Met 125 130 135 Glu Lys Asp Trp Asn Asn His Lys Lys Gly Lys Asp Cys Ser Lys 140 145 150 Leu Gly Lys Lys Cys Ile Tyr Gln Gln Leu Val Arg Gly Arg Lys 155 160 165 Ile Arg Arg Ser Ser Glu Glu His Leu Arg Gln Thr Arg Ser Glu 170 175 180 Thr Met Arg Asn Ser Val Lys Ser Ser Phe His Asp Pro Lys Leu 185 190 195 Lys Gly Asn Pro Ser Arg Glu Arg Tyr Val Thr His Asn Arg Ala 200 205 210 His Trp 3 1701 DNA Homo sapien 3 ccgcatccta gccgccgact cacacaaggc aggtgggtga ggaaatccag 50 agttgccatg gagaaaattc cagtgtcagc attcttgctc cttgtggccc 100 tctcctacac tctggccaga gataccacag tcaaacctgg agccaaaaag 150 gacacaaagg actctcgacc caaactgccc cagaccctct ccagaggttg 200 gggtgaccaa ctcatctgga ctcagacata tgaagaagct ctatataaat 250 ccaagacaag caacaaaccc ttgatgatta ttcatcactt ggatgagtgc 300 ccacacagtc aagctttaaa gaaagtgttt gctgaaaata aagaaatcca 350 gaaattggca gagcagtttg tcctcctcaa tctggtttat gaaacaactg 400 acaaacacct ttctcctgat ggccagtatg tccccaggat tatgtttgtt 450 gacccatctc tgacagttag agccgatatc actggaagat attcaaatcg 500 tctctatgct tacgaacctg cagatacagc tctgttgctt gacaacatga 550 agaaagctct caagttgctg aagactgaat tgtaaagaaa aaaaatctcc 600 aagcccttct gtctgtcagg ccttgagact tgaaaccaga agaagtgtga 650 gaagactggc tagtgtggaa gcatagtgaa cacactgatt aggttatggt 700 ttaatgttac aacaactatt ttttaagaaa aacaagtttt agaaatttgg 750 tttcaagtgt acatgtgtga aaacaatatt gtatactacc atagtgagcc 800 atgattttct aaaaaaaaaa ataaatgttt tgggggtgtt ctgttttctc 850 caacttggtc tttcacagtg gttcgtttac caaataggat taaacacaca 900 caaaatgctc aaggaaggga caagacaaaa ccaaaactag ttcaaatgat 950 gaagaccaaa gaccaagtta tcatctcacc acaccacagg ttctcactag 1000 atgactgtaa gtagacacga gcttaatcaa cagaagtatc aagccatgtg 1050 ctttagcata aaagaatatt tagaaaaaca tcccaagaaa atcacatcac 1100 tacctagagt caactctggc caggaactct aaggtacaca ctttcattta 1150 gtaattaaat tttagtcaga ttttgcccaa cctaatgctc tcagggaaag 1200 cctctggcaa gtagctttct ccttcagagg tctaatttag tagaaaggtc 1250 atccaaagaa catctgcact cctgaacaca ccctgaagaa atcctgggaa 1300 ttgaccttgt aatcgatttg tctgtcaagg tcctaaagta ctggagtgaa 1350 ataaattcag ccaacatgtg actaattgga agaagagcaa agggtggtga 1400 cgtgttgatg aggcagatgg agatcagagg ttactagggt ttaggaaacg 1450 tgaaaggctg tggcatcagg gtaggggagc attctgccta acagaaatta 1500 gaattgtgtg ttaatgtctt cactctatac ttaatctcac attcattaat 1550 atatggaatt cctctactgc ccagcccctc ctgatttctt tggcccctgg 1600 actatggtgc tgtatataat gctttgcagt atctgttgct tgtcttgatt 1650 aacttttttg gataaaacct tttttgaaca gaaaaaaaaa aaaaaaaaaa 1700 a 1701 4 175 PRT Homo sapien 4 Met Glu Lys Ile Pro Val Ser Ala Phe Leu Leu Leu Val Ala Leu 1 5 10 15 Ser Tyr Thr Leu Ala Arg Asp Thr Thr Val Lys Pro Gly Ala Lys 20 25 30 Lys Asp Thr Lys Asp Ser Arg Pro Lys Leu Pro Gln Thr Leu Ser 35 40 45 Arg Gly Trp Gly Asp Gln Leu Ile Trp Thr Gln Thr Tyr Glu Glu 50 55 60 Ala Leu Tyr Lys Ser Lys Thr Ser Asn Lys Pro Leu Met Ile Ile 65 70 75 His His Leu Asp Glu Cys Pro His Ser Gln Ala Leu Lys Lys Val 80 85 90 Phe Ala Glu Asn Lys Glu Ile Gln Lys Leu Ala Glu Gln Phe Val 95 100 105 Leu Leu Asn Leu Val Tyr Glu Thr Thr Asp Lys His Leu Ser Pro 110 115 120 Asp Gly Gln Tyr Val Pro Arg Ile Met Phe Val Asp Pro Ser Leu 125 130 135 Thr Val Arg Ala Asp Ile Thr Gly Arg Tyr Ser Asn Arg Leu Tyr 140 145 150 Ala Tyr Glu Pro Ala Asp Thr Ala Leu Leu Leu Asp Asn Met Lys 155 160 165 Lys Ala Leu Lys Leu Leu Lys Thr Glu Leu 170 175 5 3236 DNA Homo sapien 5 gacccggcca tgcgcggcct cgggctctgg ctgctgggcg cgatgatgct 50 gcctgcgatt gcccccagcc ggccctgggc cctcatggag cagtatgagg 100 tcgtgttgcc gcggcgtctg ccaggccccc gagtccgccg agctctgccc 150 tcccacttgg gcctgcaccc agagagggtg agctacgtcc ttggggccac 200 agggcacaac ttcaccctcc acctgcggaa gaacagggac ctgctgggtt 250 ccggctacac agagacctat acggctgcca atggctccga ggtgacggag 300 cagcctcgcg ggcaggacca ctgcttatac cagggccacg tagaggggta 350 cccggactca gccgccagcc tcagcacctg tgccggcctc aggggtttct 400 tccaggtggg gtcagacctg cacctgatcg agcccctgga tgaaggtggc 450 gagggcggac ggcacgccgt gtaccaggct gagcacctgc tgcagacggc 500 cgggacctgc ggggtcagcg acgacagcct gggcagcctc ctgggacccc 550 ggacggcagc cgtcttcagg cctcggcccg gggactctct gccatcccga 600 gagacccgct acgtggagct gtatgtggtc gtggacaatg cagagttcca 650 gatgctgggg agcgaagcag ccgtgcgtca tcgggtgctg gaggtggtga 700 atcacgtgga caagctatat cagaaactca acttccgtgt ggtcctggtg 750 ggcctggaga tttggaatag tcaggacagg ttccacgtca gccccgaccc 800 cagtgtcaca ctggagaacc tcctgacctg gcaggcacgg caacggacac 850 ggcggcacct gcatgacaac gtacagctca tcacgggtgt cgacttcacc 900 gggactactg tggggtttgc cagggtgtcc gccatgtgct cccacagctc 950 aggggctgtg aaccaggacc acagcaagaa ccccgtgggc gtggcctgca 1000 ccatggccca tgagatgggc cacaacctgg gcatggacca tgatgagaac 1050 gtccagggct gccgctgcca ggaacgcttc gaggccggcc gctgcatcat 1100 ggcaggcagc attggctcca gtttccccag gatgttcagt gactgcagcc 1150 aggcctacct ggagagcttt ttggagcggc cgcagtcggt gtgcctcgcc 1200 aacgcccctg acctcagcca cctggtgggc ggccccgtgt gtgggaacct 1250 gtttgtggag cgtggggagc agtgcgactg cggccccccc gaggactgcc 1300 ggaaccgctg ctgcaactct accacctgcc agctggctga gggggcccag 1350 tgtgcgcacg gtacctgctg ccaggagtgc aaggtgaagc cggctggtga 1400 gctgtgccgt cccaagaagg acatgtgtga cctcgaggag ttctgtgacg 1450 gccggcaccc tgagtgcccg gaagacgcct tccaggagaa cggcacgccc 1500 tgctccgggg gctactgcta caacggggcc tgtcccacac tggcccagca 1550 gtgccaggcc ttctgggggc caggtgggca ggctgccgag gagtcctgct 1600 tctcctatga catcctacca ggctgcaagg ccagccggta cagggctgac 1650 atgtgtggcg ttctgcagtg caagggtggg cagcagcccc tggggcgtgc 1700 catctgcatc gtggatgtgt gccacgcgct caccacagag gatggcactg 1750 cgtatgaacc agtgcccgag ggcacccggt gtggaccaga gaaggtttgc 1800 tggaaaggac gttgccagga cttacacgtt tacagatcca gcaactgctc 1850 tgcccagtgc cacaaccatg gggtgtgcaa ccacaagcag gagtgccact 1900 gccacgcggg ctgggccccg ccccactgcg cgaagctgct gactgaggtg 1950 cacgcagcgt ccgggagcct ccccgtcctc gtggtggtgg ttctggtgct 2000 cctggcagtt gtgctggtca ccctggcagg catcatcgtc taccgcaaag 2050 cccggagccg catcctgagc aggaacgtgg ctcccaagac cacaatgggg 2100 cgctccaacc ccctgttcca ccaggctgcc agccgcgtgc cggccaaggg 2150 cggggctcca gccccatcca ggggccccca agagctggtc cccaccaccc 2200 acccgggcca gcccgcccga cacccggcct cctcggtggc tctgaagagg 2250 ccgccccctg ctcctccggt cactgtgtcc agcccaccct tcccagttcc 2300 tgtctacacc cggcaggcac caaagcaggt catcaagcca acgttcgcac 2350 ccccagtgcc cccagtcaaa cccggggctg gtgcggccaa ccctggtcca 2400 gctgagggtg ctgttggccc aaaggttgcc ctgaagcccc ccatccagag 2450 gaagcaagga gccggagctc ccacagcacc ctaggggggc acctgcgcct 2500 gtgtggaaat ttggagaagt tgcggcagag aagccatgcg ttccagcctt 2550 ccacggtcca gctagtgccg ctcagcccta gaccctgact ttgcaggctc 2600 agctgctgtt ctaacctcag taatgcatct acctgagagg ctcctgctgt 2650 ccacgccctc agccaattcc ttctccccgc cttggccacg tgtagcccca 2700 gctgtctgca ggcaccaggc tgggatgagc tgtgtgcttg cgggtgcgtg 2750 tgtgtgtacg tgtctccagg tggccgctgg tctcccgctg tgttcaggag 2800 gccacatata cagcccctcc cagccacacc tgcccctgct ctggggcctg 2850 ctgagccggc tgccctgggc acccggttcc aggcagcaca gacgtggggc 2900 atccccagaa agactccatc ccaggaccag gttcccctcc gtgctcttcg 2950 agagggtgtc agtgagcaga ctgcacccca agctcccgac tccaggtccc 3000 ctgatcttgg gcctgtttcc catgggattc aagagggaca gccccagctt 3050 tgtgtgtgtt taagcttagg aatgcccttt atggaaaggg ctatgtggga 3100 gagtcagcta tcttgtctgg ttttcttgag acctcagatg tgtgttcagc 3150 agggctgaaa gcttttattc tttaataatg agaaatgtat attttactaa 3200 taaattattg accgagttct gtagattctt gttaga 3236 6 824 PRT Homo sapien 6 Met Arg Gly Leu Gly Leu Trp Leu Leu Gly Ala Met Met Leu Pro 1 5 10 15 Ala Ile Ala Pro Ser Arg Pro Trp Ala Leu Met Glu Gln Tyr Glu 20 25 30 Val Val Leu Pro Arg Arg Leu Pro Gly Pro Arg Val Arg Arg Ala 35 40 45 Leu Pro Ser His Leu Gly Leu His Pro Glu Arg Val Ser Tyr Val 50 55 60 Leu Gly Ala Thr Gly His Asn Phe Thr Leu His Leu Arg Lys Asn 65 70 75 Arg Asp Leu Leu Gly Ser Gly Tyr Thr Glu Thr Tyr Thr Ala Ala 80 85 90 Asn Gly Ser Glu Val Thr Glu Gln Pro Arg Gly Gln Asp His Cys 95 100 105 Leu Tyr Gln Gly His Val Glu Gly Tyr Pro Asp Ser Ala Ala Ser 110 115 120 Leu Ser Thr Cys Ala Gly Leu Arg Gly Phe Phe Gln Val Gly Ser 125 130 135 Asp Leu His Leu Ile Glu Pro Leu Asp Glu Gly Gly Glu Gly Gly 140 145 150 Arg His Ala Val Tyr Gln Ala Glu His Leu Leu Gln Thr Ala Gly 155 160 165 Thr Cys Gly Val Ser Asp Asp Ser Leu Gly Ser Leu Leu Gly Pro 170 175 180 Arg Thr Ala Ala Val Phe Arg Pro Arg Pro Gly Asp Ser Leu Pro 185 190 195 Ser Arg Glu Thr Arg Tyr Val Glu Leu Tyr Val Val Val Asp Asn 200 205 210 Ala Glu Phe Gln Met Leu Gly Ser Glu Ala Ala Val Arg His Arg 215 220 225 Val Leu Glu Val Val Asn His Val Asp Lys Leu Tyr Gln Lys Leu 230 235 240 Asn Phe Arg Val Val Leu Val Gly Leu Glu Ile Trp Asn Ser Gln 245 250 255 Asp Arg Phe His Val Ser Pro Asp Pro Ser Val Thr Leu Glu Asn 260 265 270 Leu Leu Thr Trp Gln Ala Arg Gln Arg Thr Arg Arg His Leu His 275 280 285 Asp Asn Val Gln Leu Ile Thr Gly Val Asp Phe Thr Gly Thr Thr 290 295 300 Val Gly Phe Ala Arg Val Ser Ala Met Cys Ser His Ser Ser Gly 305 310 315 Ala Val Asn Gln Asp His Ser Lys Asn Pro Val Gly Val Ala Cys 320 325 330 Thr Met Ala His Glu Met Gly His Asn Leu Gly Met Asp His Asp 335 340 345 Glu Asn Val Gln Gly Cys Arg Cys Gln Glu Arg Phe Glu Ala Gly 350 355 360 Arg Cys Ile Met Ala Gly Ser Ile Gly Ser Ser Phe Pro Arg Met 365 370 375 Phe Ser Asp Cys Ser Gln Ala Tyr Leu Glu Ser Phe Leu Glu Arg 380 385 390 Pro Gln Ser Val Cys Leu Ala Asn Ala Pro Asp Leu Ser His Leu 395 400 405 Val Gly Gly Pro Val Cys Gly Asn Leu Phe Val Glu Arg Gly Glu 410 415 420 Gln Cys Asp Cys Gly Pro Pro Glu Asp Cys Arg Asn Arg Cys Cys 425 430 435 Asn Ser Thr Thr Cys Gln Leu Ala Glu Gly Ala Gln Cys Ala His 440 445 450 Gly Thr Cys Cys Gln Glu Cys Lys Val Lys Pro Ala Gly Glu Leu 455 460 465 Cys Arg Pro Lys Lys Asp Met Cys Asp Leu Glu Glu Phe Cys Asp 470 475 480 Gly Arg His Pro Glu Cys Pro Glu Asp Ala Phe Gln Glu Asn Gly 485 490 495 Thr Pro Cys Ser Gly Gly Tyr Cys Tyr Asn Gly Ala Cys Pro Thr 500 505 510 Leu Ala Gln Gln Cys Gln Ala Phe Trp Gly Pro Gly Gly Gln Ala 515 520 525 Ala Glu Glu Ser Cys Phe Ser Tyr Asp Ile Leu Pro Gly Cys Lys 530 535 540 Ala Ser Arg Tyr Arg Ala Asp Met Cys Gly Val Leu Gln Cys Lys 545 550 555 Gly Gly Gln Gln Pro Leu Gly Arg Ala Ile Cys Ile Val Asp Val 560 565 570 Cys His Ala Leu Thr Thr Glu Asp Gly Thr Ala Tyr Glu Pro Val 575 580 585 Pro Glu Gly Thr Arg Cys Gly Pro Glu Lys Val Cys Trp Lys Gly 590 595 600 Arg Cys Gln Asp Leu His Val Tyr Arg Ser Ser Asn Cys Ser Ala 605 610 615 Gln Cys His Asn His Gly Val Cys Asn His Lys Gln Glu Cys His 620 625 630 Cys His Ala Gly Trp Ala Pro Pro His Cys Ala Lys Leu Leu Thr 635 640 645 Glu Val His Ala Ala Ser Gly Ser Leu Pro Val Leu Val Val Val 650 655 660 Val Leu Val Leu Leu Ala Val Val Leu Val Thr Leu Ala Gly Ile 665 670 675 Ile Val Tyr Arg Lys Ala Arg Ser Arg Ile Leu Ser Arg Asn Val 680 685 690 Ala Pro Lys Thr Thr Met Gly Arg Ser Asn Pro Leu Phe His Gln 695 700 705 Ala Ala Ser Arg Val Pro Ala Lys Gly Gly Ala Pro Ala Pro Ser 710 715 720 Arg Gly Pro Gln Glu Leu Val Pro Thr Thr His Pro Gly Gln Pro 725

730 735 Ala Arg His Pro Ala Ser Ser Val Ala Leu Lys Arg Pro Pro Pro 740 745 750 Ala Pro Pro Val Thr Val Ser Ser Pro Pro Phe Pro Val Pro Val 755 760 765 Tyr Thr Arg Gln Ala Pro Lys Gln Val Ile Lys Pro Thr Phe Ala 770 775 780 Pro Pro Val Pro Pro Val Lys Pro Gly Ala Gly Ala Ala Asn Pro 785 790 795 Gly Pro Ala Glu Gly Ala Val Gly Pro Lys Val Ala Leu Lys Pro 800 805 810 Pro Ile Gln Arg Lys Gln Gly Ala Gly Ala Pro Thr Ala Pro 815 820 7 1938 DNA Homo sapien 7 gactttggtg gcaagaggag ctggcggagc ccagccagtg ggcggggcca 50 ggggaggggc gggcaggtag gtgcagccac tcctgggagg accctgcgtg 100 gccagacggt gctggtgact cgtccacact gctcgcttcg gatactccag 150 gcgtctcccg ttgcggccgc tccctgcctt agaggccagc cttggacact 200 tgctgcccct ttccagcccg gattctggga tccttccctc tgagccaaca 250 tctgggtcct gccttcgaca ccaccccaag gcttcctacc ttgcgtgcct 300 ggagtctgcc ccaggggccc ttgtcctggg ccatggccca gaagggggtc 350 ctggggcctg ggcagctggg ggctgtggcc attctgctct atcttggatt 400 actccggtcg gggacaggag cggaaggggc agaagctccc tgcggtgtgg 450 ccccccaagc acgcatcaca ggtggcagca gtgcagtcgc cggtcagtgg 500 ccctggcagg tcagcatcac ctatgaaggc gtccatgtgt gtggtggctc 550 tctcgtgtct gagcagtggg tgctgtcagc tgctcactgc ttccccagcg 600 agcaccacaa ggaagcctat gaggtcaagc tgggggccca ccagctagac 650 tcctactccg aggacgccaa ggtcagcacc ctgaaggaca tcatccccca 700 ccccagctac ctccaggagg gctcccaggg cgacattgca ctcctccaac 750 tcagcagacc catcaccttc tcccgctaca tccggcccat ctgcctccct 800 gcagccaacg cctccttccc caacggcctc cactgcactg tcactggctg 850 gggtcatgtg gccccctcag tgagcctcct gacgcccaag ccactgcagc 900 aactcgaggt gcctctgatc agtcgtgaga cgtgtaactg cctgtacaac 950 atcgacgcca agcctgagga gccgcacttt gtccaagagg acatggtgtg 1000 tgctggctat gtggaggggg gcaaggacgc ctgccagggt gactctgggg 1050 gcccactctc ctgccctgtg gagggtctct ggtacctgac gggcattgtg 1100 agctggggag atgcctgtgg ggcccgcaac aggcctggtg tgtacactct 1150 ggcctccagc tatgcctcct ggatccaaag caaggtgaca gaactccagc 1200 ctcgtgtggt gccccaaacc caggagtccc agcccgacag caacctctgt 1250 ggcagccacc tggccttcag ctctgcccca gcccagggct tgctgaggcc 1300 catccttttc ctgcctctgg gcctggctct gggcctcctc tccccatggc 1350 tcagcgagca ctgagctggc cctacttcca ggatggatgc atcacactca 1400 aggacaggag cctggtcctt ccctgatggc ctttggaccc agggcctgac 1450 ttgagccact ccttccttca ggactctgcg ggaggctggg gccccatctt 1500 gatctttgag cccattcttc tgggtgtgct ttttgggacc atcactgaga 1550 gtcaggagtt ttactgcctg tagcaatggc cagagcctct ggcccctcac 1600 ccaccatgga ccagcccatt ggccgagctc ctggggagct cctgggaccc 1650 ttggctatga aaatgagccc tggctcccac ctgtttctgg aagactgctc 1700 ccggcccgcc tgcccagact gatgagcaca tctctctgcc ctctccctgt 1750 gttctgggct ggggccacct ttgtgcagct tcgaggacag gaaaggcccc 1800 aatcttgccc actggccgct gagcgccccc gagccctgac tcctggactc 1850 cggaggactg agcccccacc ggaactgggc tggcgcttgg atctggggtg 1900 ggagtaacag ggcagaaatg attaaaatgt ttgagcac 1938 8 343 PRT Homo sapien 8 Met Ala Gln Lys Gly Val Leu Gly Pro Gly Gln Leu Gly Ala Val 1 5 10 15 Ala Ile Leu Leu Tyr Leu Gly Leu Leu Arg Ser Gly Thr Gly Ala 20 25 30 Glu Gly Ala Glu Ala Pro Cys Gly Val Ala Pro Gln Ala Arg Ile 35 40 45 Thr Gly Gly Ser Ser Ala Val Ala Gly Gln Trp Pro Trp Gln Val 50 55 60 Ser Ile Thr Tyr Glu Gly Val His Val Cys Gly Gly Ser Leu Val 65 70 75 Ser Glu Gln Trp Val Leu Ser Ala Ala His Cys Phe Pro Ser Glu 80 85 90 His His Lys Glu Ala Tyr Glu Val Lys Leu Gly Ala His Gln Leu 95 100 105 Asp Ser Tyr Ser Glu Asp Ala Lys Val Ser Thr Leu Lys Asp Ile 110 115 120 Ile Pro His Pro Ser Tyr Leu Gln Glu Gly Ser Gln Gly Asp Ile 125 130 135 Ala Leu Leu Gln Leu Ser Arg Pro Ile Thr Phe Ser Arg Tyr Ile 140 145 150 Arg Pro Ile Cys Leu Pro Ala Ala Asn Ala Ser Phe Pro Asn Gly 155 160 165 Leu His Cys Thr Val Thr Gly Trp Gly His Val Ala Pro Ser Val 170 175 180 Ser Leu Leu Thr Pro Lys Pro Leu Gln Gln Leu Glu Val Pro Leu 185 190 195 Ile Ser Arg Glu Thr Cys Asn Cys Leu Tyr Asn Ile Asp Ala Lys 200 205 210 Pro Glu Glu Pro His Phe Val Gln Glu Asp Met Val Cys Ala Gly 215 220 225 Tyr Val Glu Gly Gly Lys Asp Ala Cys Gln Gly Asp Ser Gly Gly 230 235 240 Pro Leu Ser Cys Pro Val Glu Gly Leu Trp Tyr Leu Thr Gly Ile 245 250 255 Val Ser Trp Gly Asp Ala Cys Gly Ala Arg Asn Arg Pro Gly Val 260 265 270 Tyr Thr Leu Ala Ser Ser Tyr Ala Ser Trp Ile Gln Ser Lys Val 275 280 285 Thr Glu Leu Gln Pro Arg Val Val Pro Gln Thr Gln Glu Ser Gln 290 295 300 Pro Asp Ser Asn Leu Cys Gly Ser His Leu Ala Phe Ser Ser Ala 305 310 315 Pro Ala Gln Gly Leu Leu Arg Pro Ile Leu Phe Leu Pro Leu Gly 320 325 330 Leu Ala Leu Gly Leu Leu Ser Pro Trp Leu Ser Glu His 335 340 9 7650 DNA Homo sapien 9 acacacacgc gccctcaccc gccaccgccg ccgcggccgc cgccgcaccc 50 ggacagcgag cggctgaggc cgccagggcc caaaggacag cggcccagac 100 aggggctggc ggcccggccg gccccggctc accgactcgg gcagcatcca 150 cctgccccag ccaacaccct tctctcgccc caggtccttt ctcagcctcc 200 agctgggctg tccccaagct gagctgaggc tcttctcctc cgatccccac 250 ctctgcccgg acatccacca tggggacagc caccaggagg aagccacacc 300 tgctgctggt agctgctgtg gcccttgtct cctcttcagc ttggagttca 350 gccctgggat cccaaaccac cttcgggcct gtctttgaag accagcccct 400 cagtgtgcta ttcccagagg agtccacgga ggagcaggtg ttgctggcat 450 gccgcgcccg ggccagccct ccagccacct atcggtggaa gatgaatggt 500 accgagatga agctggagcc aggttcccgt caccagctgg tggggggcaa 550 cctggtcatc atgaacccca ccaaggcaca ggatgccggg gtctaccagt 600 gcctggcctc caacccagtg ggcaccgttg tcagcaggga ggccatcctc 650 cgcttcggct ttctgcagga attctccaag gaggagcgag acccagtgaa 700 agctcatgaa ggctgggggg tgatgttgcc ctgtaaccca cctgcccact 750 acccaggctt gtcctaccgc tggctcctca acgagttccc caacttcatc 800 ccgacggacg ggcgtcactt cgtgtcccag accacaggga acctgtacat 850 tgcccgaacc aatgcctcag acctgggcaa ctactcctgt ttggccacca 900 gccacatgga cttctccacc aagagcgtct tcagcaagtt tgctcagctc 950 aacctggctg ctgaagatac ccggctcttt gcacccagca tcaaggcccg 1000 gttcccagca gagacctatg cactggtggg gcagcaggtc accctggagt 1050 gcttcgcctt tgggaaccct gtcccccgga tcaagtggcg caaagtggac 1100 ggctccctgt ccccgcagtg gaccacagct gagcccaccc tgcagatccc 1150 cagcgtcagc tttgaggatg agggcaccta cgagtgtgag gcggagaact 1200 ccaagggccg agacaccgtg cagggccgca tcatcgtgca ggctcagcct 1250 gagtggctaa aagtgatctc ggacacagag gctgacattg gctccaacct 1300 gcgttggggc tgtgcagccg ccggcaagcc ccggcctaca gtgcgctggc 1350 tgcggaacgg ggagcctctg gcctcccaga accgggtgga ggtgttggct 1400 ggggacctgc ggttctccaa gctgagcctg gaagactcgg gcatgtacca 1450 gtgtgtggca gagaataagc acggtaccat ctacgccagc gccgagctag 1500 ccgtgcaagc actcgcccct gacttcaggc tgaatcccgt gaggcgtctg 1550 atccccgcgg cccgcggggg agagatcctt atcccctgcc agccccgggc 1600 agctccaaag gccgtggtgc tctggagcaa aggcacggag attttggtca 1650 acagcagcag agtgactgta actccagatg gcaccttgat cataagaaac 1700 atcagccggt cagatgaagg caaatacacc tgctttgctg agaacttcat 1750 gggcaaagcc aacagcactg gaatcctatc tgtgcgagat gcaaccaaaa 1800 tcactctagc cccctcaagt gccgacatca acttgggtga caacctgacc 1850 ctacagtgcc atgcctccca cgaccccacc atggacctca ccttcacctg 1900 gaccctggac gacttcccca tcgactttga taagcctgga gggcactacc 1950 ggagaactaa tgtgaaggag accattgggg atctgaccat cctgaacgcc 2000 cagctgcgcc atggggggaa gtacacgtgc atggcccaga cggtggtgga 2050 cagcgcgtcc aaggaggcca cagtcctggt ccgaggtccg ccaggtcccc 2100 caggaggtgt ggtggtgagg gacattggcg acaccaccat ccagctcagc 2150 tggagccgtg gcttcgacaa ccacagcccc atcgctaagt acaccctgca 2200 agctcgcact ccacctgcag ggaagtggaa gcaggttcgg accaatcctg 2250 caaacatcga gggcaatgcc gagactgcac aggtgctggg cctcaccccc 2300 tggatggact atgagttccg ggtcatagcc agcaacattc tgggcactgg 2350 ggagcctagt gggccctcca gcaaaatccg gaccagggaa gcagccccct 2400 cggtggcacc ctcaggactc agcggaggag gtggagcccc cggagagctc 2450 atcgtcaact ggacgcccat gtcacgggag taccagaacg gagacggctt 2500 cggctacctg ctgtccttcc gcaggcaggg cagcactcac tggcagaccg 2550 cccgggtgcc tggcgccgat gcccagtact ttgtctacag caacgagagc 2600 gtccggccct acacgccctt tgaggtcaag atccgcagct acaaccgccg 2650 cggggatggg cccgagagcc tcactgcact cgtgtactca gctgaggaag 2700 agcccagggt ggcccctacc aaggtgtggg ccaaaggggt ctcatcctca 2750 gagatgaacg tgacctggga acccgtgcag caggacatga atggtatcct 2800 cctggggtat gagatccgct actggaaagc tggggacaaa gaagcagctg 2850 cggaccgagt gaggacagca gggctggaca ccagtgcccg agtcagcggc 2900 ctgcatccca acaccaagta ccatgtgacc gtgagggcct acaaccgggc 2950 tggcactggg cctgccagcc cttctgccaa cgccacgacc atgaagcccc 3000 ctccgcggcg acctcctggc aacatctcct ggactttctc aagctctagt 3050 cttagcatta agtgggaccc tgtggtccct ttccgaaatg agtctgcagt 3100 caccggctat aagatgctgt accagaatga cttacacctg actcccacgc 3150 tccacctcac cggcaagaac tggatagaaa tcccagtgcc tgaagacatt 3200 ggccatgccc tggtacaaat tcggaccaca gggcccggag gggatgggat 3250 ccctgcagaa gtccacatcg tgaggaatgg aggcacaagc atgatggtgg 3300 agaacatggc agtccgccca gcaccacacc ctggcaccgt catttcccac 3350 tccgtggcga tgctgatcct cataggctcc ctggagctct gatcctggaa 3400 cccctccctc tgcgccgcag ctggacgcca cctccgacgg acacagccag 3450 ccccttcctg ctgccaaggt ggcctgacac tgtgccagag agtggctggt 3500 tttaaatacc tactttaaac agtgcccttt ttgtaggagg taggatattt 3550 tatattctgc cgcaggatag aacccacgca aggattttct ttaaattgag 3600 aggcaccagg cagtaacttc catgatgaca ctgacgccta tacctgagct 3650 ctaggctgcc tggagggaag gaacaggccc atgggaagaa gggggtttta 3700 aaaacatgtc ttcaactcag cagagatggc cctctgggac cctatacgga 3750 ctccgccact tgagagcagt cctaggcccg gcaggaacac cagacatgaa 3800 caggttgaag aactggagcg aagtgcacac ctcaccatcc ttcagtctaa 3850 ggaagaaggg caagccctgg gaccaagagc tctcccgcct tctccctcga 3900 gcagcagcaa ggaccctgac gctgtccccg ataactccct aggggctcct 3950 gcctgcccaa gcggctgaga accagcgccc cgatgcctga ggctgggagc 4000 ctgagcccct tcagctttga ggggggtgat actccaggct gtttggggtg 4050 ggagccaaaa agagttgaga ggccagggcc cttggtggaa aggggcacca 4100 gccttggtct gagatagtca caacccaggt gacgatgccc tctcagccaa 4150 cactgccaac ctgaccctgt catcccgatt gacagcgcca cttcaggtgg 4200 ctgggtgact aaagggcttg tcttggtggg gtctcccacc cctccaagac 4250 ccattctgca cagtccctcc agggtttggg caggagatgg ccaatcatgc 4300 gcccacctct ccagtgctgc ctgcagtcag ctcggcctcc ccgacctgca 4350 gccccagact ctgctctccc agcactgact cactcctgcc tgggagggga 4400 atgcagcatt catgctgtgt gtcctggtat tgggaggttt ctgggaaggg 4450 cagaggataa atgtggccct gcctgctccc aggtatacct aggaccacct 4500 ggccagatcc gctcccagac ggccttggac tgcttgcatt tccccggaga 4550 aaaaggggtt aataaatggg ccatcctttc ctgagctctg ggtatactac 4600 cagtcacaga acgtcagagc tggaagaagc cttagagctc aacttcttca 4650 agcccctcac tttacagatg aggaaatgga ggtggtccag agagggtctg 4700 ggattcccaa ggtcacacag cccagaagag atggggctgg gttaagaact 4750 cgagtcttcc acctttctgt tcaaggctgt ttgtctaccc agaggaagga 4800 ggcactgctg aatggctatg gcctggctaa gaaggtgatt agtcagtagg 4850 gtgtgaaaat tctacttcaa ggggttcgga ttggtgatca tggggattgg 4900 catggctggg ttcccgtcca aggtgtgggc agagcttcta ccaaacttca 4950 acatggaggg ctgacttgaa gctccctgtc cccctcactc ttgccccaag 5000 aaaagaggcc aaagcaagag cagattccct aggcaagagc agcagcacaa 5050 ctaggaaacc ccaaagccca tgctccgaca ggtggccctt cacagggggc 5100 agcgggacag gcatcttgaa gggcatatgt cctcggaagc tccgagcctg 5150 ttttctgtag tttatagtta gagctctatt ttgttatggt tttttaaact 5200 tttaagtcct gctctatttt cctgggcagg tttatgttga tgtttaccca 5250 ctacaatttt ttaaaaatat aagctcacat gccttttccc tgccacagcc 5300 aaacccccac tgcaccctac ccacccaccc ctagcccagg tcagctttcc 5350 tggagctggc taatgaaagc ctcctcacct cttcccaacc cttacaagca 5400 agggtgctag gggctcagct atacgaccat tctccctgac agggagtcca 5450 aacttggcct agcatccctc ctggcccccc tctggccacg acttggcctg 5500 tgcctggttc tctatcagaa aggggatgct gaacaaaacc tccttccaag 5550 ttttatccaa ttcgttcctc attgcctcgg gctgcgtcag gggaagcagg 5600 ggacaggtgt ccagttgctg ggccgaggga ggagctggtt tggcatagga 5650 cctaaccagt gaagctagag gctacagcca ctaaacttgc ttcaggccaa 5700 cgatagttac tcacaagtaa gtaccttaat gctaatgagg tccactaaaa 5750 aggggaggaa ggcagacctc ctgggagacc cacgaagggt ttttagccag 5800 ggaaaactga gccccaggaa aacctaacca ctgggcaggc agaatttgtt 5850 tgagggatag aacgacaaca aaataaatgt tcctgcagcc tgagatttca 5900 ggtagagtac tgactaaggt ttaataagac aataggtgac ctgaggacat 5950 gcaagcttgt aaaatgcaac agcctcctgc tagagtgact tgtacatgag 6000 cttgcttgca gaagactaga ttagatgttt ctcaggatcc cctcctgcgc 6050 aggggttctc tgattttcgt gttctctgcc cagatgggct gggggagttg 6100 agagtgtgct tattttcact gcgatcatga gaccacagtt ctgggttatc 6150 tcctctcata catcaagccc cagaggaggc ggcaagagga acagccacaa 6200 acaagtactt taccccacag cttagtggcc agtaaacacc ctggggacta 6250 ggaaaaggaa ccaactgtag gcacctctcc agggcctagg gagacaagtg 6300 tcctctcttc tgcatacatt tgggctcccc ttacagagcc ctttgccctg 6350 gctctctggt ccttgttgct ctaacagtcc agatgtacac ccagcctcag 6400 ggggaaggca gctctctcca gacagagtct cagggcccag caaggtcagg 6450 ttatctgctt tcattcaggg caacaaatga tacaaatggt gccagggagt 6500 ggcaaggcca tgggggtagg tgggggtgtc tttttctttt cataaagtaa 6550 caacagacga gactgaggtt aaacatcaga aaaaaacctc tggaatgacc 6600 ttcctcattc caggaggccc tggaataagg aagaggcttc tttctgaggg 6650 agctttgagg aattttgaca gctgttgaca tgggatttgg gaaaggtgaa 6700 gctgtgactg gaggggcagg agatggtcca agtgtccatc cagagatgag 6750 actcttagaa tcaaagtgtt cagcccagga agtcttggag atcccacctt 6800 ctgtggccct gcaccttatg ggaagccatt aagggggctc atctaggaat 6850 tctggttaca gcccagtgct catcccagcg tatgctgcct ctttagggca 6900 gccccaaggg ccagccagcc tgtactctgg gcaagagccc aaaatggcta 6950 ggaatgtttg actcccttaa tctcttcccc agctacagag gaatcttttc 7000 tctgcctggt ctcagaatgg gactgccaac tggctcattg gtgggagaca 7050 cagtatcctc aaacctgtgg ccactggcat gacagtggtg ctctgtctcc 7100 ctgggtgaca cccaccctag gcttcctcct ggatgtgatg gggattgcca 7150 gagaggctct tagcataaaa ggcattaggt gggcattttt ctgtgtgccc 7200 ccaaaaagct ccatggaaac aggcacctgg tagctgcgga acacccgtgg 7250 acttgtgtat atggtcatag gctttgggaa gacaggacgt aaaggaaaat 7300 gagagaaaca aaatgggtca gatagctttg gccacagccc caggcagcct 7350 ttggggccta tgacacttag tgcccttaga tgggatacat cttgcctcgg 7400 ccccaagact cctccaactt acccgtccca tccagggcct gcacagctta 7450 gagaggctca cagcttggca aatgctaggg cttcatcaga ccactgactt 7500 gactcagtgt ttgttaaaat ggaaccactc ccgttggcct actgtttctc 7550 tcctgtactt cttgtaatga tagttattta ttgactctgg tagcaggcag 7600 ttcttaaata aagatggttt ctcaacctgt tggggaaaaa aaaaaaaaaa 7650 10 1040 PRT Homo sapien 10 Met Gly Thr Ala Thr Arg Arg Lys Pro His Leu Leu Leu Val Ala 1 5 10 15 Ala Val Ala Leu Val Ser Ser Ser Ala Trp Ser Ser Ala Leu Gly 20 25 30 Ser Gln Thr Thr Phe Gly Pro Val Phe Glu Asp Gln Pro Leu Ser 35 40 45 Val Leu Phe Pro Glu Glu Ser Thr Glu Glu Gln Val Leu Leu Ala 50 55 60 Cys Arg Ala Arg Ala Ser Pro Pro Ala Thr Tyr Arg Trp Lys Met 65 70 75 Asn Gly Thr Glu Met Lys Leu Glu Pro Gly Ser Arg His Gln Leu 80 85 90 Val Gly Gly Asn Leu Val Ile

Met Asn Pro Thr Lys Ala Gln Asp 95 100 105 Ala Gly Val Tyr Gln Cys Leu Ala Ser Asn Pro Val Gly Thr Val 110 115 120 Val Ser Arg Glu Ala Ile Leu Arg Phe Gly Phe Leu Gln Glu Phe 125 130 135 Ser Lys Glu Glu Arg Asp Pro Val Lys Ala His Glu Gly Trp Gly 140 145 150 Val Met Leu Pro Cys Asn Pro Pro Ala His Tyr Pro Gly Leu Ser 155 160 165 Tyr Arg Trp Leu Leu Asn Glu Phe Pro Asn Phe Ile Pro Thr Asp 170 175 180 Gly Arg His Phe Val Ser Gln Thr Thr Gly Asn Leu Tyr Ile Ala 185 190 195 Arg Thr Asn Ala Ser Asp Leu Gly Asn Tyr Ser Cys Leu Ala Thr 200 205 210 Ser His Met Asp Phe Ser Thr Lys Ser Val Phe Ser Lys Phe Ala 215 220 225 Gln Leu Asn Leu Ala Ala Glu Asp Thr Arg Leu Phe Ala Pro Ser 230 235 240 Ile Lys Ala Arg Phe Pro Ala Glu Thr Tyr Ala Leu Val Gly Gln 245 250 255 Gln Val Thr Leu Glu Cys Phe Ala Phe Gly Asn Pro Val Pro Arg 260 265 270 Ile Lys Trp Arg Lys Val Asp Gly Ser Leu Ser Pro Gln Trp Thr 275 280 285 Thr Ala Glu Pro Thr Leu Gln Ile Pro Ser Val Ser Phe Glu Asp 290 295 300 Glu Gly Thr Tyr Glu Cys Glu Ala Glu Asn Ser Lys Gly Arg Asp 305 310 315 Thr Val Gln Gly Arg Ile Ile Val Gln Ala Gln Pro Glu Trp Leu 320 325 330 Lys Val Ile Ser Asp Thr Glu Ala Asp Ile Gly Ser Asn Leu Arg 335 340 345 Trp Gly Cys Ala Ala Ala Gly Lys Pro Arg Pro Thr Val Arg Trp 350 355 360 Leu Arg Asn Gly Glu Pro Leu Ala Ser Gln Asn Arg Val Glu Val 365 370 375 Leu Ala Gly Asp Leu Arg Phe Ser Lys Leu Ser Leu Glu Asp Ser 380 385 390 Gly Met Tyr Gln Cys Val Ala Glu Asn Lys His Gly Thr Ile Tyr 395 400 405 Ala Ser Ala Glu Leu Ala Val Gln Ala Leu Ala Pro Asp Phe Arg 410 415 420 Leu Asn Pro Val Arg Arg Leu Ile Pro Ala Ala Arg Gly Gly Glu 425 430 435 Ile Leu Ile Pro Cys Gln Pro Arg Ala Ala Pro Lys Ala Val Val 440 445 450 Leu Trp Ser Lys Gly Thr Glu Ile Leu Val Asn Ser Ser Arg Val 455 460 465 Thr Val Thr Pro Asp Gly Thr Leu Ile Ile Arg Asn Ile Ser Arg 470 475 480 Ser Asp Glu Gly Lys Tyr Thr Cys Phe Ala Glu Asn Phe Met Gly 485 490 495 Lys Ala Asn Ser Thr Gly Ile Leu Ser Val Arg Asp Ala Thr Lys 500 505 510 Ile Thr Leu Ala Pro Ser Ser Ala Asp Ile Asn Leu Gly Asp Asn 515 520 525 Leu Thr Leu Gln Cys His Ala Ser His Asp Pro Thr Met Asp Leu 530 535 540 Thr Phe Thr Trp Thr Leu Asp Asp Phe Pro Ile Asp Phe Asp Lys 545 550 555 Pro Gly Gly His Tyr Arg Arg Thr Asn Val Lys Glu Thr Ile Gly 560 565 570 Asp Leu Thr Ile Leu Asn Ala Gln Leu Arg His Gly Gly Lys Tyr 575 580 585 Thr Cys Met Ala Gln Thr Val Val Asp Ser Ala Ser Lys Glu Ala 590 595 600 Thr Val Leu Val Arg Gly Pro Pro Gly Pro Pro Gly Gly Val Val 605 610 615 Val Arg Asp Ile Gly Asp Thr Thr Ile Gln Leu Ser Trp Ser Arg 620 625 630 Gly Phe Asp Asn His Ser Pro Ile Ala Lys Tyr Thr Leu Gln Ala 635 640 645 Arg Thr Pro Pro Ala Gly Lys Trp Lys Gln Val Arg Thr Asn Pro 650 655 660 Ala Asn Ile Glu Gly Asn Ala Glu Thr Ala Gln Val Leu Gly Leu 665 670 675 Thr Pro Trp Met Asp Tyr Glu Phe Arg Val Ile Ala Ser Asn Ile 680 685 690 Leu Gly Thr Gly Glu Pro Ser Gly Pro Ser Ser Lys Ile Arg Thr 695 700 705 Arg Glu Ala Ala Pro Ser Val Ala Pro Ser Gly Leu Ser Gly Gly 710 715 720 Gly Gly Ala Pro Gly Glu Leu Ile Val Asn Trp Thr Pro Met Ser 725 730 735 Arg Glu Tyr Gln Asn Gly Asp Gly Phe Gly Tyr Leu Leu Ser Phe 740 745 750 Arg Arg Gln Gly Ser Thr His Trp Gln Thr Ala Arg Val Pro Gly 755 760 765 Ala Asp Ala Gln Tyr Phe Val Tyr Ser Asn Glu Ser Val Arg Pro 770 775 780 Tyr Thr Pro Phe Glu Val Lys Ile Arg Ser Tyr Asn Arg Arg Gly 785 790 795 Asp Gly Pro Glu Ser Leu Thr Ala Leu Val Tyr Ser Ala Glu Glu 800 805 810 Glu Pro Arg Val Ala Pro Thr Lys Val Trp Ala Lys Gly Val Ser 815 820 825 Ser Ser Glu Met Asn Val Thr Trp Glu Pro Val Gln Gln Asp Met 830 835 840 Asn Gly Ile Leu Leu Gly Tyr Glu Ile Arg Tyr Trp Lys Ala Gly 845 850 855 Asp Lys Glu Ala Ala Ala Asp Arg Val Arg Thr Ala Gly Leu Asp 860 865 870 Thr Ser Ala Arg Val Ser Gly Leu His Pro Asn Thr Lys Tyr His 875 880 885 Val Thr Val Arg Ala Tyr Asn Arg Ala Gly Thr Gly Pro Ala Ser 890 895 900 Pro Ser Ala Asn Ala Thr Thr Met Lys Pro Pro Pro Arg Arg Pro 905 910 915 Pro Gly Asn Ile Ser Trp Thr Phe Ser Ser Ser Ser Leu Ser Ile 920 925 930 Lys Trp Asp Pro Val Val Pro Phe Arg Asn Glu Ser Ala Val Thr 935 940 945 Gly Tyr Lys Met Leu Tyr Gln Asn Asp Leu His Leu Thr Pro Thr 950 955 960 Leu His Leu Thr Gly Lys Asn Trp Ile Glu Ile Pro Val Pro Glu 965 970 975 Asp Ile Gly His Ala Leu Val Gln Ile Arg Thr Thr Gly Pro Gly 980 985 990 Gly Asp Gly Ile Pro Ala Glu Val His Ile Val Arg Asn Gly Gly 995 1000 1005 Thr Ser Met Met Val Glu Asn Met Ala Val Arg Pro Ala Pro His 1010 1015 1020 Pro Gly Thr Val Ile Ser His Ser Val Ala Met Leu Ile Leu Ile 1025 1030 1035 Gly Ser Leu Glu Leu 1040 11 1168 DNA Homo sapien 11 gagctggaag tgagagcaga tccctaacca tgagcaccag ccaaccaggg 50 gcctgcccat gccagggagc tgcaagccgc cccgccattc tctacgcact 100 tctgagctcc agcctcaagg ctgtcccccg accccgtagc cgctgcctat 150 gtaggcagca ccggcccgtc cagctatgtg cacctcatcg cacctgccgg 200 gaggccttgg atgttctggc caagacagtg gccttcctca ggaacctgcc 250 atccttctgg cagctgcctc cccaggacca gcggcggctg ctgcagggtt 300 gctggggccc cctcttcctg cttgggttgg cccaagatgc tgtgaccttt 350 gaggtggctg aggccccggt gcccagcata ctcaagaaga ttctgctgga 400 ggagcccagc agcagtggag gcagtggcca actgccagac agaccccagc 450 cctccctggc tgcggtgcag tggcttcaat gctgtctgga gtccttctgg 500 agcctggagc ttagccccaa ggaatatgcc tgcctgaaag ggaccatcct 550 cttcaacccc gatgtgccag gcctccaagc cgcctcccac attgggcacc 600 tgcagcagga ggctcactgg gtgctgtgtg aagtcctgga accctggtgc 650 ccagcagccc aaggccgcct gacccgtgtc ctcctcacgg cctccaccct 700 caagtccatt ccgaccagcc tgcttgggga cctcttcttt cgccctatca 750 ttggagatgt tgacatcgct ggccttcttg gggacatgct tttgctcagg 800 tgacctgttc cagcccaggc agagatcagg tgggcagagg ctggcagtgc 850 tgattcagcc tggccatccc cagaggtgac ccaatgctcc tggaggggca 900 agcctgtata gacagcactt ggctccttag gaacagctct tcactcagcc 950 acaccccaca ttggacttcc ttggtttgga cacagtgctc cagctgcctg 1000 ggaggctttt ggtggtcccc acagcctctg ggccaagact cctgtccctt 1050 cttgggatga gaatgaaagc ttaggctgct tattggacca gaagtcctat 1100 cgactttata cagaactgaa ttaagttatt gatttttgta ataaaaggta 1150 tgaaacacta aaaaaaaa 1168 12 257 PRT Homo sapien 12 Met Ser Thr Ser Gln Pro Gly Ala Cys Pro Cys Gln Gly Ala Ala 1 5 10 15 Ser Arg Pro Ala Ile Leu Tyr Ala Leu Leu Ser Ser Ser Leu Lys 20 25 30 Ala Val Pro Arg Pro Arg Ser Arg Cys Leu Cys Arg Gln His Arg 35 40 45 Pro Val Gln Leu Cys Ala Pro His Arg Thr Cys Arg Glu Ala Leu 50 55 60 Asp Val Leu Ala Lys Thr Val Ala Phe Leu Arg Asn Leu Pro Ser 65 70 75 Phe Trp Gln Leu Pro Pro Gln Asp Gln Arg Arg Leu Leu Gln Gly 80 85 90 Cys Trp Gly Pro Leu Phe Leu Leu Gly Leu Ala Gln Asp Ala Val 95 100 105 Thr Phe Glu Val Ala Glu Ala Pro Val Pro Ser Ile Leu Lys Lys 110 115 120 Ile Leu Leu Glu Glu Pro Ser Ser Ser Gly Gly Ser Gly Gln Leu 125 130 135 Pro Asp Arg Pro Gln Pro Ser Leu Ala Ala Val Gln Trp Leu Gln 140 145 150 Cys Cys Leu Glu Ser Phe Trp Ser Leu Glu Leu Ser Pro Lys Glu 155 160 165 Tyr Ala Cys Leu Lys Gly Thr Ile Leu Phe Asn Pro Asp Val Pro 170 175 180 Gly Leu Gln Ala Ala Ser His Ile Gly His Leu Gln Gln Glu Ala 185 190 195 His Trp Val Leu Cys Glu Val Leu Glu Pro Trp Cys Pro Ala Ala 200 205 210 Gln Gly Arg Leu Thr Arg Val Leu Leu Thr Ala Ser Thr Leu Lys 215 220 225 Ser Ile Pro Thr Ser Leu Leu Gly Asp Leu Phe Phe Arg Pro Ile 230 235 240 Ile Gly Asp Val Asp Ile Ala Gly Leu Leu Gly Asp Met Leu Leu 245 250 255 Leu Arg 13 1998 DNA Homo sapien 13 cggcgcgatg cgcggagacc cccgcggggg cggcggcggc cgtgagcccc 50 gatgaggccc gagcgtcccc ggccgcgcgg cagcgccccc ggcccgatgg 100 agaccccgcc gtgggaccca gcccgcaacg actcgctgcc gcccacgctg 150 accccggccg tgccccccta cgtgaagctt ggcctcaccg tcgtctacac 200 cgtgttctac gcgctgctct tcgtgttcat ctacgtgcag ctctggctgg 250 tgctgcgtta ccgccacaag cggctcagct accagagcgt cttcctcttt 300 ctctgcctct tctgggcctc cctgcggacc gtcctcttct ccttctactt 350 caaagacttc gtggcggcca attcgctcag ccccttcgtc ttctggctgc 400 tctactgctt ccctgtgtgc ctgcagtttt tcaccctcac gctgatgaac 450 ttgtacttca cgcaggtgat tttcaaagcc aagtcaaaat attctccaga 500 attactcaaa taccggttgc ccctctacct ggcctccctc ttcatcagcc 550 ttgttttcct gttggtgaat ttaacctgtg ctgtgctggt aaagacggga 600 aattgggaga ggaaggttat cgtctctgtg cgagtggcca ttaatgacac 650 gctcttcgtg ctgtgtgccg tctctctctc catctgtctc tacaaaatct 700 ctaagatgtc cttagccaac atttacttgg agtccaaggg ctcctccgtg 750 tgtcaagtga ctgccatcgg tgtcaccgtg atactgcttt acacctctcg 800 ggcctgctac aacctgttca tcctgtcatt ttctcagaac aagagcgtcc 850 attcctttga ttatgactgg tacaatgtat cagaccaggc agatttgaag 900 aatcagctgg gagatgctgg atacgtatta tttggagtgg tgttatttgt 950 ttgggaactc ttacctacca ccttagtcgt ttatttcttc cgagttagaa 1000 atcctacaaa ggaccttacc aaccctggaa tggtccccag ccatggattc 1050 agtcccagat cttatttctt tgacaaccct cgaagatatg acagtgatga 1100 tgaccttgcc tggaacattg cccctcaggg acttcaggga ggttttgctc 1150 cagattacta tgattgggga caacaaacta acagcttcct ggcacaagca 1200 ggaactttgc aagactcaac tttggatcct gacaaaccaa gccttgggta 1250 gcatcagtta acagttttat ggacgattcc tcagatgaaa agcttcagaa 1300 aagcatagtg acagctgaat ttttagggca cttttcctta agaaatagaa 1350 cttgattttt atttgttaca ggtttccaat ggccccatag gaataagcaa 1400 taatgtagac tgataaaccc ttattttagt actaaagagg gagccttgct 1450 atttcagtgg gtataattta aactttttaa agaaaatctg tacttttata 1500 aagatgtatt ttgtataact taaataataa tgctaaagta tactagggtt 1550 tttttttctt gagaatgtta ctgcaatcat gttgtagttt gcacagactt 1600 ttatgcataa ttcactttaa aaatatagaa tatatggtct aatagttttt 1650 taaagctttt ggactaaagt attccacaaa tcttacctct ttaggtcact 1700 gatggtcact ccgattctga gtgccacatt ggtagactcc taaaatacag 1750 ttgacaactt agccaattgc aactccagtg ttgataatta aaatgaaatg 1800 gtaaagcagc agactgtaag gtctttagag attttttttt aaggttcagg 1850 ccgtaggttc ctcaaggaat ctcttaagtt ttgcccaaag actggtactt 1900 cctttcagta gggcgctaat gtatacacat taatgataag ttgataacat 1950 taaaaatgta gctgacttat cctattaaac ctcctctgct atgttcac 1998 14 399 PRT Homo sapien 14 Met Arg Pro Glu Arg Pro Arg Pro Arg Gly Ser Ala Pro Gly Pro 1 5 10 15 Met Glu Thr Pro Pro Trp Asp Pro Ala Arg Asn Asp Ser Leu Pro 20 25 30 Pro Thr Leu Thr Pro Ala Val Pro Pro Tyr Val Lys Leu Gly Leu 35 40 45 Thr Val Val Tyr Thr Val Phe Tyr Ala Leu Leu Phe Val Phe Ile 50 55 60 Tyr Val Gln Leu Trp Leu Val Leu Arg Tyr Arg His Lys Arg Leu 65 70 75 Ser Tyr Gln Ser Val Phe Leu Phe Leu Cys Leu Phe Trp Ala Ser 80 85 90 Leu Arg Thr Val Leu Phe Ser Phe Tyr Phe Lys Asp Phe Val Ala 95 100 105 Ala Asn Ser Leu Ser Pro Phe Val Phe Trp Leu Leu Tyr Cys Phe 110 115 120 Pro Val Cys Leu Gln Phe Phe Thr Leu Thr Leu Met Asn Leu Tyr 125 130 135 Phe Thr Gln Val Ile Phe Lys Ala Lys Ser Lys Tyr Ser Pro Glu 140 145 150 Leu Leu Lys Tyr Arg Leu Pro Leu Tyr Leu Ala Ser Leu Phe Ile 155 160 165 Ser Leu Val Phe Leu Leu Val Asn Leu Thr Cys Ala Val Leu Val 170 175 180 Lys Thr Gly Asn Trp Glu Arg Lys Val Ile Val Ser Val Arg Val 185 190 195 Ala Ile Asn Asp Thr Leu Phe Val Leu Cys Ala Val Ser Leu Ser 200 205 210 Ile Cys Leu Tyr Lys Ile Ser Lys Met Ser Leu Ala Asn Ile Tyr 215 220 225 Leu Glu Ser Lys Gly Ser Ser Val Cys Gln Val Thr Ala Ile Gly 230 235 240 Val Thr Val Ile Leu Leu Tyr Thr Ser Arg Ala Cys Tyr Asn Leu 245 250 255 Phe Ile Leu Ser Phe Ser Gln Asn Lys Ser Val His Ser Phe Asp 260 265 270 Tyr Asp Trp Tyr Asn Val Ser Asp Gln Ala Asp Leu Lys Asn Gln 275 280 285 Leu Gly Asp Ala Gly Tyr Val Leu Phe Gly Val Val Leu Phe Val 290 295 300 Trp Glu Leu Leu Pro Thr Thr Leu Val Val Tyr Phe Phe Arg Val 305 310 315 Arg Asn Pro Thr Lys Asp Leu Thr Asn Pro Gly Met Val Pro Ser 320 325 330 His Gly Phe Ser Pro Arg Ser Tyr Phe Phe Asp Asn Pro Arg Arg 335 340 345 Tyr Asp Ser Asp Asp Asp Leu Ala Trp Asn Ile Ala Pro Gln Gly 350 355 360 Leu Gln Gly Gly Phe Ala Pro Asp Tyr Tyr Asp Trp Gly Gln Gln 365 370 375 Thr Asn Ser Phe Leu Ala Gln Ala Gly Thr Leu Gln Asp Ser Thr 380 385 390 Leu Asp Pro Asp Lys Pro Ser Leu Gly 395 15 2320 DNA Homo sapien 15 gcgcagggag gggagacctt ggcggacggc ggagccccag cggaggtgaa 50 agtattggcg gaaaggaaaa tacagcggaa aaatgcagag ctggagtcgt 100 gtgtactgct ccttggccaa gagaggccat ttcaatcgaa tatctcatgg 150 cctacaggga ctttctgcag tgcctctgag aacttacgca gatcagccga 200 ttgatgctga tgtaacagtt ataggttctg gtcctggagg atatgttgct 250 gctattaaag ctgcccagtt aggcttcaag acagtctgca ttgagaaaaa 300 tgaaacactt ggtggaacat gcttgaatgt tggttgtatt ccttctaagg 350 ctttattgaa caactctcat tattaccata tggcccatgg aacagatttt 400 gcatctagag gaattgaaat gtccgaagtt cgcttgaatt tagacaagat 450 gatggagcag aagagtactg cagtaaaagc tttaacaggt ggaattgccc 500 acttattcaa acagaataag gttgttcatg

tcaatggata tggaaagata 550 actggcaaaa atcaagtcac tgctacgaaa gctgatggcg gcactcaggt 600 tattgataca aagaacattc ttatagccac gggttcagaa gttactcctt 650 ttcctggaat cacgatagat gaagatacaa tagtgtcatc tacaggtgct 700 ttatctttaa aaaaagttcc agaaaagatg gttgttattg gtgcaggagt 750 aataggtgta gaattgggtt cagtttggca aagacttggt gcagatgtga 800 cagcagttga atttttaggt catgtaggtg gagttggaat tgatatggag 850 atatctaaaa actttcaacg catccttcaa aaacaggggt ttaaatttaa 900 attgaataca aaggttactg gtgctaccaa gaagtcagat ggaaaaattg 950 atgtttctat tgaagctgct tctggtggta aagctgaagt tatcacttgt 1000 gatgtactct tggtttgcat tggccgacga ccctttacta agaatttggg 1050 actagaagag ctgggaattg aactagatcc tagaggtaga attccagtca 1100 ataccagatt tcaaactaaa attccaaata tctatgccat tggtgatgta 1150 gttgctggtc caatgctggc tcacaaagca gaggatgaag gcattatctg 1200 tgttgaagga atggctggtg gtgctgtgca cattgactac aattgtgtgc 1250 catcagtgat ttacacacac cctgaagttg cttgggttgg caaatcagaa 1300 gagcagttga aagaagaggg tattgagtac aaagttggga aattcccatt 1350 tgctgctaac agcagagcta agacaaatgc tgacacagat ggcatggtga 1400 agatccttgg gcagaaatcg acagacagag tactgggagc acatattctt 1450 ggaccaggtg ctggagaaat ggtaaatgaa gctgctcttg ctttggaata 1500 tggagcatcc tgtgaagata tagctagagt ctgtcatgca catccgacct 1550 tatcagaagc ttttagagaa gcaaatcttg ctgcgtcatt tggcaaatca 1600 atcaactttt gaattagaag attatatatt tttttttctg aaatttcctg 1650 ggagcttttg tagaagtcac attcctgaac aggatattct cacagctcca 1700 agaatttcta ggactgaatt atgaaacttt tggaaggtat ttaataggtt 1750 tggacaaaat ggaatactct tatatctata ttttacataa atttagtatt 1800 ttgtttcagt gcactaatat gtaagacaaa aaggactact tattgtagtc 1850 atcctggaat atctccgtca actcatattt tcatgctgtt catgaaagat 1900 tcaatgcccc tgaatttaaa tagctctttt ctctgataca gaaaagttga 1950 attttacatg gctggagcta gaatttgata tgtgaacagt tgtgtttgaa 2000 gcacagtgat caagttattt ttaatttggt tttcacattg gaaacaagtc 2050 agtcattcag atatgattca aatgtctata aaccaaactg atgtaagtaa 2100 atggtctctc acttgtttta tttaacctct aaattctttc attttagggg 2150 tagcatttgt gttgaagagg ttttaaagct tccattgttg tctgcaactc 2200 tgaagggtaa ttatatagtt acccaaatta agagagtcta tttacggaac 2250 tcaaatacgt gggcattcaa atgtattaca gtggggaatg aagatactga 2300 aataaacgtc ttaaatattc 2320 16 509 PRT Homo sapien 16 Met Gln Ser Trp Ser Arg Val Tyr Cys Ser Leu Ala Lys Arg Gly 1 5 10 15 His Phe Asn Arg Ile Ser His Gly Leu Gln Gly Leu Ser Ala Val 20 25 30 Pro Leu Arg Thr Tyr Ala Asp Gln Pro Ile Asp Ala Asp Val Thr 35 40 45 Val Ile Gly Ser Gly Pro Gly Gly Tyr Val Ala Ala Ile Lys Ala 50 55 60 Ala Gln Leu Gly Phe Lys Thr Val Cys Ile Glu Lys Asn Glu Thr 65 70 75 Leu Gly Gly Thr Cys Leu Asn Val Gly Cys Ile Pro Ser Lys Ala 80 85 90 Leu Leu Asn Asn Ser His Tyr Tyr His Met Ala His Gly Thr Asp 95 100 105 Phe Ala Ser Arg Gly Ile Glu Met Ser Glu Val Arg Leu Asn Leu 110 115 120 Asp Lys Met Met Glu Gln Lys Ser Thr Ala Val Lys Ala Leu Thr 125 130 135 Gly Gly Ile Ala His Leu Phe Lys Gln Asn Lys Val Val His Val 140 145 150 Asn Gly Tyr Gly Lys Ile Thr Gly Lys Asn Gln Val Thr Ala Thr 155 160 165 Lys Ala Asp Gly Gly Thr Gln Val Ile Asp Thr Lys Asn Ile Leu 170 175 180 Ile Ala Thr Gly Ser Glu Val Thr Pro Phe Pro Gly Ile Thr Ile 185 190 195 Asp Glu Asp Thr Ile Val Ser Ser Thr Gly Ala Leu Ser Leu Lys 200 205 210 Lys Val Pro Glu Lys Met Val Val Ile Gly Ala Gly Val Ile Gly 215 220 225 Val Glu Leu Gly Ser Val Trp Gln Arg Leu Gly Ala Asp Val Thr 230 235 240 Ala Val Glu Phe Leu Gly His Val Gly Gly Val Gly Ile Asp Met 245 250 255 Glu Ile Ser Lys Asn Phe Gln Arg Ile Leu Gln Lys Gln Gly Phe 260 265 270 Lys Phe Lys Leu Asn Thr Lys Val Thr Gly Ala Thr Lys Lys Ser 275 280 285 Asp Gly Lys Ile Asp Val Ser Ile Glu Ala Ala Ser Gly Gly Lys 290 295 300 Ala Glu Val Ile Thr Cys Asp Val Leu Leu Val Cys Ile Gly Arg 305 310 315 Arg Pro Phe Thr Lys Asn Leu Gly Leu Glu Glu Leu Gly Ile Glu 320 325 330 Leu Asp Pro Arg Gly Arg Ile Pro Val Asn Thr Arg Phe Gln Thr 335 340 345 Lys Ile Pro Asn Ile Tyr Ala Ile Gly Asp Val Val Ala Gly Pro 350 355 360 Met Leu Ala His Lys Ala Glu Asp Glu Gly Ile Ile Cys Val Glu 365 370 375 Gly Met Ala Gly Gly Ala Val His Ile Asp Tyr Asn Cys Val Pro 380 385 390 Ser Val Ile Tyr Thr His Pro Glu Val Ala Trp Val Gly Lys Ser 395 400 405 Glu Glu Gln Leu Lys Glu Glu Gly Ile Glu Tyr Lys Val Gly Lys 410 415 420 Phe Pro Phe Ala Ala Asn Ser Arg Ala Lys Thr Asn Ala Asp Thr 425 430 435 Asp Gly Met Val Lys Ile Leu Gly Gln Lys Ser Thr Asp Arg Val 440 445 450 Leu Gly Ala His Ile Leu Gly Pro Gly Ala Gly Glu Met Val Asn 455 460 465 Glu Ala Ala Leu Ala Leu Glu Tyr Gly Ala Ser Cys Glu Asp Ile 470 475 480 Ala Arg Val Cys His Ala His Pro Thr Leu Ser Glu Ala Phe Arg 485 490 495 Glu Ala Asn Leu Ala Ala Ser Phe Gly Lys Ser Ile Asn Phe 500 505 17 2090 DNA Homo sapien 17 gttctgggcc taggggaggc gggccgaggg cgtctgagct gaggcccgcg 50 tcgatcctgg gttggaggag gtggcggccg ctgaggctgc ggcgtgaaga 100 cggcgggcat ggtggggcgg gagaaagagc tctctataca ctttgttccc 150 gggagctgtc ggctggtgga ggaggaagtt aacatcccta ataggagggt 200 tctggttact ggtgccactg ggcttcttgg cagagctgta cacaaagaat 250 ttcagcagaa taattggcat gcagttggct gtggtttcag aagagcaaga 300 ccaaaatttg aacaggttaa tctgttggat tctaatgcag ttcatcacat 350 cattcatgat tttcagcccc atgttatagt acattgtgca gcagagagaa 400 gaccagatgt tgtagaaaat cagccagatg ctgcctctca acttaatgtg 450 gatgcttctg ggaatttagc aaaggaagca gctgctgttg gagcatttct 500 catctacatt agctcagatt atgtatttga tggaacaaat ccaccttaca 550 gagaggaaga cataccagct cccctaaatt tgtatggcaa aacaaaatta 600 gatggagaaa aggctgtcct ggagaacaat ctaggagctg ctgttttgag 650 gattcctatt ctgtatgggg aagttgaaaa gctcgaagaa agtgctgtga 700 ctgttatgtt tgataaagtg cagttcagca acaagtcagc aaacatggat 750 cactggcagc agaggttccc cacacatgtc aaagatgtgg ccactgtgtg 800 ccggcagcta gcagagaaga gaatgctgga tccatcaatt aagggaacct 850 ttcactggtc tggcaatgaa cagatgacta agtatgaaat ggcatgtgca 900 attgcagatg ccttcaacct ccccagcagt cacttaagac ctattactga 950 cagccctgtc ctaggagcac aacgtccgag aaatgctcag cttgactgct 1000 ccaaattgga gaccttgggc attggccaac gaacaccatt tcgaattgga 1050 atcaaagaat cactttggcc tttcctcatt gacaagagat ggagacaaac 1100 ggtctttcat tagtttattt gtgttgggtt cttttttttt tttaaatgaa 1150 aagtatagta tgtggcactt tttaaagaac aaaggaaata gttttgtatg 1200 agtactttaa ttgtgactct taggatcttt caggtaaatg atgctcttgc 1250 actagtgaaa ttgtctaaag aaactaaagg gcagtcatgc cctgtttgca 1300 gtaatttttc tttttatcat tttgtttgtc ctggctaaac ttggagtttg 1350 agtatagtaa attatgatcc ttaaatattt gagagtcagg atgaagcaga 1400 tctgctgtag acttttcaga tgaaattgtt cattctcgta acctccatat 1450 tttcaggatt tttgaagctg ttgacctttt catgttgatt attttaaatt 1500 gtgtgaaata gtataaaaat cattggtgtt cattatttgc tttgcctgag 1550 ctcagatcaa aatgtttgaa gaaaggaact ttatttttgc aagttacgta 1600 cagtttttat gcttgagata tttcaacatg ttatgtatat tggaacttct 1650 acagcttgat gcctcctgct tttatagcag tttatgggga gcacttgaaa 1700 gagcgtgtgt acatgtattt tttttctagg caaacattga atgcaaacgt 1750 gtattttttt aatataaata tataactgtc cttttcatcc catgttgccg 1800 ctaagtgata tttcatatgt gtggttatac tcataataat gggccttgta 1850 agtcttttca ccattcatga ataataataa atatgtactg ctggcatgta 1900 atgcttagtt ttcttgtatt tacttctttt tttaaatgta aggaccaaac 1950 ttctaaacta attgttcttt tgttgcttta atttttaaaa attacattct 2000 tctgatgtaa catgtgatac atacaaaaga atatagttta atatgtattg 2050 aaataaaaca caataaaatt aaaaaaaaaa aaaaaaaaaa 2090 18 334 PRT Homo sapien 18 Met Val Gly Arg Glu Lys Glu Leu Ser Ile His Phe Val Pro Gly 1 5 10 15 Ser Cys Arg Leu Val Glu Glu Glu Val Asn Ile Pro Asn Arg Arg 20 25 30 Val Leu Val Thr Gly Ala Thr Gly Leu Leu Gly Arg Ala Val His 35 40 45 Lys Glu Phe Gln Gln Asn Asn Trp His Ala Val Gly Cys Gly Phe 50 55 60 Arg Arg Ala Arg Pro Lys Phe Glu Gln Val Asn Leu Leu Asp Ser 65 70 75 Asn Ala Val His His Ile Ile His Asp Phe Gln Pro His Val Ile 80 85 90 Val His Cys Ala Ala Glu Arg Arg Pro Asp Val Val Glu Asn Gln 95 100 105 Pro Asp Ala Ala Ser Gln Leu Asn Val Asp Ala Ser Gly Asn Leu 110 115 120 Ala Lys Glu Ala Ala Ala Val Gly Ala Phe Leu Ile Tyr Ile Ser 125 130 135 Ser Asp Tyr Val Phe Asp Gly Thr Asn Pro Pro Tyr Arg Glu Glu 140 145 150 Asp Ile Pro Ala Pro Leu Asn Leu Tyr Gly Lys Thr Lys Leu Asp 155 160 165 Gly Glu Lys Ala Val Leu Glu Asn Asn Leu Gly Ala Ala Val Leu 170 175 180 Arg Ile Pro Ile Leu Tyr Gly Glu Val Glu Lys Leu Glu Glu Ser 185 190 195 Ala Val Thr Val Met Phe Asp Lys Val Gln Phe Ser Asn Lys Ser 200 205 210 Ala Asn Met Asp His Trp Gln Gln Arg Phe Pro Thr His Val Lys 215 220 225 Asp Val Ala Thr Val Cys Arg Gln Leu Ala Glu Lys Arg Met Leu 230 235 240 Asp Pro Ser Ile Lys Gly Thr Phe His Trp Ser Gly Asn Glu Gln 245 250 255 Met Thr Lys Tyr Glu Met Ala Cys Ala Ile Ala Asp Ala Phe Asn 260 265 270 Leu Pro Ser Ser His Leu Arg Pro Ile Thr Asp Ser Pro Val Leu 275 280 285 Gly Ala Gln Arg Pro Arg Asn Ala Gln Leu Asp Cys Ser Lys Leu 290 295 300 Glu Thr Leu Gly Ile Gly Gln Arg Thr Pro Phe Arg Ile Gly Ile 305 310 315 Lys Glu Ser Leu Trp Pro Phe Leu Ile Asp Lys Arg Trp Arg Gln 320 325 330 Thr Val Phe His 19 2380 DNA Homo sapien 19 gaggaggagg gaaaaggcga gcaaaaagga agagtgggag gaggagggga 50 agcggcgaag gaggaagagg aggaggagga agaggggagc acaaaggatc 100 caggtctccc gacgggaggt taataccaag aaccatgtgt gccgagcggc 150 tgggccagtt catgaccctg gctttggtgt tggccacctt tgacccggcg 200 cgggggaccg acgccaccaa cccacccgag ggtccccaag acaggagctc 250 ccagcagaaa ggccgcctgt ccctgcagaa tacagcggag atccagcact 300 gtttggtcaa cgctggcgat gtggggtgtg gcgtgtttga atgtttcgag 350 aacaactctt gtgagattcg gggcttacat gggatttgca tgacttttct 400 gcacaacgct ggaaaatttg atgcccaggg caagtcattc atcaaagacg 450 ccttgaaatg taaggcccac gctctgcggc acaggttcgg ctgcataagc 500 cggaagtgcc cggccatcag ggaaatggtg tcccagttgc agcgggaatg 550 ctacctcaag cacgacctgt gcgcggctgc ccaggagaac acccgggtga 600 tagtggagat gatccatttc aaggacttgc tgctgcacga accctacgtg 650 gacctcgtga acttgctgct gacctgtggg gaggaggtga aggaggccat 700 cacccacagc gtgcaggttc agtgtgagca gaactgggga agcctgtgct 750 ccatcttgag cttctgcacc tcggccatcc agaagcctcc cacggcgccc 800 cccgagcgcc agccccaggt ggacagaacc aagctctcca gggcccacca 850 cggggaagca ggacatcacc tcccagagcc cagcagtagg gagactggcc 900 gaggtgccaa gggtgagcga ggtagcaaga gccacccaaa cgcccatgcc 950 cgaggcagag tcgggggcct tggggctcag ggaccttccg gaagcagcga 1000 gtgggaagac gaacagtctg agtattctga tatccggagg tgaaatgaaa 1050 ggcctggcca cgaaatcttt cctccacgcc gtccattttc ttatctatgg 1100 acattccaaa acatttacca ttagagaggg gggatgtcac acgcaggatt 1150 ctgtggggac tgtggacttc atcgaggtgt gtgttcgcgg aacggacagg 1200 tgagatggag acccctgggg ccgtggggtc tcaggggtgc ctggtgaatt 1250 ctgcacttac acgtactcaa gggagcgcgc ccgcgttatc ctcgtacctt 1300 tgtcttcttt ccatctgtgg agtcagtggg tgtcggccgc tctgttgtgg 1350 gggaggtgaa ccagggaggg gcagggcaag gcagggcccc cagagctggg 1400 ccacacagtg ggtgctgggc ctcgccccga agcttctggt gcagcagcct 1450 ctggtgctgt ctccgcggaa gtcagggcgg ctggattcca ggacaggagt 1500 gaatgtaaaa ataaatatcg cttagaatgc aggagaaggg tggagaggag 1550 gcaggggccg agggggtgct tggtgccaaa ctgaaattca gtttcttgtg 1600 tggggccttg cggttcagag ctcttggcga gggtggaggg aggagtgtca 1650 tttctatgtg taatttctga gccattgtac tgtctgggct gggggggaca 1700 ctgtccaagg gagtggcccc tatgagttta tattttaacc actgcttcaa 1750 atctcgattt cacttttttt atttatccag ttatatctac atatctgtca 1800 tctaaataaa tggctttcaa acaaagcaac tgggtcatta aaaccagctc 1850 aaagggggtt taaaaaaaaa aaaaccagcc catcctttga ggctgatttt 1900 tctttttttt aagttctatt ttaaaagcta tcaaacagcg acatagccat 1950 acatctgact gcctgacatg gactcctgcc cacttggggg aaaccttata 2000 cccagaggaa aatacacacc tggggagtac atttgacaaa tttcccttag 2050 gatttcgtta tctcaccttg accctcagcc aagattggta aagctgcgtc 2100 ctggcgattc caggagaccc agctggaaac ctggcttctc catgtgaggg 2150 gatgggaaag gaaagaagag aatgaagact acttagtaat tcccatcagg 2200 aaatgctgac cttttacata aaatcaagga gactgctgaa aatctctaag 2250 ggacaggatt ttccagatcc taattggaaa tttagcaata aggagaggag 2300 tccaagggga caaataaagg cagagagaga gagagagaga gggagaggaa 2350 gaaaagagag agagaaaaga gcctcgtgcc 2380 20 302 PRT Homo sapien 20 Met Cys Ala Glu Arg Leu Gly Gln Phe Met Thr Leu Ala Leu Val 1 5 10 15 Leu Ala Thr Phe Asp Pro Ala Arg Gly Thr Asp Ala Thr Asn Pro 20 25 30 Pro Glu Gly Pro Gln Asp Arg Ser Ser Gln Gln Lys Gly Arg Leu 35 40 45 Ser Leu Gln Asn Thr Ala Glu Ile Gln His Cys Leu Val Asn Ala 50 55 60 Gly Asp Val Gly Cys Gly Val Phe Glu Cys Phe Glu Asn Asn Ser 65 70 75 Cys Glu Ile Arg Gly Leu His Gly Ile Cys Met Thr Phe Leu His 80 85 90 Asn Ala Gly Lys Phe Asp Ala Gln Gly Lys Ser Phe Ile Lys Asp 95 100 105 Ala Leu Lys Cys Lys Ala His Ala Leu Arg His Arg Phe Gly Cys 110 115 120 Ile Ser Arg Lys Cys Pro Ala Ile Arg Glu Met Val Ser Gln Leu 125 130 135 Gln Arg Glu Cys Tyr Leu Lys His Asp Leu Cys Ala Ala Ala Gln 140 145 150 Glu Asn Thr Arg Val Ile Val Glu Met Ile His Phe Lys Asp Leu 155 160 165 Leu Leu His Glu Pro Tyr Val Asp Leu Val Asn Leu Leu Leu Thr 170 175 180 Cys Gly Glu Glu Val Lys Glu Ala Ile Thr His Ser Val Gln Val 185 190 195 Gln Cys Glu Gln Asn Trp Gly Ser Leu Cys Ser Ile Leu Ser Phe 200 205 210 Cys Thr Ser Ala Ile Gln Lys Pro Pro Thr Ala Pro Pro Glu Arg 215 220 225 Gln Pro Gln Val Asp Arg Thr Lys Leu Ser Arg Ala His His Gly 230 235 240 Glu Ala Gly His His Leu Pro Glu Pro Ser Ser Arg Glu Thr Gly 245 250 255 Arg Gly Ala Lys Gly Glu Arg Gly Ser Lys Ser His Pro Asn Ala 260 265 270 His Ala Arg Gly Arg Val Gly Gly Leu Gly Ala Gln Gly Pro Ser 275 280

285 Gly Ser Ser Glu Trp Glu Asp Glu Gln Ser Glu Tyr Ser Asp Ile 290 295 300 Arg Arg 21 2516 DNA Homo sapien 21 gttcctggtg tccccacttc gcctccctcc tgctgccccc aagacatgca 50 ggggccctgg gtgctgctgc tgctgggcct gaggctacag ctctccctgg 100 gcgtcatccc agctgaggag gagaacccgg ccttctggaa ccgccaggca 150 gctgaggccc tggatgctgc caagaagctg cagcccatcc agaaggtcgc 200 caagaacctc atcctcttcc tgggcgatgg gttgggggtg cccacggtga 250 cagccaccag gatcctaaag gggcagaaga atggcaaact ggggcctgag 300 acgcccctgg ccatggaccg cttcccatac ctggctctgt ccaagacata 350 caatgtggac agacaggtgc cagacagcgc agccacagcc acggcctacc 400 tgtgcggggt caaggccaac ttccagacca tcggcttgag tgcagccgcc 450 cgctttaacc agtgcaacac gacacgcggc aatgaggtca tctccgtgat 500 gaaccgggcc aagcaagcag gaaagtcagt aggagtggtg accaccacac 550 gggtgcagca cgcctcgcca gccggcacct acgcacacac agtgaaccgc 600 aactggtact cagatgctga catgcctgcc tcagcccgcc aggaggggtg 650 ccaggacatc gccactcagc tcatctccaa catggacatt gacgtgatcc 700 ttggcggagg ccgcaagtac atgtttccca tggggacccc agaccctgag 750 tacccagctg atgccagcca gaatggaatc aggctggacg ggaagaacct 800 ggtgcaggaa tggctggcaa agcaccaggg tgcctggtat gtgtggaacc 850 gcactgagct catgcaggcg tccctggacc agtctgtgac ccatctcatg 900 ggcctctttg agcccggaga cacgaaatat gagatcctcc gagaccccac 950 actggacccc tccctgatgg agatgacaga ggctgccctg cgcctgctga 1000 gcaggaaccc ccgcggcttc tacctctttg tggagggcgg ccgcatcgac 1050 catggtcatc atgagggtgt ggcttaccag gcagtcactg aggcggtcat 1100 gttcgacgac gccattgaga gggcgggcca gctcaccagc gaggaggaca 1150 cgctgaccct cgtcaccgct gaccactccc atgtcttctc ctttggtggc 1200 tacaccttgc gagggagctc catcttcggg ttggccccca gcaaggctca 1250 ggacagcaaa gcctacacgt ccatcctgta cggcaatggc ccgggctacg 1300 tgttcaactc aggcgtgcga ccagacgtga atgagagcga gagcgggagc 1350 cccgattacc agcagcaggc ggcggtgccc ctgtcgtccg agacccacgg 1400 aggcgaagac gtggcggtgt ttgcgcgcgg cccgcaggcg cacctggtgc 1450 atggtgtgca ggagcagagc ttcgtagcgc atgtcatggc cttcgctgcc 1500 tgtctggagc cctacacggc ctgcgacctg gcgctccccg cctgcaccac 1550 cgacgccgcg cacccagttg ccgcgtcgct gccactgctg gccgggaccc 1600 tgctgctgct gggggcgtcc gctgctccct gagtgcccca ctccggagtt 1650 atcctgctcc ccacctccgg gcgtcctgcc ctgttccccg tcctgagccg 1700 ccacttccag cgaacacaca caggtgtcct gccgttggac cttcacctcc 1750 tagagataaa ccagcctcag ctggcgcagc ggggcccttc ttccctccgc 1800 atccccttca gggagcagga gcccagggcg ccctgggagc tgagcctggg 1850 acttccagga cctcccctca ggttgttctc tgattcttcc tcccaacccc 1900 agagactgca gatttgtgcc atgcggctgc ctgcacccca gacaataaag 1950 ggaccaaaac cacccaaccc ccaccctgcc tctatcctaa ggaagaccaa 2000 gcaggcctgg acccagagac gtcccccatc gtgggacacg acacacccag 2050 accgcgtgcc ccaccgtctt agcttcaatc ctggcagcac ctggtagacc 2100 caaggacttg ggtggatcag gacacctgaa gaagagaagc ttccggcaac 2150 cctgcaaccc acccaaggag gctactggat cggggattcc caggggggct 2200 ttgacacagt cctctgctgt ctccccacta ggatcattcc acacccctgc 2250 acctgaccaa gggaccaatg aggcagaggc ttgccccaag tcacagccac 2300 tcagatgctt cctgcccccc agtgcccatt ccaggtcacc agatccaagg 2350 agcgcttgag gagctctggg tacagggcag caacccagag cccatgggcc 2400 ctcccgggac atctggatgc tgggcataga tttctcaaca aggaagactc 2450 ccctgcctcc tcaaggtctc cattctccta ggagacaaag caataataaa 2500 aggtgttaga caatgt 2516 22 528 PRT Homo sapien 22 Met Gln Gly Pro Trp Val Leu Leu Leu Leu Gly Leu Arg Leu Gln 1 5 10 15 Leu Ser Leu Gly Val Ile Pro Ala Glu Glu Glu Asn Pro Ala Phe 20 25 30 Trp Asn Arg Gln Ala Ala Glu Ala Leu Asp Ala Ala Lys Lys Leu 35 40 45 Gln Pro Ile Gln Lys Val Ala Lys Asn Leu Ile Leu Phe Leu Gly 50 55 60 Asp Gly Leu Gly Val Pro Thr Val Thr Ala Thr Arg Ile Leu Lys 65 70 75 Gly Gln Lys Asn Gly Lys Leu Gly Pro Glu Thr Pro Leu Ala Met 80 85 90 Asp Arg Phe Pro Tyr Leu Ala Leu Ser Lys Thr Tyr Asn Val Asp 95 100 105 Arg Gln Val Pro Asp Ser Ala Ala Thr Ala Thr Ala Tyr Leu Cys 110 115 120 Gly Val Lys Ala Asn Phe Gln Thr Ile Gly Leu Ser Ala Ala Ala 125 130 135 Arg Phe Asn Gln Cys Asn Thr Thr Arg Gly Asn Glu Val Ile Ser 140 145 150 Val Met Asn Arg Ala Lys Gln Ala Gly Lys Ser Val Gly Val Val 155 160 165 Thr Thr Thr Arg Val Gln His Ala Ser Pro Ala Gly Thr Tyr Ala 170 175 180 His Thr Val Asn Arg Asn Trp Tyr Ser Asp Ala Asp Met Pro Ala 185 190 195 Ser Ala Arg Gln Glu Gly Cys Gln Asp Ile Ala Thr Gln Leu Ile 200 205 210 Ser Asn Met Asp Ile Asp Val Ile Leu Gly Gly Gly Arg Lys Tyr 215 220 225 Met Phe Pro Met Gly Thr Pro Asp Pro Glu Tyr Pro Ala Asp Ala 230 235 240 Ser Gln Asn Gly Ile Arg Leu Asp Gly Lys Asn Leu Val Gln Glu 245 250 255 Trp Leu Ala Lys His Gln Gly Ala Trp Tyr Val Trp Asn Arg Thr 260 265 270 Glu Leu Met Gln Ala Ser Leu Asp Gln Ser Val Thr His Leu Met 275 280 285 Gly Leu Phe Glu Pro Gly Asp Thr Lys Tyr Glu Ile Leu Arg Asp 290 295 300 Pro Thr Leu Asp Pro Ser Leu Met Glu Met Thr Glu Ala Ala Leu 305 310 315 Arg Leu Leu Ser Arg Asn Pro Arg Gly Phe Tyr Leu Phe Val Glu 320 325 330 Gly Gly Arg Ile Asp His Gly His His Glu Gly Val Ala Tyr Gln 335 340 345 Ala Val Thr Glu Ala Val Met Phe Asp Asp Ala Ile Glu Arg Ala 350 355 360 Gly Gln Leu Thr Ser Glu Glu Asp Thr Leu Thr Leu Val Thr Ala 365 370 375 Asp His Ser His Val Phe Ser Phe Gly Gly Tyr Thr Leu Arg Gly 380 385 390 Ser Ser Ile Phe Gly Leu Ala Pro Ser Lys Ala Gln Asp Ser Lys 395 400 405 Ala Tyr Thr Ser Ile Leu Tyr Gly Asn Gly Pro Gly Tyr Val Phe 410 415 420 Asn Ser Gly Val Arg Pro Asp Val Asn Glu Ser Glu Ser Gly Ser 425 430 435 Pro Asp Tyr Gln Gln Gln Ala Ala Val Pro Leu Ser Ser Glu Thr 440 445 450 His Gly Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gln Ala 455 460 465 His Leu Val His Gly Val Gln Glu Gln Ser Phe Val Ala His Val 470 475 480 Met Ala Phe Ala Ala Cys Leu Glu Pro Tyr Thr Ala Cys Asp Leu 485 490 495 Ala Leu Pro Ala Cys Thr Thr Asp Ala Ala His Pro Val Ala Ala 500 505 510 Ser Leu Pro Leu Leu Ala Gly Thr Leu Leu Leu Leu Gly Ala Ser 515 520 525 Ala Ala Pro 23 1746 DNA Homo sapien 23 agaattcggc acgacggggt tctggccatg aagcccacct caggcccaga 50 ggaggcccgg cggccagcct cggacatccg cgtgttcgcc agcaactgct 100 cgatgcacgg gctgggccac gtcttcgggc caggcagcct gagcctgcgc 150 cgggggatgt gggcagcggc cgtggtcctg tcagtggcca ccttcctcta 200 ccaggtggct gagagggtgc gctactacag ggagttccac caccagactg 250 ccctggatga gcgagaaagc caccggctca tcttcccggc tgtcaccctg 300 tgcaacatca acccactgcg ccgctcgcgc ctaacgccca acgacctgca 350 ctgggctggg tctgcgctgc tgggcctgga tcccgcagag cacgccgcct 400 tcctgcgcgc cctgggccgg ccccctgcac cgcccggctt catgcccagt 450 cccacctttg acatggcgca actctatgcc cgtgctgggc actccctgga 500 tgacatgctg ctggactgtc gcttccgtgg ccaaccttgt gggcctgaga 550 acttcaccac gatcttcacc cggatgggaa agtgctacac atttaactct 600 ggcgctgatg gggcagagct gctcaccact actaggggtg gcatgggcaa 650 tgggctggac atcatgctgg acgtgcagca ggaggaatat ctacctgtgt 700 ggagggacaa tgaggagacc ccgtttgagg tggggatccg agtgcagatc 750 cacagccagg aggagccgcc catcatcgat cagctgggct tgggggtgtc 800 cccgggctac cagacctttg tttcttgcca gcagcagcag ctgagcttcc 850 tgccaccgcc ctggggcgat tgcagttcag catctctgaa ccccaactat 900 gagccagagc cctctgatcc cctaggctcc cccagcccca gccccagccc 950 tccctatacc cttatggggt gtcgcctggc ctgcgaaacc cgctacgtgg 1000 ctcggaagtg cggctgccga atggtgtaca tgccaggcga cgtgccagtg 1050 tgcagccccc agcagtacaa gaactgtgcc cacccggcca tagatgccat 1100 gcttcgcaag gactcgtgcg cctgccccaa cccgtgcgcc agcacgcgct 1150 acgccaagga gctctccatg gtgcggatcc cgagccgcgc cgccgcgcgc 1200 ttcctggccc ggaagctcaa ccgcagcgag gcctacatcg cggagaacgt 1250 gctggccctg gacatcttct ttgaggccct caactatgag accgtggagc 1300 agaagaaggc ctatgagatg tcagagctgc ttggtgacat tgggggccag 1350 atggggctgt tcatcggggc cagcctgctc accatcctcg agatcctaga 1400 ctacctctgt gaggtgttcc gagacaaggt cctgggatat ttctggaacc 1450 gacagcactc ccaaaggcac tccagcacca atctgcttca ggaagggctg 1500 ggcagccatc gaacccaagt tccccacctc agcctgggcc ccagacctcc 1550 cacccctccc tgtgccgtca ccaagactct ctccgcctcc caccgcacct 1600 gctaccttgt cacacagctc tagacctgct gtctgtgtcc tcggagcccc 1650 gccctgacat cctggacatg cctagcctgc acgtagcttt tccgtcttca 1700 ccccaaataa agtcctaatg catcaaaaaa aaaaaaaaaa aaaaaa 1746 24 531 PRT Homo sapien 24 Met Lys Pro Thr Ser Gly Pro Glu Glu Ala Arg Arg Pro Ala Ser 1 5 10 15 Asp Ile Arg Val Phe Ala Ser Asn Cys Ser Met His Gly Leu Gly 20 25 30 His Val Phe Gly Pro Gly Ser Leu Ser Leu Arg Arg Gly Met Trp 35 40 45 Ala Ala Ala Val Val Leu Ser Val Ala Thr Phe Leu Tyr Gln Val 50 55 60 Ala Glu Arg Val Arg Tyr Tyr Arg Glu Phe His His Gln Thr Ala 65 70 75 Leu Asp Glu Arg Glu Ser His Arg Leu Ile Phe Pro Ala Val Thr 80 85 90 Leu Cys Asn Ile Asn Pro Leu Arg Arg Ser Arg Leu Thr Pro Asn 95 100 105 Asp Leu His Trp Ala Gly Ser Ala Leu Leu Gly Leu Asp Pro Ala 110 115 120 Glu His Ala Ala Phe Leu Arg Ala Leu Gly Arg Pro Pro Ala Pro 125 130 135 Pro Gly Phe Met Pro Ser Pro Thr Phe Asp Met Ala Gln Leu Tyr 140 145 150 Ala Arg Ala Gly His Ser Leu Asp Asp Met Leu Leu Asp Cys Arg 155 160 165 Phe Arg Gly Gln Pro Cys Gly Pro Glu Asn Phe Thr Thr Ile Phe 170 175 180 Thr Arg Met Gly Lys Cys Tyr Thr Phe Asn Ser Gly Ala Asp Gly 185 190 195 Ala Glu Leu Leu Thr Thr Thr Arg Gly Gly Met Gly Asn Gly Leu 200 205 210 Asp Ile Met Leu Asp Val Gln Gln Glu Glu Tyr Leu Pro Val Trp 215 220 225 Arg Asp Asn Glu Glu Thr Pro Phe Glu Val Gly Ile Arg Val Gln 230 235 240 Ile His Ser Gln Glu Glu Pro Pro Ile Ile Asp Gln Leu Gly Leu 245 250 255 Gly Val Ser Pro Gly Tyr Gln Thr Phe Val Ser Cys Gln Gln Gln 260 265 270 Gln Leu Ser Phe Leu Pro Pro Pro Trp Gly Asp Cys Ser Ser Ala 275 280 285 Ser Leu Asn Pro Asn Tyr Glu Pro Glu Pro Ser Asp Pro Leu Gly 290 295 300 Ser Pro Ser Pro Ser Pro Ser Pro Pro Tyr Thr Leu Met Gly Cys 305 310 315 Arg Leu Ala Cys Glu Thr Arg Tyr Val Ala Arg Lys Cys Gly Cys 320 325 330 Arg Met Val Tyr Met Pro Gly Asp Val Pro Val Cys Ser Pro Gln 335 340 345 Gln Tyr Lys Asn Cys Ala His Pro Ala Ile Asp Ala Met Leu Arg 350 355 360 Lys Asp Ser Cys Ala Cys Pro Asn Pro Cys Ala Ser Thr Arg Tyr 365 370 375 Ala Lys Glu Leu Ser Met Val Arg Ile Pro Ser Arg Ala Ala Ala 380 385 390 Arg Phe Leu Ala Arg Lys Leu Asn Arg Ser Glu Ala Tyr Ile Ala 395 400 405 Glu Asn Val Leu Ala Leu Asp Ile Phe Phe Glu Ala Leu Asn Tyr 410 415 420 Glu Thr Val Glu Gln Lys Lys Ala Tyr Glu Met Ser Glu Leu Leu 425 430 435 Gly Asp Ile Gly Gly Gln Met Gly Leu Phe Ile Gly Ala Ser Leu 440 445 450 Leu Thr Ile Leu Glu Ile Leu Asp Tyr Leu Cys Glu Val Phe Arg 455 460 465 Asp Lys Val Leu Gly Tyr Phe Trp Asn Arg Gln His Ser Gln Arg 470 475 480 His Ser Ser Thr Asn Leu Leu Gln Glu Gly Leu Gly Ser His Arg 485 490 495 Thr Gln Val Pro His Leu Ser Leu Gly Pro Arg Pro Pro Thr Pro 500 505 510 Pro Cys Ala Val Thr Lys Thr Leu Ser Ala Ser His Arg Thr Cys 515 520 525 Tyr Leu Val Thr Gln Leu 530 25 1104 DNA Homo sapien 25 ctcggtgcgc gaccccggct cagaggactc tttgctgtcc cgcaagatgc 50 ggatgctgct ggcgctcctg gccctctccg cggcgcggcc atcggccagt 100 gcagagtcac actggtgcta cgaggttcaa gccgagtcct ccaactaccc 150 ctgcttggtg ccagtcaagt ggggtggaaa ctgccagaag gaccgccagt 200 cccccatcaa catcgtcacc accaaggcaa aggtggacaa aaaactggga 250 cgcttcttct tctctggcta cgataagaag caaacgtgga ctgtccaaaa 300 taacgggcac tcagtgatga tgttgctgga gaacaaggcc agcatttctg 350 gaggaggact gcctgcccca taccaggcca aacagttgca cctgcactgg 400 tccgacttgc catataaggg ctcggagcac agcctcgatg gggagcactt 450 tgccatggag atgcacatag tacatgagaa agagaagggg acatcgagga 500 atgtgaaaga ggcccaggac cctgaagacg aaattgcggt gctggccttt 550 ctggtggagg ctggaaccca ggtgaacgag ggcttccagc cactggtgga 600 ggcactgtct aatatcccca aacctgagat gagcactacg atggcagaga 650 gcagcctgtt ggacctgctc cccaaggagg agaaactgag gcactacttc 700 cgctacctgg gctcactcac cacaccgacc tgcgatgaga aggtcgtctg 750 gactgtgttc cgggagccca ttcagcttca cagagaacag atcctggcat 800 tctctcagaa gctgtactac gacaaggaac agacagtgag catgaaggac 850 aatgtcaggc ccctgcagca gctggggcag cgcacggtga taaagtccgg 900 ggccccgggt cggccgctgc cctgggccct gcctgccctg ctgggcccca 950 tgctggcctg cctgctggcc ggcttcctgc gatgatggct cacttctgca 1000 cgcagcctct ctgttgcctc agctctccaa gttccaggct tccggtcctt 1050 agccttccca ggtgggactt taggcatgat taaaatatgg acatattttt 1100 ggag 1104 26 312 PRT Homo sapien 26 Met Arg Met Leu Leu Ala Leu Leu Ala Leu Ser Ala Ala Arg Pro 1 5 10 15 Ser Ala Ser Ala Glu Ser His Trp Cys Tyr Glu Val Gln Ala Glu 20 25 30 Ser Ser Asn Tyr Pro Cys Leu Val Pro Val Lys Trp Gly Gly Asn 35 40 45 Cys Gln Lys Asp Arg Gln Ser Pro Ile Asn Ile Val Thr Thr Lys 50 55 60 Ala Lys Val Asp Lys Lys Leu Gly Arg Phe Phe Phe Ser Gly Tyr 65 70 75 Asp Lys Lys Gln Thr Trp Thr Val Gln Asn Asn Gly His Ser Val 80 85 90 Met Met Leu Leu Glu Asn Lys Ala Ser Ile Ser Gly Gly Gly Leu 95 100 105 Pro Ala Pro Tyr Gln Ala Lys Gln Leu His Leu His Trp Ser Asp 110 115 120 Leu Pro Tyr Lys Gly Ser Glu His Ser Leu Asp Gly Glu His Phe 125 130 135 Ala Met Glu Met His Ile Val His Glu Lys Glu Lys Gly Thr Ser 140 145 150 Arg Asn Val Lys Glu Ala Gln Asp Pro Glu Asp Glu Ile Ala Val 155 160 165 Leu Ala Phe Leu Val Glu Ala Gly Thr Gln Val Asn Glu Gly Phe 170 175 180 Gln Pro Leu Val Glu Ala Leu Ser Asn Ile Pro Lys Pro Glu Met 185 190 195 Ser Thr Thr Met Ala Glu Ser Ser Leu Leu Asp Leu Leu Pro Lys

200 205 210 Glu Glu Lys Leu Arg His Tyr Phe Arg Tyr Leu Gly Ser Leu Thr 215 220 225 Thr Pro Thr Cys Asp Glu Lys Val Val Trp Thr Val Phe Arg Glu 230 235 240 Pro Ile Gln Leu His Arg Glu Gln Ile Leu Ala Phe Ser Gln Lys 245 250 255 Leu Tyr Tyr Asp Lys Glu Gln Thr Val Ser Met Lys Asp Asn Val 260 265 270 Arg Pro Leu Gln Gln Leu Gly Gln Arg Thr Val Ile Lys Ser Gly 275 280 285 Ala Pro Gly Arg Pro Leu Pro Trp Ala Leu Pro Ala Leu Leu Gly 290 295 300 Pro Met Leu Ala Cys Leu Leu Ala Gly Phe Leu Arg 305 310 27 585 DNA Homo sapien 27 tggtcatctc agttcttttc tcaccttgac tgcaagatga aactccttgt 50 gctagctgtg ctgctcacag tggccgccgc cgacagcggc atcagccctc 100 gggccgtgtg gcagttccgc aaaatgatca agtgcgtgat cccggggagt 150 gaccccttct tggaatacaa caactacggc tgctactgtg gcttgggggg 200 ctcaggcacc cccgtggatg aactggacaa gtgctgccag acacatgaca 250 actgctatga ccaggccaag aagctggaca gctgtaaatt tctgctggac 300 aacccgtaca cccacaccta ttcatactcg tgctctggct cggcaatcac 350 ctgtagcagc aaaaacaaag agtgtgaggc cttcatttgc aactgcgacc 400 gcaacgctgc catctgcttt tcaaaagctc catataacaa ggcacacaag 450 aacctggaca ccaagaagta ttgtcagagt tgaatatcac ctctcaaaag 500 catcacctct atctgcctca tctcacactg tactctccaa taaagcacct 550 tgttgaaaga cctcaaaaaa aaaaaaaaaa aaaaa 585 28 148 PRT Homo sapien 28 Met Lys Leu Leu Val Leu Ala Val Leu Leu Thr Val Ala Ala Ala 1 5 10 15 Asp Ser Gly Ile Ser Pro Arg Ala Val Trp Gln Phe Arg Lys Met 20 25 30 Ile Lys Cys Val Ile Pro Gly Ser Asp Pro Phe Leu Glu Tyr Asn 35 40 45 Asn Tyr Gly Cys Tyr Cys Gly Leu Gly Gly Ser Gly Thr Pro Val 50 55 60 Asp Glu Leu Asp Lys Cys Cys Gln Thr His Asp Asn Cys Tyr Asp 65 70 75 Gln Ala Lys Lys Leu Asp Ser Cys Lys Phe Leu Leu Asp Asn Pro 80 85 90 Tyr Thr His Thr Tyr Ser Tyr Ser Cys Ser Gly Ser Ala Ile Thr 95 100 105 Cys Ser Ser Lys Asn Lys Glu Cys Glu Ala Phe Ile Cys Asn Cys 110 115 120 Asp Arg Asn Ala Ala Ile Cys Phe Ser Lys Ala Pro Tyr Asn Lys 125 130 135 Ala His Lys Asn Leu Asp Thr Lys Lys Tyr Cys Gln Ser 140 145 29 2876 DNA Homo sapien 29 tgaaacctaa cccgccctgg ggaggcgcgc agcagaggct ccgattcggg 50 gcaggtgaga ggctgacttt ctctcggtgc gtccagtgga gctctgagtt 100 tcgaatcggc ggcggcggat tccccgcgcg cccggcgtcg gggcttccag 150 gaggatgcgg agccccagcg cggcgtggct gctgggggcc gccatcctgc 200 tagcagcctc tctctcctgc agtggcacca tccaaggaac caatagatcc 250 tctaaaggaa gaagccttat tggtaaggtt gatggcacat cccacgtcac 300 tggaaaagga gttacagttg aaacagtctt ttctgtggat gagttttctg 350 catctgtcct cactggaaaa ctgaccactg tcttccttcc aattgtctac 400 acaattgtgt ttgtggtggg tttgccaagt aacggcatgg ccctgtgggt 450 ctttcttttc cgaactaaga agaagcaccc tgctgtgatt tacatggcca 500 atctggcctt ggctgacctc ctctctgtca tctggttccc cttgaagatt 550 gcctatcaca tacatggcaa caactggatt tatggggaag ctctttgtaa 600 tgtgcttatt ggctttttct atggcaacat gtactgttcc attctcttca 650 tgacctgcct cagtgtgcag aggtattggg tcatcgtgaa ccccatgggg 700 cactccagga agaaggcaaa cattgccatt ggcatctccc tggcaatatg 750 gctgctgatt ctgctggtca ccatcccttt gtatgtcgtg aagcagacca 800 tcttcattcc tgccctgaac atcacgacct gtcatgatgt tttgcctgag 850 cagctcttgg tgggagacat gttcaattac ttcctctctc tggccattgg 900 ggtctttctg ttcccagcct tcctcacagc ctctgcctat gtgctgatga 950 tcagaatgct gcgatcttct gccatggatg aaaactcaga gaagaaaagg 1000 aagagggcca tcaaactcat tgtcactgtc ctggccatgt acctgatctg 1050 cttcactcct agtaaccttc tgcttgtggt gcattatttt ctgattaaga 1100 gccagggcca gagccatgtc tatgccctgt acattgtagc cctctgcctc 1150 tctaccctta acagctgcat cgaccccttt gtctattact ttgtttcaca 1200 tgatttcagg gatcatgcaa agaacgctct cctttgccga agtgtccgca 1250 ctgtaaagca gatgcaagta tccctcacct caaagaaaca ctccaggaaa 1300 tccagctctt actcttcaag ttcaaccact gttaagacct cctattgagt 1350 tttccaggtc ctcagatggg aattgcacag taggatgtgg aacctgttta 1400 atgttatgag gacgtgtctg ttatttccta atcaaaaagg tctcaccaca 1450 taccatgtgg atgcagcacc tctcaggatt gctaggagct cccctgtttg 1500 catgagaaaa gtagtccccc aaattaacat cagtgtctgt ttcagaatct 1550 ctctactcag atgaccccag aaactgaacc aacagaagca gacttttcag 1600 aagatggtga agacagaaac ccagtaactt gcaaaaagta gacttggtgt 1650 gaagactcac ttctcagctg aaattatata tatacacata tatatatttt 1700 acatctggga tcatgataga cttgttaggg cttcaaggcc ctcagagatg 1750 atcagtccaa ctgaacgacc ttacaaatga ggaaaccaag ataaatgagc 1800 tgccagaatc aggtttccaa tcaacagcag tgagttggga ttggacagta 1850 gaatttcaat gtccagtgag tgaggttctt gtaccacttc atcaaaatca 1900 tggatcttgg ctgggtgcgg tgcctcatgc ctgtaatcct agcactttgg 1950 gaggctgagg caggcaatca cttgaggtca ggagttcgag accagcctgg 2000 ccatcatggc gaaacctcat ctctactaaa aatacaaaag ttaaccaggt 2050 gtgtggtgca cgtttgtaat cccagttact caggaggctg aggcacaaga 2100 attgagtatc actttaactc aggaggcaga ggttgcagtg agccgagatt 2150 gcaccactgc actccagctt gggtgataaa ataaaataaa atagtcgtga 2200 atcttgttca aaatgcagat tcctcagatt caataatgag agctcagact 2250 gggaacaggg cccaggaatc tgtgtggtac aaacctgcat ggtgtttatg 2300 cacacagaga tttgagaacc attgttctga atgctgcttc catttgacaa 2350 agtgccgtga taatttttga aaagagaagc aaacaatggt gtctctttta 2400 tgttcagctt ataatgaaat ctgtttgttg acttattagg actttgaatt 2450 atttctttat taaccctctg agtttttgta tgtattatta ttaaagaaaa 2500 atgcaatcag gattttaaac atgtaaatac aaattttgta taacttttga 2550 tgacttcagt gaaattttca ggtagtctga gtaatagatt gttttgccac 2600 ttagaatagc atttgccact tagtatttta aaaaataatt gttggagtat 2650 ttattgtcag ttttgttcac ttgttatcta atacaaaatt ataaagcctt 2700 cagagggttt ggaccacatc tctttggaaa atagtttgca acatatttaa 2750 gagatacttg atgccaaaat gactttatac aacgattgta tttgtgactt 2800 ttaaaaataa ttattttatt gtgtaattga tttataaata acaaaatttt 2850 ttttacaact taaaaaaaaa aaaaaa 2876 30 397 PRT Homo sapien 30 Met Arg Ser Pro Ser Ala Ala Trp Leu Leu Gly Ala Ala Ile Leu 1 5 10 15 Leu Ala Ala Ser Leu Ser Cys Ser Gly Thr Ile Gln Gly Thr Asn 20 25 30 Arg Ser Ser Lys Gly Arg Ser Leu Ile Gly Lys Val Asp Gly Thr 35 40 45 Ser His Val Thr Gly Lys Gly Val Thr Val Glu Thr Val Phe Ser 50 55 60 Val Asp Glu Phe Ser Ala Ser Val Leu Thr Gly Lys Leu Thr Thr 65 70 75 Val Phe Leu Pro Ile Val Tyr Thr Ile Val Phe Val Val Gly Leu 80 85 90 Pro Ser Asn Gly Met Ala Leu Trp Val Phe Leu Phe Arg Thr Lys 95 100 105 Lys Lys His Pro Ala Val Ile Tyr Met Ala Asn Leu Ala Leu Ala 110 115 120 Asp Leu Leu Ser Val Ile Trp Phe Pro Leu Lys Ile Ala Tyr His 125 130 135 Ile His Gly Asn Asn Trp Ile Tyr Gly Glu Ala Leu Cys Asn Val 140 145 150 Leu Ile Gly Phe Phe Tyr Gly Asn Met Tyr Cys Ser Ile Leu Phe 155 160 165 Met Thr Cys Leu Ser Val Gln Arg Tyr Trp Val Ile Val Asn Pro 170 175 180 Met Gly His Ser Arg Lys Lys Ala Asn Ile Ala Ile Gly Ile Ser 185 190 195 Leu Ala Ile Trp Leu Leu Ile Leu Leu Val Thr Ile Pro Leu Tyr 200 205 210 Val Val Lys Gln Thr Ile Phe Ile Pro Ala Leu Asn Ile Thr Thr 215 220 225 Cys His Asp Val Leu Pro Glu Gln Leu Leu Val Gly Asp Met Phe 230 235 240 Asn Tyr Phe Leu Ser Leu Ala Ile Gly Val Phe Leu Phe Pro Ala 245 250 255 Phe Leu Thr Ala Ser Ala Tyr Val Leu Met Ile Arg Met Leu Arg 260 265 270 Ser Ser Ala Met Asp Glu Asn Ser Glu Lys Lys Arg Lys Arg Ala 275 280 285 Ile Lys Leu Ile Val Thr Val Leu Ala Met Tyr Leu Ile Cys Phe 290 295 300 Thr Pro Ser Asn Leu Leu Leu Val Val His Tyr Phe Leu Ile Lys 305 310 315 Ser Gln Gly Gln Ser His Val Tyr Ala Leu Tyr Ile Val Ala Leu 320 325 330 Cys Leu Ser Thr Leu Asn Ser Cys Ile Asp Pro Phe Val Tyr Tyr 335 340 345 Phe Val Ser His Asp Phe Arg Asp His Ala Lys Asn Ala Leu Leu 350 355 360 Cys Arg Ser Val Arg Thr Val Lys Gln Met Gln Val Ser Leu Thr 365 370 375 Ser Lys Lys His Ser Arg Lys Ser Ser Ser Tyr Ser Ser Ser Ser 380 385 390 Thr Thr Val Lys Thr Ser Tyr 395 31 3279 DNA Homo sapien 31 ccggctcgaa gcgcaacgag gaagcgtttg cggtgatccc ggcgactgcg 50 ctggctaatg cggtaccggc tagcgtggct tctgcacccc gcactgccca 100 gcaccttccg ctcagtcctc ggcgcccgcc tgccgcctcc ggagcgcctg 150 tgtggtttcc aaaaaaagac ttacagcaaa atgaataatc cagccatcaa 200 gagaatagga aatcacatta ccaagtctcc tgaagacaag cgagaatatc 250 gagggctaga gctggccaat ggtatcaaag tacttcttat gagtgatccc 300 accacggata agtcatcagc agcacttgat gtgcacatag gttcattgtc 350 ggatcctcca aatattgctg gcttaagtca tttttgtgaa catatgcttt 400 ttttgggaac aaagaaatac cctaaagaaa atgaatacag ccagtttctc 450 agtgagcatg caggaagttc aaatgccttt actagtggag agcataccaa 500 ttactatttt gatgtttctc atgaacacct agaaggtgcc ctagacaggt 550 ttgcacagtt ttttctgtgc cccttgttcg atgaaagttg caaagacaga 600 gaggtgaatg cagttgattc agaacatgag aagaatgtga tgaatgatgc 650 ctggagactc tttcaattgg aaaaagctac agggaatcct aaacacccct 700 tcagtaaatt tgggacaggt aacaaatata ctctggagac tagaccaaac 750 caagaaggca ttgatgtaag acaagagcta ctgaaattcc attctgctta 800 ctattcatcc aacttaatgg ctgtttgtgt tttaggtcga gaatctttag 850 atgacttgac taatctggtg gtaaagttat tttctgaagt agagaacaaa 900 aatgttccat tgccagaatt tcctgaacac cctttccaag aagaacatct 950 taaacaactt tacaaaatag tacccattaa agatattagg aatctctatg 1000 tgacatttcc catacctgac cttcagaaat actacaaatc aaatcctggt 1050 cattatcttg gtcatctcat tgggcatgaa ggtcctggaa gtctgttatc 1100 agaacttaag tcaaagggct gggttaatac tcttgttggt gggcagaagg 1150 aaggagcccg aggttttatg ttttttatca ttaatgtgga cttgaccgag 1200 gaaggattat tacatgttga agatataatt ttgcacatgt ttcaatacat 1250 tcagaagtta cgtgcagaag gacctcaaga atgggttttc caagagtgca 1300 aggacttgaa tgctgttgct tttaggttta aagacaaaga gaggccacgg 1350 ggctatacat ctaagattgc aggaatattg cattattatc ccctagaaga 1400 ggtgctcaca gcggaatatt tactggaaga atttagacct gacttaatag 1450 agatggttct cgataaactc agaccagaaa atgtccgggt tgccatagtt 1500 tctaaatctt ttgaaggaaa aactgatcgc acagaagagt ggtatggaac 1550 ccagtacaaa caagaagcta taccggatga agtcatcaag aaatggcaaa 1600 atgctgacct gaatgggaaa tttaaacttc ctacaaagaa tgaatttatt 1650 cctacgaatt ttgagatttt accgttagaa aaagaggcga caccataccc 1700 tgctcttatt aaggatacag tcatgagcaa actttggttc aaacaagatg 1750 ataagaaaaa aaagccgaag gcttgtctca actttgaatt tttcagccca 1800 tttgcttatg tggacccctt gcactgtaac atggcctatt tgtaccttga 1850 gctcctcaaa gactcactca acgagtatgc atatgcagca gagctagcag 1900 gcttgagcta tgatctccaa aataccatct atgggatgta tctttcagtg 1950 aaaggttaca atgacaagca gccaatttta ctaaagaaga ttattgagaa 2000 aatggctacc tttgagattg atgaaaaaag atttgaaatt atcaaagaag 2050 catatatgcg atctcttaac aatttccggg ctgaacagcc tcaccagcat 2100 gccatgtact acctccgctt gctgatgact gaagtggcct ggactaaaga 2150 tgagttaaaa gaagctctgg atgatgtaac ccttcctcgc cttaaggcct 2200 tcatacctca gctcctgtca cggctgcaca ttgaagccct tctccatgga 2250 aacataacaa agcaggctgc attaggaatt atgcagatgg ttgaagacac 2300 cctcattgaa catgctcata ccaaacctct ccttccaagt cagctggttc 2350 ggtatagaga agttcagctc cctgacagag gatggtttgt ttatcagcag 2400 agaaatgaag ttcacaataa ctgtggcatc gagatatact accaaacaga 2450 catgcaaagc acctcagaga atatgtttct ggagctcttc tgtcagatta 2500 tctcggaacc ttgcttcaac accctgcgca ccaaggagca gttgggctat 2550 atcgtcttca gcgggccacg tcgagctaat ggcatacaga gcttgagatt 2600 catcatccag tcagaaaagc cacctcacta cctagaaagc agagtggaag 2650 ctttcttaat taccatggaa aagtccatag aggacatgac agaagaggcc 2700 ttccaaaaac acattcaggc attagcaatt cgtcgactag acaaaccaaa 2750 gaagctatct gctgagtgtg ctaaatactg gggagaaatc atctcccagc 2800 aatataattt tgacagagat aacactgagg ttgcatattt aaagacactt 2850 accaaggaag atatcatcaa attctacaag gaaatgttgg cagtagatgc 2900 tccaaggaga cataaggtat ccgtccatgt tcttgccagg gaaatggatt 2950 cttgtcctgt tgttggagag ttcccatgtc aaaatgacat aaatttgtca 3000 caagcaccag ccttgccaca acctgaagtg attcagaaca tgaccgaatt 3050 caagcgtggt ctgccactgt ttccccttgt gaaaccacat attaacttca 3100 tggctgcaaa actctgaaga ttccccatgc atgggaaagt gcaagtggat 3150 gcattcctga gtcttccaga gcctaagaaa atcatcttgg ccactttaat 3200 agtttctgat tcactattag agaaacaaac aaaaaattgt caaatgtcat 3250 tatgtagaaa tattataaat ccaaagtaa 3279 32 1019 PRT Homo sapien 32 Met Arg Tyr Arg Leu Ala Trp Leu Leu His Pro Ala Leu Pro Ser 1 5 10 15 Thr Phe Arg Ser Val Leu Gly Ala Arg Leu Pro Pro Pro Glu Arg 20 25 30 Leu Cys Gly Phe Gln Lys Lys Thr Tyr Ser Lys Met Asn Asn Pro 35 40 45 Ala Ile Lys Arg Ile Gly Asn His Ile Thr Lys Ser Pro Glu Asp 50 55 60 Lys Arg Glu Tyr Arg Gly Leu Glu Leu Ala Asn Gly Ile Lys Val 65 70 75 Leu Leu Met Ser Asp Pro Thr Thr Asp Lys Ser Ser Ala Ala Leu 80 85 90 Asp Val His Ile Gly Ser Leu Ser Asp Pro Pro Asn Ile Ala Gly 95 100 105 Leu Ser His Phe Cys Glu His Met Leu Phe Leu Gly Thr Lys Lys 110 115 120 Tyr Pro Lys Glu Asn Glu Tyr Ser Gln Phe Leu Ser Glu His Ala 125 130 135 Gly Ser Ser Asn Ala Phe Thr Ser Gly Glu His Thr Asn Tyr Tyr 140 145 150 Phe Asp Val Ser His Glu His Leu Glu Gly Ala Leu Asp Arg Phe 155 160 165 Ala Gln Phe Phe Leu Cys Pro Leu Phe Asp Glu Ser Cys Lys Asp 170 175 180 Arg Glu Val Asn Ala Val Asp Ser Glu His Glu Lys Asn Val Met 185 190 195 Asn Asp Ala Trp Arg Leu Phe Gln Leu Glu Lys Ala Thr Gly Asn 200 205 210 Pro Lys His Pro Phe Ser Lys Phe Gly Thr Gly Asn Lys Tyr Thr 215 220 225 Leu Glu Thr Arg Pro Asn Gln Glu Gly Ile Asp Val Arg Gln Glu 230 235 240 Leu Leu Lys Phe His Ser Ala Tyr Tyr Ser Ser Asn Leu Met Ala 245 250 255 Val Cys Val Leu Gly Arg Glu Ser Leu Asp Asp Leu Thr Asn Leu 260 265 270 Val Val Lys Leu Phe Ser Glu Val Glu Asn Lys Asn Val Pro Leu 275 280 285 Pro Glu Phe Pro Glu His Pro Phe Gln Glu Glu His Leu Lys Gln 290 295 300 Leu Tyr Lys Ile Val Pro Ile Lys Asp Ile Arg Asn Leu Tyr Val 305 310 315 Thr Phe Pro Ile Pro Asp Leu Gln Lys Tyr Tyr Lys Ser Asn Pro 320 325 330 Gly His Tyr Leu Gly His Leu Ile Gly His Glu Gly Pro Gly Ser 335 340 345 Leu Leu Ser Glu Leu Lys Ser Lys Gly Trp Val Asn Thr Leu Val 350 355 360 Gly Gly Gln Lys Glu Gly Ala Arg Gly Phe Met Phe Phe Ile Ile 365 370 375 Asn Val Asp Leu Thr Glu Glu Gly Leu Leu His Val Glu Asp Ile

380 385 390 Ile Leu His Met Phe Gln Tyr Ile Gln Lys Leu Arg Ala Glu Gly 395 400 405 Pro Gln Glu Trp Val Phe Gln Glu Cys Lys Asp Leu Asn Ala Val 410 415 420 Ala Phe Arg Phe Lys Asp Lys Glu Arg Pro Arg Gly Tyr Thr Ser 425 430 435 Lys Ile Ala Gly Ile Leu His Tyr Tyr Pro Leu Glu Glu Val Leu 440 445 450 Thr Ala Glu Tyr Leu Leu Glu Glu Phe Arg Pro Asp Leu Ile Glu 455 460 465 Met Val Leu Asp Lys Leu Arg Pro Glu Asn Val Arg Val Ala Ile 470 475 480 Val Ser Lys Ser Phe Glu Gly Lys Thr Asp Arg Thr Glu Glu Trp 485 490 495 Tyr Gly Thr Gln Tyr Lys Gln Glu Ala Ile Pro Asp Glu Val Ile 500 505 510 Lys Lys Trp Gln Asn Ala Asp Leu Asn Gly Lys Phe Lys Leu Pro 515 520 525 Thr Lys Asn Glu Phe Ile Pro Thr Asn Phe Glu Ile Leu Pro Leu 530 535 540 Glu Lys Glu Ala Thr Pro Tyr Pro Ala Leu Ile Lys Asp Thr Val 545 550 555 Met Ser Lys Leu Trp Phe Lys Gln Asp Asp Lys Lys Lys Lys Pro 560 565 570 Lys Ala Cys Leu Asn Phe Glu Phe Phe Ser Pro Phe Ala Tyr Val 575 580 585 Asp Pro Leu His Cys Asn Met Ala Tyr Leu Tyr Leu Glu Leu Leu 590 595 600 Lys Asp Ser Leu Asn Glu Tyr Ala Tyr Ala Ala Glu Leu Ala Gly 605 610 615 Leu Ser Tyr Asp Leu Gln Asn Thr Ile Tyr Gly Met Tyr Leu Ser 620 625 630 Val Lys Gly Tyr Asn Asp Lys Gln Pro Ile Leu Leu Lys Lys Ile 635 640 645 Ile Glu Lys Met Ala Thr Phe Glu Ile Asp Glu Lys Arg Phe Glu 650 655 660 Ile Ile Lys Glu Ala Tyr Met Arg Ser Leu Asn Asn Phe Arg Ala 665 670 675 Glu Gln Pro His Gln His Ala Met Tyr Tyr Leu Arg Leu Leu Met 680 685 690 Thr Glu Val Ala Trp Thr Lys Asp Glu Leu Lys Glu Ala Leu Asp 695 700 705 Asp Val Thr Leu Pro Arg Leu Lys Ala Phe Ile Pro Gln Leu Leu 710 715 720 Ser Arg Leu His Ile Glu Ala Leu Leu His Gly Asn Ile Thr Lys 725 730 735 Gln Ala Ala Leu Gly Ile Met Gln Met Val Glu Asp Thr Leu Ile 740 745 750 Glu His Ala His Thr Lys Pro Leu Leu Pro Ser Gln Leu Val Arg 755 760 765 Tyr Arg Glu Val Gln Leu Pro Asp Arg Gly Trp Phe Val Tyr Gln 770 775 780 Gln Arg Asn Glu Val His Asn Asn Cys Gly Ile Glu Ile Tyr Tyr 785 790 795 Gln Thr Asp Met Gln Ser Thr Ser Glu Asn Met Phe Leu Glu Leu 800 805 810 Phe Cys Gln Ile Ile Ser Glu Pro Cys Phe Asn Thr Leu Arg Thr 815 820 825 Lys Glu Gln Leu Gly Tyr Ile Val Phe Ser Gly Pro Arg Arg Ala 830 835 840 Asn Gly Ile Gln Ser Leu Arg Phe Ile Ile Gln Ser Glu Lys Pro 845 850 855 Pro His Tyr Leu Glu Ser Arg Val Glu Ala Phe Leu Ile Thr Met 860 865 870 Glu Lys Ser Ile Glu Asp Met Thr Glu Glu Ala Phe Gln Lys His 875 880 885 Ile Gln Ala Leu Ala Ile Arg Arg Leu Asp Lys Pro Lys Lys Leu 890 895 900 Ser Ala Glu Cys Ala Lys Tyr Trp Gly Glu Ile Ile Ser Gln Gln 905 910 915 Tyr Asn Phe Asp Arg Asp Asn Thr Glu Val Ala Tyr Leu Lys Thr 920 925 930 Leu Thr Lys Glu Asp Ile Ile Lys Phe Tyr Lys Glu Met Leu Ala 935 940 945 Val Asp Ala Pro Arg Arg His Lys Val Ser Val His Val Leu Ala 950 955 960 Arg Glu Met Asp Ser Cys Pro Val Val Gly Glu Phe Pro Cys Gln 965 970 975 Asn Asp Ile Asn Leu Ser Gln Ala Pro Ala Leu Pro Gln Pro Glu 980 985 990 Val Ile Gln Asn Met Thr Glu Phe Lys Arg Gly Leu Pro Leu Phe 995 1000 1005 Pro Leu Val Lys Pro His Ile Asn Phe Met Ala Ala Lys Leu 1010 1015 33 3624 DNA Homo sapien 33 cagggagcct gggctggaag aggcagcaaa agggaaaatc agaagagtgg 50 acactggcaa gaggagggca gcctttttcc cagcttcctt gcaccatgga 100 cagctcccat taagccacct ctccatcctg gggccaggac tcttatgccc 150 cattcctgtc aaattgagat ttcatccacc attctccaag gacagtgaag 200 ttatacccta gttccagtgt tgggatcagt ggcccctctg gacatgcctc 250 tcctggaagg ttctgtgggg gtggaggatc ttgtcctcct ggaacccttg 300 gtggaggagt cactgctcaa gaatcttcag cttcgctatg aaaacaagga 350 gatttatacc tacattggga atgtggtgat ctcagtgaat ccctatcaac 400 agcttcccat ctatgggcca gagttcattg ccaaatatca agactatact 450 ttctatgagc tgaagcccca tatctacgca ttggcaaatg tggcgtacca 500 gtcactgagg gacagggacc gagaccagtg tatcctcatc acaggcgaga 550 gtggatcagg gaagactgag gccagcaagc tggtgatgtc ttatgtggct 600 gccgtctgtg ggaaaggaga gcaggtgaac tctgtgaagg agcagctgct 650 acagtctaac ccagtgctgg aggcttttgg caatgccaag accattcgca 700 acaacaattc ctcccgattt ggaaaataca tggatattga atttgacttc 750 aagggatccc ccctcggtgg tgtcatcaca aactatctgc ttgagaaatc 800 ccgattagtg aagcagctca aaggagaaag gaacttccac atcttctatc 850 agctgctggc tggagcagat gaacagctgc tgaaggccct gaagcttgag 900 cgggatacaa ctggctatgc ctatctgaat catgaagtat ccagagtgga 950 tggcatggac gacgcctcca gcttcagggc tgtacagagt gcaatggcag 1000 tgattgggtt ctcggaggag gagattcgac aagtgctaga ggtgacatcc 1050 atggtgctaa agctggggaa cgtgttggtg gctgatgagt tccaggccag 1100 tgggatacca gcaagtggca tccgtgatgg gagaggtgtt cgggagattg 1150 gggagatggt gggcttgaat tcagaagaag tagagagagc tttgtgctcg 1200 aggaccatgg aaacagccaa ggaaaaggtg gtcactgcac tgaatgttat 1250 gcaggctcag tatgctcggg acgccctggc taagaacatc tacagccgcc 1300 tctttgactg gatagtgaat cgaatcaatg agagcatcaa ggtgggcatc 1350 ggggaaaaga agaaggtaat gggagtcctt gatatctacg gttttgagat 1400 attagaggat aatagctttg agcaatttgt gatcaactac tgcaatgaga 1450 agctgcagca ggtgttcata gagatgaccc tgaaagaaga gcaagaggaa 1500 tataagagag aaggcatacc gtggacaaag gtggactact ttgataatgg 1550 catcatttgt aagctcattg agcataatca gcgaggtatc ctggccatgt 1600 tggatgagga gtgcctgcgg cctggggtgg tcagtgactc cactttccta 1650 gcaaagctga accagctctt ctccaagcat ggccactacg agagcaaagt 1700 cacccagaat gcccagcgtc agtatgacca caccatgggc ctcagctgct 1750 tccgcatctg ccactatgcg ggcaaggtga catacaacgt gaccagcttt 1800 attgacaaga ataatgacct actcttccga gacctgttgc aggccatgtg 1850 gaaggcccag caccccctcc ttcggtcctt gtttcctgag ggcaatccta 1900 agcaggcatc tctcaaacgc cccccgactg ctggggccca gttcaagagt 1950 tctgtggcca tcctcatgaa gaatctgtat tccaagagcc ccaactacat 2000 caggtgcata aagcccaatg agcatcagca gcgaggtcag ttctcttcag 2050 acctggtggc aacccaggct cggtacctgg gactgctgga gaacgtacgg 2100 gtgcgacggg caggctatgc ccaccgccag ggttatgggc ccttcctgga 2150 aaggtaccga ttgctgagcc ggagcacctg gcctcactgg aatgggggag 2200 accgggaagg tgttgagaag gtcctggggg agctgagcat gtcctcgggg 2250 gagctggcct ttggcaagac aaagatcttc attagaagcc ccaagactct 2300 tttctacctc gaagaacaga ggcgcctgag actccagcag ctggccacac 2350 tcatacagaa gatttaccga ggctggcgct gccgcaccca ctaccaactg 2400 atgcgaaaga gtcagatcct catctcctct tggtttcggg gaaacatgca 2450 aaagaaatgc tatgggaaga taaaggcatc cgtgttattg atccaggctt 2500 ttgtgagagg gtggaaggcc cgaaagaatt atcgcaaata tttccggtca 2550 gaggctgccc tcaccttggc agatttcatc tacaagagca tggtacagaa 2600 attcctactg gggctgaaga acaatttgcc atccacaaac gtcttagaca 2650 agacatggcc agccgccccc tacaagtgcc tcagcacagc aaatcaggag 2700 ctgcagcagc tcttctacca gtggaagtgc aagaggttcc gggatcagct 2750 gtccccgaag caggtagaga tcctgaggga aaagctctgt gccagtgaac 2800 tgttcaaggg caagaaggct tcatatcccc agagtgtccc cattccattc 2850 tgtggtgact acattgggct gcaagggaac cccaagctgc agaagctgaa 2900 aggcggggag gaggggcctg ttctgatggc agaggccgtg aagaaggtca 2950 atcgtggcaa tggcaagact tcttctcgga ttctcctcct gaccaagggc 3000 catgtgattc tcacagacac caagaagtcc caggccaaaa ttgtcattgg 3050 gctagacaat gtggctgggg tgtcagtcac cagcctcaag gatgggctct 3100 ttagcttgca tctgagtgag atgtcatcgg tgggctccaa gggggacttc 3150 ctgctggtca gcgagcatgt gattgaactg ctgaccaaaa tgtaccgggc 3200 tgtgctggat gccacgcaga ggcagcttac agtcaccgtg actgagaagt 3250 tctcagtgag gttcaaggag aacagtgtgg ctgtcaaggt cgtccagggc 3300 cctgcaggtg gtgacaacag caagctacgc tacaaaaaaa aggggagtca 3350 ttgcttggag gtgactgtgc agtgaggagg gggcaccatg cagagatggc 3400 agttgcttcc tcctgaacca gcactaatcc ccctctgccc tcctgtgtgg 3450 gaggatctct aacccctctg atcgtggcgc atggcttggg gattaaacta 3500 cccttgaaga ggacccttgt cccaaaccct tcttgttctc tcctccaaaa 3550 gtagcttcct ccaacccgca gcctctctgc acactaataa aacatgtggc 3600 ttggaaaggt tcaaaaaaaa aaaa 3624 34 1043 PRT Homo sapien 34 Met Pro Leu Leu Glu Gly Ser Val Gly Val Glu Asp Leu Val Leu 1 5 10 15 Leu Glu Pro Leu Val Glu Glu Ser Leu Leu Lys Asn Leu Gln Leu 20 25 30 Arg Tyr Glu Asn Lys Glu Ile Tyr Thr Tyr Ile Gly Asn Val Val 35 40 45 Ile Ser Val Asn Pro Tyr Gln Gln Leu Pro Ile Tyr Gly Pro Glu 50 55 60 Phe Ile Ala Lys Tyr Gln Asp Tyr Thr Phe Tyr Glu Leu Lys Pro 65 70 75 His Ile Tyr Ala Leu Ala Asn Val Ala Tyr Gln Ser Leu Arg Asp 80 85 90 Arg Asp Arg Asp Gln Cys Ile Leu Ile Thr Gly Glu Ser Gly Ser 95 100 105 Gly Lys Thr Glu Ala Ser Lys Leu Val Met Ser Tyr Val Ala Ala 110 115 120 Val Cys Gly Lys Gly Glu Gln Val Asn Ser Val Lys Glu Gln Leu 125 130 135 Leu Gln Ser Asn Pro Val Leu Glu Ala Phe Gly Asn Ala Lys Thr 140 145 150 Ile Arg Asn Asn Asn Ser Ser Arg Phe Gly Lys Tyr Met Asp Ile 155 160 165 Glu Phe Asp Phe Lys Gly Ser Pro Leu Gly Gly Val Ile Thr Asn 170 175 180 Tyr Leu Leu Glu Lys Ser Arg Leu Val Lys Gln Leu Lys Gly Glu 185 190 195 Arg Asn Phe His Ile Phe Tyr Gln Leu Leu Ala Gly Ala Asp Glu 200 205 210 Gln Leu Leu Lys Ala Leu Lys Leu Glu Arg Asp Thr Thr Gly Tyr 215 220 225 Ala Tyr Leu Asn His Glu Val Ser Arg Val Asp Gly Met Asp Asp 230 235 240 Ala Ser Ser Phe Arg Ala Val Gln Ser Ala Met Ala Val Ile Gly 245 250 255 Phe Ser Glu Glu Glu Ile Arg Gln Val Leu Glu Val Thr Ser Met 260 265 270 Val Leu Lys Leu Gly Asn Val Leu Val Ala Asp Glu Phe Gln Ala 275 280 285 Ser Gly Ile Pro Ala Ser Gly Ile Arg Asp Gly Arg Gly Val Arg 290 295 300 Glu Ile Gly Glu Met Val Gly Leu Asn Ser Glu Glu Val Glu Arg 305 310 315 Ala Leu Cys Ser Arg Thr Met Glu Thr Ala Lys Glu Lys Val Val 320 325 330 Thr Ala Leu Asn Val Met Gln Ala Gln Tyr Ala Arg Asp Ala Leu 335 340 345 Ala Lys Asn Ile Tyr Ser Arg Leu Phe Asp Trp Ile Val Asn Arg 350 355 360 Ile Asn Glu Ser Ile Lys Val Gly Ile Gly Glu Lys Lys Lys Val 365 370 375 Met Gly Val Leu Asp Ile Tyr Gly Phe Glu Ile Leu Glu Asp Asn 380 385 390 Ser Phe Glu Gln Phe Val Ile Asn Tyr Cys Asn Glu Lys Leu Gln 395 400 405 Gln Val Phe Ile Glu Met Thr Leu Lys Glu Glu Gln Glu Glu Tyr 410 415 420 Lys Arg Glu Gly Ile Pro Trp Thr Lys Val Asp Tyr Phe Asp Asn 425 430 435 Gly Ile Ile Cys Lys Leu Ile Glu His Asn Gln Arg Gly Ile Leu 440 445 450 Ala Met Leu Asp Glu Glu Cys Leu Arg Pro Gly Val Val Ser Asp 455 460 465 Ser Thr Phe Leu Ala Lys Leu Asn Gln Leu Phe Ser Lys His Gly 470 475 480 His Tyr Glu Ser Lys Val Thr Gln Asn Ala Gln Arg Gln Tyr Asp 485 490 495 His Thr Met Gly Leu Ser Cys Phe Arg Ile Cys His Tyr Ala Gly 500 505 510 Lys Val Thr Tyr Asn Val Thr Ser Phe Ile Asp Lys Asn Asn Asp 515 520 525 Leu Leu Phe Arg Asp Leu Leu Gln Ala Met Trp Lys Ala Gln His 530 535 540 Pro Leu Leu Arg Ser Leu Phe Pro Glu Gly Asn Pro Lys Gln Ala 545 550 555 Ser Leu Lys Arg Pro Pro Thr Ala Gly Ala Gln Phe Lys Ser Ser 560 565 570 Val Ala Ile Leu Met Lys Asn Leu Tyr Ser Lys Ser Pro Asn Tyr 575 580 585 Ile Arg Cys Ile Lys Pro Asn Glu His Gln Gln Arg Gly Gln Phe 590 595 600 Ser Ser Asp Leu Val Ala Thr Gln Ala Arg Tyr Leu Gly Leu Leu 605 610 615 Glu Asn Val Arg Val Arg Arg Ala Gly Tyr Ala His Arg Gln Gly 620 625 630 Tyr Gly Pro Phe Leu Glu Arg Tyr Arg Leu Leu Ser Arg Ser Thr 635 640 645 Trp Pro His Trp Asn Gly Gly Asp Arg Glu Gly Val Glu Lys Val 650 655 660 Leu Gly Glu Leu Ser Met Ser Ser Gly Glu Leu Ala Phe Gly Lys 665 670 675 Thr Lys Ile Phe Ile Arg Ser Pro Lys Thr Leu Phe Tyr Leu Glu 680 685 690 Glu Gln Arg Arg Leu Arg Leu Gln Gln Leu Ala Thr Leu Ile Gln 695 700 705 Lys Ile Tyr Arg Gly Trp Arg Cys Arg Thr His Tyr Gln Leu Met 710 715 720 Arg Lys Ser Gln Ile Leu Ile Ser Ser Trp Phe Arg Gly Asn Met 725 730 735 Gln Lys Lys Cys Tyr Gly Lys Ile Lys Ala Ser Val Leu Leu Ile 740 745 750 Gln Ala Phe Val Arg Gly Trp Lys Ala Arg Lys Asn Tyr Arg Lys 755 760 765 Tyr Phe Arg Ser Glu Ala Ala Leu Thr Leu Ala Asp Phe Ile Tyr 770 775 780 Lys Ser Met Val Gln Lys Phe Leu Leu Gly Leu Lys Asn Asn Leu 785 790 795 Pro Ser Thr Asn Val Leu Asp Lys Thr Trp Pro Ala Ala Pro Tyr 800 805 810 Lys Cys Leu Ser Thr Ala Asn Gln Glu Leu Gln Gln Leu Phe Tyr 815 820 825 Gln Trp Lys Cys Lys Arg Phe Arg Asp Gln Leu Ser Pro Lys Gln 830 835 840 Val Glu Ile Leu Arg Glu Lys Leu Cys Ala Ser Glu Leu Phe Lys 845 850 855 Gly Lys Lys Ala Ser Tyr Pro Gln Ser Val Pro Ile Pro Phe Cys 860 865 870 Gly Asp Tyr Ile Gly Leu Gln Gly Asn Pro Lys Leu Gln Lys Leu 875 880 885 Lys Gly Gly Glu Glu Gly Pro Val Leu Met Ala Glu Ala Val Lys 890 895 900 Lys Val Asn Arg Gly Asn Gly Lys Thr Ser Ser Arg Ile Leu Leu 905 910 915 Leu Thr Lys Gly His Val Ile Leu Thr Asp Thr Lys Lys Ser Gln 920 925 930 Ala Lys Ile Val Ile Gly Leu Asp Asn Val Ala Gly Val Ser Val 935 940 945 Thr Ser Leu Lys Asp Gly Leu Phe Ser Leu His Leu Ser Glu Met 950 955 960 Ser Ser Val Gly Ser Lys Gly Asp Phe Leu Leu Val Ser Glu His 965 970 975 Val Ile Glu Leu Leu Thr Lys Met Tyr Arg Ala Val Leu Asp Ala 980 985

990 Thr Gln Arg Gln Leu Thr Val Thr Val Thr Glu Lys Phe Ser Val 995 1000 1005 Arg Phe Lys Glu Asn Ser Val Ala Val Lys Val Val Gln Gly Pro 1010 1015 1020 Ala Gly Gly Asp Asn Ser Lys Leu Arg Tyr Lys Lys Lys Gly Ser 1025 1030 1035 His Cys Leu Glu Val Thr Val Gln 1040 35 1876 DNA Homo sapien 35 gagccatgct cgcggcgatg ggctctctgg cggctgccct ctgggcagtg 50 gtccatcctc ggactctcct actgggcact gtcgcctttc tgctcgctgc 100 tgactttctc aaaagacggc gcccaaagaa ctacccgccg gggccctggc 150 gcctgccctt ccttggcaac ttcttccttg tggacttcga gcagtcgcac 200 ctggaggttc agctgtttgt gaagaaatat gggaaccttt ttagcttgga 250 gcttggtgac atatctgcag ttcttattac tggcttgccc ttaatcaaag 300 aagcccttat ccacatggac caaaactttg ggaaccgccc cgtgacccct 350 atgcgagaac atatctttaa gaaaaatgga ttgattatgt caagtggcca 400 ggcatggaag gagcaaagaa ggttcactct gacagcacta aggaactttg 450 gtttaggaaa gaagagctta gaggaacgca ttcaggagga ggcccaacac 500 ctcactgaag caataaaaga ggagaacgga cagccttttg accctcattt 550 caagatcaac aatgcagttt ccaatatcat ttgctccatc accttcggag 600 aacgctttga gtaccaggat agttggtttc agcagctgct gaagttacta 650 gatgaagtca catacttgga ggcttcaaag acatgccagc tctacaatgt 700 ctttccatgg ataatgaaat tcctgcctgg accccaccaa actctcttca 750 gcaactggaa aaaactgaaa ttgtttgttt ctcatatgat tgacaaacac 800 agaaaggatt ggaatcctgc agaaacaaga gactttattg atgcttacct 850 taaagaaatg tcaaagcaca caggcaatcc tacttcaagt ttccatgaag 900 aaaacctcat ctgcagcacc ctggacctct tctttgccgg aaccgagaca 950 acttccacaa ctctgcgatg ggctctgctt tatatggccc tctacccaga 1000 aatccaagaa aaagtacaag ctgagattga cagagtgatt ggccaggggc 1050 agcagccgag cacagccgcc cgggagtcca tgccctacac caatgctgtc 1100 atccatgagg tgcagagaat gggcaacatc atccccctga acgttcccag 1150 ggaagtgaca gttgatacca ctttggctgg gtaccacctg cccaagggta 1200 ccatgatcct gaccaatttg acggcgctgc acagggaccc cacagagtgg 1250 gccacccctg acacattcaa tccggaccat tttctggaga atggacagtt 1300 taagaaaagg gaagccttta tgcctttctc aataggaaag cgggcatgcc 1350 tcggagaaca gttggccagg actgagctgt ttattttctt cacttccctt 1400 atgcaaaaat ttaccttcag gcccccaaac aatgagaagc tgagcctgaa 1450 gtttagaatg ggtatcacca tttccccagt cagtcaccgc ctctgcgctg 1500 ttcctcaggt gtaatattgt taagaaagaa aggggcaagg aaagtaagaa 1550 gacatggcac gtgttctgaa accactggtg tctgctcaga tgtgttggga 1600 caaaatgaaa gtgactttca agaaagatca gaggaatttg actcagagaa 1650 aactagatcc aaatcccagc tctactgtct cgtccgaatt agccttggga 1700 aaatcattta tatgctaaat aatttacctt tttatctagg agatgaaaag 1750 aggataatgt ttccttccat aaagaaagtt cttgtaagaa tcaaaagaaa 1800 tggtgagctt taagtggttt gtaaaccata aaacacatca taaaagttct 1850 atctataaaa aaaaaaaaaa aaaaaa 1876 36 502 PRT Homo sapien 36 Met Leu Ala Ala Met Gly Ser Leu Ala Ala Ala Leu Trp Ala Val 1 5 10 15 Val His Pro Arg Thr Leu Leu Leu Gly Thr Val Ala Phe Leu Leu 20 25 30 Ala Ala Asp Phe Leu Lys Arg Arg Arg Pro Lys Asn Tyr Pro Pro 35 40 45 Gly Pro Trp Arg Leu Pro Phe Leu Gly Asn Phe Phe Leu Val Asp 50 55 60 Phe Glu Gln Ser His Leu Glu Val Gln Leu Phe Val Lys Lys Tyr 65 70 75 Gly Asn Leu Phe Ser Leu Glu Leu Gly Asp Ile Ser Ala Val Leu 80 85 90 Ile Thr Gly Leu Pro Leu Ile Lys Glu Ala Leu Ile His Met Asp 95 100 105 Gln Asn Phe Gly Asn Arg Pro Val Thr Pro Met Arg Glu His Ile 110 115 120 Phe Lys Lys Asn Gly Leu Ile Met Ser Ser Gly Gln Ala Trp Lys 125 130 135 Glu Gln Arg Arg Phe Thr Leu Thr Ala Leu Arg Asn Phe Gly Leu 140 145 150 Gly Lys Lys Ser Leu Glu Glu Arg Ile Gln Glu Glu Ala Gln His 155 160 165 Leu Thr Glu Ala Ile Lys Glu Glu Asn Gly Gln Pro Phe Asp Pro 170 175 180 His Phe Lys Ile Asn Asn Ala Val Ser Asn Ile Ile Cys Ser Ile 185 190 195 Thr Phe Gly Glu Arg Phe Glu Tyr Gln Asp Ser Trp Phe Gln Gln 200 205 210 Leu Leu Lys Leu Leu Asp Glu Val Thr Tyr Leu Glu Ala Ser Lys 215 220 225 Thr Cys Gln Leu Tyr Asn Val Phe Pro Trp Ile Met Lys Phe Leu 230 235 240 Pro Gly Pro His Gln Thr Leu Phe Ser Asn Trp Lys Lys Leu Lys 245 250 255 Leu Phe Val Ser His Met Ile Asp Lys His Arg Lys Asp Trp Asn 260 265 270 Pro Ala Glu Thr Arg Asp Phe Ile Asp Ala Tyr Leu Lys Glu Met 275 280 285 Ser Lys His Thr Gly Asn Pro Thr Ser Ser Phe His Glu Glu Asn 290 295 300 Leu Ile Cys Ser Thr Leu Asp Leu Phe Phe Ala Gly Thr Glu Thr 305 310 315 Thr Ser Thr Thr Leu Arg Trp Ala Leu Leu Tyr Met Ala Leu Tyr 320 325 330 Pro Glu Ile Gln Glu Lys Val Gln Ala Glu Ile Asp Arg Val Ile 335 340 345 Gly Gln Gly Gln Gln Pro Ser Thr Ala Ala Arg Glu Ser Met Pro 350 355 360 Tyr Thr Asn Ala Val Ile His Glu Val Gln Arg Met Gly Asn Ile 365 370 375 Ile Pro Leu Asn Val Pro Arg Glu Val Thr Val Asp Thr Thr Leu 380 385 390 Ala Gly Tyr His Leu Pro Lys Gly Thr Met Ile Leu Thr Asn Leu 395 400 405 Thr Ala Leu His Arg Asp Pro Thr Glu Trp Ala Thr Pro Asp Thr 410 415 420 Phe Asn Pro Asp His Phe Leu Glu Asn Gly Gln Phe Lys Lys Arg 425 430 435 Glu Ala Phe Met Pro Phe Ser Ile Gly Lys Arg Ala Cys Leu Gly 440 445 450 Glu Gln Leu Ala Arg Thr Glu Leu Phe Ile Phe Phe Thr Ser Leu 455 460 465 Met Gln Lys Phe Thr Phe Arg Pro Pro Asn Asn Glu Lys Leu Ser 470 475 480 Leu Lys Phe Arg Met Gly Ile Thr Ile Ser Pro Val Ser His Arg 485 490 495 Leu Cys Ala Val Pro Gln Val 500 37 1577 DNA Homo sapien 37 gcccgctgcg gtaaatgggg cagaggccgg gaggggtggg ggttccccgc 50 gccgcagcca tggagcagct tcgcgccgcc gcccgtctgc agattgttct 100 gggccacctc ggccgcccct cggccggggc tgtcgtagct catcccactt 150 cagggactat ttcctctgcc agtttccatc ctcaacaatt ccagtatact 200 ctggataata atgttctaac cctggaacag agaaaatttt atgaagaaaa 250 tgggtttcta gtaatcaaaa atcttgtacc tgatgccgat attcaacgct 300 ttcggaatga gtttgaaaaa atctgcagaa aggaggtgaa accattagga 350 ttaacagtaa tgagagatgt gaccatttcg aaatccgaat atgctccaag 400 tgagaagatg atcacgaagg tccaggattt ccaggaagat aaggagctct 450 tcagatactg cactctcccc gagattctga aatatgtgga gtgcttcact 500 ggacctaata ttatggccat gcacacaatg ttgataaaca aacctccaga 550 ttctggcaag aagacgtccc gtcaccccct gcaccaggac ctgcactatt 600 tccccttcag gcccagcgat ctcatcgttt gcgcctggac ggcgatggag 650 cacatcagcc ggaacaacgg ctgtctggtt gtgctcccag gcacacacaa 700 gggctccctg aagccccacg attaccccaa gtgggagggg ggagttaaca 750 aaatgttcca cgggatccag gactacgagg aaaacaaggc ccgggtgcac 800 ctggtgatgg agaagggcga cactgttttc ttccatcctt tgctcatcca 850 cggatctggt cagaataaaa cccagggatt ccggaaggca atttcctgcc 900 atttcgccag tgccgattgc cactacattg acgtgaaggg caccagtcaa 950 gaaaacatcg agaaggaagt tgtaggaata gcacataaat tctttggagc 1000 tgaaaatagc gtgaacttga aggatatttg gatgtttcga gctcgacttg 1050 tgaaaggaga aagaaccaat ctttgaaata gccatctgct ataactcttt 1100 caacagaaaa ccaaaaccaa acgaaatgtc taaggaaaat gttttcttaa 1150 tgagatgatg taaccttttc tatcacttgt taaaagcaga aaacatgtat 1200 caggtactta attgcataga gttagttttg cagcacaatg gtgttgcttt 1250 aatggaaaaa aaaaacagta aaagtgaaat attactgttt taaggaaaac 1300 taatttaggg tggcagccaa taaaggtggt tggtgtctaa tttaagtgtt 1350 aaatcaattt ctttcattca gttagctctt tacccaagaa gaagtgaatg 1400 atttggagct tagggtatgt tttgtatccc ctttctgata aacccattcc 1450 ctaccaattt tatgtcataa gagatttttt tcccccaaat ctagaacaat 1500 gtataataca ttcacatcta gtcaagggca taggaacggt gtcatggagt 1550 ccaaataaag tggatattcc tgctcgg 1577 38 338 PRT Homo sapien 38 Met Glu Gln Leu Arg Ala Ala Ala Arg Leu Gln Ile Val Leu Gly 1 5 10 15 His Leu Gly Arg Pro Ser Ala Gly Ala Val Val Ala His Pro Thr 20 25 30 Ser Gly Thr Ile Ser Ser Ala Ser Phe His Pro Gln Gln Phe Gln 35 40 45 Tyr Thr Leu Asp Asn Asn Val Leu Thr Leu Glu Gln Arg Lys Phe 50 55 60 Tyr Glu Glu Asn Gly Phe Leu Val Ile Lys Asn Leu Val Pro Asp 65 70 75 Ala Asp Ile Gln Arg Phe Arg Asn Glu Phe Glu Lys Ile Cys Arg 80 85 90 Lys Glu Val Lys Pro Leu Gly Leu Thr Val Met Arg Asp Val Thr 95 100 105 Ile Ser Lys Ser Glu Tyr Ala Pro Ser Glu Lys Met Ile Thr Lys 110 115 120 Val Gln Asp Phe Gln Glu Asp Lys Glu Leu Phe Arg Tyr Cys Thr 125 130 135 Leu Pro Glu Ile Leu Lys Tyr Val Glu Cys Phe Thr Gly Pro Asn 140 145 150 Ile Met Ala Met His Thr Met Leu Ile Asn Lys Pro Pro Asp Ser 155 160 165 Gly Lys Lys Thr Ser Arg His Pro Leu His Gln Asp Leu His Tyr 170 175 180 Phe Pro Phe Arg Pro Ser Asp Leu Ile Val Cys Ala Trp Thr Ala 185 190 195 Met Glu His Ile Ser Arg Asn Asn Gly Cys Leu Val Val Leu Pro 200 205 210 Gly Thr His Lys Gly Ser Leu Lys Pro His Asp Tyr Pro Lys Trp 215 220 225 Glu Gly Gly Val Asn Lys Met Phe His Gly Ile Gln Asp Tyr Glu 230 235 240 Glu Asn Lys Ala Arg Val His Leu Val Met Glu Lys Gly Asp Thr 245 250 255 Val Phe Phe His Pro Leu Leu Ile His Gly Ser Gly Gln Asn Lys 260 265 270 Thr Gln Gly Phe Arg Lys Ala Ile Ser Cys His Phe Ala Ser Ala 275 280 285 Asp Cys His Tyr Ile Asp Val Lys Gly Thr Ser Gln Glu Asn Ile 290 295 300 Glu Lys Glu Val Val Gly Ile Ala His Lys Phe Phe Gly Ala Glu 305 310 315 Asn Ser Val Asn Leu Lys Asp Ile Trp Met Phe Arg Ala Arg Leu 320 325 330 Val Lys Gly Glu Arg Thr Asn Leu 335 39 716 DNA Homo sapien 39 atggcagagc agtcggacga ggccgtgaag tactacaccc tagaggagat 50 tcagaagcac aaccacagca agagcacctg gctgatcctg caccacaagg 100 tgtacgattt gaccaaattt ctggaagagc atcctggtgg ggaagaagtt 150 ttaagggaac aagctggagg tgacgctact gagaactttg aggatgtcgg 200 gcactctaca gatgccaggg aaatgtccaa aacattcatc attggggagc 250 tccatccaga tgacagacca aagttaaaca agcctccaga accttaaagg 300 cggtgtttca aggaaactct tatcactact attgattcta gttccagttg 350 gtggaccaac tgggtgatcc ctgccatctc tgcagtggcc gtcgccttga 400 tgtatcgcct atacatggca gaggactgaa cacctcctca gaagtcagcg 450 caggaagagc ctgctttgga cacgggagaa aagaagccat tgctaactac 500 ttcaactgac agaaaccttc acttgaaaac aatgatttta atatatctct 550 ttctttttct tccgacatta gaaacaaaac aaaaagaact gtcctttctg 600 cgctcaaatt tttcgagtgt gcctttttat tcatctactt tattttgatg 650 tttccttaat gtgtaattta cttattataa gcatgatctt ttaaaaatat 700 atttggcttt taaagt 716 40 98 PRT Homo sapien 40 Met Ala Glu Gln Ser Asp Glu Ala Val Lys Tyr Tyr Thr Leu Glu 1 5 10 15 Glu Ile Gln Lys His Asn His Ser Lys Ser Thr Trp Leu Ile Leu 20 25 30 His His Lys Val Tyr Asp Leu Thr Lys Phe Leu Glu Glu His Pro 35 40 45 Gly Gly Glu Glu Val Leu Arg Glu Gln Ala Gly Gly Asp Ala Thr 50 55 60 Glu Asn Phe Glu Asp Val Gly His Ser Thr Asp Ala Arg Glu Met 65 70 75 Ser Lys Thr Phe Ile Ile Gly Glu Leu His Pro Asp Asp Arg Pro 80 85 90 Lys Leu Asn Lys Pro Pro Glu Pro 95 41 578 DNA Homo sapien 41 cctcctggga gggagctgaa gccgctcgca agactcccgt agtccccacc 50 tctctcagct tccggctggt agtagttccg cttcctgtcc gactgtggtg 100 tctttgctga gggtcacatt gagctgcagg ttgaatccgg ggtgccttta 150 ggattcagca ccatggcgga agacatggag accaaaatca agaactacaa 200 gaccgcccct tttgacagcc gcttccccaa ccagaaccag actagaaact 250 gctggcagaa ctacctggac ttccaccgct gtcagaaggc aatgaccgct 300 aaaggaggcg atatctctgt gtgcgaatgg taccagcgtg tgtaccagtc 350 cctctgcccc acatcctggg tcacagactg ggatgagcaa cgggctgaag 400 gcacgtttcc cgggaagatc tgaactggct gcatctccct ttcctctgtc 450 ctccatcctt ctcccaggat ggtgaagggg gacctggtac ccagtgatcc 500 ccaccccagg atcctaaatc atgacttacc tgctaataaa aactcattgg 550 aaaagtgaaa aaaaaaaaaa aaaaaaaa 578 42 86 PRT Homo sapien 42 Met Ala Glu Asp Met Glu Thr Lys Ile Lys Asn Tyr Lys Thr Ala 1 5 10 15 Pro Phe Asp Ser Arg Phe Pro Asn Gln Asn Gln Thr Arg Asn Cys 20 25 30 Trp Gln Asn Tyr Leu Asp Phe His Arg Cys Gln Lys Ala Met Thr 35 40 45 Ala Lys Gly Gly Asp Ile Ser Val Cys Glu Trp Tyr Gln Arg Val 50 55 60 Tyr Gln Ser Leu Cys Pro Thr Ser Trp Val Thr Asp Trp Asp Glu 65 70 75 Gln Arg Ala Glu Gly Thr Phe Pro Gly Lys Ile 80 85 43 2444 DNA Homo sapien 43 ggtttttttt ttttaccccc cttttttatt tattattttt ttgcacattg 50 agcggatcct tgggaacgag agaaaaaaga aacccaaact cacgcgtgca 100 gaagatctcc ccccccttcc cctcccctcc tccctctttt cccctcccca 150 ggagaaaaag acccccaagc agaaaaaagt tcaccttgga ctcgtctttt 200 tcttgcaata ttttttgggg gggcaaaact ttgagggggt gatttttttt 250 ggcttttctt cctccttcat ttttcttcca aaattgctgc tggtgggtga 300 aaaaaaaatg ccgcagctga acggcggtgg aggggatgac ctaggcgcca 350 acgacgaact gatttccttc aaagacgagg gcgaacagga ggagaagagc 400 tccgaaaact cctcggcaga gagggattta gctgatgtca aatcgtctct 450 agtcaatgaa tcagaaacga atcaaaacag ctcctccgat tccgaggcgg 500 aaagacggcc tccgcctcgc tccgaaagtt tccgagacaa atcccgggaa 550 agtttggaag aagcggccaa gaggcaagat ggagggctct ttaaggggcc 600 accgtatccc ggctacccct tcatcatgat ccccgacctg acgagcccct 650 acctccccaa cggatcgctc tcgcccaccg cccgaaccta tctccagatg 700 aaatggccac tgcttgatgt ccaggcaggg agcctccaga gtagacaagc 750 cctcaaggat gcccggtccc catcaccggc acacattgtc tctaacaaag 800 tgccagtggt gcagcaccct caccatgtcc accccctcac gcctcttatc 850 acgtacagca atgaacactt cacgccggga aacccacctc cacacttacc 900 agccgacgta gaccccaaaa caggaatccc acggcctccg caccctccag 950 atatatcccc gtattaccca ctatcgcctg gcaccgtagg acaaatcccc 1000 catccgctag gatggttagt accacagcaa ggtcaaccag tgtacccaat 1050 cacgacagga ggattcagac acccctaccc cacagctctg accgtcaatg 1100 cttccgtgtc caggttccct ccccatatgg tcccaccaca tcatacgcta 1150 cacacgacgg gcattccgca tccggccata gtcacaccaa cagtcaaaca 1200 ggaatcgtcc cagagtgatg tcggctcact ccatagttca aagcatcagg 1250 actccaaaaa ggaagaagaa aagaagaagc cccacataaa gaaacctctt 1300 aatgcattca tgttgtatat gaaggaaatg agagcaaagg tcgtagctga 1350 gtgcacgttg aaagaaagcg cggccatcaa ccagatcctt gggcggaggt 1400 ggcatgcact gtccagagaa gagcaagcga aatactacga gctggcccgg 1450 aaggagcgac agcttcatat gcaactgtac cccggctggt ccgcgcggga 1500 taactatgga aagaagaaga agaggaaaag ggacaagcag ccgggagaga 1550 ccaatgaaca cagcgaatgt ttcctaaatc cttgcctttc acttcctccg 1600 attacagacc tcagcgctcc taagaaatgc cgagcgcgct ttggccttga 1650 tcaacagaat aactggtgcg gcccttgcag gagaaaaaaa aagtgcgttc

1700 gctacataca aggtgaaggc agctgcctca gcccaccctc ttcagatgga 1750 agcttactag attcgcctcc cccctccccg aacctgctag gctcccctcc 1800 ccgagacgcc aagtcacaga ctgagcagac ccagcctctg tcgctgtccc 1850 tgaagcccga ccccctggcc cacctgtcca tgatgcctcc gccacccgcc 1900 ctcctgctcg ctgaggccac ccacaaggcc tccgccctct gtcccaacgg 1950 ggccctggac ctgcccccag ccgctttgca gcctgccgcc ccctcctcat 2000 caattgcaca gccgtcgact tcttggttac attcccacag ctccctggcc 2050 gggacccagc cccagccgct gtcgctcgtc accaagtctt tagaatagct 2100 ttagcgtcgt gaaccccgct gctttgttta tggttttgtt tcacttttct 2150 taatttgccc cccaccccca ccttgaaagg ttttgttttg tactctctta 2200 attttgtgcc atgtggctac attagttgat gtttatcgag ttcattggtc 2250 aatatttgac ccattcttat ttcaatttct ccttttaaat atgtagatga 2300 gagaagaacc tcatgattgg taccaaaatt tttatcaaca gctgtttaaa 2350 gtctttgtag cgtttaaaaa atatatatat atacataact gttatgtagt 2400 tcggatagct tagttttaaa agactgatta aaaaacaaaa aaaa 2444 44 596 PRT Homo sapien 44 Met Pro Gln Leu Asn Gly Gly Gly Gly Asp Asp Leu Gly Ala Asn 1 5 10 15 Asp Glu Leu Ile Ser Phe Lys Asp Glu Gly Glu Gln Glu Glu Lys 20 25 30 Ser Ser Glu Asn Ser Ser Ala Glu Arg Asp Leu Ala Asp Val Lys 35 40 45 Ser Ser Leu Val Asn Glu Ser Glu Thr Asn Gln Asn Ser Ser Ser 50 55 60 Asp Ser Glu Ala Glu Arg Arg Pro Pro Pro Arg Ser Glu Ser Phe 65 70 75 Arg Asp Lys Ser Arg Glu Ser Leu Glu Glu Ala Ala Lys Arg Gln 80 85 90 Asp Gly Gly Leu Phe Lys Gly Pro Pro Tyr Pro Gly Tyr Pro Phe 95 100 105 Ile Met Ile Pro Asp Leu Thr Ser Pro Tyr Leu Pro Asn Gly Ser 110 115 120 Leu Ser Pro Thr Ala Arg Thr Tyr Leu Gln Met Lys Trp Pro Leu 125 130 135 Leu Asp Val Gln Ala Gly Ser Leu Gln Ser Arg Gln Ala Leu Lys 140 145 150 Asp Ala Arg Ser Pro Ser Pro Ala His Ile Val Ser Asn Lys Val 155 160 165 Pro Val Val Gln His Pro His His Val His Pro Leu Thr Pro Leu 170 175 180 Ile Thr Tyr Ser Asn Glu His Phe Thr Pro Gly Asn Pro Pro Pro 185 190 195 His Leu Pro Ala Asp Val Asp Pro Lys Thr Gly Ile Pro Arg Pro 200 205 210 Pro His Pro Pro Asp Ile Ser Pro Tyr Tyr Pro Leu Ser Pro Gly 215 220 225 Thr Val Gly Gln Ile Pro His Pro Leu Gly Trp Leu Val Pro Gln 230 235 240 Gln Gly Gln Pro Val Tyr Pro Ile Thr Thr Gly Gly Phe Arg His 245 250 255 Pro Tyr Pro Thr Ala Leu Thr Val Asn Ala Ser Val Ser Arg Phe 260 265 270 Pro Pro His Met Val Pro Pro His His Thr Leu His Thr Thr Gly 275 280 285 Ile Pro His Pro Ala Ile Val Thr Pro Thr Val Lys Gln Glu Ser 290 295 300 Ser Gln Ser Asp Val Gly Ser Leu His Ser Ser Lys His Gln Asp 305 310 315 Ser Lys Lys Glu Glu Glu Lys Lys Lys Pro His Ile Lys Lys Pro 320 325 330 Leu Asn Ala Phe Met Leu Tyr Met Lys Glu Met Arg Ala Lys Val 335 340 345 Val Ala Glu Cys Thr Leu Lys Glu Ser Ala Ala Ile Asn Gln Ile 350 355 360 Leu Gly Arg Arg Trp His Ala Leu Ser Arg Glu Glu Gln Ala Lys 365 370 375 Tyr Tyr Glu Leu Ala Arg Lys Glu Arg Gln Leu His Met Gln Leu 380 385 390 Tyr Pro Gly Trp Ser Ala Arg Asp Asn Tyr Gly Lys Lys Lys Lys 395 400 405 Arg Lys Arg Asp Lys Gln Pro Gly Glu Thr Asn Glu His Ser Glu 410 415 420 Cys Phe Leu Asn Pro Cys Leu Ser Leu Pro Pro Ile Thr Asp Leu 425 430 435 Ser Ala Pro Lys Lys Cys Arg Ala Arg Phe Gly Leu Asp Gln Gln 440 445 450 Asn Asn Trp Cys Gly Pro Cys Arg Arg Lys Lys Lys Cys Val Arg 455 460 465 Tyr Ile Gln Gly Glu Gly Ser Cys Leu Ser Pro Pro Ser Ser Asp 470 475 480 Gly Ser Leu Leu Asp Ser Pro Pro Pro Ser Pro Asn Leu Leu Gly 485 490 495 Ser Pro Pro Arg Asp Ala Lys Ser Gln Thr Glu Gln Thr Gln Pro 500 505 510 Leu Ser Leu Ser Leu Lys Pro Asp Pro Leu Ala His Leu Ser Met 515 520 525 Met Pro Pro Pro Pro Ala Leu Leu Leu Ala Glu Ala Thr His Lys 530 535 540 Ala Ser Ala Leu Cys Pro Asn Gly Ala Leu Asp Leu Pro Pro Ala 545 550 555 Ala Leu Gln Pro Ala Ala Pro Ser Ser Ser Ile Ala Gln Pro Ser 560 565 570 Thr Ser Trp Leu His Ser His Ser Ser Leu Ala Gly Thr Gln Pro 575 580 585 Gln Pro Leu Ser Leu Val Thr Lys Ser Leu Glu 590 595 45 3697 DNA Homo sapien 45 agggagtgtt cccgggggag atactccagt cgtagcaaga gtctcgacca 50 ctgaatggaa gaaaaggact tttaaccacc attttgtgac ttacagaaag 100 gaatttgaat aaagaaaact atgatacttc aggcccatct tcactccctg 150 tgtcttctta tgctttattt ggcaactgga tatggccaag aggggaagtt 200 tagtggaccc ctgaaaccca tgacattttc tatttatgaa ggccaagaac 250 cgagtcaaat tatattccag tttaaggcca atcctcctgc tgtgactttt 300 gaactaactg gggagacaga caacatattt gtgatagaac gggagggact 350 tctgtattac aacagagcct tggacaggga aacaagatct actcacaatc 400 tccaggttgc agccctggac gctaatggaa ttatagtgga gggtccagtc 450 cctatcacca tagaagtgaa ggacatcaac gacaatcgac ccacgtttct 500 ccagtcaaag tacgaaggct cagtaaggca gaactctcgc ccaggaaagc 550 ccttcttgta tgtcaatgcc acagacctgg atgatccggc cactcccaat 600 ggccagcttt attaccagat tgtcatccag cttcccatga tcaacaatgt 650 catgtacttt cagatcaaca acaaaacggg agccatctct cttacccgag 700 agggatctca ggaattgaat cctgctaaga atccttccta taatctggtg 750 atctcagtga aggacatggg aggccagagt gagaattcct tcagtgatac 800 cacatctgtg gatatcatag tgacagagaa tatttggaaa gcaccaaaac 850 ctgtggagat ggtggaaaac tcaactgatc ctcaccccat caaaatcact 900 caggtgcggt ggaatgatcc cggtgcacaa tattccttag ttgacaaaga 950 gaagctgcca agattcccat tttcaattga ccaggaagga gatatttacg 1000 tgactcagcc cttggaccga gaagaaaagg atgcatatgt tttttatgca 1050 gttgcaaagg atgagtacgg aaaaccactt tcatatccgc tggaaattca 1100 tgtaaaagtt aaagatatta atgataatcc acctacatgt ccgtcaccag 1150 taaccgtatt tgaggtccag gagaatgaac gactgggtaa cagtatcggg 1200 acccttactg cacatgacag ggatgaagaa aatactgcca acagttttct 1250 aaactacagg attgtggagc aaactcccaa acttcccatg gatggactct 1300 tcctaatcca aacctatgct ggaatgttac agttagctaa acagtccttg 1350 aagaagcaag atactcctca gtacaactta acgatagagg tgtctgacaa 1400 agatttcaag accctttgtt ttgtgcaaat caacgttatt gatatcaatg 1450 atcagatccc catctttgaa aaatcagatt atggaaacct gactcttgct 1500 gaagacacaa acattgggtc caccatctta accatccagg ccactgatgc 1550 tgatgagcca tttactggga gttctaaaat tctgtatcat atcataaagg 1600 gagacagtga gggacgcctg ggggttgaca cagatcccca taccaacacc 1650 ggatatgtca taattaaaaa gcctcttgat tttgaaacag cagctgtttc 1700 caacattgtg ttcaaagcag aaaatcctga gcctctagtg tttggtgtga 1750 agtacaatgc aagttctttt gccaagttca cgcttattgt gacagatgtg 1800 aatgaagcac ctcaattttc ccaacacgta ttccaagcga aagtcagtga 1850 ggatgtagct ataggcacta aagtgggcaa tgtgactgcc aaggatccag 1900 aaggtctgga cataagctat tcactgaggg gagacacaag aggttggctt 1950 aaaattgacc acgtgactgg tgagatcttt agtgtggctc cattggacag 2000 agaagccgga agtccatatc gggtacaagt ggtggccaca gaagtagggg 2050 ggtcttcctt gagctctgtg tcagagttcc acctgatcct tatggatgtg 2100 aatgacaacc ctcccaggct agccaaggac tacacgggct tgttcttctg 2150 ccatcccctc agtgcacctg gaagtctcat tttcgaggct actgatgatg 2200 atcagcactt atttcggggt ccccatttta cattttccct cggcagtgga 2250 agcttacaaa acgactggga agtttccaaa atcaatggta ctcatgcccg 2300 actgtctacc aggcacacag agtttgagga gagggagtat gtcgtcttga 2350 tccgcatcaa tgatgggggt cggccaccct tggaaggcat tgtttcttta 2400 ccagttacat tctgcagttg tgtggaagga agttgtttcc ggccagcagg 2450 tcaccagact gggataccca ctgtgggcat ggcagttggt atactgctga 2500 ccacccttct ggtgattggt ataattttag cagttgtgtt tatccgcata 2550 aagaaggata aaggcaaaga taatgttgaa agtgctcaag catctgaagt 2600 caaacctctg agaagctgaa tttgaaaagg aatgtttgaa tttatatagc 2650 aagtgctatt tcagcaacaa ccatctcatc ctattacttt tcatctaacg 2700 tgcattataa ttttttaaac agatattccc tcttgtcctt taatatttgc 2750 taaatatttc ttttttgagg tggagtcttg ctctgtcgcc caggctggag 2800 tacagtggtg tgatcccagc tcactgcaac ctccgcctcc tgggttcaca 2850 tgattctcct gcctcagctt cctaagtagc tgggtttaca ggcacccacc 2900 accatgccca gctaattttt gtatttttaa tagagacggg gtttcgccat 2950 ttggccaggc tggtcttgaa ctcctgacgt caagtgatct gcctgccttg 3000 gtctcccaat acaggcatga accactgcac ccacctactt agatatttca 3050 tgtgctatag acattagaga gatttttcat ttttccatga catttttcct 3100 ctctgcaaat ggcttagcta cttgtgtttt tcccttttgg ggcaagacag 3150 actcattaaa tattctgtac attttttctt tatcaaggag atatatcagt 3200 gttgtctcat agaactgcct ggattccatt tatgtttttt ctgattccat 3250 cctgtgtccc cttcatcctt gactcctttg gtatttcact gaatttcaaa 3300 catttgtcag agaagaaaaa cgtgaggact caggaaaaat aaataaataa 3350 aagaacagcc ttttccctta gtattaacag aaatgtttct gtgtcattaa 3400 ccatctttaa tcaatgtgac atgttgctct ttggctgaaa ttcttcaact 3450 tggaaatgac acagacccac agaaggtgtt caaacacaac ctactctgca 3500 aaccttggta aaggaaccag tcagctggcc agatttcctc actacctgcc 3550 atgcatacat gctgcgcatg ttttcttcat tcgtatgtta gtaaagtttt 3600 ggttattata tatttaacat gtggaagaaa acaagacatg aaaagagtgg 3650 tgacaaatca agaataaaca ctggttgtag tcagttttgt ttgttaa 3697 46 832 PRT Homo sapien 46 Met Ile Leu Gln Ala His Leu His Ser Leu Cys Leu Leu Met Leu 1 5 10 15 Tyr Leu Ala Thr Gly Tyr Gly Gln Glu Gly Lys Phe Ser Gly Pro 20 25 30 Leu Lys Pro Met Thr Phe Ser Ile Tyr Glu Gly Gln Glu Pro Ser 35 40 45 Gln Ile Ile Phe Gln Phe Lys Ala Asn Pro Pro Ala Val Thr Phe 50 55 60 Glu Leu Thr Gly Glu Thr Asp Asn Ile Phe Val Ile Glu Arg Glu 65 70 75 Gly Leu Leu Tyr Tyr Asn Arg Ala Leu Asp Arg Glu Thr Arg Ser 80 85 90 Thr His Asn Leu Gln Val Ala Ala Leu Asp Ala Asn Gly Ile Ile 95 100 105 Val Glu Gly Pro Val Pro Ile Thr Ile Glu Val Lys Asp Ile Asn 110 115 120 Asp Asn Arg Pro Thr Phe Leu Gln Ser Lys Tyr Glu Gly Ser Val 125 130 135 Arg Gln Asn Ser Arg Pro Gly Lys Pro Phe Leu Tyr Val Asn Ala 140 145 150 Thr Asp Leu Asp Asp Pro Ala Thr Pro Asn Gly Gln Leu Tyr Tyr 155 160 165 Gln Ile Val Ile Gln Leu Pro Met Ile Asn Asn Val Met Tyr Phe 170 175 180 Gln Ile Asn Asn Lys Thr Gly Ala Ile Ser Leu Thr Arg Glu Gly 185 190 195 Ser Gln Glu Leu Asn Pro Ala Lys Asn Pro Ser Tyr Asn Leu Val 200 205 210 Ile Ser Val Lys Asp Met Gly Gly Gln Ser Glu Asn Ser Phe Ser 215 220 225 Asp Thr Thr Ser Val Asp Ile Ile Val Thr Glu Asn Ile Trp Lys 230 235 240 Ala Pro Lys Pro Val Glu Met Val Glu Asn Ser Thr Asp Pro His 245 250 255 Pro Ile Lys Ile Thr Gln Val Arg Trp Asn Asp Pro Gly Ala Gln 260 265 270 Tyr Ser Leu Val Asp Lys Glu Lys Leu Pro Arg Phe Pro Phe Ser 275 280 285 Ile Asp Gln Glu Gly Asp Ile Tyr Val Thr Gln Pro Leu Asp Arg 290 295 300 Glu Glu Lys Asp Ala Tyr Val Phe Tyr Ala Val Ala Lys Asp Glu 305 310 315 Tyr Gly Lys Pro Leu Ser Tyr Pro Leu Glu Ile His Val Lys Val 320 325 330 Lys Asp Ile Asn Asp Asn Pro Pro Thr Cys Pro Ser Pro Val Thr 335 340 345 Val Phe Glu Val Gln Glu Asn Glu Arg Leu Gly Asn Ser Ile Gly 350 355 360 Thr Leu Thr Ala His Asp Arg Asp Glu Glu Asn Thr Ala Asn Ser 365 370 375 Phe Leu Asn Tyr Arg Ile Val Glu Gln Thr Pro Lys Leu Pro Met 380 385 390 Asp Gly Leu Phe Leu Ile Gln Thr Tyr Ala Gly Met Leu Gln Leu 395 400 405 Ala Lys Gln Ser Leu Lys Lys Gln Asp Thr Pro Gln Tyr Asn Leu 410 415 420 Thr Ile Glu Val Ser Asp Lys Asp Phe Lys Thr Leu Cys Phe Val 425 430 435 Gln Ile Asn Val Ile Asp Ile Asn Asp Gln Ile Pro Ile Phe Glu 440 445 450 Lys Ser Asp Tyr Gly Asn Leu Thr Leu Ala Glu Asp Thr Asn Ile 455 460 465 Gly Ser Thr Ile Leu Thr Ile Gln Ala Thr Asp Ala Asp Glu Pro 470 475 480 Phe Thr Gly Ser Ser Lys Ile Leu Tyr His Ile Ile Lys Gly Asp 485 490 495 Ser Glu Gly Arg Leu Gly Val Asp Thr Asp Pro His Thr Asn Thr 500 505 510 Gly Tyr Val Ile Ile Lys Lys Pro Leu Asp Phe Glu Thr Ala Ala 515 520 525 Val Ser Asn Ile Val Phe Lys Ala Glu Asn Pro Glu Pro Leu Val 530 535 540 Phe Gly Val Lys Tyr Asn Ala Ser Ser Phe Ala Lys Phe Thr Leu 545 550 555 Ile Val Thr Asp Val Asn Glu Ala Pro Gln Phe Ser Gln His Val 560 565 570 Phe Gln Ala Lys Val Ser Glu Asp Val Ala Ile Gly Thr Lys Val 575 580 585 Gly Asn Val Thr Ala Lys Asp Pro Glu Gly Leu Asp Ile Ser Tyr 590 595 600 Ser Leu Arg Gly Asp Thr Arg Gly Trp Leu Lys Ile Asp His Val 605 610 615 Thr Gly Glu Ile Phe Ser Val Ala Pro Leu Asp Arg Glu Ala Gly 620 625 630 Ser Pro Tyr Arg Val Gln Val Val Ala Thr Glu Val Gly Gly Ser 635 640 645 Ser Leu Ser Ser Val Ser Glu Phe His Leu Ile Leu Met Asp Val 650 655 660 Asn Asp Asn Pro Pro Arg Leu Ala Lys Asp Tyr Thr Gly Leu Phe 665 670 675 Phe Cys His Pro Leu Ser Ala Pro Gly Ser Leu Ile Phe Glu Ala 680 685 690 Thr Asp Asp Asp Gln His Leu Phe Arg Gly Pro His Phe Thr Phe 695 700 705 Ser Leu Gly Ser Gly Ser Leu Gln Asn Asp Trp Glu Val Ser Lys 710 715 720 Ile Asn Gly Thr His Ala Arg Leu Ser Thr Arg His Thr Glu Phe 725 730 735 Glu Glu Arg Glu Tyr Val Val Leu Ile Arg Ile Asn Asp Gly Gly 740 745 750 Arg Pro Pro Leu Glu Gly Ile Val Ser Leu Pro Val Thr Phe Cys 755 760 765 Ser Cys Val Glu Gly Ser Cys Phe Arg Pro Ala Gly His Gln Thr 770 775 780 Gly Ile Pro Thr Val Gly Met Ala Val Gly Ile Leu Leu Thr Thr 785 790 795 Leu Leu Val Ile Gly Ile Ile Leu Ala Val Val Phe Ile Arg Ile 800 805 810 Lys Lys Asp Lys Gly Lys Asp Asn Val Glu Ser Ala Gln Ala Ser 815 820 825 Glu Val Lys Pro Leu Arg Ser 830 47 1258 DNA Homo sapien 47 ctcgtcaaca gctgccgcgc gcaggcttag ctcattcctc tgacctgcca 50 ggaagcagag agacccacag agcaggaggg aggcagaaag tggagacgga 100 cctgagcccg aggaagaggc aggcagaggc tgaggctgat tccaccccag 150 cctgcctgga caaccctcct tagccgcagc cccttccagt tccctagggg 200 ttctgcccct ccccctctct ggggcaccag ccccccaggg

tcctgcatcc 250 caccatgtcg atggctgtgg aaacctttgg cttcttcatg gcaactgtgg 300 ggctgctgat gctgggggtg actctgccaa acagctactg gcgagtgtcc 350 actgtgcacg ggaacgtcat caccaccaac accatcttcg agaacctctg 400 gtttagctgt gccaccgact ccctgggcgt ctacaactgc tgggagttcc 450 cgtccatgct ggccctctct gggtatattc aggcctgccg ggcactcatg 500 atcaccgcca tcctcctggg cttcctcggc ctcttgctag gcatagcggg 550 cctgcgctgc accaacattg ggggcctgga gctctccagg aaagccaagc 600 tggcggccac cgcaggggcc ctccacattc tggccggtat ctgcgggatg 650 gtggccatct cctggtacgc cttcaacatc acccgggact tcttcgaccc 700 cttgtacccc ggaaccaagt acgagctggg ccccgccctc tacctggggt 750 ggagcgcctc actgatctcc atcctgggtg gcctctgcct ctgctccgcc 800 tgctgctgcg gctctgacga ggacccagcc gccagcgccc ggcggcccta 850 ccaggctccc gtgtccgtga tgcccgtcgc cacctcggac caagaaggcg 900 acagcagctt tggcaaatac ggcagaaacg cctacgtgta gcagctctgg 950 cccgtgggcc ccgctgtctt cccactgccc caaggagagg ggacctggcc 1000 ggggcccatt cccctatagt aacctcaggg gccggccacg ccccgctccc 1050 gtagccccgc cccggccacg gccccgtgtc ttgcactctc atggcccctc 1100 caggccaaga actgctcttg ggaagtcgca tatctcccct ctgaggctgg 1150 atccctcatc ttctgaccct gggttctggg ctgtgaaggg gacggtgtcc 1200 ccgcacgttt gtattgtgta taaatacatt cattaataaa tgcatattgt 1250 gaccgttc 1258 48 1258 PRT Homo sapien 48 Cys Thr Cys Gly Thr Cys Ala Ala Cys Ala Gly Cys Thr Gly Cys 1 5 10 15 Cys Gly Cys Gly Cys Gly Cys Ala Gly Gly Cys Thr Thr Ala Gly 20 25 30 Cys Thr Cys Ala Thr Thr Cys Cys Thr Cys Thr Gly Ala Cys Cys 35 40 45 Thr Gly Cys Cys Ala Gly Gly Ala Ala Gly Cys Ala Gly Ala Gly 50 55 60 Ala Gly Ala Cys Cys Cys Ala Cys Ala Gly Ala Gly Cys Ala Gly 65 70 75 Gly Ala Gly Gly Gly Ala Gly Gly Cys Ala Gly Ala Ala Ala Gly 80 85 90 Thr Gly Gly Ala Gly Ala Cys Gly Gly Ala Cys Cys Thr Gly Ala 95 100 105 Gly Cys Cys Cys Gly Ala Gly Gly Ala Ala Gly Ala Gly Gly Cys 110 115 120 Ala Gly Gly Cys Ala Gly Ala Gly Gly Cys Thr Gly Ala Gly Gly 125 130 135 Cys Thr Gly Ala Thr Thr Cys Cys Ala Cys Cys Cys Cys Ala Gly 140 145 150 Cys Cys Thr Gly Cys Cys Thr Gly Gly Ala Cys Ala Ala Cys Cys 155 160 165 Cys Thr Cys Cys Thr Thr Ala Gly Cys Cys Gly Cys Ala Gly Cys 170 175 180 Cys Cys Cys Thr Thr Cys Cys Ala Gly Thr Thr Cys Cys Cys Thr 185 190 195 Ala Gly Gly Gly Gly Thr Thr Cys Thr Gly Cys Cys Cys Cys Thr 200 205 210 Cys Cys Cys Cys Cys Thr Cys Thr Cys Thr Gly Gly Gly Gly Cys 215 220 225 Ala Cys Cys Ala Gly Cys Cys Cys Cys Cys Cys Ala Gly Gly Gly 230 235 240 Thr Cys Cys Thr Gly Cys Ala Thr Cys Cys Cys Ala Cys Cys Ala 245 250 255 Thr Gly Thr Cys Gly Ala Thr Gly Gly Cys Thr Gly Thr Gly Gly 260 265 270 Ala Ala Ala Cys Cys Thr Thr Thr Gly Gly Cys Thr Thr Cys Thr 275 280 285 Thr Cys Ala Thr Gly Gly Cys Ala Ala Cys Thr Gly Thr Gly Gly 290 295 300 Gly Gly Cys Thr Gly Cys Thr Gly Ala Thr Gly Cys Thr Gly Gly 305 310 315 Gly Gly Gly Thr Gly Ala Cys Thr Cys Thr Gly Cys Cys Ala Ala 320 325 330 Ala Cys Ala Gly Cys Thr Ala Cys Thr Gly Gly Cys Gly Ala Gly 335 340 345 Thr Gly Thr Cys Cys Ala Cys Thr Gly Thr Gly Cys Ala Cys Gly 350 355 360 Gly Gly Ala Ala Cys Gly Thr Cys Ala Thr Cys Ala Cys Cys Ala 365 370 375 Cys Cys Ala Ala Cys Ala Cys Cys Ala Thr Cys Thr Thr Cys Gly 380 385 390 Ala Gly Ala Ala Cys Cys Thr Cys Thr Gly Gly Thr Thr Thr Ala 395 400 405 Gly Cys Thr Gly Thr Gly Cys Cys Ala Cys Cys Gly Ala Cys Thr 410 415 420 Cys Cys Cys Thr Gly Gly Gly Cys Gly Thr Cys Thr Ala Cys Ala 425 430 435 Ala Cys Thr Gly Cys Thr Gly Gly Gly Ala Gly Thr Thr Cys Cys 440 445 450 Cys Gly Thr Cys Cys Ala Thr Gly Cys Thr Gly Gly Cys Cys Cys 455 460 465 Thr Cys Thr Cys Thr Gly Gly Gly Thr Ala Thr Ala Thr Thr Cys 470 475 480 Ala Gly Gly Cys Cys Thr Gly Cys Cys Gly Gly Gly Cys Ala Cys 485 490 495 Thr Cys Ala Thr Gly Ala Thr Cys Ala Cys Cys Gly Cys Cys Ala 500 505 510 Thr Cys Cys Thr Cys Cys Thr Gly Gly Gly Cys Thr Thr Cys Cys 515 520 525 Thr Cys Gly Gly Cys Cys Thr Cys Thr Thr Gly Cys Thr Ala Gly 530 535 540 Gly Cys Ala Thr Ala Gly Cys Gly Gly Gly Cys Cys Thr Gly Cys 545 550 555 Gly Cys Thr Gly Cys Ala Cys Cys Ala Ala Cys Ala Thr Thr Gly 560 565 570 Gly Gly Gly Gly Cys Cys Thr Gly Gly Ala Gly Cys Thr Cys Thr 575 580 585 Cys Cys Ala Gly Gly Ala Ala Ala Gly Cys Cys Ala Ala Gly Cys 590 595 600 Thr Gly Gly Cys Gly Gly Cys Cys Ala Cys Cys Gly Cys Ala Gly 605 610 615 Gly Gly Gly Cys Cys Cys Thr Cys Cys Ala Cys Ala Thr Thr Cys 620 625 630 Thr Gly Gly Cys Cys Gly Gly Thr Ala Thr Cys Thr Gly Cys Gly 635 640 645 Gly Gly Ala Thr Gly Gly Thr Gly Gly Cys Cys Ala Thr Cys Thr 650 655 660 Cys Cys Thr Gly Gly Thr Ala Cys Gly Cys Cys Thr Thr Cys Ala 665 670 675 Ala Cys Ala Thr Cys Ala Cys Cys Cys Gly Gly Gly Ala Cys Thr 680 685 690 Thr Cys Thr Thr Cys Gly Ala Cys Cys Cys Cys Thr Thr Gly Thr 695 700 705 Ala Cys Cys Cys Cys Gly Gly Ala Ala Cys Cys Ala Ala Gly Thr 710 715 720 Ala Cys Gly Ala Gly Cys Thr Gly Gly Gly Cys Cys Cys Cys Gly 725 730 735 Cys Cys Cys Thr Cys Thr Ala Cys Cys Thr Gly Gly Gly Gly Thr 740 745 750 Gly Gly Ala Gly Cys Gly Cys Cys Thr Cys Ala Cys Thr Gly Ala 755 760 765 Thr Cys Thr Cys Cys Ala Thr Cys Cys Thr Gly Gly Gly Thr Gly 770 775 780 Gly Cys Cys Thr Cys Thr Gly Cys Cys Thr Cys Thr Gly Cys Thr 785 790 795 Cys Cys Gly Cys Cys Thr Gly Cys Thr Gly Cys Thr Gly Cys Gly 800 805 810 Gly Cys Thr Cys Thr Gly Ala Cys Gly Ala Gly Gly Ala Cys Cys 815 820 825 Cys Ala Gly Cys Cys Gly Cys Cys Ala Gly Cys Gly Cys Cys Cys 830 835 840 Gly Gly Cys Gly Gly Cys Cys Cys Thr Ala Cys Cys Ala Gly Gly 845 850 855 Cys Thr Cys Cys Cys Gly Thr Gly Thr Cys Cys Gly Thr Gly Ala 860 865 870 Thr Gly Cys Cys Cys Gly Thr Cys Gly Cys Cys Ala Cys Cys Thr 875 880 885 Cys Gly Gly Ala Cys Cys Ala Ala Gly Ala Ala Gly Gly Cys Gly 890 895 900 Ala Cys Ala Gly Cys Ala Gly Cys Thr Thr Thr Gly Gly Cys Ala 905 910 915 Ala Ala Thr Ala Cys Gly Gly Cys Ala Gly Ala Ala Ala Cys Gly 920 925 930 Cys Cys Thr Ala Cys Gly Thr Gly Thr Ala Gly Cys Ala Gly Cys 935 940 945 Thr Cys Thr Gly Gly Cys Cys Cys Gly Thr Gly Gly Gly Cys Cys 950 955 960 Cys Cys Gly Cys Thr Gly Thr Cys Thr Thr Cys Cys Cys Ala Cys 965 970 975 Thr Gly Cys Cys Cys Cys Ala Ala Gly Gly Ala Gly Ala Gly Gly 980 985 990 Gly Gly Ala Cys Cys Thr Gly Gly Cys Cys Gly Gly Gly Gly Cys 995 1000 1005 Cys Cys Ala Thr Thr Cys Cys Cys Cys Thr Ala Thr Ala Gly Thr 1010 1015 1020 Ala Ala Cys Cys Thr Cys Ala Gly Gly Gly Gly Cys Cys Gly Gly 1025 1030 1035 Cys Cys Ala Cys Gly Cys Cys Cys Cys Gly Cys Thr Cys Cys Cys 1040 1045 1050 Gly Thr Ala Gly Cys Cys Cys Cys Gly Cys Cys Cys Cys Gly Gly 1055 1060 1065 Cys Cys Ala Cys Gly Gly Cys Cys Cys Cys Gly Thr Gly Thr Cys 1070 1075 1080 Thr Thr Gly Cys Ala Cys Thr Cys Thr Cys Ala Thr Gly Gly Cys 1085 1090 1095 Cys Cys Cys Thr Cys Cys Ala Gly Gly Cys Cys Ala Ala Gly Ala 1100 1105 1110 Ala Cys Thr Gly Cys Thr Cys Thr Thr Gly Gly Gly Ala Ala Gly 1115 1120 1125 Thr Cys Gly Cys Ala Thr Ala Thr Cys Thr Cys Cys Cys Cys Thr 1130 1135 1140 Cys Thr Gly Ala Gly Gly Cys Thr Gly Gly Ala Thr Cys Cys Cys 1145 1150 1155 Thr Cys Ala Thr Cys Thr Thr Cys Thr Gly Ala Cys Cys Cys Thr 1160 1165 1170 Gly Gly Gly Thr Thr Cys Thr Gly Gly Gly Cys Thr Gly Thr Gly 1175 1180 1185 Ala Ala Gly Gly Gly Gly Ala Cys Gly Gly Thr Gly Thr Cys Cys 1190 1195 1200 Cys Cys Gly Cys Ala Cys Gly Thr Thr Thr Gly Thr Ala Thr Thr 1205 1210 1215 Gly Thr Gly Thr Ala Thr Ala Ala Ala Thr Ala Cys Ala Thr Thr 1220 1225 1230 Cys Ala Thr Thr Ala Ala Thr Ala Ala Ala Thr Gly Cys Ala Thr 1235 1240 1245 Ala Thr Thr Gly Thr Gly Ala Cys Cys Gly Thr Thr Cys 1250 1255 49 6129 DNA Homo sapien 49 aattggaagc aaatgacatc acagcaggtc agagaaaaag ggttgagcgg 50 caggcaccca gagtagtagg tctttggcat taggagcttg agcccagacg 100 gccctagcag ggaccccagc gcccgagaga ccatgcagag gtcgcctctg 150 gaaaaggcca gcgttgtctc caaacttttt ttcagctgga ccagaccaat 200 tttgaggaaa ggatacagac agcgcctgga attgtcagac atataccaaa 250 tcccttctgt tgattctgct gacaatctat ctgaaaaatt ggaaagagaa 300 tgggatagag agctggcttc aaagaaaaat cctaaactca ttaatgccct 350 tcggcgatgt tttttctgga gatttatgtt ctatggaatc tttttatatt 400 taggggaagt caccaaagca gtacagcctc tcttactggg aagaatcata 450 gcttcctatg acccggataa caaggaggaa cgctctatcg cgatttatct 500 aggcataggc ttatgccttc tctttattgt gaggacactg ctcctacacc 550 cagccatttt tggccttcat cacattggaa tgcagatgag aatagctatg 600 tttagtttga tttataagaa gactttaaag ctgtcaagcc gtgttctaga 650 taaaataagt attggacaac ttgttagtct cctttccaac aacctgaaca 700 aatttgatga aggacttgca ttggcacatt tcgtgtggat cgctcctttg 750 caagtggcac tcctcatggg gctaatctgg gagttgttac aggcgtctgc 800 cttctgtgga cttggtttcc tgatagtcct tgcccttttt caggctgggc 850 tagggagaat gatgatgaag tacagagatc agagagctgg gaagatcagt 900 gaaagacttg tgattacctc agaaatgatt gaaaatatcc aatctgttaa 950 ggcatactgc tgggaagaag caatggaaaa aatgattgaa aacttaagac 1000 aaacagaact gaaactgact cggaaggcag cctatgtgag atacttcaat 1050 agctcagcct tcttcttctc agggttcttt gtggtgtttt tatctgtgct 1100 tccctatgca ctaatcaaag gaatcatcct ccggaaaata ttcaccacca 1150 tctcattctg cattgttctg cgcatggcgg tcactcggca atttccctgg 1200 gctgtacaaa catggtatga ctctcttgga gcaataaaca aaatacagga 1250 tttcttacaa aagcaagaat ataagacatt ggaatataac ttaacgacta 1300 cagaagtagt gatggagaat gtaacagcct tctgggagga gggatttggg 1350 gaattatttg agaaagcaaa acaaaacaat aacaatagaa aaacttctaa 1400 tggtgatgac agcctcttct tcagtaattt ctcacttctt ggtactcctg 1450 tcctgaaaga tattaatttc aagatagaaa gaggacagtt gttggcggtt 1500 gctggatcca ctggagcagg caagacttca cttctaatga tgattatggg 1550 agaactggag ccttcagagg gtaaaattaa gcacagtgga agaatttcat 1600 tctgttctca gttttcctgg attatgcctg gcaccattaa agaaaatatc 1650 atctttggtg tttcctatga tgaatataga tacagaagcg tcatcaaagc 1700 atgccaacta gaagaggaca tctccaagtt tgcagagaaa gacaatatag 1750 ttcttggaga aggtggaatc acactgagtg gaggtcaacg agcaagaatt 1800 tctttagcaa gagcagtata caaagatgct gatttgtatt tattagactc 1850 tccttttgga tacctagatg ttttaacaga aaaagaaata tttgaaagct 1900 gtgtctgtaa actgatggct aacaaaacta ggattttggt cacttctaaa 1950 atggaacatt taaagaaagc tgacaaaata ttaattttga atgaaggtag 2000 cagctatttt tatgggacat tttcagaact ccaaaatcta cagccagact 2050 ttagctcaaa actcatggga tgtgattctt tcgaccaatt tagtgcagaa 2100 agaagaaatt caatcctaac tgagacctta caccgtttct cattagaagg 2150 agatgctcct gtctcctgga cagaaacaaa aaaacaatct tttaaacaga 2200 ctggagagtt tggggaaaaa aggaagaatt ctattctcaa tccaatcaac 2250 tctatacgaa aattttccat tgtgcaaaag actcccttac aaatgaatgg 2300 catcgaagag gattctgatg agcctttaga gagaaggctg tccttagtac 2350 cagattctga gcagggagag gcgatactgc ctcgcatcag cgtgatcagc 2400 actggcccca cgcttcaggc acgaaggagg cagtctgtcc tgaacctgat 2450 gacacactca gttaaccaag gtcagaacat tcaccgaaag acaacagcat 2500 ccacacgaaa agtgtcactg gcccctcagg caaacttgac tgaactggat 2550 atatattcaa gaaggttatc tcaagaaact ggcttggaaa taagtgaaga 2600 aattaacgaa gaagacttaa aggagtgcct ttttgatgat atggagagca 2650 taccagcagt gactacatgg aacacatacc ttcgatatat tactgtccac 2700 aagagcttaa tttttgtgct aatttggtgc ttagtaattt ttctggcaga 2750 ggtggctgct tctttggttg tgctgtggct ccttggaaac actcctcttc 2800 aagacaaagg gaatagtact catagtagaa ataacagcta tgcagtgatt 2850 atcaccagca ccagttcgta ttatgtgttt tacatttacg tgggagtagc 2900 cgacactttg cttgctatgg gattcttcag aggtctacca ctggtgcata 2950 ctctaatcac agtgtcgaaa attttacacc acaaaatgtt acattctgtt 3000 cttcaagcac ctatgtcaac cctcaacacg ttgaaagcag gtgggattct 3050 taatagattc tccaaagata tagcaatttt ggatgacctt ctgcctctta 3100 ccatatttga cttcatccag ttgttattaa ttgtgattgg agctatagca 3150 gttgtcgcag ttttacaacc ctacatcttt gttgcaacag tgccagtgat 3200 agtggctttt attatgttga gagcatattt cctccaaacc tcacagcaac 3250 tcaaacaact ggaatctgaa ggcaggagtc caattttcac tcatcttgtt 3300 acaagcttaa aaggactatg gacacttcgt gccttcggac ggcagcctta 3350 ctttgaaact ctgttccaca aagctctgaa tttacatact gccaactggt 3400 tcttgtacct gtcaacactg cgctggttcc aaatgagaat agaaatgatt 3450 tttgtcatct tcttcattgc tgttaccttc atttccattt taacaacagg 3500 agaaggagaa ggaagagttg gtattatcct gactttagcc atgaatatca 3550 tgagtacatt gcagtgggct gtaaactcca gcatagatgt ggatagcttg 3600 atgcgatctg tgagccgagt ctttaagttc attgacatgc caacagaagg 3650 taaacctacc aagtcaacca aaccatacaa gaatggccaa ctctcgaaag 3700 ttatgattat tgagaattca cacgtgaaga aagatgacat ctggccctca 3750 gggggccaaa tgactgtcaa agatctcaca gcaaaataca cagaaggtgg 3800 aaatgccata ttagagaaca tttccttctc aataagtcct ggccagaggg 3850 tgggcctctt gggaagaact ggatcaggga agagtacttt gttatcagct 3900 tttttgagac tactgaacac tgaaggagaa atccagatcg atggtgtgtc 3950 ttgggattca ataactttgc aacagtggag gaaagccttt ggagtgatac 4000 cacagaaagt atttattttt tctggaacat ttagaaaaaa cttggatccc 4050 tatgaacagt ggagtgatca agaaatatgg aaagttgcag atgaggttgg 4100 gctcagatct gtgatagaac agtttcctgg gaagcttgac tttgtccttg 4150 tggatggggg ctgtgtccta agccatggcc acaagcagtt gatgtgcttg 4200 gctagatctg ttctcagtaa ggcgaagatc ttgctgcttg atgaacccag 4250 tgctcatttg gatccagtaa cataccaaat aattagaaga actctaaaac 4300 aagcatttgc tgattgcaca gtaattctct gtgaacacag gatagaagca 4350 atgctggaat gccaacaatt tttggtcata gaagagaaca aagtgcggca 4400 gtacgattcc atccagaaac tgctgaacga gaggagcctc ttccggcaag 4450 ccatcagccc ctccgacagg gtgaagctct ttccccaccg gaactcaagc 4500 aagtgcaagt ctaagcccca gattgctgct ctgaaagagg agacagaaga 4550 agaggtgcaa gatacaaggc tttagagagc agcataaatg

ttgacatggg 4600 acatttgctc atggaattgg agctcgtggg acagtcacct catggaattg 4650 gagctcgtgg aacagttacc tctgcctcag aaaacaagga tgaattaagt 4700 ttttttttaa aaaagaaaca tttggtaagg ggaattgagg acactgatat 4750 gggtcttgat aaatggcttc ctggcaatag tcaaattgtg tgaaaggtac 4800 ttcaaatcct tgaagattta ccacttgtgt tttgcaagcc agattttcct 4850 gaaaaccctt gccatgtgct agtaattgga aaggcagctc taaatgtcaa 4900 tcagcctagt tgatcagctt attgtctagt gaaactcgtt aatttgtagt 4950 gttggagaag aactgaaatc atacttctta gggttatgat taagtaatga 5000 taactggaaa cttcagcggt ttatataagc ttgtattcct ttttctctcc 5050 tctccccatg atgtttagaa acacaactat attgtttgct aagcattcca 5100 actatctcat ttccaagcaa gtattagaat accacaggaa ccacaagact 5150 gcacatcaaa atatgcccca ttcaacatct agtgagcagt caggaaagag 5200 aacttccaga tcctggaaat cagggttagt attgtccagg tctaccaaaa 5250 atctcaatat ttcagataat cacaatacat cccttacctg ggaaagggct 5300 gttataatct ttcacagggg acaggatggt tcccttgatg aagaagttga 5350 tatgcctttt cccaactcca gaaagtgaca agctcacaga cctttgaact 5400 agagtttagc tggaaaagta tgttagtgca aattgtcaca ggacagccct 5450 tctttccaca gaagctccag gtagagggtg tgtaagtaga taggccatgg 5500 gcactgtggg tagacacaca tgaagtccaa gcatttagat gtataggttg 5550 atggtggtat gttttcaggc tagatgtatg tacttcatgc tgtctacact 5600 aagagagaat gagagacaca ctgaagaagc accaatcatg aattagtttt 5650 atatgcttct gttttataat tttgtgaagc aaaatttttt ctctaggaaa 5700 tatttatttt aataatgttt caaacatata ttacaatgct gtattttaaa 5750 agaatgatta tgaattacat ttgtataaaa taatttttat atttgaaata 5800 ttgacttttt atggcactag tatttttatg aaatattatg ttaaaactgg 5850 gacaggggag aacctagggt gatattaacc aggggccatg aatcaccttt 5900 tggtctggag ggaagccttg gggctgatcg agttgttgcc cacagctgta 5950 tgattcccag ccagacacag cctcttagat gcagttctga agaagatggt 6000 accaccagtc tgactgtttc catcaagggt acactgcctt ctcaactcca 6050 aactgactct taagaagact gcattatatt tattactgta agaaaatatc 6100 acttgtcaat aaaatccata catttgtgt 6129 50 1480 PRT Homo sapien 50 Met Gln Arg Ser Pro Leu Glu Lys Ala Ser Val Val Ser Lys Leu 1 5 10 15 Phe Phe Ser Trp Thr Arg Pro Ile Leu Arg Lys Gly Tyr Arg Gln 20 25 30 Arg Leu Glu Leu Ser Asp Ile Tyr Gln Ile Pro Ser Val Asp Ser 35 40 45 Ala Asp Asn Leu Ser Glu Lys Leu Glu Arg Glu Trp Asp Arg Glu 50 55 60 Leu Ala Ser Lys Lys Asn Pro Lys Leu Ile Asn Ala Leu Arg Arg 65 70 75 Cys Phe Phe Trp Arg Phe Met Phe Tyr Gly Ile Phe Leu Tyr Leu 80 85 90 Gly Glu Val Thr Lys Ala Val Gln Pro Leu Leu Leu Gly Arg Ile 95 100 105 Ile Ala Ser Tyr Asp Pro Asp Asn Lys Glu Glu Arg Ser Ile Ala 110 115 120 Ile Tyr Leu Gly Ile Gly Leu Cys Leu Leu Phe Ile Val Arg Thr 125 130 135 Leu Leu Leu His Pro Ala Ile Phe Gly Leu His His Ile Gly Met 140 145 150 Gln Met Arg Ile Ala Met Phe Ser Leu Ile Tyr Lys Lys Thr Leu 155 160 165 Lys Leu Ser Ser Arg Val Leu Asp Lys Ile Ser Ile Gly Gln Leu 170 175 180 Val Ser Leu Leu Ser Asn Asn Leu Asn Lys Phe Asp Glu Gly Leu 185 190 195 Ala Leu Ala His Phe Val Trp Ile Ala Pro Leu Gln Val Ala Leu 200 205 210 Leu Met Gly Leu Ile Trp Glu Leu Leu Gln Ala Ser Ala Phe Cys 215 220 225 Gly Leu Gly Phe Leu Ile Val Leu Ala Leu Phe Gln Ala Gly Leu 230 235 240 Gly Arg Met Met Met Lys Tyr Arg Asp Gln Arg Ala Gly Lys Ile 245 250 255 Ser Glu Arg Leu Val Ile Thr Ser Glu Met Ile Glu Asn Ile Gln 260 265 270 Ser Val Lys Ala Tyr Cys Trp Glu Glu Ala Met Glu Lys Met Ile 275 280 285 Glu Asn Leu Arg Gln Thr Glu Leu Lys Leu Thr Arg Lys Ala Ala 290 295 300 Tyr Val Arg Tyr Phe Asn Ser Ser Ala Phe Phe Phe Ser Gly Phe 305 310 315 Phe Val Val Phe Leu Ser Val Leu Pro Tyr Ala Leu Ile Lys Gly 320 325 330 Ile Ile Leu Arg Lys Ile Phe Thr Thr Ile Ser Phe Cys Ile Val 335 340 345 Leu Arg Met Ala Val Thr Arg Gln Phe Pro Trp Ala Val Gln Thr 350 355 360 Trp Tyr Asp Ser Leu Gly Ala Ile Asn Lys Ile Gln Asp Phe Leu 365 370 375 Gln Lys Gln Glu Tyr Lys Thr Leu Glu Tyr Asn Leu Thr Thr Thr 380 385 390 Glu Val Val Met Glu Asn Val Thr Ala Phe Trp Glu Glu Gly Phe 395 400 405 Gly Glu Leu Phe Glu Lys Ala Lys Gln Asn Asn Asn Asn Arg Lys 410 415 420 Thr Ser Asn Gly Asp Asp Ser Leu Phe Phe Ser Asn Phe Ser Leu 425 430 435 Leu Gly Thr Pro Val Leu Lys Asp Ile Asn Phe Lys Ile Glu Arg 440 445 450 Gly Gln Leu Leu Ala Val Ala Gly Ser Thr Gly Ala Gly Lys Thr 455 460 465 Ser Leu Leu Met Met Ile Met Gly Glu Leu Glu Pro Ser Glu Gly 470 475 480 Lys Ile Lys His Ser Gly Arg Ile Ser Phe Cys Ser Gln Phe Ser 485 490 495 Trp Ile Met Pro Gly Thr Ile Lys Glu Asn Ile Ile Phe Gly Val 500 505 510 Ser Tyr Asp Glu Tyr Arg Tyr Arg Ser Val Ile Lys Ala Cys Gln 515 520 525 Leu Glu Glu Asp Ile Ser Lys Phe Ala Glu Lys Asp Asn Ile Val 530 535 540 Leu Gly Glu Gly Gly Ile Thr Leu Ser Gly Gly Gln Arg Ala Arg 545 550 555 Ile Ser Leu Ala Arg Ala Val Tyr Lys Asp Ala Asp Leu Tyr Leu 560 565 570 Leu Asp Ser Pro Phe Gly Tyr Leu Asp Val Leu Thr Glu Lys Glu 575 580 585 Ile Phe Glu Ser Cys Val Cys Lys Leu Met Ala Asn Lys Thr Arg 590 595 600 Ile Leu Val Thr Ser Lys Met Glu His Leu Lys Lys Ala Asp Lys 605 610 615 Ile Leu Ile Leu Asn Glu Gly Ser Ser Tyr Phe Tyr Gly Thr Phe 620 625 630 Ser Glu Leu Gln Asn Leu Gln Pro Asp Phe Ser Ser Lys Leu Met 635 640 645 Gly Cys Asp Ser Phe Asp Gln Phe Ser Ala Glu Arg Arg Asn Ser 650 655 660 Ile Leu Thr Glu Thr Leu His Arg Phe Ser Leu Glu Gly Asp Ala 665 670 675 Pro Val Ser Trp Thr Glu Thr Lys Lys Gln Ser Phe Lys Gln Thr 680 685 690 Gly Glu Phe Gly Glu Lys Arg Lys Asn Ser Ile Leu Asn Pro Ile 695 700 705 Asn Ser Ile Arg Lys Phe Ser Ile Val Gln Lys Thr Pro Leu Gln 710 715 720 Met Asn Gly Ile Glu Glu Asp Ser Asp Glu Pro Leu Glu Arg Arg 725 730 735 Leu Ser Leu Val Pro Asp Ser Glu Gln Gly Glu Ala Ile Leu Pro 740 745 750 Arg Ile Ser Val Ile Ser Thr Gly Pro Thr Leu Gln Ala Arg Arg 755 760 765 Arg Gln Ser Val Leu Asn Leu Met Thr His Ser Val Asn Gln Gly 770 775 780 Gln Asn Ile His Arg Lys Thr Thr Ala Ser Thr Arg Lys Val Ser 785 790 795 Leu Ala Pro Gln Ala Asn Leu Thr Glu Leu Asp Ile Tyr Ser Arg 800 805 810 Arg Leu Ser Gln Glu Thr Gly Leu Glu Ile Ser Glu Glu Ile Asn 815 820 825 Glu Glu Asp Leu Lys Glu Cys Leu Phe Asp Asp Met Glu Ser Ile 830 835 840 Pro Ala Val Thr Thr Trp Asn Thr Tyr Leu Arg Tyr Ile Thr Val 845 850 855 His Lys Ser Leu Ile Phe Val Leu Ile Trp Cys Leu Val Ile Phe 860 865 870 Leu Ala Glu Val Ala Ala Ser Leu Val Val Leu Trp Leu Leu Gly 875 880 885 Asn Thr Pro Leu Gln Asp Lys Gly Asn Ser Thr His Ser Arg Asn 890 895 900 Asn Ser Tyr Ala Val Ile Ile Thr Ser Thr Ser Ser Tyr Tyr Val 905 910 915 Phe Tyr Ile Tyr Val Gly Val Ala Asp Thr Leu Leu Ala Met Gly 920 925 930 Phe Phe Arg Gly Leu Pro Leu Val His Thr Leu Ile Thr Val Ser 935 940 945 Lys Ile Leu His His Lys Met Leu His Ser Val Leu Gln Ala Pro 950 955 960 Met Ser Thr Leu Asn Thr Leu Lys Ala Gly Gly Ile Leu Asn Arg 965 970 975 Phe Ser Lys Asp Ile Ala Ile Leu Asp Asp Leu Leu Pro Leu Thr 980 985 990 Ile Phe Asp Phe Ile Gln Leu Leu Leu Ile Val Ile Gly Ala Ile 995 1000 1005 Ala Val Val Ala Val Leu Gln Pro Tyr Ile Phe Val Ala Thr Val 1010 1015 1020 Pro Val Ile Val Ala Phe Ile Met Leu Arg Ala Tyr Phe Leu Gln 1025 1030 1035 Thr Ser Gln Gln Leu Lys Gln Leu Glu Ser Glu Gly Arg Ser Pro 1040 1045 1050 Ile Phe Thr His Leu Val Thr Ser Leu Lys Gly Leu Trp Thr Leu 1055 1060 1065 Arg Ala Phe Gly Arg Gln Pro Tyr Phe Glu Thr Leu Phe His Lys 1070 1075 1080 Ala Leu Asn Leu His Thr Ala Asn Trp Phe Leu Tyr Leu Ser Thr 1085 1090 1095 Leu Arg Trp Phe Gln Met Arg Ile Glu Met Ile Phe Val Ile Phe 1100 1105 1110 Phe Ile Ala Val Thr Phe Ile Ser Ile Leu Thr Thr Gly Glu Gly 1115 1120 1125 Glu Gly Arg Val Gly Ile Ile Leu Thr Leu Ala Met Asn Ile Met 1130 1135 1140 Ser Thr Leu Gln Trp Ala Val Asn Ser Ser Ile Asp Val Asp Ser 1145 1150 1155 Leu Met Arg Ser Val Ser Arg Val Phe Lys Phe Ile Asp Met Pro 1160 1165 1170 Thr Glu Gly Lys Pro Thr Lys Ser Thr Lys Pro Tyr Lys Asn Gly 1175 1180 1185 Gln Leu Ser Lys Val Met Ile Ile Glu Asn Ser His Val Lys Lys 1190 1195 1200 Asp Asp Ile Trp Pro Ser Gly Gly Gln Met Thr Val Lys Asp Leu 1205 1210 1215 Thr Ala Lys Tyr Thr Glu Gly Gly Asn Ala Ile Leu Glu Asn Ile 1220 1225 1230 Ser Phe Ser Ile Ser Pro Gly Gln Arg Val Gly Leu Leu Gly Arg 1235 1240 1245 Thr Gly Ser Gly Lys Ser Thr Leu Leu Ser Ala Phe Leu Arg Leu 1250 1255 1260 Leu Asn Thr Glu Gly Glu Ile Gln Ile Asp Gly Val Ser Trp Asp 1265 1270 1275 Ser Ile Thr Leu Gln Gln Trp Arg Lys Ala Phe Gly Val Ile Pro 1280 1285 1290 Gln Lys Val Phe Ile Phe Ser Gly Thr Phe Arg Lys Asn Leu Asp 1295 1300 1305 Pro Tyr Glu Gln Trp Ser Asp Gln Glu Ile Trp Lys Val Ala Asp 1310 1315 1320 Glu Val Gly Leu Arg Ser Val Ile Glu Gln Phe Pro Gly Lys Leu 1325 1330 1335 Asp Phe Val Leu Val Asp Gly Gly Cys Val Leu Ser His Gly His 1340 1345 1350 Lys Gln Leu Met Cys Leu Ala Arg Ser Val Leu Ser Lys Ala Lys 1355 1360 1365 Ile Leu Leu Leu Asp Glu Pro Ser Ala His Leu Asp Pro Val Thr 1370 1375 1380 Tyr Gln Ile Ile Arg Arg Thr Leu Lys Gln Ala Phe Ala Asp Cys 1385 1390 1395 Thr Val Ile Leu Cys Glu His Arg Ile Glu Ala Met Leu Glu Cys 1400 1405 1410 Gln Gln Phe Leu Val Ile Glu Glu Asn Lys Val Arg Gln Tyr Asp 1415 1420 1425 Ser Ile Gln Lys Leu Leu Asn Glu Arg Ser Leu Phe Arg Gln Ala 1430 1435 1440 Ile Ser Pro Ser Asp Arg Val Lys Leu Phe Pro His Arg Asn Ser 1445 1450 1455 Ser Lys Cys Lys Ser Lys Pro Gln Ile Ala Ala Leu Lys Glu Glu 1460 1465 1470 Thr Glu Glu Glu Val Gln Asp Thr Arg Leu 1475 1480 51 1847 DNA Homo sapien 51 ctcctgccct ccactgactc cagagaggga gatccccagt acttgactcc 50 atcacgcaga tgggagcagg caccagctat ggagagggat acagctgcgt 100 ctccacatga cccatcctgc atgacaccaa agccaccgcc agacagtgcc 150 tcggattcta tgcaaaacct gggaagcgga gacctacccc agccccggga 200 ggaagctagc tcttcagggg accgtctgag gactggagtt tgatccatga 250 acctggcttc gaggccttgc ttttctctct tcttcattca tattcattcc 300 caacacctta gaaggtgttg cttaatttat ttctagaaaa gcagcccaga 350 gtcagtcatt gaagccttcc ccaccccctg gccaaaaaaa aaaaaaaaaa 400 aaaactggac acattttgga tctgttggga gcttggagtc cagtggttgg 450 catagttgtc acattgggag cagagaagaa gcaaccaggg gccctgatca 500 ggggactgag ccgtagagtc ccaggatggc acccaatggc acagcctctt 550 ccttttgcct ggactctacc gcatgcaaga tcaccatcac cgtggtcctt 600 gcggtcctca tcctcatcac cgttgctggc aatgtggtcg tctgtctggc 650 cgtgggcttg aaccgccggc tccgcaacct gaccaattgt ttcatcgtgt 700 ccttggctat cactgacctg ctcctcggcc tcctggtgct gcccttctct 750 gccatctacc agctgtcctg caagtggagc tttggcaagg tcttctgcaa 800 tatctacacc agcctggatg tgatgctctg cacagcctcc attcttaacc 850 tcttcatgat cagcctcgac cggtactgcg ctgtcatgga cccactgcgg 900 taccctgtgc tggtcacccc agttcgggtc gccatctctc tggtcttaat 950 ttgggtcatc tccattaccc tgtcctttct gtctatccac ctggggtgga 1000 acagcaggaa cgagaccagc aagggcaatc ataccacctc taagtgcaaa 1050 gtccaggtca atgaagtgta cgggctggtg gatgggctgg tcaccttcta 1100 cctcccgcta ctgatcatgt gcatcaccta ctaccgcatc ttcaaggtcg 1150 cccgggatca ggccaagagg atcaatcaca ttagctcctg gaaggcagcc 1200 accatcaggg agcacaaagc cacagtgaca ctggccgccg tcatgggggc 1250 cttcatcatc tgctggtttc cctacttcac cgcgtttgtg taccgtgggc 1300 tgagagggga tgatgccatc aatgaggtgt tagaagccat cgttctgtgg 1350 ctgggctatg ccaactcagc cctgaacccc atcctgtatg ctgcgctgaa 1400 cagagacttc cgcaccgggt accaacagct cttctgctgc aggctggcca 1450 accgcaactc ccacaaaact tctctgaggt ccaacgcctc tcagctgtcc 1500 aggacccaaa gccgagaacc caggcaacag gaagagaaac ccctgaagct 1550 ccaggtgtgg agtgggacag aagtcacggc cccccaggga gccacagaca 1600 ggtaatagcc ctagccattg gtgcacagga tgggggcaat gggaggggat 1650 gctactgatg ggaatgatta agggagctgc tgtttaggtg gtgctggttt 1700 atgttctagg aactcttcat gagcactttg taaacaccct cttgcttaat 1750 cctcccaacg gcccccaaag gtagaactta gctccctttt aaaaggagca 1800 cattaaaatt ctcagaggac ttggcaaggg ccgcacagct ggggcat 1847 52 359 PRT Homo sapien 52 Met Ala Pro Asn Gly Thr Ala Ser Ser Phe Cys Leu Asp Ser Thr 1 5 10 15 Ala Cys Lys Ile Thr Ile Thr Val Val Leu Ala Val Leu Ile Leu 20 25 30 Ile Thr Val Ala Gly Asn Val Val Val Cys Leu Ala Val Gly Leu 35 40 45 Asn Arg Arg Leu Arg Asn Leu Thr Asn Cys Phe Ile Val Ser Leu 50 55 60 Ala Ile Thr Asp Leu Leu Leu Gly Leu Leu Val Leu Pro Phe Ser 65 70 75 Ala Ile Tyr Gln Leu Ser Cys Lys Trp Ser Phe Gly Lys Val Phe 80 85 90 Cys Asn Ile Tyr Thr Ser Leu Asp Val Met Leu Cys Thr Ala Ser 95 100 105 Ile Leu Asn Leu Phe Met Ile Ser Leu Asp Arg Tyr Cys Ala Val 110 115 120 Met Asp Pro Leu Arg Tyr Pro Val Leu Val Thr Pro Val Arg Val 125 130 135 Ala Ile Ser Leu Val Leu Ile Trp Val Ile Ser Ile Thr Leu Ser 140 145 150 Phe Leu Ser Ile His Leu Gly Trp Asn Ser Arg Asn Glu Thr Ser 155 160 165 Lys Gly Asn His Thr Thr Ser Lys Cys Lys Val Gln Val Asn Glu 170 175 180 Val Tyr Gly Leu Val Asp

Gly Leu Val Thr Phe Tyr Leu Pro Leu 185 190 195 Leu Ile Met Cys Ile Thr Tyr Tyr Arg Ile Phe Lys Val Ala Arg 200 205 210 Asp Gln Ala Lys Arg Ile Asn His Ile Ser Ser Trp Lys Ala Ala 215 220 225 Thr Ile Arg Glu His Lys Ala Thr Val Thr Leu Ala Ala Val Met 230 235 240 Gly Ala Phe Ile Ile Cys Trp Phe Pro Tyr Phe Thr Ala Phe Val 245 250 255 Tyr Arg Gly Leu Arg Gly Asp Asp Ala Ile Asn Glu Val Leu Glu 260 265 270 Ala Ile Val Leu Trp Leu Gly Tyr Ala Asn Ser Ala Leu Asn Pro 275 280 285 Ile Leu Tyr Ala Ala Leu Asn Arg Asp Phe Arg Thr Gly Tyr Gln 290 295 300 Gln Leu Phe Cys Cys Arg Leu Ala Asn Arg Asn Ser His Lys Thr 305 310 315 Ser Leu Arg Ser Asn Ala Ser Gln Leu Ser Arg Thr Gln Ser Arg 320 325 330 Glu Pro Arg Gln Gln Glu Glu Lys Pro Leu Lys Leu Gln Val Trp 335 340 345 Ser Gly Thr Glu Val Thr Ala Pro Gln Gly Ala Thr Asp Arg 350 355 53 5512 DNA Homo sapien 53 gagctagccc cggcggccgc cgccgcccag accggacgac aggccacctc 50 gtcggcgtcc gcccgagtcc ccgcctcgcc gccaacgcca caaccaccgc 100 gcacggcccc ctgactccgt ccagtattga tcgggagagc cggagcgagc 150 tcttcgggga gcagcgatgc gaccctccgg gacggccggg gcagcgctcc 200 tggcgctgct ggctgcgctc tgcccggcga gtcgggctct ggaggaaaag 250 aaagtttgcc aaggcacgag taacaagctc acgcagttgg gcacttttga 300 agatcatttt ctcagcctcc agaggatgtt caataactgt gaggtggtcc 350 ttgggaattt ggaaattacc tatgtgcaga ggaattatga tctttccttc 400 ttaaagacca tccaggaggt ggctggttat gtcctcattg ccctcaacac 450 agtggagcga attcctttgg aaaacctgca gatcatcaga ggaaatatgt 500 actacgaaaa ttcctatgcc ttagcagtct tatctaacta tgatgcaaat 550 aaaaccggac tgaaggagct gcccatgaga aatttacagg aaatcctgca 600 tggcgccgtg cggttcagca acaaccctgc cctgtgcaac gtggagagca 650 tccagtggcg ggacatagtc agcagtgact ttctcagcaa catgtcgatg 700 gacttccaga accacctggg cagctgccaa aagtgtgatc caagctgtcc 750 caatgggagc tgctggggtg caggagagga gaactgccag aaactgacca 800 aaatcatctg tgcccagcag tgctccgggc gctgccgtgg caagtccccc 850 agtgactgct gccacaacca gtgtgctgca ggctgcacag gcccccggga 900 gagcgactgc ctggtctgcc gcaaattccg agacgaagcc acgtgcaagg 950 acacctgccc cccactcatg ctctacaacc ccaccacgta ccagatggat 1000 gtgaaccccg agggcaaata cagctttggt gccacctgcg tgaagaagtg 1050 tccccgtaat tatgtggtga cagatcacgg ctcgtgcgtc cgagcctgtg 1100 gggccgacag ctatgagatg gaggaagacg gcgtccgcaa gtgtaagaag 1150 tgcgaagggc cttgccgcaa agtgtgtaac ggaataggta ttggtgaatt 1200 taaagactca ctctccataa atgctacgaa tattaaacac ttcaaaaact 1250 gcacctccat cagtggcgat ctccacatcc tgccggtggc atttaggggt 1300 gactccttca cacatactcc tcctctggat ccacaggaac tggatattct 1350 gaaaaccgta aaggaaatca cagggttttt gctgattcag gcttggcctg 1400 aaaacaggac ggacctccat gcctttgaga acctagaaat catacgcggc 1450 aggaccaagc aacatggtca gttttctctt gcagtcgtca gcctgaacat 1500 aacatccttg ggattacgct ccctcaagga gataagtgat ggagatgtga 1550 taatttcagg aaacaaaaat ttgtgctatg caaatacaat aaactggaaa 1600 aaactgtttg ggacctccgg tcagaaaacc aaaattataa gcaacagagg 1650 tgaaaacagc tgcaaggcca caggccaggt ctgccatgcc ttgtgctccc 1700 ccgagggctg ctggggcccg gagcccaggg actgcgtctc ttgccggaat 1750 gtcagccgag gcagggaatg cgtggacaag tgcaaccttc tggagggtga 1800 gccaagggag tttgtggaga actctgagtg catacagtgc cacccagagt 1850 gcctgcctca ggccatgaac atcacctgca caggacgggg accagacaac 1900 tgtatccagt gtgcccacta cattgacggc ccccactgcg tcaagacctg 1950 cccggcagga gtcatgggag aaaacaacac cctggtctgg aagtacgcag 2000 acgccggcca tgtgtgccac ctgtgccatc caaactgcac ctacggatgc 2050 actgggccag gtcttgaagg ctgtccaacg aatgggccta agatcccgtc 2100 catcgccact gggatggtgg gggccctcct cttgctgctg gtggtggccc 2150 tggggatcgg cctcttcatg cgaaggcgcc acatcgttcg gaagcgcacg 2200 ctgcggaggc tgctgcagga gagggagctt gtggagcctc ttacacccag 2250 tggagaagct cccaaccaag ctctcttgag gatcttgaag gaaactgaat 2300 tcaaaaagat caaagtgctg ggctccggtg cgttcggcac ggtgtataag 2350 ggactctgga tcccagaagg tgagaaagtt aaaattcccg tcgctatcaa 2400 ggaattaaga gaagcaacat ctccgaaagc caacaaggaa atcctcgatg 2450 aagcctacgt gatggccagc gtggacaacc cccacgtgtg ccgcctgctg 2500 ggcatctgcc tcacctccac cgtgcagctc atcacgcagc tcatgccctt 2550 cggctgcctc ctggactatg tccgggaaca caaagacaat attggctccc 2600 agtacctgct caactggtgt gtgcagatcg caaagggcat gaactacttg 2650 gaggaccgtc gcttggtgca ccgcgacctg gcagccagga acgtactggt 2700 gaaaacaccg cagcatgtca agatcacaga ttttgggctg gccaaactgc 2750 tgggtgcgga agagaaagaa taccatgcag aaggaggcaa agtgcctatc 2800 aagtggatgg cattggaatc aattttacac agaatctata cccaccagag 2850 tgatgtctgg agctacgggg tgaccgtttg ggagttgatg acctttggat 2900 ccaagccata tgacggaatc cctgccagcg agatctcctc catcctggag 2950 aaaggagaac gcctccctca gccacccata tgtaccatcg atgtctacat 3000 gatcatggtc aagtgctgga tgatagacgc agatagtcgc ccaaagttcc 3050 gtgagttgat catcgaattc tccaaaatgg cccgagaccc ccagcgctac 3100 cttgtcattc agggggatga aagaatgcat ttgccaagtc ctacagactc 3150 caacttctac cgtgccctga tggatgaaga agacatggac gacgtggtgg 3200 atgccgacga gtacctcatc ccacagcagg gcttcttcag cagcccctcc 3250 acgtcacgga ctcccctcct gagctctctg agtgcaacca gcaacaattc 3300 caccgtggct tgcattgata gaaatgggct gcaaagctgt cccatcaagg 3350 aagacagctt cttgcagcga tacagctcag accccacagg cgccttgact 3400 gaggacagca tagacgacac cttcctccca gtgcctgaat acataaacca 3450 gtccgttccc aaaaggcccg ctggctctgt gcagaatcct gtctatcaca 3500 atcagcctct gaaccccgcg cccagcagag acccacacta ccaggacccc 3550 cacagcactg cagtgggcaa ccccgagtat ctcaacactg tccagcccac 3600 ctgtgtcaac agcacattcg acagccctgc ccactgggcc cagaaaggca 3650 gccaccaaat tagcctggac aaccctgact accagcagga cttctttccc 3700 aaggaagcca agccaaatgg catctttaag ggctccacag ctgaaaatgc 3750 agaataccta agggtcgcgc cacaaagcag tgaatttatt ggagcatgac 3800 cacggaggat agtatgagcc ctaaaaatcc agactctttc gatacccagg 3850 accaagccac agcaggtcct ccatcccaac agccatgccc gcattagctc 3900 ttagacccac agactggttt tgcaacgttt acaccgacta gccaggaagt 3950 acttccacct cgggcacatt ttgggaagtt gcattccttt gtcttcaaac 4000 tgtgaagcat ttacagaaac gcatccagca agaatattgt ccctttgagc 4050 agaaatttat ctttcaaaga ggtatatttg aaaaaaaaaa aaaaagtata 4100 tgtgaggatt tttattgatt ggggatcttg gagtttttca ttgtcgctat 4150 tgatttttac ttcaatgggc tcttccaaca aggaagaagc ttgctggtag 4200 cacttgctac cctgagttca tccaggccca actgtgagca aggagcacaa 4250 gccacaagtc ttccagagga tgcttgattc cagtggttct gcttcaaggc 4300 ttccactgca aaacactaaa gatccaagaa ggccttcatg gccccagcag 4350 gccggatcgg tactgtatca agtcatggca ggtacagtag gataagccac 4400 tctgtccctt cctgggcaaa gaagaaacgg aggggatgaa ttcttcctta 4450 gacttacttt tgtaaaaatg tccccacggt acttactccc cactgatgga 4500 ccagtggttt ccagtcatga gcgttagact gacttgtttg tcttccattc 4550 cattgttttg aaactcagta tgccgcccct gtcttgctgt catgaaatca 4600 gcaagagagg atgacacatc aaataataac tcggattcca gcccacattg 4650 gattcatcag catttggacc aatagcccac agctgagaat gtggaatacc 4700 taaggataac accgcttttg ttctcgcaaa aacgtatctc ctaatttgag 4750 gctcagatga aatgcatcag gtcctttggg gcatagatca gaagactaca 4800 aaaatgaagc tgctctgaaa tctcctttag ccatcacccc aaccccccaa 4850 aattagtttg tgttacttat ggaagatagt tttctccttt tacttcactt 4900 caaaagcttt ttactcaaag agtatatgtt ccctccaggt cagctgcccc 4950 caaaccccct ccttacgctt tgtcacacaa aaagtgtctc tgccttgagt 5000 catctattca agcacttaca gctctggcca caacagggca ttttacaggt 5050 gcgaatgaca gtagcattat gagtagtgtg aattcaggta gtaaatatga 5100 aactagggtt tgaaattgat aatgctttca caacatttgc agatgtttta 5150 gaaggaaaaa agttccttcc taaaataatt tctctacaat tggaagattg 5200 gaagattcag ctagttagga gcccattttt tcctaatctg tgtgtgccct 5250 gtaacctgac tggttaacag cagtcctttg taaacagtgt tttaaactct 5300 cctagtcaat atccacccca tccaatttat caaggaagaa atggttcaga 5350 aaatattttc agcctacagt tatgttcagt cacacacaca tacaaaatgt 5400 tccttttgct tttaaagtaa tttttgactc ccagatcagt cagagcccct 5450 acagcattgt taagaaagta tttgattttt gtctcaatga aaataaaact 5500 atattcattt cc 5512 54 1210 PRT Homo sapien 54 Met Arg Pro Ser Gly Thr Ala Gly Ala Ala Leu Leu Ala Leu Leu 1 5 10 15 Ala Ala Leu Cys Pro Ala Ser Arg Ala Leu Glu Glu Lys Lys Val 20 25 30 Cys Gln Gly Thr Ser Asn Lys Leu Thr Gln Leu Gly Thr Phe Glu 35 40 45 Asp His Phe Leu Ser Leu Gln Arg Met Phe Asn Asn Cys Glu Val 50 55 60 Val Leu Gly Asn Leu Glu Ile Thr Tyr Val Gln Arg Asn Tyr Asp 65 70 75 Leu Ser Phe Leu Lys Thr Ile Gln Glu Val Ala Gly Tyr Val Leu 80 85 90 Ile Ala Leu Asn Thr Val Glu Arg Ile Pro Leu Glu Asn Leu Gln 95 100 105 Ile Ile Arg Gly Asn Met Tyr Tyr Glu Asn Ser Tyr Ala Leu Ala 110 115 120 Val Leu Ser Asn Tyr Asp Ala Asn Lys Thr Gly Leu Lys Glu Leu 125 130 135 Pro Met Arg Asn Leu Gln Glu Ile Leu His Gly Ala Val Arg Phe 140 145 150 Ser Asn Asn Pro Ala Leu Cys Asn Val Glu Ser Ile Gln Trp Arg 155 160 165 Asp Ile Val Ser Ser Asp Phe Leu Ser Asn Met Ser Met Asp Phe 170 175 180 Gln Asn His Leu Gly Ser Cys Gln Lys Cys Asp Pro Ser Cys Pro 185 190 195 Asn Gly Ser Cys Trp Gly Ala Gly Glu Glu Asn Cys Gln Lys Leu 200 205 210 Thr Lys Ile Ile Cys Ala Gln Gln Cys Ser Gly Arg Cys Arg Gly 215 220 225 Lys Ser Pro Ser Asp Cys Cys His Asn Gln Cys Ala Ala Gly Cys 230 235 240 Thr Gly Pro Arg Glu Ser Asp Cys Leu Val Cys Arg Lys Phe Arg 245 250 255 Asp Glu Ala Thr Cys Lys Asp Thr Cys Pro Pro Leu Met Leu Tyr 260 265 270 Asn Pro Thr Thr Tyr Gln Met Asp Val Asn Pro Glu Gly Lys Tyr 275 280 285 Ser Phe Gly Ala Thr Cys Val Lys Lys Cys Pro Arg Asn Tyr Val 290 295 300 Val Thr Asp His Gly Ser Cys Val Arg Ala Cys Gly Ala Asp Ser 305 310 315 Tyr Glu Met Glu Glu Asp Gly Val Arg Lys Cys Lys Lys Cys Glu 320 325 330 Gly Pro Cys Arg Lys Val Cys Asn Gly Ile Gly Ile Gly Glu Phe 335 340 345 Lys Asp Ser Leu Ser Ile Asn Ala Thr Asn Ile Lys His Phe Lys 350 355 360 Asn Cys Thr Ser Ile Ser Gly Asp Leu His Ile Leu Pro Val Ala 365 370 375 Phe Arg Gly Asp Ser Phe Thr His Thr Pro Pro Leu Asp Pro Gln 380 385 390 Glu Leu Asp Ile Leu Lys Thr Val Lys Glu Ile Thr Gly Phe Leu 395 400 405 Leu Ile Gln Ala Trp Pro Glu Asn Arg Thr Asp Leu His Ala Phe 410 415 420 Glu Asn Leu Glu Ile Ile Arg Gly Arg Thr Lys Gln His Gly Gln 425 430 435 Phe Ser Leu Ala Val Val Ser Leu Asn Ile Thr Ser Leu Gly Leu 440 445 450 Arg Ser Leu Lys Glu Ile Ser Asp Gly Asp Val Ile Ile Ser Gly 455 460 465 Asn Lys Asn Leu Cys Tyr Ala Asn Thr Ile Asn Trp Lys Lys Leu 470 475 480 Phe Gly Thr Ser Gly Gln Lys Thr Lys Ile Ile Ser Asn Arg Gly 485 490 495 Glu Asn Ser Cys Lys Ala Thr Gly Gln Val Cys His Ala Leu Cys 500 505 510 Ser Pro Glu Gly Cys Trp Gly Pro Glu Pro Arg Asp Cys Val Ser 515 520 525 Cys Arg Asn Val Ser Arg Gly Arg Glu Cys Val Asp Lys Cys Asn 530 535 540 Leu Leu Glu Gly Glu Pro Arg Glu Phe Val Glu Asn Ser Glu Cys 545 550 555 Ile Gln Cys His Pro Glu Cys Leu Pro Gln Ala Met Asn Ile Thr 560 565 570 Cys Thr Gly Arg Gly Pro Asp Asn Cys Ile Gln Cys Ala His Tyr 575 580 585 Ile Asp Gly Pro His Cys Val Lys Thr Cys Pro Ala Gly Val Met 590 595 600 Gly Glu Asn Asn Thr Leu Val Trp Lys Tyr Ala Asp Ala Gly His 605 610 615 Val Cys His Leu Cys His Pro Asn Cys Thr Tyr Gly Cys Thr Gly 620 625 630 Pro Gly Leu Glu Gly Cys Pro Thr Asn Gly Pro Lys Ile Pro Ser 635 640 645 Ile Ala Thr Gly Met Val Gly Ala Leu Leu Leu Leu Leu Val Val 650 655 660 Ala Leu Gly Ile Gly Leu Phe Met Arg Arg Arg His Ile Val Arg 665 670 675 Lys Arg Thr Leu Arg Arg Leu Leu Gln Glu Arg Glu Leu Val Glu 680 685 690 Pro Leu Thr Pro Ser Gly Glu Ala Pro Asn Gln Ala Leu Leu Arg 695 700 705 Ile Leu Lys Glu Thr Glu Phe Lys Lys Ile Lys Val Leu Gly Ser 710 715 720 Gly Ala Phe Gly Thr Val Tyr Lys Gly Leu Trp Ile Pro Glu Gly 725 730 735 Glu Lys Val Lys Ile Pro Val Ala Ile Lys Glu Leu Arg Glu Ala 740 745 750 Thr Ser Pro Lys Ala Asn Lys Glu Ile Leu Asp Glu Ala Tyr Val 755 760 765 Met Ala Ser Val Asp Asn Pro His Val Cys Arg Leu Leu Gly Ile 770 775 780 Cys Leu Thr Ser Thr Val Gln Leu Ile Thr Gln Leu Met Pro Phe 785 790 795 Gly Cys Leu Leu Asp Tyr Val Arg Glu His Lys Asp Asn Ile Gly 800 805 810 Ser Gln Tyr Leu Leu Asn Trp Cys Val Gln Ile Ala Lys Gly Met 815 820 825 Asn Tyr Leu Glu Asp Arg Arg Leu Val His Arg Asp Leu Ala Ala 830 835 840 Arg Asn Val Leu Val Lys Thr Pro Gln His Val Lys Ile Thr Asp 845 850 855 Phe Gly Leu Ala Lys Leu Leu Gly Ala Glu Glu Lys Glu Tyr His 860 865 870 Ala Glu Gly Gly Lys Val Pro Ile Lys Trp Met Ala Leu Glu Ser 875 880 885 Ile Leu His Arg Ile Tyr Thr His Gln Ser Asp Val Trp Ser Tyr 890 895 900 Gly Val Thr Val Trp Glu Leu Met Thr Phe Gly Ser Lys Pro Tyr 905 910 915 Asp Gly Ile Pro Ala Ser Glu Ile Ser Ser Ile Leu Glu Lys Gly 920 925 930 Glu Arg Leu Pro Gln Pro Pro Ile Cys Thr Ile Asp Val Tyr Met 935 940 945 Ile Met Val Lys Cys Trp Met Ile Asp Ala Asp Ser Arg Pro Lys 950 955 960 Phe Arg Glu Leu Ile Ile Glu Phe Ser Lys Met Ala Arg Asp Pro 965 970 975 Gln Arg Tyr Leu Val Ile Gln Gly Asp Glu Arg Met His Leu Pro 980 985 990 Ser Pro Thr Asp Ser Asn Phe Tyr Arg Ala Leu Met Asp Glu Glu 995 1000 1005 Asp Met Asp Asp Val Val Asp Ala Asp Glu Tyr Leu Ile Pro Gln 1010 1015 1020 Gln Gly Phe Phe Ser Ser Pro Ser Thr Ser Arg Thr Pro Leu Leu 1025 1030 1035 Ser Ser Leu Ser Ala Thr Ser Asn Asn Ser Thr Val Ala Cys Ile 1040 1045 1050 Asp Arg Asn Gly Leu Gln Ser Cys Pro Ile Lys Glu Asp Ser Phe 1055 1060 1065 Leu Gln Arg Tyr Ser Ser Asp Pro Thr Gly Ala Leu Thr Glu Asp 1070 1075 1080 Ser Ile Asp Asp Thr Phe Leu Pro Val Pro Glu Tyr Ile Asn Gln 1085 1090 1095 Ser Val Pro Lys Arg Pro Ala Gly Ser Val Gln Asn Pro Val

Tyr 1100 1105 1110 His Asn Gln Pro Leu Asn Pro Ala Pro Ser Arg Asp Pro His Tyr 1115 1120 1125 Gln Asp Pro His Ser Thr Ala Val Gly Asn Pro Glu Tyr Leu Asn 1130 1135 1140 Thr Val Gln Pro Thr Cys Val Asn Ser Thr Phe Asp Ser Pro Ala 1145 1150 1155 His Trp Ala Gln Lys Gly Ser His Gln Ile Ser Leu Asp Asn Pro 1160 1165 1170 Asp Tyr Gln Gln Asp Phe Phe Pro Lys Glu Ala Lys Pro Asn Gly 1175 1180 1185 Ile Phe Lys Gly Ser Thr Ala Glu Asn Ala Glu Tyr Leu Arg Val 1190 1195 1200 Ala Pro Gln Ser Ser Glu Phe Ile Gly Ala 1205 1210 55 4711 DNA Homo sapien 55 gccccgggaa gcgcagccat ggctctgcgg aggctggggg ccgcgctgct 50 gctgctgccg ctgctcgccg ccgtggaaga aacgctaatg gactccacta 100 cagcgactgc tgagctgggc tggatggtgc atcctccatc agggtgggaa 150 gaggtgagtg gctacgatga gaacatgaac acgatccgca cgtaccaggt 200 gtgcaacgtg tttgagtcaa gccagaacaa ctggctacgg accaagttta 250 tccggcgccg tggcgcccac cgcatccacg tggagatgaa gttttcggtg 300 cgtgactgca gcagcatccc cagcgtgcct ggctcctgca aggagacctt 350 caacctctat tactatgagg ctgactttga ctcggccacc aagaccttcc 400 ccaactggat ggagaatcca tgggtgaagg tggataccat tgcagccgac 450 gagagcttct cccaggtgga cctgggtggc cgcgtcatga aaatcaacac 500 cgaggtgcgg agcttcggac ctgtgtcccg cagcggcttc tacctggcct 550 tccaggacta tggcggctgc atgtccctca tcgccgtgcg tgtcttctac 600 cgcaagtgcc cccgcatcat ccagaatggc gccatcttcc aggaaaccct 650 gtcgggggct gagagcacat cgctggtggc tgcccggggc agctgcatcg 700 ccaatgcgga agaggtggat gtacccatca agctctactg taacggggac 750 ggcgagtggc tggtgcccat cgggcgctgc atgtgcaaag caggcttcga 800 ggccgttgag aatggcaccg tctgccgagg ttgtccatct gggactttca 850 aggccaacca aggggatgag gcctgtaccc actgtcccat caacagccgg 900 accacttctg aaggggccac caactgtgtc tgccgcaatg gctactacag 950 agcagacctg gaccccctgg acatgccctg cacaaccatc ccctccgcgc 1000 cccaggctgt gatttccagt gtcaatgaga cctccctcat gctggagtgg 1050 acccctcccc gcgactccgg aggccgagag gacctcgtct acaacatcat 1100 ctgcaagagc tgtggctcgg gccggggtgc ctgcacccgc tgcggggaca 1150 atgtacagta cgcaccacgc cagctaggcc tgaccgagcc acgcatttac 1200 atcagtgacc tgctggccca cacccagtac accttcgaga tccaggctgt 1250 gaacggcgtt actgaccaga gccccttctc gcctcagttc gcctctgtga 1300 acatcaccac caaccaggca gctccatcgg cagtgtccat catgcatcag 1350 gtgagccgca ccgtggacag cattaccctg tcgtggtccc agccggacca 1400 gcccaatggc gtgatcctgg actatgagct gcagtactat gagaaggagc 1450 tcagtgagta caacgccaca gccataaaaa gccccaccaa cacggtcacc 1500 gtgcagggcc tcaaagccgg cgccatctat gtcttccagg tgcgggcacg 1550 caccgtggca ggctacgggc gctacagcgg caagatgtac ttccagacca 1600 tgacagaagc cgagtaccag acaagcatcc aggagaagtt gccactcatc 1650 atcggctcct cggccgctgg cctggtcttc ctcattgctg tggttgtcat 1700 cgccatcgtg tgtaacagaa gacgggggtt tgagcgtgct gactcggagt 1750 acacggacaa gctgcaacac tacaccagtg gccacatgac cccaggcatg 1800 aagatctaca tcgatccttt cacctacgag gaccccaacg aggcagtgcg 1850 ggagtttgcc aaggaaattg acatctcctg tgtcaaaatt gagcaggtga 1900 tcggagcagg ggagtttggc gaggtctgca gtggccacct gaagctgcca 1950 ggcaagagag agatctttgt ggccatcaag acgctcaagt cgggctacac 2000 ggagaagcag cgccgggact tcctgagcga agcctccatc atgggccagt 2050 tcgaccatcc caacgtcatc cacctggagg gtgtcgtgac caagagcaca 2100 cctgtgatga tcatcaccga gttcatggag aatggctccc tggactcctt 2150 tctccggcaa aacgatgggc agttcacagt catccagctg gtgggcatgc 2200 ttcggggcat cgcagctggc atgaagtacc tggcagacat gaactatgtt 2250 caccgtgacc tggctgcccg caacatcctc gtcaacagca acctggtctg 2300 caaggtgtcg gactttgggc tctcacgctt tctagaggac gatacctcag 2350 accccaccta caccagtgcc ctgggcggaa agatccccat ccgctggaca 2400 gccccggaag ccatccagta ccggaagttc acctcggcca gtgatgtgtg 2450 gagctacggc attgtcatgt gggaggtgat gtcctatggg gagcggccct 2500 actgggacat gaccaaccag gatgtaatca atgccattga gcaggactat 2550 cggctgccac cgcccatgga ctgcccgagc gccctgcacc aactcatgct 2600 ggactgttgg cagaaggacc gcaaccaccg gcccaagttc ggccaaattg 2650 tcaacacgct agacaagatg atccgcaatc ccaacagcct caaagccatg 2700 gcgcccctct cctctggcat caacctgccg ctgctggacc gcacgatccc 2750 cgactacacc agctttaaca cggtggacga gtggctggag gccatcaaga 2800 tggggcagta caaggagagc ttcgccaatg ccggcttcac ctcctttgac 2850 gtcgtgtctc agatgatgat ggaggacatt ctccgggttg gggtcacttt 2900 ggctggccac cagaaaaaaa tcctgaacag tatccaggtg atgcgggcgc 2950 agatgaacca gattcagtct gtggaggttt gacattcacc tgcctcggct 3000 cacctcttcc tccaagcccc gccccctctg ccccacgtgc cggccctcct 3050 ggtgctctat ccactgcagg gccagccact cgccaggagg ccacgggcca 3100 cgggaagaac caagcggtgc cagccacgag acgtcaccaa gaaaacatgc 3150 aactcaaacg acggaaaaaa aaagggaatg ggaaaaaaga aaacagatcc 3200 tgggaggggg cgggaaatac aaggaatatt ttttaaagag gattctcata 3250 aggaaagcaa tgactgttct tgcgggggat aaaaaagggc ttgggagatt 3300 catgcgatgt gtccaatcgg agacaaaagc agtttctctc caactccctc 3350 tgggaaggtg acctggccag agccaagaaa cactttcaga aaaacaaatg 3400 tgaaggggag agacaggggc cgcccttggc tcctgtccct gctgctcctc 3450 taggcctcac tcaacaacca agcgcctgga ggacgggaca gatggacaga 3500 cagccaccct gagaacccct ctgggaaaat ctattcctgc caccactggg 3550 caaacagaag aatttttctg tctttggaga gtattttaga aactccaatg 3600 aaagacactg tttctcctgt tggctcacag ggctgaaagg ggcttttgtc 3650 ctcctgggtc agggagaacg cggggacccc agaaaggtca gccttcctga 3700 ggatgggcaa cccccaggtc tgcagctcca ggtacatatc acgcgcacag 3750 cctggcagcc tggccctcct ggtgcccact cccgccagcc cctgcctcga 3800 ggactgatac tgcagtgact gccgtcagct ccgactgccg ctgagaaggg 3850 ttgatcctgc atctgggttt gtttacagca attcctggac tcgggggtat 3900 tttggtcaca gggtggtttt ggtttagggg gtttgtttgt tgggttgttt 3950 tttgtttttt ggtttttttt aatgacaatg aagtgacact ttgacatttc 4000 ctaccttttg aggacttgat ccttctccag gaagaaggtg ctttctgctt 4050 actgacttag gcaatacacc aagggcgaga ttttatatgc acatttctgg 4100 atttttttat acggttttca ttgacactct tccctcctcc cacctgccac 4150 caggcctcac caaagcccac tgccatgggg ccatctgggc cattcagaga 4200 ctggagtgag atttgggtgt ggagggggag gcgccaaggt ggaggagctt 4250 cccactccag gactgttgat gaaagggaca gattgaggag gaagtgggct 4300 ctgaggctgc agggctggaa gtccttgccc acttcccact ctcctgcccc 4350 aatctatcta gtacttccca ggcaaatagg cccctttgag gctcctgagt 4400 gccctcagat ggtcaaaacc cagttttccc tctgggagcc taaaccaggc 4450 tgcatcggag gccaggaccc ggatcattca ctgtgatacc ctgccctcca 4500 gagggtgcgc tcagagacac gggcaagcat gcctcttccc ttccctggag 4550 agaaagtgtg tgatttctct cccacctcct tccccccacc agacctttgc 4600 tgggcctaaa ggtcttggcc atggggacgc cctcagtcta gggatctggc 4650 cacagactcc ctcctgtgaa ccaacacaga cacccaagca gagcaatcag 4700 ttagtgaatt g 4711 56 987 PRT Homo sapien 56 Met Ala Leu Arg Arg Leu Gly Ala Ala Leu Leu Leu Leu Pro Leu 1 5 10 15 Leu Ala Ala Val Glu Glu Thr Leu Met Asp Ser Thr Thr Ala Thr 20 25 30 Ala Glu Leu Gly Trp Met Val His Pro Pro Ser Gly Trp Glu Glu 35 40 45 Val Ser Gly Tyr Asp Glu Asn Met Asn Thr Ile Arg Thr Tyr Gln 50 55 60 Val Cys Asn Val Phe Glu Ser Ser Gln Asn Asn Trp Leu Arg Thr 65 70 75 Lys Phe Ile Arg Arg Arg Gly Ala His Arg Ile His Val Glu Met 80 85 90 Lys Phe Ser Val Arg Asp Cys Ser Ser Ile Pro Ser Val Pro Gly 95 100 105 Ser Cys Lys Glu Thr Phe Asn Leu Tyr Tyr Tyr Glu Ala Asp Phe 110 115 120 Asp Ser Ala Thr Lys Thr Phe Pro Asn Trp Met Glu Asn Pro Trp 125 130 135 Val Lys Val Asp Thr Ile Ala Ala Asp Glu Ser Phe Ser Gln Val 140 145 150 Asp Leu Gly Gly Arg Val Met Lys Ile Asn Thr Glu Val Arg Ser 155 160 165 Phe Gly Pro Val Ser Arg Ser Gly Phe Tyr Leu Ala Phe Gln Asp 170 175 180 Tyr Gly Gly Cys Met Ser Leu Ile Ala Val Arg Val Phe Tyr Arg 185 190 195 Lys Cys Pro Arg Ile Ile Gln Asn Gly Ala Ile Phe Gln Glu Thr 200 205 210 Leu Ser Gly Ala Glu Ser Thr Ser Leu Val Ala Ala Arg Gly Ser 215 220 225 Cys Ile Ala Asn Ala Glu Glu Val Asp Val Pro Ile Lys Leu Tyr 230 235 240 Cys Asn Gly Asp Gly Glu Trp Leu Val Pro Ile Gly Arg Cys Met 245 250 255 Cys Lys Ala Gly Phe Glu Ala Val Glu Asn Gly Thr Val Cys Arg 260 265 270 Gly Cys Pro Ser Gly Thr Phe Lys Ala Asn Gln Gly Asp Glu Ala 275 280 285 Cys Thr His Cys Pro Ile Asn Ser Arg Thr Thr Ser Glu Gly Ala 290 295 300 Thr Asn Cys Val Cys Arg Asn Gly Tyr Tyr Arg Ala Asp Leu Asp 305 310 315 Pro Leu Asp Met Pro Cys Thr Thr Ile Pro Ser Ala Pro Gln Ala 320 325 330 Val Ile Ser Ser Val Asn Glu Thr Ser Leu Met Leu Glu Trp Thr 335 340 345 Pro Pro Arg Asp Ser Gly Gly Arg Glu Asp Leu Val Tyr Asn Ile 350 355 360 Ile Cys Lys Ser Cys Gly Ser Gly Arg Gly Ala Cys Thr Arg Cys 365 370 375 Gly Asp Asn Val Gln Tyr Ala Pro Arg Gln Leu Gly Leu Thr Glu 380 385 390 Pro Arg Ile Tyr Ile Ser Asp Leu Leu Ala His Thr Gln Tyr Thr 395 400 405 Phe Glu Ile Gln Ala Val Asn Gly Val Thr Asp Gln Ser Pro Phe 410 415 420 Ser Pro Gln Phe Ala Ser Val Asn Ile Thr Thr Asn Gln Ala Ala 425 430 435 Pro Ser Ala Val Ser Ile Met His Gln Val Ser Arg Thr Val Asp 440 445 450 Ser Ile Thr Leu Ser Trp Ser Gln Pro Asp Gln Pro Asn Gly Val 455 460 465 Ile Leu Asp Tyr Glu Leu Gln Tyr Tyr Glu Lys Glu Leu Ser Glu 470 475 480 Tyr Asn Ala Thr Ala Ile Lys Ser Pro Thr Asn Thr Val Thr Val 485 490 495 Gln Gly Leu Lys Ala Gly Ala Ile Tyr Val Phe Gln Val Arg Ala 500 505 510 Arg Thr Val Ala Gly Tyr Gly Arg Tyr Ser Gly Lys Met Tyr Phe 515 520 525 Gln Thr Met Thr Glu Ala Glu Tyr Gln Thr Ser Ile Gln Glu Lys 530 535 540 Leu Pro Leu Ile Ile Gly Ser Ser Ala Ala Gly Leu Val Phe Leu 545 550 555 Ile Ala Val Val Val Ile Ala Ile Val Cys Asn Arg Arg Arg Gly 560 565 570 Phe Glu Arg Ala Asp Ser Glu Tyr Thr Asp Lys Leu Gln His Tyr 575 580 585 Thr Ser Gly His Met Thr Pro Gly Met Lys Ile Tyr Ile Asp Pro 590 595 600 Phe Thr Tyr Glu Asp Pro Asn Glu Ala Val Arg Glu Phe Ala Lys 605 610 615 Glu Ile Asp Ile Ser Cys Val Lys Ile Glu Gln Val Ile Gly Ala 620 625 630 Gly Glu Phe Gly Glu Val Cys Ser Gly His Leu Lys Leu Pro Gly 635 640 645 Lys Arg Glu Ile Phe Val Ala Ile Lys Thr Leu Lys Ser Gly Tyr 650 655 660 Thr Glu Lys Gln Arg Arg Asp Phe Leu Ser Glu Ala Ser Ile Met 665 670 675 Gly Gln Phe Asp His Pro Asn Val Ile His Leu Glu Gly Val Val 680 685 690 Thr Lys Ser Thr Pro Val Met Ile Ile Thr Glu Phe Met Glu Asn 695 700 705 Gly Ser Leu Asp Ser Phe Leu Arg Gln Asn Asp Gly Gln Phe Thr 710 715 720 Val Ile Gln Leu Val Gly Met Leu Arg Gly Ile Ala Ala Gly Met 725 730 735 Lys Tyr Leu Ala Asp Met Asn Tyr Val His Arg Asp Leu Ala Ala 740 745 750 Arg Asn Ile Leu Val Asn Ser Asn Leu Val Cys Lys Val Ser Asp 755 760 765 Phe Gly Leu Ser Arg Phe Leu Glu Asp Asp Thr Ser Asp Pro Thr 770 775 780 Tyr Thr Ser Ala Leu Gly Gly Lys Ile Pro Ile Arg Trp Thr Ala 785 790 795 Pro Glu Ala Ile Gln Tyr Arg Lys Phe Thr Ser Ala Ser Asp Val 800 805 810 Trp Ser Tyr Gly Ile Val Met Trp Glu Val Met Ser Tyr Gly Glu 815 820 825 Arg Pro Tyr Trp Asp Met Thr Asn Gln Asp Val Ile Asn Ala Ile 830 835 840 Glu Gln Asp Tyr Arg Leu Pro Pro Pro Met Asp Cys Pro Ser Ala 845 850 855 Leu His Gln Leu Met Leu Asp Cys Trp Gln Lys Asp Arg Asn His 860 865 870 Arg Pro Lys Phe Gly Gln Ile Val Asn Thr Leu Asp Lys Met Ile 875 880 885 Arg Asn Pro Asn Ser Leu Lys Ala Met Ala Pro Leu Ser Ser Gly 890 895 900 Ile Asn Leu Pro Leu Leu Asp Arg Thr Ile Pro Asp Tyr Thr Ser 905 910 915 Phe Asn Thr Val Asp Glu Trp Leu Glu Ala Ile Lys Met Gly Gln 920 925 930 Tyr Lys Glu Ser Phe Ala Asn Ala Gly Phe Thr Ser Phe Asp Val 935 940 945 Val Ser Gln Met Met Met Glu Asp Ile Leu Arg Val Gly Val Thr 950 955 960 Leu Ala Gly His Gln Lys Lys Ile Leu Asn Ser Ile Gln Val Met 965 970 975 Arg Ala Gln Met Asn Gln Ile Gln Ser Val Glu Val 980 985 57 2033 DNA Homo sapien 57 ggagaatccc cggaaaggct gagtctccag ctcaaggtca aaacgtccaa 50 ggccgaaagc cctccagttt cccctggacg ccttgctcct gcttctgcta 100 cgaccttctg gggaaaacga atttctcatt ttcttcttaa attgccattt 150 tcgctttagg agatgaatgt tttcctttgg ctgttttggc aatgactctg 200 aattaaagcg atgctaacgc ctcttttccc cctaattgtt aaaagctatg 250 gactgcagga agatggcccg cttctcttac agtgtgattt ggatcatggc 300 catttctaaa gtctttgaac tgggattagt tgccgggctg ggccatcagg 350 aatttgctcg tccatctcgg ggatacctgg ccttcagaga tgacagcatt 400 tggccccagg aggagcctgc aattcggcct cggtcttccc agcgtgtgcc 450 gcccatgggg atacagcaca gtaaggagct aaacagaacc tgctgcctga 500 atgggggaac ctgcatgctg gggtcctttt gtgcctgccc tccctccttc 550 tacggacgga actgtgagca cgatgtgcgc aaagagaact gtgggtctgt 600 gccccatgac acctggctgc ccaagaagtg ttccctgtgt aaatgctggc 650 acggtcagct ccgctgcttt cctcaggcat ttctacccgg ctgtgatggc 700 cttgtgatgg atgagcacct cgtggcttcc aggactccag aactaccacc 750 gtctgcacgt actaccactt ttatgctagt tggcatctgc ctttctatac 800 aaagctacta ttaatcgaca ttgacctatt tccagaaata caattttaga 850 tatcatgcaa atttcatgac cagtaaaggc tgctgctaca atgtcctaac 900 tgaaagatga tcatttgtag ttgccttaaa ataatgaata caatttccaa 950 aatggtctct aacatttcct tacagaacta cttcttactt ctttgccctg 1000 ccctctccca aaaaactact tcttttttca aaagaaagtc agccatatct 1050 ccattgtgcc taagtccagt gtttcttttt tttttttttt ttgagacgga 1100 gtctcactct gtcacccagg ctggactgca atgacgcgat cttggttcac 1150 tgcaacctcc gcatccgggg ttcaagccat tctcctgcct aagcctccca 1200 agtaactggg attacaggca tgtgtcacca tgcccagcta atttttttgt 1250 attttagtag agatgggggt ttcaccatat tggccagtct ggtctcgaac 1300 tctgaccttg tgatccatcg atcagcctct cgagtgctga gattacacac 1350 gtgagcaact gtgcaaggcc tggtgtttct tgatacatgt aattctacca 1400 aggtcttctt aatatgttct tttaaatgat tgaattatat gttcagatta 1450 ttggagacta attctaatgt ggaccttaga atacagtttt gagtagagtt 1500 gatcaaaatc aattaaaata gtctctttaa aaggaaagaa aacatcttta 1550 aggggaggaa ccagagtgct gaaggaatgg aagtccatct gcgtgtgtgc 1600 agggagactg ggtaggaaag aggaagcaaa tagaagagag aggttgaaaa 1650 acaaaatggg ttacttgatt ggtgattagg tggtggtaga gaagcaagta 1700 aaaaggctaa atggaagggc aagtttccat catctataga aagctatata 1750 agacaagaac tccccttttt ttcccaaagg cattataaaa

agaatgaagc 1800 ctccttagaa aaaaaattat acctcaatgt ccccaacaag attgcttaat 1850 aaattgtgtt tcctccaagc tattcaattc ttttaactgt tgtagaagac 1900 aaaatgttca caatatattt agttgtaaac caagtgatca aactacatat 1950 tgtaaagccc atttttaaaa tacattgtat atatgtgtat gcacagtaaa 2000 aatggaaact atattgacct aaaaaaaaaa aaa 2033 58 188 PRT Homo sapien 58 Met Asp Cys Arg Lys Met Ala Arg Phe Ser Tyr Ser Val Ile Trp 1 5 10 15 Ile Met Ala Ile Ser Lys Val Phe Glu Leu Gly Leu Val Ala Gly 20 25 30 Leu Gly His Gln Glu Phe Ala Arg Pro Ser Arg Gly Tyr Leu Ala 35 40 45 Phe Arg Asp Asp Ser Ile Trp Pro Gln Glu Glu Pro Ala Ile Arg 50 55 60 Pro Arg Ser Ser Gln Arg Val Pro Pro Met Gly Ile Gln His Ser 65 70 75 Lys Glu Leu Asn Arg Thr Cys Cys Leu Asn Gly Gly Thr Cys Met 80 85 90 Leu Gly Ser Phe Cys Ala Cys Pro Pro Ser Phe Tyr Gly Arg Asn 95 100 105 Cys Glu His Asp Val Arg Lys Glu Asn Cys Gly Ser Val Pro His 110 115 120 Asp Thr Trp Leu Pro Lys Lys Cys Ser Leu Cys Lys Cys Trp His 125 130 135 Gly Gln Leu Arg Cys Phe Pro Gln Ala Phe Leu Pro Gly Cys Asp 140 145 150 Gly Leu Val Met Asp Glu His Leu Val Ala Ser Arg Thr Pro Glu 155 160 165 Leu Pro Pro Ser Ala Arg Thr Thr Thr Phe Met Leu Val Gly Ile 170 175 180 Cys Leu Ser Ile Gln Ser Tyr Tyr 185 59 3346 DNA Homo sapien 59 gagtagacag cacagcggca gcggagggag tctatgcgag ctggacagca 50 gtgggaggtt tgtgaggctc gcactggccg cagaccctcg ggctcgatcg 100 cccgggagcc aggactcggc gacgcgaggc tgccgggcta cccggccgag 150 gcttcggggg cgcaaactaa tgggactggc tcgctcggca gcatctcccc 200 gctcttctaa gtacactgag cagggcccgc gctgaagtag aagctgtccg 250 ggggcgcgta gcccggagtc ccagtgtggc ccggaggaac ggagcccgtg 300 ccagggcggc ccagtcggga gcccggggac cgagcttgtg ctgtggggaa 350 acccccactt cttccaaggg acagcgatcc cgggacggtc gaggcgtcgg 400 ggcggtcacc gagacctctg cgggaagacc ccgtcgggga gagggcgcgc 450 agccccgaag cgtctcggga agtcgagcgg aatcgggcgg gatcacccgg 500 gggcgcagag cccccgtcgc gcctcgtgcg gcagcggaga gcccaggaga 550 acgagccctc gggggccgaa gcccatgccc gggttggggg cggctgccca 600 gtgagtcctc ctggccggcc gggcggagaa gagcgacacc gaagccggcg 650 ggaggggagc acttcaaggc cggcggctgc ggaggatggg cgcctgagcg 700 gctccgagcg cagcgcggca gaggaaggcg aggcgagctt tggtgaggag 750 gcgccaaggg atcccgaagt gcagtctgcc cccgggaaga tggctcggcc 800 tgggcagcgt tggctcggca agtggcttgt ggcgatggtc gtgtgggcgc 850 tgtgccggct cgccacaccg ctggccaaga acctggagcc cgtatcctgg 900 agctccctca accccaagtt cctgagtggg aagggcttgg tgatctatcc 950 gaaaattgga gacaagctgg acatcatctg cccccgagca gaagcagggc 1000 ggccctatga gtactacaag ctgtacctgg tgcggcctga gcaggcagct 1050 gcctgtagca cagttctcga ccccaacgtg ttggtcacct gcaataggcc 1100 agagcaggaa atacgcttta ccatcaagtt ccaggagttc agccccaact 1150 acatgggcct ggagttcaag aagcaccatg attactacat tacctcaaca 1200 tccaatggaa gcctggaggg gctggaaaac cgggagggcg gtgtgtgccg 1250 cacacgcacc atgaagatca tcatgaaggt tgggcaagat cccaatgctg 1300 tgacgcctga gcagctgact accagcaggc ccagcaagga ggcagacaac 1350 actgtcaaga tggccacaca ggcccctggt agtcggggct ccctgggtga 1400 ctctgatggc aagcatgaga ctgtgaacca ggaagagaag agtggcccag 1450 gtgcaagtgg gggcagcagc ggggaccctg atggcttctt caactccaag 1500 gtggcattgt tcgcggctgt cggtgccggt tgcgtcatct tcctgctcat 1550 catcatcttc ctgacggtcc tactactgaa gctacgcaag cggcaccgca 1600 agcacacaca gcagcgggcg gctgccctct cgctcagtac cctggccagt 1650 cccaaggggg gcagtggcac agcgggcacc gagcccagcg acatcatcat 1700 tcccttacgg actacagaga acaactactg cccccactat gagaaggtga 1750 gtggggacta cgggcaccct gtctacatcg tccaagagat gccgccccag 1800 agcccggcga acatctacta caaggtctga gtgcccggca cggcctcagg 1850 cccccgaggg acagtcggcc tggaccggac ctctcctttc gcccccacac 1900 cccctcccct tgccagctgt gcccaccttt gtatttagtt ttgtagtttc 1950 ttggctttta taatccccct ttttccctgc cccctgggct tcggaggggg 2000 gtgcttgtgc ccctaacccc catgctcttg tgccttcccc ctctggccag 2050 gcctctgggc tccgtggggg cgccccttct tggaaggcag ggctggacac 2100 tgatggacag caggcaggga gacagtcccc tggccctgcc cctccctcgc 2150 cccccttgcc accttcccag gactgcttgt ccgctatcat cactgttttt 2200 aatgcttttg tgttcatttt ttagctgtca actcattttc atctgttttt 2250 tgaagaaaaa tggaaaaatg taaaaggcag cccctcccca ggctttgtga 2300 gcctggccca agccagtaca agagggcctg gggcacgatg tggtcagcca 2350 ggaagcatag gatgccattt cttttataga ttccttggta tttctggtgg 2400 ggtaaggggc aggccagggc tgttcacgcc catgagggaa gaggaaagtg 2450 ccactgggca aggtgtccca ccctcccctc ctgaccctcc tacgaggctt 2500 atcctggcaa tggggtagtc actgccaccc ttccacacac acacacacac 2550 acacacacac aaaaaaaaat cccttccttg tgggattctt gggcatctcc 2600 tgcctccctc actctcacgg taattaatgt cttaattggc tgttgcctgg 2650 ggaacaggag agctgctgca ggcagatgac ctcatggggg gtggagggag 2700 gtgaggtgcc caggtggcta tttgccctgc agagctggga gtttcacccc 2750 caccccccac cctgttctct ccttaccttt ggcatccttt ggcctggtgg 2800 ggaaacagag gcccagggtg gagacctaag cgggtataag accaggtggc 2850 ctgctccttt tctgggccct agcacaggtg ggtaaccccc acccaaccca 2900 gctcctgctg ctgtcccagt cttgggctgg ggcctggaaa gaggaagagg 2950 ctgcctgggg ctgggccagc ccgctgtgca ctttgacccc agttccttgc 3000 cagcacggct gctaacagac tgccacttga gtgcgccttg caggcactcc 3050 cagagcagcc atggaaggag ctggccctca caccatccac ctccacactg 3100 cctcctggcc agctgcccac cccagtgcca ggtgggagag ggagcagaac 3150 agccagcccc ttccaggtgg cagtcggaag ggtttttgtt tttgtttctg 3200 ttgccatttg tgtaaatact agtctttttg gaaaaaaaat aatgtaaaga 3250 tgttttgtat aaactctgaa ttattttctt gttgcttttt tcttagaaaa 3300 aaatgagaac taaaaaaaaa aaattaacca catggaaaaa aaaaaa 3346 60 346 PRT Homo sapien 60 Met Ala Arg Pro Gly Gln Arg Trp Leu Gly Lys Trp Leu Val Ala 1 5 10 15 Met Val Val Trp Ala Leu Cys Arg Leu Ala Thr Pro Leu Ala Lys 20 25 30 Asn Leu Glu Pro Val Ser Trp Ser Ser Leu Asn Pro Lys Phe Leu 35 40 45 Ser Gly Lys Gly Leu Val Ile Tyr Pro Lys Ile Gly Asp Lys Leu 50 55 60 Asp Ile Ile Cys Pro Arg Ala Glu Ala Gly Arg Pro Tyr Glu Tyr 65 70 75 Tyr Lys Leu Tyr Leu Val Arg Pro Glu Gln Ala Ala Ala Cys Ser 80 85 90 Thr Val Leu Asp Pro Asn Val Leu Val Thr Cys Asn Arg Pro Glu 95 100 105 Gln Glu Ile Arg Phe Thr Ile Lys Phe Gln Glu Phe Ser Pro Asn 110 115 120 Tyr Met Gly Leu Glu Phe Lys Lys His His Asp Tyr Tyr Ile Thr 125 130 135 Ser Thr Ser Asn Gly Ser Leu Glu Gly Leu Glu Asn Arg Glu Gly 140 145 150 Gly Val Cys Arg Thr Arg Thr Met Lys Ile Ile Met Lys Val Gly 155 160 165 Gln Asp Pro Asn Ala Val Thr Pro Glu Gln Leu Thr Thr Ser Arg 170 175 180 Pro Ser Lys Glu Ala Asp Asn Thr Val Lys Met Ala Thr Gln Ala 185 190 195 Pro Gly Ser Arg Gly Ser Leu Gly Asp Ser Asp Gly Lys His Glu 200 205 210 Thr Val Asn Gln Glu Glu Lys Ser Gly Pro Gly Ala Ser Gly Gly 215 220 225 Ser Ser Gly Asp Pro Asp Gly Phe Phe Asn Ser Lys Val Ala Leu 230 235 240 Phe Ala Ala Val Gly Ala Gly Cys Val Ile Phe Leu Leu Ile Ile 245 250 255 Ile Phe Leu Thr Val Leu Leu Leu Lys Leu Arg Lys Arg His Arg 260 265 270 Lys His Thr Gln Gln Arg Ala Ala Ala Leu Ser Leu Ser Thr Leu 275 280 285 Ala Ser Pro Lys Gly Gly Ser Gly Thr Ala Gly Thr Glu Pro Ser 290 295 300 Asp Ile Ile Ile Pro Leu Arg Thr Thr Glu Asn Asn Tyr Cys Pro 305 310 315 His Tyr Glu Lys Val Ser Gly Asp Tyr Gly His Pro Val Tyr Ile 320 325 330 Val Gln Glu Met Pro Pro Gln Ser Pro Ala Asn Ile Tyr Tyr Lys 335 340 345 Val 61 2438 DNA Homo sapien 61 ccggcggggg cgccgcggag agcggagggc gccgggctgc ggaacgcgaa 50 gcggagggcg cgggaccctg cacgccgccc gcgggcccat gtgagcgcca 100 tgcggcgccg cgcagcccgg ggacccggcc cgccgccccc agggcccgga 150 ctctcgcggt tgccgctgct gccgctgccg ctgctgctgc tgctggcgct 200 ggggacccgc gggggctgcg ccgcgcccgc acccgcgccg cgcgccgagg 250 acctcagcct gggagtggag tggctaagca ggttcggtta cctgcccccg 300 gctgacccca caacagggca gctgcagacg caagaggagc tgtctaaggc 350 catcacagcc atgcagcagt ttggtggcct ggaggccacc ggcatcctgg 400 acgaggccac cctggccctg atgaaaaccc cacgctgctc cctgccagac 450 ctccctgtcc tgacccaggc tcgcaggaga cgccaggctc cagcccccac 500 caagtggaac aagaggaacc tgtcgtggag ggtccggacg ttcccacggg 550 actcaccact ggggcacgac acggtgcgtg cactcatgta ctacgccctc 600 aaggtctgga gcgacattgc gcccctgaac ttccacgagg tggcgggcag 650 caccgccgac atccagatcg acttctccaa ggccgaccat aacgacggct 700 accccttcga cggccccggc ggcaccgtgg cccacgcctt cttccccggc 750 caccaccaca ccgccgggga cacccacttt gacgatgacg aggcctggac 800 cttccgctcc tcggatgccc acgggatgga cctgtttgca gtggctgtcc 850 acgagtttgg ccacgccatt gggttaagcc atgtggccgc tgcacactcc 900 atcatgcggc cgtactacca gggcccggtg ggtgacccgc tgcgctacgg 950 gctcccctac gaggacaagg tgcgcgtctg gcagctgtac ggtgtgcggg 1000 agtctgtgtc tcccacggcg cagcccgagg agcctcccct gctgccggag 1050 cccccagaca accggtccag cgccccgccc aggaaggacg tgccccacag 1100 atgcagcact cactttgacg cggtggccca gatccgcggt gaagctttct 1150 tcttcaaagg caagtacttc tggcggctga cgcgggaccg gcacctggtg 1200 tccctgcagc cggcacagat gcaccgcttc tggcggggcc tgccgctgca 1250 cctggacagc gtggacgccg tgtacgagcg caccagcgac cacaagatcg 1300 tcttctttaa aggagacagg tactgggtgt tcaaggacaa taacgtagag 1350 gaaggatacc cgcgccccgt ctccgacttc agcctcccgc ctggcggcat 1400 cgacgctgcc ttctcctggg cccacaatga caggacttat ttctttaagg 1450 accagctgta ctggcgctac gatgaccaca cgaggcacat ggaccccggc 1500 taccccgccc agagccccct gtggaggggt gtccccagca cgctggacga 1550 cgccatgcgc tggtccgacg gtgcctccta cttcttccgt ggccaggagt 1600 actggaaagt gctggatggc gagctggagg tggcacccgg gtacccacag 1650 tccacggccc gggactggct ggtgtgtgga gactcacagg ccgatggatc 1700 tgtggctgcg ggcgtggacg cggcagaggg gccccgcgcc cctccaggac 1750 aacatgacca gagccgctcg gaggacggtt acgaggtctg ctcatgcacc 1800 tctggggcat cctctccccc gggggcccca ggcccactgg tggctgccac 1850 catgctgctg ctgctgccgc cactgtcacc aggcgccctg tggacagcgg 1900 cccaggccct gacgctatga cacacagcgc gagcccatga gaggacagag 1950 gcggtgggac agcctggcca cagagggcaa ggactgtgcc ggagtccctg 2000 ggggaggtgc tggcgcggga tgaggacggg ccaccctggc accggaaggc 2050 cagcagaggg cacggcccgc cagggctggg caggctcagg tggcaaggac 2100 ggagctgtcc cctagtgagg gactgtgttg actgacgagc cgaggggtgg 2150 ccgctccaga agggtgccca gtcaggccgc accgccgcca gcctcctccg 2200 gccctggagg gagcatctcg ggctgggggc ccacccctct ctgtgccggc 2250 gccaccaacc ccacccacac tgctgcctgg tgctcccgcc ggcccacagg 2300 gcctccgtcc ccaggtcccc agtggggcag ccctccccac agacgagccc 2350 cccacatggt gccgcggcac gtcccccctg tgacgcgttc cagaccaaca 2400 tgacctctcc ctgctttgta aaaaaaaaaa aaaaaaaa 2438 62 606 PRT Homo sapien 62 Met Arg Arg Arg Ala Ala Arg Gly Pro Gly Pro Pro Pro Pro Gly 1 5 10 15 Pro Gly Leu Ser Arg Leu Pro Leu Leu Pro Leu Pro Leu Leu Leu 20 25 30 Leu Leu Ala Leu Gly Thr Arg Gly Gly Cys Ala Ala Pro Ala Pro 35 40 45 Ala Pro Arg Ala Glu Asp Leu Ser Leu Gly Val Glu Trp Leu Ser 50 55 60 Arg Phe Gly Tyr Leu Pro Pro Ala Asp Pro Thr Thr Gly Gln Leu 65 70 75 Gln Thr Gln Glu Glu Leu Ser Lys Ala Ile Thr Ala Met Gln Gln 80 85 90 Phe Gly Gly Leu Glu Ala Thr Gly Ile Leu Asp Glu Ala Thr Leu 95 100 105 Ala Leu Met Lys Thr Pro Arg Cys Ser Leu Pro Asp Leu Pro Val 110 115 120 Leu Thr Gln Ala Arg Arg Arg Arg Gln Ala Pro Ala Pro Thr Lys 125 130 135 Trp Asn Lys Arg Asn Leu Ser Trp Arg Val Arg Thr Phe Pro Arg 140 145 150 Asp Ser Pro Leu Gly His Asp Thr Val Arg Ala Leu Met Tyr Tyr 155 160 165 Ala Leu Lys Val Trp Ser Asp Ile Ala Pro Leu Asn Phe His Glu 170 175 180 Val Ala Gly Ser Thr Ala Asp Ile Gln Ile Asp Phe Ser Lys Ala 185 190 195 Asp His Asn Asp Gly Tyr Pro Phe Asp Gly Pro Gly Gly Thr Val 200 205 210 Ala His Ala Phe Phe Pro Gly His His His Thr Ala Gly Asp Thr 215 220 225 His Phe Asp Asp Asp Glu Ala Trp Thr Phe Arg Ser Ser Asp Ala 230 235 240 His Gly Met Asp Leu Phe Ala Val Ala Val His Glu Phe Gly His 245 250 255 Ala Ile Gly Leu Ser His Val Ala Ala Ala His Ser Ile Met Arg 260 265 270 Pro Tyr Tyr Gln Gly Pro Val Gly Asp Pro Leu Arg Tyr Gly Leu 275 280 285 Pro Tyr Glu Asp Lys Val Arg Val Trp Gln Leu Tyr Gly Val Arg 290 295 300 Glu Ser Val Ser Pro Thr Ala Gln Pro Glu Glu Pro Pro Leu Leu 305 310 315 Pro Glu Pro Pro Asp Asn Arg Ser Ser Ala Pro Pro Arg Lys Asp 320 325 330 Val Pro His Arg Cys Ser Thr His Phe Asp Ala Val Ala Gln Ile 335 340 345 Arg Gly Glu Ala Phe Phe Phe Lys Gly Lys Tyr Phe Trp Arg Leu 350 355 360 Thr Arg Asp Arg His Leu Val Ser Leu Gln Pro Ala Gln Met His 365 370 375 Arg Phe Trp Arg Gly Leu Pro Leu His Leu Asp Ser Val Asp Ala 380 385 390 Val Tyr Glu Arg Thr Ser Asp His Lys Ile Val Phe Phe Lys Gly 395 400 405 Asp Arg Tyr Trp Val Phe Lys Asp Asn Asn Val Glu Glu Gly Tyr 410 415 420 Pro Arg Pro Val Ser Asp Phe Ser Leu Pro Pro Gly Gly Ile Asp 425 430 435 Ala Ala Phe Ser Trp Ala His Asn Asp Arg Thr Tyr Phe Phe Lys 440 445 450 Asp Gln Leu Tyr Trp Arg Tyr Asp Asp His Thr Arg His Met Asp 455 460 465 Pro Gly Tyr Pro Ala Gln Ser Pro Leu Trp Arg Gly Val Pro Ser 470 475 480 Thr Leu Asp Asp Ala Met Arg Trp Ser Asp Gly Ala Ser Tyr Phe 485 490 495 Phe Arg Gly Gln Glu Tyr Trp Lys Val Leu Asp Gly Glu Leu Glu 500 505 510 Val Ala Pro Gly Tyr Pro Gln Ser Thr Ala Arg Asp Trp Leu Val 515 520 525 Cys Gly Asp Ser Gln Ala Asp Gly Ser Val Ala Ala Gly Val Asp 530 535 540 Ala Ala Glu Gly Pro Arg Ala Pro Pro Gly Gln His Asp Gln Ser 545 550 555 Arg Ser Glu Asp Gly Tyr Glu Val Cys Ser Cys Thr Ser Gly Ala 560 565 570 Ser Ser Pro Pro Gly Ala Pro Gly Pro Leu Val Ala Ala Thr Met 575 580 585 Leu Leu Leu Leu Pro Pro Leu Ser Pro Gly Ala Leu Trp Thr Ala 590 595 600 Ala Gln Ala Leu Thr Leu 605 63 1009 DNA Homo sapien 63 gacaaatgag ggtttggcat gcagctcgtc atcttaagag ttactatctt 50 cttgccctgg tgtttcgccg ttccagtgcc ccctgctgca gaccataaag 100 gatgggactt tgttgagggc tatttccatc aatttttcct gaccgagaag 150

gagtcgccac tccttaccca ggagacacaa acacagctcc tgcaacaatt 200 ccatcggaat gggacagacc tacttgacat gcagatgcat gctctgctac 250 accagcccca ctgtggggtg cctgatgggt ccgacacctc catctcgcca 300 ggaagatgca agtggaataa gcacactcta acttacagga ttatcaatta 350 cccacatgat atgaagccat ccgcagtgaa agacagtata tataatgcag 400 tttccatctg gagcaatgtg acccctttga tattccagca agtgcagaat 450 ggagatgcag acatcaaggt ttctttctgg cagtgggccc atgaagatgg 500 ttggcccttt gatgggccag gtggtatctt aggccatgcc tttttaccaa 550 attctggaaa tcctggagtt gtccattttg acaagaatga acactggtca 600 gcttcagaca ctggatataa tctgttcctg gttgcaactc atgagattgg 650 gcattctttg ggcctgcagc actctgggaa tcagagctcc ataatgtacc 700 ccacttactg gtatcacgac cctagaacct tccagctcag tgccgatgat 750 atccaaagga tccagcattt gtatggagaa aaatgttcat ctgacatacc 800 ttaatgttag cacagaggac ttattcaacc tgtcctttca gggagtttat 850 tggaggatca aagaactgaa agcactagag cagccttggg gactgctagg 900 atgaagccct aaagaatgca acctagtcag gttagctgaa ccgacactca 950 aaacgctact gagtcacaat aaagattgtt ttaaagagta aaaaaaaaaa 1000 aaaaaaaaa 1009 64 261 PRT Homo sapien 64 Met Gln Leu Val Ile Leu Arg Val Thr Ile Phe Leu Pro Trp Cys 1 5 10 15 Phe Ala Val Pro Val Pro Pro Ala Ala Asp His Lys Gly Trp Asp 20 25 30 Phe Val Glu Gly Tyr Phe His Gln Phe Phe Leu Thr Glu Lys Glu 35 40 45 Ser Pro Leu Leu Thr Gln Glu Thr Gln Thr Gln Leu Leu Gln Gln 50 55 60 Phe His Arg Asn Gly Thr Asp Leu Leu Asp Met Gln Met His Ala 65 70 75 Leu Leu His Gln Pro His Cys Gly Val Pro Asp Gly Ser Asp Thr 80 85 90 Ser Ile Ser Pro Gly Arg Cys Lys Trp Asn Lys His Thr Leu Thr 95 100 105 Tyr Arg Ile Ile Asn Tyr Pro His Asp Met Lys Pro Ser Ala Val 110 115 120 Lys Asp Ser Ile Tyr Asn Ala Val Ser Ile Trp Ser Asn Val Thr 125 130 135 Pro Leu Ile Phe Gln Gln Val Gln Asn Gly Asp Ala Asp Ile Lys 140 145 150 Val Ser Phe Trp Gln Trp Ala His Glu Asp Gly Trp Pro Phe Asp 155 160 165 Gly Pro Gly Gly Ile Leu Gly His Ala Phe Leu Pro Asn Ser Gly 170 175 180 Asn Pro Gly Val Val His Phe Asp Lys Asn Glu His Trp Ser Ala 185 190 195 Ser Asp Thr Gly Tyr Asn Leu Phe Leu Val Ala Thr His Glu Ile 200 205 210 Gly His Ser Leu Gly Leu Gln His Ser Gly Asn Gln Ser Ser Ile 215 220 225 Met Tyr Pro Thr Tyr Trp Tyr His Asp Pro Arg Thr Phe Gln Leu 230 235 240 Ser Ala Asp Asp Ile Gln Arg Ile Gln His Leu Tyr Gly Glu Lys 245 250 255 Cys Ser Ser Asp Ile Pro 260 65 3410 DNA Homo sapien 65 gaattcgagg atccgggtac catgggcggc ggcaggccta gcagcacggg 50 aaccgtcccc cgcgcgcatg cgcgcgcccc tgaagcgcct gggggacggg 100 tatgggcggg aggtaggggc gcggctccgc gtgccagttg ggtgcccgcg 150 cgtcacgtgg tgaggaagga ggcggaggtc tgagtttcga gggagggggg 200 gagagaagag ggaacgagca agggaaggaa agcggggaaa ggaggaagga 250 aacgaacgag ggggagggag gtccctgttt tggaggagct aggagcgttg 300 ccggcccctg aagtggagcg agagggaggt gcttcgccgt ttctcctgcc 350 aggggaggtc ccggcttccc gtggaggctc cggaccaagc cccttcagct 400 tctccctccg gatcgatgtg ctgctgttaa cccgtgagga ggcggcggcg 450 gcggcagcgg cagcggaaga tggtgttgct gagagtgtta attctgctcc 500 tctcctgggc ggcggggatg ggaggtcagt atgggaatcc tttaaataaa 550 tatatcagac attatgaagg attatcttac aatgtggatt cattacacca 600 aaaacaccag cgtgccaaaa gagcagtctc acatgaagac caatttttac 650 gtctagattt ccatgcccat ggaagacatt tcaacctacg aatgaagagg 700 gacacttccc ttttcagtga tgaatttaaa gtagaaacat caaataaagt 750 acttgattat gatacctctc atatttacac tggacatatt tatggtgaag 800 aaggaagttt tagccatggg tctgttattg atggaagatt tgaaggattc 850 atccagactc gtggtggcac attttatgtt gagccagcag agagatatat 900 taaagaccga actctgccat ttcactctgt catttatcat gaagatgata 950 ttaactatcc ccataaatac ggtcctcagg ggggctgtgc agatcattca 1000 gtatttgaaa gaatgaggaa ataccagatg actggtgtag aggaagtaac 1050 acagatacct caagaagaac atgctgctaa tggtccagaa cttctgagga 1100 aaaaacgtac aacttcagct gaaaaaaata cttgtcagct ttatattcag 1150 actgatcatt tgttctttaa atattacgga acacgagaag ctgtgattgc 1200 ccagatatcc agtcatgtta aagcgattga tacaatttac cagaccacag 1250 acttctccgg aatccgtaac atcagtttca tggtgaaacg cataagaatc 1300 aatacaactg ctgatgagaa ggaccctaca aatcctttcc gtttcccaaa 1350 tattggtgtg gagaagtttc tggaattgaa ttctgagcag aatcatgatg 1400 actactgttt ggcctatgtc ttcacagacc gagattttga tgatggcgta 1450 cttggtctgg cttgggttgg agcaccttca ggaagctctg gaggaatatg 1500 tgaaaaaagt aaactctatt cagatggtaa gaagaagtcc ttaaacactg 1550 gaattattac tgttcagaac tatgggtctc atgtacctcc caaagtctct 1600 cacattactt ttgctcacga agttggacat aactttggat ccccacatga 1650 ttctggaaca gagtgcacac caggagaatc taagaatttg ggtcaaaaag 1700 aaaatggcaa ttacatcatg tatgcaagag caacatctgg ggacaaactt 1750 aacaacaata aattctcact ctgtagtatt agaaatataa gccaagttct 1800 tgagaagaag agaaacaact gttttgttga atctggccaa cctatttgtg 1850 gaaatggaat ggtagaacaa ggtgaagaat gtgattgtgg ctatagtgac 1900 cagtgtaaag atgaatgctg cttcgatgca aatcaaccag agggaagaaa 1950 atgcaaactg aaacctggga aacagtgcag tccaagtcaa ggtccttgtt 2000 gtacagcaca gtgtgcattc aagtcaaagt ctgagaagtg tcgggatgat 2050 tcagactgtg caagggaagg aatatgtaat ggcttcacag ctctctgccc 2100 agcatctgac cctaaaccaa acttcacaga ctgtaatagg catacacaag 2150 tgtgcattaa tgggcaatgt gcaggttcta tctgtgagaa atatggctta 2200 gaggagtgta cgtgtgccag ttctgatggc aaagatgata aagaattatg 2250 ccatgtatgc tgtatgaaga aaatggaccc atcaacttgt gccagtacag 2300 ggtctgtgca gtggagtagg cacttcagtg gtcgaaccat caccctgcaa 2350 cctggatccc cttgcaacga ttttagaggt tactgtgatg ttttcatgcg 2400 gtgcagatta gtagatgctg atggtcctct agctaggctt aaaaaagcaa 2450 tttttagtcc agagctctat gaaaacattg ctgaatggat tgtggctcat 2500 tggtgggcag tattacttat gggaattgct ctgatcatgc taatggctgg 2550 atttattaag atatgcagtg ttcatactcc aagtagtaat ccaaagttgc 2600 ctcctcctaa accacttcca ggcactttaa agaggaggag acctccacag 2650 cccattcagc aaccccagcg tcagcggccc cgagagagtt atcaaatggg 2700 acacatgaga cgctaactgc agcttttgcc ttggttcttc ctagtgccta 2750 caatgggaaa acttcactcc aaagagaaac ctattaagtc atcatctcca 2800 aactaaaccc tcacaagtaa cagttgaaga aaaaatggca agagatcata 2850 tcctcagacc aggtggaatt acttaaattt taaagcctga aaattccaat 2900 ttgggggtgg gaggtggaaa aggaacccaa ttttcttatg aacagatatt 2950 tttaacttaa tggcacaaag tcttagaata ttattatgtg ccccgtgttc 3000 cctgttcttc gttgctgcat tttcttcact tgcaggcaaa cttggctctc 3050 aataaacttt taccacaaat tgaaataaat atattttttt caactgccaa 3100 tcaaggctag gaggctcgac cacctcaaca ttggagacat cacttgccaa 3150 tgtacatacc ttgttatatg cagacatgta tttcttacgt acactgtact 3200 tctgtgtgca attgtaaaca gaaattgcaa tatggatgtt tctttgtatt 3250 ataaaatttt tccgctctta attaaaaatt actgtttaat tgacatactc 3300 aggataacag agaatggtgg tattcagtgg tccaggattc tgtaatgctt 3350 tacacaggca gttttgaaat gaaaatcaat ttaccccatg gtacccggat 3400 cctcgaattc 3410 66 748 PRT Homo sapien 66 Met Val Leu Leu Arg Val Leu Ile Leu Leu Leu Ser Trp Ala Ala 1 5 10 15 Gly Met Gly Gly Gln Tyr Gly Asn Pro Leu Asn Lys Tyr Ile Arg 20 25 30 His Tyr Glu Gly Leu Ser Tyr Asn Val Asp Ser Leu His Gln Lys 35 40 45 His Gln Arg Ala Lys Arg Ala Val Ser His Glu Asp Gln Phe Leu 50 55 60 Arg Leu Asp Phe His Ala His Gly Arg His Phe Asn Leu Arg Met 65 70 75 Lys Arg Asp Thr Ser Leu Phe Ser Asp Glu Phe Lys Val Glu Thr 80 85 90 Ser Asn Lys Val Leu Asp Tyr Asp Thr Ser His Ile Tyr Thr Gly 95 100 105 His Ile Tyr Gly Glu Glu Gly Ser Phe Ser His Gly Ser Val Ile 110 115 120 Asp Gly Arg Phe Glu Gly Phe Ile Gln Thr Arg Gly Gly Thr Phe 125 130 135 Tyr Val Glu Pro Ala Glu Arg Tyr Ile Lys Asp Arg Thr Leu Pro 140 145 150 Phe His Ser Val Ile Tyr His Glu Asp Asp Ile Asn Tyr Pro His 155 160 165 Lys Tyr Gly Pro Gln Gly Gly Cys Ala Asp His Ser Val Phe Glu 170 175 180 Arg Met Arg Lys Tyr Gln Met Thr Gly Val Glu Glu Val Thr Gln 185 190 195 Ile Pro Gln Glu Glu His Ala Ala Asn Gly Pro Glu Leu Leu Arg 200 205 210 Lys Lys Arg Thr Thr Ser Ala Glu Lys Asn Thr Cys Gln Leu Tyr 215 220 225 Ile Gln Thr Asp His Leu Phe Phe Lys Tyr Tyr Gly Thr Arg Glu 230 235 240 Ala Val Ile Ala Gln Ile Ser Ser His Val Lys Ala Ile Asp Thr 245 250 255 Ile Tyr Gln Thr Thr Asp Phe Ser Gly Ile Arg Asn Ile Ser Phe 260 265 270 Met Val Lys Arg Ile Arg Ile Asn Thr Thr Ala Asp Glu Lys Asp 275 280 285 Pro Thr Asn Pro Phe Arg Phe Pro Asn Ile Gly Val Glu Lys Phe 290 295 300 Leu Glu Leu Asn Ser Glu Gln Asn His Asp Asp Tyr Cys Leu Ala 305 310 315 Tyr Val Phe Thr Asp Arg Asp Phe Asp Asp Gly Val Leu Gly Leu 320 325 330 Ala Trp Val Gly Ala Pro Ser Gly Ser Ser Gly Gly Ile Cys Glu 335 340 345 Lys Ser Lys Leu Tyr Ser Asp Gly Lys Lys Lys Ser Leu Asn Thr 350 355 360 Gly Ile Ile Thr Val Gln Asn Tyr Gly Ser His Val Pro Pro Lys 365 370 375 Val Ser His Ile Thr Phe Ala His Glu Val Gly His Asn Phe Gly 380 385 390 Ser Pro His Asp Ser Gly Thr Glu Cys Thr Pro Gly Glu Ser Lys 395 400 405 Asn Leu Gly Gln Lys Glu Asn Gly Asn Tyr Ile Met Tyr Ala Arg 410 415 420 Ala Thr Ser Gly Asp Lys Leu Asn Asn Asn Lys Phe Ser Leu Cys 425 430 435 Ser Ile Arg Asn Ile Ser Gln Val Leu Glu Lys Lys Arg Asn Asn 440 445 450 Cys Phe Val Glu Ser Gly Gln Pro Ile Cys Gly Asn Gly Met Val 455 460 465 Glu Gln Gly Glu Glu Cys Asp Cys Gly Tyr Ser Asp Gln Cys Lys 470 475 480 Asp Glu Cys Cys Phe Asp Ala Asn Gln Pro Glu Gly Arg Lys Cys 485 490 495 Lys Leu Lys Pro Gly Lys Gln Cys Ser Pro Ser Gln Gly Pro Cys 500 505 510 Cys Thr Ala Gln Cys Ala Phe Lys Ser Lys Ser Glu Lys Cys Arg 515 520 525 Asp Asp Ser Asp Cys Ala Arg Glu Gly Ile Cys Asn Gly Phe Thr 530 535 540 Ala Leu Cys Pro Ala Ser Asp Pro Lys Pro Asn Phe Thr Asp Cys 545 550 555 Asn Arg His Thr Gln Val Cys Ile Asn Gly Gln Cys Ala Gly Ser 560 565 570 Ile Cys Glu Lys Tyr Gly Leu Glu Glu Cys Thr Cys Ala Ser Ser 575 580 585 Asp Gly Lys Asp Asp Lys Glu Leu Cys His Val Cys Cys Met Lys 590 595 600 Lys Met Asp Pro Ser Thr Cys Ala Ser Thr Gly Ser Val Gln Trp 605 610 615 Ser Arg His Phe Ser Gly Arg Thr Ile Thr Leu Gln Pro Gly Ser 620 625 630 Pro Cys Asn Asp Phe Arg Gly Tyr Cys Asp Val Phe Met Arg Cys 635 640 645 Arg Leu Val Asp Ala Asp Gly Pro Leu Ala Arg Leu Lys Lys Ala 650 655 660 Ile Phe Ser Pro Glu Leu Tyr Glu Asn Ile Ala Glu Trp Ile Val 665 670 675 Ala His Trp Trp Ala Val Leu Leu Met Gly Ile Ala Leu Ile Met 680 685 690 Leu Met Ala Gly Phe Ile Lys Ile Cys Ser Val His Thr Pro Ser 695 700 705 Ser Asn Pro Lys Leu Pro Pro Pro Lys Pro Leu Pro Gly Thr Leu 710 715 720 Lys Arg Arg Arg Pro Pro Gln Pro Ile Gln Gln Pro Gln Arg Gln 725 730 735 Arg Pro Arg Glu Ser Tyr Gln Met Gly His Met Arg Arg 740 745 67 13497 DNA Homo sapien 67 atgctgatgg ctgttggcga tcggatctac gatctcgtta tgcaaattgg 50 cctgcccgcc gatggcttcc gcctcttccg gcgtggagcg gcctggcaaa 100 cgctcttcct gctctgcgct ctcgcctatt gcataaatga agccagcagc 150 gagggtcgcg tggtctgcta ctatacgaac tggagtgtct atcggccggg 200 cacagccaaa ttcaatccgc agaacataaa tccatacctg tgtacacatt 250 tggtgtacgc gttcggtgga ttcaccaagg ataaccaaat gaagcccttt 300 gacaagtacc aggacatcga acagggtggc tatgccaagt tcactggact 350 caaaacgtac aacaaacagc tgaagaccat gattgctatt ggcggatgga 400 acgaggcgag ttccagattc tctccattgg tggctagtaa cgagcgtcgg 450 cagcagttca tcaagaacat cttgaaattc ctgcggcaga accattttga 500 tggcatcgac ctggattggg aatatccagc ccatcgagag ggcggcaagt 550 cccgggatcg tgataactat gcccagtttg tccaagagct tagggccgaa 600 ttcgaaaggg aagcggagaa gaccggacgc actcgtctcc tattgaccat 650 ggcagttcct gctggcatcg agtacattga caagggttac gatgtgccca 700 agttgaacaa gtatttggat tggttcaatg tgctgaccta tgatttccat 750 tcttcccacg agccatcggt taaccatcat gctccgcttt attcgctgga 800 ggaagactcc gagtacaatt acgatgccga gttaaatatt gactactcca 850 tcaaatacta tctgaaagcg ggtgcagatc gtgacaagct tgttctgggc 900 atacccacct atggccgatc ctatactttg atcaacgagg agagcaccga 950 actgggagca ccggcagagg gtccgggaga acagggtgat gccaccaggg 1000 aaaagggata cctggcttac tatgagatct gccagaccct caaggacgac 1050 cccgagtgga ctgtggtgca gccgaatgcc aatgttatgg gtccctatgc 1100 ctacagacgc aatcagtggg tgggctacga tgatgaggct atcgtgcgca 1150 agaaggccga atatgtggtg gcccagggac tgggcggcat tatgttctgg 1200 gccatcgaca acgatgattt ccgcggcacc tgcaacggaa agccatatcc 1250 tcttatcgag gctgccaagg aagccatggt ggaggcattg ggactgggca 1300 tcaacgaggt ggccaagcca agtggaccac agaagccatc aagatctcgc 1350 agtcgcgata atgccagcaa taggaaccgt ctaaatggca agaccgaagc 1400 tccactcagc tccaggagac ccagtgccac aaggagacca gctgtgagct 1450 ctactcaggc accaccgcca agcaccacct tcaaactgac cgaagccgag 1500 ggatcatctc tttacattgg aggcagggca tccaccacgc cgccaccacc 1550 aacgactccc gatccaggtt cggacttcaa gtgcgaggag gagggattct 1600 tccagcatcc aagagattgc aaaaagtact actggtgcct ggatagtgga 1650 ccgtctggtc taggaattgt ggctcacatg ttcacttgtc catcgggact 1700 ttactttaat ccggctgccg attcctgcga ctttgctcgc aatgtgccct 1750 gcaagaccaa aaaatctaca acggcggcac cagtgacatc cacaaccccg 1800 gcaaccacca cagtgcgttc caatcgagta acagctgctc ccacgtctcg 1850 tccagtatat ccacgcacca ccaccacaac aagcactacg acaaccacta 1900 cgacaactcc atccactgtt gatgaagact tggaatatga agaagatact 1950 gacgagctgt cgccaagcaa atccacggat gctgaggagg atccacaggt 2000 gattaaggaa ctaatcgatt tgattcgcaa agtgggcggc gtggagcaac 2050 tggagaagca tcttttgcgt aacaaggatg gatcgattac attgaaggaa 2100 aattctgcca ctggtgcagc cactactccg tcgactataa gcaaatctct 2150 gtatgatcgt gtgctaagtc gacctggaac attgaattcc ttcagccgca 2200 atcgtttcaa gatcagcgaa gcttcagaga caagcacaga acccactact 2250 tcgagcagtt cgtcaagggg ttcatcgact ctcacctcca acaccaactc 2300 caaatattcc tcagttctga ggggcaacag tcgccaaggt ccacagaacg 2350 agggcattga aaaattggct gagttcgatg gtttcctgaa ggaacgcaaa 2400 caatatgtga ccatcaatcg ccatcgatct gcgagtcagg gagatgagga 2450 ggagcacgca gatcagcagg aagaggagga gaacctggcc gaagttgaga 2500 caaccacccg tcgtcctttg agctctatca ctccgtctta cacaagtctg 2550 agacgctcta ggcccaccac ggtggcaccg ccagcagaag aatcccacga 2600 agaggcagag caacagaccc agacacaggt gaaatcctat

gccaccttga 2650 gccgtactcg tggacgcacc acgtcatctc cagaagtcac ggaagcggcg 2700 cccagttcga caaccaatcg ttacaaatac ttcgagcgta cacgacccac 2750 taagagtgcc actgccgaag attccgaaga tccaacagaa gatgaagagg 2800 aagagtacga ggacgagcaa aaggacattg tcacggtaca gagcaaacaa 2850 tccacaaaca cacgtaaata tgcgagcatc ggccgcagaa caaccaccac 2900 cacaacagca acaccagaaa ctacaactac aacaacaaca acaaccgccg 2950 gcactgagac tgccaaggcc agcaccacca ccaacaacaa caacaacaac 3000 aacagccact acaacagcag caataacaac aacaatgtga aactgaataa 3050 ccaattacca acagaagaga acatcacaac aacacccagc accaccgccc 3100 aatccgaaac cacaaccaca actaacgaaa ccactgaacc aaatgaaagc 3150 acctcaacca ccaccacatc cataactaat aatctgcata ccaccaccac 3200 cacacccaca ccgattgtag catccactgt cccaacaaca accgctaatg 3250 gcattagctc agactccctg ctagccaccg agttgagtga agcctcaccc 3300 acccaccttt ctcccagtcc cgactcagaa acatcaacac cgacaacaac 3350 aagcacaaca acaacggaac aacccgaact ggacaccaca acaacgacac 3400 cgaaaaccac aacaacaacc accacgggca acaacgaact gaatgacgtt 3450 aacaacgtgg acgaggacag tgaggtaacg aaaaccaaga ctcaatacaa 3500 gtatgcaacc accaaccgtc gacgcataac cacaacaaca acaacagcca 3550 ccaagaacag caacaacaac aacaatgcag aagctgcgaa cgatgctagt 3600 ccaacaacaa acggtttgag tagtttaaat agcattagaa ccaacccggg 3650 cagaaggcaa ccccaaccag agcagaccca aacaacaacc tccgaaccaa 3700 acctttctag tcccagaccc tttggttatc cccgacgtcg cacacgaccc 3750 acagttagca ccacaaccac cactatttcc caaaccgata atgataacaa 3800 taccgataat aacgataacg aaaccgatgc agttgctcaa gtagtgaaga 3850 agacacgact atctccaggg gataggccca aggtgagtgc cagtcttcca 3900 acagcaaccg caatcaacac acgaaccaat acctcctcac tgcaccacca 3950 agaatcccaa gtagaagtag ctggtaacgg tggcaatgat agcttaaggc 4000 atgatgtagt tagctctagt ttaagtcaat cccaatccaa caaaatcgat 4050 accgatgacc tcagtaccac acagcagcat accaagtaca cttggcgtgc 4100 ggtgcgacgt cctgcatcgc agcgcacagt ggttcccaat tctttagcgg 4150 gcgacgacaa ggactcccga cgctttgccg gcaagcagtt gaacaccgaa 4200 tcaattgtgg atgacgagct tcaaaccacc accaagttcc gcagtcgccg 4250 cctaaattcc gccgaagatg agtccgaagt ggctctcgag gtggcaaccg 4300 cgaccccaac gcacggttcg cgcagctacc agagcatcca acgttctgct 4350 agcaaagcga gcttagatga ctctcaaatc cattacaaag ccatcattcg 4400 cgattccgag ggcggtgctc atctaaccgc cggacgcagt tccagctttg 4450 tgaggaattt cggcgatgca gccaaaccaa ctccaccgca ccaacctatt 4500 agcagaggtg gacagattgt ggagagcacc accgaagatg aaaatgtggc 4550 tgctgaaata atcgatgatg agaagagggg tgagacgaag gcgcctgctg 4600 gaagcgagaa tactgatgac agcaacacgg cgaccgaaca ggaatctccc 4650 gaaattgtaa ctgaagctgc acaaccacag ctcgagatta ccactttgcc 4700 atcagaaact tcagatgtca gcagttctac ggaacaatct gtaagcagca 4750 ccactgaaga aagctccagc agcaccgcag acttggatat cgtcgctgaa 4800 gaagccagtc taggcgctga aaccgacaaa aagtctactt ctgagaatga 4850 taatggagaa tcttcaacgg aaatcagctc ttctgaagcc ccaattagca 4900 gcacaactgg acaatctgaa gatgtaagca gcaccactga aactaattca 4950 gaagccattg aaaaagaaat tgcctctgat tctaacgatg gatccagtga 5000 tgatcctgcc agcagtacgg aatttataga aatcactaac acaactagca 5050 gcccagtcag tctccaggaa gattcttcga caacgacaga aaaactcaca 5100 cggcgagcat ttaaccgctt tagctcaact actcctgcag ttgtgccaga 5150 agatgaaacc acatccaccg taaaccaaag gcgtcgagtt attgtgcgta 5200 acagaatcag caccactgaa gcagagagtg aggctcagac aaccacagaa 5250 gaaccaaaac gacgtagttt ttacagaacc agcaccacag ctgaaccaag 5300 cagctcaact gaagcagaca gtgatgccca gatatctact gaaactacaa 5350 cacgtcgttc tttctttaga actaggacca ctgaagcagc tagcagcacc 5400 actgaagaac ctagcagccc cacagaacca gagattgaag tcgagacaac 5450 caccgagggt ccaacgcgcc gttcgttctt cagacgtagc accactgtgg 5500 ctcctagtag caccactgaa gaaatcagta gcagctccgt agatgatgat 5550 gctgaagcga atattatcac cactcgtcgt tcgctattca ctacgccagc 5600 gccaagttcg acagaagcca caaccacagc aactgcagaa gattccgaag 5650 ttagtagctc caccaggcgt agtttcttcc gtaccagcac taccacagaa 5700 ggcacaacaa gtactacaga agaggcaaag gatattgaac atgagagtga 5750 gaccaccgct gctttgccta aacgtcgagt tatagtcaga ggcaatttta 5800 ggccccgcaa ggagggagat ctctcctctc ttctagcggc agatgccaac 5850 aagcgagtga gaaacaacca cagcaccaca agcactgaaa cacctgctaa 5900 tagtcaatca actacatcca atgaagaaga tacggtggca caaccaccac 5950 aagctgaggt gaaggcaacc actggtcggg tttccttaaa tgctgttcga 6000 aaccgcacca ccacaaaaac cgaaagtcta ggcaatggaa tcacccgcac 6050 tcgcaccacc tacgtccgca ctctcgacgc tggtcaaaag attgtgaaac 6100 gcattcacac taaaaccata gaagagaaac ccgcggagta tgagtacatt 6150 atcgatgaag tgacccaccc accagcggcg agcaccacgc ctcgcacagt 6200 gacacgcaat cgtggatccg tgcgattcca gagtaatgat ctttcttcgc 6250 tgctggcctt ggactttgcc tcacgtagca cccgcaagaa acaggcccaa 6300 acagagacaa cagttaccaa aacgcgaaga cgcctactta agaaacccaa 6350 ggaaaccatc gaacatgaag aggtggaaga atacgagtat gaagcaggtc 6400 aggaagctgg gaacgaagtt gaggaggcgc cacgtgtgag cacgacagcc 6450 aggactataa tccgaagaac tagaccgaca acaatcagaa caacaactac 6500 ggaaacacct caaaatattg aggctagcac ccgcagggct agcttcgctt 6550 tcaagcgtcc ctccaaagtt agcacaacca ctgaggaacc aactacaagc 6600 tctacggaac ccacaatatc tgcagaagct accaccagaa gagttctcaa 6650 cttcagaaga ccagtgagca cgacaagcac cccagcttca gatgagtcca 6700 ctgaggaggc aactgctgcc cccattgagg ccaccaccag aagagttctt 6750 gcgttcaaaa gacctgtcag cacaacaacc accccagctc cagtagacga 6800 agagtccaca gaagaatcca cccctacttc tatagagggc aacaccagaa 6850 ggattctcgc ttatagaagg cccgtgagca caacaactac cacaccagtt 6900 cctgttgaag atgaatcttc aacagatcag ttggcagcgg ccaagcaaaa 6950 gtttataaat cgtctcaaat ctagtaccac aacgacaaca tcaattcctg 7000 aaacaacaac caccgaagaa gacttaagcg atcttaaggt acagcttagc 7050 aatgccataa accgtttgca aacggaaaat aaactagaag tccagacaat 7100 caccaagggt agtgaagccg cagaagatga aggcgatgat aagctgtctt 7150 tgccaattta ccacagaaga aagtactatc agtatgtaaa ggattcacca 7200 attacctata tcgacaagtc ccccgcacca ccggatatcg aaagtgtgac 7250 tgtaaatatt aagcaacaaa ttcatgatgt gttcaatgtg agtgaaaatg 7300 agacgccaca taattctctg ggcgatgacg aagagaccga aggacatcgc 7350 gtggctatgg cgcaggccaa ggaaatcaat gccgagctcg aagaaaagga 7400 aagaggcgaa gatgaagcca gggctttacg cacctacacc agattaaacc 7450 gcacccgctt gaccctgtcc accaggttac aggaaaagac ccaaagtgaa 7500 ccgctagaca ccacaaccag gaggagctac agcgtccctc agcgtttccg 7550 tattcgctca acgacgccga taccatctaa aatagaaaac agcgaagagg 7600 acgatgaaga aactaaagac aatgaaggac catcgccgag cacaacaacc 7650 gtgacaccac catccataaa actacctacc agacgacttt ttacacccag 7700 aaggcccgtt aatgctgtag aggactcaga ttcttctgat atacgtaaag 7750 acaatgaaga agagttaaag gtggaatcaa ccacaaaacg cctatatgcc 7800 ggcttaaata gattgagagg caggggaagc accaccacaa ccaccgaaga 7850 agcaactgat tcaaccaccg agacagcaac cacaacagcc aaaagtacaa 7900 gacaacctta tgtgggtatt agtcgcagag taacaactac aacgaccact 7950 gagaagtcag ccgaaagtag tactgaatat aatggaaatg aagatgaaga 8000 aacggagtcc accacagtaa ctcctgaaca ggagataagc gatgatgctg 8050 aagaaaataa ggttgcaatc aaagaaatag atgaccaagt tagcaagaaa 8100 gctcccgaag aggctgaaga caccagcact gaggagcccg aactggaagc 8150 ctttatagac gatgacaatg aaatccccct cgaagaaagc ggacctaaaa 8200 ctgaaaccac atcaacaaca acaacaacaa catcaacgac caccactaca 8250 ccagcaagca ccacatcacg tcgtcagttg gtaattcgtc gccgttttaa 8300 tggcaccata acaacaacca ctacggttgc ccctgtcgct gatgagaatc 8350 tcgaaaatga gatcgatcca agtgatacgg agagttcaac accaaaagcg 8400 gcaactacaa cttcaccaag acggcaactt ttaatacgcc gacgtttcaa 8450 tgcgaccagc agcgggtcaa caacaacgac aactgcaaat cctagtgccg 8500 acaatgaaat agatcagggc gagacaaaaa gaacaacgcg tcgaccaatt 8550 ctgtcacgtc ggcgttttaa tgcgacttcc atcacagcaa ccacgacggg 8600 atcaacgaac ggcgacgaga taagcacacg acgtccgtat gcggccttga 8650 accgctcacg taatcgcttt acaacaccgc aaacaacaac caccgacggt 8700 ggggcaaacg gcgatgatga tgattacgac ggagaggaag aggagcagct 8750 agcaccgccg cgagctgtct tccttcaaac aaaccgccat cgcgctctga 8800 aacccactcc tgaagacgag gaggagggtg cagctgcggt gccaggacgt 8850 aggccactca attttgccgc acgtcgcacc accgctgcac cactgagagt 8900 tagctctagt acgcgacgaa atctggtggc aatcaataga aatctttacc 8950 accgaccaga ggaagataac gaggaagaac ccgaagagga gtacgacgaa 9000 aacgaggatg gggatgatga tcaagaagaa tctgtagatc ctcaagtcac 9050 aagcactaca acacgttcac gactaaatca attgctagcc aacaggcaaa 9100 gacaaccact caggacaacc actgaaaaac agacagagac agatagcaac 9150 gatacagaaa ctgactcgga caacggagat gaaaacgatg atgatgaaga 9200 taacgatagc tccgtcgagg tatcgaacag taaccacaca ctaaaacaca 9250 gcactatatt cggggttggc accactaact ttaacaatct tactaaccgt 9300 agcactgcac tgaatgtagc tagtcaacgt agtaacagca ccgtagccaa 9350 ttatattaat cgattcaagt caaacagtta cacaaacaaa aacaaaccag 9400 tcacagtcac agctaatatt aaggcagata gtacagatga taaagataat 9450 gataatgatg aagatgatga tgatgatgat aacgatgatg ataattatgc 9500 tagtctagag aatgagggga aggagaaaac gagtggggct ggtttgaatg 9550 ccttaggtaa tgatgttaat tccacacgac gctttcagaa tcgttatcaa 9600 ttgagccgca ccagaggcag caccaccacc aacaccaacc caacaacaac 9650 acaacaaccc caaacaacaa gcaccgccag acgattggcg tttgggggcc 9700 gtcagcgcgc ccaggtaact aaactaaccc tagtcgatga gcagactgaa 9750 gagaccgaga ctaagggtga ttcccgcgaa gaggagaagg aggaggagga 9800 ggaagaagac tccaacgcaa ccacaacaac aacaaccaca accaccagca 9850 gaccaacacc aaagcgcata cgtgtgctta aattccgcag acccttaaat 9900 agcaatagca atagtaccat caatgtagat agcaccacta atagcgccac 9950 tgacactaac ccagacacca ccacagccac accaacaaca gcaggtcaaa 10000 gcaccaccag caacagcaac aacaacaaca acaacaccac aagcaccacc 10050 ggcaataagc gtttccgcaa gattgttcgc aaattgaggc ctgtggactc 10100 aagtactgcg gctagtgtgg ataattccga tgagaccacc cgtaaacctt 10150 tcgttcccag ccacaccaga ttcgctgatc aggataatga ccttgttaat 10200 ctccgccagc gaatcaagga gcagcaagca cgaggtgagc cgcaggatgg 10250 agttattagc aatcgattta agaccctggg ccaaaaggac gatcaggatg 10300 tgtcggagtt gcaaaagttg agggacaaag taaaggcaga gcaggcgagg 10350 ggagaaggag aacagggcgt gatcaacgat cgtctgaaga agcttctggc 10400 cgagaagggc tcgagcatct caagccaaag agaagaatct tcaacagatg 10450 atgactcttc atcggtttcc tcggccagac ccttctttaa acgaaaactg 10500 gtggcacgta gaccctacac accaccctcg gccagtggtg gcaccacaaa 10550 ggcaccactg acctttagca cctcccgacc tacggctaaa ttcgtgcgca 10600 ggaagaacgg aagatttgat cccttcaact cgagtgttcg caatcgtgga 10650 gagggctttg tgagaagtga tccgagagga tccagattgc ctggaaccga 10700 tcgcttcaag tctcagggaa atagcgaaga cgatgacgaa gtggaagagc 10750 gccatgagca accactgcag aatcagtttg caaccacatt gcgaagacct 10800 tttgtgccga agacccgacc agttttggat aagtccaagc ctgagcagga 10850 agatggtgca gaagagagtg aggaggaaga cgaggaggag gattccgagg 10900 aagaaggtga cgaagaagag gatgaagaag aagaggacgt gaagccaggt 10950 ggagaagagg acaatgcaca agaggataac aaaccgaagt ttaattcccc 11000 ttacaaaccc aaagataatc gagctccacc gggtagtcgt cccacatttg 11050 gaaccaccgg aagtggatca ccacctaccg catccggaaa tgtaccatac 11100 aatcctcgca atcgcccatc aaactcggca aatggaaata gcacaccatc 11150 caaccgtttt ggcaccacaa agcgacctcg tgtggttaac cgaccaccgg 11200 gagtagcctc accaaatttg accttgaaac ccgtggccag tgattatgaa 11250 cgcaccacac cgctgacacc actgaaaccg gcgccattta tacccagcaa 11300 taacaggagt tacgagagga agtactcggg tccatccaca gaggccgcgg 11350 agacggccag cgagaattcc ctgatagaag atttgaatat cgatgcactg 11400 aatgcgagaa acaagaagat ctttgacaaa cacagtaaga aacatccggc 11450 tctgaagcca aaggtggtta aggtggaatc cgagacggga ttagaggtgg 11500 aagccggtac agaggtggcg gtggaagatg aaaccaccga ggagcagcag 11550 caggagcagg gctttgtgac caccacaccc agtacaccac catcaccagc 11600 accacccagc acacagtcag acacagccac caccacagac acgccaccag 11650 aaactgaaac tgaaactgaa actgaaactg aaacagaaac tgaaactgaa 11700 aatgtaaccg aaattgaaac agcgaccaat gctaatgaag ccacttccat 11750 caattcacag gatcaaacca tctcgagcac tactcaagca ccaccgccgg 11800 ccacaacctt gttgcatgtg ttcacccttc tcgagggtga gggtcaagaa 11850 gaggagccca cgaccagaaa gccaacggta cgcctctatc ccaccataca 11900 aacggaagtg gtgcccaagc acaagttaat cgagattaat cgcattgtcg 11950 agattaattc caaacaggcc aaggctgctc agaggaagtc caaggcgaat 12000 catgacttta gcaccttaat ggtggagtcc ctgccccacg tggagcagct 12050 gggcgagatc agtgtggtga aatatgtcca cttggtggat ggcagtgata 12100 ttcagatcaa cgatggtcac agcacagtgg cagattacac gcccaccgaa 12150 cccacatcag cggctgaacg tcctgtctcg ctgcccgtta ggaattctct 12200 gccagaaacc gagggcgctg acaccgatcg cagtggcaag tccttggtgc 12250 ctgaagttct aactgccgca ttggagacct ctaccatttc gctggaggga 12300 ctatttgatt cggccagaaa gggcaagcaa ctgagcagta atacaatcat 12350 tggcgaaacg gaggagagca caaccattgg cagtagcagt agtttggcca 12400 gcgaaaccgg agagaccacc acaccggcac ccacttacgt aagacccatt 12450 gtacccctcc taagacccga gtccaatgag tcctcaccac tggtcatttc 12500 gatagccaac ctcgatcagg tgatactgag caaggtgcaa aagtcgctgg 12550 ccgagaatag ccaaaccaca gtggcccccg aagctgccag cgactcgaat 12600 tccgccttct ctgtgcgcca accgctggtg gtgcaggcgc ccattagcaa 12650 tggtgctcag gaaatcgatc aggatacact taacacacaa gatcagacaa 12700 tcaatggtgc tattagtgtg aaaactaatc caattataca aacgacaacc 12750 aacaggccaa atgacgatca ggtggctgaa gaaaccacaa tctttagcat 12800 tgaaacggcc actgagccag agcttaacac acaaacaaca attcccaaaa 12850 cagaagccaa tagcgaaaca gtaacagcca tgccaattgg tgctgttatt 12900 atgggtcagt ttggtctaaa cacacaatca acaaccgctg ttgacaatga 12950 caaccaacta aatgctcaaa caacaagcac aattagcagt ggcgctgtta 13000 gttccgtagc tattggtggt aacacacaaa cagcaaacgc ggacaatgct 13050 cgccaagaaa atacacaatc aacaggcaca ataacaagtg agattagcag 13100 tggcgctatt agttccgaca accacaatca cataggcaca caaacaacag 13150 ccaccatcga tagcagttca gaaacaacac ccacccaaat atcaaccaca 13200 attagcagtg gtgcaattag tggtcatatt gatggtagca tcaatctgaa 13250 cacacaaaca aacaccacca tcagcactaa caacaccacc accagcacca 13300 ccgatgtggg cagcaaagtt agcgaggcag ttagctttag ttccgagacg 13350 catgtggtgc atcgcaaaaa gatgggtcgc aaggggcgtg gccgacgcct 13400 gcgcaatcgc aaaacgacaa caacgacaac aaccacagaa acgccaacca 13450 cgacggaggc gacatttgat gacaccacca cagtggtgcc agaatag 13497 68 4498 PRT Homo sapien 68 Met Leu Met Ala Val Gly Asp Arg Ile Tyr Asp Leu Val Met Gln 1 5 10 15 Ile Gly Leu Pro Ala Asp Gly Phe Arg Leu Phe Arg Arg Gly Ala 20 25 30 Ala Trp Gln Thr Leu Phe Leu Leu Cys Ala Leu Ala Tyr Cys Ile 35 40 45 Asn Glu Ala Ser Ser Glu Gly Arg Val Val Cys Tyr Tyr Thr Asn 50 55 60 Trp Ser Val Tyr Arg Pro Gly Thr Ala Lys Phe Asn Pro Gln Asn 65 70 75 Ile Asn Pro Tyr Leu Cys Thr His Leu Val Tyr Ala Phe Gly Gly 80 85 90 Phe Thr Lys Asp Asn Gln Met Lys Pro Phe Asp Lys Tyr Gln Asp 95 100 105 Ile Glu Gln Gly Gly Tyr Ala Lys Phe Thr Gly Leu Lys Thr Tyr 110 115 120 Asn Lys Gln Leu Lys Thr Met Ile Ala Ile Gly Gly Trp Asn Glu 125 130 135 Ala Ser Ser Arg Phe Ser Pro Leu Val Ala Ser Asn Glu Arg Arg 140 145 150 Gln Gln Phe Ile Lys Asn Ile Leu Lys Phe Leu Arg Gln Asn His 155 160 165 Phe Asp Gly Ile Asp Leu Asp Trp Glu Tyr Pro Ala His Arg Glu 170 175 180 Gly Gly Lys Ser Arg Asp Arg Asp Asn Tyr Ala Gln Phe Val Gln 185 190 195 Glu Leu Arg Ala Glu Phe Glu Arg Glu Ala Glu Lys Thr Gly Arg 200 205 210 Thr Arg Leu Leu Leu Thr Met Ala Val Pro Ala Gly Ile Glu Tyr 215 220 225 Ile Asp Lys Gly Tyr Asp Val Pro Lys Leu Asn Lys Tyr Leu Asp 230 235 240 Trp Phe Asn Val Leu Thr Tyr Asp Phe His Ser Ser His Glu Pro 245 250 255 Ser Val Asn His His Ala Pro Leu Tyr Ser Leu Glu Glu Asp Ser 260 265 270 Glu Tyr Asn Tyr Asp Ala Glu Leu Asn Ile Asp Tyr Ser Ile Lys 275 280 285 Tyr Tyr Leu Lys Ala Gly Ala Asp Arg Asp Lys Leu Val Leu Gly 290 295 300 Ile Pro Thr Tyr Gly Arg Ser Tyr Thr

Leu Ile Asn Glu Glu Ser 305 310 315 Thr Glu Leu Gly Ala Pro Ala Glu Gly Pro Gly Glu Gln Gly Asp 320 325 330 Ala Thr Arg Glu Lys Gly Tyr Leu Ala Tyr Tyr Glu Ile Cys Gln 335 340 345 Thr Leu Lys Asp Asp Pro Glu Trp Thr Val Val Gln Pro Asn Ala 350 355 360 Asn Val Met Gly Pro Tyr Ala Tyr Arg Arg Asn Gln Trp Val Gly 365 370 375 Tyr Asp Asp Glu Ala Ile Val Arg Lys Lys Ala Glu Tyr Val Val 380 385 390 Ala Gln Gly Leu Gly Gly Ile Met Phe Trp Ala Ile Asp Asn Asp 395 400 405 Asp Phe Arg Gly Thr Cys Asn Gly Lys Pro Tyr Pro Leu Ile Glu 410 415 420 Ala Ala Lys Glu Ala Met Val Glu Ala Leu Gly Leu Gly Ile Asn 425 430 435 Glu Val Ala Lys Pro Ser Gly Pro Gln Lys Pro Ser Arg Ser Arg 440 445 450 Ser Arg Asp Asn Ala Ser Asn Arg Asn Arg Leu Asn Gly Lys Thr 455 460 465 Glu Ala Pro Leu Ser Ser Arg Arg Pro Ser Ala Thr Arg Arg Pro 470 475 480 Ala Val Ser Ser Thr Gln Ala Pro Pro Pro Ser Thr Thr Phe Lys 485 490 495 Leu Thr Glu Ala Glu Gly Ser Ser Leu Tyr Ile Gly Gly Arg Ala 500 505 510 Ser Thr Thr Pro Pro Pro Pro Thr Thr Pro Asp Pro Gly Ser Asp 515 520 525 Phe Lys Cys Glu Glu Glu Gly Phe Phe Gln His Pro Arg Asp Cys 530 535 540 Lys Lys Tyr Tyr Trp Cys Leu Asp Ser Gly Pro Ser Gly Leu Gly 545 550 555 Ile Val Ala His Met Phe Thr Cys Pro Ser Gly Leu Tyr Phe Asn 560 565 570 Pro Ala Ala Asp Ser Cys Asp Phe Ala Arg Asn Val Pro Cys Lys 575 580 585 Thr Lys Lys Ser Thr Thr Ala Ala Pro Val Thr Ser Thr Thr Pro 590 595 600 Ala Thr Thr Thr Val Arg Ser Asn Arg Val Thr Ala Ala Pro Thr 605 610 615 Ser Arg Pro Val Tyr Pro Arg Thr Thr Thr Thr Thr Ser Thr Thr 620 625 630 Thr Thr Thr Thr Thr Thr Pro Ser Thr Val Asp Glu Asp Leu Glu 635 640 645 Tyr Glu Glu Asp Thr Asp Glu Leu Ser Pro Ser Lys Ser Thr Asp 650 655 660 Ala Glu Glu Asp Pro Gln Val Ile Lys Glu Leu Ile Asp Leu Ile 665 670 675 Arg Lys Val Gly Gly Val Glu Gln Leu Glu Lys His Leu Leu Arg 680 685 690 Asn Lys Asp Gly Ser Ile Thr Leu Lys Glu Asn Ser Ala Thr Gly 695 700 705 Ala Ala Thr Thr Pro Ser Thr Ile Ser Lys Ser Leu Tyr Asp Arg 710 715 720 Val Leu Ser Arg Pro Gly Thr Leu Asn Ser Phe Ser Arg Asn Arg 725 730 735 Phe Lys Ile Ser Glu Ala Ser Glu Thr Ser Thr Glu Pro Thr Thr 740 745 750 Ser Ser Ser Ser Ser Arg Gly Ser Ser Thr Leu Thr Ser Asn Thr 755 760 765 Asn Ser Lys Tyr Ser Ser Val Leu Arg Gly Asn Ser Arg Gln Gly 770 775 780 Pro Gln Asn Glu Gly Ile Glu Lys Leu Ala Glu Phe Asp Gly Phe 785 790 795 Leu Lys Glu Arg Lys Gln Tyr Val Thr Ile Asn Arg His Arg Ser 800 805 810 Ala Ser Gln Gly Asp Glu Glu Glu His Ala Asp Gln Gln Glu Glu 815 820 825 Glu Glu Asn Leu Ala Glu Val Glu Thr Thr Thr Arg Arg Pro Leu 830 835 840 Ser Ser Ile Thr Pro Ser Tyr Thr Ser Leu Arg Arg Ser Arg Pro 845 850 855 Thr Thr Val Ala Pro Pro Ala Glu Glu Ser His Glu Glu Ala Glu 860 865 870 Gln Gln Thr Gln Thr Gln Val Lys Ser Tyr Ala Thr Leu Ser Arg 875 880 885 Thr Arg Gly Arg Thr Thr Ser Ser Pro Glu Val Thr Glu Ala Ala 890 895 900 Pro Ser Ser Thr Thr Asn Arg Tyr Lys Tyr Phe Glu Arg Thr Arg 905 910 915 Pro Thr Lys Ser Ala Thr Ala Glu Asp Ser Glu Asp Pro Thr Glu 920 925 930 Asp Glu Glu Glu Glu Tyr Glu Asp Glu Gln Lys Asp Ile Val Thr 935 940 945 Val Gln Ser Lys Gln Ser Thr Asn Thr Arg Lys Tyr Ala Ser Ile 950 955 960 Gly Arg Arg Thr Thr Thr Thr Thr Thr Ala Thr Pro Glu Thr Thr 965 970 975 Thr Thr Thr Thr Thr Thr Thr Ala Gly Thr Glu Thr Ala Lys Ala 980 985 990 Ser Thr Thr Thr Asn Asn Asn Asn Asn Asn Asn Ser His Tyr Asn 995 1000 1005 Ser Ser Asn Asn Asn Asn Asn Val Lys Leu Asn Asn Gln Leu Pro 1010 1015 1020 Thr Glu Glu Asn Ile Thr Thr Thr Pro Ser Thr Thr Ala Gln Ser 1025 1030 1035 Glu Thr Thr Thr Thr Thr Asn Glu Thr Thr Glu Pro Asn Glu Ser 1040 1045 1050 Thr Ser Thr Thr Thr Thr Ser Ile Thr Asn Asn Leu His Thr Thr 1055 1060 1065 Thr Thr Thr Pro Thr Pro Ile Val Ala Ser Thr Val Pro Thr Thr 1070 1075 1080 Thr Ala Asn Gly Ile Ser Ser Asp Ser Leu Leu Ala Thr Glu Leu 1085 1090 1095 Ser Glu Ala Ser Pro Thr His Leu Ser Pro Ser Pro Asp Ser Glu 1100 1105 1110 Thr Ser Thr Pro Thr Thr Thr Ser Thr Thr Thr Thr Glu Gln Pro 1115 1120 1125 Glu Leu Asp Thr Thr Thr Thr Thr Pro Lys Thr Thr Thr Thr Thr 1130 1135 1140 Thr Thr Gly Asn Asn Glu Leu Asn Asp Val Asn Asn Val Asp Glu 1145 1150 1155 Asp Ser Glu Val Thr Lys Thr Lys Thr Gln Tyr Lys Tyr Ala Thr 1160 1165 1170 Thr Asn Arg Arg Arg Ile Thr Thr Thr Thr Thr Thr Ala Thr Lys 1175 1180 1185 Asn Ser Asn Asn Asn Asn Asn Ala Glu Ala Ala Asn Asp Ala Ser 1190 1195 1200 Pro Thr Thr Asn Gly Leu Ser Ser Leu Asn Ser Ile Arg Thr Asn 1205 1210 1215 Pro Gly Arg Arg Gln Pro Gln Pro Glu Gln Thr Gln Thr Thr Thr 1220 1225 1230 Ser Glu Pro Asn Leu Ser Ser Pro Arg Pro Phe Gly Tyr Pro Arg 1235 1240 1245 Arg Arg Thr Arg Pro Thr Val Ser Thr Thr Thr Thr Thr Ile Ser 1250 1255 1260 Gln Thr Asp Asn Asp Asn Asn Thr Asp Asn Asn Asp Asn Glu Thr 1265 1270 1275 Asp Ala Val Ala Gln Val Val Lys Lys Thr Arg Leu Ser Pro Gly 1280 1285 1290 Asp Arg Pro Lys Val Ser Ala Ser Leu Pro Thr Ala Thr Ala Ile 1295 1300 1305 Asn Thr Arg Thr Asn Thr Ser Ser Leu His His Gln Glu Ser Gln 1310 1315 1320 Val Glu Val Ala Gly Asn Gly Gly Asn Asp Ser Leu Arg His Asp 1325 1330 1335 Val Val Ser Ser Ser Leu Ser Gln Ser Gln Ser Asn Lys Ile Asp 1340 1345 1350 Thr Asp Asp Leu Ser Thr Thr Gln Gln His Thr Lys Tyr Thr Trp 1355 1360 1365 Arg Ala Val Arg Arg Pro Ala Ser Gln Arg Thr Val Val Pro Asn 1370 1375 1380 Ser Leu Ala Gly Asp Asp Lys Asp Ser Arg Arg Phe Ala Gly Lys 1385 1390 1395 Gln Leu Asn Thr Glu Ser Ile Val Asp Asp Glu Leu Gln Thr Thr 1400 1405 1410 Thr Lys Phe Arg Ser Arg Arg Leu Asn Ser Ala Glu Asp Glu Ser 1415 1420 1425 Glu Val Ala Leu Glu Val Ala Thr Ala Thr Pro Thr His Gly Ser 1430 1435 1440 Arg Ser Tyr Gln Ser Ile Gln Arg Ser Ala Ser Lys Ala Ser Leu 1445 1450 1455 Asp Asp Ser Gln Ile His Tyr Lys Ala Ile Ile Arg Asp Ser Glu 1460 1465 1470 Gly Gly Ala His Leu Thr Ala Gly Arg Ser Ser Ser Phe Val Arg 1475 1480 1485 Asn Phe Gly Asp Ala Ala Lys Pro Thr Pro Pro His Gln Pro Ile 1490 1495 1500 Ser Arg Gly Gly Gln Ile Val Glu Ser Thr Thr Glu Asp Glu Asn 1505 1510 1515 Val Ala Ala Glu Ile Ile Asp Asp Glu Lys Arg Gly Glu Thr Lys 1520 1525 1530 Ala Pro Ala Gly Ser Glu Asn Thr Asp Asp Ser Asn Thr Ala Thr 1535 1540 1545 Glu Gln Glu Ser Pro Glu Ile Val Thr Glu Ala Ala Gln Pro Gln 1550 1555 1560 Leu Glu Ile Thr Thr Leu Pro Ser Glu Thr Ser Asp Val Ser Ser 1565 1570 1575 Ser Thr Glu Gln Ser Val Ser Ser Thr Thr Glu Glu Ser Ser Ser 1580 1585 1590 Ser Thr Ala Asp Leu Asp Ile Val Ala Glu Glu Ala Ser Leu Gly 1595 1600 1605 Ala Glu Thr Asp Lys Lys Ser Thr Ser Glu Asn Asp Asn Gly Glu 1610 1615 1620 Ser Ser Thr Glu Ile Ser Ser Ser Glu Ala Pro Ile Ser Ser Thr 1625 1630 1635 Thr Gly Gln Ser Glu Asp Val Ser Ser Thr Thr Glu Thr Asn Ser 1640 1645 1650 Glu Ala Ile Glu Lys Glu Ile Ala Ser Asp Ser Asn Asp Gly Ser 1655 1660 1665 Ser Asp Asp Pro Ala Ser Ser Thr Glu Phe Ile Glu Ile Thr Asn 1670 1675 1680 Thr Thr Ser Ser Pro Val Ser Leu Gln Glu Asp Ser Ser Thr Thr 1685 1690 1695 Thr Glu Lys Leu Thr Arg Arg Ala Phe Asn Arg Phe Ser Ser Thr 1700 1705 1710 Thr Pro Ala Val Val Pro Glu Asp Glu Thr Thr Ser Thr Val Asn 1715 1720 1725 Gln Arg Arg Arg Val Ile Val Arg Asn Arg Ile Ser Thr Thr Glu 1730 1735 1740 Ala Glu Ser Glu Ala Gln Thr Thr Thr Glu Glu Pro Lys Arg Arg 1745 1750 1755 Ser Phe Tyr Arg Thr Ser Thr Thr Ala Glu Pro Ser Ser Ser Thr 1760 1765 1770 Glu Ala Asp Ser Asp Ala Gln Ile Ser Thr Glu Thr Thr Thr Arg 1775 1780 1785 Arg Ser Phe Phe Arg Thr Arg Thr Thr Glu Ala Ala Ser Ser Thr 1790 1795 1800 Thr Glu Glu Pro Ser Ser Pro Thr Glu Pro Glu Ile Glu Val Glu 1805 1810 1815 Thr Thr Thr Glu Gly Pro Thr Arg Arg Ser Phe Phe Arg Arg Ser 1820 1825 1830 Thr Thr Val Ala Pro Ser Ser Thr Thr Glu Glu Ile Ser Ser Ser 1835 1840 1845 Ser Val Asp Asp Asp Ala Glu Ala Asn Ile Ile Thr Thr Arg Arg 1850 1855 1860 Ser Leu Phe Thr Thr Pro Ala Pro Ser Ser Thr Glu Ala Thr Thr 1865 1870 1875 Thr Ala Thr Ala Glu Asp Ser Glu Val Ser Ser Ser Thr Arg Arg 1880 1885 1890 Ser Phe Phe Arg Thr Ser Thr Thr Thr Glu Gly Thr Thr Ser Thr 1895 1900 1905 Thr Glu Glu Ala Lys Asp Ile Glu His Glu Ser Glu Thr Thr Ala 1910 1915 1920 Ala Leu Pro Lys Arg Arg Val Ile Val Arg Gly Asn Phe Arg Pro 1925 1930 1935 Arg Lys Glu Gly Asp Leu Ser Ser Leu Leu Ala Ala Asp Ala Asn 1940 1945 1950 Lys Arg Val Arg Asn Asn His Ser Thr Thr Ser Thr Glu Thr Pro 1955 1960 1965 Ala Asn Ser Gln Ser Thr Thr Ser Asn Glu Glu Asp Thr Val Ala 1970 1975 1980 Gln Pro Pro Gln Ala Glu Val Lys Ala Thr Thr Gly Arg Val Ser 1985 1990 1995 Leu Asn Ala Val Arg Asn Arg Thr Thr Thr Lys Thr Glu Ser Leu 2000 2005 2010 Gly Asn Gly Ile Thr Arg Thr Arg Thr Thr Tyr Val Arg Thr Leu 2015 2020 2025 Asp Ala Gly Gln Lys Ile Val Lys Arg Ile His Thr Lys Thr Ile 2030 2035 2040 Glu Glu Lys Pro Ala Glu Tyr Glu Tyr Ile Ile Asp Glu Val Thr 2045 2050 2055 His Pro Pro Ala Ala Ser Thr Thr Pro Arg Thr Val Thr Arg Asn 2060 2065 2070 Arg Gly Ser Val Arg Phe Gln Ser Asn Asp Leu Ser Ser Leu Leu 2075 2080 2085 Ala Leu Asp Phe Ala Ser Arg Ser Thr Arg Lys Lys Gln Ala Gln 2090 2095 2100 Thr Glu Thr Thr Val Thr Lys Thr Arg Arg Arg Leu Leu Lys Lys 2105 2110 2115 Pro Lys Glu Thr Ile Glu His Glu Glu Val Glu Glu Tyr Glu Tyr 2120 2125 2130 Glu Ala Gly Gln Glu Ala Gly Asn Glu Val Glu Glu Ala Pro Arg 2135 2140 2145 Val Ser Thr Thr Ala Arg Thr Ile Ile Arg Arg Thr Arg Pro Thr 2150 2155 2160 Thr Ile Arg Thr Thr Thr Thr Glu Thr Pro Gln Asn Ile Glu Ala 2165 2170 2175 Ser Thr Arg Arg Ala Ser Phe Ala Phe Lys Arg Pro Ser Lys Val 2180 2185 2190 Ser Thr Thr Thr Glu Glu Pro Thr Thr Ser Ser Thr Glu Pro Thr 2195 2200 2205 Ile Ser Ala Glu Ala Thr Thr Arg Arg Val Leu Asn Phe Arg Arg 2210 2215 2220 Pro Val Ser Thr Thr Ser Thr Pro Ala Ser Asp Glu Ser Thr Glu 2225 2230 2235 Glu Ala Thr Ala Ala Pro Ile Glu Ala Thr Thr Arg Arg Val Leu 2240 2245 2250 Ala Phe Lys Arg Pro Val Ser Thr Thr Thr Thr Pro Ala Pro Val 2255 2260 2265 Asp Glu Glu Ser Thr Glu Glu Ser Thr Pro Thr Ser Ile Glu Gly 2270 2275 2280 Asn Thr Arg Arg Ile Leu Ala Tyr Arg Arg Pro Val Ser Thr Thr 2285 2290 2295 Thr Thr Thr Pro Val Pro Val Glu Asp Glu Ser Ser Thr Asp Gln 2300 2305 2310 Leu Ala Ala Ala Lys Gln Lys Phe Ile Asn Arg Leu Lys Ser Ser 2315 2320 2325 Thr Thr Thr Thr Thr Ser Ile Pro Glu Thr Thr Thr Thr Glu Glu 2330 2335 2340 Asp Leu Ser Asp Leu Lys Val Gln Leu Ser Asn Ala Ile Asn Arg 2345 2350 2355 Leu Gln Thr Glu Asn Lys Leu Glu Val Gln Thr Ile Thr Lys Gly 2360 2365 2370 Ser Glu Ala Ala Glu Asp Glu Gly Asp Asp Lys Leu Ser Leu Pro 2375 2380 2385 Ile Tyr His Arg Arg Lys Tyr Tyr Gln Tyr Val Lys Asp Ser Pro 2390 2395 2400 Ile Thr Tyr Ile Asp Lys Ser Pro Ala Pro Pro Asp Ile Glu Ser 2405 2410 2415 Val Thr Val Asn Ile Lys Gln Gln Ile His Asp Val Phe Asn Val 2420 2425 2430 Ser Glu Asn Glu Thr Pro His Asn Ser Leu Gly Asp Asp Glu Glu 2435 2440 2445 Thr Glu Gly His Arg Val Ala Met Ala Gln Ala Lys Glu Ile Asn 2450 2455 2460 Ala Glu Leu Glu Glu Lys Glu Arg Gly Glu Asp Glu Ala Arg Ala 2465 2470 2475 Leu Arg Thr Tyr Thr Arg Leu Asn Arg Thr Arg Leu Thr Leu Ser 2480 2485 2490 Thr Arg Leu Gln Glu Lys Thr Gln Ser Glu Pro Leu Asp Thr Thr 2495 2500 2505 Thr Arg Arg Ser Tyr Ser Val Pro Gln Arg Phe Arg Ile Arg Ser 2510 2515 2520 Thr Thr Pro Ile Pro Ser Lys Ile Glu Asn Ser Glu Glu Asp Asp 2525 2530 2535 Glu Glu Thr Lys Asp Asn Glu Gly Pro Ser Pro Ser Thr Thr Thr 2540 2545 2550 Val Thr Pro Pro Ser Ile Lys Leu Pro Thr Arg Arg Leu Phe Thr 2555 2560 2565 Pro Arg Arg Pro Val Asn Ala Val Glu Asp Ser Asp Ser Ser Asp 2570 2575 2580 Ile Arg Lys Asp Asn Glu Glu Glu Leu Lys Val Glu Ser Thr Thr 2585 2590 2595 Lys Arg Leu Tyr Ala Gly Leu Asn Arg Leu Arg Gly Arg Gly Ser 2600

2605 2610 Thr Thr Thr Thr Thr Glu Glu Ala Thr Asp Ser Thr Thr Glu Thr 2615 2620 2625 Ala Thr Thr Thr Ala Lys Ser Thr Arg Gln Pro Tyr Val Gly Ile 2630 2635 2640 Ser Arg Arg Val Thr Thr Thr Thr Thr Thr Glu Lys Ser Ala Glu 2645 2650 2655 Ser Ser Thr Glu Tyr Asn Gly Asn Glu Asp Glu Glu Thr Glu Ser 2660 2665 2670 Thr Thr Val Thr Pro Glu Gln Glu Ile Ser Asp Asp Ala Glu Glu 2675 2680 2685 Asn Lys Val Ala Ile Lys Glu Ile Asp Asp Gln Val Ser Lys Lys 2690 2695 2700 Ala Pro Glu Glu Ala Glu Asp Thr Ser Thr Glu Glu Pro Glu Leu 2705 2710 2715 Glu Ala Phe Ile Asp Asp Asp Asn Glu Ile Pro Leu Glu Glu Ser 2720 2725 2730 Gly Pro Lys Thr Glu Thr Thr Ser Thr Thr Thr Thr Thr Thr Ser 2735 2740 2745 Thr Thr Thr Thr Thr Pro Ala Ser Thr Thr Ser Arg Arg Gln Leu 2750 2755 2760 Val Ile Arg Arg Arg Phe Asn Gly Thr Ile Thr Thr Thr Thr Thr 2765 2770 2775 Val Ala Pro Val Ala Asp Glu Asn Leu Glu Asn Glu Ile Asp Pro 2780 2785 2790 Ser Asp Thr Glu Ser Ser Thr Pro Lys Ala Ala Thr Thr Thr Ser 2795 2800 2805 Pro Arg Arg Gln Leu Leu Ile Arg Arg Arg Phe Asn Ala Thr Ser 2810 2815 2820 Ser Gly Ser Thr Thr Thr Thr Thr Ala Asn Pro Ser Ala Asp Asn 2825 2830 2835 Glu Ile Asp Gln Gly Glu Thr Lys Arg Thr Thr Arg Arg Pro Ile 2840 2845 2850 Leu Ser Arg Arg Arg Phe Asn Ala Thr Ser Ile Thr Ala Thr Thr 2855 2860 2865 Thr Gly Ser Thr Asn Gly Asp Glu Ile Ser Thr Arg Arg Pro Tyr 2870 2875 2880 Ala Ala Leu Asn Arg Ser Arg Asn Arg Phe Thr Thr Pro Gln Thr 2885 2890 2895 Thr Thr Thr Asp Gly Gly Ala Asn Gly Asp Asp Asp Asp Tyr Asp 2900 2905 2910 Gly Glu Glu Glu Glu Gln Leu Ala Pro Pro Arg Ala Val Phe Leu 2915 2920 2925 Gln Thr Asn Arg His Arg Ala Leu Lys Pro Thr Pro Glu Asp Glu 2930 2935 2940 Glu Glu Gly Ala Ala Ala Val Pro Gly Arg Arg Pro Leu Asn Phe 2945 2950 2955 Ala Ala Arg Arg Thr Thr Ala Ala Pro Leu Arg Val Ser Ser Ser 2960 2965 2970 Thr Arg Arg Asn Leu Val Ala Ile Asn Arg Asn Leu Tyr His Arg 2975 2980 2985 Pro Glu Glu Asp Asn Glu Glu Glu Pro Glu Glu Glu Tyr Asp Glu 2990 2995 3000 Asn Glu Asp Gly Asp Asp Asp Gln Glu Glu Ser Val Asp Pro Gln 3005 3010 3015 Val Thr Ser Thr Thr Thr Arg Ser Arg Leu Asn Gln Leu Leu Ala 3020 3025 3030 Asn Arg Gln Arg Gln Pro Leu Arg Thr Thr Thr Glu Lys Gln Thr 3035 3040 3045 Glu Thr Asp Ser Asn Asp Thr Glu Thr Asp Ser Asp Asn Gly Asp 3050 3055 3060 Glu Asn Asp Asp Asp Glu Asp Asn Asp Ser Ser Val Glu Val Ser 3065 3070 3075 Asn Ser Asn His Thr Leu Lys His Ser Thr Ile Phe Gly Val Gly 3080 3085 3090 Thr Thr Asn Phe Asn Asn Leu Thr Asn Arg Ser Thr Ala Leu Asn 3095 3100 3105 Val Ala Ser Gln Arg Ser Asn Ser Thr Val Ala Asn Tyr Ile Asn 3110 3115 3120 Arg Phe Lys Ser Asn Ser Tyr Thr Asn Lys Asn Lys Pro Val Thr 3125 3130 3135 Val Thr Ala Asn Ile Lys Ala Asp Ser Thr Asp Asp Lys Asp Asn 3140 3145 3150 Asp Asn Asp Glu Asp Asp Asp Asp Asp Asp Asn Asp Asp Asp Asn 3155 3160 3165 Tyr Ala Ser Leu Glu Asn Glu Gly Lys Glu Lys Thr Ser Gly Ala 3170 3175 3180 Gly Leu Asn Ala Leu Gly Asn Asp Val Asn Ser Thr Arg Arg Phe 3185 3190 3195 Gln Asn Arg Tyr Gln Leu Ser Arg Thr Arg Gly Ser Thr Thr Thr 3200 3205 3210 Asn Thr Asn Pro Thr Thr Thr Gln Gln Pro Gln Thr Thr Ser Thr 3215 3220 3225 Ala Arg Arg Leu Ala Phe Gly Gly Arg Gln Arg Ala Gln Val Thr 3230 3235 3240 Lys Leu Thr Leu Val Asp Glu Gln Thr Glu Glu Thr Glu Thr Lys 3245 3250 3255 Gly Asp Ser Arg Glu Glu Glu Lys Glu Glu Glu Glu Glu Glu Asp 3260 3265 3270 Ser Asn Ala Thr Thr Thr Thr Thr Thr Thr Thr Thr Ser Arg Pro 3275 3280 3285 Thr Pro Lys Arg Ile Arg Val Leu Lys Phe Arg Arg Pro Leu Asn 3290 3295 3300 Ser Asn Ser Asn Ser Thr Ile Asn Val Asp Ser Thr Thr Asn Ser 3305 3310 3315 Ala Thr Asp Thr Asn Pro Asp Thr Thr Thr Ala Thr Pro Thr Thr 3320 3325 3330 Ala Gly Gln Ser Thr Thr Ser Asn Ser Asn Asn Asn Asn Asn Asn 3335 3340 3345 Thr Thr Ser Thr Thr Gly Asn Lys Arg Phe Arg Lys Ile Val Arg 3350 3355 3360 Lys Leu Arg Pro Val Asp Ser Ser Thr Ala Ala Ser Val Asp Asn 3365 3370 3375 Ser Asp Glu Thr Thr Arg Lys Pro Phe Val Pro Ser His Thr Arg 3380 3385 3390 Phe Ala Asp Gln Asp Asn Asp Leu Val Asn Leu Arg Gln Arg Ile 3395 3400 3405 Lys Glu Gln Gln Ala Arg Gly Glu Pro Gln Asp Gly Val Ile Ser 3410 3415 3420 Asn Arg Phe Lys Thr Leu Gly Gln Lys Asp Asp Gln Asp Val Ser 3425 3430 3435 Glu Leu Gln Lys Leu Arg Asp Lys Val Lys Ala Glu Gln Ala Arg 3440 3445 3450 Gly Glu Gly Glu Gln Gly Val Ile Asn Asp Arg Leu Lys Lys Leu 3455 3460 3465 Leu Ala Glu Lys Gly Ser Ser Ile Ser Ser Gln Arg Glu Glu Ser 3470 3475 3480 Ser Thr Asp Asp Asp Ser Ser Ser Val Ser Ser Ala Arg Pro Phe 3485 3490 3495 Phe Lys Arg Lys Leu Val Ala Arg Arg Pro Tyr Thr Pro Pro Ser 3500 3505 3510 Ala Ser Gly Gly Thr Thr Lys Ala Pro Leu Thr Phe Ser Thr Ser 3515 3520 3525 Arg Pro Thr Ala Lys Phe Val Arg Arg Lys Asn Gly Arg Phe Asp 3530 3535 3540 Pro Phe Asn Ser Ser Val Arg Asn Arg Gly Glu Gly Phe Val Arg 3545 3550 3555 Ser Asp Pro Arg Gly Ser Arg Leu Pro Gly Thr Asp Arg Phe Lys 3560 3565 3570 Ser Gln Gly Asn Ser Glu Asp Asp Asp Glu Val Glu Glu Arg His 3575 3580 3585 Glu Gln Pro Leu Gln Asn Gln Phe Ala Thr Thr Leu Arg Arg Pro 3590 3595 3600 Phe Val Pro Lys Thr Arg Pro Val Leu Asp Lys Ser Lys Pro Glu 3605 3610 3615 Gln Glu Asp Gly Ala Glu Glu Ser Glu Glu Glu Asp Glu Glu Glu 3620 3625 3630 Asp Ser Glu Glu Glu Gly Asp Glu Glu Glu Asp Glu Glu Glu Glu 3635 3640 3645 Asp Val Lys Pro Gly Gly Glu Glu Asp Asn Ala Gln Glu Asp Asn 3650 3655 3660 Lys Pro Lys Phe Asn Ser Pro Tyr Lys Pro Lys Asp Asn Arg Ala 3665 3670 3675 Pro Pro Gly Ser Arg Pro Thr Phe Gly Thr Thr Gly Ser Gly Ser 3680 3685 3690 Pro Pro Thr Ala Ser Gly Asn Val Pro Tyr Asn Pro Arg Asn Arg 3695 3700 3705 Pro Ser Asn Ser Ala Asn Gly Asn Ser Thr Pro Ser Asn Arg Phe 3710 3715 3720 Gly Thr Thr Lys Arg Pro Arg Val Val Asn Arg Pro Pro Gly Val 3725 3730 3735 Ala Ser Pro Asn Leu Thr Leu Lys Pro Val Ala Ser Asp Tyr Glu 3740 3745 3750 Arg Thr Thr Pro Leu Thr Pro Leu Lys Pro Ala Pro Phe Ile Pro 3755 3760 3765 Ser Asn Asn Arg Ser Tyr Glu Arg Lys Tyr Ser Gly Pro Ser Thr 3770 3775 3780 Glu Ala Ala Glu Thr Ala Ser Glu Asn Ser Leu Ile Glu Asp Leu 3785 3790 3795 Asn Ile Asp Ala Leu Asn Ala Arg Asn Lys Lys Ile Phe Asp Lys 3800 3805 3810 His Ser Lys Lys His Pro Ala Leu Lys Pro Lys Val Val Lys Val 3815 3820 3825 Glu Ser Glu Thr Gly Leu Glu Val Glu Ala Gly Thr Glu Val Ala 3830 3835 3840 Val Glu Asp Glu Thr Thr Glu Glu Gln Gln Gln Glu Gln Gly Phe 3845 3850 3855 Val Thr Thr Thr Pro Ser Thr Pro Pro Ser Pro Ala Pro Pro Ser 3860 3865 3870 Thr Gln Ser Asp Thr Ala Thr Thr Thr Asp Thr Pro Pro Glu Thr 3875 3880 3885 Glu Thr Glu Thr Glu Thr Glu Thr Glu Thr Glu Thr Glu Thr Glu 3890 3895 3900 Asn Val Thr Glu Ile Glu Thr Ala Thr Asn Ala Asn Glu Ala Thr 3905 3910 3915 Ser Ile Asn Ser Gln Asp Gln Thr Ile Ser Ser Thr Thr Gln Ala 3920 3925 3930 Pro Pro Pro Ala Thr Thr Leu Leu His Val Phe Thr Leu Leu Glu 3935 3940 3945 Gly Glu Gly Gln Glu Glu Glu Pro Thr Thr Arg Lys Pro Thr Val 3950 3955 3960 Arg Leu Tyr Pro Thr Ile Gln Thr Glu Val Val Pro Lys His Lys 3965 3970 3975 Leu Ile Glu Ile Asn Arg Ile Val Glu Ile Asn Ser Lys Gln Ala 3980 3985 3990 Lys Ala Ala Gln Arg Lys Ser Lys Ala Asn His Asp Phe Ser Thr 3995 4000 4005 Leu Met Val Glu Ser Leu Pro His Val Glu Gln Leu Gly Glu Ile 4010 4015 4020 Ser Val Val Lys Tyr Val His Leu Val Asp Gly Ser Asp Ile Gln 4025 4030 4035 Ile Asn Asp Gly His Ser Thr Val Ala Asp Tyr Thr Pro Thr Glu 4040 4045 4050 Pro Thr Ser Ala Ala Glu Arg Pro Val Ser Leu Pro Val Arg Asn 4055 4060 4065 Ser Leu Pro Glu Thr Glu Gly Ala Asp Thr Asp Arg Ser Gly Lys 4070 4075 4080 Ser Leu Val Pro Glu Val Leu Thr Ala Ala Leu Glu Thr Ser Thr 4085 4090 4095 Ile Ser Leu Glu Gly Leu Phe Asp Ser Ala Arg Lys Gly Lys Gln 4100 4105 4110 Leu Ser Ser Asn Thr Ile Ile Gly Glu Thr Glu Glu Ser Thr Thr 4115 4120 4125 Ile Gly Ser Ser Ser Ser Leu Ala Ser Glu Thr Gly Glu Thr Thr 4130 4135 4140 Thr Pro Ala Pro Thr Tyr Val Arg Pro Ile Val Pro Leu Leu Arg 4145 4150 4155 Pro Glu Ser Asn Glu Ser Ser Pro Leu Val Ile Ser Ile Ala Asn 4160 4165 4170 Leu Asp Gln Val Ile Leu Ser Lys Val Gln Lys Ser Leu Ala Glu 4175 4180 4185 Asn Ser Gln Thr Thr Val Ala Pro Glu Ala Ala Ser Asp Ser Asn 4190 4195 4200 Ser Ala Phe Ser Val Arg Gln Pro Leu Val Val Gln Ala Pro Ile 4205 4210 4215 Ser Asn Gly Ala Gln Glu Ile Asp Gln Asp Thr Leu Asn Thr Gln 4220 4225 4230 Asp Gln Thr Ile Asn Gly Ala Ile Ser Val Lys Thr Asn Pro Ile 4235 4240 4245 Ile Gln Thr Thr Thr Asn Arg Pro Asn Asp Asp Gln Val Ala Glu 4250 4255 4260 Glu Thr Thr Ile Phe Ser Ile Glu Thr Ala Thr Glu Pro Glu Leu 4265 4270 4275 Asn Thr Gln Thr Thr Ile Pro Lys Thr Glu Ala Asn Ser Glu Thr 4280 4285 4290 Val Thr Ala Met Pro Ile Gly Ala Val Ile Met Gly Gln Phe Gly 4295 4300 4305 Leu Asn Thr Gln Ser Thr Thr Ala Val Asp Asn Asp Asn Gln Leu 4310 4315 4320 Asn Ala Gln Thr Thr Ser Thr Ile Ser Ser Gly Ala Val Ser Ser 4325 4330 4335 Val Ala Ile Gly Gly Asn Thr Gln Thr Ala Asn Ala Asp Asn Ala 4340 4345 4350 Arg Gln Glu Asn Thr Gln Ser Thr Gly Thr Ile Thr Ser Glu Ile 4355 4360 4365 Ser Ser Gly Ala Ile Ser Ser Asp Asn His Asn His Ile Gly Thr 4370 4375 4380 Gln Thr Thr Ala Thr Ile Asp Ser Ser Ser Glu Thr Thr Pro Thr 4385 4390 4395 Gln Ile Ser Thr Thr Ile Ser Ser Gly Ala Ile Ser Gly His Ile 4400 4405 4410 Asp Gly Ser Ile Asn Leu Asn Thr Gln Thr Asn Thr Thr Ile Ser 4415 4420 4425 Thr Asn Asn Thr Thr Thr Ser Thr Thr Asp Val Gly Ser Lys Val 4430 4435 4440 Ser Glu Ala Val Ser Phe Ser Ser Glu Thr His Val Val His Arg 4445 4450 4455 Lys Lys Met Gly Arg Lys Gly Arg Gly Arg Arg Leu Arg Asn Arg 4460 4465 4470 Lys Thr Thr Thr Thr Thr Thr Thr Thr Glu Thr Pro Thr Thr Thr 4475 4480 4485 Glu Ala Thr Phe Asp Asp Thr Thr Thr Val Val Pro Glu 4490 4495 69 782 DNA Homo sapien 69 aggggcctta gcgtgccgca tcgccgagat ccagcgccca gagagacacc 50 agagaaccca ccatggcccc ctttgagccc ctggcttctg gcatcctgtt 100 gttgctgtgg ctgatagccc ccagcagggc ctgcacctgt gtcccacccc 150 acccacagac ggccttctgc aattccgacc tcgtcatcag ggccaagttc 200 gtggggacac cagaagtcaa ccagaccacc ttataccagc gttatgagat 250 caagatgacc aagatgtata aagggttcca agccttaggg gatgccgctg 300 acatccggtt cgtctacacc cccgccatgg agagtgtctg cggatacttc 350 cacaggtccc acaaccgcag cgaggagttt ctcattgctg gaaaactgca 400 ggatggactc ttgcacatca ctacctgcag tttcgtggct ccctggaaca 450 gcctgagctt agctcagcgc cggggcttca ccaagaccta cactgttggc 500 tgtgaggaat gcacagtgtt tccctgttta tccatcccct gcaaactgca 550 gagtggcact cattgcttgt ggacggacca gctcctccaa ggctctgaaa 600 agggcttcca gtcccgtcac cttgcctgcc tgcctcggga gccagggctg 650 tgcacctggc agtccctgcg gtcccagata gcctgaatcc tgcccggagt 700 ggaactgaag cctgcacagt gtccaccctg ttcccactcc catctttctt 750 ccggacaatg aaataaagag ttaccaccca gc 782 70 207 PRT Homo sapien 70 Met Ala Pro Phe Glu Pro Leu Ala Ser Gly Ile Leu Leu Leu Leu 1 5 10 15 Trp Leu Ile Ala Pro Ser Arg Ala Cys Thr Cys Val Pro Pro His 20 25 30 Pro Gln Thr Ala Phe Cys Asn Ser Asp Leu Val Ile Arg Ala Lys 35 40 45 Phe Val Gly Thr Pro Glu Val Asn Gln Thr Thr Leu Tyr Gln Arg 50 55 60 Tyr Glu Ile Lys Met Thr Lys Met Tyr Lys Gly Phe Gln Ala Leu 65 70 75 Gly Asp Ala Ala Asp Ile Arg Phe Val Tyr Thr Pro Ala Met Glu 80 85 90 Ser Val Cys Gly Tyr Phe His Arg Ser His Asn Arg Ser Glu Glu 95 100 105 Phe Leu Ile Ala Gly Lys Leu Gln Asp Gly Leu Leu His Ile Thr 110 115 120 Thr Cys Ser Phe Val Ala Pro Trp Asn Ser Leu Ser Leu Ala Gln 125 130 135 Arg Arg Gly Phe Thr Lys Thr Tyr Thr Val Gly Cys Glu Glu Cys 140 145 150 Thr Val Phe Pro Cys Leu Ser Ile Pro Cys Lys Leu Gln Ser Gly 155 160 165 Thr His Cys Leu Trp Thr Asp Gln Leu Leu Gln Gly Ser Glu Lys 170 175 180 Gly Phe Gln Ser Arg His Leu Ala Cys Leu Pro Arg Glu Pro Gly 185 190 195 Leu Cys Thr Trp Gln Ser Leu Arg Ser Gln Ile Ala 200 205 71 481 DNA Homo sapien 71 ccactgcacg gtagggggtc ctgtaggagg ctggtggcag ggttggattg 50 tgggccctag gcttctgggc gggatgatga cattgagatt ctggcccctg 100 tatccacagg tgatggagac ctgccagatg tccaggagcc cccgagagcg 150 gctgttgctg cttttgctgc tgctactgct tgtgccctgg ggcactggcc 200 ctgcctcagg tgttgccctg cccctcgctg gtgtgttcag cctccgcgcc 250

ccgggtcgtg cctgggcggg cttgggtagc cccctgtctc ggcgcagcct 300 ggcgctagct gacgacgcgg cctttcggga gcgcgcgcgc ctgctggccg 350 ccctggagcg ccgccgctgg ctggactctt acatgcagaa gctgttgcta 400 ctggacgcgc cctgagccta ataaagagcc tgtcgcactg cgactgcgcc 450 tctttgctgc gccactctct tgtgggtgtg t 481 72 100 PRT Homo sapien 72 Met Glu Thr Cys Gln Met Ser Arg Ser Pro Arg Glu Arg Leu Leu 1 5 10 15 Leu Leu Leu Leu Leu Leu Leu Leu Val Pro Trp Gly Thr Gly Pro 20 25 30 Ala Ser Gly Val Ala Leu Pro Leu Ala Gly Val Phe Ser Leu Arg 35 40 45 Ala Pro Gly Arg Ala Trp Ala Gly Leu Gly Ser Pro Leu Ser Arg 50 55 60 Arg Ser Leu Ala Leu Ala Asp Asp Ala Ala Phe Arg Glu Arg Ala 65 70 75 Arg Leu Leu Ala Ala Leu Glu Arg Arg Arg Trp Leu Asp Ser Tyr 80 85 90 Met Gln Lys Leu Leu Leu Leu Asp Ala Pro 95 100 73 2974 DNA Homo sapien 73 ctcagggcag agggaggaag gacagcagac cagacagtca cagcagcctt 50 gacaaaacgt tcctggaact caagctcttc tccacagagg aggacagagc 100 agacagcaga gaccatggag tctccctcgg cccctcccca cagatggtgc 150 atcccctggc agaggctcct gctcacagcc tcacttctaa ccttctggaa 200 cccgcccacc actgccaagc tcactattga atccacgccg ttcaatgtcg 250 cagaggggaa ggaggtgctt ctacttgtcc acaatctgcc ccagcatctt 300 tttggctaca gctggtacaa aggtgaaaga gtggatggca accgtcaaat 350 tataggatat gtaataggaa ctcaacaagc taccccaggg cccgcataca 400 gtggtcgaga gataatatac cccaatgcat ccctgctgat ccagaacatc 450 atccagaatg acacaggatt ctacacccta cacgtcataa agtcagatct 500 tgtgaatgaa gaagcaactg gccagttccg ggtatacccg gagctgccca 550 agccctccat ctccagcaac aactccaaac ccgtggagga caaggatgct 600 gtggccttca cctgtgaacc tgagactcag gacgcaacct acctgtggtg 650 ggtaaacaat cagagcctcc cggtcagtcc caggctgcag ctgtccaatg 700 gcaacaggac cctcactcta ttcaatgtca caagaaatga cacagcaagc 750 tacaaatgtg aaacccagaa cccagtgagt gccaggcgca gtgattcagt 800 catcctgaat gtcctctatg gcccggatgc ccccaccatt tcccctctaa 850 acacatctta cagatcaggg gaaaatctga acctctcctg ccacgcagcc 900 tctaacccac ctgcacagta ctcttggttt gtcaatggga ctttccagca 950 atccacccaa gagctcttta tccccaacat cactgtgaat aatagtggat 1000 cctatacgtg ccaagcccat aactcagaca ctggcctcaa taggaccaca 1050 gtcacgacga tcacagtcta tgcagagcca cccaaaccct tcatcaccag 1100 caacaactcc aaccccgtgg aggatgagga tgctgtagcc ttaacctgtg 1150 aacctgagat tcagaacaca acctacctgt ggtgggtaaa taatcagagc 1200 ctcccggtca gtcccaggct gcagctgtcc aatgacaaca ggaccctcac 1250 tctactcagt gtcacaagga atgatgtagg accctatgag tgtggaatcc 1300 agaacgaatt aagtgttgac cacagcgacc cagtcatcct gaatgtcctc 1350 tatggcccag acgaccccac catttccccc tcatacacct attaccgtcc 1400 aggggtgaac ctcagcctct cctgccatgc agcctctaac ccacctgcac 1450 agtattcttg gctgattgat gggaacatcc agcaacacac acaagagctc 1500 tttatctcca acatcactga gaagaacagc ggactctata cctgccaggc 1550 caataactca gccagtggcc acagcaggac tacagtcaag acaatcacag 1600 tctctgcgga gctgcccaag ccctccatct ccagcaacaa ctccaaaccc 1650 gtggaggaca aggatgctgt ggccttcacc tgtgaacctg aggctcagaa 1700 cacaacctac ctgtggtggg taaatggtca gagcctccca gtcagtccca 1750 ggctgcagct gtccaatggc aacaggaccc tcactctatt caatgtcaca 1800 agaaatgacg caagagccta tgtatgtgga atcc