Title: Genome-scale Identification of Secreted and Membrane-associated Tumor Markers
Abstract: Purpose/Objective: The subcellular localization of proteins is critical to their biological roles. Moreover, whether a protein is membrane-bound, secreted, or intracellular, greatly affects the usefulness of and the strategies for using a protein as a diagnostic marker or a therapeutic target. DNA microarrays allow the rapid assaying of gene expression in tumor and normal samples. We sought to use the same microarrays used for tumor expression profiling to identify novel membrane-associated or secreted markers that could be candidates for serum diagnostic tests or antibody-based therapies. Materials/Methods: We employed a rapid and efficient experimental approach to classify thousands of human gene products as either "membrane-associated/secreted" (MS) or "cytosolic/nuclear" (CN). Using subcellular fractionation methods that take advantage of the compartmentalization of mRNA translation, we separated cancer cell line mRNAs associated with membranes from those associated with the soluble cytosolic fraction and analyzed these two pools by competitive hybridization to DNA microarrays. By analyzing the distribution of mRNAs encoding proteins with known subcellular localization in the two fractions, we were able to predict the subcellular localization of mRNAs representing uncharacterized genes. The eleven cell lines analyzed included representatives from lymphoid, myeloid, breast, ovarian, hepatic, colon and prostate tissues in order to identify a broad range of tissue- and tumor-specific markers. We also compared our subcellular localization predictions with sequence-based in silico predictions. Finally, we applied algorithms to identify tumor-specific markers that were not highly expressed in normal human tissues and could therefore serve as ideal therapeutic targets and diagnostic markers. Results: Our analyses identified greater than 5,000 previously uncharacterized MS and more than 6,400 putative CN UniGene clusters at high confidence levels. The experimentally determined localizations correlated well with in silico predictions of signal peptides and transmembrane domains, but also significantly extended the number of previously uncharacterized human genes that could be cataloged as MS or CN gene products. Significantly, ∼40 percent of all known MS proteins and of our empirically identified MS proteins do not contain predicted signal peptides or transmembrane domains, underscoring the importance of experimental approaches to subcellular localization prediction. Using gene expression data from more than 700 primary human malignancies and normal tissues, we found that the membrane compartments of tumors are similar to, yet distinct from those of their normal counterparts. This allowed us to identify dozens of candidate membrane-associated markers that are highly specific for tumors compared to all normal tissues analyzed. In general these markers reflect the tissue of origin of a given malignancy but are significantly over-expressed in tumors compared to their normal counterparts. Conclusions: By combining subcellular fractionation methods with DNA microarray analysis we identified thousands of novel membrane-associated and secreted markers. The localization of many of these proteins could not have been predicted using current in silico methods. Our large scale annotations should aid in the rational design of diagnostic tests and molecular therapies for a variety of malignancies and are particularly relevant to identifying future targets for immuno- and radioimmuno-therapies.