Introduction

Gene expression of a tumor is a complex intermediate phenotype that is determined by many different factors: (1) the host genotype, (2) eventual somatic mutations and genomic alterations, (3) the cell from which the tumor originated, and (4) the specific microenvironment in terms of stroma cells, infiltrating cells and cell signaling molecules. It is reasonable to assume that any difference between two tumors in their patho-physiological state should be represented by some difference in gene expression. This is true despite the fact that while genes and the messenger RNAs derived from them are not the executers of biological function, they can nonetheless be used as markers of it. Even events that are completely independent of gene transcription such as protein modifications, as a part of the cellular program, elicit a transcriptional response that can be detected by expression profiling.

However, tumors are heterogeneous and the different cell populations contained in different areas of the same tumor might well differ for their potential to progress and metastasize. The multistep carcinogenesis model predicts that tumors evolve through the consecutive acquisition of genetic and epigenetic alterations towards a more and more aggressive phenotype inasmuch as they acquire the characteristics needed for progression [proliferation, resistance to apoptosis, scattering, migration and invasion, stimulation of vessel growth, survival in the blood stream (anti-anoikis), adhesion to the vessel wall, extravasation, survival and growth in the target tissue] through random mutation and epigenetic changes followed by selection of the cells that carry a growth advantage in the specific environment [15]. One of the predictions of the multistep carcinogenesis model is that the “metastatic phenotype” might be a characteristic of only a very small subpopulation of the cells within the primary tumor and that the metastasizing cells could acquire additional mutations that are not present in the primary tumor [2, 6]. These features would escape from detection by expression profiling (or any other genomic technique such as comparative genome hybridization on arrays) unless one analyzes each potentially unique cell population within a tumor, an impossible enterprise. Another recent theory predicts that metastasis arises from cancer “stem cells”, better defined as “cancer initiating cells” that might stay latent in the primary tumor and escape to metastasis when local conditions, for instance in hypoxia, induce their scattering. In retrospect, the success of metastasis prediction by expression profiling of the primary tumor was not so obvious and it is still not very clear how this can be reconciled with the multistep carcinogenesis model [710].

The current view of carcinogenesis considers the tumor cells themselves as main actors in carcinogenesis. Mutations of oncogenes and tumor suppressor genes initiate and promote the development of the tumor, hence these mutations determine its fate. From this point of view, the analysis of the actual tumor cells instead of the whole tumor that also contains many host fibroblasts, endothelial cells, and blood cells, would be expected to yield the best information on the molecular status of the tumor. The presence of host cells was expected to dilute the signal derived from the mutated tumor cell population [11].

However, the importance of the tumor stroma and of its interactions with the neoplastic cells are increasingly recognized as important, if not determining, factors in the evolution of a tumor [1217]. This is also supported by evidence that gene expression profiling of the entire tumor yields reliable signatures. We have recently shown that many of the genes contained in the metastasis signature developed by comparing many tumors of different tissues and metastases of unmatched tumors [18] are expressed by stromal fibroblast or endothelial cells. Genes encoding proteins of the extracellular space, which are produced at least in part by stromal cells, can be used to construct a prognostic signature [19]. Stroma cells, long believed to be passive by-standers, can undergo mutations that drive them to contribute to tumor progression: p53 mutations, for instance, have been detected in stromal fibroblasts in 40% of the tumors analyzed, prevalent even in the absence of p53 mutations in the tumor epithelial cells, and these mutations predicted lymphnode metastasis [20]. For the purpose of prognosis, it appears therefore useful or even necessary to screen all the tumor compartments for potential markers and not only the tumor cell itself.

For breast cancer, prognostic signatures are expected to reduce the tendency to apply what has been seen as over-treatment. Following the existing guidelines, adjuvant chemotherapy is offered to most breast cancer patients. Many of these patients have tumors that have a low risk of developing metastases or that do not respond to existing chemotherapies. A more precise prognosis of the risk of relapse and a response prediction are needed.

Technical issues

Microarray technology has developed to considerable maturity. State of the art platforms and protocols guarantee high reproducibility and a low intrinsic variability. It has been shown that most of the variability in microarray experiments derives from the biological sample itself, even when conducted under apparently identical conditions. The second most important source of variability is introduced by the operator whereas the intrinsic oscillation of present microarrays is negligible [21]. Also, in our experience, the correlation between gene expression data generated by microarray and those obtained by more classical approaches such as real time PCR is excellent. Most differences can eventually be attributed to differences in the region of selection of probes and primers (U. Pfeffer, unpublished observation) since alternative splicing appears to be more the rule than the exception. The same applies for cross-platform variation of microarray data [22]. The latest addition to the technological battery are arrays that contain several probes for each single exon of each single gene of the human genome and are expected to yield additional information on alternative splicing and variant expression as well as more robust data [23]. So far, no large breast cancer studies using these arrays have published.

The sensitivity of microarrays has grown continuously and the amount of tissue needed to perform the analysis has been greatly reduced. Latest generation microarray analyses start with 100 ng total RNA that can be isolated from very small amounts of tissue. Analysis of material obtained from fine needle biopsies is therefore possible, potentially allowing for pre-operative diagnostic approaches [2426]. However, the smaller the sample, the higher the risk for sampling non-representative areas of a typically heterogeneous tumor. The ratio between tumor epithelial cells and stroma cells sampled also varies in fine needle aspirates. For this reason, whenever possible, larger pieces that have accurately been selected by the pathologist should be used. Sample conservation remains critical for the success of expression profiling applications. Yet the laborious and logistically demanding procedure of liquid nitrogen freezing of intra-operative samples can be substituted by conservation in RNA inhibiting pre-fixation solution such as RNAlater [27] or HOPE [28] that also facilitate long-term storage thereby reducing the costs and eventually increasing stability of biological molecules. Data on the involvement of microRNAs in breast cancer are growing and therefore sample conservation and RNA extraction should also consider the need to isolate small RNAs, since not all miRNA screening procedures accept total RNA as a starting material.

A major concern has been the virtually absent overlap of signatures produced using different platforms. Unfortunately, not a single study to date has hybridized a large number of samples to two different platforms. It is therefore difficult to conclude to which degree the different platforms actually reveal the same gene expression events. The lack of overlap can probably be attributed mainly to different patient populations analyzed, which greatly determines the genes selected for the signatures, as discussed below.

Breast cancer subtypes

Microarray gene expression profiling reliably yields the complex intermediate phenotype of gene expression of a sample. The gene expression phenotype of a tumor reflects its past history in terms of cell origin, its present in terms of exposure to oxygen and nutrients, growth factors and hormones, inflammation and tumor infiltrate which in turn are believed to determine its future potential to develop metastases. The first indication of the power of gene expression profiling with microarray came from the analysis of tumors sampled before and after chemotherapy and, in two cases, of matched lymphnode metastases [29]. This work showed the existence of molecularly defined subgroups. Gene expression patterns in two tumors from the same individual were almost always more similar to each other than to any other sample. Additional studies have confirmed that primary tumors and their metastases share similar expression profiles [30, 31] although several genes that are differently expressed have been identified [32, 33].

Sorlie et al. [34] showed that breast cancer cases fall into two large molecular classes that correspond almost perfectly to the status of estrogen receptor expression as analyzed by ligand binding assays or immunohistochemistry. A bioinformatic analysis of the two classes revealed that they correspond to profoundly different tumors. Even after the stepwise removal of several hundreds of the genes that best discriminate between the two classes, almost all cases were still correctly assigned to the estrogen receptor α positive or negative classes of tumors [35]. The difference between these two classes is consistent with the hypothesis that the two different tumor types derive from different progenitor cells. For this reason, the two classes are also defined as luminal and basal type. However, gene expression profiling reveals additional subtypes that are reproducibly formed after hierarchical clustering of breast cancer datasets containing more than hundred cases (hence containing a sufficient number of cases for each subtype class) [34, 36]. The estrogen receptor α positive type is subdivided into luminal A, luminal B and eventually C clusters and the estrogen receptor α negative type contains the basal-like, her2-like, and a “normal like” cluster. A similar classification has independently been obtained by Sotiriou et al. [36]. A more recent elaboration of breast cancer microarray expression profiles has confirmed the subtypes identified before [37]. The main determinant of subtypes remain HER2 and estrogen receptor status [38, 39]. The analysis by Hu et al. [37] tentatively names one of the estrogen receptor α negative subtypes “IFN-like” since STAT1 and several interferon induced genes are among the master discriminators of this cluster. Analyses of this type point to the identification of genes or pathways whose differential expression characterizes single clusters. This is particularly evident for the HER2 subtype. Tumors that cluster in this subtype show over-expression of HER2/neu/ERBB2, the EGF receptor family member whose amplifications determine responsiveness to the humanized antibody trastuzumab, which is directed against the HER2 encoded protein [40, 41]. Similarly, more defined subclusters have been identified as being formed by tumors harboring BRCA1 and BRCA2 mutations, and a closely related subcluster shows alterations downstream of the two hereditary breast cancer genes without having the genes themselves mutated [42, 43]. It is most likely that, step by step, information on the molecular nature of the tumors clustering together will be obtained for most of the subclusters. Intriguingly, not all tumors belonging to a subcluster must necessarily express the master discriminator gene(s) nor is expression of these genes sufficient for assigning a single case to a specific subcluster. In fact, HER2 expressing tumors are found among nearly all the subtypes [34]. This indicates that tumors of any subtype can over-express HER2 without loosing their identity, but in this case, HER2 over-expression most probably has only a minor role in the etiology of the specific tumor and hence a lower potential as a response marker.

Molecular subtypes have a clinical relevance inasmuch as they have different propensities to metastasize. The genes that characterize the different subtypes form the intrinsic subtype signature that can prognosticate clinical behavior [34, 36, 37, 44]. Its development as a real time PCR based test has been reported [45].

The major classes of estrogen receptor α positive and negative breast cancers most probably are derived from different tumor progenitor cells. Weinberg et al. [46] have recently shown that the cell of origin is a major determinant of the propensity to metastasize. It cannot be excluded, at present, that the molecular subtypes identified correspond to different progenitor cells downstream to ERα+ (luminal) or ERα− (basal) progenitor cells. More likely, subtypes correspond to tumors sharing the same driving mutation or at least the same affected pathway, as it is evident for the HER2 and BRCA1 and 2 cases. Further studies on molecular characteristics will eventually identify pathway mutations and lead to a more profound understanding of etiological events in breast cancer carcinogenesis.

Signatures

The identification of molecular subtypes with clinical relevance indicated that it should be possible to develop prognostic signatures with an elevated accuracy in classifying tumors into risk groups that deserve differential treatment or eventually no treatment at all. This endeavor has been undertaken by several groups. The general approach is a retrospective analysis of deep frozen breast cancer specimens with known follow-up, selection of discriminator genes by a variety of bioinformatics and statistical methods, and the construction of a predictive multigene classifier on the base of the follow-up information available. The statistical aspects of this approach have been the argument of a survey of these kinds of studies where many flaws in the experimental design were revealed [47]. Essentially, two datasets are needed: one to develop the classifier (training set) and another to test the classifier (test or validation set). These two sets can be obtained by a random split if the original dataset is large enough. In this case, a validation on a third, completely independent set is desirable. In alternative, signatures can be developed by the cross-validation method where single cases are left out from the training set in an iterative manner. The cases left out are then classified in order to validate the method.

A major problem with existing sample collections is that most of the women with breast cancer have received adjuvant chemotherapy. Therefore, a tumor that later developed a metastasis is of certain metastatic potential, whereas a tumor that did not metastazie could be a tumor without metastatic potential, a tumor that has not yet yielded a metastasis but will eventually do so, or a potentially metastatic tumor that has responded to adjuvant chemotherapy. A further complication is that breast cancers can give metastases even numerous years after removal of the primary tumor. This is best taken into account by working with cases with a very long follow-up. The metastasis-free cases that represent response to chemotherapy are most probably limited to a relatively small group of cases. However, these cases will participate in determining the signature. In the theoretical case of a cohort entirely composed of tumors with metastatic potential but with half of these being responsive to chemotherapy, these latter would be classified as “low risk”. This aspect could be dealt with by using cases that have not received adjuvant chemotherapy. However, with current standards of care such cases are rare and are becoming progressively rarer.

A variety of signatures have been developed and have been recently exhaustively reviewed [4855]. Two of these signatures have already been developed for commercialization as centralized laboratory tests and are presently being tested in large prospective trials. The Oncotype DX® recurrence score has been developed as real time PCR assay that can be performed on formalin fixed paraffin embedded (FFPE) tissues [56]. The selection of genes that form the classifier was based on three preliminary studies using real time PCR on FFPE samples. The most robust and informative genes were incorporated into a 16 gene classifier and analyzed in comparison to five housekeeping genes. The test is directed towards estrogen receptor α positive lymphnode negative cancers and yields a continuous prognostic score and predicts benefit from tamoxifen (in the low and intermediate risk group) and from adjuvant chemotherapy (in the high risk group). The test has recently been included in the recommendations of the American Society of Clinical Oncology. A prospective trial (Trial Assigning Individualized Options for Treatment—TAILORx) with a planned accrual of at least 10,000 patients intends to answer the question of whether intermediate risk patients benefit from chemotherapy [57]. The genes tested mainly monitor ER and HER2 status and proliferation.

Laura van’t Veer et al. [58, 59] from the Netherlands Cancer Institute in Amsterdam developed a 70-gene classifier (commercialized as MammaPrint®) based on two-color microarray technology (initially using Rosetta Inpharmatics inkjet microarray technology and subsequently Agilent’s microarray platform). The signature was built using the cross validation leave-one-out procedure on 78 informative lymph node negative cases and was subsequently validated on a partially overlapping cohort of 295 lymph node negative and positive cases. The classifier assigns cases to poor and good prognosis groups. The comparison of the 70 gene classifier with the NIH and St. Gallen criteria shows that the genomic approach clearly outperforms both. One study claimed, however, that this classifier works comparably to the Nottingham Prognostic Index (NPI) [60]. An additional validation study revealed a more than 90% chance of being free of disease after 5 years for patients classified in the low risk group. This signature is being prospectively validated in the Microarray in Node-Negative Disease May Avoid Chemotherapy (MINDACT) trial where the enrollment of 6,000 patients is planned [61]. Patients discordantly scored by the classical Adjuvant! Online (www.adjuvantonline.com) and the genomic MammaPrint® assay will be randomized to receive or not receive chemotherapy. This trial will yield microarray data from a very large cohort that will also contain many cases that, according to the genomic classifier, did not receive chemotherapy. The use of these data will certainly allow major improvements of existing molecular signatures.

Other genomic prognostic classifiers such as the 76-gene classifier (Rotterdam signature) [62], the wound healing signature [63] and the invasive gene signature [64] have been developed. In a more straightforward approach the group of Sotiriou and colleagues has given the long sought answer to the question whether grade 2 cancers are an independent entity or whether they incorporate features of grade 1 and 3 cancers. A 97-gene (128 probe sets) discriminator was built on Affymetrix HU133A arrays using cancers of pathological grade 1 and 3. This discriminator was then applied to grade 2 cancers that showed either a grade 1 or 3 profile rather than an intermediate expression phenotype [65]. These signatures are also being developed towards commercialization [66].

The overlap of genes contained in these signatures is very low. However, the main determinants of all the signatures are proliferation, ER-status, HER2-status and, less prominently, angiogenesis, invasiveness and apoptosis. Most probably these signatures detect the same biological processes and pathways involved in metastasis. Depending on the actual patient cohort, the assay platform and the analytic algorithms used, different genes are selected for the signature. This is also supported by a bioinformatics analysis that shows the possibility to select other equally informative 70 gene classifier sets from the data set used by Van’t Veer and Bernards [67]. This most probably indicates that many genes, probably several hundred or even more than one thousand, are actually related to disease free survival, and that the genes selected for any one classifier heavily depend on the patient cohort. In an additional study, the same authors calculated that for the development of a stable, cohort-independent classifier, several thousands of patients are needed [68].

A validation of several signatures (intrinsic subtype, 70-gene classifier, wound healing signature, recurrence score and two gene classifier) revealed that with the exception of the latter, all signatures have a similar discriminating power showing 77 to 80% agreement in outcome scoring. Importantly, the combination of the signatures did not perform better, again indicating that the different signatures actually identify the same molecular characteristics by using different marker genes [44].

This, however, also means that a considerable part of cases is misclassified by any and all of the signatures. This raises the question of whether the existing signatures can be improved to yield a more reliable prediction of outcome. This is of particular importance if one keeps in mind that the classifier is to be used to withhold patients from chemotherapy. In this context, misclassification means not to treat a woman who might have benefited from chemotherapy. This is also important since this aspect determines the major skepticism among the oncologists who must decide whether or not to base the treatment decision they offer to the patients on molecular signatures.

Limits of molecular signatures

In order to explore the possibility of compiling the ultimate signature we have performed a simple simulation based on a publicly available breast cancer dataset (Gene Expression Omnibus, GEO1456) of 159 consecutive breast cancer cases from Stockholm with follow-up information of at least 8 years [69]. Forty of the 159 cases developed distant metastases. We have clustered these 159 cases using several gene lists that were obtained by selecting genes contained in specific Gene Ontology categories [70] that were associated with the parameter “relapse” with a P value below 0.001. The Gene Ontology terms “cell cell signaling”, “cell death”, “cell growth”, “cell proliferation”, “kinase”, “metabolism”, “signal transduction”, “transcription regulator activity” were selected since they stand for different functional aspects that are involved in the process of metastasis. We then used the genes selected from each category separately for hierarchical clustering of the samples. This procedure necessarily produces highly discriminating gene lists since the classifier is developed on the same dataset to which it is applied. For the development of a molecular signature, these classifiers must be tested on an independent test or validation set. Here we ask instead, how well the classifiers developed can distinguish between cases with and without relapse in the optimal situation. In all cases, two main clusters are formed, one is enriched in cases with metastases and the other contains only very few of the metastatic cancer samples (Fig. 1). The bar under each clustergram indicates the follow-up information for each case (pink = relapse, brown = no relapse). As expected, the classifiers thus work on the dataset on which they have been developed and all eight gene lists yield a similar discrimination. Yet for each classifier, several cases with relapse cluster together with most of the cases without relapse. This could be simply due to a limitation of the single classifier with each misclassifying a different subset of cases. This would be the case if, for instance, a group of metastatic cancers are similar to the non-metastatic samples as far as proliferation is concerned but not when metabolism is considered. However, when we looked at the samples actually misclassified by the different classifier gene lists we noticed that several cases were misclassified by many or even by all of the eight classifiers (Table 1). Three of forty cases with metastases clustered together with the more benign samples for all of the eight lists and five cases were misclassified by seven out of eight lists. We also tested combinations of the lists using either all of the genes contained in the eight lists (combined lists), only those genes of the eight lists that are associated with relapse with a P < 10−6 or using the five most strongly associated genes of each list. Again, a similar picture is obtained (Fig. 2): two clusters with a clearly different content of metastatic cases are obtained and the “good prognosis” cluster contains six misclassified cases, among which the five samples that are most frequently misclassified by the single lists. Misclassification could have been expected more easily for cases that showed metastases at very long times after diagnosis, but this is not the case (Table 1).

Fig. 1
figure 1

Raw data from 159 consecutive breast cancer cases were obtained from Gene Expression Omnibus (GSE1954). Data were preprocessed by the GCRMA algorithm implemented in Bioconductor. Genes that were significantly associated with the parameter “relapse” (P < 0.001) were selected among the genes listed in the Gene Ontology categories indicated. Hierarchical cluster analysis (Pearson correlation, average linkage) was performed using the genes selected. Cases with relapse are indicated in pink, cases without relapse in brown in the bar beneath each clustergram. The eight annotation categories yield a distinction in two major clusters. Most of the cases with metastases cluster together whereas the other cluster contains only few metastatic cases (the yellow bar indicates the separation between the two major clusters)

Table 1 Misclassification of breast cancer cases by Gene Ontology based functional clustering
Fig. 2
figure 2

Hierarchical clustering of 159 breast cancer samples using a gene list containing genes selected from the combination of all eight gene lists used in Fig. 1 containing 1,085 genes. Genes that are associated with the parameter “relapse” with P value below 10−6 were selected. The combination of all gene lists equally misclassifies a number of cases, among which those misclassified by all the single gene lists

We then compiled a list of genes that are significantly differently expressed between correctly classified cases with metastases and the five most frequently misclassified cases using a permutation test (Significance Analysis of Microarrays [71]). The genes selected are expressed at similar levels in misclassified cases and in cases without metastases. When we tried to identify genes that are expressed at significantly different levels in misclassified cases with metastases and cases without metastasis no gene passed the threshold. The misclassified cases are evidently very similar to tumors that do not metastasize, hence molecular classifiers fail to classify them correctly.

The problem of misclassification

Intrinsic misclassification could come about through various reasons: (1) annotation errors (erroneously annotated metastases), (2) metastases derived from different clinically non overt primary breast cancers (second primaries), (3) acquisition of the metastatic phenotype by only a very small subpopulation of cells in the primary tumor which would not leave a trace on the expression profile, (4) metastasis of non-metastatic cancers. Although annotation errors can never be totally excluded we believe that they would not occur in as many as 15% of the cases. Similarly, while non overt second primaries cannot be completely excluded they are expected to account for less than the proportion detected. Acquisition of the metastatic phenotype by only a small subpopulation of the primary tumor cells, or even after the cells have left the primary tumor, is exactly what the multistep carcinogenesis model predicts. The fact that gene expression profiling of the bulk of the primary tumor allows to prognosticate metastasis is a violation of the multistep carcinogenesis theory as Bernards and Weinberg [8] pointed out. These authors proposed that the fate of the tumor is determined right from the initiating event, the transforming mutation, and it might also depend on the cell of origin [46]. These two features, cellular origin and driving mutations, are what microarray profiling is best at detecting. Hence most if not all of the prognostic power of the approach might reside in the correct and exhaustive identification of these two crucial aspects of tumor biology. Yet this does not exclude that, at least in some case, the metastatic phenotype is acquired by an additional mutation by a single cell in the primary tumor or by a cell that, although devoid of metastatic growth characteristics, acquires this potential through mutations. A potential scenario could involve an invasive, anoikis resistant and extravasation competent cell that is able to survive but not to proliferate in a target tissue until a mutation in a receptor gene confers growth factor independent growth. Such a mutation would be rare, since it must occur in the limited number of cells that have formed micrometastases, thus as a consequence the primary tumor would be classified as a “non metastatic” one.

The multistep carcinogenesis model therefore meets a model of probabilistic metastasis that predicts that any invasive cancer can yield metastases but it does so with varying probabilities (being in no case equal to zero). This probability relies on the cell of origin and the driving mutation as well as on the genetic background of the host and the tumor microenvironment, including the host’s immune response, inflammation and tumor infiltrate [19]. The acquisition of additional mutations that confer an increased metastatic propensity accounts for the residual metastasis risk that is not derived from the above features and cannot not be predicted. Microarray expression profiling measures, to different extents, all of the above risk modulating factors, with cellular origin and driving mutation probably being the master discriminators that correspond to the main clusters formed by unsupervised hierarchical clustering. Differences in the tumor microenvironment, inflammation, host factors and the like, if they have an influence on the tumor, will leave a trace in the expression profile. The mutation in a subpopulation or, worse, in an already disseminated cell, escapes expression profiling and therefore yields a subfraction of tumors that are “misclassified”. This suggests that a relatively acute multistep carcinogenesis model occurs in a set percentage of cases, in the cohort used here approximately 10%, that are likely responsible for metastasis from a primary tumor with a “non-metastatic” phenotype. This would also suggest that the same misclassification would be made by more traditional grading approaches, as the cells with true metastatic potential are quite rare within the primary tumor, thus the histopathological grade would also missclassifly these same tumors.

Improvement of molecular signatures

This interpretation postulates a theoretical limit to any prognostic procedure that relies on the primary tumor. With a relatively high degree of accurate relapse prognostication, present signatures might already be close to this theoretical limit. But how can we drive them further to actually reach the limit?

In our opinion, three approaches will permit to improve the existing signatures:

  • Large sets of samples on which to build and test signatures

  • Subtype specific signatures

  • Re-introduction of functional information into the signatures

The need for large datasets has been highlighted by the analyses of Ein-Dor and colleagues: in very large datasets the effect of the actual composition of the sample types will be diluted and any possible tumor type will actually be present. Hence, the signatures are expected to become more robust and more generally applicable [67, 68].

Subtype specific signatures are expected to have an improved prognostic power since tumors of different subtypes are profoundly different and therefore most likely follow different routes to metastasis. Such signatures become possible if very large datasets are analyzed.

The re-introduction of functional information is expected to yield more robust signatures inasmuch as the genes contained in the signature are mechanistically linked to the process of metastasis. The matrix metalloproteinase 1 (MMP1), for example, has been shown to be functionally linked to metastasis and is contained in the general metastasis signature developed by Ramaswamy et al. [18] and in the metastasis signature obtained from highly metastatic MDA-MB231 mammary breast cancer cells [72]. MMP1 expression shows a very clear expression difference in breast cancer samples [19]. If over-expression of this proteinase could be shown to be necessary for metastasis in a specific subtype of breast cancers, it could become a particularly robust, causally involved marker. The wound healing signature [63] and the invasive gene signature [64] have exploited this functionally oriented approach and were able to show that it is possible to derive valuable prognostic signatures from it. Figure 1 also illustrates that the simple use of functional information already contained in Gene Ontology allows for compiling molecular classifiers. Similarly, Achyrya and colleagues used pathway information to refine the estimation of relapse-free survival and sensitivity to chemotherapy [73]. In addition to an improved prognostication, information on response to chemotherapy can be obtained.

Most probably, a complex functionally defined signature would consist in a meta-signature that has been built by taking into account several functional aspects of metastasis. The use of annotations like Gene Ontology is, however, not sufficient for this approach because the functional information for genes is still relatively limited and mainly depends on the context in which the genes have been analyzed. Experimental evidence of gene function should therefore be collected in a context specific manner before building signatures.

An additional improvement of signatures could be expected from the application of latest generation microarray platforms with more exhaustive (exon rather than transcript coverage) and more robust sets of gene specific probes [23].

MicroRNAs

Concurrently, new actors in the cellular regulation plot are being investigated. Small non coding RNAs, such as 21–23-mer microRNAs that regulate mRNA stability and translatability show important functions in tumor initiation and progression [74, 75]. Apparently, their expression can be taken as a surrogate of breast cancer subtypes [76] although differences in miRNA expression within the single subtypes cannot be excluded. Integration of miRNA with mRNA expression profiles could eventually further improve molecular signatures. New technologies reveal many more transcripts than previously estimated [77], hence new entries of transcripts or even transcript classes with functional and/or prognostic significance may soon be detected.

Clinical application

Oncologists are debating over the need or the possibility to integrate information obtained from genomic signatures into the process of treatment decisions. This is already useful in situations where a reasonable doubt on the best treatment persists after classical prognostication, as it could be the case for small, ER-positive, HER2-negative, lymphnode negative, low proliferation index (KI67) cases of grade 2 cancers. Since the application of molecular signatures aims at reducing over-treatment, thereby invariably increasing the risk of under-treatment (in the case of misclassification), much will depend on the individual orientation of the patient. Molecular profiling can be offered as an additional source of information if the patient would prefer to avoid chemotherapy, if possible.

Molecular profiling, as a relatively recent technology, still has elevated costs, partly determined by the fact that existing protocols are offered as centralized services. It can be foreseen that with the growing introduction of genomic approaches into the clinical routine, de-centralized devices will become available and lead to cost reductions. Genomics also has the potential to substitute several classical analyses since the correlation between gene expression data and classical analytical methods for prognostic breast cancer markers (estrogen and progesterone receptor and KI67 expression) is excellent (U. Pfeffer, unpublished observations). Hence, it is conceivable that future pathologists will obtain most of their information on a breast cancer sample from a single complex highly standardized genomic assay.

It is important that the screening of breast cancer samples using whole genome platforms continues in order to design better prognostic signatures. This should not be compromised by the application of existing classifiers. Data obtained from validation studies of molecular classifiers (if publicly funded and published on peer reviewed journals) must be made available to the scientific community.