Main

In recent years, substantial progress has been made in the detection and diagnosis of early stage cancers. This is mostly due to improved imaging technologies and new biomarkers in histological and hematologic testing. However, there still is a dearth of molecular indicators that distinguish highly aggressive tumors from moderately aggressive and non-aggressive ones. Specifically, few markers that predict invasiveness have been firmly established. Better molecular prognostics are needed to accurately assess disease. One candidate marker for the progression of various malignant tumors has been Osteopontin. In cancer, this molecule can support cell invasion and anchorage independence, thus enhancing tumor progression and metastasis formation (Weber, 2008). Despite a large literature on Osteopontin as a cancer marker, it is not in routine diagnostic use. One reason may be the diversity of source materials and cancer-associated readouts that have been investigated in correlation to Osteopontin levels. Therefore, it is important to analyze the comprehensive published evidence to discern which aspects of cancer pathophysiology are consistently associated with elevated Osteopontin levels, thus validating this molecule as a candidate marker.

The scientific literature on biomarkers has grown disproportionately more rapidly than the application of promising markers in clinical practice. Among the reasons for the delay are high barriers in the regulatory process and limited available resources for the recruitment and analysis of sufficiently large patient populations. Meta-analysis is a suitable approach to enhancing knowledge about the diagnostic potential of individual biomarkers within these confines. Yet, conventional regression algorithms have had limited capability of combining distinct data sets and have therefore often fallen short of improving confidence. This is a particular problem for immunohistochemistry, where variable staining protocols combined with the semi-quantitative nature of the examination generate substantial study-to-study fluctuations. Categorical data analysis can limit such heterogeneity. The evaluation of within-study ranks results in a self-normalization of variable data sets. When applied to the meta-analysis of biomarkers, categorical data analysis has a dramatically higher sensitivity than conventional regression algorithms for detecting trends in data sets from disparate sources.

Methods

Data extraction

A PubMed search with the keywords ‘osteopontin AND cancer’ through December 2008 resulted in 800 hits. Titles and abstracts were screened for studies involving human subjects, yielding 271 papers for initial analysis. 36 articles (including reviews, commentaries, experiments only on cell lines, no results on cancer, etc.) did not contain new data on Osteopontin in human cancer. Four articles were not obtained, even after request through interlibrary loan. Three papers were excluded because they contained one retraction, one article that pooled diverse primary tumors without separating them by tumor type, and one paper that applied scientifically questionable methodology (bidigital O-ring test). This left 228 publications to be used for data extraction (Table 1). Of articles not written in English, only the abstracts (not the full texts) were drawn on for obtaining data. For data extraction, numbers from the article text were applied directly; data presented in the format of graphs were measured and converted to the relevant units. Data from Kaplan-Meier survival curves were digitized using the software DataThief.

Table 1 Source references for data extraction

The cancers covered by the original publications include: breast cancer (34), ovarian cancer (25), liver cancer (21), lung cancer (20), head and neck cancer (15), colorectal cancer (14), gastric cancer (14), prostate cancer (13), bone cancer (9), oral cancer (9), melanoma (9), pancreatic cancer (8), renal cancer (8), esophageal cancer (7), glioma (7), mesothelioma (7), thyroid cancer (7), endometrial cancer (6), myeloma (6), cervical cancer (4), gestational trophoblastic tumor (4), leukemia/lymphoma (3), granular cell tumor (2), non-melanoma skin cancer (2), ampullary cancer (2), bladder cancer (2), medulloblastoma (2), soft tissue tumors (2), teratoid tumor (2), adrenocortical cancer (1), neuroblastoma (1), pilomatricoma (1), renal pelvis cancer (1), von Hippel-Lindau disease (1). The numbers in parentheses indicate the number of publications for each type of cancer. Note that several papers contain data on more than one type of cancer and are counted here for each. Therefore the sum is larger than the 228 original publications used for the data extraction.

Data analysis

A significance level of 95% (P<0.05) was applied to all studies. The correlation between Osteopontin expression levels and the clinical variables of interest was examined with a categorical approach (using ranked values). Within a study, the clinical variables were ranked from low to high and then normalized by the number of examples in the study. Studies that combined a range of grades were assigned the mean grade. Also within a study, the Osteopontin scores were ranked from low to high. In the case of immunohistochemistry scores that reported graded results on a 0–3+ scale, a composite score for the study was computed by weighting each score by the fraction of patients reported for that score. For studies using an expanded scoring system, the scores were grouped at low, medium, and high levels and treated in the same way as the 0–3+ results. For studies that only reported mean or median results, the raw values were simply ranked. Ranking accomplishes a self-normalization within each study (Hong et al, 2006; Hong and Breitling, 2008) and permits the simultaneous analysis of both the summary results (mean, median only) and various graded results. In the case of immunohistochemistry, this reduces the effects of different pathologists scoring the samples. In other assay types, such as ELISA or quantitative RT-PCR, this eliminates the need for a normal standard under the assumption that all samples within a study are compared against the same standard.

We utilized the Pearson χ2 test (Agresti, 2007) for independence to assess whether the Osteopontin ranks are independent of the clinical variable ranks. This test was carried out by constructing contingency tables using the ranks for each variable, and populating each cell with the total number of patients reporting that combination of ranks. Separate tables were constructed for sets of studies with 2, 3, or more ranks to avoid structural zeros. The Mantel-Haenszel χ2 test (Agresti, 2007) was used to assess the hypothesis that the ranking of a particular clinical variable within a study is linearly related to the Osteopontin level. We then tested for a non-linear trend by examining the residuals between the observed values and a linear model of the data.

Receiver operator characteristic (ROC) curves are commonly used to assess diagnostic performance of a particular measurable quantity. The most common feature used to quantify this characteristic is the area under the curve, which can be interpreted here as the probability that for two randomly chosen samples, the one with the higher Osteopontin rank will also have a higher rank for the clinical variable in question (Rice and Harris, 2005). In the case of the ranked data in this study, that probability can be calculated for each clinical study. Each pair of patient groups in the study was examined, and the fraction of those where a group with higher clinical variable rank also had a higher Osteopontin level rank is reported here. The statistical significance of this fraction was tested by carrying out a Monte Carlo simulation to estimate the distribution of fractions expected for random ranks.

Reporting standards

The data applied to this study were not skewed by publication bias according to a funnel plot analysis. The present study has been conducted according to the standards of the PRISMA Statement (Moher et al, 2009).

Results

Osteopontin in patient survival

We applied categorical meta-analysis to the evaluation of Osteopontin as a prognostic marker. The distribution of ranks for published overall and disease-free/relapse-free survival versus measured Osteopontin levels displayed an aggregation along the diagonal in bar graphs, indicating a good correspondence for higher Osteopontin rank to lower mean survival time (Figure 1A and B). To further quantify these results, we determined the probability that for two patient groups selected at random from a study, the one that had the higher Osteopontin score would also have a shorter mean survival time. This resulted in a probability of 90.8%, P<1 × 10−5 for overall survival and a probability of 92.9%, P=1 × 10−4 for disease-free/relapse-free survival, where the significance was estimated using a permutation test. These results indicate that Osteopontin rank is a good predictor of survival outcome rank within a study. The actual probability calculated from the meta-analysis of the data was significant when compared to the estimated probability distribution under the null hypothesis that Osteopontin and mean survival time are independent (Figure 1C and D). When broken down to individual cancers, the association between Osteopontin levels and overall survival was significant for lung cancer, breast cancer, prostate cancer, head and neck cancer, and liver cancer (Table 2). Similar results were obtained using the meta-analysis function in Oncomine (Supplement 1). For several cancer types, only one published study was available. Those cases were excluded from the meta-analysis.

Figure 1
figure 1

Osteopontin in overall survival and in disease-free/relapse-free survival. (A) Overall survival and Osteopontin ranks for all cancers combined. (B) Disease-free survival and Osteopontin ranks for all cancers combined. (C) Probability Distribution Function for independent Osteopontin and overall survival ranks. The measured value for Osteopontin data is shown as a vertical line. (D) Probability Density Function for independent Osteopontin and disease free survival ranks. Measured value for Osteopontin data is shown as a vertical line.

Table 2 Osteopontin and survival in individual cancers

In clinical practice, the detection of Osteopontin is particularly important in two settings. In serum or plasma, Osteopontin may serve as a prognostic marker associated with a minimally invasive procedure. After a biopsy, Osteopontin may serve as a prognostic marker directly linked to the tumor. Therefore, we separately analyzed the patient survival data for Osteopontin in these distinct types of specimens. For all cancers combined, the levels of Osteopontin in plasma, in serum, and in tumors significantly identified subpopulations with shorter mean survival (Table 3). In tumors, the highest Osteopontin groups had a mean survival 850 days shorter than the lowest Osteoponin groups. For plasma, the highest Osteopontin groups had a mean survival 560 days shorter than the lowest Osteoponin groups. The concordance between Osteopontin ranks and risk for reduced survival was confirmed for several individual cancers (Table 3). However, the sample sizes for several individual cancers were not sufficiently large to obtain 95% significance for the rank statistic used here (in plasma for gastric, cervical, liver, teratoid, esophageal, and renal cancers; in serum for breast cancer, head and neck cancer, and mesothelioma; in tumors for colorectal, ovarian, and prostate cancers, mesothelioma, and glioma). In tumors, discordance (i.e. higher Osteopontin groups had longer mean survival times) was observed for one study each on bone cancer, endometrial cancer, and melanoma.

Table 3 Osteopontin and survival in distinct clinical specimens

Osteopontin in tumor grade, stage and early progression

Osteopontin immunohistochemistry score ranks and tumor grade and stage ranks were dependent (P<0.001) for all cancers combined (Figure 2A), as well as for 12 individual cancers for grade, 13 individual cancers for stage T, 8 individual cancers for stage N, and 9 individual cancers for stage M (Table 4). Graphical representation of the group ranks suggested a strong positive relationship, reflected in a high density of data points along the diagonal in bar graphs (Figure 2B). To further quantify these ranked data, we determined the probability that for two patient groups, the one that had the higher Osteopontin rank would also have a higher grade or stage rank. In 66.3% of these comparisons, the group with higher Osteopontin rank was also the group with a higher tumor grade, which Monte Carlo analysis revealed to be significant (P=0.004). Positive comparisons were also seen in 81.3% of cases for tumor stage N (node involvement, P=0.01), 54.5% of cases for tumor stage T (primary tumor, P=0.28), and 70% of cases with higher tumor stage M (metastasis, P=0.18).

Figure 2
figure 2

By categorical meta-analysis, Osteopontin levels correlate with stage and grade of cancers. (A) The Pearson χ2 test of ranked Osteopontin immunohistochemistry scores with tumor grade and stage shows a significant dependence between Osteopontin rank and clinical variable. (B) The bar graphs of Osteopontin rank versus rank of grade or stage display an aggregation of data along the diagonal, indicating a positive relationship between Osteopontin levels and clinical variables. The associations are statistically significant for grade and node positivity, but not for stage T and M. (C) Expanded analysis of grade and stage ranks to all published measures. In five studies with duplicate data sets only the immunohistochemistry results were used. We computed a measure analogous to that represented by the area under a ROC curve (see Methods). For all grade and stage measures, Osteopontin is a significant positive indicator.

Table 4 Categorical meta-analysis of tumor grade and stage

For stage T and M, the positive relationship identified in the comparison of ranks was not statistically significant, possibly due to insufficient sample size. Advantageously, the categorical analysis can be applied to heterogeneous data sets. By combining immunohistochemistry with the other published tests, we identified a statistically significant relationship between Osteopontin levels and all grade and stage measures, including T and M, thus demonstrating the benefit of incorporating all of the available data within one analysis (Figure 2C). The categorical analysis had higher sensitivity than a conventional meta-analysis approach (Supplement 2).

In the early stages of transformation, tumor progression can be described as the transition from normal tissue to precancerous lesions (dysplasia, metaplasia), preinvasive cancer, and cancer. According to categorical meta-analysis, Osteopontin expression levels were significantly associated with the progression of eight cancers, independent in one, and inversely correlated in two (skin cancer and gestational trophoblastic tumor) (Table 5). Of note, while Osteopontin appears to be a cancer biomarker for 31 individual malignancies its levels were significantly reduced below normal in non-melanoma skin cancer and gestational trophoblastic tumor. This suggests a unique role for Osteopontin in these two malignancies.

Table 5 Categorical meta-analysis of tumor progression

Discussion

High levels of Osteopontin in several cancers are indicative of a poor prognosis. Overall and disease-free survival are inversely related to Osteopontin levels in several cancers. There is strong correspondence between high Osteopontin and lower mean survival time in tumor (82%) and plasma (100%) measurements, with large mean differences in survival times, indicating a useful role for Osteopontin in patient stratification, Patient survival is largely determined by tumor aggressiveness. Hence, it is not unexpected that Osteopontin, a prognostic measure for survival, is also a marker for grade, stage, and early progression. It is likely that patients with elevated Osteopontin at the time of diagnosis warrant more forceful treatment regimens than are suitable for patients with low levels of Osteopontin.

Although tumor grade, tumor stage, and early tumor progression are distinct measures for the clinical presentation of a cancer they are not mutually unrelated. A dedifferentiated, high grade tumor is more aggressive, and consequently more likely to disseminate and become high stage than a low grade tumor. The molecular mechanisms driving progression, grade, and stage are overlapping. Osteopontin is associated with all of them. In patient care, the diagnosis and assessment of cancer is typically made on the basis of clinical and histo-morphologic criteria. However, molecular markers are more quantifiable and may be more reflective of underlying disease mechanisms. The incomplete convergence between clinical and molecular descriptors may require a reevaluation of how we assess cancer (Weber, 2010).

In this analysis, the concordance between Osteopontin expression rank and stage or grade rank was 67–84% over all types of cancer. This is comparable to the accuracy commonly estimated for existing tumor markers, including CEA, CA 15-3, CA 19-9 and PSA (Ebert et al, 1996; Koopmann et al, 2006; Ulmert et al, 2009). When applied to select cancers, the accuracy of Osteopontin increases. Future research needs to assess whether the combination of Osteopontin with other markers can further improve its diagnostic value (Reinholz et al, 2002; O’Neill et al, 2005; Alonso et al, 2007; Ribeiro-Silva and Oliveira da Costa, 2008).

Meta-analysis has been a valuable tool in biomarker validation. One of its major limitations is the detection of true signals over the noise of heterogeneous input data. Categorical data analysis has a self-normalizing effect on study-to-study variations and may therefore be superior to conventional meta-regression algorithms. For the evaluation of Osteopontin as a biomarker for cancer, we have found conventional and categorical meta-analysis to be in agreement. This was not the case for the correlation of Osteopontin levels with tumor grade and stage (Figure 2 and Supplementary Figure S1). Here, the improved sensitivity of the categorical analysis is required to detect the existing trends in the published data sets.