Abstract
Background/Aim: Despite great effort to elucidate the process of acquired gefitinib resistance (AGR) in order to develop successful chemotherapy, the precise mechanisms and genetic factors of such resistance have yet to be elucidated. Materials and Methods: We performed a cross-platform meta-analysis of three publically available microarray datasets related to cancer with AGR. For the top 100 differentially expressed genes (DEGs), we clustered functional modules of hub genes in a gene co-expression network and a protein-protein interaction network. We conducted a weighted correlation network analysis of total DEGs in microarray dataset GSE 34228. The identified DEGs were functionally enriched by Gene Ontology (GO) function and KEGG pathway. Results: We identified a total of 1,033 DEGs (510 up-regulated, 523 down-regulated, and 109 novel genes). Among the top 100 up- or down-regulated DEGs, many genes were found in different types of cancers and tumors. Through integrative analysis of two systemic networks, we selected six hub DEGs (Pre-B-cell leukemia homeobox1, Transient receptor potential cation channel subfamily C member 1, AXL receptor tyrosine kinase, S100 calcium binding protein A9, S100 calcium binding protein A8, and Nucleotide-binding oligomerization domain containing 2) associated with calcium homeostasis and signaling, apoptosis, transcriptional regulation, or chemoresistance. We confirmed a correlation of expression of these genes in the microarray dataset. Conclusion: Our study may lead to comprehensive insights into the complex mechanism of AGR and to novel gene expression signatures useful for further clinical studies.
Although chemotherapy is the most common treatment for various cancer types, the development of acquired resistance to anticancer drugs remains a serious problem, leading to a majority of patients who initially responded to anticancer drugs suffering the recurrence or metastasis of their cancer (1-3). Acquired drug resistance (ADR) is known to have a multi-factorial etiology, with specific associations that include ethnicity, genetic alterations, epigenetic imbalances, and environmental variations.
Gefitinib, an orally-active synthetic anilinoquinazoline, is a first-generation epidermal growth factor receptor (EGFR)-tyrosine kinase inhibitor that was approved in 2003 by the US Food and Drug Administration for the second-line treatment of advanced non-small cell lung cancer (NSCLC) that harbors EGFR-activating mutations, such as an in-frame deletion in exon 19 or an L858R substitution in exon 21 (4, 5). To antagonize the tyrosine kinase activity of EGFR, gefitinib reversibly binds to the adenosine triphosphate binding pocket of the intracellular tyrosine kinase domain, blocking its autophosphorylation and resulting in the inhibition of subsequent downstream signal transduction (6). For example, inhibition of the downstream signaling pathway (such as phosphatidylinositol-3-kinase (PI3K)-Akt kinase (AKT)-mammalian target of rapamycin (mTOR), mitogen-activated protein kinase (MAPK)-extracellular signal-regulated kinase (ERK), and Janus kinase (JAK)-signal transducer and activator of transcription (STAT) pathways prevent cell proliferation, inhibition of programmed cell death (apoptosis), angiogenesis, invasion, or metastasis in many types of epithelial cancers. Despite an early positive clinical response, most patients with EGFR-mutant NSCLC eventually acquire resistance to gefitinib, resulting in the chemotherapeutic failures. Several possible mechanisms for acquired gefitinib resistance (AGR) in lung cancer have been proposed, including the following: (i) alterations in the target oncogene, such as alternate expression of tyrosine kinase isoforms and second-site mutations in the tyrosine kinase domain (e.g. a T790, secondary EGFR mutation), altered EGFR trafficking, erb-b2 receptor tyrosine kinase 3 (ERBB3) activation, and expression of the ATP-binding cassette subfamily G member 2 (ABCG2) drug-efflux transporter; (ii) by passing of drug inhibition by oncogene addiction, such as the compensatory activation of downstream signaling pathways and redundant activation of other survival pathways (e.g. mesenchymal-epithelial transition factor amplification, activation of vascular endothelial growth factor, insulin-like growth factor-1, or integrin B1 pathways, hepatocyte growth factor overexpression, and loss of the phosphatase and tensin homolog, and (iii) histological transformation (e.g. epithelial-mesenchymal transformation and small-cell transformation). However, none of these mechanisms has provided a full explanation for AGR, and no definitive genetic factors have been reported for AGR in epithelial cancer, including NSCLC (7-10).
Because cancer research needs insightful observations to determine complex etiologies, researchers introduced high-throughput microarrays to investigate the expression of multiple genes under specific conditions (11, 12). In three published reports of microarray studies on AGR cancer, many differentially expressed genes (DEGs) were suggested as candidate biomarkers of AGR but these individual studies were limited by small sample sizes, low sample quality, and differences in laboratory protocols (13, 14). In order to minimize the uncertainty in these findings, we identified the DEGs that were consistently detected in paired samples from all microarray datasets, using a cross-platform meta-analysis. To organize the results, we also approached available data topologically, performing an integrative analysis of the DEGs at the gene or protein level.
To our knowledge, this is the first report of a cross-platform meta-analysis of multiple gene-expression profiles from microarray datasets associated with AGR.
Materials and Methods
Selection of microarray datasets qualified for meta-analysis. We thoroughly evaluated the suitability of microarray datasets retrieved on Gene Expression Omnibus (GEO) database of National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/geo/) and ArrayExpress database of the European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI) (http://www.ebi.ac.uk/arrayexpress/), according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines published in 2009 (15).
Meta-analysis of microarray datasets with different platforms. We performed meta-analysis of multiple gene-expression profiles in microarray datasets obtained using different platforms by means of rank product algorithm (RankProd package in R, http://www.rproject.org/) implemented in the INMEX online program (http://inmex.ca/INMEX/) (16, 17). Considering AGR cells derived from the same parental cell line, we arbitrarily fixed the sample pair. Before the datasets were analyzed, all probe IDs from each dataset were annotated as EntrezIDs for data consistency, and intensity values for gene expression were log2-transformed and processed by quantile normalization (limma package in R). A list of DEGs (up- or down-regulated) were identified based on p-value (where threshold was p<0.05) and foldchange (FC) level in a given number of replicates multiplied across different microarray datasets under the nonparametric algorithm, which is statistically rigorous but biologically intuitive.
Gene Ontology (GO) hierarchy and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. To discern the implication of DEGs in cancer cell lines with AGR, functional enrichment analysis of GO hierarchy (biological process, molecular function, and cellular component) and KEGG pathways were carried out using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) web-accessible bioinformatics program (http://david.abcc.ncifcrf.gov/) under a significance threshold of p<0.05.
Gene co-expression network analysis. To construct gene co-expression networks for each of the top 100 up- or down-regulated DEGs, we imported the DEG lists into an extensive database of previously discovered networks and screened for significant gene-gene interactions using the online GeneMANIA program (http://www.genemania.org/) (18, 19). The correlation coefficient between a DEG and additional genes in its network was determined by a GO term (biological process)-based weighting and then was filtered to include only those gene co-expressions above a significance threshold of 0.05.
From the 10 up-regulated or down-regulated genes with the most connected edges in the co-expression networks, the distinct functional modules of the hub DEGs and additional related genes were identified by the fast-greedy HE (G) algorithm of the Community Clusters GLay plugin (http://cytoscape.wodaklab.org/wiki/CommunityStructureLayout) using Cytoscape software (http://www.cytoscape.org/) (20).
Protein-protein interaction (PPI) network analysis. To construct a PPI network for each list of proteins encoded by the top 100 up- or down-regulated DEGs, we imported the lists into the extensive database of already-known networks and screened significant PPIs under the Biological General Repository for Interaction Datasets (BioGRID) (http://thebiogrid.org/). In the top 10 up- or down-regulated protein list with the most connected edges in the PPI network, subnetwork between the hub DEGs-encoding proteins and additional related proteins were analyzed by using the Cytoscape plugin, ClusterONE (http://apps.cytoscape.org/apps/clusterone) (21, 22).
Weighted gene correlation network analysis (WGCNA). To construct a co-expression network of the identified total DEGs in cancer cell lines with AGR, we evaluated Pearson’s correlation coefficient values across 1692 microarray probes in the normalized datasets of GSE 34228, which was analyzed using the most samples, by WGCNA (23). During the WGCNA procedure, briefly, Pearson’s correlation matrices for all genes were calculated and transformed by raising all values to a power β (soft thresholding as biological networks are small-world and scale-free). Finally, gene co-expression modules were identified from the hierarchical cluster tree by using a dynamic tree cut procedure (24).
Results
Selection of microarray datasets for meta-analysis related to AGR. Three microarray datasets containing 62 GEO samples were extracted from the GEO database of The NCBI, which met our criteria for meta-analysis (Figure 1A). All three GEO series (GSEs) were microarray expression profiles of only the cancer cell lines that acquire drug resistance by stepwise exposure to increasing doses of gefitinib (Table I). The microarray results of three GSEs were achieved by using two cancer cell lines such as lung cancer (GSE34228 and 38302) and epidermoid carcinoma (GSE10696).
Identification of up- and down-regulated DEGs by meta-analysis. From cross-platform microarray meta-analysis, we identified total 1033 DEGs including 510 up- and 523 down-regulated genes across the three microarray datasets under the significance threshold of p<0.05. While 109 “gained” genes were uniquely identified as DEGs in the meta-analysis, 680 “lost” genes were identified as DEGs in any individual analysis but not in the meta-analysis (Figure 1B). The “gained” genes show relatively weak, but consistent expression profiles across all three datasets, having more reliability to be declared as novel genes. But, the “lost” genes either imply inconsistent changes in expression profiles across different datasets, or large variations by different platforms or experimental errors. The 100 most significantly up- and down-regulated DEGs with p<1.0E-5 are listed in Tables II and III, respectively. In the up-regulated DEGs, genes with the largest mean log2FC were family with sequence similarity 9, member B (FAM9B), followed by patatin-like phospholipase domain containing 4 (PNPLA4) and ankyrin repeat domain 30B pseudogene 2 (ANKRD30BP2). The down-regulated genes with the largest mean log2FC were by descending order: GPC6, S100A8, and SYNPO2L.
A subset of top 25 dysregulated DEGs across the three microarray datasets was also visualized by heat maps showing differential expression of individual datasets (Figure 2).
GO hierarchy and KEGG pathway enrichment analysis of total DEGs. Considering all 1,033 DEGs obtained by the meta-analysis, the most over-represented GO terms in biological processes were enriched in the following descending order: “inflammatory response”, “epidermis development”, and “response to inorganic substance” (Table IV). The most enriched GO terms in molecular function and cellular component were “calcium ion binding” and “plasma membrane”, respectively. The most enriched KEGG pathway terms were as follows (in descending order): “calcium signaling pathway”, “phosphatidylinositol signaling system”, “cytokine-cytokine receptor interaction”, and “pathways in cancer”.
Gene co-expression network analysis of DEGs. To interpret the biological meaning of the identified DEGs at the gene level, we constructed a co-expression network for the top 100 up- and down-regulated DEGs with significant interaction relation composed of 144 nodes/446 edges and 124 nodes/550 edges, respectively. From the co-expression network of up-regulated DEGs, the list of top 10 hub genes was determined in order of number of the interacting edges as follows (in order): peripheral myelin protein 22 (PMP22), echinoderm microtubule associated protein like 1 (EML1), secretogranin II (SCG2), sparc/osteonectin, cwcv and kazal-like domains proteoglycan 1 (SPOCK1), pre-B-cell leukemia homeobox 1 (PBX1), transient receptor potential cation channel, subfamily C, member 1 (TRPC1), growth arrest-specific 6 (GAS6), phosphodiesterase 2A, cGMP-stimulated (PDE2A), AXL receptor tyrosine kinase (AXL), and prostaglandin I2 synthase (PTGIS). In the network of down-regulated DEGs, the top 10 hub genes with the most connected edges were peptidase inhibitor 3, skin-derived (PI3), S100 calcium binding protein A9 (S100A9), S100A8, chemokine (C-X-C motif) ligand 1 (CXCL1), cystatin A (CSTA), PDZK1 interacting protein 1 (PDZK1IP1), serpin peptidase inhibitor, clade B (ovalbumin), member 2 (SERPINB2), nucleotide-binding oligomerization domain containing 2 (NOD2), serum amyloid A4, constitutive (SAA4), and chitinase 3-like 2 (CHI3L2).
The distinct modules of the hub DEGs and their interacting genes were further identified by fast-greedy HE (G) algorithm of GLay Cytoscape plugin (Figure 3). Among the modules, “Module 1”, which has the largest size formed by 41 nodes, was significantly enriched by biological process terms such as “defense response”, “apoptosis”, and “chemical homeostasis” of GO hierarchy.
PPI network analysis of DEGs. In order to interpret the biological meaning of the identified DEGs at the protein level, we constructed a PPI network for the proteins encoded by the top 100 up- or down-regulated DEGs, which were made by significant interaction including 196 nodes/237 edges and 438 nodes/500 edges respectively.
From the PPI network of proteins encoded by up-regulated DEGs, the list of the top 10 hub proteins was determined in order of number of the interacting edges as follows (in order): ribosomal protein S6 kinase, polypeptide 6 (RPS6KA6), PBX1, inhibitor of DNA binding 1 (ID1), TRPC1, small nuclear ribonucleoprotein polypeptide N (SNRPN), AXL, cyclin-dependent kinase inhibitor 1C (CDKN1C), trefoil factor 1 (TFF1), membrane protein, palmitoylated 1 (MPP1), and de-etiolated homolog 1 (DET1) (Figure 4).
In the same manner, the top 10 hub proteins encoded by down-regulated DEGs were tumor protein p63 (TP63), crystallin, alpha B (CRYAB), S100A9, activating transcription factor 3 (ATF3), disabled homolog 2 (DAB2), histone cluster 1, H1a (HIST1H1A), S100A8, keratin 14 (KRT14), NOD2, and caspase 1 (CASP1).
The 15 module clusters including the hub proteins were further identified by density of nodes and p-value of ClusterONE Cytoscape plugin and six of these clusters shared common genes with the top 10 hub gene lists of the gene co-expression network (Figure 5).
WGCNA of the identified total DEGs in cancer cell lines with AGR. To investigate the functional module eigen (ME) of genes that were highly correlated in microarray dataset GSE34228 with the most samples, we conducted the WGCNA of total DEGs by R package WGCNA under a scale-free topology of the power adjacency function parameter of 11.
The closely-interacting genes within the network were sub-divided into eight distinct MEs using the dynamic hybrid tree-cutting algorithm and were identified by functional enrichment of GO hierarchy and KEGG pathway (Figure 6). Among them, “MEgrey” was the largest ME of 405 nodes and was significantly enriched by GO terms “regulation of cell proliferation” and KEGG terms “cytokine-cytokine receptor interaction”.
Discussion
Cancer cells that develop resistance to gefitinib are a major challenge to successful treatment and long-term survival. To date, no clear mechanism for AGR has been demonstrated. A multifarious approach to analyzing the expression patterns of multiple genes in AGR cancer cells could allow us to characterize the overall response of resistant cells and to understand the complex mechanisms underlying gefitinib resistance.
In the gene-expression patterns obtained by the meta-analysis, primary study for p-value and log2FC of DEGs showed that not few of the top 100 up- or down-regulated DEGs were found in developmental processes of many types of tumor. In the case of top 20 DEGs, in particular, previous studies reported that most of the genes were involved in carcinogenesis of various types of cancer such as of the breast, prostate, stomach, ovary, and lung. In addition, some of these DEGs have been identified in cancer cells with anticancer drug resistance other than AGR, including resistance to cisplatin, trastuzumab, fluorouracil, taxane, and erlotinib. DEGs implicated include membrane protein, palmitoylated 1 (MPP1), AXL, PMP22, smoothened, frizzled class receptor (SMO), S100A8, Hes family bHLH transcription factor 5 (HES5), ATP-binding cassette, sub-family A, member 12 (ABCA12), haptoglobin (HP), and PI3. Interestingly, we identified 109 “gained” DEGs in this meta-analysis which were not discovered in the prior individual analyses. This large number of “gained” DEGs implies that our approach has a higher likelihood of identifying novel biomarkers for AGR. For the identified DEGs overall, a functional enrichment analysis revealed that a large proportion of these genes were classified as having functions related to cellular processes that occur during oncogenesis, including homeostasis, development, apoptosis, immune response, angiogenesis, and signal transduction. In our gene coexpression network, four distinct modules composed of hub DEGs were classified into typical cellular processes of AGR, including indefinite cell proliferation and anti-apoptosis (module 1), malfunctioning membrane transport of small molecules (modules 1, 3, and 4), response to chemical compounds (modules 1 and 3), abnormal cell development (modules 2 and 4), angiogenesis (module 2), and dysregulated transcription (module 2). In parallel with the gene co-expression network, 15 module clusters surrounding hub proteins were identified and enriched in the PPI network. By comparing these networks, we recognized six genes in common, including three up-regulated genes (AXL, PBX1, and TRPC1) and three down-regulated genes (S100A9, S100A8, and NOD2). It is notable that TRPC1 and another three down-regulated genes were affiliated with “module 1” in the gene co-expression network. The DEGs in “module 1” were enriched with GO (biological process) terms relevant to calcium homeostasis and signaling or apoptosis, as observed in the hub protein clusters from the PPI network. Moreover, “calcium ion binding” and “calcium signaling pathway” ranked first in the top 15 terms of the GO hierarchy and KEGG pathway enrichment of DEGs overall. Many studies have suggested that dysregulated calcium homeostasis and altered calcium signaling play important roles in anti-apoptosis, tumor vascularization, invasion, and metastasis in the carcinogenesis of many tumorigenic cells (25, 26). Joshua et al. reported that sustained potentiation of purinergic intercellular calcium signaling was observed following transient exposure to EGF, and that this was not blocked by gefitinib or erlotinib in human glioma cells, suggesting new therapeutic approaches to gliomas and other tumors (27). The TRP ion channel family has been implicated in the regulation of cancer progression and aggressiveness by modulating calcium influx and downstream signaling in many types of cancers (28). In NSCLC cells, EGF-induced calcium entry through TRPC1 promoted cancer cell proliferation by activating EGFR and escaping G0/G1 cell-cycle arrest (29). S100A8 and S100A9 (a multigene family of non-ubiquitous cytoplasmic calcium-binding proteins of the EF-hand type) are involved in tumor development or progression and have been recently revealed to be associated with the progression of carcinoma cells via the Wnt/β-catenin pathway or in MMP2 expression (30-32). NOD2 is known to induce chronic inflammation by mediating anti-apoptosis and autophagy pathways in cancer development (33). AXL and PBX1, both found in “module 2” of the gene co-expression network, have also been reported to be involved in carcinogenesis and tumor development. For example, the AXL receptor tyrosine kinase participates in cell proliferation, invasion and metastasis, and chemoresistance in many types of solid cancers, and causes acquired resistance to EGFR-tyrosine kinase inhibitors by mediating the EMT pathway in NSCLC cells (34). PBX1 forms heterodimeric transcription complexes with other homeodomain-containing nuclear proteins, such as HOX and HEIS, and regulates expression of important genes in organogenesis, hematopoiesis, and tumorigenesis (35). To determine the correlation of the six candidate DEGs from gene expression profiles of the microarray dataset, we performed a WGCNA of all identified DEGs in GSE34228 and identified eight functional MEs in which genes with highly correlated co-expression were grouped by biological terms of carcinogenesis through functional enrichment. Through our analyses, we confirmed that five DEGs clustered within MEgrey, with NOD2 included in MEred, are contiguous with each other in a hierarchical clustering tree of MEs.
In conclusion, the present study provided comprehensive insights into the complex nature and mechanisms of AGR and novel gene-expression signatures that may be useful for clinical chemotherapy studies.
Acknowledgments
This work was supported by Konkuk University.
Footnotes
Conflicts of Interest
None.
- Received March 30, 2015.
- Revision received April 20, 2015.
- Accepted April 23, 2015.
- Copyright© 2015, International Institute of Anticancer Research (Dr. John G. Delinasios), All rights reserved