Abstract
Background/Aim: Colon cancer is one of the most common cancer types and the second leading cause of death due to cancer. Many efforts have been performed towards the investigation of molecular alterations during colon cancer progression. However, the identification of stage-specific molecular markers remains a challenge. The aim of this study was to develop a novel computational methodology for the analysis of alterations in differential gene expression and pathway deregulation across colon cancer stages in order to reveal stage-specific biomarkers and reinforce drug repurposing investigation. Materials and Methods: Transcriptomic datasets of colon cancer were used to identify (a) differentially expressed genes with monotonicity in their fold changes (MEGs) and (b) perturbed pathways with ascending monotonic enrichment (MEPs) related to the number of the participating differentially expressed genes (DEGs), across the four colon cancer stages. Through an in silico drug repurposing pipeline we identified drugs that regulate the expression of MEGs and also target the resulting MEPs. Results: Our methodology highlighted 15 MEGs and 32 candidate repurposed drugs that affect their expression. We also found 51 MEPs divided into two groups according to their rate of DEG content alteration across colon cancer stages. Focusing on the target MEPs of the highlighted repurposed drugs, we found that one of them, the neuroactive ligand-receptor interaction, was targeted by the majority of the candidate drugs. Moreover, we observed that two of the drugs (PIK-75 and troglitazone) target the majority of the resulting MEPs. Conclusion: These findings highlight significant genes and pathways that can be used as stage-specific biomarkers and facilitate the discovery of new potential repurposed drugs for colon cancer. We expect that the computational methodology presented can be applied in a similar way to the analysis of any progressive disease.
- Monotonically expressed genes
- monotonically enriched pathways
- drug repurposing
- colon cancer progression
- colon cancer staging
Colon cancer is one of the most common cancer types and the second leading cause of cancer-related deaths in the United States (1). The most significant factor for patient survival, prognosis and treatment is tumor staging (2, 3). As a rule, the earlier the stage of colon cancer, the smaller it’s metastatic potential. It is well known that cancer metastases are the major cause of colon cancer-related deaths and there are no available drugs to confine the spread to other organs (4).
Despite the efforts to identify significant features that could be used as important indicators for the better understanding of the colon cancer progression (5–9), there is an increased trend in the appearance and death rates from colon cancer. Furthermore, the molecular characterization of colon cancer stages and identification of predictive biomarkers remain a challenge (10). Few studies in the literature have focused on the comparison and identification of genes that may be associated with clinical stages of colon cancer (11–13) and to our knowledge, there are no computational methodologies to facilitate the investigation of deregulation of molecular mechanisms during colon cancer progression. In the present study we developed a novel computational methodology for the analysis of stage-specific alterations in differential gene expression and pathway deregulation of colon cancer. Specifically, we used publicly available transcriptomic datasets of colon cancer and normal samples accompanied with their clinical information, in order to identify monotonically expressed genes (MEGs) i.e., differentially expressed genes (DEGs) with ascending or descending trend in their differential expressions (log 2-fold changes-log2FC) during colon cancer progression. Through an in silico drug repurposing pipeline, we pointed out candidate repurposed drugs which have been experimentally found to significantly change the expression of the majority of these MEGs in colon cancer cell lines. We further explored alterations in the revealed molecular mechanisms that participate in cancer progression by investigating the monotonically enriched pathways (MEPs) i.e., the perturbed pathways with ascending enrichment related to the participating DEGs across the four colon cancer stages.
Our computational approach yielded in 15 MEGs and 32 candidate repurposed drugs that affect their expression. Furthermore, we found 51 MEPs that were divided into two groups according to their perturbation rate across stages i.e., rate of DEG content alteration across colon cancer stages. We also investigated which of the candidate repurposed drugs also target the resulting MEPs and we found two drugs (PIK-75 and troglitazone) that target most of them. On the other hand, one MEP, namely the neuroactive ligand-receptor interaction, is targeted by the majority of the candidate repurposed drugs from our analysis.
Overall, our findings highlight genes and pathways monotonically deregulated across colon cancer stages that can be used as stage-specific biomarkers and as a stepping stone in the direction of drug repurposing against colon cancer. This methodology provides insights for further experimentation and can be applied in a similar way to the analysis of any progressive disease.
Materials and Methods
Data selection and preprocessing. We collected transcriptomic datasets that contain colon cancer samples from all stages (Stage I-IV) and normal samples from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/) (14) and from The Cancer Genome Atlas database (TCGA) (https://www.cancer.gov/tcga). We utilized the GEO datasets for our analysis (as reference sets) and the TCGA datasets as validation sets. In GEO, when looking for transcriptomic datasets with samples in all four colon cancer stages accompanied with healthy ones, we found only five independent microarray datasets whose accession numbers and the number of samples are presented in Table I. We used Linear Models for Microarray Data-LIMMA R package (15) to perform the differential expression analysis in each dataset comparing samples from each colon cancer stage with normal ones. Finally, we selected the DEGs by applying selection thresholds of adjusted p-value < 0.05 and absolute log2FC≥1. The number of DEGs from each comparison is presented in Table II.
Furthermore, we retrieved RNA-seq data from TCGA using TCGAbiolinks R package (16) in order to use them as validation set. Specifically, we downloaded raw-counts from TCGA Colon Adenocarcinoma (TCGA-COAD) and Rectum Cancer (TCGA-READ). Concerning staging information, we separated the samples based on their clinical records. Finally, each subset was statistically analyzed in order to find the differentially expressed genes comparing samples from each colon cancer stage and normal ones. For the differential expression analysis we used DESeq2 R package (17) and we set as threshold 10 total counts per gene to filter out very low expressed genes. Moreover, the same selection criteria of absolute log2FC≥1 and the adjusted p-value <0.05 were utilized. The description of the TCGA datasets and the total number of DEGs from each analysis are presented in Table I and Table II respectively.
Finding and scoring the monotonically expressed genes. For each dataset, we were interested in the DEGs with a monotonic trend (constantly increasing or decreasing trend) in their log 2-fold changes across colon cancer progression, noting them as MEGs. Based on this concept, we examined the monotonically over-expressed genes with an ascending trend in their log2FC and the monotonically under-expressed genes with a descending trend in their log2FC across colon cancer stages.
To further evaluate these MEGs, we investigated their linear correlation in terms of log2FC across the four colon cancer stages. To measure this correlation, we used linear regression to fit the log fold changes of each MEG with line and we subsequently calculated that line’s slope and coefficient of determination (R-squared) i.e., how close the log2FC of each MEG are to the fitted regression line. Moreover, we calculated the distance between the extreme log2FC of Stage IV and Stage I for each MEG and by combining those measures we scored and ranked the MEGs of each dataset using a weighted sum according to the following equation [1]:
[1]
Where, is the absolute distance of the log2FC between Stage IV and Stage I, is the absolute value of slope and the coefficient of determination for each MEG. This weighted sum with w1, w2 weights, combines the actual total increase or decrease of the log2FC across stages with the trend parameters calculated through the linear regression. In the present study we selected to give precedence in the actual total change in the log2FC by setting w1 and w2 0.6 and 0.4, respectively.
According to these scores, each MEG in each dataset acquired a ranking. The rank of each MEG was normalized according to equation [2]:
[2]
where gi,j is the rank of the ith gene in jth dataset and Nj the total number of MEGs in jth dataset.
Finally, we combined the normalized ranks from the five datasets and we calculated a final score as a weighted sumfor the monotonically over- and under-expressed genes separately, using the following equation [3]:
[3]
where fi the number of appearances of MEG i in the datasets. In our study, the weights w1 and w2 were set to 0.4 and 0.6, respectively giving a precedence in the score to MEGs found in more datasets. In silico drug repurposing. The selected MEGs were used separately as input in an in silico drug repurposing tool called Drug Gene Budger (https://maayanlab.cloud/DGB/) (18). DGB is a web-based tool which returns small molecules that are predicted to maximally affect the expression of the genes of interest. Users can query the genes that they want to reverse their expression and DGB results a ranked list of small molecules which have been experimentally found to produce the desired expression effect. The experimental data of DGB have been extracted from the LINCS L1000 dataset (19), the Connectivity Map (CMap) dataset (20), and the GEO database (14). In this study we selected the results from LINCS L1000 and more specifically the significant small molecules with q-value<0.05 and absolute log2FC≥1, that reverse the expression of MEGs in the three colon cancer cell lines (HT29, SW620 and SW948) that take part in L1000 data.
Monotonically enriched pathways (MEPs). To further investigate the alterations in molecular mechanisms related to the colon cancer progression, we explored the perturbed pathways with ascending monotonic enrichment related to the number of the involved DEGs across the four colon cancer stages. For this reason, we constructed a consensus gene signature for each colon cancer stage by examining the common over- and under-expressed genes from the differential expression analysis of the five GEO datasets. Then we found the molecular mechanisms in which the common DEGs of each stage are involved. Specifically, by parsing the biological pathways from Kyoto Encyclopedia of Genes and Genomes – KEGG (21), we found all genes that are involved in each pathway. Finally, we investigated the pathways with an ascending monotonic enrichment in the number of the participating DEGs from Stage I to Stage IV.
MEPs network construction. MEPs from all clusters were used as input in the PathIN – an integrated web tool that provides an easy and flexible way for rapidly creating pathway-based networks, at several functional biological levels: genes, compounds and reactions (http://bioinformatics.cing.ac.cy/PathIN). We used PathIN to map the MEPs on the reference pathway-to-pathway network parsed from the KEGG database.
Results
MEGS found in colon cancer progression. We found the following numbers of monotonically over-/under-expressed genes per dataset: GSE39582 (146/151), GSE21815 (668/51), GSE21510 (73/71), GSE71187 (419/133) and GSE35279 (899/45) that are presented in the box plots of Figure 1A and B. Moreover, as described in the Materials and Methods section we investigated the linear relationships between their log2FC across the four colon cancer stages, by calculating the regression coefficient (Slope) and the coefficient of determination (R-squared) in a linear regression model. The corresponding metrics for each MEG and for each dataset are presented in a polar coordinate system in Figure 1C.
For each MEG, we calculated a final weighted score FSi that combines the normalized ranks and the frequency of occurrence of each MEG across the five GEO datasets. Following the sorting of MEGs according to their FSi score, we highlighted those that surpassed the following criteria: 1) their FSi is above 0.5 and 2) they are found as MEGs with the same trend in at least one TCGA validation set. Following these filtering criteria, we kept 12 MEGs with ascending monotonicity (HOXB8, SNTB1, ATAD2, KLRG2, ITGBL1, RDH12, LEMD1, TACSTD2, F2, RELL2, PMEPA1 and HSPH1) and 3 MEGs with descending monotonicity (PPARGC1A, SLC26A2 and CLCA1) respectively.
Candidate repurposed drugs for the highlighted MEGs. The 15 selected MEGs were separately used as input to the DGB drug repurposing tool. We found candidate repurposed drugs that regulate 7 out of 12 over-expressed MEGs and all the 3 under-expressed MEG. From these drugs we kept only those that were experimentally found to regulate the resulting MEGs in colon cancer cell lines. Then we used as filtering criteria the q-value <0.05 and the absolute log2FC≥1 and we selected for each MEG the maximum top 5 drugs, based on the log2FC that quantify the observed change in the expression of each gene. With this procedure, we ended up with 32 unique drugs presented in the bipartite network of Figure 2.
MEPs found in colon cancer progression. We investigated the common over- and under-expressed genes across stages and datasets. For the case of Stage I, 151 common over-expressed genes and 51 under-expressed genes were found across the 5 datasets. Following the same procedure, we concluded to 367 common over- and 141 common under-expressed genes for Stage II, 360 and 127 for Stage III and 488 and 139 for Stage IV respectively. Finally, we ended up with 51 MEPs, shown in Figure 3. For each MEP, we normalized the number of DEGs in each stage with the total number of genes that are involved in this pathway. We then investigated the groups of pathways with similar deregulation across stages. Specifically, the 51 MEPs were grouped into 4 clusters according to their monotonic enrichment in the four colon cancer stages (Figure 3). The four clusters have been found based on the normalized number of DEGs.The color scale in the presented heat map represents a transformation of the normalized number of DEGs in the scale of [0, 1].
As presented in Figure 3, cluster 1 (C1) consists of 20 pathways with smooth alteration rate between Stage II – Stage III or/and Stage III – Stage IV. Cluster 2 (C2) included also 20 pathways with the lowest deregulation across all colon cancer stages and also slight or no alterations between them. For the case of cluster 3 (C3), we observed nine involved pathways with high total deregulation across the colon cancer progression and high alteration rate between Stage I and Stage II. Finally, cluster 4 (C4) also consists of two pathways with the highest perturbation rate across stages and quite high rate of alteration between Stage I and Stage II or/and Stage III and Stage IV. Observing the four clusters, we separated the MEPs into two groups: the first group includes the MEPs with low perturbation rate (LPR) across stages i.e., low rate of DEG content alteration. The second group contains MEPs with high perturbation rate (HPR) across the four colon cancer stages i.e., high rate of DEG content alteration (Figure 3).
Network connectivity between the MEPs. The LPR and HPR MEPs from the two groups were mapped on the KEGG pathway-to-pathway reference network with the help of PathIN. In this network, the nodes represent the MEPS, the edges represent the connectivity found in KEGG between each pair of pathways while the edge weight represent the number of common genes between the two pathways. As shown in Figure 4, MEPs within and across groups are linked as they are sharing biochemical connections. Specifically, the majority of the LPR MEPs from the first group are strongly associated. On the other hand, subgroups from the HPR MEPs are biochemically connected, sharing also a number of common genes. Following the network statistics, the MEPs with the highest degree centralities are apoptosis and mapk signaling pathway and they belong to the LPR MEPs of the first group.
MEPs and repurposed drugs. To further enrich our analysis, we investigated the MEPs that are targets of the highlighted repurposed drugs. For this reason, we queried the corresponding gene targets of each drug in the databases CLUE –The Drug Repurposing Hub (https://clue.io/repurposing#download-data), the DrugBank database (https://www.drugbank.com/) (22) and PubChem (https://pubchem.ncbi.nlm.nih.gov/) (23). Then we looked in which pathways these target genes are involved. Finally, we compared these pathways with the resulting MEPs in order to find which MEPs are also targets of the resulting repurposed drugs. We applied this procedure to each group to find drugs that affect MEPs with a similar tendency of deregulation during the colon cancer progression. As shown in Figure 5, 18 out of 32 repurposed drugs (PIK-75, CAY-10594, everolimus, doxorubicin, triptolide, BRD-K05402890, troglitazone, tanespimycin, tyrphostin-AG-1478, trichostatin-a, BMS-536924, CD-1530, vemurafenib, emetine, pidorubicine, vorinostat, mepacrine and selumetinib) target 33 out of 51 MEPs (ampk signaling pathway, axon guidance, camp signaling pathway, cellular senescence, central carbon metabolism in cancer, Epstein Barr virus infection, hematopoietic cell lineage, human papillomavirus infection, microRNAs in cancer, mtor signaling pathway, neuroactive ligand-receptor interaction, platinum drug resistance, proteoglycans in cancer, purine metabolism, steroid hormone biosynthesis, apoptosis, choline metabolism in cancer, endocytosis, fc gamma r-mediated phagocytosis, hepatitis b, herpes simplex virus 1 infection, human immunodeficiency virus 1 infection, influenza a, Kaposi sarcoma-associated herpesvirus infection, mapk signaling pathway, measles, necroptosis, pathogenic Escherichia coli infection, regulation of actin cytoskeleton, tuberculosis, yersinia infection, antifolate resistance and progesterone-mediated oocyte maturation). From these MEPs, two belong to HPR MEPs and specifically antifolate resistance and progesterone-mediated oocyte maturation. We prioritized the candidate drugs based on the number of MEPs that they target and among them we found that the most MEPs are targeted byPIK-75 (23 target MEPs) and troglitazone (19 target MEPs) followed by tanespimycin (7 target MEPs) and CAY-10594 (4 target MEPs). The rest drugs target 1–3 MEPs. All the candidate drugs and the number of MEPs they target are presented in Table III. It is worth noting that the first two candidate repurposed drugs, PIK-75 and troglitazone target both LPR and HPR MEPs. Specifically, PIK-75 targets 22 LPR MEPs (ampk signaling pathway, axon guidance, camp signaling pathway, cellular senescence, central carbon, metabolism in cancer, Epstein Barr virus infection, human papillomavirus infection, microRNAs in cancer, mtor signaling pathway, platinum drug resistance, proteoglycans in cancer, apoptosis, choline metabolism in cancer, fc gamma r-mediated phagocytosis, hepatitis b, herpes simplex virus 1 infection, human immunodeficiency virus 1 infection, influenza a, Kaposi sarcoma-associated, herpesvirus infection, measles, regulation of actin cytoskeleton and yersinia infection) and one HPR MEP (progesterone-mediated oocyte maturation). On the other hand, 18 LPR MEPs and one HPR MEP are targeted by troglitazone (LPR MEPS: cellular senescence, Epstein Barr virus infection, hematopoietic cell lineage, human papillomavirus infection, mtor signaling pathway, proteoglycans in cancer, apoptosis, hepatitis b, herpes simplex virus 1 infection, human immunodeficiency virus 1 infection, influenza a, Kaposi sarcoma-associated herpesvirus infection, mapk signaling pathway, measles, necroptosis, pathogenic, Escherichia coli infection, tuberculosis and yersinia infection; HPR MEP: antifolate resistance). The MEPs that were targeted by the majority of the drugs are: (i) the neuroactive ligand-receptor interaction MEP (targeted by 7 repurposed drugs:BMS-536924, CD-1530, doxorubicin, emetine, everolimus, triptolide and vemurafenib), (ii) the camp signaling MEP (targeted by 6 repurposed drugs: BRD-K05402890, CAY-10594, doxorubicin, everolimus, PIK-75 and triptolide).
Discussion
Despite the enormous amount of omics data and the huge efforts to identify important molecular indicators that could be used for better understanding of molecular pathology in multistage diseases, the understanding of the molecular alterations during colon cancer progression remains a challenge. The simple way of the differential expression analysis across cancer stages is not enough to highlight related genes and molecular mechanisms that mark the phases and the progression of colon cancer.
In this study, we analyzed five transcriptomic datasets from the Gene Expression Omnibus database (GEO), with colon cancer samples from different stages and normal samples, in order to find DEGs with monotonicity in their log2FCduring the colon cancer progression. We found 15 MEGs, 12 over-expressed with ascending monotonicity and 3 under-expressed with descending monotonicity respectively. We searched in the literature to find the biological relevance of the resulting 15 MEGs and reached some significant observations. For the case of the 12 over-expressed MEGs, ithas been reported that HOXB8 and ATAD2 are important factors for the development and progression of colon cancer and may be used as possible drug targets for colon cancer therapy (24, 25). A recent publication has also proposed the significant role of syntrophin beta 1-SNTB1 as prognostic marker for colon cancer metastasis (26). Moreover, it has been found that integrin beta-like 1-ITGBL1 is involved in the colon cancer development and metastasis (27) whileTACSTD2 and LEMD1 belong to the significant up-regulated genes of colon cancer (28, 29). For the case of under-expressed MEGs found in our analysis, it has been indicated that the lower levels of PPARGC1A and SLC26A2 could increase colon cancer risk, proliferation and propagation and they have been proposed as possible candidate targets for cancer therapy (30, 31). Finally, low expression of CLCA1 has been related to unfavorable prognosis in colon cancer and it has been indicated that could be a possible prognostic marker (32, 33). High CLCA1 expression has also been shown to suppress colon cancer belligerence (34).
Through a computational drug repurposing pipeline, we also found drugs and small molecules that affect the expression levels of the highlighted MEGs. Among the 32 resulting candidate repurposed drugs, some were found to affect the expression of more than one MEGs. Specifically, trichostatin-a that belongs to the class of organic compounds and it is used as an antifungal antibiotic, regulates the expression of three monotonically over-expressed genes (SNTB1, TACSTD2 and HSPH1) and two under-expressed (SLC26A2 and CLCA1). It has been reported that trichostatin-a has anticancer effects in the domain of cell proliferation and apoptosis and it has been suggested as possible therapy for colon cancer (35). Moreover, vorinostat, a member in the family of compounds that inhibit HDAC and also used in the management of cutaneous T cell lymphoma, was found to influence the expression of two over- and one under-monotonically expressed gene SNTB1, TACSTD2 and SLC26A2 respectively. It is also in 6 clinical trial studies for colon cancer in the ClinicalTrials.gov with National Clinical Trial number (NCT) numbers NCT00336141, NCT02316340, NCT00942266, NCT00138177, NCT00126451 and NCT01023737. SN-38, the active metabolite of irinotecan (a chemotherapeutic drug for metastatic colorectal cancer)(36), was also found as a candidate repurposed drug from our analysis regulating the expression of one over- and one under-monotonically expressed gene (SNTB1 and PPARGC1A). Doxorubicin, another antineoplastic drug of our list that also affects one over- and one under-monotonically expressed gene (HSPH1 and SLC26A2), is used to treat various types of cancer including colon cancer (37).
We further investigated the perturbed pathways with an ascending monotonic enrichment in the colon cancer progression i.e., pathways with ascending monotonic enrichment in terms of their participating DEGs across the four colon cancer stages. We ended up with 51 MEPs that were mapped into four clusters forming two groups: (i) LPR MEPS with low perturbation rate (i.e., low rate of DEG content alteration) across colon cancer stages and (ii) HPR MEPs with high perturbation rate respectively. We noticed that MEPs within and across the two groups are strongly associated in a pathway-to-pathway network (Figure 4). Hippo signaling pathway and tight junction pathway are strongly connected since they are sharing 27 common genes. Hippo pathway is involved in tumorigenesis in several cancer types, and it has been reported that is also connected in cancer growth and progression (38, 39). Moreover, it has become increasingly evident that tight junctions are involved in cancer metastasis and changes of their protein expression are found in several colon cancer cases (40). It is worth noting that one HPR MEP, namely the p53 signaling pathway, is associated with 9 LPR MEPs (measles, human papillomavirus infection, platinum drug resistance, apoptosis, Epstein Barr virus infection, hepatitis b, cellular senescence, mapk signaling pathway and Kaposi sarcoma-associated herpes virus infection). From them, the Epstein Barr virus infection MEP is sharing with the p53 signaling pathway the highest number of common genes. It is well known that p53 signaling pathway is responsible for the alterations of critical regulators of the cell cycle, the angiogenesis, the DNA replication and apoptosis (41). Furthermore, two LPR MEPs, the endocytosis and platinum drug resistance, were also found to be connected with two HPR MEPS, the mismatch repair and antifolate resistance respectively. It is already known that the alterations of endocytosis are involved in tumorigenesis as they affect proliferation and migration (42). Following the pathway network statistics, two LPR MEPS have the highest degree centralities: apoptosis and mapk signaling pathway. Apoptosis is one of the most deregulated pathways in colon cancer and mapk signaling has an important role in tumor growth and progression (43).
Based on the ranking of the highlighted drugs (Table III), the top two drugs with the highest number of targeted MEPs are PIK-75 and troglitazone. These two drugs target both LPR and HPR MEPs. PIK-75 is a PI 3-kinase p110alpha inhibitor and it is known to have anti-cancer activity and anti-inflammatory properties (44). Troglitazone is an antidiabetic and anti-inflammatory drug which has potential anti-cancer properties on several cancer types including prostate cancer (45). PIK-75 is targeting 22 LPR MEPS and one HPR and troglitazone 18 LPR and one HPR MEPs also (Figure 5). Overall, 13 LPR MEPs are targeted by both drugs: cellular senescence, Epstein Barr virus infection, human papillomavirus infection, mtor signaling pathway, proteoglycans in cancer, apoptosis, hepatitis b, herpes simplex virus 1 infection, human immunodeficiency virus 1 infection, influenza a, Kaposi sarcoma-associated herpesvirus infection, measles and yersinia infection. We can observe that 6 from these common targeted MEPs belong to the group of viral infectious diseases. It is well known that the most common oncoviruses, human papillomavirus (HPVs) and the Epstein Barr virus (EBV), contribute to many cancer types including colorectal cancer (46, 47). Moreover, it has been proposed that Epstein Barr virus infection is involved in the etiopathogenesis of inflammatory bowel diseases (48). However, the association of viral infections with colon cancer is still under consideration (49). Recent studies have shown that senescent cells are involved in various pathologic conditions and in the progression of several diseases (50). It has been also reported that cellular senescence could be a promising strategy for cancer therapy (51). The mtor pathway plays a pivotal role in cell proliferation and it is deregulated in many cancer types (52). As mentioned above, PIK-75and troglitazone target one HPR MEP each, namely progesterone-mediated oocyte maturation and antifolate resistance respectively. It has been reported that progesterone-mediated oocyte maturation may be used as diagnostic marker and possible target for colon cancer treatment (53). Mechanisms of antifolate resistance are a pivotal cause of difficulty in cancer chemotherapy (54). Finally, we observe that the most targeted MEP is neuroactive ligand-receptor interaction signaling pathway (Figure 5). This pathway has been associated with many cancer types and more specifically is associated with the progression of bladder, prostate and renal cancer (55).
In summary, in our study we highlighted the importance of 15 MEGs as prognostic markers for colon cancer progression. Four of them (SNTB1, ATAD2, HOXB8 and ITGBL1) have been indicated to be associated with the development of colon cancer (62–65). We further identified 51 MEPs that were mapped into two groups according to their perturbation rate across colon cancer stages. By examining their network connectivity, we highlighted the hippo signaling pathway and the tight junction pathway. It has been reported that hippo pathway components were associated with tumor differentiation, colon cancer stages and metastasis (66). On the other hand, the role of tight junction in human colon cancer development is still unknown. Through a computational drug repurposing pipeline, we also found repurposed drugs that influence the expression levels of the 15 highlighted MEGs. Among the 32 resulting candidate repurposed drugs, 18 also target the resulting MEPs. The two drugs with the highest number of MEPs that they target are the p110α inhibitor PIK-75 and the antidiabetic drug troglitazone. Finally, the MEP that is targeted by the majority of the candidate repurposed drugs from our analysis is the neuroactive ligand-receptor interaction. This pathway may be a promising therapeutic target for future investigation and treatment of colon cancer patients.
This study provides new computational evidence that the monotonicity of the gene differential expression and the pathway deregulation are associated with the colon cancer progression. This observation offers alternative views in drug repurposing efforts as well. Despite all the positive findings of this study, there are some possible limitations related to the database biases that may exist (a common problem in bioinformatics research), the inherent heterogeneity in biological material sampling and annotation regarding the different datasets used and the possible differences in drug selectivity on pathway targeting. Undoubtedly, this is a large-scale computational methodology and the need for further experimental validation of the findings is obvious. Nevertheless, many of our findings have been already associated to the colon cancer staging and progression and this testifies to a preliminary validity of our approach. Based on this implicit validation note, the association of the rest of the findings can be further investigated. The presented findings highlight significant genes and pathways that act as novel potential stage-specific biomarkers and facilitate stage-specific drug repurposing against colon cancer. Furthermore, this work describes a computational methodology that can be applied in a similar way to the analysis of any progressive disease.
Acknowledgements
M.M.B. is funded by the State Scholarships Foundation (IKY) scholarship; This research is co-financed by Greece and the European Union (European Social Fund-ESF) through the Operational Programme “Human Resources Development, Education and Lifelong Learning” in the context of the project ‘Reinforcement of Post-doctoral Researchers – 2nd Cycle’ [MIS-5033021], implemented by the State Scholarships Foundation (IKY).
Footnotes
Authors’ Contributions
Supervision of the study: G.K. and G.M.S. Conception and design of the study: M.M.B., G.K. and G.M.S. Collection, analysis and interpretation of data: M.M.B., G.K. and G.M.S. Drafting the article and revising it critically: M.M.B., G.K. and G.M.S.
This article is freely accessible online.
Data Availability
For extra information and analysis please visit: https://github.com/kmarlen1988/Colon-Cancer-Progression/
Conflicts of Interest
The Authors declare no conflicts of interest.
- Received July 16, 2021.
- Revision received September 3, 2021.
- Accepted September 16, 2021.
- Copyright© 2021, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved