Abstract
A fusion gene is the physical juxtaposition of two different genes resulting in a structure consisting of the head of one gene and the tail of the other. Gene fusion is often a primary neoplasia-inducing event in leukemias, lymphomas, solid malignancies as well as benign tumors. Knowledge about fusion genes is crucial not only for our understanding of tumorigenesis, but also for the diagnosis, prognostication, and treatment of cancer. Balanced chromosomal rearrangements, in particular translocations and inversions, are the most frequent genetic events leading to the generation of fusion genes. In the present review, we summarize the existing knowledge on chromosome deletions as a mechanism for fusion gene formation. Such deletions are mostly submicroscopic and, hence, not detected by cytogenetic analyses but by array comparative genome hybridization (aCGH) and/or high throughput sequencing (HTS). They are found across the genome in a variety of neoplasias. As tumors are increasingly analyzed using aCGH and HTS, it is likely that more interstitial deletions giving rise to fusion genes will be found, significantly impacting our understanding and treatment of cancer.
A fusion gene is defined as the physical juxtaposition of two different genes resulting in a chimeric structure consisting of the head of one gene and the tail of the other. It is an important class of mutations in both benign and malignant neoplasms where they often constitute the primary tumorigenic event (1-5). Clinically, fusion gene-detection may play a key role in the accurate diagnosis and sub-classification of cancers, may have prognostic significance, and the novel genes may even be the target of molecular therapy (6-9). Thus, they are key to an increased understanding of neoplastic processes and may serve as the ultimate biomarker. As such, they have attracted much attention.
Fusion genes have been detected in hematologic neoplasms as well as in both benign and malignant mesenchymal, epithelial, and other solid tumors (10, 11). During 1982-1988, 10 fusion genes were identified, followed by 162 during the next decade (1990-99). In the last update (January 15, 2021) of the “Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer”, the number of fusion genes had risen to 32,618 (12). The list is certainly going to become longer as more tumor samples are investigated using high throughput sequencing methodologies (10). However, many of the fusion genes detected by these techniques alone, i.e. without subsequent, meticulous verification by other methods, are likely to represent stochastic events without any pathogenetic significance (13).
Chromosomal translocations, and to a lesser extent inversions, have traditionally been viewed as the most common genetic mechanisms whereby fusion genes are generated. The existence of such events has been known since the 1980s and the field has been repeatedly and extensively reviewed (1-5, 7, 14-17).
In contrast, unbalanced genomic rearrangements leading to loss of material, in particular terminal and interstitial chromosomal deletions, have mostly been pathogenetically associated with loss of tumor suppressor genes (11, 18-20). In the 1970s, the detection of a constitutional interstitial deletion of chromosome band 13q14 in some patients with retinoblastoma was key to Knudson’s two-hit model of suppressor gene-mediated tumorigenesis and crucial for the subsequent discovery of RB1, the classical tumor suppressor gene (21-27). Another example was the interstitial deletion of chromosome band 9p21 detected in many types of cancer, but particularly in acute lymphoblastic leukemia, which results in loss of the cyclin dependent kinase inhibitor 2A and 2B genes (CDKN2A and CDK2NB) (28-31). Chromosome deletions resulting in the loss of an important allele and, consequently, reduced levels of protein in the cells lacking that allele (haploinsufficiency) may also contribute to cancer development, even in the absence of subsequent loss of the second allele (20, 32-34).
A less known consequence of interstitial chromosomal deletions is the formation of fusion genes. In the present review, we discuss this genetic mechanism, i.e. the fusion genes that develop through it, and the neoplastic diseases in which this appears to be preferred.
Genes at the Rims of Interstitial Deletions May Fuse to Form Chimeric Genes/Transcripts
The principle for the formation of a fusion gene by interstitial deletion is the same as that for a translocation-generated fusion. The deletion starts within the 5’-end of one gene and finishes within the 3’-end of another, its fusion partner. Both genes are transcribed in the same orientation, i.e. from telomere to centromere or from centromere to telomere. Thus, juxtaposition of the two genes by removal of the chromosome segment between them results in a chimeric structure consisting of the head of one gene and the tail of the other (Figure 1A). Depending on the size of the deletion, loss of gene loci between the fusion partners may or may not accompany the fusion gene formation.
It is important to note that a fusion gene generated by a deletion could also be formed by a translocation between homologous chromosomes if the breaks and recombinations are the same as those for the deletion (Figure 1B). For example, deletion in chromosome bands 1q22-23 breaks the genes lamin A/C (LMNA in 1q22) and neurotrophic receptor tyrosine kinase 1 (NTRK1 in 1q23.1), both of which are transcribed from centromere to telomere, to generate the LMNA-NTRK1 fusion gene in many malignancies (see below). The same LMNA-NTRK1 fusion gene can also be formed by a t(1;1)(q22;q23) chromosome translocation.
Most fusion genes have been detected using high throughput sequencing technologies. In fact, most were found as fusion transcripts in RNA sequencing analyses and were subsequently reported as fusion genes (35-39). For the majority of cases, no chromosome banding or other cytogenetic analysis, no fluorescence in situ hybridization (FISH), array comparative genome hybridization (aCGH), single nucleotide polymorphism (SNP) array, Southern blot or other methodologies were used to support this conclusion. As a consequence, no actual genome-level confirmation exists that fusion gene formation has taken place in these situations, i.e. no structural DNA rearrangement leading to the junction of two different genes has been proven. In order to fill this “gap” between fusion transcripts and fusion genes, Prof. Mitelman decided in his database that “chromosome abnormalities giving rise to gene fusions identified through RNA sequencing are by default designated as translocations (t), unless shown to arise by other types of chromosome rearrangements (del, dup, ins, inv)” (12, 40, 41). By way of example, the “Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer” lists the transcript emanating from a fusion between the transcriptional repressor GATA binding 1 (TPRS1) gene from 8q23.3 and the pleomorphic adenoma gene 1 (PLAG1) from 8q12.1 (TRPS1-PLAG1 chimera), found by RNA sequencing in a uterine myxoid leiomyosarcoma and a soft tissue myoepithelial tumor, as being generated by a t(8;8)(q12;q23) (12, 40, 41). However, no direct evidence for the presence of such a translocation is provided in the articles describing the genetic analyses of the above-mentioned tumors (40, 41). By contrast, we recently examined a chondroid syringoma carrying a del(8)(q12q23) as the only cytogenetic aberration (42). Using aCGH, FISH, reverse transcription polymerase chain reaction (RT-PCR), and Sanger sequencing methodologies, we showed that a TRPS1-PLAG1 chimeric gene was generated by the deletion (42) (Figure 2).
Chimeric transcripts may also be formed at the transcription level. In that case, two independently transcribed, neighboring genes with the same orientation give rise to a single chimeric RNA which may code for a chimeric protein (43-46). Various names have been given for these chimeric transcripts such as readthrough transcripts, transcription induced chimeras, tandem RNA chimeras etc (47). They have been found in many mammals (48). Whether they should be viewed as genuine chimeric transcripts is still under discussion (43-50). An example involves the genes solute carrier family 45 member 3 (SLC45A3) and ETS Like 4 transcription factor (ELK4) which are both transcribed from telomere to centromere and map on 1q32 with a distance of 25 kbp between them. The chimeric SLC45A3-ELK4 transcript, detected in prostate cancer, was found to be generated by cis-splicing between the two neighboring genes SLC45A3 and ELK4 without any actual rearrangement of DNA (35, 51-53). That chimera is designed as resulting from a t(1;1)(q32;q32) in Mitelman’s database (12).
With all these difficulties, caveats, and provisos in mind, we provide a chromosome-by-chromosome list of the unambiguous deletion-generated neoplasia-associated fusion genes that we have been able to ascertain from the relevant literature (Table I).
Chromosome 1
The SCL/TAL1 interrupting locus (STIL is also known as SIL) maps on 1p33, is transcribed from centromere to telomere, and codes for a protein which is part of the pericentriolar material surrounding the parental centrioles, which is essential for centriole duplication during the cell cycle (54). The T-cell acute leukemia 1 (TAL1 also known as SCL, tal-1) gene maps just 18 kbp distal to STIL, is transcribed from centromere to telomere and codes for a transcription factor that harbors the basic helix-loop-helix domain (bHLH) which is a protein dimerization and DNA-binding motif common to many eukaryotic transcription factors (55).
In 1990, two independent research groups working on T-lineage acute lymphoblastic leukemias detected an approximately 90 kbp interstitial deletion in 1p33 which caused the 5’-untranslated part of STIL to fuse with the coding part of TAL1 (56, 57). The deletion placed the expression of TAL1 under the control of the STIL promoter, causing aberrant overexpression of the TAL1 protein (56, 57). To the best of our knowledge, this was the first description of a fusion gene resulting from an interstitial, submicroscopic deletion.
The STIL-TAL1 fusion gene has been reported in 15-25 % of pediatric and young adult T-lineage acute lymphoblastic leukemia (T-ALL) but much less frequently in older T-ALL patients (58-60). Compared to T-ALL patients without STIL-TAL1 fusions, those with the chimera have a higher white blood cell count at diagnosis, express CD2 on their leukemic cells and show a poor response to the steroid drug prednisone (59, 61, 62). The prognosis of STIL-TAL1 fusion-positive leukemias has been reported as both better, poorer or about equal to that of other T-ALL groups (59, 61-64). In murine models, abnormal expression of TAL1 has been reported to result in the development of T-cell malignances (65, 66).
Fusion of the gene coding for lamin A and C (LMNA) with the gene coding for neurotrophic receptor tyrosine kinase 1 (NTRK1) was reported to occur as the result of a 750 kbp interstitial deletion in chromosome bands 1q22-23 in a spitzoid melanoma (67). Both genes are transcribed from centromere to telomere. Subsequently, LMNA-NTRK1 fusion was also described in other neoplasias such as colon cancer, thyroid cancer, breast cancer, cholangiocarcinoma, soft tissue sarcoma, and uterine sarcoma (68-80). The LMNA-NTRK1 codes for a chimeric tyrosine kinase. Patients with this fusion can be treated with kinase inhibitors such as crizotinib, entrectinib, and larotrectinib with significant clinical response (71, 72, 79, 81-85).
In the 1q21-23 chromosomal region, 15 fusion genes involving NTRK1 have been reported. Based on the orientation of the transcription (from centromere to telomere), interstitial deletions come across as the probable cause of fusions between NTRK1 (3’-fusion partner) and zinc finger and BTB domain containing 7B (ZBTB7B), brevican (BCAN), chromatin target of protein arginine methyltransferase 1 (CHTOP), cingulin (CGN), platelet endothelial aggregation receptor 1 (PEAR1), or phosphatidylinositol-4-phosphate 5-kinase type 1 alpha (PIP5K1A). The fusions have been found in various tumors of the brain, breast, bladder, and neuroendocrine cells (75, 80, 86-89). Most fusions have been detected using high throughput sequencing methodologies. Cytogenetic, FISH, aCGH, or any other data confirming the said deletions at the genomic level are lacking.
Using CRISP-Cas9, Cook et al. (90) generated a microdeletion leading to a Bcan-Ntrk1 fusion gene in mice. The mice developed high-grade gliomas which responded to the Ntrk1 inhibitor entrectinib. In general, patients whose cancers carry NTRK1 fusion genes have responded satisfactorily to treatment with tyrosine kinase inhibitors (85, 91-95).
Chromosome 2
The anaplastic lymphoma kinase (ALK) gene maps to 2p23 (position chr2:29,192,774-29,921,586) and is transcribed from centromere to telomere. More than 20 ALK-chimeras have been reported in which the ALK 5’-fusion partner comes from another gene which also resides on the short arm of chromosome 2. In 10 of these ALK-chimeras, the 5’-fusion partner maps proximal to ALK (i.e. closer to chromosome 2 centromere) and is also transcribed from the centromere towards the telomere (Table I). Thus, an interstitial deletion could be the genomic mechanism behind the generation of these chimeras.
Fusion of the coiled-coil domain-containing protein 88A (CCDC88A) gene with ALK, giving a CCDC88A-ALK chimera, was found in an anaplastic ependymoma of an 8-month-old girl. An interstitial deletion del(2)(p16p23) was seen by G-banding examination of the tumor cells and confirmed by FISH. Genomic PCR showed that the deletion started within intron 12 of CCDC88 (2p16.1) and ended within intron 19 of ALK (2p32.2) (96).
ALK-fusions were detected with the genes dynactin 1 (DCTN1) in uterine inflammatory myofibroblastic tumor and pancreatic ductal adenocarcinoma (DCTN1-ALK chimera) (97, 98), glutamine:fructose-6-phosphate amidotransferase 1 (GFPT1) in medullary thyroid cancer (GFPT1-ALK chimera) (99), WD repeat-containing planar cell polarity effector (WDPCP) in lung adenocarcinoma (WDPCP-ALK chimera) (100), BAF chromatin remodeling complex subunit BCL11A in lung adenocarcinoma (BCL11A-ALK chimera) (101, 102), S1 RNA binding domain 1 (SRBD1) in lung adenocarcinoma (SRBD1-ALK chimera) (103, 104), and striatin (STRN) in lung adenocarcinoma, malignant peritoneal mesothelioma, and thyroid carcinoma (STRN-ALK chimera) (105-108). These were true chimeric genes resulting from DNA rearrangements, possibly deletions between the 5’-fusion partner and ALK (the 3’partner). In all the above-mentioned fusion genes, the genomic breakpoint in ALK was within the 1932 bp long intron 19 of the gene.
Irrespective of the 5’-fusion partner gene, all ALK-chimeras seem to code for chimeric protein tyrosine kinases (109). Patients whose tumors carry ALK-chimeras, respond well to treatment with ALK inhibitors (110-115). More specifically, patients whose tumors carry the fusions DCTN1-ALK, BCL11A-ALK, SRBD1-ALK, STPG4-ALK, and STRN-ALK have reportedly shown excellent response to ALK inhibitors such as Crizotinib, Ceritinib, and Alectinib (98, 101-103, 105-107, 116).
Chromosome 4
The factor interacting with PAPOLA and CPSF1 (FIP1L1) gene and the platelet derived growth factor receptor alpha (PDGFRA) gene both map to chromosome band 4q12 and are transcribed from centromere to telomere. The distance between them is 800 kbp. FIP1L1 codes for a subunit of the cleavage and polyadenylation specificity factor complex that polyadenylates the 3’ end of mRNA precursors (117). PDGFRA codes for a cell surface tyrosine kinase receptor for members of the platelet-derived growth factor family (118-120). PDGFRA together with its paralog gene platelet derived growth factor receptor beta (PDGFRB), and the genes colony stimulating factor 1 receptor (CSFR1), KIT proto-oncogene receptor tyrosine kinase (KIT), and fms related receptor tyrosine kinase 3 (FLT3) code for the class III family of receptor tyrosine kinases that have important roles in leukemo- and tumorigenesis (120-124).
In 2003, the FIP1L1-PDGFRA fusion gene was, as a result of an 800 kbp interstitial chromosomal deletion in 4q12, detected in nine out of 16 patients with hypereosinophilic syndrome (125). FIP1L1-PDGFRA codes for a chimeric, constitutively active tyrosine kinase which consists of the first 233 amino acids of FIP1L1 and the last 523 amino acids of PDGFRA. Imatinib inhibits tyrosine phosphorylation by the FIP1L1-PDGFRA fusion protein (125). Nowadays, the World Health Organization’s “Classification of tumours of haematopoietic and lymphoid tissues” lists, under the category “Myeloid/lymphoid neoplasms with eosinophilia and rearrangement of PDGFRA, PDGFRB, or FGFR1, or with PCM1-JAK2”, the new subgroup “Myeloid/lymphoid neoplasms with PDGFRA rearrangement” in which FIP1L1-PDGFRA is the most commonly detected gene fusion (126-128). Patients with this disease usually respond well to imatinib (129, 130).
Chromosome 5
The PDGFRB gene maps to 5q32 and is transcribed from telomere to centromere. It encodes, similarly to its homologous PDGFRA gene, a cell surface tyrosine kinase receptor for members of the platelet-derived growth factor family (119, 120, 123, 124). In 1994, Golub et al. reported that the t(5;12)(q33;p13) chromosome translocation sometimes seen in chronic myelomonocytic leukemia results in fusion of the ETS variant transcription factor 6 gene (ETV6, also known as TEL) from 12p13 with PDGFRB (131). According to the January 15, 2021 version of the Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer, 49 PDGFRB chimeras have been reported, most of them in hematologic malignancies (12). The consequence of the PDGFRB fusions is constitutive activation of the PDGFRB tyrosine kinase (120, 123). Patients with hematologic malignancies bearing PDGFRB chimeras can be successfully treated with imatinib (132-140).
The genes EBF transcription factor 1 (EBF1 on 5q33.3), CD74 molecule (CD74 on 5q33.1), secreted protein acidic and cysteine rich (SPARC on 5q33.1), and TNFAIP3 interacting protein 1 (TNIP1 on 5q33.1) are transcribed from telomere to centromere and have been found to fuse as 5’-end partner genes with PDGFRB (Table I). The EBF1-PDGFRB chimera, which is found in B-lineage acute lymphoblastic leukemia, in the majority of cases results from an 8.6 Mbp interstitial deletion, del(5)(q32q33.3), with breakpoints within the EBF1 and PDGFRB genes (132, 133, 141-143). In very few cases, a chromosome translocation has instead been shown to generate the EBF1-PDGFRB chimera (142). The TNIP1-PDGFRB chimera results from a 900 kbp interstitial deletion with breakpoints located within TNIP1 and PDGFRB (6, 144-146).
No information exists about DNA rearrangements behind the formation of the CD74-PDFGRB and SPARC-PDGFRB chimeras found in a patient with B-ALL and a case of lipofibromatosis, respectively (147, 148). CD74 and SPARC are located 240 kb and 1.5 Mbp distal to PDGFRB, respectively. Since both genes are transcribed from telomere to centromere, as is PDGFRB, and since both are distal to PDGFRB, we see it as probable that both fusions are the product of interstitial deletions.
Chromosome 6
The ROS proto-oncogene 1, receptor tyrosine kinase (ROS1) gene maps on 6q22.1, is transcribed from telomere to centromere, and codes for a tyrosine kinase receptor with similarities to the Drosophila sevenless tyrosine kinase receptor (149-155). Neither the expression nor the cellular function of ROS1 has been well studied but the gene seems to be widely expressed. Examining the expression of ROS1 in 45 different human cell lines, the majority from various neoplasias, Birchmeier et al. (152, 155-158) found high-level expression in glioblastoma-derived cell lines but no to very low expression in the remainder. Further studies have shown ectopic expression of ROS1 also in other brain tumors (152, 155-158). ROS1 chimeras have been reported in various types of cancer, more and more as tumors are increasingly being screened for fusion genes/transcripts. In 2016, a review of ROS1 fusions in cancer reported 26 genes as having been found to fuse with ROS1 (159) whereas a similar recent review reported the number of ROS1 fusion partners to be 54 (160). In 2020, the year before the review by Drilon et al. (160) was published, another 14 novel ROS1 fusion partner genes were added to the list (161-165), raising the total currently known number of ROS1 chimeras to 68. Regardless of their large number and variability, ROS1 chimeras encode chimeric ROS1 proteins which are constitutively active kinases and which, consequently, may be targets for treatment with kinase inhibitors (13, 160, 166-171).
In 2003, Charest et al. (172) showed that cells from a glioblastoma contained a 250 kbp submicroscopic interstitial deletion causing fusion of the golgi associated PDZ and coiled-coil motif containing GOPC (also known as FIG) gene with ROS1. The GOPC-ROS1 transcript, which consists of the first seven exons of GOPC and the last nine exons of ROS1, was in-frame and coded for a constitutively active GOPC-ROS1 chimeric protein that seems to be oncogenic (172, 173). At present, the GOPC-ROS1 chimera is considered to be a rare, but recurrent, fusion found in glioma, lung adenocarcinoma, cholangiocarcinoma, and high-grade serous ovarian carcinoma (166, 168, 172, 174-177). The chimeric GOPC-ROS1 protein may be the target for kinase inhibitors (160, 166-169, 177).
In 2013, a 41.5 Mbp interstitial deletion, del(6)(q22q25), was reported to fuse the first 10 exons of the ezrin (EZR) gene, which is transcribed from telomere to centromere and maps on 6q25.3, with exons 34-43 of the ROS1 gene in lung adenocarcinomas from four female patients, three of whom had never been smokers (178). The EZR-ROS1 gene coded for a chimeric protein with oncogenic activity. It contained the FERM domain of the EZR protein joined to the transmembrane and kinase domains of ROS1 (178). Additional studies confirmed the recurrence of EZR-ROS1 in lung cancer and also that the finding was clinically important: the chimeric protein could be the target of kinase inhibitors with very good results (160, 161, 167, 169, 179-184).
A chimera with the centrosomal protein 85 like (CEP85L) gene, which maps on 6q22.31, 1.0 Mbp distal to ROS1, and is transcribed from telomere to centromere, as the 5’-end partner gene and ROS1 as the 3’-end partner gene has been reported in an angiosarcoma as well as a few glioblastomas (87, 166, 185-188). The chimeric CEP85L–ROS1 transcript was accompanied by deletion of the 5’-end of ROS1 suggesting that an interstitial 1.1 Mbp submicroscopic deletion within band 6q22 caused the CEP85L–ROS1 chimera (166, 185). The CEP85L–ROS1 transcript codes for a chimeric protein with oncogenic activity. The protein can be targeted with kinase inhibitors (87, 166, 185-188).
Recently, three novel in-frame ROS1 chimeric transcripts were detected (161, 164) using high throughput technology, probably corresponding to microdeletions between ROS1 (as the 3’-end partner gene) and 5’-end partner genes (161, 164). In the first chimeric transcript, found in a melanoma of the skin, the SFT2 domain containing 1 (SFT2D1) gene was fused to ROS1 (161). In the second transcript, found in a serous carcinoma of the ovary, an invasive ductal breast carcinoma, and in a carcinoma of unknown origin, the protein tyrosine phosphatase receptor type K (PTPRK) gene was fused to ROS1 (161). In the third transcript, found in a leiomyosarcoma, the mannosidase alpha class 1A member 1 (MAN1A1) gene was fused with ROS1 (164). In vitro assays showed that the MAN1A1-ROS1 protein had strong transformation potential and that the kinase inhibitor crizotinib inhibited growth of MAN1A1-ROS1 transformed cells in a dose-dependent manner (164).
The SFT2D1, PTPRK, and MAN1A1 genes are distal to ROS1 and map on 6q27, 6q22.33, and 6q22.31, respectively. They are transcribed from telomere to centromere. Thus, a 49 Mbp deletion is predicted to have caused the SFT2D1-ROS1, an 11 Mbp deletion the PTPRK-ROS1, whereas a 2 Mbp deletion probably resulted in the MAN1A1-ROS1 chimera.
The MYB proto-oncogene (MYB is also known as c-MYB) gene codes for a transcription regulator with three helix-turn-helix (HTH) DNA-binding domains, maps on 6q23.3, and is transcribed from centromere to telomere (189, 190). The gene and its paralogues MYBL1 (also known as A-MYB, on 8q13.1) and MYBL2 (also known as B-MYB, on 20q13.12) compose the MYB family of transcription factors which play important roles in cell growth, differentiation, and apoptosis (191-193). MYB regulates hematopoiesis, is crucial for colon development in murine animals, and is required for the proliferation of neural progenitor cells and maintenance of the neural stem cell niche (189, 193-196). Because MYB is involved in many malignancies such as leukemias and solid cancers of breast, colon, and brain, it has been considered as an attractive target for anti-tumor therapy (193, 197, 198).
The QKI, KH domain containing RNA binding (QKI) gene, which codes for a protein that regulates pre-mRNA splicing, export of mRNAs from the nucleus, protein translation, and mRNA stability, maps on 6q26 and is transcribed from centromere to telomere (199-201). In 2014, Roth et al. (202) used high-resolution SNP array methodology to detect, in a pediatric ganglioglioma, a 30 Mbp deletion in 6q23.3-26 with the proximal breakpoint in the last intron of MYB and the distal one within the QKI gene. They proposed that the result of this deletion would be a MYB-QKI fusion gene, a chimera that had previously been reported in a pediatric low-grade glioma (203). The MYB-QKI fusion gene was subsequently found to characterize angiocentric gliomas (204-207). Interstitial deletion as a mechanism for the generation of the MYB-QKI fusion was reported in two of the studies (204-207).
Chromosome 7
The B-Raf proto-oncogene, serine/threonine kinase (BRAF) gene maps to 7q34 and transcribes from telomere to centromere (208-210). It codes for a member of the RAF family of serine/threonine protein kinases which is involved in regulating the MAP kinase/ERK signaling pathway and affects cell division, differentiation, and secretion (211-214).
Mutations in BRAF, most commonly the V600E mutation, have been found in many malignancies such as melanoma, colorectal cancer, thyroid carcinoma, non-small cell lung carcinoma, hairy cell leukemia, non-Hodgkin lymphoma, and adenocarcinoma of lung (214-216). The mutations play a fundamental role in cancer development. They constitutively activate BRAF resulting in an over-performing RAF-MEK-ERK signaling cascade, promotion of cell proliferation and survival, and inhibition of apoptosis (214-216). The identification and characterization of pathogenic BRAF mutations have led to the development of BRAF kinase inhibitors used to treat patients whose cancers carry this particular genetic abnormality (214, 215, 217, 218).
BRAF chimeras have also been reported (12). In Mitelman’s Database of Chromosome Aberrations and Gene Fusions in Cancer (updated October 15, 2020), 95 BRAF chimeras were registered with 30 of them involving a partner gene in 7q.
Using aCGH, Cin et al. (219) found in three pilocytic astrocytomas a 2.5 Mbp interstitial deletion in chromosome band 7q34. The deletion led to in-frame fusion of the currently uncharacterized gene with the name “family with sequence similarity 131-member B” (FAM131B) with BRAF. The chimeric FAM131B-BRAF protein was a constitutively active kinase with MEK phosphorylation potential and transforming activity in vitro (219). Subsequent studies confirmed the existence of a submicroscopic interstitial deletion in 7q34 and the recurrent generation of a FAM131B-BRAF chimeric gene in pilocytic astrocytomas (202, 206, 220, 221).
Chromosome 8
The gene with the name “hes related family bHLH transcription factor with YRPW motif 1” (HEY1) maps on 8q21.13, is transcribed from telomere to centromere, and codes for a nuclear protein belonging to the hairy and enhancer of split-related (HESR) family of basic helix-loop-helix (bHLH)-type transcriptional repressors (222-225). The nuclear receptor coactivator 2 (NCOA2) gene maps on 8q13.3, is also transcribed from telomere to centromere, and codes for a transcriptional coactivator of nuclear hormone receptors (226-229). A HEY1-NCOA2 fusion gene has been reported to be pathognomonic for mesenchymal chondrosarcoma (230-236). SNP array analyses of a few such chondrosarcomas indicated an interstitial deletion as the cause of the HEY1-NCOA2 chimeric gene (221, 231).
The pleomorphic adenoma gene 1 (PLAG1) maps to 8q12.1, is transcribed from telomere to centromere, and codes for a zinc finger transcription factor (237-240). PLAG1 spans 50 kbp and contains 5 exons, the first 3 of which are untranslated (NCBI reference: NM_002655.3) (237, 241). PLAG1 was initially found to be rearranged in pleomorphic adenomas carrying a t(3;8)(p22;q12) translocation which led to its fusion as a 3’-end partner with the catenin beta 1 (CTNNB1) gene from 3p22.1 (237). Subsequently, various PLAG1-fusion genes were found in pleomorphic adenomas of the salivary glands, lipoblastomas, as well as other tumors (40-42, 242-246). In PLAG1 chimeras, the two fusion partner genes exchange their promoters and the 5’-end untranslated exons. Consequently, the expression of PLAG1 is controlled and regulated by the fusion partner gene promoter. The PLAG1 gene is either overexpressed or activated which results in deregulation of its targeted genes and leading thus to tumor development (240, 247-251).
The hyaluronan synthase 2 (HAS2) gene maps on 8q24.13, is transcribed from telomere to centromere, and codes for the isoform 2 of hyaluronan synthase (252-256). HAS2 spans 29 kb and has 4 exons, the first of which is untranslated (NCBI reference: NM_005328.3). In 2000, a recurrent HAS2-PLAG1 fusion gene was detected in three lipoblastomas, two of which had del(8)(q12q24) and the third a ring chromosome 8 (244). The genomic breakpoints were in introns 1 of both HAS2 and PLAG1, and in the chimeric HAS2-PLAG1 transcripts, the untranslated exon 1 of HAS2 fused to either exon 2 or exon 3 of PLAG1 (244). Thus, the HAS2-PLAG1 fusion gene was the result of a 65.5 Mbp interstitial del(8)(q12q24) deletion. Subsequent reports on lipobastomas confirmed that the HAS2-PLAG1 fusion resulted from a del(8)(q12q24) (257-259).
The transcriptional repressor GATA binding 1 (TRPS1) gene maps on 8q23.3, is transcribed from telomere to centromere, and codes for a transcription factor that represses GATA-regulated genes and binds to a dynein light-chain protein (260). TRPS1 spans 260 kbp and has seven exons, the first of which is untranslated (NCBI reference: NM_014112.5). Chimeric TRPS1-PLAG1 transcripts in which exon 1 of TRPS1 was fused to exon 2 or exon 3 of PLAG1, were reported in soft tissue myoepithelial tumor, uterine myxoid leiomyosarcoma, and chondroid syringoma (40-42). G-banding analysis of the chondroid syringoma revealed an interstial deletion, del(8)(q12q23) (Figure 2A). aCGH examination confirmed the deletion and showed that it started in intron 1 of PLAG1 and ended in exon 1 of TRPS1 (Figure 2B). RT-PCR (Figure 2C) and Sanger sequencing (Figure 2D) confirmed the presence of the TRPS1-PLAG1 fusion transcripts. FISH analysis on metaphase spreads showed that the TRPS1-PLAG1 fusion gene was on the del(8)(q12q23) chromosome (Figure 2E). Thus, both the aCGH and karyotyping data indicated that a TRPS1-PLAG1 fusion gene had been formed as the result of a deletion (42).
The N-myc downstream regulated 1 gene (NDRG1) maps to 8q24.22, is transcribed from telomere to centromere and codes for a cytoplasmic protein involved in stress and hormonal responses, cell growth, and differentiation (261-264). The NDRG1 gene spans 60 kbp and has sixteen exons of which the first is untranslated (NCBI reference: NM_006096.4). A chimeric NDRG1-PLAG1 transcript in which exon 1 of NDRG1 was fused to exon 3 of PLAG1 was found in a chondroid syringoma (42). FISH analysis showed that the NDRG1-PLAG1 chimeric gene was on a ring chromosome 8. No reciprocal PLAG1-NDRG1 chimeric gene was seen. The data indicated that an interstitial deletion had caused the NDRG1-PLAG1 chimera (42).
Chromosome 9
A fusion of the SET nuclear proto-oncogene (SET) with the nucleoporin 214 (NUP214) gene, also known as CAN, was discovered by von Linden et al. in an acute undifferentiated leukemia with normal karyotype. The discovery was made while they were looking for the DEK-NUP214 (alias DEK-CAN) fusion gene generated by t(6;9)(p22;q34) in acute myeloid leukemias (265-267).
SET and NUP214 map on 9q34.11 and 9q34.13, respectively, and are both transcribed from centromere to telomere. SET codes for a nuclear protein which inhibits both histone acetyltransferase and demethylation of DNA (268, 269), whereas NUP214 codes for a nuclear envelop protein which is a subunit of the nuclear pore complex (270). The SET-NUP214 protein is found within the nucleus. It causes disturbed intracellular localization of the chromosomal maintenance 1 (CRM1) protein that facilitates transport of RNA and protein across the nuclear membrane into the cytoplasm (271). As a consequence, disruption of the nuclear export system occurs. Recruitment of the SET-NUP214 protein onto HOX gene clusters leads to aberrant expression of HOX genes in leukemic cells (271, 272). Expression of SET-NUP214 in transgenic mice was shown to block hematopoietic differentiation (273).
In 2006, Rosati et al. reported that a 2.5 Mbp deletion generated SET-NUP214 fusion in an AML-patient (274). Subsequent studies confirmed that the submicroscopic deletion did indeed lead to SET-CAN chimeras in leukemias (274-281).
The SET-NUP214 chimera has been detected in AML as well as in undifferentiated acute leukemia (AUL) and B- and T-differentiated lymphoblastic leukemias (B-ALL and T-ALL). Its overall frequency in T-ALL is 3-8 % (275, 277, 282). SET-NUP214 is rare in pediatric T-ALL but was found in as many as 13 % of adult T-ALLs (60, 282). In a recent study of 24 patients whose leukemic cells carried a SET-NUP214 and who had undergone allogeneic hematopoietic stem cell transplantation, those who expressed SET-NUP214 after transplantation fared badly (283).
Chromosome 10
Vesicle transport through interaction with t-SNAREs 1A (VTI1A) and transcription factor 7 like 2 (TCF7L2) are neighboring genes in 10q25.2-25.3, separated by 130 kbp. Both are transcribed from centromere to telomere (284). The VTI1A gene codes for a soluble N-ethylmaleimide-sensitive fusion protein-attachment protein receptor that is active in intracellular trafficking (285, 286). The TCF7L2 gene codes for a high mobility group (HMG) box-containing transcription factor that plays a key role in the Wnt signaling pathway (287, 288). Although several TCF7L2 tissue specific splice variants have been found, all of them code for a protein which has an N-terminal beta-catenin (CTNNB1)-binding domain and a HMG-box region (287-289).
Genomic sequencing of colorectal adenocarcinomas identified a 540 kbp deletion starting in intron 2 of the VTI1A gene and ending in intron 3 of the TCF7L2 gene, thus generating a VTI1A-TCF7L2 chimera which is in-frame transcribed and translated to a chimeric protein lacking the CTNNB1-binding domain of TCF7L2 (284). In the first study, the chimeric VTI1A-TCF7L2 gene was present in 3 % of the examined colorectal carcinomas (284). Later, Nome et al. (290) detected the VTI1A-TCF7L2 fusion transcript in 42 % of colorectal cancers but also in 28 % of normal colonic mucosa samples as well as in 25 % of normal tissue samples taken from various other anatomical sites. They also detected seven different splice variants of the VTI1A-TCF7L2 transcript (290). These data indicate that VTI1A-TCF7L2 is not specific for cancer nor for cells emanating from the large bowel. Nevertheless, functional studies of the VTI1A-TCF4 chimeric protein have shown that it acts as a dominant negative regulator of the Wnt signaling pathway, and that its transcription is activated by CDX2 (291). It is possible that it plays a pathogenetic role in cancer in spite of its lack of specificity.
Chromosome 11
The histone-lysine N-methyltransferase 2A gene (KMT2A, also known as MLL) maps to 11q23 and is transcribed from centromere to telomere. It encodes a transcriptional coactivator with multiple functional motifs and domains, among them a menin-binding motif at the amino-terminus, DNA binding AT hooks, a cysteine rich CXXC domain, plant homeodomain finger motifs, a bromodomain, a transactivation domain, and a SET domain at the carboxyl-terminus responsible for histone H3 lysine 4 (H3K4) methyltransferase activity (292-297). KMT2A is known to recombine with more than 100 different partners in hematologic malignances and solid tumors with most of the fusions coding for chimeric proteins (12). All KMT2A-chimeric proteins retain the menin-binding motif, the DNA binding AT hooks, and the CXXC domain indicating that they are essential for the transformation potential of the fusion proteins (282, 298-300).
Fusions of KMT2A with three genes - Rho guanine nucleotide exchange factor 12 (ARHGEF12), Casitas B-lineage lymphoma proto-oncogene (CBL), and decapping enzyme scavenger (DCPS) - were found to result from interstitial deletions in various hematologic malignancies (Table I) (147, 282, 298-305).
KMT2A-ARHGEF12 fusion is brought about by a 2Mbp deletion stretching from the major breakpoint cluster region of KMT2A, which spans from exon 7 to exon 13, to intron 11 or 13 of ARHGEF12 (Figure 3) (299-301, 304, 305). The result is an in-frame KMT2A-ARHGEF12 chimeric transcript that gives rise to a protein composed of the KMT2A amino-terminus and the ARHGEF12 carboxyl-terminus (299-301, 304, 305). So far, seven cases with KMT2A-ARHGEF12 fusion have been reported: three AMLs, three B-ALLs, and one high-grade B-cell lymphoma (147, 299-305). Figure 3 presents, in brief, our results on identification of a KMT2A-ARHGEF12 fusion gene generated by a therapy induced interstitial deletion in subband 11q23.3 in a child treated for acute myeloid leukemia (301). aCGH detects a deletion which starts in the KMT2A gene and ends in the ARHGEF12 gene (Figure 3A). The deletion is also confirmed by FISH (Figure 3B). Finally, molecular methodologies (genomic PCR and Sanger sequencing of the PCR amplified fragments) show that an intronic sequence of KMT2A fuses to an intronic sequence of ARHGEF12, generating a chimeric KMT2A-ARHGEF12 gene (Figure 3C).
A KMT2A-CBL fusion is generated by an 800 kbp deletion starting within KMT2A and ending in CBL gene (282, 298-300). It gives rise to an in-frame KMT2A-CBL chimeric transcript that translates into a chimeric protein. Up to now, KMT2A-CBL fusion has been described in two AML and one T-Lineage ALL (282, 298-300).
Mayer et al. (306) described an AML patient with a del(11)(q23) in the diagnostic karyotype. Detailed investigation showed that the leukemic cells carried a 7.8 Mbp interstitial deletion which fused a genomic sequence from intron 8 of KMT2A with an intergenic sequence 7.2 kbp upstream of the DCPS gene. DCPS maps on 11q24.2, 10 kbp distal to TIRAP, and is transcribed, as is KMT2A, from centromere to telomere (306). At the transcription level, the deletion results in in-frame fusion of exon 8 of KMT2A with exon 2 of the DCPS gene (306).
The forkhead box R1 (FOXR1) gene maps to 11q23.3 (chr11:118,971,761-119,018,638), is transcribed from centromere to telomere, and codes for a member of the forkhead box (FOX) family of transcription factors which are expressed in the testis, predominantly in spermatogonia and meiotic spermatocytes (307, 308). Santo et al. (309) identified interstitial microdeletions activating the FOXR1 gene in three neuroblastomas. In two of them, a 500 kbp deletion between intron 1 of KMT2A upstream of the FOXR1 gene resulted in a KMT2A-FOXR1 chimeric transcript in which the entire coding region of FOXR1 was fused to exon 1 of KMT2A. In the third neuroblastoma, a 1.9 Mbp deletion within 11q23.3, starting within the platelet activating factor acetylhydrolase 1b catalytic subunit 2 (PAFAH1B2) and ending just upstream of FOXR1, resulted in two PAFAH1B-FOXR1 chimeric transcripts in which the entire coding region of FOXR1 was fused to exon 2 of PAFAH1B. Thus, both KMT2A-FOXR1 and PAFAH1B-FOXR1 resulted in FOXR1 expression (309).
Chromosome 19
In 2014, a 400 kbp submicroscopic deletion in 19p13.12 was found to fuse the DnaJ heat shock protein family (Hsp40) member B1 (DNAJB1) gene with the protein kinase cAMP-activated catalytic subunit alpha (PRKACA) gene in all fifteen examined cases of fibrolamellar hepatocellular carcinoma, a rare liver cancer (310). Both DNAJB1 and PRKACA are transcribed from centromere to telomere. Although the breakpoints were different in the examined cases, each deletion started either in intron 1 or exon 2 of DNAJB1 and ended in intron 1 of PRKACA. The resulting DNAJB1-PRKACA chimeric transcript thus comprised the first exon of DNAJB1 and exons 2-10 of PRKACA (310). The correlation between DNAJB1-PRKACA fusion gene formation and fibrolamellar hepatocellular carcinoma was quickly confirmed by other groups (311-316). Recently, the same fusion gene was reported to be recurrent also in intraductal oncocytic papillary neoplasms of the pancreas and bile ducts, cystic precursors to invasive carcinoma (317, 318).
The DNAJB1 gene codes for a member of the heat shock protein 40 family (HSP40) which interact with HSP70s and are involved in numerous cellular processes such as refolding, interaction, and transport of proteins (319, 320). PRKACA codes for one of the catalytic subunits of protein kinase A (321, 322). The DNAJB1-PRKACA gene codes for a chimeric protein kinase with oncogenic potential (310, 315, 323, 324). Both the first exon of DNAJB1 and the kinase domain of PRKACA were required for tumorigenesis (324).
Chromosome 21
In 2005, Tomlins et al. (325) reported a recurrent fusion transcript of transmembrane serine protease 2 (TMPRSS2) with the E26 transformation-specific (ETS) related gene (ERG), resulting in strong overexpression of ERG, in prostate cancer. The TMPRSS2-ERG fusion transcript was quickly confirmed by other groups and was found to be present in at least 40 % of prostate cancers (see below) and 20 % of high-grade prostatic intraepithelial neoplasia (326-331). The TPPRSS2 gene maps on 21q22.3, is transcribed from telomere to centromere, and codes for a type II transmembrane serine protease (332-334) which in prostate cancer is regulated by androgen (335, 336). The ERG gene maps on 21q22.2, 3.1 Mbp centromeric (proximal) to TMPRSS2. It is transcribed from telomere to centromere and codes for a member of the ETS family of transcription factors (337-339).
FISH and aCGH analyses show that the TMPRSS2-ERG fusion gene is generated by an approximately 3.0 Mbp interstitial deletion which starts in ERG and ends in TMPRSS2, by translocation between the two chromosomes 21 or by microdeletion and concurrent translocation (326, 327, 329-331, 340-344). Roughly 40 % to 60 % of TMPRSS2-ERG fusion genes in patients with prostate cancer are generated by deletions (345, 346). Furthermore, prostate cancer patients whose tumor cells have a TMPRSS2-ERG fusion stemming from deletion, seem to have worse prognosis than those with a fusion resulting from translocation (346, 347). The 3 Mbp region between ERG and TMPRSS2 contains many genes which are involved in cancer and may function as tumor suppressor genes. The fact that the interstitial deletion which generates the TMPRSS2-ERG fusion gene, simultaneously results in haploinsufficiency for these genes, may explain the clinical difference. In a murine model, Linn et al. (348) showed that only mice lacking the interstitial region developed prostate adenocarcinoma marked by poor differentiation and epithelial-to-mesenchymal transition.
Chromosome X/Y
The genes cytokine receptor like factor 2 (CRLF2, also known as TSLPR) and P2Y receptor family member 8 (P2RY8) map to the pseudoautosomal regions Xp22.33 and Yp11.2, are transcribed from centromere to telomere, and are separated by a 250 kbp genomic region (349-353). CRLF2 codes for a receptor for thymic stromal lymphoprotein (TSLP) (349-352). CRLF2 together with interleukin 7 receptor (IL7R) and TSLP form the TSLPR complex which is capable of activating multiple signaling transduction pathways, among them the JAK/STAT pathway and the PI-3 kinase pathway (354-356).
P2RY8 codes for a member of the family of G-protein coupled receptors (357-359). The P2RY8 protein together with its ligand, S-geranylgeranyl-L-glutathione, and the enzyme gamma-glutamyltransferase-5, which metabolizes S-geranylgeranyl-L-glutathione to a form that does not activate the P2RY8 receptor, promote confinement of B-cells in germinal centers (357-359).
In 2009, two groups reported that in B-progenitor ALL a 300 kbp interstitial deletion within the pseudoautosomal region Xp22.33/Yp11.2 juxtaposed the first, noncoding exon of P2RY8 with the coding region of CRLF2 resulting in overexpression of CRLF2 (360, 361). The P2RY8-CRLF2 fusion was found in 5-7 % of patients with B-progenitor ALL but in more than 50 % of B-ALL patients with Down syndrome (360-365). The P2RY8-CRLF2 fusion could be both an early and a clearly secondary genomic event in B-ALL development, making its role in leukemogenesis all the more intriguing (363, 366, 367).
Conclusion
Although it may seem more likely that fusion genes or activated oncogenes are mainly caused by balanced genomic rearrangements, and although the early history of fusion gene detection in cancer apparently corroborated this view, we show here that interstitial chromosomal deletions are not an uncommon mechanism for the formation of similar fusion genes. Most of these deletions are below the detection level of chromosome banding methodologies and, hence, were detected using other techniques, including aCGH and high throughput sequencing. The detected interstitial deletions/fusion genes are not restricted to one or only a few chromosomes or a single type of cancer; instead, they have been found across almost the entire genome and in various neoplasias. Their detection has improved significantly our understanding of tumorigenesis and leukemogenesis and they are increasingly used for diagnosis and classification of neoplasms, prognostication, and as targets for molecular therapy. As more neoplasms are being analyzed, especially as high throughput sequencing is increasingly being relied on in laboratory diagnostic routines, even more such interstitial deletions/fusion genes are likely to be found, something that is going to have a significant impact both clinically and scientifically. The challenge in this context, however, is to apply proper verification/falsification measures to all new discoveries that will be made so that the field does not become swamped by data of questionable significance.
Acknowledgements
This work was supported by grants from Radiumhospitalets Legater.
Footnotes
Authors’ Contributions
Both authors (IP and SH) wrote the manuscript.
This article is freely accessible online.
Conflicts of Interest
The Authors declare that they have no potential conflicts of interest with regards to this study.
- Received February 26, 2021.
- Revision received March 15, 2021.
- Accepted March 16, 2021.
- Copyright© 2021, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved