Abstract
Chromosomal translocations in cancer as well as benign neoplasias typically lead to the formation of fusion genes. Such genes may encode chimeric proteins when two protein-coding regions fuse in-frame, or they may result in deregulation of genes via promoter swapping or translocation of the gene into the vicinity of a highly active regulatory element. A less studied consequence of chromosomal translocations is the fusion of two breakpoint genes resulting in an out-of-frame chimera. The breaks then occur in one or both protein-coding regions forming a stop codon in the chimeric transcript shortly after the fusion point. Though the latter genetic events and mechanisms at first awoke little research interest, careful investigations have established them as neither rare nor inconsequential. In the present work, we review and discuss the truncation of genes in neoplastic cells resulting from chromosomal rearrangements, especially from seemingly balanced translocations.
Translocations are genetic rearrangements in which chromosomes exchange material. Translocations may be balanced (no net loss or gain of genetic material occurs) or unbalanced. They sometimes involve more than two chromosomes in chains of exchanges or are complex in even more intricate ways, and they may coexist with additional numerical as well as structural genetic aberrations such as inversions, insertions, and deletions.
Acquired chromosome translocations are common in neoplasias, benign as well as malignant, testifying to their pathogenetic importance in such processes (1, 2). They may be found as sole cytogenetic aberrations or as a part of a complex karyotype, in which case they may be secondary cytogenetic events. In the last update (June 6, 2022) of the “Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer”, 72.718 karyotypically abnormal cases were listed, of which 40.981 (56%) had one or more translocations (2).
Chromosomal translocations in cancer may lead to fusion genes when the breakpoints are in gene loci; in fact, translocations are the most common mechanism whereby such hybrids are generated (1, 3). The swapping of chromosome segments results in joining of material from the loci in such a manner that a chimera is formed consisting of the head (5′-end) of one gene and the tail (3′-end) of the other. The new chimera acts as a single unit affecting various cellular processes, and it may be tumorigenic (1). The chimeric gene may encode a protein when two protein-coding regions are fused in-frame. The resulting protein has functional domains derived from each of the two original proteins and may be of importance early in cancer development (3-6). A chimera may also demonstrate promoter swapping when the translocation breakpoints occur in the 5′, non-coding end of the genes. This leads to deregulation of the involved genes, again possibly resulting in oncogenesis (7-15). In B- and T-lineage lymphatic leukemias and lymphomas, one partner of the fusion gene typically codes for an immunoglobulin or T-cell receptor; the other partner gene becomes quantitatively deregulated when it comes under the control of elements whose normal function it is to maintain steady transcription of the immunoglobulin and/or T-cell receptor genes (1). The role of chromosomal translocations in generating the above-mentioned types of fusion gene was unraveled in the 1980s and has been repeatedly and extensively reviewed (1-6, 16-22).
Clinically, the detection of chromosome translocations leading to the generation of fusion genes may play a key role in the diagnosis and sub-classification of cancers, may have prognostic significance, and the generated fusion genes (or their chimeric proteins) may even be the target for molecular therapies (1). t(9;22)(q34;q11) is the classic example: The translocation is pathognomonic for chronic myeloid leukemia (and Ph-positive acute leukemias), results in fusion of BCR activator of RhoGEF and GTPase (BCR on 22q11) with the ABL proto-oncogene 1, non-receptor tyrosine kinase (ABL1 on 9q34), and codes for a chimeric BCR::ABL1 tyrosine kinase which can be blocked by the tyrosine kinase inhibitor imatinib mesylate (23, 24).
Apart from translocations, also other structural chromosome rearrangements such as deletions (25), inversions (26, 27), insertions (28-31), and tandem duplications (32) may generate highly consequential fusion genes in the oncological context. For reasons of simplicity, in the present text they are all included in the term “chromosomal translocations”.
A much less studied consequence of chromosomal translocations in cancer is truncation of genes and the expression of their aberrant truncated proteins. The breakpoints occur in protein-coding regions but a stop codon is formed after the fusion point in the resulting transcripts. Thus, a translocation may join the 5′-end coding part of a gene with an intergenic genomic sequence, it may fuse the coding part of a 5′-end gene to the antisense strand (most often intronic) of the 3′-partner gene, or it may generate out-of-frame fusion transcripts if the reading frame of the 3’-partner gene is shifted and a stop codon formed after the fusion point in the chimeric transcript. These genetic events originally came across as only marginally important, but careful investigations have shown that gene abrogation or the generation of out-of-frame chimeras for genes such as colony stimulating factor 1 (CSF1), ETS variant transcription factor 6 (ETV6), Fos proto-oncogene, AP-1 transcription factor subunit (FOS), high mobility group AT-hook 2 (HMGA2), and RUNX family transcription factor 1 (RUNX1) is not rare (33-59). In addition, functional studies have shown that truncated genes can act oncogenically (60-66). To the best of our knowledge, such events have not been extensively reviewed which is why we here summarize existing knowledge and thinking about the role gene truncations resulting from balanced chromosome rearrangements plays in tumorigenesis.
Methodological considerations. Up until and through the 1990s, detection of rearranged/fused genes in cancer almost always began with a discovery made by chromosome banding analysis of neoplastic cells. A structural chromosome aberration would be found with breakpoints mapping to distinguishable bands on the rearranged chromosome(s). The second step towards finding a pathogenetically important gene fusion brought about by cytogenetic changes mostly involved investigation of abnormal metaphase plates by means of fluorescence in situ hybridization (FISH) techniques using probes such as yeast (YAC), bacterial (BAC), and P1 (PAC) artificial chromosomes, fosmids or cosmids looking for the smallest possible probe that spanned the chromosomal break. A third step would then follow in which molecular cloning techniques were used, typically Southern blot and various types of polymerase chain reaction (PCR) amplification, that helped localize the breakpoint more precisely and identified the rearranged genes whose fusion was the result of the chromosomal rearrangement. The above-mentioned sequential procedure was laborious, but robust and reliable and resulted in the identification of many both “canonical” and “non-canonical” cancer-associated fusions (1). The final molecular product of a “canonical” fusion gene was a chimeric protein or the detection of promoter swapping between two genes (see above). In “non-canonical” fusion genes, the chimeric transcript did not produce any chimeric protein but rather a truncated peptide encoded by exons from the 5′-end partner gene that had been fused out-of-frame with a coding sequence, alternatively an intronic or antisense sequence, from the 5′-end partner gene. Shortly after the fusion point, a stop codon had been introduced. The sequence of the 3′-end partner gene might code for a novel amino acid string not present in the normal protein produced by the 3′-end partner gene. By way of illustration, and using the above-mentioned step-by-step procedure on a case of pediatric pre-B acute lymphoblastic leukemia, a cryptic recombination of chromosome bands 12p13 and 12q13 was found resulting in fusion of the first two exons of ETV6 with a sequence from intron 1 of the bromodomain adjacent to zinc finger domain 2A gene (BAZ2A) (40). Likewise, examining an atypical chronic myeloid leukemia carrying a t(3;21)(q21;q22), Micci and co-workers (67) found that the coding sequence of the receptor-like tyrosine kinase (RYK) gene had been fused to sequences from chromosome 21 that included the ATP synthase peripheral stalk subunit OSCP (ATP5O) gene coding for a mitochondrial ATP synthase; the translocation led to truncation of the RYK gene.
However, sometimes genes are recombined through a chromosomal rearrangement without the generation of a functional gene fusion. The breakpoint on one of the chromosomes may occur within a gene locus whereas the other breakpoint lies in an intergenic region. An example of this phenomenon was the t(19;21)(q13;q22) chromosome translocation detected by Hromas and coworkers (68, 69) in a radiation-associated leukemia. The translocation resulted in recombination of exon 6 of RUNX1 (also known as AML1) in 21q22 with an intergenic sequence in 19q13 they called “AMP19” (GenBank sequence AY004251.2). The aberrant RUNX1::AMP19 fusion transcript produced a truncated RUNX1 protein containing only the DNA binding domain which, consequently, inhibited the function of normal RUNX1 (69).
The introduction of next generation sequencing (NGS, also called high throughput sequencing) has changed the preferred approach to the detection of fusion genes in neoplastic cells (70-74). Isolated RNA (and less frequently DNA) from tumor samples is massively parallel sequenced whereupon the data are analyzed for fusion transcripts/genes using specific algorithms resulting in the detection of many fusion transcripts/genes (74-76). But also the NGS approach has its issues of which the considerable cost of performing extensive analyses is perhaps the least important. In a study detecting the recurrent ZC3H7::BCOR fusion gene in endometrial stromal sarcomas, analysis of RNA sequencing data using the FusionMap algorithm revealed more than 1000 possible fusion transcripts (77). In other studies, the analysis of RNA sequencing data using specific programs “detected” a large number of fusion gene/transcript candidates but failed to find the pathogenetically essential CIC::DUX and KAT6A::CREBBP fusion genes in a small round cell tumor and an acute myeloid leukemia, respectively (78, 79). The above examples illustrate a profound problem when searching for pathogenetically interesting fusion transcripts by RNA sequencing: some, often many, fusion genes or transcripts detected by the available algorithms are false positives (80), whereas on other occasions false negatives are the result, i.e., the search programs fail to detect the fusion genes of interest. Admittedly, some of the detected fusion transcripts could reflect intra-tumor heterogeneity and clonal evolution shown at both the cytogenetic and molecular level (81, 82), but others are simply noise. One way to identify which of the suggested or detected fusion transcripts or rearranged genes are truly important, would involve the examination (by RNA sequencing) of many samples from the same type of cancer, especially if this can be done together with examination of matched normal tissue samples (34, 35, 44, 83). A second method is preferred by research groups competent in cancer cytogenetics and combines the use of banding cytogenetic analysis of tumor cells with RNA sequencing (74). The rationale behind the latter, two-pronged approach is that a chromosomal rearrangement giving rise to a fusion gene is sometimes seen as the only cytogenetic aberration in which case it is assumed to represent a primary oncogenic event. Thus, karyotypic information is used to focus on fusion genes that map to chromosomal breakpoints. The reality of these gene fusions is then confirmed (or ruled out) using yet other methods such as FISH, PCR, and Sanger sequencing analyses (33, 45, 77, 84-88). This approach also has the advantage of not overlooking “non-canonical” fusion genes or rearranged genes that lead to truncated proteins (42, 51, 58, 89-94).
Most fusion genes known today were detected using high throughput sequencing technologies. In fact, most were found as fusion transcripts by RNA sequencing analysis whereupon they, by inference, were reported as fusion genes (72, 95-97). More often than not, no chromosome banding or other cytogenetic analysis, no FISH, array comparative genome hybridization (aCGH), single nucleotide polymorphism (SNP) array, Southern blot or other methodology was used to falsify or support this conclusion. As a consequence, no genome-level confirmation is at hand that fusion gene formation actually existed in these situations, i.e., there is no fool-proof evidence of structural DNA rearrangements leading to the fusion of two different genes. Prof. Mitelman states in his database that “chromosome abnormalities giving rise to gene fusions identified through RNA sequencing are by default designated as translocations (t), unless shown to arise by other types of chromosome rearrangements (del, dup, ins, inv)” (2). Thus, for neighboring genes transcribed with the same orientation - centromere to telomere or telomere to centromere - and seemingly generating chimeras, one cannot distinguish between a true fusion resulting from an interstitial deletion and a read-through chimeric transcript resulting from cis-splicing of the two neighboring genes. The necessity of using “old-fashioned” techniques in addition to NGS-type analyses in order to avoid being swamped by false positives can be gleaned from a recap of fairly well-known examples from leukemogenesis: A 90 kbp interstitial deletion in 1p33 was shown to cause fusion of the 5′-untranslated part of the STIL centriolar assembly protein (STIL) gene with the coding part of TAL bHLH transcription factor 1, erythroid differentiation factor (TAL1) (98, 99). In the STIL::TAL1 fusion gene, the expression of TAL1 is controlled by the STIL promoter causing excessive production of the TAL1 protein (98, 99). Beyond any doubt, the transcriptional abnormalities really correspond to DNA-level changes, something that was demonstrated long before massively parallel sequencing processing entered the scene.
The genes solute carrier family 45 member 3 (SLC45A3) and ETS Like 4 transcription factor (ELK4) are both transcribed from telomere to centromere and map 25 kbp apart on 1q32. A chimeric SLC45A3-ELK4 transcript in prostate cancer was found to be generated by cis-splicing between the two neighboring genes although no DNA-level recombination between them existed (95, 100-102). Similarly, the transcriptome-level fusion of the two neighboring genes ATP5L (also known as ATP5MG, official full name: ATP synthase, H+ transporting, mitochondrial Fo complex subunit G) with KMT2A (or MLL, official full name: lysine methyltransferase 2A), both of which map to chromosome band 11q23, could not be demonstrated to be the result of any detectable deletion between them. The ATP5L-KMT2A fusion transcript therefore most likely represents a read-through artefact (meaning that no DNA-level corollary exists to the transcriptome fusion transcript) of the two neighboring genes, which are both transcribed from centromere to telomere (103, 104). In the Mitelman database, both SLC45A3-ELK4 and ATP5L-KMT2A are nevertheless listed as fusion genes caused by the inferred default translocations t(1;1)(q32;q32) and t(11;11)(q23;q23), respectively (2). In these two situations and according to present-day knowledge, the altered transcriptional features of the tumor cells do not correspond to actual genomic changes.
Are gene truncations recurrent? Yoshihara et al. (105) studied 4366 primary tumor samples from 13 tumor types and identified 8695 tumor-specific fusion transcripts, out of which 2991 were in-frame and 2852 out-of-frame. Shlien et al. (106) examined 23 breast tumors and found a) 205 gene-to-gene rearrangements in the same transcriptional orientation resulting in 133 in-frame and 368 out-of- frame fusion transcripts, b) 171 gene-to-gene fusions in opposite transcriptional orientation of which 50 generated transcripts with the sense portion of the 5′-end gene fusing to the antisense strand of the 3′-partner gene, and c) 473 genomic rearrangements that joined the 3′-part of a gene to an intergenic region. They summarized their findings as “productive, stable transcription from sense-to-antisense gene fusions and gene-to-intergenic rearrangements, suggesting that these mutation classes drive more transcriptional disruption than previously suspected”. In a large studied of 9000 primary tumors from 33 different types of cancer, Vellichirammal et al. (107) reported, in addition to many canonical fusion transcripts in which both partner genes were transcribed in the sense orientation, also non-canonical fusion transcripts in which at least one partner gene was transcribed in antisense orientation. Non-canonical fusion transcripts were found in all tumor groups except glioblastomas which carried only canonical fusion transcripts.
Johansson et al. (108) reviewed thousands of published fusion genes/transcripts extracted from the “Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer” concluding that most fusion genes detected by next generation sequencing reflected stochastic events of no pathogenetic significance. We see this as indicating that some of the database’s “fusion genes” actually represent artefacts attributable to bugs in the interpretation algorithms or even the technique itself. Johansson et al. further opined that there are fundamental differences between NGS-detected and “conventionally identified” fusion genes, i.e., chimeras identified or confirmed by cytogenetic methods (108).
As is apparent from the above summary, the actual status of newly reported fusion genes can be hard to establish. Some “fusions” may be noise, but at the same time it is also the case that what initially seemed to be a unique event based on scrutiny of the relevant literature at the time of reporting, in some cases turn out to be but the first of a series of similar reports. Indeed, when a fusion gene’s actual existence at the DNA level has been confirmed cytogenetically, experience shows that a seemingly unique chromosome aberration targeting a neoplasia-relevant gene will also soon be seen in other neoplasms. The publication interval between the first and second case, and then the third and so on, could be a measure of the frequency/rarity of the aberration, although pure chance certainly plays a role, too.
By way of example, we recently described a case of therapy-related myeloid leukemia carrying a t(8;19)(p11;q13) chromosome translocation resulting in fusion of the lysine acetyltransferase 6A gene (KAT6A) with the leucine twenty homeobox gene (LEUTX) (109). The first leukemia with t(8;19) had been reported in 1988 (110). The translocation remained a fluke event until 1995 when a second case was reported carrying the same translocation (111). The third leukemia case with a seemingly identical t(8;19) was reported in 2008, 20 years after the first one and 13 years after the second (112). In 2014, six years later, the molecular consequences of the generation of the KAT6A::LEUTX fusion gene were reported (113). At present (after a time-span of 33 years), only seven leukemias carrying t(8;19)(p11;q13)/KAT6A::LEUTX can be found in the relevant cytogenetic literature (109, 114, 115). Thus, the t(8;19)(p11;q13)/KAT6A::LEUTX fusion gene comes across as a very rare, but nevertheless unquestionably recurrent, genetic aberration in leukemia reported at a frequency of only 0.21-times per year (7 cases in 33 years).
In 2013, we combined cytogenetics with RNA sequencing methodology to identify a fusion of the zinc finger MYND-type containing 8 (ZMYND8) gene, mapping on chromosome subband 20q13.12, with the RELA proto-oncogene, NF-kB subunit (RELA) gene that maps to 11q13.1, in a pediatric acute erythroid leukemia (116). In 2019, Iacobucci and coworkers, examining 159 acute erythroid leukemias using next generation sequencing methodologies, found the second case of pediatric acute erythroid leukemia with the same ZMYND8::RELA chimera, indicating that ZMYND8::RELA is a rare but recurrent fusion in pediatric acute erythroid leukemias (117).
Many chromosome translocations resulting in truncated or fused genes leading to out-of-frame transcripts have been reported only once. These events might turn out to be recurrent if and when the second case, and then the third et cetera, is reported, as for the leukemias with t(8;19)/KAT6A::LEUTX described between 1988 and 1995 and t(11;20)/ZMYND8::RELA between 2013 and 2018 (33, 37, 38, 40, 53, 67, 89-94, 118-121). In 1985, De Braekeleer and coworkers reported an acute lymphoblastic leukemia carrying a t(10;19)(q26;q13) as the sole cytogenetic change (122). Thirty-two years later, we described the second case of acute lymphoblastic leukemia with t(10;19)(q26;q13) (89). Molecular analysis demonstrated that the coding sequence of the FAM53B gene (family with sequence similarity 53 member B) from 10q26 was fused to a genomic sequence from 19q13 that mapped upstream of the solute carrier family 7 member 10 (SLC7A10). Because no molecular analyses had been performed on the case from 1985 (122), we do not know whether the two reported translocations were identical at the molecular level. However, both patients had t(10;19)(q26;q13) as the only cytogenetic change, both were diagnosed with acute lymphoblastic leukemia, and both achieved complete hematologic and cytogenetic remission under antileukemic treatment (89, 122).
In 2019, our group reported the first genetically analyzed myoid hamartoma of the breast (123). The tumor had a t(5;12)(p13;q14) translocation as the sole karyotypic aberration leading to fusion of the HMGA2 gene from 12q14 with a sequence from chromosome subband 5p13.2. The data indicated that myoid hamartoma is a true neoplasm growing from a mutated mesenchymal stem cell capable of differentiating into smooth muscle cells (123). In 2022, we reported the genetic characterization of a mammary leiomyomatous tumor without signs of malignancy (124). Using cytogenetics, fluorescence in situ hybridization, RNA sequencing, reverse transcription-polymerase chain reaction, and Sanger sequencing methodologies, we showed that the tumor cells had an identical genetic profile to that found in the previously examined myoid breast hamartoma. We conclude that a chromosome translocation t(5;12)(p13;q14) resulting in fusion of HMGA2 with a sequence from chromosome subband 5p13.2 is recurrent in benign myoid neoplasms of the breast (124).
The recurrent t(2;12)(q35~37;q14~q15) chromosome translocation in lipomas results in fusion of HMGA2 with the atypical chemokine receptor 3 (ACKR3) gene (also known as RDC1) from 2q37.3. The functional impact of the fusion was assumed to be truncation of HMGA2 since ACKR3 contributes a stop codon one amino acid downstream of the fusion point (47, 125). Similarly, the recurrent chromosome translocation t(9;12)(p22;q14~q15) found in lipoma, pleomorphic adenoma, and myolipoma results in fusion of HMGA2 with the nuclear factor I B (NFIB) gene from 9p23-p22.3 (lipomas and pleomorphic adenomas) or with chromosome 9 open reading frame 92 (C9orf92) from 9p22, leading to HMGA2 truncation (50, 51, 126, 127). On the other hand, the recurrent t(4;12)(q27~28;q14~15) chromosomal rearrangement in lipoma results in fusion of HMGA2 not with a gene but with various intergenic sequences from chromosome subband 4q28.1, again leading to truncation of HMGA2 (46). The above examples illustrate how chromosome translocations resulting in truncation of a gene may be viewed as recurrent variations on the same pathogenetic theme. Indeed, the variety of chromosome aberrations, most of them translocations, targeting the HMGA2 gene locus in many types of neoplasia are the best-known examples of the fascinating genetic heterogeneity that may lie behind gene truncation as a tumorigenic mechanism (Table I) (128-130). Chromosomal rearrangements leading to truncations also target the CSF1 gene which maps to 1p13.3 in tenosynovial giant cell tumors (33-35), the collagen type II alpha 1 chain gene (COL2A1) which maps to 12q13.11 in chondrosarcomas (131, 132), FOS which maps to 14q24.3 in epithelioid hemangioma of the bone and osteoblastoma (42-45, 133), the trio Rho guanine nucleotide exchange factor (TRIO) gene which maps to 5p15.2 in liposarcomas and undifferentiated sarcomas (59, 134, 135), as well as several other loci.
Genomic aberrations giving rise to HMGA2 (on 12q14) truncation or in-frame fusions (denoted with *).
In hematologic neoplasias, chromosome rearrangements generating truncated or out-of-frame fusion transcripts have been frequently described to target RUNX1 in 21q22.12 (53, 55-58, 68, 69, 120, 136-142). In most cases, RUNX1 was truncated after exon 6 or 7 (Table II). Translocations generating truncations have also been frequently reported for ETV6 in 12p13.2 (Table III) (36-41, 143-145). In all examined cases, the truncation of ETV6 occurred after exon 1 or 2 of the gene (Table III).
Genomic aberrations giving rise to truncated RUNX1 (on 21q22).
Genomic aberrations giving rise to truncated ETV6 (on 12p13).
Also the lysine methyltransferase 2A (KMT2A, formerly known as MLL) gene on 11q23.3 sometimes fuses with intergenic sequences of various chromosomes giving rise to out-of-frame KMT2A-fusion transcripts (146-149). Based on the MLL recombinome data, five chromosome translocations exist that result in fusion of the 5′-part of KTM2A with genomic sequences from 1p13, 6q27, 9p13, 11q23, 11q24, and 21q22 (148, 149) whereas ten chromosome translocations result in out-of-frame KMT2A fusion genes (146-149).
Are gene truncations pathogenetic events (primary or secondary) or genetic noise? In cancer cytogenetics, the terms primary and secondary chromosome aberrations are frequently used to emphasize the sequential order in which neoplastic cells acquire genomic changes (150-152). Primary chromosome abnormalities are believed to be important in the transformation of susceptible somatic target cells and, hence, essential in the very establishment of a neoplasm. They may be, in fact often are, observed as sole cytogenetic changes and are not rarely phenotype-specific (150-152). Many gene truncations caused by primary chromosome aberrations, mainly translocations, are found in both solid tumors and leukemias; perhaps the most typical and yet variable among them involves the HMGA2 gene (47, 128-130, 153-161). Other examples of primary aberrations leading to gene truncations in solid tumor cytogenetics target CSF1 in 1p13.3 (33-35) in tenosynovial giant cell tumors and FOS in 14q24.3 in epithelioid hemangiomas of bone and osteoblastomas (42-45, 133).
The genotype-phenotype or cytogenetic-diagnostic specificity is far from absolute, however, inasmuch as the same primary chromosome abnormality is sometimes found in more than one tumor type indicating shared pathogenetic pathways. For example, the t(16;21)(p11;q22) translocation leading to fusion of FUS RNA binding protein gene (FUS) from 16p11 with the ETS transcription factor ERG gene (ERG) from 21q22 to form a FUS::ERG hybrid gene on the der(21)t(16;21)(p11;q22), has been reported in acute leukemias (162-166) as well as Ewing sarcomas (167, 168). The chromosome translocation t(3;12)(q28;q14) which fuses HMGA2 with the LPP gene (in 3q28; the gene’s full official name is LIM domain containing preferred translocation partner in lipoma) has been reported in lipoma, chondroid hamartoma, and chondroma (154, 156, 169-171). Likewise, truncation of the TYRO3 protein tyrosine kinase (TYRO3) gene on 15q15.1 through a t(10;15)(p11;q15) has been found in pediatric acute myeloid leukemia (93) as well as anaplastic ependymoma grade 3 (172). In leukemia, an exonic sequence of TYRO3 is fused to an intergenic sequence from 10p11 (93). In ependymoma, an exonic sequence of TYRO3 is fused out-of-frame with the kelch like family member 18 (KLHL18) gene from 3p21.31 (172). In both neoplasias, the putative TYRO3 truncated protein would contain the extracellular domain together with the two immunoglobulin domains, the two fibronectin type III domains, and the transmembrane domain. The protein lacks the catalytic domain of TYRO3, its autophosphorylation sites, and the carboxyl-terminal part, all of which are required to maintain TYRO3 stability (93, 172).
That a cytogenetic aberration is solitary does not necessarily mean that it was the initial genetic event in neoplastic transformation since also submicroscopic genetic aberrations may be present. For example, molecular and fluorescence in situ hybridization analyses of a pediatric AML carrying trisomy 4 as the only cytogenetically visible aberration revealed the additional presence of both an FLT3-ITD mutation and a cryptic t(6;21)(q25;q22) translocation resulting in fusion of exon 7 of RUNX1 with an intergenic sequence from 6q25 (58). Which of these three changes was the initial genetic event, let alone the most pathogenetically important one, remains unknown (58). On the other hand, in a patient with myeloproliferative bone marrow disease, 90% of the cytogenetically analyzed metaphases carried a t(12;22)(q14.3;q13.2) as the sole cytogenetic aberration. The aberration generated an HMGA2::EFCAB6 fusion gene (EFCAB6 is from 22q13; the official name is EF-hand calcium binding domain 6) resulting in HMGA2 truncation (173). An additional JAK2V617F mutation was also found but only in half of the cells examined, indicating that the t(12;22)(q14.3;q13.2) leading to HMGA2::EFCAB6 was indeed the primary genetic event in the neoplastic process (173).
The secondary chromosome aberrations referred to above were acquired after the primary ones but probably were of importance in tumor progression (150-152). They, too, come across as nonrandom genetic events that may be clinically important both diagnostically and prognostically (152, 174-185). By way of example, a patient with chronic myeloproliferative disorder carried a t(8;22)(p11;q11) leading to fusion of the BCR activator of RhoGEF and GTPase (BCR) and fibroblast growth factor receptor-1 (FGFR1) genes (55, 186). At the time of transformation of the myeloproliferative disorder to AML, the cells had acquired an additional t(9;21)(q34;q22) [the karyotype was now 46,XY,t(8;22)(p11;q11),t(9;21)(q34;q22)] (55). The t(9;21)(q34;q22) resulted in fusion of exon 4 of RUNX1 with repetitive sequences from chromosome 9 generating a truncated RUNX1 gene (55). In an acute leukemia characterized by the pathognomonic t(9;22)(q34;q11.2)/BCR::ABL1 aberration, an additional t(15;21)(q26.1;q22) was also detected that had resulted in fusion of RUNX1 with the antisense strand of the SV2B gene, leading to formation of out-of-frame RUNX1 fusion transcripts and the production of truncated RUNX1 isoforms (57). Finally, an AML patient carrying a t(5;18)(q35;q21) as the primary cytogenetic abnormality developed a t(3;12) (p23;p13) during disease progression resulting in ETV6 rearrangement (187).
The term cytogenetic noise is used for the background level of nonconsequential aberrations, seemingly distributed in a random manner throughout the genome, that is seen in some malignancies (150, 151). In contrast to primary and secondary cytogenetic aberrations that confer an evolutionary edge on the cells, and are detected in at least two cells (the minimum prerequisite for regarding them as clonal), noise aberrations are typically nonclonal. Only limited investigations on the oncogenic importance of such aberrations (if any) have been reported (188-192). Extensive cytogenetic instability has been described in elastofibromas (193, 194) whereas nonclonal telomeric associations are common in giant cell tumors of bone (195-197). No studies have focused on possible associations between nonclonal structural chromosome aberrations in cancer and fusion genes that may have been formed by them. However, some of the multiple fusion transcripts detected when analyzing RNA sequencing data using fusion transcript detection programs (FusionCatcher, TopHat, JAFFAL, et cetera) might be attributable to nonclonal structural chromosome aberrations. Single cell RNA/DNA sequencing methodology might be a useful tool to assess a possible relationship between nonclonal structural chromosome aberrations and fusion genes in cancer (198-204). Because, at least in theory, sequencing may detect fusion genes at the level of single cells, this approach should be useful in studies of intratumor heterogeneity, enabling the detection of fusion genes and possible alternative fusion transcript variants in even the smallest of clones (198-204).
Are gene truncations drivers? In the era of next generation sequencing, acquired genetic abnormalities detected by this methodology are often referred to as either drivers or passengers. Driver mutations cause cancer (or at least neoplasia) inasmuch as they play a fundamental role in the transformation of a targeted somatic cell, whereas passenger mutations are randomly accumulated events that seemingly do not influence tumorigenesis one way or another (205-207). A third type of mutations called “mini drivers” have relatively weak tumor-promoting effects but are nevertheless clinically important at least to some degree (208, 209). By and large, driver mutations have the same status as what we prefer to call primary genomic aberrations.
Against this background it is not surprising that evidence is mounting that some truncated genes are driver mutations in cancer. The best-known example is rearrangements leading to truncation of the HMGA2 gene, events referred to extensively above. HMGA2 consists of five exons and maps to chromosome subband 12q14.3 (chr12:65,824,483-65,966,291 in GRCh38/hg38 assembly) (153-156). The first three exons code for three domains which specifically bind to the minor groove of adenine-thymine (AT) rich DNA, hence the name AT-hooks DNA binding domains (155, 210-212). Exon 5 contains a long untranslated region (3′-UTR) that has seven microRNA let-7 sites and many other regulatory sequences and controls the expression of HMGA2 (Figure 1) (213-216).
The high mobility group AT-hook 2 (HMGA2) gene. The chromosome band location (top) and exon 5 of the gene (bottom) are shown. On exon 5, the Let7 positions are underlined. Additional microRNA target sites predicted by TargetScanHuman 7.2 are also shown. The colors indicate different classes of target sites: Purple is for 8mer, red for 7mer-m8 and blue for 7mer-A1.
Recombination of 12q14.3 with bands from almost every other chromosome, but resulting in truncation of HMGA2, has repeatedly been reported in various neoplasms (2, 129) (Table III). The molecular consequence is always the physical separation of the first three exons from the 3′-UTR of HMGA2. In vitro experiments showed that expression of truncated HMGA2 protein carrying the three AT-hook DNA-binding domains transforms mouse embryonic NIH3T3 fibroblasts (62), induces leiomyoma-like lesions in human myometrial cells (63), and promotes the growth of chondrocytes (64, 217). Transgenic mice producing a truncated form of HMGA2 develop benign mesenchymal tumors (65, 218).
In later years, other genes were also deemed to have driver potential when truncated through genomic rearrangements; they, too, have been at the forefront of pathogenetic interest for those who study the genomic or chromosomal abnormalities in neoplasia. In cells from tenosynovial giant cell tumors, various chromosome rearrangements, mainly translocations, often target and disrupt the CSF1 locus in 1p13.3 (chr1:109,910,849-109,930,992 in GRCh38/hg38 assembly) (219-227). CSF1 has four alternative transcripts (references sequences NM_172210.3, NM_172212.3, NM_172211.4, and NM_00757.6) which are different only in exons 6 and 9 (Figure 2) (228, 229). The translation initiation codon is in exon 1 whereas the stop codon is in exon 8. CSF1 transcripts with reference sequences NM_172211.4 and NM_00757.6 have the same untranslated exon 9, while transcript NM_172212.3 differs in exon 9 compared to the other two and transcript NM_172210.3 lacks untranslated exon (Figure 2). No information is available about exon 9 of the CSF1 transcript with reference sequence NM_172212.3 whereas a lot is known about exon 9 of the two CSF1 transcripts with reference sequences NM_172211.4 and NM_00757.6. Exon 9 contains at least 14 microRNA target sites (miRNA) that regulate the expression of CSF1 and its protein, a noncanonical G-quadruplex which is involved in post-transcriptional regulation of CSF1, as well as AU-rich elements (AREs) which regulate the stability and decay of CSF1 mRNA (230-232). Woo et al. (232) showed that disruption of the miRNA target region, G-quadruplex, and AREs together dramatically increased reporter RNA levels, suggesting important roles for these cis-acting regulatory elements in the down-regulation of CSF1 mRNA.
The colony stimulating factor 1 (CSF1) gene. Chromosome band location (top), the four transcript variants (middle), and the last non-translated exon 9 of the variants with accession number NM_172211.4 and NM_000757.6 (bottom) are shown. On exon 9, the G-quadruplex, the AU-rich (ARE) element, and microRNA target sites predicted by TargetScanHuman 7.2 are also shown. The colors indicate different classes of target sites: Purple is for 8mer, red for 7mer-m8 and blue for 7mer-A1.
Studying three tenosynovial giant cell tumors, Panagopoulos and coworkers (33) found that replacement of the 3′-UTR (exon 9) of CSF1 with new sequences was common to all three tumors resulting in overexpression of the protein-coding part of CSF1. That this is indeed a common pathogenetic theme in tenosynovial giant cell tumors was confirmed by two subsequent studies of larger series of tumors (34, 35). Abnormal expression of CSF1 by macrophages and monocytes led to tumor development (225, 226). The possible role of CSF1R tyrosine kinase inhibitor as a medical treatment for patients with tenosynovial giant cell tumors is under investigation (233-236).
Epithelioid hemangiomas of the bone and osteoblastomas are genetically characterized by rearrangements leading to fusions of the FOS gene which maps to chromosome subband 14q24.3 (chr14:75,278,828-75,282,230 in GRCh38/hg38 assembly) (42, 44, 45, 133). FOS has four exons and codes for a leucine zipper protein which dimerizes with members of the JUN family of transcription factors to form the transcription factor complex AP-1 (237, 238). Exon 4, the last exon of FOS, consists of a coding part and a 3′-UTR (Figure 3). The C-terminal region of the translated part of exon 4 (last 21 amino acids: KGSSSNEPSSDSLSSPTLLAL) is essential for the degradation of FOS protein (239, 240). The 3′-UTR contains ARE and many micro-RNA target sites which regulate the expression of FOS and maintains the stability of FOS mRNA, and a 145 bp nucleotide region which enables the targeting of FOS mRNA towards the perinuclear cytoskeleton (241-253).
The Fos proto-oncogene, AP-1 transcription factor subunit (FOS) gene. On the translated part of exon 4, the C-terminal region involved in FOS degradation is shown. In the untranslated part of the exon, the AU-rich (ARE) element and microRNA target sites predicted by TargetScanHuman 7.2 are shown. The colors indicate different classes of target sites: Purple is for 8mer, red for 7mer-m8 and blue for 7mer-A1. In all FOS chimeras, the FOS degradation and the untranslated region are not present.
In all examined epithelioid hemangiomas of the bone and osteoblastomas with rearrangements of FOS, the breakpoint occurred in the coding part of exon 4 where the fusion event introduced a stop codon (42-45, 61). As a consequence, the C-terminal part of FOS, essential for its degradation, and the 3’-UTR of FOS mRNA are removed (Figure 3). The resulting, truncated FOS protein contains the N-terminal transactivation domain which plays a crucial role in transformation, and the basic leucine zipper domain (bZIP) that makes it more stable (61, 242). In vitro experiments have shown that the truncated FOS protein is resistant to degradation and has a longer half-life than wild-type FOS (1-2 hrs) (61).
The mitogen-activated protein kinase 8 (MAP3K8) gene, which maps to chromosome subband 10p11.23 (chr10:30,434,021-30,461,833 in GRCh38/hg38 assembly), has nine exons (NCBI reference sequence: NM_005204.4) (Figure 4) and codes for a cytoplasmic protein which is a member of the serine/threonine protein kinase family and activates both the MAP and JNK kinase pathways (254-256). In 1991, Miyoshi and coworkers cloned MAP3K8 (257) and named it cot (cancer Osaka thyroid) oncogene. They found two cDNA clones: one corresponding to cDNA of the normal MAP3K8, and a second, truncated clone in which part of the sequence from the eighth exon onwards was replaced by another sequence. They wrote that “…DNA sequence downstream from the 3’ rearranged point was totally different from the cot proto-oncogene sequence and the following coding sequence of the cot oncogene was abruptly terminated by a stop codon TAG. These results suggest that the cot oncogene might be activated by a C’-terminal truncation during the process of DNA transfer, although we have no idea about the C’ terminus of the normal Cot kinase because we have no information about the normal transcript of the cot proto-oncogene” (257). In 1993, Aoki and coworkers (258) showed that the truncated gene had much higher transformation activity than the normal one and suggested that the N-terminal part of MAP3K8 protein is essential for transformation whereas the C-terminal moiety of normal MAP3K8 protein (i.e., the part missing in the truncated gene product) may regulate negatively the transformation potential of the MAP3K8 protein (258).
The mitogen-activated protein kinase 8 (MAP3K8) gene. The chromosome band location (top) and exon 9 of the gene are shown (bottom). On the translated part of exon 9, the C-terminal region the kinase repression domain is shown. The underlined sequence is part of the PEST domain of MAP3K8 protein. The bold sequence is the degradation signal (degron).
For the purpose of the present review, we aligned the sequence reported in that article with the Human GRCh38/hg38 assembly using the BLAT algorithm (259) (CCAAGAGCCGCAGACCTACTAAAACATGAGGCCCT G A A C C C G C C C A G A G A G G AT C A G C C A C G C - GGGCACCAAGTCATTCATGAAGGATCCTCCACCAAT GACCCAAACAACTCCTGCTAGGCCCCACCT). It turned out that, in the truncated MAP3K8 cDNA clone, exon 8 of the gene (the fusion point was at nucleotide 1640 of the NCBI reference sequence: NM_005204.4) was fused with an intergenic sequence mapping to chromosome subband 5q13.3 (chr5:75,004,416-75,004,477).
In 2004, Clark and coworkers (260) reported a novel MAP3K8 truncation in lung cancer. The breakpoint occurred in exon 8 of the gene (the fusion point was at nucleotide 1711 of the NCBI reference sequence: NM_005204.4) and fused MAP3K8 with an intergenic sequence from chromosome band 9p23 (260). Truncated MAP3K8 had much higher transforming activity than wild-type MAP3K8 (260). The common theme of those two truncated MAP3K8s was that they lacked exon 9 which codes for the last 43 amino acids of the MAP3K8 protein (position 425-467 in reference sequence NP_005195.2: DSSCTGSTEESEMLKRQRSLYIDLGALAGYFNLVRGPPT LEYG) (Figure 4). This part of the protein is a kinase repression domain, has a degradation signal (degron) between positions 435 and 457 (SEMLKRQRSLYIDLGALAGYFNL), and a portion of the PEST sequence (DSSCTGSTEESEML) which acts as a signal for MAP3K8 degradation (261). Thus, the absence of the last 43 amino acids in the truncated MAP3K8 results in stabilization of the protein, higher kinase activity, and higher oncogenic capacity (261, 262).
In 2019, Newman and coworkers found recurrent rearrangements of MAP3K8 in spitzoid melanomas (263). Working from an index case (an 11-year-old boy) and studying altogether 50 spitzoid melanomas, they found MAP3K8 fusion genes in 12 tumors and truncation of MAP3K8 in another five. The common theme in all 17 tumors was the absence of exon 9 of MAP3K8 (263). When truncated MAP3K8, which lacked exon 9, was expressed in NIH 3T3 cells, signs of neoplastic transformation were seen in a colony-formation assay (263). Truncations and fusions of the MAP3K8 gene have subsequently been detected in spitzoid melanomas by other researchers (237-239) and similar changes have been seen also in other melanoma subtypes (264-267). In all these settings, lack of exon 9 was the common pathogenetic theme.
In hematologic malignancies, most reported RUNX1 truncations are the consequence of chromosome translocations hitting the gene after exon 6 or 7 (Table II). These truncated genes code for a RUNX1 protein containing the Runt homology domain (RHD) but lacking the transcriptional activation domain. The truncated RUNX1 proteins may heterodimerize with CBFB and bind to DNA but do not have any transactivation ability (53, 56, 58, 69, 120, 136, 137, 139-142, 268). Thus, they are functionally similar to the isoform AML1a of the RUNX1 protein (NCBI reference sequences NM_001122607.2 and NP_001116079.1) (58, 269-271). AML1a binds to the DNA sequence of target genes and inhibits the transcriptional activity of RUNX1 (271). In 32Dcl3 murine myeloid cells treated with colony stimulating factor 3 (CSF3), overexpression of AML1a suppressed granulocytic differentiation but stimulated cell proliferation (271). Overexpression of AML1a has been found in both ALL and AML-M2 patients (272). Mice have developed lymphoblastic leukemia after transplantation of murine bone marrow mononuclear cells transduced with AML1a (272).
Truncated RUNX1 proteins, resulting from chromosome translocations, were found to function as an inhibitor of normal RUNX1 protein. They increase proliferation and disrupt the differentiation program by interfering with AML1b (53, 69, 120, 136-138). In a recent study, truncated RUNX1 protein induced expression of CSF3 receptor on 32D myeloid leukemia cells (142).
Studying RUNX1 point mutation in a mice model, Watanabe-Okochi and coworkers found that truncated RUNX1 protein (mutant S291fsX300) induced pancytopenia with erythroid dysplasia, followed by progression to MDS-RAEB or MDS/AML (273). In human CD34+ stem/progenitor cells as well as human induced pluripotent stem cells, the same mutation impaired myeloid commitment, enhanced self-renewal, and prevented granulocytic differentiation (274).
In another mouse model study, Dowdky and coworkers introduced a premature translational stop codon after amino acid 307 (Runx1Q307X) mimicking the RUNX1 mutations found in MDS/AML and CMML patients (275). RUNX1 truncated protein bound to suitable DNA sequences but failed to activate the target gene’s promoters and was unable to associate with the nuclear matrix. Furthermore, Runx1Q307X homozygous mice exhibited embryonic lethality at E12.5 due to central nervous system hemorrhage and a complete lack of hematopoietic stem cell function (275). Taking this and the other above-mentioned data into consideration, it seems safe to conclude that truncated RUNX1 protein is at least a contributing factor in leukemogenesis.
Recently, Kogure et al. reported that 13% of adult T-cell leukemias/lymphomas and 12% of diffuse large B-cell lymphomas carry a truncation of the 3′-end of the REL proto-oncogene, NF-kB subunit (REL) gene which maps to chromosome subband 2p16.1 (276). The consequence is upregulation of REL expression and generation of a REL protein with gain-of-function (276).
Also, functional studies of out-of-frame fusion transcripts suggest that they have the properties of drivers in neoplasia. In ovarian cancer SKOV3 cells, a fusion transcript between the genes LAMC2 from 1q25.3 (official full name: laminin subunit gamma 2) and NR6A1 (nuclear receptor subfamily 6 group A member 1) from 9q33.3 resulted in a truncated LAMC2 protein that lacked the essential functional domains I and II, but instead had tumor promoting properties (277). Overexpression and suppression of LAMC2::NR6A1 fusion transcripts significantly increased or decreased the tumorigenic growth of neoplastic cells in mouse models, respectively (277).
In a Ewing sarcoma carrying the pathognomonic EWSR1::FLI1 fusion gene, Dupain and coworkers (172, 278) detected two variants of an LMO3::BORCS5 out-of-frame fusion transcript leading to loss of the functional domain LIM2 from the LMO3 gene (official full name: LIM domain only 3) and disruption of BORCS5 (official full name: BLOC-1 related complex subunit 5) (172, 278). In vitro studies showed that LMO3::BORCS5 fusion transcripts increased proliferation, decreased expression of apoptosis-related genes, and deregulated genes involved in differentiation. Expression of LMO3::BORCS5 in NIH-3T3 cells and subcutaneous injection into the flank of mice induced tumors in the animals (172, 278).
In a patient with myelodysplasia, the cytogenetically detected t(12;17)(p13;q11) chromosome translocation was shown to result in fusion of exon 2 of ETV6 with an antisense sequence from intron 19 of TAOK1 (from 17q11). The chimeric ETV6::TAOK1 transcript had an RNA-interfering effect on the expression of normal TAOK1 resulting in less production of TAOK1 protein. In addition, it encoded a C-terminal truncated ETV6 protein (145). In the zebrafish model, truncated forms of ETV6 had a dominant-negative effect on the function of normal ETV6 protein disrupting both primitive and definitive hematopoiesis (60).
Many of the reciprocal KMT2A fusion genes such as IKZF1::KMT2A, PBX1::KMT2A, and JAK1::KMT2A are out-of-frame (149). However, these out-of-frame fusions can express the 3-terminal part of KMT2A through an internal KMT2A promoter (149, 279). The N-terminal truncated version of KMT2A protein may acetylate H4K16 through the association with MOF protein or methylate H3K4 by the SET domain complex (149, 279).
Conclusion
Although gene truncations or out-of-frame fusion transcripts brought about by chromosome translocations and other cytogenetic rearrangements have received relatively little emphasis in studies of neoplastic transformation, data are accumulating that such changes may play important roles in tumorigenesis and leukemogenesis. It is worthy of emphasis that next generation sequencing technologies used alone - without cytogenetic backing and/or the use of other molecular genetic techniques for falsification/verification purposes - are not sufficient when it comes to detecting such genetic aberrations; many false positives are detected while at the same time pathogenetically important changes may be overlooked. Gene truncations/out-of-frame fusion transcripts may remove transcriptional and translational regulatory elements affecting gene expression and/or the stability and properties of the encoded protein. On other occasions, they may introduce a premature stop codon whose effect may be similar to that of a nonsense pathogenetic polymorphism. Apart from the highly prevalent truncation of HMGA2 in mostly benign connective tissue tumors, whole genome or whole transcriptome sequencing of large series of a variety of other tumors has detected gene truncations/out-of-frame fusion transcripts that at present come across as rare, but that later may be shown to be recurrent, perhaps even common. Examples are truncations of CSF1 in tenosynovial giant cell tumors, of FOS in epithelioid hemangiomas of the bone and osteoblastomas, of MTRK8 in spitzoid melanomas, and of REL in T-cell leukemias/lymphomas and diffuse large B-cell lymphoma. The search for other neoplasia-inducing acquired genetic alterations of this type should be intensified, both by cytogenetic and molecular genetic means.
Acknowledgements
The research work of the authors was supported by grants from Radiumhospitalets Legater.
Footnotes
Authors’ Contributions
Both Authors (IP and SH) wrote the manuscript.
Conflicts of Interest
The Authors declare that they have no potential conflicts of interest with regards to this study.
- Received July 6, 2022.
- Revision received August 19, 2022.
- Accepted August 23, 2022.
- Copyright © 2022, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY-NC-ND) 4.0 international license (https://creativecommons.org/licenses/by-nc-nd/4.0).