Original article
Genes that contribute to cancer fusion genes are large and evolutionarily conserved

https://doi.org/10.1016/j.cancergencyto.2009.02.004Get rights and content

Abstract

Numerous cancer fusion genes have been identified and studied, and in some cases, therapy or diagnostic techniques have been designed that are specific to the fusion protein encoded by the fusion gene. There has been little progress, however, in understanding the general features of cancer fusion genes in a way that could provide the foundation for an algorithm for predicting the occurrence of a fusion gene once the chromosomal translocation points have been identified by karyotype analyses. In this study, we used publicly available data sets to characterize 59 cancer fusion genes. The results indicate that all but 17% of the genes involved in fusion events are either relatively large, compared to neighboring genes, or are highly conserved in evolution. These results support a basis for designing algorithms that could have a high degree of predictive value in identifying fusion genes once conventional microscopic analyses have identified the chromosomal breakpoints.

Introduction

The study of cancer fusion genes, arising from chromosomal translocations during cancer cell development, has led to a much more sophisticated understanding of the basis of cancer and to designer drugs specifically targeted to certain cancers. For example, translocations that have fused the MYC gene with the IgH gene have led to the understanding that part of the development of Burkitt's lymphoma is caused by an abnormal and apparently perennial activation of the c-myc gene, which in turn stimulates cell proliferation [1], [2], [3]. In addition, the understanding of the structure of the bcr-abl protein, resulting from the fusion of the BCR and ABL genes [4], led to the discovery and use of Gleevec [5], [6], which efficiently retards the progress of chronic myelocytic leukaemia without the side of effects of less specific, anti-proliferative drugs. Finally, there is the hope of using fusion proteins to generate cancer-specific immune responses [7].

Most, if not all, fusion genes have been discovered as a result of pursuing the specific goal of isolating a fusion gene that is expected to be associated with a chromosomal translocation, with one recent exception representing a more general approach [8]. A case-by-case approach to identifying fusion genes is inefficient, and with more than 50,000 reported disease-associated chromosomal rearrangements in the Mitelman database, a more systematic approach toward identifying fusion genes is needed. Thus, we have selected 59 fusion genes from the published literature and the Atlas of Genetics and Cytogenetics in Oncology and Haematology (http://www.atlasgeneticsoncology.org/index.html) to determine whether these genes have features in common that would distinguish them from neighboring genes. This type of information offers the prospect of being able to predict which genes, within the large regions defined by karyotype analyses of chromosomal translocations, are most likely to be involved in a fusion event. Our results indicate that, with a high degree of probability and statistical significance, the majority of fusion genes are either very large, compared to their neighbors, or have an unusually high degree of evolutionary conservation. The possible connection between these characteristics and the occurrence of the translocation, as well as the potential value of identifying a very large number of DNA segments involved in translocations, are discussed below.

Section snippets

Selection of cancer fusion genes

We selected 36 genes involved in fusion events using the PubMed database for a preliminary study. We then selected an additional 23 genes involved in fusion events from the Atlas of Genetics and Cytogenetics in Oncology and Haematology (http://www.atlasgeneticsoncology.org/index.html). We began the selection of the latter 23 genes with chromosome 22 and worked toward larger chromosomes until 23 genes that did not involve either the immunoglobulin or T-cell receptor loci were selected.

Basic method of data acquisition

To compare

Results

We determined the rank order of size of 59 genes involved in fusion events (Table 1), among all genes present within one million base pairs on either side of each fusion gene. Genes involved in fusion events that were not among the top five in size were labeled as being below 5 (b5) and not ordered further. The size range was selected because karyotype analyses in general has the potential of identifying a region of DNA represented by a chromosomal translocation with a lower limit of about 2

Discussion

The above analyses represent previously identified genes involved in fusion events in cancer. Thus, the analyses do not take into consideration biases that could exist in the process of discovery of these genes. For example, it is possible that larger genes were discovered more frequently because their size made them more accessible to any or all of the technologies used in the process of identifying the fusion gene. Likewise, evolutionarily conserved genes may be more fundamental to cellular

References (12)

  • R. Taub et al.

    Translocation of the c-myc gene into the immunoglobulin heavy chain locus in human Burkitt lymphoma and murine plasmacytoma cells

    Proc Natl Acad Sci USA

    (1982)
  • R. Dalla-Favera et al.

    Human c-myc onc gene is located on the region of chromosome 8 that is translocated in Burkitt lymphoma cells

    Proc Natl Acad Sci USA

    (1982)
  • Neel BG, Jhanwar SC, Chaganti RS, Hayward WS. Two human c-onc genes are located on the long arm of chromosome 8. Proc...
  • N. Heisterkamp et al.

    Structural organization of the bcr gene and its role in the Ph' translocation

    Nature

    (1985)
  • E. Buchdunger et al.

    Inhibition of the Abl protein-tyrosine kinase in vitro and in vivo by a 2-phenylaminopyrimidine derivative

    Cancer Res

    (1996)
  • B.J. Druker et al.

    Effects of a selective inhibitor of the Abl tyrosine kinase on the growth of Bcr-Abl positive cells

    Nat Med

    (1996)
There are more references available in the full text version of this article.

Cited by (14)

  • Germline cytoskeletal and extra-cellular matrix-related single nucleotide variations associated with distinct cancer survival rates

    2018, Gene
    Citation Excerpt :

    For example, most cancer fusion proteins are derived from genes that have unusually large introns, making the generation of a fusion gene, based on randomly positioned DNA breakage sites, more likely, than in the case of genes with small introns. And when the origin of the fusion gene is small, the cancer is very rare, such as in Ewings sarcoma (Narsing et al., 2009; Pava et al., 2012). Likewise, genes that contribute to cancer development, and have mutations late in the process of development of a fully metastatic cancer, are relatively small (Long et al., 2011), befitting the reduced likelihood of these small genes incurring a mutation.

  • Stratifying melanoma and breast cancer TCGA datasets on the basis of the CNV of transcription factor binding sites common to proliferation- and apoptosis-effector genes

    2017, Gene
    Citation Excerpt :

    The BRCA Set A comparison in particular (Table 9; Figs.6C, D) indicated an almost even distribution of barcodes, between those with a greater number of proliferation effector gene TFBS and those with a greater number of apoptosis TFBS. Again, keeping in mind the relatively good outcome for breast cancer, this process is likely detecting what is essentially the randomness of genetic damage, clearly evidenced by other forms of genetic damage, such as chromosomal translocations, the formation of fusion genes, and mutations (Long et al., 2011; Narsing et al., 2009; Parry et al., 2015; Pava et al., 2012). Presumably, this randomness is detectable in breast cancer due to the relatively early detection of breast cancers and the fact that these tumors represent genetic damage that has been subject to less selective pressure in terms of becoming a fatal cancer.

  • Big genes are big mutagen targets: A connection to cancerous, spherical cells?

    2015, Cancer Letters
    Citation Excerpt :

    Overall, the distribution mirrored the distribution of the overall collection of mutated genes and the distribution of metastasis and suppressor genes (Table 4; Fig. 1; SOM). Thus, most of the genes in both the high and low frequency groups have very large coding regions, which strongly indicates that mutation sensitivity is, at least in part, a function of coding region size, regardless of the frequency of mutations, reminiscent of the role of gene size in the formation of cancer fusion genes [4,5]. The above conclusion regarding the association of coding region size with mutation susceptibility invites re-visiting the oncoprotein data of Table 2.

View all citing articles on Scopus
View full text