There are two major groups of TEs differentiated by their intermediate mode of transposition: Class I and II elements (Cosby et al. 2019). These classes can be further broken up into element families, or ‘subclasses’, based on the genetic traits and functions that distinguish them from their ancestral line (Bourque et al. 2018). Class I elements are retrotransposons which utilize an RNA intermediate to mobilize within the genome (Cosby et al. 2019). These elements proliferate via a ‘copy-and-paste’ method, in which the intermediate is reverse-transcribed into DNA that can be integrated into the genome at another location (Bourque et al. 2008). Class II elements are DNA transposons which instead utilize a DNA intermediate to mobilize (Cosby et al. 2019). These elements proliferate via a ‘cut-and-paste’ method in which a DNA sequence is excised out of the genome and integrated at a new location.
Class I and II elements exhibit predilection for insertion into specific portions of the genome, a unique function known as ‘site-selection’. At selected sites, an insertion is favorable to the propagation of the element while causing minimal damage to the host. However, not all insertions prove to be harmless. Genome expansion is achieved through the contributions of active elements to mutagenize DNA sequences and break chromosomes to facilitate gene duplications. Through these methods, TEs are able to evolve novel functions in existing genes to increase genetic diversity between organisms (Bourque et al. 2018).
The Genomic Evolution of Transposon Activity
Class I retrotransposons dominate a percentage of active transposable elements in the human system. Two sub-classes of retroelements include LTR (long terminal repeats) and non-LTR retrotransposons. LTR retroelements are believed to have first been inserted into the genome over twenty-five million years ago as endogenous retroviruses. These retrotransposons are considered to be responsible for the rapid evolutionary changes of size and diversity in the multicellular eukaryotic genome. The adaptive radiation of eukaryotic genomes is exemplified independently by maize and rice, both of which have phylogenetic data consistent with whole genome duplication events resulting from retrotransposon amplification (Cordaux and Batzer 2009).
Compared to DNA transposons and LTR retrotransposons, non-LTR elements are rather active in the eukaryotic system. Several families of non-LTR elements have been studied for their evolution in the human system, including SINEs (short interspersed nuclear elements) and LINEs (long interspersed nuclear elements), also referred to as L-1 elements. Alu and SVA elements belong to the SINE family, both of which constitute active elements derived from the primate lineages of transposons (Platt et al. 2018). Alu and SVA elements have spread and evolved within primate DNA over the past eighty million years (Cordaux and Batzer 2009). These elements are found in abundance in highly expressed genes, while L-1 elements are commonly localized to regions of low levels of gene expression. L-1 retroelements constitute a portion of transposons that promote genetic instability through the generation of chromosomal insertions, deletions, and the modification of gene expression at regulatory regions of the genome. Non-LTR elements continue to propagate the genomic landscape in the human system via their mobility and maintenance of high copy numbers (Huda and Jordan 2009).
Class II DNA transposons are thought to have once contributed largely to the expansion of the primate genome about thirty-seven million years ago, but have since remained inactive in the human system (Cordaux and Batzer 2009). Evidence for DNA transposon activity is exhibited by studies performed on the sexually-transmitted parasite Trichomonas vaginalis, whose genetic content is roughly two-thirds repetitive sequences of DNA (Feschotte and Pritham 2007). This repetition is particularly interesting to scientists studying the activity of this single-celled eukaryote due to the implication that DNA transposons are normally only transferred horizontally through sexual reproduction. The DNA-derived TEs present in the T. vaginalis genome comprise forty percent of all genetic material, and the most active elements appear to be members of the Tc1/mariner family. At the mariner insertion site, nucleotide polymorphisms and reduced expression of local genes present damaging effects on the T. vaginalis genome. It is hypothesized that this parasite maintains a stable genome amid continuous transposon activity via gene recombination and sexual reproduction (Bradic and Carlton 2018).
Selfish Genetic Elements and the Defensive Eukaryotic Genome
It is theorized that the adoption of TEs as fundamental genetic material in eukaryotic organisms was an exaptive event which produced an increase in the genetic divergence of closely related organisms. The extensive interactions of the eukaryotic genome and transposable elements have largely contributed to the genomic evolution of many species. The co-option of TEs as integral genetic material has provided both evolutionary benefits and detriments to eukaryotic organisms.
Many eukaryotic organisms have developed mechanisms to inhibit transposon activity and defend the integrity and stability of their genome. The parasitic mechanism of transposon proliferation has resulted in biological conflict deemed to be an ‘evolutionary arms race’. The host genome must evolve rapidly to counteract the deleterious impacts of TE activity via positive selective pressures. However, it is thought that TEs have evolved strategies to continue proliferation throughout the host system while mitigating the most damaging effects of their propagation (Cosby et al. 2019).
The genome defense model suggests that the host genome must terminate the activity of transposable elements in order to protect the integrity and longevity of the germline. This model posits that epigenetic regulatory processes are products of the first line of genomic defense against transposition. Following this theory, it is possible that these initial host defenses against TE activity have evolved into modern mechanisms of gene regulation including spatial patterns of expression and the modification of genome structure (Huda and Jordan 2009). Through the co-option of TEs as functional genetic material, epigenetic modifications of the eukaryotic genome have been shown to influence gene function and regulation (Cosby et al. 2019).
TEs have maintained their presence in various hominoid genomes through their ability to lie dormant for long periods of time. During these periods of dormancy, TEs exhibit minimal transposition activity. Alu elements have demonstrated this low-activity mechanism in non-human primate systems for periods of time as long as 20 million years. Through this method, low-activity TEs referred to as ‘stealth drivers’ remain protected from host defenses while periodically producing more active elements deemed ‘secondary TEs’. Secondary TEs that exhibit highly active transposition are called ‘master elements’, which contribute deleterious effects to the host genome. Master elements are subject to negative selection by the host system while their stealth driver relatives remain undetected. The decline in mobile activity allows the stealth driver element lineage to persist for millions of years, while master elements are selected against and eventually eradicated (Cordaux and Batzer 2009).
Genetic Disease Mediated by TE Activity
Although transposable elements have been characterized as major speciation-driving factors of the eukaryotic genome, their activity is also likened to the development of human disease. As already mentioned, transposons modify gene expression and impact regulatory networks through their invasive insertions into functioning genes. TEs cause gene mutations, disrupt gene function, produce chromosomal breakage, and increase genetic instability. These genome-remodeling mechanisms generate a plethora of diseases with varying levels of severity.
The most common cause of element-induced human disease is insertional mutagenesis. Insertional mutagenesis involves the positioning of repetitive DNA sequences into open reading frames or splice sites, interrupting the function of nearby genes. Highly active insertional mutagens include L-1, Alu, and SVA elements. According to evidence cited by Victoria Belancio et al. (2009), mutation-causing elements inserted into the germline provoke diseases including, but are not limited to, “neurofibromatosis, hemophilia, choroideremia, cholinesterase deficiency, Apert syndrome, Dent’s disease, B-thalassemia, and Walker-Warburg syndrome.” While these diseases are treated as rare, recessive disorders, TE insertions also play a role in the development and progression of cancers, which are far more widely distributed in the human population (Belancio et al. 2009).
Transposon Influence in the Development and Progression of Human Cancer
Transposons are thought to cause genetic diseases in humans through non-allelic homologous recombination (NAHR), which promotes chromosomal misalignment and irregular exchange of genetic material between elements. Alu elements are the most commonly studied members of the TE family in regards to the development and progression of genetic diseases as a result of NAHR events. Genes that are linked to specific hereditary diseases are proven to be fragile sites of DNA modification. These sites have been identified as mutational hotspots for NAHR between Alu elements.
Genes that are involved in the promotion of cancer development, such as MLL-1 and BRCA1, have high concentrations of Alu elements within their non-coding regions. The BRCA1 gene sequence contains 137 Alu-specific elements, which constitutes more than forty percent of its genetic composition. In a study of 23 participants with the BRCA1 mutation, 44 of the 137 Alu elements were responsible for NAHR events at this locus (Belancio et al. 2009). While this information is intriguing, it is important to view studies of such small sample sizes with some skepticism. The results of this study may be confounded by the nature of sampling error, and the quantity of elements proven to act as expected are thirty-two percent of the total TE landscape at the oncogenic locus. As the significance of these results may be questionable, other evidence suggests stronger ties to TE-mediated cancer development.
In a study performed on 54 families with Von Hippel-Lindau disease (VHL), an autosomal dominant non-cancerous tumor development disorder, about thirty percent of patients exhibited partial or complete deletions of the VHL tumor suppressor gene on the third chromosome. In the intron-dominating regions of the VHL tumor suppressor gene, 66 chromosomal breakpoints were identified as targeted deletion sites among 33 patients. Of these 66 sites, ninety percent of sequence deletions occurred in regions with a high concentration of Alu elements. The Alu elements responsible for NAHR activity promoting germline deletions of the VHL tumor suppressor gene in tested individuals range in evolutionary age, and the most recent familial descendant was determined to be the most actively recombinogenic in the human system. The youngest element, AluYa5, was found to be responsible for seven of the 33 patient cases of Alu-mediated sequence deletions (Franke et al. 2009). These data present rather substantial implications of Alu-mediated NAHR effects on tumor development in humans. However, this case cannot provide a link to the development and progression of cancerous tumors in patients with VHL. Therefore, an association between Alu element activity in the human genome and cancer development cannot be explicitly determined by this study. Nonetheless, it is evident that transposable elements play a role in the origination and continuation of non-cancerous genetic diseases in individual patients through NAHR activity.
TE Interactions with Cellular Pathways
It is important not only to discuss and evaluate the genetic implications of TE prevalence, but to also investigate the ways in which elements interact with the surrounding intracellular and extracellular environments to cause disease. The transposon amplification system is controlled by several factors of the host’s extracellular environment. L-1 elements in particular are modified by cellular pathways involving RNA interference, regulatory elements of the transcriptome, and methylation at DNA sites. Cellular proteins also play a role in the TE amplification system, including DNA repair proteins: ATM (ataxia telangiectasia mutated kinase), ERCC1 endonclease dimer and XPF (xeroderma pigmentosum complement group), which aid the integration of L-1 and Alu elements into genetic material. To test this theory, researchers implemented gene knockout of the DNA-methylation control gene, Dnmt3L (DNA-methyl-transferase-3-like protein) in the mouse genome, which is responsible for the methylation of DNA segments in the germline. With complete gene knockout, expression of L-1 and LTR elements associated with cell death promoted by mitotic stress during the production of spermatozoa was upregulated (Belancio et al. 2009).
These intracellular exposures of TE modulation are important factors to consider when testing the effects of transposon activity on loci in the human genome. The extracellular environment also influences TE activity in the human system. Belancio et al. (2009) states that external indicators of TE upregulation by environmental mutagens include “ionizing radiation, heavy metals as implemented in cigarette smoke and workplace exposures, anti-cancer therapies, air pollutants, and DNA demethylation agents.” These mutagenic agents are theorized to be responsible for an increase in endogenous TE activity through the modification of cellular enzyme machinery required for the down-regulation of TE activity. Researchers predict that these uncontrolled agents harbor the potential to produce health issues such as novel, unpredictable cancers and the amplification of invasive cancer phenotypes.
It is important to note that the proliferation of the most active transposon family in the human system, L-1 elements, are closely tied with these modular processes mentioned. So, it is theorized that individuals who share genetic backgrounds with mutations present in the L-1 cellular defense pathways might be rather susceptible to TE-induced diseases (Belancio et al. 2009). When implementing future treatments of TE-mediated diseases, it will be necessary to consider the array of human genotypes and corresponding vulnerabilities to both TE activity and environmental mutagens responsible for the modification of cellular functions.
What was once advocated as ‘junk DNA’ is now regarded as some of the most crucial and deeply ingrained evidence for genomic evolution in the eukaryotic system. While applications of use remain modest, there is much potential in the study of transposable elements and their influence on human diseases. Continued research into the functions of transposable elements may lead to a deeper understanding of past and present genetic diseases as well as an anticipation of future medical applications.
Using Domesticated TEs to Study Cancer
The molecular domestication of transposable elements in the human system has allowed for the progression of genomic evolution and the development of complex cellular functions. Domesticated TE proteins are proposed to serve as exapted genomic material, as exemplified by the enzyme telomerase reverse transcriptase (TERT). The reverse transcriptase enzymes of non-LTR elements such as L-1 share genetic material with the reverse transcriptase domain of the telomerase enzyme, which has led researchers to determine degrees of relatedness between transposable elements and the telomerase enzyme. The formation of telomeres is performed by the incorporation of RNAs and reverse transcriptase telomerase together at different loci. This mechanism also is performed by Alu elements which utilize RNA and reverse transcriptase produced at discrete regions of the genome. Both the upregulation of transposable element activity and telomerase expression is identified in cancer cells, which require unlimited division and expansion.
An area of further research is the mechanism of retrotransposon elongation of telomeres in Drosophila cells (Jangam et al. 2017). The relationship between TE activity and telomerase function is unique and significant in the realm of genomic evolution in eukaryotes, and is largely applicable to the study of cancer evolution. In order to quantify TE-mediated cancer development in the human body, next generation sequencing (NGS) may be done to obtain measures of mutated DNA in tumor cells caused by transposition. NGS methods can potentially be used to target portions of the genome with high mutational propensity and cell susceptibility to tumor development, particularly at sites of high TE concentration (Belancio et al. 2010). However, these mechanisms would require refinement and other methods to locate epigenetic modifications and recombination sites produced by TE activity to cause carcinogenesis. If these factors can be targeted using genomic sequencing methods and regulating agents of epigenetic activity such as bromodomain inhibitors and demethylase, viable prescriptions for treating TE-mediated cancers can be developed. These methods are in preclinical and clinical trials as of 2017 for use in solid tumors (Anwar et al. 2017).
Using TEs to Treat Systemic Human Diseases
Although TE activity is detrimental in the human system due to causation of genetic diseases through NAHR activity, deletions, segmental duplication, and regulatory modifications, novel methods of gene therapy may be able to employ these properties in a capacity to benefit human health. Progress has been made in recent years in the design of non-viral vector systems utilizing domesticated elements for treating genetic diseases. This transposon vector system is referred to as ‘Sleeping Beauty’ (SB) and is in clinical trials for use as a delivery system of genetic material to stem cells for the treatment of cancers such as lymphoma as well as congenital diseases such as hemophilia and mucopolysaccharidosis (Aronovich et al. 2011).
SB is a synthetic transposon, and is a member of the Tc1/mariner family of DNA TEs. These elements contain evolutionarily conserved recombinase domains, and are composed of inverted repeat sequences on either side of the transposase binding gene. SB was constructed from ancestral Class II element sequences found primarily in a variety of fish species that lived more than ten million years ago. SB ancestor elements proliferate modern fish genomes as well as the genetic systems of several species of amphibian, but do not traverse the mammalian genome. The implementation of SB elements into the human genome requires the development of a relationship between the host system and the TE via host-element protein interactions. Using its host factors, SB can recognize and respond to host cell signaling cascades and interact with the host system’s gene regulation mechanisms (Narayanavari et al. 2017). This method is discussed in detail by author Suneel Narayanavari et al. (2017), who explain that to achieve this interaction, SB is incorporated with a “gene-expression cassette” and transposase enzyme to form the Sleeping Beauty transposon system (SBTS). The SBTS functions to transform a plasmid with the gene-expression cassette, which can be transposed into the genome and transcribed into protein product.
The effectiveness of the transgenic SBTS approach has been studied in mouse and rat model systems for hemophilia, sickle cell anemia, B-cell lymphoma, and other diseases. The results of these tests have provided promising outcomes for the usage of the SBTS in the realm of personalized medicine. However, these methods are nowhere near perfect. The efficacy of the SBTS requires improvement due to low levels of transposition in some host systems. Cell culture methods have been altered and improved to better plasmid delivery and gene transfer due to apparent site-selection of the SBTS. More clinical trials are necessary in the future to prove effectiveness and safety of the system. Further studies of the SBTS and its interaction with the mammalian genome can contribute to the comprehension of host interactions with transgenic systems and host immune responses to transgene products (Aronovich et al. 2011). With continued improvement of the SBTS gene therapy mechanism may come massive feats in the progressive field of personalized medicine for treating inherited cancers and other germline diseases.