Retroelements with long terminal repeats (LTRs) represent one of the four major groups of retrotranscribing mobile genetic elements. This system includes the broad range of LTR retrotransposons and retroviruses present in animals, plants and fungi (for more information in this topic, see Retroelements), and other non-autonomous retroelements evolved from LTR retroelements. We additionally consider caulimoviruses of plants within the system of autonomous LTR retroelements because of their common evolutionary relationship with them based on the gag-pol region.
These are composed of an internal coding region bounded by LTRs of variable size. The LTRs usually end in short inverted repeats, they do not encode for any known protein but contain promoters and features associated with the element transcription (Kumar and Bennetzen 1999). LTR retroelements have a Primer Binding Site (PBS) as well as a Polypurine Tract (PPT) downstream to the 5´LTR and upstream to the 3´LTR, respectively (for more information, see LTRs). The internal region contains the characteristics gag and pol (and env in the case of retroviruses) ORFs, which encode for the different protein products required for the replication cycle and transposition of LTR retroelements (Eickbush and Jamburuthugoda 2008). The replication cycle of LTR retrotransposons includes both nuclear and cytoplasmic stages. Inserted LTR retroelements are transcribed into mRNAs which are then exported to the cytoplasm to form virus-like particles (VLPs) and retrotranscribed cDNA copies that go back to the nucleus for integration in the host's DNA genome. Based on sequence similarity and other features LTR retroelements can be classified into four major families called Ty1/Copia, Ty3/Gypsy, Bel/Pao and Retroviridae.
Ty1/Copia elements are abundant in species ranging from single-cell algae to superior plants and animals. The Ty1/Copia internal region encodes for the typical gag and pol polyproteins but differ from other LTR retrotransposons in the ORF position of integrase (INT), which maps after to protease (PR). The classification of this family is in continous progress. According to the ICTV, the Ty1/Copia family (taxonomically the Pseudoviridae) was originally divided into three genera: Pseudovirus, Hemivirus and Sirevirus (for more details, see Boeke et al. 2005). However, a recent study (Llorens et al. 2009) shows that the diversity of Ty1/Copia LTR retroelements is greater than this original classification. On the other hand, genomic studies have revealed the presence of an env-like ORF in some Ty1/Copia species belonging to the genus Sirevirus (Laten et al. 1998; Havecker et al. 2005). Also, some Ty1/Copia elements (called CoDi-A) described in diatoms encode for integrases (INTs) carrying a putative chromodomain similar to that found in the INTs coded by Ty3/Gypsy chromoviruses (for more information about this family, see Ty1/Copia family).
Ty3/Gypsy LTR retroelements also spread in animal, plant and fungal organisms. In almost (but not all) Ty3/Gypsy LTR retroelements int maps after to the Ribonuclease H (RH) in the pol polyprotein (Eickbush and Jamburuthugoda 2008). The exception is the Ty3/Gypsy lineage called Gmr1 (Goodwin and Poulter 2002) which show an ORF organization identical to that of Ty1/Copia elements. The classification of Ty3/Gypsy LTR retroelements is also an in-progress matter. ICTV originally classified this family into two major genera, namely Metavirus and Errantivirus, based on the presence or absence of a third ORF env (Metaviridae group, Eickbush et al. 2005). However, increasing evidences have revealed that these two genera include retroviruses and retrotransposons (for more details, see Ty3/Gypsy family).
Bel/Pao LTR retroelements have been only described in metazoan genomes. This family includes both LTR retrotransposons and retroviruses, as it is now known that some Bel/Pao species encode for putative env-like genes (Bowen and McDonald 1999; Havecker et al. 2004; Eickbush and Jamburuthugoda 2008). These LTR retroelements show the same genomic ORF organization than Ty3/Gypsy family and are taxonomically known as semotiviruses (according to the ICTV, Eickbush et al. 2005). For more information, see Bel/Pao family.
Retroviridae constitute a family of retroviruses which specifically inhabit or circulate in the genomes of vertebrates. The gag-pol-env genome ORF organization of the Retroviridae is similar to that of Ty3/Gypsy and Bel/Pao simple retroviruses. In most cases the multiple distinct Retroviridae species are complex retroviruses that incorporate in their genomes some additional accessory genes which are necessary for the replication cycle and transmission from a host into another. According to the ICTV (Fauquet et al. 2005), the Retroviridae can be divided into seven genera namely Alpha-, Beta-, Gamma-, Delta-, Epsilon-, Spumaretroviridae and Lentiviridae (for further information, see Retroviridae family).
Plant caulimoviruses (Caulimoviridae) are double-stranded DNA unenveloped pararetroviruses that do not regularly integrate into the host genome for replication. Caulimoviral particles could be either bacilliform or isometric. According to ICTV, this family comprises six genera: Caulimovirus, Soymovirus, Cavemovirus, Tungrovirus, Badnavirus and Petuvirus (Hull et al. 2005). These replicate in plants via a RNA intermediate evolved from LTR retroelements (Bousalem et al. 2008). Despite caulimoviruses are not LTR retroelements and have not LTRs, they show a gag-PR-RT-RH genomic architecture similar to that of LTR retroelements. Additionally, caulimoviruses encode for other proteins required in their replication cycle and transmission (see the figure below, which represents the genome structure of the Cauliflower mosaic virus (CaMV) (Franck et al 1980; Stavolone et al. 2005)). That is, the distinct caulimoviruses species show particular ORFs encoding for both structural and non structural proteins - movement protein, proteases, reverse transcriptase, RNase H and transactivator protein - as well as some proteins which function is unknown yet.
This is a family of retrotransposons showing LTRs, PBS and PPT sequences characteristics of autonomous LTR retrotransposons but usually lacking of coding capacity. Two classes of non-autonomous LTR retrotransposons have been described: the LArge Retrotransposons Derivatives (LARDs) and the Terminal-repeat Retrotransposons In Miniature (TRIMs). These elements probably need the help of mobility-related proteins encoded by other functional (autonomous) retrotransposons. To date, non-autonomous LTR retroelements have been only described in plants. As there is no intermediate sequences described between TRIMs and LARDs, these appear to follow different replicational or life-cycle strategies and probably have distinct histories (Kalendar et al. 2004).
LARDs range from 5.5 kb to 8.5 kb and have large-conserved-no coding internal domains of 3.5 kb flanked by long LTRs (4.5 kb). Although LARD elements have been found throughout the Triticeae tribe (they have been primarily described from barley and related grasses and called Sukkula-like elements) they have also been detected in grass species outside the Triticeae and in rice (Jiang et al. 2002a; 2002b). It has been suggested that LARDs evolve from Ty3/Gypsy LTR retroelements as they apparently use (for replication and integration) distinct protein products encoded by their putative Ty3/Gypsy LTR retroelements partners (Kalendar et al. 2004).
TRIMs elements are less than 540 bp in size. They show a small non-coding central domain of 100–300 bp which contains PBS and PPT motifs and that is flanked by small LTRs of 100–250 bp in size. The complete lack of coding capacity makes it impossible to classify TRIMs elements conventionally into Ty1/Copia or Ty3/Gypsy families (Witte et al. 2001). These elements have been described in both monocotyledonous and dicotyledonous plant species and seem to be involved actively in the restructuring of plant genomes (Witte et al. 2001).