|
Fig. 1. hAT-10 DNA transposon-derived fragments on the dm-W intron 3 and exon 4 (Ex4). (A) Distributions of transposon-derived DNA fragments in and around Ex4 of Xenopus laevis (Xl) dm-W and comparison to X. tropicalis (Xt) hAT-10 DNA transposons. Noncoding and coding portions of exons of dm-W are indicated with white and gray boxes, respectively (upper). Colored boxes in X. laevis dm-W represent TE distribution by CENSOR as indicated (middle). A box in X. tropicalis hAT-10 DNA transposon shows homologous regions to the dm-W hAT-10-derived ones (lower). (B) Schematic comparison among the dm-W Ex4 and its corresponding regions from Xt hAT-10 to Xl hAT-10-derived sequences. The hAT-10-derived sequences were classified into three regions (named A, B, and C) based on sequence similarity. Region A (133 bp of the third intron and 16 bp of Ex4 in dm-W) and C (108 bp of Ex4 and 207 bp downstream) shared high sequence identity among them, whereas region B (219 bp of the Ex4) has lower sequence identity between Xt hAT-10 and each Xl hAT-10 transposon-derived sequence. Nucleotide sequence identity (%) with dm-W is shown on each region. Noncoding and coding exons are represented by white and gray boxes, respectively. The green triangle shows a terminal inverted repeat (TIR). (C) A partial comparison of nucleotide sequences within and adjacent to the Ex4-CDS among the five hAT-10 transposon-derived sequences from X. laevis in (B). Splicing acceptor site AG, a deletion, and a sequence for a stop codon are shaded.
|
|
Fig. 2. Molecular evolution of hAT superfamily, hAT-10 family, and hAT-10-derived Ex4 in Xenopus frogs. (A) Copy numbers of Xt hAT-10 transposase (Tp)-like sequences, hAT-10-derived Ex4-like sequences (AC in fig. 1B), hAT-10-derived Ex4-CDS-like sequences [B in A] in 14 vertebrate species, including anuran amphibians (Xenopus laevis, X. borealis, X. tropicalis, Nanorana parkeri, Rana catesbeiana, and Rhinella marina), an urodele amphibian (Ambystoma mexicanum), a caecilian amphibian (Geotrypetes seraphini), a bird (Gallus gallus), mammals (Homo sapiens and Mus musculus), teleost fish (Oryzias latipes and Tetraodon nigroviridis), and a cartilaginous fish (Callorhinchus milii). GenBank assembly accession of 11 species except for the three Xenopus species used is shown in supplementary fig. S3, Supplementary Material online. (B) A repeat landscape of hAT superfamily consisting of the five families, as inferred in the X. laevis genome using RepeatMasker (upper). The y-axis and x-axis show percentages of each family on the genome and Jukes-Cantor-corrected divergence, respectively. The estimated divergence time of the hAT-10-derived regions on and around dm-W Ex4 from the hAT-10 consensus sequence is shown by a triangle on the landscape. After the ancestors of X. laevis and X. tropicalis diverged at 48Ma, speciation and hybridization of the predicted L and S species occurred at 34 and 1718Ma, respectively (Session et al. 2016).
[[Supplementary table 3. GenBank assembly accession of 11 vertebrate species used in fig. 2.
Rana catesbeiana RCv2.1 GCA_002284835.2
Nanorana parkeri ASM93562v1 GCA_000935625.1
Rhinella marina RM170330 GCA_900303285.1
Ambystoma mexicanum AmbMex60DD GCA_002915635.3
Geotrypetes seraphini aGeoSer1.2 GCA_902459505.2
Homo sapience GRCh38.p13 GCA_000001405.28
Mus musculus GRCm39 GCA_000001635.9
Gallus gallus bGalGal1.mat.broiler.GRCg7b GCA_016699485.1
Oryzias latipes ASM223467v1 GCA_002234675.1
Tetraodon nigroviridis ASM18073v1 GCA_000180735.1
Callorhinchus milii Callorhinchus_milii-6.1.3 GCA_000165045.2]]
|
|
Fig. 3
The nucleotide and deduced amino acid sequence alignments of the Ex4-CDS of dm-W among four allopolyploid Xenopus species. (A) A multiple nucleotide sequence alignment within and adjacent to Ex4-CDS of X. laevis, X. largeni (MCZ-A cryogenic 333), X. itombwensis (MCZ-A A136197), and X. petersii. Red font highlights the positions of TAG and TAA stop codons. (B) A multiple alignment of the Ex4-encoded amino acid sequences from the four DNA sequences in (A) (upper) and the ML phylogenetic tree of the Ex4 sequences and/or their corresponding hAT-10-derived sequences from X. laevis, X. largeni, X. itombwensis, and X. petersii (lower). Numbers at each node denote the ML/NJ bootstrap percentages of 1000 replicates.
|
|
Fig. Effects of the transposon-derived Ex4 coding region in in vitro DNA binding and transrepression activities of DM-W. (A) In vitro DNA binding of DM-W (full length) and its C-terminal truncated protein, DM-W (Δ124–194), to the DMRT1-binding sequence. Flag-DM-W(full) and Flag-DM-W(Δ124–194) were produced by in vitro transcription–translation system and analyzed by Western blot analysis with an anti-FLAG antibody followed by an HRP-conjugated antimouse antibody (left). The relative intensity values were shown below. EMSA was performed using in vitro synthesized Flag-DM-W(full) (0.5, 1.0, 2.0, or 3.0 μl) or Flag-DM-W(Δ124–194) (2.0, 4.0, or 5.5 μl) and 32P-labeled double-stranded oligonucleotides containing the DMRT1-binding sequence. The relative intensity values to the protein amounts, binding strength, and ratio of strength/protein amount were shown below. (B) The luciferase reporter assay for the DM-W transrepression activity on DMRT1-driven transcription. The DMRT1-driven luciferase reporter and an expression plasmid for DMRT1.S and DM-W(full) or DM-W(Δ124–194) were transfected into HEK293T cells and posttransfection (24 h) luciferase activity was measured. Letters (a–d) indicate significant differences based on a one-way ANOVA, followed by the Tukey–Kramer HSD test (P < 0.01).
|
|
Supplementary fig. 1. Xenopus frog diversification and the emergence of dm-W after allotetraploidization
The figure was constructed based on finding from five studies (Evans et al. 2011, 2015, 2019; Session et al. 2016; Mawaribuchi et al. 2017b).
|
|
Supplementary fig. 2. Patches of homology among four Xenopus dmrt1 homologous genes, X. laevis (Xl) dm-W, Xl dmrt1.L, Xl dmrt1.S, and X. tropicalis (Xt) dmrt1 using VISTA tool
VISTA plots of the four genomic regions (100 kb of Scaffold78 on Xl chr2L, 96 kb on Xl chr1L, 75 kb on Xl chr1S, and 73 kb on Xt chr 1) were constructed by using Xl dm-W (upper panel) and Xl dmrt1.S (lower panel), respectively, as references. Dark blue and light blue indicate homologous coding and non-coding exons of genes, respectively. Pink shows conserved non-CDSs.
|
|
Supplementary fig. 3. A phylogenetic tree of the regions containing the DM domain-coding sequences among X. laevis (Xl) dm-W, dmrt1.S, dmrt1.L, and X. tropicalis (Xt) dmrt1
The tree was constructed using the MEGAX program, based on about 1700 nucleotide sequences from the upstream region of Ex2 to the downstream region of Ex3 of dm-W and its corresponding ones of Xl dmrt1.L and Xl dmrt1.S and Xt dmrt1 as an outgroup. In the NJ and ML analyses, the best-fit model of nucleotide substation was selected by model selection using likelihood ratio test. An identical topology was obtained for genes in both analyses. The ML tree is shown as a representative example. Numbers at each node denote the NJ/ML bootstrap percentage values based on 1000 replicates.
|
|
Supplementary Fig. 4. Density of transposable elements (TEs) in X. laevis dm-W, dmrt1.L, and dmrt1.S and flanking regions
CENSOR program was used for the density of TEs, which shows a color gradient between blue (low) and red (high). Noncoding exons and coding portions of exons are represented by white and grey boxes, respectively. Numbers over each intron shown by horizontal black line indicate the proportions (%) of each intron that is comprised of TE sequence.
|
|
Supplementary fig. 6. Nucleotide sequence alignments among five hAT-10-derived sequences including dm-W Ex4 from X. laevis and its corresponding one of Xt hAT-10
(A) Sequence alignments of the regions A, and a part of B (see Fig. 1) among the five X. laevis hAT-10-derived sequences on dm-W-containing scaffold 78, chromosome 2L, chromosome 7L, scaffold 19, and scaffold 30, and X. tropicalis hAT-10. Identical nucleotides among all the five sequences and four sequences are shown by white letters on black background and black letters on grey background, respectively. The numbers on the alignments indicate those corresponding to Xt hAT-10 sequence. Pale blue, pink and grey horizontal lines indicate the homologous region with Xt hAT-10 sequence, Xl dm-W Ex4 CDS, and its non-CDS, respectively.
see next image for Suppl fig 6(B)
|
|
Supplementary fig. 6. Nucleotide sequence alignments among five hAT-10-derived sequences including dm-W Ex4 from X. laevis and its corresponding one of Xt hAT-10
[continued ]
(B) Sequence alignments of the regions A, B, and a part of C (see Fig. 1) among the three hAT-10-derived sequences on dm-W-containing scaffold 78, scaffold30 and Xt hAT-10. Pale blue, pink and grey show the same as in (A). Nucleotide sequences in pale blue-violet shading indicate the region which was recognized as Xt hAT-10 sequence by CENSOR program. Asterisks represent identical nucleotides among the three.
|
|
Supplementary fig. 7. Sequence comparison of Xenopus hAT-10-derived A-C regions
(A) Schematic comparison among Xenopus (X) tropicalis and X. laevis hAT-10-derived A-C consensus sequences and their surrounding regions. The numbers under the A, B, and C regions show nucleotide identity (%) in pairwise comparisons between genes depicted above and below the values. (B) Sequence comparisons of the eight X. laevis and six X. borealis hAT-10-derived A-C sequences in Figure 2A and supplementary table 2, respectively, which were constructed by Jalview (https://www.jalview.org/). Nucleotides A, T, G, and C correspond to light green, blue, red, and orange, respectively.
|
|
Supplementary Table 1. Nucleotide sequence identity (%) of three regions A - C for comparisons between dm-W and X. laevis hAT-10-like sequences
A. Sequence identity of dm-W Ex4-containing sequences to Xl and Xt hAT-10-like sequences
Note: Sequence identities were examined by “Search Global Homology” in GENETYX-MAC or “MUSCLE alignment program”. More than 70 % identity between two sequences is shown in red.
|
|
Supplementary Table 1. Nucleotide sequence identity (%) of three regions A - C for comparisons between dm-W and X. laevis hAT-10-like sequences
B. Sequence identity of Xt hAT-10 to Xl hAT-10-like sequences
Note: Sequence identities were examined by “Search Global Homology” in GENETYX-MAC or “MUSCLE alignment program”. More than 70 % identity between two sequences is shown in red.
|
|
Supplementary Table 2. Nucleotide sequence identity (%) of three regions A - C for comparisons between dm-W and X. borealis hAT-10-like sequences
|