|
Graphical abstract
|
|
Figure 1. Using self-organizing maps (SOMs) to discover ME GRN
(A) Genome browser view of TF binding during X. tropicalis development. Shown are maternally expressed (Foxh1, Otx1, Sox7, Vegt, Ctnnb1, Smad1, and Smad2/3) and zygotically expressed (Foxa4, Gsc, Eomes, Tbxt, and Vegt) TF binding in the gsc gene locus. Shaded are the well-characterized proximal, distal, and upstream CRMs, associated with TF binding. Further upstream are binding sites in possibly unexplored CRMs.
(B) Datasets used in this analysis, targeting several wild-type and MO-injected embryos at developmental stages important for ME development.
(C) The X. tropicalis genome is partitioned (grey shadings in bottom track) using ChIP-seq and ATAC-seq peak locations. Each partition is assigned ChIP-seq and ATAC-seq signal quantified as reads per kilobase per million (RPKMs) for all chromatin datasets.
(D) The RNA-seq and ChIP-seq/ATAC-seq datasets were each converted into training matrices and clustered using SOM metaclustering using SOMatic. These clusters were then linked using the SOM Linking tool within SOMatic. The pairwise linked metaclusters (LMs) and spatial SOM data were mined for regulatory connections and built into networks.
|
|
Figure 2. RNA-seq SOM metaclustering reveals developmental gene modules that contain similarly regulated genes
(A) SOM slices relating to gene expression signal Wildtype at stage 10.5 and the fold change between Foxh1 MO and control experiments at stage 10.5. Creation of SOM visualization is described in STAR Methods. Metaclusters containing genes from the core ME network show unique temporal dynamics during development. nodal, nodal2, and sia are grouped left and gsc, nodal1, lhx, and osr2 are grouped right (top). Overlaid metacluster boundaries show the genes that are up- and down-regulated upon Foxh1 MO KD (bottom).
(B) Each metacluster is filled with genes with a similar expression profile (labeled “Eigen-Profile”); for example, a heatmap of the genes in metacluster 11 is shown.
(C) Heatmap of average temporal expression profiles of genes belonging to 13 RNA metaclusters. Parentheses after RNA metaclusters indicate number of genes in each RNA metacluster.
(D) Two-tailed Wilcox hypothesis analysis applied on gene metaclusters. Each metacluster responded to each MO experiment differently at different time points.
(E) GO term enrichments for genes within three example RNA SOM metaclusters. Each metacluster had unique functional enrichments supporting the coherence of these clusters.
|
|
Figure 3. SOM-based clustering shows Foxh1 co-binding and functional gene modules during gastrulation
(A) Heatmap of Foxh1 ChIP-enriched metaclusters that visualizes the different patterns of co-regulation present in Foxh1-bound CRMs. The heatmap is initially expressed as TPMs and then maximum normalized. Blue and red represent regions with low and high signals, respectively.
(B) Experiment hierarchy of ATAC/ChIP-seq data after metacluster correction. The developmental stages of each experiment are indicated by the same color coding as (A).
(C) GO term enrichments for genes nearby genome regions within three example ATAC/ChIP SOM metaclusters.
|
|
Figure 4. RNA metaclusters can be further segregated by spatial RNA SOM
(A) SOM slices from the spatial RNA SOM analysis corresponding to RNAs from the animal, dorsal, and vegetal explants with overlaid spatial RNA metacluster (sR) boundaries. Some important sR locations are noted.
(B) Heatmap of the fold change of genes within sRs over whole-embryo signal, indicating enrichment and reduction of genes in particular RNA metaclusters.
(C) Heatmap of statistical difference between gene expression in each tissue and the whole embryo. Six sRs showed statistically significant differences in ectoderm/mesoderm or in endoderm.
(D) Joint membership of genes in sRs and RNA metaclusters from the full RNA dataset. Rows and columns are hierarchically clustered.
(E) Temporal (from wild type) and spatial gene expression profiles for genes in sR9, sR6, sR15, and sR1 and R38.
(F) Average temporal and spatial gene expression profiles for genes in R23, R16, R11, R10, or R1, based on sRs.
|
|
Figure 5. sR assists in identifying candidate TFs for Xenopus ME differentiation
(A and B) Temporal and spatial gene expression profiles of TFs with motifs found near endodermally (A) or ectodermally (B) enriched genes. Asterisks indicate TFs that show distinct spatial expression.
(C) Temporal and spatial gene expression profiles for spatially differential TFs (bold) matched with the average gene expression profile of their predicted targets. Correlations were calculated by comparing their spatial gene expression profiles.
(D) The temporal and spatial gene expression profiles of genes important in Xenopus ME development, separated by RNA metacluster.
|
|
Figure 6. GRN centered on the activity of Tcf7l1, Sox17, Vegt, Smad2/3, and Foxh1
(A) Our predicted developmental GRN. The active CRMs were identified based on the enrichment of their respective TFs, enrichment of Ep300 signal, and DNA binding motif presence. Shown are literature identified targets (“prior direct targets”) and potential new connections (“new potential targets”). Note that only a subset of targets is shown, and the network is focused only on TF and signaling molecule targets.
(B) Fold change of relative luciferase units in log scale of putative CRMs comparing Foxh1 binding site mutations over wild type. Each of these shows that enhancer activity depends on Foxh1 binding sites. Two biologically independent experiments were performed.
(C) Fold change of relative luciferase units of putative CRMs comparing Sox17 binding site mutations over wild type. Each shows that enhancer activity depends on Sox17 binding sites. Two biologically independent experiments were performed.
|
|
Figure 7. New and known core ME TF targets
List of targets in the core ME network for the TFs: Foxh1, Sox17, Tcf7l1, Vegt, and Smad2/3. Bolded entries are new to this analysis. Underlined entries were successfully validated.
|
|
Figure S1. SOM slices reveal overall structure of the input data. Related to Figure 2 and 3. (A) An RNA SOM slice corresponding to wild-type (stage 10.5). The SOM unit location of various important genes in mesendodermal development are noted. (B) RNA SOM difference slices corresponding to the fold change between MO and control experiments. The RNA metacluster divisions are overlaid over the slices. (C) DNA SOM slices corresponding to Foxh1, Vegt, and Otx1 ChIPs at stage 8 and one corresponding to their average. The DNA metacluster divisions are overlaid over the slices. (D) DNA SOM slices corresponding to Foxh1 and Ep300 ChIPs at stage 9 and their average, followed by DNA SOM slices corresponding to Smad2/3, Foxh1, and Gsc ChIPs at stage 10.5. The DNA metacluster divisions are overlaid over the slices. (E) DNA SOM slices corresponding to Sox17 ChIP at stage 10.5 and Tbxt ChIP at stage 12. The DNA metacluster divisions are overlaid over the slices. (F) DNA SOM slices corresponding to Ctnnb1, Foxh1, and Smad1 ChIPs at stage 10.5. The DNA metacluster divisions are overlaid over the slices.
|
|
Figure S2. Distribution of TF ChIP peak summits in final partitioning. Related to Figure 2. The number of unique TF peak summits within each partition. 46% of partitions contain at least one TF peak summit.
|
|
Figure S3. Full DNA metacluster heatmap captures known co-regulatory interactions. Related to Figure 2 and 3. The full set of eigenprofiles revealed that several experiments had very similar results on the collected genomic region clusters. Some of these are known co-regulatory interactions in Xenopus or vertebrates in general
|
|
Figure S4. Motif analysis on linked spatial RNA and DNA SOM metaclusters finds TFs with spatially specific regulation. Related to Figure 4. (A) Venn diagrams showing the motif overlap between spatial metaclusters with similar temporal gene expression profiles. (B) The overlaps between motifs found uniquely on each side of the embryo in (A).
|
|
Figure S5. Gene expression profiles of genes from core mesedoderm network and network filtering strategy. Related to Figure 4. (A) Heatmap of gene expression for all experiements for each gene from the core mesendoderm network. Columns were sorted by developmental stage of each experiment. Rows were clustered by SOM metacluster and ordered by the starting stage of gene expression for that metacluster. Signal for each experiment was normalized (vertically). (B) Motif filtering strategy of network analysis. After using FIMO to find motifs and Motif Enrichment to filter out motifs that are dispersed evenly across the linked metaclusters, the network connections were further filtered by a joint Ep300 and TF ChIP signal separately for each of the TFs tested. Then, we selected RNA metaclusters that contained TFs known to be important for mesendodermal development. Finally, we connected TFs to genes from the core mesendodermal network and known TFs to make the final network.
|
|
Figure S6. Final network luciferase validation experiments and importance correlations. Related to Figure 6. (A) A genome browser view of the predicted and tested Foxh1 target element near gata6 with Foxh1, Sox17, Ctnnb1, and Ep300 ChIP-seq signals for stages 8 - 12. (B, C) Fold change of relative luciferase units in Log scale after foxh1 or sox17 MO injection. Reporter gene was injected into the vegetal (mesendoderm) region of embryos with or without co-injected. Each microinjected CRM reporter showed expression in the mesendoderm (Table S9, Figure S6B, C), which suggested that each probed region is a functional enhancer. All reporter gene activities were downregulated in the absence of Foxh1 or Sox17, except a nodal enhancer reporter. At present it is unclear whether the genes belonging to R82 are negatively regulated by Sox17 or whether this is due to an isolated action of this nodal CRM. (D) Tcf7l1 targets and their overlap with a Ctnnb1 ChIP holdout dataset. (E) Heatmap of the correlation between each importance score and the effect on each of the 12 validation experiments. H3K4me1 scored the best.
|