Click here to close
Hello! We notice that you are using Internet Explorer, which is not supported by Xenbase and may cause the site to display incorrectly.
We suggest using a current version of Chrome,
FireFox, or Safari.
Sci Rep
2016 Jan 22;6:30330. doi: 10.1038/srep30330.
Show Gene links
Show Anatomy links
BATCH-GE: Batch analysis of Next-Generation Sequencing data for genome editing assessment.
Boel A
,
Steyaert W
,
De Rocker N
,
Menten B
,
Callewaert B
,
De Paepe A
,
Coucke P
,
Willaert A
.
???displayArticle.abstract???
Targeted mutagenesis by the CRISPR/Cas9 system is currently revolutionizing genetics. The ease of this technique has enabled genome engineering in-vitro and in a range of model organisms and has pushed experimental dimensions to unprecedented proportions. Due to its tremendous progress in terms of speed, read length, throughput and cost, Next-Generation Sequencing (NGS) has been increasingly used for the analysis of CRISPR/Cas9 genome editing experiments. However, the current tools for genome editing assessment lack flexibility and fall short in the analysis of large amounts of NGS data. Therefore, we designed BATCH-GE, an easy-to-use bioinformatics tool for batch analysis of NGS-generated genome editing data, available from https://github.com/WouterSteyaert/BATCH-GE.git. BATCH-GE detects and reports indel mutations and other precise genome editing events and calculates the corresponding mutagenesis efficiencies for a large number of samples in parallel. Furthermore, this new tool provides flexibility by allowing the user to adapt a number of input variables. The performance of BATCH-GE was evaluated in two genome editing experiments, aiming to generate knock-out and knock-in zebrafish mutants. This tool will not only contribute to the evaluation of CRISPR/Cas9-based experiments, but will be of use in any genome editing experiment and has the ability to analyze data from every organism with a sequenced genome.
Figure 1. Implementation of BATCH-GE. Multiple singleplex PCR products (S1, S2, â¦, Sn) (upper panel, left) that correspond to different genomic sequences in one specific or in different genomes are pooled in equimolar amounts. Subsequently, the pools are used as DNA input for NGS library preparation using the Nextera XT library preparation kit, which simultaneously fragments and tags input DNA (upper panel, middle). The tagging involves the addition of unique adapter sequences in order to provide sequencing indices on both sides of the amplicons (depicted by yellow, grey, light and dark blue bars). In a final step, all molecules are pooled in a single tube prior to NGS sequencing (upper panel, right). BATCH-GE analyses the data sample-by-sample in an automated batchwise manner. The experimental specifications needed to run BATCH-GE are supplied via two input files (middle panel, E (Experiment.csv) and C (Cutsites.bed) icons). In a first step, raw sequencing data is converted into the SAM file format. Secondly, BATCH-GE screens the reads in the SAM file for their coverage of the region(s) of interest, which are user-defined regions, encompassing the theoretical CRISPR/Cas9 cut site, 3 base pairs upstream of the PAM sequence (middle panel, grey sequence). Thirdly, reads that do not fully cover the region of interest are discarded from the analysis, since they lack information about the presence or absence of indels in this region (middle panel, indicated by a mark/cross). Subsequently, the remaining reads (indicated by a tick) are screened for insertions and deletions initiated within the same user-defined region of interest (middle panel, grey dash-lined box). The detected indel variants, along with information about their position, type, length and their frequency are written to a âVariantsâ text file. Reads that do not contain any indel, are screened for the presence of intended base pair alterations. Frequencies of partial and full repairs are listed in the âRepairReportâ file. Additionally, general indel and repair rates are indicated in the âEfficienciesâ file. Lastly, URLs (âURLâ file) enable read visualization in the freeware UCSC Genome Browser database22.
Figure 2. BATCH-GE output files for a specific genome editing experiment targeting the tprkb gene. (a) The âVariantsâ text file lists chromosome, chromosomal location of the variant, type of the variant, length, the reference sequence surrounding the indel (10âbp upstream and 10âbp downstream of the indel) with [] marking the inserted sequence or with [deleted base pairs] marking the deleted sequence, and absolute and relative frequency of the variants. (b) In case of HDR analysis, the reads which do not contain any indel, are screened for the presence of the intended base pair alterations. BATCH-GE can distinguish between full and partial repair, in case multiple base pair alterations are intended to be introduced in the region of interest. If partial repair is encountered, the specific sequence of the partial repair is listed. (c) General indel and repair rates are shown in the âEfficienciesâ file. (d) URLs are generated (âURLâ file) which allow visualization of the reads in the freeware UCSC Genome Browser database26. However, if the number of total reads (also the reads that are discarded by the tool) exceeds 1000, visualization via UCSC is no longer possible. As an alternative, raw NGS result files (fastQ) can be uploaded into the Integrative Genomics Viewer (IGV)1920.
Figure 3. Indel rates and read number, as a function of the size of the region of interest used in BATCH-GE.The raw sequencing data derived from CRISPR/Cas9 assays (slc2a10, pls3, tapt1a, myt1la, tprkb) injected with 25âpg sgRNA and 250âpg Cas9 and analysed at 1 dpf were reanalysed while varying the size of the region of interest from 20 to 100âbp. The blue bars represent the number of reads retained by BATCH-GE when screened for coverage of the user-defined region of interest. The red line represents the indel rate as a function of the size of the region of interest.
Auer,
CRISPR/Cas9 and TALEN-mediated knock-in approaches in zebrafish.
2014, Pubmed
Auer,
CRISPR/Cas9 and TALEN-mediated knock-in approaches in zebrafish.
2014,
Pubmed
Babon,
The use of resolvases T4 endonuclease VII and T7 endonuclease I in mutation detection.
2003,
Pubmed
Bedell,
In vivo genome editing using a high-efficiency TALEN system.
2012,
Pubmed
Bétermier,
Is non-homologous end-joining really an inherently error-prone process?
2014,
Pubmed
Boel,
Publisher Correction: BATCH-GE: Batch analysis of Next-Generation Sequencing data for genome editing assessment.
2018,
Pubmed
Cermak,
Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting.
2011,
Pubmed
Cong,
Multiplex genome engineering using CRISPR/Cas systems.
2013,
Pubmed
De Leeneer,
Flexible, scalable, and efficient targeted resequencing on a benchtop sequencer for variant detection in clinical practice.
2015,
Pubmed
Fu,
High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells.
2013,
Pubmed
Güell,
Genome editing assessment using CRISPR Genome Analyzer (CRISPR-GA).
2014,
Pubmed
Horvath,
CRISPR/Cas, the immune system of bacteria and archaea.
2010,
Pubmed
Hruscha,
Efficient CRISPR/Cas9 genome editing with low off-target effects in zebrafish.
2013,
Pubmed
Hsu,
DNA targeting specificity of RNA-guided Cas9 nucleases.
2013,
Pubmed
Huang,
A simple, high sensitivity mutation screening using Ampligase mediated T7 endonuclease I and Surveyor nuclease with microfluidic capillary electrophoresis.
2012,
Pubmed
Hwang,
Efficient genome editing in zebrafish using a CRISPR-Cas system.
2013,
Pubmed
Hwang,
Heritable and precise zebrafish genome editing using a CRISPR-Cas system.
2013,
Pubmed
Iliakis,
Mechanisms of DNA double strand break repair and chromosome aberration formation.
2004,
Pubmed
Irion,
Precise and efficient genome editing in zebrafish using the CRISPR/Cas9 system.
2014,
Pubmed
Jinek,
A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity.
2012,
Pubmed
Kent,
The human genome browser at UCSC.
2002,
Pubmed
Kim,
Highly efficient RNA-guided genome editing in human cells via delivery of purified Cas9 ribonucleoproteins.
2014,
Pubmed
Li,
Efficient and heritable gene targeting in tilapia by CRISPR/Cas9.
2014,
Pubmed
Mali,
RNA-guided human genome engineering via Cas9.
2013,
Pubmed
Naito,
CRISPRdirect: software for designing CRISPR/Cas guide RNA with reduced off-target sites.
2015,
Pubmed
Qiu,
Mutation detection using Surveyor nuclease.
2004,
Pubmed
Robinson,
Integrative genomics viewer.
2011,
Pubmed
Sung,
Highly efficient gene knockout in mice and zebrafish with RNA-guided endonucleases.
2014,
Pubmed
Terns,
CRISPR-based adaptive immune systems.
2011,
Pubmed
Thorvaldsdóttir,
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration.
2013,
Pubmed
Tsuji,
Development of a simple and highly sensitive mutation screening system by enzyme mismatch cleavage with optimized conditions for standard laboratories.
2008,
Pubmed
Urnov,
Genome editing with engineered zinc finger nucleases.
2010,
Pubmed
Urnov,
Highly efficient endogenous human gene correction using designed zinc-finger nucleases.
2005,
Pubmed
van Dijk,
Ten years of next-generation sequencing technology.
2014,
Pubmed
Varshney,
High-throughput gene targeting and phenotyping in zebrafish using CRISPR/Cas9.
2015,
Pubmed
Vouillot,
Comparison of T7E1 and surveyor mismatch cleavage assays to detect mutations triggered by engineered nucleases.
2015,
Pubmed
,
Xenbase
Wiedenheft,
RNA-guided genetic silencing systems in bacteria and archaea.
2012,
Pubmed