digestion efficiency differences of restriction enzymes
TRANSCRIPT
Citation: Chung YS, Jun T, Kim C. 2017.
Digestion efficiency differences of restriction
enzymes frequently used for genotype-by-
sequencing technology. Korean Journal of
Agricultural Science 44:318-324.
DOI: https://doi.org/10.7744/kjoas.20170042
Editor: Bo-Keun Ha, Chonnam National
University, Korea
Received: July 26, 2017
Revised: August 22, 2017
Accepted: August 22, 2017
Copyright: © 2017 Korean Journal of
Agricultural Science.
This is an Open Access article distributed under the terms of the Creative Com-
mons Attribution Non-Commercial License (http: //creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
ISSN (Print) : 2466-2402
ISSN (Online) : 2466-2410
OPEN ACCESS
Korean Journal of Agricultural Science 44(3) September 2017 318
PLANT & FOREST
Digestion efficiency differences of restriction
enzymes frequently used for
genotype-by-sequencing technology
Yong Suk Chung1, Taehwan Jun2, Changsoo Kim1*
1Department of Crop Science, College of Life Sciences, Chungnam National University, Daejeon 34134,
Korea
2Department of Crop Science, Pusan National University, Miryang 50463, Korea
*Corresponding author: [email protected]
Abstract
With the development of next-generation sequencing (NGS), a cutting-edge technology,
genotype-by-sequencing (GBS) became available at a low cost per sample. GBS makes it
possible to customize the process of library preparation to obtain high-quality single
nucleotide polymorphisms (SNPs) in the most efficient way. However, a GBS library is
hard to construct due to fine-tuning of concentration of each reagent and set-up. The major
reason for this is the presence of undigested genomic DNA (gDNA) owing to the efficiency
of different restriction enzymes for different species with unknown reasons. Therefore, this
proof-concept study is to demonstrate the unpredictable patterns of enzyme digestion from
various plants in order to make the reader aware of the caution needed when choosing
restriction enzymes for their GBS library preparations. Indeed, no pattern was found for the
digestibility of gDNA samples and restriction enzymes in the current study. We suggest
that more data should be accumulated on this matter to help researchers who want to apply
GBS technologies in a variety of genetic approaches.
Keywords: dicot, monocot, next-generation sequencing, plant genomic DNA, restriction
digestion
Introduction
With the advent of next-generation sequencing (NGS), a cutting-edge technology, genotype-
by-sequencing (GBS), has emerged for the sequencing of multiplexed samples (Elshire et al.,
2011). It can perform molecular marker discovery and genotyping at the same time (Poland and
Rife, 2012; He et al., 2014; Kim et al., 2016). Because of its cost and effectiveness, it has been
applied to deal with the large quantities of samples generated by various genetic or breeding
populations such as conventional biparental populations, advanced backcross populations, nested
association mapping populations, and diversity panels from many plant species. Although many
Digestion efficiency differences of restriction enzymes frequently used for genotype-by-sequencing technology
Korean Journal of Agricultural Science 44(3) September 2017 319
commercial kits are currently available on the market, the cost per sample is still quite expensive particularly in
handling multiple samples from those large-scale populations. The GBS makes it possible to customize the process of
library preparations at much less cost, dramatically reducing the cost per sample (data point). It is by far the most
efficient way to get high-quality single nucleotide polymorphisms (SNPs) (Gore et al., 2007; Gore et al., 2009).
Obviously, many genetic laboratories have used publicly available protocols or set up their own protocols to reduce
genotyping costs. However, constructing GBS libraries has multiple steps harmonizing different basic techniques of
molecular biology including restriction digestion, ligation, purification, and polymerase chain reaction (PCR). In
consequence, trouble-shooting is another issue because library preparation tends to be error-prone. In other words, the
quality control of GBS libraries has to be seriously considered in order not to waste resources and labor. From our
multi-year experiences in the GBS procedure, we have encountered different issues, leading to the failure of the entire
experiment. Those issues can occur at any step such as restriction digestion, ligation, purification, and PCR. From our
latest experience, our library preparation encountered multiple failure due to unknown reasons and we tried to exclude
potential issues step by step. With the help of fragment analysis, the main source of failure was found in the restriction
digestion, which was unexpected. One of the reasons why restriction enzymes (REs) are used in library preparations is
to control genomic representations indirectly. GBS uses a low coverage of genomic data; however, if the coverage for
each locus is less than 2, it generates a plethora of missing or false genotype data. Since the genome sizes of plants
vary, researchers use different combinations of REs (e.g. four-, five-, or six-base cutters) according to their digestion
probabilities, resulting in an increase of coverage in each locus. Some may want to use methylation-sensitive REs to
focus on the euchromatic regions of genomes, enriching coding regions in a GBS library. Sometimes, REs do not work
well due to star activity in which REs cleave similar but not identical sequences. This can be overcome by using high
fidelity REs provided by major suppliers. However, the most important point is that the efficiency of restriction
digestion in plants varies, indicating that we indeed need prior knowledge of RE cutting profiles for as many plant
species as possible. A basic way to visualize those profiles is to use gel electrophoresis but it does not offer enough
resolution to see if the fragments are well formed and it requires quite large amounts of digested gDNA. The efficiency
of restriction digestion will determine that of downstream steps in GBS library preparation.
To sum up, GBS is low cost, has reduced sample handling, fewer PCR and purification steps, no size fractionation,
no reference sequence limits, efficient barcoding and is easy to scale up (Davey et al., 2011). These advantageous
features make GBS a very powerful tool to do many kinds of plant genetic studies including genetic mapping, association
mapping, genome-wide association (GWAS), genomic selection, polyploidy, and genetic-diversity studies. This is
possible not only due to the features of GBS that make it highly reproducible but also due to extremely specificity of
enzyme digest sites.
As briefly stated above, there are sites in genomic DNA (gDNA) that cannot be cleaved with methylation-sensitive
REs (Susan et al., 1994). Thus, gDNA digestion protocols should take into consideration the probability of having
target sites over a genomic size of hundreds of mega-bases. Nevertheless, for unknown reasons, some gDNA seems
not to be digested in this study. This causes an important problem in constructing GBS libraries. Accordingly, the
objectives of this proof-of-concept study is to profile unpredictable patterns of enzyme digestions of various genomic
DNA samples in order to let readersbe cautious when choosing the REs for their GBS library construction.
Digestion efficiency differences of restriction enzymes frequently used for genotype-by-sequencing technology
Korean Journal of Agricultural Science 44(3) September 2017 320
Materials and Methods
Plant materials
Six diverse plants species were randomly selected from monocots including zoysiagrass (Zoysia japonica Steud.,
Choridoideae subfamily), rice (Oryza sativa L., Oryzoideae subfamily), and sorghum (Sorghum bicolor L.,
Panicoideae subfamily), and from dicots such as lettuce (Lactuca sativa, Cichorioideae subfamily), perilla (Perilla
frutescens, Laminaceae subfamily), and tomato (Solanum lycopersicum, Solanoideae subfamily). Those plants cover
quite number of subfamilies from monocots to dicots. The gDNA of each plant sample planted and grown in the
greenhouse in Chungnam National University in Daejeon, Korea was extracted using a CTAB method (Doyle and
Doyle, 1987) and diluted at 40 ng/μL. To obtain intact DNA, each samples were frozen with nitrogen and
homogenizing with mortar and pestle or a mechanical homogenizer (Honeycutt et al., 1992; Guillemaut and Maréchal-
Drouard, 1992) followed by phenol-chloroform-isoamyl alcohol extraction (Zhu et al., 1993) to obtain clean DNA
samples so that RE is not blocked by junk proteins during digestions.
REs and digestion conditions
Ten REs which are frequently used for GBS preparations were selected (Table 1). Two of them were methylation-
sensitive and the others were methylation non-sensitive. Recognition sites ranged from 4 bases to 6 bases. Those REs
were from two different companies (New England Biolabs, MA, U.S.A. and Enzynomics, Daejeon, Korea) but their
quality is widely acceptable without variation (unpublished data). Eight units of each RE (since the concentration of
REs were different and the volume of RE treatment varied) with 2.0 μL of the corresponding enzyme buffer, were
incubated with 3.0 μL of gDNA (120 ng/μL) to give 20 μL of total volume for 2 hours at 37℃ followed by further
incubation for 20 minutes at 65℃ (the volume of H2O was adjusted depending on the volume of RE).
Table 1. List of restriction enzymes used in the current study.
Enzyme list Company Recognition site Site length Methylation Temp (℃)
BamHI ENy G↓GATCC 6 No 37
DdeI NEBz C↓TNAG 5 No 37
HpaII EN C↓CGG 4 CpG 37
KpnI NEB GGTAC↓C 6 No 37
MseI NEB T↓TAA 4 No 37
MspI NEB CC↓GG 4 No 37
NsiI EN C↓TCGAG 6 CpG 37
PstI NEB CTGCA↓G 6 No 37
SacI NEB GAGCT↓C 6 No 37
SphI NEB GCATG↓C 6 No 37
yEnzynomics.zNew England BioLabs.
Fragment analysis
DNA fragments generated by REs were detected and visualized using Q-Analyzer (Wind Hill Technologies Co.,
Ltd, Shanghai, China). Method in the program setting was M-4-10-06-300 for sample injection 3 kV 10 seconds and
Digestion efficiency differences of restriction enzymes frequently used for genotype-by-sequencing technology
Korean Journal of Agricultural Science 44(3) September 2017 321
separation 6 kV 300 seconds to have 15 - 1,000 bp ranges at 2 - 4 bp accuracy. Cartridge type was S1 (high resolution
catridge) and alignment marker was MA-1.
Results and Discussion
Notably, the gDNA cutting rates of ten REs on six different plant species did not exceed 50 percent (data is not
shown), which is very low. The gDNAs of zoysiagrass, rice, and perilla, were cleaved by five Res out of 10 REs
(Table 2, Fig. 1). This pattern was not dependent on the kind of REs, RE recognizing bases, or methylation-sensitivity.
Likewise, the gDNAs of sorghum and tomato were cleaved by four REs and those of lettuce were cut by one RE in an
irregular pattern. No particular pattern was observed when those samples were grouped into monocots and dicots. In
addition, the lengths of recognition sites did not affect the profiles of digestion patterns. Methylation sensitivity also
does not seem to be important factor in the cleavage of gDNA based on the results of methylation-sensitive enzymes,
HpaII and NsiI (Comb and Goodman, 1990). The purpose of using methylation-sensitive enzymes in the GBS process
is to increase coverage at each genetic locus. Indeed, the distribution of methylated DNA is overwhelmed in
heterochromatic regions in which reside a number of repeated sequences such as transposable elements. However, it is
well-evidenced by many whole genome studies that a large portion of genomic DNA is methylated in euchromatic
regions which are gene-rich areas (Arabidopsis Genome Initiative, 2000; Paterson et al., 2009; Schnable et al., 2009).
Therefore, one needs to be very cautious to use methylation-sensitive REs in the GBS because useful genomic
information can be missed. Furthermore, the recognition sites do not seem to be influential to the distribution of DNA
fragments sizes. The distribution of DNA fragments sizes is not only related to the base cutting number but also to the
RE dosage and incubation time. This should be investigated further because many different factors can contribute to
the efficiency of REs. However, any pattern found in this study may not cover all the patterns of many other species.
Thus, it would be necessary to increase plant diversity as well as the kinds of REs.
As a preliminary study to demonstrate if all REs could cut any gDNA, the amount of gDNA samples which are
minimally detectable in the fragment analyser were used. Therefore, peaks are not visually obvious. Hence, it is
recommended that more gDNA should be added in order to quantify the gDNA fragments of different sizes.
Table 2. The digestibility of each restriction enzyme for six species.
Site length Zoysiagrass Rice Sorghum Lettuce Perilla Tomato
BamHI 6 Yy Y Y Y Y Nz
DdeI 5 N N N N Y N
HpaII 4 Y Y N N Y Y
KpnI 6 Y Y Y Y Y Y
MseI 4 Y Y Y N Y Y
MspI 4 N N Y N N N
NsiI 6 N Y N Y N Y
PstI 6 N N N N N N
SacI 6 N N N N N N
SphI 6 Y N N N N N
yindicates gDNA was digested.zIndicates gDNA was not digested.
Digestion efficiency differences of restriction enzymes frequently used for genotype-by-sequencing technology
Korean Journal of Agricultural Science 44(3) September 2017 322
Fig. 1. Digestibility of plant genomic DNA samples by 10 different restriction enzymes. X-axis and Y-axis of each graph
represent relative migration times (minutes) and fluorescence units (RFU), respectively. Peaks with red circles in the
graph indicate the profiles of DNA fragments generated by restriction-digestions.
Digestion efficiency differences of restriction enzymes frequently used for genotype-by-sequencing technology
Korean Journal of Agricultural Science 44(3) September 2017 323
Nevertheless, the fragmentation by REs could be profiled and predicted according to their sizes using the fragment
analyzer. One thing to mention is that the undigested gDNA is not shown because the size of intact gDNA is too large
(around 20 kb) to be seen in our analysable range.
In summary, no patterns were found for the digestibility based on different REs or plant species, which is unexpected
and very interesting. This phenomenon is likely to make GBS users perplexed because their gDNA samples may not
be cut by the REs of their choices. Thus, it is very important to let them know not all REs can cut any gDNA.
The GBS uses the combination of low-frequency and high-frequency cutters to digest gDNA, a barcoded adapter is
ligated to one restriction site and a common adapter to the other (Poland and Rife, 2012; He et al., 2014). Therefore,
the selection of REs for GBS approaches is crucial, especially for the reason demonstrated in the current study that the
enzyme cleaving is not always working properly for unknown reasons. Further study of why this phenomenon occurs
would be another interesting topic. Meanwhile, it would be very valuable to accumulate data on the digestibility of
many other plant species with more REs to help researchers not waste their time and money due to the gDNA digestion
failure by suggesting a cautious approach during experiment setup for GBS before testing their chosen REs on their
own gDNA samples.
Acknowledgements
This work was supported by National Agricultural Genome Program (#PJ0122762017, Rural Development
Administration). We also thank Ji Won Kang for assisting this experiment.
References
Arabidopsis Genome Initiative. 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana.
Nature 408:796-815.
Comb M, Goodman HM. 1990. CpG methylation inhibits proenkephalin gene expression and binding of the
transcription factor AP-2. Nucleic Acids Research 18:3975-3982.
Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML. 2011. Genome-wide genetic marker
discovery and genotyping using next-generation sequencing. Nature Reviews Genetics 12:499-510.
Doyle J, Doyle JL. 1987. Genomic plant DNA preparation from fresh tissue-CTAB method. Phytochemical Bulletin
19:11-5.
Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, Mitchell SE. 2011. A robust, simple genotyping-
by-sequencing (GBS) approach for high diversity species. PloS ONE 6:e19379.
Gore M, Bradbury P, Hogers R, Kirst M, Verstege E, van Oeveren J, Peleman J, Buckler E, van Eijk M. 2007. Evaluation
of target preparation methods for single-feature polymorphism detection in large complex plant genomes. Crop
Science 47:S-135-S-148.
Gore MA, Chia J-M, Elshire RJ, Sun Q, Ersoz ES, Hurwitz BL, Peiffer JA, McMullen MD, Grills GS, Ross-Ibarra J. 2009.
A first-generation haplotype map of maize. Science 326:1115-1117.
Guillemaut P, Maréchal-Drouard L. 1992. Isolation of plant DNA: A fast, inexpensive, and reliable method. Plant
Molecular Biology Reporter 10:60-5.
He J, Zhao X, Laroche A, Lu ZX, Liu H, Li Z. 2014. Genotyping-by-sequencing (GBS), an ultimate marker-assisted
selection (MAS) tool to accelerate plant breeding. Frontiers in Plant Science 5:484.
Honeycutt RJ, Sobral BW, Keim P, Irvine JE. 1992. A rapid DNA extraction method for sugarcane and its relatives.
Digestion efficiency differences of restriction enzymes frequently used for genotype-by-sequencing technology
Korean Journal of Agricultural Science 44(3) September 2017 324
Plant Molecular Biology Reporter 10:66-72.
Kim C, Guo H, Kong W, Chandnani R, Shuang L-S, Paterson AH. 2016. Application of genotyping by sequencing
technology to a variety of crop breeding programs. Plant Science 242:14-22.
Paterson AH, Bowers JE, Bruggmann R, Dubchak I, Grimwood J, Gundlach H, Haberer G, Hellsten U, Mitros T,
Poliakov A, Schmutz J, Spannagl M, Tang H, Wang X, Wicker T, Bharti AK, Chapman J, Feltus FA, Gowik U,
Grigoriev IV, Lyons E, Maher CA, Martis M, Narechania A, Otillar RP, Penning BW, Salamov AA, Wang Y, Zhang L,
Carpita NC, Freeling M, Gingle AR, Hash CT, Keller B, Klein P, Kresovich S, McCann MC, Ming R, Peterson DG,
Rahman M, Ware D, Westhoff P, Mayers KF, Messing J, Rokhsar DS. 2009. The Sorghum bicolor genome and the
diversification of grasses. Nature 457:551-556.
Poland JA, Rife TW. 2012. Genotyping-by-sequencing for plant breeding and genetics. Plant Genome 5:92-102.
Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P, Reily AD,
Courtney L, Kruchowski SS, Tomlinson C, Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F,
Kim K, Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B, Chen W, Yan L,
Higginbotham J, Cardenas M, Waligorski J, Applebaum E, Phelps L, Falcone J, Kanchi K, Thane T, Scimone A,
Thane N, Henke J, Wang T, Ruppert J, Shah N, Rotter K, Hodges J, Ingenthron E, Cordes M, Kohlberg S, Sgro J,
Delgado B, Mead K, Chinwalla A, Leonard S, Crouse K, Collura K, Kudrna D, Currie J, He R, Angelova A, Rajasekar
S, Mueller T, Lomeli R, Scara G, Ko A, Delaney K, Wissotski M, Lopez G, Campos D, Braidotti M, Ashley E, Golser
W, Kim H, Lee S, Lin J, Dujmic Z, Kim W, Talag J, Zuccolo A, Fan C, Sebastian A, Kramer M, Spiegel L, Nascimento
L, Zutavern T, Miller B, Ambroise C, Muller S, Spooner W, Narechania A, Ren L, Wei S, Kumari S, Faga B, Levy MJ,
McMahan L, Van Buren P, Vaughn MW, Ying K, Yeh CT, Emrich SJ, Jia Y, Kalyanaraman A, Hsia AP, Barbazuk
WB, Baucom RS, Brutnell TP, Carpita NC, Chaparro C, Chia JM, Deragon JM, Estill JC, Fu Y, Jeddeloh JA, Han Y,
Lee H, Li P, Lisch DR, Liu S, Liu Z, Nagel DH, McCann MC, SanMiguel P, Myers AM, Nettleton D, Nguyen J,
Penning BW, Ponnala L, Schneider KL, Schwartz DC, Sharma A, Soderlund C, Springer NM, Sun Q, Wang H,
Waterman M, Westerman R, Wolfgruber TK, Yang L, Yu Y, Zhang L, Zhou S, Zhu Q, Bennetzen JL, Dawe RK,
Jiang J, Jiang N, Presting GG, Wessler SR, Aluru S, Martienssen RA, Clifton SW, McCombie WR, Wing RA, Wilson
RK. 2009. The B73 maize genome: Complexity, diversity, and dynamics. Science 326:1112-1115.
Susan JC, Harrison J, Paul CL, Frommer M. 1994. High sensitivity mapping of methylated cytosines. Nucleic Acids
Research 22: 2990-2997.
Zhu H, Qu F, Zhu, LH. 1993. Isolation of genomic DNAs from plants, fungi and bacteria using benzyl chloride. Nucleic
Acids Research 21:5279.