a multi-purpose toolkit to enable advanced genome ... · 1 large-scale biology article a...
TRANSCRIPT
-
1
LARGE-SCALEBIOLOGYARTICLE
AMulti-purposeToolkittoEnableAdvancedGenomeEngineeringinPlants
TomášČermák1,ShaunJ.Curtin2,3,4,JavierGil-Humanes1,5,RadimČegan6,ThomasJ.Y.Kono3,EvaKonečná1,JosephJ.Belanto1,ColbyG.Starker1,JadeW.Mathre1,RebeccaL.Greenstein1,DanielF.Voytas1
1DepartmentofGenetics,CellBiology&DevelopmentandCenterforGenomeEngineering,UniversityofMinnesota,Minneapolis,MN554552DepartmentofPlantPathology,UniversityofMinnesota,StPaul,MN551083DepartmentofAgronomy&PlantGenetics,UniversityofMinnesota,St.Paul,MN551084Presentaddress:UnitedStatesDepartmentofAgriculture,AgriculturalResearchService,CerealDiseaseLaboratory,StPaul,MN551085Presentaddress:CalyxtInc.,NewBrighton,MN551126DepartmentofPlantDevelopmentalGenetics,InstituteofBiophysicsoftheCAS,v.v.i.,CZ-61265,Brno,CzechRepublicCorrespondingauthor,DanielF.Voytas,[email protected]
Shorttitle:Toolkitforplantgenomeengineering
One-sentencesummary:Thisintegratedreagenttoolkitandstreamlinedprotocolsworkacrossdiverseplantspeciestoenablesophisticatedgenomeedits.
TheauthorresponsiblefordistributionofmaterialsintegraltothefindingspresentedinthisarticleinaccordancewiththepolicydescribedintheInstructionsforAuthors(www.plantcell.org)is:DanielF.Voytas([email protected]).
ABSTRACT
We report a comprehensive toolkit that enables targeted, specificmodificationofmonocot anddicotgenomes using a variety of genome engineering approaches. Our reagents, based on TranscriptionActivator-Like Effector Nucleases TALENs and the Clustered Regularly Interspaced Short PalindromicRepeats (CRISPR)/Cas9 system, are systematized for fast, modular cloning and accommodate diverseregulatory sequences to drive reagent expression. Vectors are optimized to create either single ormultiplegeneknockoutsandlargechromosomaldeletions.Moreover,integrationofgeminivirus-basedvectorsenablesprecisegeneeditingthroughhomologousrecombination.Regulationoftranscriptionisalso possible. Aweb-based tool streamlines vector selection and construction.One advantage of ourplatformistheuseoftheCsy-type(CRISPRsystemyersinia)ribonuclease4Csy4 andtRNAprocessingenzymes to simultaneously express multiple guide RNAs (gRNAs). For example, we demonstratetargeteddeletionsinuptosixgenesbyexpressingtwelvegRNAsfromasingletranscript.Csy4andtRNAexpression systems are almost twice as effective in inducing mutations as gRNAs expressed fromindividual RNA polymerase III promoters. Mutagenesis can be further enhanced 2.5-fold byincorporatingtheTrex2exonuclease.Finally,wedemonstratethatCas9nickasesinducegenetargetingat frequencies comparable to native Cas9 when they are delivered on geminivirus replicons. Thereagents have been successfully validated in tomato (Solanum lycopersicum), tobacco (Nicotianatabacum),Medicagotruncatula,wheat(Triticumaestivum),andbarley(Hordeumvulgare).
Plant Cell Advance Publication. Published on May 18, 2017, doi:10.1105/tpc.16.00922
©2017 American Society of Plant Biologists. All Rights Reserved
-
2
INTRODUCTION
Genome engineering is a rapidly emerging discipline that seeks to develop strategies and
methods for the efficient, targeted modification of DNA in living cells. Most genome
engineering approaches use sequence-specific nucleases (SSNs), such as TALENs and
CRISPR/Cas9reagents,tocreateatargetedDNAdouble-strandbreak(DSB)inagenome.DNA
repair then achieves the desired DNA sequence modification. DSBs are most frequently
repaired by non-homologous end joining (NHEJ), which can create insertion/deletion (indel)
mutations at the break site that knockout gene function. TALEN- or CRISPR/Cas9-mediated
geneknockoutshavebeenmadeindiverseplantspeciesamenabletotransformation(Brooks
etal.,2014;Fauseretal.,2014;Forneretal.,2015;Gaoetal.,2014;Christianetal.,2013;Liang
etal.,2014;Shanetal.,2013;Suganoetal.,2014;Xuetal.,2015;Zhangetal.,2014).Traitsof
commercialvaluehavealsobeencreatedbytargetedknockoutofselectedgenes(Wangetal.,
2014;Clasenetal.,2015;Haunetal.,2014;Lietal.,2015a,2012).Alternatively,DSBscanbe
repairedbyhomologydependentrepair (HDR),whichcopies information froma ‘donor’DNA
template. HDR typically occurs at lower frequencies thanNHEJ in somatic cells, and ismore
challengingtoachievebecause it requires introducing intotheplantcellboththedonorDNA
molecule and the SSNexpression cassette. As describedbelow, however, novel strategies to
deliver these reagents have recently realized some improvements in efficiency (Baltes et al.,
2014;Čermáketal.,2015;Fauseretal.,2012;Schimletal.,2014).
Since the development of TALENs and CRISPR/Cas9 reagents, improvements in the
technologyhaveoccurredatarapidpace.Forexample,indelfrequencieshavebeenincreased
by coupling SSNs with exonucleases (Certo et al., 2012; Kwon et al., 2012), altering gRNA
-
3
architecture (Chenetal.,2013)orusingspecializedpromoters (Wangetal.,2015;Yanetal.,
2015).IncreasedspecificityhasbeenachievedusingpairedTALENorCRISPR/Cas9nickases(Ran
etal.,2013),truncatedgRNAs(Fuetal.,2014)ordimericCRISPR/Cas9nucleases(Guilingeret
al., 2014; Tsai et al., 2014). Other diverse applications have emerged, including targeted
regulation of gene expression, targeted epigenetic modification or site-specific base editing
through deamination (Russa and Qi, 2015; Kungulovski and Jeltsch, 2016; Zong et al., 2017;
Shimatanietal.,2017).
CRISPR/Cas9 is particularly useful for multiplexed gene editing, because target
specificityisdeterminedbyshortgRNAsratherthanproteinsthatmustbeengineeredforeach
newtarget(BaltesandVoytas,2015).Severalsystemsforsimultaneousexpressionofmultiple
gRNAshavebeendeveloped,andthesemostofteninvolvetheassemblyofmultiple,individual
gRNA expression cassettes, each transcribed from a separate RNA Polymerase III (Pol III)
promoter(Xingetal.,2014;Maetal.,2015;Lowderetal.,2015).MultiplegRNAscanalsobe
expressed from a single transcript. Such polycistronic mRNAs are processed post-
transcriptionally into individual gRNAs by RNA-cleaving enzymes. These enzymes include the
CRISPR-associated RNA endoribonuclease Csy4 from Pseudomonas aeruginosa (Tsai et al.,
2014), the tRNAprocessingenzymesnaturallypresent in thehost cells (Xie et al., 2015) and
ribozymes(GaoandZhao,2014).
One significant remaining challenge in plant genome engineering is achieving high
frequencygeneeditingbyHDR.Asmentionedabove,HDRrequiresintroducingintotheplant
cellbothadonorDNAmoleculeandaSSNexpressioncassette.DNAsequencemodifications
incorporated into the donor template are then copied into the broken chromosome. The in
-
4
planta gene targeting (GT) approach stably integrates the donor template into the plant
genome.Thisensuresthateachcellhasatleastonecopyofthedonor,whichisreleasedfrom
thechromosomebyaSSN,makingitavailableforHDR(Schimletal.,2014;Fauseretal.,2012).
Anotherapproachinvolvesincreasingdonortemplateavailabilitybyreplicatingittohighcopy
numberusinggeminivirusreplicons(GVR)(Baltesetal.2014).GVRsmakeitpossibletocreate
precise modifications without the need to stably integrate editing reagents into the plant
genome(Cermaketal.2015).
Notallof themethodsandapproachesdescribedabovehavebeen implementedand
optimized for use in plants. Although there are several toolsets available for plant genome
engineering(Maetal.,2015;Xingetal.,2014;Lowderetal.,2015),theyarelimitedinthatthey
useasingletypeofDSB-inducingreagent(e.g.CRISPR/Cas9),asingletypeofexpressionsystem
(e.g.PolIIIpromotersforexpressionofgRNAs)ortheywerespecializedforasingleapplication
(e.g.geneknockouts).Here,weintroduceacomprehensive,modularsystemforplantgenome
engineering that makes it possible to accomplish gene knockouts, replacements, altered
transcriptionalregulationormultiplexedmodifications. Inaddition,theplatformcanbeeasily
upgraded and adjusted to accommodate novel reagents and approaches as they arise. Our
toolkitusesGoldenGatecloningforfastandeasyassemblyofgenomeengineeringconstructs
and flexibility in combining different functional modules for different applications. The
functional modules include TALE-based reagents (TALENs and TALE activators), CRISPR/Cas9
reagents (nucleases, nickases, catalytically inactive enzymes and activators), various gRNA
expression approaches (single and multi-gRNA expression systems using Pol III or Pol II
promoters), donor template plasmids for gene targeting, and a variety of other expression
-
5
cassettes (Trex2, Csy4 and GFP). Further, a large set of promoters, terminators, codon
optimizedgenesandselectionmarkersfacilitateengineeringdicotormonocotgenomesusing
T-DNA or non-T-DNA vectors for Agrobacterium-mediated, biolistic or protoplast
transformation.Wealsoprovideauser-friendlywebportalwithprotocols,DNAsequencesof
all reagentsandweb-basedtoolstoassist inconstructdesign. Finally,wehavevalidatedour
reagentsinsixdifferentplantspeciesandreportthefirstuseofgenomeengineeringtocreate
modifiedMedicagotruncatulaplants.
RESULTS
Directandmodularassemblyofgenomeengineeringconstructs
Our plant genome engineering toolkit provides two sets of vectors, designated ‘direct’ and
‘modular’ (Figure1).Bothvectorsetscandeliver toplantcellseitherTALENsorCRISPR/Cas9
reagents for creating targeted DNA sequence modifications. The TALEN vectors are fully
compatiblewithourGoldenGateTALENandTALEffectorKit2.0(Cermaketal.,2011),andthey
use the highly active Δ152/+63 TALEN architecture (Miller et al., 2011). The CRISPR/Cas9
vectorsincludeeitherwheatorArabidopsisthalianacodon-optimizedversionsofCas9foruse
inmonocotsanddicots,respectively.BothversionsofCas9containasingleC-terminalnuclear
localizationsignal(NLS).
The direct cloning vectors (numbering 31) enable rapid construction of reagents for
making targeted gene knock-outs. They consist of vector backbones (T-DNA or non-T-DNA),
selectable marker expression cassettes and nucleases (i.e. Cas9 or the TALEN backbone).
However, they lack determinants for DNA targeting (i.e. gRNAs or TALE repeats). Single or
-
6
multiplegRNAscanbeaddedinasinglestepusingaGoldenGateprotocol(Engleretal.,2008,
2009) (Figure 1A). Addition of TALENs uses a three-step Golden Gate protocol (see
SupplementalMethodsforassemblyprotocols).
ThemodularvectorsetusesGoldenGatecloningtocreatevectorsforadiverserangeof
geneeditingapplications(Figure1B,SupplementalMethods–Protocol5).Therearethreesets
ofmodularcloningvectors.ModulesetAcontainsready-to-useplasmids(currentlynumbering
61)withCas9orGFPexpression cassettes.ModuleAplasmidsarealsoavailable that canbe
used when assembling TALENs with our Golden Gate kit (Cermak et al., 2011). Module B
plasmids(currentlynumbering22)areusedeithertoaddsingleormultiplegRNAsinavariety
of expression formats or to add a second TALEN monomer. Module C plasmids (currently
numbering 22) may be used to add donor templates for gene targeting, additional gRNA
cassettesorother expression cassettes suchasGFP, Trex2orCsy4. Eachexpression cassette
hasa similararchitecture (SupplementalFigure1): a commonsetof restrictionenzymesites
are used that allow for easy swapping of regulatory elements (promoters, terminator
sequences)orcodingsequences.Ifareagentfromagivensetisnotdesired,thereareempty
modules to serve as placeholders and still allow for Golden Gate assembly. Finally, a
transformationbackbone canbe selected froma set (currently numbering 33) of T-DNAand
non-T-DNA vectors. Standard or geminivirus replicon (GVR) vectors are available with or
without one of the commonly used antibiotic selectable markers. For most applications,
creatingatransformationvectorusingthemodularvectorsetisatwo-stepcloningprocessthat
canbecompletedinfivedays;assemblyofTALEN-basedvectorsusuallyrequiresanadditional
cloningstep.
-
7
Tofacilitateuseofourtoolkit,wedevelopedonlineresourcestoaidinvectorselection
and construct design (http://cfans-pmorrell.oit.umn.edu/CRISPR_Multiplex/). Our website
provides a complete list of both direct andmodular cloning vectorswith descriptions of key
features,DNAsequence filesand relevantprotocols fordownloading (seealsoSupplemental
DataSet1forthelistofvectors). Alsoavailableonlineisa‘vectorselection’toolthatusesa
dropdownmenutohelpidentifysuitableintermediaryplasmidsneeded(e.g.moduleA,BorC)
or to select a transformationbackbone.Thevector selection tool thenprovidesa link to the
DNA sequence file of the selectedplasmid that canbedownloaded inGenBank format. For
multiplexedgeneediting,our‘primerdesignandmapconstruction’tooltakesasinputaFASTA
filewiththegRNAtargetsequences.Alististhenpreparedofalltheprimersequencesneeded
tocreateavectorthatexpressesmultiplegRNAsusingourtRNA,Csy4orribozymeexpression
formats,andaGenBankfileiscreatedwiththeannotatedsequenceofthespecifiedgRNAarray
intheselectedplasmidbackbone.
TargetedmutagenesisofgenesinMedicagotruncatulausingdirectvectors
To test the functionality of the direct vectors, we designed a TALEN pair to target two
paralogous genes in Medicago (Medtr4g020620 and Medtr2g013650) (Figure 2A). The
Arabidopsis ortholog is known to play a role in phosphate regulation, and mutants hyper-
accumulatephosphate(Parketal.,2014).AT-DNAvectorwiththeTALENswasassembledusing
ourGoldenGateassemblyprotocol (SupplementalMethods–Protocol1B)and transformed
intoMedicago.ElevenT0plantswererecoveredandscreenedfortargetedmutationsatboth
Medtr4g020620andMedtr2g013650usingaPCRdigestionassay.TwoT0plants(WPT52-4and
WPT52-5)producedPCRampliconsthatwereresistanttorestrictionenzymedigestion(atboth
-
8
targets inWPT52-4 and at one target inWPT52-5). PlantWPT52-4was self-fertilized, and T1
progenywerescreenedusingthesamePCRdigestionassay.Mutationswereconfirmedatboth
loci (Figure 2B).Mutations inMedtr4g020620were likely somatic, as only twoof the tested
plants produced digestion-resistant amplicons of varying intensity. However, mutations in
Medtr2g013650segregatedasexpectedforatraittransmittedthroughthegermline,andone
outofeighttestedplantswasahomozygousmutant.ThiswasconfirmedbyDNAsequencing,
whichrevealedfourdifferentMedtr4g020620deletionallelesinasingleplant,andanidentical
63bpdeletioninMedtr2g013650inthreeofthetestedT1plants(Figure2C).FromtheeightT1
plants,weidentifiedasingleplantwithmutationsinbothgenes.Thus,TALENsworkefficiently
inourdirectvectorsandinduceheritablemutations.
ExpressingmultiplegRNAsbyCsy4,tRNA-processingenzymesandribozymes
To harness one of the biggest advantages of the CRISPR/Cas9 system – the ability to target
multiple sites simultaneously – ourmodular cloning systemprovides vectors formultiplexed
genome editing. Typically in multiplexed editing, independent Pol III promoters are used to
drive expression of each gRNA. One limitation is that Pol III expression requires a specific
nucleotide in the first position of the transcript. Further, the size of the final array can be
cumbersome due to multiple, repeating promoter sequences. Processing polycistronic
transcriptscontainingmultiplegRNAsprovidesanalternativetotheuseofindependentPolIII
promoters.ThepolycistronicmessagecanbeproducedfromaPolIIpromoterthatisprocessed
eitherbytheCsy4protein,tRNAprocessingenzymesorribozymes(Nissimetal.,2014;Aebyet
al.,2010;Tangetal.,2016).PolIIpromotersprovideflexibilityintermsofspatialandtemporal
controlofexpressioninvivo,andtheyaremorelikelytoproducelongtranscripts,sincePolIIis
-
9
not hampered by the presence of short internal termination sites, as is the case for Pol III
(Arimbasserietal.,2013).Wesoughttocomparetheefficiencyoftargetedmutagenesisusing
five different gRNA expression systems, including three different methods for processing
polycistronic gRNA transcripts (Figure 3A). All vectors were assembled using our modular
cloningsystemandusedtwogRNAstotargettwositesinthetomatoAUXINRESPONSEFACTOR
8AARF8Agene,apositiveregulatorofflowerdevelopmentandfruitset(Liuetal.,2014).
We created three vectors that use individual Pol III promoters to express each gRNA:
two AtU6 promoters, an AtU6 and At7SL promoter or AtU6 and At7SL promoters combined
with structurally optimized gRNA scaffolds (Chen et al., 2013). The remaining three vectors
produceasingle transcriptwithgRNAcassettesseparatedby20bpCsy4hairpins (Tsaietal.,
2014),77bptRNAGlygenes(Xieetal.,2015)or15bpribozymecleavagesites(inadditiontoa
synthetic ribozyme at the 3’ end of the transcript) (Haseloff and Gerlach, 1988; Tang et al.,
2016).ThankstotheadvantagesoverPolIIIpromotersmentionedabove,wechosetouseaPol
IIpromotertodrivetheexpressionofthepolycistronicgRNAs.SinceCas9wasexpressedfrom
a 35S promoter, we chose Cestrum Yellow Leaf Curling Virus promoter (CmYLCV) for gRNA
expressiontopreventuseofduplicatepromoters.TheCmYLCVpromoterdrivescomparableor
higher levelsofexpression than the35Sormaizeubiquitin (ZmUbi)promoters inbothdicots
andmonocots(Stavoloneetal.,2003),andtheintensityofGFPexpressedfromthispromoter
was similar to 35S-driven GFP in tomato protoplasts (Supplemental Figure 2). Final vectors
were co-transformed into tomato protoplasts along with a yellow fluorescent protein (YFP)
expressionplasmid. Transformationefficiencyapproximated60%, asdeterminedby counting
YFPpositivecells.
-
10
Toassessmutationfrequencies, thetargetsitewasPCR-amplifiedfromDNAprepared
from the transformed protoplasts and subjected to Illumina DNA sequencing (Supplemental
Table1).AlloftheexpectedtypesofmutationsintheARF8Alocuswererevealedinallsamples
except for thenon-transformedcontrol (Figure3B,3C;SupplementalTable2; Supplemental
Figures 3-7). Interestingly, the overall frequency ofmutagenesiswas approximately two fold
higher with the Csy4 and tRNA expression systems compared to constructs with two Pol III
promoters.AlthoughtheoverallfrequencyofmutagenesisintheCsy4samplewasconsistently
higher than in the tRNA sample across three replicates, the difference was not statistically
significant(Figure3C,SupplementalTable3).Theribozymeshowedlowerfrequenciesofmost
mutation types, particularly at the 3’-most gRNA target site. We did not observe any
enhancement in overall mutation frequencies with the previously reported improved gRNA
architecture(Chenetal.,2013).Itisimportanttonotethatthemutationfrequencieswerenot
adjusted to account for the ~60% transformation efficiency, and therefore they are
underestimates.
DeletionbetweenthetwoCRISPR/Cas9cleavagesiteswasthemostcommonmutation
inallsamples(SupplementalFigure3).Shortinsertions/deletions(indels)ateitheroneorboth
cleavagesitesweremorethan10-foldlessabundant(SupplementalFigures3-6).Inversionsof
theinterveningsequenceanddeletionscombinedwithinsertionsweremorethan100-foldless
frequent(SupplementalFigure3).Mostinsertionsmappedeithertothetomatogenomeorto
thevector,coveringalmosteveryregionofthevectorsequence(SupplementalFigure7).
Cas9mostfrequentlycreatesablunt-endedDSBthreenucleotidesupstreamofthePAM
sequence(Jineketal.,2012).Inourexperiments,preciseligationofCas9-cleavedDNAwithout
-
11
theinterveningsequencewouldresultinadeletionof85bp.Gainorlossofnucleotidesdueto
staggeredCas9cleavage(Jineketal.,2012;Lietal.,2015b)and/orexonucleaseactivitycould
result in shorteror longerdeletions.Strikingly,weobservedaveryhigh frequencyofprecise
ligation: on average, 83.5% of all reads with long deletions were deletions of 85 bp
(SupplementalFigure8).TheseresultssuggestthatifasinglegRNAwasused,mostinstances
ofNHEJrepairwouldnotresult inamutation.Thereforecreatingdeletionswithtwoormore
gRNAsshouldbeamoreefficientapproachtoachievetargetedmutagenesis.
Wefurthertestedtheeffectivenessofthepolycistronicexpressionsystemsusingeight
gRNAstargetingfourgenesfordeletionintomatoprotoplasts.Wealsomonitoredtheeffectof
thepositionofthegRNAinthearray.gRNAs#49-56(SupplementalTable4)wereassembled
directly intonon-T-DNAprotoplastexpressionvectors in twodifferentorders (Figure4A). In
onearray,thegRNAswereorderedfrom49to56,andtheorderwasreversed inthesecond
array. In addition, a vector expressing each gRNA from a separate Pol III promoter was
constructed. Each pair of gRNAs was designed to create a ~3 kb deletion to prevent
amplificationofthenon-deletedlociinsubsequentPCRanalysis(Figure4B).Thesevenvectors
were transformed into tomato protoplasts, and the frequencies of deletions induced by the
first,lastandoneoftheinternalgRNApairsineacharrayweremeasuredbyquantitativePCR
(Figure4C). Consistentwith thepreviousexperiment, theCsy4expression systemperformed
bestregardlessofthepositionofthegRNAinthearray.ThetRNAsystemwasthenextmost
effective,followedbytheuseofindividualPolIIIpromoters;theribozymesystemwastheleast
effective.gRNAsneartheendoftheCsy4andtRNAarrayswere lessefficient inmutagenesis
relative to gRNAs in earlier positions for themajority of tested loci, suggesting that position
-
12
doeshaveaneffectongRNAactivity.Nevertheless, regardlessofpositioneffects,whenCsy4
was used, the frequencyof deletionswas as highor higher than theuseof individual Pol III
promoters,confirmingtheCsy4isthemostefficientsystemformultiplexedgeneediting.
Multiplexedmutagenesisintomato,wheat,andbarleyprotoplasts
WefurtherexploredtheeffectivenessoftheCsy4systemformultiplexingbytargetingsixgenes
in tomato (Solyc06g074350, Solyc02g085500, Solyc02g090730, Solyc11g071810,
Solyc06g074240andSolyc02g077390),threegenesinwheat(ubiquitin,5-Enolpyruvylshikimate-
3-phosphatesynthase-TaEPSPS,andMildewlocusO-TaMLO),andonegeneinbarley(Mildew
locusO-HvMLO).EachgenewastargetedfordeletionusingtwogRNAs.Toprovideexpression
in cereals, we created vectors with the maize ubiquitin (ZmUbi), switchgrass (Panicum
virgatum) ubiquitin 1 (PvUbi1) and switchgrass ubiquitin 2 (PvUbi2) promoters (Mann et al.,
2011).ThesequencesofPvUbi1andPvUbi2weremodifiedtoremoveAarI,SapI,andBsaIsites,
and themodifiedpromoterswere confirmed todrivehigh levelsofGFPexpression inwheat
scutella(SupplementalFigure9).Next,wedesignedCsy4arraysoftwelvegRNAstargetingthe
sixgenesintomato,sixgRNAstargetingthethreegenesinwheat,andtwogRNAstargetingthe
singlegeneinbarley.Intomato,the35SpromoterwasusedtoexpressCas9,andtheCmYLCV
promoter drove expression of the gRNAs. In wheat and barley, ZmUbi was used to drive
expression of Cas9, and the gRNA array was under control of PvUbi1. To test whether the
PvUbi1 promoter could be substituted with the significantly shorter viral CmYLCV for gRNA
expressioninmonocots,wealsodesignedavectorinwhichthetwogRNAstargetingthebarley
MLOgenewerecontrolledbytheCmYLCVpromoter.ThefinalvectorsexpressingCas9andthe
gRNA array were transformed into tomato, wheat and barley protoplasts, and PCR was
-
13
conducted two days later to detect targeted deletions. Deletions of the expected size were
foundineachtargetedgeneandverifiedbyDNAsequencing(Figures5-7).BothCmYLCVand
PvUbi1-driven gRNA arrays induced targeted deletion of the barleyMLO gene with similar
efficiencies (Figure 7B). Thus, we confirmed that our system is efficient in simultaneously
creating multiplexed, targeted gene deletions in both dicots and monocots using up to 12
gRNAs.
Heritable,multiplexedmutagenesisinMedicagotruncatula
To demonstrate the efficiency of our Csy4multiplexing vectors inwhole plants,we used six
gRNAstotargetfourMedicagogenesthatencodenodule-specificcysteine-rich(NCR)peptides
(SupplementalFigure10).Morethan500genesmakeuptheNCRfamily,andtheirsmallsize,
sequence identity and overall number pose challenges to traditional knock-out/knock-down
platformssuchas transposableelements (Tnt1),RNA interferenceandchemicalmutagenesis.
Three gRNA pairs were designed to each delete one of three loci: NCR53, NCR54
(Medtr4g026818) andNCR55 (Medtr3g014705). TheNCR53 locus contains two homologous
NCR genes differing by several SNPs (Supplemental Data Set 2) and separated by
approximately 4 kb. The vector carrying the gRNA array, Cas9 expression cassette and bar
resistance gene was transformed into Medicago, and 46 phosphinothricin resistant T0
transformantswererecoveredandscreenedbyPCRanddirectsequencing.Forty-threeplants
hadmutations in inNCR53,eight inNCR54and12 inNCR55 (Table1,SupplementalTable5,
SupplementalDataSet2).Themutations includeddeletionsof theNCR53andNCR55genes,
resultingfromsimultaneouscleavageat twosites,aswellasshort indels inall three lociata
giventargetsite.Threeplantshadmutationsinallthreegenes,13weredoublemutantsand27
-
14
hadmutationsinasinglegene.Despiteitsrepetitiveness,mutationsattheNCR53 locuswere
exclusively long deletions or inversions between the two cleavage sites in seven out of 46
plants.Nomutationswere recoveredat thegRNA3 site, likelydue to twoSNPs foundat the
targetsite.gRNA1,whichtargetsasiteatNCR53,hastwomismatchesintheseedregiontoa
siteatNCR54andaPAM-disruptingmutation.Similarly,gRNA2hastwomismatchestoasiteat
NCR54, one in eachof the 5’ and3’ endsof the seed region. Nooff-targetmutationswere
observedfromgRNA1andgRNA2at theNCR54 locus in the46plants, further indicatingthat
twomismatchesinthetargetsitearesufficienttopreventoff-targetactivity.
To confirm transmission of the mutations to the next generation, plant #16 (with
mutations inNCR53 andNCR54), plant#17 (withmutations inNCR53 andNCR55), plant#27
(with mutations in all three targets) and plant #40 (with mutations in NCR53) were self-
fertilized.SevenT1seedlingsfromeachT0parentwerescreenedbyPCRandDNAsequencing
formutations in the respective genes and for presence of thebar transgene (Supplemental
Figure 11A, B). Mutations identical to those observed in the T0 parents were found in the
NCR53 locusintheprogenyofallfourtestedplants(SupplementalFigure11A).Intwoofthe
fourplants(plant#17andplant#40),all testedprogenyonlyshowedthemutatedversionof
theNCR53 locus. Both thedoublemutant (plant #16) and the triplemutant (plant #27) also
transmittedtheexpectedmutationsinNCR54totheirprogeny,whereasonlythetriplemutant
(plant #27) and not the double mutant (plant #17) transmitted the mutations in NCR55.
Althoughwedidnotdetectthewild-typealleleofNCR53inT0plant#16orthewild-typeallele
ofNCR55intheT0plant#27,wild-typealleleswererecoveredintheT1progeny,suggestingthe
T0 plants might have been chimeric. Nevertheless, the T-DNA insertion segregated away in
-
15
severalT1progenyofT0plants#16,#27(includingatriplemutant27-6)and#40(Supplemental
Figure 11B), confirming that the mutations in these plants were transmitted through the
germline rather than created de novo in the T1. Thus, we have been able to show that
mutationsineachofthegenesareheritable.Inaddition,weshowedthatmutationsinallthree
genescanbetransmittedsimultaneouslyfromatriplymutantparent(plant#27)toitsprogeny.
These results show the feasibility of using the Csy4 system to effectively recover heritable
mutationsinmultiplegenes.
Targeteddeletionof58kbinMedicagousingthetRNAandCsy4multiplexingsystems
To further evaluate the Csy4 and tRNA systems for creating chromosomal deletions in
Medicago,wedesignedanothersetofsixgRNAs(MPC1toMPC6)thattargeta58kbregionon
chromosome2(Figure8A).Theregioncontains fivecandidategenes identifiedbyagenome-
wideassociation study tobeassociatedwith symbioticnitrogen fixation (Curtinetal., 2017).
ThegRNAswereassembled(intheorderfrom1to6)intothedirectcloningT-DNAvectorsthat
useeitherCsy4ortRNAprocessingenzymes(seeSupplementalMethods–Protocols3Aand
3B),andaftertransformation,DNAfromrandomlyselectedcalliwastestedbyPCRforpotential
chromosomaldeletions. Primersweredesigned200 to300bpupstreamanddownstreamof
varioustargetsitesthroughoutthelocus(Figure8A,SupplementalTable6).Resultsfromthis
preliminaryassay revealedall fourpossibledeletions (detectableusing this setofprimers) in
the Csy4-transformed calli, whereas only the 27 kb deletion between gRNA sitesMPC1 and
MPC4wasdetected in thecalli transformedwiththetRNAmultiplexingreagents (Figure8B).
Amplicons generated from putative 58 kb deletions were sequence confirmed (Figure 8C).
While still in tissueculture,wepre-screened46T0plantlets fromeachof theCsy4and tRNA
-
16
transformations for the 27 kb and 58 kb deletions. The putatively positive plantlets were
transferredtosoil,grownintomatureplantsandscreenedagainbyPCR.ThreeCsy4T0plants
wereidentifiedwiththe58kbdeletionandtwoplantswiththe27kbdeletion.ForthetRNAT0
plants,oneplanthadthe27kbdeletion(Figure8B).Thewild-typesequencewithinthe58kb
deletedregioncouldstillbeamplifiedwithadifferentsetofprimers,suggestingthatallplants
wereheterozygousorchimeric.Oneplant(WPT228-5)appearedtobeaT0homozygousmutant
for the 27 kb deletion, since we failed to recover a PCR amplicon from the wild-type allele
(Figure8B).
To test for heritable transmission of the deletions, T0 plants WPT228-28 (Csy4,
heterozygote or chimeric for the 58 kb deletion) and WPT227-28 (tRNA, heterozygote or
chimeric for the 27 kb deletion) were self-fertilized, and T1 plants were screened for the
segregating chromosomal deletions. Eight out of 12 T1 seedlings produced bands of the
expectedsizeforthe58kbdeletioninWPT228-28progeny,andsevenoutofsevenT1seedlings
showed bands expected for the 27 kb deletion inWPT227-28 progeny, confirming germline
transmissionintheselines(Figure8B).Inaddition,thewild-typeallelewasnotdetectedintwo
out of the eight WPT228-28 T1 seedlings, suggesting the two seedlings are homozygous
mutants. The bar transgene was not detected in three heterozygous WPT228-28 seedlings,
indicatingthe lossofT-DNAinsertionandconfirmingthatthemutationswerenotcreatedde
novo intheT1(Figure8B).Todeterminewhetherwecouldrecoverhomozygousmutants ina
transgene-freebackground,T1 plantWPT228-28-1–heterozygous for the58kbdeletionand
lackingtheT-DNAinsertion–wasself-fertilized,andsevenT2seedlingswerescreenedforthe
-
17
presence of the deletion. PCR and sequencing identified two homozygous and three
heterozygousnon-transgenicplants(Figure8B).
TheloweroverallmutationfrequenciescomparedwiththoseobservedintheNCRgeneediting
experimentare likelyduetothesignificantly largersizeofdeletionsbeingsought.An inverse
relationshipbetweenCas9-mediateddeletionfrequencyandsizehasbeenobservedpreviously
inbothmammalianandplantcells(Canveretal.,2014;Xieetal.,2015;Ordonetal.,2016).
EnhancingtargetedmutagenesiswithTrex2
To increase the frequency of NHEJ-induced mutations, sequence-specific nucleases can be
coupledwithexonucleasestopromoteendresectionandpreventprecise ligation.Previously,
co-expressionofthehuman3’repairexonuclease2(Trex2)withameganucleaseresultedina
25-fold enhancement in gene disruption frequencies in human cells (Certo et al., 2012). In
plants, overexpression of exonuclease 1 (Exo1) significantly enhanced the frequency of DSB
resectioninricecalli(Kwonetal.,2012),andTrex2increasedthemutagenesisfrequencywhen
co-deliveredtoplantcellswithameganuclease(Luoetal.,2015).Basedonthisdataandour
observation that the majority of Cas9-induced breaks are rejoined precisely, we tested the
effectonmutagenesisofexpressingTrex2withCas9andasinglegRNA.WeusedgRNAsthat
target two different sites in the tomato ANT1 gene and one site on the barleyMLO gene.
IndividualgRNAswereclonedunderthecontroloftheAtU6orTaU6promoters,combinedwith
a dicot or monocot optimized Trex2-P2A-Cas9 fusion, and tested in tomato and barley
protoplasts. We saw a modest 1.5-2.5-fold increase in indel frequencies in the presence of
Trex2comparedtothesampleswithoutTrex2inbarley(asmeasuredbytheT7EIassay)(Figure
-
18
9A).Similarresultswereobtainedintomato(asmeasuredbybothT7EIandDNAsequencing)
(Figure9A,B).Sequencingrevealeda trendtowards longerdeletions in theTrex2samples in
both tomato and barley (Figure 9B, C).Whereas no deletions >10 bp were detected in the
samples lacking Trex2, they represented 77% (tomato) and 100% (barley) of all mutations
recoveredinsamplesexpressingTrex2.Insertionsofupto42bpwerefoundinsampleswithout
Trex2 (Supplemental Figure 12) but not in the samples expressing Trex2, suggesting that
deletionswereinducedpreferentiallybythepresenceoftheexonuclease.
GenetargetingwithCas9nickases
Severalstrategieshavebeendevelopedtodecreasethefrequencyofoff-targetmutagenesisby
CRISPR/Cas9. Among these are the use of paired Cas9 nickases, which create a DSB by
introducingnicksonoppositeDNAstrands.Pairednickaseshaveon-targetcleavageefficiency
comparabletowildtypeCas9inhumancellsandplants(i.e.Arabidopsisandrice),andreduce
off-targetactivityupto1500-fold(Ranetal.,2013;Schimletal.,2014;Mikamietal.,2016).We
createdCas9nickasevariantsbymutatingthecatalyticresiduesintheconservedRuvC(D10A)
andHNH (H840A)nucleasedomains.ModuleA vectorswere then created thatexpressCas9
variantswithindividualorcombinedmutations(thelatterresultinginacatalyticallydeadCas9
enzyme).
Wewere interested inexploringwhetherpairedCas9nickasescanalsopromotegene
targeting(GT)inplantsusingapreviouslydescribedtobaccotransientGTassay.Inthisassay,a
defectivegus:nptII transgene is restoredby cleavage andhomologous recombination (Figure
10A) (Wright et al., 2005; Baltes et al., 2014). We designed a pair of gRNAs in PAM-out
orientationthatoverlappedby5bp(-5bpoffset)andshouldcreatea29bpoverhang(Figure
-
19
10A).TostimulateGT,thereagentsweredeliveredtotheplanttissueontheBeanyellowdwarf
virus (BeYDV) replicons (Baltesetal.,2014;Čermáketal.,2015)available inour toolkitasT-
DNAtransformationbackbones.Thereadoutwasthenumberofblue(GUSpositive)spotsand
intensityofstainingintobaccoleaves(SupplementalFigure13A).
WefirstcomparedtheGTfrequencies inducedbyasinglenickorDSB.Thenumberof
GTeventsinducedbysinglenickswasaround70%oftheGTeventsinducedbyaDSB(Figure
10B).NosignificantdifferencesinGTfrequencieswereobservedbetweentheD10AandH840A
nickases.Then,wecomparedtheGTefficienciesusingpairednickasesthatinducedoublenicks
resulting in 29 bp 5’ overhangs (AtD10A) and 3’ overhangs (AtH840A). We observed 2-fold
increaseinGTwith5’overhangscomparedto3’overhangs.ThenumberofGTeventsinduced
bydoublenickscreating5’overhangswasevenhigherthanthenumberofGTeventsinduced
bythewild-typeCas9nuclease.Thisis inagreementwithobservationsinhumancells(Ranet
al.,2013).
ToconfirmthenatureoftheGTeventsonthemolecularlevel,weextractedDNAfrom
infiltrated tobacco leaves and PCR amplified both junctions of the inserted sequence
(Supplemental Figure 13B). All samples except the one that was infiltrated with the vector
expressingnon-cuttinggRNAs,yieldedproductsofexpectedsizeforbothjunctions.Sixtoten
bacterial clones for each junction from each sample were analyzed by sequencing, which
revealed precise integration of the donor template in all clones (Figure 10C). These data
suggestthattheoverallfrequencyandprecisionofGTinducedbynickasesiscomparabletothe
nuclease-basedapproach,whichwehavepreviouslyshowncanbeusedtoefficientlyrecover
plantswithheritablemodificationscreatedbyGT(Baltesetal.,2014;Čermáketal.,2015).
-
20
To see whether the nickase approach is applicable for GT in a monocot species, we
targetedanin-frameinsertionofaT2A-GFPcassetteintothethirdexonoftheubiquitingenein
wheat scutella, usingWheat dwarf virus (WDV) replicons as described in Gil-Humanes et al.
(2017). We observed GFP expression consistent with insertion into the ubiquitin locus
(Supplemental Figure 14A-C), suggesting the feasibility of this approach across different
species.
DISCUSSION
Genome engineering promises to advance both basic and applied plant research, and we
developed a comprehensive reagent toolkit to make the latest genome engineering
technologiesavailable to theplant sciencecommunity.BothTALENsandCRISPR/Cas9canbe
easilycombinedwithavarietyofregulatorysequences,transformationvectorsandotherDNA
modifyingenzymes,making itpossibletorapidlyassemblereagentsnecessarytomakesingle
andmulti-geneknockoutsaswellasgenereplacements.
Usingour flexible,modularsystemweevaluated fourdifferentmulti-gRNAexpression
strategiesintomatoprotoplaststoachievemultiplexedediting.ApolycistronicgRNAtranscript,
when processed by the bacterial Csy4 ribonuclease, resulted in almost two times higher
mutation frequencies compared with the commonly used strategy in which each gRNA is
expressedfroman individualPol IIIpromoter.Further, thepositionofthegRNAs inthearray
hasonlyamodesteffectonactivity,andthelastgRNAsinanarrayofeightperformcomparably
togRNAsexpressedindependentlyfromPollIIIpromoters.Althoughthefrequenciesobserved
with Csy4 were comparable to the previously described tRNA processing system (Xie et al.,
-
21
2015),wefavortheCsy4systembecauseitconsistentlygaveushigherfrequenciesofdeletions
intomatoprotoplasts,andthemulti-gRNAtranscriptsareshorterandlessrepetitive(i.e.Csy4
recognitionsitesare20bpvs.77bpfortRNAs).AlthoughtheCsy4systemrequiresexpression
oftheCsy4ribonucleaseintheplantcell,andnoextrafactorsarerequiredfortRNAprocessing,
theCsy4geneisrelativelyshort(561bp=187aa),andwetypicallyexpressitasP2Afusionto
Cas9withoutanyobservablephenotypicconsequencestotransgenicplants.
Although we have previously shown that gRNA transcripts processed by ribozymes
efficiently induce mutagenesis in rice, tobacco, and Arabidopsis (Tang et al., 2016), of the
polycistronicRNAprocessingmechanismstestedinthepresentstudy,ribozymesresultedinthe
lowestmutation frequencies. One difference between the two studies is that we expressed
Cas9 and thegRNAarray fromseparatepromoters,whereasTangetal. (2016)useda single
transcription unit to express both the Cas9 and the gRNAs. Furthermore, the frequency of
deletionscreatedbytwogRNAswasnotdirectlymeasuredinthepreviousstudy.Itispossible
that coordinate expression ofCas9 and gRNAs from a single transcription unit, which is not
possible in the current version of our cloning systemdue to itsmodular structure,might be
essentialtoachievehighergeneeditingfrequenciesusingribozymeprocessing.Whetherthisis
true remains to be elucidated. Nevertheless, we demonstrate Csy4 and tRNA expression
systemsefficientlyprocessmulti-gRNAtranscriptsandenablehighefficiencymultiplexedgene
editing.
AnalysisofmutationsresultingfromcleavageofasinglegenewithtwogRNAsrevealed
that approximately 85% of DSBs were precisely ligated. This is consistent with the high
frequencyofpreciserepairobservedinmouseandhumancells(Lietal.,2015b;Geisingeretal.,
-
22
2016).SinceDSBsinducedbyasinglegRNAarealsolikelyaccuratelyrepaired,wefavortheuse
of two gRNAs to induce loss-of-function mutations. Precise repair can be used to your
advantagetomakethemutationofchoice,andtherearenoconcernsaboutcreatingfunctional
geneisoformsthroughnon-frameshiftindelsoralternativesplicing.Furthermore,thetargeted
deletionscanbedetectedbyPCR,makingthescreeningformutantssignificantlyeasier.Thisis
particularly valuable, if not essential, when simultaneously mutagenizing several genes, as
screening for loss of restriction enzyme sites, T7 assays or direct DNA sequencing is time
consuming and costly. Targeted deletions might also be valuable for editing non-coding
sequences, including promoters and regulatory sequences,where short indelmutationsmay
not be effective. Moreover, engineering complex traits will likely require an efficient
multiplexededitingplatform.Hereweshowthatcombinationsofdifferentgenedeletionscan
be induced in a single experiment using a single vector, highlighting the potential of this
approachtospeedupcandidategeneevaluation.UsingourCsy4-basedsystem,wesuccessfully
deleteduptosixgenes intomato,wheat,andbarleyprotoplastsand inMedicagotruncatula
plants,demonstratingtheversatilityofthissystemacrossavarietyofplantspecies.Finally,the
high level of precision inNHEJ repair also opens up an opportunity for targeted and precise
geneinsertion/replacement,independentofHDR(Geisingeretal.,2016).
As an alternative to knocking out geneswith twoormore gRNAs,we also show that
Trex2, in combination with Cas9, increases the length and frequency (2.5 fold) of NHEJ-
mediateddeletions inbothtomatoandbarley.AsinglestudythatcombinedTrex2withCas9
forgeneeditinginhumancellsreportedasimilar2.5-foldincreaseinmutagenesis(Charietal.,
2015). When Trex2 was used in combination with TALENs, a 3-4-fold enhancement was
-
23
observed(Certoetal.,2012;Luoetal.,2015),whereasa25-foldincreasewasachievedwhena
meganucleasewasusedastheDSB-inducingagent(Certoetal.,2012).Thedifferencesinfold-
enhancementbetweenthethreetypesofnucleasesmaybecausedbydifferentlevelsofTrex2
activity on different types of DNA ends induced by Cas9 (blunt), TALENs (5’ overhang) or
meganucleases(3’overhang,thepreferredsubstrateofTrex2).Interestingly,Certoetal.(2012)
detectedabias towards smalldeletionswhen they combinedTrex2withmeganucleasesand
TALENs,butthestudyofCharietal.(2015)aswellasourdataindicatethatthedeletionlength
increaseswhenTrex2isusedwiththeCRISPR/Cas9system.Nevertheless,deletionsseemtobe
induced preferentially in the presence of Trex2, regardless of the type of nuclease used to
createtheDSBsincethelowerfrequencyof insertionscomparedwiththenoTrex2control is
consistentinallstudies.
Cas9nickasesarefrequentlyusedbecausetheyreducetheriskofoff-targetmutations
(Ranetal.,2013;Choetal.,2014;Shenetal.,2014;Mikamietal.,2016).Singlenickasesdonot
inducedetectablelevelsofon-targetNHEJ-mediatedmutagenesis(Congetal.,2013;Ranetal.,
2013; Fauser et al., 2014), whereas paired nickases, which create two adjacent nicks in the
targetsequence,inducemutationsatefficienciescomparabletonativeCas9(Ranetal.,2013;
Schimletal.,2014;Mikamietal.,2016).Pairednickasesalsoefficientlyinducegenetargetingin
humancells(Malietal.,2013;Ranetal.,2013),andherewedemonstratethatpairednickases
constructed using our modular system induce gene targeting in tobacco and wheat with
efficienciescomparabletothenuclease.Consistentwiththedatafromhumancells(Malietal.,
2013;Ranetal.,2013),pairednickasesthatcreate5’overhangs inducedGTmorefrequently
thanthosecreating3’overhangs.Inaseparatestudy(Fauseretal.,2014),singlenickaseswere
-
24
foundcapableofpromotingGTinArabidopsisatfrequenciesexceedingthenuclease;however
weobservedthatGTfrequencieswithsinglenickaseswereabovebackgroundinbothtobacco
andwheat,buttheywerelowerthantheGTfrequencyachievedwiththenucleaseandpaired
nickases.ThehighfrequencyofGTobservedbyFauseretal. (2014)withasinglenickmay in
partbeduetheproximityofthedonortemplate,whichwasintegratedintothechromosome
adjacent to thenick site. Indeed, significantly lowerGT frequencieswereobserved inhuman
cells when single nickases were used with extrachromosomal donors (Ran et al., 2013),
although other factors such as donor architecture and cell type could contribute to the
observeddifferencesbetweenstudies.Nevertheless,weshowthatpairednickases,whichhave
beenshowntohaveminimaloff-targeteffects inplants(Mikamietal.2016),canbeusedfor
efficientGTinbothmonocotsanddicots.
Anumberofdifferenttoolkitsforplantgenomeengineeringhavebeenpublishedwithin
thepasttwoyears(Xingetal.,2014;Xieetal.,2015;Maetal.,2015;Lowderetal.,2015;Zhang
etal.,2015;Liangetal.,2016;Vazquez-Vilaretal.,2016;Kimetal.2016;Ordonetal.,2016);
however,allhavelimitations:mostdonotofferflexibilityinthegeneeditingplatform(TALENs
or CRISPR/Cas9); some reagents are optimized for a non target organism (e.g. humans);
typicallythereislittlechoiceinpromotersforexpressingtargetingreagents;mostsystemsare
limited to a single vector type (e.g. T-DNA); applications are limited to NHEJ-mediated gene
knockouts;mostreagentsarevalidatedinonlyoneortwomodelspecies.Althoughtwostudies
havemadetheirapproachescompatiblewithawidervarietyofvectorsandreagents,suchas
theD10ACas9nickaseandCas9-basedtranscriptionalactivatorsandrepressors(Lowderetal.,
2015;Vazquez-Vilaretal.,2016),theyremainlimitedinotheraspects.Forexample,theLowder
-
25
et al. 2015 toolkit limits gRNA expression to Pol III promoters, requires an 8-day assembly
process,andlackscross-compatibleandadaptableexpressionvectors;theVazquiez-Vilaretal.
2016 toolkit requires a time-consuming binary cloning approach and lacks substantial
evaluation of the multiplexed gene editing vectors. In comparison, our plant genome
engineeringplatformprovides theuserwithanoption to choose thedesired reagent froma
comprehensivesetofexpressioncassettesincludingTALENs,TALE-VP64,TALE-VPR,Cas9,D10A
Cas9 nickase, H840A Cas9 nickase, dCas9, dCas9-VP64 and dCas9-VPR, GFP, Trex2 and Csy4
withversionscodonoptimizedforuseindicotsormonocots.ForCas9-basedgenomeediting,
the toolkit enables assembly of vectorswith all gRNA expression systems described to date,
including two dicot and four monocot Pol III promoters. Furthermore, multiplexed systems
basedonCsy4,tRNAandribozymesareavailablethatcanbecombinedwithPolIIpromoters.
All reagents can be assembled into T-DNAor non-T-DNA vectorswith orwithout one of the
threemost common plant selectablemarkers driven by dicot ormonocot promoters. Faster
direct or flexiblemodular cloning protocols are available.While previous studies focused on
evaluating reagents in the model species Arabidopsis, rice and tobacco, here we provide
evidence that inaddition to tobacco,our system is functionalandeffective in two important
dicotandtwomonocotcropspecies.
Tothebestofourknowledge,ourtoolkitisthefirstdesignedtofacilitategenetargeting
through homologous recombination. We include GVR-based vectors, which were first
demonstrated to work in proof of concept experiments in tobacco (Baltes et al., 2014) and
morerecentlytoeffectivelyintroduceheritabletargetedinsertionsintothetomatogenomeat
frequencies3-9-foldhigherthanstandardT-DNA(Čermáketal.,2015).WehavealsousedWDV
-
26
repliconstotargetinsertionofaselectablemarkerintothewheatubiquitinlocus(Gil-Humanes
etal.,2017),andhereweshowGVRsworkincombinationwithpairednickases.Moreover,our
modularsystemisalsocompatiblewiththeinplantagenetargetingapproachdescribedearlier
(Fauser et al., 2012). An integral element in a gene targeting strategy is the DNA donor
template,whichcarriescustommodificationsbeingintroducedinthetargetgenome.Allofthe
module C plasmids in our toolkit include a unique restriction site for insertion of the donor
template,andweprovideprotocolsforassemblyofthedonortemplateplasmidsforboththe
repliconand inplantagenetargetingmethods (seeSupplementalMethods–Protocol4).All
together, our toolkit enables the most efficient plant gene targeting methods currently
available.
In addition to the commonly used 35S and maize ubiquitin promoters, the current
versionofourcloningsystemincludessevenotherdicotandthreemonocotpromoters.These
includeconstitutiveviral,bacterial(CmYLCV,M24,FMV,nos)orplant(AtUbi10,OsAct1,PvUbi1,
PvUbi2)promoters,whichprovidearangeofexpressionstrengthsinavarietyofplantspecies
(Stavoloneetal.,2003;Sahooetal.,2014;Maitietal.,1997;Norrisetal.,1993;McElroyetal.,
1990; Mann et al., 2012). We have also included the Arabidopsis thaliana Ec1.2 and YAO
promoters, which are egg cell and embryo sac/pollen specific, respectively. Although
Arabidopsiswasnotourtargetspeciesinthisstudy,thesetwopromotershavebeenshownto
enhancemutagenesis inArabidopsiswhenused todriveCas9expression (Wanget al., 2015;
Yan et al., 2015). In our toolkit, these promoters can drive either Cas9 expression or the
polycistronic gRNA arrays. Due to the universal structure of the modular vectors, new
combinations of promoters and coding sequences canbe easily added. The versatility of the
-
27
toolkit is further increased by its optimization for Golden Gate assembly, one of the most
popular DNA assembly methods. We have made the T-DNA backbones derived from the
commonly used pCAMBIA vectors free of most relevant type IIs restriction sites, AarI, BsaI,
Esp3I and SapI. AarI, SapI and BsaI sites were also removed from the CmYLCV, PvUbi1 and
PvUbi2promoters,andtheactivityofthemodifiedpromoterswasverifiedinplantstoensure
these modifications do not affect expression levels. Since the Golden Gate modules can be
easily modified to contain other genes, the system can be adopted for applications beyond
genomeengineering,suchassyntheticbiology.
Finally, theability toalter transcription levelsofendogenousgenes isoftendesirable.
TranscriptionalactivationdomainshavebeenfusedtobothTALEsandCas9toenabletargeted
gene activation (Lowder et al., 2015; Vazquez-Vilar et al., 2016). Although these custom
transcriptionfactorshaveproveneffective (Liuetal.,2013;Piateketal.,2015;Lowderetal.,
2015; Vazquez-Vilar et al., 2016), they havemostly been testedon reporter genes drivenby
syntheticorminimalviralpromoters.Therefore,theincreaseinexpressionlevelsmightnotbe
sufficient to inducephenotypicchangeswhenendogenousgenesare targeted (Lowderetal.,
2015).Toovercomethemodestlevelsofgeneactivation,Chavezetal.(2015)recentlyshowed
thatfusionsofdCas9orTALEproteinstoastrongtranscriptionalactivatorVPRmediateupto
320-fold higher activation of endogenous targets compared with fusions to VP64 alone.
Althoughfurtheroptimizationofartificialplanttranscriptionalregulatorsisneeded,atpresent,
weprovidemoduleswithfusionsofbothVP64andVPRactivatordomainstoTALEsanddCas9
fortranscriptionalactivation,aswellasmoduleswithdCas9only,whichhasbeenshowntobe
-
28
equallyeffectiveintranscriptionalrepressionasdCas9fusionstorepressordomains(Piateket
al.,2015;Vazquez-Vilaretal.,2016).
Inconclusion,wewouldliketonotethatduetoourtoolkit’smodulardesign,itcanbe
easily updatedwith new functions, andwe intend to incorporate new tools as they become
available.Giventheeaseandspeedofreagentconstructionandtheavailabilityofuser-friendly
protocols andonline tools,wehopeourplatformwill accelerateprogress inplant functional
genomicsandcropimprovement.
Methods
Directandmodularvectorlibraryconstruction
Direct TALEN cloning vectors pDIRECT_37-39J were first constructed by Gibson assembly
(Gibsonetal.,2009)ofPCRproductscontainingpartsofpCLEAN-G126backbone(Tholeetal.,
2007,GenBankacc.EU186082), selectionmarker cassettes, theOsADH 5’UTRamplified from
riceOryza sativa cv.Nipponbare genome (Sugio et al., 2008), 35S promoter,δ152/+63 TALE
scaffoldsincluding3xFLAGepitopeandSV40NLS,Esp3IflankedccdBcassette,BsaIflankedLacZ
cassette, ELD andKKRheterodimeric FokI variants (Doyonet al., 2011) andArabidopsis heat
shock protein 18.2 (HSP) terminator amplified from the A. thaliana genome (Nagaya et al.,
2010). In theprocess,Esp3I recognitionsiteswere removed fromthenptII geneand theELD
FokICDS,BsaIfromtheccdBgeneandtheBglIIsitefromtheHSPterminator.Uniquerestriction
siteswereinsertedtoflankthepromoterandterminatorinthedualTALENexpressioncassette.
TALE sequences frompTALplasmids (Cermaketal., 2011)wereusedand19bpof sequence
immediatelyupstreamofthefirstEsp3IsiteinTALEN1anddownstreamofthesecondBsaIsite
-
29
inTALEN2werere-codedtoincludeuniqueprimerbindingsitesforcolonyPCRandsequencing
of both TALE repeat arrays. pCAMBIA-based vectors pDIRECT_21-23J were constructed
accordingly,excepttheywerefirstmadewithanAscIandSacIflankedtheccdBcassette,later
replaced with the AscI/SacI fragment from the pCLEAN-based vectors containing the dual
TALEN expression cassette; and unique restriction sites were inserted to flank the selection
marker gene. For expression in monocots, the ZmUbi1 promoter was PCR amplified from
pAHC25(ChristensenandQuail,1996)andclonedbetweenAscIandSbfIsitesofpDIRECT_21J
andpDIRECT_23Jtoreplacethe35SpromoterandcreatepDIRECT_25KandpDIRECT_26K.To
createthefirstmoduleplasmids,theBsaILacZcassetteinpTC102(aderivativeofpDIRECT_37J
with translation initiation sequence optimized for dicots) was replaced with a second Esp3I
ccdBcassttetocreatepTC163.ThefirstandsecondTALENcassetteswerePCRamplifiedfrom
pTC163(splittingtheP2AsequencebetweenthetwocassettesandremovingtheAarIsitefrom
it)andaBaeIsitecontainingfragmentwasamplifiedfrompTC130(Čermáketal.,2015),using
threepairsofprimerscontainingAarIsiteswithdistinct4bpoverhangs.Thefirstthreemodule
plasmidspMOD_A1001,pMOD_B2000andpMOD_C0000were constructedby insertingeach
ofthesethreePCRfragmentsintoBsaXIandPmlI-linearizedpTAL3(Cermaketal.,2011)using
Gibsonassembly.TogeneratethefirstCRISPR/Cas9modulespMOD_A0101andpMOD_B2515,
PCRfragmentscontaining35Spromoter,AtCas9genecodon-optimizedforA.thaliana(Fauser
et al., 2014) and the HSP terminator; or the AtU6 gRNA cassette were inserted into AarI
linearized pMOD_A1001 and pMOD_B2000 by Gibson assembly. Similarly, an At7SL gRNA
cassette was inserted into NaeI and XhoI cut pMOD_C0000 to create pMOD_C2516. All
remaining module plasmids were derived from these initial plasmids through replacing the
-
30
genecassettesandpromotersusingtheuniquerestrictionsitesAscI,SbfI,XhoIandSacI,orin
case ofmulti-gRNAplasmids, throughGoldenGate assembly of PCR products containing the
specific elements. TaCas9 was codon optimized for wheat and Gibson assembled from 9
gBlocks (Integrated DNA Technologies, Coralville, USA). Cas9 nickases and catalytically dead
variantsweremadebyGibsonassemblyofPCRproductscontainingthespecificmutationsinto
the Cas9 nuclease modules. Cas9 and TALE transcriptional activators were constructed by
Gibsonassemblyof fragments intoTALENanddCas9containingmodules.TheVPRactivators
werebuiltfirst,usingthetripartiteVPRtranscriptionactivationdomaindescribedinChavezet
al.(2015).TheVP64activatorswerederivedfromtheVPRmodulesbydeletingthep65andRTA
domains by restriction cloning. The Csy4 genes, codon-optimized for dicots (tomato + A.
thaliana) and monocots (wheat and rice), were synthesized as gBlocks along with the P2A
ribosomal skippingsequenceandGibsonassembled intoSbfIandSalIdigestedpMOD_A0101
and SbfI andBstBI digestedpMOD_A1510, respectively.Trex2was cloned similarly, except it
wasPCRamplifiedfrompExodusCMV.Trex2plasmid(Certoetal.,2012)containingthemouse
versionofthegenewhichwepreviouslyobservedhadrobustactivityinplantcells.TheCmYLCV
(withouttheAarIsite)andFMVpromotersweresynthesizedasgBlocks.ThePvUbi1andPvUbi2
promoterswereamplifiedfrompANIC10A(Mannetal.,2012)eachasasetofoverlappingPCR
fragments to enable removal of AarI, BsaI and SapI sites throughGibson assembly. TheNos
promoter, rbcsE9 and 35S terminators were amplified from pFZ19 (Zhang et al., 2010) and
pCAMBIAvectors.TheremainingpromoterfragmentswereobtainedbyPCRamplificationfrom
therespectivegenomicDNAs.ThetRNAGlysequencesusedtocreatethemultigRNAconstructs
were amplified from tomato cv. MicroTom genomic DNA, while the Csy4 and ribozyme
-
31
sequencesweresynthesizedaspartsofprimersforGoldenGateassembly.Finally,theempty
modulespMOD_A0000andpMOD_B0000werecreatedby ligatingannealedoligonucleotides
carryinguniquerestrictionsitesintotheAscIandSacIsitesinpMOD_A0101andAscIandAgeI
sitesinpMOD_B2103.TheT-DNAtransformationbackboneswerederivedfrompCAMBIA1300,
2300and3300vectorsbyGibsonassemblyoftheCmR/ccdBselectioncassetteamplifiedfrom
pMDC32 (Curtis and Grossniklaus, 2003) using primers carrying AarI sites specifying 4 bp
overhangsdesignedtoacceptinsertsfromthethreemodules,intoHindIIIandBamHIdigested
pCAMBIA backbones. In addition, pTRANS_210d-250d aremodified versions of the resulting
vectors with AarI, BsaI, Esp3I and SapI sites removed from the vector backbone by Gibson
assembly of backbone fragments overlapping at the mutated sites. pTRANS_250d and
pTRANS_260dcontainPvUbi1 andPvUbi2promoters thatweredirectlyassembled into these
vectors as described above. All non-T-DNA transformation backbones are derived from
pTRANS_100.ThisvectorwascreateddenovobyGibsonassemblyof twoPCRproducts,one
containingthespectinomycinresistancegeneandbacterialoriginofreplicationamplifiedfrom
pTC14 (Cermak et al., 2011) and the other containing the AarI flanked CmR/ccdB selection
cassette from the T-DNA transformation vector pTRANS_210. The selectionmarker cassettes
were then inserted into this vector by restriction cloning from the T-DNA transformation
backbones. The GVR transformation vectors were built by Gibson assembly from modified
pCAMBIAbackbone(withoutBsaIandEsp3Isites)andgeminiviruselementsdescribedinBaltes
etal.(2014),Čermáketal.(2015)andGil-Humanesetal.(2017)AllCRISPR/Cas9directcloning
andCas9onlyvectorswereassembledbyGoldenGateassemblyfrommodulesA,BandC.To
-
32
createtheCas9onlyvectors,thetransformationbackbonesweremodifiedtoaccepttheinsert
fromasinglemodule.
GeneeditinginMedicagotruncatula
FortargetedmutagenesisoftheMedtr4g020620andMedtr2g013650Agenes,apairofTALENs
wasassembledintopDIRECT_39JusingProtocol1B(SupplementalMethods)resultinginpSC1
and transformed into an Agrobacterium tumefaciens strain EHA105 for whole-plant
transformation of theMedicago truncatula accession HM340 using an established protocol
(Cossonet al., 2006). The resultingplantswere screened for putativemutationsusing a PCR
digestionassaywithprimerslistedinSupplementalTable6andtheHaeIIIrestrictionenzyme.
Thedigestion-resistantproductswereclonedandseveralclonesweresequencedtoidentifythe
mutations. For combinatorial deletion of NCR genes, the six gRNAs were assembled into
pDIRECT_23C along with the CmYLCV promoter using Protocol 3A (SupplementalMethods)
resulting in pSC2 and transformed into Agrobacterium tumefaciens EHA105. To detect the
targetedgenedeletionsandshort indelsateachgRNAsite,thethreelociwerePCRamplified
fromT0andT1plantsusingprimerslistedinSupplementalTable6.ThePCRproductswerefirst
directly sequenced. PCR products from samples containing both the long deletion and non-
deletionalleleswere gel purifiedbefore sequencing and sequenced separately. Samples that
showedoverlappingsequencetracesstartingateithergRNAsiteindicativeof indelmutations
wereclonedandseveralclonesweresequencedtoidentifythetypeofmutations.Fortargeted
deletionofthe58kbregiononchromosome2,thesixgRNAswereassembledasCsy4ortRNA
arrays driven by the CmYLCV promoter into pDIRECT_23C using Protocol 3A and 3B
(SupplementalMethods)yieldingpSC3andpSC4,respectively,andtransformedintoMedicago
-
33
truncatulaaccessionHM340asabove.Targeteddeletions intheresultingT0,T1andT2plants
weredetectedusingprimerslistedinSupplementalTable6,clonedandsequenced.
Determiningtheefficiencyofmulti-gRNAexpressionsystemsintomatoprotoplastsbydeep
sequencing
To build vector pTC303, expressing the two gRNAs from AtU6 and At7SL promoters; and
pTC363expressingeachgRNAfromanindividualAtU6promoter,thegRNAspacerswerefirst
clonedintopMOD_B2515(moduleBwithAtU6promoter,usedforbothpTC303andpTC363),
pMOD_C2515(moduleCwithAtU6promoter,usedforpTC363)andpMOD_C2516(moduleC
with At7SL promoter, used for pTC303) using Protocol 2A (Supplemental Methods). The
resulting plasmids were assembled together with pMOD_A0101 into the pTRANS_100
protoplast vector using Protocol 5 (SupplementalMethods). pAH750, expressing the gRNAs
with structurally optimized scaffolds from AtU6 and At7SL promoters was constructed
accordingly,exceptthegRNAspacerswereclonedintopMOD_B2515bandpMOD_C2516b,the
respectivemoduleBandCplasmidsthatcarrytheoptimizedgRNArepeats.TheCsy4,tRNAand
ribozymeexpressioncassetteswerecreateddirectlyinprotoplastvectorspDIRECT_10C(Csy4)
andpDIRECT_10E(tRNAandribozyme)usingProtocol3A,3Band3C(SupplementalMethods),
resulting in pTC364 (Csy4), pTC365 (tRNA) and pTC366 (ribozyme). Next, protoplasts were
isolated from tomato cv. MicroTom as described in Zhang et al. (2012), with minor
modifications.Leavesfrom1-2sterilegrown(26°Cwith16hlight)~4weeksoldplantletswere
slicedintothinstripswithasharprazor,transferredtotheenzymesolution(1.0%cellulaseR10,
-
34
0.25%macerozymeR10, 0.45Mmannitol, 20mMMES, 1xMS salts and0.1%bovine serum
albumin)andincubated15hat25°Cand40rpminthedark.Thedigestedproductwasfiltered
througha40-µmcellstrainer(Falcon)andpipetteddirectlyontopof0.55Msucrosesolution.
TherestoftheprotocolfollowedZhangetal.(2012),exceptW5solution(2mMMES,154mM
NaCl,125mMCaCl2and5mMKCl)wasusedaswashingbuffer.TheCRISPR/Cas9constructs
were co-transformed into ~200,000 cells along with the YFP expression plasmid pZHY162
(Zhangetal.,2012),theprotoplastswereresuspendedinW5solutionandkeptat25°Cinthe
dark.All samples2days later, the transformationefficiencywasdeterminedby counting the
YFP-positivecellsusing imageanalysis,genomicDNAwas isolatedandusedasatemplatefor
thedeepsequencinglibrarypreparation,asdescribedin(Čermáketal.,2015).Briefly,40ngof
protoplastgenomicDNAwasusedastemplatetoamplifya354-bpregionoftheARF8A locus
encompassing the gRNA9 and gRNA10 target sites with primers TC324F and TC324R
(Supplemental Table 6), which have overhangs complementary to Nextera XT indices. PCR
productswerepurifiedwithAgencourtAMPureXPBeads(BeckmanCoulter,Brea,USA)and5
μlofthepurifiedPCRproductwasusedastemplateforthesecondPCRtoattachdualindices
and Illumina sequencing adapters. PCR products were purified with Agencourt AMPure XP
Beads, quantified and pooled. The pooled libraries were enriched for fragment sizes in the
range250 -700bpusingBluePippin (SageScience,Beverly,USA)and sequencedon Illumina
MiSeq platform (ACGT, Inc., Wheeling, USA). An average of 140,000 reads per sample was
generated (Supplemental Table 1, http://www.ebi.ac.uk/ena/data/view/PRJEB13819).
Sequencedataanalysiswasdoneasdescribedpreviously(Čermáketal.,2015),exceptdifferent
sequence variants were manually sorted into specific groups. All data are presented as the
-
35
mean of three biological replicates (three separate pools of protoplasts) plus/minus the
standarderror.Totestforthesignificanceofthedifferencebetweenthesamples,Tukeytest
formultiplecomparisonofmeansinR(version3.2.4RC)wasused(SupplementalTable4).
Testingmulti-gRNAexpressionsystemsintomatoprotoplastsbyqPCR
The Csy4, tRNA and ribozyme arrays of eight gRNAs (Supplemental Table 4) were created
directlyinprotoplastvectorspDIRECT_10C(Csy4)andpDIRECT_10E(tRNAandribozyme)using
Protocol3A, 3B and3C (SupplementalMethods), resulting inpTC634 (Csy4, gRNAorder49-
56),pTC635(Csy4,gRNAorder56-49),pTC636(tRNA,gRNAorder49-56),pTC637(tRNA,gRNA
order56-49),pTC638(ribozyme,gRNAorder49-56)andpTC639(ribozyme,gRNAorder56-49).
To construct the vector with eight gRNAs each expressed from an individual AtU6 (Pol III)
promoter, four AtU6:gRNA cassettes were first assembled into each pMOD_B2203 and
pMOD_C2200(resultinginpTC643andpTC644)usingProtocol3,exceptthefirst,lastandeach
reverse primer were re-designed to bind the AtU6 sequence. The AtU6 cassettes were
amplifiedfromBanI-digestedpTC363(firstcassette),aHindIIIandSgrAIfragmentfrompTC363
(intermediarycassettes)andaBsaHIfragmentfrompTC363(lastcassette).The35Sterminator
wasremovedfrompTC644bySnaBI/Eco53kI-mediatedreleaseoftheterminatorfragmentand
religationofthebackbone,resultinginpTC645.pTC643andpTC645wereassembledalongwith
pMOD_A0101 into thenon-T-DNAvectorpTRANS_100usingProtocol5, resulting in the final
vectorpTC646.
The CRISPR/Cas9 constructs were transformed into tomato protoplasts as described
above.PurifiedprotoplastDNAwasextracted48hourslaterandusedwithLunaUniversalqPCR
-
36
Master Mix (NEB) following the manufacturer’s protocol. All samples were run in technical
triplicateoftwobiologicalreplicates(twoseparatepoolsofprotoplasts)ina384-wellformaton
a LightCycler 480 Instrument (Roche). All qPCR primers are listed in Supplemental Table 6.
Primers for the genomic locus SGN-U346908 control were previously described (Expósito-
Rodríguezetal.,2008).Alldataarepresentedasthemeanplus/minusthestandarderrorwith
eachsamplerelativetothelevelsoftotalDNAasmeasuredbythegenomicslocuscontrol.
Geneeditingintomato,wheatandbarleyprotoplasts
TobuildthetomatoconstructpTC362withtwelvegRNAstargetingsixgenes,twoarraysofsix
gRNAs were cloned into pMOD_B2203 (with CmYLCV promoter) and pMOD_C2200 using
Protocol 3S1 and the two arrays were assembled together with the Csy4-Cas9 module
pMOD_A0501 into the non-T-DNA transformation backbone pTRANS_100 using Protocol 5
(SupplementalMethods).ThewheatvectorpJG612wasbuiltaccordinglyusingtheversionof
Cas9optimizedformonocotsinpMOD_A1510,asinglearrayofsixgRNAsdrivenbythePvUbi1
promoter in pMOD_B2112 and an empty module pMOD_C0000. Similarly, pMOD_A1510,
pMOD_B 2112 (PvUbi1) or pMOD_B2103 (CmYLCV), and the empty pMOD_C0000 modules
along with the transformation backbone pTRANS_100 were used to construct the barley
vectors expressing two gRNAs from the PvUbi1 (pTC441) or CmYLCV (pTC437) promoters,
targeting for deletion theMLO gene. Tomato protoplasts were isolated and transformed as
described above. Wheat protoplasts were isolated from ~10 day old plantlets of the bread
wheatcv.Bobwhite208.Seedsweresurfacesterilizedbyrinsingin70%ethanolfor10minand
soakingfor30minina1%sodiumhypochloritesolution,andasepticallygrowninMSmedium
-
37
at24°Cundera16-hlightcycle.Approximately20plantletswereharvestedineachexperiment,
cutinto~1-mmstripswitharazorblade,anddigestedwithanenzymesolution(1.5%cellulase
R10, 0.75%macerozyme R10, 0.6Mmannitol, 10mMMES, 10mMCaCl2, and 0.1% bovine
serumalbumin,pH5.7). The remaining stepswere identical as in theprotocol forprotoplast
isolation from tomato described above. Barley protoplasts were isolated from cv. Golden
Promiseusingthesameprotocol,exceptleavesfrom9-to14-day-oldplantlets(22°Cwith16h
light cycle) were used. The CRISPR/Cas9 constructs were co-transformed into ~200,000 cells
alongwiththeGFPexpressionplasmidpMOD_A3010theprotoplastswereresuspendedinW5
solutionandkeptat25°Cinthedark.GenomicDNAwasisolated2daysaftertransformation,
targeted deletions were detected using primers listed in Supplemental Table 6, cloned and
sequenced.
GeneeditingwithTrex2
Individual gRNAs were cloned into pMOD_B2515 (AtU6 promoter, for tomato) and
pMOD_B2518 (TaU6promoter, for barley) usingProtocol 2A and assembledwith 35S:Trex2-
P2A-AtCas9 in pMOD_A0901 (for tomato) or ZmUbi:Trex2-P2A-TaCas9 in pMOD_A1910 (for
barley) and the empty module pMOD_C0000 into the protoplast transformation backbone
pTRANS_100 to createpTC391 (gRNA23, tomatoANT1),pTC393 (gRNA24, tomatoANT1) and
pTC436 (gRNA38, barleyMLO) using PROTOCOL 5. The control plasmids pTC392 (gRNA23,
tomato ANT1), pTC394 (gRNA24, tomato ANT1) and pTC440 (gRNA38, barleyMLO) without
Trex2wereconstructedaccordingly,except35S:AtCas9 inpMOD_A0101andZmUbi:TaCas9 in
pMOD_A1110 were used instead of the Trex2 containing modules. The constructs were
-
38
transformed into tomato and barley protoplasts as described above and genomic DNA was
extracted2dayslater.TheANT1andMLOlociwereamplifiedbyPCR(seeSupplementalTable
6forprimers)andthegeneeditingratesweredeterminedusingT7endonucleaseIaccordingto
the manufacturer’s protocol (NEB, Ipswich, USA). Gel images were analyzed using ImageJ
(Schneider et al., 2012). In addition, theANT1 PCR productswere cloned and sequenced to
determine both frequency and type of mutations. For barley, the PCR products were first
digestedwith NcoI to enrich formutations and only the digestion-resistant product was gel
purified,clonedandsequenced.
GenetargetingwithCas9nickasesintobaccoandwheat
IndividualgRNAspacerswereclonedinmoduleB(pMOD_B2515fortobaccoorpMOD_B2518
forwheat)orC(pMOD_C2516fortobaccoorpMOD_C2518forwheat),downstreamofanAtU6
andAt7SLor twoTaU6promoters.TheGTdonorswere inserted into the resultingmoduleC
plasmids usingProtocol 4 (SupplementalMethods). A promoter-less T2A-gfp-nos terminator
cassette with 747 and 773 bp long flanking homology arms, and a 2530 bp long functional
gus:nptII gene fusion were used as donor templates for wheat and tobacco, respectively.
Finally,theBandCmoduleswereassembledwithselectedmoduleACas9variants(nuclease,
D10Anickase,H840nickaseordeadCas9expressedfromeither35SorZmUbipromoterintoa
BeYDV (pTRANS_201) or WDV (pTRANS_203) replicon vectors and used for A. tumefaciens
infiltrationoftobaccoleaves(pJG363,pJG365,pJG367,pJG369,pJG370,pJG380andpJG382)or
biolistic bombardment ofwheat scutella (pJG455, pJG456, pJG480, pJG481, pJG484, pJG486,
pJG488 and pJG493). Note that plasmids pJG455-pJG493 above used the octopine synthase
-
39
(OCS)terminatorsequenceinsteadofthemoreefficientHSPterminatorusedinthepublished
modules. The TaCas9 coding sequence and the rest of the plasmidswere identical with the
newermodulesincludedinthecurrentvectorset.Fourdifferenttransgeniclines(Wrightetal.,
2005)wereused todeterminegene targeting frequencies in tobaccoaspreviouslydescribed
(Baltesetal.,2014),withminormodifications.Leavesfrom1to6weeksoldplantswereused
forAgrobacteriumleafinfiltration.Foreachleaf,onehalfwasinfiltratedwithacontrolplasmid
pLSLZ.D.R(Baltesetal.,2014)thatservedasareferencetocontrolfortransformationefficiency
andtheotherhalfwithoneoftheCRISPR/Cas9constructs.Fourtosix leaveswere infiltrated
witheachconstructineachexperiment.Fivedaysafterinfiltration,theleaftissuewasstained
in the X-Gluc solution. Whole leaves were scanned and the stained area was quantified by
imageanalysisusingImageJ(Schneideretal.,2012).TheGTfrequencieswerenormalizedtothe
pLSLZ.D.RcontrolanddisplayedrelativetotheCas9nucleaseconstruct.Formolecularanalysis,
genomicDNAwasextracted fromtwo infiltrated leavespersample (three leafpuncheswere
takenfromeachleafandpooled)andusedasatemplateforPCRtoamplifytheleftandright
recombinationjunctionsusingprimerslistedinSupplementalTable6.PCRproductsfromboth
leaf sampleswere cloned and sequenced. To determineGT frequencies inwheat, immature
wheatscutella(0.5-1.5mm)wereisolatedfromprimarytillersharvested16daysafteranthesis
and used for biolistic transformation. Scutella isolation and culture conditions were as
describedinGil-Humanesetal.(2011).BiolisticbombardmentwascarriedoutusingaPDS-1000
gene gun. Equimolar amounts (1 pmol DNA mg-1 of gold) of plasmid were used for each
experimentwith60μgofgoldparticles(0.6μmdiameter)pershot.GFPimagesoftransformed
-
40
tissue were taken 7 days after transformation and GFP positive cells were counted. The GT
frequencieswerenormalizedtotheCas9nucleaseconstruct.
Accessionnumbers
ThedeepsequencingdataisavailableundertheEuropeanNucleotideArchive(ENA)accession
PRJEB13819(http://www.ebi.ac.uk/ena/data/view/PRJEB13819).
AllvectorsarepubliclyavailablefromAddgene(plasmid#90997–91225).AnnotatedvectorsequencesareavailablefromAddgeneandfromhttp://cfans-pmorrell.oit.umn.edu/CRISPR_Multiplex/
SUPPLEMENTALDATA
SupplementalFigure1.Universalarchitectureofexpressioncassetteswithuniquerestriction
sitesshown.
Supplemental Figure 2. GFP expression mediated by different promoter/terminator
combinationsintomatoprotoplasts.
SupplementalFigure3.FrequenciesofmutationsinducedusingdifferentsystemsforgRNA
expressionintomatoprotoplasts.
SupplementalFigure4.ShortindelsequencevariantsdetectedatgRNAsite9.
SupplementalFigure5.ShortindelsequencevariantsdetectedatgRNAsite10.
SupplementalFigure6.ShortindelsequencevariantsdetectedatbothgRNAsites9and10.
SupplementalFigure7.InsertionsfromtheARF8AlocusmappedtothevectorpTC364.
SupplementalFigure8.LongdeletionsequencevariantsdetectedbetweengRNAsites9and
10.
SupplementalFigure9.GFPexpressionmediatedbydifferentpromotersinwheatscutella.
-
41
SupplementalFigure10.MultiplexedtargetedmutagenesisinMedicagotruncatulausingCas9
andCsy4.
SupplementalFigure11.AnalysisofT1progenyfromfourMedicagotruncatulaT0plantswith
mutationsinNCR53,NCR54andNCR55.
SupplementalFigure12.SequencesofinsertionsdetectedattheANT1site#2intomato
protoplastsampleswithoutTrex2expression.
SupplementalFigure13.Detectionofgenetargetingintobaccoleaves.
SupplementalFigure14.GenetargetingwithCas9nickasesandgeminivirusreplicons(GVRs)in
wheat.
SupplementalTable1.Sequencingandreadfilteringdata.
SupplementalTable2.FrequenciesofindividualmutationtypesamongallARF8Asequencing
reads.
SupplementalTable3.MultipleTukeytestresults.
SupplementalTable4.SequencesofthegRNAsitesusedintheqPCRexperiment.
SupplementalTable5.Genotypesof46MedicagotruncatulahygromycinresistantT0plants
transformedwithaCas9Csy4constructexpressingsixgRNAstargetingthreegenes.
SupplementalTable6.Listofprimersusedinthisstudy.
SupplementalMethods.Protocolsforvectorassembly.
SupplementalDataSet1.Listofvectorsincludedinthetoolset.
SupplementalDataSet2.SequencesofNCR53,NCR54andNCR55locitargetedfordeletionin
46MedicagotruncatulaT0plants.
-
42
ACKNOWLEDGEMENTS
WethankmembersoftheVoytaslabforhelpfuldiscussionsandinsights.Weparticularlythank
LynnHufortechnicalassistance,LázaroPeres(UniversidadedeSãoPaulo,Brazil)andAgustin
Zsögön (Universidade Federal de Viçosa, Brazil) for input on target genes in tomato, Joseph
Guhlin (College of Biological Sciences, University of Minnesota) for input onMedicago NCR
target genes, Aaron Hummel for providing the plasmids with structurally optimized gRNA
scaffolds, Evan Ellison for the GmUbi promoter reagents, and Kit Leffler for help with the
figures. FundingwasprovidedbygrantsfromtheNationalScienceFoundation (IOS-1339209
and IOS-1444511) and the Department of Energy (DE-SC0008769).
AUTHORS’CONTRIBUTIONS
TCdesignedthevectorsystemandworkplan.TC,JG,CGSandRLGconstructedthevectors.TC
performedtheexperimentsintomatoprotoplasts.SJCperformedtheexperimentsinMedicago
truncatula.JGperformedtheexperimentsinwheatprotoplastsandgenetargetingexperiments
intobaccoandwheat.EKperformedtheexperimentsinbarleyprotoplastsandcarriedoutthe
molecularanalysisof thegenetargetingevents in tobacco.RCanalyzedthedeepsequencing
data.TJYKdesignedandbuiltthesetofonlinetools.JJBperformedtheqPCRanalysistoassess
deletionfrequencies.JWMperformedtheanalysisofGUSstainingintobaccoleaves.TC,JG,SC
andDVwrotethemanuscript.
-
57
NumberofT0plants(outof46)
Targetgene
DeletionbetweengRNAsites
InversionbetweengRNAsites
IndelatgRNAsite#1only
IndelatgRNAsite#2only
IndelsatbothgRNAsites
Totalmutant
NCR53(multicopy) 17 4 11 1 23 43NCR54 0 0 0 8 0 8NCR55 7 0 5 0 1 12
Table1.MultiplextargetedmutagenesisinMedicagotruncatulausingCas9andCsy4.
-
Golden Gate assembly using Type IIs enzymes
CCDB
ORClone gRNA(SINGLE STEP)pF U
S_A
pF U
S_B
pL
R
Assemble a TALEN(THREE STEPS)
Transforma
tion Backbone (TB)
A BGolden Gate assembly using Aarl
CCDB
Transforma
tion Backbone (TB)
A
B
C
CCTG
CTAG
GATC
GGCCGGAC
CCGG TCAC
AGTG
A B C
Aarl AarlAarl Aarl Aarl Aarl Aarl Aarl
DIRECT CLONING VECTORS MODULAR ASSEMBLY VECTORS
Figure 1. Two sets of vectors for direct cloning or modular assembly of genome engineering reagents. A) Direct cloning vectors were designed to speed-up the cloning process. Specificity determining elements (gRNAs or TAL repeats) are cloned directly into the transformation backbone (e.g. T-DNA). B) The modular assembly vectors enable combination of different functional elements. Specificity determining elements (gRNAs, TAL repeatsand donor templates for gene targeting) are first cloned into intermediate module vectors. Custom-selected modules are then assembled together intothe transformation backbone (e.g. T-DNA).
-
A
C
TGAATGTCATAAACACTCATGAATTTTGGCCTCATCAGTTTGTGATGGAAAA
TGAATGTCATAAACACTCATGAAT-----------------GTGATGGAAAA: C1TGAATGTCATAAACACTCATGAATTT--GCCTCATCAGTTTGTGATGGAAAA: C2TGAATGTCATAAACACTCATGAAT----GCCTCATCAGTTTGTGATGGAAAA: C3TGAATGTCATAAACACTCATGAATTTTG-CCTCATCAGTTTGTGATGGAAAA: C4
WPT52-4-4 - Medtr4g020620
ATTTAGTTCCTGTGAATGTCATAAACACTCATGAATTTTGGCCTCATCAGTTTGTGATGGAAAAAG
A---------------------------------------------------------------AG
WPT52-4-4 - Medtr2g013650
4g020620
2g013650
bar
ACTIN
WPT
52-4
-1W
PT52
-4-2
WPT
52-4
-3W
PT52
-4-4
WPT
52-4
-5W
PT52
-4-6
WPT
52-4
-7W
PT52
-4-8
HM34
0 (D
)HM
340
(UD)
B
TGTCATAAACACTCATGAATTTTGGCCTCATCAGTTTGTGATGGA
Medtr4g020620
4g020620 F
4g020620 R
2g013650 R
2g013650 F
Medtr2g013650
Figure 2. TALEN-mediated mutagenesis in Medicago truncatula using direct cloning vectors. A) Maps of two Medicago genes targeted for mutagenesis using a single TALEN pair. PCR primers used for screening in panel B are shown as green arrowheads. The sequence of the TALEN target site is shown with TALEN binding sites underlined. B) HaeIII PCR digestion screening of eight T1 progeny of plant WPT52-4. The amplified locus is indicated on the right. bar and ACTIN are controls that amplify the transgene and a native gene, respectively. HM340 (D), WT product digested with HaeIII; HM340 (UD), undigested WT product. C) DNA sequences of undigested amplicons from plant WPT52-4-4. The reference sequence of the unmodified locus is shown on the top. TALEN binding sites are underlined. HaeIII restriction site used for the screening is in bold red text.
-
50
40
30
20
10
0
AtU6+
At7SL
Impro
ved g
RNA
AtU6+
AtU6
Csy4
tRNA
Riboz
yme
Non-t
ransfo
rmed
ALL MUTATIONSC
B
A
CCCTCAAAGATCAGGCTGGCAGC...62 bp...CCTAATTAACCAGAAAACTTAC
Unmodified
Long deletion
Short indels at gRNA9 site
Short indels at gRNA10 site
Short indels at both sites
Inversion
Long deletion plus insertion
gRNA9 gRNA1085bp
5’ 3’
Figure 3. Comparison of different systems for expressing multiple gRNAs. A) Structure of six constructs for expressing gRNA9 and gRNA10, which both target the tomato ARF8A gene. CmYLCV, cestrum yellow leaf curling virus promoter; Csy4, 20 bp Csy4 hairpin; tRNA, 77 bp pre-tRNAGly gene; rb, 15 bp ribozyme cleavage site; ribozyme, 58 bp synthetic ribozyme; AtU6, Arabidopsis U6 promoter; At7SL Arabidopsis 7SL promoter. B) Possible editing outcomes due to expression of both gRNAs. Sequences targeted by gRNA9 and gRNA10 are shown on the top Cleavage sites are indicated by arrows. C) Overall mutation frequencies as determined by deep sequencing. Error bars represent SE of three replicates (pools of protoplasts). Statistical significance was determined by Tukey’s test. *P < 0.05 and ***P < 0.001.
*Improved gRNA scaffold
Csy4 Csy4 Csy4CmYLCV gRNA9 gRNA10 35S polyA
CmYLCV gRNA9 gRNA10 35S polyAtRNA tRNAtRNA
rb ribozymeCmYLCV gRNA9 rb gRNA10 35S polyA
AtU6 gRNA9 AtU6 gRNA10 TTTTTTTTTTTT
gRNA9 At7SL gRNA10 TTTTTTTTTTTTAtU6
gRNA9 At7SL gRNA10 TTTTTTTTTTTTAtU6
Ove
rall
Mut
atio
n Fr
eque
ncy
-
C
Figure 4. Evaluation of eight gRNAs in various polycistronic expression systems and the effect of position in the array on gRNA activity. A) Structure of seven constructs for expressing gRNAs 49 -56 targeting four tomato genes for deletion. B) Detection of deletions between gRNA sites using qPCR. Each gene is targeted for deletion with two gRNAs designed to create a 3 kb deletion to prevent amplification of the non-deleted, WT template. Deletion frequency wsa quantified by qPCR using primers shown as red and blue arrowheads. C) Deletion frequencies in tomato protoplasts as measured by qPCR. Positions of respective gRNAs in the array are specified for each construct. Error bars represent SE of two replicates.
ACm
YLCV
gRNA
49
gRNA
50
35S p
olyA
gRNA
51
gRNA
52
Csy4
Csy4
Csy4
Csy4
Csy4
Csy4
Csy4
Csy4
Csy4
gRNA
53
gRNA
54
gRNA
55
gRNA
56
CmYL
CV
gRNA
56
gRNA
55
35S p
olyA
gRNA
54
gRNA
53
Csy4
Csy4
Csy4
Csy4
Csy4
Csy4
Csy4
Csy4
Csy4
gRNA
52
gRNA
51
gRNA
50
gRNA
49
CmYL
CV
35S p
olyA
tRNA
Gly
gRNA
49
gRNA
50
gRNA
51
gRNA
52
gRNA
53
gRNA
54
gRNA
55
gRNA
56
tRNA
Gly
tRNA
Gly
tRNA
Gly
tRNA
Gly
tRNA
Gly
tRNA
Gly
tRNA
Gly
tRNA
Gly
gRNA
56
gRNA
55
gRNA
54
gRNA
53
gRNA
52
gRNA
51
gRNA
50
gRNA
49
CmYL
CV
35S p
olyA
tRNA
Gly
tRNA
Gly
tRNA
Gly
tRNA
Gly
tRNA
Gly
tRNA
Gly
tRNA
Gly
tRNA
Gly
tRNA
Gly
rb rb rb rb rb rb rb rbCmYL
CV
gRNA
49
gRNA
50
35S p
olyA
gRNA
51
gRNA
52
riboz
yme
gRNA
53
gRNA
54
gRNA
55
gRNA
56
CmYL
CV
gRNA
56
gRNA
55
35S p
olyA
gRNA
54
gRNA
53
rb rb rb rb rb rb rb rbgRNA
52
gRNA
51
gRNA
50
gRNA
49
riboz
yme
gRNA
49
gRNA
50
gRNA
51
gRNA
52
gRNA
53
gRNA
54
gRNA
55
gRNA
56
AtU6
TTTT
TT
TTTT
TT
TTTT
TT
TTTT
TT
TTTT
TT
TTTT
TT
TTTT
TT
TTTT
TT
AtU6
AtU6
AtU6
AtU6
AtU6
AtU6
AtU6
BWT locus
3 kb - no amplification
Deleted locus
100-200 bp - amplifies in qPCR
Del
etio
n Fr
eque
ncy
Solyc02g077390 - gRNAs 49/50
Csy4
- 1/2
Csy4
- 7/8
tRNA
- 1/2
tRNA
- 7/8
Ribo -
1/2
Ribo -
7/8 U6
0
0.2
0.4
0.6
0.8
1.0
1.2Solyc06g074240 - gRNAs 55/56
Del
etio
n Fr
eque
ncy
Csy4
- 7/8
Csy4
- 1/2
tRNA
- 7/8
tRNA
- 1/2
Ribo -
7/8
Ribo -
1/2 U6
0
0.2
0.4
0.6
0.8
1.0
1.2Solyc02g090730 - gRNAs 53/54
Del
etio
n Fr
eque
ncy
Csy4
- 5/6
Csy4
- 3/4
tRNA
- 5/6
tRNA
- 3/4
Ribo -
5/6
Ribo -
3/4 U6
0
0.2
0.4
0.6
0.8
1.0
1.2
-
C
A1930 bp (1306 bp)Solyc06g074350
gRNA11 gRNA12
1510 bp (409 bp)Solyc02g085500
gRNA13 gRNA14
670 bp (388 bp)Solyc02g090730
gRNA15 gRNA16
1497 bp (684 bp)Solyc06g074240
gRNA17 gRNA18
1933 bp (1536 bp)Solyc02g077390
gRNA19 gRNA20
1551 bp (313 bp)Solyc11g071810
gRNA21 gRNA22
Solyc06g074350AGATCATGATAGATCCAGATGTTCCTGGTCCTAGTGATCCATATCT // AGAGGATTTGACCATCTTTACAGGATTGTCACAGACATTCCAGGCACTAGATCATGATAGATCCAG---------------------------- // -------------------------------------------GCACTAGATCATGATAGATCCAG---------------------------- // -------------------------------------------GCACTAGATCATGATAGATCCAG---------------------------- // -------------------------------------------GCACT
gRNA11 gRNA12
Solyc02g085500TAGTTCCTTCGTGTCTGAAGAAGAAGAATGTGAAGAGGAGATCAATTT // AGTGAAGAAATCTCAGGACCCGTACTAAGATTTCAAGAGATCGATGTAGTTCCTTCG------------------------------------- // -------------------------GAAGATTTCAAGAGATCGATGTAGTTCCTTCG------------------------------------- // -----------------------------ATTTCAAGAGATCGATG
gRNA13 gRNA14
Solyc02g090730ATGTTCCTCCTCACTATGTATCTGCCCCCGGCACCACCACGGCGCGGT // TTAGCATGTGGGAGTAGAGGTGCATTATATTGTTTGCTGGGATTGAATGTTCCTCCT------------------------------------- // -----------------------------------GCTGGGATTGAATGTTCCTCCT------------------------------------- // -----------------------------------GCTGGGATTGAATGTTCCTCCTC------------------------------------ // ----------------------------------TGCTGGGATTGA
gRNA15 gRNA16
Solyc11g071810AAATCCCTTCGATTCGTCGTAAGTACTCTCTCTATCTCTCTTAATAAA // GAGAAAAGACAACGTGTTCCTTCTGCGTACAACCGATTCATCAAGTAAATCCCTTCGA------------------------------------ // ------------------------GCGTACAACCGATTCATCAAGTAAATCCCTTCGA------------------------------------ // ----------------------------ACAACCGATTCATCAAGTAAATCCCTTCG------------------------------------- // -------------------------CGTACAACCGATTCATCAAGT
gRNA21 gRNA22
Solyc06g074240GATCATTATCGGAGCTGGCCCTGCTGGGCTCAGGCTAGCTGAACAAGT // CTATTGGTGGGAATTCAGGGATAGTTCATCCATCAACAGGGTA