S
CHANGE!!MGL Users Group meetings
will now be on the 1st Monday of each month
3:00-4:00Room 3-2550
Note the change of time and room
S
Notes on ChIP-seq library preparation and initial
handlingMGL Users Group
1/21/15
Library Preparation
What is the desired resolution of the experiment? Is it desired to be able to identify a specific recognition motif? Do we want to resolve binding to bp resolution? Do we just need general areas?
Target resolution will influence both sample preparation and initial library construction.
It may also influence a decision to do either paired-end or single-end sequencing. Paired end will result in more reliable mapping, generally, by
disambiguation of highly similar sequences with another proximal read. It can also potentially resolve the ‘ends’ and sizes of protected
fragments
HiSeq / SOLiD reads are short
Unless the ChIP targets have a small footprint, even paired end reads may not span a full pulldown target in some cases. Only ends will be read, with intervening sequence ‘invisible’ to
overall read density.
In the case of digested nucleosome fragments, for example, it may be possible to have more than one tandem ligand/target on a single resulting fragment.
Again, if resolution is key, it’s often desirable to ensure efficient cleavage/digestion of DNA prior to pulldown. Additional steps may be added to size-select for fragments
matching most closely in size to the expected single-ligand protected fragment.
Always run an Input control
ChIP-seq identifies regions of enrichment in the IP pool which may be affected by relative abundance of those in the input.
To control for artificial enrichment, the input control serves as a normalization for apparent peaks of enriched signal in the IP.
Additional treatments pre-library construction
Large pulldown fragments were subjected to additional Covaris shearing prior to library construction.Input Immuno-
precipitated
Mono-nucleosomes~151 nt
Poly-nucleosomes~1,000-8,000 nt
>1kb necessitate mate-pair library construction otherwise.
Final library sizes were similar
After Covaris shearing, construction of standard fragment libraries proceeded as normal.
Input Immuno-precipitated
Library~285 nt
Library~285 nt
The common workflow for ChIPseq following
sequencing
Bardet, et al. Nature Protocols (2012)
Data Alignment
Alignment strategies may differ, particularly if redundant or highly similar sequences are expected.
Alignment modes: Unique alignment – Only uniquely best alignments are used Random assignment – If equal best alignments are found,
randomly assign the read to one of them Multiple assignment – If equal best alignments are found,
assign the same read to all of them
Depending on the aligner used, often a random/multi aligned read will be reported as mapping quality of 0. Visualization tools may mask mapping quality below a certain
threshold
Peak identification & Mining Software
Peak calling/comparisons HOMER, MACS, ODIN, R (DBChIP, others), MAnorm
Peak annotation HOMER, Bedtools, GREAT, Cistrome, R (ChIPpeakAnno)
Vizualization IGV, UCSC Genome Browser, NGS-Plot, R (base & other packages)
Motif Identification Interpro (known), Pfam (known), MEME (discovery/known)
Enrichment Analysis DAVID, gProfiler, BiNGO, AmiGO, R (GOstats, others)