project of cz5225 zhang jingxian: [email protected]

28
Project of CZ5225 Project of CZ5225 Zhang Jingxian: [email protected]

Upload: leslie-blake

Post on 03-Jan-2016

232 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Project of CZ5225Project of CZ5225

Zhang Jingxian: [email protected]

Page 2: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Identifying biomarkers of drug Identifying biomarkers of drug response for cancer patientsresponse for cancer patients

Aims:Aims: To To develop of predictors of response to develop of predictors of response to

drugs drugs To learn how to get public microarray To learn how to get public microarray

datadata To learn how to preprocess microarray To learn how to preprocess microarray

raw dataraw data To annotate the genes of interestTo annotate the genes of interest

Page 3: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

RequirementsRequirements

Each group investigates:Each group investigates: ONE kind of cancer patient drug responseONE kind of cancer patient drug response Need Two datasets from different studiesNeed Two datasets from different studies Download the raw dataDownload the raw data Use Bioconductor in R to prepossess raw Use Bioconductor in R to prepossess raw

datadata Identify certain number of genesIdentify certain number of genes Annotate those identified genes in your Annotate those identified genes in your

reportreport Each group needs only ONE reportEach group needs only ONE report

Page 4: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

RequirementsRequirements

All kinds of affymatrix expression All kinds of affymatrix expression datasets related to drug response of datasets related to drug response of cancer patients are availablecancer patients are available

Dataset needs to contain at least 20 Dataset needs to contain at least 20 samplessamples

Dataset needs two comparable Dataset needs two comparable outcome groups: response vs. non-outcome groups: response vs. non-response; resistance vs. non-response; resistance vs. non-resistance, et al.resistance, et al.

Page 5: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Bioconductor & RBioconductor & R

http://www.bioconductor.orghttp://www.bioconductor.org

Page 6: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

AdvantagesAdvantages Cross platformCross platform

Linux, windows and MacOSLinux, windows and MacOS Comprehensive and centralizedComprehensive and centralized

Analyzes both Affymetrix and two color spotted microarrays, and Analyzes both Affymetrix and two color spotted microarrays, and covers various stages of data analysis in a single environmentcovers various stages of data analysis in a single environment

Cutting edge analysis methodsCutting edge analysis methods New methods/functions can easily be incorporated and New methods/functions can easily be incorporated and

implementedimplemented QualityQuality check of data analysis methodscheck of data analysis methods

Algorithms and methods have undergone evaluation by Algorithms and methods have undergone evaluation by statisticians and computer scientists before launch. And in many statisticians and computer scientists before launch. And in many cases there are also literature referencescases there are also literature references

Good documentationsGood documentations Comprehensive manuals, documentations, course materials, Comprehensive manuals, documentations, course materials,

course notes and discussion group are availablecourse notes and discussion group are available A good chance to learn statistics and programmingA good chance to learn statistics and programming

Page 7: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Installation R & Installation R & BionconductorBionconductor

Install R from: Install R from: http://cran.stat.nus.edu.sg/

Open R platform then execute:Open R platform then execute:>source("http://bioconductor.org/>source("http://bioconductor.org/

biocLite.R") biocLite.R")

>biocLite() >biocLite()

Check library by execute: >library()Check library by execute: >library()

Page 8: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Case studyCase study Dataset source (GSE19697): Dataset source (GSE19697):

http://www.ncbi.nlm.nih.gov/geo

Page 9: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Extraction raw data into: Extraction raw data into: D://gse19697D://gse19697

Create title.txt :Create title.txt :

Page 10: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Open ROpen R Set workdir by execute: Set workdir by execute:

>setwd(>setwd(‘‘d://gse19697d://gse19697’’)) Load simpleaffy module by execute:Load simpleaffy module by execute:

>library(simpleaffy)>library(simpleaffy)

Load data by:Load data by: >eset <- read.affy('title.txt')>eset <- read.affy('title.txt')

Page 11: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Calculate expression by:Calculate expression by: >eset.rma <- call.exprs(eset,'rma')>eset.rma <- call.exprs(eset,'rma')

Compare two groups by:Compare two groups by: >pc.result <- >pc.result <-

pairwise.comparison(eset.rma, "title", pairwise.comparison(eset.rma, "title", c("pCR", "RD"), eset)c("pCR", "RD"), eset)

Page 12: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Filter significant changed markers Filter significant changed markers between two groups by:between two groups by: >significant <- >significant <-

pairwise.filter(pc.result,fc=log2(1.5), tt=0.001)pairwise.filter(pc.result,fc=log2(1.5), tt=0.001)

Page 13: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Plot significant changed markers:Plot significant changed markers: >plot(significant)>plot(significant)

Annotate selected markers:Annotate selected markers: >significant>significant

Page 14: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg
Page 15: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Annotate selected markers:Annotate selected markers:

Page 16: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Heatmap:Heatmap:

Page 17: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

> significant <- pairwise.filter(pc.result,fc=log2(1), > significant <- pairwise.filter(pc.result,fc=log2(1), tt=0.001) tt=0.001)

> pid<-rownames(significant@means) > pid<-rownames(significant@means) >eset.hm<-eset.rma[pid,] >eset.hm<-eset.rma[pid,] > install.packages("RColorBrewer")> install.packages("RColorBrewer") > library(RColorBrewer)> library(RColorBrewer) > hmcol <- colorRampPalette(brewer.pal(10, > hmcol <- colorRampPalette(brewer.pal(10,

"RdBu"))(256) "RdBu"))(256) > spcol <- ifelse(eset.hm$title == "pCR", > spcol <- ifelse(eset.hm$title == "pCR",

"goldenrod", "skyblue") "goldenrod", "skyblue") > heatmap(exprs(eset.hm), col = hmcol, > heatmap(exprs(eset.hm), col = hmcol,

ColSideColors = spcol)ColSideColors = spcol)

Page 18: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Assignment 2Assignment 2

Genetics of gene expression (eQTL)Genetics of gene expression (eQTL) Aim: to identify potential genetics Aim: to identify potential genetics

various that causes differential various that causes differential expressionexpression

Deadline of report: two weeks before Deadline of report: two weeks before final examinationfinal examination

Page 19: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Genetics of gene expressionGenetics of gene expression

Page 20: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

SNPSNP

Page 21: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg
Page 22: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg
Page 23: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg
Page 24: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

expression Quantitative expression Quantitative Trait Locus (eQTL) Trait Locus (eQTL)

tries to find genomic variation to explain tries to find genomic variation to explain expression traits.expression traits.

One difference between eQTL mapping One difference between eQTL mapping and traditional QTL mapping is that, and traditional QTL mapping is that, traditional mapping study focuses on one traditional mapping study focuses on one or a few traits, while in most of eQTL or a few traits, while in most of eQTL studies, studies, thousands of expression traits thousands of expression traits will will be analyzed and thousands of QTLs will be be analyzed and thousands of QTLs will be declared.declared.

Page 25: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

GGdata: all 90 hapmap CEU samples, GGdata: all 90 hapmap CEU samples, 47K expression, 4mm SNP47K expression, 4mm SNP

Page 26: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Chromosome 17Chromosome 17

Page 27: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

> biocLite(“GGtools”)> biocLite(“GGtools”) >biocLite(“GGdata”)>biocLite(“GGdata”) >library(GGtools)>library(GGtools) >library(GGdata)>library(GGdata) > c17 = getSS("GGdata", "17")> c17 = getSS("GGdata", "17") >/////get(“CSDA", revmap(illuminaHumanv1SYMBOL))>/////get(“CSDA", revmap(illuminaHumanv1SYMBOL)) > t1 = gwSnpTests(genesym("CSDA") ~ male, c17, chrnum("17"))> t1 = gwSnpTests(genesym("CSDA") ~ male, c17, chrnum("17")) > /////t1 = gwSnpTests(probeId(" GI_21359983-S ") ~ male, c17, > /////t1 = gwSnpTests(probeId(" GI_21359983-S ") ~ male, c17,

chrnum("17"))chrnum("17")) > topSnps(t1)> topSnps(t1) >plot_EvG(genesym("CSDA"), rsid("rs7212116"), c17)>plot_EvG(genesym("CSDA"), rsid("rs7212116"), c17) >//c_full = getSS(“GGdata", as.character(1:22))>//c_full = getSS(“GGdata", as.character(1:22))

Page 28: Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

Requirements for Requirements for assignment 2assignment 2

Identify the genetics cause (eQTL) of Identify the genetics cause (eQTL) of the genes selected in assignment 1the genes selected in assignment 1

Get SNPs with significant association Get SNPs with significant association (<10e-4) from (<10e-4) from each chromosomeeach chromosome

Paste the plot image for each Paste the plot image for each associationassociation

Annotate SNPs in dbSNPAnnotate SNPs in dbSNP Submit a report for each groupSubmit a report for each group