an introduction to opm opm – an r package to analyse ... · opm – an r package to analyse...
TRANSCRIPT
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
opm – an R package to analyse OmniLog Phenotype MicroArray data
Dr. Johannes Sikorski, Dr. Lea Vaas, Dr. Markus GökerLeibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbHwww.dsmz.de
An introduction to opm
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
You have numerous OmniLog Phenotype MicroArray data
and you want to explore them full-fledged and quantitatively into diverse directions of analysis frameworks.
• of closely related organisms or cell lines
• of numerous well-defined mutants
• obtained under diverse physiological test conditions
www.biolog.com
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
to organize your PM data, curve parameters and metadata
to subset and query your data
graphical display of raw kinetics or aggregated curve parameters
exploit the full statistics implemented in R
export to third-party software using YAML
opm: Tools for analysing OmniLog(R) Phenotype Microarray data
enables you:
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
http://www.rstudio.org/RStudio™ is a free and open source integrated development environment (IDE) for R.
http://www.r-project.org/R is a free software environment for statistical computing and graphics.
http://cran.r-project.org/web/packages/opm/index.htmladd-on package opm: Tools for analysing OmniLog(R) Phenotype Microarray data
Software requirements
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
The R code of this presentation is available on request from
Dr. Johannes Sikorski [email protected]
R Code of this presentation
Dr. Lea Vaas [email protected]
Dr. Markus Göker [email protected]
Feel free to contact us in case of any questions regarding usage of opm.
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
opm enables you:
to organize your PM data, curve parameters and metadata
to subset and query your data
graphical display of raw kinetics or aggregated curve parameters
exploit the full statistics implemented in R
export to third-party software using YAML
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
OPMorganizes your PM data in OPMS objects:
Plate 1 Plate 2 Plate 3Plate 4 Plate 5 Plate 6
Plate 7 Plate 8 Plate 9
Plate 1 Plate 2 Plate 3Plate 4 Plate 5 Plate 6
Plate 7 Plate 8 Plate 9
Plate 1 Plate 2 Plate 3Plate 4 Plate 5 Plate 6
Plate 7 Plate 8 Plate 9
raw kinetic data
aggregated curve
parameters
metadata
An OPMS object stores:
Example: a set of 9 PM plates of the same plate type
The size of the OPMS object is only limited by the amount of RAM memory
Hour00.0000.2500.50.30.00.60.00
lysin353337.
102.
328Hourinte
nsity
per well:raw kinetic data
metadata Plate 3
Taxonomy Bacillus subtilis.habitat soilsampling place GPS coord.sampling date 2011-06-15sampling season summerhabitat [°C] 27.sporulation yes.PCR (gene xyz) positive.... as much and what you wish...
per plate:any metadata of interest to the user
lysin
mu 15.559078 lambda 5.798210A 305.989319AUC 23308.269348mu CI95 low 3.803466lambda CI95 low 1.080333A CI95 low 305.642353AUC CI95 low 23125.092442mu CI95 high 140.841704lambda CI95 high 11.819251A CI95 high 306.986123AUC CI95 high 23411.648024
per well:aggregated curve parameters, confidence-intervals from bootstrapping
Lag = lambda, Slope = mu, Max = A, Area Under the Curve = AUC
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
read_opm()
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Load Demo files that come with the opm package
# Use the built-in function opm_files() to retrieve the paths where the example files in your R installation are located:(files <- opm_files("testdata"))
# read in the files, which are zipped# using the include argument to select specific plates of interest# by this, three files are loaded into the object "example.opm"example.opm <- read_opm(files, include = "*Example_?.csv.xz")
read_opm(names, convert = c("try", "no", "yes", "sep", "grp"), gen.iii = FALSE, include = list(), ..., demo = FALSE)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
# read in all CSV raw data files in your working directoryPM1 <- read_opm(".")
# read in all CSV raw data files in your working directory and convert the plate type to GenIII platesGenIII <- read_opm(".", gen.iii = TRUE)
Load Demo files that come with the opm package
read_opm(names, convert = c("try", "no", "yes", "sep", "grp"), gen.iii = FALSE, include = list(), ..., demo = FALSE)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Load Demo files that come with the opm package
# let us check some information on the files in this OPMS objectplates(example.opm)summary(example.opm)show(example.opm)dim(example.opm)hours(example.opm)length(example.opm)max(example.opm)plate_type(example.opm)seq(example.opm)setup_time(example.opm)measurements(example.opm)wells(example.opm)wells(example.opm, full = TRUE)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Load Demo files that come with the opm package
# let us check some information on the files in this OPMS objectplates(example.opm)summary(example.opm)show(example.opm)dim(example.opm)hours(example.opm)length(example.opm)max(example.opm)plate_type(example.opm)seq(example.opm)setup_time(example.opm)measurements(example.opm)wells(example.opm)wells(example.opm, full = TRUE)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Load Demo files that come with the opm package
# let us check some information on the files in this OPMS objectplates(example.opm)summary(example.opm)show(example.opm)dim(example.opm)hours(example.opm)length(example.opm)max(example.opm)plate_type(example.opm)seq(example.opm)setup_time(example.opm)measurements(example.opm)wells(example.opm)wells(example.opm, full = TRUE)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Load Demo files that come with the opm package
# let us check some information on the files in this OPMS objectplates(example.opm)summary(example.opm)show(example.opm)dim(example.opm)hours(example.opm)length(example.opm)max(example.opm)plate_type(example.opm)seq(example.opm)setup_time(example.opm)measurements(example.opm)wells(example.opm)wells(example.opm, full = TRUE)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Load Demo files that come with the opm package
# let us check some information on the files in this OPMS objectplates(example.opm)summary(example.opm)show(example.opm)dim(example.opm)hours(example.opm)length(example.opm)max(example.opm)plate_type(example.opm)seq(example.opm)setup_time(example.opm)measurements(example.opm)wells(example.opm)wells(example.opm, full = TRUE)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Load Demo files that come with the opm package
# let us check some information on the files in this OPMS objectplates(example.opm)summary(example.opm)show(example.opm)dim(example.opm)hours(example.opm)length(example.opm)max(example.opm)plate_type(example.opm)seq(example.opm)setup_time(example.opm)measurements(example.opm)wells(example.opm)wells(example.opm, full = TRUE)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Load Demo files that come with the opm package
# let us check some information on the files in this OPMS objectplates(example.opm)summary(example.opm)show(example.opm)dim(example.opm)hours(example.opm)length(example.opm)max(example.opm)plate_type(example.opm)seq(example.opm)setup_time(example.opm)measurements(example.opm)wells(example.opm)wells(example.opm, full = TRUE)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Load Demo files that come with the opm package
# let us check some information on the files in this OPMS objectplates(example.opm)summary(example.opm)show(example.opm)dim(example.opm)hours(example.opm)length(example.opm)max(example.opm)plate_type(example.opm)seq(example.opm)setup_time(example.opm)measurements(example.opm)wells(example.opm)wells(example.opm, full = TRUE)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Load Demo files that come with the opm package
# let us check some information on the files in this OPMS objectplates(example.opm)summary(example.opm)show(example.opm)dim(example.opm)hours(example.opm)length(example.opm)max(example.opm)plate_type(example.opm)seq(example.opm)setup_time(example.opm)measurements(example.opm)wells(example.opm)wells(example.opm, full = TRUE)
returns the raw kinetic data
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Load Demo files that come with the opm package
# let us check some information on the files in this OPMS objectplates(example.opm)summary(example.opm)show(example.opm)dim(example.opm)hours(example.opm)length(example.opm)max(example.opm)plate_type(example.opm)seq(example.opm)setup_time(example.opm)measurements(example.opm)wells(example.opm)wells(example.opm, full = TRUE)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Load Demo files that come with the opm package
# let us check some information on the files in this OPMS objectplates(example.opm)summary(example.opm)show(example.opm)dim(example.opm)hours(example.opm)length(example.opm)max(example.opm)plate_type(example.opm)seq(example.opm)setup_time(example.opm)measurements(example.opm)wells(example.opm)wells(example.opm, full = TRUE)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
do_aggr()
x <- do_aggr(example.opm, program = "opm-fast")aggregate only “A“ and “AUC“ using a fast algorithm
x <- do_aggr(example.opm)aggregate all 4 parameters using a spline fit algorithm (grofit package)
x <- do_aggr(example.opm, program = "opm-fast", boot = 100)include 100x bootstrap replicates
x <- do_aggr(example.opm , boot = 100) Note: time consuming
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
aggregated(example.opm)check aggregated data
A01 A02 A03 A04 A07mu 4.242159 5.769109 0.02138581 0.2827407 0.2383062lambda -2.340620 12.799329 -465.46803431 20.0749555 -14.4573092A 47.923185 62.738943 11.51078807 19.4617762 18.2811191AUC 3914.852139 4154.830048 1070.20657323 1250.9426009 1396.9447154mu CI95 low 2.733574 3.045267 -1.10076311 -2.2050686 -4.8515830lambda CI95 low -38.403543 -10.300782 56.14216650 42.4248855 24.8184260A CI95 low 47.197513 58.940763 11.17285004 19.1992801 16.9627344AUC CI95 low 3875.243148 4093.577722 1056.62986435 1230.3571787 1352.9702303mu CI95 high 14.170557 13.689212 6.15737265 9.3063345 21.5309783lambda CI95 high 79.044830 50.248293 87.70587107 106.1197708 107.3697670A CI95 high 52.484756 67.456369 15.37628753 23.6590936 30.0717055AUC CI95 high 3941.361758 4183.239559 1077.02925382 1262.9208049 1432.5071603
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
OPMorganizes your PM data in OPMS objects:
Plate 1 Plate 2 Plate 3Plate 4 Plate 5 Plate 6
Plate 7 Plate 8 Plate 9
Plate 1 Plate 2 Plate 3Plate 4 Plate 5 Plate 6
Plate 7 Plate 8 Plate 9
Plate 1 Plate 2 Plate 3Plate 4 Plate 5 Plate 6
Plate 7 Plate 8 Plate 9
raw kinetic data
aggregated curve
parameters
metadata
An OPMS object stores:
Example: a set of 9 PM plates of the same plate type
You need to provide the metadata
separately
Hour00.0000.2500.50.30.00.60.00
lysin353337.
102.
328Hourinte
nsity
per well:raw kinetic data
metadata Plate 3
Taxonomy Bacillus subtilis.habitat soilsampling place GPS coord.sampling date 2011-06-15sampling season summerhabitat [°C] 27.sporulation yes.PCR (gene xyz) positive.... as much and what you wish...
per plate:any metadata of interest to the user
lysin
mu 15.559078 lambda 5.798210A 305.989319AUC 23308.269348mu CI95 low 3.803466lambda CI95 low 1.080333A CI95 low 305.642353AUC CI95 low 23125.092442mu CI95 high 140.841704lambda CI95 high 11.819251A CI95 high 306.986123AUC CI95 high 23411.648024
per well:aggregated curve parameters, confidence-intervals from bootstrapping
Lag = lambda, Slope = mu, Max = A, Area Under the Curve = AUC
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
We use as identifier the Setup Time and Position of the plate in the reader.
Good news:opm allows to export these informations as a start for the metadata file using the function:collect_template()
You need to provide the metadata separately
One Problem Arises:Imagine, you have numerous plates with numerous metadata to each plate. How can we make sure that the metadata are matched CORRECTLY to the specific raw kinetic data?
Solution:We need an identifier that perfectly matches metadata to raw kinetic data.
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
collect_template()
data frame
add further metadata columns
metadata <- collect_template(files, include = "*Example_?.csv.xz")
Unique identifier to merge metadata and raw kinetic data
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
collect_template()
data frame
add further metadata columns
CSV file (or *.txt, *.dat)
add further metadata columnsin a spreadsheed application
collect_template(files, include = "*Example_?.csv.xz", outfile = "template.csv")
note the FORMAT:columns are tab separated, fields protected by quotation marks
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
collect_template()
data frame
add further metadata columns
CSV file (or *.txt, *.dat)
add further metadata columns
save tab separated and use quotation marks as field protector
load file into R environment usingto_metadata()
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
collect_template()
data frame
add further metadata columns
CSV file (or *.txt, *.dat)
add further metadata columns
save tab separated and use quotation marks as field protector
metadata.example <- to_metadata("template.csv", sep = ",")
metadata.example <- to_metadata("template.csv")metadata.example <- to_metadata("template.csv", strip.white = FALSE)
load file into R environment using
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
collect_template()
data frame
add further metadata columns
CSV file (or *.txt, *.dat)
add further metadata columns
further added metadata columns
Note: mock metadata for demonstration purpose
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
example.opm.metadata <- include_metadata(example.opm, md = metadata)
OPMS object with kinetic raw data
data frame with metadatametadata
example.opm
include_metadata()
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm)draw kinetic data
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm, col = c("blue", "red", "green"), lwd = 2)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm, col = c("blue", "red", "green"), lwd = 2, legend.fmt = list(space = "right"))
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm, col = c("blue", "red", "green"), lwd = 2, legend.fmt = list(space = "right"), include = c("Species", "strain"))
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm, col = c("blue", "red", "green"), lwd = 2, legend.fmt = list(space = "right"), include = c("Species", "strain"))
Modify panel strip, strip text, and legend by using arguments from lattice
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm[plates, time points, wells])
It is possible to plot (1) specific plates, (2) time points, or (3) wellsby indexing OPMS objectsusing square brackets.
xy_plot(example.opm[ , , ])
What about drawing only parts?
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm[ , , ])
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm[ 3, , ])
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm[ 3, 1:100 , ])
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm[ 3, 1:100 , c("A01", "A02", "E05", "G08", "H10")])
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm[ 3, 1:100 , c("A01", "A02", "E05", "G08", "H10")])
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Heatmaps compare plates on the basis of aggregated curve parameters
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
The generation of heatmaps includes two steps:
(1) Extract the curve parameter values usingextract()
(2) Create the heatmap usingheat_map()
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
First step:
AUC <- extract(example.opm, dataframe = TRUE, as.labels = list("country", "Species", "strain", "town"),subset = "AUC")
metadata of interest parameter and values from aggregating the curve parameters
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Second step:
heat_map(AUC, as.labels = c("Species", "town"), as.groups = "town", cexRow = 1.2, use.fun = "gplots",main = "nice heatmap",col = topo.colors(120))
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
heat_map(AUC, as.labels = c("Species", "town"),as.groups = "town",cexRow = 1.2, use.fun = "gplots",main = "nice heatmap",col = topo.colors(120))
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Confidence interval plot
Do curves differ significantly in aggregated curve parameters?
We make use of the 95% confidence intervals calculated from 100 bootstrap replicates.
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm)
In which aggregated curve parameters
do these curves differ significantly ?
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm[ , ,"D10"], include = list("Species","town"), neg.ctrl = FALSE)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
ci_plot(example.opm[ , , c("D10")], as.labels = list("Species","town"), subset = "A")
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
ci_plot(example.opm[ , , c("D10")], as.labels = list("Species","town"), subset = "AUC")
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
ci_plot(example.opm[ , , c("D10")], as.labels = list("Species","town"), subset = "lambda")
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm)
Do these curves differ in their lag phase?
Try yourself
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
radial_plot(example.opm[, , 5:17], sep = " ", as.labels = c("Species", "town"), draw.legend = FALSE, subset = "AUC")
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
xy_plot(example.opm[plates, time points, wells])
data(vaas_et_al)
-114 GenIII plates (run 96 hours) - numerous replicates of - each two strains of Escherichia coli and Pseudomonas aeruginosa, - including aggregated bootstrapped curve parameters and metadata
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
data(vaas_et_al)
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH