no slide titlemuhammadbilal.co/assets/pdf/toxnano.pdf · 2) identifying the enms properties that...

1
Conclusions An integrated online toolkit for analysis of ENM toxicity data (ToxNano) was developed for: 1) Rapid analysis of high throughput screening data of ENM toxicity (e.g., data preprocessing and normalization, hit identification, similarity analysis, association rules) 2) Identifying the ENMs properties that significantly correlate with toxicity 3) Evaluating the body of evidence based on literature data mining 4) Rapid development of toxicity quantitative-structure-activity relations (QSARs) ToxNano facilitated the development of toxicity QSARs for a wide range of ENMs including, metal, metal-oxides, and QDs, as well as various surface modified ENMs. ToxNano: An Online Toolkit for Toxicity Data Analysis of Nanomaterials Muhammad Bilal a , Rong Liu a , Haven Liu b , Dennis Bacsafra a , Michelle Romero a , Eunkeu Oh c , Andre Nel a , Igor Medintz c and Yoram Cohen a,b a Center for Environmental Implications of Nanotechnology (CEIN) b Chemical and Bio-molecular Engineering Department c Center for Bio/Molecular Science and Engineering, US Naval Research Laboratory Evaluation of the Body of Evidence QDs Toxicity (310 publications) Overview Understanding the relationships between physiochemical properties of engineered nanomaterials (ENMs) and their toxicity is critical for environmental and health risk analyses. However, this task is confounded by wide material diversity, heterogeneity of published data and limited sampling within individual studies. Efforts to arrive at predictive ENMs toxicology via data-driven models have typically been based on datasets from limited studies rather than the collective body of published evidence, while at the same time there has been increasing effort to mine toxicity data from published studies. In this regard, the challenge is to evaluate the body of evidence in order to: (i) identify the ENMs parameters and experimental conditions that are relevant to ENM toxicity, and (ii) apply advanced machine learning/data mining techniques to correlate toxicity with the identified parameters. Accordingly, an integrated toolkit for toxicity data analysis of ENMs (ToxNano) was developed that includes a set of advanced models and computational tools for: Knowledge Discovery and QSAR development for high content bioactivity data Tiered approaches to correlate toxicity metrics with qualitative and quantitative information Identification of the parameters that can be used for predictive toxicology Evaluation of the body of evidence w.r.t. ENM bioactivity High Throughput Screening (HTS) data integration with CEIN Data Management System and advanced techniques for HTS data analysis ToxNano: Toxicity Data Analysis of Nanomaterials Knowledge Extraction Predictive Toxicology Parameter Significance Evaluation of Body of Evidence Data Visualization Acknowledgements This material is based upon work supported by the National Science Foundation and the Environmental Protection Agency under Cooperative Agreement Number DBI 1266377. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or the Environmental Protection Agency. This work has not been subjected to EPA review and no official endorsement should be inferred. Random Forest (RF) Toxicity Model 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 R 2 E632 Attribute Significance IC 50 (mg/L) Cell Viability (%) Attribute Significance Determined via Exhaustive Search Case Study 310 Publications Cell viability data (%) of 1,741 Quantum Dots (QDs) from ~310 publications IC 50 values (nM and mg/L) for 514 QDs 25 quantitative/qualitative attributes to describe QD properties and experimental information Experimental conditions: Exposure time and concentrations Cell type and source Multiple toxicity assays ENM delivery approach Bipartite graph Visual demonstration of exposure to high doses of ZnO & TiO 2 NPs leading to significant compositional changes in soil bacterial communities Rapid identification of the interrelations between exposure to NPs and response of bacteria taxa with family level being most suitable for NP impact assessment. -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 -0.2 -0.1 0.0 0.1 0.2 Sphingobacteriales Burkholderiales Rhizobiales Bacillales Solirubrobacterales Actinomycetales Dim2 (25.6%) Dim1 (56.8%) ZnO TiO 2 L M H (dose) 15d 60d 15d 60d Ctrl 0d 15d 60d Susceptibility of Soil Bacteria to ENM QSAR for Cell Association of Au NPs 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 5 10 15 20 25 30 Decrease in R 2 E632 (%) Linear Regression ε-SVR APOB A1AT ZP Syn IGLL5 HRG FA12 APOE APOB ANT3 PLMN ITIH3 A1AT IGHG4 KLKB1 TTHY FA11 APOF KNG1 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 Predicted Cell Association Observed Cell Association R 2 resub = 0.971 R 2 boot = 0.851 R 2 E632 = 0.895 2 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 Predicted Cell Association Observed Cell Association R 2 resub = 0.887 R 2 boot = 0.829 R 2 E632 = 0.850 Au (15, 30, 60 nm) Linear QSAR (Non-linear QSAR) 85 Au NPs of anionic/cationic ligands Endpoint : cell association of NPs Descriptor : 129 protein corona fingerprints & 39 NP physicochemical properties QSAR analysis identified key serum proteins and zeta potential as the attributes most relevant to NP cell association Descriptor Significance f(x)=∑ iSV α i exp[-(x i,1 -x 1 ) 2 -(x i,2 -x 2 ) 2 ]+b Probability of a NP being classified as toxic is given by P(T|x)= 1/(1+e-f(x)); x (=[x1, x2]) represents the NP identified by its normalized ([0,1]) descriptor vector [ΔHhyd, EC]; SVM: Support Vector Machine. Nano-SAR for metal oxides (24) (BEAS-2B and RAW264.7 cell lines; 7 assays (Descriptors: Conduction band energy and Metal ion hydration Enthalpy) Penalty of classifying NP x as: - toxic P(N|x)LFP - nontoxic P(T|x)LFN Decision Boundary (DB) P(T|x)LFN-P(N|x)LFP=0 NP is of concern if P(T|x)LFN-P(N|x)LFP>0 L FN : L FP 1.0 : 2.7 1.0 : 1.0 2.7 : 1.0 DB of Penalty of acceptance of false negatives relative to false positive predictions False Negative Attributes List Surface Ligand SL Shell Diameter Assay Type AT Exposure Time ET Surface Modification SM Cell Anatomical Type CAT Core Surface Charge SC Cell Source Species CSS Delivery Type DT Cell Origin - CO Attribute Significance based on Sensitivity Analysis 0 5 10 15 20 25 30 35 40 45 % Variance Reduction in IC 50 (mg/L) Attributes SC SL SM DT AT CAT CO Core Shell Diameter ET QD_Source IC 50 CSS Bayesian Network (BN) Toxicity Model Reduction in the variance of the target outcome when an attribute information is provided Evidence Distribution (IC 50 )

Upload: others

Post on 05-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: No Slide Titlemuhammadbilal.co/assets/pdf/ToxNano.pdf · 2) Identifying the ENMs properties that significantly correlate with toxicity ... However, this task is confounded by wide

Conclusions

An integrated online toolkit for analysis of ENM toxicity data (ToxNano) was developed for:

1) Rapid analysis of high throughput screening data of ENM toxicity (e.g., data

preprocessing and normalization, hit identification, similarity analysis, association rules)

2) Identifying the ENMs properties that significantly correlate with toxicity

3) Evaluating the body of evidence based on literature data mining

4) Rapid development of toxicity quantitative-structure-activity relations (QSARs)

ToxNano facilitated the development of toxicity QSARs for a wide range of ENMs including,

metal, metal-oxides, and QDs, as well as various surface modified ENMs.

ToxNano: An Online Toolkit for Toxicity Data Analysis of Nanomaterials

Muhammad Bilala, Rong Liua, Haven Liub, Dennis Bacsafraa, Michelle Romeroa, Eunkeu Ohc, Andre Nela, Igor Medintzc

and Yoram Cohena,b aCenter for Environmental Implications of Nanotechnology (CEIN)

bChemical and Bio-molecular Engineering Department cCenter for Bio/Molecular Science and Engineering, US Naval Research Laboratory

Evaluation of the Body of

Evidence

QDs Toxicity

(310 publications)

Overview Understanding the relationships between physiochemical properties of engineered nanomaterials

(ENMs) and their toxicity is critical for environmental and health risk analyses. However, this task is

confounded by wide material diversity, heterogeneity of published data and limited sampling within

individual studies. Efforts to arrive at predictive ENMs toxicology via data-driven models have typically

been based on datasets from limited studies rather than the collective body of published evidence, while

at the same time there has been increasing effort to mine toxicity data from published studies. In this

regard, the challenge is to evaluate the body of evidence in order to: (i) identify the ENMs parameters

and experimental conditions that are relevant to ENM toxicity, and (ii) apply advanced machine

learning/data mining techniques to correlate toxicity with the identified parameters. Accordingly, an

integrated toolkit for toxicity data analysis of ENMs (ToxNano) was developed that includes a set of

advanced models and computational tools for:

• Knowledge Discovery and QSAR development for high content bioactivity data

• Tiered approaches to correlate toxicity metrics with qualitative and quantitative information

• Identification of the parameters that can be used for predictive toxicology

• Evaluation of the body of evidence w.r.t. ENM bioactivity

• High Throughput Screening (HTS) data integration with CEIN Data Management System

and advanced techniques for HTS data analysis

ToxNano: Toxicity Data Analysis of Nanomaterials

Knowledge Extraction

Predictive Toxicology

Parameter Significance

Evaluation of Body of Evidence

Data Visualization

Acknowledgements This material is based upon work supported by the National Science Foundation and the

Environmental Protection Agency under Cooperative Agreement Number DBI 1266377.

Any opinions, findings, and conclusions or recommendations expressed in this material

are those of the author(s) and do not necessarily reflect the views of the National Science

Foundation or the Environmental Protection Agency. This work has not been subjected to

EPA review and no official endorsement should be inferred.

Random Forest (RF) Toxicity Model

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

R2 E

63

2

Attribute Significance

IC50 (mg/L)

Cell Viability (%)

Attribute Significance Determined via Exhaustive Search

Case Study

• 310 Publications

• Cell viability data (%) of 1,741 Quantum

Dots (QDs) from ~310 publications

• IC50 values (nM and mg/L) for 514 QDs

• 25 quantitative/qualitative attributes to

describe QD properties and experimental

information

Experimental conditions:

• Exposure time and concentrations

• Cell type and source

• Multiple toxicity assays

• ENM delivery approach

Bipartite graph

• Visual demonstration of exposure to

high doses of ZnO & TiO2 NPs

leading to significant compositional

changes in soil bacterial

communities

• Rapid identification of the

interrelations between exposure to

NPs and response of bacteria taxa

with family level being most suitable

for NP impact assessment.

-0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2

-0.2

-0.1

0.0

0.1

0.2

Sphingobacteriales

BurkholderialesRhizobiales

Bacillales

Solirubrobacterales

Actinomycetales

Dim

2 (

25.6

%)

Dim1 (56.8%)

-0.2 -0.1 0.0 0.1 0.2 0.3

-0.2

-0.1

0.0

0.1

0.2

ZnO

TiO2

Ctrl 0d 15d 60d

L M H (dose)

15d

60d

15d

60d

Proteobacteria

Gemmatimonadetes

Firmicutes

Bacteroidetes

Actinobacteria

Acidobacteria

Dim1 (58.7%)

-0.2 -0.1 0.0 0.1 0.2 0.3

-0.2

-0.1

0.0

0.1

0.2

ZnO

TiO2

Ctrl 0d 15d 60d

L M H (dose)

15d

60d

15d

60d

Proteobacteria

Gemmatimonadetes

Firmicutes

Bacteroidetes

Actinobacteria

Acidobacteria

Dim1 (58.7%)

Susceptibility of Soil Bacteria to ENM

QSAR for Cell Association of Au NPs

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200

5

10

15

20

25

30

De

cre

ase in

R2

E632 (

%)

Linear Regressionε-SVR

AP

OB

A1

AT

ZPSy

n

IGLL

5H

RG

FA1

2A

PO

E

AP

OB

AN

T3

PLM

NIT

IH3

A1

AT

IGH

G4

KLK

B1

TT

HY

FA1

1A

PO

FK

NG

1 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2-9

-8

-7

-6

-5

-4

-3

-2

-1

0

1

2

Pre

dic

ted

Cell

Asso

cia

tion

Observed Cell Association

R2

resub = 0.971

R2

boot = 0.851

R2

E632 = 0.895 2

-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2-9

-8

-7

-6

-5

-4

-3

-2

-1

0

1

2

Pre

dic

ted

Cell

Asso

cia

tion

Observed Cell Association

R2

resub = 0.887

R2

boot = 0.829

R2

E632 = 0.850

Au(15, 30, 60 nm)

Linear QSAR (Non-linear QSAR)

85 Au NPs of anionic/cationic ligands

Endpoint: cell association of NPs

Descriptor: 129 protein corona fingerprints & 39 NP physicochemical properties

QSAR analysis identified key serum proteins and zeta potential as the attributes most relevant to NP cell association

Descriptor Significance

f(x)=∑i∈SV αiexp[-(xi,1-x1)2-(xi,2-x2)

2]+b

Probability of a NP being classified as toxic is given by P(T|x)= 1/(1+e-f(x)); x (=[x1, x2])

represents the NP identified by its normalized (∈[0,1]) descriptor vector [ΔHhyd, EC]; SVM:

Support Vector Machine.

Nano-SAR for metal oxides (24)

(BEAS-2B and RAW264.7 cell lines; 7 assays (Descriptors: Conduction band energy

and Metal ion hydration Enthalpy)

• Penalty of classifying NP x as:

- toxic P(N|x)LFP

- nontoxic P(T|x)LFN

Decision Boundary (DB)

P(T|x)LFN-P(N|x)LFP=0

• NP is of concern if

P(T|x)LFN-P(N|x)LFP>0

LFN : LFP

1.0 : 2.7

1.0 : 1.0

2.7 : 1.0

DB of Penalty of acceptance of false negatives relative to false positive predictions

False Negative

Attributes List

• Surface Ligand – SL

• Shell

• Diameter

• Assay Type – AT

• Exposure Time – ET

• Surface Modification – SM

• Cell Anatomical Type – CAT

• Core

• Surface Charge – SC

• Cell Source Species – CSS

• Delivery Type – DT

• Cell Origin - CO

Attribute Significance based on Sensitivity Analysis

0

5

10

15

20

25

30

35

40

45

% V

aria

nce

Red

uct

ion

in

IC

50(m

g/L

)

Attributes

SC

SL

SM

DT AT

CAT

CO

Core

Shell

Diameter

ET

QD_Source

IC50

CSS

Bayesian Network (BN) Toxicity Model

Reduction in the variance of the target outcome

when an attribute information is provided

Evidence Distribution (IC50)