jpjanet.io · 2/21 introduction describing inorganic complexes similarity and model uncertainty...

Post on 18-Jun-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1/21

Introduction Describing inorganic complexes Similarity and model uncertainty

ML for inorganic molecular design:descriptors and similarity in transition metal

chemical space

Jon Paul Janet 1 Heather Kulik 1

1Department of Chemical Engineering, Massachusetts Institute of Technology

255th ACS National Meeting, New Orleans

03.19.18

2/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data-driven molecular design

Gomez-Bombarelli, R. et al.. Nat.

Mater., 15(10):1120-1127, 2016.

OLED chemical space

NN∼ 106

DFT∼ 105

Exp.∼ 101

Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.

Machine learningis transforminghow we designnew materials...

L

M

L

L

L

L

LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.

NN

N N

Pt

Cl

Cl

Periana, R. A. et al. Science, 280(5363), 1998.

...what about inorganic molecular complexes?

2/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data-driven molecular designGomez-Bombarelli, R. et al.. Nat.

Mater., 15(10):1120-1127, 2016.

OLED chemical space

NN∼ 106

DFT∼ 105

Exp.∼ 101

Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.

Machine learningis transforminghow we designnew materials...

L

M

L

L

L

L

LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.

NN

N N

Pt

Cl

Cl

Periana, R. A. et al. Science, 280(5363), 1998.

...what about inorganic molecular complexes?

2/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data-driven molecular designGomez-Bombarelli, R. et al.. Nat.

Mater., 15(10):1120-1127, 2016.

OLED chemical space

NN∼ 106

DFT∼ 105

Exp.∼ 101

Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.

Machine learningis transforminghow we designnew materials...

L

M

L

L

L

L

LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.

NN

N N

Pt

Cl

Cl

Periana, R. A. et al. Science, 280(5363), 1998.

...what about inorganic molecular complexes?

2/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data-driven molecular designGomez-Bombarelli, R. et al.. Nat.

Mater., 15(10):1120-1127, 2016.

OLED chemical space

NN∼ 106

DFT∼ 105

Exp.∼ 101

Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.

Machine learningis transforminghow we designnew materials...

L

M

L

L

L

L

LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.

NN

N N

Pt

Cl

Cl

Periana, R. A. et al. Science, 280(5363), 1998.

...what about inorganic molecular complexes?

2/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data-driven molecular designGomez-Bombarelli, R. et al.. Nat.

Mater., 15(10):1120-1127, 2016.

OLED chemical space

NN∼ 106

DFT∼ 105

Exp.∼ 101

Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.

Machine learningis transforminghow we designnew materials...

L

M

L

L

L

L

LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.

NN

N N

Pt

Cl

Cl

Periana, R. A. et al. Science, 280(5363), 1998.

...what about inorganic molecular complexes?

2/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data-driven molecular designGomez-Bombarelli, R. et al.. Nat.

Mater., 15(10):1120-1127, 2016.

OLED chemical space

NN∼ 106

DFT∼ 105

Exp.∼ 101

Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.

Machine learningis transforminghow we designnew materials...

L

M

L

L

L

L

LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.

NN

N N

Pt

Cl

Cl

Periana, R. A. et al. Science, 280(5363), 1998.

...what about inorganic molecular complexes?

2/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data-driven molecular design

Gomez-Bombarelli, R. et al.. Nat.

Mater., 15(10):1120-1127, 2016.

OLED chemical space

NN∼ 106

DFT∼ 105

Exp.∼ 101

Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.

Machine learningis transforminghow we designnew materials...

L

M

L

L

L

L

LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.

NN

N N

Pt

Cl

Cl

Periana, R. A. et al. Science, 280(5363), 1998.

...what about inorganic molecular complexes?

2/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data-driven molecular design

Gomez-Bombarelli, R. et al.. Nat.

Mater., 15(10):1120-1127, 2016.

OLED chemical space

NN∼ 106

DFT∼ 105

Exp.∼ 101

Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.

Machine learningis transforminghow we designnew materials...

L

M

L

L

L

L

LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.

NN

N N

Pt

Cl

Cl

Periana, R. A. et al. Science, 280(5363), 1998.

...what about inorganic molecular complexes?

2/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data-driven molecular design

Gomez-Bombarelli, R. et al.. Nat.

Mater., 15(10):1120-1127, 2016.

OLED chemical space

NN∼ 106

DFT∼ 105

Exp.∼ 101

Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.

Machine learningis transforminghow we designnew materials...

L

M

L

L

L

L

LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.

NN

N N

Pt

Cl

Cl

Periana, R. A. et al. Science, 280(5363), 1998.

...what about inorganic molecular complexes?

2/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data-driven molecular design

Gomez-Bombarelli, R. et al.. Nat.

Mater., 15(10):1120-1127, 2016.

OLED chemical space

NN∼ 106

DFT∼ 105

Exp.∼ 101

Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.

Machine learningis transforminghow we designnew materials...

L

M

L

L

L

L

L

Bignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.

NN

N N

Pt

Cl

Cl

Periana, R. A. et al. Science, 280(5363), 1998.

...what about inorganic molecular complexes?

2/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data-driven molecular design

Gomez-Bombarelli, R. et al.. Nat.

Mater., 15(10):1120-1127, 2016.

OLED chemical space

NN∼ 106

DFT∼ 105

Exp.∼ 101

Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.

Machine learningis transforminghow we designnew materials...

L

M

L

L

L

L

LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.

NN

N N

Pt

Cl

Cl

Periana, R. A. et al. Science, 280(5363), 1998.

...what about inorganic molecular complexes?

2/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data-driven molecular design

Gomez-Bombarelli, R. et al.. Nat.

Mater., 15(10):1120-1127, 2016.

OLED chemical space

NN∼ 106

DFT∼ 105

Exp.∼ 101

Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.

Machine learningis transforminghow we designnew materials...

L

M

L

L

L

L

LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.

NN

N N

Pt

Cl

Cl

Periana, R. A. et al. Science, 280(5363), 1998.

...what about inorganic molecular complexes?

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

3/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Transition metal complexes

t2g

eg

Energy

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L < 0

low spin

high spin

∆EH−L > 0

low spin

high spin∆EH−L ∼ 0

perturbation, ∆T

M2+

M3+

e

∆EIII−II

4/21

Introduction Describing inorganic complexes Similarity and model uncertainty

How to estimate properties?

property

features

experiment

HΨ = EΨdensity functional theory (DFT)

model

weeks, months

days

seconds

4/21

Introduction Describing inorganic complexes Similarity and model uncertainty

How to estimate properties?

property

features

experiment

HΨ = EΨdensity functional theory (DFT)

model

weeks, months

days

seconds

4/21

Introduction Describing inorganic complexes Similarity and model uncertainty

How to estimate properties?

property

features

experiment

HΨ = EΨdensity functional theory (DFT)

model

weeks, months

days

seconds

4/21

Introduction Describing inorganic complexes Similarity and model uncertainty

How to estimate properties?

property

features

experiment

HΨ = EΨdensity functional theory (DFT)

model

weeks, months

days

seconds

4/21

Introduction Describing inorganic complexes Similarity and model uncertainty

How to estimate properties?

property

features

experiment

HΨ = EΨdensity functional theory (DFT)

model

weeks, months

days

seconds

5/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Input space design

What would be the ideal feature space?

Chemical Space Cf

ci

Descriptor Space X ⊂ Rd

xi

xj

cj

d(xi , xj)

Good descriptors:• cheap• small as possible• preserve similarity

5/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Input space design

What would be the ideal feature space?

Chemical Space Cf

ci

Descriptor Space X ⊂ Rd

xi

xj

cj

d(xi , xj)

Good descriptors:• cheap• small as possible• preserve similarity

5/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Input space design

What would be the ideal feature space?

Chemical Space Cf

ci

Descriptor Space X ⊂ Rd

xi

xj

cj

d(xi , xj)

Good descriptors:• cheap• small as possible• preserve similarity

5/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Input space design

What would be the ideal feature space?

Chemical Space Cf

ci

Descriptor Space X ⊂ Rd

xi

xj

cj

d(xi , xj)

Good descriptors:• cheap• small as possible• preserve similarity

5/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Input space design

What would be the ideal feature space?

Chemical Space Cf

ci

Descriptor Space X ⊂ Rd

xi

xj

cj

d(xi , xj)

Good descriptors:• cheap• small as possible• preserve similarity

5/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Input space design

What would be the ideal feature space?

Chemical Space Cf

ci

Descriptor Space X ⊂ Rd

xi

xj

cj

d(xi , xj)

Good descriptors:• cheap• small as possible• preserve similarity

5/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Input space design

What would be the ideal feature space?

Chemical Space Cf

ci

Descriptor Space X ⊂ Rd

xi

xj

cj

d(xi , xj)

Good descriptors:• cheap• small as possible• preserve similarity

5/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Input space design

What would be the ideal feature space?

Chemical Space Cf

ci

Descriptor Space X ⊂ Rd

xi

xj

cj

d(xi , xj)

Good descriptors:• cheap• small as possible• preserve similarity

5/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Input space design

What would be the ideal feature space?

Chemical Space Cf

ci

Descriptor Space X ⊂ Rd

xi

xj

cj

d(xi , xj)

Good descriptors:• cheap• small as possible• preserve similarity

6/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data for spin splitting

Data for octahedral complexes1:

M

Lax

Lax

Leq

Leq

Leq

Leq

1345 (194)complexes

7 HF values

1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

6/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data for spin splitting

Data for octahedral complexes1:

M

Lax

Lax

Leq

Leq

Leq

Leq

1345 (194)complexes

7 HF values

1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

6/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data for spin splitting

Data for octahedral complexes1:

M

Lax

Lax

Leq

Leq

Leq

Leq

1345 (194)complexes

7 HF values

1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

6/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data for spin splitting

Data for octahedral complexes1:

M

Lax

Lax

Leq

Leq

Leq

Leq

1345 (194)complexes

7 HF values

B3LYP-like DFTHF exchange in 0-30%gas phase optimizatonLANL2DZ/6-31G*high- and low-spinM(II)/(III)

1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

6/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data for spin splitting

Data for octahedral complexes1:

M

Lax

Lax

Leq

Leq

Leq

Leq

1345 (194)complexes

7 HF values

Coulomb matrix eigenspec-trum (CM-ES) descriptor &kernel ridge regression (KRR)

1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

6/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data for spin splitting

Data for octahedral complexes1:

M

Lax

Lax

Leq

Leq

Leq

Leq

1345 (194)complexes

7 HF values

Coulomb matrix eigenspec-trum (CM-ES) descriptor &kernel ridge regression (KRR)

∆EH-L RMSECM-ES 19.2 kcal/mol

1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

6/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Data for spin splitting

Data for octahedral complexes1:

M

Lax

Lax

Leq

Leq

Leq

Leq

1345 (194)complexes

7 HF values

Coulomb matrix eigenspec-trum (CM-ES) descriptor &kernel ridge regression (KRR)

∆EH-L RMSECM-ES 19.2 kcal/mol

Why?1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

7/21

Introduction Describing inorganic complexes Similarity and model uncertainty

A tale of two complexes

PC 1

PC

2

PC 1

PC

2

∆EH−L size

Fe[pisc]3+6 Fe[misc]3+6

7/21

Introduction Describing inorganic complexes Similarity and model uncertainty

A tale of two complexes

PC 1

PC

2

PC 1

PC

2

∆EH−L size

Fe[pisc]3+6

∆EH-L = 40.7 kcal/mol

Fe[misc]3+6

∆EH-L = 37.7 kcal/mol

7/21

Introduction Describing inorganic complexes Similarity and model uncertainty

A tale of two complexes

PC 1

PC

2

PC 1

PC

2

∆EH−L size

Fe[pisc]3+6 Fe[misc]3+6

7/21

Introduction Describing inorganic complexes Similarity and model uncertainty

A tale of two complexes

PC 1

PC

2

PC 1

PC

2

∆EH−L size

Fe[pisc]3+6 Fe[misc]3+6

8/21

Introduction Describing inorganic complexes Similarity and model uncertainty

MCDL-25

mixed continuous discrete lcoal (MCDL)

metalproperties

local ligandproperties

global ligandproperties

identity

oxidation state

Fe(II)max ∆χ

χ = 3.44

χ = 2.55Kier index

0

5

10

15

20

CM−ES MCDLmethod

test

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

8/21

Introduction Describing inorganic complexes Similarity and model uncertainty

MCDL-25

mixed continuous discrete lcoal (MCDL)metal

properties

local ligandproperties

global ligandproperties

identity

oxidation state

Fe(II)

max ∆χ

χ = 3.44

χ = 2.55Kier index

0

5

10

15

20

CM−ES MCDLmethod

test

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

8/21

Introduction Describing inorganic complexes Similarity and model uncertainty

MCDL-25

mixed continuous discrete lcoal (MCDL)metal

propertieslocal ligandproperties

global ligandproperties

identity

oxidation state

Fe(II)max ∆χ

χ = 3.44

χ = 2.55

Kier index

0

5

10

15

20

CM−ES MCDLmethod

test

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

8/21

Introduction Describing inorganic complexes Similarity and model uncertainty

MCDL-25

mixed continuous discrete lcoal (MCDL)metal

propertieslocal ligandproperties

global ligandproperties

identity

oxidation state

Fe(II)max ∆χ

χ = 3.44

χ = 2.55Kier index

0

5

10

15

20

CM−ES MCDLmethod

test

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

8/21

Introduction Describing inorganic complexes Similarity and model uncertainty

MCDL-25

mixed continuous discrete lcoal (MCDL)

metalproperties

local ligandproperties

global ligandproperties

identity

oxidation state

Fe(II)max ∆χ

χ = 3.44

χ = 2.55Kier index

0

5

10

15

20

CM−ES MCDLmethod

test

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

8/21

Introduction Describing inorganic complexes Similarity and model uncertainty

MCDL-25

mixed continuous discrete lcoal (MCDL)

metalproperties

local ligandproperties

global ligandproperties

identity

oxidation state

Fe(II)max ∆χ

χ = 3.44

χ = 2.55Kier index

0

5

10

15

20

CM−ES MCDLmethod

test

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

8/21

Introduction Describing inorganic complexes Similarity and model uncertainty

MCDL-25

mixed continuous discrete lcoal (MCDL)

metalproperties

local ligandproperties

global ligandproperties

identity

oxidation state

Fe(II)max ∆χ

χ = 3.44

χ = 2.55Kier index

0

5

10

15

20

CM−ES MCDLmethod

test

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

8/21

Introduction Describing inorganic complexes Similarity and model uncertainty

MCDL-25

mixed continuous discrete lcoal (MCDL)

metalproperties

local ligandproperties

global ligandproperties

identity

oxidation state

Fe(II)max ∆χ

χ = 3.44

χ = 2.55Kier index

0

5

10

15

20

CM−ES MCDLmethod

test

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations2

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZCd3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations2

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZCd3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations2

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZCd3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations2

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48

d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZCd3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations2

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48

d1 : 48 + ∑C,O

ZOZC = 144 + 48

d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZCd3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations2

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48

d1 : ∑i

∑j

ZiZj δ(di,j , 1)

dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZCd3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations2

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)

dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZCd3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?

restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZCd3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZCd3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZCd3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZO

d2 : ∑M,C

ZMZCd3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZO

d2 : ∑M,C

ZMZC

d3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZC

d3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZC

d3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZC

d3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S∼ 160 features in total

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZC

d3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S

∼ 160 features in total

9/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Extensible, continuous descriptors - RACs

Based on autocorrelations

OO

OO

C C

M

d1 : ∑O,C

ZOZC = 48d1 : 48 + ∑C,O

ZOZC = 144 + 48d1 : ∑i

∑j

ZiZj δ(di,j , 1)dx : ∑i

∑j

ZiZj δ(dij , x)

0 1 2 3 4 5 6maximum AC depth

8

10

12

14

16

18

MU

E (

kc

al/m

ol)

traintest

*

How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms

d1 : ∑M,O

ZMZOd2 : ∑M,C

ZMZC

d3 : ∑M,O

ZMZO

(Zi − Zj)

properties:T ,χ,Z ,I,S

∼ 160 features in total

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

10/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Feature selection

MCDL

RAC155UV86

RFE43

LS28

rF41

1.5

2.0

2.5

3.0

3.5

4.0

50 100 150

dimension

RM

SE

, kca

l/mol

Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.

11/21

Introduction Describing inorganic complexes Similarity and model uncertainty

A tale of two complexes, II

PC 1

PC

2

PC 1

PC

2

PC 1

PC

2

PC 1

PC

2

∆EH−L size

11/21

Introduction Describing inorganic complexes Similarity and model uncertainty

A tale of two complexes, II

PC 1

PC

2

PC 1

PC

2

PC 1

PC

2

PC 1

PC

2

∆EH−L size

12/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Do features depend on properties?

metal

N

N

NN

CC

C

C

C

C

CC

CC

HH

CC

CC

HH

CC

C

C

H

H

C

C

H

H

12/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Do features depend on properties?

spin splitting (randF) spin splitting (randF)

bond lengths (randF) redox (randF)

more ‘electronic’

more ‘topological’

12/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Do features depend on properties?

spin splitting (randF) spin splitting (randF)

bond lengths (randF) redox (randF)

more ‘electronic’

more ‘topological’

12/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Do features depend on properties?

spin splitting (randF)

spin splitting (randF)

bond lengths (randF) redox (randF)

more ‘electronic’

more ‘topological’

12/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Do features depend on properties?

spin splitting (randF) spin splitting (randF)

bond lengths (randF) redox (randF)

more ‘electronic’

more ‘topological’

12/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Do features depend on properties?

spin splitting (randF) spin splitting (randF)

bond lengths (randF)

redox (randF)

more ‘electronic’

more ‘topological’

12/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Do features depend on properties?

spin splitting (randF) spin splitting (randF)

bond lengths (randF) redox (randF)

more ‘electronic’

more ‘topological’

13/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

PC 1

PC

2

357911

E0 (eV)

?

random forest selected for redox

13/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

PC 1

PC

2

357911

E0 (eV)

?random forest selected for redox

13/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

PC 1

PC

2

357911

E0 (eV)

?random forest selected for redox

13/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

PC 1

PC

2

357911

E0 (eV)

?random forest selected for redox

13/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

PC 1

PC

2

357911

E0 (eV)

?

random forest selected for redox

13/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

PC 1

PC

2

357911

E0 (eV)

?

random forest selected for redox

13/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

PC 1

PC

2

357911

E0 (eV)

?

random forest selected for redox

14/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

+

2

= ?

Cr(II) [H2O]5 [misc]∆G = 5.3 eV

Co(II) [CO]5 [pyr]∆G = 8.1 eV

Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV

14/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

+

2

= ?

Cr(II) [H2O]5 [misc]∆G = 5.3 eV

Co(II) [CO]5 [pyr]∆G = 8.1 eV

Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV

14/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

+

2

= ?

Cr(II) [H2O]5 [misc]∆G = 5.3 eV

Co(II) [CO]5 [pyr]∆G = 8.1 eV

Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV

14/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

+

2

= ?

Cr(II) [H2O]5 [misc]∆G = 5.3 eV

Co(II) [CO]5 [pyr]∆G = 8.1 eV

Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV

14/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

+

2

=

?

Cr(II) [H2O]5 [misc]∆G = 5.3 eV

Co(II) [CO]5 [pyr]∆G = 8.1 eV

Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV

14/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

+

2

= ?

Cr(II) [H2O]5 [misc]∆G = 5.3 eV

Co(II) [CO]5 [pyr]∆G = 8.1 eV

Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV

14/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

+

2

= ?

Cr(II) [H2O]5 [misc]

∆G = 5.3 eV

Co(II) [CO]5 [pyr]

∆G = 8.1 eVFe(II) [CO]4 [pyr][water]

∆G = 7.8 eV

14/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

+

2

= ?

Cr(II) [H2O]5 [misc]∆G = 5.3 eV

Co(II) [CO]5 [pyr]∆G = 8.1 eV

Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV

14/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Mapping TM complex space

+

2

= ?

Cr(II) [H2O]5 [misc]∆G = 5.3 eV

Co(II) [CO]5 [pyr]∆G = 8.1 eV

Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV

15/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Model transferability

Test-set performance is not necessarily a good metric for generaltransferability2:

Fe(III)

−25

0

25

50

pisc−pisc pisc−NCSpisc−H2O pisc−Cl H2O−H2O Cl−Cl NCS−NCS

∆EH

−L k

cal/m

ol

ANN

B3LYP

Fe(III)[pisc]6

0

20

40

60

0.0 0.1 0.2 0.3HFX, %

∆EH

−L k

cal/m

ol

ANN

DFT

3.132.97

0

5

10

15

train test

abs.

err

or

(kca

l/mo

l)

0

10

20

30

train test CSD

abs.

err

or

(kca

l/mo

l)

2Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

15/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Model transferability

Test-set performance is not necessarily a good metric for generaltransferability2:

Fe(III)

−25

0

25

50

pisc−pisc pisc−NCSpisc−H2O pisc−Cl H2O−H2O Cl−Cl NCS−NCS

∆EH

−L k

cal/m

ol

ANN

B3LYP

Fe(III)[pisc]6

0

20

40

60

0.0 0.1 0.2 0.3HFX, %

∆EH

−L k

cal/m

ol

ANN

DFT

3.132.97

0

5

10

15

train test

abs.

err

or

(kca

l/mo

l)

0

10

20

30

train test CSD

abs.

err

or

(kca

l/mo

l)

2Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

15/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Model transferability

Test-set performance is not necessarily a good metric for generaltransferability2:

Fe(III)

−25

0

25

50

pisc−pisc pisc−NCSpisc−H2O pisc−Cl H2O−H2O Cl−Cl NCS−NCS

∆EH

−L k

cal/m

ol

ANN

B3LYP

Fe(III)[pisc]6

0

20

40

60

0.0 0.1 0.2 0.3HFX, %

∆EH

−L k

cal/m

ol

ANN

DFT

3.132.97

0

5

10

15

train test

abs.

err

or

(kca

l/mo

l)

0

10

20

30

train test CSD

abs.

err

or

(kca

l/mo

l)

2Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

15/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Model transferability

Test-set performance is not necessarily a good metric for generaltransferability2:

Fe(III)

−25

0

25

50

pisc−pisc pisc−NCSpisc−H2O pisc−Cl H2O−H2O Cl−Cl NCS−NCS

∆EH

−L k

cal/m

ol

ANN

B3LYP

Fe(III)[pisc]6

0

20

40

60

0.0 0.1 0.2 0.3HFX, %

∆EH

−L k

cal/m

ol

ANN

DFT

3.132.97

0

5

10

15

train test

abs.

err

or

(kca

l/mo

l)

0

10

20

30

train test CSD

abs.

err

or

(kca

l/mo

l)

2Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

15/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Model transferability

Test-set performance is not necessarily a good metric for generaltransferability2:

Fe(III)

−25

0

25

50

pisc−pisc pisc−NCSpisc−H2O pisc−Cl H2O−H2O Cl−Cl NCS−NCS

∆EH

−L k

cal/m

ol

ANN

B3LYP

Fe(III)[pisc]6

0

20

40

60

0.0 0.1 0.2 0.3HFX, %

∆EH

−L k

cal/m

ol

ANN

DFT

3.132.97

0

5

10

15

train test

abs.

err

or

(kca

l/mo

l)

0

10

20

30

train test CSD

abs.

err

or

(kca

l/mo

l)

2Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.

16/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Model transferability

Uncertainty estimates are essential for our surrogate model toexplore chemical space:

-50

-25

0

25

50

75

-50 -25 0 25 50surrogate splitting (kcal/mol)

DF

T s

plit

tin

g (

kcal

/mo

l)

Uncertainty from mc-dropout1:ANN model approximates vari-ational inference with GP undersome conditions:

var (y∗|x∗) ≈ 1J ∑j yT

j yj + τ−1

-50

-25

0

25

50

75

-50 -25 0 25 50surrogate splitting (kcal/mol)

DF

T s

plit

tin

g (

kcal

/mo

l)

0

10

20

30

0.5 1.0 1.5 2.0distance

abs.

err

or (k

cal/m

ol)

Gal, Y. and Ghahramani, Z., 2016. ICMLR 1050-1059

16/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Model transferability

Uncertainty estimates are essential for our surrogate model toexplore chemical space:

-50

-25

0

25

50

75

-50 -25 0 25 50surrogate splitting (kcal/mol)

DF

T s

plit

tin

g (

kcal

/mo

l) Uncertainty from mc-dropout1:ANN model approximates vari-ational inference with GP undersome conditions:

var (y∗|x∗) ≈ 1J ∑j yT

j yj + τ−1

-50

-25

0

25

50

75

-50 -25 0 25 50surrogate splitting (kcal/mol)

DF

T s

plit

tin

g (

kcal

/mo

l)

0

10

20

30

0.5 1.0 1.5 2.0distance

abs.

err

or (k

cal/m

ol)

Gal, Y. and Ghahramani, Z., 2016. ICMLR 1050-1059

16/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Model transferability

Uncertainty estimates are essential for our surrogate model toexplore chemical space:

-50

-25

0

25

50

75

-50 -25 0 25 50surrogate splitting (kcal/mol)

DF

T s

plit

tin

g (

kcal

/mo

l) Uncertainty from mc-dropout1:ANN model approximates vari-ational inference with GP undersome conditions:

var (y∗|x∗) ≈ 1J ∑j yT

j yj + τ−1

-50

-25

0

25

50

75

-50 -25 0 25 50surrogate splitting (kcal/mol)

DF

T s

plit

tin

g (

kcal

/mo

l)

0

10

20

30

0.5 1.0 1.5 2.0distance

abs.

err

or (k

cal/m

ol)

Gal, Y. and Ghahramani, Z., 2016. ICMLR 1050-1059

16/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Model transferability

Uncertainty estimates are essential for our surrogate model toexplore chemical space:

-50

-25

0

25

50

75

-50 -25 0 25 50surrogate splitting (kcal/mol)

DF

T s

plit

tin

g (

kcal

/mo

l)

Uncertainty from mc-dropout1:ANN model approximates vari-ational inference with GP undersome conditions:

var (y∗|x∗) ≈ 1J ∑j yT

j yj + τ−1

-50

-25

0

25

50

75

-50 -25 0 25 50surrogate splitting (kcal/mol)

DF

T s

plit

tin

g (

kcal

/mo

l)

0

10

20

30

0.5 1.0 1.5 2.0distance

abs.

err

or (k

cal/m

ol)

Gal, Y. and Ghahramani, Z., 2016. ICMLR 1050-1059

17/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Demonstration

Can we use the ANN model to find new spin-crossover materials,i.e. ∆EH−L = 0?

Define a space of 32 ligands, 5 metals and with∼ 5600 possible elements with forced axial/equatorial symmetry3:

3Janet, J.P., Chan, L. and Kulik, H.J. J. Phys. Chem. Lett., 2018, 9, 5,1064-1071.

17/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Demonstration

Can we use the ANN model to find new spin-crossover materials,i.e. ∆EH−L = 0? Define a space of 32 ligands, 5 metals and with∼ 5600 possible elements with forced axial/equatorial symmetry3:

3Janet, J.P., Chan, L. and Kulik, H.J. J. Phys. Chem. Lett., 2018, 9, 5,1064-1071.

18/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Demonstration

ANN is trained on 14 of these ligands, covers only 2% of thedesign space.

We can visualize the design space using t-SNE4:

−40

−20

0

20

40

0.0

0.5

1.0

1.5

2.0

4Maaten, L., & Hinton, G., 2008. J. Mach. Learn. Res. 2579-2605.

18/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Demonstration

ANN is trained on 14 of these ligands, covers only 2% of thedesign space. We can visualize the design space using t-SNE4:

−40

−20

0

20

40

0.0

0.5

1.0

1.5

2.0

4Maaten, L., & Hinton, G., 2008. J. Mach. Learn. Res. 2579-2605.

18/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Demonstration

ANN is trained on 14 of these ligands, covers only 2% of thedesign space. We can visualize the design space using t-SNE4:

−40

−20

0

20

40

0.0

0.5

1.0

1.5

2.0

4Maaten, L., & Hinton, G., 2008. J. Mach. Learn. Res. 2579-2605.

19/21

Introduction Describing inorganic complexes Similarity and model uncertainty

How accurate are we?

Test 51 leads from ANN with DFT5:

1

2 2

1

2

3 3 3

2

1

3

7 7

4

5

3

1 1

0

2

4

6

8

-20 -15 -10 -5 0 5 10errors (kcal/mol)

coun

t

sub. isocyanides

0

5

10

15

0.00 0.25 0.50 0.75distance to train

ΔE H

− LA

NN−Δ

E H− L

GO

(kca

l/mol

)

23

CrMnFeCo

5Janet, J.P., Chan, L. and Kulik, H.J. J. Phys. Chem. Lett., 2018, 9, 5,1064-1071.

19/21

Introduction Describing inorganic complexes Similarity and model uncertainty

How accurate are we?

Test 51 leads from ANN with DFT5:

1

2 2

1

2

3 3 3

2

1

3

7 7

4

5

3

1 1

0

2

4

6

8

-20 -15 -10 -5 0 5 10errors (kcal/mol)

coun

t

sub. isocyanides

0

5

10

15

0.00 0.25 0.50 0.75distance to train

ΔE H

− LA

NN−Δ

E H− L

GO

(kca

l/mol

)

23

CrMnFeCo

5Janet, J.P., Chan, L. and Kulik, H.J. J. Phys. Chem. Lett., 2018, 9, 5,1064-1071.

19/21

Introduction Describing inorganic complexes Similarity and model uncertainty

How accurate are we?

Test 51 leads from ANN with DFT5:

1

2 2

1

2

3 3 3

2

1

3

7 7

4

5

3

1 1

0

2

4

6

8

-20 -15 -10 -5 0 5 10errors (kcal/mol)

coun

t

sub. isocyanides

0

5

10

15

0.00 0.25 0.50 0.75distance to train

ΔE H

− LA

NN−Δ

E H− L

GO

(kca

l/mol

)

23

CrMnFeCo

5Janet, J.P., Chan, L. and Kulik, H.J. J. Phys. Chem. Lett., 2018, 9, 5,1064-1071.

20/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Conclusions

choice of molecular representation is important

different properties depend non-equally on features

feature-space geometry can provide insight into modelreliability

imbuing ‘chemical intuition’ to descriptor construction candrastically improve learning

conversely, feature selection can contribute tounderstanding systems

21/21

Introduction Describing inorganic complexes Similarity and model uncertainty

Acknowledgments

Thanks to the Kulik group and funding partners:

top related