+ All Categories
Home > Documents > Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro,...

Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro,...

Date post: 26-Jun-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
34
1 Department of Materials Science and Engineering, Kyoto University, JAPAN 2 Elements Strategy Initiative for Structural Materials, Kyoto University, JAPAN 3 Center for Materials Research by Information Integration, NIMS, JAPAN 4 Nanostructure Research Laboratory, Japan Fine Ceramics Center, JAPAN Isao Tanaka 1,2,3,4 Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019
Transcript
Page 1: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

1 Department of Materials Science and Engineering, Kyoto University, JAPAN2 Elements Strategy Initiative for Structural Materials, Kyoto University, JAPAN3 Center for Materials Research by Information Integration, NIMS, JAPAN4 Nanostructure Research Laboratory, Japan Fine Ceramics Center, JAPAN

Isao Tanaka1,2,3,4

Recommender system for materials discovery

Big Data SummerPlatja d’Aro, Spain, September 9 - 13, 2019

Page 2: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Inorganic Crystal Structure Database (ICSD)

187,000 crystal structures 82,000 structures excluding duplicatesincompletes, etc.

World largest databasefor known inorganic crystals.

2Many systems are yet-unexplored !

Number of chemical elements

Number of chemical combinations

(only for simple composition ratio)

1 ~1002 ~100,0003 ~10,000,0004 ~1,000,000,000 (1billion)

Page 3: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Vast chemistry space to explore

Simple chemical combinations AaBbCcDd (a,b,c,d <10)~1B

ICSD~82k

experimental database for crystal structure

Page 4: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

ICSD~82k

Vast chemistry space to explore

thermodynamically unstable compounds

thermodynamically (meta)stable compounds

experimental database for crystal structure

Simple chemical combinations AaBbCcDd (a,b,c,d <10)~1B

Page 5: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Discovery of a novel Sn(II)-based oxide for daylight-driven photocatalyst

DFT calcs + Experiments

Hiroyuki Hayashi, Shota Katayama, Takahiro Komura, Yoyo Hinuma, Tomoyasu Yokoyama, Kou Mibu, Fumiyasu Oba and IT

Hiroyuki Hayashi

Advanced Science 9, (2016) 1600246

5

Page 6: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

q M known compounds4 Ti, Zr, Hf SnTiO3, Sn2TiO4

5 V, Nb, Ta SnNb2O6, Sn2Nb2O7, SnTa2O6, Sn2Ta2O7, SnTa4O11

6 Cr, Mo, W SnWO4, Sn2WO5, Sn3WO6

SnO-MOq/2

Only 10 compounds are known

Sn(II)-M-O

SnO-MOq/2 pseudobinary

4A – 6A transition metal oxideswidely used for photocatalystsex. TiO2, WO3, NaTaO3, TaON, …

Wide band-gaps

Sn(II) oxidesNarrow band-gaps

Reported high visible-light photocatalytic activity

Target compounds of interests; Sn(II)-M-oxides

Page 7: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Inorganic Crystal Structure Database (ICSD)

Number of chemical elements

Number of structure prototypes in ICSD

1 1202 1,7003 4,7004 4,300

World largest databasefor known inorganic crystals.

7

177,000 crystal structures 82,000 structures

excluding duplicates,incompletes, etc.

9,100 structure prototypes(e.g. rock-salt, perovskite, ...)

Page 8: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

1 2 3 4 5 6

1 154 122 359 209 438 251

2 454 258 663 220 409

3 500 184 297 109

4 444 52 149

5 72 45

6 78

ICSD prototype

NdYbS3 type

NdYbS3 type SnTiO3

NdYbS3 type TiSnO3

Hypothetical compounds with prototype structures

Formal ionic charge

Form

al io

nic

char

ge

# hypothetical compounds

Page 9: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

SnO-WO3 pseudo binary system

SnO WO3

Convex hull

Included in ICSD

9

Formation energy by DFT calcs

Page 10: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

SnO MoO3

Convex hull

as‐yet‐unknown

10

SnO-MoO3 pseudo binary system

Formation energy by DFT calcs

Page 11: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Reported oxides in ICSD(Red characters) are located onthe convex hull.

Convex hull of SnO-MOq/2 pseudo binary systems

Band gap screening

11

Formation energy by DFT calcs

Page 12: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Band gap of actual photocatalysts ≥ 2 eV (GGA)

2 ~ 3 eV

1 ~ 2 eV

0 ~ 1 eV

over 3 eV

Band gap

• SnO‐Ta2O5• SnO‐WO3• SnO‐MoO3

12

Band Gap

Page 13: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Synthesis of SnMoO4

Mixture of SnCl2 and K2MoO4 powders

1 hour annealing in Ar gas

Washed and dried

13

Experimental results

Page 14: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Newly discovered compound

498 K-synthesized sampleSpace group type: P213

(Cubic)Lattice constant: a = 7.26 Å

Sn

O

Mo

a b

c

Trigonal prism which ischaracteristic of Sn(II)

14

Crystal structure of SnMoO4

Page 15: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Degradation of methylene blue under simulated day-light

Newly-discovered SnMoO4 powder exhibits clear photocatalytic activity. 15

Photocatalytic activity of SnMoO4

Page 16: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

ICSD~82k

Vast chemistry space to explore

thermodynamically unstable compounds

thermodynamically (meta)stable compounds

experimental database for crystal structure

Simple chemical combinations AaBbCcDd (a,b,c,d <10)~1B

Page 17: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Recommender system for discovery of CRC (Chemically Relevant Composition)

using ICSD database

A. Seko, H. Hayashi, H. Kashima and IT

17A. Seko, H. Hayashi, H. Kashima, I. Tanaka, Phys. Rev. Mater. 2, 013805 (2018)

A. Seko, H. Hayashi, and I. Tanaka, J. Chem. Phys. 148, 241719 (2018).

Page 18: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

“Recommender system” in E-commerce

Amazon.com

A system that can suggest items to customers, which is sometimes useful.

= Recommendation

Netflix.com

18

Page 19: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

19

A2X-BX pseudo-binary (A1+, B2+, X2-)

7A2Xꞏ1BX (A14B1X8)

3A2Xꞏ1BX (A6B1X4)

1A2Xꞏ1BX (A2B1X2)

CRC (Chemically Relevant Composition)

Form

atio

n En

ergy

A2X BXComposition

Convex hull

3A2X

ꞏBX

(A6B

1X4)

A2X

ꞏBX

(A2B

1X2)

7A2X

ꞏBX

(A14

B1X

8)

CRC

5A2Xꞏ3BX (A10B3X8)

3A2Xꞏ5BX (A6B5X8)

A10

B3X

8

A6B

5X8

non-CRC

Page 20: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

⇒ Application to discover new Chemically Relevant Composition (CRC)

ABCDEFGH

JI

1 2 3 4 5 6 7ACHBFJDG

EI

1 4 3 5 7 2 6ACHBFJDG

EI

1 4 3 5 7 2 6

Rating matrix

Underlying assumption: a low-rank structure of rating matrix.

Rating matrix used for recommender systemC

usto

mer

Item

20

Page 21: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Ternary:AaBbXx max(a, b, x) = 8, N = 7.4 x 106

Quaternary: AaBbCcXx max(a, b, c, x) = 20, N = 1.2 x 109

Quinary: AaBbCcDdXx max(a, b, c, d, x) = 20, N = 2.3 x 1010

Candidate chemical compositions

21

Page 22: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Number of entry compounds in three databases

SpringerMaterials

ICDD

ICSD

Number of entry compounds

Ternary Quaternary Quinary

Training

Test

22

Page 23: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

23

Matrix factorization

Non-negative Matrix Factorization

Singular Value Decomposition

r : given rank

SCIKIT-LEARN

r : given rank

Page 24: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

24

Type 1Type 2Type 3

Example of Rating Matrix (Type 1)

Matrix representation of ternary composition

Page 25: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

25

Num

ber o

f cor

rect

answ

ers

inclu

ded

in IC

DD &

SpM

at

TOP3,000 compositions with high predicted rating.Discovery rate> 21% !!

TOP3,000 compositions with high predicted rating.Discovery rate> 21% !!

TOP100 compositions with high predicted rating.Discovery rate > 45% !!

TOP100 compositions with high predicted rating.Discovery rate > 45% !!

Ternary # Elements: 7,405,200

Validation of CRC prediction by a recommender system for ternary compounds using Tucker decomposition

Dependence on rank is weak. SVD performs slightly better than NMF. Type 2 representation works best.

Page 26: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Tensor representation of binary composition

170

66

10

Binary: # Elements:66x10x170=112,200 26

Page 27: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Tensor factorization

(canonical polyadic)

(higher order singular value decomposition, HO-SVD)

27SCIKIT-TENSOR

F. L. Hitchcock, Stud. Appl. Math. 6, 164 (1927).

L. R. Tucker, Psychometrika 31, 279 (1966).

Page 28: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Tensor factorization

28

Num

ber o

f cor

rect

answ

ers

inclu

ded

in IC

DD &

SpM

at

Validation of CRC prediction by a recommender system for ternary compounds using Tucker decomposition

Ternary # Elements: 7,405,200

Page 29: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Num

ber o

f cor

rect

answ

ers

inclu

ded

in IC

DD &

SpM

at

TOP3,000 compositions with high predicted rating.

Discovery rate > 25% !!

TOP3,000 compositions with high predicted rating.

Discovery rate > 25% !!

TOP100 compositions with high predicted rating.

Discovery rate > 59% !!

TOP100 compositions with high predicted rating.

Discovery rate > 59% !!

Validation of CRC prediction by a recommender system for ternary compounds using Tucker decomposition

29

Ternary # Elements: 7,405,200

Page 30: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Num

ber o

f cor

rect

answ

ers

inclu

ded

in IC

DD &

SpM

at

Validation of CRC prediction by a recommender system for ternary compounds using Tucker decomposition

30

Ternary # Elements: 7,405,200

3000

Page 31: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

31

Results for quarternary/quinary systems

59%

52%

15%

Discovery rate > 15% even for quinary systems with TOP100 high predicted rating.

Discovery rate > 15% even for quinary systems with TOP100 high predicted rating.

TOP100 TOP3000

Num

ber o

f cor

rect

answ

ers

inclu

ded

in IC

DD &

SpM

at

Page 32: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

32

Further validation by first principles calculations for pseudo-binary compounds with high predicted rating

Rb3InO3Predicted Rating: 0.64

RbInO2PredictedRating: 1.01

Page 33: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Further validation by first principles calculations for TOP 27 pseudo-binary compounds with high predicted rating

23 among 27 compositions(85%) are thermodynamically stable by DFT ! 33

Page 34: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder

Systematic discovery of as-yet-unknown CRC

Use of tensor-based recommender system ONLY with

inorganic crystal database, ICSD.

Rating prediction with neither descriptors, nor DFT results.

Validation by two other databases, ICDD-PDF & Springer

Materials. Discovery rate is 59/52/15% for TOP 100

ternary/quarternary/quinary CRC.

Validation by DFT calculations. Among TOP 27 ternary

(pseudo-binary oxides), 85% are thermodynamically stable.

CRC (chemically relevant composition)

34


Recommended