+ All Categories
Home > Documents > Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and...

Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and...

Date post: 12-Mar-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
38
Transcript
Page 1: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.
Page 2: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Contents

1

Trend in Computer-Aided Materials Discovery

High-Throughput Computational Screening & Exhaustive

Enumeration

Deep-Learning-based Evolutionary Design

Deep-Learning-based Inverse Design

Efficacy of Computer-Aided Materials Discovery

Page 3: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Trend in Computer-Aided Materials Discovery

2

Simulation(low throughput)

[ 1st Gen. ]

Virtual screening(low hit-rate)

[ 2nd Gen.] [ 3rd Gen. ]

Targeted design(high hit-rate)

Conventional

Trial-and-Error(high cost)

Iterative experiments Pre-validation High throughputRight solutions

with minimum effort

For accelerated materials discovery

Machine LearningFirst-principles

Quantum ChemistryHigh-performance

Computing

IntelligenceEfficiencyRationalization

Page 4: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Prediction of materials property based on machine learning

– Build-up of Materials vs. Property DB → Materials Informatics

Trend in Computer-Aided Materials Discovery

3

QSAR*

(’62, Hansch&Fujita)

ANN**

in Chemistry (’71)

Graph Kernels(‘05 @ UC Irvine)

Bayesian Modeling(‘09 @ MIT)

(‘18 @ Harvard)

SMILES ***

(‘87 Weininger)

Kernel methods Bayesian approaches Deep Learning

(‘16 @ Stanford)

CheminformaticsIntroduction stage of

machine learningDevelopment stage

TrainingDescriptor

SMILES: CC(C)NCC(O)COC1=CC(CC2=CC=CC=C2)=C(CC(N)=O)C=C1

Fingerprint: 011100011111101010010100100000101010001001010…

DescriptorVector

graphs images

Analysis

Process of Machine Learning @ Materials Research

* QSAR: Quantitative Structure-Activity Relationship** ANN: Artificial Neural Network*** SMILES: Simplified Molecular-input-line Systems

Page 5: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Trend in Computer-Aided Materials Discovery

4

Materials design based on machine learning

– Inverse QSAR → Inverse Design

SMILES Autoencoder(‘16 @ Harvard)

Genetic Algorithms(’92 @ Purdue)

Inverse Design(’16 @ SAIT)

GAN* for molecules(‘17 @ Harvard)

Exhaustive Generation(’12 @ Tokyo)

Deep Learning / Generative Models

Inverse QSAR(Late 80’s~)

*GAN: Generative Adversarial Network

Combinatorial Evolutionary Autoencoder

Focus on autonomous molecular generation

Page 6: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

[ In ]Targets

[ Out ]Molecules

+

(High-ThroughtputComputational Screening)Automated

Simulation

DB

MachineLearning

MaterialsInformatics

HTCS

InverseDesign

EvolutionaryDesign +

MolecularEnumeration

Materials Discovery MethodologiesElemental Technologies

Trend in Computer-Aided Materials Discovery

5

Target molecules

In-silico technologies for materials discovery

Page 7: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

High-Throughput Computational Screening& Exhaustive Enumeration

6

“Landscape of phosphorescent light-emitting energies of homoleptic Ir(III)-complexes predicted by a graph-based enumeration and deep learning”, GI01.02.02, 2018 MRS fall meeting

Page 10: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

High-Throughput Computational Screening

9

Exhaustive enumeration based on graph-theory

– “Graphs”

• Mathematical structures used to model pairwise relations between objects.

• Made up of nodes and edges.

• In chemistry, graph is used to model molecules, where nodes represent atoms and edges represent bonds.

※ Exhaustive enumeration:Systematical enumeration of all possible molecules for optimal solution search

Page 11: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

High-Throughput Computational Screening

10

Complete list of non-isomorphic graphs

http://www.cadaeic.net/graphpics.htm

ID No. of edges

No. of edges at each node

Page 12: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

High-Throughput Computational Screening

11

Landscape of phosphorescent light-emitting energies of homoleptic Ir(III)-complex core structures

– Ir(III)-complexes

• Widely used as phosphorescent OLED dopants.

• Figuring out the full landscape of emission color is important for discovering high-performing molecules in target color regions.

New J. Chem., 39, 246 (2015)

ACS Appl. Mater. Interfaces, 10, 1888–1896 (2018)Organic Electronics, 63, 244–249 (2018)

Page 13: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

High-Throughput Computational Screening

12

Approach

– Consider the nodes in graph as rings and edges as ring-connections.

– Limited the total number rings between 3 and 5.

– Exclude non-planar type (5-21) and invalid structures as dopant.

→ Only 11 graphs are valid among the total 29 graphs.

Page 14: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

High-Throughput Computational Screening

13

1. Graphs 2. Skeletons 3. Set Iridium positions

4. Substitute some carbon atoms with nitrogen atoms

Enumeration

– For 5- and 6-membered rings.

– Substitute some carbons of each molecule with nitrogen atoms (max. five).

→ Total 9,919,469 (~10M) core structures

total 405 EA

Page 15: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

High-Throughput Computational Screening

14

Property prediction

– Trained a deep-neural-network model with simulated T1 data

• Input: ECFP(Extended Connectivity FingerPrints) of molecular structures

• Outputs: T1 energy (phosphorescent light-emitting wavelength)

0

0.05

0.1

0.15

0.2

10K 20K 30K 40K 50K 60K 70K 80KMean A

bso

lute

Err

or

of

T1

of

the D

NN

(eV)

Size of the training dataset

With 80k training data,the average prediction error was less than 0.1 eV

80k

10M= 0.8%

By simulating the properties of only 0.8% molecules, we can fully scan the chemical space of 10M!

Page 16: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

High-Throughput Computational Screening

15

Results

– Distribution of T1 values

– Blue-color emitting materials are rare compared with red and green

0

1

2

3

4

5

6

0.0

5

0.1

5

0.2

5

0.3

5

0.4

5

0.5

5

0.6

5

0.7

5

0.8

5

0.9

5

1.0

5

1.1

5

1.2

5

1.3

5

1.4

5

1.5

5

1.6

5

1.7

5

1.8

5

1.9

5

2.0

5

2.1

5

2.2

5

2.3

5

2.4

5

2.5

5

2.6

5

2.7

5

2.8

5

2.9

5

Num

ber

of

mole

cule

s

x 1

00,0

00

Predicted T1 (eV)

Blue(0.4%)

Green(4.3%)

Red(18.4%)

Page 17: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Conclusions

16

In materials discovery, deep-learning-based HTCS is a good

alternative to conventional trial-and-error type approach.

Moreover, exhaustive enumeration makes it possible to

systematically explore the whole chemical space.

With the proposed exhaustive enumeration method based on

graph theory and deep learning, the whole landscape of 10M

phosphorescent Ir-dopants could be scanned with just 0.8%

computational cost compared with the pure simulation-based

approach.

Page 18: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Deep-Learning-based Evolutionary Design

17

“Evolutionary design of organic molecules based on deep learning and genetic algorithm”, COMP, ACS fall 2018 National Meeting

Page 19: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Evolutionary Design

18

A generic population-based metaheuristic optimization technique

Uses bio-inspired operators to reach near-optimal solutions

; mutation, crossover, and selection in case of genetic algorithm

Fitness

https://en.wikipedia.org/wiki/Fitness_landscape

Avera

ge

fitn

ess

Generation

+

Initial population

Calculate fitness

Selection

Mutation Crossover

New population

Satisfy constraints?DoneYes

No

Page 20: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Deep-Learning-Based Evolutionary Design

19

Proposed approach

Molecular Descriptor

Molecular Evolution

Fitness Evaluation

Graph or ASCII string

Heuristic

Simple assessment

Bit string (ECFP)

Random

DNN

RNN •Prevent heuristic bias •Secure chemical validity

*ECFP (Extended Connectivity FingerPrint)DNN (Deep Neural Network), RNN (Recurrent Neural Network)SMILES (Simplified Molecular-Input Line-Entry System)

•Versatile evaluation is possible

Mutation (n=50)

Inspection of chemical validity

Decoding toSMILES (RNN)

Fitness evaluation (DNN) Selection

EvolutionCrossover → Mutation)

Inspection of chemical validity

Fitness evaluation (DNN)

Seed molecule(ECFP)

Best-fit molecule

DB

1 1 01

1 1 000 1 01

1 0 01

0 0 0 11 0 01

1 0 0 0 0 0

1 0 1

1

1

Parents

Crossover

1

Mutation0 0 11

Iteration

Conventional Proposed Expectations

Decoding toSMILES (RNN)

Page 21: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Deep Learning-Based Evolutionary Design

20

t=1Input

(ECFP*)

y = (‘CCC’,‘CCC’,‘CC(’,…, ‘)=O’) → ‘CCCC(N)=O’

t=2 t=3 t=T+1

y1=‘CCC’ y2=‘CCC’ yT=‘)=O’

y1=‘CCC’ y2=‘CCC’ y3=‘CC(’ <end>

<start>

*ECFP (dimension=5,000, neighbor size=6)

Deep learning models

• [DNN] 3 hidden layers, 500 hidden units in each layer

• [RNN] 3 hidden layers, 500 long short-term memory units

Output (SMILES)

Input(ECFP*)

Output (Properties)

DNN Model RNN Model

Page 22: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Deep Learning-Based Evolutionary Design

21

Validation test

• Design target: change the S1 (light-absorbing wavelength) of seed molecules

• Training data: M.W. 200~600 g/mol from PubChem (10,000~50,000 molecules)

※1. No. of test data=No. of training data/10※2. Chemical validity is evaluated with RDKit,

No. of test data=5,000

-11

-10

-9

-8

-7

-6

-5

-4

-3

-11 -10 -9 -8 -7 -6 -5 -4 -3

R=0.945

HOMO (DFT; eV)

HO

MO

(D

NN

; eV)

-5

-4

-3

-2

-1

0

1

2

3

-5 -4 -3 -2 -1 0 1 2 3

R=0.955

LUMO (DFT; eV)

LU

MO

(D

NN

; eV)

0

2

4

6

8

10

12

0 2 4 6 8 10 12

R=0.973

S1 (DFT; eV)

S1(D

NN

; eV)

No. of training data

Prediction accuracy of DNN※1 (R, MAE) Success rate of decoding※2 (RNN)S1 HOMO LUMO

① 50,000 0.973, 0.198 0.945, 0.172 0.955, 0.209 86.7%

② 30,000 0.930, 0.228 0.934, 0.191 0.945, 0.224 85.3%

③ 10,000 0.913, 0.278 0.885, 0.244 0.917, 0.287 83.2%

Page 23: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Deep Learning-Based Evolutionary Design

22

Evolution toward the increase and decrease of S1 (eV)

• Seed: randomly selected 50 molecules (3.8<S1<4.2)

• Number of training data = 10k, 30k, 50k

Avera

ge r

ate

of

S1

change (

%)

-60

-40

-20

0

20

40

60

0 100 200 300 400 500

Number of trainig data=50,000

Number of trainig data=30,000

Number of trainig data=10,000

Generation

toward the increase of S1

toward the decrease of S1

0

2,500

5,000

7,500

10,000

12,500

0.2

5

1.2

5

2.2

5

3.2

5

4.2

5

5.2

5

6.2

5

7.2

5

8.2

5

9.2

5

S1 (eV)

Num

ber of M

ole

cule

s

4.0 eV

S1 distribution in the training data(50k)

Page 24: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

-60

-40

-20

0

20

40

60

0 100 200 300 400 500

Without constraint With constraint

Increase of S1 Decrease of S1

Deep Learning-Based Evolutionary Design

23

Evolution under the constraint of HOMO and LUMO (eV)

• Seed: randomly selected 50 molecules (3.8<S1<4.2)

• Number of training data = 50k

• Constraint: -7.0<HOMO<-5.0,

LUMO<0.0

LUMO

HOMO

0 eV

③① ②

-7 eV

③ ④

-5 eV

Generation

Avera

ge r

ate

of

S1

change (

%) HOMO & LUMO distributions in the training data (50k)

0

5,000

10,000

15,000

20,000

-10.

75

-9.7

5

-8.7

5

-7.7

5

-6.7

5

-5.7

5

-4.7

5

-3.7

5

-2.7

5

-1.7

5

-0.7

5

0.2

5

HOMO (eV)

Num

ber

of M

ole

cule

s

0

2,500

5,000

7,500

10,000

-6.7

5

-5.7

5

-4.7

5

-3.7

5

-2.7

5

-1.7

5

-0.7

5

0.2

5

1.2

5

2.2

5

LUMO (eV)

0 eV-5 eV-7 eV

toward the increase of S1

toward the decrease of S1

0 eV

Page 25: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Deep Learning-Based Evolutionary Design

24

Examples of evolved molecules (No. of training data = 50k)

Constraint (eV)

• -7.0<HOMO<-5.0

• LUMO<0.0

Page 26: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Conclusions

25

A fully data-driven evolutionary molecular design based on

deep-learning models (DNN & RNN) was proposed and

automatically evolved seed molecules toward target without any

pre-defined chemical rules.

Unlike HTCS, the closed-loop evolutionary workflow guided by

deep-learning automatically derived target molecules and found

rational design paths by elucidating the relationship between

structural features and their effect on the molecular properties.

Page 27: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Deep-Learning-based Inverse Design

26

npj Comput. Mater., 4, 67, 2018

Page 29: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Deep-Learning-Based Inverse Design

28

Implementation of inverse-design model

Deep learning

Molecular

database

Extraction of

design knowledge

Generation of

new molecules

Molecular design

Molecularstructures

Properties [target properties]

z=e(x)

DNN

Molecular property (t)

Molecular descriptor (x; ECFP format)

f(z)

RNN

d(z) Molecular structure identifier

(y; SMILES format)

e(·) : encoding function

f(·) : property prediction function

d(·) : decoding function

z: encoded vector of molecular descriptor

a

b

Input outputencoder decoder

Hidden Factor(fixed-length vector)

Page 30: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Deep-Learning-Based Inverse Design

29

Inverse design of light-absorbing organic molecules (1/2)

• Training DB

‒ 50k molecules sampled from PubChem (M.W. 200~600)‒ DFT calculations for S1

λmax (nm)

0%

10%

20%

30%

40%

50%

100 600200 300 400 5000%

10%

20%

30%

40%

50%

Pe

rce

nta

ge

of

ge

ne

rate

d m

ole

cule

s

ca b

100 600200 300 400 5000%

10%

20%

30%

40%

50%

λmax (nm)

0%

10%

20%

30%

40%

50%

Pe

rce

nta

ge

of

ge

ne

rate

d m

ole

cule

s

0%

10%

20%

30%

40%

50%

100 600200 300 400 500

λmax (nm)

0%

10%

20%

30%

40%

50%

Pe

rce

nta

ge

of

ge

ne

rate

d m

ole

cule

s

Distribution of λmax of the inverse-designed molecules

λmax=200–300 nm λmax=300–400 nm λmax=400–500 nm

82.6%

Target

Hit rate 64.8% 45.6%

*Simulation values for the 500 molecules in each target

※ About 10% of the designed molecules were found in PubChem even though those were not included in the randomly selected training library.

Page 31: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

a b

c d

Deep-Learning-Based Inverse Design

30

Inverse design of light-absorbing organic molecules (2/2)

Examples of inverse-designed molecules which share the

moieties with well-known dye materials

b. Azobenzene derivative

(λmax=527.5 nm)

c. Isoidoline derivative

(λmax=434.4 nm)

a. Antraquinone derivative(λmax=433.4 nm)

d. Squaraine derivative

(λmax=503.5 nm)

Page 32: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Deep-Learning-Based Inverse Design

31

Inverse design of hosts for blue phosphorescent OLED (1/3)

• Target: T1 ≥ 3.00 eV

• Training DB

‒ In-house library of 6,000 molecules by combinatorial enumeration (with nine linker (L) and fifty-seven terminal fragments (R) which are frequently employed in OLED hosts; symmetric R-L-R & R-R type enumeration).

‒ Property labeling with DFT calculations.

0%

10%

20%

30%

40%

50%

60%

2.4 2.6 2.8 3.0 3.2 3.4 3.6

Fra

ctio

n o

f m

ole

cule

s

T1 (eV)

Untargeted inverse design

Targeted inverse design

(T1 ≥ 3.00 eV)

Training librarya

b

c

The distribution of simulated T1 (eV) energy levels for the generated 3,205 molecules a. mean=2.94, std=0.15

b. mean=3.02, std=0.10

c. mean=2.92, std=0.13

The fractions of the hosts that satisfied the target (T1≥3.00 eV)

36.2% for a

58.7% for b

26.9% for c (3,497 molecules)

Page 33: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Deep-Learning-Based Inverse Design

32

Inverse design of hosts for blue phosphorescent OLED (2/3)

a b c

T1

ML 3.13 eV

DFT 3.16 eV

T1

ML 3.11 eV

DFT 3.12 eV

T1

ML 3.12 eV

DFT 3.12 eV

T1

ML 3.05 eV

DFT 3.06 eV

T1

ML 3.18 eV

DFT 3.20 eV

T1

ML 3.08 eV

DFT 3.12 eV

T1

ML 3.03 eV

DFT 3.12 eV

a1

a2

a3

b1

b2

b3

c1

c2

c3

T1

ML 3.13 eV

DFT 3.22 eV

T1

ML 3.08 eV

DFT 3.12 eV

Examples of inverse-designed host materials

Asymmetric molecules with

the given fragments in the

training library

Symmetric molecules where

the new fragments were

introduced

Asymmetric molecules

where the new fragments

were introduced

Experiment (eV)

HOMO (eV) LUMO (eV) S1 (eV) T1 (eV) ΔEST (eV)

a1 -5.98 -2.43 3.56 3.06 0.55

b1 -5.96 -2.14 3.64 2.93 1.01

c1 -6.07 -2.65 3.38 2.97 0.46

Page 34: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Deep-Learning-Based Inverse Design

33

Inverse design of hosts for blue phosphorescent OLED (3/3)

Total host molecules

(3,205)

L-(R1,R2,R3) ★

(4)

R-L-R

(3,010)

R-R

(190)

R1-R1

(16)

R1-R2

(174)

L: Linker fragment

R: Terminal fragment

Lsym: Symmetric linker

Lasym: Asymmetric linkerR

(1)

R-Lsym -R

(1,931)

R-Lasym -R

(1,079)

R1-Lsym -R1

(403)

R1-Lsym -R2

(1,528)

R1-Lasym -R1

(636)

R1-Lasym -R2

(443)

Linker

Terminal1

Terminal2

Terminal3

The connection rules of the inverse-designed molecules

Page 35: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Conclusions

34

A fully data-driven inverse design method successfully extracted

the latent materials design rules and proposed target molecular

structures without any external intervention.

The inverse design model successfully proposed new candidates

by modifying the assemble rules and creating new fragments.

Page 36: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Efficacy of Computer-Aided Materials Discovery

35

Simulation-based Screening

Full search

Total TAT takes 1 month

1st trial: 1M CandidatesQC simulations take 1.5 years Fail to find the target structure

2nd trial: 1M CandidatesQC simulations take 1.5 years Fail to find the target structure

3rd trial: 1M CandidatesQC simulations take 1.5 years Succeed to find the right structure

Total TAT took 4.5 years

[Step1] Building the training datasetNeeds only QC sim. for 50k molecules(27 days)

50K 5M

Inverse Design

Inverse Design

HTCS for pre-defined chemical space

more than 50X speed up (4.5 years vs. 1 month)

[Step3]QC simulations for the proposed molecules (1 day)

* QC simulation tool : turbomoleTotal computational resources=10,000 CPUIn case of 10 CPU computing per molecule, the simulation requires about 13 hrs.

“The inverse design learns by itself the molecular design rules inherent in the libraries and can reduce the effort of researchers and total time to reach the goal”

[Step2]Deep learning model training with GPU (3 days)

Page 37: Contents...–Limited the total number rings between 3 and 5. –Exclude non-planar type (5-21) and invalid structures as dopant. → Only 11 graphs are valid among the total 29 graphs.

Prospects for AI-based Materials Development

36

Design

Analytical Chemistry

Simulation

Synthesis

Neural Network(MD potential)

Energy

DFT simulation(<100 atoms)

Meso-scale simulation(~104 atoms)

Electronic Properties

Design

SynthesisAnalysis

Training

ChemOS

Robot Characterization ML algorithm

DB

Queuing'system' Queuing'system'

Artificial Intelligence for

Materials Design

Propose target materials

Target Properties

Database


Recommended