+ All Categories
Home > Engineering > Mat informatics opportunties fisherbarton 2015 12-07 1.1

Mat informatics opportunties fisherbarton 2015 12-07 1.1

Date post: 23-Jan-2018
Category:
Upload: ddm314
View: 238 times
Download: 6 times
Share this document with a friend
25
Opportunities in Materials Informatics Dane Morgan University of Wisconsin, Madison [email protected], W: 608-265-5879, C: 608-234-2906 Fisher Barton Technology Center Watertown, WI December 7, 2015 1
Transcript
Page 1: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Opportunities inMaterials Informatics

Dane Morgan

University of Wisconsin, Madison

[email protected], W: 608-265-5879, C: 608-234-2906

Fisher Barton Technology Center

Watertown, WI

December 7, 2015 1

Page 2: Mat informatics opportunties fisherbarton 2015 12-07 1.1

What is Materials Informatics?

Materials informatics is a field of study that applies the tools and principles of information extraction from data (informatics) to materials science and engineering to better understand the use, selection, development, and discovery of materials.

– Mining for materials information in large data sets

– Applying new information technologies to enable new materials science

2

Page 3: Mat informatics opportunties fisherbarton 2015 12-07 1.1

What Are Materials Informatics Applications?

Related buzzwords: Data science, data analytics, data mining, knowledge discovery, machine learning, artificial intelligence, deep learning, big data …

• Interpolation/Extrapolation/Correlation of Data – determine controlling factors, fill in what is missing, optimize

• Design of Experiments – Perform experiments in optimal order to achieve your goal

• Clustering (Feature Extraction) – group like things together, either supervised or unsupervised

• Image Recognition – identify things in pictures and analyze them

• Optimization – find the optimal solution in complex spaces

• Text Mining – Extract data from published documents, web

3

Associated Infrastructure: Cloud computing, high-performance computing clusters, high-throughput/combinatorial experiment+computation, …

Page 4: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Materials Informatics Has a Strong History

Mendeleev 1871

Ashby map4

Page 5: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Turning Point for Materials Informatics

Data availabilityData Production Informatics Tools

-6

-5.5

-5

-4.5

-4

-3.5

-3

0 10 20 30 40

PredictedLogk*(cm

/s)

Eabovehull(meV/atom)

LaBO3

YBO3

PrBO3

(Sr,Ba)BO3

5

Page 6: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Informatics Tools Explosion

6

Prediction API

Page 7: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Real Time Translation with Deep Learning from Microsoft

https://www.youtube.com/watch?v=Nu-nlQqFCKg

Time: 6:30s

7

Page 8: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Google Image Captioning

http://www.nytimes.com/2014/11/18/science/researchers-announce-breakthrough-in-content-recognition-software.html?_r=0

8

Page 9: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Focus Area: Informatics for Knowledge Discovery in Large Data Sets

Use machine learning techniques to

• Organize your data by putting all relevant, cleaned input and output into one place

• Understand your data by finding the most important factors controlling output values

• Expand your data by interpolating and extrapolating

• Optimize your data by finding correlations between input and output data to optimize desired output

9

Page 10: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Example

• Organize: Build a database of all the relevant factors (impurity concentrations, processing conditions, testing conditions, …) and output performance.

• Understand: Which impurities matter most. Size of impurity effects vs. other contributions.

• Expand: Interpolate/extrapolate to other impurity concentrations to assess performance under conditions we have not yet explored.

• Optimize: Determine impurity concentrations that lead to optimal performance.

I know impurities impact my device lifetime, so …

10

Page 11: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Undergraduate “Materials Informatics Skunkworks”

Benjamin Anderson Liam Witteman

Team support

Henry WuAren LorensonHaotian Wu

Zachary Jensen

11

Jason MaldonisJosh Perry Tom Vandenberg Robert Darlington

Page 12: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Example: Predicting Impurity Diffusion in FCC Alloys

12

UNPUBLISHED DATA – CONFIDENTIAL – DO NOT DISSEMINATE

Calculated activation energies with ab initio methods

1.0

1.5

2.0

2.5

3.0

Dif

fusi

on

Bar

rier

[eV

]

Sc YLa

Ti ZrHf

V NbTa

Cr MoW

Mn TcRe

Fe RuOs

Co RhIr

Ni PdPt

Cu AgAu

Zn CdHg

Ga InTl

Ge SnPb

As SbBi

Ca SrBa

K RbCs

1.0

1.5

2.0

2.5

3.0

Dif

fusi

on

Bar

rier

[eV

]

Sc YLa

Ti ZrHf

V NbTa

Cr MoW

Mn TcRe

Fe RuOs

Co RhIr

Ni PdPt

Cu AgAu

Zn CdHg

Ga InTl

Ge SnPb

As SbBi

1.0

1.5

2.0

2.5

3.0

Dif

fusi

on B

arri

er [

eV]

Sc YLa

Ti ZrHf

V NbTa

Cr MoW

Mn TcRe

Fe RuOs

Co RhIr

Ni PdPt

Cu AgAu

Zn CdHg

Ga InTl

Ge SnPb

As SbBi

2.0

2.5

3.0

3.5

4.0D

iffu

sio

n B

arri

er [

eV]

Sc YLa

Ti ZrHf

V NbTa

Cr MoW

Mn TcRe

Fe RuOs

Co RhIr

Ni PdPt

Cu AgAu

Zn CdHg

Ga InTl

Ge SnPb

As SbBi

2.0

2.5

3.0

3.5

4.0

Dif

fusi

on

Bar

rier

[eV

]

Sc YLa

Ti ZrHf

V NbTa

Cr MoW

Mn TcRe

Fe RuOs

Co RhIr

Ni PdPt

Cu AgAu

Zn CdHg

2.0

2.5

3.0

3.5

4.0

Dif

fusi

on

Bar

rier

[eV

]

Sc YLa

Ti ZrHf

V NbTa

Cr MoW

Mn TcRe

Fe RuOs

Co RhIr

Ni PdPt

Cu AgAu

Zn CdHg

Mg Al

Cu Ni

Pd Pt

Page 13: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Example: Predicting Impurity Diffusion in FCC Alloys

• 15 FCC hosts x 100 impurities = 1500 systems, ~15m core-hours (~$500k to produce, ~2 years).

• We have computed values for ~10%

• How can we quickly (and cheaply) get to ~100% coverage?

13

M Al Ca Ni Cu Sr Rh Pd Ag Yb Ir Pt Au Pb Ac Th

X 13 20 28 29 38 45 46 47 70 77 78 79 82 89 90 H 1

He 2

Li 3

Be 4

B 5

C 6

N 7

O 8

F 9

Ne 10

Na 11

Mg 12

Al 13

Si 14

P 15

S 16

Cl 17

Ar 18

K 19

Ca 20

Sc 21

Ti 22

V 23

Cr 24

Mn 25

Fe 26

Co 27

Ni 28

Cu 29

Zn 30

Ga 31

Ge 32

As 33

Se 34

Br 35

Kr 36

Rb 37

Sr 38

Y 39 N/A N/A

Zr 40

Nb 41

Mo 42 N/A

Tc 43 N/A N/A

Ru 44 N/A N/A

Rh 45 N/A N/A

Pd 46 N/A

Ag 47

Cd 48

In 49

Sn 50

Sb 51

Te 52

I 53

Xe 54

Cs 55

Ba 56

La 57 N/A N/A

Ce 58

Pr 59

Nd 60

Pm 61

Sm 62

Eu 63

Gd 64

Tb 65

Dy 66

Ho 67

Er 68

Tm 69

Yb 70

Lu 71

Hf 72

Ta 73

W 74

Re 75

Os 76

Ir 77

Pt 78

Au 79

Hg 80

Tl 81

Pb 82

Bi 83

Po 84

At 85

Rn 86

Fr 87

Ra 88

Ac 89

Th 90

Pa 91

U 92

Np 93

Pu 94

UNPUBLISHED DATA – CONFIDENTIAL – DO NOT DISSEMINATE

Page 14: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Materials Informatics Approach –Regression and Prediction

• Assume Activation energy = F(elemental properties)

• Elemental properties = melting temperature, bulk modulus, electronegativity, …

• F is determined using a one of many possible methods: linear regression, neural network, decision tree, kernel ridge regression, …

• Fit F with calculated data, test it with cross-validation, then predict new data.

M Al Ca Ni Cu Sr Rh Pd Ag Yb Ir Pt Au Pb Ac Th

X 13 20 28 29 38 45 46 47 70 77 78 79 82 89 90 H 1

He 2

Li 3

Be 4

B 5

C 6

N 7

O 8

F 9

Ne 10

Na 11

Mg 12

Al 13

Si 14

P 15

S 16

Cl 17

Ar 18

K 19

Ca 20

Sc 21

Ti 22

V 23

Cr 24

Mn 25

Fe 26

Co 27

Ni 28

Cu 29

Zn 30

Ga 31

Ge 32

As 33

Se 34

Br 35

Kr 36

Rb 37

Sr 38

Y 39 N/A N/A

Zr 40

Nb 41

Mo 42 N/A

Tc 43 N/A N/A

Ru 44 N/A N/A

Rh 45 N/A N/A

Pd 46 N/A

Ag 47

Cd 48

In 49

Sn 50

Sb 51

Te 52

I 53

Xe 54

Cs 55

Ba 56

La 57 N/A N/A

Ce 58

Pr 59

Nd 60

Pm 61

Sm 62

Eu 63

Gd 64

Tb 65

Dy 66

Ho 67

Er 68

Tm 69

Yb 70

Lu 71

Hf 72

Ta 73

W 74

Re 75

Os 76

Ir 77

Pt 78

Au 79

Hg 80

Tl 81

Pb 82

Bi 83

Po 84

At 85

Rn 86

Fr 87

Ra 88

Ac 89

Th 90

Pa 91

U 92

Np 93

Pu 94

Train F(properties)

M Al Ca Ni Cu Sr Rh Pd Ag Yb Ir Pt Au Pb Ac Th

X 13 20 28 29 38 45 46 47 70 77 78 79 82 89 90 H 1

He 2

Li 3

Be 4

B 5

C 6

N 7

O 8

F 9

Ne 10

Na 11

Mg 12

Al 13

Si 14

P 15

S 16

Cl 17

Ar 18

K 19

Ca 20

Sc 21

Ti 22

V 23

Cr 24

Mn 25

Fe 26

Co 27

Ni 28

Cu 29

Zn 30

Ga 31

Ge 32

As 33

Se 34

Br 35

Kr 36

Rb 37

Sr 38

Y 39 N/A N/A

Zr 40

Nb 41

Mo 42 N/A

Tc 43 N/A N/A

Ru 44 N/A N/A

Rh 45 N/A N/A

Pd 46 N/A

Ag 47

Cd 48

In 49

Sn 50

Sb 51

Te 52

I 53

Xe 54

Cs 55

Ba 56

La 57 N/A N/A

Ce 58

Pr 59

Nd 60

Pm 61

Sm 62

Eu 63

Gd 64

Tb 65

Dy 66

Ho 67

Er 68

Tm 69

Yb 70

Lu 71

Hf 72

Ta 73

W 74

Re 75

Os 76

Ir 77

Pt 78

Au 79

Hg 80

Tl 81

Pb 82

Bi 83

Po 84

At 85

Rn 86

Fr 87

Ra 88

Ac 89

Th 90

Pa 91

U 92

Np 93

Pu 94

Y. Zeng and K. Bai, Journal of Alloys and Compounds 624, p. 201-209 (2015).14

Page 15: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Model Predictive Ability

• Leave one out cross validation

• Predictive RMS = 0.14 eV (vs. 0.24 eV for linear fit) –predicts diffusion of new impurity within <10x at 1000K

• Time to predict new system < 1s!

0 1 2 3 4 5 6DFT Activation Energy [eV]

0

1

2

3

4

5

6

Pre

dic

ted

Act

ivat

ion

En

erg

y [

eV]

AlCuNiPdPtAuCaIrPb

Leave One Out Cross Validation

y = 0.9909x

R2 = 0.9312

UNPUBLISHED DATA – CONFIDENTIAL – DO NOT DISSEMINATE

15

Page 16: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Al-X Recrystallization Temperature (Tx)

• Data on Tx for 82 Al-X alloys with 11 alloying elements

• What controls Tx and how can we optimize it? 16

0

50

100

150

200

250

300

350

400

0

2

4

6

8

10

12

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81

Recrystalliza

onTemperatureT

x

MoleFraconAllo

yingElement

AlloyNumber

Fe Y Ni La Ti CoCu Sn Ga B Ce Tx(°C)

Courtesy of

Izabela

Szlufarska, John

Perepezko, Zach

Jensen

Page 17: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Materials Informatics Approach –Regression and Prediction

• Assume Tx = F(elemental composition)

• Elemental composition = mole fraction of Fe, Cu, Y, …

• F is determined using a one of many possible methods: linear regression, neural network, decision tree, kernel ridge regression, …

• Fit F with calculated data, test it with cross-validation, then predict new data.

Train F(properties)

17

0

50

100

150

200

250

300

350

400

0

2

4

6

8

10

12

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81

Recrystalliza

onTemperatureT

x

MoleFraconAllo

yingElement

AlloyNumber

Fe Y Ni La Ti CoCu Sn Ga B Ce Tx(°C)

0

50

100

150

200

250

300

350

400

0

2

4

6

8

10

12

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81

Recrystalliza

onTemperatureT

x

MoleFraconAllo

yingElement

AlloyNumber

Fe Y Ni La Ti CoCu Sn Ga B Ce Tx(°C)

Page 18: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Linear Regression Prediction of Tx

18

Max RMS: 91°C

Min RMS: 13°C

Avg RMS: 28°C +/- 10.3°C

Original Data Std Dev: 48°C

Worst Case

1000 leave out 20% cross-validation tests

Best Case

TrainingTraining

TestingTesting

Courtesy of

Izabela

Szlufarska, John

Perepezko, Zach

Jensen

Page 19: Mat informatics opportunties fisherbarton 2015 12-07 1.1

The Undergraduate “Materials Informatics Skunkworks”

We are establishing ~10-20 undergraduates working together to provide materials informatics research for companies• Help researchers in academia and industry develop and

utilize this new field• Provide training in rapidly growing field of informatics to

undergraduates to enhance employment opportunities and key workforce development

• Be supported financially/academically through credits, internships, senior design/capstone projects, funded projects from industry

• Be supported intellectually through group culture of teamwork and knowledge continuity (more senior train more junior members) with limited faculty involvement for advanced issues

19

Page 20: Mat informatics opportunties fisherbarton 2015 12-07 1.1

What the Informatics SkunkworksMight Provide You

WORKFORCEA team of talented students who are ready to work quickly with

your company to get the most out of your data

DATA ANALYTICSTechnical skills to help you organize, understand and expand data

sets and utilize data to optimize materials development

20

Page 21: Mat informatics opportunties fisherbarton 2015 12-07 1.1

What You Might Provide the Informatics Skunkworks

FINANCIAL/COURSE CREDIT SUPPORTInternships, Co-ops, Senior design/Capstone projects, Research

projects, Research funding or course credits

SHARED DATAData sets of materials related performance and property data that are large (> ~50), can be shared (ideally published), and are worth

mining

21

Page 22: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Thank You for Your Attention

22

Page 23: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Backup

23

Page 24: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Present Best ApproachGaussian Kernel Ridge Regression

• We have systems M-X labeled with i, and descriptors labeled with j for each M-X system. Assume yi are output, xi,j are input descriptors

• Regression: Find {aj} that minimize

• Ridge Regression: Find {aj} that minimize

• Kernel Ridge Regression: Find {ai} that minimize

yi - a jxi, jj

åæ

èçç

ö

ø÷÷

i

å

yi - a jxi, jj

åæ

èçç

ö

ø÷÷

i

å + l a j2

j

å

yi - ai 'K xi ',xi( )i '

åæ

èç

ö

ø÷

i

å + l ai ai 'i,i '

å K xi ',xi( )

K xi ',xi( ) =

exp - xi ' - xi2

2s 2( )New values are given by y

* = aii,i '

å K xi,x*( )

Kernel is

Must fit s and l

G. Montavon, et al., NJOP ‘13.

A. Gretton, Introduction to RKHS, and some simple kernel algorithms, 1/27/15 (lecture notes)

Page 25: Mat informatics opportunties fisherbarton 2015 12-07 1.1

Gaussian Kernel Ridge Regression

Introduction to RKHS, and some simple kernel Algorithms, Arthur Gretton, January 27, 2015


Recommended