+ All Categories
Home > Documents > Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik...

Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik...

Date post: 07-Apr-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
21
Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik ([email protected]), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard, Yadu Babuji, Steve Tuecke, Mike Franklin, Ian Foster Data and Learning Hub for Science
Transcript
Page 1: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

Funding: 2018 Argonne Advanced Computing LDRD

Ben Blaiszik ([email protected]), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard, Yadu Babuji,

Steve Tuecke, Mike Franklin, Ian Foster

Data and Learning Hub for Science

Page 2: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

• Collect, publish, categorize models from many disciplines (materials science, physics, chemistry, genomics, etc.)

• Serve models via API to simplify sharing, consumption, and access

• Enable new science through reuse, real-time integration, and synthesis of existing models

TrainCollect Serve

Data and Learning Hub for Science (DLHub)

Funding: 2018 Argonne Adv. Computing LDRD

FY18 FY18 FY19

Page 3: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

Model-driven Experimentation and Data Tagging

Select DLHub Use Cases

Funding: 2018 Argonne Adv. Computing LDRD

• Crystal structure • NIST PFHub

• Models linked to dynamic data sources

Community Model Benchmarking

Automated Model Retraining with New Data

• Metallic glass discovery [active learning]

• XRD beamline image tagging

(Yager, BNL)

(Ward, ANL/UC)

(Ward, ANL/UC) (Wheeler, Warren, HeinonenNIST/UC/Argonne/NU)

(Center for Hierarchical Materials Design NIST/UC/Argonne/NU)

Page 4: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

TrainCollect Serve

DLHub Servables and Pipelines

Funding: 2018 Argonne Adv. Computing LDRD

Preprocess 1.run()

Preprocess 2.run()

Model predict.run()

.run()

.test()

Pipelines

Singularityor

Docker

methods

Page 5: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

DLHub Architecture

• REST API with Python SDK (available) / CLI (delivery in Nov. 2018 )– Support model markup, data

staging, registration, and invocation

• Model Repository– Container registry

– Advanced search functions

– Identifier minting capabilitieszm

q

Task Manager

Model Repository

REST

CLI SDK

TF Serving

DLHub Management Service Key

Servable

Node

Model Serving

ParslSage

Maker

Executor Executor Executor

zmq

Task Manager

https://github.com/DLHub-Argonne

Page 6: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

DLHub Architecture

• Task Managers (TM) to support execution on various compute resources

• Executors chosen by TM to invoke a given servable’

• Caching at TM

• Data staging with Globus

• Batch submissions

• Scalability through deployment of model replicas

zmq

Task Manager

Model Repository

REST

CLI SDK

TF Serving

DLHub Management Service Key

Servable

Node

Model Serving

ParslSage

Maker

Executor Executor Executor

zmq

Task Manager

Page 7: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

DLHub Model Registration and Publication

Funding: 2018 Argonne Adv. Computing LDRD

DLH

ub

Collect Data

Train Model

Register Model

User

Send to DLHub

• Register model metadata, weights, and files to improve discoverability and reusability

• Containerize model to enhance interoperability

• Identify model with a permanent identifier (e.g., DOI, minid)

• Version model and data pre/post processing steps

• Deploy model with simplified interfaces for users

• Control access to model metadata and usage

• (future) Automate retraining and testing when new data are available

Page 8: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

Marking up a Model – Python SDK

Existing Model

User Mark Up with SDK

Send to DLHub(via Globus or HTTPS)

DLHubContainerization

Populate Search Index / Mint Identifiers

SDK Extracts Metadata for Known Model

Types

Page 9: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

Python SDK – Automated Metadata Generation

Citation Metadata DLHub Metadata Servable Metadata

Access Control• Public• Globus users• Globus groups

Page 10: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

DLHub Model Discovery and Usage

Funding: 2018 Argonne Adv. Computing LDRD

DLH

ub

Gather Data

Send Data

Call DLHub

User

Find Model

• Find curated and tested models

• Use models through simple interfaces

Model Output

Page 11: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

▪ Where are the model and trained weights?

▪ How do I run the model on my data?

▪ How can I retrain the model on new data?

▪ How can I build on this work?

Predicting Glass-forming Ability

Material compositions

DLHub

10.1126/sciadv.aaq1566

Funding: 2018 Argonne Adv. Computing LDRD

Model / transform containers

Predicted glass-forming ability

Page 12: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

Predicting Glass-forming Ability

10.1126/sciadv.aaq1566

Funding: 2018 Argonne Adv. Computing LDRD

DLHub

[“Zr”, “Co”, “ V”]

Predicted glass-forming ability

Page 13: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

DLHub

Image tags

Analyzing Beamline Images

Funding: 2018 Argonne Adv. Computing LDRD

• Stage data into containers via Globus HTTPS

• Pass valid token and data location

Page 14: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

TrainCollect Serve

Data and Learning Hub (DLHub): Pipelines

Funding: 2018 Argonne Adv. Computing LDRD

Preprocess 1

Preprocess 2

Model predict

Pipelines

Page 15: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

Pipelines: Predicting Formation Enthalpy

Funding: 2018 Argonne Adv. Computing LDRD

Page 16: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

Predicting Formation Enthalpy

Funding: 2018 Argonne Adv. Computing LDRD

Page 17: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

DLHubPerformance

Page 18: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

DLHub Performance

Funding: 2018 Argonne Adv. Computing LDRD

Scale Testing

The time required for the Inception, CIFAR10, and Matminer-featurize models to process 5000 inferences with varying numbers of replicas.

Servable invocation time, with and without batching.

Scale Testing Batching

Page 19: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

DLHub Performance

Funding: 2018 Argonne Adv. Computing LDRD

CachingServing General Models

Performance of different serving systems on the Inception and CIFAR-10 problems.

Performance impact of caching in DLHub. Bars and error bars show median and 5th/95th percentiles

Serving General Models Caching

Page 20: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

DLHub Summary

Funding: 2018 Argonne Adv. Computing LDRD

Model deposit and discovery- Developed a model schema to promote discovery- Implemented advanced search and filtering- Built ingest flow: models are dynamically staged,

packaged, dockerized, published, and indexed

Model serving- Deployed capabilities for users to run inference with

SDK and CLI- Automated testing of containers- Implemented caching and batching

Support for multiple execution sites- PetrelKube: Parsl, TF serving, Sagemaker- Other: AWS, OSG

Authentication- Protected model metadta and inference with

GlobusAuth- Secured data staging

Monitoring and statistics- Request, invocation, data staging

Future work- Dynamic scaling by load- Build Web UI to create pipelines and invoke

models- Cache at the servable level within pipelines- Couple DLHub to data sources (MDF, etc.)- Integrate with ML frontend tools (DeepForge),

optimization tools (DeepHyper), and more- Create interface for training and retraining of

models

Page 21: Data and Learning Hub for Science · Funding: 2018 Argonne Advanced Computing LDRD Ben Blaiszik (bblaiszik@anl.gov), Ryan Chard, Logan Ward, Kyle Chard, Zhuozhao Li, Anna Woodard,

Thanks to our sponsors!

U.S. DEPARTMENT OF

ENERGY

ALCF DF

Parsl Globus IMaD

DLHub Argonne LDRD


Recommended