+ All Categories
Home > Documents > Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim...

Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim...

Date post: 05-Jan-2016
Category:
Upload: todd-obrien
View: 212 times
Download: 0 times
Share this document with a friend
Popular Tags:
47
Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center
Transcript
Page 1: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

Astronomy, Petabytes, and MySQL

MySQL ConferenceSanta Clara, CAApril 16, 2008

Kian-Tat LimStanford Linear Accelerator Center

Page 2: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

2 / 47

Outline

LSSTLSST Database

LSST Database + MySQL

Page 3: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

3 / 47

LSST

What Is It?Why Build It?

Page 4: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

4 / 47

LSST

What Is It?Why Build It?

Page 5: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

5 / 47

Telescope

Proposed telescope to be

built in Chile

Page 6: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

6 / 47

Large

3.2 gigapixel camera

8.4 meter diameter mirror

Page 7: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

7 / 47

Synoptic Survey

Wide

Deep

Fast

Page 8: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

8 / 47

LSST

What Is It?Why Build It?

Page 9: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

9 / 47

Dark Matter and Energy

Photo: J. A. Tyson, W. Colley, E. L. Turner, and NASA

Page 10: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

10

/ 47

Variable Objects

Page 11: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

11

/ 47

Transient Objects

Page 12: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

12

/ 47

Moving Objects

Photo: D. Roddy, Lunar and Planetary Institute

Page 13: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

13

/ 47

LSST Database

What’s In It?How Big?

How Often?What Queries?Unusual Needs

Page 14: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

14

/ 47

LSST Database

What’s In It?How Big?

How Often?What Queries?Unusual Needs

Page 15: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

15

/ 47

Database: Components

Image Metadata

Moving

Objects

CatalogObject Catalog

Source Catalog

Difference Image Source Catalog

Provenance

Statistics

Summaries

Calibration Engineering and Facility Database

Page 16: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

16

/ 47

Astronomical Objects

Image Metadata

Moving

Objects

CatalogObject Catalog

Source Catalog

Difference Image Source Catalog

Provenance

Statistics

Summaries

Calibration Engineering and Facility Database

Page 17: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

17

/ 47

Sources

Image Metadata

Moving

Objects

CatalogObject Catalog

Source Catalog

Difference Image Source Catalog

Provenance

Statistics

Summaries

Calibration Engineering and Facility Database

Page 18: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

18

/ 47

Changes

Image Metadata

Moving

Objects

CatalogObject Catalog

Source Catalog

Difference Image Source Catalog

Provenance

Statistics

Summaries

Calibration Engineering and Facility Database

Page 19: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

19

/ 47

Image Metadata

Image Metadata

Moving

Objects

CatalogObject Catalog

Source Catalog

Difference Image Source Catalog

Provenance

Statistics

Summaries

Calibration Engineering and Facility Database

Page 20: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

20

/ 47

Calibration and Facility

Image Metadata

Moving

Objects

CatalogObject Catalog

Source Catalog

Difference Image Source Catalog

Provenance

Statistics

Summaries

Calibration Engineering and Facility Database

Page 21: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

21

/ 47

LSST Database

What’s In It?How Big?

How Often?What Queries?Unusual Needs

Page 22: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

22

/ 47

Sagans of Rows

49 billion objects

2.8 trillion sources

Page 23: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

23

/ 47

Lots of Columns

308 columns for objects

56 columns for sources

(for now)

Page 24: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

24

/ 47

Database Size

Grows to >14 PB

Page 25: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

25

/ 47

LSST Database

What’s In It?How Big?

How Often?What Queries?Unusual Needs

Page 26: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

26

/ 47

Frequency

Nightly updates

Semi-annual data releases

Page 27: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

27

/ 47

LSST Database

What’s In It?How Big?

How Often?What Queries?Unusual Needs

Page 28: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

28

/ 47

Queries

•All about an object•All objects meeting criteria•All objects near objects meeting

criteria•All objects with interesting time

series•All pairs of objects with similar time

series

Page 29: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

29

/ 47

LSST Database

What’s In It?How Big?

How Often?What Queries?Unusual Needs

Page 30: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

30

/ 47

Unusual Needs

Flexibility

Provenance

Page 31: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

31

/ 47

LSST Database + MySQL

Why MySQL?Scalability?

Performance?

Page 32: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

32

/ 47

LSST Database + MySQL

Why MySQL?Scalability?

Performance?

Page 33: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

33

/ 47

MySQL

Relational database management system

Page 34: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

34

/ 47

Open Source

Vibrant community

Strong company support

Page 35: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

35

/ 47

Hardware

Runs on commodity hardware

Page 36: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

36

/ 47

In-Memory Tables

Needed for near-real-time processing

Page 37: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

37

/ 47

LSST Database + MySQL

Why MySQL?Scalability?

Performance?

Page 38: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

38

/ 47

“MySQL Grid”

Page 39: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

39

/ 47

Partitioning

Large tables partitioned spatially

Page 40: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

40

/ 47

Replication

Dimension tables likely replicated

Page 41: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

41

/ 47

Needs: Distributor/Combiner

LSST will build prototypeNeed long-term support

Page 42: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

42

/ 47

LSST Database + MySQL

Why MySQL?Scalability?

Performance?

Page 43: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

43

/ 47

Per-Column Indexing

2X data size

Page 44: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

44

/ 47

Needs: Optimizer

Efficient use of multiple (20-30) indexes

Page 45: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

45

/ 47

Needs: Indexes

Bitmap/compressed indexes

Page 46: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

46

/ 47

Needs: Storage Engine

“Shared scan” for long-running full-table queries

Page 47: Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

MySQL ConferenceApril 16, 2008 Santa Clara, CA

47

/ 47

Summary

Building a petabyte DB

MySQL can be a core component


Recommended