+ All Categories
Home > Documents > Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Date post: 27-Mar-2015
Category:
Upload: landon-soto
View: 217 times
Download: 4 times
Share this document with a friend
Popular Tags:
40
Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2
Transcript
Page 1: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Tom Barclay

Jim Gray, Don Slutz, Greg Smith, many others

Microsoft Research

SPIN-2

Page 2: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Scaleup - Big Database

Build a 1 TB SQL Server database Data must be

– 1 TB– Unencumbered– Interesting to everyone everywhere– And not offensive to anyone anywhere

Loaded – 1.1 M place names from Encarta World Atlas– 1 M Sq Km from USGS (1 meter resolution)– 2 M Sq Km from Russian Space agency (2 m)

Will be on web (world’s largest atlas) Sell images with commerce server.

Page 3: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

3

What’s a Terabyte?1 Terabyte 1,000,000,000 business letters 150 miles of book shelf 100,000,000 book pages 15 miles of book shelf 50,000,000 FAX images 7 miles of book shelf 10,000,000 TV pictures (mpeg) 10 days of video 4,000 LandSat images 16 earth images (100m)

Library of Congress (in ASCII) is 25 TB 1980: 200 M$ of disc 10,000 discs 5 M$ of tape silo 10,000 tapes

1998: 100 k$ of magnetic disc 60 discs 50 K$ nearline tape 30 tapes

Terror Byte !!

Page 4: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Some Other Terror-Byte Databases

Kilo

Mega

Giga

Tera

Peta

Exa

Zetta

Yotta

TerraServer Sloan Digital Sky Survey:

– 40 TB raw, 2 TB cooked – EOS/DIS (picture of planet each week)– 15 PB by 2007

Federal Reserve Clearing house: images of checks– 15 PB by 2006 (7 year history)

Nuclear Stockpile Stewardship Program– 10 Exabytes (???!!)

Page 5: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

TerraServer is:

An on-line demo and sales tool directed at IT customers and ISVs

A test of the Sphinx VLDB features:– Load performance

– Online Backup/Restore

– Query Performance

A “cool 90s app”– Image and Text data

– Web-lication

– Electronic Commerce

“A shameless advertisement of WNT and SQL Server Scalability”

Page 6: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Application Requirements

BIG —1 TB of data.

PUBLIC — available on the world wide web.

INTERESTING — to a wide audience

ACCESSIBLE — using standard browsers (IE, Netscape)

REAL — a real application (users can buy imagery)

FREE —cannot require NDA or money to access

FAST — impress customers for BackOffice, StorageWorks

EASY — Inexpensive to develop, deploy, and maintain

Page 7: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Project PartnersMotivation

Distribute DOQs to awider audienceLower cost of distribution

Demo scope & qualityof Spin-2 imageryOpen new marketsfor imagery sales

SPIN-2

Demo DEC Alpha& StorageWorks™ScalabilityRecognized as superior h/w vendor

Demo Scalabilityof NT &SQL Server

Page 8: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Database & App UI Coverage: Range from 70ºN to 70ºS

35% U.S., 1% outside U.S. Source Imagery:

– 3.5 TB 1sq meter/pixel Aerial (USGS - 60,000 46Mb B&W- 151Mb Color IR files)

– 700 GB 1.56 meter/pixelSatellite (Spin-2 - 2400 300 Mb B&W)

Display Imagery: 80 m 225 x 150 pixel images, 1.6 m x 3 sub-sampled views

Nav Tools: – 1.5 m place names

– “Click-on” Coverage map

– Expedia & Virtual Globe map

1.8x1.2km 32m “city view”

1.8x1.2km 16m thumbnail

1.8x1.2km 8m browse

225x150m tile

Concept: User navigates an ‘almost seamless’ image of earth

Page 9: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

TerraServer Demo

Intranet Beta Sites: – http://terraweb1– http://terraweb2

Internal Beta Schedule– Mon April 27 - June 23

Page 10: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

What Microsoft & DEC Contribute Microsoft’s contribution:

– Build an “internet UI” – Design the app and the database– Slice & Dice & Load the data.– Build “electronic stores” for USGS’ for Aerial Images to operate to

sell & distribute images– Run a “robust”web site 18 months

Digital contribution:– Provide high-performance processors – provide high capacity, reliable storage. – Provide technical advice

Page 11: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

World’s Largest PC!

– 324 disks (2.4 TB)

– 8 x 440 mhz Alpha CPU

– 10 GB RAM

Page 12: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Alpha8400

(8x440)10GBRam

Enterprise Storage ArrayStorageTek

9 HSZ70 Ultra-SCSI Dual redundant Controllers

324 9.1 Seagate Disks

6 DLT7000Quantum DrivesFWD SCSI Compaq

55004x200mhz

Web Servers

Compaq5500

4x200mhzWeb

Servers

To the Web

Site Configuration

Page 13: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

broswer

HTMLJava

Viewer

The Internet

Web Client

Microsoft AutomapActiveX Server

Internet InfoServer 4.0

Image DeliveryApplication

SQL Server7

MicrosoftSite Server EE

Internet InformationServer 4.0

Image Provider Site(s)

Terra-Server DB Automap Server

Sphinx(SQL Server)

Terra-ServerStored Procedures

InternetInformationServer 4.0

ImageServer

Active Server Pages

MTS

Terra-Server Web Site

Software

Page 14: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

How We Did It “Chopped” big images into small “tiles”

– Sub-sampled tiles to create zoom levels– Tile sizes map to Lat/Lon system– Unique ID assigned to each Tile location

(Z-transform of lat/long or UTM)

– Unique ID clusters adjacent tiles onto the same database & index pages Wrote Load Management program

– Runs image cutting job– Loads meta and image data into SQL– Multiple Loaders can run in parallel– Web Active Server Page controls load process

Page 15: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

USGS Editing Process1

Deg

ree

Lat

itud

e

DOQQ Origin Point

DOQTiles

Quad Cut 3x6Jump, Thumb-nails &Browse Images

1 2 3

4 5 6

7 8

1 89

641 Quadrangle (7.5’ x 7.5’)

1 “QUAD”DOQ Photo(3.75’ x 3.75’)

9

10 11 12

13 14 15

16 17 18

1 Degree Longitude

Page 16: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Spin-2 Image Editing Process48 x 96 cells per sq degree

Image aligned to left corner of grid system

Non-image squares (all white) are discarded

Cut Images are extracted

SubSample Jump

16m

Browse8m

Tiles are cut5x5, scrambledoutput Jpeg

32m

Thumb

Page 17: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Spin-2 Meta Data

File name (of image) City1

State1

Country Number of Rows Number of Columns Shooting Height Height of Sun Date of survey

(mm/dd/yyyy)

Time of survey (GMT)(hr:mn:ss)

Upper Left Latitude Upper Left Longitude Lower Right Latitude Lower Right Longitude Camera System1

Pixel size1

Copyright1

1Field is not required, if not present, then a blank field is present

Semi-colon delimited fields, ASCII encoding 1 records per line

Page 18: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Database Design and Load

Build a 1 TB (2**40B) SQL Server Database Database includes

– Gazetteer data for searching– Image data pyramid and metadata

Load the Database – Chop the big images into tiles– BCP data and metadata in– Allow for restart and undo of loads– Create indexes– Check consistency of the data

Keep it Simple, no Tricks, Test the Scaling

Page 19: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Jump image1 pixel = 32x32 m2

USGS Tile imageDOQ of Washington Monument1 pixel = 1 sq meter

Dithered Thumb image1 pixel = 8x8 m2

Dithered Browse image1 pixel = 16x16 m2

64:11:11:1

The Image Pyramid

Zooming in on the Washington Monument

Page 20: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

‘Logical’ SchemaCountry State

PlacePlaceType

FeatureType

Gazetteer

Star schemaIndex on• image, place, type• image, state, type• image, state, country, type• image, place, state, type• image, place, country, typeall lookups are fast

ImgMeta TileMeta

Jump Img BrowseImg TileImg

Theme Meta Information TileLog

Thumb Img

Image Data & Meta Data

Lookup by UGrid or ZGrid ID plus resolutionLookups are fast.Indices are in DRAM (auto-magically by SQL)SQL manages all the tiles and indicesImages are brought in on demand

Lat/Long(U/ZGridId)

Page 21: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

AlternateNameName

CountryIDStateIDTypeID

GazSourcIDLatitude

LongitudeUGridIDZGridIDDOQdate

SPIN2date

PlaceID

Place

ImageFlag

FeatureType

TypeIDDescription

13

GazetteerSource

GazSrcIDDescription

11,089,897

Country

CountryIDCountryName

UNcode

264

State

StateIDCountryIDStateName

1083

CountrySearch

AlternateNameCountryIDGazSrcID

1148

StateSerach

AlternateNameCountryID

StateIDFreatureIDGazSrcID

3776

PlaceGrid

ZGridIDBestPlaceName

XDistanceYDistrance

50,000,000

Gazetteer Design

Classic Snowflake Schema Top 10 Hint to RE for Cursor Select

Page 22: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

1

ImgSource

SrcIDSrcName

SrcTblNameSrcDescription

GridSysIDImgTypeID

2

Jump

UGridIDZGridID

ZTileGridIDImgDataImgDate

ImgTypeIDImgMetaID

SrcIDEncryptKeyFile Name

.65 M SPIN21.5 M USGS

OriginalMetaData

OrigMetaIDSrcID

ImageSourceAgency

SourcePhotoIDSourcePhotoDateSourceDEMDate

MetaDataDateProductionSystem

ProductionDateDataFileSizeCompressionHeaderBytes

…80 other fields

650 k SPIN22 M USGS

Pick

NameDescription

LinkPickDate

10

ImageMeta

ImgMetaIDOrigMetaIDImgStatusImgDate

ImgTypeIDJumpPixHeightJumpPixWidth

BrowsePixHeightBrowsePixWidthThumbPixWidthThumbPixHeight

CutColCutRowMidLat

MidLongNELat

NELongNWLat

NWLongSELat

SELongSWLat

SWLongUGridID

UTMZoneXUtmIDYUtmIDXGridIDYGridIDZGridID

650 k SPIN22 M USGS

ImgType

ImgTypeIDImgFileDescImgFileExt

MimeStr

4

Browse

UGridIDZGridID

ZTileGridIDImgDataImgDate

ImgTypeIDImgMetaID

SrcIDEncryptKeyFile Name

.65 M SPIN21.5 M USGS

Thumb

UGridIDZGridID

ZTileGridIDImgDataImgDate

ImgTypeIDImgMetaID

SrcIDEncryptKeyFile Name

.65 M SPIN21.5 M USGS

Tile

UGridIDZGridID

ZTileGridIDImgDataImgDate

ImgTypeIDImgMetaID

SrcIDEncryptKeyFile Name

16 M SPIN296 M USGS

xxx

UGridHits

URLUGridID

ZTileGridIDcount

Log

URLTime

<extensivelist of actionparameters

xxx

TileMeta

ImgMetaIDOrigMetaID

SrcIDImgStatusImgDate

ImgTypeIDTilePixHeightTilePixWidth

CutColCutRowMidLat

MidLongNELat

NELongNWLat

NWLongSELat

SELongSWLat

SWLongUGridID

UTMZoneXUtmIDYUtmIDXGridIDYGridIDZGridID

16 M SPIN296 M USGS

Image Data Design Image pyramid stored in DBMS (250 M recs)

Page 23: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

TerraServer File Group Design Make 28 RAID5 sets from 324 disks

Each raid set has 11 disks (16 spare drives)

Make 4 595GB NT volumes Each striped over 7 Raid sets on 7 controllers

Create 26 20,000MB files on F:, 27 on G: DB is File Group of 53 files (1.011TB)

HSZ70 A

HSZ70 B

HSZ70 A

HSZ70 B

HSZ70 A

HSZ70 B

HSZ70 A

HSZ70 B

HSZ70 A

HSZ70 B

HSZ70 A

HSZ70 B

F: G: H: I:

HSZ70 A

HSZ70 B

Page 24: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Physical Database

53 Files. 20,000MB each 16,960,000 extents 135,680,000 pages Separate tables for DOQ, Spin ‘Themes’ Each image stored in column of type ‘image’ All tile images in one (big) table A number of indexes too

Page 25: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

TerraServer Tables USGS DOQ Data

– 48,000 DOQQ images (45-55mb / image)– Creates 864,000 Jump, Thumb, & Browse images (3.5 m rows)– Creates 55.3 m Tile images (110.6 m rows)

SPIN-2 Data– 3200 278 MB images (approximate size)– Creates 620,800 Jump, Thumb, & Browse images (2.5 m rows)– Creates 15.5 m Tile images (31 m rows)

Gazetteer Data– 1.1 m named places (Encarta World Atlas)– 45 m cell names

Total Rows = 193.7 M

Page 26: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

The Loading Process Includes Cutting Images, building BCP files, BCP meta data, BCP image data First Load 1/97-5/97 for Scalability Day

– 190 GB actual image data, 800 GB duplicates– Pre-beta Sphinx

Second Load 12/97-4/98 for Web Server– 750 GB actual image data, all images recut

Page 27: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Image Preperation and LoadDLTTape “tar”

\Drop’N’ DoJobWait 4LoadLoadMgr

DB

100mbitEtherSwitch

108 9.1 GBDrives

Enterprise Storage Array

AlphaServer8400

108 9.1 GBDrives

108 9.1 GBDrives

STCDLTTape

Library

604.3 GBDrives

AlphaServer4100

ESAAlphaServer4100

LoadMgr

DLTTape

NTBackup

ImgCutter

\Drop’N’ \Images

10: ImgCutter20: Partition30: ThumbImg40: BrowseImg45: JumpImg50: TileImg55: Meta Data60: Tile Meta70: Img Meta80: Update Place

...LoadMgr

Page 28: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

NT Backup

Pre-Process Data

Read *.IMD filesGenerate IdsGenerate ZLatLongSort by ZLatLong

Image Meta Tile Meta

Load Thumb ImgRead Image MetaRead Image DataBCP into ImgTbl

Load Browse ImgRead Image MetaRead Image DataBCP into ImgTbl

Load Tile ImgRead Tile MetaRead Tile DataBCP into TileTbl

*.IMD & *.JPG

Load Tile MetaRead Image MetaBCP into TileMeta

Load Img MetaRead Image MetaBCP into TileMeta

ImgMeta

ImgMetaId intOrigMetaId intSrcId intImgTypeId intXGridId intYGridId intImgDate DateHemisphere smallintContinent smallintxxLat smallintxxLong smallintZLatLong intMetaStr vchar(255)

TileMeta

TileMetaId intImgMetaId intOrigMetaId intSrcId intImgTypeId intXGridId intYGridId intHemisphere smallintContinent smallintxxLat smallintxxLong smallintZLatLong int

“SRC”ThumbImg

ThumbImgId int ImgMetaId int ZLatLong int SrcId intImgTypeId intPixWidth intPixHeight intImgData Blob

“SRC”BrowseImg

BrowseImgId intImgMetaId int ZLatLong int SrcId intImgTypeId intPixWidth intPixHeight intImgData Blob

“SRC”TileImg

TileImgId intTileMetaId intZLatLong int SrcId intImgTypeId intPixWidth intPixHeight intImgData Blob

Meta & ImageLoad Process

Page 29: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

The Load Manager A Workflow System. Manages Job ‘Steps’. Built as an SQL Database App. Collects Stats. Would use Data Transformation Services today

Page 30: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Load Statistics 601 DOQ Jobs, 818 Spin Jobs

– Each job does 3 meta BCP, 4 Image BCP steps

5676 Image BCP Steps– 106 million total images loaded– 546 GB total. 5.4 KB avg image size

For Tile Images (96% of the database)– avg 68,000 images/step. max 757,000– avg 33 minutes/step. max 596– total time 796 hours (33 days)

Page 31: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Industrial Strength– High Performance– Online Backups– Simple, Error Free Media Handling– Minimal Recovery Time

System Maintenance: Backup &Recovery

Page 32: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Project Phases & Characteristics Load Phase

– Ongoing Massive Data Loads– Updates to Fix Errors in Meta-Data– Backups at Key Milestones

Deployed– 7 x 24– Some Updates to Existing Data– Small Loads as More Data Arrives– Infrequent Large Loads

Page 33: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

SQL Server 7.0 Backup/RestoreFeatures Fast

Online Backup Under Load – Minimal Impact

Just the Data Backup Part of the Database Minimize Recovery Time

– Differential Backups, Log Backups

– Restore Only Damaged Files

Page 34: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Backup ISVs Address Limitations

Legato NetWorker™ Computer Associates ArcServe™ Seagate Backup Exec™ Others…

These Products support SQL Server 6.5

None Support SQL Server 7.0 yet.

Page 35: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Deployed6/98...

ISV Supports SQL Server 7.0 High Performance Backup API

ISV Supports Full Range of SQL Server 7.0 Backup/Restore Features

Backup Software

Backup API

SQL Server

Tape Library

Page 36: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Backup API Performance

Avg CPU Usage

0 10 20 30 40 50 60

Backup API (no write)

PIPE (no write)

NUL

Percent

Throughput

0 20 40 60 80 100 120

Backup API (no write)

PIPE (no write)

NUL

MB/sec

Page 37: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Verifying Backup/Restore Minimal Risk Restore to a Separate

System at DECWest– Early Problems with Unreadable Tapes

Test SystemTest SystemTerraServerTerraServer

Another Terabyte of Disk!

Page 38: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

TerraServer Backup/Restore

Factoids Backup/Restore Rate

Time Required for Full Database Backup:

Number of DLT Tape Cartridges:

200 GB/Hr (57 MB/sec)

5 Hours

36

Page 39: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Other Details Active Server pages

– faster and easier than DB stored procedures.

Commerce Server is interesting– Images the Inventory

no SKU, millions of them

– USGS built their own they are very smart, but it is easy masquerade as a credit-card reader.

The earth is a geoid, and Every Geographer has a coordinate system (or two). Tapes are still a nightmare. Everyone is a UI expert.

Page 40: Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Thank You!

SPIN-2

Microsoft

BackOffice


Recommended