+ All Categories
Home > Documents > Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of...

Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of...

Date post: 16-Jan-2016
Category:
Upload: rosemary-mccarthy
View: 220 times
Download: 0 times
Share this document with a friend
Popular Tags:
43
Semantic Grid Semantic Grid + + Data Federation Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director
Transcript
Page 1: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Semantic GridSemantic Grid + + Data FederationData Federation

US National Virtual Observatory

Roy WilliamsCalifornia Institute of Technology

NVO co-director

Page 2: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

What is NVO?

– Standard protocols, standard data types• XML transfer protocol (VOTable)• Resource description (VOResource etc)• Publish/discover to federated registry (OAI)• Semantic Types (UCD)• Services: Cone search, Simple Image Access

– Computing with big data on the Grid• Database Crossmatch• Image Federation: Atlases

Page 3: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

First NVO Discovery

Page 4: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Database Fuzzy Join

2MASS versus SDSS cross-identification with- j_m as 2MASS magnitude and - I_mtotn as SDSS magnitude

2MASS : j_m ,+ 15SDSS: I_mtotn <= 18

Billion Source Cross-Identification: A Computational Challenge

SDSS unmatched

2MASS matched

SDSS matched

2MASS unmatched

Page 5: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Crossmatch Services

SDSSdatabase

2MASSdatabase

query

query

Crossmatchservice

query

scientificknowledge!

NVO protocols

Page 6: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

First NVO Discovery

Database crossmatch of two massive

databases creates new science

“The sum is greater than the parts”

Page 7: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Semantic Grid

Page 8: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Cone Search

• First VO standard service• Input: RA, DEC, SR must be present

– decimal degrees J2000

• Output: VOTable of sky-located data records– must have columns with UCDs:

POS_EQ_RA_MAIN, POS_EQ_DEC_MAIN, ID_MAIN

RA=300DEC=25SR=0.1

ID RA DEC x y z

RequestResponse

Page 9: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Cone Search Registry

POS_EQ_RA_MAIN

POS_EQ_DEC_MAINPOS_EQ

ID

URLbase RA=200&DEC=20&SR=2Request: HTTPget of shape:

Response: VOTable of shape:

A collection of services that have the same shape

Page 10: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Cone Search + Density Probe

Cone

Search

Density

Probe

baseURL

Spacing

Search radius

interoperating NVO-compliant services!

Federation of Multiple Services

Page 11: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

NVO Image ProtocolSIAP

• Specify box by position and size• SIAP server returns relevant images

• Footprint• Logical Name• URL

Can choose:

standard URL:http://.......

SRB URLsrb://nvo.npaci.edu/…..

Page 12: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Simple Image Access Service

• Query is sky region• May query on image type, image geometry

• Response is VOTable of images• Each has WCS (geometry) parameters• Plus a URL to fetch the image

• Designed for• Set of pointed observations (eg Hubble)• Wide-area survey (eg Sloan)• Image service

– Mosaicking

– Reprojection

Page 13: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Data Inventory Service

• What data covers a position in the sky?

Registry

OAIPublish

Registry

OAI QueryRegistry

OAIPublish

DIS

1

2

3

4Caltech

NCSA

JHU/StSci

Goddard

Page 14: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Data Inventory Service

Request is a cone on the sky

Page 15: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Data Inventory Service

Relevant Images and Catalogs

NVSS Image

ROSAT catalog

Page 16: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Image Federation

Page 17: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

VO Registry

VORegistry

Schemas & Service Types

VOResourceID ivo://me.com/file123

Query service

R R

Portals Tools& Services

DatabasesGridVirtual Data

md server for ivo://

VOView Fill-in forms Visualization Reports

Publishing

OAI

Publish service

AladinOASIS

DIS

Page 18: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

What is in the Registry?

• Answer: “Entities”• It has a global identifier ivo://…….

– Must be resolved by authority

• It has “VOViews”– Queries return these

• …..and that’s all!

Page 19: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

3 Views of an Entitiy

Zoo-keeper metadata:<diet>carrots</diet><excrement>yes</excrement><fencing>strong</fencing>

Transportation metadata:<weight>4000 kg</weight><poisonous>no</poisonous><claws>no</claws><food>carrots</food><waste-mgmt>heavy</waste-mgmt>

Zoo-manager metadata:<popularity>9</popularity><visitors>2500 per day</visitors><feeding>carrots</feeding>

“entity”

Page 20: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

VOResource

A mandatory form plus other supporting forms

Page 21: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Schemas and Service Types

• VOResource– Entity description form

• Organzation, project, data collection, service• Has ivo:// identifier

• VORegion– sky coverage form (α/δ/λ)

• VOTable– star catalog, image list, other tables

• OAI– Registry harvesting– Distributed virtual registry

• CONE– Request-response for catalog

• SIAP– Request-response for images

When can I publish my own schema to VO?

Page 22: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Dublin Core Metadata

Title A name given to the resource.

Creator An entity primarily responsible for making the content of the resource.

Subject A topic of the content of the resource.

Description An account of the content of the resource.

Publisher An entity responsible for making the resource available

Contributor An entity responsible for making contributions to the content of the resource.

Date A date of an event in the lifecycle of the resource.

Type The nature or genre of the content of the resource.

Format The physical or digital manifestation of the resource.

Identifier An unambiguous reference to the resource within a given context.

Source A Reference to a resource from which the present resource is derived.

Language A language of the intellectual content of the resource.

Relation A reference to a related resource.

Coverage The extent or scope of the content of the resource.

Rights Information about rights held in and over the resource.

Curation data for “any human creation”

Page 23: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Dublin Core

Dublin Core is how the VO will interoperate with libraries of the world

A global metadata standard

Page 24: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Prototype Registry

OrganizationData CollectionProjectServiceSIA service

Page 25: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

VOViews

VOResource view

Dublin Core view

Page 26: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

OAI: Open Archives Initiative Harvesting Protocol

OAI is popular– Ask your University librarian

Distributed Comprehensive Registry– Harvesting

Different views for different purposes– Six blind men and the elephant

Page 27: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

OAI Harvesting Protocol

6 magic verbs of OAI

Page 28: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

VO Identifiers

ivo://mydomain.com / mySkySurvey # file00037.fits

• URI form• Still in flux

Authority ID• Registered with IVOA• Must correspond to a registry

Resource ID• Created by Authority• Resolved by registry

Record ID• Not known to registry

delimiter delimiter

Page 29: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Image Federation

Page 30: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Multispectral Imagery

Crab Nebula.3 channels: X-ray in blue, optical in green, and radio in red.

Moffet Field California. 224 channels from 400 nm to 2500 nm

Page 31: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Image Federation

detection

Stacking allows detection of faint sources. A 1-sigma detection in each of many bands becomes a 3-sigma detection.

Images of the same galaxy taken several days apart are automatically subtracted from one another, and remaining bright spots may be supernova candidates. (NEAT project)

Image subtraction allows detection of narrow-line features that are not also wide-band (eg Hα but not R-band)

Page 32: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Principle Components

SDSS (5 channel) SDSS+2MASS (8 channel)

Page 33: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Mosaicking and Federation

Every Astronomical image has a different projection

• different pointing of the telescope

• We want to mosaic different images• We want to federate different information

Compute intensive:flux in each pixel is carefully

distributed into a new pixel grid

Mosaicking

Federation

Infrared map

Xray map today

Xray map last year

Page 34: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

AtlasmakerUses Montage, Yoursky

Project

Project Estimate & correct Background

Co-Add

Data

Chart

David H

ockney Pearblossom

Highw

ay 1986

Page 35: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Images and Charts

Image• Big data

Chart• Map: sphere → plane• FITS-WCS header• small data

An atlas is a collection of chartsHyperatlas is an attempt to standardize atlases

Page 36: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

HyperatlasStandard naming for atlases and vcharts

TM-5-SIN-20Vchart TM-5-SIN-20-1589

Standard Scales:scale s means 220-s arcseconds per pixel

SIN projection

TAN projection

TM-5 layout

HV-4 layout

Standard Projections

StandardLayout

Page 37: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Parallel Atlasmaker

MPI Parallellism• ~2% serial work (Amdahl)• Projection is parallel• All nodes share filespace

Making a single Image Making an Atlas of 1736 Images

Teragrid Distributed• Federated Scheduling wanted• SRB as Virtual Data Catalog

Page 38: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Atlasmaker Architecture

NVO/IVONED

SloanDPOSS

FIRST[2MASS]

NV

O P

rotocol

making atlaspages

scalereprojectcompress

sky index

VirtualDataSystem

YourSkyVirtualSkyOasis

VIE

W B

us

federation

datamining

Hyperatlas service

SIAP services

Page 39: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

AtlasmakerVirtual Data System

Metadata repositoriesFederated by OAI

Data repositoriesFederated by SRB

Compute resourcesFederated by TG/IPG

Mosaicked data is on

file

2a. Mosaicked data is not on file

2d: Store result &

return result

2c: Compute on TG/IPG

Userrequest

Request manager

2b. Get raw data from NVO resources

Page 40: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Atlasmaker stack

Mosaicking(executables)

Atlasmaker(script)

Hyperatlas(service)

NVO Image Access(service)

SRB(service)

webMontage YourSky

Virtual Data System-- Chimera?

Page 41: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Charts and Pages

Chart – a frame for specific data

Page – an organization for data

The virtual disk is 400,000 pixels wideS

IN projection

Page 42: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Background Correction

Uncorrected Corrected

Page 43: Semantic Grid + Data Federation US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director.

Montage Background Correction

Project pixels to output chart

Fit ramps on overlap regions

Fit ramps on projected images

Subtract from Pixel values


Recommended