+ All Categories
Home > Documents > Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP...

Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP...

Date post: 11-Jan-2016
Category:
Upload: myra-mason
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
17
Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July
Transcript
Page 1: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

Virtual Observatory:A Quick Overview, and Some Lessons Learned

S. George DjorgovskiCaltech

ESIP Workshop,UCSB, July 2009

Page 2: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

Astronomy Has Become Very Data-Rich• Typical digital sky survey now generates ~ 10 - 100 TB, plus a

comparable amount of derived data products– PB-scale data sets are on the horizon

• Astronomy today has ~ 1 - 2 PB of archived data, and generates a few TB/day

– Both data volumes and data rates grow exponentially, with a doubling time ~ 1.5 years

– Even more important is the growth of data complexity

• For comparison:Human memory ~ a few hundred MB

Human Genome < 1 GB

1 TB ~ 2 million books

Library of Congress (print only) ~ 30 TB

Page 3: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

Exponential Growth in Data Volumes and

Complexity

Visible + X-ray

Crab Star forming complex

Radio + IR

Understanding of complex phenomena requires complex data!

Multi- data fusion leads to a more complete, less biased picture(also: multi-scale, multi-epoch, …)

Numerical simulations are also producing many TB’s of very complex “data”

Data + Theory = Understanding

doubling t ≈ 1.5 yrs

TB’s to PB’s of data,108 - 109 sources,102 - 103 param./source

Page 4: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

The Archive Archipelago• As the data sets kept increasing, a number of archives, data

depositories, and digital library services were created

• All of them are mission-, domain-, or observatory-specific, distinct and independent scientifically, technologically, institutionally, heterogeneous in look-feel, usage, etc.– There was a considerable replication of effort– There was some functional redundancy– There was almost no interoperability– Some standards have been generally adopted (e.g., FITS)

• All of them were primarily designed for single-object (or single-pointing) queries - and thus inherently unsuitable for the science enabled by the massive and complex data sets

• The next step was clearly to connect them in a functional manner, and develop interoperability standards, formats, etc.

Page 5: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

The Virtual Observatory Concept• A complete, dynamical, distributed, open research

environment for the new astronomy with massive and complex data sets

– Provide and federate content (data, metadata) services, standards, and analysis/compute services– Develop and provide data exploration and discovery tools

– Not just the archives!

– A part of a broader Cyber-Infrastructure and e-Science movement

Page 6: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

SurveyTelescope

Archive

Follow-UpTelescopes

Results

Target SelectionData Mining

From Traditional to Survey to VO Science

Highly successful, but inherently limited by the information content of individual sky surveys … What comes next, beyond survey science is the VO science

Another Survey/Archive?

Data Analysis

Results

Telescope

Traditional: Survey-Based:

Page 7: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

SurveysObservatories

Missions

Surveyand

MissionArchives Follow-Up

Telescopesand

Missions

Results

Data Services---------------Data Miningand Analysis,

Target Selection

Digital libraries

Primary Data Providers

VOSecondary

DataProviders

A Systemic View of the VO-Based Science

VO connects the wholesystem of astronomicalresearch

Page 8: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

A Brief History of the VO Concept• Early (pre-web!) ideas already in the “Astrophysics Data System”

(only the digital library part survives)

• Concept developed through 1990’s, mainly from large digital sky surveys (DPOSS, SDSS…), discussions at conferences and workshops in the late 1990’s

• Top recommendation in the “small projects” category in the NAS Decadal Astronomy & Astrophysics survey

(the McKee-Taylor report), 2001• The first major VO conference at Caltech in

2000; the NVO White paper

• National Virtual Observatory Science Definition Team, 2001 - 2002

• ESO conferences, 2001 - 2002

• Vigorous international efforts, coordinated via International VO Alliance (IVOA)

Page 9: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

VO Development and Status• NSF-funded framework development project (2001-2008): the

U.S. National Virtual Observatory (NVO)• Now into a facility regime: Virtual Astro. Obs. (VAO)• Joint funding by the NSF and NASA• Work largely done in the existing data archives, and thus very

data-centric• Vigorous international efforts (IVOA)

http://us-vo.org http:// ivoa.net

Page 10: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

Scientific Roles and Benefits of a VO• Facilitate science with massive data sets (observations

and theory/simulations) efficiency amplifier• Provide an added value from federated data sets (e.g.,

multi-wavelength, multi-scale, multi-epoch …)– Discover the knowledge which is present in the data,

but can be uncovered only through data fusion

• Enable and stimulate some qualitatively new science with massive data sets (not just old-but-bigger)

• Optimize the use of expensive resources (e.g., space missions, large ground-based telescopes, computing …)

• Provide R&D drivers, application testbeds, and stimulus to the partnering disciplines (CS/IT, statistics …)

Page 11: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

VO Represents a New Type of a Scientific Organization

for the era of information abundance

• It is not yet another data center, archive, mission, or a traditional project It does not fit into any of the usual organizational structures– It is inherently distributed, and web-centric– It is fundamentally based on a rapidly developing

technology (IT/CS)– It transcends the traditional boundaries between

different wavelength regimes, agency domains– It has an unusually broad range of constituents and

interfaces– It is inherently multidisciplinary

Page 12: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

Broader and Societal Benefits of a VO• Professional Empowerment: Scientists and students

anywhere with an internet connection would be able to do a first-rate science A broadening of the talent pool in astronomy, democratization of the field

• Interdisciplinary Exchanges:– The challenges facing the VO are common to most

sciences and other fields of the modern human endeavor

– Intellectual cross-fertilization, feedback to IT/CS

• Education and Public Outreach:– Unprecedented opportunities in terms of the content,

broad geographical and societal range, at all levels– Astronomy as a magnet for the CS/IT education

“Weapons of Mass Instruction”

Page 13: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

VO Education and Public Outreach

Microsoft’s World Wide Telescope, andGoogle Sky: use DSS, SDSS, HST data, etc., for easy sky browsing

Page 14: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

VO Functionality TodayWhat we did so far:• Lots of progress on interoperability, standards, etc.• An incipient data grid of astronomy• Some useful web services• Community training, EPOWhat we did not do (yet):• Significant data exploration and mining tools

That is where the science will come from!Thus, little VO-enabled science so far

Thus, a slow community buy-in

Development of powerful, usable knowledge discovery tools should be a key priority

Page 15: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

An Evolving Sociology• We have transitioned from the data poverty regime into

an era of exponential data abundance– Most astronomers do not seem too fully realize this– Proprietary periods should be re-thought; there are other modes

of data access rights currencies, different scenarios?– Data are cheap, but the expertise is expensive (and creativity is

priceless)

• Telescopes are just the hardware needed to generate the data; and data are just incidental to our real mission, which is knowledge creation– When the data and the exploration tools are on the web, the

value of large facilities ownership should be rethought

– Computers are (relatively) cheap, but software is expensive — especially if you are not approaching it in a smart way

Page 16: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

Information Technology New Science• The information volume grows exponentially

Most data will never be seen by humans! The need for data storage, network, database-related

technologies, standards, etc.• Information complexity is also increasing greatly

Most data (and data constructs) cannot be comprehended by humans directly!

The need for data mining, KDD, data understanding technologies, hyperdimensional visualization, AI/Machine-assisted discovery …

• We need to create a new scientific methodology on the basis of applied CS and IT

• VO is the framework to effect this for astronomy

Page 17: Virtual Observatory: A Quick Overview, and Some Lessons Learned S. George Djorgovski Caltech ESIP Workshop, UCSB, July 2009.

Some Readings:• A quick summary:

– “Virtual Observatory: From Concept to Implementation”, Djorgovski, S.G., & Williams, R. 2005, A.S.P. Conf. Ser. 345, 517, available as http://arXiv.org/abs/astro-ph/0504006

• The original VO White Paper:– “Toward a National Virtual Observatory: Science Goals,

Technical Challenges, and Implementation Plan”, in Virtual Observatories of the Future, A.S.P. Conf. Ser. 225, 353, available as http://arXiv.org/abs/astro-ph/0108115

• The NVO SDT report, from http://www.us-vo.org/sdt• Many other good documents available at http://us-vo.org

(especially the summer school presentations)• Technical documents at http://www.ivoa.net


Recommended