Date post: | 11-Jan-2016 |
Category: |
Documents |
Upload: | myra-mason |
View: | 213 times |
Download: | 0 times |
Virtual Observatory:A Quick Overview, and Some Lessons Learned
S. George DjorgovskiCaltech
ESIP Workshop,UCSB, July 2009
Astronomy Has Become Very Data-Rich• Typical digital sky survey now generates ~ 10 - 100 TB, plus a
comparable amount of derived data products– PB-scale data sets are on the horizon
• Astronomy today has ~ 1 - 2 PB of archived data, and generates a few TB/day
– Both data volumes and data rates grow exponentially, with a doubling time ~ 1.5 years
– Even more important is the growth of data complexity
• For comparison:Human memory ~ a few hundred MB
Human Genome < 1 GB
1 TB ~ 2 million books
Library of Congress (print only) ~ 30 TB
Exponential Growth in Data Volumes and
Complexity
Visible + X-ray
Crab Star forming complex
Radio + IR
Understanding of complex phenomena requires complex data!
Multi- data fusion leads to a more complete, less biased picture(also: multi-scale, multi-epoch, …)
Numerical simulations are also producing many TB’s of very complex “data”
Data + Theory = Understanding
doubling t ≈ 1.5 yrs
TB’s to PB’s of data,108 - 109 sources,102 - 103 param./source
The Archive Archipelago• As the data sets kept increasing, a number of archives, data
depositories, and digital library services were created
• All of them are mission-, domain-, or observatory-specific, distinct and independent scientifically, technologically, institutionally, heterogeneous in look-feel, usage, etc.– There was a considerable replication of effort– There was some functional redundancy– There was almost no interoperability– Some standards have been generally adopted (e.g., FITS)
• All of them were primarily designed for single-object (or single-pointing) queries - and thus inherently unsuitable for the science enabled by the massive and complex data sets
• The next step was clearly to connect them in a functional manner, and develop interoperability standards, formats, etc.
The Virtual Observatory Concept• A complete, dynamical, distributed, open research
environment for the new astronomy with massive and complex data sets
– Provide and federate content (data, metadata) services, standards, and analysis/compute services– Develop and provide data exploration and discovery tools
– Not just the archives!
– A part of a broader Cyber-Infrastructure and e-Science movement
SurveyTelescope
Archive
Follow-UpTelescopes
Results
Target SelectionData Mining
From Traditional to Survey to VO Science
Highly successful, but inherently limited by the information content of individual sky surveys … What comes next, beyond survey science is the VO science
Another Survey/Archive?
Data Analysis
Results
Telescope
Traditional: Survey-Based:
SurveysObservatories
Missions
Surveyand
MissionArchives Follow-Up
Telescopesand
Missions
Results
Data Services---------------Data Miningand Analysis,
Target Selection
Digital libraries
Primary Data Providers
VOSecondary
DataProviders
A Systemic View of the VO-Based Science
VO connects the wholesystem of astronomicalresearch
A Brief History of the VO Concept• Early (pre-web!) ideas already in the “Astrophysics Data System”
(only the digital library part survives)
• Concept developed through 1990’s, mainly from large digital sky surveys (DPOSS, SDSS…), discussions at conferences and workshops in the late 1990’s
• Top recommendation in the “small projects” category in the NAS Decadal Astronomy & Astrophysics survey
(the McKee-Taylor report), 2001• The first major VO conference at Caltech in
2000; the NVO White paper
• National Virtual Observatory Science Definition Team, 2001 - 2002
• ESO conferences, 2001 - 2002
• Vigorous international efforts, coordinated via International VO Alliance (IVOA)
VO Development and Status• NSF-funded framework development project (2001-2008): the
U.S. National Virtual Observatory (NVO)• Now into a facility regime: Virtual Astro. Obs. (VAO)• Joint funding by the NSF and NASA• Work largely done in the existing data archives, and thus very
data-centric• Vigorous international efforts (IVOA)
http://us-vo.org http:// ivoa.net
Scientific Roles and Benefits of a VO• Facilitate science with massive data sets (observations
and theory/simulations) efficiency amplifier• Provide an added value from federated data sets (e.g.,
multi-wavelength, multi-scale, multi-epoch …)– Discover the knowledge which is present in the data,
but can be uncovered only through data fusion
• Enable and stimulate some qualitatively new science with massive data sets (not just old-but-bigger)
• Optimize the use of expensive resources (e.g., space missions, large ground-based telescopes, computing …)
• Provide R&D drivers, application testbeds, and stimulus to the partnering disciplines (CS/IT, statistics …)
VO Represents a New Type of a Scientific Organization
for the era of information abundance
• It is not yet another data center, archive, mission, or a traditional project It does not fit into any of the usual organizational structures– It is inherently distributed, and web-centric– It is fundamentally based on a rapidly developing
technology (IT/CS)– It transcends the traditional boundaries between
different wavelength regimes, agency domains– It has an unusually broad range of constituents and
interfaces– It is inherently multidisciplinary
Broader and Societal Benefits of a VO• Professional Empowerment: Scientists and students
anywhere with an internet connection would be able to do a first-rate science A broadening of the talent pool in astronomy, democratization of the field
• Interdisciplinary Exchanges:– The challenges facing the VO are common to most
sciences and other fields of the modern human endeavor
– Intellectual cross-fertilization, feedback to IT/CS
• Education and Public Outreach:– Unprecedented opportunities in terms of the content,
broad geographical and societal range, at all levels– Astronomy as a magnet for the CS/IT education
“Weapons of Mass Instruction”
VO Education and Public Outreach
Microsoft’s World Wide Telescope, andGoogle Sky: use DSS, SDSS, HST data, etc., for easy sky browsing
VO Functionality TodayWhat we did so far:• Lots of progress on interoperability, standards, etc.• An incipient data grid of astronomy• Some useful web services• Community training, EPOWhat we did not do (yet):• Significant data exploration and mining tools
That is where the science will come from!Thus, little VO-enabled science so far
Thus, a slow community buy-in
Development of powerful, usable knowledge discovery tools should be a key priority
An Evolving Sociology• We have transitioned from the data poverty regime into
an era of exponential data abundance– Most astronomers do not seem too fully realize this– Proprietary periods should be re-thought; there are other modes
of data access rights currencies, different scenarios?– Data are cheap, but the expertise is expensive (and creativity is
priceless)
• Telescopes are just the hardware needed to generate the data; and data are just incidental to our real mission, which is knowledge creation– When the data and the exploration tools are on the web, the
value of large facilities ownership should be rethought
– Computers are (relatively) cheap, but software is expensive — especially if you are not approaching it in a smart way
Information Technology New Science• The information volume grows exponentially
Most data will never be seen by humans! The need for data storage, network, database-related
technologies, standards, etc.• Information complexity is also increasing greatly
Most data (and data constructs) cannot be comprehended by humans directly!
The need for data mining, KDD, data understanding technologies, hyperdimensional visualization, AI/Machine-assisted discovery …
• We need to create a new scientific methodology on the basis of applied CS and IT
• VO is the framework to effect this for astronomy
Some Readings:• A quick summary:
– “Virtual Observatory: From Concept to Implementation”, Djorgovski, S.G., & Williams, R. 2005, A.S.P. Conf. Ser. 345, 517, available as http://arXiv.org/abs/astro-ph/0504006
• The original VO White Paper:– “Toward a National Virtual Observatory: Science Goals,
Technical Challenges, and Implementation Plan”, in Virtual Observatories of the Future, A.S.P. Conf. Ser. 225, 353, available as http://arXiv.org/abs/astro-ph/0108115
• The NVO SDT report, from http://www.us-vo.org/sdt• Many other good documents available at http://us-vo.org
(especially the summer school presentations)• Technical documents at http://www.ivoa.net