Date post: | 27-Mar-2015 |
Category: |
Documents |
Upload: | isaiah-reilly |
View: | 215 times |
Download: | 1 times |
9 September 2005NVO Summer School 2006 - Aspen
Astronomical Dataset Query Language (ADQL)
Ray Plante
THE US NATIONAL VIRTUAL OBSERVATORY
9 September 2005NVO Summer School 2006 - Aspen
Intersections and Goals
• Users– Will primarily use portal interfaces to query catalogs in pursuit of
scienceGoals:
• Create original ADQL queries using the OpenSkyPortal• Cross-correlate two catalogs using the XMatch function
• “Scripters” and Data Providers– Will use client libraries to query catalogs
• In pursuit of science• requires more queries and/or repetition than is practical to do by hand.
– Will use a toolkit to deploy a service that uses ADQL (e.g. SkyNode)Goals:
• Understand the role of ADQL/x in supporting query web services• See how ADQL will be used with other services beyond SkyNodes
• Developers– Will develop new services and applications that use ADQL– Will extend existing code that uses ADQLGoals:
• Use library to convert between ADQL/s & ADQL/x• Understand how ADQL can be adapted to new local query languages
9 September 2005NVO Summer School 2006 - Aspen
What is ADQL?
• ADQL = subset of SQL92 + astronomy extensions– Originally developed as part of SkyNode interface– Focuses on read-access via SELECT– Later versions allow creating tables to hold
intermediate results– astronomy: regions and cross-matching
• ADQL has two forms:– ADQL/s – string form
• Looks like SQL• Intended as user’s view of ADQL
– ADQL/x – XML form• SQL “parse tree”• Format used in messages passed to search services• Easier to parse and transform into local query
9 September 2005NVO Summer School 2006 - Aspen
Caution: Construction ahead
• ADQL is undergoing rapid change as a standard– OpenSkyPortal and existing SkyNodes use ADQL v0.7.4
• This talk will focus on this version
– IVOA is currently considering v1.05
– Discussion is underway to simplify ADQL in the context of a Table Access Protocol
• Bringing a core syntax in compliance with SQL92
• De-emphasizing the role of ADQL/x by supporting ADQL/s directly on the wire
• Sharpening the definitions of cross-match and region functions
9 September 2005NVO Summer School 2006 - Aspen
Why ADQL/x?
• If ADQL is standard SQL (plus) and we are querying SQL databases, why do we need an XML form?– Query Transformation is commonly necessary
• Few databases are 100% compliant with the SQL standard.– Transform to local SQL dialect
• Semantic filtering possible (transforming metadata).• Easier to adapt to non-relational databases
– E.g. XML database
– A pre-parsed form on the wire makes transformations easier to implement
• ADQL/x on the wire– Shifts parsing problem to the client – ACCESS BARRIER!
• Experience – Minor transformations can often be handled via SQL string
manipulation– More careful adherence to SQL92 would eliminate most common
difference between native SQLs (TOP, functions)– The emergence of parser/conversion tools make choice of wire
format less important
9 September 2005NVO Summer School 2006 - Aspen
Why ADQL/x?
If ADQL is standard SQL (plus) and we are querying SQL databases, why do we need an
XML form?Query Transformation is commonly necessary
– Few databases are 100% compliant with the SQL standard.
• Transform to local SQL dialect
– Semantic filtering possible (transforming metadata).
– Easier to adapt to non-relational databases• E.g. XML database
9 September 2005NVO Summer School 2006 - Aspen
What is ADQL?
SELECT o.objId, o.ra, o.r, o.type, t.objId
FROM SDSS:PhotoPrimary o, TWOMASS:PhotoPrimary t
WHERE XMATCH(o,t,3.5,1)
AND Region('Circle J2000 181.3 -0.76 6.5')
AND o.type=3
9 September 2005NVO Summer School 2006 - Aspen
ADQL for Users
Restrictions on SQL92
• Only SELECT statements allowed– v1.0: SELECT INTO for saving results in remote
storage
• All tables in FROM must define an aliasFROM PhotoPrimary o
• JOIN keywords not supported– Traditional inner joins supported– Outer joins not supported
• IN keyword not supportedp.id IN (23872871, 2309823, 1512342)
9 September 2005NVO Summer School 2006 - Aspen
ADQL for Users
Astronomical Extension: Regions
• Does a position fall within a region of sky? • Uses Space-time Coordinate metadata schema
– Also an emerging IVOA standard– Has string and XML formats
• Allows for a several region shapes, coordinate systems– Current SkyNode implementations only support
CIRCLE J2000
WHERE REGION('CIRCLE J2000 ra dec radius')
– ra dec = right ascension, declination in degrees– radius = angular radius of cone, in degrees
9 September 2005NVO Summer School 2006 - Aspen
ADQL for Users
Joining Tables
• Implied Inner Joins– WHERE constraint where key in one table is set to key
in other tablep.objId = s.objId
– This is only practical for joining tables from the same catalog/node
• ADQL/SkyQuery was motivated by the need to cross-correlate distributed tables by position– XMatch: extension for joining based on matching
positions.• Current practice: a statistical analysis of the likelihood
that two positions are the same
9 September 2005NVO Summer School 2006 - Aspen
ADQL for Users
Astronomical Extension: XMatch
• Catalog/Node designation:– Tables in FROM class are preceeded by node name
FROM SDSS:PhotoPrimary s, TWOMASS:PhotoPrimary t
• XMatch function– Arguments: tables to cross-match
• Referred to by the table aliases– Returns: chi-squared confidence that two objects are
the same XMATCH(s, t) < 3.5 • given sources in each table are the same with a
confidence of better than 3.5 sigmas– Can include negative detection
XMATCH(s, t, !o) < 3.5 • Detected in tables s and t but not in o
9 September 2005NVO Summer School 2006 - Aspen
ADQL for Users/Data Providers
The XMatch Algorithm
• ADQL does not set cross-match algorithm– Requires positional information and positional
errors
• SkyNode recommends chi-square minimization
– x, y, z is the unit vector position of the most likely position of an object
– xn, yn, zn is the measured position in a survey
– αn is a weighting based on the positional error
9 September 2005NVO Summer School 2006 - Aspen
• Limiting the cross-matched candidates– In principle, analysis can be carried out on every possible
combination of detections– Techniques for limiting cross-matches
• User provides a limited region constraint• User provides other filtering constraints (e.g. color)
• Implementation can apply internal X2 cut-off based on given overall limit
• Invoking XMatch causes additional info to be returned– Most likely position, the original positions– Χ2, Minimized Chi-squared– SkyNodes: intermediate values in statistical calculations
• A single detection’s position can appear in multiple cross-matched records
ADQL for Users/Data Providers
Using XMatch
9 September 2005NVO Summer School 2006 - Aspen
ADQL for Scripters
Accessing ADQL Services
• OpenSkyPortal web service takes ADQL/s – querying SkyNodes is straight-forward
through portal service
• Querying an individual SkyNode service– Requires use of ADQL/x– Client-side support is not great but improving– As a user, would like to give ADQL/s
• Use convertADQL to convert ADQL/s to ADQL/x on the fly
9 September 2005NVO Summer School 2006 - Aspen
ADQL for Scripters
Other uses of ADQL
• Searching Registries– Registry Interface v1.0 for this year– Specify only WHERE clause– Use simple XPaths for column names
Ex: WHERE content/description LIKE '%quasars%'
• Future generations of SIAP, SSAP– Replace simple keyword=value query – More fine-grained control over search criteria
9 September 2005NVO Summer School 2006 - Aspen
ADQL for Developers
ADQLlib
• Parses and converts between string & XML versions
• Can be incorporated into client code to provide support for ADQL/s– Java clients can use Java API directly– Other clients can use the convertADQL command-line
tool
• ADQL/x -> ADQL/s– Multiple transformation implementations, can be
supported– Default implementation uses XSLT– Can be configured to convert from ADQL/x directly to
a RDBMS’ local variant of SQL