GeoViQua: GeoViQua: the quality challenges forthe quality challenges forthe quality challenges for the quality challenges for
GEOSSGEOSS
YANG Xiaoyu, BLOWER Jon, CORNFORD Dan, LUSH Victoria, MASO Joan, ZABALA Alaitz, Nüst Daniel
Center of Research in Ecology and Forestry Applications (CREAF)[email protected]
QUAlity awareaware
VIsualisation for thefor the
Global Earth Observation
system ofsystem of systems
www.geoviqua.org
The problem
• Is there quality information in the GCI?Is there quality information in the GCI?– There is some in the form of ISO19115 DQ elements and lineage– Not enough
• The GEOSS Common Infrastructure does not follow a global model for qualityglobal model for quality
• The GEOPortal search and resultsThe GEOPortal search and results – are not ranged by quality– quality indicators are not shown
• Common data viewers do not generally include quality information in parallel with the data
www.geoviqua.org
information in parallel with the data
The aim
GeoViQua will provide a setGeoViQua will provide a set of scientifically developed software components
GEO S&T Labelsoftware components
and services that facilitate the creation,
Communitybuildingfacilitate the creation,
search and visualization ofvisualization of quality information on EO data integrated
Pilot case studies
on EO data integrated and validated in the GEOSS Common Infrastructure.
www.geoviqua.org
Common Infrastructure.
Time table
Req irements and Data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Requirements and Data Model phase finished,
Metadata extraction
Best practices quality encoding
Direct extraction from continuous variablesQuality elicitation User feedback
Extraction from categorical variables
8
Start PrototypesValidation
Mobile SolutionsSearch & Visualization
Data ready
Quality recommendations
Testing
solutionsPilot cases
User & technical User & technicalUser & technical requirements to CoP
User & technical solutions to CoPWorkshops
Proposals evaluation Final documentGeoLabel
www.geoviqua.org
Community Views on Data Quality
• Many researchers refer to the ‘famous five’ as the common• Many researchers refer to the famous five as the common criteria for evaluating spatial data quality– lineage; completeness; consistency; positional accuracy; and
attribute accuracy. • Broad scientific acceptance of the common spatial quality
elements does not apply to all cases for “fitness-for-use”elements does not apply to all cases for fitness-for-use evaluation– user requirements can go far beyond the widely accepted ‘famous
fi ’five’.• We used semi-structured telephone and face-to-face
interviews with a variety of geospatial data users andinterviews with a variety of geospatial data users and experts from a number of countries and application domains.
www.geoviqua.org
What users want?
• Users are exceedingly interested in good quality metadata records g y g q y– And information that can help to assess fitness-for-use of the data
• Users find metadata records typically incomplete with essential data omitted– The process of dataset discovery and selection is more difficult
• Users are also interested in ‘soft’ knowledge about data quality– Data providers’ comments on the overall quality of a dataset, known data errors, potential
data usagedata usage– Peers’ reviews and recommendations (they contact their peers to obtain suggestions)– Dataset provenance, citation and licensing information
• Citation is incomplete (lack of valid producer contact details), and licensing often missing• Citation: users rely on data from good reputation producers• Citation: users rely on data from good reputation producers
• Currently, some of these cannot be recorded in standard metadata
• Need for easily and systematically compare metadata records• Need for easily and systematically compare metadata records– Side-by-side visualisation of all metadata elements would allow geospatial datasets to be
compared more effectively, • especially when datasets are very similar and differences are hard to distinguish
www.geoviqua.org
Producer’s-consumer’s quality
Producer’s quality metadata• Producer’s quality metadata– In the producers metadata records– Encoded in the classical ISO 19115/19139Encoded in the classical ISO 19115/19139– Some extensions required– Stored in the current catalogues (GEOSS Clearinghouse, etc)
• Consumer’s quality metadata– In independent metadata repositories– Linked to producer’s metadata by id– Future component of the GCI?
Contains comments “like it” star rates etc– Contains comments, like it , star rates, etc
www.geoviqua.org
The ISO classical view
Quality indicators Provenance/LineageQuality indicators g
Usageg
www.geoviqua.org
Add ‘soft’ knowledge to producer’s metadataproducer s metadata
Q litMetadataDataset series
D t tMetadata 0..*
Data Quality
QualityScope
Metadata Dataset
Subset of data
Metadata Packages
User Feedback
Publication
••
Lineage
Discovered Issues Universe of
Discourse
Feature Type
Quality Element Non‐quantitative Quality Information
Positional Accuracy
Temporal Accuracy
Thematic Accuracy
Quality Parameter (ISO 19157)
Completeness Logical Consistency Usability
Quality Indicator (ISO 19157)
Metaquality
Omission Commission
Missing Items
Number of Missing Items
••
Quantitativeattribute accuracy
Non‐quantitativeattribute correctness
Classification correctness
Misclassification rate
Misclassificationmatrix
••
Quality Measure (ISO19157, UncertML)
www.geoviqua.org
Quality model is much more that positional accuracypositional accuracy
• There are many quantifiable aspects that can be d drecorded
– Consistency, completeness, positional, thematic and temporal accuracy…
• There are many qualitative aspects that areThere are many qualitative aspects that are needed– Lineage (traceability), scientific papers, user
feedback, data usage…
www.geoviqua.org
GeoViQua Data model: statistical uncertaintiesstatistical uncertainties
<gmd:DQ_QuantitativeAttributeAccuracy><gmd:result>
<gmd:DQ_QuantitativeResult>d l U it / d l U it
<gmd:DQ_QuantitativeAttributeAccuracy><gmd:result>
<gmd:DQ_QuantitativeResult><gmd:valueType>
<gmd:valueUnit>m</gmd:valueUnit><gmd:value><gco:Record>3.6</gco:Record></gmd:value>
g yp<gco:RecordType xlink:href=“http://www.uncertml.org/distributions/normal”>
Value of the vertical DEM accuracy</gco:RecordType>
</gmd:valueType></gmd:DQ_QuantitativeResult>
</gmd:result></gmd:DQ_QuantitativeAttributeAccuracy>
/gmd:valueType<gmd:valueUnit>m</gmd:valueUnit>
<gmd:value><gco:Record>
<un:NormalDistribution>
Explicit recognition that errors acceptably fit a Normal distribution
ith 1 2<un:NormalDistribution><un:mean>1.2</un:mean><un:variance>3.6</un:variance>
</un:NormalDistribution></gco:Record>
with mean 1.2 • An overall positive bias was observed • A difficult feature to convey by</gco:Record>
</gmd:value></gmd:DQ_QuantitativeResult>
</gmd:result></gmd:DQ QuantitativeAttributeAccuracy>
• A difficult feature to convey by traditional means)
www.geoviqua.org
</gmd:DQ_QuantitativeAttributeAccuracy>
The need for a measure dictionarydictionary
Absolute external positional accuracy 2Current quality p yAnweisung Straßeninformationsbank (Bundes… 1Codelist omission 2completeness 198Feature represented as a single object 2
• Current quality measure names in the GCI Feature represented as a single object 2
horizontal 3146Horizontal Positional Accuracy 3265Lagegenauigkeit 3Latitude Resolution 3437
the GCI– Nothing to do with
ISO19138 list of ibl Latitude Resolution 3437
Longitude Resolution 3350Mean value of positional uncertainties (2D) 3Overlapping polygon 2Q tit ti Att ib t A A t 255
possible measures– Not well defined
Quantitative Attribute Accuracy Assessment 255Rate of missing items 87Sach- und Geodatenüberprüfung 7Temporal Resolution 2870Ü fÜberprüfung der Toplogie 2Valid code Test 2Vertical Positional Accuracy 1826Vertical Resolution 812
www.geoviqua.org
vertikal 348Vollständigkeit 4
Data Quality Measure DictionaryDictionary
• Some quality indicators are used but the name and• Some quality indicators are used, but the name and description of the measure used to derive the indicator are rarely well described.
• Problems can occur due to the lack of semantic definitions of quality measures.
Description
Definition
Quality Measure ID(ID=“” Name=“”, Alias=“”)
q y– “uncertainty at 90% significance level” ??.
• A Quality Measure Dictionary is proposed that includes:
– vocabularies for quality measures
Quality element
Basic measureValue type
Value structureParameter
UncertMLDictionary
– associated semantic annotations – integrate UncertML concepts and vocabularies.
• Composed on quality measures provided by – ISO138 ISO19157
U tML
Example use
UncertMLrepresentation
Source reference
URI
– UncertML. • Measure has a unique ID
– quality element, value type, quality basic measure, description, example use, etc.
• “uncertainty at 90% significance level” can be
(URI=“”)
<un:ConfidenceIntervalxmlns:un="http://www.uncertml.org/2.0">
<un:lower level="0.05">• uncertainty at 90% significance level can be annotated using UncertML vocabulary “ConfidenceInterval”(URI: http://www.uncertml.org/statistics/confidence-interval)
un:lower level 0.05<un:values>3.14</un:values>
</un:lower><un:upper level="0.95">
<un:values>6.28</un:values></un:upper>
www.geoviqua.org
pp</un:ConfidenceInterval>
Quality Metadata Levels
Level: Multiseries Positional accuracy: 2.5 m
Content date: 2009-2010
Multiseries
Level: theme=contour line Overwrite positional accuracy:
1.5 m
Level: sheet=73-30 Overwrite content date:
October 2009 Series Sheet or Scene
777 333 --- 333 000
Level: dataset (theme=contour line, sheet=73-30)
Positional accuracy: 1 5 m
Dataset (raster or feature instance)
Positional accuracy: 1.5 mContent date: October 2009
www.geoviqua.org
GEOSS common infrastructureRegistered Community
ResourcesClient Tier
GEOSS Common Infrastructure
Main GEOWeb Site
Community Portals
Client Applications
Client TierGEO
Web Portals
Components
Registries
Business Process Tier
Community WorkflowGEOSS
Components & Services
Standards andInteroperability y
Catalogues
AlertServers
Management
ProcessingServers
ClearinghouseBest PracticesWiki
User Requirements Servers Servers
Access TierAccess Tier
GEONETCast Product AccessServers
Sensor WebServers
Model AccessServers
www.geoviqua.org
Before GEOSS
B i P Ti
Capacity Resource
User
SBABusiness Process Tier
CapacityCatalogues
SBA
Disasters
Health
Access Tier
Product AccessServers
Energy
Climate
Water
Sensor Web
Model AccessServers
Weather
Ecosystems
Agriculture
GEONETCast
Servers Biodiversity
www.geoviqua.org
How GEOSS worked yesterday
B i P Ti
Capacity Resource
User
SBA
Components & Services
RegistryBusiness Process Tier
CapacityCatalogues
SBA
Disasters
Health
Registry
GEOSS
Access Tier
Product AccessServers
Energy
Climate
Water
ClearinghouseCatalogue
DB
Sensor Web
Model AccessServers
Weather
Ecosystems
Agriculture
GEO Web Portal
GEOSS Common InfrastructureGEONETCast
Servers Biodiversity
www.geoviqua.org
How GEOSS is going to work
B i P Ti
Capacity Resource
User
SBA
Components & Services
RegistryCommunityBusiness Process Tier
CommunityCatalogues
SBA
Disasters
Health
Registry
GEOSS
CommunityCatalogueCommunity
CatalogueCommunityCatalogueCapacity
CatalogueEuroGEOSS
Access Tier
Product AccessServers
Energy
Climate
Water
ClearinghouseCatalogue
DB
EuroGEOSSBroker
Sensor Web
Model AccessServers
Weather
Ecosystems
Agriculture
GEO Web Portal
GEOSS Common InfrastructureGEONETCast
Servers Biodiversity
www.geoviqua.org
How GEOSS is going to work
CommunityCatalogueCommunity
C it
B i P Ti
Capacity Resource
User
SBA
Components & Services
Registry
CatalogueCommunityCatalogueCapacity
Catalogue
Business Process Tier
CommunityCatalogues
SBA
Disasters
Health
Registry
GEOSS EuroGEOSSEuroGEOSSAccess Tier
Product AccessServers
Energy
Climate
Water
ClearinghouseCatalogue
DB
EuroGEOSSBrokerBroker
Sensor Web
Model AccessServers
Weather
Ecosystems
Agriculture
GEO Web Portal
GEOSS Common InfrastructureGEONETCast
Servers Biodiversity
www.geoviqua.org
GeoViQua quality model
EuroGEOSSBroker model
D i
o e ode
GeoViQua Model
Data Quality
QualityScope
Metadata
Product Specification
Dataset series
Dataset
Subset of data
Feature types
Rules Quality requirements
Metadata Packages
0..*
Comments/ Peer Review
Discovered Issues
Quality Element Non‐quantitative Quality Information
Quality Parameter (ISO 19113)
Specification Universe of Discourse (i.e. Reality)
Metaquality
Publication
••
Lineage
Positional Accuracy
Temporal Accuracy
Thematic Accuracy
Completeness Logical Consistency Usability
Quality Indicator (ISO 19113)
Omission Commission
Missing Items
Number of Missing Items
••
Quantitativeattribute accuracy
Non‐quantitativeattribute correctness
Classification correctness
Misclassification rate
Misclassificationmatrix
••
www.geoviqua.org
Items Missing Items rate matrix
Quality measure (ISO19114/ISO19138, UncertML)
Quality in GEOSS
CommunityCatalogueCommunity
C it
Enhanced geo-search
tools
B i P Ti
Capacity Resource
User
SBA
Components & Services
Registry
CatalogueCommunityCatalogueCapacity
Catalogue
Business Process Tier
CapacityCatalogues
SBA
Disasters
Health
Registry
GEOSS EuroGEOSSAccess Tier
Product AccessServers
Energy
Climate
Water
ClearinghouseCatalogue
DB
EuroGEOSSBroker
Sensor Web
Model AccessServers
Weather
Ecosystems
Agriculture
GEO Web Portal
GEOSS Common InfrastructureGEONETCast
Servers Biodiversity
www.geoviqua.org
Including data quality in search
• SELECT WHERE
Enhanced geo-search
tools• SELECT WHERE
positional_accuracy < 20 and classification_correctness > 90%FROM GEOSS_GCI
www.geoviqua.org
Devillers R, Bédard Y, R Jeansoulin (2005) Multidimensional Management of Geospatial Data Quality Information for its Dynamic Use Within GIS
Consumer’s data quality
More informal• More informal• Based on social network patterns
Comments– Comments– Linked data– Like it– Star ratings
• More dinàmic• Need for an encoding• Need for an independent repositoryp p y
www.geoviqua.org
GEOSSBack
http://www ogc uab cat/GEOSSBackhttp://www.ogc.uab.cat/GEOSSBack
• Just a prototype to play with andto play with and demonstrate a concept.
www.geoviqua.org
Producer’s+consumer’s GeoViQua BrokerGeoViQua Broker
cmp GeoViQua Components Agreed So Farcmp GeoViQua Components Agreed So Far
CSW Clearinghoure WMS
SOS-Q + SensorML
EuroGEOSS Discov er broker Q
Capacity Catalogues
SOS-Q + SensorMLSensor Registry Q
CSW
GeoViQua Broker
CSW-Q
CSW
Metadata Import tool
+ HDF+ netCDF
WAF
unknown
CSW
+ others... FeedBack Serv er
www.geoviqua.org
Quality Metadata comparison
www.geoviqua.org
Conclusions
After user interviews• After user interviews• Producer’s quality model
GeoViQua quality model is based in ISO– GeoViQua quality model is based in ISO– With extensions for ‘soft’ knowledge– Inclusions of uncertML– Quality measure dictionary
• Consumer’s quality model– Based on social network patterns– Encoded independently (from producers)
• Linked by the GeoViQua broker (extension/complement of the EuroGEOSS broker)
www.geoviqua.org
GEOLabel
• What is it?– The GEO Label is intended to “assist the user to assess the scientific relevance,
quality, acceptance and societal needs of the components” (ST-09-02 Task Team, 2010).
• Purposes? Task performed in u poses– be a quality indicator for GEOSS geospatial data and datasets
• Problem: Usability depends on data application; there is no defined threshold.– improve user recognition and trust in validated datasets.
• Problem: who is going to certify this?
pcollaboration with EGIDA FP7 project and the GEO task ST 09 02• Problem: who is going to certify this?
– assist in searching by providing users with visual clues of dataset quality and relevance.
– provide accreditation, provenance, monitoringi i ibilit f EO d t
task ST-09-02
– increase visibility of EO data– Emphasize in open access and easy availability
• Possible shape?– Certification labelCertification label– A formal way to present
• quality indicators• provenance• attribution
www.geoviqua.org
• attribution
GEOLabel
• Until the end of this week• Publicly available in the web
www.geoviqua.org
• We encourage you to participate!
Please participate in the questionnaire:questionnaire:
http://geolabel.questionpro.comjust a couple of days left!!just a couple of days left!!
Th kThanks
Joan Maso@uab [email protected](CREAF)
Please participate in the questionnaire:questionnaire:
http://geolabel.questionpro.comjust a couple of days left!!just a couple of days left!!
Th kThanks
Joan Maso@uab [email protected](CREAF)
How GEOSS is going to work
CommunityCatalogueCommunity
C it
Quality aware visualisation
tools
B i P Ti
Capacity Resource
User
SBA
Components & Services
Registry
CatalogueCommunityCatalogueCopacity
Catalogue
Business Process Tier
CapacityCatalogues
SBA
Disasters
Health
Registry
GEOSS EuroGEOSSAccess Tier
Product AccessServers
Energy
Climate
Water
ClearinghouseCatalogue
DB
EuroGEOSSBroker
Sensor Web
Model AccessServers
Weather
Ecosystems
Agriculture
GEO Web Portal
Quality Access
GEOSS Common InfrastructureGEONETCast
Servers Biodiversity Q y
Broker
www.geoviqua.org
Quality map visualization
Quality aware visualisation
toolsExpress data quality using maps
tools
Blackmond Laskey K, EJ. Wright PCG da Costa (2009) Envisioning uncertainty in geospatial information
Devillers R Bédard Y R Jeansoulin (2005) Multidimensional
• Dark color represents poorquality and light color good
Devillers R, Bédard Y, R Jeansoulin (2005) Multidimensional Management of Geospatial Data Quality Information for its Dynamic Use Within GIS
www.geoviqua.org
quality
Quality map visualization
• 3D representations
Quality aware visualisation
tools• 3D representations
– representation of estimated water balance s rpl s/deficit and theirsurplus/deficit and their uncertainty (using bars above and below the surface)surface).
• Map representations have some problemsMakes visualization more complicated– Makes visualization more complicated and difficult to understand
– Attracting the attention to the more uncertain objects!!uncertain objects!!
www.geoviqua.orgMacEachren AM, A Robinson, S Hopper, S Gardner, R Murray, M Gahegan, E Hetzler (2005) Visualizing Geospatial Information Uncertainty; What We Know and What We Need to Know
Pang A (2001) Visualizing Uncertainty in Geo-spatial Data
Pilot Case scenarios
Agriculture
Global Carbon
Air QualityAir Quality
Based on many user stories among GEOSS SBA
www.geoviqua.org
g
Please participate in the questionnaire:questionnaire:
http://geolabel.questionpro.comjust a couple of days left!!just a couple of days left!!
Th kThanks
Joan Maso@uab [email protected](CREAF)