“Semantics” for Innovation in Visualization and Multimedia: Smarter Information Science ICSTI...

Post on 25-Dec-2015

221 views 0 download

Tags:

transcript

“Semantics” for Innovation in Visualization and Multimedia: Smarter Information Science

ICSTI Workshop

February 8, 2011, Redmond WAPeter Fox (RPI) pfox@cs.rpi.eduTetherless World Constellation http://tw.rpi.edu

Please buckle your seatbelt

• Working premise and the burden

• Opportunity – new means– Linked open data (LOD)– Open-source software

(Field)• Science conduct• Semiotics of portrayal

– Includes semantics• Representing e.g.

– Uncertainty, quality, bias• Speculation

2Tetherless World Constellation

3

Working premise

Scientists – actually ANYONE - should be able to access and use a global, distributed knowledge base of scientific data that:• appears to be integrated• appears to be locally available

But… data and information is obtained by multiple means (instruments, models, analysis) using various (often opaque) protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non-existent) meta-data. It may be inconsistent, incomplete, evolving, and distributed AND created in a form that facilitates generation, not use (except by accident)

And … significant levels of semantic heterogeneity, large-scale data, complex data types, legacy systems, inflexible and unsustainable implementation technology…

Uh-oh

Changing the equation

• “Changing the Equation for Scientific Data Visualization” – Fox and Hendler (Feb 11, 2011) Science (Perspectives), in press (embargoed, sorry)

• Three important points– Unlocked data (and it’s big, really, really…)– Visualization for the masses throughout the

‘life-cycle’ but scale-free (!)– Smarter data, smarter visualization

5

.. Data has Lots of Audiences

From “Why EPO?”, a NASA internalreport on science education, 2005

More Strategic

Less Strategic

Science too!

Fox Informatics and Semantics, © 2008

6

Shift the Burden from the Userto the Provider – for Viz. too!

Too many diagrams

Visualizing Linked Open Data (logd.tw.rpi.edu)

Linked open data

• Simply put: data is in RDF and has a URI and/ or it’s behind a query-able ‘triple-store’ interface

‘convert’ ‘load’ ‘query’ ‘render’

New means – artists to the rescue

• Digital artists, they needed good creative visual tools, art at the speed of creative thought, feeling, intuition, mental representation and they love programming

• And, RPI has EMPAC – Experimental Media and Performing Arts Center

From flat screen to black box - EMPAC

Field – rapid visualizing

What we are doing

• Field meets Linux!• Linked data meets Field!

– Feed the current LOD graphics into Field for manipulation

– Then….• Unscrew the Google graphics• Unscrew the JSON feed

• Query / consume raw RDF• Visualizing at the speed of thought/ typing..• From the laptop to scale

• So this is where the semantics re-enter, especially for portrayal

Linked open data

• Field consumes JSON (webify it)

‘convert’ ‘load’ ‘query’ ‘render’

Linked open visualization

• Field queries ‘triple stores’ (semantic webify it)

‘convert’ ‘load’ ‘query’/‘render’

Linked open visualization

• Field queries ‘triple stores’ (semantic webify it)

‘dynamic’

‘load’

‘query’/‘render’‘access’

Science - Means of conduct

17

So what about abduction?

• No, not the criminal meaning…• Is a method of logical inference introduced by

Peirce which comes prior to induction and deduction for which the colloquial name is to have a "hunch".

• Abductive reasoning – starts when an inquirer considers of a set of seemingly

unrelated facts, – armed with an intuition that they are somehow connected– oh, wait, good job for visualization!!!

• Leverage open world, semantics, too… on the web

18

• Semiotics, also called semiotic studies or semiology, is the study of sign processes (semiosis), or signification and communication, signs and symbols, into three branches:– Syntactics: Relation of signs to each other in

formal structures– Semantics: Relation between signs and the

things to which they refer; their denotata– Pragmatics: Relation of signs to their impacts on

those who use them19

Information theory

Semiotic model

20

Semiotics of portrayal

• But we are talking about a digital world increasingly more than an analog one

• Beyond the separation of content from presentation– We have means for content (context and structure)

semantics but pragmatics?

• Portrayal (not just ‘maps’ or ‘graphs’)– How – representation of content, context and

structure, capture visualization provenance– Graphs, points, lines polygons, titles, axes, color,

shade, dimensions, … and their relation to each other!

For science viz.

• We leave the untold things untold – big (really, really big) problem(s) – like:– Uncertainty– Quality– Bias– Need evidence

• An example?

04/19/23

MODIS Terra & Aqua vs. AIRS Cloud Top Pressure

AIRS vs. MODIS AquaAIRS vs. MODIS Terra

MODIS Aqua vs. MODIS Terra

Correlation maps for Jan 1 – 16, 2008

Impact: Throw your hands up in

the air and just walk away, silently…

Parameter A Parameter B Difference alert

Parameter Name : Aerosol Optical Depth at 550 nm Aerosol Optical Depth at 550 nm

Dataset: MYD08_D3.005 MOD08_D3.005 Diff

Data-Day definition UTC (00:00-24:00Z) UTC(00:00-24:00Z) The same but….

Temporal resolution Daily Daily

Spatial resolution 1x1 degree 1x1 degree

Sensor: MODIS MODIS

Platform: Aqua Terra Diff

EQCT 13:30 10:30 Diff

Day Time Node Ascending Descending Diff

Pre-Giovanni Processes : ATBD-MOD-30 ATBD-MOD-30

Giovanni Processes: Spatial subsetTime average

Spatial subsetTime average

MODIS Terra vs. MODIS Aqua AOD Correlation

Included Overpass time Difference

Known Issues: The difference of EQCT and Day Time Node, modulated by data-day definition, caused the included overpass time difference, which makes the artifact difference. See sample images:

BUT WHY ARE WE SAYING THIS IN WORDS?

Abductive Information System?

• What would this look like in application tools? How to explore ‘hunches’ (hints)?

• If you consent that induction is fundamentally part of how an information system is developed, then how to allow for abduction before induction may be possible?

• Open world, integrative• Design factors? Architecture factors?

Library factors? Cognitive factors?25

Speculation

• But back to big data and the need to turn the visualization ‘walls’ into exhibits, 4-dimensions – installations – i.e. not immersion but experience – Synesthesia – why only one sense?– Rapid

Speculation

• At scale – why?

• Stereo – why?

• Linked to the live data – minimal curation!

• Goal: restore abductive reasoning to the conduct of science for specialists and non-specialists

• And this has to be informatics-based not some ad-hoc techies making stuff up…

• Collaboration – wanna play?

So long and …

• pfox@cs.rpi.edu

• http://tw.rpi.edu

• http://openendedgroup.com

• http://logd.tw.rpi.edu

• http://empac.rpi.edu

Back shed

20080602 Fox VSTO et al. 30

Curation stages

Need to be here

31

Mind the Gap!

• There is/ was still a gap between science

and the underlying infrastructure and

technology that is available

• Cyberinfrastructure is the new research environment(s) that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing and information processing services over the Internet.

Informatics - information science includes the

science of (data and) information, the practice

of information processing, and the engineering

of information systems. Informatics studies the

structure, behavior, and interactions of natural

and artificial systems that store, process and

communicate (data and) information. It also

develops its own conceptual and theoretical

foundations. Since computers, individuals and

organizations all process information,

informatics has computational, cognitive and

social aspects, including study of the social

impact of information technologies. Wikipedia.

Modern informatics enables a new scale-free** framework approach

• Use cases– requirements

• Stakeholders• Distributed

authority• Access control• Ontologies• Maintaining

Identity

Multi-tiered interoperability

used by

Tetherless World Constellationtw.rpi.edu

Themes

Future Web•Web

Science•Policy•Social

Xinformatics•Data Science

•Semantic eScience

•Data Frameworks

Semantic Foundations•Knowledge Provenance

•Ontology Engineering Environments•Inference, Trust

Hendler

Fox

McGuinness

Multiple depts/schools/programs ~ 35 (Post-doc, Staff, Grad, Ugrad)