Post on 11-Sep-2014
description
transcript
Data Visualisation in the Digital Arts and
HumanitiesTools, Methods & Techniques toPut Your Research Data to Work
!
Shawn DayQueen’s University Library
Objective‣ To appreciate the rich variety of techniques and tools
available to digital humanities scholars for data visualisation and analysis. This workshop will provide an introduction to the varied use of data visualisation in the humanities through examples, case studies and hopefully inspire you to some hands-on fun.
The beginningof a conversation …
Upcoming Seminars and Workshops‣ 18 November - A Survey of Digital Humanities ‣ 2 December - Engaging Your Auduence with Your Research Data
(Exhibit) ‣ 9 December - Telling Stories with Data – Collections Visualisation
for Arts and Humanities Scholars (OMEKA) ‣ January - Digital Project Management ‣ Februrary - Hands On Workshop – Data Visualisation for
Presentation ‣ February - Social Scholarship – Tools for Collaborative Research ‣ March - Data Visualisation for Textual and Spatial Analysis
!
‣ More to come: http://qubdh.co.uk
Agenda‣ Introduction ‣ What is Data Visualisation ‣ Why Visualise Data? ‣ Case Studies ‣ Things to Visualise ‣ Ways to Visualise ‣ Tools for Visualisation
Breakpoint‣ One of the keys to good visualization is understanding
what your immediate (and longer term) goals are. !
‣ Are you visualizing data to understand what’s in it, or are you trying to communicate meaning to others?
!
‣ You - Visualisation for Data Analysis !
‣ Share with Others - Visualisation for Presentation
Why Visualise? The Basics‣ Open Up Large Datasets ‣ Increase Density of Observable Data ‣ Reduce Complexity ‣ Aestheticise Data ‣ Illustrate an Interpretation ‣ Make an Argument
Why Visualise? The Psychology and Physiology‣ Bypass language centres to tap directly into the visual
cortex; ‣ Leverage ability to recognise patterns - what they call
visual sense-making; ‣ Powerful graphics engines now allow for live data
processing and sophisticated animations and interactive research environments.
Why Visualise? From a Data Perspective‣ Can link different formats ‣ Can share more easily with others ‣ Can see new meanings and connections ‣ Sort and re-organize in automated fashion ‣ Manage larger amounts of information ‣ Visualise your results
Why Visualise? For Humanities Research ‣ Work with new data to create new knowledge
‣ Explore data to discover things that used to be unknown, unknowable or impractical to know
‣ Take a new perspective on the familiar to reveal previously hidden insights
Data Visualisation hasdefinitely hit the big-time‣ Guardian Awards ‣ New York Times ‣ Why?
Tourists vs Locals, Eric Fischer, 2010 - Flickr
Visualise New Information
Red - Tourists Blue - Locals Yellow- NA
Areas of Interest
Crowdsourcing
Visualising New Information
Flickr Flow, Martin Wattenberg and
The Familiar through New Eyes
The London Times Atlas
How Could You Use Data Analysis?‣ “In the Lab” - for your own analysis ‣ Online as part of collabourative groups ‣ Through dissemination for extension of own work -
crowdsourcing ‣ Others?
Case Study: The Time Strip
Visualisation Objective‣ Exploring the ‘ordinary’ lives of rural pioneers/farmers in
nineteenth century Ontario
Canada
Ontario
South Western Ontario
Farm Journal Raw Materials‣ 100s of pages ‣ Varying hands ‣ Varying quality ‣ Columns ‣ No Context
William Sunter Farm Diary, 1858
Medical Diary by BlueChillies
Example: Medical Diary
History flow by Martin Wattenberg and Fernanda Viegas
Example: History Flow
Mechanics of the Process‣ Generate word frequency (Voyant, TAPoR) ‣ Isolate known farm activities (NLP - LanguageWare) ‣ Collocate to link activity references to time, duration, and
resources (Voyant)
The Result/ New Patterns
The Result/ New Patterns‣ Less time haying ‣ The impact of technology ‣ More tasks faster
How Else Could this be done?
What is the Value of this Visualisation?‣ Easier to compare over intervals ‣ Multiple vectors with greater granularity in a compressed
space ‣ The challenge is to find rich enough source materials to
yield substantive datasets
Case Study: The Tree Map
Example: Panopticon
Ben Scheiderman and Hard Drive Space
Example: Bachelor’s Degrees 2011
Ben Schmidt, 2013 http://benschmidt.org/Degrees/2011Overview/
Case Study: Occupations of Politicians‣ What are we studying? • Self-declared occupations of politicians
‣ Why? • What bias might they bring to their job?
‣ How? • Visualising past occupation and mapping to political
platform of party affiliated with
Occupations of MPs in the 2nd Canadian Parliament
Occupations of MPs in the 37th Canadian Parliament
The Result/ New Patterns‣ The emergence of the professional politician with no
private sector experience ‣ Occupational continuity across changes in governing
party ‣ http://dev.dho.ie/~sday/dail/index.html
How Else Could this be Done?
The Value of Data Vis for Analysis‣ New ways of presenting allow new ways of seeing ‣ Hidden patterns become evident ‣ Suggest other hypotheses to test for ‣ Good research raises more questions than answers
People demanding more…‣ Interactivity ‣ Involvement ‣ Action ‣ Participation ‣ Web 2.0 … 3.0 ….
General Steps in Data Vis for DH1.Discovery / Acquisition 2.Cleaning / ‘Munging’ 3.Analysis / Exploratory Vis 4.Presentation
Types of Data to Visualise‣ Audio Data ‣ Categorical Data ‣ Cartographic Data ‣ Collections ‣ Image Data • Still • Moving ‣ Metadata ‣ Multimedia Data
‣ Network Data • Social • Other ‣ Numerical Data ‣ Temporal Data ‣ Textual Data • Narrative • Qualitative ‣ ????
Audio Data‣ Spectrogram ‣ Wave forms ‣ Notes ‣ Frequency ‣ Beats
Visualisation of "Canada is Really Big" by The Arrogant Worms” http://www.sonicvisualiser.org/
Audio Data‣ What does sound look like?
Audio Data: The Shape of Song‣ http://www.turbulence.org/Works/song/index.html ‣ Measuring Musical Patterns using Translucent Arcs ‣ Repetition
Phillip Glass, Candyman 2
Madonna, Like a Prayer
Audio Data: IBM ‘Glass Engine’
http://www.philipglass.com/glassengine/
Categorical Data‣ Data is grouped into categories based on a qualitative
trait, ‣ The resulting data represents the labels of these groups. ‣ Nominal, Ordinal
and/or Binary
Cartographic Data‣ Communicate spatial information
Cartographic Visualisation
Cartographic Visualisation
http://maps.stamen.com/watercolor/#13/53.3355/-6.2181
Digital Collections‣ Collections of data, images, movies, sound … etc • Visualise the
object incontext aspart of collection
• Representthe structureof thecollection
Google Art Project: Visualising Museum Collections
Digital Collection Visualisation
Digital Still Image Data‣ Colour ‣ Texture ‣ Shape ‣ Content ‣ Format ‣ Metadata ‣ Luminosity/Hue/
Saturation/Range
Digital Moving Image Data‣ Adding Data on: • Narrative • Length • Frame rate • Sound/Image • Key Frames • Storyboard
Metadata
Numerical/Quantitative Data‣ Does anyone really need me to tell them about this? • Analysed using statistical methods • displayed using tables, charts, histograms and graphs…
Social Network Data‣ Nodes and Edges ‣ Representing relations and quantifying and qualifyign the
same between objects
Temporal Data‣ Show changes over time ‣ Show temporal clusters
Xerox Parc, Stuart K.Card, George G. Robertson, Jock D. Mackinlay
http://www.itc.nl/personal/kraak/
Different Ways of Seeing Time
Quantitative Textual Visualisation
Textual - Qualitative‣ Textual attributes graphically represented • Frequency • Collocation • Adjacency
Textual - Narrative
Time, Space, Narrative: MythEngine
http://www.bbc.co.uk/blogs/
Time, Space, Narrative: MythEngine
http://www.bbc.co.uk/blogs/researchanddevelopment/2010/03/the-mythology-engine-represent.shtml
General Steps in Data Vis for DH1. Discovery / Acquisition 2. Cleaning / ‘Munging’ 3. Analysis / Exploratory Vis 4. Presentation
Step 1 Discovery / Acquisition
An Iterative Process
ACQUIRE w PARSE w FILTER w MINE w REPRESENT w REFINE w INTERACT
Visualizing What?‣ Basic types of content that we are used to deal with: • Text • Numbers • Image • Video
‣ Other, more “complex” stuff: • Relations, connections, links - a genealogy • Time and space coords - the path of migratory birds • Animations – a piece of courseware • 3D models – the plan of your house
Acquisition: Junar‣ http://www.junar.com
http://goo.gl/oexnB
Acquisition: Public Data Sources‣ CSO: Data Formats ‣ The Data Hub: Linked Data
Acquisition: Public Data Sources
Cleaning / Munging (Normalisation, Format Conversion)‣ Tools: • Data Wrangler • Google Refine • Mr. Data Converter
!
‣ Data Wrangler • Does simple, split, clear, fold/unfold transforms on data • See example --> Data and Script
!
‣ Google Refine • Works with larger datasets
Open Data/Linked Data
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
Munging Tool: Data Wrangler‣ http://vis.stanford.edu/wrangler/app/
Cleaning Exercise
Alternate: Google Refine‣ http://code.google.com/p/google-refine/
Alternate: Mr Data Converter‣ http://shancarter.com/data_converter/
Now You’ve Got Data ... ‣ What’s Next? ‣ Data Visualisation in the Analysis Process ‣ Data Visualisation for Presentation
General Steps in Data Vis for DH1. Discovery / Acquisition 2. Cleaning / ‘Munging’ 3. Analysis / Exploratory Vis 4. Presentation
Breakpoint‣ Are you visualizing data to understand what’s in it, or are
you trying to communicate meaning to others? ‣ You - Visualisation for Data Analysis ‣ Share with Others - Visualisation for Presentation
Google NGram Viewers‣ Examine word frequency in digitised books ‣ Currently about 4% of books ever published ‣ In English, Chinese, French, German, Hebrew, Russian, and
Spanish ‣ Changes in word usage ‣ Trends
Google NGram Viewer
http://books.google.com/ngrams/graph
The Value of Data Vis for Analysis‣ New ways of presenting allow new ways of seeing ‣ Hidden patterns become evident ‣ Suggest other hypotheses to test for ‣ Good research raises more questions than answers
Data Analysis Principles1. Process is a Way of Thinking, not a Substitute for
Thinking 2. Data needs to be considered and reported in Context 3. Look Before you Leap - Get to Know Your Data 4. Question Everything - CollectionProcess, Bias, etc. 5. Do a Gut Check 6. Coincidence is Not the Same as Causality 7. Just Because Data Exists Doesn’t Mean its Relevant
Fern Halper - Seven Guiding Principles
Analysis / Exploratory Visualisation
Cleaning&Structuring: Google Fusion Tables
Gephi: Analysis and Discovery of Networks
Where to Keep up with the Community‣ Highbrow: http://osc.hul.harvard.edu/highbrow
!
!
!
‣ http://chronicle.com/blogs/profhacker ‣ Flowing Data: http://flowingdata.com ‣ Perceptual Edge: http://www.perceptualedge.com ‣ Info is Beautiful: http://www.informationisbeautiful.net ‣ Visualising Data: http://www.visualisingdata.com ‣ Infosthetics: http://infosthetics.com ‣ Datavisualisation.ch: http://datavisualization.ch ‣ Dig Hum Specialist: https://dhs.stanford.edu/the-digital-
humanities-as
New Perspectiveson Old Data
Presenting Your Data Visually
Objectives‣ Consider best practices in sharing research findings
using visualisation tools; ‣ Identify and judge between publicly available tools to
create and deploy humanities visualisation research products;
‣ Consider data visualisation as part of a larger research discussion.
General Steps in Data Vis for DH‣ Discovery / Acquisition ‣ Cleaning / ‘Munging’ ‣ Analysis / Exploratory Vis ‣ Presentation
Academic Visualisation?There’s lots of published papers out there
http://www.autodeskresearch.com/projects/citeology
The Life on An Idea through Citations
Data Visualisation Lessons from Tufte‣ Show the Data ‣ Provoke Thought about the Subject at Hand ‣ Avoid Distorting the Data ‣ Present Many Numbers in a Small Space ‣ Make Large Datasets Coherent ‣ Encourage Eyes to Compare Data ‣ Reveal Data at Several Levels of Detail ‣ Serve a Reasonably Clear Purpose ‣ Be Closely Integrated with Statistical and Verbal
Descriptions of the Dataset
What Visual Techniques Exist?‣ Connecting your data with the right visualisation ‣ What is your message? ‣ How do we know what we might use? ‣ Start with your Exploratory/Research/Analytical
Environment ‣ How do visuals fit into your narrative?
Connecting your data with the right visualisation
What Visual Techniques Exist?
Visual Everything
Connecting your data with the right visualisation
Structured Data Presentation Tools(a tiny subset)‣ Webservices • Temporal: TimeFlow • Google Fusion Tables • Textual, Spatial and
Numeric: Many Eyes • Temporal: Dipity • Infographics: Visual.ly
!
!
!
!
‣ Frameworks • GraphViz • Gephi • Prefuse • D3 • Processing • Exhibit (Exercise)
TimeFlow‣ Journalism ‣ Getting the flow
of events and factsstraight
!
!
!
!
‣ http://flowingmedia.com/timeflow.html ‣ Great for historians
Google FusionTables‣ Initially Exploratory
and useful for ‘Munging’ ‣ Allows for Embedding ‣ And for User Interaction ‣ Transparency ‣ Experimental (Good)
!
!
‣ http://www.google.com/fusiontables/Home/
Many Eyes‣ http://www-958.ibm.com ‣ Rich, Varied and Accessible ‣ Free Rapid Prototyping
Visual.ly
Visual.ly‣ Well crafted Infographics gaining credibility ‣ The new poster presentation ‣ Data-driven narrative in words and pictures ‣ Visual.ly currently driven by social media
Dipity
Frameworks and Languages‣ GraphViz ‣ R Programming Language ‣ JIT (JavaScript Infovis
Toolkit) ‣ Protovis ‣ D3 ‣ Processing ‣ Tableau ‣ Prefuse ‣ Gephi
‣ WEAVE (http://www.oicweave.org/)
!
‣ Exhibit (Exercise)
Graphviz‣ An Open Source Framework ‣ Mature (1988) ‣ AT&T Labs ‣ Used as a basis for subsequent ‣ A great prototyping and starting point
!
!
!
!
!
‣ http://www.graphviz.org/
R Programming Language‣ Geared towards statistical analysis ‣ More recently has had some powerful graphics
frameworks added ‣ Open Source ‣ Typically Command Line but a variety of GUI editors
available ‣ > Jeff Rydberg-Cox: R for the Digital Humanities
JavaScript InfoVis Toolkit (JIT)‣ JIT Demos (http://thejit.org/demos/) ‣ The JavaScript InfoVis Toolkit is a complete set of tools to
create Interactive Data Visualizations for the Web. It includes JSON loading, animation, 2D point and graph classes and some predefined tree visualization methods.
‣ Smaller datasets in a clean form ‣ Related and Aggregated/Categorised Data
JavaScript InfoVis Toolkit (JIT)
ProtoVis‣ Protovis is a visualization toolkit for JavaScript using SVG.
It takes a graphical approach to data visualization, composing custom views of data with simple graphical primitives like bars and dots. These primitives are called marks, and each mark encodes data visually through dynamic properties such as color and position.
‣ Jerome Cukier: ProtoVis Tutorial ‣ Development shifted to D3 ‣ ProtoVis still very accessible and usable
http://mbostock.github.com/protovis/ex/crimea-rose.html
ProtoVis
http://mbostock.github.com/protovis/ex/napoleon.html
ProtoVis
D3‣ D3 allows you to bind arbitrary data to a Document
Object Model (DOM), and then apply data-driven transformations to the document. As a trivial example, you can use D3 to generate a basic HTML table from an array of numbers. Or, use the same data to create an interactive SVG bar chart with smooth transitions and interaction.
‣ Open Source
Processing‣ Now we are getting serious... ‣ Ben Fry ‣ Like R has a serious statistical bent ‣ Has a client and development environment, but deploys
easily to the web using processing.js ‣ Large and VL datasets ‣ Good with related data ‣ Serious support for aesthetics ‣ Modelling Environment ‣ http://processing.org/ ‣ http://www.openprocessing.org/
Processing.js
http://nytlabs.com/projects/cascade.html
Processing.JS
Tableau‣ Commercial ‣ Offers a Free Public Application ‣ Encourages sharing and focusses on building a narrative
around visualisation of your research data ‣ Education and Non-Commercial Licenses available ‣ Mature and evolving rapidly to demonstrate the newest
and most exciting visualisation types
http://www.tableausoftware.com/public
Tableau
Prefuse‣ flare.prefuse ‣ Flash-based ‣ Great transitions and very approachable ‣ Beware of Datalocking ‣ http://flare.prefuse.org/demo
Gephi‣ Open Source ‣ Mapping and Visualising Relationships and Networks ‣ An outstanding Visual Development Environment ‣ Multiplatform ‣ Extensible!! ‣ https://gephi.org/
Gephi
Gephi
Where to go further‣ DIRT (Digital Research Toolkit) ‣ Timeline Tools ‣ Visualisation in Education ‣ Visual Complexity ‣ DataVis.ca ‣ R: A Tiny Handbook of R - Springer ‣ Using R in DH ‣ MONK ‣ http://datajournalism.stanford.edu/
Upcoming Workshops‣ 18 November - A Survey of Digital Humanities ‣ 2 December - Engaging Your Auduence with Your Research Data
(Exhibit) ‣ 9 December - Telling Stories with Data – Collections Visualisation
for Arts and Humanities Scholars (OMEKA) ‣ January - Digital Project Management ‣ Februrary - Hands On Workshop – Data Visualisation for
Presentation ‣ February - Social Scholarship – Tools for Collaborative Research ‣ March - Data Visualisation for Textual and Spatial Analysis
!
‣ More to come: http://qubdh.co.uk
Thank YouShawn Day - s.day@qub.co.uk - @iridium
!
The Library/Institute for Collaborative Research in the Humanities18 University Square
Ground Floor http://qubdh.co.uk