+ All Categories
Home > Documents > Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy,...

Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy,...

Date post: 17-Oct-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
58
Visualization of linguistic information Chris Culy, Verena Lyding [email protected] ; [email protected] Institute for Specialised Communication and Multilingualism European Academy of Bolzano/Bozen
Transcript
Page 1: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Visualization of linguistic information

Chris Culy, Verena [email protected]; [email protected]

Institute for Specialised Communication and Multilingualism

European Academy of Bolzano/Bozen

Page 2: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 2

Who are we?

Chris Culy• Ph.D. in linguistics (Stanford), Ex-professor at The University of Iowa

Syntax, morpho-syntax, typology, African languages (fieldwork)

• Career in Silicon Valley

Computational linguistics, Machine translation, AI

• EURAC: Senior researcher, Language Technologies Technical OfficerVisualization, tools of various sorts

Page 3: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 3

Who are we?

Verena Lyding• Studies in computational linguistics (University of Potsdam), MSc in speech

and language processing (University of Edinburgh)

• Ongoing Ph.D. work at University of Osnabrück

User-centered evaluation of visualizations for exploratory corpus analysis

• EURAC: Researcher in the language technologies team since 2004Visualization, corpus linguistics

Page 4: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 4

Institute for Specialised Communication and Multilingualism at EURAC• Founded in 1993, oldest department of the European Academy of

Bolzano/Bozen – South Tyrol’s institute for applied research and further education (www.eurac.edu)

• More than 13 ongoing projects about:

• Specialised Communication• Bi- and Multilingualism• Language Technologies

• Team of 20 researchers:• Terminologists / translators• Linguists / computational linguists• Sociologists / Psychologists

Page 5: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 5

Your background and a first demo• What is your background?

• Linguists, computational linguists, …• Do you use visualizations already?

• What visualizations are you using?• How are you using them?

• Many Eyes Word Tree demo

Page 6: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 6

Information visualizationDefinition:

“The use of computer-supported, interactive, visual representations of abstract data to amplify cognition.” (Card et al., 1999)

LInfoVis = Linguistic Information Visualization

• The application of information visualization principles to display any kind of information concerning language and its use.

Page 7: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 7

• To convey information about language data

• To provide a way to interact with language data (user interface)

• To be an aid to discovery, decision making and explanation of information about language

Aims of LInfoVis

Page 8: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 8

• Introduce basic principles of information visualization

• Provide an overview about software and toolkits for creating visualizations

• Present projects that are concerned with the visualization of lingustic/language data

• Brainstorm about how different linguistic research contexts could benefit from visualizations

• Discuss where to go from here

Aims of the workshop

Page 9: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 9

• A closer look at how visualizations work

• Visual variables• Visualization principles• Examples of visualizations

• Projects in the area of linguistic information visualization

• What data can be used for linguistic information visualization?• What visualizations are out there? (including our projects)

• Discussion

• Creating visualizations

• Programs and Toolkits for the creation of visualizations• Research directions

• Discussion

Outline

Page 10: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Understanding Visualizations

“Visualization has to be more than pretty pictures. It has to inform. It has to challenge. It has to further our understanding.

Visualizing data is not about pretty pictures.” (Robert Kosara on www.eagereyes.org)

Page 11: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 11

How do visualizations work?Sources for this section:• Bertin, J. (1982): Graphische Darstellungen. Graphische Verarbeitung von

Informationen. Berlin/New York: de Gruyter.

• Card, S. K. / Mackinlay, J. D. / Shneiderman, B. (1999): Information Visualization: Using Vision to Think. San Francisco: Morgan Kaufmann Publishers

• Collins, C., Penn, G. and Carpendale, S. (2008). Interactive visualization for computational linguistics. ACL-08: HLT Tutorials. Retrieved from: http://www.cs.utoronto.ca/~ccollins/acl2008-vis.pdf. Access date: December 3, 2009.

• Hearst, M. (2009): Search User Interfaces. Cambridge: Cambridge University Press.

• Tufte, E. (1999): Envisioning Information. Cheshire, Connecticut: Graphics Press LLC.

• Tufte, E. (2006): Beautiful Evidence. Cheshire, Connecticut: Graphics Press LLC.

Page 12: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 12

How do visualizations work?• Visualizations use graphics to: organize, highlight, compare

information, reveal patterns/trends/outliers in the data (Hearst, 2009)

• Information is transformed into 2D graphics

• Also 3D, but 3D “has been found to be inferior, or at best equivalent to 2D or textual interfaces” (Hearst, 2009)

• Meaningful visualizations are constructed by:

• Choosing graphical representations that fit the data• Adhere to visualization principles (e.g. derived from insights

into human visual perception)

Page 13: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 13

For the purposes of visualization, it is useful to classify data (Hearst, 2009):

• Quantitative data: numbers, etc. that can be processed arithmetically

• Categorical data: everything else

• Interval: ordered data with measurable distances (e.g. months)• Ordinal: ordered data without measurable distances (e.g. hot-warm-

cold)• Nominal: data without organization (e.g. weather types, a collection

of names)• Hierarchical: data without order, arranged into subsuming groups

(e.g. {{ mammals, { bear …}, { cat { lynx…},…},…},…} , etc.)

Quantitative, interval, and ordered data are easier to convey visually than nominal data.

Data types: quantitative vs. categorical

Page 14: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 14

• We are interested in the information/properties of textual elementse.g. word frequency, syntactic structure, emotion content, etc.

• This information is (usually) either quantitative (frequencies) or structured (trees, emotion scales)

• However, in many cases, the actual textual items are important for understanding the information, so they must be indicated in the visualization.

“The categorical nature of text, and its very high dimensionality, make it very challenging to display graphically.” (Hearst, 2009)

Where do textual elements fit in?

Page 15: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 15

The problem isn’t the information about textual elements, but the textual elements themselves: they take up space.

• Textual items are not mappable• We (usually) cannot effectively represent them by something else meaningful

(e.g. shape, color, position, etc.)• Textual items are too variable and too complex to be reduced to a more

compact representation, even a label• The details of the textual items are often crucial to understanding the data

(e.g. context in a concordance)

This is a huge challenge for LInfoVis!• Not a solved problem

• Interactive visualizations will be the key to the solution(s)

Why is textual information hard to visualize?

Page 16: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 16

Taken from: M. Carpendale, "Considering visual variables as a basis for information visualisation“, Dept. of Computer Science, University of Calgary, Canada, Tech. Rep. 2001-693-16, 2003, Table 1.

Visual data transcription: visual variables

Value = Brightness

Page 17: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 17

5 key characteristics

• Selectivity: Different values are easily seen as different • “Is A different from B?”• Worst case: visual properties of all objects need to be looked at one by one

• Associativity: Similar values can easily be grouped together• “Is A similar to B?”

Positioning > {size, brightness} > {color, orientation (for points)} > texture > shape

Visual variables: characteristics (1)

Full selectivity /

associativity

No selectivity /

associativity

Page 18: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 18

• Order: Different values are perceived as ordered• “Is A more/greater/bigger than B?”• Size and brightness are ordered• Orientation, shape, texture are not ordered• Hue is somewhat ordered

• Quantity: A number can be deduced from differences• “How much is the difference between A and B?”• Position is quantitative, size is somewhat quantitative• The other variables are not quantitative

Visual variables: characteristics (2)

Page 19: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 19

• Length: The number of distinctions possible using the variable• “How many different things can we represent with this variable?”• Shape, Texture: infinite, but …• Brightness, hue: 7 (Association) – 10 (Distinction)• Size: 5 (Association) -20 (Distinction)• Orientation: 4

Visual variables: characteristics (3)

Page 20: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 20

“Sameness of a visual element implies sameness of what the visual element represents” (Tufte, 2006)

• Characteristics of visual variables are determining• e.g. Ordered values have to be represented by ordered visual variables

• Consider gestalt psychology principles of perception

• General notions about how people organize what they see • e.g.• Proximity: spatially near located objects are perceived as belonging to

the same group• Similarity: objects with common visual attributes are perceived as being

part of the same group.

• Be consistent concerning relations of similarity, proportion and configuration

Principles: visual variables (1)

Page 21: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 21

• Adhere to conventional uses of visual variables

• E.g. in cartography use blue color for water• Take care of “effects without causes” (Tufte)

• Scales should be made up of visually equidistant values of a variable

• The full range of a visual variable should be used

• The number of visual variables of a visualization should correspond to the dimensionality of the represented information

Principles: visual variables (2)

Page 22: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 22

“Preattentiveness of visual properties”• single momentum of perception, no cognitive effort

• Combinations of visual properties are usually not preattentive

Perception: preattentiveness

Page 23: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 23

“Clutter and confusion are failures of design, not attributes of information.” (Tufte, 1999)

• Avoid “optical clutter” e.g. elements positioned too closely, too strong colors (Tufte),

• “let the same ink serve more than one informational purpose” (Tufte)

• Remove irrelevant information

• Use unobtrusive grid lines and background colors

• Highlight important information

Good visualizations reduce the cognitive effort (compared to other presentation formats) for understanding complex information

Principles: visual clarity

Page 24: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 24

• Don’t hide information without indicating what is left out

• Can missing information be reconstructed?• Can transformations/simplifications/abstractions be tracked?

• Present information in context

• Use rulers/scales• Add labels and legends• Choose visual elements in a way that what they represent is

easily memorized

Principles: content transparency

Page 25: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 25

“Overview first, zoom and filter, then details-on-demand” (Shneiderman, 1996)

• Utilize the (often restricted) display space to give most room to the subject of the user’s interest (Card et al., 1999)

• User needs both overview (context) and detail (focus) simultaneously (Card et al., 1999)

• Can be combined within a single display, like in human vision• Layering and separation: visually stratifying various aspects of the

data (Tufte, 1999)

Principles: data arrangement

Page 26: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 26

“Rapid interaction fundamentally changes the process of understanding data.“ (Card et al., 1999)

• Panning across a view of the data

• Brushing-and-linking technique: simultaneous update of different views on the data (Hearst, 1999)

• Animated transitions can improve perceptions of changes between different graphical representations.

Principles: interactivity

Page 27: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 27

Visualizations: “classics”

0

5

10

15

20

25

1 2 3 4 5 6 7

1 2 3 4 5 6 7

Some examples:

• Written text• Tables, charts• Diagrams• Networks, graphs• Database layout• Maps

Page 28: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 28

• Sparklines: data-intense, design-simple, word-sized graphics (Tufte, 2006)

Visualizations: some modern examples

Page 29: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 29

• TreeMaps: space-constrained visualization of hierarchies (Johnson / Shneiderman, 1991)

Visualizations: some modern examples

Page 30: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 30

• TileBars: visualization of the relationship between query terms and retrieval results (Hearst, 1995)

Visualizations: some modern examples

Page 31: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 31

• Galaxies/Starfields: clustering of nominal data and layout on 2D/3D plane

Visualizations: some modern examples

Page 32: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 32

• Small multiples: presentation of a number of comparable objects in parallel within eye span (Tufte, 1999)

Visualizations: some modern examples

Page 33: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 33

• Fisheye view: presentation that gives focus normal size while surrounding information is miniaturized and distorted (Furnas, 1981)

Visualizations: some modern examples

Page 34: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Visualizations of linguistic information

“Nobody wants to look at a table of data, even if it's their own.”(Robert Kosara on www.eagereyes.org)

Page 35: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 35

Visualizing language-related informationWhat characteristics of language data to visualize?

• Anything that we are interested in!

Examples:

• Text structure, discourse structure

• Frequencies of linguistic units (e.g. words, POS, phrases, etc.)

• Characteristics of conversation (e.g. turn taking, participants, etc.)

• Learner language

• Speech related information (e.g. intonation, accent, etc.)

• Interlinguistic comparisons (e.g. effects of language contact)

Page 36: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 36

Up to now visualizations of language have focused mainly on:

• Content and similarity of document collections, keywords/collocations/co-occurrences, concept hierarchies/thesauri

Some examples:

• WordTree

• Leximancer

• Visuwords

• Docuburst

• Corpus Clouds

• Comparison Arcs

Visualization projects for language data

Page 37: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 37

• WordTree (Wattenberg/Viégas, 2008)

Visualization projects for language data

Page 38: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 38

• Leximancer

Visualization projects for language data

Page 39: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 39

• Visuwords(www.visuwords.com)

Visualization projects for language data

Page 40: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 40

• DocuBurst

Visualization projects for language data

Page 41: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 41

• Corpus Clouds (Culy/Lyding, 2009)

Visualization projects for language data

Page 42: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 42

Comparison Arcsby Culy/Lyding 2009

Visualization projects for language data

Page 43: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 43

• Were some of the presented visualizations relevant to your work?

• Have you gotten ideas for how to extend/adapt the tools to your needs?

• If the visualizations did not seem relevant, why is this?

Exercise:

• Think about 3 things you like about {WordTrees, Corpus Clouds, …} and 3 things you do not like. Discuss and collect ideas on how toimprove the tool.

Discussion

Page 44: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Creating Visualizations

Tools of the Trade

Page 45: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 45

A reference model of visualization (Culy/Lyding 2009, based on Card et al., 1999)

From data to visualizations

Raw Data Structured data

Visual structures Visual view

Data transformations Visual mappings View transformations

1. Raw Data, e.g. texts2. Data transformations, e.g. counting, sorting, tagging3. Structured data, e.g. document vectors, word/lemma/POS lists4. Visual mappings = the type of visualization,

• e.g. POS color, scatter plot, tree5. Visual structures = the general visual form, e.g. chart, tree, text6. View transformations = (interactive) modification and establishment of

graphical parameters7. Visual view = the visual appearance, e.g. color, shape, size, position

• Also includes filtering of data: which data is visible

Page 46: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 46

Four levels of visualization tools1. Existing programs using a common/generic/simple data format

• e.g. spreadsheet programs, statistical packages, etc.• Formats include data separated by tabs, commas, etc.

2. Existing programs using a complex/calculated format• Specialized linguistics program

• e.g. corpus query tools, annotation tools, etc.• Relevant non-linguistic programs

3. A new/custom program developed using a visualization toolkit

4. A new/custom program developed (partly) without a toolkit

e.g. Corpus Clouds

A collection of programmingcomponents (e.g. for particular types of visualizations) that can be used to construct other programs.

Page 47: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 47

Don’t underestimate the power (and convenience!) of existing programs• Spreadsheets and statistical packages have a range of visualizations for

charts, graphs, and networks (e.g. in the statistical package R )

• Programs designed for other types of analysis can be used, with some imagination and effort

• e.g. relational graphs are used in e.g. social networks and biology• Social networks (cf. Pajek)• Biology (cf. Cytoscape)

• Language related, but non-linguistic tools (e.g. ManyEyes)

Don’t forget linguistic programs! e.g. EXMARaLDA, etc.

Visualization tools: existing programs (levels 1+2)

Page 48: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 48

Before you start:

1. Who will be using the program? What level of knowledge, experience, etc?

2. What tasks will the user do?

3. What kinds of visualization would help with those tasks?

4. Is there already a program that does what you want?

• Don’t reinvent the wheel!5. What tools are available?

• Some visualizations are in toolkits in one programming language but not others

Visualization tools: Writing new programs(levels 3+4)

Page 49: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 49

An interesting combination is the NLTK toolkit in Python

• NLP tools with some visualization possibilities

There are many toolkits to create sophisticated visualizations

Often, but not always, they are complex and designed for experienced programmers

• (see handout)• A quick look at JSVIZ

A toolkit is not always necessary

• Does it provide the desired functionality?• Is the time to learn how to use the toolkit worth the benefit?

Visualization tools: Writing new programs

Page 50: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

LInfoVis – summing up

Page 51: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 51

Research Questions (1)

1. What kinds of linguistic information are best suited for visualization?• Visualization is good for showing relationships

i. What kinds of linguistic relationships?

o Distributional, classifying, association, etc.

o Monolingual, cross-lingual

ii. Concerning what types of elements?

o Morphosyntactic: words, phrases, sentences, documents, etc.

o Semantic/pragmatic: lexical, sentential, topics, emotion, etc.

Page 52: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 52

Research Questions (2)

2. Which visualization techniques are best suited for linguistic data?• i.e. What are the best visual structures and visual mappings?

o Charts, network graphs, area maps, color-based, etc.

Page 53: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 53

Research Questions (3)

3. What usage scenarios can most benefit from visualization?a) Data understanding:

• Tools for language analysis

• Mainly targeted to language professionals (e.g. linguists, terminologists, lexicographers, etc.)

b) Retrieval of information on language structure and use

• Interfaces for applications, language resources

• Also for non professionals, including CALL, but also translation, etc.

Page 54: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 54

Research Questions (4)

4. How effective are visualization techniques?a) Little existing evaluation

b) Are they more than “pretty pictures”?

Page 55: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 55

Research Questions (5)

5. How can we incorporate visualization into the processing pipeline and into applications?• Make use of the Reference Model of visualization as a framework

• Define specifications for

• The kinds of input to visualizations >> linguists, language users

• The kinds of visual mappings >> computational linguists

• The kinds of view transformations >> users and developers

• How can visualizations be the input to other processes?e.g. TIGERSearch tool (Voormann, 2002) “draws” partial graph as query

Page 56: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 56

Discussion 2Any updates on the questions from the previous discussion?• What problems and issues do you see with visualizations?

• how to create visualization• how to make use of visualizations

• Which aspects of visualizations seem most relevant?

• data presentation, data analysis, interfaces

• Where to go from here?

Page 57: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 57

ReferencesBertin, J. (1982): Graphische Darstellungen. Graphische Verarbeitung von Informationen. Berlin/New

York: de Gruyter.

Card, S. K. / Mackinlay, J. D. / Shneiderman, B. (1999): Readings in Information Visualization: Using Vision to Think. San Francisco: Morgan Kaufmann Publishers.

Carpendale, M. (2003): ‘Considering visual variables as a basis for information visualisation’, Dept. of Computer Science, University of Calgary, Canada, Tech. Rep. 2001-693-16.

Collins, C. (2007): ‘Docuburst: Radial space-filling visualization of document content’, Knowledge Media Design Institute, University of Toronto, Technical Report KMDI-TR-2007-1.

Collins, C. / Penn, G. / Carpendale, S. (2008). Interactive visualization for computational linguistics. ACL-08: HLT Tutorials. Retrieved from: http://www.cs.utoronto.ca/~ccollins/acl2008-vis.pdf. Access date: December 3, 2009.

Culy, C. / Lyding, V. (2009a): ‘Visualization as Part of the Linguistic Processing Pipeline’ presented at the Linguistic Processing Pipelines Workshop, Gesellschaft für Sprachtechnologie und Computerlinguistik. Potsdam. October 2009.

Culy, C. / Lyding, V. (2009b): ‘Corpus Clouds - facilitating text analysis by means of visualizations’, In: Proceedings of the 4th Language & Technology Conference, LTC’09 . Poznan

Page 58: Visualization of linguistic information€¦ · Visualization of linguistic information Chris Culy, Verena Lyding christopher.culy@eurac.edu; verena.lyding@eurac.edu Institute for

Institute for Specialised Communication and Multilingualism

Culy/Lyding 07.12.2009 58

Furnas, G. W. (1999): ‘The FISHEYE view: a new look at structured files’, In: Readings in Information Visualization: Using Vision to Think, San Francisco: Morgan Kaufmann Publishers.

Hearst, M. A. (1995): ‘Tilebars: Visualization of term distribution information in full text information access’, In: Proc. CHI’95, Denver, Colorado, pp. 56-66.

Hearst, M. (2009): Search User Interfaces. Cambridge: Cambridge University Press.

Johnson , B. / Shneiderman, B. (1991): ‘Tree-Maps: a space-filling approach to the visualization of hierarchical information structures’, Proceedings of the 2nd conference on Visualization '91, October 22-25, 1991, San Diego, California.

Shneiderman, B. (1996): ‘The eyes have it: A task by data type taxonomy for information visualizations’, In: Proc. of the IEEE Symposium on Visual Languages, Washington: IEEE Computer Society Press, pp. 336-343.

Tufte, E. (1999): Envisioning Information. Cheshire, Connecticut: Graphics Press LLC.

Tufte, E. (2006): Beautiful Evidence. Cheshire, Connecticut: Graphics Press LLC.

Todorovic, D. (2008): ‘Gestalt principles’. Scholarpedia, 3(12):5345, Retrieved from: http://www.scholarpedia.org/article/Gestalt_principles. Access date: December 4, 2009.

Voormann, H. (2002): TIGERin - Grafische Eingabe von Suchanfragen in TIGERSearch, (German) Diplomathesis. Fakultät Informatik, Universität Stuttgart.

Wattenberg, M. / Viégas, F. B. (2008): The word tree, an interactive visual concordance. In: IEEE Trans. on Visualization and Computer Graphics, vol. 14(6), pp. 1221-1228, Nov.-Dec. 2008.


Recommended