Post on 22-Dec-2015
transcript
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Multi-modal and Multi-functional Aspects of Information
and their Effects on Findability, Information-Hiding,
and Implicit Interaction
Andreas Rauber
Vienna University of Technology
http://www.ifs.tuwien.ac.at/~andi
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction
Where and what is information, and what is it used for? Three core areas of focus
– Multimodality: what aspects of a piece of information are there?– Multifunctionality: why was a piece of information created and
why is it being searched for?– Information: what is it?
Adressed in the context of 3 thematic areas– Music IR– Web Archiving– Digital Preservation
Going top-down from different high-level incarnations of information in different modalities via different functions to the actual building blocks – and back again
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Outline
(1) Music retrieval: – Audio and what else? Multimodality issues– (What is the function of a particular musical fragment?)– (What is the intention of the user searching or finding it?)
(2) Web Archive retrieval: – (Obviously multimodal)– Information functions, privacy and the need for information hiding?– Search and the searcher‘s intention
(3) Digital Preservation:– Significant properties & atomic information
Nothing new, but: is there a conceptual model rather than ad-hoc experimentation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Audio: wav, au, mp3, ...
Music IR – Music?
Music, of course!
www.samplesmith.com Symbolic: MIDI, mod, ...
www.westminster.gov.uk Scores: Scan, MusicXML
What is „Music“?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Music IR – Music?
0%
10%
20%
30%
40%
50%
60%
70%
TP 60,8% 58,9% 36,6% 41,3%
EP 60,6% 60,7% 30,5% 34,7%
LR 57,0% 56,7% 32,2% 27,7%
KWT 53,2% 54,1% 24,7% 20,7%
KWL 47,8% 49,2% 19,4% 15,9%
genre filtered genre album artist
Feature Extraction:-Frequency spectra analysis-Psycho-acoustic models-www.ifs.tuwien.ac.at/mir/audiofeatureextraction.html
PlaySOM & PocketSOMPlayer
MIREX
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Music IR – Music?
Community data– Playlists– Market basket– Band evolution
Text– Song lyrics
– Artist Biographies
– Websites: Fanpages, Album Reviews, Genre descriptions
Video/Images– Album covers– Music videos
www.samplesmith.com
What is „Music“? Music, of course!
– Audio: wav, au, mp3, ...– Symbolic: MIDI, mod, ...– Scores: Scan, MusicXML
www.westminster.gov.uk
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Music IR – Music?
Text: Song lyrics Convey a lot of musical information Some genres strongly related with texts Semantics of music:
love songs, christmas songs, ... Standard Text-IR: content analysis Genre-Analysis: style, rhymes, stop-words,.. Lyric portals 2 SOMs: Music, Text Analysis of cluster structure
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Text and Audio
Christmas songs
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Text and Audio
Speech Reggae
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Text and Audio
Hip-Hop Pop
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Text and Audio
Lyrics-based audio classification
BOW, tfxidf
Text Genre Features:- ExclamationMark, colon, singleQuote, comma,
questionMark, full-stop, hyphen, semicolon
- Counts of digits d0-d9
- CharsPerWord
- WordsPerLine, UniqueWordsPerLine, UniqueWordsRatio
- WordsPerMinute
PartOfSpeech: nouns, verbs, pronouns, prepositions, adverbs, articles, modals, adjectives
Rhyme Features: phoneme transcription + rhyme schemes
words per minute
Rhymes AABB
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Text and Audio
25 combinations of feature sets (RP, RH, SSD, BOW, Rhyme, Part-of-Speech, Text genre statistic)
Different classifiers: k-NN, Naive Bayes, Decision Trees, Support Vector Machines
Similar trends with all classifiers
Assuming SSD as best audio-only classier to be baseline
Statistical significance tests against that baseline
10-fold cross-validation
(Rudolf Mayer, Robert Neumayer, and Andreas Rauber. Combination of Audio and Lyrics Features for Genre Classification in Digital Audio Collections. ACM Multimedia 2008.)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Text and Audio
Feature Combination (3010 songs) Dimensionality SVM (accuracy)
SSD 168 66,32
RH 60 35,01
RP 1440 55,37
Textstatistics 23 28,72
POS 9 12,66
Rhyme 6 15,83
Textstat + POS 32 28,72
BOW + SSD 9434 66,44
BOW+SSD+textstat+POS+Rhyme 9472 67,06
SSD+textstat 191 68,72
SSD+textstat+POS 200 68,72
SSD+textstat+Rhyme 197 68,16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Music IR – Music?
Community data– Playlists– Market basket– Band evolution
Text– Song lyrics
– Artist Biographies
– Websites: Fanpages, Album Reviews, Genre descriptions
Video/Images– Album covers– Music videos
www.samplesmith.com
What is „Music“? Music, of course!
– Audio: wav, au, mp3, ...– Symbolic: MIDI, mod, ...– Scores: Scan, MusicXML
www.westminster.gov.uk
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Music IR – Music?
There is more to music than sound and text Which genre is this album?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Music IR – Music?
There is more to music than sound and text Which genre is this album?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Music IR – Music?
There is more to music than sound and text Which genre is this album?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Music IR – Music?
There is more to music than sound and text Which genre is this album?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Music IR – Music?
There is more to music than sound and text Which genre is this album?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Music IR – Music?
There is more to music than sound and text Which genre is this album?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Music IR – Music?
There is more to music than sound and text Which genre is this album?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Music IR – Music?
There is more to music than sound and text Which genre is this album?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Outline
(1) Music retrieval: – Audio and what else? Multimodality issues– (What is the function of a particular musical fragment?)– (What is the intention of the user searching or finding it?)– „Modalities“ in other domains: Text (formatting, layout,
references)– General concept of perspectives of information instead of ad-
hoc?
(2) Web Archive retrieval: – (Obviously multimodal)– Privacy functions and the need for information hiding?– Search and the searcher‘s intention
(3) Digital Preservation:– Significant properties & atomic information
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Web Archiving & Ethics
Web archiving initiatives crawl and archive web data Essential activity to ensure valuable content is being
preservedBut: Currently most archives are closed to public Mostly due to legal reasons Need a legal solution
Is this all? Ethical implications? Privacy? Can we analyze data & searches to guarantee
acceptable usage?
(Andreas Rauber, Max Kaiser, Bernhard Wachter. Ethical Issues in Web Archive Creation and Usage: Towards a Research Agenda. Proceedings International Workshop on Web
Archiving and Digital Preservation (IWAW 2008)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Web Archiving & Ethics
Web is both publication and communication platform– Can we identify, which pages are “published” and which are
“posted”?– Can we distinguish public data vs. private information?
Web Archive as eternal memory– Can we identify who posted something and when?– Can we tell children/teenagers postings?– Can we identify potentially sensitive (snippets of) information?– Can we model “forgetting” or fuzziness?
Web Archives as sources of valuable information– Can we identify what somebody is doing in a Web Archive?
(HR check-up vs. family history research vs. information look-up)– What is acceptable usage? acceptable queries?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Web Archiving & Ethics
Web Archives and IR– If we can do all of the above: do we know what to do with it?– Can we design an IR system that serves information in an
ethically acceptable manner?– What to do with it: limiting/blocking access or
excluding from archive, excluding from index, etc.?– Information hiding in retrieval? Censorship?
Will become more critical as power of multimedia search increases
Goal: establish the context of information & usage– who created it– for what reason was it created– what kind of information does it contain– what is it being used for?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Web Archiving: Classification Case Study
Analyze function of a piece of information
Identifying potentially private segments Approach:
– Take text documents, identify which ones potentially private
• Pages
• Paragraphs
– Train a classifier (SVM, Bayesian Networks, …)
– Need to integrate more fine-granular analysis (POS, snippets)
Similar to Genre Classification Works, but more open questions than solutions Combine with query analysis, domain analysis, usage…
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Outline
(1) Music retrieval: – Audio and what else? Multimodality issues– (What is the function of a particular musical fragment?)– (What is the intention of the user searching or finding it?)
(2) Web Archive retrieval: – (Obviously multimodal)– Privacy functions and the need for information hiding?– Search and the searcher‘s intention– Functions in music: emotions, ringtones, audio track for
illustration or text/presentation– What is the intention of the user searching for it?– Analyze and match functions and users / usecases / needs
(3) Digital Preservation:– Significant properties & atomic information
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Digital Preservation
Ensure that digital objects remain accessible in the future– Bit-level preservation: storage– Logical preservation: Objects -> Software -> OS -> HW
Approaches: Migration, Emulation-> some aspects lost in the process
Question: What to preserve? -> Preservation Planning
Significant properties:– Technical: format characteristics, functionalities,...– Intellectual: content, meaning, usage, ...
Authenticity
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Digital Preservation
Digital Preservation raises a lot of IR questions from a different perspective
IR Research activities in DP (in our group)– Establishing context of information objects
– Identifying significant properties
– Measuring how well certain significant properties are preserved e.g. after migration or during emulation
Core questions– What is (an atomic piece of) information?
(textsnippet + formatting + position + semantic + action)
– What does it evolve to given it‘s environment (groups of objects, usage, different aspects/views of information,...)?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Core Questions and Challenges
Definition of information– What is a piece of information? Smallest building block? – What can it evolve to if combined?– What is the context of information?– Does the concept of Memes apply?– What are the significant properties of information objects?
Functions of information– What different functions does a piece of information have?– Who created it for which purpose?– How can they be modeled? Matched with user needs?
Multi-modality– Which modalities are there?– How are they represented?– Which features can describe them
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Core Questions and Challenges
(4) Retrieval consequences:– Can information be found? (How? by which aspect of information?)– Is it designed to be found? When should it be found?– Where / in which modality shall we look for it?
(5) How can we establish a match between– The function of (a piece or a collection of) information– The functional needs of a user– The modalities and representations to use for searching it
(6) How to test / evaluate?– Use cases? Tasks? (clearly defined? generic?)– Benchmark collections?
From building blocks of information, via which function does it exhibit, to which modalities and representations to combine to retrieve and present it – in a single model?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Thank You!
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Text and Audio
Western popular music: 10 genres- Country, Folk, Grunge, Hip-Hop, Metal, Pop, Punk Rock, R&B,
Reggae, Slow Rock
`Small' Collection: 600 songs- 159 artists
- Classes of equal size (60 songs per class)
- Lyrics manually cleansed
`Large' Collection: 3010 songs- 188 artists
- Unbalanced, 180-380 songs per class
- Lyrics automatically fetched, no manual cleansing
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Text and Audio
Feature Combination (600 songs) Dimensionality SVM (accuracy)
SSD 168 59,17
RH 60 35,37
RP 1440 48,37
Textstatistics 23 29,83
POS 9 19,21
Rhyme 6 14,46
Textstat + POS 32 31,29
BOW + SSD 9434 53,46
BOW+SSD+textstat+POS+Rhyme 9472 54,21
SSD+textstat 191 64,33
SSD+textstat+POS 200 64,50
SSD+textstat+Rhyme 197 63,71