Date post: | 15-Nov-2014 |
Category: |
Education |
Upload: | loretta-auvil |
View: | 2,082 times |
Download: | 3 times |
the DISCUS project & SEASR
Xavier Llorà1,2, David E. Goldberg1 & Michael Welge2
1Illinois Genetic Algorithms Lab, Department of Industrial and Enterprise Systems Engineering,!University of Illinois at Urbana-Champaign!
2Data-Intensive Technologies and Applications, National Center for Supercomputing Applications, !University of Illinois at Urbana-Champaign!
The Vision
• Computers have become mediators of collaborations
– Email, chat rooms, blogs, wikis…
– A flood of available information
– Different modes of communication
• Let’s take advantage of such information
– Logs of conversations
– Archive of documents (email attachments, blogs, personal web pages…)
– Human-computer interactions
– Social aspect of the communication and collaboration
– Needs to work for multiple languages
theDISCUSproject(May2007) XavierLlorà 3
The Project
• DISCUS started in 2003 as an IlliGAL & NCSA collaboration
• Supports innovation and creativity:
DISCUS: Distributed Innovation and Scalable Collaboration in Uncertain Settings
• Basic research components
– Competent genetic algorithms (HBGA, iGA)
– Advance chance discovery components
– Adapt and expand the analysis of social interaction
– Efficient data mining techniques for conversations
– Develop a social network analysis for creativity and innovation processes
theDISCUSproject(May2007) XavierLlorà 4
The Project
• Technology development
– Infrastructure to support creativity and innovation processes
– Reusable repositories of analytic components
– Standardize heterogeneous data storage to boost interoperability
– Create hooks for non-intrusive usage and deployment
– Rapid adaptation cycle to new technologies
Research and Commercial Partners
• Some research partners along the "way
– University of Illinois (IlliGAL, NCSA & CEE)
– University of Osaka
– University of Tokyo (School of Management, "School of Engineering)
– University of Kyushu
• Commercial partner
– Hakuhodo Inc and HOW
– Mazda
– Toyota
theDISCUSproject(May2007) XavierLlorà 6
The Research Picture
Content
Social aspects
Analysis
Social networks
Data mining
Knowledge management
DISCUS in Action
theDISCUSproject(May2007) XavierLlorà 8
Online Communities
theDISCUSproject(May2007) XavierLlorà 9
Online Communities
theDISCUSproject(May2007) XavierLlorà 10
Content Analysis
theDISCUSproject(May2007) XavierLlorà 11
Social Network Analysis
theDISCUSproject(May2007) XavierLlorà 12
Topic Overlap
theDISCUSproject(May2007) XavierLlorà 13
Topic Dynamics
CSPAN
• CSPAN digital library
– Videos
– Transcripts
– Annotations
• Example of real-time analysis
• Crawling and results
Some Facts
• Number of document: 110,234
• Number of persons: 78,915
• Number of total sentences: 252,132
• Number of total word: 2,034,209
Documents per Year
1940 1960 1980 2000
15
10
50
10
05
00
50
00
Year
Nu
mb
er
of
do
cu
me
nts
Words per Year
1940 1960 1980 2000
1e
+0
11
e+
02
1e
+0
31
e+
04
1e
+0
5
Year
Nu
mb
er
of
wo
rds
the DISCUS project & SEASR
Xavier Llorà1,2, David E. Goldberg1 & Michael Welge2
1Illinois Genetic Algorithms Lab, Department of Industrial and Enterprise Systems Engineering,!University of Illinois at Urbana-Champaign!
2Data-Intensive Technologies and Applications, National Center for Supercomputing Applications, !University of Illinois at Urbana-Champaign!