Date post: | 17-Nov-2015 |
Category: |
Documents |
Upload: | rafael-santos |
View: | 216 times |
Download: | 0 times |
NEURAL NETWORK BASED
VISUALIZATION OF COLLABORATIONS
IN A CITIZEN SCIENCE PROJECT
Alessandra M. M. Morais, Rafael D. C. Santos,
M. Jordan Raddick
Motivation: Citizen Science
Neural network based visualization of
collaborations in a citizen science project
3
Citizen Science
Use the power of volunteers to gather or process data.
Using idle computer time.
Collecting data.
Using human intelligence.
Not a new concept, but the web made several
interesting projects possible.
4
Citizen Science Galaxy Zoo
Volunteers classify images of galaxies.
www.galaxyzoo.org
Part of the Zooniverse www.zooniverse.org
5
Citizen Science Galaxy Zoo
150.000 volunteers.
More than 80.000.000 classifications.
60% of the volunteers classified
6
Citizen Science
One important issue: data quality.
More collaborators more data better quality?
Better collaborators better quality?
How to identify different types of collaborators
Non-intrusively.
Without positive or negative reinforcement.
Log analysis.
How to identify and motivate certain categories of
users?
7
Previous Results
Morais, A. M. M.; Raddick, J.; Santos, R. D.
C.; Visualization and characterization of users
in a citizen science project; SPIE Defense,
Security, and Sensing, 2013
The Self-Organizing Map and
Visualization
Neural network based visualization of
collaborations in a citizen science project
9
Kohonens SOM
Neural network for unsupervised learning.
Projection of multidimensional data into a lower-
dimensional lattice.
Quantization: one neuron will be associated/associable
with several data vectors.
Projection: data vectors close in the original
multidimensional space will be close in the lattice.
10
The Basic Algorithm
11
The Basic Algorithm
12
SOM and Visualization
We can use the lattice to visualize a large amount of
multidimensional data.
Must choose a proper representation for the neurons.
Must take advantage of quantization and projection.
13
SOM and Visualization
Icons, Features and Results
Neural network based visualization of
collaborations in a citizen science project
15
Icons
Parallel Coordinates will be used to visualize the users.
Simple, uncluttered icons with few dimensions (few attributes).
Each icon represents a prototype vector and the set of data
vectors assignable to that prototype vector.
16
Features
Main features:
Participation range p: number of days between first and last
recorded interaction.
Participation count d: number of days of activity.
Maximum classification max in a day.
Total classifications total.
Average of classifications per user average.
Considered only the first 600 days of the participation.
17
Features
Features:
a1: p/600 1: long term
a2: d/p 1: frequent during participation
a3: d/600 1: frequent during project
a4: max/total 1: all in a day
a5: total/average 1: close to average user classif.
a6: d visual complement
a7: log10(total) how many classifications
18
Visualizing Volunteers Activity
Activities General View Seven Attributes
19
Visualizing Volunteers Activity
Activities General View Seven Attributes
Curious: very short
activity interval, very
active in this interval,
did not contribute much.
20
Visualizing Volunteers Activity
Activities General View Seven Attributes
Potentials: contributed
sporadically but
significantly.
21
Visualizing Volunteers Activity
Activities General View Seven Attributes
Dedicated: contributed
frequently, contributed
a lot.
22
Visualizing Volunteers Activity
25% or less of correct classifications
23
Visualizing Volunteers Activity
75% or more of correct classifications
24
Sessions and Accuracy
Other visualization example:
a1: number of sessions
a2: average session length in seconds
a3: average number of classifications per session
a4: percentage of correct classifications
Session is defined by periods of inactivity (180 seconds)
25
Visualizing Volunteers Accuracy
Session data and correct classifications
Conclusions
Neural network based visualization of
collaborations in a citizen science project
27
Conclusions
Visualization can give insight on data, but
Many methods, many parameters.
Very hard to find a Aha! solution.
Guided visualization for exploratory analysis very useful.
Kohonens Self-Organizing Map is able to do visual, almost-
fuzzy clustering of multidimensional data.