Date post: | 10-May-2015 |
Category: |
Education |
Upload: | christoph-trattner |
View: | 270 times |
Download: | 0 times |
Graz University of Technology
1
Christoph Trattner 2012
On the Navigability of Social Tagging
Systems
Christoph Trattner
Knowledge Management Institute and
Institute for Information Systems and Computer Media
Graz University of Technology, Austria
e-mail: [email protected]
web: http://www.austria-lexikon.at/af/User/Trattner%20Christoph
In collaboration with:
D.Helic, M.Strohmaier, K. Andrews, Ch. Körner
Graz University of Technology
2
Christoph Trattner 2012
What is a tagging system and what are
tags?
What is a tagging system?
A system that provides the user the possibility to apply tags to resources
What are tags?
- lightweight keywords (free form vocabulary)
- generated by users
- for users
Graz University of Technology
3
Christoph Trattner 2012
Popular examples of tagging systems
are…
Graz University of Technology
4
Christoph Trattner 2012
Tags
Graz University of Technology
5
Christoph Trattner 2012
Tags
Graz University of Technology
6
Christoph Trattner 2012
Tags
Graz University of Technology
7
Christoph Trattner 2012
Why system designers like tags?
- Tags add additional meta data to resources for which
typically just sparse meta data information exists
(such as pictures, movies, etc.)
- Trough tags system designers are able to provide the
user with simple navigational tools that improve the
systems information retrieval properties
- Tags are cheap!!!
Graz University of Technology
8
Christoph Trattner 2012
Why users like tags?
- Trough tags users are able to categorize or describe
resources
- Can find information faster - through personal tags
- Can find related content faster - trough related tags
Graz University of Technology
9
Christoph Trattner 2012
Navigation with Tags
Typically tagging systems provide the user the following forms of information retrieval interfaces to navigate content of a tagging system
1. Tag clouds – widely used
2. Tag hierarchies
new – hardly any implementations yet
Gupta et al. 2010
Graz University of Technology
10
Christoph Trattner 2012
How does tag (cloud) based navigation
look like?
Graz University of Technology
11
Christoph Trattner 2012
Are Tag Clouds useful for navigation?
Questions???
Graz University of Technology
12
Christoph Trattner 2012
Modelling a tag dataset as a graph (1/2)
- A tagging dataset is typically modeled as a tripartite
hypergraph
- V = R U U U T
- An annotation is a hyperedge (r, t, u)
- A tripartite hypergraph can be mapped onto three
bipartite graphs connecting users and resources,
users and tags, and tags and resources.
Graz University of Technology
13
Christoph Trattner 2012
Defining Navigability
A network is navigable iff:
There is a short path between all or almost all pairs of nodes in the network.
Formally:
1. There exists a giant component
2. The effective diameter is low (bounded by log n)
J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science
Technical Report 99-1776 (October 1999)
Graz University of Technology
14
Christoph Trattner 2012
Navigability: Examples
Example 1:
Not navigable: No giant component
Example 2:
Not navigable: giant component, BUT eff.diam: 7 > log2(8)
Graz University of Technology
15
Christoph Trattner 2012
Navigability: Examples
Example 3:
Navigable: Giant component AND eff.diam: 2 < log2(10)
Is this efficiently navigable?
There are short paths between all nodes, but can an agent or algorithm find them with local knowledge only?
Graz University of Technology
16
Christoph Trattner 2012
Efficiently navigable
A network is efficiently navigable iff:
If there is an algorithm that can find a short path with only local knowledge, and the delivery time of the algorithm is bounded polynomially by logk(n).
Example 4:
Efficiently navigable, if the algorithm knows it needs to go through A B C
A
B
C
J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science
Technical Report 99-1776 (October 1999)
Graz University of Technology
17
Christoph Trattner 2012
Navigability of Social Tagging Systems (1/2)
In general tags form networks which are navigable
from a network-theoretic perspective
Graz University of Technology
18
Christoph Trattner 2012
Navigability of Social Tagging Systems (2/2)
.
Tagging networks are navigable power-law networks. For power law
networks, efficient sub-linear decentralised navigation algorithms exist.
„Hub“ tags
Graz University of Technology
19
Christoph Trattner 2012
But how about User Interface constraints?
Tag Cloud Size n
topN resources
(topN most common algorithm)
Pagination of resources / tag
k resources shown / page
(reverse chronological ordering)
Graz University of Technology
20
Christoph Trattner 2012
How UI constraints effect Navigability
.
Limiting the tag cloud size n to practically feasible sizes (e.g. 5, 10, or more) does
not influence navigability (this is not very surprising).
BUT: Limiting the out-degree of high frequency tags k (e.g. through pagination
with resources sorted in reverse-chronological order) leaves the network
vulnerable to fragmentation. This destroys navigability of prevalent approaches
to tag clouds.
Pagination
Tag Cloud Size
Graz University of Technology
21
Christoph Trattner 2012
How can we recover the navigability of social tagging
systems?
Answer: Through resource specific resource list
construction!
Questions???
Graz University of Technology
22
Christoph Trattner 2012
What is a resource specific resource list ?
• A resource specific resource list is a resource list
that is not only specific to a particular tag but
also to a particular resource in the tagging
system
• Typically resource lists are calculated as follows
Res(t) = {ri(t),…,rn(t)}
• Resource specific resource lists are calculated
as
Res(t,r) = {ri(t,r),…,rn(t,r)}
Graz University of Technology
23
Christoph Trattner 2012
Approach: Random Ordering
-Instead of reverse-chronological ordering of resources,
we apply a random ordering. - On each click on a particular tag a different resource list is
generated
- Problem: network is not efficiently navigable
Better algorithms can easily be envisioned.
Graz University of Technology
24
Christoph Trattner 2012
Approach: Hierarchical Ordering
J. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press,
2001, p. 2001.
• Instead of random ordering, we use hierarchical
background knowledge for ranking paginated
resources [Kleinberg 2001].
• Kleinberg showed that if the nodes of a network
can be organized into a hierarchy, then such a
hierarchy provides a probability distribution for
connecting the nodes in the network.
• For such a network a hierarchical decentralized
searcher exists that is able to navigate the
network in log(n) => the network is efficiently
navigable
Graz University of Technology
25
Christoph Trattner 2012
J. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press,
2001, p. 2001.
Approach: Hierarchical Ordering
Graz University of Technology
26
Christoph Trattner 2012
Problem: Semantic Penalty
• Hierarchy was more or less randomly
constructed
• Does not take semantic similarity between
resources into account
• Hence, two new approaches were developed
• First idea, constructing efficiently navigable tag clouds
from structured web content [Trattner 2011]
• Second idea, develop an algorithm that is able to
construct semantically sound resource hierarchies
from tagging data [Trattner 2011a]
C. Trattner , D. Helic, M. Strohmaier, “On the Construction of Efficiently Navigable Tag Clouds Using Knowledge from Structured Web Content,” in JUCS,
Volume 17, Issue 4, 565-582, 2011.
C. Trattner , “Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists and Tag Trails”, in CIT, 2011.
Graz University of Technology
27
Christoph Trattner 2012
On the construction of efficiently navigable tag
clouds from structured web content
• Content on the Web not always flat
• There are websites that provide a hierarchical
structure
• Example: Austria-Forum
Graz University of Technology
28
Christoph Trattner 2012
Austria-Forum
Community AEIOU Wissenssammlungen
- Wiki-based Online encyclopedia system - provides over 200,000 information items about
Austria. - differently to Wikipedia, articles in Austria-Forum
are published, edited, checked and certified by people who are accepted as experts in particular
field - articles are organized hierarchically into categories - categories are addressable via structured URLs (cf. Open Directory DMOZ)
Graz University of Technology
29
Christoph Trattner 2012
Austria-Forum
Tags
Resource
Graz University of Technology
30
Christoph Trattner 2012
Approach (1/2)
1. Hierarchical Tag Cloud Construction
Graz University of Technology
31
Christoph Trattner 2012
Approach (2/2)
2. Hierarchical Resource List Construction
Graz University of Technology
32
Christoph Trattner 2012
Evaluation
To evaluate the presented algorithm, a network theoretical framework [Trattner 2011b] based on the Stanford SNAP Library (http://snap.stanford.edu/) was developed:
Network-theoretic module: Calculates network properties
such as the size of the Largest Strongly Connected Component
(LSCC) or the Effective Diameter (ED) of the tag cloud network
Searcher module: Implements a hierarchical decentralized
searcher to simulate “efficient” tag cloud driven navigation
C. Trattner , “NAVTAG - A Network-Theoretic Framework to Assess and Improve the Navigability of Tagging Systems,” in11th International Conference on
Web Engineering (ICWE 2011), Springer, 2011 .
Graz University of Technology
33
Christoph Trattner 2012
Hierarchical Decentralized Search
A tag network:
Background knowledge: (e.g. a folksonomy)
start target
Goal: Navigate from START to TARGET
using local background knowledge only
J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science Technical Report 99-1776 (October 1999)
Graz University of Technology
34
Christoph Trattner 2012
Results: Navigability
Approaches calculating resource lists in a
random manner form navigable tag cloud
networks
Graz University of Technology
35
Christoph Trattner 2012
Results: Searcher
• Best Results are obtained with
hierarchically constructed tag
clouds/resource lists (=HH)
• Naive (=TopN + chron. sorted resource
list) approach performs worst (=N)
• However, HR performs better than a
pure random approach (=R)
Graz University of Technology
36
Christoph Trattner 2012
User Study
To measure the performance of the approach a between-group test design was used
For that purpose we randomly split up our test users into two groups
Group A Group B
Assigned to navigate Austria-Forum with
hierarchically constructed resource lists
Assigned to navigate in Austria-
Forum with reverse chron. sorted
resource lists
Baseline
Graz University of Technology
37
Christoph Trattner 2012
User Study
During the study the users were asked to resolve 10 Tasks
In particular, the users were asked to navigate from 10 given start resources to 10 given target resources as fast as possible.
To get valid results, start and the target resources were selected uniform at random (same for all users)
As tool for navigation users were allowed to use only tag clouds
Graz University of Technology
38
Christoph Trattner 2012
User Study
To ensure that the user would have to navigate,
we selected the paths in such a way that the users had to visit at least 0-4 intermediate resources to find the target resources
As a max. amount of time, each of the users was
given 3 minutes of time for each task
Graz University of Technology
39
Christoph Trattner 2012
Example: Tag cloud based navigation
Brahms Beethoven
Start resource Target resource
Resource list
Graz University of Technology
40
Christoph Trattner 2012
User Study
Since we observed during our pilot test that
users had problems in finding resources that they did not know, the tags of the target resource were also presented to the users
The variable measured in the experiment was success rate, i.e. we measured whether the user could find the target resources or not!
Graz University of Technology
41
Christoph Trattner 2012
Results: User Study
All in all, 24 test user participated in the experiment
16 male and 8 female
median age = 33 years, ranging from 22 to 56
All participants were experienced computer users (on average 46 hours per week)
12 of them were experienced with the Austria-Forum test system
To get rid of this bias, we assigned those users randomly to group A and B
Graz University of Technology
42
Christoph Trattner 2012
Results: User Study
Regarding the mean success rate, we could observe that on
average users of group A could find to 55% their designated
target resources
Compared to this, in group B the users were only able to find to
23% their designated target resources
Or in other words, on overage, we could observe an improvement
of 32% of the navigability of the Austria-Forum tagging system,
while using hierarchically constructed resource lists.
These results confirm our theoretical assumptions as they were
made in previous work of this area [Helic et al. 2011]
Helic, D., Trattner, C., Strohmaier, M. and Andrews, K.: Are Tag Clouds Useful for Navigation? A Network-Theoretic
Analysis, Journal of Social Computing and Cyber-Physical Systems, 2011.
Graz University of Technology
43
Christoph Trattner 2012
Results: User Study
The experiment showed that the hierarchically constructed
tag network is significantly better navigable than the one
naïve approach.
Graz University of Technology
44
Christoph Trattner 2012
Problem: Predefined Resource Hierarchy
- Not always a predefined resource hierarchy is
given
- Hence, the presented approach is not
completely generic
- Other problem:
The Success Rate drops drastically if the
provided resource hierarchy is neither
balanced nor complete
Graz University of Technology
45
Christoph Trattner 2012
Question?
How can we construct fixed branched and balanced
resource hierarchies from tagging data
automatically???
Graz University of Technology
46
Christoph Trattner 2012
Algorithm: Resource Hierarchy Generation
Graz University of Technology
47
Christoph Trattner 2012
Algorithm: Resource Hierarchy Labeling
Graz University of Technology
48
Christoph Trattner 2012
Results: Semantic Evaluation
- Taxonomic F-Measure and
Taxonomic Overlap identify
the quality of a given taxonomy
against a golden standard via
common concepts.
- Comparison to four popular tag
hierarchy induction algorithms
- As golden standard for the experiment the Germanet
ontology was used (the Austria-Forum tag dataset contains
only German tags)
Graz University of Technology
49
Christoph Trattner 2012
Results: Empirical Analysis
- 9 test participants (all of them experienced in the evaluation
of concept hierarchies)
- resource taxonomy with b=10
- Evaluation via online test
- Users had to classify tag trails
Graz University of Technology
50
Christoph Trattner 2012
Results: Empirical Evaluation
Compared to a tag taxonomy comprising only tags we can
see that concept relations of a tag-resource taxonomy with
branching factor b = 10 are only to 5% less hierarchically
arranged than the tag concepts of the in theory best
semantically correct tag taxonomy approach the so-called
Deg/Cooc tag taxonomy induction algorithm.
Graz University of Technology
51
Christoph Trattner 2012
Results: Tag Cloud Navigability
In order to determine the navigability of the approach several
tag networks with different resource list lengths were
generated.
Branching factors used in the experiment: b=2,5 and 10.
Resource list length was varied from k=10 to 50.
- To determine navigability: Size of LSCC and ED was measured.
- To determine efficiency a hierarchical decentralized searcher was
implemented utilizing the resource hierarchy as background knowledge to
search the tag networks.
Graz University of Technology
52
Christoph Trattner 2012
Results: Network Properties
Simulations show the navigability of the hierarchically
constructed tag networks.
Graz University of Technology
53
Christoph Trattner 2012
Results: Searcher
Simulations show very high success rates ( > 90%)
even for “short” resource lists (k=10).
Graz University of Technology
54
Christoph Trattner 2012
Conclusions
- From a network-theoretical perspective (and only
looking at tags) tagging systems are navigable
- However, if we consider simple user-interface
constraints, they are NOT! - Problem: Current tag cloud algorithms calculate resource lists in a
statically manner
- Pagination clusters tag network into isolated network clusters
- However, with hierarchically constructed resource
lists navigability can be recovered
- Such tag networks are also efficiently navigable, if
the resources of the tagging system can be arranged
into a fixed branched resource taxonomy
Graz University of Technology
55
Christoph Trattner 2012
End of Presentation
Thank you!
Christoph Trattner
Graz University of Technology, Austria
Graz University of Technology
56
Christoph Trattner 2012
References and Further Readings
Trattner, C., Lin, Y., Parra, D., Yue, Z., Brusilovsky, P.: Evaluating Tag-Based Information
Access in Image Collections, In Proceedings of the 23rd ACM Conference on Hypertext and
Social Media, ACM, New York, NY, USA, 2012.
Helic, D., Körner, C., Granitzer, M., Strohmaier, M., Trattner, C.: Navigational efficiency of broad
vs. narrow folksonomies, In Proceedings of the 23rd ACM Conference on Hypertext and
Social Media, ACM, New York, NY, USA, 2012.
Trattner, C., Singer, P., Helic, D. and Strohmaier, M.: Exploring the Differences and Similarities
of Hierarchical Decentralized Search and Human Navigation in Information-networks In
Proceedings of the 11th International Conference on Knowledge Management and
Knowledge Technologies, ACM, New York, NY, USA, 2012.
Trattner, C.: Linking Related Content in Web Encyclopedias with search query tag clouds, IADIS
International Journal on WWW/Internet ,Volume 9(2), 2011.
Trattner, C.: Improving the Navigability of Tagging Systems with Hierarchically Constructed
Resource Lists and Tag Trails, Journal of Computing and Information Technology, Volume
19(3), 155-167, 2011.
Trattner, C., Helic, D. and Strohmaier, M.: On the Construction of Efficiently Navigable Tag
Clouds Using Knowledge From Structured Web Content, Journal of Universal Computer
Science, Volume 17(4), 565-582, 2011.
Graz University of Technology
57
Christoph Trattner 2012
Helic, D., Strohmaier, M., Trattner, C., Muhr M. and Lermann, K.: Pragmatic Evaluation of
Folksonomies, In Proceedings of the 20th international conference on World wide web,
ACM, New York, NY, USA, 417-426, 2011.
Trattner, C., Körner, C., Helic, D.: Enhancing the Navigability of Social Tagging Systems with
Tag Taxonomies, In Proceedings of the 11th International Conference on Knowledge
Management and Knowledge Technologies, ACM, 7–9 September 2011, Messe Congress
Graz, Austria, 2011.
Trattner, C.: Improving the Navigability of Tagging Systems with Hierarchically Constructed
Resource Lists: A Comparative Study, In Proceedings of the 33rd International Conference
on Information Technology Interfaces, IEEE, Cavtat / Dubrovnik, Croatia, 2011.
Helic, D., Trattner, C., Strohmaier, M., Andrews, K.: On the Navigability of Social Tagging
Systems, In proceedings of the Second IEEE International Conference on Social Computing
, Minnesota, USA, 2010.
References and Further Readings