+ All Categories
Home > Science > Wikipedia on Twitter: Analyzing Tweets about Wikipedia

Wikipedia on Twitter: Analyzing Tweets about Wikipedia

Date post: 21-Jan-2018
Category:
Upload: evazangerle
View: 594 times
Download: 3 times
Share this document with a friend
26
1 The University of Innsbruck was founded in 1669 and is one of Austria’s oldest universities. Today, with over 28.000 students and 4.500 staff, it is western Austria’s largest institution of higher education and research. For further information visit: www.uibk.ac.at. #Wikipedia on Twitter: Analyzing Tweets about Wikipedia Eva Zangerle , Georg Schmidhammer, Günther Specht
Transcript
Page 1: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

1

The University of Innsbruck was founded in 1669 and is one of Austria’s oldest universities. Today, with over 28.000 students and 4.500 staff, it is

western Austria’s largest institution of higher education and research. For further information visit: www.uibk.ac.at.

#Wikipedia on Twitter:

Analyzing Tweets about Wikipedia

Eva Zangerle, Georg Schmidhammer, Günther Specht

Page 2: Wikipedia on Twitter: Analyzing Tweets about Wikipedia
Page 3: Wikipedia on Twitter: Analyzing Tweets about Wikipedia
Page 4: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

4

Page 5: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

5

Research Questions

RQ3: Does the number of tweets about a certain articlecorrelate to a recent edit and hence, an update of thepage?

RQ2: Which features do Wikipedia articles that are popularon Twitter exhibit/share?

RQ1: How popular are the various Wikipedias on Twitter andin which language contexts are these referenced?

Page 6: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

6

Dataset

• Crawl of Twitter using keyword „wikipedia“

• 2014/10/20 – 2015/03/10

• Total of 4.5 million tweets

• Cleaning of dataset

• Tweets with Wikipedia URL

• Normalization of URLs (also mobile URLs)

• Retweets remain within the set

22% of all Wikipedia-URLs articlesare mobile URLs

Page 7: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

7

Dataset

Characteristic Raw Cleaned

Tweets 4,530,967 2,468,055

Retweets 1,440,122 659,641

Distinct Users 1,730,984 844,975

Mentions 3,334,848 1,880,687

Distinct Hashtags 159,231 118,912

Hashtag Usages 1,528,458 778,737

Distinct URLs 1,447,124 1,121,825

URL Usages 3,393,846 2,793,900

63.24% of all tweets contain 1

URL (maximum: 6 URLs)

77.72% of all URLs point to a

Wikipedia page

Page 8: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

8

Tweets per Day

Page 9: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

9

General Observations: Users

• Long-tailed distribution

• Average number of tweets per user: 2.92

• However: maximum number of tweets per user: 64,521

• 19 of 20 most popular users are bots (404 users in total; 264k tweets)

E. Zangerle, G. Schmidhammer, G. Specht: Analysing the Usage of Wikipedia on Twitter: Understanding Inter-Language Links

(accepted at HICSS 2016)

Page 10: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

RQ1

Language Analyses

Page 11: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

11

Language Distribution

• Analysis of tweeted Wikipedia article in regards to language

• Extract Wikipedia edition (language) from URL

Missing: context, underlying data.

Language Total Share

English (en) 1,349,623 52.81%

Japanese (ja) 579,157 22.66%

Spanish (es) 140,396 5.49%

Turkish (tr) 78,235 3.06%

French (fr) 64,139 2.51%

German (de) 52,256 2.04%

Russian (ru) 44,347 1.74%

Arabian (ar) 38,757 1.52%

Korean (ko) 27,261 1.07%

Portuguese (pt) 26,442 1.03%

Page 12: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

12

Correlation of Language and Wikipedia Size Measures

Measure Spearman‘s ρ

Total number of articles .76*

Edits .65*

Users .46*

Admins .42*

Active users .39*

Images .39*

Depth1 .35*

* Significant at the 0.001 level

1 Depth = Edits/Articles x Non-Articles/Articles x [1-Stub-ratio]

Page 13: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

13

Tweet Languages

Language Share

English 42.90%

Japanese 21.92%

Spanish 5.77%

Arabian 2.56%

French 2.37%

Turkish 2.24%

German 1.75%

Indonesian 1.56%

Russian 1.35%

Language Share

English (en) 52.81%

Japanese (ja) 22.66%

Spanish (es) 5.49%

Turkish (tr) 3.06%

French (fr) 2.51%

German (de) 2.04%

Russian (ru) 1.74%

Arabian (ar) 1.52%

Korean (ko) 1.07%

Tweets Wikipedias referenced

Page 14: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

14

Inter-language links

Wikipedia Language

Twit

ter

Lan

guag

e

en ja es ar fr tr de id ru pt

en 97.33% 0.19% 0.42% 0.03% 0.33% 0.05% 0.35% 0.12% 0.10% 0.05%

ja 5.48% 93.56% 0.04% 0.01% 0.11% 0.03% 0.20% 0.01% 0.05% 0.01%

es 19.65% 0.28% 77.48% 0.01% 0.62% 0.03% 0.32% 0.07% 0.03% 0.51%

ar 26.58% 0.02% 0.12% 72.79% 0.17% 0.02% 0.02% 0.00% 0.00% 0.00%

fr 20.21% 0.19% 1.11% 1.92% 74.73% 0.03% 0.73% 0.02% 0.05% 0.17%

tr 20.78% 0.01% 0.17% 0.00% 0.18% 77.62% 0.83% 0.04% 0.10% 0.02%

de 21.15% 0.59% 1.41% 0.06% 0.44% 0.13% 74.94% 0.04% 0.04% 0.06%

id 49.83% 1.20% 1.77% 0.16% 0.60% 0.40% 0.91% 42.84% 0.06% 0.26%

ru 17.74% 0.10% 0.05% 0.00% 0.14% 0.03% 0.32% 0.00% 78.38% 0.01%

pt 28.90% 0.73% 6.91% 0.01% 0.75% 0.05% 0.46% 0.09% 0.03% 60.87%

20% of all tweets link toanother language.

85% of all inter-languagelinks do not have a

counterpart in original language.

Page 15: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

15

Inter-Language Links

• 85% of all links leading to a Wikipedia of a language different from thetweet‘s language do not have a counterpart in the user‘s language

• Remaining 15%: Wikipedia actually used is significantly better in terms ofquality than language in tweet‘s language

E. Zangerle, G. Schmidhammer, G. Specht: Analysing the Usage of Wikipedia on Twitter: Understanding Inter-Language Links

(accepted at HICSS 2016)

Page 16: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

RQ2

Top Articles and Categories

Page 17: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

17

Methods

• Tweets about English Wikipedia

• 52.81% of all tweets

• Total of 724,974 references to Wikipedia

• Total of 336,605 distinct English Wikipedia articles

• Extract article titles and categories from DBPedia

• Resolve extended URLs (e.g., diff-pages, access to old revisions, etc).

Page 18: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

18

Distribution: Tweets per Articles

64% of all articlesonly tweeted once

Page 19: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

19

Top Articles

Article No. of Tweets Share

diff 54,432 7,51%

cod_wars 6,868 0,95%

user:Giraffedata/comprised_of 4,541 0,63%

matthew_ziff 2,100 0,29%

kidz_bop 2,015 0,28%

gamergate 1,703 0,23%

old_revision 1,517 0,21%

search 1,383 0,19%

the_little_mermaid_(1989_film) 1,370 0,19%

No article standing out particularly.

Page 20: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

20

Top Categories

Category No. of Tweets Share

Living people 105,895 14,61%

English-language films 18,331 2,53%

American films 9,605 1,32%

Wars involving the United Kingdom 7,487 1,03%

American male television actors 7,255 1,00%

20th-century conflicts 7,158 0,99%

American male film actors 6,981 0,96%

20th-century military history of the United Kingdom 6,968 0,96%

Law of the sea 6,953 0,96%

Wars involving Iceland 6,928 0,96%

Page 21: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

RQ3

Edits and Tweets

Page 22: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

22

Methods

• Crawled via MediaWiki API

• Tweets about English Wikipedia articles (724,974 references to 336,605 distinct articles)

• Observation period: +/- 24 hours of a tweet

• 543,788 edits in total

• 91,577 edits marked as minor

• 312,160 tweets link to an article edited within +/- 24 hours of tweet

• 233,962 tweets: edit occured before tweet

• 215,192 tweets: edit occured after tweet

• No correlation between number of edits and number of tweets: Pearson‘s r: 0.06 (at0.001 significance level)

• Exception: events

Page 23: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

23

Conclusion

RQ1: 20% of all tweets link to a Wikipedia of another language.

RQ2: No particular categories or articles are significantly more popular onTwitter. Longtail-distribution for articles (64% of all English articles only tweetedonce).

RQ3: No correlation between number of edits and popularity of article onTwitter can be detected.

Page 24: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

24

Future Work

• Look into inter-language links

• Tweets as quality measure

• Look into those tweets about Wikipedia without mentioning a particulararticle (qualitatively)

• Interested in joining forces?

Page 25: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

25

#questions? http://en.wikipedia.org/wiki/Q&A #wikipedia

@eva_zangerle

[email protected]

http://www.evazangerle.at

@dbisibk

http://dbis-informatik.uibk.ac.at

https://www.facebook.com/dbisibk

Page 26: Wikipedia on Twitter: Analyzing Tweets about Wikipedia

26

The University of Innsbruck was founded in 1669 and is one of Austria’s oldest universities. Today, with over 28.000 students and 4.500 staff, it is

western Austria’s largest institution of higher education and research. For further information visit: www.uibk.ac.at.

#Wikipedia on Twitter:

Analyzing Tweets about Wikipedia

Eva Zangerle, Georg Schmidhammer, Günther Specht


Recommended