+ All Categories
Home > Data & Analytics > Coursera & Khan Academy on the Social Web

Coursera & Khan Academy on the Social Web

Date post: 18-Dec-2014
Category:
Upload: jakub-ruzicka
View: 962 times
Download: 1 times
Share this document with a friend
Description:
Take a “social web” look back at Coursera & Khan Academy. How does it co-create both brands? What does it reveal about both communities? And how can social web data facilitate – both producers’ & consumers’ – informed decision-making in adjusting their “education mix”?
133
JAKUB RŮŽIČKA [email protected] cz.linkedin.com/in/littlerose summer 2014 | working paper @ COURSERA & KHAN ACADEMY ON THE SOCIAL WEB the social web co-creating brands, revealing communities & facilitating - both producers’ & consumers’ - informed decision-making in adjusting their “education mix”
Transcript

JAKUB RŮŽIČKA [email protected] cz.linkedin.com/in/littlerose

summer 2014 | working paper

@

COURSERA & KHAN ACADEMYON THE SOCIAL WEB

the social web co-creating brands, revealing communities

& facilitating - both producers’ & consumers’ - informed

decision-making in adjusting their “education mix”

aggregated general background

Coursera & Khan Academysocial web presence quantitatively

web domains, web traffic, keyword performance, business insights, social media statistics: facebook, twitter, google+, youtube & linkedin; competitive analysis, wikipedia insights

(pp. 3-33)

original social web data

Coursera & Khan Academysocial web presence qualitatively

facebook, twitter & google+ pages, groups, comment networks, communities, posts, content analysis, fans, demographics, traffic sources, keywords; personal network & interest profiles, search results, news articles, text mining, inbound links, reddit, youtube

(pp. 34-95)

a week of tweets

Coursera & Khan Academydetailed insights provided bya small fragment of big data

general statistics, influential tweeters, demographics, keywords, content analysis, natural language processing, text mining, sentiment analysis

(pp. 96-121)

conclusions

brand essence, swot, positioning& more

Coursera & Khan Academy…and what does the social web

say about education?

(pp. 122-130)

outline

link to datasets

tools used & DIY resources for self-driven education

(pp. 131-132)

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014

This research paper* is concerned with social web data in the context of education. Studying social web presence of two institutions, Coursera & Khan Academy – both arguably labeled “revolutionary” (reflections or even

leaders of the transforming education) – illustrates what kind of information the social web provides to learners, shaping their “education mix”; educators, personalizing education(al tools) they provide & reinforcing their

positioning; and what the general function & “state of things” of social media in education is. Even simple analyses on the social web “dataset” (quite big, indeed) might provide us with information complementing – in some

cases even substituting (/filling the gaps of) – the (internal/private) behavioral data from an online educational tool usage. Rather than making an effort to define “the one and only perfect centralized general education

system” (if such a thing exists) & even though both services might possibly exhibit many common features (e.g. the “whatever, whenever, wherever, however education” notion), we’ll attempt to define both 21st Century

education tools, Coursera & Khan Academy as “brands” co-created by their social web presence, offering personalized/personalizable (“made to measure”) education suitable for a specific group, individual human being,

moment, occasion, need etc.; and so taking a specific part in “education mix” of an individual.

The data were collected during January-June 2014 – for particular analyses/datasets always see the footer on the right – & analyzed in summer 2014. Due to that, we should keep in mind that the state of both Coursera’s &

Khan Academy’s brand is biased towards the first half of 2014 – and even more regarding the “a week of tweets” part (yet, as you will see, freeing the data of the “ungeneralizable piece”, we can gain quite detailed insights

provided by a very small fragment of the “big Twitter data”). Although “the ideal” of any web and/or social media analysis would be a longditudial & automated evaluation/reporting process (yet, before that an exploratory

research is needed), you might be surprised by the richness & high informative value of such “single captured state”.

* This research paper is part of the work-in-progress text “How to Create Self-Driven Education”. It was posted online to (potentially) educate professionals, academics & the general public about possibilities of mining the social web.

…also to get some feedback and (above all) spread the word about self-driven education & contribute to the vision. =)

download (/save) the document for higher resolution | optimized for fullscreen view

RUZICKA, Jakub (2014). COURSERA & KHAN ACADEMY ON THE SOCIAL WEB: THE SOCIAL WEB CO-CREATING BRANDS, REVEALING COMMUNITIES & FACILITATING – BOTH PRODUCERS’

& CONSUMERS’ – INFORMED DECISION-MAKING IN ADJUSTING THEIR “EDUCATION MIX”. [working paper] Charles University in Prague, Faculty of Social Sciences, Institute of Sociological Studies.

click a link in the text for more information | click a red framed visual for an updated web result or higher resolution

Coursera & Khan Academy

social web presence quantitatively

web domains, web traffic, keyword performance, business insights, social media

statistics: facebook, twitter, google+, youtube & linkedin; competitive analysis, wikipedia

insights

(pp. 3-33)

aggregatedgeneral background

The first section of this text is (mainly) based on data provided by third

parties. Therefore, it is rather suggestive than conclusive, gives

incomplete information & serves only as a general introduction into the

topic. Its significance will be reinforced in the three following sections

(i.e. including conclusions), where it complements the original data

collected & provides framework for the overall picture of Coursera’s

& Khan Academy’s social web presence.

the internet

about 3 billion users

about 1 billion websites

the top 500 sites on the web

blogs

more than 6.7 million bloggers

about 80% of internet users read blogs

facebook

about 1.3 billionmonthly active users

about 80% of daily active usersoutside the US & Canada

more than 50 million facebook pages

twitter

255 million monthly active users

about 77% of accountsoutside the US

about 500 million tweets per day

google+

540 million monthly active users

about 5.5 million pages*

* a simple estimate based on google’sstatement that more than 1 million pages

were created in the first 6 months(g+ launched in November 2011)

youtube

more than 1 billion users

80% youtube trafficoutside the US

100 hours of video uploaded every minute

linkedin

186 million monthly active users

more than 3 millioncompany pages

over 39 million students& recent college graduates

reddit

about 115 million monthly unique visitors

largest demographic group of 18-29 year old males

wikipedia

over 500 million monthly unique visitors

over 4.5 English articles

over 10 edits/sec of wikipedia& its sister projects

This slide is an introductory

one in order to set some

general benchmarks to

estimate the bias of our

(prospective) Coursera’s

& Khan Academy’s brand

definition based on the social

web analyses that make up

this report.*

* I’ve also attempted to establish

a generalization principles for social

media research (disclaimer: way

more theoretical =)), particularly

focused on social media algorithms

and offline & online political

participation, here.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose June 2014

stats

source:

internetlivestats.com

newsroom.fb.com/company-info

about.twitter.com/company

google.com/+/brands

youtube.com/yt/press/statistics.html

press.linkedin.com/about

bit.ly/1jZqHBX (techcrunch.com)

en.wikipedia.org/wiki/Wikipedia:Statistics

blog.digitalinsights.in

expandedramblings.com

comscore.com

category label category URI result labelresult

refcountresult description

Philosophy of education http://dbpedia.org/resource/Category:Philosophy_of_education Education 7048

Education in its broadest, general sense is the means through which the aims and habits of a group of people lives on from one generation to the next. Generally, it occurs

through any experience that has a formative effect on the way one thinks, feels, or acts. In its narrow, technical sense, education is the formal process by which society

deliberately transmits its accumulated knowledge, skills, customs and values from one generation to another, e.g. , instruction in schools.

Knowledge sharing http://dbpedia.org/resource/Category:Knowledge_sharing

Education http://dbpedia.org/resource/Category:Education

State schools http://dbpedia.org/resource/Category:State_schools State school 6194

State schools, also known as public schools or government schools, generally refer to primary or secondary schools mandated for or offered to all children by the government,

whether national, regional, or local, provided by an institution of civil government, and paid for, in whole or in part, by public funding from taxation. The term may also refer to

institutions of post-secondary education funded, in whole or in part, and overseen by government.

High schools and

secondary schoolshttp://dbpedia.org/resource/Category:High_schools_and_secondary_schools

Secondary

school4579

Secondary school (the term "high school" is most often associated with English-speaking countries, though the two are far from synonymous) is a term used to describe an

educational institution where the final stage of schooling, known as secondary education and usually compulsory up to a specified age, takes place. It follows elementary or

primary education, and may be followed by university (tertiary) education.

School types http://dbpedia.org/resource/Category:School_types

Educational stages http://dbpedia.org/resource/Category:Educational_stages

School terminology http://dbpedia.org/resource/Category:School_terminology

Elementary and primary

schoolshttp://dbpedia.org/resource/Category:Elementary_and_primary_schools

Primary

school3422

A primary school (from French école primaire) is an institution in which children receive the first stage of compulsory education known as primary or elementary education.

Primary school is the preferred term in the United Kingdom and many Commonwealth Nations, and in most publications of the United Nations Educational, Scientific, and

Cultural Organization. In some countries, and especially in North America, the term elementary school is preferred.

School types http://dbpedia.org/resource/Category:School_types

Educational stages http://dbpedia.org/resource/Category:Educational_stages

School terminology http://dbpedia.org/resource/Category:School_terminology

Gender http://dbpedia.org/resource/Category:GenderMixed-sex

education3089

Mixed-sex education, also known as coeducation, is the integrated education of male and female students in the same institution. It is the opposite of single-sex education.

Most older institutions of higher education were reserved for the male sex and since then have changed their policies to become coeducative.

School types http://dbpedia.org/resource/Category:School_types

Educational environment http://dbpedia.org/resource/Category:Educational_environment

Mixed-sex education http://dbpedia.org/resource/Category:Mixed-sex_education

the “education” keyword

Before we begin interpretation & discussion of the gathered data about two educational institutions, Coursera & Khan Academy, let’s consider the “education” concept itself. For the definition, we’ve asked – unsurprisingly – one of the products of the social web, Wikipedia, the free

open online encyclopedia. Properly speaking, the answer was given by DBpedia, “a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web”. [self] Its keyword search API provided us with the “education”

query results ranked by the number of inlinks pointing from other Wikipedia pages at the particular result page (the refcount column).*

At first sight, we can observe that the “education” concept is closely related to “school” – primarily in its “traditional K-12 school education” meaning which dominates the results. Yet, as we can learn right from the very first description of education’s “deep” (philosophical) meaning,

such context is “narrow” & “technical”. In recent years – arguably especially due to the mass expansion of ICT, the Internet, and (therefore) open educational resources facilitating education personalization (and mass-scale support & analytics) – we are getting more aware of the

(intrinsically motivated) self-driven & informal education opportunities. Wikipedia, blog articles & YouTube tutorials by professionals or amateurs-enthusiasts, specialized educational portals, community forums, online professional communities, open-source communities; blended

learning, MOOCs, MIT OpenCourseWare, Codecademy, TED, GitHub, and a myriad of other non-profit & for-profit online educational projects & new concepts in education.** Based on the premise of increasing professional/academic specialization – however, rather than within

one discipline, “picking” suitable skills across various disciplines (interdisciplinary) – to achieve – freely chosen (whatever, whenever, wherever, however), intrinsically motivated and, to some extent, unique mastery, we might argue that “school” in the “education - school”

association could be replaced by “education mix”***, where particular educational tools & entities act as “brands” with a clear positioning of their products, meeting the needs of particular target groups (for more insight, see the long tail) & customizing their mixes according the data

they have. The archetypes of an “educator” (producer) & a “learner” (consumer) will often blend in a “prosumer” (for more insight, see Wikinomics). On the subject of the “education mix” customization, we will see that the social web – even standing alone, not combined with other

data – can give us a detailed answer to our research question “What are the brands of Coursera & Khan Academy?”, with the aforementioned particular interest in “What kind of information does the social web provide to learners, shaping their “education mix”; educators,

personalizing education(al tools) they provide & reinforcing their positioning; and what is the general function & “state of things” of social media in education?”

* Defining the “21st Century education concept” based on a single Wikipedia query only could be seen as “sloppy”, possibly empowering the digital education revolution sceptics with some solid & sound arguments about not enough critical thinking.

On that account, I should mention that I’ll save some more elaborated theoretical background / model / framework for the aforementioned work-in-progress text “How to Create Self-Driven Education”.

** I am – by not means – trying to provide you with a curated list of all online educational resources & currently discussed concepts. It might not even be possible, since – based on the scope our definition of an “educational resource” – it might be concluded that we actually learn from

everything we interact with. Try to type the “(free) online education” query, and/or any specific topic you want to learn, in your favourite search engine, and you’ll still see only a few planets in the giant digital universe of available educational resources.

*** Coining a term is fun, and even more entertaining if the term can be somewhat useful. =) Since meanings are rather created in everyday life, based on a necessity of having them, let’s not give any “hard” definition to the “education mix” concept. We can just say, it’s kind of analogous to

the “marketing mix” concept. Under the “education mix” paradigm, schools, study programmes, job positions etc. serve as “mere” archetypes/recipes, helping us to assemble our own qualification, positioning & “brand”; employing various education providers & ways of how to devote our time

efficiently to improve our lives. Neither learners nor educators search for a “perfect educational institution” but rather for a “perfect (yet flexible) education mix” tailored for specific needs & goals.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: wiki.dbpedia.org/Lookup) | June 2014

google books

Word frequency of Google’s English book corpus shows us a similar story. The “education” keyword

seems to be strongly correlated with “school”. We can see that “the” one of the means of obtaining

education has (almost always) been discussed more than the ultimate goal (& process) itself.

We are now ready to begin the Coursera & Khan Academy comparison.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: books.google.com/ngrams) | May 2014

With respect to the end users, both Coursera & Khan

Academy represent web pages offering free - for-profit & non-

profit, in that order - educational content. Both organizations

operate in the US. Khan Academy is six years older, which

gave it more time to reinforce its “transforming education”

authority / brand label, as we’ll see in the following parts of the

text.

domain name COURSERA.ORG

domain id D164365199-LROR

creation date 2012-01-12T01

updated date 2013-07-02T20

registry expiry date 2023-01-12T01

sponsoring registrar GoDaddy.com, LLC (R91-LROR)

sponsoring registrar IANA id 146

registrant name Andrew Ng

registrant organization Dkandu, Inc.

registrant street 1975 El Camino Real

registrant city Mountain View

registrant state/province California

registrant postal code 94040

registrant country US

registrant phone 1.415377

registrant email [email protected]

domain name KHANACADEMY.ORG

domain id D118495620-LROR

creation date 2006-03-14T22

updated date 2014-04-29T00

registry expiry date 2019-03-14T22

sponsoring registrar GoDaddy.com, LLC (R91-LROR)

sponsoring registrar IANA id 146

registrant name Shantanu Sinha

registrant organization Khan Academy

registrant street PO Box 1630

registrant city Mountain View

registrant state/province California

registrant postal code 94042

registrant country US

registrant phone 1.650337

registrant email [email protected]

web domains

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: who.is) | May 2014

introduction 76 11%

science 32 5%

part 31 5%

health 29 4%

learning 25 4%

global 21 3%

data 19 3%

educational resources

mathEarly math

Differential calculus

Arithmetic

Integral calculus

Pre-algebra

Multivariable calculus

Algebra I

Differential equations

Geometry

Linear algebra

Algebra II

Applied math

Trigonometry

Recreational math

Probability and statistics

Math contests

Precalculus

scienceBiology

Cosmology and astronomy

Physics

Health and medicine

Chemistry

Discoveries and projects

Organic chemistry

economics

& finance

Microeconomics

Finance and capital markets

Macroeconomics

Entrepreneurship

arts and

humanities

History

Music

Art history

Philosophy

American civics

computing Computer programming

Cryptography & information theory

test prepSAT

CAHSEE

MCAT

IIT JEE

NCLEX-RN

AP Art History

GMAT

partner

content

The Museum of Modern Art

Crash Course

The J. Paul Getty Museum

Stanford School of Medicine

California Academy of Sciences

MIT+K12

Exploratorium

LeBron asks

Asian Art Museum

The Brookings Institution

All-Star Orchestra

The Aspen Institute

Silicon Schools Fund and Clayton Christensen Institute

NASA

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: api-explorer.khanacademy.org, tech.coursera.org/app-platform/catalog/) | May 2014

categories

Arts

Biology & Life Sciences

Business & Management

Chemistry

Computer Science: Artificial Intelligence

Computer Science: Software Engineering

Computer Science: Systems & Security

Computer Science: Theory

Economics & Finance

Education

Energy & Earth Sciences

Engineering

Food and Nutrition

Health & Society

Humanities

Information, Tech & Design

Law

Mathematics

Medicine

Music, Film, and Audio

Physical & Earth Sciences

Physics

Social Sciences

Statistics and Data Analysis

Teacher Professional Development

laguages English

Chinese

Spanish

French

Russian

Portuguese

Turkish

Ukrainian

German

Hebrew

Japanese

Arabic

Greek

Italian

The official APIs of both services allow us to obtain lists of all open educational resources that

can be found on Coursera’s & Khan Academy’s websites.

Coursera offers (as of May 2014) 664 courses – consisting of (video) lectures, additional

materials, community forum & assignments (not only) suitable in case one wants to gain (paid)

signature track certificate and/or complete a whole specialization – organized into 25

categories, lectured mainly by University/College Professors, and manifestly aimed at

University/College students, (prospective or current) professionals and/or life-long learners.

We might want to complement Coursera’s build in categorization of courses (on the left), since

– despite being proper for Coursera’s users – it might be too general for our understanding of

its content, concerning the rather niche-specialization of higher education courses. The

common features of the educational content available on Coursera can be also understood by

a simple frequency analysis* of keywords in courses’ names. The most frequent ones tell us

there are “at least” – only based on the detected “introduction” keyword – 11% of introductory-

level courses, generally rather “science” courses, sometimes serialized (having more “parts”).

Slightly favoured topics are “health”, “learning”, “global” & “data”, followed by (not included in

the table, about 0.5-1 times less frequent) “teaching”, “analysis”, “programming”, “systems”,

“history”, “world”, “management”, “engineering”, “chemistry” & “society”. Note that Coursera

can also be defined by the Universities & other partners with which it cooperates on the

preparation/distribution of courses and/or to which it provides an MOOC platform. We will talk

about them in the following parts of the text.

Khan Academy’s 56 categories of open educational resources - consisting of thousands of

small lectures, practice problems, mini-projects, points & badges (achievements) to reinforce

extrinsic motivation, discussion forums, and learning management environment for learners,

educators, or parents – are structured quite clearly – regarding its “US common core

primary/high school education” nature. Khan Academy’s primary & high school curriculum – as

we will also see later on – is frequently used to “fix” one’s general education as well. The

content is – almost exclusively – created by the founder of Khan Academy, Salman Khan.

Both platforms offer open educational materials in languages other than English. To be more

specific, there are courses taught directly in another language than English (Coursera) and

courses/lectures with available translated transcripts (in-video subtitles) created by volunteer

communities (Khan Academy, recently also Coursera)

* Such analysis serves as an introductory one only. As we will hopefully see by the end of this report,

social web and social & online media mentions give us a much more accurate overall picture.

web traffic

In spite of the fact Coursera generally appears to reach larger audience – according to both pages’ traffic estimates* – the distinction seems to be blurred in the US – possibly because Khan

Academy offers educational resources tailored for the US K-12 education system. The traffic of the “newer” of both institutions, Coursera, gives the impression of growing steadily, while Khan

Academy’s most recent traffic seems to be rather steady, with regard to its regular fluctuations (I bet you won’t find summer holidays in the KA’s line chart =)).

Other metrics/estimates might also be biased by the composition of Alexa’s global traffic panel*. However, for potential future purposes (e.g. hypothesis testing), we might temporarily (utill we have

other data) assume that Coursera is visited rather by men, whereas Khan Academy’s users are rather women. Khan Academy might also be frequently visited by students/professionals, who

possibly want to fill the gaps in their general education.*Alexa's traffic estimates are based on data from its global traffic panel, which is a sample of millions of Internet users using one of over 25,000 different browser extensions.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: alexa.com) | May 2014

web traffic

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: alexa.com) | May 2014

Both services appear to be most popular – no surprise here – in the US. However, we might argue that while Coursera seems to have wider reach around the globe, in non-(natively-)English speaking world, Khan Academy slightly

leads in Canada & the UK. Besides the importance of names/brands of both institutions, the top search engines keywords provided by Alexa do not say much. Yet, it already indicates the discussion, arising from social media

content analysis in the following parts of the text, about Coursera’s brand being more “diverse”, based on particular coursers one takes, Professors & partner institutions; and Khan Academy’s brand being more “centralized” &

dependent on the person of Salman Khan – besides other things caused by the difference in educational content production on both websites. Because the sharp rise of search traffic in mid 2013 is observable in both graphs, it was

not examined in more detail, since it is not the subject of this research and might be caused by a modification of Alexa’s methodology and/or increased publicity of both thanks to an influential publisher/medium.

web traffic

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: alexa.com) | May 2014

The inbound link analysis suggests that about three times more sites link to Khan Academy than to Coursera – on the other hand, as we already know,

Khan Academy exists about four times longer than Coursera. In general, the importance of social media (Facebook & YouTube in particular) for both

institutions’ traffic is shown. Khan Academy’s subdomains show the (web traffic) importance of SmartHistory, KA’s resource for art history, and two of its

volunteer translator communities.

You might also notice the Turkish newspaper Hurriyet (Coursera) – we will explain the connection between Coursera & Turkey later – and “the” Chinese

online medium & SNS (social networking service), Sina. Unsurprisingly, regarding the large Indian population, there’s a significant interest in Coursera in

India (google.co.in). We will talk about both services’ geography & demography in general later on.

web traffic

estimated 8,022,800 monthly visits

estimated 4,152,930 monthly visitors

estimated 4,055,700 monthly visits

estimated 3,149,610 monthly visitors

Coursera.org Course: Machine LearningCourse: Human-Computer

Interactio

Course: Cryptography ICourse: Computer Science

101

top pages

KhanAcademy.org About | Khan AcademyKnowledge Map | Khan

Academy

Computer programming |

Khan Academy

top pages

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: site-seo-analysis.com, opensiteexplorer.org, trafficestimate.com) | May 2014

Once again, despite the fact the website traffic estimates are only approximate, even this page

supports the approaching discussion of Khan Academy having a stronger community, and of

Coursera having higher reach build around diverse target groups.

Perhaps because the transformation of education is “powered by” ICT development, the most

popular Coursera’s courses seem to be the techn(olog)ical ones.

keyword performance

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: semrush.com) | May 2014

Organic keywords from an

alternative source seem to

support the previous

discussion. Moreover they

emphasize Coursera’s & Khan

Academy’s partner

institutions.

The analysis of competitors

based on organic search

shows that Khan Academy is

– as for organic search –

looked up as a source of open

educational resources for

mathematics. Coursera

appears to (organically) be a

general online class platform,

MOOC platform in particular,

since it competes with

websites offering

University/College and/or

professional online courses.

business insight

number of followers: 14,399number of followers: 33,435

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: linkedin.com) | May 2014

We’ll start our quick business insight into Coursera & Khan Academy by examining

their profiles on the business-oriented social network LinkedIn.

The for-profit vs. non-profit difference between both institutions reflects itself in the

fact that Coursera, unlike Khan Academy, updates its company page on a regular

basis, focusing its news & stories on college/professional population & recruiting

new employees. Coursera is (expectedly) also larger concerning number of

employees (/company size).

business insight

number of followers: 14,399number of followers: 33,435

While the previous organic search (/keyword) competition was based on search

terms – i.e., slightly more text-based categorization – LinkedIn’s recommendation

system gives the “related search recommendations” using clicks, term overlap &

length bias [self] – i.e., supposedly, even in our case of “people also viewed”, those

should be slightly more human behavior-based recommendations.

We can see that both portals are perceived to be among the leaders in the realm of

online/MOOC education (edX, Udacity, Udemy). Grockit – providing US

standardized exam preparation – reflects KA’s recent educational content &

partnerships (discussed in the following parts of the text). Finally, we should also

mention Knewton, an adaptive learning (/educational content personalization)

platform. Its presence among other educational tools complements our conception

of current trends in education.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: linkedin.com) | May 2014

business insight

Another methodology of finding competitors, this time

crowd-sourced, brings some new players to the overall

picture of online courses market.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: crunchbase.com) | May 2014

business insight

number of followers: 14,399number of followers: 33,435

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: crunchbase.com) | May 2014

Information about funding – from our viewpoint – seems to be

important only to identify the influences on Coursera’s operation,

its investors (similarly its partner institutions discussed later).

Nevertheless, it might be misleading to assume that Khan

Academy just “lives a life of its own”. Since this topic will show up

many times in the following two parts of the text, it will be enough

to mention its recent partnerships with College Board, Bank of

America, NASA, and the White House; and its financial backing

from the Bill & Melinda Gates Foundation.

business insight

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: crunchbase.com) | May 2014

Institutions are – arguably, to a great extent – shaped by

people who work in them (& therefore determine their

development). Besides the founders of both institutions,

Salman Khan (Khan Academy), and Daphne Koller &

Andrew Ng (Coursera), we can see other members of

the board, employees and/or partners, about whom we

can learn more simply by looking up their names in our

favourite web search engine.

Coursera Inc.

1975 El Camino Real West , CA 94040 , Mountain

View , CA , 94040 , United States

www.coursera.org

Industry Internet Educational Services

Employees 45

SICSchools & Educational Services,

Nec (8290)

NAICS Educational Support Services (611710)

People

Vincent Price Member of Advisory Board

Margaret Sheil Member of Advisory Board

Rafael Bras Member of Advisory Board

John Etchemendy Member of Advisory Board

Peter Lange Member of Advisory Board

Phyllis Wise Member of Advisory Board

Andrew Ng Co-Founder

Philip Hanlon Member of Advisory Board

Christopher Eisgruber Member of Advisory Board

Patrick Aebischer Member of Advisory Board

John Doerr Director

Scott Sandell Director

Vice President(s)

Jessica Neal Vice President - Talent

Chief Executive Officer

Daphne Koller co-founder and co-CEO

Richard Levin Chief Executive Officer

President(s)

Lila Ibrahim President

business insight

Khan Academy Inc.

PO BOX 1630 , Mountain View , CA , 94042 ,

United States

www.khanacademy.org

Industry Educational Services

Employees N/A

SIC

Services-Educational Services (8200)

NAICSAdministration of Education Programs

(923110)

Director(s)

Salman Khan

Founder & Executive

Director

Other

Jennifer Overholt

Volunteer and Math

Content Creator

The same query using an alternative source should be

enough* to roughly illustrate the difference in organization

structure between for-profit & non-profit organization, which

will complement our future conclusions about Coursera’s &

Khan Academy’s communities.

* If we were to create an as-much-as-possible complete list of both

companies’ team members, we should certainly also use web search

engines; search blogs, news articles, different social media APIs etc.

A good place to start would be here for Coursera

& here for Khan Academy.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: insideview.com) | May 2014

January January January January

A B C D E

F G H I J

K

google search

While the traffic estimates appeared to work in

favour of Coursera, Google search of the

“coursera” & “khan academy” keywords, by

contrast, shows overall higher demand for Khan

Academy. As we will see later on, it is so,

arguably, because Coursera is often found via

particular courses (e.g. ‘machine learning’),

whereas Khan Academy is rather centralized &

compact tool closely associated with its founder,

Salman Khan, & his stances towards education

mediated by online publishers.

News headlines found by Google Trends – red

about Coursera & blue about Khan Academy –

also allow us to start shaping the overall picture

of Coursera’s & Khan Academy’s online

mentions (which will be further developed later,

especially thanks to social media mentions &

inbound links).

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: google.com/trends) | May 2014

google search

The related (to the previous keyword search interest over time) world search interest maps can be seen as showing current markets of both educational tools. While Khan Academy is looked up in countries like Ghana, Singapore & Greece, the worldwide

popularity of Coursera (beyond US & Canada) seems to depend on larger University cities (possibly with better educated population). Once more (& unsurprisingly again) we can see high demand for both portals in India. Regarding Coursera, Bangladesh

might surprise us being among the top.** Numbers represent search volume relative to the highest point on the map which is always 100.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: google.com/trends/) | May 2014

“society” statistics

profile following followers

Barack Obama 650,026 43,714,421

Dalai Lama 0 9,030,143

NASA 218 6,813,595

Khan Academy 64 301,000

Coursera 300 182,000

profile fans

engagement

rate (average)

Barack Obama 41,280,379 0.32%

Narendra Modi 18,538,829 1.50%

Mitt Romney 11,336,358 0.13%

Khan Academy 727,524 N/A

Coursera 547,273 N/A

1.

2.

3.

1.

2.

3.

tagsconference, csr, education, governmental, ngo,

politics, professional association, science

Slowly shifting to social media, we can start with a popularity rank from SocialBakers’ proprietary database, which gathers social media data on a regular basis, and therefore allow us to make some

general and/or longditudial comparisons. Let’s start with fan pages.* The “society” category includes the “conference”, “csr”, “education”, “governmental”, “ngo”, “politics”, “professional association” &

“science” tags. It shows that Khan Academy is overally more popular on both Facebook & Twitter than Coursera – as measured by the number of fans/followers. Broadly speaking, it is also obvious that

political & spiritual leaders are more popular than educational leaders. To complement the left Facebook table, we can add that the TED conference was among the top 10. On the subject of other

educational institutions, there was, for example, Harvard University in the top 40. Likewise, using our “educational lens” when looking on the right Twitter table, the “third place” of NASA – sharing content

which usually can be labeled as “educational” – should make us happy.* And their analogies regarding other (not Facebook) SNSs.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | May 2014

“society” statistics

tagsconference, csr, education, governmental, ngo,

politics, professional association, science

profile views followers

Barack Obama 84,520,413 4,327,072

Jamie Oliver 31,686,664 2,389,056

Narendra Modi 114,569,200 1,493,332

Coursera 22,833,545 1,067,922

Khan Academy 6,376,165 343,686

profile views subscribers

Super Simple Songs 1,508,583,613 1,089,874

Howcast 1,362,150,063 2,117,582

Khan Academy 114,569,200 2,078,502

Coursera 977,490 32,128

1.

2.

3.

1.

2.

3.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | May 2014

Dealing with the YouTube leaderboard in the “society” category, we can see that educational content wins. Both Supersimple & Howcast offer educational content –

yet targeting very different users. In total, Khan Academy is the third most viewed in its category*, which demonstrates the popularity of Salman Khan’s videos, which

are uploaded directly to YouTube (and displayed on Khan Academy’s webpage as embed videos). Since Coursera’s courses videos are uploaded to its own (online)

learning environment made accessible after enrolling into a particular course, Coursera does not have that much video content to share – and that’s perhaps why it is

not so popular on YouTube. In the top 15 of the “society” category of YouTube videos, we might find (again) the popular TED conference, Stanford University, TED-

Ed, MIT OpenCourseWare & several other educational institutions.

What about Google+? On the subject of educational institutions in the top 50 of the “society” category, we can find several educational institutions, mainly consisting

of University Google+ pages (e.g. Stanford). Even if you have not yet been exposed to a decent amount of social media research, with respect to the fact the “first

place” is occupied by the same person on Twitter, Facebook & Google+, you should at least suspect, who is “the” politician very good with them.

To sum up, we can say that YouTube is the most proper platform for education with respect to the four SNSs we were talking about. Moreover, Facebook, Twitter &

Google+ users are more keen to follow politicians than educators.**

* Also note Khan Academy’s disproportionate number of views to number of subscribers, as compared to Super Simple Songs & Howcast. We can argue that Khan Academy requires the

most (learning) concentration of all three, since nearly all of its videos are Salman Khan’s micro lectures, typically on STEM (science, technology, engineering, and mathematics) topics.

**Jamie Oliver then might go beyond our categorization. My personal preference would be an “entertainment” category. However, since Jamie categorized his Facebook page as “public

figure”, we might argue that there’s some educational/political/societal/cultural value instilled in recipes when discussing properties of food (nevertheless, with that logic, we might also

justify the educational value of the well-know extensive collection of YouTube makeup tutorials =)).

number of fans: 529,115 number of fans: 714,626

Before we will get to the (rather qualitative) content analysis in the second

& third part of this report, let’s focus on quantitative metrics which can

provide us with a general overview of Coursera’s (orange) & Khan

Academy’s (green) reach.

As the chart shows, the numbers of Coursera & Khan Academy fans rose

steadily over the years.* The overall Facebook fan base of Khan Academy

is higher than the fan base of Coursera.

* Not paying attention to the sudden jump in the number of Khan Academy fans

around March 2014 which might be related to a significant event but might as well

be a “bug” in Wildfire’s data and/or web reporting.

competitive analysis

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: wildfireapp.com) | June 2014

number of fans: 529,115 number of fans: 714,626

fans fans

competitive analysis

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | June 2014

Looking at data from an alternative third-party source makes it possible to

verify our previous conclusions. Since we – once again – can see the

sudden jump in the number of Khan Academy fans at the end of March

2014 & we possibly want to explain it, we need to “borrow” some findings

from the following two parts of this text in order to argue that the

March/April 2014 KA’s number of fans increase might be caused by KA’s

new partnerships, especially its co-operation with College Board (SAT*

preparation).

* SAT is a widely used US college admissions standardized test.

United States 90,976 17.2 %

India 69,061 13.1 %

Brazil 37,235 7.0 %

Egypt 18,603 3.5 %

Mexico 14,394 2.7 %

United Kingdom 13,580 2.6 %

Spain 12,542 2.4 %

Canada 11,138 2.1 %

Greece 10,807 2.0 %

United States 32,2341 45.1 %

India 54,447 7.6 %

Canada 26,592 3.7 %

Bangladesh 25,471 3.6 %

Pakistan 24,128 3.4 %

Brazil 21,998 3.1 %

United Kingdom 17,618 2.5 %

Egypt 15,148 2.1 %

Australia 13,587 1.9 %

number of fans: 529,115 number of fans: 714,626

competitive analysis

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: wildfireapp.com) | June 2014

With regard to Coursera’s & Khan Academy’s Facebook fans nationality, we can

support – in some cases also complement (especially towards this research’s

overall conclusions) – Google search engine query statistics conclusions

(compare with page 21). Furthemore, taking a closer look at the left (Coursera’s)

table, we can see that Coursera seems to be popular in Latin America.

number of followers: 176,619 number of followers: 295,146

competitive analysis

Coursera’s & Khan Academy’s potential reach of Twitter users, in

comparison with Facebook, creates a larger gap between those two

(again) in favor of “the older” Khan Academy, which has almost twice the

number of Twitter followers Coursera has.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: wildfireapp.com) | June 2014

following 298

tweets 1,712

twitter age 2 years 9 months

following 63

tweets 1,072

twitter age 5 years 7 months

number of followers: 176,619 number of followers: 295,146

followers followers

competitive analysis

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | June 2014

Reports from an alternative third-party source (SocialBakers) tell us the same story. Yet it should be stressed once more that Coursera is a bit younger (not only talking about its Twitter

account age) and therefore had less time for fans acquisition. Just for the record, we might add that Coursera’s Twitter account is more active than Khan Academy’s account. Analysis of

followers is subject to the second part of this report.

followers: 1,067,922 followers: 343,686

competitive analysis

By contrast, as for Google+, Coursera looks like the absolute winner – if

the “number of followers” metric is used – although KA’s fanbase might be,

according to Wildfire’s data, growing much faster over the last 3 months

(as of June 2014).

Don’t be misled by the date Wildfire has begun monitoring both pages.

Querying the Google+ API, we find that the first ever post of Khan

Academy on Google+ was published in December 2011. As for Coursera,

it was April 2012. Knowing there’s such a small “starting time” difference

between those two & knowing that their Facebook & Google+ content

strategy does not differ significantly* (see the following section), it

illustrates how different g+ userbase is. Even without collecting user socio-

demographic data, we might argue that – in comparison with the other

studied SNSs – the g+ population is generally older and/or has a specific

interest profile.

* Just to make the statement absolutely clear: Coursera’s content strategy is

different from Khan Academy’s content strategy. Coursera’s Facebook & Google+

content strategy are very much alike. Khan Academy’s Facebook & Google+

content strategy are very much alike.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: wildfireapp.com) | June 2014

followers: 1,067,922 followers: 343,686

followers

N/A

competitive analysis

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | June 2014

Unfortunately, Coursera was not monitored by our alternative source of g+ followers time

series. At least, we can confirm that the number of Khan Academy’s Google+ followers has

been growing very fast since 2014.*

* Possibly even further back in time but we don’t have enough data available to verify that.

videos 319

total views 951,021

videos 4,201

total views 406,597,063

subscribers: 31,535 subscribers: 178,1982

subscribers subscribers

competitive analysis

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | June 2014

On the subject of YouTube subscribers, we definitely could look for its relationship to the

number of videos uploaded and/or total views. However, the only highlight here is Khan

Academy’s total views. As we mentioned before, Coursera falls short in this regard. While

one Coursera’s video is, on average, viewed about 3,000 times, an average Khan

Academy’s video reaches almost 100,000 viewers.

videos 319

total views 951,021

videos 4,201

total views 406,597,063

subscribers: 31,535 subscribers: 178,1982

uploaded video views uploaded video views

competitive analysis

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | June 2014

While Khan Academy is based on sharing YouTube videos, and therefore we can observe a long-term

steady increase in uploaded video views reflecting its (rather) constant content strategy, Coursera is yet

learning to work with YouTube – and there’s a noticeable improvement over the last few months.

The very last interesting insight we’ll have before moving on to the

(rather) qualitative section of this text based on our own (primary) data,

is revealing the MOOC education tools universe/landscape. We can

easily do that thanks to a powerful Wikipedia article graph exploration

interface, Wikinsights, based on content, category & links (both,

inbound & outbound) similarity, and links complementarity. What binds

Coursera & Khan Academy together – based on the aforementioned

criteria – are Open educational resources, CK-12 Foundation, ALISON

(company), OpenCourseWare, Massive open online course, Udacity,

MIT OpenCourseWare & LearnStreet.

Regarding Wikipedia pages related to Coursera only, we can see

Wikiversity, Charles Severance, Massive online open research,

Creative Live, TechChange, Udemy, Ben Benderson, Edsby, Iversity,

Academic Eearth, Lynda.com, Eliademy, EduKart, Daniel S.Welt,

Open-source curriculum, edX & Open Learning.

As for Wikipedia pages related to Khan Academy only, there are

Technology integration, Interactive Learning, Two Circles, OER

Commons, Free High School Science Texts, Educational technology,

Open textbook, American Friends of Arts et Méiers Paris Tech, Curriki,

Virtual university, LearnThat Foundation, MITx, PhET Interactive

Simulations, Teaching Channel, Computers in the classroom, E-

learning, INeedAPencil, Saylor Foundation, CollectSPACE, Lecture

recording, Open Source Learning, East Bay Children's Book Project.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: http://wikinsights.org/) | June 2014

wikipedia insights

Coursera & Khan Academy

social web presence qualitatively

facebook, twitter & google+ pages, groups, social networks, communities, posts, fans,

demographics, traffic sources, keywords; personal network & interest profiles, search

results, news articles, text mining, inbound links, reddit, youtube

(pp. 34-95)

originalsocial web data

mission

We are on a mission to change the world by offering classes from top

universities and professors online, and for free. We envision people throughout

the world, in both developed and developing countries, to learn without limits by

using our platform to connect to great education that has so far been available

only to a select few. We hope to empower people with education that will

improve their lives, the lives of their families, and the communities they live in.

founded 2012

category Education

description

Learn from renowned professors, watch high quality lectures online, achieve

mastery via interactive assignments, and collaborate with a global community of

students.

Please visit help.coursera.org to get help or to drop your suggestions!

How has Coursera changed your life? Tell us your story!

http://blog.coursera.org/student/stories

awards 2012 TechCrunch Crunchies - Best New Startup

products Coursera Education Platform

number of fans: 529,115 number of fans: 714,626

company overview

Start learning now at Khan Academy. All our resources are completely free, forever.

Too many people around the globe don‘t have access to high quality educational materials, or are

forced to learn through a system that doesn't allow them to learn at their own pace.

We think the technology exists today to fundamentally change this, and our 501(c)3 non-profit is

working to build the tools and resources every learner deserves.

missionWe are a not-for-profit organization with the mission of providing a free world class education for

anyone, anywhere.

founded 2007

category App page

products

We offer tutorials on everything from basic arithmetic to calculus, chemistry, physics, history, art,

medicine, economics, and finance. Khan Academy also offers infinite problems for practice

(currently in Math).

We are translating content into the world's most spoken languages (click on the subtitles option on

a video or visit the International option in the bottom right corner of www.khanacademy.org

To get started using Khan Academy, check out: http://www.khanacademy.org/about/getting-

started

If you'd like to contribute to our efforts, please visit http://www.khanacademy.org/contribute

To share your story about the impact Khan Academy has had on you (videos are much

appreciated!), please visit www.khanacademy.org/stories

0.39% active (Jan-Jul 2014) 0.68% active (Jan-Jul 2014)

about

Querying the Facebook API for both pages (their IDs), we get the basic (public) information that both organizations share about themselves on Facebook and also the

common call to action (here “call to engagement”) to share stories about impacts of education on one’s life – as we’ll see that many times later, arguably also one of the

core features of “education as a brand”. While Coursera puts itself in the general “education” category, Khan Academy defines itself as being an (educational)

“app(lication)”.

Another query gives us a list of all Coursera’s/KA’s active fans who engaged with one or the other page from January to May 2014*. We can see that Khan Academy’s

Facebook community is, in total, larger & more active.

* This period is valid for the following analyses as well, unless stated otherwise.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (data obtained via: developers.facebook.com/docs/graph-api/) | January-May 2014

pages Coursera liked pages KA liked

profile category

Working Together towards Health for All

through Primary Health Care Community

Archaeology's Dirty Little Secrets Community

INDEX: Design to Improve Life Non-profit organization

The Wall Street Journal Media/news/publishing

NPR Classical Arts & Entertainment

Lifehacker Computers/internet website

profile category

KA Lite Product/service

page likes

Before we’ll dive into the content Coursera & Khan Academy share on their social

media profiles & the typologies of their fans build around comment networks, let’s

ask the question of how Coursera & Khan Academy position themselves based on

the pages they like (Facebook) or profiles they follow & mention (on Twitter), and

also the “grassroot” social media communities around both educational tools.

Due to the design of the social network & its “culture” (common practice), there’s

not much to see on Facebook with regard to pages giving likes to other pages.

While Coursera expressed support for some non-profit communities/organizations

– one of them related to an archeology course on Coursera, and another related to

Johns Hopkins University, one of Coursera’s key partner (see later on) – and three

publishers (news/tech/entertainment), Khan Academy liked only its offline version

(Khan Academy Lite).

On the next pages, you will find, a bit more interesting, Twitter “self-positioning”

profiles of Coursera & Khan Academy.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (data obtained via: developers.facebook.com/docs/graph-api/) | January-May 2014

following: 63

following/friends network

Twitter following/friends relationship from the perspective of an

institution – who does it follow, not necessarily the other way

around – can be seen as how the institution wants to position itself,

(therefore) who it wants to publicly support. This is oftentimes true

for “human users” as well, however, human users also might want

to follow another user (incl. an institution) in order to “subscribe” to

receive their content – e.g. in order to benefit from it. While the

second statement can also be true for an institution – e.g. for its

social media manager who shares third-party content as part of an

institution’s communication strategy; to exit this nested loop, let’s

assume the act of Khan Academy or Coursera following another

user reflects (offline to online relationship) and/or co-creates

(attempt at online to offline relationship, incl. support) its brand.

Dealing first with the network* of tweeters Khan Academy is

following, the most important users – as measured by number of

followers (size of a node is proportional to number of followers) –

are: Bill Gates (less active regarding his 1,286 tweets since he

joined Twitter in 2009) & NASA (310,042 tweets since it joined

Twitter in 2007).

On the topic of the most active users in KA’s friends network – who

arguably tweet for their own (but possibly overlapping with KA’s)

communities – right after NASA & Bill Gates, there’s “jack” &

“pamelafox”, the former being Jack Dorsey, twitter co-founder, and

the latter being Pamela Fox, working at Khan Academy on the

Computer Science curriculum. Also note that besides the financial

backing of Khan Academy from the Bill & Melinda Gates

Foundation we’ve already mentioned in the first section of this text,

Bill Gates has also brought a lot of media attention to Khan

Academy. If you want a more elaborated proof than just the search

results after querying “bill gates khan academy” in your favourite

search engine, there's a named entity recognition analysis of

online news articles about Coursera & Khan Academy near the

end of this extensive chapter (p. 87).

Our previous paragraph highlights the fact that even when a

relationship network like that already provides us with a lot of

information, it’s always worth to combine it with other data (third-

party, insider knowledge etc.) to gain a detailed insight. One

source of information simply might not be enough.

* The primary tool for analyzing & visualizing social networks in this text was

NodeXL, a free, open-source template for Microsoft Excel. I’ve also adopted

several definitions of social network metrics from its build-in documentation

and/or from the documentation’s links to Wikipedia articles describing them.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Khan_IsFollowing.csv, Twitter_Khan_IsFollowing_MetaData.csv) | May 2014

following: 63

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Khan_IsFollowing.csv, Twitter_Khan_IsFollowing_MetaData.csv) | May 2014

Centrality measures – in this picture “degree” (the

number of connections a node has) – show that,

with respect to the users KA is following, rather

people than institutions are interconnected. Why is

that? Khan Academy is mainly following the “core

team” of its employees and/or volunteers (ICT

development and/or content creation) & KA’s

branches (e.g. Smarthistory). The bottom curve is

formed by several institutions. While many of them

perfectly fit the “education for anyone” part of KA’s

brand (e.g. UncommonSchools, TeachForAmerica

etc.), we’ll see in later analyses that the most

distinctive ones are NASA & CollegeBoard

(“CollegeBoard” & “OfficialSAT” twitter accounts).

following/friends network

following: 300

In comparison with the “small Khan Academy

family” – employees, key partner institutions

& a few “support the right thing” expressions

– Coursera clearly makes an effort to define

itself by the users it follows. That’s why its

friends network is slightly more diverse.

There’s no Bill Gates (yet, the Bill & Melinda

Gates Foundation is present), but Bill Clinton;

there’s Obama, Oprah and other public

figures and/or famous & influential people

(incl. entertainment & education/science

popularization, such as Vsauce).* Apart from

“celebrities”, Coursera also follows many

online tech & news publishers (& also some

education-specialized) and, above all, many

educational institutions and/or educators,

usually those with whom Coursera co-

operates (see the next page).

With regard to the most active tweeters in the

network, “the leader” is The New York Times

with its 136,887 tweets since it joined Twitter

in March 2007. Even though it’s a bit shy,

hiding behind Barack Obama**, NY Times is

also the third most influential (as measured

by number of followers) user in the network

(after Barack & Oprah).

* The size of nodes/users in the social network

graph represents the number of their followers.

** My mistake, sorry for that layout. =)

following/friends network

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_IsFollowing.csv, Twitter_Coursera_IsFollowing_MetaData.csv) | May 2014

following: 300

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_IsFollowing.csv, Twitter_Coursera_IsFollowing_MetaData.csv) | May 2014

Switching from “number of followers” to “degree”, as itself,

does not tell us anything meaningful about Coursera’s brand.

Let’s rather try to find groups – “natural” clusters of users

based on interconnectedness of the vertices (the larger the

vertice is, the bigger its clustering coefficient). This way, we

can see the first traces of a generalization – very rough but

sufficient for our purposes – that the two general (&

overlapping) groups of users Coursera is following are:

1) those Coursera wants to be co-defined by, incl. media

partners; 2) those featuring their courses on the Coursera

platform – institutions, mainly Colleges/Universities, individual

lecturers/Professors and/or particular classes/courses.

However, don’t forget that our picture does not show the

typology of Coursera’s “friends” (those who Coursera follows)

but the 1.5 level network, i.e. including the links between

Coursera’s friends which are the clusters based on, and

which helped us to see the structure of the network more

clearly (compared to the “blob” on the previous page) & make

the aforementioned generalization further supported on the

following pages of this text.

If you find our first clustering – not our generalization build

around it but as itself – a bit difficult to interpret (I do!), since

there does not seem to be any obvious “deeper” (than our

generalization) relationships at our level of analysis, you

should know that there will be much more clearer typologies

regarding Facebook & YouTube comment networks later on.

Finally a fact especially important for fans of soap operas:

while Coursera follows Khan Academy, Khan Academy does

not follow it back. =)

following/friends network

followers: 176,619top URLs

https://www.coursera.org/about/translate

https://www.coursera.org/course/teachingcharacter

top domains

coursera.org

nytimes.com

youtube.com

huffingtonpost.com

google.com

ryanseacrest.com

bbc.com

entrepreneur.com

charlierose.com

kplu.org

top hashtags

coursera14

edtech

gamification

coursera

top words

coursera

course

learn

education

new

online

courses

learning

top word pairs

co,founder

coursera,app

rick,levin

global,translator

translator,community

find,out

starts,today

online,courses

top mentioned

coursera

andrewyng

daphnekoller

relaygse

pennopencourses

oprah

women2

charlierose

juanpagalavis

top tweeters

huffingtonpost

businessinsider

felipebhz

nytimes

petchary

nytimesworld

_nastycat

slate

brainpicker

ws

timezone

Eastern Time (US & Canada) 203

Pacific Time (US & Canada) 152

Central Time (US & Canada) 59

London 34

Quito 34

Athens 30

Chennai 27

Amsterdam 25

Atlantic Time (Canada) 24

Brasilia 18

Greenland 18

Arizona 16

Hawaii 14

Mexico City 14

Tehran 14

Madrid 13

Alaska 12

Istanbul 12

Rome 11

Beijing 10

Caracas 10

recent tweets

On Twitter, Coursera has (as of June 2014) around 177 thousand followers. Since we’re still in realm of how the institution position itself, let’s take a look at the content of its 200 recent (June 2014)

tweets. This might give us some insight not only into the content Coursera (or Khan Academy on the following pages) shares but also into what & who Coursera mentions (and/or replies to) in its tweets.

Top domains show us whose educational content (and/or content about education & society – see the dataset, a zip file link enclosed on page 131) Coursera shares. Besides its own courses & major

media publishers – most of them here already considered before within our overall “mediasphere” picture of Coursera (see the conclusions) – we can (newly) find some public figures from entertainment

industry (Ryan Seacrest, Charlie Rose).

Top hashtags include “coursera”, “coursera14” (conference), (not only by online media publishers used) popular hashtag “edtech” (education technology) & gamification – a technique used to reinforce

user/learner engagement, as we will see later, rather associated with Khan Academy, occurring in relation to Coursera because of its very first course taught in Chinese, Probability (機率) by Professor

Ping-Cheng Yeh, National Taiwan University, who created “MOOC-based multi-student social game platform” for his course named PaGamO.

Top words & the three bottom rows of the “top word pairs” table illustrate the fact that Twitter is (unsurprisingly) used by Coursera to announce new courses in order to invite prospective learners to

enroll. In relation to our previous “educational resources” analysis pointing out that many Coursera’s courses are introductory level (and also supported by some following analyses studying other social

media), we can conclude that Coursera’s communication strategy – in accordance with its proclaimed mission we’ve seen before – aims to open up (& facilitate) higher education for mass audience,

within the “everyone can learn anything” notion.

The other three tables (excluding the “timezone” one) provide instances of shared educational content – e.g. Coursera courses by University of Pennsylvania (“pennopencourses”); interacting with

public(ly known) figures; and promotion of current events around both Coursera’s co-founders, Daphne Koller & Andrew Ng, and Coursera’s CEO, Rick Levin. Above all, we should emphasize

Coursera’s recent effort to recruit volunteers for translating (/subtitling) courses into other languages.

Just out of curiosity, you might take a look at the time zones of over 1500 most recent Coursera followers, which denotes where the most recent interest for Coursera is (we’ll also take those into

account in our final conclusions). There also the “top tweeters” table, which describes the top tweeters in the whole network, therefore combines mentions, recent followers & “friends” (“is following”

relationship). Since we’ll take look at some “grassroot” influencers & “brand ambassadors” in the third section of this text, and using a slightly different technique to capture them – monitoring mentions

in tweets across the entire Twitter rather than focusing on a single (Coursera’s / Khan Academy’s) profile; you might be (again) only interested in the major publishers that Coursera follows and/or

mentions (usually when they publish an article discussing Coursera).

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_AnyRelation_Edges.csv, Twitter_Coursera_AnyRelation_Vertices.csv) | June 2014

followers: 295,146

top domains

khanacademy.org

tumblr.com

youtube.com

google.com

nasa.gov

edsurge.com

paniit-bayarea.org

techcrunch.com

top hashtags

hourofcode

khanacademy

newsat

kabrainteaser

edtech

kamathchallenge

teacherappreciati

onweek

kacollegehero

stem

whsciencefair

top words

khanacademy

khan

new

academy

students

hourofcode

out

more

top word pairs

khan,academy

check,out

khanacademy,hourofcode

computer,science

math,mondays

find,out

sal,khan

top mentionedkhanacademy

lifeatka

pamelafox

collegeboard

britcruise

nasa

salkhanacademy

officialsat

calacademy

drszucker

top tweetersdanceeatrepeat

hntweets

imdrw

nytimes

washingtonpost

wsj

fastcompany

techcrunch

cnetnews

forbes

timezoneEastern Time (US & Canada) 138

Pacific Time (US & Canada) 89

Central Time (US & Canada) 73

Athens 29

Atlantic Time (Canada) 25

Arizona 22

London 21

Amsterdam 17

Chennai 15

Brasilia 11

Hawaii 11

Alaska 10

Quito 10

Bangkok 9

Greenland 9

Istanbul 9

Mumbai 8

Brisbane 7

Mountain Time (US & Canada) 7

Rome 7

Sydney 7

top URLs

https://www.khanacademy.org/hour-of-code/hour-of-code-tutorial/v/welcome-hour-of-code

https://www.khanacademy.org/donate

http://www.nasa.gov/content/nasa-khan-academy-collaborate-to-bring-stem-opportunities-to-online-learners/#.U4T468biI9V

https://www.khanacademy.org/sat

https://www.khanacademy.org/partner-content/CAS-biodiversity

http://www.paniit-bayarea.org/edtech/

http://cs-blog.khanacademy.org/2014/03/what-does-computing-professional-look.html

https://docs.google.com/document/d/1QCen5ijdfEFiG_a_RGSGgHjBYI7Vc1lakHpcLQF4mkU/edit?usp=sharing

https://www.khanacademy.org/cs/second-avatar-naming-contest/2601896243

https://www.khanacademy.org/hour-of-code/hour-of-code-tutorial

recent tweets

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Khan_AnyRelation_Edges.csv, Twitter_Khan_AnyRelation_Vertices.csv) | June 2014

As for Khan Academy’s recent tweets, publicizing the new Computer Science lectures is evident. The most distinctive is the “Hour of Code”, a trademark of Code.org, “non-profit dedicated to expanding participation in computer science by making it available in more

schools, and increasing participation by women and underrepresented students of color” [self]. Dealing with the top URLs, we can see the aforementioned HourOfCode, some other CS & education technology related content, CS curriculum translation into

Portuguese, some efforts to obtain donations, NASA & CollegeBoard partnerships – we will see those two many times later on; and California Academy of Sicences partner content on Khan Academy. Top hashtags & mentions show that Khan Academy shares a lot

of educational resources for traditional school education (collegeboard, officialsat, newsat, stem), challenges/brainteasers (kabrainteaser, kamathchallenge), background stories & new content/features development (“Life at KA” blog, pamelafox, britcruise,

salkhanacademy) and also stories related to achievements in traditional school education (educators: teacherappreciationweek, learners: kacollegehero). Compared to Coursera, once again, we can see: much more coherent universe of topics – to a certain extent,

given by smaller overall number of topics around KA; (on social media) very active & publicized “core” small team of KA’s content creators and/or ICT developers – e.g. John Resig (“jeresig”) & Ben Alpert (“soprano”) we’ve already seen in KA’s friends Twitter

networks & we’ll also see later on; better work with hashtags; and successful communication of core partners. Althought we’ve already came across this topic & it will be supported by further data, let’s point out that we’re starting to see Khan Academy as the more

“centralized” one of both institutions, with Salman Khan in the centre & a solid community around him, together “flipping” the traditional education system (the flipped classroom model), customizing it & being part of it at the same time.

Again, on the right you can see the time zones of about 1100 most recent Khan Academy followers & the “top tweeters” table.

Coursera pages & people KA pages & people

pages & people search

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (data obtained via: https://developers.google.com/+/api/) | May 2014

Pages & people API search allows us to discover communities around both educational tools on Google+. If

we omit founders & employers/partners – we can see some new faces, e.g. Khan Academy’s John Resig,

creator of the jQuery JavaScript library – and some irrelevant results (e.g. Shah Rukh Khan & Khanacademy

Pesh), the snippets of the discovered pages show us instances (the first page of results only) of the finding

that Coursera’s Google+ communities are mainly study groups, while Khan Academy’s represent especially

KA’s content translations and fans & volunteer pages/communities. Dealing with those, we should not forget

our limitations by “khan academy” or “coursera” keywords that also favour English language & Latin script –

e.g. possibly not covering Arabic languages and/or communities & pages related to Coursera/KA but not

including our keywords. Nevertheless, since, throughout this text, we complement this information, such

limitations should not be an issue. Finally, we can add that – as we know from our Twitter data, Coursera

also currently pursues its goal of establishing a volunteer translator community.

For similar – this time rather quantitative – analysis on Facebook, see the text box on right.

As for Coursera, over 90 fan pages & public(ly visible) groups were detected – the same

search limitations as we’ve mentioned with Google+ – consisting mainly of study/course

groups. Therefore we know that self-organization of students outside the “official”

Coursera’s learning environment exists.

Khan Academy has about 200 public(ly visible) fan pages & groups. These are mainly

language translations of (the “original”, English) Khan Academy (rather Facebook pages)

and/or serve as a project management tool for translators (rather Facebook groups).

pages & groups search

followers: 1,067,922

+1s top 3

Next on our agenda are content

analyses of Coursera’s & Khan

Academy’s top Google+ & Facebook

posts (January-May 2014) according to

various kinds of post engagement: +1s &

likes, replies & comments, reshares &

shares (to stick with both service’s

terminologies). Despite our “first half of

2014” bias and perceived suitability of

long-term analysis & studying posts

based on their type, content, keywords,

time & other features for the purpose of

developing an “archetypal post” (and/or

“ideal” most engaging post) – e.g. using

conjoint analysis with engagement

metrics used to indicate the perceived

value; we are able to draw reasonable

conclusions about the content Coursera

& Khan Academy posts. Above all, we

are supposed to see which content

Facebook/Google+ users generally

associate most with Coursera’s/KA’s

brand, since most engaging posts are

usually also the most visible on both

SNSs.

Coursera Google+ page’s post most

appreciated by the community – posts

that received the most +1’s (descending

order from left to right) – share a

common attribute of story & inspiration

(unlike Khan Academy’s “playfulness”).

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Coursera_Posts.csv) | January-May 2014

+1s top 3

followers: 343,686

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Khan_Posts.csv) | January-May 2014

Dealing with KA’s post from right to left (reverse order, “third place” first), Khan Academy adheres to “playfulness” – especially see the top posts based on other metrics on

the following pages. Despite the fact that any wishes, celebrations, holiday posts etc. are generally popular on social media, Khan Academy’s New Year’s resolution is

reinforced by a badge. We also see that the community endorses KA’s partnerships – but rather in the sense that it brings free educational resources they can use at

school than regarding a particular institution.* Yet the most successful post is a “storytelling” one, generally Coursera’s strong point. We’ve already seen the “Teacher

Appreciation Week” within the Twitter top hashtags table (KA’s most recent tweets analysis).

* Once again, we can see that since Khan Academy is rather aimed at K-12 education, while Coursera meets demand for higher academic/professional education, KA is more prone to “blend” with the traditional

education system. Also note – for later reflection & discussion within our overall conclusions – that even though both Coursera & Khan Academy are given the “revolutionist” brand label, it is rather adjustment of current

educational practices and enjoying the technological benefits (meeting higher demand, analytics, space-time freedom etc.) what happens than “fighting against” them and/or against the current education system.

followers: 1,067,922

replies top 3

There’s one newcomer post in Coursera’s posts that were

most commented on. It’s Coursera’s Android App Beta testing.

Asking for feedback and/or for expressing an opinion is a

technique nine out of ten social media marketers recommend.

=) “Authorization” of users in this way is a mutually beneficial

act which increases interaction & strengthens the community.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Coursera_Posts.csv) | January-May 2014

400 replies

87 replies

72 replies

replies top 3

followers: 343,686

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Khan_Posts.csv) | January-May 2014

Finally, driving social media engagement in purely educational way!

The high popularity of responding to KA’s brainteasers & challenges

shows in which way gamification of education can be important:

facilitating beginning of & reinforcing motivation in the learning process

(it catches my interest); supporting interaction & healthy competition

(responding to show to the others that I know the answer); and therefore

also collaboration (responding to answers of the others, discussion).

217 replies

191 replies

181 replies

followers: 1,067,922

reshares top 3

Reshares on Coursera don’t provide us with any (previously) unseen posts. The

lesson here could be that Coursera g+ fans rather share inspiring & storytelling

content than educational content (e.g. links to new courses), which is different from

Khan Academy (see on the next page). In several following analyses, we’ll support

the conclusion that Coursera excels in sharing stories about students, while Khan

Academy rather stands out in gamification & sharing background content about itself.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Coursera_Posts.csv) | January-May 2014

reshares top 3

followers: 343,686

Khan Academy Google+ post

reshares added a new brain teaser

to our most engaging posts “hall of

fame”. We can say that KA’s

Google+ community shares rather

educational content: content which

their social network might need

(College admissions resources)

and/or engaging learning content

(challenges & brainteasers).

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Khan_Posts.csv) | January-May 2014

followers: 1,067,922 followers: 343,686

type of post number of posts number of +1 +1/count coefficient

article 68 4,073 60

photo 25 5,321 213

video 3 295 98

type of post number of posts number of +1 +1/count coefficient

article 13 1261 97

photo 33 6,138 186

video 53 2,039 38

content shared

Before moving to Facebook, let’s support the generally known & accepted idea of visual

content (photo/pictures) being the most engaging on social media. Employing as simple

analysis as: 1) calculating the numbers of particular content types – article (text only),

photo & video – Coursera/KA shared 2) adding up the total +1s it received; and 3)

calculating the number of +1s a particular content type received on average; we find that

sharing a photo/picture pays off to Coursera & Khan Academy at least two times more

than sharing any other kind of content.* Also when popularizing education & making it

accessible, we should keep in mind that “a picture is worth a thousand words”.

* I know, I know… Not controlling for a third variable. Anyone who’ll find a confounding variable

and/or a spurious relationship, don’t hesitate to start the discussion below this text. =)

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Coursera_Posts.csv, Gplus_Khan_Posts.csv) | January-May 2014

number of fans: 529,115

likes top 3

The most liked stories posted by Coursera on Facebook are different from those on Google+ – (from my subjective point of view) “surprisingly”

inclined towards educational content information instead of storytelling – which denotes there are some differences between both communities.*

As measured by the number of likes, Coursera’s Facebook community acknowledged: firstly, Coursera’s Specializations announcement, sequence of

courses finished by a capstone project & certificate; secondly, a New Year’s post introducing new partner Universities & (therefore) new courses; and

thirdly, a “gag” showing that entertainment in general (and/or memes in particular – yet our particular post is not an Internet meme, you can learn more

about memes here) is still popular on Facebook**, even in communities around education.

* Since Facebook is the “mainstream(est)” social medium, we’ll make an attempt to derive both Coursera & Khan Academy

fan typology based on their Facebook co-comment networks.

** The rank of “memes & gags” content will rather decline in Facebook users’ personal news feeds, since Facebooks current goal seems to be

becoming a “personalized newspaper” (or here for information about Paper), and because of that it attempts to promote “high quality content”.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv) | January-May 2014

number of fans: 714,626

likes top 3

As for Khan Academy, Facebook users endorsed by their likes the SAT materials, an

outcome of The College Board & Khan Academy partnership. From the remaining two

posts in the top 3, we can see that many thumbs up are generally given to the story of

Salman Khan (nurtured by online publishers*), his ideas/visions & leadership pathway

towards transforming education.

* We’ve already seen some publishers publicizing Salman Khan before. There is The New York

Times & Harvard Business Review at this page. Nevertheless, the main support for the “nurtured by

online publishers” statement comes with the online news articles analysis later on.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Khan_Posts.csv) | January-May 2014

number of fans: 529,115

comments top 3

Being “the” social medium,

Facebook is also know as a tool

for establishing online (grassroot)

“movements”; protest/support

groups etc. Knowing that, it

should not be surprising that

Coursera’s Facebook community

expressed their opinion (mainly

disagreement) with restricted

Coursera access in some

countries due to US sanctions,

making the “Update on Course

Accessibility for Students in Cuba,

Iran, Sudan, and Syria”

Coursera’s post that was most

commended on. The disproportion

of likes against comments can be

explained by the fact that while

expression of support on social

media is generally easily done

(there’s usually a button for it),

assuming SNSs where there are

no design features like

downvotes, thumbs down or

anything similar, the only way of

how to disagree is to explain

yourself using a comment.

We are already familiar with the

other most frequently commented

posts.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv) | January-May 2014

137 comments

95 comments

92 comments

number of fans: 714,626

comments top 3

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Khan_Posts.csv) | January-May 2014

As well as Khan Academy’s

Google+ community, the

Facebook one also likes to

show off by interacting with

gamified educational content

using comments. Apart from

the (rather relaxing)

brainteasers (probability &

logic) we already know, there’s

also a new mathematical

challenge, which belongs to

the “this week’s challenge”

KA’s series.

824 comments

737 comments

549 comments

number of fans: 529,115

shares top 3

The most successful posts according to the number of times they

were shared does not bring any new content. Nevertheless, the

shuffled order suggests that many Facebook users are eager to

share entertaining content.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv) | January-May 2014

number of fans: 714,626

shares top 3

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Khan_Posts.csv) | January-May 2014

Khan Academy’s Facebook community mainly wanted to

share the SAT materials with others – note that on Facebook,

a teenager probably has more (active) classmates than on

Google+ – and also wanted to tease their friends with the “two

guards & two doors” brain teaser.

number of fans: 529,115

other popular posts

To complement the overall picture & comparison with the most popular Google+ posts, let’s also

add another three posts immediately following the top3 Coursera’s stories (Khan Academy’s on

the next page) – as measured by the number of likes they received. While there are obvious

similarities with Google+, we can also see what will be supported in the co-comment Facebook

network analysis, that on Facebook, Coursera reaches more active fans from Latin America.*

* Also note that currently, we are monitoring active fans only (those who like, comment or share). We are not

monitoring posts about Coursera (/Khan Academy) beyond its Facebook page. The supposed differences

between both communities – reflected in the content they publicly acknowledge – might also be (co-)created

by different recommender algorithms Facebook & Google+ employ. However, such discussion is far beyond

the scope of this research.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv) | January-May 2014

number of fans: 714,626

other popular posts

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Khan_Posts.csv) | January-May 2014

Similarly, you might take a look at #3-6 of KA’s Facebook posts

with the top like engagement rate. As we’ve mentioned earlier, the

SAT standardized test caused a great deal of the buzz around

Khan Academy on Facebook.

number of fans: 529,115 number of fans: 714,626

total posts

since 2014 176

unique fans engaged

since 2014 3,580 0.68% of total fans

most active likers(surnames removed if

applicable)

Chris; Lars; Eurusd Trader; Mahmoud; Crear

una tienda online; Concursos de fotos;

Ronnie; Sorteos Facebook. Aplicaciones

concursos, gratis

most active

commenters(surnames removed if

applicable)

TichiLivi; Sami; Tarlei; Askia; Ramaswamy;

Hassan; Sameer; Steffial

total posts

since 2014 92

unique fans engaged

since 2014 2,582 0.36% of total fans

most active likers(surnames removed if

applicable)

Syed; Janie; Mya; Tsveta; Daniel; James;

Tom; Shayma; Margareth

most active commenters

(surnames removed if

applicable)

David; Ahsanul; Study Australia; Maghnia;

Steve; Julie; Keith; Mark; Sue

active fans

We could obtain the demographic profile of the active Facebook fan population from those who share such information publicly. Let’s do that later as a pre-screening of our fans typology based on co-

comment network. This slide simply should highlight the fact that on Facebook (& other SNSs as well), we actually are able to reach detailed information on a single human being (here, last names were

removed). For example, we can create an archetype of a very active Coursera fan based on the demographic information & posts the most active C’s likers & commenters publicly share. The “customer

persona” of a highly engaged Coursera Facebook fan could be described as a man in his 20s, studying in US (native or foreign student) or recently employed, who wants to complement his professional

qualification; possibly (e.g. fans from Brazil, India or Syria) such courses are not available at his home University and/or he might be limited by financial constraints.*

Talking about Coursera & knowing that such practice was not detected within the most active Facebook users on KA’s posts, note that using likes, several companies makes an attempt to draw some of the

Coursera’s attention to their business.

Another information that’s important is that Coursera shares almost twice as many posts Khan Academy does. Khan Academy Facebook fans are larger in number but have a smaller active “core” of users.

As you’ll see on the three following pages, this actually perfectly fits our developing interpretation of smaller but stronger community around Salman Khan, since those active users produce an enormous

number of likes, comments & shares.

* Interest profile could also be created based on the pages the users (publicly) liked.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (data obtained via: developers.facebook.com/docs/graph-api/) | May 2014

number of fans: 529,115total posts: 176

number of fans: 714,626total posts: 92

mean 253

standard error 22

median 159

standard

deviation 290

kurtosis 19

skewness 4

range 2,280

minimum 46

maximum 2,326

sum 44,543

mean 765

standard error 79

median 502

standard

deviation 755

kurtosis 9

skewness 3

range 4,616

minimum 34

maximum 4,650

sum 70,369

likes

Although this & the following two pages

could as well precede the “top posts”

content analysis, now I feel the need to

justify why we did actually study (positive)

outliers rather than the average posts.

We’ve already mentioned the reason

related to the primary objective of this

research, discovering Coursera’s & Khan

Academy’s social web brands, where the

outlying posts are those that actually

reach the largest population and because

of that arguably impact the overall image

of an institution most significantly.

The second reason is that we’ve seen

there’s a minority of active fans which

interacts with a minority of posts. Since

the primary way of spreading content on

social media is engagement & interaction,

we rather might want to study the common

features of the most successful content.

From our “education” perspective ideally

those that are: 1) educational; 2) popular,

so that they drive engagement to

education.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv, Facebook_Khan_Posts.csv) | January-May 2014

mean 13

standard error 1

median 9

standard

deviation 18

kurtosis 19

skewness 4

range 137

minimum 0

maximum 137

sum 2,366

mean 80

standard error 15

median 27

standard

deviation 144

kurtosis 13

skewness 3

range 822

minimum 2

maximum 824

sum 7,379

comments

The previous slide showed

comparison of descriptive

statistics between the likes

Coursera & Khan Academy

received. Among other things, it

demonstrated that Khan

Academy – also thanks to its

brain teasers & challenges –

manages to drive likes more

successfully. Here, comparing

comments statistics, we can

clearly see that Coursera is

missing higher interaction of fans

with one another that Khan

Academy masters thanks to its

aforementioned gamified

content.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv, Facebook_Khan_Posts.csv) | January-May 2014

number of fans: 529,115total posts: 176

number of fans: 714,626total posts: 92

mean 41

standard error 6

median 24

standard

deviation 82

kurtosis 67

skewness 7

range 871

minimum 0

maximum 871

sum 7,178

mean 212

standard error 22

median 142

standard

deviation 218

kurtosis 5

skewness 2

range 1,183

minimum 0

maximum 1,183

sum 19,520

shares

Similarly, in relation to shares, take a

look at mean, median & sum.

Coursera’s Facebook community

receives almost twice as many posts &

has almost twice as many active users,

but its rather Khan Academy’s active

core userbase that spreads educational

content on social media.

To be fair, or more precisely, to point

finger at the “offender”, such state of

things is related to what we’ll discover

soon thanks to YouTube network

analysis. While Khan Academy shares

all of its educational content, by default,

using publicly accessible third-party

tools like YouTube, Coursera’s own

“after-you-enroll-accessible-only”

learning environment with all

educational resources makes it more

difficult to establish a solid social media

content strategy based on open

educational resources which, obviously,

drive a lot of engagement.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv, Facebook_Khan_Posts.csv) | January-May 2014

number of fans: 529,115total posts: 176

number of fans: 714,626total posts: 92

total posts: 176 total posts: 92

correlation matrix

Slowly finishing up our playing with the engagement data, here comes a correlation matrix heat map allowing us to examine likes, comments & shares in their mutual relations. Regarding Khan

Academy, what was liked was also shared, while this was not always true for Coursera. We already know that likes are the most common type of engagement, since giving a thumb up is literally as

easy as clicking a button. We also know that shares are crucial for information spreading; and, as we saw, also much less frequent – perhaps because a user sharing content not only expresses her/his

deeper interest in a particular topic (compared to a like), but is additionally asked to (optionally) add a comment to the reshared content. Such act requires a lot of involvement, doesn’t it? =) This

appears to be even “worse” in case of comments, which are the rarest of all (see the “sum” rows on previous pages). Since we tried to justify the study of outliers, even though the correlation between

shares & comments is weak for both Coursera & Khan Academy, and knowing that we’ve already discussed the most successful posts, a comments/shares scatter plot visualization should emphasize

the reason why we want to know about the common features of the most successful posts. Moreover, we’ll clearly see what a difference a single strong element in content strategy can make.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv, Facebook_Khan_Posts.csv) | January-May 2014

total posts: 176

95, 871On occasion there's a worthwhile online course Coursera doesn't host. This

is one of them... [Hogwarts School of Witchcraft & Wizardry]

Due to U.S. export sanctions,we recently had to restrict accessto students in Iran, Cuba, Sudan

and, temporarily, Syria. (...)137, 89

92, 533Today we’re excited to announce Coursera Specializations (...), a new type of program that allows students to develop mastery in a specific subject through taking a sequence of courses with a capstone project. (...)

0

100

200

300

400

500

600

700

800

900

1000

0 20 40 60 80 100 120 140 160

share

s

comments

shares & comments outliers

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv) | January-May 2014

Interactive version of the chart: http://bit.ly/1lHvtoy

total posts: 92

737, 993Can you solve this week’s brain teaser?

There are two doors, each with a guard. Behind one door is treasure. (...)

824, 652Let’s make a deal! Can you

solve this week’s brain teaser? (...)[Suppose you're on a game show,

and you're given the choiceof three door: (...)]

138, 1183It’s time to level the playing field! We're partnering with The College Board, the creators of the SAT (...)

0

200

400

600

800

1000

1200

1400

0 100 200 300 400 500 600 700 800 900

share

s

comments

shares & comments outliers

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Khan_Posts.csv) | January-May 2014

Interactive version of the chart: http://bit.ly/1tAWsos

58% 42%

pt_BR 8%

fr_FR 3.6%

es_ES 3.2%

...

number of fans: 529,115 number of fans: 714,626

70% 30%

77.9% 12.9%60.8% 14.0%

es_LA 1.1%

pt_BR 1%

sv_SE 0.9%

comment network gender & locale

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv, Facebook_Khan_Posts.csv) | January-May 2014

While our previous Facebook analyses in this section were content-oriented, now we’ll try to create user

typologies based on co-comment networks, which were established around Coursera’s & Khan Academy’s

Facebook posts.

This page offers locale & gender description of commenting users. Same as before, we find ourselves in

the realm of active (commenting) users only (& Facebook only), where we see all public (only) comments –

their final version if edited; we don’t see (theoretically existing) deleted comments. We’ve also already

mentioned that our dataset covers less than six months in the first half of 2014, which might cause a bias

(should/could be contrasted with a more recent dataset). Despite that, the co-comment Facebook network

analysis will provide us with a valuable insight into what kind of content engages different fan subgroups.

Just for “peace of my mind” I need to emphasize the evident information that in no way this page provides

a description of the general population of Coursera’s & Khan Academy’s users & supporters in the general

population.* If we wanted to do that, probably combining data from many other sources, employing web

scraping etc.; it would be suitable to standardize our ratios with respect to the total population of particular

countries (to safely identify large communities in smaller countries as well). The locale** & gender here

therefore gives evidence about the comment networks on the following pages only.

It’s apparent that among the active Facebook commenters of Coursera & Khan Academy, their largest

“customer” demographic segment, US, predominates, followed by another country whose majority

population undoubtedly enjoys open educational resources in English. Yet, we can’t be so sure, especially

about the second statement, since many English-speaking users from non-natively-English-speaking

countries most likely use Facebook with English set as their locale (similarly other world languages).**

As for Coursera, we should point out the relatively large proportion of Brazilian commenters.

Dealing with Khan Academy, gender probably takes our attention. Since Alexa’s network traffic estimates

suggested the exact opposite, we can’t shift this discussion any further. Yet, while the question of whether

the existing “not afraid to speak up” inequalities from offline world reflect itself in the online world as well

should be answered by someone with “gender studies” qualification, we should mention that both

Coursera & Khan Academy support (& contribute to) the current trend of highlighting women achievements

& facilitating women’s emancipation in traditionally “male fields”, such as ICT & science.

* If, however, you wanted to do an estimate of Coursera’s/KA’s world’s “coverage” (rather than particular countries

“proportions” estimate), take a look at the enclosed dataset,

where you can find the locales smaller in numbers as well.

** “Locale” is not “location” but simply a user’s language settings.

n=285number of fans: 529,115

group vertices edges

random 'emotional post' sympathizers 52 1326

Coursera story listeners & story tellers 37 186

Coursera news subscribers: technology

& educational transformation

32 229

Coursera news subscribers: new courses 25 87

quiz solvers 23 162

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_CommentNetwork_Edges, Facebook_Coursera_CommentNetwork_Vertices) | January-May 2014

The 2.0 levels of adjacent vertices of Coursera’s Facebook comments

network, where size of a node represent its degree, show us “natural”

clusters – according to how the vertices are connected one to

another, the Clauset-Newman-Moore algorithm – of active fans, from

which we can derive (active) fan typology based on which kind of

posts a particular group interacted with.

The largest group was labeled “random 'emotional post' sympathizers,

since it mainly consists of users interacting just with the “Dan, an

autistic student” post. The second largest group of co-commenters,

“Coursera story listeners & story tellers” consists of users who

commented on stories of people from or around Coursera, personal

stories shared by other fans, and/or shared their own educational

experience (comment on Coursera’s post asking for a story and/or a

stand-alone post on Coursera’s wall). Group #3 is labeled “Coursera

news subscribers: technology & educational transformation”. Those

users keep/kept an eye on the way Coursera re-defines higher

education & techn(olog)ical news – e.g. Coursera’s new Android app.

The fourth type of commenters, “Coursera news subscribers: new

courses” simply watch out Coursera’s Facebook page for being

notified about new courses (& comment on them). The commenters in

the last larger group that is “entitled” =) to have its own label are “Quiz

solvers”. Quiz solvers comment on posts where there is a clear call to

action in them (questions & quizzes).

The second, third & fourth group is connected to the first group,

indicating that the sentimental post bonded the community together.

Storytelling is a powerful technique, indeed. Quiz solvers are partially

mixed with both groups interested in Coursera news. Similarly, there

are some overlaps of “story listeners & tellers” and “new courses

subscribers” into “quiz solvers”.

comment network

number of fans: 529,115

group vertices edges

random 'emotional post' sympathizers 52 1326

Coursera story listeners & story tellers 37 186

Coursera news subscribers: technology

& educational transformation

32 229

Coursera news subscribers: new courses 25 87

quiz solvers 23 162

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_CommentNetwork_Edges, Facebook_Coursera_CommentNetwork_Vertices) | January-May 2014

Let’s now take a look at the most significant users in

Coursera’s co-comment network according to some

commonly used social network analysis centrality

metrics.* As per degree – i.e. the overall connectedness

as measured by number of connections – the user Tarlei

is the most significant one (blue profile picture in the

middle of Coursera story listeners & tellers group). After

examining betweeness centrality – number of shortest

paths from all vertices to all others that pass through a

node (i.e. to which extent a node acts as the “connector”

of the network); eigenvector centrality – importance of a

node based on its connections, where more important

nodes are given more weight (i.e. influence based on

who you are connected to); and PageRank, used by

Google Search to rank websites in their search engine

results – a similar link analysis algorithm to eigenvector

centrality; once again we find that Tarlei is the central

person regarding comments on Coursera’s post.**

* Again, in the picture, size of a node represent its degree.

** On the topic of other widely used social network analysis

metrics, we have (intentionally) omitted, these are closeness

centrality – sum of distances to all other nodes (how easy it is

to reach them); & clustering coefficient – how close the vertex

and its neighbors are to being a clique (a complete graph).

n=285

comment network

number of fans: 529,115

group vertices edges

random 'emotional post' sympathizers 52 1326

Coursera story listeners & story tellers 37 186

Coursera news subscribers: technology

& educational transformation

32 229

Coursera news subscribers: new courses 25 87

quiz solvers 23 162

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_CommentNetwork_Edges, Facebook_Coursera_CommentNetwork_Vertices) | January-May 2014

Unless, of course, it’s not that simple… Taking a look at random Tarlei’s

posts like “I got certificate!”, “Coursera is awesome!” or “Downloaded!”, we

probably don’t find these expanding the overall discussion on Coursera’s

Facebook page. On the other hand, there are other comments like “Please

abolish Peer Assessment evaluation. This method is not fair. Keep only

quizzes. Students are complaining a lot! (Please refer to the discussion

forums.)”, or “We are waiting for a course preparatory to TOEFL

examination.”, which actually speak in favor of awarding Tarlei the “central

person in Coursera’s comment network” badge. I simply wanted to have

some insight into that matter so that we avoid adopting too “mechanistic”

stances towards social network analysis without any qualitative verification

of our conclusions.

There are some discrepancies regarding Tarlei’s Facebook & LinkedIn

profile, and, above all, we are conducting analyses in the online world,

therefore we always are (at least) a bit “suspicious” about authenticity of

nearly anything. =) However, Tarlei acts as a big fan of Coursera (& is a

frequent user/learner), who surely follows his own (or another institution’s)

business interests, nevertheless, whether his intentions are content

marketing, or educating his followers & online neighbourhood, or both, it’s

nice to know that the central person in an educational institution’s

(Coursera’s) Facebook comments is a user who regularly shares open

educational content – here, on the topic of “English for Lawyers”.

n=285

comment network

Tarlei’s network

number of fans: 529,115

group vertices edges

random 'emotional post' sympathizers 52 1326

Coursera story listeners & story tellers 37 186

Coursera news subscribers: technology

& educational transformation

32 229

Coursera news subscribers: new courses 25 87

quiz solvers 23 162

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_CommentNetwork_Edges, Facebook_Coursera_CommentNetwork_Vertices) | January-May 2014

Even though, at this moment, we find ourselves discussing details not

entirely necessary for our original intention of creating a user typology*, it is

worth to show ways of finding influential users within a single Facebook

page (similar, for example, to a particular Internet forum research), as

opposed to our third section, “a week of Tweets”, where we’ll search for

influential users across the entire tweets universe (similar, for example, to

looking for influential bloggers who use the Wordpress content management

system).**

By removing*** Tarlei, the main character of Coursera’s comments network,

we can learn about who would possibly replace him in his role if he stopped

being an active Coursera commenter. What if Coursera starts a viral

Facebook campaign in accordance with its mission “empower people with

education that will improve their lives, the lives of their families, and the

communities they live in” (see the data from Facebook API we’ve queried at

the beginning of this section), then starts to recruit “role models” among the

active & influential Facebook commenters, but Tarlei says “Sorry, not

interested.”? Time for plan B. Prospective candidates for “replacing” Tarlei

then would be** (highlighted in red) Daniel: degree, & eigenvector centrality;

Stephanie: betweenness centrality; & NanChi: PageRank.

* From the perspective of the “pull model” of education (as opposed to the “push

model”), this is a “sneaky” way of teaching the reader something new without her or

him realizing it (therefore not resisting it =)).

** Please, don’t forget that the objective of this paper, describing Coursera’s & Khan

Academy’s social web brand, is far from discovering concrete influencers (but rather

patterns & trends). If we wanted to do so, our quantitative pre-screening of

prospective influentials should be followed by a detailed excursion into quality &

relevancy of the content they share and also with respect to our goals/intentions &

target groups.

*** This analysis was inspired by a similar one in HANSEN, Derek, Ben

SCHNEIDERMAN and Marc SMITH. ANALYZING SOCIAL MEDIA NETWORKS

WITH NODEXL: INSIGHTS FROM A CONNECTED WORLD. Burlington, MA:

Morgan Kaufmann, 2011. ISBN 01-238-2229-7. Great book, by the way! For both

practical usage & the general context of social network analysis.

Yet, an updated (second) edition would be great.

n=285

comment network

number of fans: 714,626

group vertices edges

random brain teaser solvers 761 289180

brain teasers & challenges enthusiasts 696 229974

SAT (US college admissions

standardized test)

506 102513

'Mathematics' brain teasers & challenges

enthusiasts

345 26633

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Khan_CommentNetwork_Edges, Facebook_Khan_CommentNetwork_Vertices) | January-May 2014

n=2434

To make the picture more clear, as Khan Academy’s

co-commenter community is larger than Coursera’s,

the edges were removed (only nodes & their clusters

are visible). Once again, the groups were labeled

based on the types of post a particular cluster of users

interacted with. Since the additional details were

already explained within Coursera’s network, we can

now focus on the primary goal only: to derive KA’s

Facebook user typology.

Similarly to Coursera, the largest group is made up of

generally less active commenters on Khan Academy’s

posts. The “Random brain teaser solvers” were solving

one of the most popular brain teasers featured by Khan

Academy: three doors, two goats & one car (see on the

previous pages). In general, “Logic brain teasers &

challenges enthusiasts” comment/commented on any

logical tasks posted on KA’s Facebook timeline. The

third largest group, “SAT candidates” are (naturally)

interested in any content related to the SAT

examination. Group #4, “Mathematics brain teasers &

challenges enthusiasts”, regularly share their

calculations in the comments below any mathematical

exercise/challenge.

comment network

number of fans: 529,115 number of fans: 714,626

graph type undirected

vertices 285

unique edges 2747

edges with duplicates 8

total edges 2755

connected components 30

single-vertex connected components 10

maximum vertices in a connected component 190

maximum edges in a connected component 2545

maximum geodesic distance (diameter) 6

average geodesic distance 2.733113

graph density 0.067976279

modularity 0.501497

graph type undirected

vertices 2434

unique edges 743330

edges with duplicates 4069

total edges 747399

connected components 16

single-vertex connected components 6

maximum vertices in a connected component 2397

maximum edges in a connected component 747342

maximum geodesic distance (diameter) 3

average geodesic distance 1.752121

graph density 0.251713211

modularity 0.537402

comment network comparison

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera/Khan_CommentNetwork_Edges, Facebook_Coursera/Khan_CommentNetwork_Vertices) | January-May 2014

The somewhat more qualitative insight into both communities deserves to be complemented

using several (rather quantitative) metrics allowing us to “objectively” compare both networks.

Comparing the number of nodes (/vertices) & edges (/links) appears to be easily interpretable.

So, let’s start with exploring the “edges with duplicates”* row, which tells us that Khan Academy’s

fans comment more often repeatedly on the same post (e.g. a reply to another user). Relatively

speaking – in proportion to unique edges – KA’s commenter is almost twice more prone to

comment on a post repeatedly (0.54% of total edges were duplicates) than Coursera’s

commenter (0.29% of total edges were duplicates).

The previous conclusion that the community of Khan Academy gives the feeling of being more

“tied together“ can finally be supported by an appropriate metric of “connected components”.

Although Khan Academy’s network is larger in size, in comparison with Coursera, it has about

half the number of sets of vertices that are connected to each other but not to the rest of the

graph. We know that on Facebook, Coursera shares almost twice as many posts Khan Academy

does. Nevertheless, the discussion is simply not happening, supposedly because of missing

educational content directly on Facebook (not just redirecting to courses requiring enrollment).**

Despite the fact, we have seen on the previous pages, that KA’s unique fans engaged since 2014

are proportionally smaller compared to Coursera, the “hardcore” Facebook epicenter of KA’s fan

base seems to be very strong & engage with almost anything Khan Academy shares.

We can elaborate upon the discussion in the paragraph above taking a look at the “single-vertex

connected components” row, which represents number of connected components that have only

one vertex. Furthermore, there’s the average geodesic distance & maximum geodesic distance –

number of edges in a shortest path connecting two nodes – which tell us that a KA’s commenter

reaches another commenter on maximum of three steps (if she or he belongs to the same

connected component). And finally, we can top our argument using graph density – the ratio that

compares the number of edges in the graph with the maximum number of edges the graph would

have if all the vertices were connected to each other (much smaller regarding Coursera).

Maximum vertices in a connected component & maximum edges in a connected component tell

us: 1) vertices: how large was the largest (notional) group of commenters; 2) edges: how many

relations/links were in the (notional) group of commenters with the most edges. Again, it shows

that KA’s community is more connected, also thanks to many “intermediaries” with high

betweeness centrality (such users are crucial for viral spreading of any content).

The very last metric, modularity, “quality of the grouping”, is almost the same for both. Graphs

with high modularity have dense connections among the vertices within the same group but

sparse connections among vertices in different groups. Knowing the modularity range of [−0.5,1),

we can say that our value being around 0.5 means quite clearly defined groups.

* As for our undirected graph, "A,B" & "B,A“ relationships are considered duplicates. Such duplicates can be

also used as “weight” of a relationship. We’ll do that in one of the following analyses.

** My immediate idea for improvement would be, for example, sharing (in the form of pictures) sample

exercises and/or quiz questions from Coursera courses, which would drive engagement (especially

comments).

number of fans: 529,115 number of fans: 714,626

0

20

40

60

frequency

degree

minimum degree 0

maximum degree 95

average degree 19.305

median degree 11.000

0

100

200

300

frequency

betweenness centrality

minimum betweenness centrality 0.000

maximum betweenness centrality 5361.795

average betweenness centrality 111.800

median betweenness centrality 0.000

0

500

1000

frequency

degree

minimum degree 0

maximum degree 2296

average degree 612.418

median degree 709.000

0

1000

2000

3000

frequen

cy

betweenness centrality

minimum betweenness centrality 0.000

maximum betweenness centrality 252579.402

average betweenness centrality 888.236

median betweenness centrality 0.000

comment network comparison

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera/Khan_CommentNetwork_Edges, Facebook_Coursera/Khan_CommentNetwork_Vertices) | January-May 2014

In order to make reasonable relative (not

absolute) comparisons, this detailed

comparison of metrics invites us to take a

look at the graphs, the distribution of their

bars (also those that are very small &

therefore less visible), rather than

focusing on the numbers. The “x” axis

represents value of a metric – from 0 to its

maximum, from left to right – while the “y”

axis shows frequency of the value in the

network.

Both degree & betweeness centrality

seem to verify what was concluded

before. Khan Academy is more tied

together thanks to its solid core of users,

while the discussion on Coursera’s

Facebook page happens (more

frequently, in comparison) thanks to

influential “connectors” with high

betweeness centrality.

number of fans: 529,115 number of fans: 714,626

minimum closeness centrality 0.000

maximum closeness centrality 1.000

average closeness centrality 0.098

median closeness centrality 0.002

minimum eigenvector centrality 0.000

maximum eigenvector centrality 0.017

average eigenvector centrality 0.004

median eigenvector centrality 0.000

minimum closeness centrality 0.000

maximum closeness centrality 1.000

average closeness centrality 0.006

median closeness centrality 0.000

minimum eigenvector centrality 0.000

maximum eigenvector centrality 0.001

average eigenvector centrality 0.000

median eigenvector centrality 0.000

0

100

200

300

frequency

closeness centrality

0

100

200

300

frequency

eigenvector centrality

0

1000

2000

3000

frequency

closeness centrality

0

500

1000

frequency

eigenvector centrality

comment network comparison

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera/Khan_CommentNetwork_Edges, Facebook_Coursera/Khan_CommentNetwork_Vertices) | January-May 2014

Dealing with other compared metrics, we have

to bear in mind that both networks include

disconnected components, which move their

values beyond interpretability, meaningfulness

& practical applicability. It would be more useful

– especially in case of eigenvector centrality,

PageRank & clustering coefficient – to focus

(separately) on individual users & groups, to

find the most influential users (as we did a bit

with Coursera). Since such analysis would not

contribute much to the overall objective of this

research, this & the following page just simply

illustrate how many & how much influential

users – according to various metrics we’ve

briefly defined before – we could find in both

networks.

number of fans: 529,115 number of fans: 714,626

minimum pagerank 0.000

maximum pagerank 3.511

average pagerank 0.965

median pagerank 1.000

minimum clustering coefficient 0.000

maximum clustering coefficient 1.000

average clustering coefficient 0.891

median clustering coefficient 1.000

minimum pagerank 0.000

maximum pagerank 5.613

average pagerank 0.998

median pagerank 1.021

minimum clustering coefficient 0.000

maximum clustering coefficient 1.000

average clustering coefficient 0.970

median clustering coefficient 1.000

0

50

100

150

frequency

pagerank

0

100

200

300

frequency

clustering coefficient

0

500

1000

1500

frequency

pagerank

0

1000

2000

3000

frequency

clustering coefficient

comment network comparison

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera/Khan_CommentNetwork_Edges, Facebook_Coursera/Khan_CommentNetwork_Vertices) | January-May 2014

number of fans: 529,115 number of fans: 714,626

interest profiles

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: N/A) | January-May 2014

Bing search results analysis, online news articles analysis, YouTube co-comment network of videos, and analysis of top Reddit posts are yet to come. But before that, for the following 6+1

pages including this one, let’s yet stick for a while to Facebook & the level of detail we are able to obtain about a single Facebook user, assuming that we have the necessary permissions,

obtained, for example, within the Facebook (Google/LinkedIn/… for another SNS data) login permission dialog. We might be interested in the socio-demographic data about users & mutual

relationships among them (friends & family, age, location, education, occupation etc.), their interest profiles based on Facebook pages they liked, and so on. All of that can be easily

accessed. However, assuming “fair play”, we are able to obtain such data (via Facebook Graph API) only if the studied population is part of our friends network, or if Facebook users grant us

permission to access such data (via the aforementioned Facebook permission dialog being part of your application)*, and/or collecting the publicly available user data in the way we did in our

previous analyses, beginning with an active user (a user who publicly liked, commented, or shared something of our interest) & obtaining her or his id which we then use to collect other data

she or he set as public (does not restrict their visibility with privacy settings). Nevertheless, the third way of collecting data actually (unfortunately) does not provide us with everything a user

set as public. Even though we are technically able to see someone’s public data in our web browser, it doesn’t mean the Facebook Graph API allows us to query it. Sure, there are some very

simple workarounds like screen scraping (& automating it), but such practice violates Facebooks Automated Data Collection Terms for humans, as well as robots.txt for machines. So, even

though it would be awesome if this page provided a comparison of most frequent pages in different categories that Coursera’s & Khan Academy’s (active) fans liked – i.e. comparison of their

interest profiles – we will play by the rules and instead of that take a look at my personal friends network. Protecting confidentiality & anonymity is not a simple task on the social web.

Therefore I hope that providing aggregated data from my – otherwise almost completely public – personal profile about my connections, deprived of identifying information on an individual

human being, will not damage any of my friends – not even in the name of science! =)

The purpose of this part of text is to illustrate simple collection of data from social media user profiles, for examlpe, in order to enhance an educational tool’s recommender algorithm

personalizing education (and/or complement behavioral data). Though, as you will see on the following pages, Facebook is not a social network inclined towards educational content, I won’t

be a chicken cowardly fleeing to LinkedIn – e.g., recommending interdisciplinary educational resources according to one’s professional experience & professional experience of one’s

network; Twitter – generally more rich SNS with respect to news & educational content; or YouTube – as we have already seen, if we omit all funny videos of cats etc., an ideal medium for

education; but I’ll take the “mainstreamest” social medium that is – exactly for the reason of being “mainstreamest” which reflects its very high population reach – important to study in relation

to education as well.

* However, you can take advantage of (“/abuse” =)) data about all 1.3 billion Facebook users whenever you want – provided that you have enough money –

through Facebook’s targeted advertising, which every single Facebook user agreed to the moment she/he started using the service (Terms of Service).

N/A N/A

personal network

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: www.wolframalpha.com/facebook/) | June 2014

Not to “reinvent wheel”, I am happy to aggregate the available demographic data of my 375 Facebook friends using Wolfram Alpha’s tool.* Since this is only a demonstration of

obtainable data, not an analysis leading to answering our research question, there’s no need to interpret that data (in fact we might need to do that later in connection with the

analysis on the next page).

* …aaand I’ve just violated my obligation of protecting my friend’s data providing them to a third-party. I told you it’s super simple (and you possibly do that on a daily basis using Facebook, Google or

other services, and/or using your mobile device full of your friends contacts, photos etc.). To digress, if you are looking for an alternative search engine, don’t hesitate to give a try to the aforementioned

Wolfram Alpha, which, instead of returning pages based on keyword analysis, computes search results using curated data. And what about the Wolfram (programming) language!

Speaking of alternative search engines, I have one more recommendation for those who do not support the current trend of personalized search results: try DuckDuckGo.

personal network

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: developers.facebook.com/docs/graph-api) | June 2014

Another from our miniseries of analyses “just for the sake of showing

what is possible to obtain” (& we’ll also take that into account in our

overall conclusions) shows – in an incredibly complicated graph =) –

my personal 2.0 network of my & my friends’ Facebook timeline with

my recent interactions highlighted in red (the lines are coming out of

me, the ego of the network). The size of nodes depends on

betweeness centrality, which accentuates connectors/intermediaries in

our network. The thicker & darker the line, the stronger a relationship

is.* The less transparent node is, the more recent a user’s profile

update time is (i.e. more recently active user). Only the largest &

“sufficiently” anonymous groups were given a name.

Even though the largest clusters are reflections of “limited private

offline” activities (predecessor of “massive open online” stuff =)),

schools & summer holidays, the picture also nicely illustrates influential

users around which independent clusters formed. The potential reach

of my content shared on Facebook & flow of information is therefore

significantly influenced, for example, by a friend of mine who works as

an instructor in a dance school – thus a community in which I stay on

its very periphery =); international students from my social media

marketing course at the Charles University in Prague; people who I

met totally by chance (e.g. a trip or within an online environment);

distant relatives (as measured by relationship strength) or distant

friends (as measured by geographical distance); and so on. Also the

“band fans” group quite clearly shows there was a band member who’s

connections made up 90% of the people that turned up for concerts of

our (currently not performing) amateur band. On the other hand, on the

subject of potential viral educational campaign planning, it’s evident

that my Facebook network is primarily about informal relationships.

Professional and/or academic contacts – possibly mineable from

services like LinedIn, SlideShare, Academia.edu etc. – are simply

missing.

Just to have some fun: from my (private) Facebook network posts

keyword statistics, it’s very easy to find out what technique do we use

once a year to compensates for the lack of interaction with our tons of

Facebook friends – and also the influence of Facebook’s design

features (notifications in particular) – since the collocation “happy

birthday” (& its variations) occupy all the top ranks of the charts.

A sociologist’s heart then will be pleased by the fact that the person

I share the most connections with, is my sister.

* Derived from the number of mutual public interactions

– an author of a post, a user tagged, a comment, a like.

mine subgraph a dance school instructor’s subgraph

personal network

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: developers.facebook.com/docs/graph-api) | June 2014

I would also like to add that interpreting such large

network using visualization (rather than computation)

might not always be the most appropriate solution.

Depending on our intentions we also might want to “cut

off” what we are not interested in, cluster/aggregate

nodes into summarizing “supernodes” to emphasize the

important relationships etc. I'm not about to try it here,

since it’s not directly related to our research. However, it’s

not difficult to imagine utilization of such data for

educational content targeting and/or its spreading

planning (and/or any other campaign planning), since we

are able to assemble a quite clear graph of how to reach

an individual and/or a group. The reality definitely won’t

be as perfect as the one in our picture, yet, I want to

highlight the fact that the data is out there & it depends

solely on us whether we’ll use it to increase our “brand

new yogurt” sales, or to personalize education & help its

transformation via social media. Employing the influence

of peers & personal social networks might help us to get

closer to the “ideal”, where social media users interact

with educational content rather than with “funny pictures

of cats” (or, at least, with both =)), which would result in

Facebook NewsFeed Algorithm (or any other

recommender algorithm) to take care of the rest.

Education transformation, although not complete, would

take a major (& inexpensive) step forward if people were

exposed to educational content on a daily basis (once

again, even if it means we need to “lace” such content

with funny cats =)).

interest profiles

top 10 common page like categories in my personal network

community 1064

musician/band 580

tv show 227

local business 210

movie 207

website 198

athlete 193

public figure 184

food/beverages 177

non-profit organization 173

top 10 common page likes in my personal network (English annotation)

Nejlepší zábava (Best Entertainment, community entertainment page)

Jaromír Jágr (Jaromir Jagr, Czech professional ice hockey player official page)

Viral Vines (Viral Vines, community entertainment page)

You.bo (You.bo, proprietary entertainment page)

Český olympijský tým (Czech Team, Czech Olympic games team official page)

15+ (15+, community entertainment page)

Užívám si života naplno (I Live My Life to the Max, community entertainment page)

Žiješ jen jednou (You Only Live Once, community entertainment page)

House (House, US TV show official page)

Partička (Crew, Czech TV show official page)

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: developers.facebook.com/docs/graph-api) | June 2014

Let’s continue our excursion into analyzing the most common likes among my friends (not including me).

The “top 10” table clearly illustrates the bias of the most popular content towards entertainment. This seems to be a

mainstream trend on Facebook, however, it’s also influenced by the fact that there are many teenagers in my Facebook

network since the time I worked as a teacher (ICT & English) and outdoor & summer camps instructor. Yet, quite

disappointing statistics regarding education… Trying to find new hope in at least several common page likes that could be

labeled “educational” – since I’m, apparently, an atypical node (& outlier) in my personal social network – I finally find MIT

OpenCourseWare. A good starting point for influencing my Facebook surroundings as we know – from the social network

theory – that innovation rather spreads from network peripherals. Or we might want to deal with it “the other way around”:

“stick” educational content to something that already is popular. Since Khan Academy would probably be more suitable for

my, generally younger, network & since the most popular Facebook page in my network is “Nejlepší zábava” (“Best

Entertainment”), we might want to customize the educational content so that it “fits” the culture (/content strategy) of the

entertainment page. To do so, I’ll borrow a popular tweet from the fourth section of this text, “a week of tweets” (see below).

Isn’t that a nice example of “pull marketing/education” (as opposed to “push marketing/education”)?

n-=375 friends

n-=375 friends

[context tweet of the picture]

“When you get 4 in a row on khan academy then miss the last one”

[a hint for those who simply “don’t get it” as they are probably not Khan Academy users – but

beware of the fact that “Explaining a joke is like dissecting a frog. You understand it better but

the frog dies in the process.” (E.B. White)]

hint: in order to complete a Khan Academy’s exercise, you need 5 correct answers in a row

[since this is a meme, a commonly accepted content among teenagers, “nerds“ aware of Khan

Academy might become “starts” by explaining it & possibly also reinforce their social status,

slowly turning into “role models” for their peers, spreading educational content in their social

environment …ok, ok, I’m coming down to earth =)]

interest profiles

top 10 common page like categories in my personal network

community 1064

musician/band 580

tv show 227

local business 210

movie 207

website 198

athlete 193

public figure 184

food/beverages 177

non-profit organization 173

top 10 common page likes in my personal network (English annotation)

Nejlepší zábava (Best Entertainment, community entertainment page)

Jaromír Jágr (Jaromir Jagr, Czech professional ice hockey player official page)

Viral Vines (Viral Vines, community entertainment page)

You.bo (You.bo, proprietary entertainment page)

Český olympijský tým (Czech Team, Czech Olympic games team official page)

15+ (15+, community entertainment page)

Užívám si života naplno (I Live My Life to the Max, community entertainment page)

Žiješ jen jednou (You Only Live Once, community entertainment page)

House (House, US TV show official page)

Partička (Crowd, Czech TV show official page)

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: developers.facebook.com/docs/graph-api) | June 2014

n-=375 friends

n-=375 friends

Similarly to Khan Academy’s “LeBron Asks”* miniseries, one might want to use such social media data – e.g.

obtained by a Facebook/Twitter/Google+/LinkedIn login app – in order to popularize education, as a door opening

moment; and/or before a youngster’s intrinsic motivation fully develops; also to personalize education, suggest

educational resources etc. We might also want to take it as far as not only LeBron asking but indirectly suggesting

“you can do math, I’ll play basketball” (similarly Jaromir Jagr & learning about the physics of ice-hockey, or

Dr.House & medical diagnoses education) but actually making education a natural part of one’s life rather than

something that’s separated from it. “If you can't beat them, join them”.** We are aware of the influence of athletes,

musicians & actors. And since there are too many of them, deriving (personal) interest profiles from the social

web can serve as a powerful “filter”.***

I’ve recently liked the “Open Source for You” & “ProgrammableWeb” Facebook pages. Combined with my

previous educational activity, what can a smart recommender algorithm make of that?

* LeBron James is a US professional basketball player.

** Believe me that as a person who loves self-driven education, but has little taste for traditional media, popular music & film

production, commercialization of sports, and other products of mainstream popular culture that surrounds us every day & often

distracts us from more meaningful activities, it’s not easy for me to justify employment of celebrities in edudation (if only they

were scientific celebs =)). As well as with quantitative metrics in social media research, working with celebrities in education

and/or popularizing education in the way the YouTube “independent scene” does – see, for example, learning just got awesome

– might be a great starting point but a terrible finish line.

*** Serve like that just within the scope of our discussion. We can probably imagine of several other ways of how to use the

enormous universe of the social web data.

interest profiles

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: www.last.fm/api) | June 2014

To conclude our little (& far from being comprehensive) insight

into Facebook data, let’s actually leave Facebook, since the

power of “big data” also resides in combining information from

different sources.* The most common likes among my friends

(below the “top 10”) also include many music artists & bands –

which is not surprising as many people co-define their

personalities identifying themselves with the music they listen

to. Jared Leto is the most common one in my personal

network. Even if we won’t create educational content on how a

microphone works (electronics), how sound waves work

(physics), how vocal cords work (biology), structure of lyrics

based on music genre (linguistics), music subcultures

(sociology), how celebrities are made (media studies), etc.; we

might want to look for an alternative artists for our hypothetical

educational campaign, just in case Jared is too busy (like if

that would be our biggest problem =)). Last.fm can give us a

clue.

* There are also several issues related to that: particularly those in

two general & related categories of “privacy & security” and “ethics”.

One doesn’t need to be an expert hacker to obtain geodata, files

metadata, exploit unencrypted communication etc. And one doesn’t

even have to take the trouble of hacking, since publicly available

data about individuals allow us to produce almost anything

“to measure” (/tailored). Which one are we talking about here,

phishing, or marketing? I always get these two confused. =)

coursera coursera - google+ stanford onlinecoursera, the other stanford

mooc startup, officially ...

coursera - wikipedia, la

enciclopedia libre

coursera - android apps on

google play

coursera plans to announce

university partners for online

...

coursera | mimo školu –

vzdělávejte se po svém

free massive online

education provider,

coursera, begins ...

moocs on the move: how

coursera is disrupting the ...

coursera - wikipedia, the

free encyclopedia

coursera on the app store

on itunes

completely free online

classes? coursera.org now

offering ...

stanford professors launch

online university coursera ...

investors put $43 million

more into mooc provider

coursera ...

coursera blogcoursera - we cover the

revolution taking place in ...

welcome to coursera -

youtube

coursera credentials today,

full coursera-powered

degrees ...

duke to offer free courses on

internet | duke today

coursera (coursera) on

twitter

how coursera, a free online

education service, will

school ...

coursera adds 29 schools,

90 courses and 3 new

languages ...

coursera | linkedin coursera | linkedin

coursera - youtube coursera help

daphne koller: what we're

learning from online

education ...

coursera | 50 best websites

2012 | time.com

online education startup

coursera comes of age,

announces ...

coursera

coursera meetups

everywhere - meetup - find

your people ...

coursera - quora - your best

source for knowledge

home | stanford startup

engineering | cme/cs184 |

winter 2013

coursera - fortune

management & career blog

coursera - mountain view,

california - startup, computer

...

is coursera the beginning of

the end for traditional ...

most audacious companies:

coursera | inc.com

coursera - mountain view,

california - startup, computer

...

home - andrew ng - stanford

computer science

consortium of colleges takes

online education to new

level ...

coursera | crunchbasecoursera's fee-based course

option @insidehighered

more moolah for moocs --

coursera raises another

$20m ...

coursera $43 million series b

round - business insider

coursera - mountain view,

california - startup, computer

...

coursera, udacity, edx: will

free online ivy league ...

online college course

company coursera partners

with 12 ...

coursera for android app

now available for downloadcoursera help | mobile faq

khan academy khan academy life at ka - khan academy

khan academy | windows

phone apps+games store

(united states)

khan academy blends its

youtube approach with

classrooms ...

knowledge map | khan

academy

smarthistory: a multimedia

web-book about art and art

history

salman khan (educator) -

wikipedia, the free

encyclopedia

khan academy: the hype

and the reality - the answer

sheet ...

khan academy - google+

khan academy - wikipedia,

the free encyclopediakhan academy

khan academy - mountain

view, ca - education |

facebook

how khan academy is

changing the rules of

education ...

khan academy - mountain

view, ca - education |

facebook

khan academy khan academy

khan academy: the future of

education? - 60 minutes

videos ...

khan academyviewer for khan academy -

android apps on google play

khan academy - youtube khan academy – wikipedie khanova školakhan academy en français |

cours de maths gratuits !

khan academy : the future of

education? - cbs news

the kahn academy

one man, one computer, 10

million students: how khan

...

khan academy a

„převrácená“ třídakhan academy | crunchbase

khan academy - practical

money skills

khan academy chemistry - youtubekhan academy review &

rating | pcmag.com

sal khan: bill gates' favorite

teacher - aug. 24, 2010

programación de

computadoras | khan

academy

khan academy app for

windows in the windows

store

khan academy | where do i

begin? how should i get

star...

the trouble with khan

academy - casting out nines

- the ...

khan academy: the man

who wants to teach the

world - telegraph

what is khan academy? -

definition from whatis.com

khan academy on the app

store on itunes

khan academy | portal -

desk.com

khanapp - mobile app for

khan academy

khan academy launches the

future of computer science

...

khan academy: a name you

need to know in 2011 -

forbes

khan academy online store

salman khan: let's use video

to reinvent education | talk

...

khan academy

(khanacademy) on twitter

khan academy gets rare

partnership to close wealth

gap in ...

khan academy avec

khanacademy.fr

bing top 50 search results

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Bing_Khan_SearchResults.csv, Bing_Coursera_SearchResults.csv) | June 2014

Before we’ll take a quick look at Coursera’s & Khan Academy’s inbound links, top 3 reddit posts ever about them, and their YouTube video comment networks – and therefore conclude this chapter followed by “a week of tweets” analyses – we’ll step

back for a moment from social media to the point where we started in the first section of this text, the overall online presence of Coursera & Khan Academy. Microsoft Bing’s API provides us with search results of “coursera” & “khan academy” queries

(on this page), followed by online news articles that were captured by it, and which we can use for content analysis – in particular: frequency analysis, named entity recognition & automatic text abstraction – to estimate the general perception of both

institutions in minds of English-speaking Internet users, as defined by online media.

What will be one of the first contacts of a person with Coursera and/or Khan Academy if she or he does not know much about them? The answer is “the search results of a web search engine”. The top 50 Bing search results, ordered by columns

(because of that the first column is the most important, given that it is the very first page of the search results), show us that – and we are probably not surprised by the fact – there are Wikipedia articles, social media profiles, and news articles among

the top pages. While Coursera’s blog occupies the fourth search rank, KA’s blog is a bit “sunken”* – not such a big deal though, since, as we’ve already seen, Coursera’s & Khan Academy’s blog articles are getting traffic from shared links on their

social media profiles (where fans who follow the institution subscribe to all of its social media content which possibly appears on their personal SNSs homepages). Regarding Coursera, there’s also its new Android app, whereas Khan Academy is

associated with its Windows Store & iTunes Store app (both frequently discussed topic, as you’ll see in the online news article analysis). You might want to examine the other search results as well, but since a user would need to “click through” to get

to them, these are only important regarding a cross search engine comparison.**

Even though it is very likely a product of search engine personalization – in a different geographical location, you are likely to get different search results (usually based on your IP address & other criteria) – my nationality won’t let me not to mention a

Czech website with a curated list of high-quality online resources for self-driven education, the “mimo školu” search result; and, this time in the right table, “khanova škola“, a Czech volunteer translator community currating a website with subtitled

Khan Academy’s micro lectures in Czech.

* Verified on Google, probably caused by the choice of keywords: missing the word “blog” & direct connection to Khan Academy, as e.g. in “khan academy’s official blog”.

** For example, on Google, both institution’s Google+ page is – no way! =) – among the top search results and/or, in a shape of an institution’s huge Google+ snippet, in the right column (at the top of the screen).

bing news articles

total news articles: 28 (2012-2014) total news articles: 104 (2009-2014)

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Bing_Khan_News.txt, Bing_Coursera_News.txt) | June 2014

It’s time to explore the promised online news articles. There were

28 news articles (2012-2014) about Coursera & 104 (2009-2014)

articles about (the “older” institution of both) Khan Academy that

the Bing API returned in June 2014.* Although making

generalizations build around a n<30 sample might outrage a stern

statistician, we can justify it given that Bing’s news articles API

tends to return results of larger publishers, as those are easily

recognized as “news articles” by a search engine. On the other

hand, there are no blog articles** (regarding “grassroot bloggers”,

not an institution’s blog) in our sample. On that account, let's

conclude that our news articles analysis represents the overall

picture of Coursera & Khan Academy produced by larger online

publishers.

The line chart on the left shows us that our results will be generally

skewed towards the present, which is not only due to the growing

popularity of both sites, but also because of the fact that providing a

recent search results is what web search engines generally do,

therefore they are optimized for it. Anyway, this is completely

satisfactory toward answering our research question.

* Which is obviously a tiny fraction of what was published about Coursera

& Khan Academy. Yet, it’s enough for our forthcoming analyses’ points.

** For obtaining blog articles, using your own / a third party web crawler,

analyzing the Common Crawl Corpus, or using a service like spinn3r

might be necessary.

bing news articles

total news articles: 28 (2012-2014) total news articles: 104 (2009-2014)

most active publishers

The Wall Street Journal

Forbes

New York Times

most active publishers

Education Week

New York Times

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Bing_Khan_News.txt, Bing_Coursera_News.txt) | June 2014

If you still remember the Twitter “is following” (/friends) & recent tweets analyses, there are our old friends,

The New York Times, The Wall Street Journal & Forbes (this time with reference to Coursera). Apart from

verifying our previous conclusions, there’s also Education Week being the top publisher in relation to detected

online news articles about Khan Academy, which underlines the fact that – in comparison with Coursera –

more education-specialized sites are publishing about Khan Academy.*

* Also supported in the data of the following inbound links analysis & “a week of tweets” chapter (see on the next pages).

news article titles

total news articles: 28 (2012-2014) total news articles: 104 (2009-2014)

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Bing_Khan_News.txt, Bing_Coursera_News.txt) | June 2014

Yet a very simple* frequency analysis of the news article titles illustrates well the (media co-created) brand associations with Coursera & Khan Academy, which we’ll further develop on the following page by employing named entity recognition. No

need to rewrite the keywords a reader of this text can look at in the word clouds. Nevertheless, in relation to Coursera, I’d like to point out “online”, “education”, “courses”, “massive”, “future”, “learning” & “provider”; also “restores” & “iran”, which we’ve

already seen on Facebook (US sanctions); and names of famous Universities or excerpts from names of particular courses featured on Coursera.** As for Khan Academy, even though there are also many familiar topics (partnerships, SAT, and so

on), the visualization allows us to emphasize what we’ll see more clearly on the next page, how much the name of the founder of Khan Academy, Salman Khan, is favoured (within the content of the articles) in connection with his “revolutionary”

pathway towards transforming education employing video & the flipped classroom model. It’s somewhat not surprising that Khan Academy is build around Salman Khan – he founded it & he created majority of the micro lectures available on KA, his

face (or rather voice =)) would be recognized in many US schools (& his reach spreads around the world). Nevertheless, given the popularity of Khan Academy, its large & growing volunteer community etc., it is fascinating to realize how much the

whole project depends on “the” one man – not that we can’t imagine, how it might work in the future (and there are several models, also those that haven’t been invented yet). Although Khan Academy is far from being a “one-man show” anymore,

after some hints in our previous analyses, there’s finally a solid argument for labeling Khan Academy as the “more centralized” of both institutions & another reason why KA’s community appears to be more tied together.

* Just lowercase conversion, but not dealing with plural forms, verbs in third person, verb tenses, etc. (however, this was taken care of in the more detailed tweets text mining in the fourth chapter of this text).

** Just for those who wonder what the “baidu” keyword in Coursera’s cloud means: Baidu is a Chinese web services company, whose Chief Scientists (since May 2014) is Coursera's co-founder, Andrew Ng. Baidu.com is the leading Chinese language search engine.

named entity recognition

MOOC(s)

University

Andrew Ng

edX

Harvard

Stanford

Yale

iTunes

Android

iOS

Daphne Koller

technology

California

Iran

Salman Khan

YouTube

Bill GatesCollege Board

NASA

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Bing_Khan_News.txt, Bing_Coursera_News.txt) | June 2014

After (automated) scraping of the text of all (in total 132) articles, normalization of the data & using Python’s (programming language) NLTK 3.0 (Natural Language Toolkit) to analyze content of the articles; we are able to run named entity recognition

in order to find most discussed entities (“organization” or “person” in our case) regarding both institutions. After removing the generic terms “Coursera” & “Khan Academy”, we can create a web/radar chart displaying the online media sphere brand

overview of both institutions, with regard to what & who is discussed (the charts are showing the most frequent entities).

The authors of online news articles about Coursera, above all, discuss “MOOC(s)” & their relations to the “Universities” featuring them on Coursera, featuring them using other platforms – such as MIT’s & Harvard’s “edX” (fourth most frequent entity)

– and/or “on their own” (using their own platform); especially “Harvard”, “Stanford” & “Yale”. Both Coursera founders, “Andrew Ng” & “Daphne Koller”, are also frequently discussed.* Other frequently mentioned entities in news articles about Coursera

are the headquarters of the company in “California”, “technology” shaping current education, “Iran” & US sanctions (we’ve already discussed that before), and platforms offering Coursera’s new app (“Android”, “iOS”, “iTunes”).

Dealing with Khan Academy, after the discussion on the previous page, we can sum it all up in brief. Online articles about Khan Academy are mainly articles about Salman Khan. Much less frequently – but still most frequently in relation to the

discovered entities – there are the topics of “YouTube” (where Khan Academy’s lectures are stored), Khan Academy’s partners (in particular, “NASA” & “College Board”), and (besides his “other roles” =)) a supporter of Khan Academy & its donor, Bill

Gates, we’ve already met on Twitter.* Just a footnote “inside joke” link for those obsessed with humanities, who examined the one-third difference between Andrew’s & Daphne’s frequencies in the web chart. =)

total news articles: 28 (2012-2014) total news articles: 104 (2009-2014)

articles summary

total news articles: 104 (2009-2014)

“ Here‘s the News Hour‘s review of the Khan Academy. In case you‘re

wondering about the breadth of the topics the academy covers, here‘s an

overview narrated by Khan himself. And here‘s an example lesson on the

mathematical concept of limits. The Khan Academy is a not-for-profit

business but it has started to experiment with generating some revenue so

that Khan can expand the topics he covers and the detail in which he

covers them. Listening to Sal Khan, founder of the Khan Academy, speak

on stage to several hundred attendees at the 5th Anniversary Gala last

week for Innosight Institute — the non-profit that I co-founded — I thought

about how Clayton Christensen and I have speculated for some time that

the long-term future of much of educational content will be in the business

model of a facilitated network, a platform in which users essentially

exchange modular pieces of educational content with each other.“

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Bing_Khan_News.txt, Bing_Coursera_News.txt) | June 2014

To “have a quick break” from facts & figures, besides entities discussed within

the online news articles, we might be also interested in a summary of their

content/text (/overall abstract). Although there are several libraries available

for this purpose in Python, to stick with the spirit of writing a report on

education, I’ve developed some educational effort myself. In order to learn

more about automatic text summarization, which I – so far – was able to

describe in theory only*, I’ve written a very simple text summarizer based on

50 lines of code of Miki Tebeka’s Simple Text Summarizer**, and I ran the

script on the corpus of all 104 online news articles about Khan Academy***

scraped from the web (see on the right).

Within a (more academic) debate about “abstraction” vs. “extraction” general

approach towards automatic text summarization, the result on the right is

rather (a bit simpler) “extraction”, which is based on the premise that important

sentences are those that contain important words – i.e. most frequent words

after removing stop words.

* Or I often “cheated” given the awesomeness of Python libraries

(this time an inside joke for programmers =)).

** This is the true meaning of 21st Century Education prosumers, being both educators

& learners, exchanging (under various conditions) educational content with each other.

*** Coursera summary was not satisfactory not because of the smaller sample, but

rather due to the diversity of topics discussed within articles about Coursera

(as opposed to Khan Academy’s coherent “story”)

inbound links

total links 1,242

top

domains

(n=6-12) slashdot, wordpress, cnn, gamezone, tumblr, typepad, patch

other frequent

cnet, msn, stanford, duke, dzone, feedsportal, nytimes, upenn,

usnews, wsj

n=2

abc, ala, allaboutjazz, bloomberg, bostonmagazine, cbsnews,

cdc, cisco, clucerf, crooksandliars, edublogs, edweek, fooyoh,

forbes, huffingtonpost, illinois, inc, ing, iowapublicradio, kqed,

marketplace, mashable, mit, nbcnews, opb, openforum,

payscale, publicradio, reghardware, rice, ripr, smithsonianmag,

theguardian, typesafe, uw, washington, wbur, wfu, wfubmc,

wlrn, worldbank, wpr, wrvo, yahoo

suffix count

com 701

org 258

edu 90

net 47

ca 14

de 11

gov 11

co.uk 9

com.au 9

com.br 6

es 6

blogspot.com 5

edu.au 5

fr 5

ac.uk 4

dk 4

fm 4

info 4

ch 3

me 3

se 3

blogspot.ca 2

co 2

hu 2

it 2

net.au 2

nl 2

org.uk 2

tv 2

total links 528

top

domains

(n=3-8)

slashdot, patch, feedsportal, kqed, nytimes,

stackexchange, uwstout, wordpress

n=2

abcnews, cbsnews, ck12, cmu, cnn, edweek,

gawker, go, google, Hawaii, hpu, huffingtonpost,

ljworld, mashable, metafilter, niu, rosettastone,

sc, smithsonianmag, ted, tulsalibrary, utexas,

uvm, waldorf, wtol, yahoo

suffix count

com 261

edu 114

org 101

net 12

gov 6

ca 5

co.uk 4

info 2

it 2

tv 2

ak.us 1

cc.ca.us 1

cc.ms.us 1

cc.or.us 1

cz 1

hu 1

int 1

is 1

lib.al.us 1

lib.fl.us 1

lib.ks.us 1

lib.ky.us 1

lib.mo.us 1

lib.nc.us 1

lib.va.us 1

lib.wa.us 1

pa.us 1

Back to the “serious” things. Before our extensive finish, “a week of tweets”, we should complete* our “Coursera & Khan Academy social web presence qualitatively” section with inbound links analysis (on this page); posts about Coursera & Khan

Academy in the birthplace of viral content, reddit; and co-comment networks on YouTube, the largest video sharing-site with 3rd largest Internet traffic in the world (according to Alexa.com), which, as we’ve already seen, offers great conditions for

sharing educational content.

For collection & analysis of online network data regarding Coursera’s & Khan Academy’s websites, VOSON** web crawler was used. Inbound links are links from a website that link back to the original website – here Coursera on the left & Khan

Academy on the right. Inbound links – also known as backlinks – can (besides other things) tell you who pays attention to a website & therefore who increases the website’s traffic.

Finally, we can clearly see why the current web can be attributed “social” web. In terms of frequency of linking to Coursera or Khan Academy, blogs, internet forums, specific content gathering sites & other user-submitted content and/or social

media-like portals (slashdot, wordpress, tumblr, typepad, stackexchange etc.) are very important. As for larger media, CNN is the “newbie” we haven’t discovered within our previous analyses yet. Did “gamezone” among Coursera’s “top domains”

catch your attention? All the buzz was around the “Online Games: Literature, New Media, and Narrative” course, focused on Tolkien & The Lord of the Rings Online. The original topic of this University-level English literature class is “what happens to

stories, paintings, and films when they become the basis of massively multiplayer online games”.

While the fact that many Universities publish about Coursera, is expected – domain names like “stanford", “duke”, “upenn”, “illinois", “mit”, “rice” etc.) – other Universities, by contrast, link back only to Khan Academy – “uwstout”, “cmu”, “utexas”,

“hpu” etc.. Generally speaking, Coursera seems to have more attention of larger (general) media, while Khan Academy “leads” in the category of education-focused institutions/publishers – “ck12”, “edweek” etc.

If we omit – not only with regard to the previous analysis – the very common and/or frequent suffixes “com”, “org”, “edu”, “gov”, “ca” & “uk”; we’ll find out that several sites from Germany, Spain & France link to Coursera***. As we’ve also discovered

before – even with respect to less than half the number of inbound links found for Khan Academy*** – KA’s suffixes, again, seem to be more English speaking countries-centered.

* Or rather “complement”. “Complete” is an ill-chosen word since it indicates a closed-world assumption.

** VOSON inbound liks are found via a query to the Yahoo API.

*** Do not be misled by the “it” suffix. Despite the fact “it” is the Internet country code top-level domain for Italy, it is also frequently used as a so-called “domain hack”. In our inbound links dataset, there’s for example the California-based news publishing website “scoop.it”.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Inbound_Coursera.csv, Inbound_Khan.csv) | June 2014

URL score Title author domain created subreddit permalink

http://www.reddit.com/r/science/c

omments/23o5w4/science_ama_

series_hi_im_peggy_mason_i_st

udy/

2,440 Science AMA Series: Hi, I’m

Peggy Mason, I Study Empathy in

Rats, AMA.

PeggyMason self.science 1,4E+09

(Tue, 13 May

2014 16:53:20

GMT)

science http://www.reddit.com/r/sci

ence/comments/23o5w4/s

cience_ama_series_hi_im

_peggy_mason_i_study/

http://www.thesimplelogic.com/2

012/09/24/you-say-you-want-an-

education/

2,181 You Say You Want An Education?

A 4-year university computer

science curriculum entirely on

Coursera

adamwfletcher thesimplelogic.com 1,35E+09

(Fri, 12 Oct

2012 00:00:00

GMT)

programming http://www.reddit.com/r/pro

gramming/comments/10i5x

0/you_say_you_want_an_

education_a_4year_univer

sity/

http://hummusforthought.com/20

14/01/29/us-bans-students-from-

blacklisted-countries-from-

getting-a-free-education/

1,788 US bans students from Syria, Iran,

Sudan and Cuba from accessing

Coursera, the non-profit

organization offering free Massive

Open Online Courses

hummusforthought hummusforthought.com 1,39E+09

(Fri, 17 Jan

2014 23:06:40

GMT)

worldnews http://www.reddit.com/r/wo

rldnews/comments/1wen2

m/us_bans_students_from

_syria_iran_sudan_and_cu

ba/

top 3 reddit posts (all-time)

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Reddit_Coursera.csv) | June 2014

Top 3 reddits about Coursera – those which get the highest score (posts with the highest positive difference of upvotes & downvotes)* – are, in three different subredits (science,

programming & worldnews): 1) AMA (ask me anything) of Peggy Mason, Professor in the Department of Neurobiology at the University of Chicago, who, among other things, discussed on

reddit her “Understanding the Brain: the Neurobiology of Life” MOOC on Coursera; 2) Adam Fletcher, ICT masters student discussing the competitive ability of MOOC courses against

traditional Universities – including, in his opinion, insufficient credentialing of MOOCs (as of 2012); 3) the discussion we are already familiar with, the “US bans students from Syria, Iran,

Sudan and Cuba from accessing Coursera” issue.

So, what was popular on reddit? Close connection & discussion with a Professor from one of the world’s top Universities; a discussion of possibilities & shortcomings of an emerging model

of higher education; and last but not least, a debate laced with politics around Coursera courses & restrictions on US relations with other countries.

* Registered reddit users vote posts up or down to determine their position. The other reddits (obtained from the reddit API) including the “coursera” or “khan academy” keywords can be found in the enclosed

dataset – see the lower-right corner. If you haven’t noticed the existence of a footer on every page until now, you should see it for a particular dataset (related to a particular analysis) you want to play with.

All data files are enclosed in a zip archive. The link can be found on one of the last slides of this research paper.

URL score title author domain created subreddit permalink

http://techcrunch.com/2014/03/05/khan-

academy-gets-major-partnership-to-close-

rich-advantage-in-college-test-prep/

4,413 Khan Academy Gets Rare Partnership

To Close Wealth Gap In College Test

Prep - "bring free [SAT] test prep

software to the masses" - "prepare for

the SAT at their own pace, at no cost"

bboyjkang techcrunch.

com

1395340698

(Thu, 20

Mar 2014

18:38:18

GMT)

technology http://www.reddit.com/r/tech

nology/comments/20w5eu/k

han_academy_gets_rare_p

artnership_to_close/

http://www.livememe.com/dgigev5.jpg 3,626 Studying all weekend made me realize

this. Good Guy Khan Academy

Bearowolf livememe.c

om

1379992768

(Tue, 24

Sep 2013

03:19:28

GMT)

AdviceAnimals http://www.reddit.com/r/Advi

ceAnimals/comments/1n05s

s/studying_all_weekend_ma

de_me_realize_this_good/

http://www.reddit.com/r/IAmA/comments/n

tsco/i_am_salman_khan_founder_of_kha

n_academyama/

3,176 I am Salman Khan founder of Khan

Academy-AMA

salman_khan

_academy

self.IAmA 1325094536

(Wed, 28

Dec 2011

17:48:56

GMT)

IAmA http://www.reddit.com/r/IAm

A/comments/ntsco/i_am_sal

man_khan_founder_of_kha

n_academyama/

top 3 reddit posts (all-time)

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Reddit_Khan.csv) | June 2014

The top 3 reddits concerning Khan Academy discuss: 1) our already worn down topic of SAT partnership; 2) a meme expressing support for Salman Khan; his close association with the

Khan Academy “brand” & transformation of education is also something within the intersection of several previous datasets; 3) another (just like in case of Coursera) AMA, this time dating

back to 2011, with the founder of Khan Academy, “you-know-who”, “he-who-must-not-be-named”, otherwise somebody will conduct a content analysis of this text to make the very same

conclusions I did on the subject of online news articles about Khan Academy (because every time one mentions Khan Academy, she or he also mentions Sal Khan). =)

Reddit posts about both Coursera & Khan Academy therefore again demonstrate their common brand elements of “opening up education” & “opportunity”. Coursera conveyes the

impression of “an extremely nice doorman”, who allows people into a huge all-star virtual open-air educational party with an enormous number of stages of various new genres (sometimes

complicated for a totally newbie listener), where everyone can sit in the first row & interact with the performer or with the others. On the other hand, Khan Academy has the appearance of

“an extremely nice doorman”, who allows people into a huge virtual party at his place, where Sal Khan (frontman) performs education on request from his band’s production, which includes

thousands of fresh “oldies” covers.

subscribers: 31,535

video comments network

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: YouTube_Coursera_Edges.csv, YouTube_Coursera_Vertices.csv) | June 2014

Searching for comments below YouTube videos whose title, keywords,

description, or author’s username contain (& limited by) “coursera” (or

“khan academy” on the next page); we are able to create a network, based

on pair of videos commented on by the same user – the nodes represent

videos, their size is proportional to number of views – from which, once

again, we might be able to derive user typology (/clusters) build around the

educational content Coursera’s (/Khan Academy’s) learners are interested

in.

Saying that, it might be quite tricky to begin with Coursera which does not

provide the kind of “comprehensive education” (rather “differentiation” in

Coursera’s case) Khan Academy does, and, more importantly, does not

upload its lectures on YouTube, since it makes use of its own learning

environment. But let’s deal with that.

Coursera might also represent a single community and/or some common

ways of thinking about education, but it is rather a platform which shields a

large number of – more or less independent – smaller communities around

specific educational interests.

The big black disconnected group – yes, the majority of YouTube content

uploaded by Coursera does not have a common commenters community –

contains introductory “lectures” (/invitation videos) of courses featured on

Coursera, fan videos, assignments/homework related to particular classes,

feedback to Coursera, and course reviews. The biggest black dot

represents a YouTube upload of a video from the TED conference,

“Daphne Koller: What we're learning from online education” (211,233

views).

Some “minicommunities” were also discovered. We can use them to

illustrate some stepping-stones to Coursera’s social media content

strategy development we’ve already discussed before.

The first from left group clusters videos commented on by users interested

in (& interacting with) live sessions of courses in Spanish. The names of

the other groups are also quite self-explanatory. There are two

communities discussing Coursera news: one commenting on videos giving

information about Coursera’s background, the other on more festive

videos. The “COURSERAful” YouTube channel attracted commenters on

Stanford University Computer Networking' course. The remaining two

groups are “Coursera interviews & about”, and a group of videos

commented on by users dealing with the assignments of the “Introduction

to Music Production” course.

subscribers: 31,535

video comments network

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: YouTube_Coursera_Edges.csv, YouTube_Coursera_Vertices.csv) | June 2014

To complete the “top 3” most viewed videos about and/or by

Coursera, we should mention that the two larger blue nodes

at the bottom left are “Katy Perry - Birthday (Official

Coursera Parody) Music Video” and the “Welcome To

Coursera” video. Although the former’s membership in the

“news: Coursera background information” cluster might be

questionable, a new category of “entertainment”, in the eyes

of self-driven learners, would need to include all of the

content. =)

Broadly speaking, Coursera does not use YouTube as a

platform for publishing educational content*. The only (long-

term) “strategic” content on YouTube are course invitations,

which, however, rather drive traffic to Coursera’s website

(one, but possibly not the only goal of an institution’s SNS

presence) but do not drive engagement that might help

spreading of Coursera’s content & strengthen learning

communities, and/or maintain their interest in the channel &

educate them (some added value beyond the content

provided on Coursera’s website). Also with regard to the

other self-formed communities related to its brand,

Coursera still seems to seek its YouTube positioning &

content strategy.

* On the other hand, the fact all lectures are “enclosed” in

Coursera’s own learning environment (KA also has an elaborated

one) allows it to collect many behavioral data which might help us

in the future to better understand education (& better shape it as a

result). Nevertheless, “purely” educational content obviously drives

a lot of engagement on social media, therefore there seems to be

a room for improvement - my recommendation within the

Facebook network discussion is definitely not the only possible

solution.

subscribers: 178,1982

video comments network

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: YouTube_Khan_Edges.csv, YouTube_Khan_Vertices.csv) | June 2014

Creation of Khan Academy’s (rough) user typology, derived from network of

YouTube commenters connecting videos they interacted with, is – despite the

diversity of KA’s microlectures’ topics – much easier than in case of Coursera.

This time, we’ll start with the tiny & smaller groups, which are far from

representing “user archetypes” – unlike (possibly) the larger groups.

There are lot of videos watched by random curious users at the bottom of the

picture. Some users were co-interacting with videos on fractions, other were

interested in Salman Khan, computing, or mathematics & physics. The “MCAT

candidates” watched & commented on several videos on the Medical College

Admission Test – therefore they are possibly prospective medical students from

US & Canada. The “learning natural sciences” group interacted with natural

sciences videos in Portuguese ('Khan Academy em Português' YouTube

channel). We’re slowly getting to the larger groups…

Videos that belong to the “Learning medicine” cluster were co-commented by

users discussing Khan Academy videos mainly featured by the

“khanacademymedicine” YouTube channel. The most popular video there – as

measured by total number of views (the biggest light green dot) – was “The

Kidney and Nephron“ (878,288 views).

The “learning natural sciences” group, interested in natural sciences content

uploaded by the English (largest) “Khan Academy” YouTube channel, includes

another “top” video, “Balancing Chemical Equations” (1,345,371 views).

The second largest group (in the upper right corner) includes videos watched by

users who keep/kept an eye on all mathematical educational content

categorized on Khan Academy’s web page in the “Math” section. They also

seem to follow any general core knowledge – in terms of searching for an

explanation of a generally used concept. Therefore they are close to the largest

detected commenter group.*

The largest group discussed math, economics & finance, and Khan Academy as

an institution videos (news, background & milestones). The most watched

videos there are: “Basic Addition” (Khan Academy, 2,322,876 views) & “Khan

Academy runs on Google Cloud Platform“ (Google Enterprise, 5,093,851 views).

* A certain overlap is also apparent due to the fact that the “mathematics &

general knowledge” group also includes a news video (which should therefore

be in the largest cluster), “Salman Khan talk at TED 2011” (3,624,780 views, the

largest light blue dot). Given that we haven’t yet highlighted the importance of

the TED conference in our previous analyses dealing with the online media sphere (neither, with respect to Coursera), we should highlight TED now. It also

points out the fact that this text is build around the overview of both Coursera’s

& Khan Academy’s brands, therefore does not work with the influence & reach

of particular media. A suggestion for a follow-up research? =)

subscribers: 178,1982

video comments network

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: YouTube_Khan_Edges.csv, YouTube_Khan_Vertices.csv) | June 2014

The very last group we haven’t discussed yet is the disconnected

black nodes box. Just like with Coursera, these are videos that

did not fit in any of the clusters (could it possibly be because of

missing links? =)). We find there some third-party content:

interviews, employment of khan academy in schools reports, fan

videos, how tos; and also mixture of Khan Academy videos

covering the whole spectrum of KA’s content (sorted by

frequency, in descending order): mathematics, economics,

natural sciences & medical videos, computer science.

To sum it up, the largest group of Khan Academy users arguably

– based on the video comments networks around KA’s

educational content shared on YouTube* – complement,

supplement, or fix their knowledge & skills in mathematics

(above all), economics, and general/core education. Obviously,

new communities are also emerging: in particular those build

around the subjects of currently expanding KA’s portfolio (natural

sciences & medicine).

Quite surprisingly, we haven’t detected an increased interest in a

relatively new (& generally highly supported by media) section of

“Computing”. It can easily be explained by the fact that the

section on Khan Academy devoted to computers & computer

programming consist of many hands-on practical interactive

exercises & supportive textual material, rather than of interactive

whiteboard-like videos, which are commonly associated with

Khan Academy.

* Naturally (& valid for many other analyses in this text), it would be

great to compare our conclusions to the behavioral data

right from Khan Academy’s (/Coursera’s) website.

Coursera & Khan Academy

detailed insights provided by a small fragment of big data

general statistics, influential tweeters, demographics, keywords, content analysis,

natural language processing, text mining, sentiment analysis

(pp. 96-121)

a week of tweets

number of followers: 176,619 number of followers: 295,146

a week of tweets

number of tweets 8,538 (4,323 links, 3,286 retweets, 13% threads)

archive started

05/29/2014

17:27:36

last tweet

06/06/2014

17:21:59

tweets/min rate 0.74

number of tweets 4,976 (3,441 links, 1,386 retweets, 6% threads)

archive started

05/24/2014

17:05:21

last tweet

06/06/2014

17:40:56

tweets/min rate 0.27

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

We have already analyzed some Twitter data in the beginning of the second chapter. While previously, we were focused only on Coursera’s & Khan Academy’s friends (those who Coursera or Khan Academy

follow) and selected followers of Coursera & Khan Academy – thus studying rather the positioning of both institutions; here comes the analysis (text analysis above all) of all tweets that included the “coursera”

or “khan academy” keywords && that were published between May 29 (or May 24)* and June 6, 2014. It is not unreasonable to point out again that automation of our analysis – e.g. in the software as a service

fashion – is crucial for any long-term generalization of its outcomes. Nevertheless, we should see that even such “cut-out“ of the (big) Twitter data, a week of tweets, might provide us with a meaningful insight

into who Coursera’s & Khan Academy’s users are, what do they think & what Coursera / Khan Academy can possibly do about it.

In total, 8,538 tweets (3,286 retweets) about Coursera & 4,976 tweets (1,386 retweets) about Khan Academy were collected.** 13% of Coursera’s tweets were threads - i.e. about 1,010 tweets out of the total

8,538 tweets were detected as being conversations. Regarding Khan Academy, 6% of the total tweets were detected as threads – i.e. about 299 tweets in total. On that account, we can see that despite Khan

Academy has more followers, Twitter users – that is not only online self-driven learners but also online publishers etc. – were talking more about Coursera. On average, every 15 minutes, Coursera was

mentioned more than 11 times, compared to less than 4 tweets mentioning Khan Academy within the same time span.

* Unfortunately, part of Coursera’s dataset was corrupted. That’s why we have tweets mentioning Khan Academy since the evening of May 24, while Coursera’s dataset includes tweets since the evening of May 29. So, thus it

doesn’t seem to be exactly “a week of tweets” in neither case – fair objection you have! =) – “a week of tweets” is an analogy of “a small amount of the big data” which we can use to create hypothesis and/or (where possible) make

generalizations. As we monitor specific events only in order to estimate our level of bias, what’s left & what we are interested in is the text content of tweets & metadata, providing us with a deeper insights than just “customer

experience” of Coursera’s & Khan Academy’s users.

** In spite of the data corruption – which was rather caused by my careless Google Drive setting – I would like to point out the Twitter Archiving Google Spreadsheet TAGS

by Martin Hawskey, which will save you the pain of setting up & running a web server for a small Twitter research like this one.

number of tweets: 8,538

a week of tweets

tweet volume over time

number of tweets: 4,976

tweet volume over time

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

05/06/14 31/05/14

The top tweets/min rate – within our time range – was reached by Coursera on June 05. This was the day The New York Times published

review of the new Coursera’s mobile app. Coursera learners (or, more traditionally, “students”) were also frequently tweeting about their

enrollments in Coursera’s new courses. Finally, they were, with increased frequency, announcing their final scores in different courses –

especially in “Machine Learning”, a periodically running course by one of Coursera’s founders, Andrew Ng.

The enormous growth in number of tweets about Khan Academy on May 31 seems to be caused by KA’s NASA partnership announcement.

a week of tweets

top tweeters tweets @'s % rt twitter activity

drchuck 71 257 80%

JayAndrewStarr 52 N/A N/A

MOOCList 43 39 2%

top tweeters top tweeters

top tweeters tweets @'s % rt twitter activity

iturank 271 32 N/A

nishi19 36 61 100%

EdTechRetweet 13 N/A 100%

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

Before we’ll dive into the promised text mining, let’s see if there’s anything else we can gain from the Twitter data. We’ll briefly talk about two different ways of finding influencers, and then about a tweet metadata.

The Twitter user tweeting about Coursera most frequently was Charles Severance (a.k.a. Dr. Chuck), Associate Professor, School of Information, University of Michigan; who lectured two courses on Coursera:

“Internet History, Technology, and Security” & “Programming for Everybody (Python)”. Most of his 71 tweets were retweeted or replied to (/mentioned) by other Twitter users, especially his students.

“JayAndrewStarr” – the “second place” in our imaginary Coursera’s loyalist leaderboard – is generally a very active tweeter. Jay was reporting about almost any events regarding Coursera …and then you take a

closer look at his account, tweets, links in them, followers & other personal profiles on his website to realize, it’s a scam. “MOOCList” (Portugal) describes itself as an “aggregator (directory) of Massive Open

Online Courses (MOOCs) from different providers”, and therefore included Coursera many times in its new online courses lists, and concludes our “top 3” list.*

* You can get more information about other active Coursera’s loyalists by searching “twitter” and “frugalmaniac”, “IDCourserians”, “scottedwards200”, “IskiieHacker”, “rdpeng”, “TopFreeClasses” & “Turkcell Akademi” with your

favorite web search engine. Even though you will find there some other “aggregator” sites (led by commercial intentions), there’s more. I personally like the “IDCourserians” Indonesian community ("Komunitas Courserians

Indonesia“) & the “rdpeng” user profile that belongs to Roger D. Peng, Associate Professor, Biostatistics, Johns Hopkins University, who teaches the data science program on Coursera.

number of tweets: 8,538 number of tweets: 4,976

a week of tweets

top tweeters tweets @'s % rt twitter activity

drchuck 71 257 80%

JayAndrewStarr 52 N/A N/A

MOOCList 43 39 2%

top tweeters top tweeters

top tweeters tweets @'s % rt twitter activity

iturank 271 32 N/A

nishi19 36 61 100%

EdTechRetweet 13 N/A 100%

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

Let’s also briefly talk about the Khan Academy’s top tweeters. The most active user, “iturank”, (bot: automatic sharing, non-human) publishes on iTunes U (an Apple app) courses. The second most active user, Japanese blogger nishi19,

belongs to the top 3 also with respect to retweets. Her most successful tweet in this regard was a (link to) her review of the Japanese translation of Salman Khan’s book “The One World Schoolhouse: Education Reimagined” (世界はひとつの教室 「学び×テクノロジー」が起こすイノベーション). Other active Twitter users mentioning Khan Academy confirm the previously supported statement that Khan Academy is more (than Coursera) popular among education-focused publishers.*

On a general note, let’s highlight that a prospective “brand ambassadors/influencers hunt” should concern not only quantity – possibly helpful as pre-screening of prospective candidates – but, more importantly, quality & relevancy of their

content shared, and also respect our goals/intentions & target groups. Quantitative metrics in social media research can be a great starting point but a terrible finish line. On that account, and finally as for this particular analysis, we should add

that these most active tweeters discovered are generally rather atypical in relation to “the common others” (as for humans, not bots or organizations) discussed later on, who tweet mainly about their educational experience, from learner’s or

educator’s perspective (like “drchuck”, who meets our conditions).

* Other active KA’s loyalists: “ClasesEnWeb”, “EdTechRetweet”, “QLFInc”, “languageed”, “imdhan_khan”, “technologychag”, “EurekaStartups”, “RizKhanMua”, “Shyam17”. The “SocialMediaResearchBasedOnKeywordsSceptics” movement – founded in June 2014 by Jakub

Ruzicka =) – will probably point out that among those alleged loyalist, we can also find a fan of Salman Khan, Indian film actor, and one makeup artist, Riz Khan, who runs Riz Khan Training Academy. Data collection based on keywords has some limitations.

number of tweets: 8,538 number of tweets: 4,976

a week of tweets

number of tweets: 8,538 number of tweets: 4,976

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

most influential users tweeting

about Coursera (top 3)

…, as counted by

# of followers

WorldBank 749,959

Turkcell 489,473

Ujjwal_krishna 460,237

most influential users tweeting

about Khan Academy (top 3)

…, as counted by

# of followers

NASA 6,635,843

mashable 4,052,392

the_hindu 699,432

Since we’ve already conducted a similar analysis in the second part of this text, we can, without further delay, jump into the

analysis of most influential users – as measured by number of followers – with at least one tweet about Coursera & Khan

Academy within our time range.*

Let’s start with Coursera, where the influentials are/were: WorldBank, a Coursera partner, currently hosts two courses on

Coursera; Turkcell, another partner (mobile phone operator), sharing Coursera’s content – centered around business, math and

technology – via Turkcell Akademi; Ujjwal_krishna, a very active & influential Twitter user with motto “I'm what ever you can

make out of me :)”, who mentioned Coursera in his “websites that offer free courses” tweet.

The top 3 most influential users tweeting about Khan Academy are/were: NASA, already well-known partner of KA on

increasing student interest in science, technology, engineering and mathematics (STEM); Mashable, a news & technology

website we’ve already seen in some previous analyses, which was mainly interested in KA’s partnership with NASA; and finally

the Hindu, an Indian daily newspaper in English.

To sum up, we can add to our overall online media-sphere & partners universe – in addition to the previously emphasized ones:

WorldBank & Turkcell (Coursera) and Mashable (Khan Academy). The Hindu also reminds us of the previously found high

demand for both educational tools in the second-most populous country in the world, India.

* A week of tweets is definitely biased towards the most recent interactions. However, a week of tweets should not omit too many Twitter profiles

which tweets about one or another institution frequently. Moreover & more importantly, the conclusions, where not backed up by a sufficient

amount of data, are rather general, with particular institutions used as examples.

We should also mention that conducting an analysis of all Coursera’s (176,619) & KA’s (295,146) followers would certainly be possible. However,

due to Twitter’s 15 minutes long rate limit window durations, the collection of the data – and naturally also its processing, analysis & interpretation

– would take a lot of time. Considering the fact this research paper is already quite long, I might save such analysis for another text.

a week of tweets

en en77.8% 78.6%

es es

6.5% 2.5%-0.1% (descending order) 10.1% 1.9%-0.1% (descending order) <0.1%

tr, ru, ar, ja, ht, mn, fa, fr, vi, pt, id, it,

zh-CN, nl, gl, sr, ko, ca, de, az, th,

ga, tl, lv, no, el, hu, lt, sv, cs, et, jw,

bg, da, fi, hr, ms, mt, sl, ta, uk, bs,

sq, sw

ja, fr, ht, pt, tr, id, gl, ms, ar, nl, sv,

sw, ca, cs, cy, mn, sr, th, ur, zu, af,

az, bg, eo, eu, ha, he, hi, hu, ig, it,

jw, mt, no, sk, sl, tl, vi

tweets language (Google Translate API detection)

en71.4%

es12.8%

user language (Twitter user preferences)

ja

8.7%

fr, pten73.9%

es

11.0%

ru, tr, fr, ja, pt, it

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

We will start our tweet metadata insight by frequencies of tweets language & Twitter user language preferences. Expectedly, most of the tweets mentioning Coursera or Khan Academy were in English. Even though we already know that users

from English-speaking countries dominate both websites’ userbase – besides other things, also because most of the educational resources are in English by default (possibly with a foreign language subtitles) – if we compare tweets language

to user language preferences, we can see that users from other than English-speaking countries usually tweet in the common English language (yet there are some exceptions such as Spanish speaking population). Moreover, we should bear

in mind that some users might have set their account in English even if their mother tongue is other than English. The next most common language, Spanish, reflect both tools numerous fan base in Latin America. As for Khan Academy, we

can also identify a quite big community of users from Japan. Although there definitely are fans of Khan Academy in Japan, our results are skewed by the very active “iturank” Twitter user-bot, who has Japanese as his language preference.

Other Coursera’s communities can be found in Russia, Turkey, France, Japan, Arab countries*, Italy, Brazil & possibly Portuguese (the Portuguese language preference). Dealing with Khan Academy, it is worth to mention France, Brazil,

possibly Portuguese (once again, the Portuguese language preference) & interestingly the Republic of Haiti (the Haitian Creole language, “ht”). In comparison with the Facebook comments network demographics, we can add Turkey, Japan &

Italy to our “target markets” estimates.

* Arguably – also with respect to our previous analyses – Coursera’s Arab countries userbase is much numerous than KA’s.

Just for the record, the “ar” (Arabic) language code includes: Saudi Arabia, Iraq, Egypt, Libya, Algeria, Morocco, Tunisia, Oman, Yemen, Syria, Jordan, Lebanon, Kuwait, United Arab Emirates, Bahrain & Qatar.

number of tweets: 8,538 number of tweets: 4,976

>1% >1%

a week of tweets

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

number of tweets: 8,538 number of tweets: 4,976

Regarding that in our data archive, there were only 168 Coursera & 122 Khan Academy tweeters who publicly shared their location (the geotag

metadata) with their tweets, we should not make any significant generalizations based on those two heatmaps. However, this way, we are able to

support some of our previous conclusions. Moreover, it’s good to know that such information is available, especially toward any long-term

monitoring/research.

The long-awaited tweets text mining is coming, almost there. =) Just let me quickly deal with what we can learn about Coursera & Khan Academy

from Twitter’s four other types of metadata: hashtags, mentions & URLs in tweets and attached media to them.

a week of tweets

top hashtags count % of tweets

#coursera 528 11.2%

#MOOC 434 9.2%

#WDRrisk 218 4.6%

#Iran 165 3.5%

top mentions count (n=9899) % of total mentions

Coursera 3902 39.4%

drchuck 257 2.6%

World Bank 190 1.9%

TURKCELL 175 1.8%

Johns Hopkins 112 1.1%

top hashtags count % of tweets

#STEM 428 11.2%

#Apple 303 7.9%

#iPhone 303 7.9%

#iTunes 303 7.9%

#iTunesU 303 7.9%

#Mac 303 7.9%

#WHScienceFair 151 3.9%

top mentions count (n=3288) % of total mentions

NASA 604 18.4%

Khan Academy 580 17.6%

Mashable 306 9.3%

WWWhatsnew 48 1.5%

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

number of tweets: 8,538 number of tweets: 4,976

Dealing with hashtags, mentions & URLs in tweets, we already have one foot in the text content of tweets.

The moment you take a look at the most frequent hashtags, you might have a hunch that there’s nothing that could surprise you. Neither #coursera

nor #MOOC need further explanation. #WDRisk is a hashtag of the World Bank. And opening the #Iran hashtag discussion would also be quite

repetitive. As for Khan Academy, we can, once more, see: the #STEM topic, the list of Apple products (by the iTurank bot*), and last but not least the

#WHScienceFair (The White House Science Fair) hashtag, related to the previously discussed “college admissions resources” partnership.

Mentions also (mainly) consist of “familiar faces”. We can just remind that John Hopkins (University) hosts the Data Science Specialization

on Coursera. WWWhatsnew (Brazil) tweets about technology, design & business news in Spanish.

* In relation to our overall conclusions, even though our hashtag analysis is biased because of the “iturank” bot (automated account), we have found iTunes

& KA’s iPad app in previous analyses as well, and therefore can justify our conclusions about KA’s iTunes users reach.

a week of tweets

top URLs count

(n=6239)

title

http://blog.coursera.org/post/876291

64467/coursera-works-with-turkeys-

largest-mobile-provider

90 Coursera works with Turkey’s

largest mobile provider,

Turkcell, to build resources for

Turkish Learners

http://blog.coursera.org/post/877425

96882/restoring-course-access-in-

iran

82 Restoring Course Access in

Iran

https://www.coursera.org/course/ma

nagerisk

71 Risk and Opportunity:

Managing Risk for

Development

https://www.coursera.org/ 61 Coursera

top URLs count

(n=3483)

title

http://www.nasa.gov/content/nasa-

khan-academy-collaborate-to-bring-

stem-opportunities-to-online-

learners

238 NASA, Khan Academy

Collaborate to Bring STEM

Opportunities to Online

Learners

http://mashable.com/2014/05/29/kha

n-academy-nasa-stem

157 NASA, Khan Academy Team

Up for STEM Education

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

number of tweets: 8,538 number of tweets: 4,976

Top shared/tweeted URLs (as part of the tweets) can – above all – reassure us that we are slowly reaching data saturation (within our limited

research universe).

Out of the 6,239 URLs in tweets mentioning Coursera, 90 of them link to Coursera’s blog article about cooperation with Turkcell, 82 link to

Coursera’s blog article about restring course access in Iran, and finally, 71 link to Coursera’s course on the subject of the World Bank’s “World

Development Report 2014”. Coursera was also introduced several time as itself (link to its homepage).

Out of the 3,483 URLs in tweets mentioning Khan Academy, 238 of them link to NASA’s article about STEM collaboration & 157 of them link to

Mashable’s article on the same topic.

If we were to draw some conclusions here, we should, once again, stress the importance of blog articles sharing background information &

(therefore) deepening one’s relationship with an institution.

a week of tweets

top media (context tweet & photo)

Programming for Everybody session 2

just became my highest-enrollment

@Coursera class ever

[automatic translation, Arabic]

@research_tools sites who offer free

courses on the net credit of the best

universities in the world and in different

languages

top media (context tweet & photo)

Khan Academy logged-in

homepage: As of today,

everything in red is rendered

by @reactjs

Page load times on Khan

Academy just dropped in

~half today thanks to a

new, faster A/B test

system

When you get 4 in a

row on khan academy

then miss the last one

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

number of tweets: 8,538 number of tweets: 4,976

The very last metadata we will mine before the text analysis are “media”, a 2011 addition to Twitter which used to be text microblogging platform only. The

most popular (according to frequency) photos attached to tweets can complement the overall picture of both institutions social web presence by a visual

insight .

The first from left “Programming for Everybody” picture shows enrollments in courses of our “old friend”, Charles Severance. As for the next picture showing

some of Coursera’s partner institutions, it was attached to the (several times retweeted) tweet of Sa3ad, a foreign student in US who specializes in the

science of psychology & brain.

Dealing with Khan Academy’s photos, the first & second picture belongs to user “soprano”, i.e. Ben Alpert, a young Khan Academy engineer, who shared

some information about Khan Academy development. The “top 3” is completed with the already seen meme making fun of the fact that in order to complete

a Khan Academy’s exercise, you need 5 correct answers in a row.

What’s the lesson here? Once again we can see that sharing an institution’s background information on social media works as well in the realm of education

& is able to connect its community.

a week of tweets

top keywords Porter stemmer (top) Wordnet lemmatizer (top)

coursera coursera coursera

course(s) cours course

free sign free

online mooc online

mooc free signed

learn(ing) learn mooc

signed onlin learning

programming start class

data educ programming

education take data

earn(ed) class learn

university univers

part-of-speech tags

top verbs sign(ed)

earn(ed)

take

learn

top adjectives free

learn

free

new

more

first

great

good

mobile

english

top keywords Porter stemmer (top) Wordnet lemmatizer (top)

khan khan khan

academy academi academy

nasa nasa nasa

education educ education

stem team stem

team onlin team

online learn online

khanacademy' khanacademi khanacademy

space space space

learn collabor learn

opportunities opportun opportunity

tutorials video video

astronomy tutori tutorial

youtube astronomi astronomy

youtub youtube

explor exploration

launch mashable

mashabl sensation

itun

sensat

part-of-speech tags

top verbs expand

launch

learn

bring

unveil

know

top adjectives free

english

new

more

good

great

educational

smarter

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

Regarding the text analysis* on the next pages, it should be mentioned, that apart from

preprocessing procedures including normalization, depunctuation, lowercase conversion,

removing stop words, spelling correction, dealing with negative words (replacing negations with

antonyms), stripping off links, retweets & replies; stemming, lemmatization etc.; the tweets

were also translated to a common language, English (machine translation, Google Translate

API).

Just like any of the previous analyses, text mining can also tell us something about the

aura/brand “floating” above both educational tools. If you speak “Academic”, textual and/or

content analysis can gain insights into how human beings make sense of the world. In our

case, those meanings are reflected in the text & context of the natural everyday Twitter

interactions surrounding both institutions, which co-define them and, at the same time, are their

reflections.

The most frequent (in descending order) keywords, controlled for stems, lemmas & part-of-

speech tags (verbs & adjectives in particular), provide us with insight into the

prevailing/mainstream universe of meanings around both institutions.** Without further delay,

let’s start with Coursera.

* Python’s NLTK 3.0, Pattern (CLiPS), TextBlob & Pandas libraries were used to analyze

the textual content of tweets.

** Despite the fact some of the interpretations might seem “too simple” or even “naive”, bear in mind the

following two points: 1) you enter this discussion as a reader very familiar with the topic, since, till now, you’ve

probably already read about 100 pages on the topic of Coursera’s & Khan Academy’s social web presence (it’s

coming to an end, I promise =)); 2) since the times ethnomethodology was established by Harold Garfinkel, we

can argue on a scientific basis that exactly such common everyday reasoning & day-to-day experiences

allow us to understand the social orders around Coursera & Khan Academy.

number of tweets: 8,538 number of tweets: 4,976

a week of tweets

top keywords Porter stemmer (top) Wordnet lemmatizer (top)

coursera coursera coursera

course(s) cours course

free sign free

online mooc online

mooc free signed

learn(ing) learn mooc

signed onlin learning

programming start class

data educ programming

education take data

earn(ed) class learn

university univers

part-of-speech tags

top verbs sign(ed)

earn(ed)

take

learn

top adjectives free

learn

free

new

more

first

great

good

mobile

english

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv) | June 2014

Coursera represents “free”, “online”, “mooc”, “education(al)” platform

offering “new” courses form world’s top “universities”, mainly in “english”.

People usually talk about what you can “learn” there, which courses

(“class(es)”) “start” soon, which courses you can “sign” up, which “course”

they (have) “take(n)” (or “signed” up for), whether it is their “first” Coursera

experience, or where you can find “more” information or “more”

educational resources from “coursera”. People like to tell others what is

the percentage of points they “earn(ed)”, and whether they received a

certificate from the “sign(ature)” track. Especially, people talk about

courses concerning “programming” & “data”. The course experience & the

“free” “education” “coursera” offers is “good” or even “great”. So is the fact

that the “education” provided by “coursera” is now also “mobile”.*

* In order to provide a coherent interpretation, we simplify & make generalizations. “good”,

for example, is also often part of the collocation “good luck” – similarly some other words.

The structure of sentences makes an effort to fit the majority of statements/tweets

containing the keywords used (likewise in case of Khan Academy on the next page).

To those, who are interested in doing their own research on the enclosed datasets (see the

last pages of this research paper), don’t forget that the my overall frequencies also include

all non-English text, therefore you might need to use an automatic translation API, such as

Google Translate, Bing/Microsoft Translator, Yandex Translate and/or many others

(also note that the Python’s NLTK build-in BabelFish API was discontinued).

number of tweets: 8,538

a week of tweets

top keywords Porter stemmer (top) Wordnet lemmatizer (top)

khan khan khan

academy academi academy

nasa nasa nasa

education educ education

stem team stem

team onlin team

online learn online

khanacademy' khanacademi khanacademy

space space space

learn collabor learn

opportunities opportun opportunity

tutorials video video

astronomy tutori tutorial

youtube astronomi astronomy

youtub youtube

explor exploration

launch mashable

mashabl sensation

itun

sensat

part-of-speech tags

top verbs expand

launch

learn

bring

unveil

know

top adjectives free

english

new

more

good

great

educational

smarter

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Khan_TweetsWeek.csv) | June 2014

Just like Coursera, Khan Academy represents “free”, “online”.

“education(al)” platform – however, not labeled “mooc” as often as

Coursera – also with educational resources mainly in “english”. You can

“learn” there almost anything from “youtube” “video(s)”, which make you

“smarter”. You can find those on “itunes” as well. People think that

“khanacademy” is offering “great” resources that help you do “good” (not

only) on your exam – you should “know” it exists if you need such help

or if you need “more” educational resources. Some believe, you can

“learn” “more” on “khan academy” than in a traditional class. Some also

think that “khan academy” is an “educ(ational)” “sensat(ion)”.

Rumor has it – (not only) “mashable” supports its spreading – that

“khan” “academy” “team(ed)” up with “nasa” to “bring” and/or “expand”

“stem” “education” “opportunities”. Building on this “collabor(ation)”,

“new” open “educational” resources were “launch(ed)”. Namely

“tutorials” on “astronomy” and “space” “explor(ation)”, which they

recently “unveil(ed)”.

number of tweets: 4,976

a week of tweets

machine learning attend lessons starts june avert damages

free online web application uncertain world olympic games

online courses research methods scientists toolbox world prestigious

data scienceapplication

architecturesworld preparation online course

earned 100 starts today course access turkcell academy

creativity innovation bad timeshuman-computer

interaction

interactive

programming

iran findunderstanding

researchprestigious university

khan academy education sensation academy debuted

stem education khanacademy collaboration technology causes

academy teamyoutube education

(/education youtube)causes salman

apple mac, iphone apple,

itunes iphone, itunesu ituneslaunched online learn more

space exploration, nasa khan online learners bring stem

stem opportunities, announced

today, opportunities

announced, expand stem

unveiled space academy tutorials

whsciencefair learn book review australia collections

top collocations top collocations

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

number of tweets: 8,538 number of tweets: 4,976

Most frequent collocations in both corpora support our previous conclusions. Creating sentences from them

would rather miss the point in this case, so its up to you, the reader, to take a look at both tables. As for

Coursera, I just would like to point out – because the data to make such conclusion were missing in our

previous analysis – that the partner Universities of Coursera are usually given attributes like “prestigious”

(see in the table) or “world-class”. As for Khan Academy, “technology causes” the transformation &

innovation in education (see the aforementioned nishi19’s “book review”). As for both, once again – this time

derived from natural language – we can see the diversity of Coursera’s meanings and, by contrast, the

uniformity of the key topics around Khan Academy.

a week of tweets

evaluating the khan academy

edutech

wbedutech evaluating the khan

academyi love khan academy

education wbedutech evaluating the

khan academy edtech ict

i stayed up all night watching khan

academy videos double speed and

now feel slightly crazed

education

spending my first official day of

summer cramming for the sat on

khan academy awesome

all sat materials on khan academys

page are free

coursera course evolution a

course for educators

the fiction of relationship

from brownuniversity on

coursera

the athlete within from

unimelb on coursera

oilproject from 9 june on the

new coursera mooc of

unibocconi which already

has 13000 students

enrolled here are the

details

humanitarianism from

admissionsuom on

coursera

coursera has an app didnt

know abt it

i am now inspired to

combine this coursera with

classic teaching synergy is

huge and potential fantastic

i earned 801 in an

introduction to interactive

programming in python on

coursera

programming for everybody

session 2 just became my

highestenrollment coursera

class ever

coursera with drchuck

learning to change the

world 1 program at a time

how should one put a

course from coursera on

the resume

only 615 of coursera

signature fee goes to the

university how small goes

to the profs

find free online classes at

coursera futurelearn and

more

organizations from

vanderbiltu on coursera

unethical decision making

in organizations from unil

on coursera

refreshing i recommend

machine learning on

coursera

a scientists toolbox online

course coursera

websignage scoopit

4 of 12 courses on the

coursera homepage are

penn open courses today

more to choose from

representations of

sexualities on coursera

yay penn mt

pennopencourses 412

courses on coursera

homepage are penn

courses today

“khan academy“ concordance sample“coursera“ concordance sample

sharing context of “khan”

sharing context of “coursera”

nasa education brandnew delivers

educational delivers educational career

curriculum my

course mooc python education

online june new first

free courses programming

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

number of tweets: 8,538 number of tweets: 4,976

Our shift to a slightly more personal note – which will be immediately afterwards reinforced by

sentiment analysis – can begin by “coursera” & “khan academy” concordance (context of a

word) samples. Besides several tweets about one’s personal experience – see the following

sentiment analysis for more – we can see that Khan Academy was reviewed by “EduTech”, a

World Bank’s blog on ICT use in Education, which released an article “Evaluating the Khan

Academy”. The “coursera” concordance shows again the diversity of its courses & topics, and

also Coursera’s “social dimension”. Moreover, the extracts include the first instances of a “want

to learn” notion on Coursera & “need to learn” on Khan Academy (see the sentiment analysis).

The tables below, i.e. words sharing the context of “coursera” & “khan”, support our previous

conclusions. It is suitable to mention that “careers” are related to STEM education, therefore

Khan Academy & NASA cooperation.

a week of tweets

average sentiment 0.14

average sentiment,

neutral (0) removed0.30

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014

A simple* sentiment analysis allows us to examine sentiment polarity score of tweets, [-1.0, 1.0], where ‘-1’ stands for negative attitude & ‘+1’ stands for positive attitude. A tweet is treated as

bag-of-words (words are weighted by tf-idf). The sentiment is based on the adjectives it contains, using Python’s NLTK classifier trained on a movie reviews corpus by Pang & Lee. Leaving aside

the opportunities of (rather) marketing employment of sentiment analysis – customer feedback monitoring and/or response to it in order to improve the product and/or improve an institution’s

overall image (both online & offline) – which is not so useful for our purposes, the Twitter testimonials** of Coursera or Khan Academy users allow us to study their natural & first-hand positive,

neutral or negative experiences. Looking at both institutions’ brands, with regard to user experience, we should be able to conclude – however certainly not complete/finish (see the text box of

“other social web analyses” at the end of this text’s conclusions) – the overall social web image of Coursera & Khan Academy.

Dealing with the information on this page, the average sentiment is informative, yet very rough, metric telling us that both institutions are discussed rather in positive connotations (how

surprising! =)). So, taking into account the fact that no “our-Twitter-corpus-specific” machine learning techniques were used (see the first footnote) – therefore we encountered the typical issue of

some testimonials labeled as negative not actually being negative but rather sarcastic, or labeling “negative” a tweet that, in fact, described a negative experience which Coursera/Khan

Academy solved – statements like “Khan Academy is, in general, less popular than Coursera”, or “general attitudes towards Khan Academy are close to neutral” would ridicule statistical

significance (and it’s not wise to fool foundations of statistics =)). The point here is to quickly classify & arrange our dataset in order to find out what people like & dislike about both educational

tools.

* “Simple” is sufficient with respect to our needs. We do not struggle for (methodologically) as accurate as possible positive/neutral/negative classification of the tweets, but rather for (practical) quick & dirty efficient way of using

adjectives to order our dataset by sentiment estimates, so that we can show the most distinctive instances & also be able to quickly to draw some conclusions about the rest. Even though we can’t possibly cover all tricks &

gadgets of natural language processing, just to illustrate some potential improvements of our sentiment classification, let’s start with the suggestion that we might want to improve our machine translation and/or our sentiment

classifier. Both can be achieved by machine learning & (therefore) some manual user input of pre-classified tweets to train our data and/or using an existing tagger optimized for Twitter – e.g. Twitter NLP and Part-of-Speech

Tagging (Noah's ARK, Carnegie Mellon University). Using the training & test sets, the commonly used Naive Bayes classfier could be employed to measure our accuracy. Furthermore, we could try bigrams or trigrams instead

of unigrams so that we can better identify sarcasm, negations & other related sentiment issues. Detecting pronouns, we could also distinguish tweets based on their subjectivity. “Share those amazing online courses on

Coursera/Khan Academy: (…)”, e.g. posted by an online publisher, is different from “I want to share this amazing online course I took on Coursera/Khan Academy: (…)!“, e.g. by an enthusiast Coursera/Khan Academy user.

Based on our methodology, without accounting for subjectivity, the second tweet might have less positive overall sentiment (because there are more words) even though we, humans, might argue it is the other way around.

** In order to further increase our generalizability, we might also want to collect “the remaining” 99.9% of other online testimonials about Coursera & Khan Academy. =) Indeed, even for decision-making regarding Twitter only,

we might also want to employ long-term & automated data collection. On the subject of Twitter testimonials that “made it to this paper”, i.e. were labeled as very negative or very positive, the names of users that do not

represent an institution, brand and/or Coursera were removed. Thus, on the following pages, you will find only the mention/reply symbol (“@”) without the following username.

average sentiment 0.05

average sentiment,

neutral (0) removed0.21

number of tweets: 4,976number of tweets: 8,538

a week of tweets

number of tweets: 8,538

average sentiment 0.14

average sentiment,

neutral (0) removed0.30

sentiment (detected language) original tweet analyzed tweet [automatic translation, processed]

1 (en) Here's a start on @coursera etc @ Summer of Learning! @ pulls together best online courses -

http://t.co/0Fq07j2ZwM

heres a start on coursera etc timf5050 summer of learning lifehacker pulls together best online courses

1 (en) @ are you teaching any MOOCs this summer? Your Coursera course was the best yet! are you teaching any moocs this summer your coursera course was the best yet

1 (en) Professor you've made my transition into bschool a cake walk. Best course I've ever taken.Huge fan

of your novel approach

professor youve made my transition into bschool a cake walk best course ive ever takenhuge fan of your novel

approach

1 (nl) En ook MIT heeft een online university. Doei vrije tijd: http://t.co/dwgViu9uN3 &amp;

http://t.co/WSy1eHIBM1 #awesome

and mit has an online university bye leisure & awesome

1 (en) @ Should study Khan Academy, Coursera and Udacity as role models of MOOCs, adopt best

practices @ @

should study khan academy coursera and udacity as role models of moocs adopt best practices

1 (en) that class that you mention from coursera economics of money and banking is excellent thank you that class that you mention from coursera economics of money and banking is excellent thank you

1 (ja) #Coursera #posa-002 Week 1のResultsは満点で無事に完了! 結構長かった。 #3good results of coursera posa002 week 1 was longer quite done safely in a perfect score 3good

1 (ar) إليكم قائمة بأفضل المواقع العالمية التي يمنكم من heres a list of the best sites you may feel that the world of

1 (en) Your beliefs draw your behaviors and your behaviors determine outcomes!! Just completed Week 2

of awesome learning @coursera!

your beliefs draw your behaviors and your behaviors determine outcomes just completed week 2 of awesome learning

coursera

1 (ru) По теории Графов лучше всего подходят курсы стенфордского университета на Coursera

#yacm2014

on graph theory are best courses at stanford university on coursera yacm2014

1 (en) Completed an excellent course in Machine Learning by @andrewng on @Coursera!

https://t.co/OfawjBUITj

completed an excellent course in machine learning by andrewng on coursera

1 (en) coursera is awesome coursera is awesome

1 (en) @drchuck reading your book for the coursera MOOC. It's awesome!!! reading your book for the coursera mooc its awesome

1 (en) Listening to Rick Levin, CEO of Coursera. Very impressive! listening to rick levin ceo of coursera very impressive

1 (en) @coursera @BerkleeCollege Oh my gosh, that was AWESOME! coursera berkleecollege oh my gosh that was awesome

1 (en) I found the perfect class for me. I'm geeking out in anticipation. https://t.co/VaebXhoqLV i found the perfect class for me im geeking out in anticipation

To sum up the positive tweets about Coursera, users having a good user experience recommend Coursera’s courses, give information about their

progress in a particular course, mention their achievements and/or positive learning experience and give thanks to a particular educator (/Professor).

To illustrate that (you can find the other tweets in the dataset), take a look at tweets which have the highest (/most positive) detected sentiment on this

& the two following pages. As for the importance of such (“natural”) personal word-of-mouth testimonials, the “opinion leadership” concept & the

influence of our social networks on us, these are the small elements that aggregately shape – to a greater or lesser extent for different people – the

source of (high/low) authority & credibility of any institution.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv) | June 2014

a week of tweets

sentiment (detected language) original tweet analyzed tweet [automatic translation, processed]

1 (en) Excellent suite of basics on Data Analysis https://t.co/LOeoGRh4N9 #MOOC excellent suite of basics on data analysis mooc

1 (en) awesome course! https://t.co/wdenaGtRSb awesome course

1 (tr) Dünyanın en iyi üniversitelerinden dersler, Türkçe altyazı ile @TurkcellAkademi'de..

http://t.co/pglIqoAjvF

lessons from the best universities in the world with turkish subtitles in turkcellakademi

1 (en) @ I need to sign up to even look at the course details. I've heard of coursera though. Best of luck, I

can't take up any course now.

i need to sign up to even look at the course details ive heard of coursera though best of luck i cant take up any course

now

1 (en) My online course is called Paradoxes of War. This is going to be awesome. #Coursera #princeton my online course is called paradoxes of war this is going to be awesome coursera princeton

1 (en) Excellent News http://t.co/wdpxXRIz7X excellent news

1 (en) This site is awesome https://t.co/diuZtu5Jms dunno about the classes yet though but signed up for

one to start this week :-)

this site is awesome dunno about the classes yet though but signed up for one to start this week

1 (en) The Best MOOC Provider: A Review of Coursera, Udacity and Edx - http://t.co/ENDu5MJfFK

http://t.co/3ht3jcANcA

the best mooc provider a review of coursera udacity and edx

1 (en) "Aggregation is 'link to the rest', where Curation is 'link to the best'": Understand Google,

Northwestern University coursera lec :)

aggregation is link to the rest where curation is link to the best understand google northwestern university coursera lec

1 (en) @drchuck reading your book for the coursera MOOC. It's awesome!!! drchuck reading your book for the coursera mooc its awesome

1 (en) The Best MOOC Provider: A Review of Coursera, Udacity and Edx - http://t.co/ENDu5MJfFK

http://t.co/3ht3jcANcA via @skilledup

the best mooc provider a review of coursera udacity and edx via skilledup

1 (es) Las mejores ... - http://t.co/CBr8RavZAu #Coursera #Cursos #Duolingo #Nasa #Udemy

http://t.co/bK6FvkCNcQ

best coursera courses nasa udemy duolingo

1 (es) Las mejores #Aplicaciones ... --http://t.co/0gH3naXrCE #RecetasNaturale #Coursera #Cursos

#Duolingo #NASA #UDEMY

best applications recetasnaturale tco0gh3naxrce courses duolingo coursera nasa udemy

1 (es) Las mejores aplicaciones para no dejar de aprender aún siendo adultos #ANDROIDE #Tecnologia

#Udemy #Coursera #NA... http://t.co/TxyaG5Vk0T

the best applications for non stop learning even as adults android technology udemy coursera na

1 (es) Las mejores aplicaciones para no dejar de aprender aún siendo adultos: A medida que nos

hacemos... http://t.co/bNC1BSpQyx #Udemy #coursera

the best applications for non stop learning even as adults as we grow udemy coursera

1 (es) RT @: Después de un 10 en Semana 1, estoy listísima para Semana 2 &lt;3 #TCGO @coursera

@UniLeiden

after a 10 on week 1 im listsima for week 2 < 3 tcgo unileiden coursera

1 (en) @ Best of luck on your @Coursera journey. Follow @DukeU for updates on Duke news, research

and campus life.

best of luck on your coursera journey follow dukeu for updates on duke news research and campus life

1 (tr) @Turkcell @coursera mükemmel bir calışma teşekkürler turkcell courser calma the perfect thank you

1 (ru) Друзья ЛЕГЕНДАРНАЯ новость, для желающих познать финансы. На Coursera вышел … friends legendary news for those wishing to learn finance coursera went on

1 (es) Una vez más empiezo esta maravilla de curso en Coursera, y una vez más me veo abandonándolo

por falta de tiempo :_( https://t.co/FpnrJM5aIL

again start this wonderful coursera course and once again i see myself abandoning it for lack of time

1 (en) @: One if the best for #datacomputing Heard of @Coursera courses? We have them at the

@JohnsHopkinsSPH. Explore #gohop http…

one if the best for datacomputing heard of coursera courses we have them at the johnshopkinssph explore gohop

number of tweets: 8,538

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv) | June 2014

sentiment (detected language) original tweet analyzed tweet [automatic translation, processed]

1 (tr) Turkcell Akademi ve Coursera işbirliği ile dünyanın bilgisi Türkçe: ABD�nin en iyi üniversitelerinden

dünyaca ... http://t.co/9lJjiXfQrr

turkcell academy in collaboration with the worlds information and courser turkish usa s of the best universities in the

world

1 (tr) Turkcell Akademi ve Coursera işbirliği ile dünyanın bilgisi Türkçe: ABD�nin en iyi üniversitelerinden

dünyaca ... http://t.co/qH3ttSNN64

turkcell academy in collaboration with the worlds information and courser turkish usa the world s best universities

1 (it) @ @ C'e' un eccellente @coursera MOOC https://t.co/VciPCU4qmz esplora il peso/ % di compiti in

un corso e consequenze.

there s an excellent coursera mooc explores the weight of tasks in a course and consequences

1 (en) @: @ Just finished the R Programming course on @coursera. Excellent use of time.” I am also

almost done.

just finished the r programming course on coursera excellent use of time i am also almost done

1 (gl) Suscribite a este curso para aprender a programar Programming for Everybody | Coursera

https://t.co/xwJl4pRxuJ via @delicious

suscribite to this course to learn how to program programming for everybody course via delicious

1 (en) Awesome @geranyl: Stanford's Machine Learning on @Coursera starts soon!

https://t.co/f0gVkoLpbz

awesome stanfords machine learning on coursera starts soon

1 (en) @ Just finished the R Programming course on @coursera. Excellent use of time. just finished the r programming course on coursera excellent use of time

1 (en) Weekends appear to be the perfect time to catch up on @coursera lectures and assignments

before Sunday deadlines.#MOOC

weekends appear to be the perfect time to catch up on coursera lectures and assignments before sunday

deadlinesmooc

1 (en) What is the best iOS 6 compatible app for Coursera? what is the best ios 6 compatible app for coursera

1 (en) Coursera's Internet History, Technology, and Security course starts in two days. Looks awesome!

(And it requires no programming.) #IHTS

courseras internet history technology and security course starts in two days looks awesome and it requires no

programming ihts

1 (en) The Awesome moment when you signup for the Scala course in @Coursera and finds out the

instructor is the creator of Scala

the awesome moment when you signup for the scala course in coursera and finds out the instructor is the creator of

scala

1 (en) @: Awesome chat with Pat Bosshart about next-gen #SDN chipsets. http://t.co/FF3K986nVf I

learned a ton. Week 5 of @coursera cove…

awesome chat with pat bosshart about nextgen sdn chipsets i learned a ton week 5 of coursera cove

1 (en) @ Best of luck on your @Coursera journey. Follow @DukeU for updates on Duke news, research

and campus life.

best of luck on your coursera journey follow dukeu for updates on duke news research and campus life

1 (en) Hey @coursera @open2study We wanted to let you know we featured your awesome courses in

our Courses of the Weekend! http://t…

hey coursera open2study we wanted to let you know we featured your awesome courses in our courses of the

weekend

1 (en) Best platform is Coursera...I've done loads! :) best platform is courseraive done loads

1 (en) 2014 Internet Trends http://t.co/JLTa3pAjM1 - Impressive growth in online learning resources like

@khanacademy @coursera and @duolingo.

2014 internet trends impressive growth in online learning resources like khanacademy coursera and duolingo

1 (en) BBC News - Trinidad pioneers online 'knowledge network' http://t.co/nowzNvkfKW. Impressive to

see @coursera working on knowledge.tt.

bbc news trinidad pioneers online knowledge network impressive to see coursera working on knowledgett

a week of tweets

number of tweets: 8,538

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv) | June 2014

a week of tweets

average sentiment 0.14

average sentiment,

neutral (0) removed0.30

sentiment (detected language) original tweet analyzed tweet [automatic translation, processed]

-1 (fa) ایول، # نمی‌کنهچیزروایراندیگهکورسرا ! #coursera evil kvrsra irans nothing else does coursera

-1 (fr) Dégouté d'avoir codé l'assignment4 de #coursera #progfun #scala sans avoir vu les weeks 5 et 6 et

la fin de la 4 ... j'aurai gagné mon temps

disgusted to have coded the assignment4 of coursera scala progfun without seeing the weeks 5 and 6 and the end

of the 4 i have earned my time

-1 (en) The physics equations will prolly drive me insane, but enrolling here for the sake of my #writing

#workinprogress https://t.co/8Zx661m5AS

the physics equations will prolly drive me insane but enrolling here for the sake of my writing workinprogress

-0.8 (en) I earned 97.8% in Microeconomics Principles on @Coursera! I hate deadlines!

https://t.co/upfc4VNY42

i earned 978 in microeconomics principles on coursera i hate deadlines

-0.75 (en) Manchester Uni MOOC on water and sewage. Boring? Wrong!!!! manchester uni mooc on water and sewage boring wrong

-0.7 (en) Coursera class on resilience of children in disasters & war--too bad I will be in school!

http://t.co/O5WJe4wDur

coursera class on resilience of children in disasters & wartoo bad i will be in school

-0.7 (en) @coursera @Turkcell I wish coursera would just stay with koç uni. Turkcell is just in pursuit of his

brand's promotion, Too bad for coursera

coursera turkcell i wish coursera would just stay with ko uni turkcell is just in pursuit of his brands promotion too bad

for coursera

-0.65 (es) Las clases en línea del Tec son muy complicadas, deberían de aprender de Coursera y edX. online classes are very complicated tec should learn from coursera and edx

-0.6 (pt) a merda do coursera fica me mandando emails e eu já me desinscrevi um milhão de vezes the fucking coursera keeps sending me emails and ive desinscrevi me a million times

-0.6 (en) This @coursera website is dangerous. I WANT TO LEARN ALL THE THINGS! Dang finite lifespan

and brain capacity!

this coursera website is dangerous i want to learn all the things dang finite lifespan and brain capacity

-0.6 (en) Desperately trying to work on Coursera platform from my mobile device. desperately trying to work on coursera platform from my mobile device

number of tweets: 8,538

As we’ve already mentioned, without any training, the detection of negative tweets performed rather poorly.* There actually was a minority of tweets about Coursera that were truly

negative & there were very few common topics of negative experience. We might argue it is so because Coursera’s target group is, in general, older and, at the same time, niche

population of people who chose their courses build around their professional/academic interest – therefore not possibly being pushed by the traditional education system to learn

particular curriculum (as we’ll see in connection with Khan Academy).

If we were to send Coursera some – in our case possibly biased by our data collection period – customer feedback, some crowd-sourced suggestions for improvement (although it is

not the primary subject of this analysis), it would be improving courses’ notification settings (making it clear & easy to adjust), considering the “brand risks” of entering partnerships

with non-educational & non-research institutions such as Turkcell, a Turkish mobile phone operator, & personally adjustable assignment deadlines (Yaaay! =)). From other negative

tweets not displayed on this page but available in the enclosed dataset, it would be, for example, consideration of repeated invitations to buy signature track (yet, for-profit

organizations surely need to make profit =)) and/or increasing the commission (/ financial reward) that goes to educators/Professors as opposed to their home University/College.

Finally, improving the support for students lagging behind and/or with insufficient entry educational background – e.g. providing some suggestions on where to get it & do so in

advance, so that they are able to meet such prerequisites before a particular course of their interest starts.

* However, neither here nor in relation to Khan Academy, should we say we’re not significantly better off. Just the fact we have a – more or less – ordered list of 8,538 (Coursera) + 4,976 (Khan

Academy) tweets based on their sentiment at our disposal within tens of seconds, is something that would take time & money using manpower, and it greatly facilitates our orientation in the dataset.

By the way, prospective machine learners (/data trainers) should know about services like Amazon Mechanical Turk.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv) | June 2014

a week of tweets

number of tweets: 4,976

average sentiment 0.05

average sentiment,

neutral (0) removed0.21

sentiment (detected language) original tweet analyzed tweet [automatic translation, processed]

1 (en) Khan Academy is a wonderful site. khan academy is a wonderful site

1 (en) Besides Khan Academy, is Schaums the best to learn math for quant research and purposes?

http://t.co/19VJBBZlNj

besides khan academy is schaums the best to learn math for quant research and purposes

1 (en) @ Should study Khan Academy, Coursera and Udacity as role models of MOOCs, adopt best

practices @ @…

should study khan academy coursera and udacity as role models of moocs adopt best practices

1 (es) Khan Academy: maravilloso! https://t.co/dQzEPaXgTF khan academy wonderful

1 (en) “@: All nighter with Khan Academy teaching me Physics, you guys are so awesome.”

@khanacademy

all nighter with khan academy teaching me physics you guys are so awesome

1 (tr) Khan academy diye müthiş birşey var ben daha keşfedeli 2-3 ay oluyor khan academy he has something awesome going on than i explore in 23 months

1 (en) @ @ is waqt moin khan academy best hai or rashid Latif the best two is waqt moin khan academy best hai or rashid latif the best two

1 (en) Khan Academy is the greatest study tool I've discovered. 🙌 khan academy is the greatest study tool i have discovered

1 (en) @ Should study Khan Academy, Coursera and Udacity as role models of MOOCs, adopt best

practices @ @

should study khan academy coursera and udacity as role models of moocs adopt best practices

1 (en) @ I haven’t read what you are talking about BUT KA Lite is an awesome offline Khan Academy palendae i have not read what you are talking about but ka lite is an awesome offline khan academy

1 (en) Almost Khan Academy (aka the best Calc corner ever) http://t.co/tQ6uZLxwkF almost khan academy aka the best calc corner ever

1 (en) Excellent STEM tools for anyone who works with the kiddos:) http://t.co/V7dclsjj7F excellent stem tools for anyone who works with the kiddos

1 (en) How to use Khan Academy in Math! #awesome #KIPP #Khan http://t.co/GxCLa8cAlC how to use khan academy in math awesome kipp khan

1 (en) I wish i found #khan #academy sooner. Simply awesome! #maths i wish i found khan academy sooner simply awesome maths

1 (en) khan academy actually is awesome khan academy actually is awesome

0.91 (pt) @hugsmeade O site de matemática é o Khan Academy. Muito bom! the site math is the khan academy very good

Even though there are fewer Twitter testimonials about Khan Academy compared to Coursera – the learning process & the target group, or

rather “target occasion”, in which a learner needs & looks up Khan Academy seems to be very different, and such user might not be motivated to

create public testimonials – the positive tweets often mention good experience with an educational material (STEM subjects, unsurprisingly,

above all), including those of thanks, with regard to the necessity of studying a particular topic (homework and/or exam). The “top” (1.0-0.9)

positive tweets detected can be found in the table below.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Khan_TweetsWeek.csv) | June 2014

a week of tweets

average sentiment 0.05

average sentiment,

neutral (0) removed0.21

sentiment (detected language) original tweet analyzed tweet [automatic translation, processed]

-1 (en) I'm watching a khan academy video in my miserable attempt to pass mannings quiz tomorrow I am watching a khan academy video in my miserable attempt to pass mannings quiz tomorrow

-1 (es) Khan academy, tareas, examenes mensuales y auditoria TODO junto es HORRIBLE!!!!!!!!!😡😡😡

😤

khan academy assignments exams and monthly audit all together is horrible

-0.8 (en) @ hate khan academy! ugh hate khan academy

-0.8 (en) @ and those stupid pendragon essay and khan academy and those stupid pendragon essay and khan academy

-0.8 (en) @ I hate khan academy i hate khan academy

-0.8 (en) hate khan academy hate khan academy

-0.8 (en) @: I've been watching Khan Academy videos all day bc I can't get to the stupid review on mypisd I have been watching khan academy videos all day bc i cant get to the stupid review on mypisd

-0.8 (gl) Estupido khan academy👊👊 stupid khan academy

-0.8 (en) Uggggh I hate khan academy 🔫 uggggh i hate khan academy

-0.8 (en) I freaking hate khan academy&gt;:( i freaking hate khan academy&gt

-0.8 (en) @: khan academy is so fucking stupid and it pisses me off to the max khan academy is so fucking stupid and it pisses me off to the max

-0.8 (en) I really hate Khan Academy i really hate khan academy

-0.8 (gl) ODIO CON TODO MI CORAZON KHAN ACADEMY!!!! hate with all mi heart khan academy

-0.8 (en) it has gotten to the point where I put on khan academy to listen to on the way to school i literally

hate myself

its gotten to the point where i put on khan academy to listen to on the way to school i literally hate myself

-0.8 (en) I fucking hate Khan Academy i fucking hate khan academy

-0.8 (en) I hate khan academy someone should show me how to do this i hate khan academy someone should show me how to do this

-0.8 (pt) essa porra desse khan academy é chato demais this fucking khan academy is too boring

-0.8 (en) I will always hate math & Khan Academy i will always hate math khan academy

-0.8 (en) khan academy: saving my geometry grade one annoying ass video at a time khan academy saving my geometry grade one annoying ass video at a time

number of tweets: 4,976

Just like what we’ve encountered with Coursera, the detection of the most negative tweets about Khan Academy performed with several

misclassifications; for instance, right the very first tweet was misclassified because of the word “miserable”. As we can see, KA’s (generally)

younger userbase also has no problem with openly expressing their uncensored & sincere frustrations. =) Negative tweets are often related to a

particular “offline” class (traditional K-12 education), as a response to a necessity of studying particular curriculum (homework, exam etc.).

Finally, we should point out that from the “a week of tweets” Khan Academy tweet corpus, we can find enough evidence for the previously

supported statement that KA’s open educational resources are also often used by University/College (any higher education) students or young

professionals. While Coursera’s users – regarding their “archetype”, since they surely can blend in reality – rather search for classes (directly)

increasing their professional//academic qualification, Khan Academy is often used rather as supplement – which is naturally also related to the

process of creation & distribution of study resources on both portals.

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Khan_TweetsWeek.csv) | June 2014

a week of tweets

Jakub Ruzicka

[email protected]

www.linkedin.com/in/littlerose

number of tweets: 8,538

It's been quite a journey, right? =) Finally, here comes the very last

analysis of this exploratory research paper before the conclusions.

Even though we won’t draw any “new” conclusions but rather reinforce

some previous findings, it deserves something special, interesting &

eye-catching to “say goodbye”. Imaginations has no limits …or, at

least, it is said so. Since we have 8,538 (Coursera) + 4,976 (Khan

Academy) tweets at our disposal, what about asking our database a

slightly more sophisticated question? As far back as I can remember,

we are attempting to discover Coursera’s & Khan Academy’s brand,

therefore we want to know “what/how Coursera & Khan Academy are”.

Sixty nine tweets (Coursera) plus one hundred & twelve tweets (Khan

Academy), which contain “coursera is” or “khan academy is”, will

review* some of our knowledge about both entities in a way most close

to the main point of social media.

* the D3 Word Tree visualization by Jason Davies was used

n = 69

original data (dataset: Twitter_Coursera_TweetsWeek.csv) | June 2014

a week of tweets

Jakub Ruzicka

[email protected]

www.linkedin.com/in/littlerose

number of tweets: 4,976

n = 112 (part 1/2)

original data (dataset: Twitter_Khan_TweetsWeek.csv) | June 2014

a week of tweets

Jakub Ruzicka

[email protected]

www.linkedin.com/in/littlerose

number of tweets: 4,976

n = 112 (part 2/2)

original data (dataset: Twitter_Khan_TweetsWeek.csv) | June 2014

Coursera & Khan Academy

…and what does the social web say about education?

brand essence, swot, positioning & more

(pp. 122-130)

conclusions

Since we want to “fit” the collected social web data – conclude

based on them only, although potential business analysis would

naturally took into consideration other sources of information as

well (an appropriate research bias compensation, by the way), the

following simplified & slightly modified common marketing

analyses/tools are not necessarily used in the “correct” way &

might be incomplete. However, their employment is very

convenient for our demand for concluding the overall social web

presence of both institutions.

functions

providing (prospective/current) University/College students, professionals & self-directed learners with hundreds of niche-specialized diverse advanced (college-like & online by default)

courses for lifelong education under one platform

meeting the demand for introductory-level courses teaching easily transferable skills(ICT & data science skills above all) to other professional/academic areas

anytime (however with fixed dates & deadlines) educational contentfor (in the long term) improvement of one’s own labor/academic market status

personality

social mission – transformation of education – focused on transforming society

opening up higher academic/professional education to mass audience (“anyone can join”)

decentralization & diversity

storytelling (self-driven education & life improvements) and inspiration as an education facilitator

close (semi-formal) relationship/friendship with a Professor/instructor/educator

performance

providing opportunity to improve one’s own qualification via self-driven educationin order to improve one’s own life

positioning itself via connecting to, establishing partnerships with and/or sharing content of (many) Universities (labeled “world-class”, “prestigious” etc.), University Professors & other educational institutions, also news & tech publishers, famous people & celebrities (politics & entertainment)

active support/promotion of new courses, new partnerships, users-storytellers,technology in education enthusiasts & opening up education enthusiasts

source of authorities

[also influencing Coursera and/or its brand]College/University & other partnerships (recently WorldBank), University Professors, Andrew Ng & Daphne Koller,

“the rest” of the board & investors; activity of all aforementioned, generally reinforcing Coursera’s academic-business nature & the communication element of shaping/transforming society through education & research

active, enthusiast & (often) influential social media users recommending particular courses, posting about their progress, celebrating their achievements and/or positive learning experience, and giving thanks to educators;

active, enthusiast & (usually) influential educators & institutions they represent

major news & tech publishers positioning Coursera as the leading common platform for communities looking for online higher education and/or keen to learn & excited about online education & its techn(olog)ical transformation

signature track certificates

brand essence

I want to

learn to make a move

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014

conclusions

brand

functions

fixing/establishing one’s common core education via targeted education mix, assembled from thousands of small lectures (mathematics & STEM education in the first place)

just-in-time educational content helping one succeed in the traditional education system

access to coherent primary/high school common core for self-directed learning under one platform

personality

social mission – transformation of education – focused on transforming education system

flipping the traditional institutional education system & opening it up

centralization & integrity

gamification & storytelling (self-driven education & school performance) as an education facilitator

close (semi-formal, even informal) relationship/friendship with Salman Khan

performance

providing opportunity to deal with required necessities within the traditional education/qualification system (& facilitating it) and/or serve as its supplementusing “catchy” lectures, personalization & gamification

Khan Academy as a big, open & informal (with a lot of ”background” content) family led by”father Sal”, developed & enabled by a small team (ICT development and/or content creation),

co-developed & spreaded thanks to the effort of its volunteer community

publicizing the new "science" courses & establishing partnerships with (a limited number of)key/influential institutions in order to popularize & facilitate stem education

source of authorities

Salman Khan positioned by major news, tech & also education-focused publishers as education transformation leader(via video & technology) & successfully communicating core topics of his vision

strong & growing user & volunteer community, “transparent” & well-deliveredbackground processes & development activities

[also influencing KA and/or its brand]College Board (SAT college admission exam), the White House,

NASA, and Bank of America partnerships; Bill Gates

social web testimonials mentioning positive experience with KA’s educational content - STEM subjects above all (unsurprisingly) – including “thanks” posts regarding necessity of studying a particular topic (e.g. last-minute homework and/or exam preparation)

shaping education since 2006

brand essence

I need to

learn to move on

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014

conclusions

brand

positioning

conclusions

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014

USP &

competitive

advantage

University-like (but made online

by default) MOOC courses under

the patronage of various world-

class Universities, taught by their

Professors & possibly

concluded with obtaining

a signature track

certificate

keywords

coursera, free, courses, learn,

online, education, massive, future,

provider, online, mooc, new,

universities, english, machine,

learning, class, start, sign, take,

first, more, earn, programming,

data, good, great, mobile,

wider

associational

universe

open educational

resources, ck-12

foundation, alison,

opencourseware,

massive open online

course, udacity, mit

opencourseware,

learnstreet,

wikiversity, charles

severance, massive

online open research,

creative live,

techchange, udemy,

ben benderson,

edsby, iversity,

academic eearth,

lynda.com, eliademy,

edukart, daniel s.welt,

open-source

curriculum, edx, open

learning, knewton,

traffic sources

social media, major online news &

tech publishers, C’s own & third-party blogs,

influential & active users/community (educators,

learners, Universities, well-known celebrities &

public figures, online media & other institutions),

traditional Colleges and Universities, ...

facebook, google+, twitter, nytimes, wsj, huffingtonpost, npr news,

techcrunch, wired, fastcompany, ted, mashable, wikipedia, hurriyet,

times of india, anadolu ajansi, atlanta journal constitution, montreal

gazette, calgary herald, cournytimes, huffingtonpost, ryanseacrest,

bbc, entrepreneur, charlierose, kplu, mooclist, worldbank, turkcell,

hurriyet, johns hopkins, reddit, youtube, linkedin, slashdot,

wordpress, cnn, gamezone, tumblr, typepad, patch, cnet, msn,

stanford, duke, dzone, feedsportal, upenn, usnews, abc, ala,

allaboutjazz, bloomberg, bostonmagazine, cbsnews, cdc, cisco,

clucerf, crooksandliars, edublogs, edweek, fooyoh, forbes, illinois,

inc, ing, iowapublicradio, kqed, marketplace, mit, nbcnews, opb,

openforum, payscale, publicradio, reghardware, rice, ripr,

smithsonianmag, theguardian, typesafe, uw, washington,

wbur, wfu, wfubmc, wlrn, worldbank, lifehacker,

wpr, wrvo, yahoo,

competition

edx, udacity, udemy, class-

central, tareasplus, iversity,

the teaching company,

traditional Colleges and

Universities,

KPI

professional /

academic

qualification

obtained

at Coursera

customer persona

a user/sympathizer of Coursera is a (US/foreign)

student, recent employee and/or entrepreneur,

who wants to obtain/supplement her/his

professional/academic qualification

a user/sympathizer of Coursera supports techn(olog)ical

transformation of education; she or he is enthusiastic

about the opportunity of studying world-class courses

(possibly also obtaining a certificate) and/or enjoys

exploring (from her/his perspective) new topics which

she/he “did not have courage” to study on her/his own,

did not know much about or did not know where to start

(e.g. programming or data science)

a Coursera user is a part of communities around

particular Courses; she/he possibly joins them because

of social web testimonials of learners & educators;

she/he generally responds to stories about, improving

one’s life/skills, success & overcoming challenges

via education (and making it

accessible & facilitated

by Coursera)

target markets

Coursera seems to expand rather via

larger (University) cities; due to general

global interest in professional and/or

academic qualification of students,

employees & entrepreneurs, Coursera

grows in a rather decentralized fashion

besides US, Canada, Brazil, Mexico & Latin

America, we found increased interest in

Coursera in India, Bangladesh, Brazil, Turkey,

Arab countries Russia, Egypt, Mexico, UK,

Spain, Ghana, Singapore, Greece, Hong-

Kong, Kenya, Nigeria, Pakistan, Trinidad and

Tobago, Jamaica, Cambodia, France,

Italy, Australia, China, Netherlands,

Spain & Portugal

positioning

conclusions

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014

target markets

in accordance with the primary/high school

education system it exists within, Khan

Academy is slightly more English speaking

countries-centered (US, Canada, UK,

Australia); nevertheless, thanks to its

volunteer translator community &

development of localized (/translated)

portals, it increases its reach via these

“scattered but focused epicentres” which

help its (rather centralized) growth

for example, communities were found in

Brazil & Latin America, Egypt, Sweden, Japan, India,

Bangladesh, Pakistan, Trinidad and Tobago, Jamaica,

Cambodia, Ghana, Singapore, Hong-Kong,, Kenya,

Nigeria, Czech Republic, France,

the Republic of Haiti,

Spain & Portugal

traffic sources

social media, major online news &

tech publishers, KA’s own & third-party

blogs, education-focused publishers,

KA websites in other languages,

traditional K-12 education, …

youtube, facebook, google+, twitter, nytimes,

techcrunch, wsj, fast company, forbes, cnetnews,

washingtonpost, ted, mashable, education week,

wikipedia, sina, es.khanacademy, pt.khanacademy,

market watch, cbs, telegraph, el economist,

cnnmoney, nasa, bill gates, tumblr, edsurge, paniit-

bayarea, techcrunch, iturank, the hindu, linkedin,

slashdot, patch, feedsportal, kqed, stackexchange,

uwstout, wordpress, abcnews, ck12, cmu, cnn,

gawker, go, hawaii, hpu, huffingtonpost, ljworld,

metafilter, niu, rosettastone, sc, nbcnews,

smithsonianmag, tulsalibrary, utexas, uvm, waldorf,

wwhatsnew, wtol, yahoo, google, hbr,

wider

associational

universe

open educational

resources, ck-12

foundation, alison,

opencourseware,

massive open online

course, udacity, mit

opencourseware,

learnstreet,

technology

integration, interactive

learning, two circles,

oer commons, free

high school science

texts, educational

technology, open

textbook, american

friends of arts et

méiers paris tech,

curriki, virtual

university, learnthat

foundation, mitx, phet

interactive

simulations, teaching

channel, computers in

the classroom, e-

learning, ineedapencil,

saylor foundation,

collectspace, lecture

recording, open

source learning, east

bay children's book

project, knewton,

competition

mathisfun, purplemath,

grockit, gradeslam, showme,

virtualnerd, regentsprep,

mathwarehouse,

USP &

competitive

advantage

extensive ‘menu’ of micro

lectures covering basics of STEM

education (& more), enabling

flexible assembling of

individualized curriculum plan

under a single platform

keywords

salman, khan, nasa, youtube,

videos, college board, stem, bill

gates, free, online, education,

english, learn, itunes, great, good,

know, more, sensation, bring,

expand, opportunities,

collaboration, new,

launch, tutorial,

KPI

achievement /

qualification beyond

Khan Academy

customer persona

a sympathizer of Khan Academy is a part of huge

community around Salman Khan & his path towards

education transformation

a user of Khan Academy is a primary/high school

student, prepares herself/himself to a standardized

school/admission examination – e.g. math or SAT,

also economics, science, or medicine – and/or is an

older (than high school age) female/male

supplementing (filling gaps in) her/his education

(mathematics above all)

a user/sympathizer of Khan Academy is inclined

towards gamification of education – collecting points

& badges, accepting challenges & solving brain

teasers (possibly serving as a door-opening

moment) and/or enjoys exploring new topics

(from her/his perspective, e.g. computing)

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014

swot

conclusions

STRENGTHS

•providing academic/professional education under the patronage of various word-class Universities

•highly (intrinsically) motivated participants of the educational process & supporting such motivation via academic/professional “improve your life” incentives & stories

•Google+ & Facebook: storytelling & inspiration; Twitter: influencers’ testimonials; LinkedIn: connection with the economically active (/professional) population, including students with professional experience

•mobile device users reach: particularly the Android App*

•stable growth of reach & fan base, entering new partnerships with Universities & other educational institutions

* iTunes version of Coursera’s app is also available

WEAKNESSES

•YouTube: missing educational content & unclear content strategy

•Facebook (& possibly other social media profiles): diminished ability to drive engagement* – could be improved, for example, by sharing exercises and/or quiz questions from its courses

•volunteer translator community in its infancy

* not just driving traffic to its website but also employing social media as an educational tool helping spreading of content & strengthening learning communities (and/or patronage over such communities & engagement techniques around particular courses)

OPPORTUNITIES

•(US/foreign) students & younger professionals (global reach, Internet population), who don’t have the opportunity to study particular courses in their home country, at their home University (or not included in their study pathways/programs); or those who are not able to meet the financial/time requirements of higher education; or simply those who want to “give a try” to a particular area

•(closely related) influence on the global young (possibly ”in the near future”) economically active population

•self-organized study/course groups on social media, outside Coursera’s online learning environment

•influential social media users – both learners & educators – mentioning positive experience with Coursera (word-of-mouth, genuine)

•possibility of becoming a supplement and/or competition of traditional higher education via providing full study pathways & its own (flexible) system of qualification, which might bind together current niche-specialized courses

THREATS

•eventuality of weakening the “academic neutrality” (brand) label due to commercial partnerships (on the other hand, it can be utilized as communication of linking both realms)

•not reaching those who – in spite of the fact they need qualification & seek higher education – for instance, don’t know “the” foreign language (English, regarding the majority of the courses), or do not have the required (e.g.) mathematical background for non-superficial understanding of more complex topics etc.; therefore deepening inequalities by increasing expertise of those “easily qualifiable” while not reaching the learners –possibly forming much bigger market – who would need more entry background & skills

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014

swot

conclusions

STRENGTHS

•traditional primary & high school education and supplementing it

•resources tailored for US education system

•strong and connected user & fans community

•growing volunteer community, translations

•YouTube: KA’s micro lectures among the most popular YouTube educational content; Facebook & Google+: brain teasers & challenges; Twitter: background information via employees & interns

•stable growth of reach & fan base;its increased growth if a significant partnership is entered

•iTunes users reach, including Apple mobile devices (iPad above all)*

* Windows Store version of the Khan Academy’s app is also available

WEAKNESSES

•(compared to Coursera) less personal (word-of-mouth, not by online publishers) social media testimonials regarding positive experience with particular educational content; social media (general) mentions by online publishers rather than by “influential“ –as measured by potential reach –pupils/students (also determined by age) & teachers

•the question “why to learn something” often answered by necessity –external motivation, e.g. school/exam – rather than “pursuing one’s goals/interests” – intrinsic motivation, e.g. a concept searched for & explained within a specific practical application rather than as itself

•inactive LinkedIn company page

OPPORTUNITIES

•older (than high school) population filling the gaps in their education

•an opportunity to influence & shape the US K-12 education system as well as systems in other (not) only English-speaking countries

•subtitles translations& volunteer communities

•an opportunity to easily communicate/deliver key topics to its whole user base & beyond (centralization, Salman Khan’s wide publicity & key influential partnerships)

•“breaking out of math” & becoming platform providing general introduction to all “traditional” subjects / disciplines, i.e. complex primary & high school education

THREATS

•a very tight relationship between the “Salman Khan” brand & “Khan Academy” brand (in case of an unfortunate event causing damage to his name)

•eventuality of losing the “revolutionary” part of the brand due to closer connections to the traditional education system (its adjustment & blending into it)

conclusions

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014

why

how[social media]

whatsocial media are vital for both

Coursera & Khan Academy, and so possibly also vital for spreading of

the “transforming education” principles & facilitating the

transformation

in total, on the social web,politics is more popular than

education*

* http://bit.ly/UQKfgo

(similarly to content marketing) creating highly engaging

educational post seems to require not only creative skills but also

some objective knowledge about features of such content tailored for a particular target group; therefore studying those features deserves

attention

we seem to have enough tools & necessary (key or supplementary) social web & social media data to find educational influencers, plan

viral spreading of educational content, linking formal to informal education, targeting individuals or

groups, creating educational recommender systems & more

the leaders, the visionaries & the “doers” in establishing the 21st

Century Education seem to be thefor-profit & non-profit sectors(How governments respond?)

institutions, which are both labeled as "revolutionary" in the realm of education,

offer different products & learning process; 21st Century education

therefore does not seem to be about developing one perfect centralized general education system – rather

protection, patronage & support of the market, if we consider a government regulation and/or an education record

(similar to a medical record) – but about diversity of providers (self-directed

learning, state, for-profit, non-profit, one-on-one etc.; free, trial, paid, creative commons etc.) of clearly positioned

educational products, allowing individuals to assemble their own

"education mix“, which is recognized, together with their collaborative &

individual projects/accomplishments,as their qualification, where ways of

obtaining qualification are, by default,a choice

even though we haven’t opened that topic thoroughly within the discussion in this research paper, the data suggest that Coursera’s & Khan Academy’s “revolutionary nature” seems to mainly reside in technology

allowing to make education accessible, mediate some of the best educators in particular areas & for a given target group, satisfy mass demand & provide data-driven education; but not as much in the usual

transmission of the curriculum & assessment: presentation-based lectures, not collaborative, a single human being (/learner) & her/his individual outcomes, etc.

nevertheless, we can see how huge a difference both tools provide regarding self-driven education

What’s the point of such reflection?Well, the best is yet to come! =)

YouTube

(& video in

general) as an

optimal

platform for

social web

educational

content

spreading

sharing stories about

education improving lives*

visual content

usage to drive

traffic to

education**

driving

(above all)

comments

to educate

efficientlystorytelling

closer connection

of learners

& educators

(semi-formal,

or even informal)

“good practice

examples” &

peer pressure

“covering”

ourselves in

educational

content (e.g. via

subscribing to it)

& recommender

systems as

foundations

for creation /

adjustment

of our

educational

mix

(intelligently)

entertaining

content attracting

attention

to education

& facilitating it

social

media/web

influentials

& their

testimonials

transformation,

provision &

diffusion of

education

because of,

thanks to & via

ICT

educational content derived rather from what I look

for / need & what my level of knowledge in a

particular area is, than based on who I am

according to demography (e.g. age & location)

education on the social web

introductory-level content

employing clarity & simplicity in

order to cause door-opening

moment towards educational

content perceived to be more

complicated

* arguably common/general

element of “education” as a brand

** well-know in the realm of social

media marketing; we are able to

confirm it & highlight it

blog articles / social media posts

sharing an institution’s

background information

connecting community

We definitely haven’t covered all

areas of the social web research

(how could we, when the topics of

“semantic web” and/or “Internet of

things” were not even mentioned).

However, in the same way, even if

we stay in our, somewhat

narrower, universe of the (public or

private, on request) data provided

on the social web, we should, at

least, point out plenty of other

SNSs (& their APIs), discussion

forums / comment sections, blog

posts, GeoIP & GPS, mobile

applications, click-through rate

monitoring, cookies & sessions,

web scraping, image recognition,

micro data & other APIs around

HTML5, and also digital forensics &

ethical hacking (e.g. file metadata).

If we omit the (already mentioned)

suitability of a longditudial research

verifying the conclusions of this

paper, testing the conclusions on a

recent dataset, and/or taking

account of influence & reach of

particular media; we might also

want to employ other analyses –

some suggestions can be found on

the left.

clustering (and/or MDS of) different educational websites

based on keywords tf-idfs

(in case of obtaining fans’ interest profiles) clustering of

educational websites based on pages their fanbase liked and/or deriving customer segmentation

conjoint analysis to help us assemble education mix for different groups according to

their segment/typology

(similarly) discriminant analysis & ROC curves to estimate our

chances of targeting a particular user group with a particular

educational content

“how an educational website positions itself” vs “how it is

positioned by the social web” comparison (e.g. estimating the

“brand hijacking” share)

detailed analysis of motivations & goals of both online learners

& educators

a simple recommender algorithm based on the social

web/media data(see an awesome step by step

online book/tutorial,A Programmer’s Guide

to Data Mining)

machine learning employment (e.g. the famous naive Bayes classifier) in order to improve

precision of sentiment analysis or other text mining analyses

focus on an individual human being (/small group):

how she/he is affected within her/his social network,

monitoring changes & drawing conclusions about efficient

educational content spreading& its features

monitoring features of a successful/viral social media

educational content in general

mining the social web for the core topics of transforming

education & attitudes towards them

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014

other social web analyses

tools used & DIY resources for self-driven education on the next page

(pp. 131-132)

link to all datasets (zip): http://bit.ly/1pB9ZF4

diy

Unfortunately, it does not seem to be common

practice to include tools & how-tos in research

papers. We are still getting used to the fact that

the “proprietary” things we possess are rather

our skills (& time) – used to customize

information, therefore derive a product – than

the “raw” information which is supposed to be

shared – arguably because we could not move

off if our ancestors & contemporaries did not do

the same. Such state of things might give us the

“wrong” feeling that conducting research is

feasible only by academic/professional

“masterminds”. Since education is a process we

naturally enjoy – however needs more

convenient environment in its institutional form –

I hope the pictorial list – as we know, pics should

attract you to click on them =) – of (mainly free &

open source) tools & literature, which I used

and/or which you might like, will inspire you &

get you started in the realm of data science

and/or social web mining – whichever pathway

in those extensive disciplines you want to follow.

Do not forget to google other online/paperback

resources to further expand your research toolkit

& share* those you know in the comments below

(since we already know which kind of

engagement is crucial for education =)).

* Enthusiast transform education & don't wait for the

traditional education system to take the plunge.

This research is – secretly, in the footnote, to avoid

interfering with scientific objectivity =) – dedicated to all

heroes who create & distribute freely available

educational content and/or proprietary educational

content suitable for self-driven education. The 20% of

the “learning time” I've spent educating myself, outside

the traditional education system, allowed me to learn

80% of what I know & can.

[start here]books & online

resources in ‘Social Web: (Big) Data Mining’

syllabus

Python awesome Python frameworks, libraries

& software

IPython Notebook

NodeXL(& Microsoft

Office)

Gephi Pajek Wandora

NetLogo R MYSTAT Knime RapidMiner GATE Weka KEEL

import.io IFTTT Zapier plotly HighCharts D3 Processing tableau public

OpenRefine Twitter Archiving Google Spreadsheet

TAGS (& Google Drive)

A Programmer’s Guide to Data

Mining

Linux Ubuntu Libre Office Notepad++ Sublime Text NetBeans

Eclipse VirtualBox MySQL mongoDB Apache Hue Raspberry Pi Arduino

Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014

JAKUB RŮŽIČKA [email protected] cz.linkedin.com/in/littlerose

summer 2014 | working paper

@

COURSERA & KHAN ACADEMYON THE SOCIAL WEB

the social web co-creating brands, revealing communities

& facilitating - both producers’ & consumers’ - informed

decision-making in adjusting their “education mix”


Recommended