Date post: | 18-Dec-2014 |
Category: |
Data & Analytics |
Upload: | jakub-ruzicka |
View: | 962 times |
Download: | 1 times |
JAKUB RŮŽIČKA [email protected] cz.linkedin.com/in/littlerose
summer 2014 | working paper
@
COURSERA & KHAN ACADEMYON THE SOCIAL WEB
the social web co-creating brands, revealing communities
& facilitating - both producers’ & consumers’ - informed
decision-making in adjusting their “education mix”
aggregated general background
Coursera & Khan Academysocial web presence quantitatively
web domains, web traffic, keyword performance, business insights, social media statistics: facebook, twitter, google+, youtube & linkedin; competitive analysis, wikipedia insights
(pp. 3-33)
original social web data
Coursera & Khan Academysocial web presence qualitatively
facebook, twitter & google+ pages, groups, comment networks, communities, posts, content analysis, fans, demographics, traffic sources, keywords; personal network & interest profiles, search results, news articles, text mining, inbound links, reddit, youtube
(pp. 34-95)
a week of tweets
Coursera & Khan Academydetailed insights provided bya small fragment of big data
general statistics, influential tweeters, demographics, keywords, content analysis, natural language processing, text mining, sentiment analysis
(pp. 96-121)
conclusions
brand essence, swot, positioning& more
Coursera & Khan Academy…and what does the social web
say about education?
(pp. 122-130)
outline
link to datasets
tools used & DIY resources for self-driven education
(pp. 131-132)
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014
This research paper* is concerned with social web data in the context of education. Studying social web presence of two institutions, Coursera & Khan Academy – both arguably labeled “revolutionary” (reflections or even
leaders of the transforming education) – illustrates what kind of information the social web provides to learners, shaping their “education mix”; educators, personalizing education(al tools) they provide & reinforcing their
positioning; and what the general function & “state of things” of social media in education is. Even simple analyses on the social web “dataset” (quite big, indeed) might provide us with information complementing – in some
cases even substituting (/filling the gaps of) – the (internal/private) behavioral data from an online educational tool usage. Rather than making an effort to define “the one and only perfect centralized general education
system” (if such a thing exists) & even though both services might possibly exhibit many common features (e.g. the “whatever, whenever, wherever, however education” notion), we’ll attempt to define both 21st Century
education tools, Coursera & Khan Academy as “brands” co-created by their social web presence, offering personalized/personalizable (“made to measure”) education suitable for a specific group, individual human being,
moment, occasion, need etc.; and so taking a specific part in “education mix” of an individual.
The data were collected during January-June 2014 – for particular analyses/datasets always see the footer on the right – & analyzed in summer 2014. Due to that, we should keep in mind that the state of both Coursera’s &
Khan Academy’s brand is biased towards the first half of 2014 – and even more regarding the “a week of tweets” part (yet, as you will see, freeing the data of the “ungeneralizable piece”, we can gain quite detailed insights
provided by a very small fragment of the “big Twitter data”). Although “the ideal” of any web and/or social media analysis would be a longditudial & automated evaluation/reporting process (yet, before that an exploratory
research is needed), you might be surprised by the richness & high informative value of such “single captured state”.
* This research paper is part of the work-in-progress text “How to Create Self-Driven Education”. It was posted online to (potentially) educate professionals, academics & the general public about possibilities of mining the social web.
…also to get some feedback and (above all) spread the word about self-driven education & contribute to the vision. =)
download (/save) the document for higher resolution | optimized for fullscreen view
RUZICKA, Jakub (2014). COURSERA & KHAN ACADEMY ON THE SOCIAL WEB: THE SOCIAL WEB CO-CREATING BRANDS, REVEALING COMMUNITIES & FACILITATING – BOTH PRODUCERS’
& CONSUMERS’ – INFORMED DECISION-MAKING IN ADJUSTING THEIR “EDUCATION MIX”. [working paper] Charles University in Prague, Faculty of Social Sciences, Institute of Sociological Studies.
click a link in the text for more information | click a red framed visual for an updated web result or higher resolution
Coursera & Khan Academy
social web presence quantitatively
web domains, web traffic, keyword performance, business insights, social media
statistics: facebook, twitter, google+, youtube & linkedin; competitive analysis, wikipedia
insights
(pp. 3-33)
aggregatedgeneral background
The first section of this text is (mainly) based on data provided by third
parties. Therefore, it is rather suggestive than conclusive, gives
incomplete information & serves only as a general introduction into the
topic. Its significance will be reinforced in the three following sections
(i.e. including conclusions), where it complements the original data
collected & provides framework for the overall picture of Coursera’s
& Khan Academy’s social web presence.
the internet
about 3 billion users
about 1 billion websites
the top 500 sites on the web
blogs
more than 6.7 million bloggers
about 80% of internet users read blogs
about 1.3 billionmonthly active users
about 80% of daily active usersoutside the US & Canada
more than 50 million facebook pages
255 million monthly active users
about 77% of accountsoutside the US
about 500 million tweets per day
google+
540 million monthly active users
about 5.5 million pages*
* a simple estimate based on google’sstatement that more than 1 million pages
were created in the first 6 months(g+ launched in November 2011)
youtube
more than 1 billion users
80% youtube trafficoutside the US
100 hours of video uploaded every minute
186 million monthly active users
more than 3 millioncompany pages
over 39 million students& recent college graduates
about 115 million monthly unique visitors
largest demographic group of 18-29 year old males
wikipedia
over 500 million monthly unique visitors
over 4.5 English articles
over 10 edits/sec of wikipedia& its sister projects
This slide is an introductory
one in order to set some
general benchmarks to
estimate the bias of our
(prospective) Coursera’s
& Khan Academy’s brand
definition based on the social
web analyses that make up
this report.*
* I’ve also attempted to establish
a generalization principles for social
media research (disclaimer: way
more theoretical =)), particularly
focused on social media algorithms
and offline & online political
participation, here.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose June 2014
stats
source:
internetlivestats.com
newsroom.fb.com/company-info
about.twitter.com/company
google.com/+/brands
youtube.com/yt/press/statistics.html
press.linkedin.com/about
bit.ly/1jZqHBX (techcrunch.com)
en.wikipedia.org/wiki/Wikipedia:Statistics
blog.digitalinsights.in
expandedramblings.com
comscore.com
category label category URI result labelresult
refcountresult description
Philosophy of education http://dbpedia.org/resource/Category:Philosophy_of_education Education 7048
Education in its broadest, general sense is the means through which the aims and habits of a group of people lives on from one generation to the next. Generally, it occurs
through any experience that has a formative effect on the way one thinks, feels, or acts. In its narrow, technical sense, education is the formal process by which society
deliberately transmits its accumulated knowledge, skills, customs and values from one generation to another, e.g. , instruction in schools.
Knowledge sharing http://dbpedia.org/resource/Category:Knowledge_sharing
Education http://dbpedia.org/resource/Category:Education
State schools http://dbpedia.org/resource/Category:State_schools State school 6194
State schools, also known as public schools or government schools, generally refer to primary or secondary schools mandated for or offered to all children by the government,
whether national, regional, or local, provided by an institution of civil government, and paid for, in whole or in part, by public funding from taxation. The term may also refer to
institutions of post-secondary education funded, in whole or in part, and overseen by government.
High schools and
secondary schoolshttp://dbpedia.org/resource/Category:High_schools_and_secondary_schools
Secondary
school4579
Secondary school (the term "high school" is most often associated with English-speaking countries, though the two are far from synonymous) is a term used to describe an
educational institution where the final stage of schooling, known as secondary education and usually compulsory up to a specified age, takes place. It follows elementary or
primary education, and may be followed by university (tertiary) education.
School types http://dbpedia.org/resource/Category:School_types
Educational stages http://dbpedia.org/resource/Category:Educational_stages
School terminology http://dbpedia.org/resource/Category:School_terminology
Elementary and primary
schoolshttp://dbpedia.org/resource/Category:Elementary_and_primary_schools
Primary
school3422
A primary school (from French école primaire) is an institution in which children receive the first stage of compulsory education known as primary or elementary education.
Primary school is the preferred term in the United Kingdom and many Commonwealth Nations, and in most publications of the United Nations Educational, Scientific, and
Cultural Organization. In some countries, and especially in North America, the term elementary school is preferred.
School types http://dbpedia.org/resource/Category:School_types
Educational stages http://dbpedia.org/resource/Category:Educational_stages
School terminology http://dbpedia.org/resource/Category:School_terminology
Gender http://dbpedia.org/resource/Category:GenderMixed-sex
education3089
Mixed-sex education, also known as coeducation, is the integrated education of male and female students in the same institution. It is the opposite of single-sex education.
Most older institutions of higher education were reserved for the male sex and since then have changed their policies to become coeducative.
School types http://dbpedia.org/resource/Category:School_types
Educational environment http://dbpedia.org/resource/Category:Educational_environment
Mixed-sex education http://dbpedia.org/resource/Category:Mixed-sex_education
the “education” keyword
Before we begin interpretation & discussion of the gathered data about two educational institutions, Coursera & Khan Academy, let’s consider the “education” concept itself. For the definition, we’ve asked – unsurprisingly – one of the products of the social web, Wikipedia, the free
open online encyclopedia. Properly speaking, the answer was given by DBpedia, “a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web”. [self] Its keyword search API provided us with the “education”
query results ranked by the number of inlinks pointing from other Wikipedia pages at the particular result page (the refcount column).*
At first sight, we can observe that the “education” concept is closely related to “school” – primarily in its “traditional K-12 school education” meaning which dominates the results. Yet, as we can learn right from the very first description of education’s “deep” (philosophical) meaning,
such context is “narrow” & “technical”. In recent years – arguably especially due to the mass expansion of ICT, the Internet, and (therefore) open educational resources facilitating education personalization (and mass-scale support & analytics) – we are getting more aware of the
(intrinsically motivated) self-driven & informal education opportunities. Wikipedia, blog articles & YouTube tutorials by professionals or amateurs-enthusiasts, specialized educational portals, community forums, online professional communities, open-source communities; blended
learning, MOOCs, MIT OpenCourseWare, Codecademy, TED, GitHub, and a myriad of other non-profit & for-profit online educational projects & new concepts in education.** Based on the premise of increasing professional/academic specialization – however, rather than within
one discipline, “picking” suitable skills across various disciplines (interdisciplinary) – to achieve – freely chosen (whatever, whenever, wherever, however), intrinsically motivated and, to some extent, unique mastery, we might argue that “school” in the “education - school”
association could be replaced by “education mix”***, where particular educational tools & entities act as “brands” with a clear positioning of their products, meeting the needs of particular target groups (for more insight, see the long tail) & customizing their mixes according the data
they have. The archetypes of an “educator” (producer) & a “learner” (consumer) will often blend in a “prosumer” (for more insight, see Wikinomics). On the subject of the “education mix” customization, we will see that the social web – even standing alone, not combined with other
data – can give us a detailed answer to our research question “What are the brands of Coursera & Khan Academy?”, with the aforementioned particular interest in “What kind of information does the social web provide to learners, shaping their “education mix”; educators,
personalizing education(al tools) they provide & reinforcing their positioning; and what is the general function & “state of things” of social media in education?”
* Defining the “21st Century education concept” based on a single Wikipedia query only could be seen as “sloppy”, possibly empowering the digital education revolution sceptics with some solid & sound arguments about not enough critical thinking.
On that account, I should mention that I’ll save some more elaborated theoretical background / model / framework for the aforementioned work-in-progress text “How to Create Self-Driven Education”.
** I am – by not means – trying to provide you with a curated list of all online educational resources & currently discussed concepts. It might not even be possible, since – based on the scope our definition of an “educational resource” – it might be concluded that we actually learn from
everything we interact with. Try to type the “(free) online education” query, and/or any specific topic you want to learn, in your favourite search engine, and you’ll still see only a few planets in the giant digital universe of available educational resources.
*** Coining a term is fun, and even more entertaining if the term can be somewhat useful. =) Since meanings are rather created in everyday life, based on a necessity of having them, let’s not give any “hard” definition to the “education mix” concept. We can just say, it’s kind of analogous to
the “marketing mix” concept. Under the “education mix” paradigm, schools, study programmes, job positions etc. serve as “mere” archetypes/recipes, helping us to assemble our own qualification, positioning & “brand”; employing various education providers & ways of how to devote our time
efficiently to improve our lives. Neither learners nor educators search for a “perfect educational institution” but rather for a “perfect (yet flexible) education mix” tailored for specific needs & goals.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: wiki.dbpedia.org/Lookup) | June 2014
google books
Word frequency of Google’s English book corpus shows us a similar story. The “education” keyword
seems to be strongly correlated with “school”. We can see that “the” one of the means of obtaining
education has (almost always) been discussed more than the ultimate goal (& process) itself.
We are now ready to begin the Coursera & Khan Academy comparison.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: books.google.com/ngrams) | May 2014
With respect to the end users, both Coursera & Khan
Academy represent web pages offering free - for-profit & non-
profit, in that order - educational content. Both organizations
operate in the US. Khan Academy is six years older, which
gave it more time to reinforce its “transforming education”
authority / brand label, as we’ll see in the following parts of the
text.
domain name COURSERA.ORG
domain id D164365199-LROR
creation date 2012-01-12T01
updated date 2013-07-02T20
registry expiry date 2023-01-12T01
sponsoring registrar GoDaddy.com, LLC (R91-LROR)
sponsoring registrar IANA id 146
registrant name Andrew Ng
registrant organization Dkandu, Inc.
registrant street 1975 El Camino Real
registrant city Mountain View
registrant state/province California
registrant postal code 94040
registrant country US
registrant phone 1.415377
registrant email [email protected]
domain name KHANACADEMY.ORG
domain id D118495620-LROR
creation date 2006-03-14T22
updated date 2014-04-29T00
registry expiry date 2019-03-14T22
sponsoring registrar GoDaddy.com, LLC (R91-LROR)
sponsoring registrar IANA id 146
registrant name Shantanu Sinha
registrant organization Khan Academy
registrant street PO Box 1630
registrant city Mountain View
registrant state/province California
registrant postal code 94042
registrant country US
registrant phone 1.650337
registrant email [email protected]
web domains
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: who.is) | May 2014
introduction 76 11%
science 32 5%
part 31 5%
health 29 4%
learning 25 4%
global 21 3%
data 19 3%
educational resources
mathEarly math
Differential calculus
Arithmetic
Integral calculus
Pre-algebra
Multivariable calculus
Algebra I
Differential equations
Geometry
Linear algebra
Algebra II
Applied math
Trigonometry
Recreational math
Probability and statistics
Math contests
Precalculus
scienceBiology
Cosmology and astronomy
Physics
Health and medicine
Chemistry
Discoveries and projects
Organic chemistry
economics
& finance
Microeconomics
Finance and capital markets
Macroeconomics
Entrepreneurship
arts and
humanities
History
Music
Art history
Philosophy
American civics
computing Computer programming
Cryptography & information theory
test prepSAT
CAHSEE
MCAT
IIT JEE
NCLEX-RN
AP Art History
GMAT
partner
content
The Museum of Modern Art
Crash Course
The J. Paul Getty Museum
Stanford School of Medicine
California Academy of Sciences
MIT+K12
Exploratorium
LeBron asks
Asian Art Museum
The Brookings Institution
All-Star Orchestra
The Aspen Institute
Silicon Schools Fund and Clayton Christensen Institute
NASA
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: api-explorer.khanacademy.org, tech.coursera.org/app-platform/catalog/) | May 2014
categories
Arts
Biology & Life Sciences
Business & Management
Chemistry
Computer Science: Artificial Intelligence
Computer Science: Software Engineering
Computer Science: Systems & Security
Computer Science: Theory
Economics & Finance
Education
Energy & Earth Sciences
Engineering
Food and Nutrition
Health & Society
Humanities
Information, Tech & Design
Law
Mathematics
Medicine
Music, Film, and Audio
Physical & Earth Sciences
Physics
Social Sciences
Statistics and Data Analysis
Teacher Professional Development
laguages English
Chinese
Spanish
French
Russian
Portuguese
Turkish
Ukrainian
German
Hebrew
Japanese
Arabic
Greek
Italian
The official APIs of both services allow us to obtain lists of all open educational resources that
can be found on Coursera’s & Khan Academy’s websites.
Coursera offers (as of May 2014) 664 courses – consisting of (video) lectures, additional
materials, community forum & assignments (not only) suitable in case one wants to gain (paid)
signature track certificate and/or complete a whole specialization – organized into 25
categories, lectured mainly by University/College Professors, and manifestly aimed at
University/College students, (prospective or current) professionals and/or life-long learners.
We might want to complement Coursera’s build in categorization of courses (on the left), since
– despite being proper for Coursera’s users – it might be too general for our understanding of
its content, concerning the rather niche-specialization of higher education courses. The
common features of the educational content available on Coursera can be also understood by
a simple frequency analysis* of keywords in courses’ names. The most frequent ones tell us
there are “at least” – only based on the detected “introduction” keyword – 11% of introductory-
level courses, generally rather “science” courses, sometimes serialized (having more “parts”).
Slightly favoured topics are “health”, “learning”, “global” & “data”, followed by (not included in
the table, about 0.5-1 times less frequent) “teaching”, “analysis”, “programming”, “systems”,
“history”, “world”, “management”, “engineering”, “chemistry” & “society”. Note that Coursera
can also be defined by the Universities & other partners with which it cooperates on the
preparation/distribution of courses and/or to which it provides an MOOC platform. We will talk
about them in the following parts of the text.
Khan Academy’s 56 categories of open educational resources - consisting of thousands of
small lectures, practice problems, mini-projects, points & badges (achievements) to reinforce
extrinsic motivation, discussion forums, and learning management environment for learners,
educators, or parents – are structured quite clearly – regarding its “US common core
primary/high school education” nature. Khan Academy’s primary & high school curriculum – as
we will also see later on – is frequently used to “fix” one’s general education as well. The
content is – almost exclusively – created by the founder of Khan Academy, Salman Khan.
Both platforms offer open educational materials in languages other than English. To be more
specific, there are courses taught directly in another language than English (Coursera) and
courses/lectures with available translated transcripts (in-video subtitles) created by volunteer
communities (Khan Academy, recently also Coursera)
* Such analysis serves as an introductory one only. As we will hopefully see by the end of this report,
social web and social & online media mentions give us a much more accurate overall picture.
web traffic
In spite of the fact Coursera generally appears to reach larger audience – according to both pages’ traffic estimates* – the distinction seems to be blurred in the US – possibly because Khan
Academy offers educational resources tailored for the US K-12 education system. The traffic of the “newer” of both institutions, Coursera, gives the impression of growing steadily, while Khan
Academy’s most recent traffic seems to be rather steady, with regard to its regular fluctuations (I bet you won’t find summer holidays in the KA’s line chart =)).
Other metrics/estimates might also be biased by the composition of Alexa’s global traffic panel*. However, for potential future purposes (e.g. hypothesis testing), we might temporarily (utill we have
other data) assume that Coursera is visited rather by men, whereas Khan Academy’s users are rather women. Khan Academy might also be frequently visited by students/professionals, who
possibly want to fill the gaps in their general education.*Alexa's traffic estimates are based on data from its global traffic panel, which is a sample of millions of Internet users using one of over 25,000 different browser extensions.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: alexa.com) | May 2014
web traffic
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: alexa.com) | May 2014
Both services appear to be most popular – no surprise here – in the US. However, we might argue that while Coursera seems to have wider reach around the globe, in non-(natively-)English speaking world, Khan Academy slightly
leads in Canada & the UK. Besides the importance of names/brands of both institutions, the top search engines keywords provided by Alexa do not say much. Yet, it already indicates the discussion, arising from social media
content analysis in the following parts of the text, about Coursera’s brand being more “diverse”, based on particular coursers one takes, Professors & partner institutions; and Khan Academy’s brand being more “centralized” &
dependent on the person of Salman Khan – besides other things caused by the difference in educational content production on both websites. Because the sharp rise of search traffic in mid 2013 is observable in both graphs, it was
not examined in more detail, since it is not the subject of this research and might be caused by a modification of Alexa’s methodology and/or increased publicity of both thanks to an influential publisher/medium.
web traffic
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: alexa.com) | May 2014
The inbound link analysis suggests that about three times more sites link to Khan Academy than to Coursera – on the other hand, as we already know,
Khan Academy exists about four times longer than Coursera. In general, the importance of social media (Facebook & YouTube in particular) for both
institutions’ traffic is shown. Khan Academy’s subdomains show the (web traffic) importance of SmartHistory, KA’s resource for art history, and two of its
volunteer translator communities.
You might also notice the Turkish newspaper Hurriyet (Coursera) – we will explain the connection between Coursera & Turkey later – and “the” Chinese
online medium & SNS (social networking service), Sina. Unsurprisingly, regarding the large Indian population, there’s a significant interest in Coursera in
India (google.co.in). We will talk about both services’ geography & demography in general later on.
web traffic
estimated 8,022,800 monthly visits
estimated 4,152,930 monthly visitors
estimated 4,055,700 monthly visits
estimated 3,149,610 monthly visitors
Coursera.org Course: Machine LearningCourse: Human-Computer
Interactio
Course: Cryptography ICourse: Computer Science
101
top pages
KhanAcademy.org About | Khan AcademyKnowledge Map | Khan
Academy
Computer programming |
Khan Academy
top pages
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: site-seo-analysis.com, opensiteexplorer.org, trafficestimate.com) | May 2014
Once again, despite the fact the website traffic estimates are only approximate, even this page
supports the approaching discussion of Khan Academy having a stronger community, and of
Coursera having higher reach build around diverse target groups.
Perhaps because the transformation of education is “powered by” ICT development, the most
popular Coursera’s courses seem to be the techn(olog)ical ones.
keyword performance
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: semrush.com) | May 2014
Organic keywords from an
alternative source seem to
support the previous
discussion. Moreover they
emphasize Coursera’s & Khan
Academy’s partner
institutions.
The analysis of competitors
based on organic search
shows that Khan Academy is
– as for organic search –
looked up as a source of open
educational resources for
mathematics. Coursera
appears to (organically) be a
general online class platform,
MOOC platform in particular,
since it competes with
websites offering
University/College and/or
professional online courses.
business insight
number of followers: 14,399number of followers: 33,435
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: linkedin.com) | May 2014
We’ll start our quick business insight into Coursera & Khan Academy by examining
their profiles on the business-oriented social network LinkedIn.
The for-profit vs. non-profit difference between both institutions reflects itself in the
fact that Coursera, unlike Khan Academy, updates its company page on a regular
basis, focusing its news & stories on college/professional population & recruiting
new employees. Coursera is (expectedly) also larger concerning number of
employees (/company size).
business insight
number of followers: 14,399number of followers: 33,435
While the previous organic search (/keyword) competition was based on search
terms – i.e., slightly more text-based categorization – LinkedIn’s recommendation
system gives the “related search recommendations” using clicks, term overlap &
length bias [self] – i.e., supposedly, even in our case of “people also viewed”, those
should be slightly more human behavior-based recommendations.
We can see that both portals are perceived to be among the leaders in the realm of
online/MOOC education (edX, Udacity, Udemy). Grockit – providing US
standardized exam preparation – reflects KA’s recent educational content &
partnerships (discussed in the following parts of the text). Finally, we should also
mention Knewton, an adaptive learning (/educational content personalization)
platform. Its presence among other educational tools complements our conception
of current trends in education.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: linkedin.com) | May 2014
business insight
Another methodology of finding competitors, this time
crowd-sourced, brings some new players to the overall
picture of online courses market.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: crunchbase.com) | May 2014
business insight
number of followers: 14,399number of followers: 33,435
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: crunchbase.com) | May 2014
Information about funding – from our viewpoint – seems to be
important only to identify the influences on Coursera’s operation,
its investors (similarly its partner institutions discussed later).
Nevertheless, it might be misleading to assume that Khan
Academy just “lives a life of its own”. Since this topic will show up
many times in the following two parts of the text, it will be enough
to mention its recent partnerships with College Board, Bank of
America, NASA, and the White House; and its financial backing
from the Bill & Melinda Gates Foundation.
business insight
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: crunchbase.com) | May 2014
Institutions are – arguably, to a great extent – shaped by
people who work in them (& therefore determine their
development). Besides the founders of both institutions,
Salman Khan (Khan Academy), and Daphne Koller &
Andrew Ng (Coursera), we can see other members of
the board, employees and/or partners, about whom we
can learn more simply by looking up their names in our
favourite web search engine.
Coursera Inc.
1975 El Camino Real West , CA 94040 , Mountain
View , CA , 94040 , United States
www.coursera.org
Industry Internet Educational Services
Employees 45
SICSchools & Educational Services,
Nec (8290)
NAICS Educational Support Services (611710)
People
Vincent Price Member of Advisory Board
Margaret Sheil Member of Advisory Board
Rafael Bras Member of Advisory Board
John Etchemendy Member of Advisory Board
Peter Lange Member of Advisory Board
Phyllis Wise Member of Advisory Board
Andrew Ng Co-Founder
Philip Hanlon Member of Advisory Board
Christopher Eisgruber Member of Advisory Board
Patrick Aebischer Member of Advisory Board
John Doerr Director
Scott Sandell Director
Vice President(s)
Jessica Neal Vice President - Talent
Chief Executive Officer
Daphne Koller co-founder and co-CEO
Richard Levin Chief Executive Officer
President(s)
Lila Ibrahim President
business insight
Khan Academy Inc.
PO BOX 1630 , Mountain View , CA , 94042 ,
United States
www.khanacademy.org
Industry Educational Services
Employees N/A
SIC
Services-Educational Services (8200)
NAICSAdministration of Education Programs
(923110)
Director(s)
Salman Khan
Founder & Executive
Director
Other
Jennifer Overholt
Volunteer and Math
Content Creator
The same query using an alternative source should be
enough* to roughly illustrate the difference in organization
structure between for-profit & non-profit organization, which
will complement our future conclusions about Coursera’s &
Khan Academy’s communities.
* If we were to create an as-much-as-possible complete list of both
companies’ team members, we should certainly also use web search
engines; search blogs, news articles, different social media APIs etc.
A good place to start would be here for Coursera
& here for Khan Academy.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: insideview.com) | May 2014
January January January January
A B C D E
F G H I J
K
google search
While the traffic estimates appeared to work in
favour of Coursera, Google search of the
“coursera” & “khan academy” keywords, by
contrast, shows overall higher demand for Khan
Academy. As we will see later on, it is so,
arguably, because Coursera is often found via
particular courses (e.g. ‘machine learning’),
whereas Khan Academy is rather centralized &
compact tool closely associated with its founder,
Salman Khan, & his stances towards education
mediated by online publishers.
News headlines found by Google Trends – red
about Coursera & blue about Khan Academy –
also allow us to start shaping the overall picture
of Coursera’s & Khan Academy’s online
mentions (which will be further developed later,
especially thanks to social media mentions &
inbound links).
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: google.com/trends) | May 2014
google search
The related (to the previous keyword search interest over time) world search interest maps can be seen as showing current markets of both educational tools. While Khan Academy is looked up in countries like Ghana, Singapore & Greece, the worldwide
popularity of Coursera (beyond US & Canada) seems to depend on larger University cities (possibly with better educated population). Once more (& unsurprisingly again) we can see high demand for both portals in India. Regarding Coursera, Bangladesh
might surprise us being among the top.** Numbers represent search volume relative to the highest point on the map which is always 100.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: google.com/trends/) | May 2014
“society” statistics
profile following followers
Barack Obama 650,026 43,714,421
Dalai Lama 0 9,030,143
NASA 218 6,813,595
…
Khan Academy 64 301,000
…
Coursera 300 182,000
profile fans
engagement
rate (average)
Barack Obama 41,280,379 0.32%
Narendra Modi 18,538,829 1.50%
Mitt Romney 11,336,358 0.13%
…
Khan Academy 727,524 N/A
…
Coursera 547,273 N/A
1.
2.
3.
1.
2.
3.
tagsconference, csr, education, governmental, ngo,
politics, professional association, science
Slowly shifting to social media, we can start with a popularity rank from SocialBakers’ proprietary database, which gathers social media data on a regular basis, and therefore allow us to make some
general and/or longditudial comparisons. Let’s start with fan pages.* The “society” category includes the “conference”, “csr”, “education”, “governmental”, “ngo”, “politics”, “professional association” &
“science” tags. It shows that Khan Academy is overally more popular on both Facebook & Twitter than Coursera – as measured by the number of fans/followers. Broadly speaking, it is also obvious that
political & spiritual leaders are more popular than educational leaders. To complement the left Facebook table, we can add that the TED conference was among the top 10. On the subject of other
educational institutions, there was, for example, Harvard University in the top 40. Likewise, using our “educational lens” when looking on the right Twitter table, the “third place” of NASA – sharing content
which usually can be labeled as “educational” – should make us happy.* And their analogies regarding other (not Facebook) SNSs.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | May 2014
“society” statistics
tagsconference, csr, education, governmental, ngo,
politics, professional association, science
profile views followers
Barack Obama 84,520,413 4,327,072
Jamie Oliver 31,686,664 2,389,056
Narendra Modi 114,569,200 1,493,332
…
Coursera 22,833,545 1,067,922
…
Khan Academy 6,376,165 343,686
profile views subscribers
Super Simple Songs 1,508,583,613 1,089,874
Howcast 1,362,150,063 2,117,582
Khan Academy 114,569,200 2,078,502
…
Coursera 977,490 32,128
1.
2.
3.
1.
2.
3.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | May 2014
Dealing with the YouTube leaderboard in the “society” category, we can see that educational content wins. Both Supersimple & Howcast offer educational content –
yet targeting very different users. In total, Khan Academy is the third most viewed in its category*, which demonstrates the popularity of Salman Khan’s videos, which
are uploaded directly to YouTube (and displayed on Khan Academy’s webpage as embed videos). Since Coursera’s courses videos are uploaded to its own (online)
learning environment made accessible after enrolling into a particular course, Coursera does not have that much video content to share – and that’s perhaps why it is
not so popular on YouTube. In the top 15 of the “society” category of YouTube videos, we might find (again) the popular TED conference, Stanford University, TED-
Ed, MIT OpenCourseWare & several other educational institutions.
What about Google+? On the subject of educational institutions in the top 50 of the “society” category, we can find several educational institutions, mainly consisting
of University Google+ pages (e.g. Stanford). Even if you have not yet been exposed to a decent amount of social media research, with respect to the fact the “first
place” is occupied by the same person on Twitter, Facebook & Google+, you should at least suspect, who is “the” politician very good with them.
To sum up, we can say that YouTube is the most proper platform for education with respect to the four SNSs we were talking about. Moreover, Facebook, Twitter &
Google+ users are more keen to follow politicians than educators.**
* Also note Khan Academy’s disproportionate number of views to number of subscribers, as compared to Super Simple Songs & Howcast. We can argue that Khan Academy requires the
most (learning) concentration of all three, since nearly all of its videos are Salman Khan’s micro lectures, typically on STEM (science, technology, engineering, and mathematics) topics.
**Jamie Oliver then might go beyond our categorization. My personal preference would be an “entertainment” category. However, since Jamie categorized his Facebook page as “public
figure”, we might argue that there’s some educational/political/societal/cultural value instilled in recipes when discussing properties of food (nevertheless, with that logic, we might also
justify the educational value of the well-know extensive collection of YouTube makeup tutorials =)).
number of fans: 529,115 number of fans: 714,626
Before we will get to the (rather qualitative) content analysis in the second
& third part of this report, let’s focus on quantitative metrics which can
provide us with a general overview of Coursera’s (orange) & Khan
Academy’s (green) reach.
As the chart shows, the numbers of Coursera & Khan Academy fans rose
steadily over the years.* The overall Facebook fan base of Khan Academy
is higher than the fan base of Coursera.
* Not paying attention to the sudden jump in the number of Khan Academy fans
around March 2014 which might be related to a significant event but might as well
be a “bug” in Wildfire’s data and/or web reporting.
competitive analysis
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: wildfireapp.com) | June 2014
number of fans: 529,115 number of fans: 714,626
fans fans
competitive analysis
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | June 2014
Looking at data from an alternative third-party source makes it possible to
verify our previous conclusions. Since we – once again – can see the
sudden jump in the number of Khan Academy fans at the end of March
2014 & we possibly want to explain it, we need to “borrow” some findings
from the following two parts of this text in order to argue that the
March/April 2014 KA’s number of fans increase might be caused by KA’s
new partnerships, especially its co-operation with College Board (SAT*
preparation).
* SAT is a widely used US college admissions standardized test.
United States 90,976 17.2 %
India 69,061 13.1 %
Brazil 37,235 7.0 %
Egypt 18,603 3.5 %
Mexico 14,394 2.7 %
United Kingdom 13,580 2.6 %
Spain 12,542 2.4 %
Canada 11,138 2.1 %
Greece 10,807 2.0 %
United States 32,2341 45.1 %
India 54,447 7.6 %
Canada 26,592 3.7 %
Bangladesh 25,471 3.6 %
Pakistan 24,128 3.4 %
Brazil 21,998 3.1 %
United Kingdom 17,618 2.5 %
Egypt 15,148 2.1 %
Australia 13,587 1.9 %
number of fans: 529,115 number of fans: 714,626
competitive analysis
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: wildfireapp.com) | June 2014
With regard to Coursera’s & Khan Academy’s Facebook fans nationality, we can
support – in some cases also complement (especially towards this research’s
overall conclusions) – Google search engine query statistics conclusions
(compare with page 21). Furthemore, taking a closer look at the left (Coursera’s)
table, we can see that Coursera seems to be popular in Latin America.
number of followers: 176,619 number of followers: 295,146
competitive analysis
Coursera’s & Khan Academy’s potential reach of Twitter users, in
comparison with Facebook, creates a larger gap between those two
(again) in favor of “the older” Khan Academy, which has almost twice the
number of Twitter followers Coursera has.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: wildfireapp.com) | June 2014
following 298
tweets 1,712
twitter age 2 years 9 months
following 63
tweets 1,072
twitter age 5 years 7 months
number of followers: 176,619 number of followers: 295,146
followers followers
competitive analysis
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | June 2014
Reports from an alternative third-party source (SocialBakers) tell us the same story. Yet it should be stressed once more that Coursera is a bit younger (not only talking about its Twitter
account age) and therefore had less time for fans acquisition. Just for the record, we might add that Coursera’s Twitter account is more active than Khan Academy’s account. Analysis of
followers is subject to the second part of this report.
followers: 1,067,922 followers: 343,686
competitive analysis
By contrast, as for Google+, Coursera looks like the absolute winner – if
the “number of followers” metric is used – although KA’s fanbase might be,
according to Wildfire’s data, growing much faster over the last 3 months
(as of June 2014).
Don’t be misled by the date Wildfire has begun monitoring both pages.
Querying the Google+ API, we find that the first ever post of Khan
Academy on Google+ was published in December 2011. As for Coursera,
it was April 2012. Knowing there’s such a small “starting time” difference
between those two & knowing that their Facebook & Google+ content
strategy does not differ significantly* (see the following section), it
illustrates how different g+ userbase is. Even without collecting user socio-
demographic data, we might argue that – in comparison with the other
studied SNSs – the g+ population is generally older and/or has a specific
interest profile.
* Just to make the statement absolutely clear: Coursera’s content strategy is
different from Khan Academy’s content strategy. Coursera’s Facebook & Google+
content strategy are very much alike. Khan Academy’s Facebook & Google+
content strategy are very much alike.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: wildfireapp.com) | June 2014
followers: 1,067,922 followers: 343,686
followers
N/A
competitive analysis
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | June 2014
Unfortunately, Coursera was not monitored by our alternative source of g+ followers time
series. At least, we can confirm that the number of Khan Academy’s Google+ followers has
been growing very fast since 2014.*
* Possibly even further back in time but we don’t have enough data available to verify that.
videos 319
total views 951,021
videos 4,201
total views 406,597,063
subscribers: 31,535 subscribers: 178,1982
subscribers subscribers
competitive analysis
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | June 2014
On the subject of YouTube subscribers, we definitely could look for its relationship to the
number of videos uploaded and/or total views. However, the only highlight here is Khan
Academy’s total views. As we mentioned before, Coursera falls short in this regard. While
one Coursera’s video is, on average, viewed about 3,000 times, an average Khan
Academy’s video reaches almost 100,000 viewers.
videos 319
total views 951,021
videos 4,201
total views 406,597,063
subscribers: 31,535 subscribers: 178,1982
uploaded video views uploaded video views
competitive analysis
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: socialbakers.com) | June 2014
While Khan Academy is based on sharing YouTube videos, and therefore we can observe a long-term
steady increase in uploaded video views reflecting its (rather) constant content strategy, Coursera is yet
learning to work with YouTube – and there’s a noticeable improvement over the last few months.
The very last interesting insight we’ll have before moving on to the
(rather) qualitative section of this text based on our own (primary) data,
is revealing the MOOC education tools universe/landscape. We can
easily do that thanks to a powerful Wikipedia article graph exploration
interface, Wikinsights, based on content, category & links (both,
inbound & outbound) similarity, and links complementarity. What binds
Coursera & Khan Academy together – based on the aforementioned
criteria – are Open educational resources, CK-12 Foundation, ALISON
(company), OpenCourseWare, Massive open online course, Udacity,
MIT OpenCourseWare & LearnStreet.
Regarding Wikipedia pages related to Coursera only, we can see
Wikiversity, Charles Severance, Massive online open research,
Creative Live, TechChange, Udemy, Ben Benderson, Edsby, Iversity,
Academic Eearth, Lynda.com, Eliademy, EduKart, Daniel S.Welt,
Open-source curriculum, edX & Open Learning.
As for Wikipedia pages related to Khan Academy only, there are
Technology integration, Interactive Learning, Two Circles, OER
Commons, Free High School Science Texts, Educational technology,
Open textbook, American Friends of Arts et Méiers Paris Tech, Curriki,
Virtual university, LearnThat Foundation, MITx, PhET Interactive
Simulations, Teaching Channel, Computers in the classroom, E-
learning, INeedAPencil, Saylor Foundation, CollectSPACE, Lecture
recording, Open Source Learning, East Bay Children's Book Project.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose third-party data (source: http://wikinsights.org/) | June 2014
wikipedia insights
Coursera & Khan Academy
social web presence qualitatively
facebook, twitter & google+ pages, groups, social networks, communities, posts, fans,
demographics, traffic sources, keywords; personal network & interest profiles, search
results, news articles, text mining, inbound links, reddit, youtube
(pp. 34-95)
originalsocial web data
mission
We are on a mission to change the world by offering classes from top
universities and professors online, and for free. We envision people throughout
the world, in both developed and developing countries, to learn without limits by
using our platform to connect to great education that has so far been available
only to a select few. We hope to empower people with education that will
improve their lives, the lives of their families, and the communities they live in.
founded 2012
category Education
description
Learn from renowned professors, watch high quality lectures online, achieve
mastery via interactive assignments, and collaborate with a global community of
students.
Please visit help.coursera.org to get help or to drop your suggestions!
How has Coursera changed your life? Tell us your story!
http://blog.coursera.org/student/stories
awards 2012 TechCrunch Crunchies - Best New Startup
products Coursera Education Platform
number of fans: 529,115 number of fans: 714,626
company overview
Start learning now at Khan Academy. All our resources are completely free, forever.
Too many people around the globe don‘t have access to high quality educational materials, or are
forced to learn through a system that doesn't allow them to learn at their own pace.
We think the technology exists today to fundamentally change this, and our 501(c)3 non-profit is
working to build the tools and resources every learner deserves.
missionWe are a not-for-profit organization with the mission of providing a free world class education for
anyone, anywhere.
founded 2007
category App page
products
We offer tutorials on everything from basic arithmetic to calculus, chemistry, physics, history, art,
medicine, economics, and finance. Khan Academy also offers infinite problems for practice
(currently in Math).
We are translating content into the world's most spoken languages (click on the subtitles option on
a video or visit the International option in the bottom right corner of www.khanacademy.org
To get started using Khan Academy, check out: http://www.khanacademy.org/about/getting-
started
If you'd like to contribute to our efforts, please visit http://www.khanacademy.org/contribute
To share your story about the impact Khan Academy has had on you (videos are much
appreciated!), please visit www.khanacademy.org/stories
0.39% active (Jan-Jul 2014) 0.68% active (Jan-Jul 2014)
about
Querying the Facebook API for both pages (their IDs), we get the basic (public) information that both organizations share about themselves on Facebook and also the
common call to action (here “call to engagement”) to share stories about impacts of education on one’s life – as we’ll see that many times later, arguably also one of the
core features of “education as a brand”. While Coursera puts itself in the general “education” category, Khan Academy defines itself as being an (educational)
“app(lication)”.
Another query gives us a list of all Coursera’s/KA’s active fans who engaged with one or the other page from January to May 2014*. We can see that Khan Academy’s
Facebook community is, in total, larger & more active.
* This period is valid for the following analyses as well, unless stated otherwise.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (data obtained via: developers.facebook.com/docs/graph-api/) | January-May 2014
pages Coursera liked pages KA liked
profile category
Working Together towards Health for All
through Primary Health Care Community
Archaeology's Dirty Little Secrets Community
INDEX: Design to Improve Life Non-profit organization
The Wall Street Journal Media/news/publishing
NPR Classical Arts & Entertainment
Lifehacker Computers/internet website
profile category
KA Lite Product/service
page likes
Before we’ll dive into the content Coursera & Khan Academy share on their social
media profiles & the typologies of their fans build around comment networks, let’s
ask the question of how Coursera & Khan Academy position themselves based on
the pages they like (Facebook) or profiles they follow & mention (on Twitter), and
also the “grassroot” social media communities around both educational tools.
Due to the design of the social network & its “culture” (common practice), there’s
not much to see on Facebook with regard to pages giving likes to other pages.
While Coursera expressed support for some non-profit communities/organizations
– one of them related to an archeology course on Coursera, and another related to
Johns Hopkins University, one of Coursera’s key partner (see later on) – and three
publishers (news/tech/entertainment), Khan Academy liked only its offline version
(Khan Academy Lite).
On the next pages, you will find, a bit more interesting, Twitter “self-positioning”
profiles of Coursera & Khan Academy.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (data obtained via: developers.facebook.com/docs/graph-api/) | January-May 2014
following: 63
following/friends network
Twitter following/friends relationship from the perspective of an
institution – who does it follow, not necessarily the other way
around – can be seen as how the institution wants to position itself,
(therefore) who it wants to publicly support. This is oftentimes true
for “human users” as well, however, human users also might want
to follow another user (incl. an institution) in order to “subscribe” to
receive their content – e.g. in order to benefit from it. While the
second statement can also be true for an institution – e.g. for its
social media manager who shares third-party content as part of an
institution’s communication strategy; to exit this nested loop, let’s
assume the act of Khan Academy or Coursera following another
user reflects (offline to online relationship) and/or co-creates
(attempt at online to offline relationship, incl. support) its brand.
Dealing first with the network* of tweeters Khan Academy is
following, the most important users – as measured by number of
followers (size of a node is proportional to number of followers) –
are: Bill Gates (less active regarding his 1,286 tweets since he
joined Twitter in 2009) & NASA (310,042 tweets since it joined
Twitter in 2007).
On the topic of the most active users in KA’s friends network – who
arguably tweet for their own (but possibly overlapping with KA’s)
communities – right after NASA & Bill Gates, there’s “jack” &
“pamelafox”, the former being Jack Dorsey, twitter co-founder, and
the latter being Pamela Fox, working at Khan Academy on the
Computer Science curriculum. Also note that besides the financial
backing of Khan Academy from the Bill & Melinda Gates
Foundation we’ve already mentioned in the first section of this text,
Bill Gates has also brought a lot of media attention to Khan
Academy. If you want a more elaborated proof than just the search
results after querying “bill gates khan academy” in your favourite
search engine, there's a named entity recognition analysis of
online news articles about Coursera & Khan Academy near the
end of this extensive chapter (p. 87).
Our previous paragraph highlights the fact that even when a
relationship network like that already provides us with a lot of
information, it’s always worth to combine it with other data (third-
party, insider knowledge etc.) to gain a detailed insight. One
source of information simply might not be enough.
* The primary tool for analyzing & visualizing social networks in this text was
NodeXL, a free, open-source template for Microsoft Excel. I’ve also adopted
several definitions of social network metrics from its build-in documentation
and/or from the documentation’s links to Wikipedia articles describing them.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Khan_IsFollowing.csv, Twitter_Khan_IsFollowing_MetaData.csv) | May 2014
following: 63
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Khan_IsFollowing.csv, Twitter_Khan_IsFollowing_MetaData.csv) | May 2014
Centrality measures – in this picture “degree” (the
number of connections a node has) – show that,
with respect to the users KA is following, rather
people than institutions are interconnected. Why is
that? Khan Academy is mainly following the “core
team” of its employees and/or volunteers (ICT
development and/or content creation) & KA’s
branches (e.g. Smarthistory). The bottom curve is
formed by several institutions. While many of them
perfectly fit the “education for anyone” part of KA’s
brand (e.g. UncommonSchools, TeachForAmerica
etc.), we’ll see in later analyses that the most
distinctive ones are NASA & CollegeBoard
(“CollegeBoard” & “OfficialSAT” twitter accounts).
following/friends network
following: 300
In comparison with the “small Khan Academy
family” – employees, key partner institutions
& a few “support the right thing” expressions
– Coursera clearly makes an effort to define
itself by the users it follows. That’s why its
friends network is slightly more diverse.
There’s no Bill Gates (yet, the Bill & Melinda
Gates Foundation is present), but Bill Clinton;
there’s Obama, Oprah and other public
figures and/or famous & influential people
(incl. entertainment & education/science
popularization, such as Vsauce).* Apart from
“celebrities”, Coursera also follows many
online tech & news publishers (& also some
education-specialized) and, above all, many
educational institutions and/or educators,
usually those with whom Coursera co-
operates (see the next page).
With regard to the most active tweeters in the
network, “the leader” is The New York Times
with its 136,887 tweets since it joined Twitter
in March 2007. Even though it’s a bit shy,
hiding behind Barack Obama**, NY Times is
also the third most influential (as measured
by number of followers) user in the network
(after Barack & Oprah).
* The size of nodes/users in the social network
graph represents the number of their followers.
** My mistake, sorry for that layout. =)
following/friends network
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_IsFollowing.csv, Twitter_Coursera_IsFollowing_MetaData.csv) | May 2014
following: 300
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_IsFollowing.csv, Twitter_Coursera_IsFollowing_MetaData.csv) | May 2014
Switching from “number of followers” to “degree”, as itself,
does not tell us anything meaningful about Coursera’s brand.
Let’s rather try to find groups – “natural” clusters of users
based on interconnectedness of the vertices (the larger the
vertice is, the bigger its clustering coefficient). This way, we
can see the first traces of a generalization – very rough but
sufficient for our purposes – that the two general (&
overlapping) groups of users Coursera is following are:
1) those Coursera wants to be co-defined by, incl. media
partners; 2) those featuring their courses on the Coursera
platform – institutions, mainly Colleges/Universities, individual
lecturers/Professors and/or particular classes/courses.
However, don’t forget that our picture does not show the
typology of Coursera’s “friends” (those who Coursera follows)
but the 1.5 level network, i.e. including the links between
Coursera’s friends which are the clusters based on, and
which helped us to see the structure of the network more
clearly (compared to the “blob” on the previous page) & make
the aforementioned generalization further supported on the
following pages of this text.
If you find our first clustering – not our generalization build
around it but as itself – a bit difficult to interpret (I do!), since
there does not seem to be any obvious “deeper” (than our
generalization) relationships at our level of analysis, you
should know that there will be much more clearer typologies
regarding Facebook & YouTube comment networks later on.
Finally a fact especially important for fans of soap operas:
while Coursera follows Khan Academy, Khan Academy does
not follow it back. =)
following/friends network
followers: 176,619top URLs
https://www.coursera.org/about/translate
https://www.coursera.org/course/teachingcharacter
top domains
coursera.org
nytimes.com
youtube.com
huffingtonpost.com
google.com
ryanseacrest.com
bbc.com
entrepreneur.com
charlierose.com
kplu.org
top hashtags
coursera14
edtech
gamification
coursera
top words
coursera
course
learn
education
new
online
courses
learning
top word pairs
co,founder
coursera,app
rick,levin
global,translator
translator,community
find,out
starts,today
online,courses
top mentioned
coursera
andrewyng
daphnekoller
relaygse
pennopencourses
oprah
women2
charlierose
juanpagalavis
top tweeters
huffingtonpost
businessinsider
felipebhz
nytimes
petchary
nytimesworld
_nastycat
slate
brainpicker
ws
timezone
Eastern Time (US & Canada) 203
Pacific Time (US & Canada) 152
Central Time (US & Canada) 59
London 34
Quito 34
Athens 30
Chennai 27
Amsterdam 25
Atlantic Time (Canada) 24
Brasilia 18
Greenland 18
Arizona 16
Hawaii 14
Mexico City 14
Tehran 14
Madrid 13
Alaska 12
Istanbul 12
Rome 11
Beijing 10
Caracas 10
recent tweets
On Twitter, Coursera has (as of June 2014) around 177 thousand followers. Since we’re still in realm of how the institution position itself, let’s take a look at the content of its 200 recent (June 2014)
tweets. This might give us some insight not only into the content Coursera (or Khan Academy on the following pages) shares but also into what & who Coursera mentions (and/or replies to) in its tweets.
Top domains show us whose educational content (and/or content about education & society – see the dataset, a zip file link enclosed on page 131) Coursera shares. Besides its own courses & major
media publishers – most of them here already considered before within our overall “mediasphere” picture of Coursera (see the conclusions) – we can (newly) find some public figures from entertainment
industry (Ryan Seacrest, Charlie Rose).
Top hashtags include “coursera”, “coursera14” (conference), (not only by online media publishers used) popular hashtag “edtech” (education technology) & gamification – a technique used to reinforce
user/learner engagement, as we will see later, rather associated with Khan Academy, occurring in relation to Coursera because of its very first course taught in Chinese, Probability (機率) by Professor
Ping-Cheng Yeh, National Taiwan University, who created “MOOC-based multi-student social game platform” for his course named PaGamO.
Top words & the three bottom rows of the “top word pairs” table illustrate the fact that Twitter is (unsurprisingly) used by Coursera to announce new courses in order to invite prospective learners to
enroll. In relation to our previous “educational resources” analysis pointing out that many Coursera’s courses are introductory level (and also supported by some following analyses studying other social
media), we can conclude that Coursera’s communication strategy – in accordance with its proclaimed mission we’ve seen before – aims to open up (& facilitate) higher education for mass audience,
within the “everyone can learn anything” notion.
The other three tables (excluding the “timezone” one) provide instances of shared educational content – e.g. Coursera courses by University of Pennsylvania (“pennopencourses”); interacting with
public(ly known) figures; and promotion of current events around both Coursera’s co-founders, Daphne Koller & Andrew Ng, and Coursera’s CEO, Rick Levin. Above all, we should emphasize
Coursera’s recent effort to recruit volunteers for translating (/subtitling) courses into other languages.
Just out of curiosity, you might take a look at the time zones of over 1500 most recent Coursera followers, which denotes where the most recent interest for Coursera is (we’ll also take those into
account in our final conclusions). There also the “top tweeters” table, which describes the top tweeters in the whole network, therefore combines mentions, recent followers & “friends” (“is following”
relationship). Since we’ll take look at some “grassroot” influencers & “brand ambassadors” in the third section of this text, and using a slightly different technique to capture them – monitoring mentions
in tweets across the entire Twitter rather than focusing on a single (Coursera’s / Khan Academy’s) profile; you might be (again) only interested in the major publishers that Coursera follows and/or
mentions (usually when they publish an article discussing Coursera).
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_AnyRelation_Edges.csv, Twitter_Coursera_AnyRelation_Vertices.csv) | June 2014
followers: 295,146
top domains
khanacademy.org
tumblr.com
youtube.com
google.com
nasa.gov
edsurge.com
paniit-bayarea.org
techcrunch.com
top hashtags
hourofcode
khanacademy
newsat
kabrainteaser
edtech
kamathchallenge
teacherappreciati
onweek
kacollegehero
stem
whsciencefair
top words
khanacademy
khan
new
academy
students
hourofcode
out
more
top word pairs
khan,academy
check,out
khanacademy,hourofcode
computer,science
math,mondays
find,out
sal,khan
top mentionedkhanacademy
lifeatka
pamelafox
collegeboard
britcruise
nasa
salkhanacademy
officialsat
calacademy
drszucker
top tweetersdanceeatrepeat
hntweets
imdrw
nytimes
washingtonpost
wsj
fastcompany
techcrunch
cnetnews
forbes
timezoneEastern Time (US & Canada) 138
Pacific Time (US & Canada) 89
Central Time (US & Canada) 73
Athens 29
Atlantic Time (Canada) 25
Arizona 22
London 21
Amsterdam 17
Chennai 15
Brasilia 11
Hawaii 11
Alaska 10
Quito 10
Bangkok 9
Greenland 9
Istanbul 9
Mumbai 8
Brisbane 7
Mountain Time (US & Canada) 7
Rome 7
Sydney 7
top URLs
https://www.khanacademy.org/hour-of-code/hour-of-code-tutorial/v/welcome-hour-of-code
https://www.khanacademy.org/donate
http://www.nasa.gov/content/nasa-khan-academy-collaborate-to-bring-stem-opportunities-to-online-learners/#.U4T468biI9V
https://www.khanacademy.org/sat
https://www.khanacademy.org/partner-content/CAS-biodiversity
http://www.paniit-bayarea.org/edtech/
http://cs-blog.khanacademy.org/2014/03/what-does-computing-professional-look.html
https://docs.google.com/document/d/1QCen5ijdfEFiG_a_RGSGgHjBYI7Vc1lakHpcLQF4mkU/edit?usp=sharing
https://www.khanacademy.org/cs/second-avatar-naming-contest/2601896243
https://www.khanacademy.org/hour-of-code/hour-of-code-tutorial
recent tweets
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Khan_AnyRelation_Edges.csv, Twitter_Khan_AnyRelation_Vertices.csv) | June 2014
As for Khan Academy’s recent tweets, publicizing the new Computer Science lectures is evident. The most distinctive is the “Hour of Code”, a trademark of Code.org, “non-profit dedicated to expanding participation in computer science by making it available in more
schools, and increasing participation by women and underrepresented students of color” [self]. Dealing with the top URLs, we can see the aforementioned HourOfCode, some other CS & education technology related content, CS curriculum translation into
Portuguese, some efforts to obtain donations, NASA & CollegeBoard partnerships – we will see those two many times later on; and California Academy of Sicences partner content on Khan Academy. Top hashtags & mentions show that Khan Academy shares a lot
of educational resources for traditional school education (collegeboard, officialsat, newsat, stem), challenges/brainteasers (kabrainteaser, kamathchallenge), background stories & new content/features development (“Life at KA” blog, pamelafox, britcruise,
salkhanacademy) and also stories related to achievements in traditional school education (educators: teacherappreciationweek, learners: kacollegehero). Compared to Coursera, once again, we can see: much more coherent universe of topics – to a certain extent,
given by smaller overall number of topics around KA; (on social media) very active & publicized “core” small team of KA’s content creators and/or ICT developers – e.g. John Resig (“jeresig”) & Ben Alpert (“soprano”) we’ve already seen in KA’s friends Twitter
networks & we’ll also see later on; better work with hashtags; and successful communication of core partners. Althought we’ve already came across this topic & it will be supported by further data, let’s point out that we’re starting to see Khan Academy as the more
“centralized” one of both institutions, with Salman Khan in the centre & a solid community around him, together “flipping” the traditional education system (the flipped classroom model), customizing it & being part of it at the same time.
Again, on the right you can see the time zones of about 1100 most recent Khan Academy followers & the “top tweeters” table.
Coursera pages & people KA pages & people
pages & people search
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (data obtained via: https://developers.google.com/+/api/) | May 2014
Pages & people API search allows us to discover communities around both educational tools on Google+. If
we omit founders & employers/partners – we can see some new faces, e.g. Khan Academy’s John Resig,
creator of the jQuery JavaScript library – and some irrelevant results (e.g. Shah Rukh Khan & Khanacademy
Pesh), the snippets of the discovered pages show us instances (the first page of results only) of the finding
that Coursera’s Google+ communities are mainly study groups, while Khan Academy’s represent especially
KA’s content translations and fans & volunteer pages/communities. Dealing with those, we should not forget
our limitations by “khan academy” or “coursera” keywords that also favour English language & Latin script –
e.g. possibly not covering Arabic languages and/or communities & pages related to Coursera/KA but not
including our keywords. Nevertheless, since, throughout this text, we complement this information, such
limitations should not be an issue. Finally, we can add that – as we know from our Twitter data, Coursera
also currently pursues its goal of establishing a volunteer translator community.
For similar – this time rather quantitative – analysis on Facebook, see the text box on right.
As for Coursera, over 90 fan pages & public(ly visible) groups were detected – the same
search limitations as we’ve mentioned with Google+ – consisting mainly of study/course
groups. Therefore we know that self-organization of students outside the “official”
Coursera’s learning environment exists.
Khan Academy has about 200 public(ly visible) fan pages & groups. These are mainly
language translations of (the “original”, English) Khan Academy (rather Facebook pages)
and/or serve as a project management tool for translators (rather Facebook groups).
pages & groups search
followers: 1,067,922
+1s top 3
Next on our agenda are content
analyses of Coursera’s & Khan
Academy’s top Google+ & Facebook
posts (January-May 2014) according to
various kinds of post engagement: +1s &
likes, replies & comments, reshares &
shares (to stick with both service’s
terminologies). Despite our “first half of
2014” bias and perceived suitability of
long-term analysis & studying posts
based on their type, content, keywords,
time & other features for the purpose of
developing an “archetypal post” (and/or
“ideal” most engaging post) – e.g. using
conjoint analysis with engagement
metrics used to indicate the perceived
value; we are able to draw reasonable
conclusions about the content Coursera
& Khan Academy posts. Above all, we
are supposed to see which content
Facebook/Google+ users generally
associate most with Coursera’s/KA’s
brand, since most engaging posts are
usually also the most visible on both
SNSs.
Coursera Google+ page’s post most
appreciated by the community – posts
that received the most +1’s (descending
order from left to right) – share a
common attribute of story & inspiration
(unlike Khan Academy’s “playfulness”).
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Coursera_Posts.csv) | January-May 2014
+1s top 3
followers: 343,686
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Khan_Posts.csv) | January-May 2014
Dealing with KA’s post from right to left (reverse order, “third place” first), Khan Academy adheres to “playfulness” – especially see the top posts based on other metrics on
the following pages. Despite the fact that any wishes, celebrations, holiday posts etc. are generally popular on social media, Khan Academy’s New Year’s resolution is
reinforced by a badge. We also see that the community endorses KA’s partnerships – but rather in the sense that it brings free educational resources they can use at
school than regarding a particular institution.* Yet the most successful post is a “storytelling” one, generally Coursera’s strong point. We’ve already seen the “Teacher
Appreciation Week” within the Twitter top hashtags table (KA’s most recent tweets analysis).
* Once again, we can see that since Khan Academy is rather aimed at K-12 education, while Coursera meets demand for higher academic/professional education, KA is more prone to “blend” with the traditional
education system. Also note – for later reflection & discussion within our overall conclusions – that even though both Coursera & Khan Academy are given the “revolutionist” brand label, it is rather adjustment of current
educational practices and enjoying the technological benefits (meeting higher demand, analytics, space-time freedom etc.) what happens than “fighting against” them and/or against the current education system.
followers: 1,067,922
replies top 3
There’s one newcomer post in Coursera’s posts that were
most commented on. It’s Coursera’s Android App Beta testing.
Asking for feedback and/or for expressing an opinion is a
technique nine out of ten social media marketers recommend.
=) “Authorization” of users in this way is a mutually beneficial
act which increases interaction & strengthens the community.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Coursera_Posts.csv) | January-May 2014
400 replies
87 replies
72 replies
replies top 3
followers: 343,686
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Khan_Posts.csv) | January-May 2014
Finally, driving social media engagement in purely educational way!
The high popularity of responding to KA’s brainteasers & challenges
shows in which way gamification of education can be important:
facilitating beginning of & reinforcing motivation in the learning process
(it catches my interest); supporting interaction & healthy competition
(responding to show to the others that I know the answer); and therefore
also collaboration (responding to answers of the others, discussion).
217 replies
191 replies
181 replies
followers: 1,067,922
reshares top 3
Reshares on Coursera don’t provide us with any (previously) unseen posts. The
lesson here could be that Coursera g+ fans rather share inspiring & storytelling
content than educational content (e.g. links to new courses), which is different from
Khan Academy (see on the next page). In several following analyses, we’ll support
the conclusion that Coursera excels in sharing stories about students, while Khan
Academy rather stands out in gamification & sharing background content about itself.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Coursera_Posts.csv) | January-May 2014
reshares top 3
followers: 343,686
Khan Academy Google+ post
reshares added a new brain teaser
to our most engaging posts “hall of
fame”. We can say that KA’s
Google+ community shares rather
educational content: content which
their social network might need
(College admissions resources)
and/or engaging learning content
(challenges & brainteasers).
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Khan_Posts.csv) | January-May 2014
followers: 1,067,922 followers: 343,686
type of post number of posts number of +1 +1/count coefficient
article 68 4,073 60
photo 25 5,321 213
video 3 295 98
type of post number of posts number of +1 +1/count coefficient
article 13 1261 97
photo 33 6,138 186
video 53 2,039 38
content shared
Before moving to Facebook, let’s support the generally known & accepted idea of visual
content (photo/pictures) being the most engaging on social media. Employing as simple
analysis as: 1) calculating the numbers of particular content types – article (text only),
photo & video – Coursera/KA shared 2) adding up the total +1s it received; and 3)
calculating the number of +1s a particular content type received on average; we find that
sharing a photo/picture pays off to Coursera & Khan Academy at least two times more
than sharing any other kind of content.* Also when popularizing education & making it
accessible, we should keep in mind that “a picture is worth a thousand words”.
* I know, I know… Not controlling for a third variable. Anyone who’ll find a confounding variable
and/or a spurious relationship, don’t hesitate to start the discussion below this text. =)
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Gplus_Coursera_Posts.csv, Gplus_Khan_Posts.csv) | January-May 2014
number of fans: 529,115
likes top 3
The most liked stories posted by Coursera on Facebook are different from those on Google+ – (from my subjective point of view) “surprisingly”
inclined towards educational content information instead of storytelling – which denotes there are some differences between both communities.*
As measured by the number of likes, Coursera’s Facebook community acknowledged: firstly, Coursera’s Specializations announcement, sequence of
courses finished by a capstone project & certificate; secondly, a New Year’s post introducing new partner Universities & (therefore) new courses; and
thirdly, a “gag” showing that entertainment in general (and/or memes in particular – yet our particular post is not an Internet meme, you can learn more
about memes here) is still popular on Facebook**, even in communities around education.
* Since Facebook is the “mainstream(est)” social medium, we’ll make an attempt to derive both Coursera & Khan Academy
fan typology based on their Facebook co-comment networks.
** The rank of “memes & gags” content will rather decline in Facebook users’ personal news feeds, since Facebooks current goal seems to be
becoming a “personalized newspaper” (or here for information about Paper), and because of that it attempts to promote “high quality content”.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv) | January-May 2014
number of fans: 714,626
likes top 3
As for Khan Academy, Facebook users endorsed by their likes the SAT materials, an
outcome of The College Board & Khan Academy partnership. From the remaining two
posts in the top 3, we can see that many thumbs up are generally given to the story of
Salman Khan (nurtured by online publishers*), his ideas/visions & leadership pathway
towards transforming education.
* We’ve already seen some publishers publicizing Salman Khan before. There is The New York
Times & Harvard Business Review at this page. Nevertheless, the main support for the “nurtured by
online publishers” statement comes with the online news articles analysis later on.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Khan_Posts.csv) | January-May 2014
number of fans: 529,115
comments top 3
Being “the” social medium,
Facebook is also know as a tool
for establishing online (grassroot)
“movements”; protest/support
groups etc. Knowing that, it
should not be surprising that
Coursera’s Facebook community
expressed their opinion (mainly
disagreement) with restricted
Coursera access in some
countries due to US sanctions,
making the “Update on Course
Accessibility for Students in Cuba,
Iran, Sudan, and Syria”
Coursera’s post that was most
commended on. The disproportion
of likes against comments can be
explained by the fact that while
expression of support on social
media is generally easily done
(there’s usually a button for it),
assuming SNSs where there are
no design features like
downvotes, thumbs down or
anything similar, the only way of
how to disagree is to explain
yourself using a comment.
We are already familiar with the
other most frequently commented
posts.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv) | January-May 2014
137 comments
95 comments
92 comments
number of fans: 714,626
comments top 3
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Khan_Posts.csv) | January-May 2014
As well as Khan Academy’s
Google+ community, the
Facebook one also likes to
show off by interacting with
gamified educational content
using comments. Apart from
the (rather relaxing)
brainteasers (probability &
logic) we already know, there’s
also a new mathematical
challenge, which belongs to
the “this week’s challenge”
KA’s series.
824 comments
737 comments
549 comments
number of fans: 529,115
shares top 3
The most successful posts according to the number of times they
were shared does not bring any new content. Nevertheless, the
shuffled order suggests that many Facebook users are eager to
share entertaining content.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv) | January-May 2014
number of fans: 714,626
shares top 3
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Khan_Posts.csv) | January-May 2014
Khan Academy’s Facebook community mainly wanted to
share the SAT materials with others – note that on Facebook,
a teenager probably has more (active) classmates than on
Google+ – and also wanted to tease their friends with the “two
guards & two doors” brain teaser.
number of fans: 529,115
other popular posts
To complement the overall picture & comparison with the most popular Google+ posts, let’s also
add another three posts immediately following the top3 Coursera’s stories (Khan Academy’s on
the next page) – as measured by the number of likes they received. While there are obvious
similarities with Google+, we can also see what will be supported in the co-comment Facebook
network analysis, that on Facebook, Coursera reaches more active fans from Latin America.*
* Also note that currently, we are monitoring active fans only (those who like, comment or share). We are not
monitoring posts about Coursera (/Khan Academy) beyond its Facebook page. The supposed differences
between both communities – reflected in the content they publicly acknowledge – might also be (co-)created
by different recommender algorithms Facebook & Google+ employ. However, such discussion is far beyond
the scope of this research.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv) | January-May 2014
number of fans: 714,626
other popular posts
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Khan_Posts.csv) | January-May 2014
Similarly, you might take a look at #3-6 of KA’s Facebook posts
with the top like engagement rate. As we’ve mentioned earlier, the
SAT standardized test caused a great deal of the buzz around
Khan Academy on Facebook.
number of fans: 529,115 number of fans: 714,626
total posts
since 2014 176
unique fans engaged
since 2014 3,580 0.68% of total fans
most active likers(surnames removed if
applicable)
Chris; Lars; Eurusd Trader; Mahmoud; Crear
una tienda online; Concursos de fotos;
Ronnie; Sorteos Facebook. Aplicaciones
concursos, gratis
most active
commenters(surnames removed if
applicable)
TichiLivi; Sami; Tarlei; Askia; Ramaswamy;
Hassan; Sameer; Steffial
total posts
since 2014 92
unique fans engaged
since 2014 2,582 0.36% of total fans
most active likers(surnames removed if
applicable)
Syed; Janie; Mya; Tsveta; Daniel; James;
Tom; Shayma; Margareth
most active commenters
(surnames removed if
applicable)
David; Ahsanul; Study Australia; Maghnia;
Steve; Julie; Keith; Mark; Sue
active fans
We could obtain the demographic profile of the active Facebook fan population from those who share such information publicly. Let’s do that later as a pre-screening of our fans typology based on co-
comment network. This slide simply should highlight the fact that on Facebook (& other SNSs as well), we actually are able to reach detailed information on a single human being (here, last names were
removed). For example, we can create an archetype of a very active Coursera fan based on the demographic information & posts the most active C’s likers & commenters publicly share. The “customer
persona” of a highly engaged Coursera Facebook fan could be described as a man in his 20s, studying in US (native or foreign student) or recently employed, who wants to complement his professional
qualification; possibly (e.g. fans from Brazil, India or Syria) such courses are not available at his home University and/or he might be limited by financial constraints.*
Talking about Coursera & knowing that such practice was not detected within the most active Facebook users on KA’s posts, note that using likes, several companies makes an attempt to draw some of the
Coursera’s attention to their business.
Another information that’s important is that Coursera shares almost twice as many posts Khan Academy does. Khan Academy Facebook fans are larger in number but have a smaller active “core” of users.
As you’ll see on the three following pages, this actually perfectly fits our developing interpretation of smaller but stronger community around Salman Khan, since those active users produce an enormous
number of likes, comments & shares.
* Interest profile could also be created based on the pages the users (publicly) liked.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (data obtained via: developers.facebook.com/docs/graph-api/) | May 2014
number of fans: 529,115total posts: 176
number of fans: 714,626total posts: 92
mean 253
standard error 22
median 159
standard
deviation 290
kurtosis 19
skewness 4
range 2,280
minimum 46
maximum 2,326
sum 44,543
mean 765
standard error 79
median 502
standard
deviation 755
kurtosis 9
skewness 3
range 4,616
minimum 34
maximum 4,650
sum 70,369
likes
Although this & the following two pages
could as well precede the “top posts”
content analysis, now I feel the need to
justify why we did actually study (positive)
outliers rather than the average posts.
We’ve already mentioned the reason
related to the primary objective of this
research, discovering Coursera’s & Khan
Academy’s social web brands, where the
outlying posts are those that actually
reach the largest population and because
of that arguably impact the overall image
of an institution most significantly.
The second reason is that we’ve seen
there’s a minority of active fans which
interacts with a minority of posts. Since
the primary way of spreading content on
social media is engagement & interaction,
we rather might want to study the common
features of the most successful content.
From our “education” perspective ideally
those that are: 1) educational; 2) popular,
so that they drive engagement to
education.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv, Facebook_Khan_Posts.csv) | January-May 2014
mean 13
standard error 1
median 9
standard
deviation 18
kurtosis 19
skewness 4
range 137
minimum 0
maximum 137
sum 2,366
mean 80
standard error 15
median 27
standard
deviation 144
kurtosis 13
skewness 3
range 822
minimum 2
maximum 824
sum 7,379
comments
The previous slide showed
comparison of descriptive
statistics between the likes
Coursera & Khan Academy
received. Among other things, it
demonstrated that Khan
Academy – also thanks to its
brain teasers & challenges –
manages to drive likes more
successfully. Here, comparing
comments statistics, we can
clearly see that Coursera is
missing higher interaction of fans
with one another that Khan
Academy masters thanks to its
aforementioned gamified
content.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv, Facebook_Khan_Posts.csv) | January-May 2014
number of fans: 529,115total posts: 176
number of fans: 714,626total posts: 92
mean 41
standard error 6
median 24
standard
deviation 82
kurtosis 67
skewness 7
range 871
minimum 0
maximum 871
sum 7,178
mean 212
standard error 22
median 142
standard
deviation 218
kurtosis 5
skewness 2
range 1,183
minimum 0
maximum 1,183
sum 19,520
shares
Similarly, in relation to shares, take a
look at mean, median & sum.
Coursera’s Facebook community
receives almost twice as many posts &
has almost twice as many active users,
but its rather Khan Academy’s active
core userbase that spreads educational
content on social media.
To be fair, or more precisely, to point
finger at the “offender”, such state of
things is related to what we’ll discover
soon thanks to YouTube network
analysis. While Khan Academy shares
all of its educational content, by default,
using publicly accessible third-party
tools like YouTube, Coursera’s own
“after-you-enroll-accessible-only”
learning environment with all
educational resources makes it more
difficult to establish a solid social media
content strategy based on open
educational resources which, obviously,
drive a lot of engagement.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv, Facebook_Khan_Posts.csv) | January-May 2014
number of fans: 529,115total posts: 176
number of fans: 714,626total posts: 92
total posts: 176 total posts: 92
correlation matrix
Slowly finishing up our playing with the engagement data, here comes a correlation matrix heat map allowing us to examine likes, comments & shares in their mutual relations. Regarding Khan
Academy, what was liked was also shared, while this was not always true for Coursera. We already know that likes are the most common type of engagement, since giving a thumb up is literally as
easy as clicking a button. We also know that shares are crucial for information spreading; and, as we saw, also much less frequent – perhaps because a user sharing content not only expresses her/his
deeper interest in a particular topic (compared to a like), but is additionally asked to (optionally) add a comment to the reshared content. Such act requires a lot of involvement, doesn’t it? =) This
appears to be even “worse” in case of comments, which are the rarest of all (see the “sum” rows on previous pages). Since we tried to justify the study of outliers, even though the correlation between
shares & comments is weak for both Coursera & Khan Academy, and knowing that we’ve already discussed the most successful posts, a comments/shares scatter plot visualization should emphasize
the reason why we want to know about the common features of the most successful posts. Moreover, we’ll clearly see what a difference a single strong element in content strategy can make.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv, Facebook_Khan_Posts.csv) | January-May 2014
total posts: 176
95, 871On occasion there's a worthwhile online course Coursera doesn't host. This
is one of them... [Hogwarts School of Witchcraft & Wizardry]
Due to U.S. export sanctions,we recently had to restrict accessto students in Iran, Cuba, Sudan
and, temporarily, Syria. (...)137, 89
92, 533Today we’re excited to announce Coursera Specializations (...), a new type of program that allows students to develop mastery in a specific subject through taking a sequence of courses with a capstone project. (...)
0
100
200
300
400
500
600
700
800
900
1000
0 20 40 60 80 100 120 140 160
share
s
comments
shares & comments outliers
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv) | January-May 2014
Interactive version of the chart: http://bit.ly/1lHvtoy
total posts: 92
737, 993Can you solve this week’s brain teaser?
There are two doors, each with a guard. Behind one door is treasure. (...)
824, 652Let’s make a deal! Can you
solve this week’s brain teaser? (...)[Suppose you're on a game show,
and you're given the choiceof three door: (...)]
138, 1183It’s time to level the playing field! We're partnering with The College Board, the creators of the SAT (...)
0
200
400
600
800
1000
1200
1400
0 100 200 300 400 500 600 700 800 900
share
s
comments
shares & comments outliers
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Khan_Posts.csv) | January-May 2014
Interactive version of the chart: http://bit.ly/1tAWsos
58% 42%
pt_BR 8%
fr_FR 3.6%
es_ES 3.2%
...
number of fans: 529,115 number of fans: 714,626
70% 30%
77.9% 12.9%60.8% 14.0%
es_LA 1.1%
pt_BR 1%
sv_SE 0.9%
…
comment network gender & locale
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_Posts.csv, Facebook_Khan_Posts.csv) | January-May 2014
While our previous Facebook analyses in this section were content-oriented, now we’ll try to create user
typologies based on co-comment networks, which were established around Coursera’s & Khan Academy’s
Facebook posts.
This page offers locale & gender description of commenting users. Same as before, we find ourselves in
the realm of active (commenting) users only (& Facebook only), where we see all public (only) comments –
their final version if edited; we don’t see (theoretically existing) deleted comments. We’ve also already
mentioned that our dataset covers less than six months in the first half of 2014, which might cause a bias
(should/could be contrasted with a more recent dataset). Despite that, the co-comment Facebook network
analysis will provide us with a valuable insight into what kind of content engages different fan subgroups.
Just for “peace of my mind” I need to emphasize the evident information that in no way this page provides
a description of the general population of Coursera’s & Khan Academy’s users & supporters in the general
population.* If we wanted to do that, probably combining data from many other sources, employing web
scraping etc.; it would be suitable to standardize our ratios with respect to the total population of particular
countries (to safely identify large communities in smaller countries as well). The locale** & gender here
therefore gives evidence about the comment networks on the following pages only.
It’s apparent that among the active Facebook commenters of Coursera & Khan Academy, their largest
“customer” demographic segment, US, predominates, followed by another country whose majority
population undoubtedly enjoys open educational resources in English. Yet, we can’t be so sure, especially
about the second statement, since many English-speaking users from non-natively-English-speaking
countries most likely use Facebook with English set as their locale (similarly other world languages).**
As for Coursera, we should point out the relatively large proportion of Brazilian commenters.
Dealing with Khan Academy, gender probably takes our attention. Since Alexa’s network traffic estimates
suggested the exact opposite, we can’t shift this discussion any further. Yet, while the question of whether
the existing “not afraid to speak up” inequalities from offline world reflect itself in the online world as well
should be answered by someone with “gender studies” qualification, we should mention that both
Coursera & Khan Academy support (& contribute to) the current trend of highlighting women achievements
& facilitating women’s emancipation in traditionally “male fields”, such as ICT & science.
* If, however, you wanted to do an estimate of Coursera’s/KA’s world’s “coverage” (rather than particular countries
“proportions” estimate), take a look at the enclosed dataset,
where you can find the locales smaller in numbers as well.
** “Locale” is not “location” but simply a user’s language settings.
n=285number of fans: 529,115
group vertices edges
random 'emotional post' sympathizers 52 1326
Coursera story listeners & story tellers 37 186
Coursera news subscribers: technology
& educational transformation
32 229
Coursera news subscribers: new courses 25 87
quiz solvers 23 162
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_CommentNetwork_Edges, Facebook_Coursera_CommentNetwork_Vertices) | January-May 2014
The 2.0 levels of adjacent vertices of Coursera’s Facebook comments
network, where size of a node represent its degree, show us “natural”
clusters – according to how the vertices are connected one to
another, the Clauset-Newman-Moore algorithm – of active fans, from
which we can derive (active) fan typology based on which kind of
posts a particular group interacted with.
The largest group was labeled “random 'emotional post' sympathizers,
since it mainly consists of users interacting just with the “Dan, an
autistic student” post. The second largest group of co-commenters,
“Coursera story listeners & story tellers” consists of users who
commented on stories of people from or around Coursera, personal
stories shared by other fans, and/or shared their own educational
experience (comment on Coursera’s post asking for a story and/or a
stand-alone post on Coursera’s wall). Group #3 is labeled “Coursera
news subscribers: technology & educational transformation”. Those
users keep/kept an eye on the way Coursera re-defines higher
education & techn(olog)ical news – e.g. Coursera’s new Android app.
The fourth type of commenters, “Coursera news subscribers: new
courses” simply watch out Coursera’s Facebook page for being
notified about new courses (& comment on them). The commenters in
the last larger group that is “entitled” =) to have its own label are “Quiz
solvers”. Quiz solvers comment on posts where there is a clear call to
action in them (questions & quizzes).
The second, third & fourth group is connected to the first group,
indicating that the sentimental post bonded the community together.
Storytelling is a powerful technique, indeed. Quiz solvers are partially
mixed with both groups interested in Coursera news. Similarly, there
are some overlaps of “story listeners & tellers” and “new courses
subscribers” into “quiz solvers”.
comment network
number of fans: 529,115
group vertices edges
random 'emotional post' sympathizers 52 1326
Coursera story listeners & story tellers 37 186
Coursera news subscribers: technology
& educational transformation
32 229
Coursera news subscribers: new courses 25 87
quiz solvers 23 162
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_CommentNetwork_Edges, Facebook_Coursera_CommentNetwork_Vertices) | January-May 2014
Let’s now take a look at the most significant users in
Coursera’s co-comment network according to some
commonly used social network analysis centrality
metrics.* As per degree – i.e. the overall connectedness
as measured by number of connections – the user Tarlei
is the most significant one (blue profile picture in the
middle of Coursera story listeners & tellers group). After
examining betweeness centrality – number of shortest
paths from all vertices to all others that pass through a
node (i.e. to which extent a node acts as the “connector”
of the network); eigenvector centrality – importance of a
node based on its connections, where more important
nodes are given more weight (i.e. influence based on
who you are connected to); and PageRank, used by
Google Search to rank websites in their search engine
results – a similar link analysis algorithm to eigenvector
centrality; once again we find that Tarlei is the central
person regarding comments on Coursera’s post.**
* Again, in the picture, size of a node represent its degree.
** On the topic of other widely used social network analysis
metrics, we have (intentionally) omitted, these are closeness
centrality – sum of distances to all other nodes (how easy it is
to reach them); & clustering coefficient – how close the vertex
and its neighbors are to being a clique (a complete graph).
n=285
comment network
number of fans: 529,115
group vertices edges
random 'emotional post' sympathizers 52 1326
Coursera story listeners & story tellers 37 186
Coursera news subscribers: technology
& educational transformation
32 229
Coursera news subscribers: new courses 25 87
quiz solvers 23 162
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_CommentNetwork_Edges, Facebook_Coursera_CommentNetwork_Vertices) | January-May 2014
Unless, of course, it’s not that simple… Taking a look at random Tarlei’s
posts like “I got certificate!”, “Coursera is awesome!” or “Downloaded!”, we
probably don’t find these expanding the overall discussion on Coursera’s
Facebook page. On the other hand, there are other comments like “Please
abolish Peer Assessment evaluation. This method is not fair. Keep only
quizzes. Students are complaining a lot! (Please refer to the discussion
forums.)”, or “We are waiting for a course preparatory to TOEFL
examination.”, which actually speak in favor of awarding Tarlei the “central
person in Coursera’s comment network” badge. I simply wanted to have
some insight into that matter so that we avoid adopting too “mechanistic”
stances towards social network analysis without any qualitative verification
of our conclusions.
There are some discrepancies regarding Tarlei’s Facebook & LinkedIn
profile, and, above all, we are conducting analyses in the online world,
therefore we always are (at least) a bit “suspicious” about authenticity of
nearly anything. =) However, Tarlei acts as a big fan of Coursera (& is a
frequent user/learner), who surely follows his own (or another institution’s)
business interests, nevertheless, whether his intentions are content
marketing, or educating his followers & online neighbourhood, or both, it’s
nice to know that the central person in an educational institution’s
(Coursera’s) Facebook comments is a user who regularly shares open
educational content – here, on the topic of “English for Lawyers”.
n=285
comment network
Tarlei’s network
number of fans: 529,115
group vertices edges
random 'emotional post' sympathizers 52 1326
Coursera story listeners & story tellers 37 186
Coursera news subscribers: technology
& educational transformation
32 229
Coursera news subscribers: new courses 25 87
quiz solvers 23 162
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera_CommentNetwork_Edges, Facebook_Coursera_CommentNetwork_Vertices) | January-May 2014
Even though, at this moment, we find ourselves discussing details not
entirely necessary for our original intention of creating a user typology*, it is
worth to show ways of finding influential users within a single Facebook
page (similar, for example, to a particular Internet forum research), as
opposed to our third section, “a week of Tweets”, where we’ll search for
influential users across the entire tweets universe (similar, for example, to
looking for influential bloggers who use the Wordpress content management
system).**
By removing*** Tarlei, the main character of Coursera’s comments network,
we can learn about who would possibly replace him in his role if he stopped
being an active Coursera commenter. What if Coursera starts a viral
Facebook campaign in accordance with its mission “empower people with
education that will improve their lives, the lives of their families, and the
communities they live in” (see the data from Facebook API we’ve queried at
the beginning of this section), then starts to recruit “role models” among the
active & influential Facebook commenters, but Tarlei says “Sorry, not
interested.”? Time for plan B. Prospective candidates for “replacing” Tarlei
then would be** (highlighted in red) Daniel: degree, & eigenvector centrality;
Stephanie: betweenness centrality; & NanChi: PageRank.
* From the perspective of the “pull model” of education (as opposed to the “push
model”), this is a “sneaky” way of teaching the reader something new without her or
him realizing it (therefore not resisting it =)).
** Please, don’t forget that the objective of this paper, describing Coursera’s & Khan
Academy’s social web brand, is far from discovering concrete influencers (but rather
patterns & trends). If we wanted to do so, our quantitative pre-screening of
prospective influentials should be followed by a detailed excursion into quality &
relevancy of the content they share and also with respect to our goals/intentions &
target groups.
*** This analysis was inspired by a similar one in HANSEN, Derek, Ben
SCHNEIDERMAN and Marc SMITH. ANALYZING SOCIAL MEDIA NETWORKS
WITH NODEXL: INSIGHTS FROM A CONNECTED WORLD. Burlington, MA:
Morgan Kaufmann, 2011. ISBN 01-238-2229-7. Great book, by the way! For both
practical usage & the general context of social network analysis.
Yet, an updated (second) edition would be great.
n=285
comment network
number of fans: 714,626
group vertices edges
random brain teaser solvers 761 289180
brain teasers & challenges enthusiasts 696 229974
SAT (US college admissions
standardized test)
506 102513
'Mathematics' brain teasers & challenges
enthusiasts
345 26633
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Khan_CommentNetwork_Edges, Facebook_Khan_CommentNetwork_Vertices) | January-May 2014
n=2434
To make the picture more clear, as Khan Academy’s
co-commenter community is larger than Coursera’s,
the edges were removed (only nodes & their clusters
are visible). Once again, the groups were labeled
based on the types of post a particular cluster of users
interacted with. Since the additional details were
already explained within Coursera’s network, we can
now focus on the primary goal only: to derive KA’s
Facebook user typology.
Similarly to Coursera, the largest group is made up of
generally less active commenters on Khan Academy’s
posts. The “Random brain teaser solvers” were solving
one of the most popular brain teasers featured by Khan
Academy: three doors, two goats & one car (see on the
previous pages). In general, “Logic brain teasers &
challenges enthusiasts” comment/commented on any
logical tasks posted on KA’s Facebook timeline. The
third largest group, “SAT candidates” are (naturally)
interested in any content related to the SAT
examination. Group #4, “Mathematics brain teasers &
challenges enthusiasts”, regularly share their
calculations in the comments below any mathematical
exercise/challenge.
comment network
number of fans: 529,115 number of fans: 714,626
graph type undirected
vertices 285
unique edges 2747
edges with duplicates 8
total edges 2755
connected components 30
single-vertex connected components 10
maximum vertices in a connected component 190
maximum edges in a connected component 2545
maximum geodesic distance (diameter) 6
average geodesic distance 2.733113
graph density 0.067976279
modularity 0.501497
graph type undirected
vertices 2434
unique edges 743330
edges with duplicates 4069
total edges 747399
connected components 16
single-vertex connected components 6
maximum vertices in a connected component 2397
maximum edges in a connected component 747342
maximum geodesic distance (diameter) 3
average geodesic distance 1.752121
graph density 0.251713211
modularity 0.537402
comment network comparison
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera/Khan_CommentNetwork_Edges, Facebook_Coursera/Khan_CommentNetwork_Vertices) | January-May 2014
The somewhat more qualitative insight into both communities deserves to be complemented
using several (rather quantitative) metrics allowing us to “objectively” compare both networks.
Comparing the number of nodes (/vertices) & edges (/links) appears to be easily interpretable.
So, let’s start with exploring the “edges with duplicates”* row, which tells us that Khan Academy’s
fans comment more often repeatedly on the same post (e.g. a reply to another user). Relatively
speaking – in proportion to unique edges – KA’s commenter is almost twice more prone to
comment on a post repeatedly (0.54% of total edges were duplicates) than Coursera’s
commenter (0.29% of total edges were duplicates).
The previous conclusion that the community of Khan Academy gives the feeling of being more
“tied together“ can finally be supported by an appropriate metric of “connected components”.
Although Khan Academy’s network is larger in size, in comparison with Coursera, it has about
half the number of sets of vertices that are connected to each other but not to the rest of the
graph. We know that on Facebook, Coursera shares almost twice as many posts Khan Academy
does. Nevertheless, the discussion is simply not happening, supposedly because of missing
educational content directly on Facebook (not just redirecting to courses requiring enrollment).**
Despite the fact, we have seen on the previous pages, that KA’s unique fans engaged since 2014
are proportionally smaller compared to Coursera, the “hardcore” Facebook epicenter of KA’s fan
base seems to be very strong & engage with almost anything Khan Academy shares.
We can elaborate upon the discussion in the paragraph above taking a look at the “single-vertex
connected components” row, which represents number of connected components that have only
one vertex. Furthermore, there’s the average geodesic distance & maximum geodesic distance –
number of edges in a shortest path connecting two nodes – which tell us that a KA’s commenter
reaches another commenter on maximum of three steps (if she or he belongs to the same
connected component). And finally, we can top our argument using graph density – the ratio that
compares the number of edges in the graph with the maximum number of edges the graph would
have if all the vertices were connected to each other (much smaller regarding Coursera).
Maximum vertices in a connected component & maximum edges in a connected component tell
us: 1) vertices: how large was the largest (notional) group of commenters; 2) edges: how many
relations/links were in the (notional) group of commenters with the most edges. Again, it shows
that KA’s community is more connected, also thanks to many “intermediaries” with high
betweeness centrality (such users are crucial for viral spreading of any content).
The very last metric, modularity, “quality of the grouping”, is almost the same for both. Graphs
with high modularity have dense connections among the vertices within the same group but
sparse connections among vertices in different groups. Knowing the modularity range of [−0.5,1),
we can say that our value being around 0.5 means quite clearly defined groups.
* As for our undirected graph, "A,B" & "B,A“ relationships are considered duplicates. Such duplicates can be
also used as “weight” of a relationship. We’ll do that in one of the following analyses.
** My immediate idea for improvement would be, for example, sharing (in the form of pictures) sample
exercises and/or quiz questions from Coursera courses, which would drive engagement (especially
comments).
number of fans: 529,115 number of fans: 714,626
0
20
40
60
frequency
degree
minimum degree 0
maximum degree 95
average degree 19.305
median degree 11.000
0
100
200
300
frequency
betweenness centrality
minimum betweenness centrality 0.000
maximum betweenness centrality 5361.795
average betweenness centrality 111.800
median betweenness centrality 0.000
0
500
1000
frequency
degree
minimum degree 0
maximum degree 2296
average degree 612.418
median degree 709.000
0
1000
2000
3000
frequen
cy
betweenness centrality
minimum betweenness centrality 0.000
maximum betweenness centrality 252579.402
average betweenness centrality 888.236
median betweenness centrality 0.000
comment network comparison
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera/Khan_CommentNetwork_Edges, Facebook_Coursera/Khan_CommentNetwork_Vertices) | January-May 2014
In order to make reasonable relative (not
absolute) comparisons, this detailed
comparison of metrics invites us to take a
look at the graphs, the distribution of their
bars (also those that are very small &
therefore less visible), rather than
focusing on the numbers. The “x” axis
represents value of a metric – from 0 to its
maximum, from left to right – while the “y”
axis shows frequency of the value in the
network.
Both degree & betweeness centrality
seem to verify what was concluded
before. Khan Academy is more tied
together thanks to its solid core of users,
while the discussion on Coursera’s
Facebook page happens (more
frequently, in comparison) thanks to
influential “connectors” with high
betweeness centrality.
number of fans: 529,115 number of fans: 714,626
minimum closeness centrality 0.000
maximum closeness centrality 1.000
average closeness centrality 0.098
median closeness centrality 0.002
minimum eigenvector centrality 0.000
maximum eigenvector centrality 0.017
average eigenvector centrality 0.004
median eigenvector centrality 0.000
minimum closeness centrality 0.000
maximum closeness centrality 1.000
average closeness centrality 0.006
median closeness centrality 0.000
minimum eigenvector centrality 0.000
maximum eigenvector centrality 0.001
average eigenvector centrality 0.000
median eigenvector centrality 0.000
0
100
200
300
frequency
closeness centrality
0
100
200
300
frequency
eigenvector centrality
0
1000
2000
3000
frequency
closeness centrality
0
500
1000
frequency
eigenvector centrality
comment network comparison
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera/Khan_CommentNetwork_Edges, Facebook_Coursera/Khan_CommentNetwork_Vertices) | January-May 2014
Dealing with other compared metrics, we have
to bear in mind that both networks include
disconnected components, which move their
values beyond interpretability, meaningfulness
& practical applicability. It would be more useful
– especially in case of eigenvector centrality,
PageRank & clustering coefficient – to focus
(separately) on individual users & groups, to
find the most influential users (as we did a bit
with Coursera). Since such analysis would not
contribute much to the overall objective of this
research, this & the following page just simply
illustrate how many & how much influential
users – according to various metrics we’ve
briefly defined before – we could find in both
networks.
number of fans: 529,115 number of fans: 714,626
minimum pagerank 0.000
maximum pagerank 3.511
average pagerank 0.965
median pagerank 1.000
minimum clustering coefficient 0.000
maximum clustering coefficient 1.000
average clustering coefficient 0.891
median clustering coefficient 1.000
minimum pagerank 0.000
maximum pagerank 5.613
average pagerank 0.998
median pagerank 1.021
minimum clustering coefficient 0.000
maximum clustering coefficient 1.000
average clustering coefficient 0.970
median clustering coefficient 1.000
0
50
100
150
frequency
pagerank
0
100
200
300
frequency
clustering coefficient
0
500
1000
1500
frequency
pagerank
0
1000
2000
3000
frequency
clustering coefficient
comment network comparison
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Facebook_Coursera/Khan_CommentNetwork_Edges, Facebook_Coursera/Khan_CommentNetwork_Vertices) | January-May 2014
number of fans: 529,115 number of fans: 714,626
interest profiles
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: N/A) | January-May 2014
Bing search results analysis, online news articles analysis, YouTube co-comment network of videos, and analysis of top Reddit posts are yet to come. But before that, for the following 6+1
pages including this one, let’s yet stick for a while to Facebook & the level of detail we are able to obtain about a single Facebook user, assuming that we have the necessary permissions,
obtained, for example, within the Facebook (Google/LinkedIn/… for another SNS data) login permission dialog. We might be interested in the socio-demographic data about users & mutual
relationships among them (friends & family, age, location, education, occupation etc.), their interest profiles based on Facebook pages they liked, and so on. All of that can be easily
accessed. However, assuming “fair play”, we are able to obtain such data (via Facebook Graph API) only if the studied population is part of our friends network, or if Facebook users grant us
permission to access such data (via the aforementioned Facebook permission dialog being part of your application)*, and/or collecting the publicly available user data in the way we did in our
previous analyses, beginning with an active user (a user who publicly liked, commented, or shared something of our interest) & obtaining her or his id which we then use to collect other data
she or he set as public (does not restrict their visibility with privacy settings). Nevertheless, the third way of collecting data actually (unfortunately) does not provide us with everything a user
set as public. Even though we are technically able to see someone’s public data in our web browser, it doesn’t mean the Facebook Graph API allows us to query it. Sure, there are some very
simple workarounds like screen scraping (& automating it), but such practice violates Facebooks Automated Data Collection Terms for humans, as well as robots.txt for machines. So, even
though it would be awesome if this page provided a comparison of most frequent pages in different categories that Coursera’s & Khan Academy’s (active) fans liked – i.e. comparison of their
interest profiles – we will play by the rules and instead of that take a look at my personal friends network. Protecting confidentiality & anonymity is not a simple task on the social web.
Therefore I hope that providing aggregated data from my – otherwise almost completely public – personal profile about my connections, deprived of identifying information on an individual
human being, will not damage any of my friends – not even in the name of science! =)
The purpose of this part of text is to illustrate simple collection of data from social media user profiles, for examlpe, in order to enhance an educational tool’s recommender algorithm
personalizing education (and/or complement behavioral data). Though, as you will see on the following pages, Facebook is not a social network inclined towards educational content, I won’t
be a chicken cowardly fleeing to LinkedIn – e.g., recommending interdisciplinary educational resources according to one’s professional experience & professional experience of one’s
network; Twitter – generally more rich SNS with respect to news & educational content; or YouTube – as we have already seen, if we omit all funny videos of cats etc., an ideal medium for
education; but I’ll take the “mainstreamest” social medium that is – exactly for the reason of being “mainstreamest” which reflects its very high population reach – important to study in relation
to education as well.
* However, you can take advantage of (“/abuse” =)) data about all 1.3 billion Facebook users whenever you want – provided that you have enough money –
through Facebook’s targeted advertising, which every single Facebook user agreed to the moment she/he started using the service (Terms of Service).
N/A N/A
personal network
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: www.wolframalpha.com/facebook/) | June 2014
Not to “reinvent wheel”, I am happy to aggregate the available demographic data of my 375 Facebook friends using Wolfram Alpha’s tool.* Since this is only a demonstration of
obtainable data, not an analysis leading to answering our research question, there’s no need to interpret that data (in fact we might need to do that later in connection with the
analysis on the next page).
* …aaand I’ve just violated my obligation of protecting my friend’s data providing them to a third-party. I told you it’s super simple (and you possibly do that on a daily basis using Facebook, Google or
other services, and/or using your mobile device full of your friends contacts, photos etc.). To digress, if you are looking for an alternative search engine, don’t hesitate to give a try to the aforementioned
Wolfram Alpha, which, instead of returning pages based on keyword analysis, computes search results using curated data. And what about the Wolfram (programming) language!
Speaking of alternative search engines, I have one more recommendation for those who do not support the current trend of personalized search results: try DuckDuckGo.
personal network
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: developers.facebook.com/docs/graph-api) | June 2014
Another from our miniseries of analyses “just for the sake of showing
what is possible to obtain” (& we’ll also take that into account in our
overall conclusions) shows – in an incredibly complicated graph =) –
my personal 2.0 network of my & my friends’ Facebook timeline with
my recent interactions highlighted in red (the lines are coming out of
me, the ego of the network). The size of nodes depends on
betweeness centrality, which accentuates connectors/intermediaries in
our network. The thicker & darker the line, the stronger a relationship
is.* The less transparent node is, the more recent a user’s profile
update time is (i.e. more recently active user). Only the largest &
“sufficiently” anonymous groups were given a name.
Even though the largest clusters are reflections of “limited private
offline” activities (predecessor of “massive open online” stuff =)),
schools & summer holidays, the picture also nicely illustrates influential
users around which independent clusters formed. The potential reach
of my content shared on Facebook & flow of information is therefore
significantly influenced, for example, by a friend of mine who works as
an instructor in a dance school – thus a community in which I stay on
its very periphery =); international students from my social media
marketing course at the Charles University in Prague; people who I
met totally by chance (e.g. a trip or within an online environment);
distant relatives (as measured by relationship strength) or distant
friends (as measured by geographical distance); and so on. Also the
“band fans” group quite clearly shows there was a band member who’s
connections made up 90% of the people that turned up for concerts of
our (currently not performing) amateur band. On the other hand, on the
subject of potential viral educational campaign planning, it’s evident
that my Facebook network is primarily about informal relationships.
Professional and/or academic contacts – possibly mineable from
services like LinedIn, SlideShare, Academia.edu etc. – are simply
missing.
Just to have some fun: from my (private) Facebook network posts
keyword statistics, it’s very easy to find out what technique do we use
once a year to compensates for the lack of interaction with our tons of
Facebook friends – and also the influence of Facebook’s design
features (notifications in particular) – since the collocation “happy
birthday” (& its variations) occupy all the top ranks of the charts.
A sociologist’s heart then will be pleased by the fact that the person
I share the most connections with, is my sister.
* Derived from the number of mutual public interactions
– an author of a post, a user tagged, a comment, a like.
mine subgraph a dance school instructor’s subgraph
personal network
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: developers.facebook.com/docs/graph-api) | June 2014
I would also like to add that interpreting such large
network using visualization (rather than computation)
might not always be the most appropriate solution.
Depending on our intentions we also might want to “cut
off” what we are not interested in, cluster/aggregate
nodes into summarizing “supernodes” to emphasize the
important relationships etc. I'm not about to try it here,
since it’s not directly related to our research. However, it’s
not difficult to imagine utilization of such data for
educational content targeting and/or its spreading
planning (and/or any other campaign planning), since we
are able to assemble a quite clear graph of how to reach
an individual and/or a group. The reality definitely won’t
be as perfect as the one in our picture, yet, I want to
highlight the fact that the data is out there & it depends
solely on us whether we’ll use it to increase our “brand
new yogurt” sales, or to personalize education & help its
transformation via social media. Employing the influence
of peers & personal social networks might help us to get
closer to the “ideal”, where social media users interact
with educational content rather than with “funny pictures
of cats” (or, at least, with both =)), which would result in
Facebook NewsFeed Algorithm (or any other
recommender algorithm) to take care of the rest.
Education transformation, although not complete, would
take a major (& inexpensive) step forward if people were
exposed to educational content on a daily basis (once
again, even if it means we need to “lace” such content
with funny cats =)).
interest profiles
top 10 common page like categories in my personal network
community 1064
musician/band 580
tv show 227
local business 210
movie 207
website 198
athlete 193
public figure 184
food/beverages 177
non-profit organization 173
top 10 common page likes in my personal network (English annotation)
Nejlepší zábava (Best Entertainment, community entertainment page)
Jaromír Jágr (Jaromir Jagr, Czech professional ice hockey player official page)
Viral Vines (Viral Vines, community entertainment page)
You.bo (You.bo, proprietary entertainment page)
Český olympijský tým (Czech Team, Czech Olympic games team official page)
15+ (15+, community entertainment page)
Užívám si života naplno (I Live My Life to the Max, community entertainment page)
Žiješ jen jednou (You Only Live Once, community entertainment page)
House (House, US TV show official page)
Partička (Crew, Czech TV show official page)
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: developers.facebook.com/docs/graph-api) | June 2014
Let’s continue our excursion into analyzing the most common likes among my friends (not including me).
The “top 10” table clearly illustrates the bias of the most popular content towards entertainment. This seems to be a
mainstream trend on Facebook, however, it’s also influenced by the fact that there are many teenagers in my Facebook
network since the time I worked as a teacher (ICT & English) and outdoor & summer camps instructor. Yet, quite
disappointing statistics regarding education… Trying to find new hope in at least several common page likes that could be
labeled “educational” – since I’m, apparently, an atypical node (& outlier) in my personal social network – I finally find MIT
OpenCourseWare. A good starting point for influencing my Facebook surroundings as we know – from the social network
theory – that innovation rather spreads from network peripherals. Or we might want to deal with it “the other way around”:
“stick” educational content to something that already is popular. Since Khan Academy would probably be more suitable for
my, generally younger, network & since the most popular Facebook page in my network is “Nejlepší zábava” (“Best
Entertainment”), we might want to customize the educational content so that it “fits” the culture (/content strategy) of the
entertainment page. To do so, I’ll borrow a popular tweet from the fourth section of this text, “a week of tweets” (see below).
Isn’t that a nice example of “pull marketing/education” (as opposed to “push marketing/education”)?
n-=375 friends
n-=375 friends
[context tweet of the picture]
“When you get 4 in a row on khan academy then miss the last one”
[a hint for those who simply “don’t get it” as they are probably not Khan Academy users – but
beware of the fact that “Explaining a joke is like dissecting a frog. You understand it better but
the frog dies in the process.” (E.B. White)]
hint: in order to complete a Khan Academy’s exercise, you need 5 correct answers in a row
[since this is a meme, a commonly accepted content among teenagers, “nerds“ aware of Khan
Academy might become “starts” by explaining it & possibly also reinforce their social status,
slowly turning into “role models” for their peers, spreading educational content in their social
environment …ok, ok, I’m coming down to earth =)]
interest profiles
top 10 common page like categories in my personal network
community 1064
musician/band 580
tv show 227
local business 210
movie 207
website 198
athlete 193
public figure 184
food/beverages 177
non-profit organization 173
top 10 common page likes in my personal network (English annotation)
Nejlepší zábava (Best Entertainment, community entertainment page)
Jaromír Jágr (Jaromir Jagr, Czech professional ice hockey player official page)
Viral Vines (Viral Vines, community entertainment page)
You.bo (You.bo, proprietary entertainment page)
Český olympijský tým (Czech Team, Czech Olympic games team official page)
15+ (15+, community entertainment page)
Užívám si života naplno (I Live My Life to the Max, community entertainment page)
Žiješ jen jednou (You Only Live Once, community entertainment page)
House (House, US TV show official page)
Partička (Crowd, Czech TV show official page)
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: developers.facebook.com/docs/graph-api) | June 2014
n-=375 friends
n-=375 friends
Similarly to Khan Academy’s “LeBron Asks”* miniseries, one might want to use such social media data – e.g.
obtained by a Facebook/Twitter/Google+/LinkedIn login app – in order to popularize education, as a door opening
moment; and/or before a youngster’s intrinsic motivation fully develops; also to personalize education, suggest
educational resources etc. We might also want to take it as far as not only LeBron asking but indirectly suggesting
“you can do math, I’ll play basketball” (similarly Jaromir Jagr & learning about the physics of ice-hockey, or
Dr.House & medical diagnoses education) but actually making education a natural part of one’s life rather than
something that’s separated from it. “If you can't beat them, join them”.** We are aware of the influence of athletes,
musicians & actors. And since there are too many of them, deriving (personal) interest profiles from the social
web can serve as a powerful “filter”.***
I’ve recently liked the “Open Source for You” & “ProgrammableWeb” Facebook pages. Combined with my
previous educational activity, what can a smart recommender algorithm make of that?
* LeBron James is a US professional basketball player.
** Believe me that as a person who loves self-driven education, but has little taste for traditional media, popular music & film
production, commercialization of sports, and other products of mainstream popular culture that surrounds us every day & often
distracts us from more meaningful activities, it’s not easy for me to justify employment of celebrities in edudation (if only they
were scientific celebs =)). As well as with quantitative metrics in social media research, working with celebrities in education
and/or popularizing education in the way the YouTube “independent scene” does – see, for example, learning just got awesome
– might be a great starting point but a terrible finish line.
*** Serve like that just within the scope of our discussion. We can probably imagine of several other ways of how to use the
enormous universe of the social web data.
interest profiles
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (source: www.last.fm/api) | June 2014
To conclude our little (& far from being comprehensive) insight
into Facebook data, let’s actually leave Facebook, since the
power of “big data” also resides in combining information from
different sources.* The most common likes among my friends
(below the “top 10”) also include many music artists & bands –
which is not surprising as many people co-define their
personalities identifying themselves with the music they listen
to. Jared Leto is the most common one in my personal
network. Even if we won’t create educational content on how a
microphone works (electronics), how sound waves work
(physics), how vocal cords work (biology), structure of lyrics
based on music genre (linguistics), music subcultures
(sociology), how celebrities are made (media studies), etc.; we
might want to look for an alternative artists for our hypothetical
educational campaign, just in case Jared is too busy (like if
that would be our biggest problem =)). Last.fm can give us a
clue.
* There are also several issues related to that: particularly those in
two general & related categories of “privacy & security” and “ethics”.
One doesn’t need to be an expert hacker to obtain geodata, files
metadata, exploit unencrypted communication etc. And one doesn’t
even have to take the trouble of hacking, since publicly available
data about individuals allow us to produce almost anything
“to measure” (/tailored). Which one are we talking about here,
phishing, or marketing? I always get these two confused. =)
coursera coursera - google+ stanford onlinecoursera, the other stanford
mooc startup, officially ...
coursera - wikipedia, la
enciclopedia libre
coursera - android apps on
google play
coursera plans to announce
university partners for online
...
coursera | mimo školu –
vzdělávejte se po svém
free massive online
education provider,
coursera, begins ...
moocs on the move: how
coursera is disrupting the ...
coursera - wikipedia, the
free encyclopedia
coursera on the app store
on itunes
completely free online
classes? coursera.org now
offering ...
stanford professors launch
online university coursera ...
investors put $43 million
more into mooc provider
coursera ...
coursera blogcoursera - we cover the
revolution taking place in ...
welcome to coursera -
youtube
coursera credentials today,
full coursera-powered
degrees ...
duke to offer free courses on
internet | duke today
coursera (coursera) on
how coursera, a free online
education service, will
school ...
coursera adds 29 schools,
90 courses and 3 new
languages ...
coursera | linkedin coursera | linkedin
coursera - youtube coursera help
daphne koller: what we're
learning from online
education ...
coursera | 50 best websites
2012 | time.com
online education startup
coursera comes of age,
announces ...
coursera
coursera meetups
everywhere - meetup - find
your people ...
coursera - quora - your best
source for knowledge
home | stanford startup
engineering | cme/cs184 |
winter 2013
coursera - fortune
management & career blog
coursera - mountain view,
california - startup, computer
...
is coursera the beginning of
the end for traditional ...
most audacious companies:
coursera | inc.com
coursera - mountain view,
california - startup, computer
...
home - andrew ng - stanford
computer science
consortium of colleges takes
online education to new
level ...
coursera | crunchbasecoursera's fee-based course
option @insidehighered
more moolah for moocs --
coursera raises another
$20m ...
coursera $43 million series b
round - business insider
coursera - mountain view,
california - startup, computer
...
coursera, udacity, edx: will
free online ivy league ...
online college course
company coursera partners
with 12 ...
coursera for android app
now available for downloadcoursera help | mobile faq
khan academy khan academy life at ka - khan academy
khan academy | windows
phone apps+games store
(united states)
khan academy blends its
youtube approach with
classrooms ...
knowledge map | khan
academy
smarthistory: a multimedia
web-book about art and art
history
salman khan (educator) -
wikipedia, the free
encyclopedia
khan academy: the hype
and the reality - the answer
sheet ...
khan academy - google+
khan academy - wikipedia,
the free encyclopediakhan academy
khan academy - mountain
view, ca - education |
how khan academy is
changing the rules of
education ...
khan academy - mountain
view, ca - education |
khan academy khan academy
khan academy: the future of
education? - 60 minutes
videos ...
khan academyviewer for khan academy -
android apps on google play
khan academy - youtube khan academy – wikipedie khanova školakhan academy en français |
cours de maths gratuits !
khan academy : the future of
education? - cbs news
the kahn academy
one man, one computer, 10
million students: how khan
...
khan academy a
„převrácená“ třídakhan academy | crunchbase
khan academy - practical
money skills
khan academy chemistry - youtubekhan academy review &
rating | pcmag.com
sal khan: bill gates' favorite
teacher - aug. 24, 2010
programación de
computadoras | khan
academy
khan academy app for
windows in the windows
store
khan academy | where do i
begin? how should i get
star...
the trouble with khan
academy - casting out nines
- the ...
khan academy: the man
who wants to teach the
world - telegraph
what is khan academy? -
definition from whatis.com
khan academy on the app
store on itunes
khan academy | portal -
desk.com
khanapp - mobile app for
khan academy
khan academy launches the
future of computer science
...
khan academy: a name you
need to know in 2011 -
forbes
khan academy online store
salman khan: let's use video
to reinvent education | talk
...
khan academy
(khanacademy) on twitter
khan academy gets rare
partnership to close wealth
gap in ...
khan academy avec
khanacademy.fr
bing top 50 search results
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Bing_Khan_SearchResults.csv, Bing_Coursera_SearchResults.csv) | June 2014
Before we’ll take a quick look at Coursera’s & Khan Academy’s inbound links, top 3 reddit posts ever about them, and their YouTube video comment networks – and therefore conclude this chapter followed by “a week of tweets” analyses – we’ll step
back for a moment from social media to the point where we started in the first section of this text, the overall online presence of Coursera & Khan Academy. Microsoft Bing’s API provides us with search results of “coursera” & “khan academy” queries
(on this page), followed by online news articles that were captured by it, and which we can use for content analysis – in particular: frequency analysis, named entity recognition & automatic text abstraction – to estimate the general perception of both
institutions in minds of English-speaking Internet users, as defined by online media.
What will be one of the first contacts of a person with Coursera and/or Khan Academy if she or he does not know much about them? The answer is “the search results of a web search engine”. The top 50 Bing search results, ordered by columns
(because of that the first column is the most important, given that it is the very first page of the search results), show us that – and we are probably not surprised by the fact – there are Wikipedia articles, social media profiles, and news articles among
the top pages. While Coursera’s blog occupies the fourth search rank, KA’s blog is a bit “sunken”* – not such a big deal though, since, as we’ve already seen, Coursera’s & Khan Academy’s blog articles are getting traffic from shared links on their
social media profiles (where fans who follow the institution subscribe to all of its social media content which possibly appears on their personal SNSs homepages). Regarding Coursera, there’s also its new Android app, whereas Khan Academy is
associated with its Windows Store & iTunes Store app (both frequently discussed topic, as you’ll see in the online news article analysis). You might want to examine the other search results as well, but since a user would need to “click through” to get
to them, these are only important regarding a cross search engine comparison.**
Even though it is very likely a product of search engine personalization – in a different geographical location, you are likely to get different search results (usually based on your IP address & other criteria) – my nationality won’t let me not to mention a
Czech website with a curated list of high-quality online resources for self-driven education, the “mimo školu” search result; and, this time in the right table, “khanova škola“, a Czech volunteer translator community currating a website with subtitled
Khan Academy’s micro lectures in Czech.
* Verified on Google, probably caused by the choice of keywords: missing the word “blog” & direct connection to Khan Academy, as e.g. in “khan academy’s official blog”.
** For example, on Google, both institution’s Google+ page is – no way! =) – among the top search results and/or, in a shape of an institution’s huge Google+ snippet, in the right column (at the top of the screen).
bing news articles
total news articles: 28 (2012-2014) total news articles: 104 (2009-2014)
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Bing_Khan_News.txt, Bing_Coursera_News.txt) | June 2014
It’s time to explore the promised online news articles. There were
28 news articles (2012-2014) about Coursera & 104 (2009-2014)
articles about (the “older” institution of both) Khan Academy that
the Bing API returned in June 2014.* Although making
generalizations build around a n<30 sample might outrage a stern
statistician, we can justify it given that Bing’s news articles API
tends to return results of larger publishers, as those are easily
recognized as “news articles” by a search engine. On the other
hand, there are no blog articles** (regarding “grassroot bloggers”,
not an institution’s blog) in our sample. On that account, let's
conclude that our news articles analysis represents the overall
picture of Coursera & Khan Academy produced by larger online
publishers.
The line chart on the left shows us that our results will be generally
skewed towards the present, which is not only due to the growing
popularity of both sites, but also because of the fact that providing a
recent search results is what web search engines generally do,
therefore they are optimized for it. Anyway, this is completely
satisfactory toward answering our research question.
* Which is obviously a tiny fraction of what was published about Coursera
& Khan Academy. Yet, it’s enough for our forthcoming analyses’ points.
** For obtaining blog articles, using your own / a third party web crawler,
analyzing the Common Crawl Corpus, or using a service like spinn3r
might be necessary.
bing news articles
total news articles: 28 (2012-2014) total news articles: 104 (2009-2014)
most active publishers
The Wall Street Journal
Forbes
New York Times
most active publishers
Education Week
New York Times
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Bing_Khan_News.txt, Bing_Coursera_News.txt) | June 2014
If you still remember the Twitter “is following” (/friends) & recent tweets analyses, there are our old friends,
The New York Times, The Wall Street Journal & Forbes (this time with reference to Coursera). Apart from
verifying our previous conclusions, there’s also Education Week being the top publisher in relation to detected
online news articles about Khan Academy, which underlines the fact that – in comparison with Coursera –
more education-specialized sites are publishing about Khan Academy.*
* Also supported in the data of the following inbound links analysis & “a week of tweets” chapter (see on the next pages).
news article titles
total news articles: 28 (2012-2014) total news articles: 104 (2009-2014)
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Bing_Khan_News.txt, Bing_Coursera_News.txt) | June 2014
Yet a very simple* frequency analysis of the news article titles illustrates well the (media co-created) brand associations with Coursera & Khan Academy, which we’ll further develop on the following page by employing named entity recognition. No
need to rewrite the keywords a reader of this text can look at in the word clouds. Nevertheless, in relation to Coursera, I’d like to point out “online”, “education”, “courses”, “massive”, “future”, “learning” & “provider”; also “restores” & “iran”, which we’ve
already seen on Facebook (US sanctions); and names of famous Universities or excerpts from names of particular courses featured on Coursera.** As for Khan Academy, even though there are also many familiar topics (partnerships, SAT, and so
on), the visualization allows us to emphasize what we’ll see more clearly on the next page, how much the name of the founder of Khan Academy, Salman Khan, is favoured (within the content of the articles) in connection with his “revolutionary”
pathway towards transforming education employing video & the flipped classroom model. It’s somewhat not surprising that Khan Academy is build around Salman Khan – he founded it & he created majority of the micro lectures available on KA, his
face (or rather voice =)) would be recognized in many US schools (& his reach spreads around the world). Nevertheless, given the popularity of Khan Academy, its large & growing volunteer community etc., it is fascinating to realize how much the
whole project depends on “the” one man – not that we can’t imagine, how it might work in the future (and there are several models, also those that haven’t been invented yet). Although Khan Academy is far from being a “one-man show” anymore,
after some hints in our previous analyses, there’s finally a solid argument for labeling Khan Academy as the “more centralized” of both institutions & another reason why KA’s community appears to be more tied together.
* Just lowercase conversion, but not dealing with plural forms, verbs in third person, verb tenses, etc. (however, this was taken care of in the more detailed tweets text mining in the fourth chapter of this text).
** Just for those who wonder what the “baidu” keyword in Coursera’s cloud means: Baidu is a Chinese web services company, whose Chief Scientists (since May 2014) is Coursera's co-founder, Andrew Ng. Baidu.com is the leading Chinese language search engine.
named entity recognition
MOOC(s)
University
Andrew Ng
edX
Harvard
Stanford
Yale
iTunes
Android
iOS
Daphne Koller
technology
California
Iran
Salman Khan
YouTube
Bill GatesCollege Board
NASA
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Bing_Khan_News.txt, Bing_Coursera_News.txt) | June 2014
After (automated) scraping of the text of all (in total 132) articles, normalization of the data & using Python’s (programming language) NLTK 3.0 (Natural Language Toolkit) to analyze content of the articles; we are able to run named entity recognition
in order to find most discussed entities (“organization” or “person” in our case) regarding both institutions. After removing the generic terms “Coursera” & “Khan Academy”, we can create a web/radar chart displaying the online media sphere brand
overview of both institutions, with regard to what & who is discussed (the charts are showing the most frequent entities).
The authors of online news articles about Coursera, above all, discuss “MOOC(s)” & their relations to the “Universities” featuring them on Coursera, featuring them using other platforms – such as MIT’s & Harvard’s “edX” (fourth most frequent entity)
– and/or “on their own” (using their own platform); especially “Harvard”, “Stanford” & “Yale”. Both Coursera founders, “Andrew Ng” & “Daphne Koller”, are also frequently discussed.* Other frequently mentioned entities in news articles about Coursera
are the headquarters of the company in “California”, “technology” shaping current education, “Iran” & US sanctions (we’ve already discussed that before), and platforms offering Coursera’s new app (“Android”, “iOS”, “iTunes”).
Dealing with Khan Academy, after the discussion on the previous page, we can sum it all up in brief. Online articles about Khan Academy are mainly articles about Salman Khan. Much less frequently – but still most frequently in relation to the
discovered entities – there are the topics of “YouTube” (where Khan Academy’s lectures are stored), Khan Academy’s partners (in particular, “NASA” & “College Board”), and (besides his “other roles” =)) a supporter of Khan Academy & its donor, Bill
Gates, we’ve already met on Twitter.* Just a footnote “inside joke” link for those obsessed with humanities, who examined the one-third difference between Andrew’s & Daphne’s frequencies in the web chart. =)
total news articles: 28 (2012-2014) total news articles: 104 (2009-2014)
articles summary
total news articles: 104 (2009-2014)
“ Here‘s the News Hour‘s review of the Khan Academy. In case you‘re
wondering about the breadth of the topics the academy covers, here‘s an
overview narrated by Khan himself. And here‘s an example lesson on the
mathematical concept of limits. The Khan Academy is a not-for-profit
business but it has started to experiment with generating some revenue so
that Khan can expand the topics he covers and the detail in which he
covers them. Listening to Sal Khan, founder of the Khan Academy, speak
on stage to several hundred attendees at the 5th Anniversary Gala last
week for Innosight Institute — the non-profit that I co-founded — I thought
about how Clayton Christensen and I have speculated for some time that
the long-term future of much of educational content will be in the business
model of a facilitated network, a platform in which users essentially
exchange modular pieces of educational content with each other.“
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Bing_Khan_News.txt, Bing_Coursera_News.txt) | June 2014
To “have a quick break” from facts & figures, besides entities discussed within
the online news articles, we might be also interested in a summary of their
content/text (/overall abstract). Although there are several libraries available
for this purpose in Python, to stick with the spirit of writing a report on
education, I’ve developed some educational effort myself. In order to learn
more about automatic text summarization, which I – so far – was able to
describe in theory only*, I’ve written a very simple text summarizer based on
50 lines of code of Miki Tebeka’s Simple Text Summarizer**, and I ran the
script on the corpus of all 104 online news articles about Khan Academy***
scraped from the web (see on the right).
Within a (more academic) debate about “abstraction” vs. “extraction” general
approach towards automatic text summarization, the result on the right is
rather (a bit simpler) “extraction”, which is based on the premise that important
sentences are those that contain important words – i.e. most frequent words
after removing stop words.
* Or I often “cheated” given the awesomeness of Python libraries
(this time an inside joke for programmers =)).
** This is the true meaning of 21st Century Education prosumers, being both educators
& learners, exchanging (under various conditions) educational content with each other.
*** Coursera summary was not satisfactory not because of the smaller sample, but
rather due to the diversity of topics discussed within articles about Coursera
(as opposed to Khan Academy’s coherent “story”)
inbound links
total links 1,242
top
domains
(n=6-12) slashdot, wordpress, cnn, gamezone, tumblr, typepad, patch
other frequent
cnet, msn, stanford, duke, dzone, feedsportal, nytimes, upenn,
usnews, wsj
n=2
abc, ala, allaboutjazz, bloomberg, bostonmagazine, cbsnews,
cdc, cisco, clucerf, crooksandliars, edublogs, edweek, fooyoh,
forbes, huffingtonpost, illinois, inc, ing, iowapublicradio, kqed,
marketplace, mashable, mit, nbcnews, opb, openforum,
payscale, publicradio, reghardware, rice, ripr, smithsonianmag,
theguardian, typesafe, uw, washington, wbur, wfu, wfubmc,
wlrn, worldbank, wpr, wrvo, yahoo
suffix count
com 701
org 258
edu 90
net 47
ca 14
de 11
gov 11
co.uk 9
com.au 9
com.br 6
es 6
blogspot.com 5
edu.au 5
fr 5
ac.uk 4
dk 4
fm 4
info 4
ch 3
me 3
se 3
blogspot.ca 2
co 2
hu 2
it 2
net.au 2
nl 2
org.uk 2
tv 2
total links 528
top
domains
(n=3-8)
slashdot, patch, feedsportal, kqed, nytimes,
stackexchange, uwstout, wordpress
n=2
abcnews, cbsnews, ck12, cmu, cnn, edweek,
gawker, go, google, Hawaii, hpu, huffingtonpost,
ljworld, mashable, metafilter, niu, rosettastone,
sc, smithsonianmag, ted, tulsalibrary, utexas,
uvm, waldorf, wtol, yahoo
suffix count
com 261
edu 114
org 101
net 12
gov 6
ca 5
co.uk 4
info 2
it 2
tv 2
ak.us 1
cc.ca.us 1
cc.ms.us 1
cc.or.us 1
cz 1
hu 1
int 1
is 1
lib.al.us 1
lib.fl.us 1
lib.ks.us 1
lib.ky.us 1
lib.mo.us 1
lib.nc.us 1
lib.va.us 1
lib.wa.us 1
pa.us 1
Back to the “serious” things. Before our extensive finish, “a week of tweets”, we should complete* our “Coursera & Khan Academy social web presence qualitatively” section with inbound links analysis (on this page); posts about Coursera & Khan
Academy in the birthplace of viral content, reddit; and co-comment networks on YouTube, the largest video sharing-site with 3rd largest Internet traffic in the world (according to Alexa.com), which, as we’ve already seen, offers great conditions for
sharing educational content.
For collection & analysis of online network data regarding Coursera’s & Khan Academy’s websites, VOSON** web crawler was used. Inbound links are links from a website that link back to the original website – here Coursera on the left & Khan
Academy on the right. Inbound links – also known as backlinks – can (besides other things) tell you who pays attention to a website & therefore who increases the website’s traffic.
Finally, we can clearly see why the current web can be attributed “social” web. In terms of frequency of linking to Coursera or Khan Academy, blogs, internet forums, specific content gathering sites & other user-submitted content and/or social
media-like portals (slashdot, wordpress, tumblr, typepad, stackexchange etc.) are very important. As for larger media, CNN is the “newbie” we haven’t discovered within our previous analyses yet. Did “gamezone” among Coursera’s “top domains”
catch your attention? All the buzz was around the “Online Games: Literature, New Media, and Narrative” course, focused on Tolkien & The Lord of the Rings Online. The original topic of this University-level English literature class is “what happens to
stories, paintings, and films when they become the basis of massively multiplayer online games”.
While the fact that many Universities publish about Coursera, is expected – domain names like “stanford", “duke”, “upenn”, “illinois", “mit”, “rice” etc.) – other Universities, by contrast, link back only to Khan Academy – “uwstout”, “cmu”, “utexas”,
“hpu” etc.. Generally speaking, Coursera seems to have more attention of larger (general) media, while Khan Academy “leads” in the category of education-focused institutions/publishers – “ck12”, “edweek” etc.
If we omit – not only with regard to the previous analysis – the very common and/or frequent suffixes “com”, “org”, “edu”, “gov”, “ca” & “uk”; we’ll find out that several sites from Germany, Spain & France link to Coursera***. As we’ve also discovered
before – even with respect to less than half the number of inbound links found for Khan Academy*** – KA’s suffixes, again, seem to be more English speaking countries-centered.
* Or rather “complement”. “Complete” is an ill-chosen word since it indicates a closed-world assumption.
** VOSON inbound liks are found via a query to the Yahoo API.
*** Do not be misled by the “it” suffix. Despite the fact “it” is the Internet country code top-level domain for Italy, it is also frequently used as a so-called “domain hack”. In our inbound links dataset, there’s for example the California-based news publishing website “scoop.it”.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Inbound_Coursera.csv, Inbound_Khan.csv) | June 2014
URL score Title author domain created subreddit permalink
http://www.reddit.com/r/science/c
omments/23o5w4/science_ama_
series_hi_im_peggy_mason_i_st
udy/
2,440 Science AMA Series: Hi, I’m
Peggy Mason, I Study Empathy in
Rats, AMA.
PeggyMason self.science 1,4E+09
(Tue, 13 May
2014 16:53:20
GMT)
science http://www.reddit.com/r/sci
ence/comments/23o5w4/s
cience_ama_series_hi_im
_peggy_mason_i_study/
http://www.thesimplelogic.com/2
012/09/24/you-say-you-want-an-
education/
2,181 You Say You Want An Education?
A 4-year university computer
science curriculum entirely on
Coursera
adamwfletcher thesimplelogic.com 1,35E+09
(Fri, 12 Oct
2012 00:00:00
GMT)
programming http://www.reddit.com/r/pro
gramming/comments/10i5x
0/you_say_you_want_an_
education_a_4year_univer
sity/
http://hummusforthought.com/20
14/01/29/us-bans-students-from-
blacklisted-countries-from-
getting-a-free-education/
1,788 US bans students from Syria, Iran,
Sudan and Cuba from accessing
Coursera, the non-profit
organization offering free Massive
Open Online Courses
hummusforthought hummusforthought.com 1,39E+09
(Fri, 17 Jan
2014 23:06:40
GMT)
worldnews http://www.reddit.com/r/wo
rldnews/comments/1wen2
m/us_bans_students_from
_syria_iran_sudan_and_cu
ba/
top 3 reddit posts (all-time)
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Reddit_Coursera.csv) | June 2014
Top 3 reddits about Coursera – those which get the highest score (posts with the highest positive difference of upvotes & downvotes)* – are, in three different subredits (science,
programming & worldnews): 1) AMA (ask me anything) of Peggy Mason, Professor in the Department of Neurobiology at the University of Chicago, who, among other things, discussed on
reddit her “Understanding the Brain: the Neurobiology of Life” MOOC on Coursera; 2) Adam Fletcher, ICT masters student discussing the competitive ability of MOOC courses against
traditional Universities – including, in his opinion, insufficient credentialing of MOOCs (as of 2012); 3) the discussion we are already familiar with, the “US bans students from Syria, Iran,
Sudan and Cuba from accessing Coursera” issue.
So, what was popular on reddit? Close connection & discussion with a Professor from one of the world’s top Universities; a discussion of possibilities & shortcomings of an emerging model
of higher education; and last but not least, a debate laced with politics around Coursera courses & restrictions on US relations with other countries.
* Registered reddit users vote posts up or down to determine their position. The other reddits (obtained from the reddit API) including the “coursera” or “khan academy” keywords can be found in the enclosed
dataset – see the lower-right corner. If you haven’t noticed the existence of a footer on every page until now, you should see it for a particular dataset (related to a particular analysis) you want to play with.
All data files are enclosed in a zip archive. The link can be found on one of the last slides of this research paper.
URL score title author domain created subreddit permalink
http://techcrunch.com/2014/03/05/khan-
academy-gets-major-partnership-to-close-
rich-advantage-in-college-test-prep/
4,413 Khan Academy Gets Rare Partnership
To Close Wealth Gap In College Test
Prep - "bring free [SAT] test prep
software to the masses" - "prepare for
the SAT at their own pace, at no cost"
bboyjkang techcrunch.
com
1395340698
(Thu, 20
Mar 2014
18:38:18
GMT)
technology http://www.reddit.com/r/tech
nology/comments/20w5eu/k
han_academy_gets_rare_p
artnership_to_close/
http://www.livememe.com/dgigev5.jpg 3,626 Studying all weekend made me realize
this. Good Guy Khan Academy
Bearowolf livememe.c
om
1379992768
(Tue, 24
Sep 2013
03:19:28
GMT)
AdviceAnimals http://www.reddit.com/r/Advi
ceAnimals/comments/1n05s
s/studying_all_weekend_ma
de_me_realize_this_good/
http://www.reddit.com/r/IAmA/comments/n
tsco/i_am_salman_khan_founder_of_kha
n_academyama/
3,176 I am Salman Khan founder of Khan
Academy-AMA
salman_khan
_academy
self.IAmA 1325094536
(Wed, 28
Dec 2011
17:48:56
GMT)
IAmA http://www.reddit.com/r/IAm
A/comments/ntsco/i_am_sal
man_khan_founder_of_kha
n_academyama/
top 3 reddit posts (all-time)
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Reddit_Khan.csv) | June 2014
The top 3 reddits concerning Khan Academy discuss: 1) our already worn down topic of SAT partnership; 2) a meme expressing support for Salman Khan; his close association with the
Khan Academy “brand” & transformation of education is also something within the intersection of several previous datasets; 3) another (just like in case of Coursera) AMA, this time dating
back to 2011, with the founder of Khan Academy, “you-know-who”, “he-who-must-not-be-named”, otherwise somebody will conduct a content analysis of this text to make the very same
conclusions I did on the subject of online news articles about Khan Academy (because every time one mentions Khan Academy, she or he also mentions Sal Khan). =)
Reddit posts about both Coursera & Khan Academy therefore again demonstrate their common brand elements of “opening up education” & “opportunity”. Coursera conveyes the
impression of “an extremely nice doorman”, who allows people into a huge all-star virtual open-air educational party with an enormous number of stages of various new genres (sometimes
complicated for a totally newbie listener), where everyone can sit in the first row & interact with the performer or with the others. On the other hand, Khan Academy has the appearance of
“an extremely nice doorman”, who allows people into a huge virtual party at his place, where Sal Khan (frontman) performs education on request from his band’s production, which includes
thousands of fresh “oldies” covers.
subscribers: 31,535
video comments network
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: YouTube_Coursera_Edges.csv, YouTube_Coursera_Vertices.csv) | June 2014
Searching for comments below YouTube videos whose title, keywords,
description, or author’s username contain (& limited by) “coursera” (or
“khan academy” on the next page); we are able to create a network, based
on pair of videos commented on by the same user – the nodes represent
videos, their size is proportional to number of views – from which, once
again, we might be able to derive user typology (/clusters) build around the
educational content Coursera’s (/Khan Academy’s) learners are interested
in.
Saying that, it might be quite tricky to begin with Coursera which does not
provide the kind of “comprehensive education” (rather “differentiation” in
Coursera’s case) Khan Academy does, and, more importantly, does not
upload its lectures on YouTube, since it makes use of its own learning
environment. But let’s deal with that.
Coursera might also represent a single community and/or some common
ways of thinking about education, but it is rather a platform which shields a
large number of – more or less independent – smaller communities around
specific educational interests.
The big black disconnected group – yes, the majority of YouTube content
uploaded by Coursera does not have a common commenters community –
contains introductory “lectures” (/invitation videos) of courses featured on
Coursera, fan videos, assignments/homework related to particular classes,
feedback to Coursera, and course reviews. The biggest black dot
represents a YouTube upload of a video from the TED conference,
“Daphne Koller: What we're learning from online education” (211,233
views).
Some “minicommunities” were also discovered. We can use them to
illustrate some stepping-stones to Coursera’s social media content
strategy development we’ve already discussed before.
The first from left group clusters videos commented on by users interested
in (& interacting with) live sessions of courses in Spanish. The names of
the other groups are also quite self-explanatory. There are two
communities discussing Coursera news: one commenting on videos giving
information about Coursera’s background, the other on more festive
videos. The “COURSERAful” YouTube channel attracted commenters on
Stanford University Computer Networking' course. The remaining two
groups are “Coursera interviews & about”, and a group of videos
commented on by users dealing with the assignments of the “Introduction
to Music Production” course.
subscribers: 31,535
video comments network
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: YouTube_Coursera_Edges.csv, YouTube_Coursera_Vertices.csv) | June 2014
To complete the “top 3” most viewed videos about and/or by
Coursera, we should mention that the two larger blue nodes
at the bottom left are “Katy Perry - Birthday (Official
Coursera Parody) Music Video” and the “Welcome To
Coursera” video. Although the former’s membership in the
“news: Coursera background information” cluster might be
questionable, a new category of “entertainment”, in the eyes
of self-driven learners, would need to include all of the
content. =)
Broadly speaking, Coursera does not use YouTube as a
platform for publishing educational content*. The only (long-
term) “strategic” content on YouTube are course invitations,
which, however, rather drive traffic to Coursera’s website
(one, but possibly not the only goal of an institution’s SNS
presence) but do not drive engagement that might help
spreading of Coursera’s content & strengthen learning
communities, and/or maintain their interest in the channel &
educate them (some added value beyond the content
provided on Coursera’s website). Also with regard to the
other self-formed communities related to its brand,
Coursera still seems to seek its YouTube positioning &
content strategy.
* On the other hand, the fact all lectures are “enclosed” in
Coursera’s own learning environment (KA also has an elaborated
one) allows it to collect many behavioral data which might help us
in the future to better understand education (& better shape it as a
result). Nevertheless, “purely” educational content obviously drives
a lot of engagement on social media, therefore there seems to be
a room for improvement - my recommendation within the
Facebook network discussion is definitely not the only possible
solution.
subscribers: 178,1982
video comments network
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: YouTube_Khan_Edges.csv, YouTube_Khan_Vertices.csv) | June 2014
Creation of Khan Academy’s (rough) user typology, derived from network of
YouTube commenters connecting videos they interacted with, is – despite the
diversity of KA’s microlectures’ topics – much easier than in case of Coursera.
This time, we’ll start with the tiny & smaller groups, which are far from
representing “user archetypes” – unlike (possibly) the larger groups.
There are lot of videos watched by random curious users at the bottom of the
picture. Some users were co-interacting with videos on fractions, other were
interested in Salman Khan, computing, or mathematics & physics. The “MCAT
candidates” watched & commented on several videos on the Medical College
Admission Test – therefore they are possibly prospective medical students from
US & Canada. The “learning natural sciences” group interacted with natural
sciences videos in Portuguese ('Khan Academy em Português' YouTube
channel). We’re slowly getting to the larger groups…
Videos that belong to the “Learning medicine” cluster were co-commented by
users discussing Khan Academy videos mainly featured by the
“khanacademymedicine” YouTube channel. The most popular video there – as
measured by total number of views (the biggest light green dot) – was “The
Kidney and Nephron“ (878,288 views).
The “learning natural sciences” group, interested in natural sciences content
uploaded by the English (largest) “Khan Academy” YouTube channel, includes
another “top” video, “Balancing Chemical Equations” (1,345,371 views).
The second largest group (in the upper right corner) includes videos watched by
users who keep/kept an eye on all mathematical educational content
categorized on Khan Academy’s web page in the “Math” section. They also
seem to follow any general core knowledge – in terms of searching for an
explanation of a generally used concept. Therefore they are close to the largest
detected commenter group.*
The largest group discussed math, economics & finance, and Khan Academy as
an institution videos (news, background & milestones). The most watched
videos there are: “Basic Addition” (Khan Academy, 2,322,876 views) & “Khan
Academy runs on Google Cloud Platform“ (Google Enterprise, 5,093,851 views).
* A certain overlap is also apparent due to the fact that the “mathematics &
general knowledge” group also includes a news video (which should therefore
be in the largest cluster), “Salman Khan talk at TED 2011” (3,624,780 views, the
largest light blue dot). Given that we haven’t yet highlighted the importance of
the TED conference in our previous analyses dealing with the online media sphere (neither, with respect to Coursera), we should highlight TED now. It also
points out the fact that this text is build around the overview of both Coursera’s
& Khan Academy’s brands, therefore does not work with the influence & reach
of particular media. A suggestion for a follow-up research? =)
subscribers: 178,1982
video comments network
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: YouTube_Khan_Edges.csv, YouTube_Khan_Vertices.csv) | June 2014
The very last group we haven’t discussed yet is the disconnected
black nodes box. Just like with Coursera, these are videos that
did not fit in any of the clusters (could it possibly be because of
missing links? =)). We find there some third-party content:
interviews, employment of khan academy in schools reports, fan
videos, how tos; and also mixture of Khan Academy videos
covering the whole spectrum of KA’s content (sorted by
frequency, in descending order): mathematics, economics,
natural sciences & medical videos, computer science.
To sum it up, the largest group of Khan Academy users arguably
– based on the video comments networks around KA’s
educational content shared on YouTube* – complement,
supplement, or fix their knowledge & skills in mathematics
(above all), economics, and general/core education. Obviously,
new communities are also emerging: in particular those build
around the subjects of currently expanding KA’s portfolio (natural
sciences & medicine).
Quite surprisingly, we haven’t detected an increased interest in a
relatively new (& generally highly supported by media) section of
“Computing”. It can easily be explained by the fact that the
section on Khan Academy devoted to computers & computer
programming consist of many hands-on practical interactive
exercises & supportive textual material, rather than of interactive
whiteboard-like videos, which are commonly associated with
Khan Academy.
* Naturally (& valid for many other analyses in this text), it would be
great to compare our conclusions to the behavioral data
right from Khan Academy’s (/Coursera’s) website.
Coursera & Khan Academy
detailed insights provided by a small fragment of big data
general statistics, influential tweeters, demographics, keywords, content analysis,
natural language processing, text mining, sentiment analysis
(pp. 96-121)
a week of tweets
number of followers: 176,619 number of followers: 295,146
a week of tweets
number of tweets 8,538 (4,323 links, 3,286 retweets, 13% threads)
archive started
05/29/2014
17:27:36
last tweet
06/06/2014
17:21:59
tweets/min rate 0.74
number of tweets 4,976 (3,441 links, 1,386 retweets, 6% threads)
archive started
05/24/2014
17:05:21
last tweet
06/06/2014
17:40:56
tweets/min rate 0.27
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
We have already analyzed some Twitter data in the beginning of the second chapter. While previously, we were focused only on Coursera’s & Khan Academy’s friends (those who Coursera or Khan Academy
follow) and selected followers of Coursera & Khan Academy – thus studying rather the positioning of both institutions; here comes the analysis (text analysis above all) of all tweets that included the “coursera”
or “khan academy” keywords && that were published between May 29 (or May 24)* and June 6, 2014. It is not unreasonable to point out again that automation of our analysis – e.g. in the software as a service
fashion – is crucial for any long-term generalization of its outcomes. Nevertheless, we should see that even such “cut-out“ of the (big) Twitter data, a week of tweets, might provide us with a meaningful insight
into who Coursera’s & Khan Academy’s users are, what do they think & what Coursera / Khan Academy can possibly do about it.
In total, 8,538 tweets (3,286 retweets) about Coursera & 4,976 tweets (1,386 retweets) about Khan Academy were collected.** 13% of Coursera’s tweets were threads - i.e. about 1,010 tweets out of the total
8,538 tweets were detected as being conversations. Regarding Khan Academy, 6% of the total tweets were detected as threads – i.e. about 299 tweets in total. On that account, we can see that despite Khan
Academy has more followers, Twitter users – that is not only online self-driven learners but also online publishers etc. – were talking more about Coursera. On average, every 15 minutes, Coursera was
mentioned more than 11 times, compared to less than 4 tweets mentioning Khan Academy within the same time span.
* Unfortunately, part of Coursera’s dataset was corrupted. That’s why we have tweets mentioning Khan Academy since the evening of May 24, while Coursera’s dataset includes tweets since the evening of May 29. So, thus it
doesn’t seem to be exactly “a week of tweets” in neither case – fair objection you have! =) – “a week of tweets” is an analogy of “a small amount of the big data” which we can use to create hypothesis and/or (where possible) make
generalizations. As we monitor specific events only in order to estimate our level of bias, what’s left & what we are interested in is the text content of tweets & metadata, providing us with a deeper insights than just “customer
experience” of Coursera’s & Khan Academy’s users.
** In spite of the data corruption – which was rather caused by my careless Google Drive setting – I would like to point out the Twitter Archiving Google Spreadsheet TAGS
by Martin Hawskey, which will save you the pain of setting up & running a web server for a small Twitter research like this one.
number of tweets: 8,538
a week of tweets
tweet volume over time
number of tweets: 4,976
tweet volume over time
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
05/06/14 31/05/14
The top tweets/min rate – within our time range – was reached by Coursera on June 05. This was the day The New York Times published
review of the new Coursera’s mobile app. Coursera learners (or, more traditionally, “students”) were also frequently tweeting about their
enrollments in Coursera’s new courses. Finally, they were, with increased frequency, announcing their final scores in different courses –
especially in “Machine Learning”, a periodically running course by one of Coursera’s founders, Andrew Ng.
The enormous growth in number of tweets about Khan Academy on May 31 seems to be caused by KA’s NASA partnership announcement.
a week of tweets
top tweeters tweets @'s % rt twitter activity
drchuck 71 257 80%
JayAndrewStarr 52 N/A N/A
MOOCList 43 39 2%
top tweeters top tweeters
top tweeters tweets @'s % rt twitter activity
iturank 271 32 N/A
nishi19 36 61 100%
EdTechRetweet 13 N/A 100%
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
Before we’ll dive into the promised text mining, let’s see if there’s anything else we can gain from the Twitter data. We’ll briefly talk about two different ways of finding influencers, and then about a tweet metadata.
The Twitter user tweeting about Coursera most frequently was Charles Severance (a.k.a. Dr. Chuck), Associate Professor, School of Information, University of Michigan; who lectured two courses on Coursera:
“Internet History, Technology, and Security” & “Programming for Everybody (Python)”. Most of his 71 tweets were retweeted or replied to (/mentioned) by other Twitter users, especially his students.
“JayAndrewStarr” – the “second place” in our imaginary Coursera’s loyalist leaderboard – is generally a very active tweeter. Jay was reporting about almost any events regarding Coursera …and then you take a
closer look at his account, tweets, links in them, followers & other personal profiles on his website to realize, it’s a scam. “MOOCList” (Portugal) describes itself as an “aggregator (directory) of Massive Open
Online Courses (MOOCs) from different providers”, and therefore included Coursera many times in its new online courses lists, and concludes our “top 3” list.*
* You can get more information about other active Coursera’s loyalists by searching “twitter” and “frugalmaniac”, “IDCourserians”, “scottedwards200”, “IskiieHacker”, “rdpeng”, “TopFreeClasses” & “Turkcell Akademi” with your
favorite web search engine. Even though you will find there some other “aggregator” sites (led by commercial intentions), there’s more. I personally like the “IDCourserians” Indonesian community ("Komunitas Courserians
Indonesia“) & the “rdpeng” user profile that belongs to Roger D. Peng, Associate Professor, Biostatistics, Johns Hopkins University, who teaches the data science program on Coursera.
number of tweets: 8,538 number of tweets: 4,976
a week of tweets
top tweeters tweets @'s % rt twitter activity
drchuck 71 257 80%
JayAndrewStarr 52 N/A N/A
MOOCList 43 39 2%
top tweeters top tweeters
top tweeters tweets @'s % rt twitter activity
iturank 271 32 N/A
nishi19 36 61 100%
EdTechRetweet 13 N/A 100%
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
Let’s also briefly talk about the Khan Academy’s top tweeters. The most active user, “iturank”, (bot: automatic sharing, non-human) publishes on iTunes U (an Apple app) courses. The second most active user, Japanese blogger nishi19,
belongs to the top 3 also with respect to retweets. Her most successful tweet in this regard was a (link to) her review of the Japanese translation of Salman Khan’s book “The One World Schoolhouse: Education Reimagined” (世界はひとつの教室 「学び×テクノロジー」が起こすイノベーション). Other active Twitter users mentioning Khan Academy confirm the previously supported statement that Khan Academy is more (than Coursera) popular among education-focused publishers.*
On a general note, let’s highlight that a prospective “brand ambassadors/influencers hunt” should concern not only quantity – possibly helpful as pre-screening of prospective candidates – but, more importantly, quality & relevancy of their
content shared, and also respect our goals/intentions & target groups. Quantitative metrics in social media research can be a great starting point but a terrible finish line. On that account, and finally as for this particular analysis, we should add
that these most active tweeters discovered are generally rather atypical in relation to “the common others” (as for humans, not bots or organizations) discussed later on, who tweet mainly about their educational experience, from learner’s or
educator’s perspective (like “drchuck”, who meets our conditions).
* Other active KA’s loyalists: “ClasesEnWeb”, “EdTechRetweet”, “QLFInc”, “languageed”, “imdhan_khan”, “technologychag”, “EurekaStartups”, “RizKhanMua”, “Shyam17”. The “SocialMediaResearchBasedOnKeywordsSceptics” movement – founded in June 2014 by Jakub
Ruzicka =) – will probably point out that among those alleged loyalist, we can also find a fan of Salman Khan, Indian film actor, and one makeup artist, Riz Khan, who runs Riz Khan Training Academy. Data collection based on keywords has some limitations.
number of tweets: 8,538 number of tweets: 4,976
a week of tweets
number of tweets: 8,538 number of tweets: 4,976
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
most influential users tweeting
about Coursera (top 3)
…, as counted by
# of followers
WorldBank 749,959
Turkcell 489,473
Ujjwal_krishna 460,237
most influential users tweeting
about Khan Academy (top 3)
…, as counted by
# of followers
NASA 6,635,843
mashable 4,052,392
the_hindu 699,432
Since we’ve already conducted a similar analysis in the second part of this text, we can, without further delay, jump into the
analysis of most influential users – as measured by number of followers – with at least one tweet about Coursera & Khan
Academy within our time range.*
Let’s start with Coursera, where the influentials are/were: WorldBank, a Coursera partner, currently hosts two courses on
Coursera; Turkcell, another partner (mobile phone operator), sharing Coursera’s content – centered around business, math and
technology – via Turkcell Akademi; Ujjwal_krishna, a very active & influential Twitter user with motto “I'm what ever you can
make out of me :)”, who mentioned Coursera in his “websites that offer free courses” tweet.
The top 3 most influential users tweeting about Khan Academy are/were: NASA, already well-known partner of KA on
increasing student interest in science, technology, engineering and mathematics (STEM); Mashable, a news & technology
website we’ve already seen in some previous analyses, which was mainly interested in KA’s partnership with NASA; and finally
the Hindu, an Indian daily newspaper in English.
To sum up, we can add to our overall online media-sphere & partners universe – in addition to the previously emphasized ones:
WorldBank & Turkcell (Coursera) and Mashable (Khan Academy). The Hindu also reminds us of the previously found high
demand for both educational tools in the second-most populous country in the world, India.
* A week of tweets is definitely biased towards the most recent interactions. However, a week of tweets should not omit too many Twitter profiles
which tweets about one or another institution frequently. Moreover & more importantly, the conclusions, where not backed up by a sufficient
amount of data, are rather general, with particular institutions used as examples.
We should also mention that conducting an analysis of all Coursera’s (176,619) & KA’s (295,146) followers would certainly be possible. However,
due to Twitter’s 15 minutes long rate limit window durations, the collection of the data – and naturally also its processing, analysis & interpretation
– would take a lot of time. Considering the fact this research paper is already quite long, I might save such analysis for another text.
a week of tweets
en en77.8% 78.6%
es es
6.5% 2.5%-0.1% (descending order) 10.1% 1.9%-0.1% (descending order) <0.1%
tr, ru, ar, ja, ht, mn, fa, fr, vi, pt, id, it,
zh-CN, nl, gl, sr, ko, ca, de, az, th,
ga, tl, lv, no, el, hu, lt, sv, cs, et, jw,
bg, da, fi, hr, ms, mt, sl, ta, uk, bs,
sq, sw
ja, fr, ht, pt, tr, id, gl, ms, ar, nl, sv,
sw, ca, cs, cy, mn, sr, th, ur, zu, af,
az, bg, eo, eu, ha, he, hi, hu, ig, it,
jw, mt, no, sk, sl, tl, vi
tweets language (Google Translate API detection)
en71.4%
es12.8%
user language (Twitter user preferences)
ja
8.7%
fr, pten73.9%
es
11.0%
ru, tr, fr, ja, pt, it
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
We will start our tweet metadata insight by frequencies of tweets language & Twitter user language preferences. Expectedly, most of the tweets mentioning Coursera or Khan Academy were in English. Even though we already know that users
from English-speaking countries dominate both websites’ userbase – besides other things, also because most of the educational resources are in English by default (possibly with a foreign language subtitles) – if we compare tweets language
to user language preferences, we can see that users from other than English-speaking countries usually tweet in the common English language (yet there are some exceptions such as Spanish speaking population). Moreover, we should bear
in mind that some users might have set their account in English even if their mother tongue is other than English. The next most common language, Spanish, reflect both tools numerous fan base in Latin America. As for Khan Academy, we
can also identify a quite big community of users from Japan. Although there definitely are fans of Khan Academy in Japan, our results are skewed by the very active “iturank” Twitter user-bot, who has Japanese as his language preference.
Other Coursera’s communities can be found in Russia, Turkey, France, Japan, Arab countries*, Italy, Brazil & possibly Portuguese (the Portuguese language preference). Dealing with Khan Academy, it is worth to mention France, Brazil,
possibly Portuguese (once again, the Portuguese language preference) & interestingly the Republic of Haiti (the Haitian Creole language, “ht”). In comparison with the Facebook comments network demographics, we can add Turkey, Japan &
Italy to our “target markets” estimates.
* Arguably – also with respect to our previous analyses – Coursera’s Arab countries userbase is much numerous than KA’s.
Just for the record, the “ar” (Arabic) language code includes: Saudi Arabia, Iraq, Egypt, Libya, Algeria, Morocco, Tunisia, Oman, Yemen, Syria, Jordan, Lebanon, Kuwait, United Arab Emirates, Bahrain & Qatar.
number of tweets: 8,538 number of tweets: 4,976
>1% >1%
a week of tweets
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
number of tweets: 8,538 number of tweets: 4,976
Regarding that in our data archive, there were only 168 Coursera & 122 Khan Academy tweeters who publicly shared their location (the geotag
metadata) with their tweets, we should not make any significant generalizations based on those two heatmaps. However, this way, we are able to
support some of our previous conclusions. Moreover, it’s good to know that such information is available, especially toward any long-term
monitoring/research.
The long-awaited tweets text mining is coming, almost there. =) Just let me quickly deal with what we can learn about Coursera & Khan Academy
from Twitter’s four other types of metadata: hashtags, mentions & URLs in tweets and attached media to them.
a week of tweets
top hashtags count % of tweets
#coursera 528 11.2%
#MOOC 434 9.2%
#WDRrisk 218 4.6%
#Iran 165 3.5%
top mentions count (n=9899) % of total mentions
Coursera 3902 39.4%
drchuck 257 2.6%
World Bank 190 1.9%
TURKCELL 175 1.8%
Johns Hopkins 112 1.1%
top hashtags count % of tweets
#STEM 428 11.2%
#Apple 303 7.9%
#iPhone 303 7.9%
#iTunes 303 7.9%
#iTunesU 303 7.9%
#Mac 303 7.9%
#WHScienceFair 151 3.9%
top mentions count (n=3288) % of total mentions
NASA 604 18.4%
Khan Academy 580 17.6%
Mashable 306 9.3%
WWWhatsnew 48 1.5%
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
number of tweets: 8,538 number of tweets: 4,976
Dealing with hashtags, mentions & URLs in tweets, we already have one foot in the text content of tweets.
The moment you take a look at the most frequent hashtags, you might have a hunch that there’s nothing that could surprise you. Neither #coursera
nor #MOOC need further explanation. #WDRisk is a hashtag of the World Bank. And opening the #Iran hashtag discussion would also be quite
repetitive. As for Khan Academy, we can, once more, see: the #STEM topic, the list of Apple products (by the iTurank bot*), and last but not least the
#WHScienceFair (The White House Science Fair) hashtag, related to the previously discussed “college admissions resources” partnership.
Mentions also (mainly) consist of “familiar faces”. We can just remind that John Hopkins (University) hosts the Data Science Specialization
on Coursera. WWWhatsnew (Brazil) tweets about technology, design & business news in Spanish.
* In relation to our overall conclusions, even though our hashtag analysis is biased because of the “iturank” bot (automated account), we have found iTunes
& KA’s iPad app in previous analyses as well, and therefore can justify our conclusions about KA’s iTunes users reach.
a week of tweets
top URLs count
(n=6239)
title
http://blog.coursera.org/post/876291
64467/coursera-works-with-turkeys-
largest-mobile-provider
90 Coursera works with Turkey’s
largest mobile provider,
Turkcell, to build resources for
Turkish Learners
http://blog.coursera.org/post/877425
96882/restoring-course-access-in-
iran
82 Restoring Course Access in
Iran
https://www.coursera.org/course/ma
nagerisk
71 Risk and Opportunity:
Managing Risk for
Development
https://www.coursera.org/ 61 Coursera
top URLs count
(n=3483)
title
http://www.nasa.gov/content/nasa-
khan-academy-collaborate-to-bring-
stem-opportunities-to-online-
learners
238 NASA, Khan Academy
Collaborate to Bring STEM
Opportunities to Online
Learners
http://mashable.com/2014/05/29/kha
n-academy-nasa-stem
157 NASA, Khan Academy Team
Up for STEM Education
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
number of tweets: 8,538 number of tweets: 4,976
Top shared/tweeted URLs (as part of the tweets) can – above all – reassure us that we are slowly reaching data saturation (within our limited
research universe).
Out of the 6,239 URLs in tweets mentioning Coursera, 90 of them link to Coursera’s blog article about cooperation with Turkcell, 82 link to
Coursera’s blog article about restring course access in Iran, and finally, 71 link to Coursera’s course on the subject of the World Bank’s “World
Development Report 2014”. Coursera was also introduced several time as itself (link to its homepage).
Out of the 3,483 URLs in tweets mentioning Khan Academy, 238 of them link to NASA’s article about STEM collaboration & 157 of them link to
Mashable’s article on the same topic.
If we were to draw some conclusions here, we should, once again, stress the importance of blog articles sharing background information &
(therefore) deepening one’s relationship with an institution.
a week of tweets
top media (context tweet & photo)
Programming for Everybody session 2
just became my highest-enrollment
@Coursera class ever
[automatic translation, Arabic]
@research_tools sites who offer free
courses on the net credit of the best
universities in the world and in different
languages
top media (context tweet & photo)
Khan Academy logged-in
homepage: As of today,
everything in red is rendered
by @reactjs
Page load times on Khan
Academy just dropped in
~half today thanks to a
new, faster A/B test
system
When you get 4 in a
row on khan academy
then miss the last one
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
number of tweets: 8,538 number of tweets: 4,976
The very last metadata we will mine before the text analysis are “media”, a 2011 addition to Twitter which used to be text microblogging platform only. The
most popular (according to frequency) photos attached to tweets can complement the overall picture of both institutions social web presence by a visual
insight .
The first from left “Programming for Everybody” picture shows enrollments in courses of our “old friend”, Charles Severance. As for the next picture showing
some of Coursera’s partner institutions, it was attached to the (several times retweeted) tweet of Sa3ad, a foreign student in US who specializes in the
science of psychology & brain.
Dealing with Khan Academy’s photos, the first & second picture belongs to user “soprano”, i.e. Ben Alpert, a young Khan Academy engineer, who shared
some information about Khan Academy development. The “top 3” is completed with the already seen meme making fun of the fact that in order to complete
a Khan Academy’s exercise, you need 5 correct answers in a row.
What’s the lesson here? Once again we can see that sharing an institution’s background information on social media works as well in the realm of education
& is able to connect its community.
a week of tweets
top keywords Porter stemmer (top) Wordnet lemmatizer (top)
coursera coursera coursera
course(s) cours course
free sign free
online mooc online
mooc free signed
learn(ing) learn mooc
signed onlin learning
programming start class
data educ programming
education take data
earn(ed) class learn
university univers
part-of-speech tags
top verbs sign(ed)
earn(ed)
take
learn
top adjectives free
learn
free
new
more
first
great
good
mobile
english
top keywords Porter stemmer (top) Wordnet lemmatizer (top)
khan khan khan
academy academi academy
nasa nasa nasa
education educ education
stem team stem
team onlin team
online learn online
khanacademy' khanacademi khanacademy
space space space
learn collabor learn
opportunities opportun opportunity
tutorials video video
astronomy tutori tutorial
youtube astronomi astronomy
youtub youtube
explor exploration
launch mashable
mashabl sensation
itun
sensat
part-of-speech tags
top verbs expand
launch
learn
bring
unveil
know
top adjectives free
english
new
more
good
great
educational
smarter
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
Regarding the text analysis* on the next pages, it should be mentioned, that apart from
preprocessing procedures including normalization, depunctuation, lowercase conversion,
removing stop words, spelling correction, dealing with negative words (replacing negations with
antonyms), stripping off links, retweets & replies; stemming, lemmatization etc.; the tweets
were also translated to a common language, English (machine translation, Google Translate
API).
Just like any of the previous analyses, text mining can also tell us something about the
aura/brand “floating” above both educational tools. If you speak “Academic”, textual and/or
content analysis can gain insights into how human beings make sense of the world. In our
case, those meanings are reflected in the text & context of the natural everyday Twitter
interactions surrounding both institutions, which co-define them and, at the same time, are their
reflections.
The most frequent (in descending order) keywords, controlled for stems, lemmas & part-of-
speech tags (verbs & adjectives in particular), provide us with insight into the
prevailing/mainstream universe of meanings around both institutions.** Without further delay,
let’s start with Coursera.
* Python’s NLTK 3.0, Pattern (CLiPS), TextBlob & Pandas libraries were used to analyze
the textual content of tweets.
** Despite the fact some of the interpretations might seem “too simple” or even “naive”, bear in mind the
following two points: 1) you enter this discussion as a reader very familiar with the topic, since, till now, you’ve
probably already read about 100 pages on the topic of Coursera’s & Khan Academy’s social web presence (it’s
coming to an end, I promise =)); 2) since the times ethnomethodology was established by Harold Garfinkel, we
can argue on a scientific basis that exactly such common everyday reasoning & day-to-day experiences
allow us to understand the social orders around Coursera & Khan Academy.
number of tweets: 8,538 number of tweets: 4,976
a week of tweets
top keywords Porter stemmer (top) Wordnet lemmatizer (top)
coursera coursera coursera
course(s) cours course
free sign free
online mooc online
mooc free signed
learn(ing) learn mooc
signed onlin learning
programming start class
data educ programming
education take data
earn(ed) class learn
university univers
part-of-speech tags
top verbs sign(ed)
earn(ed)
take
learn
top adjectives free
learn
free
new
more
first
great
good
mobile
english
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv) | June 2014
Coursera represents “free”, “online”, “mooc”, “education(al)” platform
offering “new” courses form world’s top “universities”, mainly in “english”.
People usually talk about what you can “learn” there, which courses
(“class(es)”) “start” soon, which courses you can “sign” up, which “course”
they (have) “take(n)” (or “signed” up for), whether it is their “first” Coursera
experience, or where you can find “more” information or “more”
educational resources from “coursera”. People like to tell others what is
the percentage of points they “earn(ed)”, and whether they received a
certificate from the “sign(ature)” track. Especially, people talk about
courses concerning “programming” & “data”. The course experience & the
“free” “education” “coursera” offers is “good” or even “great”. So is the fact
that the “education” provided by “coursera” is now also “mobile”.*
* In order to provide a coherent interpretation, we simplify & make generalizations. “good”,
for example, is also often part of the collocation “good luck” – similarly some other words.
The structure of sentences makes an effort to fit the majority of statements/tweets
containing the keywords used (likewise in case of Khan Academy on the next page).
To those, who are interested in doing their own research on the enclosed datasets (see the
last pages of this research paper), don’t forget that the my overall frequencies also include
all non-English text, therefore you might need to use an automatic translation API, such as
Google Translate, Bing/Microsoft Translator, Yandex Translate and/or many others
(also note that the Python’s NLTK build-in BabelFish API was discontinued).
number of tweets: 8,538
a week of tweets
top keywords Porter stemmer (top) Wordnet lemmatizer (top)
khan khan khan
academy academi academy
nasa nasa nasa
education educ education
stem team stem
team onlin team
online learn online
khanacademy' khanacademi khanacademy
space space space
learn collabor learn
opportunities opportun opportunity
tutorials video video
astronomy tutori tutorial
youtube astronomi astronomy
youtub youtube
explor exploration
launch mashable
mashabl sensation
itun
sensat
part-of-speech tags
top verbs expand
launch
learn
bring
unveil
know
top adjectives free
english
new
more
good
great
educational
smarter
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Khan_TweetsWeek.csv) | June 2014
Just like Coursera, Khan Academy represents “free”, “online”.
“education(al)” platform – however, not labeled “mooc” as often as
Coursera – also with educational resources mainly in “english”. You can
“learn” there almost anything from “youtube” “video(s)”, which make you
“smarter”. You can find those on “itunes” as well. People think that
“khanacademy” is offering “great” resources that help you do “good” (not
only) on your exam – you should “know” it exists if you need such help
or if you need “more” educational resources. Some believe, you can
“learn” “more” on “khan academy” than in a traditional class. Some also
think that “khan academy” is an “educ(ational)” “sensat(ion)”.
Rumor has it – (not only) “mashable” supports its spreading – that
“khan” “academy” “team(ed)” up with “nasa” to “bring” and/or “expand”
“stem” “education” “opportunities”. Building on this “collabor(ation)”,
“new” open “educational” resources were “launch(ed)”. Namely
“tutorials” on “astronomy” and “space” “explor(ation)”, which they
recently “unveil(ed)”.
number of tweets: 4,976
a week of tweets
machine learning attend lessons starts june avert damages
free online web application uncertain world olympic games
online courses research methods scientists toolbox world prestigious
data scienceapplication
architecturesworld preparation online course
earned 100 starts today course access turkcell academy
creativity innovation bad timeshuman-computer
interaction
interactive
programming
iran findunderstanding
researchprestigious university
khan academy education sensation academy debuted
stem education khanacademy collaboration technology causes
academy teamyoutube education
(/education youtube)causes salman
apple mac, iphone apple,
itunes iphone, itunesu ituneslaunched online learn more
space exploration, nasa khan online learners bring stem
stem opportunities, announced
today, opportunities
announced, expand stem
unveiled space academy tutorials
whsciencefair learn book review australia collections
top collocations top collocations
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
number of tweets: 8,538 number of tweets: 4,976
Most frequent collocations in both corpora support our previous conclusions. Creating sentences from them
would rather miss the point in this case, so its up to you, the reader, to take a look at both tables. As for
Coursera, I just would like to point out – because the data to make such conclusion were missing in our
previous analysis – that the partner Universities of Coursera are usually given attributes like “prestigious”
(see in the table) or “world-class”. As for Khan Academy, “technology causes” the transformation &
innovation in education (see the aforementioned nishi19’s “book review”). As for both, once again – this time
derived from natural language – we can see the diversity of Coursera’s meanings and, by contrast, the
uniformity of the key topics around Khan Academy.
a week of tweets
evaluating the khan academy
edutech
wbedutech evaluating the khan
academyi love khan academy
education wbedutech evaluating the
khan academy edtech ict
i stayed up all night watching khan
academy videos double speed and
now feel slightly crazed
education
spending my first official day of
summer cramming for the sat on
khan academy awesome
all sat materials on khan academys
page are free
coursera course evolution a
course for educators
the fiction of relationship
from brownuniversity on
coursera
the athlete within from
unimelb on coursera
oilproject from 9 june on the
new coursera mooc of
unibocconi which already
has 13000 students
enrolled here are the
details
humanitarianism from
admissionsuom on
coursera
coursera has an app didnt
know abt it
i am now inspired to
combine this coursera with
classic teaching synergy is
huge and potential fantastic
i earned 801 in an
introduction to interactive
programming in python on
coursera
programming for everybody
session 2 just became my
highestenrollment coursera
class ever
coursera with drchuck
learning to change the
world 1 program at a time
how should one put a
course from coursera on
the resume
only 615 of coursera
signature fee goes to the
university how small goes
to the profs
find free online classes at
coursera futurelearn and
more
organizations from
vanderbiltu on coursera
unethical decision making
in organizations from unil
on coursera
refreshing i recommend
machine learning on
coursera
a scientists toolbox online
course coursera
websignage scoopit
4 of 12 courses on the
coursera homepage are
penn open courses today
more to choose from
representations of
sexualities on coursera
yay penn mt
pennopencourses 412
courses on coursera
homepage are penn
courses today
“khan academy“ concordance sample“coursera“ concordance sample
sharing context of “khan”
sharing context of “coursera”
nasa education brandnew delivers
educational delivers educational career
curriculum my
course mooc python education
online june new first
free courses programming
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
number of tweets: 8,538 number of tweets: 4,976
Our shift to a slightly more personal note – which will be immediately afterwards reinforced by
sentiment analysis – can begin by “coursera” & “khan academy” concordance (context of a
word) samples. Besides several tweets about one’s personal experience – see the following
sentiment analysis for more – we can see that Khan Academy was reviewed by “EduTech”, a
World Bank’s blog on ICT use in Education, which released an article “Evaluating the Khan
Academy”. The “coursera” concordance shows again the diversity of its courses & topics, and
also Coursera’s “social dimension”. Moreover, the extracts include the first instances of a “want
to learn” notion on Coursera & “need to learn” on Khan Academy (see the sentiment analysis).
The tables below, i.e. words sharing the context of “coursera” & “khan”, support our previous
conclusions. It is suitable to mention that “careers” are related to STEM education, therefore
Khan Academy & NASA cooperation.
a week of tweets
average sentiment 0.14
average sentiment,
neutral (0) removed0.30
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv, Twitter_Khan_TweetsWeek.csv) | June 2014
A simple* sentiment analysis allows us to examine sentiment polarity score of tweets, [-1.0, 1.0], where ‘-1’ stands for negative attitude & ‘+1’ stands for positive attitude. A tweet is treated as
bag-of-words (words are weighted by tf-idf). The sentiment is based on the adjectives it contains, using Python’s NLTK classifier trained on a movie reviews corpus by Pang & Lee. Leaving aside
the opportunities of (rather) marketing employment of sentiment analysis – customer feedback monitoring and/or response to it in order to improve the product and/or improve an institution’s
overall image (both online & offline) – which is not so useful for our purposes, the Twitter testimonials** of Coursera or Khan Academy users allow us to study their natural & first-hand positive,
neutral or negative experiences. Looking at both institutions’ brands, with regard to user experience, we should be able to conclude – however certainly not complete/finish (see the text box of
“other social web analyses” at the end of this text’s conclusions) – the overall social web image of Coursera & Khan Academy.
Dealing with the information on this page, the average sentiment is informative, yet very rough, metric telling us that both institutions are discussed rather in positive connotations (how
surprising! =)). So, taking into account the fact that no “our-Twitter-corpus-specific” machine learning techniques were used (see the first footnote) – therefore we encountered the typical issue of
some testimonials labeled as negative not actually being negative but rather sarcastic, or labeling “negative” a tweet that, in fact, described a negative experience which Coursera/Khan
Academy solved – statements like “Khan Academy is, in general, less popular than Coursera”, or “general attitudes towards Khan Academy are close to neutral” would ridicule statistical
significance (and it’s not wise to fool foundations of statistics =)). The point here is to quickly classify & arrange our dataset in order to find out what people like & dislike about both educational
tools.
* “Simple” is sufficient with respect to our needs. We do not struggle for (methodologically) as accurate as possible positive/neutral/negative classification of the tweets, but rather for (practical) quick & dirty efficient way of using
adjectives to order our dataset by sentiment estimates, so that we can show the most distinctive instances & also be able to quickly to draw some conclusions about the rest. Even though we can’t possibly cover all tricks &
gadgets of natural language processing, just to illustrate some potential improvements of our sentiment classification, let’s start with the suggestion that we might want to improve our machine translation and/or our sentiment
classifier. Both can be achieved by machine learning & (therefore) some manual user input of pre-classified tweets to train our data and/or using an existing tagger optimized for Twitter – e.g. Twitter NLP and Part-of-Speech
Tagging (Noah's ARK, Carnegie Mellon University). Using the training & test sets, the commonly used Naive Bayes classfier could be employed to measure our accuracy. Furthermore, we could try bigrams or trigrams instead
of unigrams so that we can better identify sarcasm, negations & other related sentiment issues. Detecting pronouns, we could also distinguish tweets based on their subjectivity. “Share those amazing online courses on
Coursera/Khan Academy: (…)”, e.g. posted by an online publisher, is different from “I want to share this amazing online course I took on Coursera/Khan Academy: (…)!“, e.g. by an enthusiast Coursera/Khan Academy user.
Based on our methodology, without accounting for subjectivity, the second tweet might have less positive overall sentiment (because there are more words) even though we, humans, might argue it is the other way around.
** In order to further increase our generalizability, we might also want to collect “the remaining” 99.9% of other online testimonials about Coursera & Khan Academy. =) Indeed, even for decision-making regarding Twitter only,
we might also want to employ long-term & automated data collection. On the subject of Twitter testimonials that “made it to this paper”, i.e. were labeled as very negative or very positive, the names of users that do not
represent an institution, brand and/or Coursera were removed. Thus, on the following pages, you will find only the mention/reply symbol (“@”) without the following username.
average sentiment 0.05
average sentiment,
neutral (0) removed0.21
number of tweets: 4,976number of tweets: 8,538
a week of tweets
number of tweets: 8,538
average sentiment 0.14
average sentiment,
neutral (0) removed0.30
sentiment (detected language) original tweet analyzed tweet [automatic translation, processed]
1 (en) Here's a start on @coursera etc @ Summer of Learning! @ pulls together best online courses -
http://t.co/0Fq07j2ZwM
heres a start on coursera etc timf5050 summer of learning lifehacker pulls together best online courses
1 (en) @ are you teaching any MOOCs this summer? Your Coursera course was the best yet! are you teaching any moocs this summer your coursera course was the best yet
1 (en) Professor you've made my transition into bschool a cake walk. Best course I've ever taken.Huge fan
of your novel approach
professor youve made my transition into bschool a cake walk best course ive ever takenhuge fan of your novel
approach
1 (nl) En ook MIT heeft een online university. Doei vrije tijd: http://t.co/dwgViu9uN3 &
http://t.co/WSy1eHIBM1 #awesome
and mit has an online university bye leisure & awesome
1 (en) @ Should study Khan Academy, Coursera and Udacity as role models of MOOCs, adopt best
practices @ @
should study khan academy coursera and udacity as role models of moocs adopt best practices
1 (en) that class that you mention from coursera economics of money and banking is excellent thank you that class that you mention from coursera economics of money and banking is excellent thank you
1 (ja) #Coursera #posa-002 Week 1のResultsは満点で無事に完了! 結構長かった。 #3good results of coursera posa002 week 1 was longer quite done safely in a perfect score 3good
1 (ar) إليكم قائمة بأفضل المواقع العالمية التي يمنكم من heres a list of the best sites you may feel that the world of
1 (en) Your beliefs draw your behaviors and your behaviors determine outcomes!! Just completed Week 2
of awesome learning @coursera!
your beliefs draw your behaviors and your behaviors determine outcomes just completed week 2 of awesome learning
coursera
1 (ru) По теории Графов лучше всего подходят курсы стенфордского университета на Coursera
#yacm2014
on graph theory are best courses at stanford university on coursera yacm2014
1 (en) Completed an excellent course in Machine Learning by @andrewng on @Coursera!
https://t.co/OfawjBUITj
completed an excellent course in machine learning by andrewng on coursera
1 (en) coursera is awesome coursera is awesome
1 (en) @drchuck reading your book for the coursera MOOC. It's awesome!!! reading your book for the coursera mooc its awesome
1 (en) Listening to Rick Levin, CEO of Coursera. Very impressive! listening to rick levin ceo of coursera very impressive
1 (en) @coursera @BerkleeCollege Oh my gosh, that was AWESOME! coursera berkleecollege oh my gosh that was awesome
1 (en) I found the perfect class for me. I'm geeking out in anticipation. https://t.co/VaebXhoqLV i found the perfect class for me im geeking out in anticipation
To sum up the positive tweets about Coursera, users having a good user experience recommend Coursera’s courses, give information about their
progress in a particular course, mention their achievements and/or positive learning experience and give thanks to a particular educator (/Professor).
To illustrate that (you can find the other tweets in the dataset), take a look at tweets which have the highest (/most positive) detected sentiment on this
& the two following pages. As for the importance of such (“natural”) personal word-of-mouth testimonials, the “opinion leadership” concept & the
influence of our social networks on us, these are the small elements that aggregately shape – to a greater or lesser extent for different people – the
source of (high/low) authority & credibility of any institution.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv) | June 2014
a week of tweets
sentiment (detected language) original tweet analyzed tweet [automatic translation, processed]
1 (en) Excellent suite of basics on Data Analysis https://t.co/LOeoGRh4N9 #MOOC excellent suite of basics on data analysis mooc
1 (en) awesome course! https://t.co/wdenaGtRSb awesome course
1 (tr) Dünyanın en iyi üniversitelerinden dersler, Türkçe altyazı ile @TurkcellAkademi'de..
http://t.co/pglIqoAjvF
lessons from the best universities in the world with turkish subtitles in turkcellakademi
1 (en) @ I need to sign up to even look at the course details. I've heard of coursera though. Best of luck, I
can't take up any course now.
i need to sign up to even look at the course details ive heard of coursera though best of luck i cant take up any course
now
1 (en) My online course is called Paradoxes of War. This is going to be awesome. #Coursera #princeton my online course is called paradoxes of war this is going to be awesome coursera princeton
1 (en) Excellent News http://t.co/wdpxXRIz7X excellent news
1 (en) This site is awesome https://t.co/diuZtu5Jms dunno about the classes yet though but signed up for
one to start this week :-)
this site is awesome dunno about the classes yet though but signed up for one to start this week
1 (en) The Best MOOC Provider: A Review of Coursera, Udacity and Edx - http://t.co/ENDu5MJfFK
http://t.co/3ht3jcANcA
the best mooc provider a review of coursera udacity and edx
1 (en) "Aggregation is 'link to the rest', where Curation is 'link to the best'": Understand Google,
Northwestern University coursera lec :)
aggregation is link to the rest where curation is link to the best understand google northwestern university coursera lec
1 (en) @drchuck reading your book for the coursera MOOC. It's awesome!!! drchuck reading your book for the coursera mooc its awesome
1 (en) The Best MOOC Provider: A Review of Coursera, Udacity and Edx - http://t.co/ENDu5MJfFK
http://t.co/3ht3jcANcA via @skilledup
the best mooc provider a review of coursera udacity and edx via skilledup
1 (es) Las mejores ... - http://t.co/CBr8RavZAu #Coursera #Cursos #Duolingo #Nasa #Udemy
http://t.co/bK6FvkCNcQ
best coursera courses nasa udemy duolingo
1 (es) Las mejores #Aplicaciones ... --http://t.co/0gH3naXrCE #RecetasNaturale #Coursera #Cursos
#Duolingo #NASA #UDEMY
best applications recetasnaturale tco0gh3naxrce courses duolingo coursera nasa udemy
1 (es) Las mejores aplicaciones para no dejar de aprender aún siendo adultos #ANDROIDE #Tecnologia
#Udemy #Coursera #NA... http://t.co/TxyaG5Vk0T
the best applications for non stop learning even as adults android technology udemy coursera na
1 (es) Las mejores aplicaciones para no dejar de aprender aún siendo adultos: A medida que nos
hacemos... http://t.co/bNC1BSpQyx #Udemy #coursera
the best applications for non stop learning even as adults as we grow udemy coursera
1 (es) RT @: Después de un 10 en Semana 1, estoy listísima para Semana 2 <3 #TCGO @coursera
@UniLeiden
after a 10 on week 1 im listsima for week 2 < 3 tcgo unileiden coursera
1 (en) @ Best of luck on your @Coursera journey. Follow @DukeU for updates on Duke news, research
and campus life.
best of luck on your coursera journey follow dukeu for updates on duke news research and campus life
1 (tr) @Turkcell @coursera mükemmel bir calışma teşekkürler turkcell courser calma the perfect thank you
1 (ru) Друзья ЛЕГЕНДАРНАЯ новость, для желающих познать финансы. На Coursera вышел … friends legendary news for those wishing to learn finance coursera went on
1 (es) Una vez más empiezo esta maravilla de curso en Coursera, y una vez más me veo abandonándolo
por falta de tiempo :_( https://t.co/FpnrJM5aIL
again start this wonderful coursera course and once again i see myself abandoning it for lack of time
1 (en) @: One if the best for #datacomputing Heard of @Coursera courses? We have them at the
@JohnsHopkinsSPH. Explore #gohop http…
one if the best for datacomputing heard of coursera courses we have them at the johnshopkinssph explore gohop
number of tweets: 8,538
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv) | June 2014
sentiment (detected language) original tweet analyzed tweet [automatic translation, processed]
1 (tr) Turkcell Akademi ve Coursera işbirliği ile dünyanın bilgisi Türkçe: ABD�nin en iyi üniversitelerinden
dünyaca ... http://t.co/9lJjiXfQrr
turkcell academy in collaboration with the worlds information and courser turkish usa s of the best universities in the
world
1 (tr) Turkcell Akademi ve Coursera işbirliği ile dünyanın bilgisi Türkçe: ABD�nin en iyi üniversitelerinden
dünyaca ... http://t.co/qH3ttSNN64
turkcell academy in collaboration with the worlds information and courser turkish usa the world s best universities
1 (it) @ @ C'e' un eccellente @coursera MOOC https://t.co/VciPCU4qmz esplora il peso/ % di compiti in
un corso e consequenze.
there s an excellent coursera mooc explores the weight of tasks in a course and consequences
1 (en) @: @ Just finished the R Programming course on @coursera. Excellent use of time.” I am also
almost done.
just finished the r programming course on coursera excellent use of time i am also almost done
1 (gl) Suscribite a este curso para aprender a programar Programming for Everybody | Coursera
https://t.co/xwJl4pRxuJ via @delicious
suscribite to this course to learn how to program programming for everybody course via delicious
1 (en) Awesome @geranyl: Stanford's Machine Learning on @Coursera starts soon!
https://t.co/f0gVkoLpbz
awesome stanfords machine learning on coursera starts soon
1 (en) @ Just finished the R Programming course on @coursera. Excellent use of time. just finished the r programming course on coursera excellent use of time
1 (en) Weekends appear to be the perfect time to catch up on @coursera lectures and assignments
before Sunday deadlines.#MOOC
weekends appear to be the perfect time to catch up on coursera lectures and assignments before sunday
deadlinesmooc
1 (en) What is the best iOS 6 compatible app for Coursera? what is the best ios 6 compatible app for coursera
1 (en) Coursera's Internet History, Technology, and Security course starts in two days. Looks awesome!
(And it requires no programming.) #IHTS
courseras internet history technology and security course starts in two days looks awesome and it requires no
programming ihts
1 (en) The Awesome moment when you signup for the Scala course in @Coursera and finds out the
instructor is the creator of Scala
the awesome moment when you signup for the scala course in coursera and finds out the instructor is the creator of
scala
1 (en) @: Awesome chat with Pat Bosshart about next-gen #SDN chipsets. http://t.co/FF3K986nVf I
learned a ton. Week 5 of @coursera cove…
awesome chat with pat bosshart about nextgen sdn chipsets i learned a ton week 5 of coursera cove
1 (en) @ Best of luck on your @Coursera journey. Follow @DukeU for updates on Duke news, research
and campus life.
best of luck on your coursera journey follow dukeu for updates on duke news research and campus life
1 (en) Hey @coursera @open2study We wanted to let you know we featured your awesome courses in
our Courses of the Weekend! http://t…
hey coursera open2study we wanted to let you know we featured your awesome courses in our courses of the
weekend
1 (en) Best platform is Coursera...I've done loads! :) best platform is courseraive done loads
1 (en) 2014 Internet Trends http://t.co/JLTa3pAjM1 - Impressive growth in online learning resources like
@khanacademy @coursera and @duolingo.
2014 internet trends impressive growth in online learning resources like khanacademy coursera and duolingo
1 (en) BBC News - Trinidad pioneers online 'knowledge network' http://t.co/nowzNvkfKW. Impressive to
see @coursera working on knowledge.tt.
bbc news trinidad pioneers online knowledge network impressive to see coursera working on knowledgett
a week of tweets
number of tweets: 8,538
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv) | June 2014
a week of tweets
average sentiment 0.14
average sentiment,
neutral (0) removed0.30
sentiment (detected language) original tweet analyzed tweet [automatic translation, processed]
-1 (fa) ایول، # نمیکنهچیزروایراندیگهکورسرا ! #coursera evil kvrsra irans nothing else does coursera
-1 (fr) Dégouté d'avoir codé l'assignment4 de #coursera #progfun #scala sans avoir vu les weeks 5 et 6 et
la fin de la 4 ... j'aurai gagné mon temps
disgusted to have coded the assignment4 of coursera scala progfun without seeing the weeks 5 and 6 and the end
of the 4 i have earned my time
-1 (en) The physics equations will prolly drive me insane, but enrolling here for the sake of my #writing
#workinprogress https://t.co/8Zx661m5AS
the physics equations will prolly drive me insane but enrolling here for the sake of my writing workinprogress
-0.8 (en) I earned 97.8% in Microeconomics Principles on @Coursera! I hate deadlines!
https://t.co/upfc4VNY42
i earned 978 in microeconomics principles on coursera i hate deadlines
-0.75 (en) Manchester Uni MOOC on water and sewage. Boring? Wrong!!!! manchester uni mooc on water and sewage boring wrong
-0.7 (en) Coursera class on resilience of children in disasters & war--too bad I will be in school!
http://t.co/O5WJe4wDur
coursera class on resilience of children in disasters & wartoo bad i will be in school
-0.7 (en) @coursera @Turkcell I wish coursera would just stay with koç uni. Turkcell is just in pursuit of his
brand's promotion, Too bad for coursera
coursera turkcell i wish coursera would just stay with ko uni turkcell is just in pursuit of his brands promotion too bad
for coursera
-0.65 (es) Las clases en línea del Tec son muy complicadas, deberían de aprender de Coursera y edX. online classes are very complicated tec should learn from coursera and edx
-0.6 (pt) a merda do coursera fica me mandando emails e eu já me desinscrevi um milhão de vezes the fucking coursera keeps sending me emails and ive desinscrevi me a million times
-0.6 (en) This @coursera website is dangerous. I WANT TO LEARN ALL THE THINGS! Dang finite lifespan
and brain capacity!
this coursera website is dangerous i want to learn all the things dang finite lifespan and brain capacity
-0.6 (en) Desperately trying to work on Coursera platform from my mobile device. desperately trying to work on coursera platform from my mobile device
number of tweets: 8,538
As we’ve already mentioned, without any training, the detection of negative tweets performed rather poorly.* There actually was a minority of tweets about Coursera that were truly
negative & there were very few common topics of negative experience. We might argue it is so because Coursera’s target group is, in general, older and, at the same time, niche
population of people who chose their courses build around their professional/academic interest – therefore not possibly being pushed by the traditional education system to learn
particular curriculum (as we’ll see in connection with Khan Academy).
If we were to send Coursera some – in our case possibly biased by our data collection period – customer feedback, some crowd-sourced suggestions for improvement (although it is
not the primary subject of this analysis), it would be improving courses’ notification settings (making it clear & easy to adjust), considering the “brand risks” of entering partnerships
with non-educational & non-research institutions such as Turkcell, a Turkish mobile phone operator, & personally adjustable assignment deadlines (Yaaay! =)). From other negative
tweets not displayed on this page but available in the enclosed dataset, it would be, for example, consideration of repeated invitations to buy signature track (yet, for-profit
organizations surely need to make profit =)) and/or increasing the commission (/ financial reward) that goes to educators/Professors as opposed to their home University/College.
Finally, improving the support for students lagging behind and/or with insufficient entry educational background – e.g. providing some suggestions on where to get it & do so in
advance, so that they are able to meet such prerequisites before a particular course of their interest starts.
* However, neither here nor in relation to Khan Academy, should we say we’re not significantly better off. Just the fact we have a – more or less – ordered list of 8,538 (Coursera) + 4,976 (Khan
Academy) tweets based on their sentiment at our disposal within tens of seconds, is something that would take time & money using manpower, and it greatly facilitates our orientation in the dataset.
By the way, prospective machine learners (/data trainers) should know about services like Amazon Mechanical Turk.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Coursera_TweetsWeek.csv) | June 2014
a week of tweets
number of tweets: 4,976
average sentiment 0.05
average sentiment,
neutral (0) removed0.21
sentiment (detected language) original tweet analyzed tweet [automatic translation, processed]
1 (en) Khan Academy is a wonderful site. khan academy is a wonderful site
1 (en) Besides Khan Academy, is Schaums the best to learn math for quant research and purposes?
http://t.co/19VJBBZlNj
besides khan academy is schaums the best to learn math for quant research and purposes
1 (en) @ Should study Khan Academy, Coursera and Udacity as role models of MOOCs, adopt best
practices @ @…
should study khan academy coursera and udacity as role models of moocs adopt best practices
1 (es) Khan Academy: maravilloso! https://t.co/dQzEPaXgTF khan academy wonderful
1 (en) “@: All nighter with Khan Academy teaching me Physics, you guys are so awesome.”
@khanacademy
all nighter with khan academy teaching me physics you guys are so awesome
1 (tr) Khan academy diye müthiş birşey var ben daha keşfedeli 2-3 ay oluyor khan academy he has something awesome going on than i explore in 23 months
1 (en) @ @ is waqt moin khan academy best hai or rashid Latif the best two is waqt moin khan academy best hai or rashid latif the best two
1 (en) Khan Academy is the greatest study tool I've discovered. 🙌 khan academy is the greatest study tool i have discovered
1 (en) @ Should study Khan Academy, Coursera and Udacity as role models of MOOCs, adopt best
practices @ @
should study khan academy coursera and udacity as role models of moocs adopt best practices
1 (en) @ I haven’t read what you are talking about BUT KA Lite is an awesome offline Khan Academy palendae i have not read what you are talking about but ka lite is an awesome offline khan academy
1 (en) Almost Khan Academy (aka the best Calc corner ever) http://t.co/tQ6uZLxwkF almost khan academy aka the best calc corner ever
1 (en) Excellent STEM tools for anyone who works with the kiddos:) http://t.co/V7dclsjj7F excellent stem tools for anyone who works with the kiddos
1 (en) How to use Khan Academy in Math! #awesome #KIPP #Khan http://t.co/GxCLa8cAlC how to use khan academy in math awesome kipp khan
1 (en) I wish i found #khan #academy sooner. Simply awesome! #maths i wish i found khan academy sooner simply awesome maths
1 (en) khan academy actually is awesome khan academy actually is awesome
0.91 (pt) @hugsmeade O site de matemática é o Khan Academy. Muito bom! the site math is the khan academy very good
Even though there are fewer Twitter testimonials about Khan Academy compared to Coursera – the learning process & the target group, or
rather “target occasion”, in which a learner needs & looks up Khan Academy seems to be very different, and such user might not be motivated to
create public testimonials – the positive tweets often mention good experience with an educational material (STEM subjects, unsurprisingly,
above all), including those of thanks, with regard to the necessity of studying a particular topic (homework and/or exam). The “top” (1.0-0.9)
positive tweets detected can be found in the table below.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Khan_TweetsWeek.csv) | June 2014
a week of tweets
average sentiment 0.05
average sentiment,
neutral (0) removed0.21
sentiment (detected language) original tweet analyzed tweet [automatic translation, processed]
-1 (en) I'm watching a khan academy video in my miserable attempt to pass mannings quiz tomorrow I am watching a khan academy video in my miserable attempt to pass mannings quiz tomorrow
-1 (es) Khan academy, tareas, examenes mensuales y auditoria TODO junto es HORRIBLE!!!!!!!!!😡😡😡
😤
khan academy assignments exams and monthly audit all together is horrible
-0.8 (en) @ hate khan academy! ugh hate khan academy
-0.8 (en) @ and those stupid pendragon essay and khan academy and those stupid pendragon essay and khan academy
-0.8 (en) @ I hate khan academy i hate khan academy
-0.8 (en) hate khan academy hate khan academy
-0.8 (en) @: I've been watching Khan Academy videos all day bc I can't get to the stupid review on mypisd I have been watching khan academy videos all day bc i cant get to the stupid review on mypisd
-0.8 (gl) Estupido khan academy👊👊 stupid khan academy
-0.8 (en) Uggggh I hate khan academy 🔫 uggggh i hate khan academy
-0.8 (en) I freaking hate khan academy>:( i freaking hate khan academy>
-0.8 (en) @: khan academy is so fucking stupid and it pisses me off to the max khan academy is so fucking stupid and it pisses me off to the max
-0.8 (en) I really hate Khan Academy i really hate khan academy
-0.8 (gl) ODIO CON TODO MI CORAZON KHAN ACADEMY!!!! hate with all mi heart khan academy
-0.8 (en) it has gotten to the point where I put on khan academy to listen to on the way to school i literally
hate myself
its gotten to the point where i put on khan academy to listen to on the way to school i literally hate myself
-0.8 (en) I fucking hate Khan Academy i fucking hate khan academy
-0.8 (en) I hate khan academy someone should show me how to do this i hate khan academy someone should show me how to do this
-0.8 (pt) essa porra desse khan academy é chato demais this fucking khan academy is too boring
-0.8 (en) I will always hate math & Khan Academy i will always hate math khan academy
-0.8 (en) khan academy: saving my geometry grade one annoying ass video at a time khan academy saving my geometry grade one annoying ass video at a time
number of tweets: 4,976
Just like what we’ve encountered with Coursera, the detection of the most negative tweets about Khan Academy performed with several
misclassifications; for instance, right the very first tweet was misclassified because of the word “miserable”. As we can see, KA’s (generally)
younger userbase also has no problem with openly expressing their uncensored & sincere frustrations. =) Negative tweets are often related to a
particular “offline” class (traditional K-12 education), as a response to a necessity of studying particular curriculum (homework, exam etc.).
Finally, we should point out that from the “a week of tweets” Khan Academy tweet corpus, we can find enough evidence for the previously
supported statement that KA’s open educational resources are also often used by University/College (any higher education) students or young
professionals. While Coursera’s users – regarding their “archetype”, since they surely can blend in reality – rather search for classes (directly)
increasing their professional//academic qualification, Khan Academy is often used rather as supplement – which is naturally also related to the
process of creation & distribution of study resources on both portals.
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose original data (dataset: Twitter_Khan_TweetsWeek.csv) | June 2014
a week of tweets
Jakub Ruzicka
www.linkedin.com/in/littlerose
number of tweets: 8,538
It's been quite a journey, right? =) Finally, here comes the very last
analysis of this exploratory research paper before the conclusions.
Even though we won’t draw any “new” conclusions but rather reinforce
some previous findings, it deserves something special, interesting &
eye-catching to “say goodbye”. Imaginations has no limits …or, at
least, it is said so. Since we have 8,538 (Coursera) + 4,976 (Khan
Academy) tweets at our disposal, what about asking our database a
slightly more sophisticated question? As far back as I can remember,
we are attempting to discover Coursera’s & Khan Academy’s brand,
therefore we want to know “what/how Coursera & Khan Academy are”.
Sixty nine tweets (Coursera) plus one hundred & twelve tweets (Khan
Academy), which contain “coursera is” or “khan academy is”, will
review* some of our knowledge about both entities in a way most close
to the main point of social media.
* the D3 Word Tree visualization by Jason Davies was used
n = 69
original data (dataset: Twitter_Coursera_TweetsWeek.csv) | June 2014
a week of tweets
Jakub Ruzicka
www.linkedin.com/in/littlerose
number of tweets: 4,976
n = 112 (part 1/2)
original data (dataset: Twitter_Khan_TweetsWeek.csv) | June 2014
a week of tweets
Jakub Ruzicka
www.linkedin.com/in/littlerose
number of tweets: 4,976
n = 112 (part 2/2)
original data (dataset: Twitter_Khan_TweetsWeek.csv) | June 2014
Coursera & Khan Academy
…and what does the social web say about education?
brand essence, swot, positioning & more
(pp. 122-130)
conclusions
Since we want to “fit” the collected social web data – conclude
based on them only, although potential business analysis would
naturally took into consideration other sources of information as
well (an appropriate research bias compensation, by the way), the
following simplified & slightly modified common marketing
analyses/tools are not necessarily used in the “correct” way &
might be incomplete. However, their employment is very
convenient for our demand for concluding the overall social web
presence of both institutions.
functions
providing (prospective/current) University/College students, professionals & self-directed learners with hundreds of niche-specialized diverse advanced (college-like & online by default)
courses for lifelong education under one platform
meeting the demand for introductory-level courses teaching easily transferable skills(ICT & data science skills above all) to other professional/academic areas
anytime (however with fixed dates & deadlines) educational contentfor (in the long term) improvement of one’s own labor/academic market status
personality
social mission – transformation of education – focused on transforming society
opening up higher academic/professional education to mass audience (“anyone can join”)
decentralization & diversity
storytelling (self-driven education & life improvements) and inspiration as an education facilitator
close (semi-formal) relationship/friendship with a Professor/instructor/educator
performance
providing opportunity to improve one’s own qualification via self-driven educationin order to improve one’s own life
positioning itself via connecting to, establishing partnerships with and/or sharing content of (many) Universities (labeled “world-class”, “prestigious” etc.), University Professors & other educational institutions, also news & tech publishers, famous people & celebrities (politics & entertainment)
active support/promotion of new courses, new partnerships, users-storytellers,technology in education enthusiasts & opening up education enthusiasts
source of authorities
[also influencing Coursera and/or its brand]College/University & other partnerships (recently WorldBank), University Professors, Andrew Ng & Daphne Koller,
“the rest” of the board & investors; activity of all aforementioned, generally reinforcing Coursera’s academic-business nature & the communication element of shaping/transforming society through education & research
active, enthusiast & (often) influential social media users recommending particular courses, posting about their progress, celebrating their achievements and/or positive learning experience, and giving thanks to educators;
active, enthusiast & (usually) influential educators & institutions they represent
major news & tech publishers positioning Coursera as the leading common platform for communities looking for online higher education and/or keen to learn & excited about online education & its techn(olog)ical transformation
signature track certificates
brand essence
I want to
learn to make a move
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014
conclusions
brand
functions
fixing/establishing one’s common core education via targeted education mix, assembled from thousands of small lectures (mathematics & STEM education in the first place)
just-in-time educational content helping one succeed in the traditional education system
access to coherent primary/high school common core for self-directed learning under one platform
personality
social mission – transformation of education – focused on transforming education system
flipping the traditional institutional education system & opening it up
centralization & integrity
gamification & storytelling (self-driven education & school performance) as an education facilitator
close (semi-formal, even informal) relationship/friendship with Salman Khan
performance
providing opportunity to deal with required necessities within the traditional education/qualification system (& facilitating it) and/or serve as its supplementusing “catchy” lectures, personalization & gamification
Khan Academy as a big, open & informal (with a lot of ”background” content) family led by”father Sal”, developed & enabled by a small team (ICT development and/or content creation),
co-developed & spreaded thanks to the effort of its volunteer community
publicizing the new "science" courses & establishing partnerships with (a limited number of)key/influential institutions in order to popularize & facilitate stem education
source of authorities
Salman Khan positioned by major news, tech & also education-focused publishers as education transformation leader(via video & technology) & successfully communicating core topics of his vision
strong & growing user & volunteer community, “transparent” & well-deliveredbackground processes & development activities
[also influencing KA and/or its brand]College Board (SAT college admission exam), the White House,
NASA, and Bank of America partnerships; Bill Gates
social web testimonials mentioning positive experience with KA’s educational content - STEM subjects above all (unsurprisingly) – including “thanks” posts regarding necessity of studying a particular topic (e.g. last-minute homework and/or exam preparation)
shaping education since 2006
brand essence
I need to
learn to move on
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014
conclusions
brand
positioning
conclusions
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014
USP &
competitive
advantage
University-like (but made online
by default) MOOC courses under
the patronage of various world-
class Universities, taught by their
Professors & possibly
concluded with obtaining
a signature track
certificate
keywords
coursera, free, courses, learn,
online, education, massive, future,
provider, online, mooc, new,
universities, english, machine,
learning, class, start, sign, take,
first, more, earn, programming,
data, good, great, mobile,
…
wider
associational
universe
open educational
resources, ck-12
foundation, alison,
opencourseware,
massive open online
course, udacity, mit
opencourseware,
learnstreet,
wikiversity, charles
severance, massive
online open research,
creative live,
techchange, udemy,
ben benderson,
edsby, iversity,
academic eearth,
lynda.com, eliademy,
edukart, daniel s.welt,
open-source
curriculum, edx, open
learning, knewton,
…
traffic sources
social media, major online news &
tech publishers, C’s own & third-party blogs,
influential & active users/community (educators,
learners, Universities, well-known celebrities &
public figures, online media & other institutions),
traditional Colleges and Universities, ...
facebook, google+, twitter, nytimes, wsj, huffingtonpost, npr news,
techcrunch, wired, fastcompany, ted, mashable, wikipedia, hurriyet,
times of india, anadolu ajansi, atlanta journal constitution, montreal
gazette, calgary herald, cournytimes, huffingtonpost, ryanseacrest,
bbc, entrepreneur, charlierose, kplu, mooclist, worldbank, turkcell,
hurriyet, johns hopkins, reddit, youtube, linkedin, slashdot,
wordpress, cnn, gamezone, tumblr, typepad, patch, cnet, msn,
stanford, duke, dzone, feedsportal, upenn, usnews, abc, ala,
allaboutjazz, bloomberg, bostonmagazine, cbsnews, cdc, cisco,
clucerf, crooksandliars, edublogs, edweek, fooyoh, forbes, illinois,
inc, ing, iowapublicradio, kqed, marketplace, mit, nbcnews, opb,
openforum, payscale, publicradio, reghardware, rice, ripr,
smithsonianmag, theguardian, typesafe, uw, washington,
wbur, wfu, wfubmc, wlrn, worldbank, lifehacker,
wpr, wrvo, yahoo,
…
competition
edx, udacity, udemy, class-
central, tareasplus, iversity,
the teaching company,
traditional Colleges and
Universities,
…
KPI
professional /
academic
qualification
obtained
at Coursera
customer persona
a user/sympathizer of Coursera is a (US/foreign)
student, recent employee and/or entrepreneur,
who wants to obtain/supplement her/his
professional/academic qualification
a user/sympathizer of Coursera supports techn(olog)ical
transformation of education; she or he is enthusiastic
about the opportunity of studying world-class courses
(possibly also obtaining a certificate) and/or enjoys
exploring (from her/his perspective) new topics which
she/he “did not have courage” to study on her/his own,
did not know much about or did not know where to start
(e.g. programming or data science)
a Coursera user is a part of communities around
particular Courses; she/he possibly joins them because
of social web testimonials of learners & educators;
she/he generally responds to stories about, improving
one’s life/skills, success & overcoming challenges
via education (and making it
accessible & facilitated
by Coursera)
target markets
Coursera seems to expand rather via
larger (University) cities; due to general
global interest in professional and/or
academic qualification of students,
employees & entrepreneurs, Coursera
grows in a rather decentralized fashion
besides US, Canada, Brazil, Mexico & Latin
America, we found increased interest in
Coursera in India, Bangladesh, Brazil, Turkey,
Arab countries Russia, Egypt, Mexico, UK,
Spain, Ghana, Singapore, Greece, Hong-
Kong, Kenya, Nigeria, Pakistan, Trinidad and
Tobago, Jamaica, Cambodia, France,
Italy, Australia, China, Netherlands,
Spain & Portugal
positioning
conclusions
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014
target markets
in accordance with the primary/high school
education system it exists within, Khan
Academy is slightly more English speaking
countries-centered (US, Canada, UK,
Australia); nevertheless, thanks to its
volunteer translator community &
development of localized (/translated)
portals, it increases its reach via these
“scattered but focused epicentres” which
help its (rather centralized) growth
for example, communities were found in
Brazil & Latin America, Egypt, Sweden, Japan, India,
Bangladesh, Pakistan, Trinidad and Tobago, Jamaica,
Cambodia, Ghana, Singapore, Hong-Kong,, Kenya,
Nigeria, Czech Republic, France,
the Republic of Haiti,
Spain & Portugal
traffic sources
social media, major online news &
tech publishers, KA’s own & third-party
blogs, education-focused publishers,
KA websites in other languages,
traditional K-12 education, …
youtube, facebook, google+, twitter, nytimes,
techcrunch, wsj, fast company, forbes, cnetnews,
washingtonpost, ted, mashable, education week,
wikipedia, sina, es.khanacademy, pt.khanacademy,
market watch, cbs, telegraph, el economist,
cnnmoney, nasa, bill gates, tumblr, edsurge, paniit-
bayarea, techcrunch, iturank, the hindu, linkedin,
slashdot, patch, feedsportal, kqed, stackexchange,
uwstout, wordpress, abcnews, ck12, cmu, cnn,
gawker, go, hawaii, hpu, huffingtonpost, ljworld,
metafilter, niu, rosettastone, sc, nbcnews,
smithsonianmag, tulsalibrary, utexas, uvm, waldorf,
wwhatsnew, wtol, yahoo, google, hbr,
…
wider
associational
universe
open educational
resources, ck-12
foundation, alison,
opencourseware,
massive open online
course, udacity, mit
opencourseware,
learnstreet,
technology
integration, interactive
learning, two circles,
oer commons, free
high school science
texts, educational
technology, open
textbook, american
friends of arts et
méiers paris tech,
curriki, virtual
university, learnthat
foundation, mitx, phet
interactive
simulations, teaching
channel, computers in
the classroom, e-
learning, ineedapencil,
saylor foundation,
collectspace, lecture
recording, open
source learning, east
bay children's book
project, knewton,
…
competition
mathisfun, purplemath,
grockit, gradeslam, showme,
virtualnerd, regentsprep,
mathwarehouse,
…
USP &
competitive
advantage
extensive ‘menu’ of micro
lectures covering basics of STEM
education (& more), enabling
flexible assembling of
individualized curriculum plan
under a single platform
keywords
salman, khan, nasa, youtube,
videos, college board, stem, bill
gates, free, online, education,
english, learn, itunes, great, good,
know, more, sensation, bring,
expand, opportunities,
collaboration, new,
launch, tutorial,
…
KPI
achievement /
qualification beyond
Khan Academy
customer persona
a sympathizer of Khan Academy is a part of huge
community around Salman Khan & his path towards
education transformation
a user of Khan Academy is a primary/high school
student, prepares herself/himself to a standardized
school/admission examination – e.g. math or SAT,
also economics, science, or medicine – and/or is an
older (than high school age) female/male
supplementing (filling gaps in) her/his education
(mathematics above all)
a user/sympathizer of Khan Academy is inclined
towards gamification of education – collecting points
& badges, accepting challenges & solving brain
teasers (possibly serving as a door-opening
moment) and/or enjoys exploring new topics
(from her/his perspective, e.g. computing)
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014
swot
conclusions
STRENGTHS
•providing academic/professional education under the patronage of various word-class Universities
•highly (intrinsically) motivated participants of the educational process & supporting such motivation via academic/professional “improve your life” incentives & stories
•Google+ & Facebook: storytelling & inspiration; Twitter: influencers’ testimonials; LinkedIn: connection with the economically active (/professional) population, including students with professional experience
•mobile device users reach: particularly the Android App*
•stable growth of reach & fan base, entering new partnerships with Universities & other educational institutions
* iTunes version of Coursera’s app is also available
WEAKNESSES
•YouTube: missing educational content & unclear content strategy
•Facebook (& possibly other social media profiles): diminished ability to drive engagement* – could be improved, for example, by sharing exercises and/or quiz questions from its courses
•volunteer translator community in its infancy
* not just driving traffic to its website but also employing social media as an educational tool helping spreading of content & strengthening learning communities (and/or patronage over such communities & engagement techniques around particular courses)
OPPORTUNITIES
•(US/foreign) students & younger professionals (global reach, Internet population), who don’t have the opportunity to study particular courses in their home country, at their home University (or not included in their study pathways/programs); or those who are not able to meet the financial/time requirements of higher education; or simply those who want to “give a try” to a particular area
•(closely related) influence on the global young (possibly ”in the near future”) economically active population
•self-organized study/course groups on social media, outside Coursera’s online learning environment
•influential social media users – both learners & educators – mentioning positive experience with Coursera (word-of-mouth, genuine)
•possibility of becoming a supplement and/or competition of traditional higher education via providing full study pathways & its own (flexible) system of qualification, which might bind together current niche-specialized courses
THREATS
•eventuality of weakening the “academic neutrality” (brand) label due to commercial partnerships (on the other hand, it can be utilized as communication of linking both realms)
•not reaching those who – in spite of the fact they need qualification & seek higher education – for instance, don’t know “the” foreign language (English, regarding the majority of the courses), or do not have the required (e.g.) mathematical background for non-superficial understanding of more complex topics etc.; therefore deepening inequalities by increasing expertise of those “easily qualifiable” while not reaching the learners –possibly forming much bigger market – who would need more entry background & skills
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014
swot
conclusions
STRENGTHS
•traditional primary & high school education and supplementing it
•resources tailored for US education system
•strong and connected user & fans community
•growing volunteer community, translations
•YouTube: KA’s micro lectures among the most popular YouTube educational content; Facebook & Google+: brain teasers & challenges; Twitter: background information via employees & interns
•stable growth of reach & fan base;its increased growth if a significant partnership is entered
•iTunes users reach, including Apple mobile devices (iPad above all)*
* Windows Store version of the Khan Academy’s app is also available
WEAKNESSES
•(compared to Coursera) less personal (word-of-mouth, not by online publishers) social media testimonials regarding positive experience with particular educational content; social media (general) mentions by online publishers rather than by “influential“ –as measured by potential reach –pupils/students (also determined by age) & teachers
•the question “why to learn something” often answered by necessity –external motivation, e.g. school/exam – rather than “pursuing one’s goals/interests” – intrinsic motivation, e.g. a concept searched for & explained within a specific practical application rather than as itself
•inactive LinkedIn company page
OPPORTUNITIES
•older (than high school) population filling the gaps in their education
•an opportunity to influence & shape the US K-12 education system as well as systems in other (not) only English-speaking countries
•subtitles translations& volunteer communities
•an opportunity to easily communicate/deliver key topics to its whole user base & beyond (centralization, Salman Khan’s wide publicity & key influential partnerships)
•“breaking out of math” & becoming platform providing general introduction to all “traditional” subjects / disciplines, i.e. complex primary & high school education
THREATS
•a very tight relationship between the “Salman Khan” brand & “Khan Academy” brand (in case of an unfortunate event causing damage to his name)
•eventuality of losing the “revolutionary” part of the brand due to closer connections to the traditional education system (its adjustment & blending into it)
conclusions
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014
why
how[social media]
whatsocial media are vital for both
Coursera & Khan Academy, and so possibly also vital for spreading of
the “transforming education” principles & facilitating the
transformation
in total, on the social web,politics is more popular than
education*
* http://bit.ly/UQKfgo
(similarly to content marketing) creating highly engaging
educational post seems to require not only creative skills but also
some objective knowledge about features of such content tailored for a particular target group; therefore studying those features deserves
attention
we seem to have enough tools & necessary (key or supplementary) social web & social media data to find educational influencers, plan
viral spreading of educational content, linking formal to informal education, targeting individuals or
groups, creating educational recommender systems & more
the leaders, the visionaries & the “doers” in establishing the 21st
Century Education seem to be thefor-profit & non-profit sectors(How governments respond?)
institutions, which are both labeled as "revolutionary" in the realm of education,
offer different products & learning process; 21st Century education
therefore does not seem to be about developing one perfect centralized general education system – rather
protection, patronage & support of the market, if we consider a government regulation and/or an education record
(similar to a medical record) – but about diversity of providers (self-directed
learning, state, for-profit, non-profit, one-on-one etc.; free, trial, paid, creative commons etc.) of clearly positioned
educational products, allowing individuals to assemble their own
"education mix“, which is recognized, together with their collaborative &
individual projects/accomplishments,as their qualification, where ways of
obtaining qualification are, by default,a choice
even though we haven’t opened that topic thoroughly within the discussion in this research paper, the data suggest that Coursera’s & Khan Academy’s “revolutionary nature” seems to mainly reside in technology
allowing to make education accessible, mediate some of the best educators in particular areas & for a given target group, satisfy mass demand & provide data-driven education; but not as much in the usual
transmission of the curriculum & assessment: presentation-based lectures, not collaborative, a single human being (/learner) & her/his individual outcomes, etc.
nevertheless, we can see how huge a difference both tools provide regarding self-driven education
What’s the point of such reflection?Well, the best is yet to come! =)
YouTube
(& video in
general) as an
optimal
platform for
social web
educational
content
spreading
sharing stories about
education improving lives*
visual content
usage to drive
traffic to
education**
driving
(above all)
comments
to educate
efficientlystorytelling
closer connection
of learners
& educators
(semi-formal,
or even informal)
“good practice
examples” &
peer pressure
“covering”
ourselves in
educational
content (e.g. via
subscribing to it)
& recommender
systems as
foundations
for creation /
adjustment
of our
educational
mix
(intelligently)
entertaining
content attracting
attention
to education
& facilitating it
social
media/web
influentials
& their
testimonials
transformation,
provision &
diffusion of
education
because of,
thanks to & via
ICT
educational content derived rather from what I look
for / need & what my level of knowledge in a
particular area is, than based on who I am
according to demography (e.g. age & location)
education on the social web
introductory-level content
employing clarity & simplicity in
order to cause door-opening
moment towards educational
content perceived to be more
complicated
* arguably common/general
element of “education” as a brand
** well-know in the realm of social
media marketing; we are able to
confirm it & highlight it
blog articles / social media posts
sharing an institution’s
background information
connecting community
We definitely haven’t covered all
areas of the social web research
(how could we, when the topics of
“semantic web” and/or “Internet of
things” were not even mentioned).
However, in the same way, even if
we stay in our, somewhat
narrower, universe of the (public or
private, on request) data provided
on the social web, we should, at
least, point out plenty of other
SNSs (& their APIs), discussion
forums / comment sections, blog
posts, GeoIP & GPS, mobile
applications, click-through rate
monitoring, cookies & sessions,
web scraping, image recognition,
micro data & other APIs around
HTML5, and also digital forensics &
ethical hacking (e.g. file metadata).
If we omit the (already mentioned)
suitability of a longditudial research
verifying the conclusions of this
paper, testing the conclusions on a
recent dataset, and/or taking
account of influence & reach of
particular media; we might also
want to employ other analyses –
some suggestions can be found on
the left.
clustering (and/or MDS of) different educational websites
based on keywords tf-idfs
(in case of obtaining fans’ interest profiles) clustering of
educational websites based on pages their fanbase liked and/or deriving customer segmentation
conjoint analysis to help us assemble education mix for different groups according to
their segment/typology
(similarly) discriminant analysis & ROC curves to estimate our
chances of targeting a particular user group with a particular
educational content
“how an educational website positions itself” vs “how it is
positioned by the social web” comparison (e.g. estimating the
“brand hijacking” share)
detailed analysis of motivations & goals of both online learners
& educators
a simple recommender algorithm based on the social
web/media data(see an awesome step by step
online book/tutorial,A Programmer’s Guide
to Data Mining)
machine learning employment (e.g. the famous naive Bayes classifier) in order to improve
precision of sentiment analysis or other text mining analyses
focus on an individual human being (/small group):
how she/he is affected within her/his social network,
monitoring changes & drawing conclusions about efficient
educational content spreading& its features
monitoring features of a successful/viral social media
educational content in general
mining the social web for the core topics of transforming
education & attitudes towards them
…
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014
other social web analyses
tools used & DIY resources for self-driven education on the next page
(pp. 131-132)
link to all datasets (zip): http://bit.ly/1pB9ZF4
diy
Unfortunately, it does not seem to be common
practice to include tools & how-tos in research
papers. We are still getting used to the fact that
the “proprietary” things we possess are rather
our skills (& time) – used to customize
information, therefore derive a product – than
the “raw” information which is supposed to be
shared – arguably because we could not move
off if our ancestors & contemporaries did not do
the same. Such state of things might give us the
“wrong” feeling that conducting research is
feasible only by academic/professional
“masterminds”. Since education is a process we
naturally enjoy – however needs more
convenient environment in its institutional form –
I hope the pictorial list – as we know, pics should
attract you to click on them =) – of (mainly free &
open source) tools & literature, which I used
and/or which you might like, will inspire you &
get you started in the realm of data science
and/or social web mining – whichever pathway
in those extensive disciplines you want to follow.
Do not forget to google other online/paperback
resources to further expand your research toolkit
& share* those you know in the comments below
(since we already know which kind of
engagement is crucial for education =)).
* Enthusiast transform education & don't wait for the
traditional education system to take the plunge.
This research is – secretly, in the footnote, to avoid
interfering with scientific objectivity =) – dedicated to all
heroes who create & distribute freely available
educational content and/or proprietary educational
content suitable for self-driven education. The 20% of
the “learning time” I've spent educating myself, outside
the traditional education system, allowed me to learn
80% of what I know & can.
[start here]books & online
resources in ‘Social Web: (Big) Data Mining’
syllabus
Python awesome Python frameworks, libraries
& software
IPython Notebook
NodeXL(& Microsoft
Office)
Gephi Pajek Wandora
NetLogo R MYSTAT Knime RapidMiner GATE Weka KEEL
import.io IFTTT Zapier plotly HighCharts D3 Processing tableau public
OpenRefine Twitter Archiving Google Spreadsheet
TAGS (& Google Drive)
A Programmer’s Guide to Data
Mining
Linux Ubuntu Libre Office Notepad++ Sublime Text NetBeans
Eclipse VirtualBox MySQL mongoDB Apache Hue Raspberry Pi Arduino
…
Jakub Ruzicka | [email protected] | www.linkedin.com/in/littlerose August 2014
JAKUB RŮŽIČKA [email protected] cz.linkedin.com/in/littlerose
summer 2014 | working paper
@
COURSERA & KHAN ACADEMYON THE SOCIAL WEB
the social web co-creating brands, revealing communities
& facilitating - both producers’ & consumers’ - informed
decision-making in adjusting their “education mix”