+ All Categories
Home > Documents > Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on...

Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on...

Date post: 03-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
28
Multimedia Tools and Applications manuscript No. (will be inserted by the editor) Socialising Around Media Improving The Second Screen Experience through Semantic Analysis, Context Awareness and Dynamic Communities David Tomás · Yoan Gutiérrez · Atta Badii · Marco Tiemann · Fotis Aisopos Received: date / Accepted: date Abstract SAM is a social media platform that enhances the experience of watching video content in a conventional living room setting with a service that lets the viewer use a second screen (such as a smart phone) to interact with content, context and communities related to the main video content. This article describes three key functionalities used in the SAM platform in order to create an advanced interactive and social second screen experience for users: semantic analysis, context awareness and dynamic communities. Both dataset-based and end user evaluations of system functionalities are reported in order to determine the effectiveness and efficiency of the components directly involved and the platform as a whole. Keywords Social TV · Second Screen · Semantic Analysis · Entity Linking · Sentiment Analysis · Context Awareness · Community Detection · Dynamic Communities 1 Introduction The introduction of consumer-centric Internet devices, in particular of smart phone devices, has changed the way users interact with media: from having D. Tomás & Yoan Guitérrez University of Alicante, Department of Software and Computing Systems, Carretera San Vicente del Raspeig s/n, 03690, Alicante, Spain E-mail: [email protected] A. Badii & M. Tiemann University of Reading, Department of Computer Science, Reading, RG6 6AH, UK E-mail: [email protected] F. Aisopos National Technical University of Athens, Distributed Knowledge and Media Systems Group, Zografou Campus, Athens, 15773, Greece E-mail: [email protected]
Transcript
Page 1: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Multimedia Tools and Applications manuscript No.(will be inserted by the editor)

Socialising Around MediaImproving The Second Screen Experience through SemanticAnalysis, Context Awareness and Dynamic Communities

David Tomás · Yoan Gutiérrez · AttaBadii · Marco Tiemann · Fotis Aisopos

Received: date / Accepted: date

Abstract SAM is a social media platform that enhances the experience ofwatching video content in a conventional living room setting with a servicethat lets the viewer use a second screen (such as a smart phone) to interactwith content, context and communities related to the main video content.This article describes three key functionalities used in the SAM platform inorder to create an advanced interactive and social second screen experience forusers: semantic analysis, context awareness and dynamic communities. Bothdataset-based and end user evaluations of system functionalities are reportedin order to determine the effectiveness and efficiency of the components directlyinvolved and the platform as a whole.

Keywords Social TV · Second Screen · Semantic Analysis · Entity Linking ·Sentiment Analysis · Context Awareness · Community Detection · DynamicCommunities

1 Introduction

The introduction of consumer-centric Internet devices, in particular of smartphone devices, has changed the way users interact with media: from having

D. Tomás & Yoan GuitérrezUniversity of Alicante, Department of Software and Computing Systems, Carretera SanVicente del Raspeig s/n, 03690, Alicante, SpainE-mail: [email protected]

A. Badii & M. TiemannUniversity of Reading, Department of Computer Science, Reading, RG6 6AH, UKE-mail: [email protected]

F. AisoposNational Technical University of Athens, Distributed Knowledge and Media Systems Group,Zografou Campus, Athens, 15773, GreeceE-mail: [email protected]

Page 2: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

2 David Tomás et al.

previously been largely passive and unidirectional, the relationship has nowbecome proactive and interactive. Users can select what to watch when, andthey can comment or rate a TV show and search for related information re-garding characters, facts or personalities. The latter type of behaviour is servedthrough second screen use: the use of a computing device (commonly a mobiledevice, such as a tablet or smartphone) to provide an enhanced viewing ex-perience for content on another device (such as a television). This experienceinvolves providing interactive features during broadcast content, including me-dia postings on social networks and external content discovery related to theassets consumed by the user.

In today’s second screen environment, there are no commonly acceptedstandards, protocols or frameworks through which users can discover andaccess information related to consumed contents. Into this scenario emergesSAM1 (Socialising Around Media), an EU-co-funded research project focusedon developing an advanced digital media delivery platform for second screenand content syndication in the domain of social TV. SAM provides open andstandardised means of characterising, discovering and syndicating digital me-dia assets (e.g. films, songs, and books).

The potential customers of SAM are both business stakeholders (such asmedia broadcasters, content asset providers, software companies and digitalmarketing agencies) and private users. For the former, the platform provides anumber of benefits, including dynamic social and media content syndication,the ability to manage online reputation, to better understand customers, totrack real-time statistics and to monitor media-related social content throughsecond screen. For the latter, SAM offers a complete solution for interactivelyconsuming media and TV programs. The platform integrates context-awareinformation and complex social functionalities that provide contextual infor-mation relevant to their current interests. These features provide the end userswith an augmented experience in which they can discover new informationabout assets and share their experience with other users that are also inter-ested in the same topics.

This paper focuses on describing three key functionalities of the SAM plat-form: semantic analysis, context awareness and dynamic communities. Theseback-end functionalities are responsible for providing an enhanced experienceto the end user when interacting with their second screen.

Semantic analysis involves a set of features to import, analyse, enrich andexploit content based on Natural Language Processing (NLP) technologies(Cambria and White, 2014). These features include sentiment analysis andentity linking : the former provides information about the sentiments expressedby the users’ comments send via the platform, which allows to better clustersimilar users based not just in their demographics and consumption habits butalso in their opinions; the later provides an ontology-based data integrationmechanism to enrich assets before they are sent to the users by creating links toother assets in the platform and to external data sources (such as Wikipedia).

1 http://www.socialisingaroundmedia.com.

Page 3: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Socialising Around Media 3

The context awareness functionality manages contextual information and ag-gregates all user interactions within the system, providing personalisation andsmart recommendations to them. Finally, the dynamic communities function-ality analyses user data and communication, generates candidate user commu-nities and manages user membership in these communities dynamically.

This study includes the results of the intrinsic evaluation carried out foreach of the three functionalities mentioned above, together with an extrinsicevaluation performed on the whole platform, involving around sixty partici-pants that were asked to test it and provide their feedback through a set ofquestionnaires about different aspects of their experience interacting withinSAM.

The remainder of this article is organised as follows: the next section pro-vides information about the architecture of the SAM platform; Section 3 high-lights the role in the enhancement process of the three key functionalitiesdescribed in this paper; Section 4 summarises the intrinsic and extrinsic eval-uation carried out on the main functionalities and the whole platform; Section5 presents related work in the social TV and second screen ecosystem; finally,conclusions and future work are presented in Section 6.

2 The SAM Platform

The SAM platform is the main technological outcome of the SAM project.The platform offers a complete end-to-end system both for business users whocreate, manage and evaluate second screen experiences and for end users whoconsume and interact with their second screen through SAM. The platformhas been developed as a modular system whose individual modules providethe necessary functionalities or enable them through shared services.

Figure 1 depicts the overall modular architecture of SAM. The componentsof the platform are distributed in four different layers. Data Management com-ponents are responsible of importing and storing all the assets and metadatain SAM. The Control layer includes, among others, the core functionalitiesfor enhancing the second screen experience described in this paper: Seman-tic Services (provides sentiment analysis and entity linking), Context Man-ager (responsible of context awareness functionality) and Community Manager(controlling dynamic communities). Communication layer includes he Inter-connection Bus that coordinates and facilitates the communication betweenthe different SAM components, providing built-in facilities such as messagerouting, format transformation, message queuing, security, and access control.Finally, Interaction gathers all the front-end components for business and endusers to communicate with the platform, such as the Dashboard that displaysvideo elements on the first screen (usually a smart TV) and additional infor-mation on the second screen (a tablet, smartphone, etc.).

The remainder of this section summarises the services provided by theplatform for business users and end users of SAM.

Page 4: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

4 David Tomás et al.

Fig. 1: The SAM Platform architecture

2.1 Business Use Cases: Experience Creation and Management

The goal of the SAM platform for business users and their use cases is toimprove the efficiency, quality and integration of the second screen contentcreation process into work flows and content provider ecosystems, facilitatingthat second screen experiences can be provided for many programs instead ofjust for prime time entertainment content.

Second screen experiences can be created and managed through a dedi-cated Marketplace component offering a user interface for business users. Here,content managers can import and access media items (referred to as “mediaassets”) in order to compose a second screen experience. The system developedduring the course of the project supports a number of different content typesand formats such as Wikipedia entries consisting mostly of textual content,and video clips that complement first screen video content.

Using a Linker component, content editors can search for, receive sugges-tions for (using the entity linking functionality described in Section 3.1.1) andthen compose available assets using a time line-based view of the first screenmedia asset for which a second screen experience is to be created. In additionto including media assets to a second screen experience, editors can also config-ure and control content access restrictions through the Brand and ConsumerProtection component (e.g. to protect minors) and configure the integrationof social media channels into the second screen experience for enabling SAMdynamic community functionalities (described in Section 3.3) or integratingexternal social media services such as Facebook or Twitter. Once an editor

Page 5: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Socialising Around Media 5

has completed a second screen experience, it can be reviewed and publishedthrough the SAM platform so that it is ready for use by end users.

After a second screen experience has been published and used by end users,SAM gives business users access to data analytics tools (Analytics component)that let them analyse and visualise user activities and user interactions, includ-ing analyses of their activity, the contents of their interactions on social mediaand the sentiments expressed (through the sentiment analysis functionalitydescribed in Section 3.1.2).

2.2 End User Use Cases: Consumption and Interaction

For end users, the aim of the SAM platform is to provide them with an easy touse, simple and seamless user experience for consuming and interacting withsecond screen content while audiovisual content is playing on a first screendevice.

Second screen experiences created for end users using the SAM platforminfrastructure are launched and synchronised automatically with first screenthrough paired SAM applications that run both on a television set that servesas first screen and on a hand-held device that displays the second screen.Both the content synchronisation and the delivery of second screen content arehandled by dedicated content selection and delivery components that are partof the SAM platform (the Syndicator and Dashboard components in Figure1).

While a first screen video is playing on a SAM-enabled TV, connected SAMapplications display related content assets that have been added to the secondscreen experience for this video by content editors, leveraging the suggestionsprovided by the entity linking functionality as described in the previous sec-tion.

Users can interact with the content in different ways (e.g. play video clipsor expand teaser text for longer text assets). This constitutes the first set offunctionalities offered to end users which focuses on augmenting first screencontent with additional second screen content.

Users of the SAM second screen application can furthermore interact witheach other through social media messages while a first screen video is be-ing played. The SAM application includes both external social media services(Twitter and Facebook) and a social media service developed for the SAM plat-form, managed by the Context Control component. User interactions throughthe SAM social media service resemble interactions over Twitter. Unlike Twit-ter, the SAM social media service is based on the concept of groups to whichmessages are posted: only members of a group receive a message sent to thatgroup. Users of SAM can be automatically invited to groups using user profilesthat are maintained for each user of the platform. Figure 2 illustrates the userinterface for SAM dynamic community interactions. In this example, a user isinvited to join a Tim Burton’s film community. The way these communitiesare dynamically created is further described in Section 3.3.

Page 6: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

6 David Tomás et al.

Fig. 2: The SAM Application Community User Interface

The remainder of this article focuses on the methods used in order toenhance the second screen experience for end users with particular focus ontext content analysis and representation for the purposes of recommendingcontent assets and potentially interesting social communities to end users.

3 Improving the Second Screen Experience

This section describes the three functionalities of SAM that are crucial for pro-viding advanced features in the interaction of the end users with the platformthrough their second screens. Other functionalities provided by the platform,such as the asset marketplace, content syndication and responsive user inter-faces, are out of the scope of this paper.

The first functionality is semantic analysis, which is carried out by the Se-mantic Services component. This component provides different NLP services,such as sentiment analysis and text summarisation, and an editor interface (theAsset Profiler) to enrich the assets stored in the platform through ontology-based exploitation and entity linking technologies. As mentioned in previoussections, this paper focuses on describing the entity linking and sentimentanalysis capabilities. The second functionality is context awareness, a taskthat takes place in the Context Manager. This component enables the contextawareness of the platform by analysing contextual information and personal-ising the SAM second screen experience. The third and final one is dynamiccommunities, provided by the Community Manager component, which is con-cerned with identifying potential communities of end users, and with propos-ing and managing community membership and messaging in communities thathave been created. Both Context Manager and Community Manager are partof the Context Control (as shown in Figure 1), which also creates, maintains

Page 7: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Socialising Around Media 7

and provides an API to access the model that describes the context in whichthe end user of SAM interacts with the system.

The following sections provide information on the architecture, subcompo-nents and scientific background underlying these functionalities, highlightingtheir role in achieving the enhancement of the end user experience while in-teracting with their second screens in SAM.

3.1 Semantic Analysis

The Semantic Services component of SAM is in charge of providing semanticanalysis functionalities to the platform. These functionalities include sentimentanalysis, entity linking, ontology mapping and exploitation, text summarisa-tion, and asset edition. These features are supported by an ontology repre-senting digital media assets and their relationships. Assets in the platformare stored as instances of this ontology (SAM ontology), created in order tosupport semantic representation and querying of the data.2 The SAM ontol-ogy reuses concepts from the Europeana Data Model3 (EDM) and Schema.orgamong others, defining new concepts and relationships when no suitable ele-ments were available in the previously mentioned schemas.

Since the sources of media information to be distributed through SAM areheterogeneous, and data imported into the platform follow different formatsand schemas, SAM includes a mechanism for ontology mapping that for eachconcept in the input schema provides a list of suggestions of related concepts inthe SAM ontology. In this way, content providers can manually select the bestmatch between their data and the SAM ontology and run a batch importingprocess.

In order to facilitate the interaction with other SAM components, all thefunctionalities provided by the Semantic Services are exposed as RESTfulinterfaces. The following paragraphs highlight the parts of this componentemployed in the task of enhancing the end user experience while interactingwith their second screen: entity linking and sentiment analysis.

3.1.1 Entity Linking

Entity linking is the task of matching a textual entity mention to a knowledgebase, such as a Wikipedia page, that is a canonical entry for that entity (Raoet al, 2013). For instance, given a mention in a text to “Al Pacino”, the goal ofthis task is to determine that it refers to the entity described in this specific en-try in Wikipedia: http://es.wikipedia.org/wiki/Al_Pacino. This task ismore challenging than traditional named entity recognition (NER), where thegoal is to determine the occurrences of names in text and their classification.In the previous example, a NER system would determine that “Al Pacino” is a

2 The ontology definition and additional information, including a use case example, isavailable at https://github.com/perma-id/w3id.org/tree/master/media/dma.

3 http://labs.europeana.eu/api/linked-open-data-data-structure.

Page 8: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

8 David Tomás et al.

person or that “Los Angeles” is a location (Nadeau and Sekine, 2007). Entitylinking requires a NER system, but this process must be complemented by afollowing disambiguation phase where this person or location is linked to anunambiguous entity stored in a knowledge base.

In the context of SAM, entity linking allows to analyse text and identifyentity mentions in two different knowledge bases: Wikipedia and the SAMknowledge base, which stores the assets imported and created inside the plat-form. In this way, the data existing in the platform can be analysed for entities,and these entities can be linked to Wikipedia pages and other assets in SAM(e.g. books, songs, films, actors), creating a linked data ecosystem.

Two different approaches to the entity linking task have been defined de-pending on the target knowledge base. Although the task at hand is the same,the tools and resources employed are different. On one hand, the approach toentity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpediaLookup6. It takes into account the page structure (i.e. number of incominglinks to a DBpedia entry) and the context of the target entity (surroundingwords) to solve ambiguities. On the other hand, the approach to entity linkingon the SAM knowledge base combines techniques from information retrievaland word sense disambiguation research fields, using OpenNLP and Lucene7as its core tools. OpenNLP is used to identify noun phrases, whereas Lucene isemployed to index and retrieve instances of the SAM knowledge base relatedto a specific query (taking the form of a string of keywords). See Tomás et al(2015) for further information on both approaches.

Regarding the benefits for end users in their second screens, entity linkingin SAM provides an augmented experience in which they can discover newinformation about an asset, creating richer experiences around the originalcontents. For instance, a user watching the film “Casino Royale” in the SAMplatform, taking advantage of the entity linking functionality, would get addi-tional information related to actors “Daniel Craig” and “Mads Mikkelsen” fromWikipedia, and also to other related media assets in the platform based on thelinking to the SAM knowledge base, such as books created by “Ian Fleming”,the writer of the series of spy novels, or references to its original soundtrack.

3.1.2 Sentiment Analysis

Sentiment analysis, also known as opinion mining, is an area of NLP focusedon identifying and extracting subjective information from human language(Pang and Lee, 2008). Sentiment analysis systems try to identify the attitudeof the author of the analysed source of information with respect to some topic,entity or overall contextual polarity (positive, negative or neutral) of the text.

The use of sentiment analysis technologies can benefit companies andusers in their decision-making processes, since it makes possible to determine

4 https://opennlp.apache.org/.5 http://wiki.dbpedia.org/.6 https://github.com/dbpedia/lookup/.7 https://lucene.apache.org/.

Page 9: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Socialising Around Media 9

user preferences, opinions and feelings. These technologies can be applied atdifferent levels, depending on the focus of the analysis: global polarity andaspect-based polarity. Global polarity allows inferring the sentiment polarityexpressed in the whole text. Aspect-based polarity aims to classify the senti-ment with respect to the specific aspects of an entity, since an opinion holdercan give different views for different aspects of the same entity (e.g. “The filmis not very good, but the main actor is superb, I’m happy to watch him actingagain”).

The approach followed in SAM starts with a normalisation of user com-ments, translating from informal to a more formal language. Then these textsare tokenized to extract their terms, which are combined to create lexicalpatterns of n-grams and skip-grams. Finally, these patterns are employed asfeatures of a Support Vector Machines (SVM) algorithm, together with otherparameters, as described in Fernández et al (2015).

In SAM, the application of these techniques to user comments allows ex-tracting valuable information about the consumption of products and services.The benefit provided to the end users by sentiment analysis in SAM is the dis-covery of user preferences, feelings and attitudes while using the platform thatcontribute as an input to improve content awareness (Section 3.2) and dynamiccommunities (Section 3.3). Moreover, this information is used by the Analyt-ics component to provide content creators with advance reports for businessintelligent purposes, although this application of sentiment analysis is out ofhe scope of this paper.

3.2 Context Awareness

Context Awareness is a crucial characteristic of SAM, capturing and managingend usersâĂŹ contextual data for exploitation. To personalise user experience,the platform needs to collect and analyse all context-related information aswill be discussed in the following paragraphs. In what follows, we provide anextension and concrete validation of the work presented in Aisopos et al (2016).

3.2.1 Context Representation and Management

The Context Manager component deals with the aggregation and efficientstorage of SAM users, as well as their contextual information and interactionhistory with the various first and second screen elements. In both first andsecond screen SAM applications, specific action listeners were integrated whichcapture and send user interactions with the various elements and buttons tothe Context Manager for further analysis. The list of user interactions thatwere collected from the SAM first and second screen (Root Assets/videos andother Assets respectively) is given in Table 1.

As noted in relevant literature (Jaiswal and Agrawal, 2013), no-SQL graphdatabases such as Neo4j8 can be applied efficiently to user-entity relationship

8 https://neo4j.com/.

Page 10: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

10 David Tomás et al.

Root Assets 2nd screen Assets

consume likecomment dislikefull screen dismiss

show more

Table 1: List of collected user interactions.

Fig. 3: Neo4j graph - users/relationships plain example.

models such as the one in SAM. Moreover, the flexibility of such databases(Batra and Tyagi, 2012) is another characteristic that makes Neo4j suitablefor context representation. Thus, Neo4j was used as in the Context Managerbackend, where registered SAM users, as well as the existing videos and secondscreen assets, were represented as nodes. User interactions on the other hand,were modeled as graph edges connecting those nodes. A visual example of thecontext graph illustrating various button-pressed actions collected in Neo4jdatabase can be seen in Figure 3. In this Figure, various SAM end-users canbe observed (in blue bubbles) interacting (consuming, commenting, pressing"Like" button etc.) with SAM assets (in green bubbles).

3.2.2 Context Analysis

The exploitation of the aforementioned context information and user interac-tions aggregation resides within the analysis of the relationship of every userand asset. To this end, the system provides a relevance score of each userwith SAM assets in order to personalise the user second screen experience byproviding on screen recommendations of assets of interest.

The calculation of an asset’s relevance for a specific user depends on allthe interactions collected, as well as the user contextual connection with asset-

Page 11: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Socialising Around Media 11

related keywords, e.g. via a relation with keyword’s parent node. For thispurpose, asset keywords are also imported into the Neo4j graph as separatenodes, with SAM sentiment analysis providing user sentiment scores on eachkeyword by analysing their comment history (see Section 3.1.2). These scoresare reduced in the [−1, 1] interval, with −1 representing the negative edge and1 the positive one.

Regarding the user interactions, each one is also weighted from [−1, 1](the Context Manager chosen weighting climax), in order to be included inthe analysis. Interactions that explicitly declare relevance/irrelevance (“like” /“dislike” respectively) are weighted with the absolute scores −1 or 1. For therest of the interactions ("non-explicit" interactions) the approach used dictatesthat the sum of their weights will not overshadow an “explicit” interaction (sotheir sum in the worst case will be less than 1 or greater than −1). Therelevance weights applied to all user interactions are shown in Table 2.

Interactions Root Asset 2nd screen Asset Keyword

Explicit weights weComment [−1, 1] [−1, 1]

Like 1 1Dislike −1 −1

Non-explicit weights wneFull screen 1/3Consume 1/3Dismiss −1/3

Show More 1/3

Table 2: Relevance weights of the various user interactions with Root or Secondscreen Assets. Comments including Asset keywords are regarded as commentson each specific keyword as well.

Given the aforementioned ratings, a user-asset relevance score can be con-sidered as the sum of "explicit" and non-"explicit" weights:

REL1 =∑

We +∑

Wne (1)

However, apart from a direct relation to an asset, a user interaction withneighboring assets is also an indication of relevance/irrelevance to it. For ex-ample, in Figure 4, case (a) shows a user disliking an asset of a video. Everysuch relationship over this video’s related assets implies potential irrelevanceof the video itself to this user. Case (b) shows a user consuming in full screenmode and also commenting on a video (supposedly in a positive way). All thesestrong relevance weights to this video can also be considered as indications ofrelevance with the video’s related assets (those assets are connected to it inthe graph via the IS_ROOT_OF relationship).

Once more, in the current analysis it was decided that the relevance valueof those indirect relationships must not overshadow the direct or explicit in-teractions. Thus, the total relevance score resulting from those is set as the

Page 12: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

12 David Tomás et al.

(a) User interacts with an asset linked to avideo.

(b) User interacts with the root video of anasset.

Fig. 4: Indirect relation to an asset.

sum of their weights (now set as W ′e and W ′

ne divided by a distance factor(number of hops in the graph between the user and asset, following the pathof those edges/relations):

REL2 =RELneighbour

distance=

∑W ′

e +∑

W ′ne

2(2)

Therefore, the overall relevance score now gets equal to:

RELtotal = REL1 +REL2 =∑

We +∑

Wne +

∑W ′

e +∑

W ′ne

distance(3)

As mentioned above, user-assets relevance estimation is intended for per-sonalising the user environment via recommendations. Resulting recommen-dations can personalise the SAM environment in the two screens:

– Prioritise the video carousel items presented to users in the first screenbased on their scores and highlight the most relevant recommendations.

– Suggest content in second screen by only showing in the top side of theDashboard widgets with most relevant assets for the user.

3.3 Dynamic Communities

The Community Manager component of SAM manages the social media func-tionalities provided by the SAM platform, including the exchange of mes-sages and the identification and recommendation of communities to end users.Hence it both operates the technical infrastructure that is required in order totransmit messages between users of a community and handles mechanisms formanaging user membership in such groups. As a core feature it implements al-gorithms that identify which communities might be interesting to an end usergiven their user profile data and invites the user to join a relevant community.

Page 13: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Socialising Around Media 13

3.3.1 Technical Implementation

The dynamic communities backend functionalities of the system, which handleusers and messages throughout dynamic communities that are created, havebeen implemented using a MySQL database system, a RESTful Web ServicesAPI implemented in Java and a draft version of the W3C ActivityStreamssocial messaging format9.

In the current implementation of the SAM platform, a process for runningalgorithms in order to compute community invitations for a user is triggered bythe Context Manager component when it has processed a profile update for auser. Such a profile update can for instance occur when a user has started play-ing back a new video, or when a user has made a comment on social media thathas been analysed using sentiment analysis. For the technical implementationof the dynamic community suggestion algorithms, a modular infrastructure hasbeen implemented so that different types of algorithm implementations can beswapped out with little effort. Two groups of algorithms were considered: de-terministic approaches that can be configured by platform administrators, andclustering approaches for detecting communities given available data.

3.3.2 Deterministic Community Creation

The goal of deterministic community creation is to give administrators runningSAM the opportunity to define when an end user should receive an invitationto a particular community. A rule-based approach has been selected in orderto achieve this goal. Based on the Drools Rule Engine library10, an event-condition-action rule processing module allows the explicit specification ofrules and the execution of community management actions given the outcomeof rule processing activities. In addition to basic RETE rule processing (Forgy,1982), Drools also supports data aggregation and temporal restrictions on datato consider (e.g. sliding windows over timestamped data).

Listing 1 depicts a typical rule that may be defined in order to target aspecific user demographics (in this instance, young male adults who like aparticular content asset).

r u l e "Young Adult Males"when

m : Contex tNot i f i c a t i on ( $consumedAsset : consumedAsset ,person . getGender ( ) == "male " , $user : person )

eva l ( m. getPerson ( ) . getAge ( ) < 21)eva l ( m. getPerson ( ) . getAge ( ) > 15 )eva l ( ! $user . isMemberOf ("Young Adults who l i k e "

+ $consumedAsset . g e tT i t l e ( ) ) )eva l ( m. getCreatedGroupName ( ) == nu l l )

thenm. setCreateGroup ( true ) ;m. setCreatedGroupName ("Young Adults who l i k e "

+ $consumedAsset . g e tT i t l e ( ) ) ;end

9 https://www.w3.org/TR/2016/CR-activitystreams-vocabulary-20160906/.10 https://www.drools.org/.

Page 14: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

14 David Tomás et al.

Listing 1: Example rule “Young Adult Males”

3.3.3 Clustering for Community Detection

Using a clustering method in order to identify communities within groupsof users does not require manual effort for dynamic community generationand may identify communities that would not have explicitly been definedif the explicit community definitions described in the previous sections hadbeen used. Especially since social networks and social media channels haveestablished themselves as important communication channels for large usercommunities, the automatic detection of user communities in a large group ofusers has attracted research interest (Paliouras et al, 2015).

Many different clustering algorithms can be applied to the problem of com-munity discovery in large user communities. For the SAM platform, an initialselection of hierarchical divisive clustering, k-means clustering and standardgraph-based clustering techniques was initially examined. Graph-based clus-tering approaches were found to be the most flexible in implementation andalso considered as both well-suited to community discovery problems and asan interesting and active field of research of interest within the project.

After initial investigations of the “classic” Girvan-Newman algorithm forcommunity discovery (Girvan and Newman, 2002), the BigCLAM algorithmwas selected as a simple, fast and scalable algorithm for graph-based commu-nity detection.

The BigCLAM algorithm, proposed by Yang and Leskovec (2013), has beendesigned for operating in big social or similar networks. The algorithm searchesfor the most likely affiliation factor matrix that maps a number of communitiesto an undirected and unlabeled network. The algorithm can detect overlap-ping and non-overlapping clusters of users. The number of communities to fitcan be defined as a parameter, or it can be estimated from the network underconsideration. Figure 5 shows a representation where circles represent com-munities, squares represent users and the edges between them represent nodeaffiliation (e.g. membership of a community); edges are weighted to representthe degree of affiliation.

The underlying optimisation problem of finding an affiliation factor matrixis considered as a variation of Non-negative Matrix Factorisation (NMF) forfinding an approximation matrix. In the implementation used for SAM, Noesisframework for network data mining (Martínez et al, 2015), a block gradientascent algorithm is used to solve this optimisation problem.

In our investigations for the SAM platform, the focus was placed on theidentification of concepts and user attitudes towards those concepts. In ourmodel, these concepts are used in order to create user profiles of bags of wordswith concept representations identified using sentiment analysis of social me-dia messages (e.g. “loves-football”), which are used in order to create the basicuser graph that is used by the BigCLAM algorithm (Leskovec et al, 2014). In

Page 15: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Socialising Around Media 15

Fig. 5: Illustration of affiliation graph structure (Yang and Leskovec, 2013).

the created graph, users are represented by feature vectors of concepts (suchas “loves-football”) and weighted edges between user nodes are created by de-termining the similarity of user feature vectors (edges are omitted when nofeatures are shared between feature vectors). This approach to user repre-sentation exclusively uses data extracted from social media contributions viasentiment analysis. It can be replaced or augmented by incorporating addi-tional factors, in particular information on which first/second screen contenthas been viewed by users represented in the graph if sufficient data for thisis gathered; implementing this was outside of the scope of the SAM project.Irrespective of how the described graph has been created, the BigCLAM algo-rithm is then applied in order to identify communities given the created graphrepresentation.

4 Evaluation

This section presents two types of evaluation of the system described in thisarticle. First, it describes the data-driven evaluation of the individual function-alities and algorithms described in the article in order to provide informationabout their performance on standard datasets. Second, it reports on a large-scale end user evaluation carried out as part of the SAM project, in which theoverall approach of the SAM platform and its appeal to end users has beenevaluated.

4.1 Core Functionalities

This section reports the evaluation results for the three technical topics coveredin 3 and provide references to relevant publications with additional details onthe individual evaluations.

Page 16: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

16 David Tomás et al.

4.1.1 Semantic Analysis

The intrinsic evaluation carried out for this functionality involved the entitylinking (in both SAM assets and Wikipedia pages) and sentiment analysisfunctionalities. Results are summarised in Table 3.

Experiment Precision RecallEntity Linking

Wikipedia 0.87 0.89SAM knowledge base 0.90 0.89

Sentiment AnalysisGlobal polarity 0.75 0.68

Aspect-based polarity 0.61 0.55

Table 3: Performance of the semantic analysis functionalities.

In the case of entity linking, the goal of the experiments was to evaluate theperformance of the system in identifying mentions to different named entities(person, fictional character, book, video game, organisation, album, and song)in plain text (a set of paragraphs) and linking them to their correspondingWikipedia page or SAM asset.

In absence of a suitable dataset in the digital media domain, a corpuswas developed in order to evaluate this task. To this end, a list of IMDB11

500 top rated films was retrieved. A crawler processed this list to retrieve thedescription for each film from Wikipedia (i.e., the set of paragraphs occurringbefore the table of contents). The resulting corpus contained more than 4,500entities, with person as the most common type of entity and video game asthe least one. The descriptions obtained for each film contain on average 27.21words and 2.22 entities.

In the case of entity linking on Wikipedia, the approach developed for thistask included the disambiguation process described in Tomás et al (2015).For each possible entity, an average of 1.91 candidates were found. As shownin Table 3, the system obtained 87% precision and 89% recall in the corpusdescribed above.

For entity liking on the SAM knowledge base, the approach followed todisambiguate and link entities is reported in Tomás et al (2015). Since inthe current stage the SAM knowledge base is not massively populated, theWikipedia corpus described above was also employed in this experiment. Tothis end, every film description was imported into the platform as an individualasset, populating the SAM knowledge base.

Different confidence thresholds were applied in the experiments to deter-mine the existence of an entity in the films descriptions of the corpus gathered.The system obtained a precision of 90% and a recall of 89%.

11 http://www.imdb.com/.

Page 17: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Socialising Around Media 17

The sentiment analysis system was evaluated for both global polarity andaspect-based polarity detection. The global polarity algorithm was trainedwith the Semeval Task 2 dataset12 (10,709 user comments) and Metacriticdataset13 (5,888 user reviews regarding movies). In the case of aspect-basedpolarity, the training dataset was Semeval 2014 - Task 4 âĂŞ Restaurants14(3,009 user comments).

As shown in Table 3, the performance achieved by global polarity was 75%precision and 68% recall. The performance achieved by aspect-based polaritywas 61% precision and 55% recall. A detailed explanation of the algorithmsemployed can be found in Gutiérrez et al (2015) and Fernández et al (2015).

In order to perform a real testing in the SAM context, an evaluation cam-paign was carried out with real users interacting with the SAM platform (seeSection 4.2 for further details on the experimental setup). The result of thisexperiment provided 143 interactions (messages sent to the platform) by theusers of SAM while consuming different assets. This dataset was manually an-notated by three reviewers. Each message was classified as positive, negativeor neutral. The inter-annotator agreement was computed using Fleiss’ Kappametric (Fleiss, 1971). The overall Kappa for the three annotators was 0.79,which is considered as substantial according to the scale proposed by Landisand Koch (1977). The results obtained for global and aspect-based polarityin this experiment was 75% precision, recall and F1. In this case, all the usercomments included only one aspect per sentence.

4.1.2 Context Awareness

The purpose of this evaluation was to validate that users get accurate rec-ommendations from the Context Manager component, as well as confirm theoverall service efficiency.

In the final version of the SAM platform a recommendation tab was addedin second screen in order to suggest content of interest to users. To obtain ameaningful dataset of user interactions, we imported into the SAM graph abig dataset available online (Zhang, 2005), which recorded user interactionswith articles. This includes number of clicks, scrolls, likes, time spent on eacharticle (and preliminary dismisses) metrics as well as explicit relevance ratingsas a ground truth.

Using the above values, the Weka toolkit15 was employed to train variousmachine learning algorithms and compare their effectiveness with the SAMrelevance predictions (which do not require training). The splitting ratios be-tween the training and testing set was 70% and 30% respectively.

Aggregated results can be found in Table 4. As can be seen in this table,the SAM Algorithm approaches the optimal classification accuracy, however,as mentioned, in contrast to baseline batch classifiers it requires no training.12 https://www.cs.york.ac.uk/semeval-2013/task2/index.php\%3Fid=data.html.13 http://www.metacritic.com/about-metacritic.14 http://alt.qcri.org/semeval2014/task4/index.php?id=data-and-tools.15 http://www.cs.waikato.ac.nz/ml/weka/.

Page 18: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

18 David Tomás et al.

Fig. 6: Comparing the efficiency of SAM Recommendation Web Service withkNN and Pearson Correlation.

Also, due to the graph analysis described above, the SAM Algorithm lo-calises its analysis only in the neighbourhood of the user and asset examinedevery time, implying low resource consumption and optimising the recommen-dation efficiency.

Classifier Mean Absolute Error Root Mean Square Error

MLP 0.3311 0.4456Linear Regression 0.2423 0.3398Random Forest 0.241 0.3588

Neural Network 100 neurons 0.2663 0.3472epsilon-SVR 0.2285 0.3339nu-SVR 0.2441 0.3373m5rules 0.2241 0.3261REPTree 0.2273 0.3355

DecisionTable 0.2198 0.3246SAM Algorithm 0.2307 0.3886

Table 4: List of collected user interactions.

In order to examine the latter, we used JMeter16 to generate 500 HTTPcalls comparing the performance of SAM recommendation as a Web Service(WS) with two baseline Collaborative Filtering recommendation techniques:k-NN clustering (kNN) and Pearson Correlation (CF). SAM Web Service wasby far more efficient than the two aforementioned approaches (Figure 6), withan average response time of 147 ms, verifying the optimised performance ex-pectations.

16 http://jmeter.apache.org/.

Page 19: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Socialising Around Media 19

Dataset |V||E| BigCLAM METIS MLR-MCL Graclus Louvain

Twitter 813061342303 0.2761 0.1625 0.1146 0.2147 0.1086

Facebook 403988234 0.3505 0.2356 0.2701 0.3026 0.3868

Google+ 10761412238285 0.1799 0.1663 0.0100 0.1789 0.0549

Table 5: Graph clustering performance F1 scores.

4.1.3 Dynamic Communities

In order to validate whether the BigCLAM algorithm should be a well-suitedcandidate for use in the SAM platform implementation, an evaluation of thegraph clustering performance was carried out. To this end, the evaluationreported in Paliouras et al (2015) was repeated with a number of additionalcandidate algorithms in order to compare results with those reported by theaforementioned authors.

Three datasets generated using social media data by third-party researcherswere used in order to carry out the evaluation. The data was generated byMcAuley and Leskovec (2014) with data from Twitter, Facebook and Google+social networks.

For evaluation purposes, we followed the procedure reported in McAuleyand Leskovec (2014) for three candidate algorithms: BigCLAM, METIS andMLR-MCL. METIS is a fast graph partitioning algorithm described in Karypisand Kumar (1998). MLR-MCL is a multi-level flow-based clustering algorithmcovered in Satuluri and Parthasarathy (2009). We also report results for theGraclus and Louvain algorithms that were obtained in McAuley and Leskovec(2014) using identical methods and datasets for comparison purposes.

Table 5 provides the results for the three data sets and the F1 scores forthe different algorithms. It also reports the number of vertices (|V|) and edges(|E|) in the datasets. As can be seen, BigCLAM generally performs better interms of this score than the other reported algorithm implementations do.

4.2 Overall User Evaluation

An extensive end user evaluation has been carried out as part of the SAMproject. The purpose of the evaluation was to determine the acceptance andthe enjoyability of the end user experience that is provided through secondscreen experiences created with and delivered through the SAM platform.

4.2.1 Evaluation Setup

The end user evaluation was carried out within the premises of two schoolsaffiliated with the SAM project. Within each school, pupils aged 13 to 17

Page 20: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

20 David Tomás et al.

participated in the user evaluations. The evaluation setup was created in aclassroom environment.

A single television (first screen) was used for the evaluation, and partici-pants were asked to install the SAM application on their mobile phones (or usea web-based alternative if their device was not compatible with the Androidapplication created in the project).

Participants were then asked to complete four evaluation rounds. Eachevaluation round followed an identical procedure:

1. Brief verbal introduction to the evaluation round2. Written introduction to the evaluation round3. Presentation of a short form video with the respective second screen content4. Free interaction after completion of playback5. Completion of a paper questionnaire

In each of the four evaluation rounds, different properties of the SAMapplication and the created second screen experiences were evaluated: (1) abasic augmented content version, (2) using basic SAM community features,(3) using advanced SAM community features and (4) all functionalities of theoverall system. Each of the evaluation rounds took between 20 and 30 minutesto complete.

For each of the rounds, a questionnaire based on the Technology Accep-tance Model (TAM) questionnaire was developed (Davis, 1986). In the ques-tionnaire, participants were asked to evaluate a number of statements on a5-point Likert scale. Furthermore, they were encouraged to provide writtenresponses to open questions at the end of each of the evaluation question-naires.

Two evaluations using this approach were carried out as part of the project,one for piloting the evaluation setup and for gathering formative feedback atmidterm, and one for gathering summative evaluation feedback for the overallproject at the end. In the remainder of this subsection, we focus on the resultsof the summative user evaluation.

4.2.2 Participants

A total of 90 Participants completed trial questionnaires as part of the sum-mative SAM end user evaluation. The vast majority of participants were in the15-16-year age bracket (81.1%), 15.6% were in the 13-14-year age bracket and3.3% were in the 17-18-year age bracket. Gender balance was quite even with47.8% female and 48.9% male participants (3.3% of participants preferred notto answer this question).

4.2.3 Results and Discussion

Due to space constraints, only a small subset of individual responses can bepresented in this article. Overall, participants in the end user evaluations re-sponded positively to the overall concept of the SAM platform.

Page 21: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Socialising Around Media 21

Fig. 7: Participant responses concerning content augmentation.

Fig. 8: Participant responses concerning dynamic communities.

Since augmenting first screen content with additional second screen contentis an important goal of the SAM project, it is also important to determinewhether the content is considered useful. Figure 7 shows a subset of responsesconcerning this topic. A majority of participants found that the provided SAMapplication makes it easy to find out more about topics in a video and thatusing the SAM application improves viewing a video on the TV.

The majority of participants similarly found using the SAM dynamic com-munities enjoyable (Figure 8). It should be noted here that a specially config-ured version of the dynamic community functionality was used in user trials; amore appropriate longitudinal evaluation was not carried out due to resourceand copyright restrictions concerning providing large amounts of video contentto participants over a large period of time.

Generally, participants also responded that they would like to use the SAMapplication in school environments (65.91% agree or strongly agree) as well as

Page 22: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

22 David Tomás et al.

at home (70.46% agree or strongly agree) to complement classroom materi-als. Given that the SAM platform and related application are still prototypesystems at the end of the project, this could be interpreted as pointing to-wards significant interest in such a system from the end user population thatparticipated in these trials.

5 Related Work

This section presents existing work related to SAM as a social televisionand second screen platform, also considering the individual functionalities de-scribed in this paper: entity linking, sentiment analysis, context awareness anddynamic communities.

5.1 Social Television and Second Screen

The rise of digital technology and social media gave boost to social television, afield where new opportunities for marketers arise, as multi-screen interactivityresults in increased levels of viewers’ engagement correlated with increasedcommercial effectiveness (Pynta et al, 2014). Social interaction is apparent,especially in the case of collocated viewers, using second screen applications,designed for specific TV programs (Vanattenhoven and Geerts, 2017).

Courtois and D’heer (2012) presented a study on the experience of multi-screening, investigating how users incorporate multiple media (e.g. Facebookand Twitter) in their television viewing experience from a connected secondscreen (e.g. tablet). The authors analysed the behavior and responses of an ex-tended group of participants, monitoring their engagement with various secondscreen applications.

Holmes et al (2012) specifically examined visual attention to televisionprograms while interacting with synchronised second screen applications. Sec-ond screen garnered considerable visual attention during experimental viewingsessions, and especially interactive content and social media feed from Twitter.

Compared with previous initiatives in this area, the strength of SAM isthe inclusion of semantic analysis, context awareness, and dynamic commu-nities technologies described above to improve the end user experience whileinteracting with their second screen device.

5.2 Entity Linking

Entity linking has been used in different domains to enrich content by linkingthem to existing knowledge bases such as Wikipedia, CYC17 and Freebase18.Odijk et al (2013) presented an approach that automatically generates links to

17 http://www.cyc.com/.18 https://developers.google.com/freebase/.

Page 23: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Socialising Around Media 23

background information (Wikipedia articles) in real-time, in order to be shownin second screen. This paper reflected the work performed in the context ofLiMoSINe EU Research Project (LiM, 2013) concerning language-based tech-nology search and employing semantic linking based on subtitles. Specifically,media-related subtitles were used as a textual stream to generate links and alsosecond screen context was efficiently modeled using a graph-based approach.

The NoTube EU project (NoT, 2012) also worked towards TV experiencepersonalisation enhancing online social and semantic data. The project focusedon the task of enriching TV metadata and creating a linked data cloud, aswell as generating user models and profiles via their social web activities.This results into a personalised recommendation system that uses existing webservices and shared background knowledge to collect, enrich and recommendTV data without an intrusive user profiling process (Aroyo et al, 2011).

Similarly to these projects, SAM exploits entity linking in order to enrichthe assets’ content by automatically identifying which assets in the platformare interrelated, connecting them, and also providing external links to relatedWikipedia pages building a linked-data ecosystem.

5.3 Sentiment Analysis

Sentiment analysis has become one of the hottest research areas in computerscience. Many companies and research groups are developing sentiment anal-ysis solutions that tries to leverage the huge amount of subjective informationavailable from social media in order to monitor reputation about products andservices.

Work in Fernández et al (2015) presented a sentiment analysis approachfor the social context in a second screen scenario, addressing the Task 1 (sen-timent analysis at global level) of the TASS 2015 competition (Villena-Románet al, 2015). In this model, a sentiment lexicon was created using the individualwords, n-grams and skip-grams of tweets datasets, with each term being statis-tically scored according to their appearance within each polarity. This lexiconwas incorporated employing a Support Vector Machine (SVM) algorithm tobuild a classifier.

Zhao et al (2011) attempted to extract TV watchers sentimental reactionto major events in live broadcast sports games in real-time. They followed alexicon-based approach to streaming data from Twitter, using WordNet as alexical database and analysing the sentiment polarities evolution over time.The authors also introduced a social TV system that enables the audienceto better select interesting programs in real-time and to produce personalisedprogram summaries.

Unlike other projects in the context of second screen, in SAM sentimentanalysis is employed in three different ways. First, it provides informationabout end user’s opinions on specific assets consumed in the platform forbusiness intelligence purposes, providing reports to the content creators on thepolarity and intensity of the comments posted while consuming their assets.

Page 24: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

24 David Tomás et al.

Secondly, it is employed to build dynamic groups of users inside the platformbased on their similar opinions. Finally, sentiments are also considered as partof the context analysis of the user for assets recommendation.

5.4 Context Awareness and Social TV Personalisation

Context awareness in a Social TV environment can be related to user publicinformation (e.g. location and gender) or users interactions and social behavior(friends and comments). Hu et al (2014) presented a multi-screen Social TVarchitecture, using geo-location data and user social features to enrich TVviewing experience. Later work by the same authors (Hu et al, 2015) providesa unified big data platform for social TV analytics, mining social responsesassociated with TV programs, obtained from Sina Weibo19 social platform.

A major result of this analysis targets at the personalization of mobiledigital TV applications, by predicting and recommending to television viewerscontent that match their interests (Chorianopoulos, 2008). Mitchell et al (2010)used social networks as a mechanism for providing social awareness to users ofan IPTV system, resulting into a user-user recommendation and rating system.

Work in Cesar et al (2008) investigated the usages of the second screen inan interactive television environment, aiming at controlling, enriching, sharing,and transferring TV content. Geerts et al (2014) analysed a second screencompanion application stimulating social interaction to offer more insight intohow viewers are experiencing such applications. Finally, Giglietto and Selva(2014) performed a content analysis to a big dataset of second screen tweetsthroughout a TV season, clarifying the relationship between politics-relatedshows and social media content.

To the best of our knowledge, no previous work has already taken intoaccount both the first screen (video full screen, "likes", comments) and secondscreen (show-more, dismiss) interactions in Social TVs to personalise contentdelivered. Also, our Context Management approach includes an innovativegraph analysis method that aims at providing a competitive 2nd screen userprofiling and recommendation service in terms of efficiency.

5.5 Dynamic Social Communities

The creation of dynamic user communities through social media is a phe-nomenon that has become very popular, along with the emerge of social mediathemselves. Kaplan and Haenlein (2010) describe how several companies arealready using social networking sites to support the creation of brand commu-nities. Much work has focused on mining and detecting communities in socialmedia (Papadopoulos et al, 2012)(Greene et al, 2010) (Tang and Liu, 2010),

19 https://www.weibo.com/.

Page 25: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Socialising Around Media 25

usually employing graph-based algorithms and placing emphasis on their per-formance for enabling scaling community detection to huge real-world socialnetworks.

The approach taken in SAM is to use a generically applicable scalable com-munity identification algorithm (BigCLAM), and to create the graph structureto which the algorithm is applied from data gathered through the SAM Plat-form itself - specifically users’ social media interactions over the platform aswell as their interactions with the platform (e.g. selecting a specific video fileor second screen element) at a later stage of development.

6 Conclusion

This article has introduced the SAM platform as a system for easily and quicklycreating second screen experiences around video content. As a practically ori-ented project, SAM has focused on real-world implementation and evaluationactivities, which have been presented in this and related publications con-cerning the SAM platform. While developed as a research prototype duringthe project, commercial use of the overall platform has been a major concernthroughout the project, resulting in a practically usable and readily commer-cially exploitable system.

The article has described three key processes of the workflow that is in-volved with enabling better socialising around media through the use of seman-tic analysis, user and context modeling and dynamic community discovery andrepresentation mechanisms. This workflow illustrates how a real-world pipelinefor implementing such a functionality can be provided, and the provided userevaluations indicate user interest in such functionalities as part of multiscreenviewing experiences.

Future research in this area should focus on the creation of publicly avail-able datasets that incorporate multiple types of data sources in order to enableinvestigating such complex workflows better in experimental environments. Tofurther investigate the appeal and potential acceptance of systems such as theSAM platform, longitudinal studies and evaluations with more heterogeneousfirst and second screen video content could be carried out to identify the mostpromising areas for further development and research.

Acknowledgements This work has been partially funded by the European Commissionunder the Seventh (FP7 2007-2013) Framework Programme for Research and TechnologicalDevelopment through the SAM (FP7-611312) project, by the Spanish Government underproject REDES (TIN2015-65136-C2-2-R) and by the University of Alicante under projectGRE06-01.

References

(2012) Notube: bringing web and tv closer together. URL http://www.notube.tv/

Page 26: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

26 David Tomás et al.

(2013) Limosine project integrates the studies of leading researchers over di-verse topics with a view to enable new kinds of language-based technologysearch. URL http://limosine-project.eu/

Aisopos F, Valsamis A, Psychas A, Menychtas A, Varvarigou T (2016) Efficientcontext management and personalized user recommendations in a smartsocial tv environment. In: GECON - Conference on the Economics of Grids,Clouds, Systems, and Services (GECON)

Aroyo L, Nixon L, Miller L (2011) Notube: the television experience enhancedby online social and semantic data. In: Consumer Electronics-Berlin (ICCE-Berlin), 2011 IEEE International Conference on, IEEE, pp 269–273

Batra S, Tyagi C (2012) Comparative analysis of relational and graphdatabases. International Journal of Soft Computing and Engineering(IJSCE) 2(2):509–512

Cambria E, White B (2014) Jumping nlp curves: A review of natural languageprocessing research. Comp Intell Mag 9(2):48–57

Cesar P, Bulterman DC, Jansen A (2008) Usages of the secondary screen in aninteractive television environment: Control, enrich, share, and transfer tele-vision content. In: European Conference on Interactive Television, Springer,pp 168–177

Chorianopoulos K (2008) Personalized and mobile digital tv applications. Mul-timedia Tools and Applications 36(1-2):1–10

Courtois C, D’heer E (2012) Second screen applications and tablet users: con-stellation, awareness, experience, and interest. In: Proceedings of the 10thEuropean conference on Interactive tv and video, ACM, pp 153–156

Davis F (1986) A technology acceptance model for empirically testing newend-user information systems: Theory and results. Thesis, URL http://dspace.mit.edu/handle/1721.1/15192

Fernández J, Gutiérrez Y, Tomás D, Gómez JM, Martínez-Barco P (2015)Evaluating a sentiment analysis approach from a business point of view. In:TASS at SEPLN, pp 93–98

Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psy-chological bulletin 76(5):378

Forgy CL (1982) Rete: A fast algorithm for the many pattern/many objectpattern match problem. Artificial intelligence 19(1):17–37

Geerts D, Leenheer R, De Grooff D, Negenman J, Heijstraten S (2014) Infront of and behind the second screen: viewer and producer perspectives ona companion app. In: Proceedings of the 2014 ACM international conferenceon Interactive experiences for TV and online video, ACM, pp 95–102

Giglietto F, Selva D (2014) Second screen and participation: A content analysison a full season dataset of tweets. Journal of Communication 64(2):260–277

Girvan M, Newman MEJ (2002) Community structure in social and biologicalnetworks. Proceedings of the National Academy of Sciences 99(12):7821–7826

Greene D, Doyle D, Cunningham P (2010) Tracking the evolution of communi-ties in dynamic social networks. In: Advances in social networks analysis andmining (ASONAM), 2010 international conference on, IEEE, pp 176–183

Page 27: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

Socialising Around Media 27

Gutiérrez Y, Tomás D, Fernández J (2015) Benefits of using ranking skip-gramtechniques for opinion mining approaches. In: eChallenges e-2015 Confer-ence, 2015, IEEE, pp 1–10

Holmes ME, Josephson S, Carney RE (2012) Visual attention to television pro-grams with a second-screen application. In: Proceedings of the symposiumon eye tracking research and applications, ACM, pp 397–400

Hu H, Wen Y, Luan H, Chua TS, Li X (2014) Toward multiscreen social tvwith geolocation-aware social sense. IEEE MultiMedia 21(3):10–19

Hu H, Wen Y, Gao Y, Chua TS, Li X (2015) Toward an sdn-enabled big dataplatform for social tv analytics. IEEE Network 29(5):43–49

Jaiswal G, Agrawal AP (2013) Comparative analysis of relational and graphdatabases. IOSR Journal of Engineering (IOSRJEN)

Kaplan AM, Haenlein M (2010) Users of the world, unite! the challenges andopportunities of social media. Business horizons 53(1):59–68

Karypis G, Kumar V (1998) Multilevel k-way partitioning scheme for irregulargraphs. Journal of Parallel and Distributed Computing 48(1):96 – 129

Landis JR, Koch GG (1977) The measurement of observer agreement for cat-egorical data. biometrics pp 159–174

Leskovec J, Rajaraman A, Ullman J (2014) Mining of Massive Datasets. 2ndEdition. Cambridge University Press

Martínez V, Berzal F, Cubero JC (2015) The noesis open source frameworkfor network data mining. In: 2015 7th International Joint Conference onKnowledge Discovery, Knowledge Engineering and Knowledge Management(IC3K), vol 01, pp 316–321

McAuley J, Leskovec J (2014) Discovering social circles in ego networks. ACMTransaction on Knowledge Discovery from Data 8(1)

Mitchell K, Jones A, Ishmael J, Race NJ (2010) Social tv: toward contentnavigation using social awareness. In: Proceedings of the 8th internationalinteractive conference on Interactive TV&Video, ACM, pp 283–292

Nadeau D, Sekine S (2007) A survey of named entity recognition and classifi-cation. Linguisticae Investigationes 30(1):3–26

Odijk D, Meij E, De Rijke M (2013) Feeding the second screen: Semanticlinking based on subtitles. In: Proceedings of the 10th Conference on OpenResearch Areas in Information Retrieval, Le Centre De Hautes Etudes In-ternationales d’Informatique Documentaire, pp 9–16

Paliouras G, Papadopoulos S, Vogiatzis D, Kompatsiaris Y (2015) User Com-munity Discovery. Springer

Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found TrendsInf Retr 2(1-2):1–135

Papadopoulos S, Kompatsiaris Y, Vakali A, Spyridonos P (2012) Communitydetection in social media. Data Mining and Knowledge Discovery 24(3):515–554

Pynta P, Seixas SA, Nield GE, Hier J, Millward E, Silberstein RB (2014)The power of social television: Can social media build viewer engagement?Journal of Advertising Research 54(1):71–80

Page 28: Socialising Around Mediarua.ua.es/dspace/bitstream/10045/95667/5/2019... · entity linking on Wikipedia is based on OpenNLP,4 DBpedia5 and DBpedia Lookup6. It takes into account the

28 David Tomás et al.

Rao D, McNamee P, Dredze M (2013) Entity Linking: Finding Extracted En-tities in a Knowledge Base, Springer Berlin Heidelberg, Berlin, Heidelberg,pp 93–115

Satuluri V, Parthasarathy S (2009) Scalable graph clustering using stochasticflows: Applications to community discovery. In: Proceedings of the 15thACM SIGKDD International Conference on Knowledge Discovery and DataMining, ACM, New York, NY, USA, KDD ’09, pp 737–746, DOI 10.1145/1557019.1557101, URL http://doi.acm.org/10.1145/1557019.1557101

Tang L, Liu H (2010) Community detection and mining in social media. Syn-thesis Lectures on Data Mining and Knowledge Discovery 2(1):1–137

Tomás D, Gutiérrez Y, Agulló F (2015) Entity linking in media content anduser comments: Connecting data to wikipedia and other knowledge bases.In: eChallenges e-2015 Conference, pp 1–10

Vanattenhoven J, Geerts D (2017) Social experiences within the home us-ing second screen tv applications. Multimedia Tools and Applications76(4):5661–5689

Villena-Román J, García-Morera J, Cumbreras MÁG, Martínez-Cámara E,Martín-Valdivia MT, López LAU (2015) Overview of tass 2015. In: TASSat SEPLN, pp 13–21

Yang J, Leskovec J (2013) Overlapping community detection at scale: a non-negative matrix factorization approach. In: ACM International Conferenceof Web Search and Data Mining

Zhang Y (2005) Bayesian graphical models for adaptive information fil-tering. Thesis, URL https://users.soe.ucsc.edu/~yiz/papers/data/YOWStudy

Zhao S, Zhong L, Wickramasuriya J, Vasudevan V (2011) Analyzing twitterfor social tv: Sentiment extraction for sports. In: Proceedings of the 2ndInternational Workshop on Future of Television, vol 2, pp 11–18


Recommended