Date post: | 11-May-2015 |
Category: |
Technology |
Upload: | selver-softic |
View: | 1,055 times |
Download: | 2 times |
@twitter Mining #Microblogs Using #SemanticTechnologies
Selver Softic, Martin Ebner, Herbert Mühlburger , Thomas Altmann,
Behnam Taraghi
Web 2.0 - well known story
Web 2.0 technologies brought users closer to Web …– Wikis, Blogs, Forums …– Podcasts, RSS, XML …
… then users started to generate content …
Source: http:mediabistro.com
From Web to Social Web
• Result = a vast of information– Text, Pictures, Audio, Videos ….
• Communication, networking, exchange of data• Web became more personal• Cultural, geographical and social borders
disappeared
Source: http://www.ignitesocialmedia.com
Social Media Boom!
Social sites are data silos
source: www.pidgintech.com
But still disconnected ?
source: www.pidgintech.com
Data is still captured in Walled Garden!
Statements
• Social Web relies on users and communication among them
• While communicating users produce or consume content
• Social sites are data silos rich on variety of information
• This information could be interesting for:– monitoring of trends, advertising, statistics, reputation,
news broadcasting , tagging …• This data is captured in Walled garden !!!
Questions
• How to use this data to gain more useful insights• What are the advantages of online (offline) search
on such data and how to reach it in an uniform way
• Is it possible to structurize, connect and expose the data in order to be used by humans and machines more efficiently
• What would an architecture look like for this issue
Social Web Trends
MicrobloggingSocial BookmarkingSocial NetworkingSocial MarketingSharing Photos, Videos …
Source: http://socialwebresearch.com
Microblogs• Microblogs
– Used for communication,publishing and information exchange– Simple for processing – Information generated by many different users– Social user relations– Tripartite communication structure– Variety of informations – No boundaries by culture, location or technology (mobile users)
• Twitter– Most Popular – Large amount od data– But limitedAccording: http://an.kaist.ac.kr/traces/WWW2010.html41.7 million user profiles, 1.47 billion social relations, 4,262 trending topics, and 106
million tweets
Semantic aspects and Twitter
• Twitter– User realtions– Tweets as short information artefacts – Communication with tripartite pattern– Time related information
• Vocabularies– SIOC, FOAF, Dublin Core
Linked Data and Twitter
• Twitter contains infos on:– People, Organisations,
Locations, Trends …
• LOD Cloud contains– Billions of triples about:
• Geolocations , data about science, government, common knowledge , persons, news …
• Vocabularies– MOAT, CommmonTag
Architecture model
Acquisition - Grabeeter
Grabeeter
• Search in your Tweets• Filter your Tweets by date• Search in your Tweets offline using the
Grabeeter Client• Filter your tweets offline using the Grabeeter
Client• Grabeeter provides an API
Triplification Module
• Author• Date• Content• Reciever
<tweet url="http://grabeeter.tugraz.at/tweet/199272" text="Sitting in Prater #vienna, launch party. Nice" screen_name="selvers" created="2010-08-19" twitterUrl="http://twitter.com/selvers/status/21606926237"/>
TriplifierRDF Store
Triplification Module@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sioc: <http://rdfs.org/sioc/ns#> .
@prefix sioct: <http://rdfs.org/sioc/types#> .
@prefix dcterms: <http://purl.org/dc/terms/#> .
<http://twitter.com/selvers/status/21606926237> rdf:type sioct:MicroblogPost ;
sioc:content "Sitting in Prater #vienna, launch party. Nice" ;
sioc:has_creator <http://twitter.com/selvers/> ;
foaf:maker <http://grabeteer.tugraz.at/foaf/selvers/> ;
dcterms:created “2010-08-19” ; rdfs:sameAs <http://grabeeter.tugraz.at/tweet/199272> .
<http://twitter.com/selvers/> rdf:type foaf:Person ;
foaf:name "Selver Softic" ;
foaf:depiction <http://a0.twimg.com/profile_images/905118560/f9e4b6eba.13070201_3_normal.jpg> ;
foaf:knows <http://twitter.com/hmuehlburger/> ;
foaf:knows <http://twitter.com/mhausenblas/> ;
foaf:knows <http://twitter.com/mebner/> .
…
Interlinking Module
• Hashtags (People, Organisation, Locations)• MOAT, CommonTag• Later NLP processed content, SILK Framework
SELECT ?post ?content ?maker ?name WHERE {?post rdf:type sioct:MicroblogPost; foaf:maker ?maker; ?maker foaf:name ?name;sioc:content ?content.FILTER(regex(?content,#vienna))}
tag: tagName "vienna" ;moat: tagMeaning <http://dbpedia .org/resource/Vienna>tag: taggedResource <http://twitter.com/selvers/status/2160692623>
Classifier
Analysis
Conclusions & Outlook
• Current state of the art technologies suffice to realise the proposed architecture paradigm
• Interlinking with LOD Cloud (Tweet-O-Sphere)• Involving NLP Methods• Sentiment classification• (Re)Tagging of Tweets• Providing SPARQL Endpoint + Lookup Service as
research interface• Social Semantic Web Apps
Questions?