A Social Network is not a Graph
Y.C. TayNational University of Singapore
in collaboration with : Zhifeng Bao, Yong Zeng, Jingbo Zhou
(fmsasg.com)
Tripartite Graph Clustering for Dynamic Sentiment Analysis on Social Media
papers
CS104 Information and Information SystemsSocial Networks and Graph Theory
courses
Exponential Random Graph Models for Social Networks
books
but a social network is not a graph
a social network is not a graph because(1) a social network is dynamic but a graph is static
Facebook: TAO social graph
(Bronson et al,USENIX ATC 2013)
graph is not up-to-date
master database
pulled
updates
a social network is not a graph because(2) a social network is multi-dimensional whereas a graph is one-dimensional
(fmsasg.com)
a social network is not a graph because(2) a social network is multi-dimensional whereas a graph is one-dimensional
Aisha Bala
Facebook friends Twitter
follower comment
tag
job
hobby
family
education
node attributes
edge attributes
a social network is not a graph because(2) a social network is multi-dimensional whereas a graph is one-dimensional
Link Prediction Problem (e.g. "People You May Know")
graph properties
Prob(link) = f (node degree, path length, ...)
e.g. [Lichtenwalter et al, KDD2010][Liben-Nowell & Kleinberg CIKM2003]
much better [Bao et al, ASONAM2013] :
Prob(link) = f (coauthor, citation, affiliation, ...)academic community
multi-dimensionprincipalcomponentanalysis
one dimensiongraphalgorithms
a social network is not a graph because(2) a social network is multi-dimensional whereas a graph is one-dimensional
Cluster Discovery
algorithm(conductance, betweenness, ...)
e.g. [Leskovec et al, WWW 2008][Mishra et al, Internet Math 2008]
syntactic graph properties
much better [Bao et al, ER2013] :
algorithm(number and frequency of interactions)academic community
semantics of relationship
a social network is not a graph because(3) a social network contains many graphs
e.g. [Zhou & Lin, KDD2013]
data model: social graph + interaction graph + influence graph
e.g. social network for photographs:
bird watchers, gourmet cooks, photo journalists, Bollywood fans, ...
e.g. Facebook's TAO graph: thousands of edge types
female malegraph
type = gender:
a social network is not a graph because(4) social network analysis often not expressible as graph navigation
e.g. How do coauthor communities evolve over time?
sample SQL query to find #coauthors for papers in SIGMOD conferences between 1995 and 2000:
select count(*) from coauthor, proceedings p, conference cwhere coauthor.paper_id = p.paper_id
and p.proceeding_id = c.proceeding_idand year(c.publication_date) > 1995and year(c.publication_date) <= 2000and c.proc_profile like `%SIGMOD'
expressible as graph traversal?
requires aggregation, joins, selection, non-key attributes.
a social network is not a graph because(5) hard to express/impose data integrity constraints on a graph model
foreign keyse.g. tagging a face in a photo: tag.photo_id must be a photo.photo_id
functional dependenciese.g user_id uniquely determines name
etc.
a social network is not a graph because(6) there are no industrial strength graph data management systems
concurrency control
crash recovery
query optimization
integrity constraints
data warehousing
triggers
index structures
buffer managementsystem catalog
data normalization
access control
data sharding/replication
decision supportview materialization
data mining
data dictionary language
stored procedures
if not a graph,then what?
We want a data model for social networks that
(III) facilitates database schema design for social networks
(IV) facilitates database system engineering for scalability
our proposal: sonSchemaa relational database model of restricted form
(I) is supported by commercial database management systems
e.g. DB2, SQL Server, Oracle
(II) is supported by database management systems that are affordable for social network start-ups
e.g. MySQL, PostgreSQL
(I), (II) (III), (IV)
a social network is a group of userswho interact through social products
starting point: what is a social network?
sonSchema : a relational database model of restricted form
sonSchemaentities
user
group
private_message
social_product
post
relationships
friendship
membership
product_relationship
product_activitiy
response2post
user
product
user-user
user-product
product-product
sonSchemaentities
user
group
private_message
social_product
post
relationships
friendship
membership
product_relationship
product_activitiy
response2post
conceptual schema
logicalschema
exampleinstantiations
exampleinstantiations
individualadvertiser
cricket_clubBeatles_fans
photoblog
emailannouncement
couponpoll event
contact_listfollower
commentretweet
vote-electioncoupon-event
share_videotag_photo
like_comment
sonSchema
secondary key
primary key
conceptual schema:
sonSchemaexample instantiation: academic community
group
post
response2post
user
friendship
We want a data model for social networks that
(III) facilitates database schema design for social networks
(IV) facilitates database system engineering for scalability
our proposal: sonSchemaa relational database model of restricted form
(I) is supported by commercial database management systems
e.g. DB2, SQL Server, Oracle
(II) is supported by database management systems that are affordable for social network start-ups
e.g. MySQL, PostgreSQL
(I), (II) (III), (IV)
We want a data model for social networks that(III) facilitates database schema design for social networks
architecture to automatically translatesocial network design into sonSchema instantiation
We want a data model for social networks that(IV) facilitates database system engineering for scalability
leverage on sonSchema's restricted formto design a scalable protocolfor strong consistency
leverage on sonSchema's restricted formto efficiently find best query plan
result: sonSQL
our ambition is for sonSQL to replace MySQLas the default database system adopted by new social network services
http://sonsql.comp.nus.edu.sg