+ All Categories
Home > Documents > Modeling Human Flows in Social Activities from Individual...

Modeling Human Flows in Social Activities from Individual...

Date post: 20-Mar-2018
Category:
Upload: hoangngoc
View: 217 times
Download: 5 times
Share this document with a friend
8
Modeling Human Flows in Social Activities from Individual to Collective Behavior Zhongchen Miao Rong Xie Wenjun Zhang Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University Shanghai Key Laboratory of Digital Media Processing and Transmissions [email protected] [email protected] [email protected] ABSTRACT The human flow management model is of great importance to get success of any kinds of large-scale social activities, in which each individual makes choices by compromising between subjective willingness and objective restrictions. However modeling an individual's activities among a large population of participants is itself a great challenge, due to the variety of activities, constraints, preference types involved, and the interactions between the crowd and the environment. In this paper, we propose a model of human flows from individual to collective behavior to illustrate the participants’ dynamic visiting target choosing process. Our model takes fully into account the subjective willingness such as target attraction index and individual preferences, as well as the objective restrictions such as waiting time and visiting distance. Thus a framework representing the attributes of both the user behavior and the environment is built. Using the proposed model and framework, a case study on the World Expo 2010 Shanghai China is carried out to validate the model. Finally, the characteristics of collective behavior in social activities are analyzed through our models. Categories and Subject Descriptors J.4 [Computer Applications]: Social and Behavioral Sciences – sociology; I.6.0 [Computing Methodologies]: Simulation and Modeling – general; H.2.8 [Database Management]: Database Applications – data mining; G.2.2 [Discrete Mathematics]: Graph Theory – network problems General Terms Algorithms, Experimentation, Human Factors Keywords Human flows, Social network modeling, Collective behavior 1. INTRODUCTION Social activities see one’s compromise between his willingness action and the objective restrictions such as the environment and the actions taken by others. Collective behavior thus shows up as the result of interactions among a large group of people. In many resources and time limited social activities such as theme park visiting, every participant would make dynamic decisions in targets choosing to achieve an overall best between time cost and satisfaction gained. Thus the human flows show the collective behavior among a large number of participants in doing these dynamic choices. Therefore, a model that describes a user’s behavior in a social activity and the interactions with other users is badly needed as it’s very helpful in optimizing individual’s experience and understanding the social mechanics. There have been researches and algorithms in solving the problem of user choice making behavior in an environment restricted social activity. Location Based Service (LBS) is a fundamental technology in dealing with the environment and the user behavior. When people move through a physical environment, Location Based Service can provide routes to a neighborhood using the user’s current GPS location[10], or show information like comments and experiences shared by the others[6]. The classic generic decision theory then can be used to deal with a small group of users’ choice-making behavior[1]. As the difference between generic decision making and environment restricted decision making is still uncertain[8], a framework of geographic information-supported participatory decision-making is done[3]. Algorithms of tourists coordination such as Greedy algorithm, Congestion-Avoidance algorithm and Stochastic CA algorithm are studied[7]. Kawamura et al. develop a multi-agent model to describe theme park problem[5], which provides a basic spatial model to our research. A coordination algorithm PCC is also proposed in his paper. Li et al. improve the PCC algorithm and develop a PCD coordination method[4]. The researches on time geography[9] and space-time accessibility [11] in individual’s choice-making behavior are also carried out. As the fast development of LBS, works on decision making analysis for individuals[10] and small groups using LBS info is emerging[2]. However, the challenge still remains. It is very hard to determine the criterions in ruling collective behavior among a large amount of people because each person’s action is highly related with the environment restrictions while the interaction between the users and the environment is uncertain. In the above mentioned theme park scenario, people may flood into every amusement with different distributions and this unbalanced crowd distribution will affect user choices in visiting. In this paper, we develop a framework to model individual’s behavior on dynamic target choosing in a restricted environment, and to explain the collective behavior of human flows over a large amount of people. The model is validated with the visiting data of World Expo 2010 Shanghai China. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. The 6th SNA-KDD Workshop '12 (SNA-KDD'12), August 12, 2012, Beijing, China. Copyright 2012 ACM 978-1-4503-1544-9...$15.00.
Transcript
Page 1: Modeling Human Flows in Social Activities from Individual ...wan.poly.edu/KDD2012/forms/workshop/SNAKDD2012/doc/a10-miao.pdf · Modeling Human Flows in Social Activities from Individual

Modeling Human Flows in Social Activities from Individual to Collective Behavior

Zhongchen Miao Rong Xie Wenjun Zhang

Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University Shanghai Key Laboratory of Digital Media Processing and Transmissions

[email protected] [email protected] [email protected]

ABSTRACT The human flow management model is of great importance to get success of any kinds of large-scale social activities, in which each individual makes choices by compromising between subjective willingness and objective restrictions. However modeling an individual's activities among a large population of participants is itself a great challenge, due to the variety of activities, constraints, preference types involved, and the interactions between the crowd and the environment. In this paper, we propose a model of human flows from individual to collective behavior to illustrate the participants’ dynamic visiting target choosing process. Our model takes fully into account the subjective willingness such as target attraction index and individual preferences, as well as the objective restrictions such as waiting time and visiting distance. Thus a framework representing the attributes of both the user behavior and the environment is built. Using the proposed model and framework, a case study on the World Expo 2010 Shanghai China is carried out to validate the model. Finally, the characteristics of collective behavior in social activities are analyzed through our models.

Categories and Subject Descriptors J.4 [Computer Applications]: Social and Behavioral Sciences – sociology; I.6.0 [Computing Methodologies]: Simulation and Modeling – general; H.2.8 [Database Management]: Database Applications – data mining; G.2.2 [Discrete Mathematics]: Graph Theory – network problems

General Terms Algorithms, Experimentation, Human Factors

Keywords Human flows, Social network modeling, Collective behavior

1. INTRODUCTION Social activities see one’s compromise between his willingness action and the objective restrictions such as the environment and the actions taken by others. Collective behavior thus shows up as

the result of interactions among a large group of people.

In many resources and time limited social activities such as theme park visiting, every participant would make dynamic decisions in targets choosing to achieve an overall best between time cost and satisfaction gained. Thus the human flows show the collective behavior among a large number of participants in doing these dynamic choices. Therefore, a model that describes a user’s behavior in a social activity and the interactions with other users is badly needed as it’s very helpful in optimizing individual’s experience and understanding the social mechanics.

There have been researches and algorithms in solving the problem of user choice making behavior in an environment restricted social activity. Location Based Service (LBS) is a fundamental technology in dealing with the environment and the user behavior. When people move through a physical environment, Location Based Service can provide routes to a neighborhood using the user’s current GPS location[10], or show information like comments and experiences shared by the others[6]. The classic generic decision theory then can be used to deal with a small group of users’ choice-making behavior[1]. As the difference between generic decision making and environment restricted decision making is still uncertain[8], a framework of geographic information-supported participatory decision-making is done[3]. Algorithms of tourists coordination such as Greedy algorithm, Congestion-Avoidance algorithm and Stochastic CA algorithm are studied[7]. Kawamura et al. develop a multi-agent model to describe theme park problem[5], which provides a basic spatial model to our research. A coordination algorithm PCC is also proposed in his paper. Li et al. improve the PCC algorithm and develop a PCD coordination method[4]. The researches on time geography[9] and space-time accessibility [11] in individual’s choice-making behavior are also carried out. As the fast development of LBS, works on decision making analysis for individuals[10] and small groups using LBS info is emerging[2].

However, the challenge still remains. It is very hard to determine the criterions in ruling collective behavior among a large amount of people because each person’s action is highly related with the environment restrictions while the interaction between the users and the environment is uncertain. In the above mentioned theme park scenario, people may flood into every amusement with different distributions and this unbalanced crowd distribution will affect user choices in visiting.

In this paper, we develop a framework to model individual’s behavior on dynamic target choosing in a restricted environment, and to explain the collective behavior of human flows over a large amount of people. The model is validated with the visiting data of World Expo 2010 Shanghai China.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. The 6th SNA-KDD Workshop '12 (SNA-KDD'12), August 12, 2012, Beijing, China. Copyright 2012 ACM 978-1-4503-1544-9...$15.00.

Page 2: Modeling Human Flows in Social Activities from Individual ...wan.poly.edu/KDD2012/forms/workshop/SNAKDD2012/doc/a10-miao.pdf · Modeling Human Flows in Social Activities from Individual

In Section 2, we firstly build a spatial model to illustrate the physical environment. The restrictions in the environment are therefore converted to the attributes of nodes and edges in a topological network. Then we define a temporal model to show the varying attributes as there are interactions between user behavior and the environment. After that, we propose an algorithm to mathematically define the user behavior in dynamic visiting target determination procedure. The algorithm, denoted as STU, takes into account the spatial, temporal and user behavior information in the target choosing. In Section 3, we study our framework in the scenario of World Expo 2010 Shanghai China, and validate our STU algorithm with the real visiting dataset in the Expo Park. In Section 4, the tourists’ collective behavior in the Expo Park over a long period of time is analyzed according to our STU algorithm. The abnormality in tourists visiting, touring experience evaluation and some characteristics in our model are also discussed. Finally, Section 5 draws the conclusions.

2. FRAMEWORK DEFINITIONS 2.1 Framework Structure Inside a bounded region such as a tourist attraction, the collective behavior of a group of people is highly related to the environment as their social actions are limited by the physical constraints. Therefore, we establish a framework to mathematically define the users’ behavior, environment constraints and the interactions between the two. The framework showing in Figure 1 works on the following three aspects to achieve the goal.

1) The spatial model is a basic one in the framework as it gathers geographic information and attributes of the entities (interesting points) within a region, and then converts the real environment information into the attributes of nodes and edges in a topological network graph.

2) The temporal model is designed to deal with the varying attributes of entities which are led by both users’ actions and the time factor in the framework. As time goes by, the users’ behavior and choices they made are essentially making changes to the entities’ attributes. The changes of these attributes will eventually affect the other users’ behavior and choices as a result.

3) The user behavior model is to analyze the user behavior in choice-making process for individuals, and then summarize a common principle that guides the collective behavior among a large population. The core of this model is the STU algorithm that explains patterns of user behavior in dynamic visiting target determination by taking into account the above attributes in the framework.

2.2 Spatial Model A bounded tourist attraction area can be regarded as a combination of irregular shaped blocks. On each block lies an entity with one of the various functions such as entrance, exit, exhibition building, square and accessory facility. If we abstractly treat each entity as a node, then a tourist attraction area can be turned into a topological network graph with N nodes. The roads and traffic lines that connect every physical block can be seen as edges between nodes in that network. The geographic information such as function, location, scale and distance information of different blocks can be found using Geographic Information System (GIS) and Location Based Service (LBS).

As the road or traffic line between two physical blocks is a two-way path, an undirected weighted graph can be generated by linking the N nodes. The weight of an edge D(i1, i2) between node

Figure 1. Block diagram of our framework

Figure 2. (a) shows the map of Zone D and Zone E of the Expo Park of World Expo 2010 Shanghai China1. (b) shows the network graph with nodes and edges we obtained in spatial model.

i1 and i2 shows the shortest path distance available between block i1 and block i2 in physical environment.

Attributes of the nodes in the network graph represent the physical characteristics and functions of the entity. The attribute 𝐶𝑖 shows the capacity of node i, which represents the maximum number of people that can be served in node i at the same time. 𝑆𝑇𝑖 is the service time of node i, which means the period of time that a person could stay in that node. The service time in some places is fixed such as a cinema, while it is flexible in other places such as a supermarket. 𝑄𝑖 is a parameter called attraction index to show 1 The Expo Tourist Guidance Map is located at the Expo 2010

official website: http://www.expo2010.cn/hqfw/dlt.htm

Page 3: Modeling Human Flows in Social Activities from Individual ...wan.poly.edu/KDD2012/forms/workshop/SNAKDD2012/doc/a10-miao.pdf · Modeling Human Flows in Social Activities from Individual

the popularity of node i for the public. Generally speaking, the more people entering or having the willingness to enter node i, the higher its 𝑄𝑖 should be.

Figure 2 shows an example of modeling geographic environment with different functions in a region into an undirected weighted topological network graph with varies attributes. The operation provides a fundamental model to our collective behavior research since the places that users do actions are within the spatial model.

2.3 Temporal Model The temporal model deals with varying attributes of entities in our framework as there are interactions between user behavior and the environment. The phenomenon of queuing is a very classic interaction: a place with a long queue may attract many people while in the meanwhile lose some potential visitors who hate queuing. As time goes by and people go around the physical environment, the status of the entities is changing in that area. Therefore, a temporal model is applied to our framework to show the time varying attributes of nodes.

Suppose the current time is 𝑡 . The tourists number inside an attraction spot i is 𝐶𝐼𝑖(𝑡) , which is always no higher than its capacity 𝐶𝑖. As 𝐶𝐼𝑖(𝑡) increases to 𝐶𝑖 and there being more people willing to visit node i right now, a waiting queue with the length of 𝑄𝐿𝑖(𝑡) will exist. Regarding the waiting queue of a tourist attraction spot as a single pipeline of visiting tourists, if the number of people waiting to enter node i in front of a tourist j is n times 𝐶𝑖 , he has to wait at least n times 𝑆𝑇𝑖 of time to enter because that node can serve only 𝐶𝑖 people during one 𝑆𝑇𝑖 time period. Therefore, a newly enqueue tourist j has a waiting time 𝑊𝑖𝑗(𝑡) for entering node i that can be calculated by:

𝑊𝑖𝑗(𝑡) =𝑄𝐿𝑖(𝑡) + 1

𝐶𝑖∗ 𝑆𝑇𝑖

𝐶𝑉𝑖(𝑡) is the cumulative count of tourists for visiting node i till time t. Another time varying attribute 𝑞𝑖𝑗(𝑡) in temporal model is to describe the preference coefficient of person j to node i. 𝑞𝑖𝑗(𝑡) may be an constant 1 if everyone has no preference bias to every node during any time in the day.

Table1. The attributes of spatial and temporal model

Attribute Description

𝐶𝑖 The capacity shows the maximum number of

people can be served inside node i

𝑆𝑇𝑖 The service time shows the time a person

spends inside node i

𝑄𝑖 Attraction Index shows the popularity and the

public recognition to node i 𝐷(𝑖1, 𝑖2) The shortest route distance from node i1 to i2

𝑄𝐿𝑖(𝑡) The number of people who are in a waiting queue in order to enter node i at time 𝑡

𝑊𝑖𝑗(𝑡) The waiting time for person j who newly joins the queue to enter node i at time 𝑡

𝐶𝐼𝑖(𝑡) The current number of visitor’s count inside node i at time 𝑡

𝐶𝑉𝑖(𝑡) The cumulative visitors’ count for entering node i till time 𝑡

𝑞𝑖𝑗(𝑡) The preference coefficient of person j to node i at time 𝑡. The coefficient stays constant 1 if

there’s no bias in user preferences

By taking both spatial and temporal model into consideration, the above attributes of nodes and edges can distinguish one block from all the other blocks in a region. For example, a square has much higher capacity than an attraction spot, but it has less service time as people won’t stay in a square for a long time. The preference of an exit is lower than an entrance in the morning, and it will rise when the time is late and people are leaving for home. That is to say, in a large landscape, a user’s action can be inferred by the place he is in.

2.4 User Behavior Model An action made by a person can be denoted as a pair ⟨𝐴,𝑃,𝑇⟩, which shows his action 𝐴 happens at a place 𝑃 during a period of time 𝑇. Therefore, one’s chain of behavior can be illustrated as a sequence list of ⟨𝑎𝑐𝑡𝑖𝑜𝑛, 𝑝𝑙𝑎𝑐𝑒, 𝑡𝑖𝑚𝑒⟩ pair by observation on that person in a continuous time period.

Let 𝑺𝒎 be a set of sequence lists made by a group of people with the population of m. An element 𝑆𝑗 in the set 𝑺𝒎 represents a sequence list of the person j’s 𝑐𝑗 times of actions as well as the detailed place and time of those actions during the observation process:

𝑆𝑗 = {⟨𝑎𝑗(1), 𝑝𝑗

(1), 𝑡𝑗(1)⟩ , … , ⟨𝑎𝑗

(𝑘), 𝑝𝑗(𝑘), 𝑡𝑗

(𝑘)⟩ , … , ⟨𝑎𝑗�𝑐𝑗�, 𝑝𝑗

�𝑐𝑗�, 𝑡𝑗�𝑐𝑗�⟩}

The pair ⟨𝑎𝑗(𝑘), 𝑝𝑗

(𝑘), 𝑡𝑗(𝑘)⟩ describes that person j has spent 𝑡𝑗

(𝑘) time in dong his kth action 𝑎𝑗

(𝑘) at place 𝑝𝑗(𝑘).

In social activities like visiting, choosing a next target is the most frequent and a leading action that a user can make, as all the actions such as moving, queuing, visiting and stopping are followed actions to be accomplished within the target. Since a person always makes tradeoff between time costing and reward gaining to achieve a best overall satisfaction, he has to make dynamic choices like where to visit, whether to queue and when to exit while he is touring a tourist attraction.

We propose a STU algorithm to describe the principle of user’s dynamic target choosing procedure. This algorithm takes into account the subjective intentions such as user preferences and objective constraints like distance, service time, waiting time and some other factors.

Suppose people are visiting in an environment which has been modeled in Section 2.2 and 2.3. At time 𝜏 , person j has just finished his (k-1)th action of visiting the place 𝑝𝑗

(𝑘−1) , he will calculate the score of every node in his unvisited node set 𝑳𝒋, and choose the maximum scored one as his next visiting target. And then he would start his kth action 𝑎𝑗

(𝑘) of moving towards 𝑝𝑗(𝑘) .

The kth score calculating algorithm at time 𝜏 is shown below:

𝑠𝑐𝑜𝑟𝑒𝑖(𝑘) = �

𝑄𝑖 ∗ 𝑞𝑗𝑖(𝜏)

𝑊𝑖𝑗(𝜏) ∗ 𝐷 �𝑖, 𝑝𝑗(𝑘−1)�

, 𝑖 ∈ 𝑳𝒋

0 ,𝑳𝒋 = ∅

After a person has made up his mind of his next destination, the following action chain can be got in advance, namely, moving to target, queuing, visiting and choosing the next target.

When a person j has finished visiting a node, we can calculate and update his overall satisfaction gained till time 𝜏 shown below, by calculating an attraction index and preference weighted summation of time ratio of all his visited node set 𝑽𝒋.

Page 4: Modeling Human Flows in Social Activities from Individual ...wan.poly.edu/KDD2012/forms/workshop/SNAKDD2012/doc/a10-miao.pdf · Modeling Human Flows in Social Activities from Individual

𝑠𝑎𝑡𝑖𝑠𝑓𝑎𝑐𝑡𝑖𝑜𝑛𝑗(𝜏) = �𝑄𝑖 ∗ 𝑞𝑖𝑗�𝑇𝑖𝑗� ∗ 𝑆𝑇𝑖𝑡𝑜𝑡𝑎𝑙_𝑡𝑖𝑚𝑒𝑖𝑗

, 𝑖 ∈ 𝑽𝒋

𝑇𝑖𝑗 is the time point that person j visit node i, 𝑡𝑜𝑡𝑎𝑙_𝑡𝑖𝑚𝑒𝑖𝑗 is the total time that he spent on node i including the time period of waiting, visiting and moving to node i.

The individual satisfaction index is the average value of satisfaction index among all users till time 𝜏, which can be easily calculated with the total population of m:

𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙_𝑠𝑎𝑡𝑖𝑠𝑓𝑎𝑐𝑡𝑖𝑜𝑛(𝜏) = ��𝑠𝑎𝑡𝑖𝑠𝑓𝑎𝑐𝑡𝑖𝑜𝑛𝑗(𝜏)�/𝑚𝑚

𝑗=1

If person j has finished visiting all the nodes (𝑳𝒋 becomes empty) or the maximum of scorei is lower than his threshold TH_S𝑗, he will stop visiting as he’s not interested in this region anymore.

3. A CASE STUDY ON EXPO 2010 SHANGHAI CHINA In this section, we use the method declared in Section 2, and build a user behavior-environment combined model using the real dataset of Expo 2010 Shanghai China. Analysis on approaches to utilizing the framework with the Expo 2010 dataset is explained. Then we use the STU algorithm to simulate the tourists visiting progress using the Expo 2010 data.

3.1 Dataset of Expo 2010 Shanghai China In the year 2010, a World Exposition is held in Shanghai, China. During the 184 days from May 1st to October 31th, the Expo 2010 Shanghai China attracted more than 73 million tourists 2 . The Expo Park is divided into 5 zones denoted as Zone A, B, C, D and E. The Zone D and E are on the west bank of Huangpu River while the other three zones are on the east as the river separates them apart. Subway, buses and ferryboats go across the river and link the two banks. As the Expo Park being 5.28 square kilometers in space and the daily average visiting population being almost 0.4 million, the whole area can be even regarded as a small town with pavilions, restaurants, squares, roads, subway and river. To a great extent, the visitors’ touring experience in the Expo Park is just like the way people behave in a city or a small town.

In general, the Expo Park and pavilions start service at 9:00 in the morning. The pavilions and the whole Park stop service at 22:30 and 24:00. In our model building and attributes setting up, we refer to the Expo 2010 official website and the dataset provided by Bureau of Shanghai World Expo Coordination, our cooperative partner. These data consist of entrances, traffic, and tourists’ flows information, as well as the hourly data such as 𝐶𝑉(𝑡), 𝑄𝐿(𝑡) and 𝐶𝐼(𝑡) of every major pavilion in Zone D and E.

Since Zone D and E have higher data availability and they are quite physically independent of the other zones in the Expo Park, we concentrate on the 20 major pavilions in Zone D and E.

3.2 Attributes Calculation As described in Section 2, we build nodes, edges and their spatial attributes in a topological graph using the geographic information from GIS and LBS as well as historical data of Expo 2010.

2 The statistics can be found at the Expo 2010 official Website:

http://en.expo2010.cn

Figure 3. The hourly stats of EXPO tourists number in a day by averaging the data through July. The hourly count of tourists coming into Expo Park through 4 gates (the bar graph on primary y-axis) shows that most of the tourists check in before 12:00. The number of tourists staying at Zone D and Zone E (the line graph on secondary y-axis) remains almost steady between 12:00 and 18:00.

Pavilion Capacity) If 𝐶𝐼(𝑡) stays at a nearly constant value over a long time, it is most likely to have reached the capacity of a pavilion. In rare cases however, the number of visitors inside might be a little bit more than that value. These can be treated as an exception since there may be some special event taking place in the pavilion. So the capacity 𝐶𝑖 of a pavilion i can be calculated by selecting the highest frequent MAX(𝐶𝐼(𝑡)) for that pavilion.

Pavilion Attraction Index) Once a tourist enters a pavilion, he is actually voting to that pavilion showing his interests. When the voting number statistic is counted among quite a large population over quite a long time period, it will show the public recognition to a pavilion. In that case, the attraction index of a pavilion can be calculated using historical 𝐶𝑉(𝑡) over a period of time. We calculate number counts of visiting tourists for each pavilion over 31 days in July, 2010. Thus the attraction indexes are the percentages of these counts and we normalize the indexes to a summation of 1000.

Tourists number) There are 4 entrances and exits in Zone D and E of Expo Park. Three of them are land entrances and the other one is the metro subway entrance. When the Expo Park starts service at 09:00, very large visitor flows are coming through the 4 entrances. These 4 entrances play a major role as the source of tourists’ inflow from outside the Expo Park, comparing with the number of tourists coming across the river from the other three zones inside the Expo Park. As shown in Figure 3, the hourly number of tourists checking in drops a lot after 12:00. So the real-time tourists’ counts from these 4 entrances before 12:00 are used in our simulation, with a data frequency of five minutes.

Pavilion Service Time) Service time shows the average time of a user’s stay in a pavilion. Our investigation on the Expo visiting data shows the time period between 10:00 and 18:00 are the busiest hour for every pavilion as the 𝐶𝐼(𝑡) stays at nearly their full capacity during that time period. The tourists’ number staying in Zone D and E in Figure 3 shows the same result. With a time frequency of one hour, the service time for a pavilion i can be calculated as following, where t=10 means the time is ten o’clock.

Page 5: Modeling Human Flows in Social Activities from Individual ...wan.poly.edu/KDD2012/forms/workshop/SNAKDD2012/doc/a10-miao.pdf · Modeling Human Flows in Social Activities from Individual

𝑆𝑇𝑖 = � �𝐶𝑖

𝐶𝑉𝑖(𝑡 + 1) − 𝐶𝑉𝑖(𝑡)�

17

𝑡=10

/8

3.3 Evaluation Visitors’ cumulative count 𝐶𝑉(𝑡) of pavilions shows the number of people who have entered the pavilion for visiting, which can represent the whole picture of the overall choices made by the tourists as well as the visitor flows of different pavilions. So the validity of the framework and the STU algorithm is judged by the evaluation on simulation results compared with the real 𝐶𝑉(𝑡) over different time points. In terms of the reliability of the Expo data, three pavilions are having regular data faults in 𝐶𝑉(𝑡), so we remove them from our evaluation. On rare cases, a pavilion is not open for a whole day, so we also need to remove that pavilion from evaluation in that day.

The overall evaluation on error rate among 𝑁 pavilions can be calculated by averaging the absolute summation of one pavilion’s error rates with 𝑁 . At time 𝜏 , the error rate of the whole simulation process can be calculated by the formula below:

𝐸𝑟𝑟𝑜𝑟 𝑅𝑎𝑡𝑒(𝜏) =∑ ��𝑟𝑒𝑎𝑙_𝐶𝑉𝑖(𝜏) − 𝑠𝑖𝑚𝑢_𝐶𝑉𝑖(𝜏)

𝑟𝑒𝑎𝑙_𝐶𝑉𝑖(𝜏) ��𝑁𝑖=1

𝑁

3.4 Simulation Results As the tourists’ number and the environment in simulation setting is explained above, we begin our study on the visitor flows and user behavior on dynamic visiting target choosing in the Expo.

We simulate the tourists’ entire behavior in the Expo Park within a single day using the spatial and temporal framework we build above. The actions a tourist might take include waiting in a queue, visiting a pavilion, moving to a destination, choosing target and stopping. In terms of algorithms of user behavior on dynamic visiting target choosing, we use our STU algorithm and some other algorithms such as congestion-avoidance model(CA)[7], greedy model(GM)[7], PCC model[5] and PCD model[4] we mentioned in Section 1 as well as a random selection strategy(RS). As the ground truth data arrive hourly, the error rates of every hour’s simulation result for using different algorithms are calculated from 09:00 to 22:00.

The above simulation procedure is repeated over 46 times using 46 days’ different datasets from August 1st 2010 to September 30th

2010 (Among the 61 days, the hourly data of some pavilions in the 15 days is incomplete to evaluate the hourly error rate). Figure 4 shows the average hourly error rate over 46 days.

In Figure 4, we see that the error rate of STU remains the lowest among all the algorithms except the time 10:00, which is the first hour in visiting. The error rate of STU drops as time increases, and it reaches the lowest at time 22:00, the end of our simulation. These phenomena are caused due to the fact that we set all the tourists preferences for all the pavilions during all the time to be equally the same in the above experiment, which is of course not the same story in real life. This shows that when user preference bias is eliminated by a long time term, our model can show an overall success in collective behavior modeling.

Furthermore, the individual satisfaction index of the simulation is also calculated. In Figure 5, the STU algorithm achieves a higher individual satisfaction than any other algorithms from beginning to the ending of the day. It proves that STU algorithm is effective in an individual’s dynamic target choosing for better visiting experience, and in coordinating over large number of participants.

Figure 4. The average of hourly error rate of different algorithms through 46 days.

Figure 5. The average of hourly individual satisfaction index of different algorithms simulations through 46 days.

4. DISCUSSIONS 4.1 Abnormal Human Flows Detection As our model is quite effective in describing user behavior of dynamic target determination within a single day among a large group of people and the human flows, we want to extend our model for analyzing from individuals to collective behavior of human in a longer time period.

So we carry on our study in the Expo Park to see if the STU algorithm satisfies the behavior of tourists visiting in the Expo Park over a longer time. After removing 8 days due to data incompletion from August 1st 2010 to October 30th 2010, we simulate our model under 83 datasets for 83 days and calculate the evaluation at time 22:00 every day.

The Figure 6 shows the error rate and the individual satisfaction index each day. In the figure, we see that error rate is around 15% and 20% in most cases in August and September. In October, the error rate raises up to 25%-30%. However, the error rate rises sharply to be more than 40% in September 1st and October 16th, which is much higher than the other days.

Why this happens? Our model shows the common behavior of users in dynamic target choosing, and thus the human flows in

Page 6: Modeling Human Flows in Social Activities from Individual ...wan.poly.edu/KDD2012/forms/workshop/SNAKDD2012/doc/a10-miao.pdf · Modeling Human Flows in Social Activities from Individual

collective behavior for a regular situation. If a large error rate appears, then there must be something uncommon happened as the large scale of tourists are behaving abnormally. The irregular severe weather in September 1st and the overwhelming flooding visitors in October 16th lead to stronger constraints in the environment and unpredictable interactions between users, which causes the abnormality. In historical facts, the tourists’ collective behavior in the above two days is indeed apparently abnormal compared to usual conditions. And the abnormality of tourists’ collective behavior is reflected by the big error rate in visitor flows and the irregular satisfaction index for individuals.

Severe Weather) In September 1st 2010, a typhoon called Kompasu hit the Expo Park located in Shanghai as well as the most part of East Asia. In the weather forecast the day before the typhoon hitting, the peak of the typhoon would have wind speed of 185 km/h. Therefore, on August 31th, Shanghai Meteorological Observatory issued a blue typhoon warning signal, and the Shanghai Municipal Government announced that schools would be closed on that day3. Therefore, the behavior of visitors in the Expo Park is also greatly affected by the typhoon as some pavilion stops service due to the heavy rain. The visitor numbers dropped to about 181,700 in that day, which is only seen on the first ten days at the beginning of the Expo 2010.

Overwhelming Crowd) During October 16th, more than 1,032,800 tourists visited the Expo Park in that single day, which is 2.5 times the average tourists number during the whole Expo. The one million visitors count also broke the record set by the Osaka World Expo 1970. The overwhelming tourists are making the Expo Park and the pavilions very crowded and unpleasant to visit. Popular pavilions such as the Oil Pavilion in Zone D even had a waiting line of 12 hours 4 . In such a circumstance, the behavior of tourists is definitely different from ordinary ones.

4.2 Analysis on Visitor’s Satisfaction Index Individual’s satisfaction of a visitor (defined in Section 2.4) is the overall benchmark of his whole day’s visiting experience. Analyzing satisfaction index can evaluate a tourist’s individual happiness as well as the society as a whole.

In the experiment explained in Section 4.1, the individual satisfaction by the time 22:00 is calculated every day. The dashed line on the secondary y-axis in Figure 6 shows this individual satisfaction index over 83 days. Comparing with the total number of tourists that check into the Expo Park (the dotted line on secondary y-axis in Figure 6, number is count in thousands), we can find the phenomenon that the total tourists number and individual’s satisfaction are entirely opposed to each other. While visitor number increases, everyone will feel comfortless as his available options of social activities are narrowed by the other tourists, and he would spend more time on waiting in queue or lingering about a square rather than visiting an attraction spot. On the other hand, if the constraints in the environment are less, every tourist will feel less pressure and have a pleasant time in visiting.

3 For more details of Typhoon Kompasu, please see

http://en.wikipedia.org/wiki/Typhoon_Kompasu_(2010) 4 More story and pictures showing the crowd are on the website

http://www.chinasmack.com/2010/pictures/shanghai-world-expo-sees-1-million-visitors-in-a-single-day.html

Figure 6. The solid blue line on primary y-axis shows the simulation error rate till 22:00 over 83 days. The dotted green line on the secondary y-axis shows the total number of tourists to check into the Expo Park during that day (count in thousands). The dashed red line on secondary y-axis is the individual satisfaction index till 22:00.

Figure 7. The surveillance camera video taken near Malaysia Pavilion on June 26th, September 1st and October 16th.

The phenomenon that individual satisfaction drops as the total tourists’ number rises can also be evidenced by the surveillance video in the Expo Park shown in the Figure 7. The recorded video of June 26th shows a normal situation that a pavilion has some queuing people at 12:00 and the queue length would drop when it is late at 18:00. In contrast, September 1st sees a very low number of tourists due to the typhoon alert we explained in Section 4.1. The Expo Site seems to be desolate and the visitors barely have to wait for visiting in that day. In that case, the visitors can visit more pavilions that usual and gain much higher satisfaction. However when it is October 16th , the only day in the 6 months’ duration of the Expo that more than 1 million tourists visit the Expo Park in a single day, we can see the Expo Park are full of tourists both on the square or in the waiting queue even at 18:00. In that case, satisfaction index of everyone is dramatically low during that day and almost no one feels pleasant about his experience.

Page 7: Modeling Human Flows in Social Activities from Individual ...wan.poly.edu/KDD2012/forms/workshop/SNAKDD2012/doc/a10-miao.pdf · Modeling Human Flows in Social Activities from Individual

4.3 Analysis on Model Characteristics 4.3.1 Threshold TH_S for quitting visit The parameter 𝑇𝐻_𝑆 in our model (see Section 2.4) is the threshold as to define the user stopping criteria. Since every tourist is free to visit all five zones in the Expo Park, some tourists may stop visiting zone D and zone E and go across the river to visit on the other side. It is a subjective action showing that tourists don’t want to continue visiting any target here, after he assesses all his unvisited nodes’ 𝑠𝑐𝑜𝑟𝑒 and finds out they are no higher than 𝑇𝐻_𝑆. We run simulation described in Section 4.1 using different 𝑇𝐻_𝑆 to show the effect of these stopping criteria, in which error rate is calculated at 22:00 every day. As illustrated in Figure 8, the error rate rises as the threshold 𝑇𝐻_𝑆 increases from 0 to 10. This is a phenomenon saying that the majority of the tourists in Zone D and Zone E don’t have a strong intention to leave the area early because there are still many interesting spots that keep them visiting here.

4.3.2 Threshold TH_D for queue drop rate In practice, a queue drop phenomenon is common since the length of a waiting queue is sometimes limited. The common reasons of this limitation are due to the concerns of security issue on a very long queue, or due to the fact that the designed waiting area is not large enough to hold a very long queuing crowd. These unexpected and changeable restrictions above disallow a tourist to visit the most willing one, so he has to choose the second willing one as his destination.

Considering this, we modify our model and add a feature of queue drop to represent the physical constraints of that kind. As a more interesting pavilion attracts more people and holds much longer queue, our queue drop rate is simply defined below:

𝐷𝑅𝑖 =𝑄𝑖

𝑊𝑖𝑗(𝑡)

As the node i’s queue length 𝑄𝐿𝑖(𝑡) increases, the rising waiting time will make 𝐷𝑅𝑖 smaller. If 𝐷𝑅𝑖 of pavilion i is smaller than a threshold 𝑇𝐻_𝐷, then the policy of limitation on queue length will be enabled and thus the new comers are not allowed to enter the queue. We then do simulations using different 𝑇𝐻_𝐷 to show the impact of the policies that rules the tourists’ behavior. The experiment is carried out using the 83 days data described in Section 4.1.

In Figure 9, the simulations with bigger 𝑇𝐻_𝐷 has lower error rate than the ones with smaller queue drop rate thresholds. This shows that relatively strong queuing limitations have been practiced in the Expo 2010. The limitation operation is actually functioning as congestion avoidance in very hot pavilions, and it gives the tourists a second opportunity for choosing a less queued visiting target. Although the queue drop feature may hurt someone’s passion in visiting, it partially plays a role to coordinate and balance the distribution of visitor flows among pavilions. Figure 10 shows that a carefully designed threshold 𝑇𝐻_𝐷 on queue drop rate would increase an individual’s overall satisfaction.

Figure 8. Error rate till 22:00 over 83 days using different stop visiting thresholds TH_S. The higher threshold leads to bigger error rate, which shows the tourists are not likely to leave the area early.

Figure 9. Error rate till 22:00 over 83 days using different queue drop threshold TH_D. The higher threshold of drop rate achieves better result.

Figure 10. The individual satisfaction index till 22:00 over 83 days using different queue drop threshold TH_D. The high threshold on queue drop rate plays a role of congestion avoidance and thus improves the individual’s satisfaction.

Page 8: Modeling Human Flows in Social Activities from Individual ...wan.poly.edu/KDD2012/forms/workshop/SNAKDD2012/doc/a10-miao.pdf · Modeling Human Flows in Social Activities from Individual

5. CONCLUSIONS Model of human flows is badly needed in large numbers of people involved and environment restricted social activities. In this paper, we contribute 3 major points in solving the problem: 1) We propose a framework to mathematically define the problem of human flows by modeling the spatial and temporal attributes of environment into a topological network graph. 2) We develop a user behavior model and the STU algorithm to illustrate the user behavior of dynamic target determination. The STU algorithm makes full use of the attributes including the subjective willingness such as attraction index and individual preferences as well as the objective restrictions such as waiting time and visiting distance in the above framework. 3) We study our model and framework on the case of World Expo 2010 Shanghai China, which is a 184-days-long, more than 73 million tourists involved large-scale social activity. We explain some phenomena of collective behavior of visitor flows in the Expo by analyzing the characteristics of our model. In conclusion, our model shows the common principle of user behavior in dynamic target choosing for both individuals and among a massive number of people.

6. ACKNOWLEDGMENTS This work was partially supported by National Basic Research Program of China (2010CB731400 and 2010CB731406), the Program of National Natural Science Foundation of China (No. 60902020), and STCSM (12DZ2272600).

7. REFERENCES [1] Amedeo, D., Golledge, R., and Stimson, R. 2009. Person

environment behavior research: investigating activities and experiences in spaces and environments. New York: Guilford Press.

[2] Espeter, M. and Raubal, M. 2009. Location-based decision support for user groups. Journal of Location Based Services. 3, 3 (Sep. 2009), 165-187.

[3] Jankowski, P. 2003. Toward a framework for research on geographic information-supported participatory decision-making. URISA Journal. (2003), 9-17.

[4] Jin, L. et al. 2010. Visitors’ Coordination for World Expo 2010 Shanghai. Systems Engineering. 6, (2010), 50-56.

[5] Kawamura, H. and Kurumatani, K. 2004. Modeling of Theme Park problem with multiagent for mass user support. Agent for Mass User Support. (2004), 48-69.

[6] Lindqvist, J. et al. 2011. I’m the mayor of my house: examining why people use foursquare - a social-driven location sharing application. Proceedings of the 2011 annual conference on Human factors in computing systems (New York, NY, USA, 2011), 2409-2418.

[7] Ohtani, Y. et al. 2010. Study on congestion reducing in the theme park. 2010 8th Asia-Pacific Symposium on Information and Telecommunication Technologies (APSITT) (2010), 1-6.

[8] Raper, J. et al. 2007. A critical evaluation of location based services and their potential. Journal of Location Based Services. 1, 1 (Mar. 2007), 5-45.

[9] Raubal, M. et al. 2004. User-Centred Time Geography for Location-Based Services. Geografiska Annaler, Series B: Human Geography. 86, 4 (Dec. 2004), 245-265.

[10] Rinner, C. 2008. Mobile Maps and More-Extending Location-Based Services with Multi-Criteria Decision Analysis. GEOGRAPHY PUBLICATIONS AND RESEARCH. (2008), 335-352.

[11] Wu, Y. 2002. Computational tools for measuring space-time accessibility within transportation networks with dynamic flow. Journal of Transportation and Statistics. 4, (2001), 1-14.


Recommended