www.mobilab.lu
Assessing partial observability of link flow
inference problems at large scale networksFrancesco VITI & Marco Rinaldi - University of Luxembourg
ERC SCALE 2018 – 10-11 September 2018, Grenoble
www.mobilab.lu
Outline
Introduction
Network Sensor Location Problems
Full and partial observability concepts
A new metric for assessing partial observability
NSP metric
Ranking partial observability solutions
Examples
The impact of route set generation in large scale applications
Hypergraph
Exact and approximate solutions
Examples
Conclusions & future research directions
2
www.mobilab.lu
Introduction
Information from traffic sensors crucial to various applications, e.g.
State estimation
OD flow estimation
Model calibration
Traffic management
…
• Problem: Obtaining full information not realistic in real-sized networks
need to strategically position the (limited) amount of sensors
The quality of traffic information depends on number and type of
sensors and where they are positioned
network sensor location problem, NSLP
3
www.mobilab.lu
Network Sensor Location
Problem (NSLP)
Formal definition:
“Find an optimal number and location of sensors to
maximize the available information on a network”
Analogy with a sudoku game: different initial states (positions
of numbers) determine difficulty (estimation reliability) and
uniqueness (solution determinacy)
Need rules to solve them
Link-Route-OD relationships
Local consistency (e.g., flows at nodes)
4
www.mobilab.lu
The NSLP formulation
Determining link/route information based on simple algebraic relations
Fundamental relationships used
OD flows f,
route flows h σ𝑟∈ℛ 𝛿𝑎𝑟ℎ𝑟 = 𝑣𝑎 , ∀𝑙 ∈ ℒ
σ𝑟∈ℛ 𝜌𝑤𝑟ℎ𝑟 = 𝑓𝑤 , ∀𝑤 ∈ 𝒲
𝒗𝒇 =
∙ 𝒉
link flows v
Location rules
Mostly based on widely accepted statements, i.e.
Select those variables that can tell something about unknown variables
(information coverage)
Try and get as much flow information as possible (information capturing)
Don’t waste sensors to get information you have already (information
independence)
5
www.mobilab.lu
Location rules
Heuristic rules in literature (Yang and Zhou’s, 1998, Larsson et al., 2010, Cipriani et al.,
2006, Yang et al., 2006, Gentili and Mirchandani, 2012, Castillo et al., 2014)
1. OD/route/link coverage
2. Maximum (link/route/OD) flow fraction
3. Maximum intercepting/net (link/route/OD) flow
4. Link/route independence
No general methodology exists that applies all rules at once
Two families of NSLP approaches (Gentili& Mirchandani, 2012)
Observability problems – exploiting topological supply relations
Flow estimation problems – exploting demand flow relations
www.mobilab.lu
Observability vs. Flow
estimation
Observability problems
Need only relations between link,
route and OD variables (topological
characteristics)
Specifies coverage requirements
Set the positions of the known
numbers, in relation to the unknown
ones
Assess solution determinacy
Flow-estimation problems
Optimal solutions related to estimation
model adopted
Prior information necessary
Different rules to determine quality of
information a priori
Set the values of numbers in relation to
those unknown
Assess solution reliability
7
www.mobilab.lu
Full observability problem
Determine the minimum amount/combination of sensors to be placed
such that all flow information is known
Obvious solution: equip all links with sensors
Better solution: exploit topological relationships to reduce number of variables
necessary
8
3 1 2
4 3 5
v v v
v v v
www.mobilab.lu
Full observability problem
Full observability: find set of linearly independent variables:
𝒙𝒅 = 𝐏𝒙𝒊
Different solution methods:
Pivoting (Castillo et al., 2008);
Gaussian Elimination (Hu et al., 2009);
Node-based (Ng, 2012);
Topological tree (He, 2013);
Non-planar Holes (Castillo et al., 2014).
Derive new relations 𝐏 using matrix transformations
𝑣1𝑣2𝑣3𝑣4𝑣5
=
1 0 1 00 1 0 11 1 1 11 1 0 00 0 1 1
ℎ1ℎ2ℎ3ℎ4
www.mobilab.lu
Full observability: Pivoting
𝑣1𝑣2𝑣3𝑣4𝑣5
=
1 0 1 00 1 0 11 1 1 11 1 0 00 0 1 1
ℎ1ℎ2ℎ3ℎ4
ℎ1𝑣2𝑣3𝑣4𝑣5
=
1 0 −1 00 1 0 11 1 0 11 1 −1 00 0 1 1
𝒗1ℎ2ℎ3ℎ4
ℎ1ℎ2𝒗3𝑣4𝑣5
=
1 0 −1 00 1 0 −11 1 0 01 1 −1 −10 0 1 1
𝒗1𝒗2ℎ3ℎ4
ℎ1ℎ2𝒗3ℎ3𝒗5
=
0 −1 1 10 1 0 −11 1 0 01 1 −1 −11 1 −1 0
𝒗1𝒗2𝒗4ℎ4
Problem 1 : Full observability solutions only theoretical in real-sized networks
(~60% of links need to be measured)
Problem 2: Exact solutions, not unique permutation-dependent
www.mobilab.lu
Research questions
Research questions:
Is there a way of assessing partial observability solutions?
What is the added value of placing or removing an extra sensor?
What if only a limited number of sensors is available for budget reasons?
What would be the # of sensors needed to guarantee an acceptable level of
under-determinedness?
Are full observability solutions all the same in terms of potential partial
observability solutions?
ℎ1ℎ2𝒗3ℎ3𝒗5
=
0 −1 1 10 1 0 −11 1 0 01 1 −1 −11 1 −1 0
𝒗1𝒗2𝒗4ℎ4
𝑣1𝑣2𝑣3𝑣4𝑣5
=
1 0 1 00 1 0 11 1 1 11 1 0 00 0 1 1
ℎ1ℎ2ℎ3ℎ4
www.mobilab.lu
Partial observability
Goal Find an optimal* set of locations in full observability
solution(s) in terms of partial observability
* minimizing the magnitude of missing information, while
* respecting given budget constraints (e.g. maximum number of sensors)
Why starting from the full observability solution?
1. We can assure that links with sensors are linearly independent
2. Solutions provide a smallest set of linearly independent variables
3. The search space of solutions is reduced significantly
ℎ1ℎ2𝒗3ℎ3𝒗5
=
0 −1 1 10 1 0 −11 1 0 01 1 −1 −11 1 −1 0
𝒗1𝒗2𝒗4ℎ4
www.mobilab.lu
Towards a new metric
Rewrite 𝒙𝒅 = 𝐏𝒙𝒊 into
𝒙𝑖𝑛𝑑𝑒𝑝𝒙𝑑𝑒𝑝
=𝐈 𝟎𝐏 𝟎
𝒙𝑖𝑛𝑑𝑒𝑝𝒙𝑑𝑒𝑝
Subdivide variables in those observed and those unobserved𝒙𝑖𝑛𝑑𝑒𝑝 &𝑜𝑏𝑠
𝒙𝑖𝑛𝑑𝑒𝑝 &𝑢𝑛𝑜𝑏𝑠
𝒙𝑑𝑒𝑝=
𝐈 𝟎𝐏 𝟎
𝒙𝑖𝑛𝑑𝑒𝑝𝒙𝑑𝑒𝑝
=𝐈′𝟎𝐏′
𝟎𝐈′′𝐏′′
𝟎𝟎𝟎
𝒙𝑖𝑛𝑑𝑒𝑝 &𝑜𝑏𝑠
𝒙𝑖𝑛𝑑𝑒𝑝 &𝑢𝑛𝑜𝑏𝑠
𝒙𝑑𝑒𝑝
Focus on the ‘size’ of the solution space of the reduced pivoted matrix and on its basic
dimensions knowing that
The reduced matrix P’ will have infinite possible solutions due to under-
determinedness
The basis B’ provides the ‘basic’ vectors characterizing the infinite solutions, i.e. any
combination 𝜶𝐁′ belongs to the Null space
We want now
A metric that gives a scalar value to the degree of under-determinedness
It is normalized, i.e. it can be seen as ‘fraction of full information lost’13
www.mobilab.lu
Null-Space Metric of P, NSP
• Characterize the Null Space, i.e. set of vectors mapping to zero in the given
(sub)space measuring the degrees of freedom in solution space
Compare extent of the basis of the Null space of the reduced matrix wrt basis
of full matrix normalization of the solution spaces
New metric for assessing partial observability solutions based on the trace
function:
𝑁𝑆𝑃 =
𝐈 𝟎𝐏 𝟎
𝑇
𝐁′𝐹
𝐈 𝟎𝐏 𝟎 𝐹
Frobenius norm:
related to the trace of the matrix, i.e. the maximum error extent in the null-space
related to the singular values of the reduced matrix
Basis of the Partial
Observability matrixProjection onto full set of
independent variables
Extent of the full matrix
14
www.mobilab.lu
Geometric interpretation
NSP metric related to the
maximum size of the prism
derived from the basis
A
B
C
D
v3v1
v2
v4
v5
15
www.mobilab.lu
Greedy algorithm(s)
Questions
1. How to optimally identify sensor placements?
2. What is the added value of placing or removing an extra sensor?
3. Given n sensors, where should these be installed to get maximum
information?
Two local search algorithms introduced:
Add
starts from an empty set of observed variables, and adds as variable
to be observed, the one that results in the biggest decrease of error
Remove
starts from the set observing all variables, and chooses the single
variable that causes the least decrease in information
www.mobilab.lu
Proof of concept (1)
Test on simplified ‘candy’
network
Easier to illustrate properties
Has only two ‘pivot’ families to
evaluate
‘Center’ family, i.e. l3 is
observed
‘Star’ family, where solution
does not contain I3
A
B
C
D
v3v1
v2
v4
v5
17
www.mobilab.lu
Proof of concept (2)
Star
A
B
C
D
v3v1
v2
v4
v5 Max. Estimation Error
Observed: l1 & l4 Observed: l1 & l5 Observed: l4 & l5l1 0 0 −∞,∞
l2 0 + −∞,∞ − 0 −∞,∞ + 0 − 0 0 + 0 − −∞,∞
l3 0 + −∞,∞ −∞,∞ + 0 0
l4 0 −∞,∞ 0
l5 −∞,∞ 0 0
NSP 0.44 0.44 0.44
Max. Estimation Error
Observed: l1 Observed: l4 Observed: l5
l10 −∞,∞ −∞,∞
l2 −∞,∞ + −∞,∞ − 0 −∞,∞ + 0 − [−∞,∞] −∞,∞ + 0 − [−∞,∞]
l3 −∞,∞ + [−∞,∞] −∞,∞ + 0 0 + −∞,∞
l4−∞,∞ 0 −∞,∞
l5−∞,∞ −∞,∞ 0
NSP 0.79 0.64 0.64
18
www.mobilab.lu
Proof of concept (3)
Center
A
B
C
D
v3v1
v2
v4
v5Max. Estimation Error
Observed: l1 & l5 Observed: l1 & l3 Observed: l3 & l5l1 0 0 −∞,∞
l2 0 − −∞,∞ 0 0 − −∞,∞
l3 −∞,∞ 0 0
l4 0 − −∞,∞ 0 − −∞,∞ 0
l5 0 −∞,∞ 0
NSP 0.53 0.47 0.47
Max. Estimation Error
Observed: l1 Observed: l3 Observed: l5l1 0 −∞,∞ −∞,∞
l2 −∞,∞ − 0 0 − −∞,∞ −∞,∞ − [−∞,∞]
l3 −∞,∞ 0 −∞,∞
l4 −∞,∞ − [−∞,∞] 0 − −∞,∞ −∞,∞ − 0
l5 −∞,∞ −∞,∞ 0
NSP 0.8 0.69 0.8
19
www.mobilab.lu
Pivot “Families”: Ranking
Question: are pivots all the same to find optimal locations?
Answer: NO
And our NSP is able to provide different results for the same number of sensors
Singular Value Decomposition can be employed to rank pivots a-priori
σ𝑖σ𝑗 𝑠𝑖𝑗 as a measure of pivot’s information content
20
• “Star” Family: 2.7979
• “Center” Family: 2.7321
www.mobilab.lu
Pivot “Families”: different
information
21
More informative
Faster error decrease
Rank: 2.7321
Rank: 2.7979
www.mobilab.lu
Observations
Tested different network sizes and it seems that
Ranking is consistently independent on the number of sensors, i.e. better ranked
families remain better ranked till full observability solution
Best families can be found already from first sensors placed!
‘Parallel’ network (14 links; 4 ODs)
www.mobilab.lu
Test on large networks:
Rotterdam
Test on real-sized networks shows that
Most informative links are near nodes and centroids (good news!)
Nice spatial distribution (even if no such rule was imposed)
Not all full observability solutions contain the minimum number of sensors
(consistent with recent findings of Castillo, 2013)
Rank:
266.27
Rank:
342.74
!!!
Links 476
Nodes 243
ODs 1890
www.mobilab.lu
Conclusions (intermezzo)
A new definition of partial observability problems that accounts for
full and partially observed variables was introduced
A novel metric able to assess information quantity based on
analysis of the extent of the Null-Space (NSP)
Local search algorithms developed to explore the solution space,
to determine optimal sensor locations
The methodology is generic as requires only the fundamental
topological relations between links, routes and eventually OD flows
different applications and extensions possible
Solutions on large networks strongly depend on route set
24
www.mobilab.lu
Node vs. Route based
approaches
Node-based relations
• Flow conservation at nodes
• Local, simple laws
• No route enumeration necessary
• Misses complete information on link-path-OD relations
solutions require systematically more sensors
Route-based relations• Flow conservation on routes
• Connects links across whole network
• Needs some kind of route enumeration
6
www.mobilab.lu
Observability and route
information (1)
Different route set compositions yield different full observability
solutions
Routes may be selected because of behavioral/flow capturing rules
(e.g. shortest paths)
| |
1 0 1 0
1 0 0 1
l RA
{ }odR r | | | |dep indep
Hypothesis: more efficient full (and partial) observability solutions can
be found with efficient route set generation.
7
www.mobilab.lu
Observability and route
information (2)
Three different sorts of information
Non Redundant (linearly independent)
Redundant + Informative (linearly dependent, but useful to derive
independence relationships)
Pure Redundant (linearly dependent)
Strongly dependent on the chosen route set
| |
1 0 1 0
[ ]
1 0 0 1
l RA NR RI PR
8
www.mobilab.lu
Example: parallel highway
network
Best full obs. route set: Best* partial obs. route set:
«Parallel highway» network
4 routes, fully diverse 9 routes, highly overlapping
* Non-unique solution
9
www.mobilab.lu
Route set generation
Main idea:
Enumerate routes so to obtain the maximum linearly independent route set
Evaluate resulting Partial Observability Solutions
Challenges:
High dimensionality, combinatorial problem
Non-uniqueness of solutions
Non-convex condition (linear independence)
29
www.mobilab.lu
Hypergraph approach (1)
Hypergraph
Express combinations of routes as vertices
Edges capture linear independence through bitwise logical operations
Example:
2 3 4
1 3 4
2 3 5
{ , , }
{ , }
{ ,
,
, }
v
A v v v
B v v
C v v v
(0,1,1,1,0)
(1,0,1,1,0)
(0,1,1,0,1)
A
B
C
Route combination A + B is independent iif:
,A B A B
(1,1,1,1,0)A B
A
B
C
30
www.mobilab.lu
Hypergraph approach (2)
B CA B A C
A B C
A B C
Independence condition:
A
B
C
AA B C
B
B C
A C
A
B
C
31
www.mobilab.lu
3 equivalent solutions
A B
B C
B C A C
BA A C
A
B
C
32
www.mobilab.lu
Finding exact solutions (1)
Problem to be solved: constrained maximum clique in
the hypergraph|| ||
| |
{0,1}
(
(
A {
)
}
)
h h
h
V V
g ij
g V g
HG g
A J
A
a
I A
Q I
min
. .0
{0,1}
T
x HG
HG
x Q x f x
xPs t
x
Quadratic component,
captures adjacency
| |
max(|:
| ) 1
ii
i h
v
vf f
V
Linear component,
Captures importance
1 | |[ , ] { }
1:
| |
,...,hHG i iV
ik k i
ii i
P i
v
pp
vp v
p
Parenthood
constraints
33
www.mobilab.lu
Finding exact solutions (2)
2R
2 (2 1)
2
R R
34
Exact solutions based on solving the max clique problem
computationally intractable even for small networks!
Real challenge: building the hypergraph
nodes to be generated
edges to be checked for eligibility
Quickly unfeasible with realistic network sizes, akin to brute
forcing.
www.mobilab.lu
Full ‘candy’ example (1)
12
Exact solutions based on solving the max clique problem
computationally intractable even for small networks!
8
| | 256
| | 32385
h
h
R
V
L
www.mobilab.lu
Full ‘candy’ example (2)
36
www.mobilab.lu
Hypergraph generation:
heuristics
Vertex culling rules:
(Cul-1): Remove vertices if no new information is introduced w.r.t. parents
(Cul-2): Remove vertices if expected information is lower than best bound
Metaheuristics used to find efficient solutions (e.g. GA)
A B
BA
A B
BA
B CA B A C
A B C
D
14
www.mobilab.lu
Impact of culling rules
Cul-1
Solution: « BCH »Cul-1 + Cul-2
Solution: « BC »
38
www.mobilab.lu
Parallel Highways example
Hypergraph Statistics Solution statistics
# Vertices # Arcs# Vertices Max
Clique
Final route set size
[ind, full coverage]
Tot Memory usage (MB)
[RAM + Nodefiles]
Comp. Time
(s)% Gap
3583 5991142 405 6, 6 217.77 3600+278.7 4.11
39
www.mobilab.lu
Test on large scale networks:
Route set generation methods
Yen’s K-Shortest path
Free parameter: k, how many paths per O/D
Intuition: higher k -> higher information content
Intuition 2: an upper bound to k must exist, after which adding routes to the
routeset will only bring redundant information*
Enumerate according to Yen’s K-Shortest paths
Extra check: if new path added is NOT independent, discard
Stop when k is met or no more independent paths can be found
Independency check: performed through matrix rank (can be computationally
cumbersome)
Castillo’s (2015) Independent paths
Enumerates routes such that all routes (and combinations thereof) are
independent from one another
Higher degree of information wrt. randomly chosen k-shortest paths
10
www.mobilab.lu
Sioux Falls example
Generated Solution Statistics Solution statistics
# VerticesTot. GA
GenerationsFinal route set size
Tot Memory usage (MB)
[RAM]Comp. Time (s)
37 165 - 37 0.72 281.9
41
www.mobilab.lu
Rotterdam example
Generated Solution Statistics Solution statistics
# VerticesTot. GA
GenerationsFinal route set size
Tot Memory usage (MB)
[RAM]Comp. Time (s)
144 541 - 150 98.26 12259.1
Links 476
Nodes 243
ODs 1890
www.mobilab.lu
Vulnerability analysis
43
KSP
KISP
C1
MI
PivotingNetwork
Topology
MI
KSP
KISP
C
'
'
'
'
MI
KSP
KISP
C
ℝ 𝑖𝑛𝑑𝑒𝑝 −|fail|×|𝑑𝑒𝑝|ℝ|𝑖𝑛𝑑𝑒𝑝|×|𝑑𝑒𝑝|
MI
KSP
KISP
C
A
A
A
A
ℝ|𝐿|×|𝑅|
( , )G N L
Random Sensor
FailureLoss of
Information
Hypothesis: different full (and partial) observability solutions bear
different levels of resilience to sensor failure.
'||
||
||
||
T
F
F
NSB
P
')(B null
www.mobilab.lu
Test Results (1)
Information level for different route set generation policies at 50 sensors
44
( ' )KISPNSP
( ' )KSPNSP
1)( 'CNSP
)( 'MINSP
www.mobilab.lu
Test Results (2)
Loss of information upon 10% sensor failure (100 draws, 5x ~U(0,50))
45
( )NSP
www.mobilab.lu
Conclusions
Different full observability solutions indeed bear different resilience to
sensor failure
Max. Independent Route Set policy
Achieves highest information content and density
Interestingly, also yields most resilient solution upon sensor failure
Future research topics:
Mixing different types of sensors (scanned links, FCDs,…)
Topology to data: impact on flow estimation techniques, comparison
between topology prediction and data validation
Potential for combination with flow capturing methods to be explored,
especially for state estimation
From observability to controllability…
46
www.mobilab.lu
References
Viti F., Rinaldi M., Corman F., Tampére C.M.J. (2014). Assessing Partial
Observability in Network Sensor Location Problems. Transportation
Research Part B: Methodological, Vol. 70, pp. 65-89.
Rinaldi M., Viti F. (2017). Exact and Approximate Route Set Generation for
Resilient Partial Observability in Sensor Location Problems. Transportation
Research Part B: Methodological, Vol. 105, pp. 86-119.
Thank you!
{francesco.viti,marco.rinaldi}@uni.lu
47