Post on 11-Jan-2016
transcript
WORKING DRAFT
Approximation Algorithm for Soft-Capacitated Connected Facility Location Problems
7'th Israeli Network Seminar 2012Prof. Danny Raz and Assaf Rappaport17/05/2012
Data Centers Placement
2
Contents
▪ Data Centers
▪ Facility Location Problem
▪ Steiner Tree
▪ Connected Facility Location
▪ Google Case Study
Data centers are becoming the hosting platform for a wide spectrum of composite applications
1 Email services
2 Database services
3 File Servers
4 Collaboration tools
5 CRM (Customer Relationship Management)
6 ERP (Enterprise Resource Planning)
7 E-Commerce
3
▪ Data centers are used to run applications that handle the core business and operational data of organizations:
– SaaS – Software as a Service
– HaaS – Hardware as a service
– PaaS – Platform as a Service
Examples of data centers applications
In recent years, large investments have been made in massive data centers supporting cloud services
4
A list of companies that are running at least 50,000 servers
SOURCE: Data Center Knowledge (DCK)
With an increasing trend towards communication intensive applications, the bandwidth usage within and between data centers is rapidly growing
5
Data centers placement presents challenging optimization problems (1/2)
6
1 Number of facilities
2 Location
3 Assignment
1 Graph with costs on edges
2 Set of locations where facilities may be placed
3 Set of demand nodes that must be assigned to an open facility
Data centers placement presents challenging optimization problems (2/2)
7
1 Number of facilities
2 Location
3 Assignment
1 Graph with costs on edges
2 Set of locations where facilities may be placed
3 Set of demand nodes that must be assigned to an open facility
The goal is to optimally place the applications and their related dataover the available infrastructure
8
Consider the following scenario:
▪ An email application in the cloud depends on an authentication service
??
▪ We consider the problem of placing replicas of the authentication servers at multiple locations in the data center
Data centerData center
Replica placement deals with the actual number and network location of the replicas
9
??
▪ Having more replicas is more expensive so we need to model the cost
▪ We would like to minimize the network distance between an application server and the closest replica and thus having more replicas helps
▪ A replica must be synchronized with the original content server in order to supply reliable service
▪ The synchronization traffic across the network depends on the number of replicas deployed in the network, the topology of the distributed update and the rate of updates in the content of the server
10
Contents
▪ Data Centers
▪ Facility Location Problem
▪ Steiner Tree
▪ Connected Facility Location
▪ Google Case Study
The general uncapacitated facility location problem (1/2)
11
▪ Set D of clients
▪ Set F of potential facility locations
▪ A distance function
▪ A cost function
Input Output
▪
▪ Set of potential facility sites where a facility can be opened
▪ Set of demand points D that must be serviced
▪ We want the facilities to be as efficient as possible, thus we want to minimize the distance from each client to its closest facility.
▪ There can be a cost associated with creating each facility that also must be minimized, otherwise all points would be facilities
▪ Minimize the sum of distances, plus the sum of opening costs of the facilities
Description
The general uncapacitated facility location problem (2/2)
12
Customers D
Facilities F
dij
fj Facility Location (FL) Problem: Open a subset of facilities & connect customers to one facility each at minimal cost
13
Uncapacitated Facility Location Problem
The Fermat-Weber Problem
The point minimizing the sum of distances to the sample points:
Given set of m points
and positive multipliers
Find a point
that minimizes
17th century 1960s 1997
Constant-factor approximation algorithm
Stollsteimer - 1963
Balinski and Wolfe - 1963
Kuehn and Hamburger - 1963
Manne - 1964
Plant location problem or warehouse location problem
Shmoys, Tardos and Aardal give a first polynomial-time algorithm that finds a solution within a factor of 3.16 of the optimal
Uncapacitated facility location problem - History
14
Contents
▪ Data Centers
▪ Facility Location Problem
▪ Steiner Tree
▪ Connected Facility Location
▪ Google Case Study
22 6
Steiner Tree Problem
15
▪ Given:
– An undirected weighted graph G(V,E)
– A set of nodes S (subset of V)
Input
▪ Find the minimum cost tree that spans the nodes in S
▪ Which is the Steiner tree for the green nodes?▪ Shortest path tree doesn’t equal Steiner tree
1010
10
2
51
4
2
53
3
Output
1010
10
2
1
45
33
1010
10
2
51
45
33
16
Contents
▪ Data Centers
▪ Facility Location Problem
▪ Steiner Tree
▪ Connected Facility Location
▪ Google Case Study
Connected Facility Location
client
facility
node
▪ Given:
Input
Graph G=(V,E), costs {ce} on edges and a parameter M ≥ 1
F : set of facilitiesD : set of clients (demands)
Facility i has facility cost fi
cij : distance between i and j in V
client
facility
Cost = I in A fi + j in D ci(j)j + M e in T ce
= facility opening cost + client assignment cost + cost of connecting facilities
Assign each demand j to an open facility i(j)
Steiner tree
Connect all open facilities by a Steiner tree T
openfacility
Pick a set A of facilities to open
We want to:
Soft-ConFL algorithm – the first deterministic constant approximation algorithm for the soft capacitated connected facility location problem
18
TextΡ-approximation algorithm for the Uncapacitated Facility Location Problem
μ-approximation algorithm for the minimum Steiner Tree Problem
Add a cost λi to each facility: This cost is defined as twice the minimum cost of satisfying M units of demand from facility i.
fj
dijModify the distance function by adding:
Deterministic constant approximation algorithm
19
Proof of lemma 1
20
Proof of lemma 1
21
Proof of lemma 1
22
Convert into a binary treeConvert into a binary tree
<M
<M<M
3M>
23
Contents
▪ Data Centers
▪ Facility Location Problem
▪ Steiner Tree
▪ Connected Facility Location
▪ Google Case Study
Google data centers
Google data centers world wide Google data centers world wide Google data centers in the USAGoogle data centers in the USA
Google data centers in EuropeGoogle data centers in Europe▪ Google operates data centers in:
– 19 in the US
– 12 in Europe
– one in Russia
– one in South America
– 3 in Asia
▪ Not all of the locations are dedicated Google data centers
Google data centers – Case example
25
X 36 Google data centersX 36 Google data centers
How many replicas?Locations?
How many replicas?Locations?
Unified demandUnified demandUnified demandUnified demand
Unified costUnified costUnified costUnified cost
Geographic distanceGeographic distance
Google data centers: Greedy vs. CoFL
▪ Facility cost: 5,000-10,000
▪ Min SPT: 22,000
▪ Total demand: 36
GreedyGreedy
CoFLCoFL
Google data centers: Greedy vs. UFL vs. CoFL
GreedyGreedy
UFLUFL
CoFLCoFL▪ Facility cost:
5,000
▪ Min SPT: 22,000
▪ Total demand: 36
Google data centers: Greedy vs. UFL vs. CoFL
GreedyGreedy
UFLUFL
CoFLCoFL▪ Facility cost:
3,000
▪ Min SPT: 22,000
▪ Total demand: 36
44
55
Google data centers: Greedy vs. UFL vs. CoFL
GreedyGreedy
UFLUFL
CoFLCoFL▪ Facility cost:
3,000
▪ Min SPT: 22,000
▪ Total demand: 36
▪ Facility cost: 1,000
▪ Min SPT: 22,000
▪ Total demand: 36
CoFL
2.80% 5.60% 8.30% 11.10% 13.90%Mountain View, Calif.BeijingPortland, OregonLenoir, North CarolinaFrankfurt, GermanyPryor, OklahomaMons, BelgiumMoscow, RussiaSao Paulo, BrazilTokyoHong KongAtlanta, Ga. (two sites)Ashburn, Va.Groningen, NetherlandsOther 22 Facilities
▪ Facility cost: 1,000
▪ Min SPT: 22,000
▪ Total demand: 36
2.8% 5.6% 8.3% 11.1% 13.9%
Mountain View, Calif.
Pleasanton, Calif.
San Jose, Calif.
Los Angeles, Calif.
Palo Alto, Calif.
Seattle
Portland, Oregon
The Dalles, Oregon
Chicago
Atlanta, Ga. (two sites)
Reston, Virginia
Ashburn, Va.
Virginia Beach, Virginia
Houston, Texas
Miami, Fla.
Lenoir, North Carolina
Goose Creek, South Carolina
Pryor, Oklahoma
Council Bluffs, Iowa
Toronto, Canada
Berlin, Germany
Frankfurt, Germany
Munich, Germany
Zurich, Switzerland
Groningen, Netherlands
Mons, Belgium
Eemshaven, Netherlands
Paris
London
Dublin, Ireland
Milan, Italy
Moscow, Russia
Sao Paulo, Brazil
Tokyo
Hong Kong
Beijing
CoFL
32
2.80% 5.60% 8.30% 11.10% 13.90%Mountain View, Calif.BeijingPortland, OregonLenoir, North CarolinaFrankfurt, GermanyPryor, OklahomaMons, BelgiumMoscow, RussiaSao Paulo, BrazilTokyoHong KongAtlanta, Ga. (two sites)Ashburn, Va.Groningen, NetherlandsOther 22 Facilities
33
GreedyGreedy
UFLUFL
CoFLCoFL
34
The Steiner tree problem is NP-hard
35
Reduction
We will show that a known NP-hard problem can be solved in polynomial complexity if the Steiner decision problem can be solved in polynomial complexity
Exact cover by 3-sets is NP-hard
X = {x1, x2,……, x3p}
C = {C1, C2,….. Cq}
Ci X | |Ci|=3, i=1,…..q
Is it possible to select mutually disjoint subsets such that their union is
X?
Is it possible to select mutually disjoint subsets such that their union is
X?
v
C1
C2
C3
C4
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10