WORKING DRAFT Approximation Algorithm for Soft-Capacitated Connected Facility Location Problems 7'th...

Post on 11-Jan-2016

220 views 0 download

transcript

WORKING DRAFT

Approximation Algorithm for Soft-Capacitated Connected Facility Location Problems

7'th Israeli Network Seminar 2012Prof. Danny Raz and Assaf Rappaport17/05/2012

Data Centers Placement

2

Contents

▪ Data Centers

▪ Facility Location Problem

▪ Steiner Tree

▪ Connected Facility Location

▪ Google Case Study

Data centers are becoming the hosting platform for a wide spectrum of composite applications

1 Email services

2 Database services

3 File Servers

4 Collaboration tools

5 CRM (Customer Relationship Management)

6 ERP (Enterprise Resource Planning)

7 E-Commerce

3

▪ Data centers are used to run applications that handle the core business and operational data of organizations:

– SaaS – Software as a Service

– HaaS – Hardware as a service

– PaaS – Platform as a Service

Examples of data centers applications

In recent years, large investments have been made in massive data centers supporting cloud services

4

A list of companies that are running at least 50,000 servers

SOURCE: Data Center Knowledge (DCK)

With an increasing trend towards communication intensive applications, the bandwidth usage within and between data centers is rapidly growing

5

Data centers placement presents challenging optimization problems (1/2)

6

1 Number of facilities

2 Location

3 Assignment

1 Graph with costs on edges

2 Set of locations where facilities may be placed

3 Set of demand nodes that must be assigned to an open facility

Data centers placement presents challenging optimization problems (2/2)

7

1 Number of facilities

2 Location

3 Assignment

1 Graph with costs on edges

2 Set of locations where facilities may be placed

3 Set of demand nodes that must be assigned to an open facility

The goal is to optimally place the applications and their related dataover the available infrastructure

8

Consider the following scenario:

▪ An email application in the cloud depends on an authentication service

??

▪ We consider the problem of placing replicas of the authentication servers at multiple locations in the data center

Data centerData center

Replica placement deals with the actual number and network location of the replicas

9

??

▪ Having more replicas is more expensive so we need to model the cost

▪ We would like to minimize the network distance between an application server and the closest replica and thus having more replicas helps

▪ A replica must be synchronized with the original content server in order to supply reliable service

▪ The synchronization traffic across the network depends on the number of replicas deployed in the network, the topology of the distributed update and the rate of updates in the content of the server

10

Contents

▪ Data Centers

▪ Facility Location Problem

▪ Steiner Tree

▪ Connected Facility Location

▪ Google Case Study

The general uncapacitated facility location problem (1/2)

11

▪ Set D of clients

▪ Set F of potential facility locations

▪ A distance function

▪ A cost function

Input Output

▪ Set of potential facility sites where a facility can be opened

▪ Set of demand points D that must be serviced

▪ We want the facilities to be as efficient as possible, thus we want to minimize the distance from each client to its closest facility.

▪ There can be a cost associated with creating each facility that also must be minimized, otherwise all points would be facilities

▪ Minimize the sum of distances, plus the sum of opening costs of the facilities

Description

The general uncapacitated facility location problem (2/2)

12

Customers D

Facilities F

dij

fj Facility Location (FL) Problem: Open a subset of facilities & connect customers to one facility each at minimal cost

13

Uncapacitated Facility Location Problem

The Fermat-Weber Problem

The point minimizing the sum of distances to the sample points:

Given set of m points

and positive multipliers

Find a point

that minimizes

17th century 1960s 1997

Constant-factor approximation algorithm

Stollsteimer - 1963

Balinski and Wolfe - 1963

Kuehn and Hamburger - 1963

Manne - 1964

Plant location problem or warehouse location problem

Shmoys, Tardos and Aardal give a first polynomial-time algorithm that finds a solution within a factor of 3.16 of the optimal

Uncapacitated facility location problem - History

14

Contents

▪ Data Centers

▪ Facility Location Problem

▪ Steiner Tree

▪ Connected Facility Location

▪ Google Case Study

22 6

Steiner Tree Problem

15

▪ Given:

– An undirected weighted graph G(V,E)

– A set of nodes S (subset of V)

Input

▪ Find the minimum cost tree that spans the nodes in S

▪ Which is the Steiner tree for the green nodes?▪ Shortest path tree doesn’t equal Steiner tree

1010

10

2

51

4

2

53

3

Output

1010

10

2

1

45

33

1010

10

2

51

45

33

16

Contents

▪ Data Centers

▪ Facility Location Problem

▪ Steiner Tree

▪ Connected Facility Location

▪ Google Case Study

Connected Facility Location

client

facility

node

▪ Given:

Input

Graph G=(V,E), costs {ce} on edges and a parameter M ≥ 1

F : set of facilitiesD : set of clients (demands)

Facility i has facility cost fi

cij : distance between i and j in V

client

facility

Cost = I in A fi + j in D ci(j)j + M e in T ce

= facility opening cost + client assignment cost + cost of connecting facilities

Assign each demand j to an open facility i(j)

Steiner tree

Connect all open facilities by a Steiner tree T

openfacility

Pick a set A of facilities to open

We want to:

Soft-ConFL algorithm – the first deterministic constant approximation algorithm for the soft capacitated connected facility location problem

18

TextΡ-approximation algorithm for the Uncapacitated Facility Location Problem

μ-approximation algorithm for the minimum Steiner Tree Problem

Add a cost λi to each facility: This cost is defined as twice the minimum cost of satisfying M units of demand from facility i.

fj

dijModify the distance function by adding:

Deterministic constant approximation algorithm

19

Proof of lemma 1

20

Proof of lemma 1

21

Proof of lemma 1

22

Convert into a binary treeConvert into a binary tree

<M

<M<M

3M>

23

Contents

▪ Data Centers

▪ Facility Location Problem

▪ Steiner Tree

▪ Connected Facility Location

▪ Google Case Study

Google data centers

Google data centers world wide Google data centers world wide Google data centers in the USAGoogle data centers in the USA

Google data centers in EuropeGoogle data centers in Europe▪ Google operates data centers in:

– 19 in the US

– 12 in Europe

– one in Russia

– one in South America

– 3 in Asia

▪ Not all of the locations are dedicated Google data centers

Google data centers – Case example

25

X 36 Google data centersX 36 Google data centers

How many replicas?Locations?

How many replicas?Locations?

Unified demandUnified demandUnified demandUnified demand

Unified costUnified costUnified costUnified cost

Geographic distanceGeographic distance

Google data centers: Greedy vs. CoFL

▪ Facility cost: 5,000-10,000

▪ Min SPT: 22,000

▪ Total demand: 36

GreedyGreedy

CoFLCoFL

Google data centers: Greedy vs. UFL vs. CoFL

GreedyGreedy

UFLUFL

CoFLCoFL▪ Facility cost:

5,000

▪ Min SPT: 22,000

▪ Total demand: 36

Google data centers: Greedy vs. UFL vs. CoFL

GreedyGreedy

UFLUFL

CoFLCoFL▪ Facility cost:

3,000

▪ Min SPT: 22,000

▪ Total demand: 36

44

55

Google data centers: Greedy vs. UFL vs. CoFL

GreedyGreedy

UFLUFL

CoFLCoFL▪ Facility cost:

3,000

▪ Min SPT: 22,000

▪ Total demand: 36

▪ Facility cost: 1,000

▪ Min SPT: 22,000

▪ Total demand: 36

CoFL

2.80% 5.60% 8.30% 11.10% 13.90%Mountain View, Calif.BeijingPortland, OregonLenoir, North CarolinaFrankfurt, GermanyPryor, OklahomaMons, BelgiumMoscow, RussiaSao Paulo, BrazilTokyoHong KongAtlanta, Ga. (two sites)Ashburn, Va.Groningen, NetherlandsOther 22 Facilities

▪ Facility cost: 1,000

▪ Min SPT: 22,000

▪ Total demand: 36

  2.8% 5.6% 8.3% 11.1% 13.9%

Mountain View, Calif.          

Pleasanton, Calif.          

San Jose, Calif.          

Los Angeles, Calif.          

Palo Alto, Calif.          

Seattle          

Portland, Oregon          

The Dalles, Oregon          

Chicago          

Atlanta, Ga. (two sites)          

Reston, Virginia          

Ashburn, Va.          

Virginia Beach, Virginia          

Houston, Texas          

Miami, Fla.          

Lenoir, North Carolina          

Goose Creek, South Carolina          

Pryor, Oklahoma          

Council Bluffs, Iowa          

Toronto, Canada          

Berlin, Germany          

Frankfurt, Germany          

Munich, Germany          

Zurich, Switzerland          

Groningen, Netherlands          

Mons, Belgium          

Eemshaven, Netherlands          

Paris          

London          

Dublin, Ireland          

Milan, Italy          

Moscow, Russia          

Sao Paulo, Brazil          

Tokyo          

Hong Kong          

Beijing          

CoFL

32

2.80% 5.60% 8.30% 11.10% 13.90%Mountain View, Calif.BeijingPortland, OregonLenoir, North CarolinaFrankfurt, GermanyPryor, OklahomaMons, BelgiumMoscow, RussiaSao Paulo, BrazilTokyoHong KongAtlanta, Ga. (two sites)Ashburn, Va.Groningen, NetherlandsOther 22 Facilities

33

GreedyGreedy

UFLUFL

CoFLCoFL

34

The Steiner tree problem is NP-hard

35

Reduction

We will show that a known NP-hard problem can be solved in polynomial complexity if the Steiner decision problem can be solved in polynomial complexity

Exact cover by 3-sets is NP-hard

X = {x1, x2,……, x3p}

C = {C1, C2,….. Cq}

Ci X | |Ci|=3, i=1,…..q

Is it possible to select mutually disjoint subsets such that their union is

X?

Is it possible to select mutually disjoint subsets such that their union is

X?

v

C1

C2

C3

C4

x1

x2

x3

x4

x5

x6

x7

x8

x9

x10