+ All Categories
Home > Documents > Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown...

Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown...

Date post: 08-Jan-2018
Category:
Upload: marian-campbell
View: 216 times
Download: 1 times
Share this document with a friend
Description:
Contents Motivation Problem Solution Approach Central Version of Algorithm  Edge  Edge+  In-Network  latency Constrained Distributed Version of Algorithm Experiment Critique
26
Network-Aware Query Processing for Stream-based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004
Transcript
Page 1: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Network-Aware Query Processing for Stream-based Application

Yanif Ahmad, Ugur Cetintemel-Brown University

VLDB 2004

Page 2: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

One-line Comments This paper is addressing the operator placement

problem in distributed query processing by using network latency information

Page 3: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Contents Motivation Problem Solution Approach Central Version of Algorithm

Edge Edge+ In-Network latency Constrained

Distributed Version of Algorithm Experiment Critique

Page 4: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Motivation Small scale query processing system: Not-

scalable A lot of data stream & query request

Widely-distributed query processing

Page 5: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Problem Operator placement problem

Operators in query processing trees should be dispersed into the network

O00

O10 O11

O20 O22O21 O23 O25O24 O26

O00

Processing tree (query plan) IP network

O10

O11

O22 O23

O26O25

O20 O21

O24

operator nodeApplication node

Page 6: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Problem : formalized version Operator placement problem

For efficient operator placement Cost: Bandwidth

AOT , EVG ,

Aa

ac )(min

O: operatorsA: their connected inputs & outputsV: nodesE: their linksC(): link cost, bandwidth

• c(a)=0 if for a=(m,n) :• Source (operator’s) locations are determined

nm

m

n

a)(m

)(nc(a)

Page 7: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Solution Approach Network-aware operator placement algorithms

Edge Consider only sources and the proxy location

Edge+ Edge with pair-wise server communication latencies

In-Network Sources, proxy, a subset of all locations

Latency-bound algorithm

Page 8: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Contents Motivation Problem Solution Approach Central Version of Algorithm Distributed Version of Algorithm Experiment Critique

Page 9: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Algorithm Design Principle Naïve algorithm for operator placement

Calculate all the combination of possible mapping => Too complex

Greedy algorithm Calculate only for the locations of having high possibility Locate operators in post-order When we put a operator at a location, we can move by its children

Processing tree

O00

O10 O11

O20 O22O21 O23 O25O24 O26

IP network

operator nodeApplication node

S0 S1

Page 10: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Mapping Function

),(minarg)()(

ovoov

)},(),,()),((min{),(),()}(:{)}(:{

iicvc

iicvc

i cvcoccccvoviiii

O

O10 O12O11

O20 O22O21 O23 O25O24 O26 O27 O29O28

Page 11: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Edge Location candidate: sources, proxy Candidate with high possibility

(1) One of children’s locations (2) A common location (3) Proxy’s location

Link cost

otherwisenm

nmifnmc

),()()(0

),(

< : Tree cost >),( nm

Page 12: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Edge (1) One of children’s locations

A location that maximizes the total tree cost between the operator and all of its children

})(:{

),(maxarg)`(1 vcc

iv

ii

ni

coo

O00

O10 O12O11

O20 O22O21 O23 O25O24 O26 O27 O29O28

S0 S0 S1 S0 S1 S1 S2 S0 S1 S1

S0 S1

O10

O20 O22O21

30 50 20

Processing tree

Page 13: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Edge (2) A common location Idea

Placing an operator and its children at a common location -> zero overlay cost between the operator and its children

Common location (cl) Good place for all its children -> an intersection of each child’s dl (the set of descendant leaf locations)

O00

O10 O12O11

O20 O22O21 O23 O25O24 O26 O27 O29O28

S0 S0 S1 S0 S1 S1 S2 S0 S1 S1

)()()(lodl

oleavesl

)()(

1 ii

ncdlocl

dl(O11)={S0, S1, S2}cl(O00)={S0, S1 }

Page 14: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Edge (3) Proxy’s location Idea

If tree costs are higher near the root -> proxy location, r

O00

O10 O12O11

O20 O22O21 O23 O25O24 O26 O27 O29O28

S0 S0 S1 S0 S1 S1 S2 S0 S1 S1

Page 15: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Edge – Summary Summary

otherwiseovTleavesoifoDHT

oroclv

),(minarg)()(

)(}),`({

)},(),,()),((min{),(),()}(:{)}(:{

iicvc

iicvc

i cvcoccccvoviiii

Page 16: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Edge+

otherwisenmdnm

nmifnmc

))(),((),()()(0

),(

Location candidate: sources, proxy Edge with network latency (d) between two locations Link cost

Mapping function

otherwiseovTleavesoifoDHT

oroclv

),(minarg)()(

)(}),`({

)},(),,()),((min{),(),()}(:{)}(:{

iicvc

iicvc

i cvcoccccvoviiii

Page 17: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

In-Network Placement Location candidate : arbitrary locations

(including sources and proxy) Overlay cost and mapping function is the same

as Edge+

otherwisenmdnm

nmifnmc

))(),((),()()(0

),(

)},(),,()),((min{),(),()}(:{)}(:{

iicvc

iicvc

i cvcoccccvoviiii

Problem: reducing the candidate location set

otherwiseovTleavesoifoDHT

oov

),(minarg)()(

)()(

Page 18: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

In-Network Placement Approach

Remove the location unless its distance to all current child placements is less than all pairwise distances between child placements

.,:{)( CccVvo jii ))(),(())(,( jiii ccdcvd

))}(),(())(,( jiji ccdcvd

O00

O10 O12O11

O11

O12

O10

O00

40

30

20

50

60

30

N2

N4

N7N8

Page 19: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Latency-Constrained Placement Find the configuration satisfying the latency-constrained Latency-constrained

o

ci

O20 O22O21

S0 S0 S1 S0 S1 S1 S2 S0 S1 S1

P: a set of leaf-to-root pathsPplbadpba

))(),((),(

}))(,()(:)({)( lcvdcovoL ii

pbasubtreepathsp

bado),())0((

))(),((max)(

otherwiseovTleavesoifoDHT

oLv

),(minarg)()(

)(

ci

O

O20

50

30

30N4

N7

S1

O22

O21

S0

O20N5

If l=75

Page 20: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Contents Motivation Problem Solution Approach Central Version of Algorithm Distributed Version of Algorithm Experiment Critique

Page 21: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Distributed Query Placement Reason

Centralized approach – not scalable Substantial network state Algorithm complexity

Page 22: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Distributed Query Placement

O1 C1

C2

C3

C4

O2

O3 O4

Processing tree

Application proxy Partition a processing tree into subtrees (zones) Assign each zone to a coordinator node

Page 23: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Distributed Query Placement

C1

C2

C3

C4

Tree Overlay

Page 24: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Experiment Experimental Setup

Processing Tree Binary tree Depth: 3 ~ 5

Network Topology Max pair-wise path delay: 500ms

Server and proxy location Uniform: APD = ASD Star: APD = 0.5*ASD Cluster: APD = 2*ASD

APD: Average Proxy DistanceASD: Average Server Distance

Server Proxy, UniformProxy, ClusterProxy, Star

Page 25: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Experiment Latency constraints

120ms (0.9nd, tight delay) vs. 300ms (2.2nd, loose delay)

Direct comparison Baseline case: all operators are located at the proxy

Result

Bandwidth consumption Latency stretch

Page 26: Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Critique Pros

Operator placement problem Focus on network-related cost not processing cost (BW, latency)

Cons High complexity algorithm possible to apply?

Heavy processing Too much time taken to complete the placement

Latency information of many places is needed Sequential convergence in a bottom-up manner

=> impossible to use in case of complex query plan & topology => more simple algorithm is appropriate

Dynamic? Unresilient to Dynamic topology change

In case of node leave, latency change


Recommended