+ All Categories
Home > Documents > Scheduling on a Channel with Failures and Retransmissionsvalia/Jelen_skiani_INFORMS13.pdf ·...

Scheduling on a Channel with Failures and Retransmissionsvalia/Jelen_skiani_INFORMS13.pdf ·...

Date post: 25-Jan-2021
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
33
Scheduling on a Channel with Failures and Retransmissions Predrag R. Jelenkovi´ c and Evangelia D. Skiani Department of Electrical Engineering Columbia University, NY 10027, USA {predrag,valia}@ee.columbia.edu October 6, 2013 *Supported by NSF grant 0915784 P.R.Jelenkovi´ c & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 1 / 18
Transcript
  • Scheduling on a Channel with Failures andRetransmissions

    Predrag R. Jelenković and Evangelia D. Skiani

    Department of Electrical EngineeringColumbia University, NY 10027, USA

    {predrag,valia}@ee.columbia.edu

    October 6, 2013

    *Supported by NSF grant 0915784

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 1 / 18

  • Outline

    1 IntroductionDefinitions & Notation

    2 Main ResultsFirst Come First ServedProcessor Sharing

    3 SimulationExample 1: FCFSExample 2: PS

    4 Conclusions

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 2 / 18

  • Introduction

    Failures & Retransmissions (Restarts)

    High variability ⇒ frequent failuresPossible solution: Restart the system

    Applicationsnetworking e.g. ARQ, HTTPcomputing

    Restarts cause power law delays & possibly zero throughput, even forsuperexponential files [ALSF’05-, JT’06-]:

    P[N > n] ∼ �(a+1)�na (1)What is the best job scheduling policy?

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 3 / 18

  • Introduction

    Failures & Retransmissions (Restarts)

    High variability ⇒ frequent failuresPossible solution: Restart the system

    Applicationsnetworking e.g. ARQ, HTTPcomputing

    Restarts cause power law delays & possibly zero throughput, even forsuperexponential files [ALSF’05-, JT’06-]:

    P[N > n] ∼ �(a+1)�na (1)What is the best job scheduling policy?

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 3 / 18

  • Introduction

    Failures & Retransmissions (Restarts)

    High variability ⇒ frequent failuresPossible solution: Restart the system

    Applicationsnetworking e.g. ARQ, HTTPcomputing

    Restarts cause power law delays & possibly zero throughput, even forsuperexponential files [ALSF’05-, JT’06-]:

    P[N > n] ∼ �(a+1)�na (1)What is the best job scheduling policy?

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 3 / 18

  • Introduction

    Failures & Retransmissions (Restarts)

    High variability ⇒ frequent failuresPossible solution: Restart the system

    Applicationsnetworking e.g. ARQ, HTTPcomputing

    Restarts cause power law delays & possibly zero throughput, even forsuperexponential files [ALSF’05-, JT’06-]:

    P[N > n] ∼ �(a+1)�na (1)What is the best job scheduling policy?

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 3 / 18

  • Introduction Motivation

    Scheduling & Retransmissions

    No known policies optimize the sojourn time tail across BOTH light andheavy-tailed job size distributions.

    Optimality

    Subexponential jobs: PS, shortest remaining processing time [ANA’99]

    Superexponential jobs: First come first served [RS’01]

    We study two scheduling policies:

    1 First Come First Served (FCFS)

    2 Processor Sharing (PS)

    Question:

    How do these policies work under retransmissions?

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 4 / 18

  • Introduction Motivation

    Scheduling & Retransmissions

    No known policies optimize the sojourn time tail across BOTH light andheavy-tailed job size distributions.

    Optimality

    Subexponential jobs: PS, shortest remaining processing time [ANA’99]

    Superexponential jobs: First come first served [RS’01]

    We study two scheduling policies:

    1 First Come First Served (FCFS)

    2 Processor Sharing (PS)

    Question:

    How do these policies work under retransmissions?

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 4 / 18

  • Introduction Motivation

    Scheduling & Retransmissions

    No known policies optimize the sojourn time tail across BOTH light andheavy-tailed job size distributions.

    Optimality

    Subexponential jobs: PS, shortest remaining processing time [ANA’99]

    Superexponential jobs: First come first served [RS’01]

    We study two scheduling policies:

    1 First Come First Served (FCFS)

    2 Processor Sharing (PS)

    Question:

    How do these policies work under retransmissions?

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 4 / 18

  • Introduction Motivation

    Model of Channel

    Available periods {An

    }n≥1: i.i.d.

    Unit Capacity

    1"

    1"

    2" 2"

    2"

    2"

    A1" A2"U1" U2"

    Figure: A failure-prone system.

    Retransmission Model

    Generic job B ∈ (0,∞)if B ≤A

    n

    , success; else, retransmit at period An+1

    B

    System withfailures An ≥B

    restart no

    Figure: Jobs over a system with failures.P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 5 / 18

  • Introduction Definitions & Notation

    Definitions & Notation

    Definition 1 (Service Time)

    The service time is the total time until a job is successfully served and isdenoted as

    S ∶= N−1�i=1 Ai +B ,

    where N is the number of attempts until the successful completion of thejob.

    Denote the tail distributions of job sizes B and availability periods A as

    F̄ (x) = P(B > x) and Ḡ(x) = P(A > x)

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 6 / 18

  • Introduction Definitions & Notation

    A Simple Scenario

    There are m jobs of size Bi

    , i = 1 . . .mEach job requires S

    i

    time units

    No future arrivals

    Job Scheduling:

    B3# B2# B1#

    B2#

    B3#

    FCFS

    vs.

    B3# B2# B1#

    B1#B2#B3#

    PSP.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 7 / 18

  • Introduction Definitions & Notation

    Definitions & Notation

    Definition 2 (Total Completion Time)

    The total completion time is defined as the total time until all the jobs inthe queue are successfully served and is denoted as

    ⇥m

    ∶= m�i=1Si ,

    where m is the total number of jobs in the system and Si

    ’s are the servicetimes for each job.

    Note: Total completion time without retransmissions → trivial!⇒ Always equal to ∑mi=1Bi

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 8 / 18

  • Main Results First Come First Served

    First Come First Served (FCFS)

    Theorem 1

    If log F̄ (x) ≈ a log Ḡ(x) for all x ≥ 0 and a > 0, and E[A1+q] 0, then

    limt→∞

    logP[⇥m

    > t]log t

    = −a.Proof [of Theorem 1].

    Under the conditions of the Theorem, the result in [JT’06-] yields

    limt→∞

    logP[S > t]log t

    = −a as t →∞, (�)where S is the service time of one job if served alone.

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 9 / 18

  • Main Results First Come First Served

    First Come First Served (FCFS)

    Theorem 1

    If log F̄ (x) ≈ a log Ḡ(x) for all x ≥ 0 and a > 0, and E[A1+q] 0, then

    limt→∞

    logP[⇥m

    > t]log t

    = −a.Proof [of Theorem 1].

    Under the conditions of the Theorem, the result in [JT’06-] yields

    limt→∞

    logP[S > t]log t

    = −a as t →∞, (�)where S is the service time of one job if served alone.

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 9 / 18

  • Main Results First Come First Served

    FCFS

    Proof [of Theorem 1].

    The total completion time is lower bounded by a single job service time:

    P[⇥m

    > t] ≥ P[S1 > t] (�)�⇒ − logP[⇥m > t]log t

    � a.

    Let S̄i

    be the service time of a job i when we idle the server after jobcompletion until next failure. Then, the upper bound is

    P[⇥m

    > t] ≤ P� m�i=1 S̄i > t� ≤mP�S̄1 >

    t

    m

    �(�)�⇒ − logP[⇥m > t]

    log t� a.

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 10 / 18

  • Main Results First Come First Served

    FCFS

    Proof [of Theorem 1].

    The total completion time is lower bounded by a single job service time:

    P[⇥m

    > t] ≥ P[S1 > t] (�)�⇒ − logP[⇥m > t]log t

    � a.

    Let S̄i

    be the service time of a job i when we idle the server after jobcompletion until next failure. Then, the upper bound is

    P[⇥m

    > t] ≤ P� m�i=1 S̄i > t� ≤mP�S̄1 >

    t

    m

    �(�)�⇒ − logP[⇥m > t]

    log t� a.

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 10 / 18

  • Main Results Processor Sharing

    Processor Sharing (PS)

    Theorem 2

    If the hazard function − log F̄ (x) is regularly varying with index g ≥ 0, then,under the conditions of Theorem 1,

    i) if g ≤ 1, i.e. B is subexponential or exponential, thenlimt→∞− logP[⇥

    m

    > t]log t

    = a,

    ii) if g > 1, i.e. B is superexponential, thenlimt→∞− logP[⇥

    m

    > t]log t

    = am

    g−1 < a.

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 11 / 18

  • Main Results Processor Sharing

    Processor Sharing (PS)

    Theorem 2

    If the hazard function − log F̄ (x) is regularly varying with index g ≥ 0, then,under the conditions of Theorem 1,

    i) if g ≤ 1, i.e. B is subexponential or exponential, thenlimt→∞− logP[⇥

    m

    > t]log t

    = a,

    ii) if g > 1, i.e. B is superexponential, thenlimt→∞− logP[⇥

    m

    > t]log t

    = am

    g−1 < a.

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 11 / 18

  • Main Results Processor Sharing

    Idea of the proof (I)

    The upper bound is

    P[⇥m

    > t] ≤ P� m�i=1 S̄i > t� ≤ (1+e)

    m�i=1P[S̄i > t].

    1 If B̂1 is the smallest job, then

    P[N1 > n] =EP�B̂1 > Am

    �n =E�1− Ḡ(mB̂1)�n =E�1− F̄1(mB̂1) 1a1 �n2 What is the relationship between F̄1(x) and Ḡ(x)?

    log F̄1(x) = logP[mB̂1 > x] = log�F̄ (x�m)�m ≈m1−g log F̄ (x).3 Recalling (�),

    − logP[S̄1 > t]log t

    �→t→∞

    am

    g−1 (�)P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 12 / 18

  • Main Results Processor Sharing

    Idea of the proof (I)

    The upper bound is

    P[⇥m

    > t] ≤ P� m�i=1 S̄i > t� ≤ (1+e)

    m�i=1P[S̄i > t].

    1 If B̂1 is the smallest job, then

    P[N1 > n] =EP�B̂1 > Am

    �n =E�1− Ḡ(mB̂1)�n =E�1− F̄1(mB̂1) 1a1 �n2 What is the relationship between F̄1(x) and Ḡ(x)?

    log F̄1(x) = logP[mB̂1 > x] = log�F̄ (x�m)�m ≈m1−g log F̄ (x).3 Recalling (�),

    − logP[S̄1 > t]log t

    �→t→∞

    am

    g−1 (�)P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 12 / 18

  • Main Results Processor Sharing

    Idea of the proof (I)

    The upper bound is

    P[⇥m

    > t] ≤ P� m�i=1 S̄i > t� ≤ (1+e)

    m�i=1P[S̄i > t].

    1 If B̂1 is the smallest job, then

    P[N1 > n] =EP�B̂1 > Am

    �n =E�1− Ḡ(mB̂1)�n =E�1− F̄1(mB̂1) 1a1 �n2 What is the relationship between F̄1(x) and Ḡ(x)?

    log F̄1(x) = logP[mB̂1 > x] = log�F̄ (x�m)�m ≈m1−g log F̄ (x).3 Recalling (�),

    − logP[S̄1 > t]log t

    �→t→∞

    am

    g−1 (�)P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 12 / 18

  • Main Results Processor Sharing

    Idea of the proof (II)

    4 Similarly, for the 2nd smallest job ∼ 1�ta(m−1)1−g5 . . . and the last one ∼ 1�ta

    If g > 1 (superexponential), then the lower bound is determined by theminimum power law index (am1−g < . . . < a)

    − logP[⇥m

    > t]log t

    � am

    g−1 . (1)

    Equivalently, if g ≤ 1 ((sub)exponential), then− logP[⇥

    m

    > t]log t

    � a. (2)P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 13 / 18

  • Main Results Processor Sharing

    Idea of the proof (II)

    4 Similarly, for the 2nd smallest job ∼ 1�ta(m−1)1−g5 . . . and the last one ∼ 1�ta

    If g > 1 (superexponential), then the lower bound is determined by theminimum power law index (am1−g < . . . < a)

    − logP[⇥m

    > t]log t

    � am

    g−1 . (1)

    Equivalently, if g ≤ 1 ((sub)exponential), then− logP[⇥

    m

    > t]log t

    � a. (2)P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 13 / 18

  • Simulation Example 1: FCFS

    Simulations

    Example 1. FCFS: All job types generate same power law asymptotics

    Service time S ∼ 1�t2# jobs: m = 10

    Figure: Logarithmic asymptotics for a = 2 under FCFS.

    100

    101

    102

    103

    104

    10−4

    10−3

    10−2

    10−1

    100

    t

    P[T

    >t]

    γ < 1

    Exponential

    γ > 1

    Asymptote

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 14 / 18

  • Simulation Example 2: PS

    Simulations

    Example 2. PS: The e↵ect of the number of (superexponential) jobs

    B ∼ superexponential (g > 1)# jobs: m = 2 and m = 5, service time with a = 4Figure: Logarithmic asymptotics for a = 4 under PS and FCFS discipline.

    100

    101

    102

    103

    104

    10−4

    10−3

    10−2

    10−1

    100

    t

    P[T

    >t]

    PS: m = 5PS: m = 2FCFSAsymptote

    Figure: Logarithmic asymptotics for a = 4 under FCFS, PS with g > 1 and g < 1discipline for m = 5.

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 15 / 18

  • Conclusions

    Queueing: PS could be always unstable

    Theorem 3

    If jobs are superexponential (g > 1), then for any arrival rate l > 0 and anya > 0, the PS queue is unstable.

    Queueing with retransmissions & scheduling is hard

    More to come in our forthcoming paper. . .

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 16 / 18

  • Conclusions

    Queueing: PS could be always unstable

    Theorem 3

    If jobs are superexponential (g > 1), then for any arrival rate l > 0 and anya > 0, the PS queue is unstable.

    Queueing with retransmissions & scheduling is hard

    More to come in our forthcoming paper. . .

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 16 / 18

  • Conclusions

    Conclusions

    FCFS: power law of same index for both super/subexponential

    PS: new phenomenon - dramatic di↵erence betweensuper/subexponential jobs

    Queueing: for superexponential jobs, sharing induces instabilities →zero throughput

    Sharing is not always good /

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 17 / 18

  • Conclusions

    Conclusions

    FCFS: power law of same index for both super/subexponential

    PS: new phenomenon - dramatic di↵erence betweensuper/subexponential jobs

    Queueing: for superexponential jobs, sharing induces instabilities →zero throughput

    Sharing is not always good /

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 17 / 18

  • Conclusions

    Conclusions

    FCFS: power law of same index for both super/subexponential

    PS: new phenomenon - dramatic di↵erence betweensuper/subexponential jobs

    Queueing: for superexponential jobs, sharing induces instabilities →zero throughput

    Sharing is not always good /

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 17 / 18

  • Conclusions

    Conclusions

    FCFS: power law of same index for both super/subexponential

    PS: new phenomenon - dramatic di↵erence betweensuper/subexponential jobs

    Queueing: for superexponential jobs, sharing induces instabilities →zero throughput

    Sharing is not always good /

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 17 / 18

  • Conclusions

    Thank you

    Questions?

    P.R.Jelenković & E.D.Skiani Scheduling on a Channel with Failures and Retransmissions October 6, 2013 18 / 18

    IntroductionDefinitions & Notation

    Main ResultsFirst Come First ServedProcessor Sharing

    SimulationExample 1: FCFSExample 2: PS

    Conclusions


Recommended