Lectures on Randomised Algorithms 1
Lectures onRandomised Algorithms
COMP 523: Advanced Algorithmic Techniques
Lecturer: Dariusz Kowalski
Lectures on Randomised Algorithms 2
Overview
Previous lectures:
• NP-hard problems
• Approximation algorithms
These lectures:
• Basic theory: – probability, random variable, expected value
• Randomised algorithms
Lectures on Randomised Algorithms 3
Probabilistic theoryConsider flipping two symmetric coins with sides 1 and 0• Event: situation which depends on random generator
– Event when the sum of results on two flipped coins is 1
• Random variable: a function which attaches a real value to any event – X = sum of results on two flipped coins
• Probability of event: proportion of the event to the set of all events (sometimes weighted)– Pr[X=1] = 2/4 = 1/2 , since
X = 1 is the event containing two elementary events: • 0 on the first coin and 1 on the second coin • 1 on the first coin and 0 on the second coin
Lectures on Randomised Algorithms 4
Probabilistic theory cont.Consider flipping two symmetric coins with sides 1 and 0• Expected value (of random variable): the sum of all
possible values of the random variable weighted by the probabilities of occurring these values– E[X] = 0 1/4 + 1 1/2 + 2 1/4 = 1
• Independence: two events are independent if the probability of their intersection is equal to the multiplication of their probabilities– Event 1: 1 on the first coin, Event 2: 0 on the second coin;
Pr[Event1 & Event2] = 1/4 = Pr[Event1] Pr[Event2] = 1/2 1/2– Event 3: sum on two coins is 2;
Pr[Event1 & Event3] = 1/4 Pr[Event1] Pr[Event3] = 1/2 1/4
Lectures on Randomised Algorithms 5
Randomised algorithms
• Any kind of algorithm using (pseudo) random generator
• Main kinds of algorithms:– Monte Carlo: algorithm computes proper solution with
high probability (in practise: at least constant)• Algorithm MC always stops
– Las Vegas: algorithm always computes proper solution• Sometimes algorithm can run very long, but with very small
probability
Lectures on Recursive Algorithms 6
Quick sort - algorithmic schemeGeneric Quick Sort:• Select one element x from the input• Partition the input into the part containing elements
not greater than x and the part containing all bigger elements
• Sort each part separately• Concatenate these sorted parts
Problem: how to choose element x to balance the sizes of these two parts? (to get the similar recursive equations as for MergeSort)
Lectures on Recursive Algorithms 7
Why parts should be balanced?
Suppose we do not balance, but choose the last element:T(n) T(n-1) + T(1) + c n
T(1) cSolution: T(n) d n2, for some constant 0 < d c/2Proof: by induction.
– For n = 1 straightforward– Suppose T(n-1) d (n-1)2; then
T(n) T(n-1) + c + c n d (n-1)2 + c (n+1) d (n-1)2 + 2dn d n2
Lectures on Randomised Algorithms 8
Randomised approach
Randomised approach:
• Select element x uniformly at random
• Time: O(n log n)
• Additional memory: O(n)
Uniform selection: each element has the same probability to be selected
Lectures on Recursive Algorithms 9
Randomized approach - analysisLet T(n) denote the expected time: sum of all possible values of time weighted by the probabilities of these valuesT(n) 1/n ([T(n-1)+T(1)] + [T(n-2)+T(2)] + … +[T(0)+T(n)]) + cn
T(0) = T(1) = 1, T(2) c
Solution: T(n) d n log n, for some constant d 8cProof: by induction.– For n = 2 straightforward– Suppose T(m) d m log m, for every m < n; then
(1-1/n)T(n) (2/n)(T(0) + … + T(n-1)) + c n
(2d/n)(0 log 0 + … + (n-1)log(n-1)) + c n d n log n - d (n/4) + c n d n log n - d n/2
T(n) n/(n-1)(d n log n - d n/2) d n log n
Lectures on Recursive Algorithms 10
Tree structure of random execution
1 2 3 4 5 6 7 8
2 4
5 7
6
3
1
height = 5
root
8 leaves
Lectures on Randomised Algorithms 11
Minimum Cut in a graphMinimum cut in an undirected multi-graph G (there may be
many edges between a pair of nodes):– A partition of nodes with minimum number of crossing edges
• Deterministic approach:– Transform the graph to s-t network, for every pair of nodes s,t – Replace each undirected edge by two directed edges in
opposite directions of capacity 1 each– Replace all multiple directed edges by one edge with capacity
equal to the multiplicity of this edge– Run Ford-Fulkerson (or other network-flow algorithm) to
compute max-flow, which is equal to min-cut
Lectures on Randomised Algorithms 12
Minimum Cut in a graphRandomised approach:• Select a random edge:
– contract their end nodes into one node, – remove edges between these two nodes – keep the other adjacent edges to the obtained
supernode
• Repeat the above procedure until two supernodes remain
• Count the number of edges between the remaining supernodes and return the result
Lectures on Randomised Algorithms 13
Minimum Cut - Analysis
Let K be the smallest cut (set of edges) and let k be its size.• Compute probability that in step j the edge in K is selected,
providing no edge from K has been selected before, is– Each supernode has at least k adjacent edges (otherwise a cut
between a node with smaller number of adjacent edges and remaining supernodes would be smaller than K)
– Total number of remaining supernodes in the beginning of step j is n - j + 1
– Total number of edges in the beginning of step j is thus at least k(n - j + 1)/2 (each edge is counted twice to the degree of a node)
– Probability of selecting (and so contracting) edge in K in step j is at most k/[k(n - j + 1)/2] = 2/(n - j + 1)
Lectures on Randomised Algorithms 14
Minimum Cut - Analysis cont.• Event Bj : in step j of the algorithm an edge not in K is selected
• Conditional probability (of event A under condition event B):
Pr[A|B] = Pr[AB]/Pr[B]
• From the previous slide:
Pr[Bj | Bj-1 … B1] > 1 – 2/(n - j + 1)
• The following holds:
Pr[Bj Bj-1 … B1] = Pr[B1] Pr[B2|B1] Pr[B3|B2B1]
… Pr[Bj|Bj-1…B1]
• Probability of sought event Bn-2 Bn-3 … B1 (i.e., that in all n-2 steps of the algorithm edges not in K are selected) is at most
[1-2/n][1-2/(n - 1)]…[1-2/3] =
[(n-2)/n][(n-3)/(n-1)][(n-4)/(n-2)]…[2/4][1/3] =
2/[n(n-1)]
Lectures on Randomised Algorithms 15
Minimum Cut - Analysis cont.
• If we iterate this algorithm independently n(n-1)/2 times, always recording the minimum output obtained so far, then the probability of success (i.e., of finding a min-cut) is at least
1-(1-2/[n(n-1)])n(n-1)/2 1-1/e• To obtain bigger probability we have to iterate this
process more times• The total time is O(n3) concatenations• Question: how to implement concatenation efficiently?
Lectures on Randomised Algorithms 16
Conclusions
• Probabilistic theory– Events, random variables, expected values
• Basic algorithms– LV Randomised Quick Sort (randomised recurrence)– MC Minimum Cut (iterating to get bigger probability)
Lectures on Randomised Algorithms 17
Textbook and Exercises
READING:• Chapter 13, Sections 13.2, 13.3, 13.5 and 13.12
EXERCISE: • How many iterations of min-cut randomised algorithm
should we perform to obtain probability of success at least 1 - 1/n ?
For volunteers:• Suppose that we know the size of min-cut. What is the
expected number of iterations of min-cut randomised algorithm to find a sample min-cut?
Lectures on Randomised Algorithms 18
Overview
Previous lectures:
• Randomised algorithms
• Basic theory: probability, random variable, expected value
• Algorithms: LV (sorting) and MC (min-cut)
This lecture:
• Basic random processes
Lectures on Randomised Algorithms 19
Expected number of successesSequence (possibly infinite) of independent random trials,
each with probability p of success• Expected number of successes in m trials is
– Probability of success in one trial is p, so let Xj be such that Pr[Xj=1] = p and Pr[Xj=0] = 1 - p , for 0<jm
– E[0<jm Xj] = 0<jm E[Xj] = mp
• Memoryless guessing: n cards, you guess one, turn over one card, check if you succeeded, shuffle cards and repeat; how much time do you need to expect one proper guess?– Pr[Xj=1] = 1/n and Pr[Xj=0] = 1 - 1/n
– E[0<jn Xj] = 0<jn E[Xj] = n 1/n = 1
Lectures on Randomised Algorithms 20
Guessing with memory
• n cards, you guess one, turn over one card, remove the card, shuffle the rest of them and repeat; how many successful guesses can you expect?– Pr[Xj=1] = 1/(n-j+1) and Pr[Xj=0] = 1 - 1/(n-j+1)
– E[0<jn Xj] = 0<jn E[Xj] = 0<jn 1/(n-j+1) =
0<jn 1/j = Hn = ln n + const.
Lectures on Randomised Algorithms 21
Waiting for the first success
Sequence (possibly infinite) of independent random trials, each with probability p of success
• Expected time for waiting for the first success is
j>0 j (1 - p)j-1 p = p j1 j (1 - p)j-1 =
p (j1 (1 - p)j )’ =
p ((1 - p)/(1-(1-p)))’ =
p 1/p2 = 1/p
Lectures on Randomised Algorithms 22
Collecting coupons• n types of coupons hidden randomly in a large number of
boxes, each box contains one coupon. You choose a box and take a coupon from it. How many boxes can you expect to open in order to collect all kinds of coupons?
• Stage j: time between selecting j - 1 different coupons and jth different coupon– Independent trials Yi for each step i of stage j, satisfying
Pr[Yi=0] = (j-1)/n and Pr[Yi=1] = (n-j+1)/n– Let Xj be the length of stage j; the expected waiting for first
success E[Xj] = 1/Pr[Yi =1] = n/(n-j+1)
• Finally E[0<jn Xj] = 0<jn E[Xj] = 0<jn n/(n-j+1) =n 0<jn 1/j = nHn = n ln n + n const.
Lectures on Randomised Algorithms 23
Conclusions
• Probabilistic theory– Events, random variables, expected values, etc.
• Basic random processes– Number of successes– Guessing with or without memory– Waiting for the first success– Collecting coupons
Lectures on Randomised Algorithms 24
Textbook and ExercisesREADING:• Section 13.3EXERCISES:• How many iterations of min-cut randomised algorithm should
we perform to obtain probability of success at least 1 - 1/n ?• More general question: Suppose that algorithm MC answers
correctly with probability 1/2. How to modify it to answer correctly with probability at least 1-1/n ?
For volunteers:• Suppose that we know the size of min-cut. What is the expected
number of iterations of min-cut randomised algorithm to find a precise min-cut?
Lectures on Randomised Algorithms 25
Overview
Previous lectures:
• Randomized algorithms
• Basic theory: probability, random variable, expected value
• Algorithms: LV sorting, MC min-cut
• Basic random processes
This lecture:
• Randomised caching
Lectures on Randomised Algorithms 26
Randomised algorithms
• Any kind of algorithm using (pseudo-)random generator• Main kinds of algorithms:
– Monte Carlo: algorithm computes the proper solution with large probability (at least constant)
• Algorithm MC always stops
• We want to have high probability of success
– Las Vegas: algorithm computes always the proper solution• Sometimes algorithm can run very long, but with very small probability
• We want to achieve small expected running time (or other complexity)
Lectures on Randomised Algorithms 27
On-line vs. off-lineDynamic data:• Arrive during execution
Algorithms:• On-line: doesn’t know the future, makes decision on-line• Off-line: knows the future, makes decision off-line
Complexity measure: Competitive ratio:• The maximum ratio, taken over all data, between the
performance of given on-line algorithm and the optimum off-line solution for the data
Lectures on Randomised Algorithms 28
Analyzing the caching processTwo kinds of memory:• Fast memory: cache of size k• Slow memory: disc of size nExamples: • hard disc versus processor cache• network resources versus local memoryProblem:• In each step a request for a value arrives; • If the value is in cache then answering does not cost anything,
otherwise it costs one unit (of access to the slow memory)Performance measure:• Count the number of accesses to the slow memory• Compute competitive ratio
Lectures on Randomised Algorithms 29
Marking algorithm(s)• Algorithm proceeds in phases• Each item in cache is either marked or unmarked • At the beginning of each phase all items are
unmarked• Upon a request to item s:
– If s is in cache then mark s (if already unmarked)– Else
• If all items in cache are marked then finish the current phase and start a new one, unmark all items in the cache
• Remove a randomly selected unmarked item from the cache and put s in its place; mark s
Lectures on Randomised Algorithms 30
Example of processing by markingStream: 1,2,3,4,1,2,3,4Cache (for k = 3 items):• Phase 1: 1 -> 1 2 -> 1,2 3 -> 1,2,3• Phase 2: 1,2,3 4 -> 1,3,4 1 -> 1,3,4 2-> 1,2,4• Phase 3: 1,2,4 3 -> 1,2,3 4 -> 2,3,4
• Number of accesses to slow memory: 7• Optimal algorithm: 5
Notation: Marked elements: 1New marked elements: 4
Lectures on Randomised Algorithms 31
Analysis• Let r denote the number of phases of the algorithm• Item can be
– Marked– Unmarked
• Fresh - it was not marked during previous phase• Stale - it was marked during previous phase
• Let denote the stream of requests, – cost() denote the number of accesses to slow memory by the
algorithm, – opt() denote the minimum possible cost on stream ,– optj() be the number of “misses” in phase j.
• Let cj denote the number of requests in the data stream to fresh items in phase j
Lectures on Randomised Algorithms 32
Analysis: fresh items in optimal solution(*) After one phase, only items which have been requested
in that phase can be stored in the cache Properties:• optj() + optj+1() cj+1
Indeed, in phases j and j+1 there are at least cj+1 “misses” in optimal algorithm, since, by (*), fresh items requested in phase j+1 were not requested in phase j and so could not be present in the cache.
• 2opt () 0j<r [optj() + optj+1()] 0j<r cj+1
• opt() 0.5 0<jr cj
Lectures on Randomised Algorithms 33
Analysis - stale itemsLet Xj be the number of misses by marking algorithm in
phase j• No misses on marked items - they remain in cache• cj misses on fresh items in phase j• At the beginning of phase j all items in cache are stale -
unmarked by request in the previous phase• ith request to unmarked stale item, say it is for item s:
– each of remaining k-i+1 stale items is equally likely to be no longer in cache, at most cj items were replaced by fresh items, and so s is not in the cache with probability at most cj/(k-i+1)
• E[Xj] cj + 0<ik cj/(k-i+1) cj (1 + 0<ik 1/(k-i+1)) cj (1 + Hk)
Lectures on Randomised Algorithms 34
Analysis - conclusions• Let Xj be the number of misses by marking
algorithm in phase j• cost() denotes the number of accesses to
external memory by the algorithm - random variable
• opt() denotes the minimum possible cost on stream - deterministic value
E[cost()] 0<jr E[Xj] (1 + Hk)0<jr cj
(2Hk + 2) opt()
Lectures on Randomised Algorithms 35
Conclusions
• Randomised algorithm for caching: O(ln k) competitive
• Lower bound k on competitiveness of any deterministic caching algorithm: for every deterministic algorithm there is a string of requests such that they are processed at least k times slower than the optimal processing
Lectures on Randomised Algorithms 36
Textbook and Exercises
READING:
• Section 13.8
EXERCISES (for volunteers):
• Modify Randomized Marking algorithm to obtain k-competitive deterministic algorithm.
• Prove that Randomized Marking algorithm is at least Hk-competitive.
• Prove that each deterministic caching algorithm is at least k-competitive.
Lectures on Randomised Algorithms 37
OverviewPrevious lectures:
• Randomized algorithms
• Basic theory: probability, random variable, expected value
• Algorithms: LV sorting, MC min-cut
• Basic random processes
• Randomised caching
This lecture:
• Multi-access channel protocols
Lectures on Randomised Algorithms 38
Ethernet“dominant” LAN technology: • cheap $20 for 1000Mbs!• first widely used LAN technology• Simpler, cheaper than token LANs and ATM• Kept up with speed race: 10, 100, 1000 Mbps
Metcalfe’s Ethernetsketch
Lectures on Randomised Algorithms 39
Ethernet Frame Structure
Sending adapter encapsulates IP datagram (or other network layer protocol packet) in Ethernet frame
Preamble:
• 7 bytes with pattern 10101010 followed by one byte with pattern 10101011
• used to synchronize receiver, sender clock rates
Lectures on Randomised Algorithms 40
Ethernet Frame Structure (more)
• Addresses: 6 bytes– if adapter receives frame with matching destination address, or
with broadcast address (eg ARP packet), it passes data in frame to net-layer protocol
– otherwise, adapter discards frame
• Type: indicates the higher layer protocol, mostly IP but others may be supported such as Novell IPX and AppleTalk)
• CRC: checked at receiver, if error is detected, the frame is simply dropped
Lectures on Randomised Algorithms 41
Unreliable, connectionless service
• Connectionless: No handshaking between sending and receiving adapter.
• Unreliable: receiving adapter doesn’t send acks or nacks to sending adapter– stream of datagrams passed to network layer can have gaps
– gaps will be filled if app is using TCP
– otherwise, app will see the gaps
Lectures on Randomised Algorithms 42
Random Access Protocols• When node has packet to send
– transmit at full channel data rate R
– no a priori coordination among nodes
• Multiple-access channel: – One transmitting node at a time -> successful access/transmission
– Two or more transmitting nodes at a time -> collision (no success)
• Random access MAC protocol specifies: – how to detect collisions
– how to recover from collisions (e.g., via delayed retransmissions)
• Examples of random access MAC protocols:– ALOHA (slotted, unslotted)
– CSMA (CSMA/CD, CSMA/CA)
Lectures on Randomised Algorithms 43
Slotted ALOHA
Assumptions• all frames same size• time is divided into equal
size slots, time to transmit 1 frame
• nodes start to transmit frames only at beginning of slots
• nodes are synchronized• if 2 or more nodes transmit
in slot, all nodes detect collision
Operation
• when node obtains fresh frame, it transmits in next slot
• no collision, node can send new frame in next slot
• if collision, node retransmits frame in each subsequent slot with prob. p until success
Lectures on Randomised Algorithms 44
Slotted ALOHA
Pros• single active node can
continuously transmit at full rate of channel
• highly decentralized: only slots in nodes need to be in sync
• simple
Cons
• collisions, wasting slots
• idle slots
• nodes may be able to detect collision in less than time to transmit packet
Lectures on Randomised Algorithms 45
Slotted Aloha: analysisSuppose that k stations want to transmit in the same slot.
The probability that one station transmits in the next slot is kp(1-p)k-1
• If k 1/(2p) then kp(1-p)k-1 = (kp) < 1/2, and applying the analysis similar to the coupon collector problem we get an average number of slots when all stations transmit successfully is
(1/p+1/(2p)+…+1/(kp)) = ((1/p)Hk) = ((1/p) ln k)
• If k > 1/(2p) then kp(1-p)k-1 = (kp/ekp), hence the expected time even for the first successful transmission is ((1/p) ekp/k)
Conclusion: choice of the probability matters!
Lectures on Randomised Algorithms 46
CSMA (Carrier Sense Multiple Access)
CSMA: listen before transmit:
• If channel sensed idle: transmit entire frame
• If channel sensed busy, defer transmission
• Human analogy: don’t interrupt others!
Lectures on Randomised Algorithms 47
CSMA/CD (Collision Detection)CSMA/CD: carrier sensing, deferral as in CSMA
– collisions detected within short time– colliding transmissions aborted, reducing channel
wastage
• collision detection: – easy in wired LANs: measure signal strengths,
compare transmitted, received signals– difficult in wireless LANs: receiver shut off while
transmitting
• human analogy: the polite conversationalist
Lectures on Randomised Algorithms 48
Ethernet uses CSMA/CD
• No slots
• adapter doesn’t transmit if it senses that some other adapter is transmitting, that is, carrier sense
• transmitting adapter aborts when it senses that another adapter is transmitting, that is, collision detection
• Before attempting a retransmission, adapter waits a random time, that is, random access
Lectures on Randomised Algorithms 49
Ethernet CSMA/CD algorithm
1. Adaptor gets datagram and creates frame
2. If adapter senses channel idle, it starts to transmit frame. If it senses channel busy, waits until channel idle and then transmits
3. If adapter transmits entire frame without detecting another transmission, the adapter is done with frame !
4. If adapter detects another transmission while transmitting, aborts and sends jam signal
5. After aborting, adapter enters exponential backoff: after the m-th collision, if m < M, adapter chooses a K at random from {0,1,2,…,2m-1}. Adapter waits K*512 bit times and returns to Step 2
Lectures on Randomised Algorithms 50
Ethernet’s CSMA/CD (more)Jam Signal: make sure all other
transmitters are aware of collision; 48 bits;
Bit time: .1 microsec for 10 Mbps Ethernet ;for K=1023, wait time is about 50 msec
Exponential Backoff:
• Goal: adapt retransmission attempts to estimated current load– heavy load: random wait will
be longer
• first collision: choose K from {0,1}; delay is K x 512 bit transmission times
• after second collision: choose K from {0,1,2,3}…
• after ten collisions, choose K from {0,1,2,3,4,…,1023}
See/interact with Javaapplet on AWL Web site:
highly recommended !
Lectures on Randomised Algorithms 51
Ethernet CSMA/CD modified algorithm
1. Adaptor gets datagram from and creates frame; K := 0
2. If adapter senses channel idle, it starts to transmit frame. If it senses channel busy, waits until channel idle and then transmits
3. If adapter transmits entire frame without detecting another transmission, the adapter is done with frame !
4. If adapter detects another transmission while transmitting, aborts and sends jam signal
5. After aborting, adapter enters modified exponential backoff: after the m-th collision, if m < M, adapter
• waits (2m-1-K)*512 bit times• chooses a K at random from
{0,1,2,…,2m-1}. Adapter waits K*512 bit times and returns to Step 2
Lectures on Randomised Algorithms 52
Modified Exponential Backoff: analysis
Suppose some k stations start the protocol at the same time.
Time complexity for a given packet out of k packets to be successfully transmitted is O(k) with probability at least ¼ :
• Consider value of window such that 0.5 · window ≤ k < window; Time required to reach this size of window is O(k)
• The probability that a given packet is transmitted during the run of the loop for this value of window is at least
(1-1/window)k-1 > (1-1/k)k > ¼
Lectures on Randomised Algorithms 53
Modified Exponential Backoff: analysis cont.
Suppose some k stations start the protocol at the same time.
Time complexity for all given packets to be successfully transmitted is O(k2) with probability at least 1/2 :
• Consider value of window such that 0.5 · window ≤ k2 < window; Time required to reach this size of window is O(k2)
• The probability that there is any collision during the run of the loop for this value x = window is at most
k’(k’-1)/2 1/x2 x = k’(k’-1)/2 1/x < k’(k’-1)/2 1/k2 < 1/2
where k’ is the number of packets that have not been successfully transmitted before, there are k’(k’-1)/2 of pairs of stations that may collide, each pair with probability 1/x2 , and there are x times available for collision
Lectures on Randomised Algorithms 54
Ethernet Technologies: 10Base2• 10: 10Mbps; 2: under 200 meters max cable length• thin coaxial cable in a bus topology
• repeaters used to connect up to multiple segments– Each segment up to 30 nodes, up to 185 metres long.– Max of 5 segments.
• repeater repeats bits it hears on one interface to its other interfaces: physical layer device only!• has become a legacy technology
Lectures on Randomised Algorithms 55
10BaseT and 100BaseT• 10/100 Mbps rate; latter called “fast ethernet”• T stands for Twisted Pair• Nodes connect to a hub: “star topology”; 100 m max distance between nodes and hub
• Hubs are essentially physical-layer repeaters:– bits coming in one link go out all other links– no frame buffering– adapters detect collisions– provides net management functionality
• eg disconnection of malfunctioning adapters/hosts.
hub
nodes
Lectures on Randomised Algorithms 56
Gbit Ethernet• use standard Ethernet frame format• allows for point-to-point links and shared
broadcast channels• in shared mode, CSMA/CD is used; short
distances between nodes to be efficient• uses hubs, called here “Buffered Distributors”• Full-Duplex at 1 Gbps for point-to-point links• 10 Gbps now !
Lectures on Randomised Algorithms 57
Textbook and Exercises
READING:
• Section 13.1
EXERCISES (for volunteers):
• Exercise 3 from Chapter 13