Skip to Content
CSE347Analysis of Algorithms (Lecture 10)

CSE347 Analysis of Algorithms (Lecture 10)

Online Algorithms

Example 1: Elevator

Problem: You’ve entered the lobby of a tall building, and want to go to the top floor as quickly as possible. There is an elevator which takes EE time to get to the top once it arrives. You can also take the stairs which takes SS time to climb (once you start) with S>ES>E. However, you do not know when the elevator will arrive.

Offline (Clairvoyant) vs. Online

Offline: If you know that the elevator is arriving in TT time, the what will you do?

  • Easy. I will computer E+TE+T with SS and take the smaller one.

Online: You do not know when the elevator will arrive.

  • You can either wait for the elevator or take the stairs.

Strategies

Always take the stairs.

Your cost SS,

Optimal Cost: EE.

Your cost / Optimal cost = SE\frac{S}{E}.

SS would be arbitrary large. For example, the Empire State Building has 103103 floors.

Wait for the elevator

Your cost T+ET+E

Optimal Cost: SS (if TT is large)

Your cost / Optimal cost = T+ES\frac{T+E}{S}.

TT could be arbitrary large. For out of service elevator, TT could be infinite.

Online Algorithms

Definition: An online algorithm must take decisions without full information about the problem instance [in this case TT] and/or it does not know the future [e.g. makes decision immediately as jobs come in without knowing the future jobs].

An offline algorithm has the full information about the problem instance.

Competitive Ratio

Quality of online algorithm is quantified by the competitive ratio (Idea is similar to the approximation ratio in optimization).

Consider a problem LL (minimization) and let ll be an instance of this problem.

C(l)C^*(l) is the cost of the optimal offline solution with full information and unlimited computational power.

AA is the online algorithm for LL.

CA(l)C_A(l) is the value of AA‘s solution on ll.

An online algorithm AA is α\alpha-competitive if

CA(l)C(l)α\frac{C_A(l)}{C^*(l)}\leq \alpha

for all instances ll of the problem.

In other words, α=maxlCA(l)C(l)\alpha=\max_l\frac{C_A(l)}{C^*(l)}.

For maximization problems, we want to minimize the comparative ratio.

Back to the Elevator Problem

Strategy 1: Always take the stairs. Ratio is SE\frac{S}{E}. can be arbitrarily large.

Strategy 2: Wait for the elevator. Ratio is T+ES\frac{T+E}{S}. can be arbitrarily large.

Strategy 3: We do not make a decision immediately. Let’s wait for RR times and then takes stairs if elevator does not arrive.

Question: What is the value of RR? (how long to wait?)

Let’s try R=SR=S.

Claim: The comparative ratio is 22.

Proof

Case 1: The optimal offline solution takes the elevator, then T+EST+E\leq S.

We also take the elevator.

Competitive ratio = T+ET+E=1\frac{T+E}{T+E}=1.

Case 2: The optimal offline solution takes the stairs, immediately.

We wait for RR times and then take the stairs. In worst case, we wait for RR times and then take the stairs for RR.

Competitive ratio = 2RR=2\frac{2R}{R}=2.

Let’s try R=SER=S-E instead.

Claim: The comparative ratio is max{1,2ES}max\{1,2-\frac{E}{S}\}.

Proof

Case 1: The optimal offline solution takes the elevator, then T+EST+E\leq S.

We also take the elevator.

Competitive ratio = T+ET+E=1\frac{T+E}{T+E}=1.

Case 2: The optimal offline solution takes the stairs, immediately.

We wait for R=SER=S-E times and then take the stairs.

Competitive ratio = SE+SS=2ES\frac{S-E+S}{S}=2-\frac{E}{S}.

What if we wait less time? Let’s try R=SEϵR=S-E-\epsilon for some ϵ>0\epsilon>0

In the worst case, we take the stairs for SEϵS-E-\epsilon times and then take the stairs for SS.

Competitive ratio = (SEϵ)+SSEϵ+E=2SEϵ2SE>2ES\frac{(S-E-\epsilon)+S}{S-E-\epsilon+E}=\frac{2S-E-\epsilon}{2S-E}>2-\frac{E}{S}.

So the optimal competitive ratio is max{1,2ES}max\{1,2-\frac{E}{S}\} when we wait for SES-E time.

Example 2: Cache Replacement

Cache: Data in a cache is organized in blocks (also called pages or cache lines).

If CPU accesses data that is already in the cache, it is called cache hit, then access is fast.

If CPU accesses data that is not in the cache, it is called cache miss, This block if brought to cache from main memory. If the cache already has kk blocks (full), then another block need to be kicked out (eviction).

Global: Minimize the number of cache misses.

Clairvoyant policy: Knows that will be accessed in the future and the sequence of access.

FIF: Farthest in the future

Example: k=3k=3, cache has 33 blocks.

Sequence: ABCDCABA B C D C A B

Cache: ABCA B C, the evict BB for DD. then 3 warm up and 1 miss.

Online Algorithm: Least recently used (LRU)

LRU: least recently used.

Example: ABCDCABA B C D C A B

Cache: ABCA B C, the evict AA for DD. then 3 warm up and 1 miss.

Cache: DBCD B C, the evict BB for AA. 1 miss.

Cache: DACD A C, the evict DD for BB. 1 miss.

Competitive Ratio for LRU

Claim: LRU is k+1k+1-competitive.

Proof

Split the sequence into subsequences such that each subsequence contains k+1k+1 distinct blocks.

For example, suppose k=3k=3, sequence ABCDCEFGEAABCDCEFGEA, subsequences are ABCDCABCDC and EFGEAEFGEA.

LRU Cache: In each subsequence, it has at most k+1k+1 misses.

The optimal offline solution: In each subsequence, must have at least 11 miss.

So the competitive ratio is at most k+1k+1.

Using similar analysis, we can show that LRU is kk competitive.

Hint for the proof:

Split the sequence into subsequences such that each subsequence LRU has kk misses.

Argue that OPT has at least 11 miss in each subsequence.

Many sensible algorithms are kk-competitive

Lower Bound: No deterministic online algorithm is better than kk-competitive.

Resource augmentation: Offline algorithm (which knows the future) has kk cache lines in its cache and the online algorithm has ckck cache lines with c>1c>1.

Lemma: Competitive Ratio is cc1\sim \frac{c}{c-1}

Say c=2c=2. LRU cache has twice as much as cache. LRU is 22-competitive.

Proof

LRU has cache of size 2k2k.

Divide the sequence into subsequences such that you have ckck distinct pages.

In each subsequence, LRU has at most ckck misses.

The OPT has at least (c1)k(c-1)k misses.

So competitive ratio is at most ck(c1)k=cc1\frac{ck}{(c-1)k}=\frac{c}{c-1}.

Actual competitive ratio is cc1+1k\sim \frac{c}{c-1+\frac{1}{k}}.

Conclusion

  • Definition: some information unknown
  • Clairvoyant vs. Online
  • Competitive Ratio
  • Example:
    • Elevator
    • Cache Replacement

Example 3: Pessimal cache problem

Maximize number of cache misses.

Maximization problem: competitive ratio is max{cost of the optimal online algorithmcost of our algorithm}max\{\frac{\text{cost of the optimal online algorithm}}{\text{cost of our algorithm}}\}.

Or get min{cost of our algorithmcost of the optimal online algorithm}min\{\frac{\text{cost of our algorithm}}{\text{cost of the optimal online algorithm}}\}.

The size of the cache is kk.

So if OPT has XX cache misses, we want Xα\geq \frac{X}{\alpha}. cache misses where α\alpha is the competitive ratio.

Claim: The OPT could always miss (note quite) except when the page is accessed twice in a row.

Claim: No deterministic online algorithm has a bounded competitive ratio. (that is independent of the length of the sequence)

Proof:

Start with an empty cache. (size of cache is kk)

Miss the first kk unique pages.

P1,P2,,PkPk+1,Pk+2,,P2kP_1,P_2,\cdots,P_k|P_{k+1},P_{k+2},\cdots,P_{2k}

Say your deterministic online algorithm choose to evict PiP_i for i{1,2,,k}i\in\{1,2,\cdots,k\}.

We want to choose PiP_i for i{1,2,,k}i\in\{1,2,\cdots,k\} such that the number of misses is maximized.

The optimal offline solution: evict the page that will be accessed furthest in the future. Let’s call it σ\sigma.

The online algorithm: evict PiP_i for i{1,2,,k}i\in\{1,2,\cdots,k\}. Will have k+1k+1 misses in the worst case.

So the competitive ratio is at most σk+1\frac{\sigma}{k+1}, which is unbounded.

Randomized most recently used (RAND, MRU)

MRU without randomization is a deterministic algorithm, and thus, the competitive ration is bounded.

First kk unique accesses brings all pages to cache.

On the k+1k+1th access, pick a random page from the cache and evict it.

After that evict the MRU no a miss.

Claim: RAND is kk-competitive.

Lemma: After the first k+1k+1 unique accesses at all times

  1. 1 page is in the cache with probability 1 (the MRU one)
  2. There exists kk pages each of which is in the cache with probability 11k1-\frac{1}{k}
  3. All other pages are in the cache with probability 00.

Proof

By induction.

Base case: right after the first k+1k+1 unique accesses and before k+2k+2th access.

  1. Pk+1P_{k+1} is in the cache with probability 11.
  2. When we brought Pk+1P_{k+1} to the cache, we evicted one page uniformly at random. (i.e. PiP_i is evicted with probability 1k\frac{1}{k}, PiP_i is still in the cache with probability 11k1-\frac{1}{k})
  3. All other rr pages are definitely not in the cache because we did not see them yet.

Inductive cases:

Let PP be a page that is in the cache with probability 00

Cache miss and RAND MRU evict PP' for another page with probability in this cache with probability 00.

  1. PP is in the cache with probability 11.
  2. By induction, there exists a set of kk pages each of which is in the cache with probability 11k1-\frac{1}{k}.
  3. All other pages are in the cache with probability 00.

Let PP be a page in the cache with probability 11k1-\frac{1}{k}.

With probability 1k\frac{1}{k}, PP is not in the cache and RAND evicts PP' in the cache and brings PP to the cache.

MRU is kk-competitive.

Proof

Case 1: Access MRU page.

Both OPT and our algorithm don’t miss.

Case 2: Access some other 1 page

OPT definitely misses.

RAND MRU misses with probability 1k\geq \frac{1}{k}.

Let’s define the random variable XX as the number of misses of RAND MRU.

E[X]1+1kE[X]\leq 1+\frac{1}{k}.

Last updated on