CSE347 Analysis of Algorithms (Lecture 10)
Online Algorithms
Example 1: Elevator
Problem: You’ve entered the lobby of a tall building, and want to go to the top floor as quickly as possible. There is an elevator which takes time to get to the top once it arrives. You can also take the stairs which takes time to climb (once you start) with . However, you do not know when the elevator will arrive.
Offline (Clairvoyant) vs. Online
Offline: If you know that the elevator is arriving in time, the what will you do?
- Easy. I will computer with and take the smaller one.
Online: You do not know when the elevator will arrive.
- You can either wait for the elevator or take the stairs.
Strategies
Always take the stairs.
Your cost ,
Optimal Cost: .
Your cost / Optimal cost = .
would be arbitrary large. For example, the Empire State Building has floors.
Wait for the elevator
Your cost
Optimal Cost: (if is large)
Your cost / Optimal cost = .
could be arbitrary large. For out of service elevator, could be infinite.
Online Algorithms
Definition: An online algorithm must take decisions without full information about the problem instance [in this case ] and/or it does not know the future [e.g. makes decision immediately as jobs come in without knowing the future jobs].
An offline algorithm has the full information about the problem instance.
Competitive Ratio
Quality of online algorithm is quantified by the competitive ratio (Idea is similar to the approximation ratio in optimization).
Consider a problem (minimization) and let be an instance of this problem.
is the cost of the optimal offline solution with full information and unlimited computational power.
is the online algorithm for .
is the value of ‘s solution on .
An online algorithm is -competitive if
for all instances of the problem.
In other words, .
For maximization problems, we want to minimize the comparative ratio.
Back to the Elevator Problem
Strategy 1: Always take the stairs. Ratio is . can be arbitrarily large.
Strategy 2: Wait for the elevator. Ratio is . can be arbitrarily large.
Strategy 3: We do not make a decision immediately. Let’s wait for times and then takes stairs if elevator does not arrive.
Question: What is the value of ? (how long to wait?)
Let’s try .
Claim: The comparative ratio is .
Proof
Case 1: The optimal offline solution takes the elevator, then .
We also take the elevator.
Competitive ratio = .
Case 2: The optimal offline solution takes the stairs, immediately.
We wait for times and then take the stairs. In worst case, we wait for times and then take the stairs for .
Competitive ratio = .
Let’s try instead.
Claim: The comparative ratio is .
Proof
Case 1: The optimal offline solution takes the elevator, then .
We also take the elevator.
Competitive ratio = .
Case 2: The optimal offline solution takes the stairs, immediately.
We wait for times and then take the stairs.
Competitive ratio = .
What if we wait less time? Let’s try for some
In the worst case, we take the stairs for times and then take the stairs for .
Competitive ratio = .
So the optimal competitive ratio is when we wait for time.
Example 2: Cache Replacement
Cache: Data in a cache is organized in blocks (also called pages or cache lines).
If CPU accesses data that is already in the cache, it is called cache hit, then access is fast.
If CPU accesses data that is not in the cache, it is called cache miss, This block if brought to cache from main memory. If the cache already has blocks (full), then another block need to be kicked out (eviction).
Global: Minimize the number of cache misses.
Clairvoyant policy: Knows that will be accessed in the future and the sequence of access.
FIF: Farthest in the future
Example: , cache has blocks.
Sequence:
Cache: , the evict for . then 3 warm up and 1 miss.
Online Algorithm: Least recently used (LRU)
LRU: least recently used.
Example:
Cache: , the evict for . then 3 warm up and 1 miss.
Cache: , the evict for . 1 miss.
Cache: , the evict for . 1 miss.
Competitive Ratio for LRU
Claim: LRU is -competitive.
Proof
Split the sequence into subsequences such that each subsequence contains distinct blocks.
For example, suppose , sequence , subsequences are and .
LRU Cache: In each subsequence, it has at most misses.
The optimal offline solution: In each subsequence, must have at least miss.
So the competitive ratio is at most .
Using similar analysis, we can show that LRU is competitive.
Hint for the proof:
Split the sequence into subsequences such that each subsequence LRU has misses.
Argue that OPT has at least miss in each subsequence.
Many sensible algorithms are -competitive
Lower Bound: No deterministic online algorithm is better than -competitive.
Resource augmentation: Offline algorithm (which knows the future) has cache lines in its cache and the online algorithm has cache lines with .
Lemma: Competitive Ratio is
Say . LRU cache has twice as much as cache. LRU is -competitive.
Proof
LRU has cache of size .
Divide the sequence into subsequences such that you have distinct pages.
In each subsequence, LRU has at most misses.
The OPT has at least misses.
So competitive ratio is at most .
Actual competitive ratio is .
Conclusion
- Definition: some information unknown
- Clairvoyant vs. Online
- Competitive Ratio
- Example:
- Elevator
- Cache Replacement
Example 3: Pessimal cache problem
Maximize number of cache misses.
Maximization problem: competitive ratio is .
Or get .
The size of the cache is .
So if OPT has cache misses, we want . cache misses where is the competitive ratio.
Claim: The OPT could always miss (note quite) except when the page is accessed twice in a row.
Claim: No deterministic online algorithm has a bounded competitive ratio. (that is independent of the length of the sequence)
Proof:
Start with an empty cache. (size of cache is )
Miss the first unique pages.
Say your deterministic online algorithm choose to evict for .
We want to choose for such that the number of misses is maximized.
The optimal offline solution: evict the page that will be accessed furthest in the future. Let’s call it .
The online algorithm: evict for . Will have misses in the worst case.
So the competitive ratio is at most , which is unbounded.
Randomized most recently used (RAND, MRU)
MRU without randomization is a deterministic algorithm, and thus, the competitive ration is bounded.
First unique accesses brings all pages to cache.
On the th access, pick a random page from the cache and evict it.
After that evict the MRU no a miss.
Claim: RAND is -competitive.
Lemma: After the first unique accesses at all times
- 1 page is in the cache with probability 1 (the MRU one)
- There exists pages each of which is in the cache with probability
- All other pages are in the cache with probability .
Proof
By induction.
Base case: right after the first unique accesses and before th access.
- is in the cache with probability .
- When we brought to the cache, we evicted one page uniformly at random. (i.e. is evicted with probability , is still in the cache with probability )
- All other pages are definitely not in the cache because we did not see them yet.
Inductive cases:
Let be a page that is in the cache with probability
Cache miss and RAND MRU evict for another page with probability in this cache with probability .
- is in the cache with probability .
- By induction, there exists a set of pages each of which is in the cache with probability .
- All other pages are in the cache with probability .
Let be a page in the cache with probability .
With probability , is not in the cache and RAND evicts in the cache and brings to the cache.
MRU is -competitive.
Proof
Case 1: Access MRU page.
Both OPT and our algorithm don’t miss.
Case 2: Access some other 1 page
OPT definitely misses.
RAND MRU misses with probability .
Let’s define the random variable as the number of misses of RAND MRU.
.