CSE347 Analysis of Algorithms (Lecture 11)
More randomized algorithms
Caching problem: You have a cache with blocks and a sequence of accesses, called . The cost of a randomized caching algorithm is the expected number of cache misses on .
Randomized Marking Algorithm
A phase has new pages.
Our goal is to optimize where is the number of new pages in phase .
Marking algorithm:
- at a cache miss, evict an unmarked page uniformly at random
- at the beginning of the algorithm, all the entries are unmarked
- after unique accesses and one miss, all the entries are unmarked
- old pages: pages in cache at the end of the previous phase
- new pages: pages accessed in this phase that are not old.
- new pages always cause a miss.
- old pages can cause a miss if a new page was accessed and replaced that old page and then the old page was accessed again. This can also be caused by old pages replacing other old pages and creating this cascading effect.
Reminder: Competitive ratio for our randomized algorithm is
def randomized_marking_algorithm(sigma, k):
cache = set()
marked = set()
misses = 0
for page in sigma:
if page not in cache:
# once all the blocks are marked, unmark all the blocks
if len(marked) == k:
marked.clear()
# if the cache is full, randomly remove a page that is not marked
if len(cache) == k:
for page in cache:
if page not in marked:
cache.remove(page)
misses += 1
# add the new page to the cache and mark it
cache.add(page)
marked.add(page)
return missesExample:
A cache on phase has blocks and miss on page :
[ new pages] [ old pages] [] []
Proof:
Warning: the first few line of the equation might be wrong.
Let be the probability that page causes a miss when the cache has new pages, old pages, and unmarked blocks.
Using , we have
Fix a phase , let be an indicator random variable
Let be the total number of phases.
So the expected total number of misses is
So the competitive ratio is
Probabilistic boosting for decision problems
Assume that you have a randomized algorithm that gives you the correct answer with probability . for some .
I want to boost the probability of the correct decision to be .
What we can do is to run the algorithm times independently with probability and take the majority vote.
The probability of the wrong decision is
I want to choose such that this is .
So
We use this to solve for .