Exam 2 Review

Reductions

We say that a problem $A$ reduces to a problem $B$ if there is a polynomial time reduction function $f$ such that for all $x$ , $x \in A \iff f(x) \in B$ .

To prove a reduction, we need to show that the reduction function $f$ :

runs in polynomial time
$x \in A \iff f(x) \in B$ .

Useful results from reductions

$B$ is at least as hard as $A$ if $A \leq B$ .
If we can solve $B$ in polynomial time, then we can solve $A$ in polynomial time.
If we want to solve problem $A$ , and we already know an efficient algorithm for $B$ , then we can use the reduction $A \leq B$ to solve $A$ efficiently.
If we want to show that $B$ is NP-hard, we can do this by showing that $A \leq B$ for some known NP-hard problem $A$ .

$P$ is the class of problems that can be solved in polynomial time. $NP$ is the class of problems that can be verified in polynomial time.

We know that $P \subseteq NP$ .

NP-complete problems

A problem is NP-complete if it is in $NP$ and it is also NP-hard.

NP

A problem is in $NP$ if

there is a polynomial size certificate for the problem, and
there is a polynomial time verifier for the problem that takes the certificate and checks whether it is a valid solution.

NP-hard

A problem is NP-hard if every instance of $NP$ hard problem can be reduced to it in polynomial time.

List of known NP-hard problems:

3-SAT (or SAT):
- Statement: Given a boolean formula in CNF with at most 3 literals per clause, is there an assignment of truth values to the variables that makes the formula true?
Independent Set:
- Statement: Given a graph $G$ and an integer $k$ , does $G$ contain a set of $k$ vertices such that no two vertices in the set are adjacent?
Vertex Cover:
- Statement: Given a graph $G$ and an integer $k$ , does $G$ contain a set of $k$ vertices such that every edge in $G$ is incident to at least one vertex in the set?
3-coloring:
- Statement: Given a graph $G$ , can each vertex be assigned one of 3 colors such that no two adjacent vertices have the same color?
Hamiltonian Cycle:
- Statement: Given a graph $G$ , does $G$ contain a cycle that visits every vertex exactly once?
Hamiltonian Path:
- Statement: Given a graph $G$ , does $G$ contain a path that visits every vertex exactly once?

Approximation Algorithms

Consider optimization problems whose decision problem variant is NP-hard. Unless P=NP, finding an optimal solution to these problems cannot be done in polynomial time.
In approximation algorithms, we make a trade-o↵: we’re willing to accept sub-optimal solutions in exchange for polynomial runtime.
The Approximation Ratio of our algorithm is the worst-case ratio of our solution to the optimal solution.
For minimization problems, this ratio is $\max_{l\in L}\left(\frac{c_A(l)}{c_{OPT}(l)}\right)$ , since our solution will be larger than OPT.
For maximization problems, this ratio is $\min_{l\in L}\left(\frac{c_{OPT}(l)}{c_A(l)}\right)$ , since our solution will be smaller than OPT.
If given an algorithm, and you need to show it has some desired approximation ratio, there are a few approaches.
In recitation, we saw Max-Subset Sum. We found upper bounds on the optimal solution and showed that the given algorithm would always give a solution with value at least half of the upper bound, giving our approximation ratio of 2.
In lecture, you saw the Vertex Cover 2-approximation. Here, you would select any uncovered edge $(u, v)$ and add both u and v to the cover. We argued that at least one of u or v must be in the optimal cover, as the edge must be covered, so at every step we added at least one vertex from an optimal solution, and potentially one extra. So, the size of our cover could not be any larger than twice the optimal.

Randomized Algorithms

Sometimes, we can get better expected performance from an algorithm by introducing randomness.

We make the tradeoff guarantee runtime and solution quality from a deterministic algorithm, to expected runtime and quality from randomized algorithms.

We can make various bounds and tricks to calculate and amplify the probability of succeeding.

Chernoff Bound

Statement:

Pr[X < (1-\delta)E[x]] \leq e^{-\frac{\delta^2 E[x]}{2}}

Requirements:

$X$ is the sum of $n$ independent random variables
You used the Chernoff bound to bound the probability of getting less than $d$ good partitions, since the probability of each partition being good is independent – the quality of one partition does not affect the quality of the next.
Usage: If you have some probability $Pr[X < \text{something}]$ that you want to bound, you must find $E[X]$ , and find a value for $\delta$ such that $(1-\delta)E[X] = \text{something}$ . You can then plug in and $E[X]$ into the Chernoff bound.

Markov’s Inequality

Statement:

Pr[X \geq a] \leq \frac{E[X]}{a}

Requirements:

$X$ is a non-negative random variable
No assumptions about independence
Usage: If you have some probability $Pr[X \geq \text{something}]$ that you want to bound, you must find $E[X]$ , and find a value for $a$ such that $a = \text{something}$ . You can then plug in and $E[X]$ into Markov’s inequality.

Union Bound

Statement:

Pr[\bigcup_{i=1}^n e_i] \leq \sum_{i=1}^n Pr[e_i]

Conceptually, it’s saying that at least one event out of a collection will occur is no more than the sum of the probabilities of each event.
Usage: To bound some bad event $e$ , we can use the union bound to sum up the probabilities of each of the bad events $e_i$ and use that to bound $Pr[e]$ .

Probabilistic Boosting via Repeated Trials

If we want to reduce the probability of some bad event $e$ to some value $p$ , we can run the algorithm repeatedly and make majority votes for the decision.
Assume we run the algorithm $k$ times, and the probability of success is $\frac{1}{2} + \epsilon$ .
The probability that all trials fail is at most $(1-\epsilon)^k$ .
The majority vote of $k$ runs is wrong is the same as probability that more than $\frac{k}{2}+1$ trials fail.
So, the probability is

\begin{aligned} Pr[\text{majority fails}] &=\sum_{i=\frac{k}{2}+1}^{k}\binom{k}{i}(\frac{1}{2}-\epsilon)^i(\frac{1}{2}+\epsilon)^{k-i}\\ &= \binom{k}{\frac{k}{2}+1}(\frac{1}{2}-\epsilon)^{\frac{k}{2}+1} \end{aligned}

If we want this probability to be at most $p$ , we can just solve for $k$ in the inequality make it less than some $\delta$ . Then we solve for $k$ in the inequality $\binom{k}{\frac{k}{2}+1}(\frac{1}{2}-\epsilon)^{\frac{k}{2}+1} \leq \delta$ .

Online Algorithms

We make decisions on the fly, without knowing the future.
The offline optimum is the optimal solution that knows the future.
The competitive ratio of an online algorithm is the worst-case ratio of the cost of the online algorithm to the cost of the offline optimum. (when offline problem is NP-complete, an online algorithm for the problem is also an approximation algorithm) $\text{Competitive Ratio} = \frac{C_{online}}{C_{offline}}$
We do case by case analysis to show that the competitive ratio is at most some value. Just like approximation ratio proofs.