Approximations Algorithms
(for Database Researchers)
Optimization and computation problems
Letf be a functionf :X →R(often, but not always, a function that countssomething). We focus in this talk on:
Optimization problems.
Given a set of objectsS,findsome subsetX ⊆Ssuch thatf(X)isminimal(ormaximal) among allX satisfying some conditions.
Given a database D,findsome set of tuples X such that f(X)isminimalamong all X satisfying some conditions.
Computation problems.
Given a set of objectsS,computethe value off(S). Given a database D,computethe value of f(D).
Optimization and computation problems
Letf be a functionf :X →R(often, but not always, a function that countssomething). We focus in this talk on:
Optimization problems.
Given a set of objectsS,findsome subsetX ⊆Ssuch thatf(X)isminimal(ormaximal) among allX satisfying some conditions.
Given a database D,findsome set of tuples X such that f(X)isminimalamong all X satisfying some conditions.
Computation problems.
Given a set of objectsS,computethe value off(S). Given a database D,computethe value of f(D).
Optimization and computation problems
Letf be a functionf :X →R(often, but not always, a function that countssomething). We focus in this talk on:
Optimization problems.
Given a set of objectsS,findsome subsetX ⊆Ssuch thatf(X)isminimal(ormaximal) among allX satisfying some conditions.
Given a database D,findsome set of tuples X such that f(X)isminimalamong all X satisfying some conditions.
Computation problems.
Given a set of objectsS,computethe value off(S). Given a database D,computethe value of f(D).
Optimization and computation problems
Letf be a functionf :X →R(often, but not always, a function that countssomething). We focus in this talk on:
Optimization problems.
Given a set of objectsS,findsome subsetX ⊆Ssuch thatf(X)isminimal(ormaximal) among allX satisfying some conditions.
Given a database D,findsome set of tuples X such that f(X)isminimalamong all X satisfying some conditions.
Computation problems.
Given a set of objectsS,computethe value off(S).
Given a database D,computethe value of f(D).
Optimization and computation problems
Letf be a functionf :X →R(often, but not always, a function that countssomething). We focus in this talk on:
Optimization problems.
Given a set of objectsS,findsome subsetX ⊆Ssuch thatf(X)isminimal(ormaximal) among allX satisfying some conditions.
Given a database D,findsome set of tuples X such that f(X)isminimalamong all X satisfying some conditions.
Computation problems.
Examples of optimization problems
Examples
Maximum Matching. Given a set of tasks and a set of workers with preferences of workers on tasks,findan assignment for all tasks thatmaximizessatisfaction.
Set Cover. Given a set of people, each with fluency in various languages, finda group of people ofminimumsize who can speak all the languages.
Vertex Cover. In a city,finda set ofminimumsize of road intersections where to put street cameras such that all roads are covered.
Inconsistent Data Repair. Given a database inconsistent w.r.t. fixed integrity constraints,findtheminimumamount of tuples to add or remove to make it consistent.
Influence Maximization. Given a social network with influence probabilities on edges,findthe set of nodes to target tomaximizethe impact of a marketing campaign.
Examples of optimization problems
Examples
Maximum Matching. Given a set of tasks and a set of workers with preferences of workers on tasks,findan assignment for all tasks thatmaximizessatisfaction.
Set Cover. Given a set of people, each with fluency in various languages, finda group of people ofminimumsize who can speak all the languages.
Vertex Cover. In a city,finda set ofminimumsize of road intersections where to put street cameras such that all roads are covered.
Inconsistent Data Repair. Given a database inconsistent w.r.t. fixed integrity constraints,findtheminimumamount of tuples to add or remove to make it consistent.
Influence Maximization. Given a social network with influence probabilities on edges,findthe set of nodes to target tomaximizethe impact of a marketing campaign.
Examples of optimization problems
Examples
Maximum Matching. Given a set of tasks and a set of workers with preferences of workers on tasks,findan assignment for all tasks thatmaximizessatisfaction.
Set Cover. Given a set of people, each with fluency in various languages, finda group of people ofminimumsize who can speak all the languages.
Vertex Cover. In a city,finda set ofminimumsize of road intersections where to put street cameras such that all roads are covered.
Inconsistent Data Repair. Given a database inconsistent w.r.t. fixed integrity constraints,findtheminimumamount of tuples to add or remove to make it consistent.
Influence Maximization. Given a social network with influence probabilities on edges,findthe set of nodes to target tomaximizethe impact of a marketing campaign.
Examples of optimization problems
Examples
Maximum Matching. Given a set of tasks and a set of workers with preferences of workers on tasks,findan assignment for all tasks thatmaximizessatisfaction.
Set Cover. Given a set of people, each with fluency in various languages, finda group of people ofminimumsize who can speak all the languages.
Vertex Cover. In a city,finda set ofminimumsize of road intersections where to put street cameras such that all roads are covered.
Inconsistent Data Repair. Given a database inconsistent w.r.t. fixed integrity constraints,findtheminimumamount of tuples to add or remove to make it consistent.
Influence Maximization. Given a social network with influence probabilities on edges,findthe set of nodes to target tomaximizethe impact of a marketing campaign.
Examples of optimization problems
Examples
Maximum Matching. Given a set of tasks and a set of workers with preferences of workers on tasks,findan assignment for all tasks thatmaximizessatisfaction.
Set Cover. Given a set of people, each with fluency in various languages, finda group of people ofminimumsize who can speak all the languages.
Vertex Cover. In a city,finda set ofminimumsize of road intersections where to put street cameras such that all roads are covered.
Inconsistent Data Repair. Given a database inconsistent w.r.t. fixed integrity constraints,findtheminimumamount of tuples to add or remove to make it consistent.
Influence Maximization. Given a social network with influence probabilities
Examples of Computation Problems
Examples
Coloring Counting. Computethe number of ways to color a graph with 3 colors.
SQL Match Counting. Computethe number of distinct matches to a fixed SQL query:
SELECT COUNT(DISTINCT *)
FROM R NATURAL JOIN S NATURAL JOIN T Navigational XPath Counting. Computethe number of matches of a
fixed simple XPath expression (no functions, no equality): count(//a[b/c]/d[e/f])
Probabilistic Query Evaluation. Computethe probability of a fixed SQL query over a database whose tuples are annotated with probabilities.
Examples of Computation Problems
Examples
Coloring Counting. Computethe number of ways to color a graph with 3 colors.
SQL Match Counting. Computethe number of distinct matches to a fixed SQL query:
SELECT COUNT(DISTINCT *)
FROM R NATURAL JOIN S NATURAL JOIN T
Navigational XPath Counting. Computethe number of matches of a fixed simple XPath expression (no functions, no equality):
count(//a[b/c]/d[e/f])
Probabilistic Query Evaluation. Computethe probability of a fixed SQL query over a database whose tuples are annotated with probabilities.
Examples of Computation Problems
Examples
Coloring Counting. Computethe number of ways to color a graph with 3 colors.
SQL Match Counting. Computethe number of distinct matches to a fixed SQL query:
SELECT COUNT(DISTINCT *)
FROM R NATURAL JOIN S NATURAL JOIN T Navigational XPath Counting. Computethe number of matches of a
fixed simple XPath expression (no functions, no equality):
count(//a[b/c]/d[e/f])
Probabilistic Query Evaluation. Computethe probability of a fixed SQL query over a database whose tuples are annotated with probabilities.
Examples of Computation Problems
Examples
Coloring Counting. Computethe number of ways to color a graph with 3 colors.
SQL Match Counting. Computethe number of distinct matches to a fixed SQL query:
SELECT COUNT(DISTINCT *)
FROM R NATURAL JOIN S NATURAL JOIN T Navigational XPath Counting. Computethe number of matches of a
fixed simple XPath expression (no functions, no equality):
count(//a[b/c]/d[e/f])
Probabilistic Query Evaluation. Computethe probability of a fixed SQL query over a database whose tuples are annotated with
Intractability
Most of these problems areintractable: unless P=NP, there is no polynomial-time algorithm to solve them, only algorithms
exponential in the size of the data!
Two classes of intractability discussed here: NP-hardnessfor optimization problems,#P-hardnessfor computation problems (latter implies former). See further.
Polynomial-time NP-hard #P-hard
Max. Matching Set Cover Coloring Counting
Nav. XPath Counting Vertex Cover SQL Match Counting Incons. Data Repair Prob. Query Evaluation Influence Maximization
Intractability
Most of these problems areintractable: unless P=NP, there is no polynomial-time algorithm to solve them, only algorithms
exponential in the size of the data!
Two classes of intractability discussed here: NP-hardnessfor optimization problems,#P-hardnessfor computation problems (latter implies former). See further.
Polynomial-time NP-hard #P-hard
Max. Matching Set Cover Coloring Counting
Nav. XPath Counting Vertex Cover SQL Match Counting Incons. Data Repair Prob. Query Evaluation Influence Maximization
Intractability
Most of these problems areintractable: unless P=NP, there is no polynomial-time algorithm to solve them, only algorithms
exponential in the size of the data!
Two classes of intractability discussed here: NP-hardnessfor optimization problems,#P-hardnessfor computation problems (latter implies former). See further.
Polynomial-time NP-hard #P-hard
Max. Matching Set Cover Coloring Counting
Nav. XPath Counting Vertex Cover SQL Match Counting Incons. Data Repair Prob. Query Evaluation
Is this the end of it?
Many real-world tasks require solving hard problems
Be persistent, do not stop because you have encountered an intractable problem!
Different strategies:
Find tractable subcases
Find heuristic algorithms that are good enough in practice, though without any guarantee
Finddeterministic algorithmsthat provide aguaranteed approximation
Findrandomized algorithmsthat provide aguaranteed approximation with high probability
Is this the end of it?
Many real-world tasks require solving hard problems
Be persistent, do not stop because you have encountered an intractable problem!
Different strategies:
Find tractable subcases
Find heuristic algorithms that are good enough in practice, though without any guarantee
Finddeterministic algorithmsthat provide aguaranteed approximation
Findrandomized algorithmsthat provide aguaranteed approximation with high probability
Is this the end of it?
Many real-world tasks require solving hard problems
Be persistent, do not stop because you have encountered an intractable problem!
Different strategies:
Find tractable subcases
Find heuristic algorithms that are good enough in practice, though without any guarantee
Finddeterministic algorithmsthat provide aguaranteed approximation
Findrandomized algorithmsthat provide aguaranteed approximation with high probability
Is this the end of it?
Many real-world tasks require solving hard problems
Be persistent, do not stop because you have encountered an intractable problem!
Different strategies:
Find tractable subcases
Find heuristic algorithms that are good enough in practice, though without any guarantee
Finddeterministic algorithmsthat provide aguaranteed approximation
Findrandomized algorithmsthat provide aguaranteed approximation with high probability
Is this the end of it?
Many real-world tasks require solving hard problems
Be persistent, do not stop because you have encountered an intractable problem!
Different strategies:
Find tractable subcases
Find heuristic algorithms that are good enough in practice, though without any guarantee
Finddeterministic algorithmsthat provide aguaranteed approximation
Findrandomized algorithmsthat provide aguaranteed approximation with high probability
Is this the end of it?
Many real-world tasks require solving hard problems
Be persistent, do not stop because you have encountered an intractable problem!
Different strategies:
Find tractable subcases
Find heuristic algorithms that are good enough in practice, though without any guarantee
Finddeterministic algorithmsthat provide aguaranteed approximation
Findrandomized algorithmsthat provide aguaranteed approximation with high probability
Is this the end of it?
Many real-world tasks require solving hard problems
Be persistent, do not stop because you have encountered an intractable problem!
Different strategies:
Find tractable subcases
Find heuristic algorithms that are good enough in practice, though without any guarantee
Finddeterministic algorithmsthat provide aguaranteed approximation
Findrandomized algorithmsthat provide aguaranteed approximation with high probability
Outline
Introduction
Intractable Classes
Deterministic Approximations Randomized Approximations Conclusion
NP-hardness
Adecision(i.e., yes/no) problem is inNPif there exists a
nondeterministic polynomial-timealgorithm (i.e., the algorithm is allowed to make a guess) that solves it
A problemX isNP-hardif it is at least as hard as any problem in NP: being able to solveX means you can solve any problem in NP with deterministic polynomial-time overhead
NP-complete: decision problemboth NP and NP-hard
To prove NP-hardness ofX, you show apolynomial-time reduction from an arbitrary problemY known to be NP-hard toX: you take an instance ofY and show that if you know how to solveX, then you can solveY with deterministic polynomial-time overhead
Technical note:most definitions of NP-hardness fordecision problemsare a bit stricter because they are based onKarp many-one reductions. Important if you want to distinguish, e.g., NP-hardness vs coNP-hardness. For optimization/computation problems, it is irrelevant, so I use simplerTuring reductions.
NP-hardness
Adecision(i.e., yes/no) problem is inNPif there exists a
nondeterministic polynomial-timealgorithm (i.e., the algorithm is allowed to make a guess) that solves it
A problemX isNP-hardif it is at least as hard as any problem in NP: being able to solveX means you can solve any problem in NP with deterministic polynomial-time overhead
NP-complete: decision problemboth NP and NP-hard
To prove NP-hardness ofX, you show apolynomial-time reduction from an arbitrary problemY known to be NP-hard toX: you take an instance ofY and show that if you know how to solveX, then you can solveY with deterministic polynomial-time overhead
Technical note:most definitions of NP-hardness fordecision problemsare a bit stricter because they are based onKarp many-one reductions. Important if you want to distinguish, e.g., NP-hardness vs coNP-hardness. For optimization/computation problems, it is irrelevant, so I use simplerTuring reductions.
NP-hardness
Adecision(i.e., yes/no) problem is inNPif there exists a
nondeterministic polynomial-timealgorithm (i.e., the algorithm is allowed to make a guess) that solves it
A problemX isNP-hardif it is at least as hard as any problem in NP: being able to solveX means you can solve any problem in NP with deterministic polynomial-time overhead
NP-complete: decision problemboth NP and NP-hard
To prove NP-hardness ofX, you show apolynomial-time reduction from an arbitrary problemY known to be NP-hard toX: you take an instance ofY and show that if you know how to solveX, then you can solveY with deterministic polynomial-time overhead
Technical note:most definitions of NP-hardness fordecision problemsare a bit stricter because they are based onKarp many-one reductions. Important if you want to distinguish, e.g., NP-hardness vs coNP-hardness. For optimization/computation problems, it is irrelevant, so I use simplerTuring reductions.
NP-hardness
Adecision(i.e., yes/no) problem is inNPif there exists a
nondeterministic polynomial-timealgorithm (i.e., the algorithm is allowed to make a guess) that solves it
A problemX isNP-hardif it is at least as hard as any problem in NP: being able to solveX means you can solve any problem in NP with deterministic polynomial-time overhead
NP-complete: decision problemboth NP and NP-hard
To prove NP-hardness ofX, you show apolynomial-time reduction from an arbitrary problemY known to be NP-hard toX: you take an instance ofY and show that if you know how to solveX, then you can solveY with deterministic polynomial-time overhead
Technical note:most definitions of NP-hardness fordecision problemsare a bit stricter because they are based onKarp many-one reductions. Important if you want to distinguish, e.g., NP-hardness vs coNP-hardness. For optimization/computation problems, it is irrelevant, so I use simplerTuring reductions.
NP-hardness
Adecision(i.e., yes/no) problem is inNPif there exists a
nondeterministic polynomial-timealgorithm (i.e., the algorithm is allowed to make a guess) that solves it
A problemX isNP-hardif it is at least as hard as any problem in NP: being able to solveX means you can solve any problem in NP with deterministic polynomial-time overhead
NP-complete: decision problemboth NP and NP-hard
To prove NP-hardness ofX, you show apolynomial-time reduction from an arbitrary problemY known to be NP-hard toX: you take an instance ofY and show that if you know how to solveX, then you can solveY with deterministic polynomial-time overhead
Technical note:most definitions of NP-hardness fordecision problemsare a bit stricter because they are based onKarp many-one reductions. Important if you want to
#P-hardness
Acountingproblem is in#Pif it can be solved by counting the number of ways anondeterministic polynomial-timealgorithm (i.e., the algorithm is allowed to make a guess, and you count the various ways to guess) can return “yes”
A problemX is#P-hardif it is at least as hard as any problem in
#P: being able to solveX means you can solve any problem in #P with deterministic polynomial-time overhead
To prove #P-hardness ofX, you show apolynomial-time reduction from an arbitrary problemY known to be #P-hard toX: you take an instance ofY and show that if you know how to solveX, then you can solveY with deterministic polynomial-time overhead
Technical note:most definitions of #P-hardness forcounting problemsare a bit stricter because they are based onKarp many-one reductions. I use simplerTuring
reductions, not much practical difference.
#P-hardness
Acountingproblem is in#Pif it can be solved by counting the number of ways anondeterministic polynomial-timealgorithm (i.e., the algorithm is allowed to make a guess, and you count the various ways to guess) can return “yes”
A problemX is#P-hardif it is at least as hard as any problem in
#P: being able to solveX means you can solve any problem in #P with deterministic polynomial-time overhead
To prove #P-hardness ofX, you show apolynomial-time reduction from an arbitrary problemY known to be #P-hard toX: you take an instance ofY and show that if you know how to solveX, then you can solveY with deterministic polynomial-time overhead
Technical note:most definitions of #P-hardness forcounting problemsare a bit stricter because they are based onKarp many-one reductions. I use simplerTuring
reductions, not much practical difference.
#P-hardness
Acountingproblem is in#Pif it can be solved by counting the number of ways anondeterministic polynomial-timealgorithm (i.e., the algorithm is allowed to make a guess, and you count the various ways to guess) can return “yes”
A problemX is#P-hardif it is at least as hard as any problem in
#P: being able to solveX means you can solve any problem in #P with deterministic polynomial-time overhead
To prove #P-hardness ofX, you show apolynomial-time reduction from an arbitrary problemY known to be #P-hard toX: you take an instance ofY and show that if you know how to solveX, then you can solveY with deterministic polynomial-time overhead
Technical note:most definitions of #P-hardness forcounting problemsare a bit stricter because they are based onKarp many-one reductions. I use simplerTuring
reductions, not much practical difference.
#P-hardness
Acountingproblem is in#Pif it can be solved by counting the number of ways anondeterministic polynomial-timealgorithm (i.e., the algorithm is allowed to make a guess, and you count the various ways to guess) can return “yes”
A problemX is#P-hardif it is at least as hard as any problem in
#P: being able to solveX means you can solve any problem in #P with deterministic polynomial-time overhead
To prove #P-hardness ofX, you show apolynomial-time reduction from an arbitrary problemY known to be #P-hard toX: you take an instance ofY and show that if you know how to solveX, then you can solveY with deterministic polynomial-time overhead
Technical note:most definitions of #P-hardness forcounting problemsare a bit stricter
#P-hardness implies NP-hardness
Proof.
Take a #P-hard problemX. Let us show it is NP-hard as well.
Take an arbitraryNP-completeproblemY. There is a
non-deterministic polynomial-time algorithmAthat solvesY. Consider the problemZ that counts the number of waysAcan return “yes”. This is a #P problem.
ThusZ reduces toX.
ButY reduces toZ: useZ to count, and return “yes” iff the count is>0.
ThereforeY reduces toX, and thusX is NP-hard.
#P-hardness implies NP-hardness
Proof.
Take a #P-hard problemX. Let us show it is NP-hard as well.
Take an arbitraryNP-completeproblemY. There is a
non-deterministic polynomial-time algorithmAthat solvesY.
Consider the problemZ that counts the number of waysAcan return “yes”. This is a #P problem.
ThusZ reduces toX.
ButY reduces toZ: useZ to count, and return “yes” iff the count is>0.
ThereforeY reduces toX, and thusX is NP-hard.
#P-hardness implies NP-hardness
Proof.
Take a #P-hard problemX. Let us show it is NP-hard as well.
Take an arbitraryNP-completeproblemY. There is a
non-deterministic polynomial-time algorithmAthat solvesY. Consider the problemZ that counts the number of waysAcan return “yes”. This is a #P problem.
ThusZ reduces toX.
ButY reduces toZ: useZ to count, and return “yes” iff the count is>0.
ThereforeY reduces toX, and thusX is NP-hard.
#P-hardness implies NP-hardness
Proof.
Take a #P-hard problemX. Let us show it is NP-hard as well.
Take an arbitraryNP-completeproblemY. There is a
non-deterministic polynomial-time algorithmAthat solvesY. Consider the problemZ that counts the number of waysAcan return “yes”. This is a #P problem.
ThusZ reduces toX.
ButY reduces toZ: useZ to count, and return “yes” iff the count is>0.
ThereforeY reduces toX, and thusX is NP-hard.
#P-hardness implies NP-hardness
Proof.
Take a #P-hard problemX. Let us show it is NP-hard as well.
Take an arbitraryNP-completeproblemY. There is a
non-deterministic polynomial-time algorithmAthat solvesY. Consider the problemZ that counts the number of waysAcan return “yes”. This is a #P problem.
ThusZ reduces toX.
ButY reduces toZ: useZ to count, and return “yes” iff the count is>0.
ThereforeY reduces toX, and thusX is NP-hard.
#P-hardness implies NP-hardness
Proof.
Take a #P-hard problemX. Let us show it is NP-hard as well.
Take an arbitraryNP-completeproblemY. There is a
non-deterministic polynomial-time algorithmAthat solvesY. Consider the problemZ that counts the number of waysAcan return “yes”. This is a #P problem.
ThusZ reduces toX.
ButY reduces toZ: useZ to count, and return “yes” iff the count is>0.
ThereforeY reduces toX, and thusX is NP-hard.
Outline
Introduction
Intractable Classes
Deterministic Approximations Approximation Algorithms FPTAS
Randomized Approximations
Outline
Introduction
Intractable Classes
Deterministic Approximations Approximation Algorithms FPTAS
Randomized Approximations Conclusion
Additive (absolute) approximation
Letϕ:R→R+. Definition
AnoptimizationalgorithmAprovides anadditiveϕ-approximation for a problemP with optimal solutionX∗ if the solutionX returned byA satisfies the condition ofPand is such that
|f(X)−f(X∗)|6ϕ(f(X∗))
Definition
AcomputationalgorithmAprovides anadditiveϕ-approximation for a problemPwith actual solutionv∗if the valuev returned byAis such that
|v−v∗|6ϕ(v∗)
Additive (absolute) approximation
Letϕ:R→R+. Definition
AnoptimizationalgorithmAprovides anadditiveϕ-approximation for a problemP with optimal solutionX∗ if the solutionX returned byA satisfies the condition ofPand is such that
|f(X)−f(X∗)|6ϕ(f(X∗))
Definition
AcomputationalgorithmAprovides anadditiveϕ-approximation for a problemP with actual solutionv∗if the valuev returned byAis such that
∗ ∗
Multiplicative (relative) approximation
Letϕ:R→R+. Definition
AnoptimizationalgorithmAprovides amultiplicativeϕ-approximation for a problemP with optimal solutionX∗ if the solutionX returned byA satisfies the condition ofPand is such that
|f(X)|6ϕ(f(X∗))|f(X)∗| ifP is a minimization problem
|f(X)|>ϕ(f(X∗))|f(X)∗| ifP is a maximization problem
Definition (attention, inconsistent notation!)
AcomputationalgorithmAprovides amultiplicativeϕ-approximation for a problemPwith actual solutionv∗if the valuev returned byAis such that
(1−ϕ(v∗))|v∗|6|v|6(1+ϕ(v∗))|v∗|
Multiplicative (relative) approximation
Letϕ:R→R+. Definition
AnoptimizationalgorithmAprovides amultiplicativeϕ-approximation for a problemP with optimal solutionX∗ if the solutionX returned byA satisfies the condition ofPand is such that
|f(X)|6ϕ(f(X∗))|f(X)∗| ifP is a minimization problem
|f(X)|>ϕ(f(X∗))|f(X)∗| ifP is a maximization problem
Definition (attention, inconsistent notation!)
AcomputationalgorithmAprovides amultiplicativeϕ-approximation for a problemP with actual solutionv∗if the valuev returned byAis such that
APX
We want the approximation algorithms to bepolynomial-time
Ideally,ϕisconstant
For a constantϕ,multiplicative approximation is betterthan additive approximation
Additive approximation is thus rarely used, “approximation algorithm” usually means multiplicative
APX:class ofoptimizationproblems that have a polynomial-time multiplicative approximation algorithm with constantϕ
APX
We want the approximation algorithms to bepolynomial-time Ideally,ϕisconstant
For a constantϕ,multiplicative approximation is betterthan additive approximation
Additive approximation is thus rarely used, “approximation algorithm” usually means multiplicative
APX:class ofoptimizationproblems that have a polynomial-time multiplicative approximation algorithm with constantϕ
APX
We want the approximation algorithms to bepolynomial-time Ideally,ϕisconstant
For a constantϕ,multiplicative approximation is betterthan additive approximation
Additive approximation is thus rarely used, “approximation algorithm” usually means multiplicative
APX:class ofoptimizationproblems that have a polynomial-time multiplicative approximation algorithm with constantϕ
APX
We want the approximation algorithms to bepolynomial-time Ideally,ϕisconstant
For a constantϕ,multiplicative approximation is betterthan additive approximation
Additive approximation is thus rarely used, “approximation algorithm” usually means multiplicative
APX:class ofoptimizationproblems that have a polynomial-time multiplicative approximation algorithm with constantϕ
APX
We want the approximation algorithms to bepolynomial-time Ideally,ϕisconstant
For a constantϕ,multiplicative approximation is betterthan additive approximation
Additive approximation is thus rarely used, “approximation algorithm” usually means multiplicative
APX:class ofoptimizationproblems that have a polynomial-time multiplicative approximation algorithm with constantϕ
Vertex Cover is in APX
a c
b
e
d
Optimal: {a,c}, size2
Approximated: {a,b,c,e}, size4
Approximation algorithm:
Choose an arbitrary edge not covered Add both end points to the cover Repeat until all edges are covered
Multiplicative 2-approximation! (twice as many nodes in the
approximated cover as edges chosen, each of this edge need to be covered)
Vertex Cover is in APX
a c
b
e
d
Optimal: {a,c}, size2
Approximated: {a,b,c,e}, size4
Approximation algorithm:
Choose an arbitrary edge not covered Add both end points to the cover Repeat until all edges are covered
Multiplicative 2-approximation! (twice as many nodes in the
approximated cover as edges chosen, each of this edge need to be covered)
Vertex Cover is in APX
a c
b
e
d
Optimal: {a,c}, size2
Approximated: {a,b,c,e}, size4
Approximation algorithm:
Choose an arbitrary edge not covered Add both end points to the cover Repeat until all edges are covered
Multiplicative 2-approximation! (twice as many nodes in the
approximated cover as edges chosen, each of this edge need to be covered)
Vertex Cover is in APX
a c
b
e
d
Optimal: {a,c}, size2
Approximated: {a,b,c,e}, size4
Approximation algorithm:
Choose an arbitrary edge not covered Add both end points to the cover Repeat until all edges are covered
Multiplicative 2-approximation! (twice as many nodes in the
approximated cover as edges chosen, each of this edge need to be covered)
Vertex Cover is in APX
a c
b
e
d
Optimal: {a,c}, size2
Approximated: {a,b,c,e}, size4
Approximation algorithm:
Choose an arbitrary edge not covered Add both end points to the cover Repeat until all edges are covered
Multiplicative 2-approximation! (twice as many nodes in the
approximated cover as edges chosen, each of this edge need to be covered)
Vertex Cover is in APX
a c
b
e
d
Optimal: {a,c}, size2
Approximated: {a,b,c,e}, size4
Approximation algorithm:
Choose an arbitrary edge not covered Add both end points to the cover Repeat until all edges are covered
Multiplicative 2-approximation! (twice as many nodes in the
approximated cover as edges chosen, each of this edge need to be covered)
Vertex Cover is in APX
a c
b
e
d
Optimal: {a,c}, size2
Approximated: {a,b,c,e}, size4
Approximation algorithm:
Choose an arbitrary edge not covered Add both end points to the cover Repeat until all edges are covered
Multiplicative 2-approximation! (twice as many nodes in the
approximated cover as edges chosen, each of this edge need to be
Other examples
Set Cover has a(lnn+O(1))-approximationalgorithm [Chv79]
but isnot in APX[LY94]
Inconsistent Data Repair is inAPX(but the constant depends on the dependencies) [KL09]
Influence Maximization is inAPX; it has a
(1−1/e−ε)-approximationalgorithm for anyε(slightly better than 63%) [KKT03]
Inapproximability results
It is also possible to show that some problem isnot ϕ-approximable (we assume P6=NP in this slide)
Vertex Cover isnot 1.3606-approximable[DS05] (!) Set Cover isnot(ln(n)−o(lnn))-approximable[DS14]
This kind of results is usuallymuch more difficultto obtain than approximation algorithms
Inapproximability results
It is also possible to show that some problem isnot ϕ-approximable (we assume P6=NP in this slide) Vertex Cover isnot 1.3606-approximable[DS05] (!)
Set Cover isnot(ln(n)−o(lnn))-approximable[DS14]
This kind of results is usuallymuch more difficultto obtain than approximation algorithms
Inapproximability results
It is also possible to show that some problem isnot ϕ-approximable (we assume P6=NP in this slide) Vertex Cover isnot 1.3606-approximable[DS05] (!) Set Cover isnot(ln(n)−o(lnn))-approximable[DS14]
This kind of results is usuallymuch more difficultto obtain than approximation algorithms
Inapproximability results
It is also possible to show that some problem isnot ϕ-approximable (we assume P6=NP in this slide) Vertex Cover isnot 1.3606-approximable[DS05] (!) Set Cover isnot(ln(n)−o(lnn))-approximable[DS14]
This kind of results is usuallymuch more difficultto obtain than approximation algorithms
How to find an approximation algorithm?
From scratch, by exploiting the structure of the problem (as we did with Vertex Cover)
By exploitingapproximation-preserving reductionsbetween a problem and an approximable problem (in both directions); various notions of approximation-preserving, arbitrary reductions don’t work
How to find an approximation algorithm?
From scratch, by exploiting the structure of the problem (as we did with Vertex Cover)
By exploitingapproximation-preserving reductionsbetween a problem and an approximable problem (in both directions); various notions of approximation-preserving, arbitrary reductions don’t work
Outline
Introduction
Intractable Classes
Deterministic Approximations Approximation Algorithms FPTAS
Randomized Approximations Conclusion
From APX to FPTAS
APX:polynomial-timec-approximation forsomefixed constantc
Useful, but would be better if we could have ac arbitrarily close to 1
PTAS (Polynomial-Time Approximation Scheme): there exists a polynomial-time(1+ε)-approximation foranyε >0(for a minimization problem);(1−ε)-approximation for a maximization problem;ε-approximation for a computation problem
Great, but these approximations may becomemore and more difficult to findasεnears 0
FPTAS (Fully Polynomial-Time Approximation Scheme): PTAS whose overall complexity dependspolynomially in 1/ε
From APX to FPTAS
APX:polynomial-timec-approximation forsomefixed constantc Useful, but would be better if we could have ac arbitrarily close to 1
PTAS (Polynomial-Time Approximation Scheme): there exists a polynomial-time(1+ε)-approximation foranyε >0(for a minimization problem);(1−ε)-approximation for a maximization problem;ε-approximation for a computation problem
Great, but these approximations may becomemore and more difficult to findasεnears 0
FPTAS (Fully Polynomial-Time Approximation Scheme): PTAS whose overall complexity dependspolynomially in 1/ε
From APX to FPTAS
APX:polynomial-timec-approximation forsomefixed constantc Useful, but would be better if we could have ac arbitrarily close to 1
PTAS (Polynomial-Time Approximation Scheme): there exists a polynomial-time(1+ε)-approximation foranyε >0(for a minimization problem);(1−ε)-approximation for a maximization problem;ε-approximation for a computation problem
Great, but these approximations may becomemore and more difficult to findasεnears 0
FPTAS (Fully Polynomial-Time Approximation Scheme): PTAS whose overall complexity dependspolynomially in 1/ε
From APX to FPTAS
APX:polynomial-timec-approximation forsomefixed constantc Useful, but would be better if we could have ac arbitrarily close to 1
PTAS (Polynomial-Time Approximation Scheme): there exists a polynomial-time(1+ε)-approximation foranyε >0(for a minimization problem);(1−ε)-approximation for a maximization problem;ε-approximation for a computation problem
Great, but these approximations may becomemore and more difficult to findasεnears 0
FPTAS (Fully Polynomial-Time Approximation Scheme): PTAS whose overall complexity dependspolynomially in 1/ε
From APX to FPTAS
APX:polynomial-timec-approximation forsomefixed constantc Useful, but would be better if we could have ac arbitrarily close to 1
PTAS (Polynomial-Time Approximation Scheme): there exists a polynomial-time(1+ε)-approximation foranyε >0(for a minimization problem);(1−ε)-approximation for a maximization problem;ε-approximation for a computation problem
Great, but these approximations may becomemore and more difficult to findasεnears 0
FPTAS (Fully Polynomial-Time Approximation Scheme): PTAS
Problems with a FPTAS?
Fairlyrare!
Neither Vertex Cover, nor Inconsistent Data Repair [KL09], nor Influence Maximization [KKT03] have an FPTAS (unless P=NP)
There are still some problems for which there are FPTAS:
Example
Knapsack. Given a collection of items, each with a weight and a volume,finda subset of items ofmaximumvolume whose total weight does not exceed some fixed limit
Knapsack is anNP-hardproblem, but there exists anFPTAS
Problems with a FPTAS?
Fairlyrare!
Neither Vertex Cover, nor Inconsistent Data Repair [KL09], nor Influence Maximization [KKT03] have an FPTAS (unless P=NP)
There are still some problems for which there are FPTAS:
Example
Knapsack. Given a collection of items, each with a weight and a volume,finda subset of items ofmaximumvolume whose total weight does not exceed some fixed limit
Knapsack is anNP-hardproblem, but there exists anFPTAS
Problems with a FPTAS?
Fairlyrare!
Neither Vertex Cover, nor Inconsistent Data Repair [KL09], nor Influence Maximization [KKT03] have an FPTAS (unless P=NP)
There are still some problems for which there are FPTAS:
Example
Knapsack. Given a collection of items, each with a weight and a volume,finda subset of items ofmaximumvolume whose total weight does not exceed some fixed limit
Knapsack is anNP-hardproblem, but there exists anFPTAS
Problems with a FPTAS?
Fairlyrare!
Neither Vertex Cover, nor Inconsistent Data Repair [KL09], nor Influence Maximization [KKT03] have an FPTAS (unless P=NP)
There are still some problems for which there are FPTAS:
Example
Knapsack. Given a collection of items, each with a weight and a volume,finda subset of items ofmaximumvolume whose total weight does not exceed some fixed limit Knapsack is anNP-hardproblem, but there exists anFPTAS
Problems with a FPTAS?
Fairlyrare!
Neither Vertex Cover, nor Inconsistent Data Repair [KL09], nor Influence Maximization [KKT03] have an FPTAS (unless P=NP)
There are still some problems for which there are FPTAS:
Example
Knapsack. Given a collection of items, each with a weight and a volume,finda subset of items ofmaximumvolume whose total weight does not exceed some fixed limit
Knapsack is anNP-hardproblem, but there exists anFPTAS
Problems with a FPTAS?
Fairlyrare!
Neither Vertex Cover, nor Inconsistent Data Repair [KL09], nor Influence Maximization [KKT03] have an FPTAS (unless P=NP)
There are still some problems for which there are FPTAS:
Example
Knapsack. Given a collection of items, each with a weight and a volume,finda subset of items ofmaximumvolume whose total weight does not exceed some fixed limit
Outline
Introduction
Intractable Classes
Deterministic Approximations Randomized Approximations
Generalities
Monte-Carlo Sampling FPRAS
Conclusion
Outline
Introduction
Intractable Classes
Deterministic Approximations Randomized Approximations
Generalities
Monte-Carlo Sampling FPRAS
Randomized approximations
To simplify, only talk about computation problems. ϕ:R→R+,δ >0.
Definition
AcomputationalgorithmAprovides arandomized additive
(ϕ, δ)-approximation for a problemP with actual solutionv∗if the value v returned byAis such that
|v−v∗|6ϕ(v∗) with probability>1−δ
Definition
AcomputationalgorithmAprovides arandomized multiplicative
(ϕ, δ)-approximation for a problemP with actual solutionv∗if the value v returned byAis such that
(1−ϕ(v∗))|v∗|6|v|6(1+ϕ(v∗))|v∗| with probability>1−δ
Randomized approximations
To simplify, only talk about computation problems. ϕ:R→R+,δ >0.
Definition
AcomputationalgorithmAprovides arandomized additive
(ϕ, δ)-approximation for a problemP with actual solutionv∗if the value v returned byAis such that
|v−v∗|6ϕ(v∗) with probability>1−δ
Definition
AcomputationalgorithmAprovides arandomized multiplicative
(ϕ, δ)-approximation for a problemP with actual solutionv∗if the value
Hoeffding’s Inequality
LetX1, . . . ,Xnbenindependent random variables, each within the interval[a,b], andX¯ = 1nP
iXi the empirical mean.
We have [Hoe63]: Pr
X¯ −E[ ¯X] >ε
62e
−2nε2 (b−a)2
In other words, we know that Pr
X¯ −E[ ¯X] >ε
6δas long as: n> (b−a)2
2ε2 ln1 δ
Often too conservative!
Hoeffding’s Inequality
LetX1, . . . ,Xnbenindependent random variables, each within the interval[a,b], andX¯ = 1nP
iXi the empirical mean.
We have [Hoe63]:
Pr
X¯ −E[ ¯X] >ε
62e
−2nε2 (b−a)2
In other words, we know that Pr
X¯ −E[ ¯X] >ε
6δas long as: n> (b−a)2
2ε2 ln1 δ
Often too conservative!
Hoeffding’s Inequality
LetX1, . . . ,Xnbenindependent random variables, each within the interval[a,b], andX¯ = 1nP
iXi the empirical mean.
We have [Hoe63]:
Pr
X¯ −E[ ¯X] >ε
62e
−2nε2 (b−a)2
In other words, we know that Pr
X¯ −E[ ¯X] >ε
6δ as long as:
n> (b−a)2 2ε2 ln1
δ
Often too conservative!
Hoeffding’s Inequality
LetX1, . . . ,Xnbenindependent random variables, each within the interval[a,b], andX¯ = 1nP
iXi the empirical mean.
We have [Hoe63]:
Pr
X¯ −E[ ¯X] >ε
62e
−2nε2 (b−a)2
In other words, we know that Pr
X¯ −E[ ¯X] >ε
6δ as long as:
n> (b−a)2 2ε2 ln1
δ
Often too conservative!
Hoeffding’s Inequality
LetX1, . . . ,Xnbenindependent random variables, each within the interval[a,b], andX¯ = 1nP
iXi the empirical mean.
We have [Hoe63]:
Pr
X¯ −E[ ¯X] >ε
62e
−2nε2 (b−a)2
In other words, we know that Pr
X¯ −E[ ¯X] >ε
6δ as long as:
n> (b−a)2 2ε2 ln1
δ
Outline
Introduction
Intractable Classes
Deterministic Approximations Randomized Approximations
Generalities
Monte-Carlo Sampling FPRAS
Application to Polling
Pollofnpersons in a country ofminhabitants
Every personi is asked if they prefer politicianAor politicianB. We noteXi =0 if they preferA, 1 otherwise
We are interested inpredicting the result of an electionbetweenA andB;E[ ¯X]is the expected proportion of votes forB
We want a margin of error ofε=2%, and a probabilistic guarantee of 1−δ =95%
So we just need by Hoeffding’s inequality: n> 1
2ε2ln1
δ >3745 This is completelyindependent ofm!
Application to Polling
Pollofnpersons in a country ofminhabitants
Every personi is asked if they prefer politicianAor politicianB.
We noteXi =0 if they preferA, 1 otherwise
We are interested inpredicting the result of an electionbetweenA andB;E[ ¯X]is the expected proportion of votes forB
We want a margin of error ofε=2%, and a probabilistic guarantee of 1−δ =95%
So we just need by Hoeffding’s inequality: n> 1
2ε2ln1
δ >3745 This is completelyindependent ofm!
Application to Polling
Pollofnpersons in a country ofminhabitants
Every personi is asked if they prefer politicianAor politicianB.
We noteXi =0 if they preferA, 1 otherwise
We are interested inpredicting the result of an electionbetweenA andB;E[ ¯X]is the expected proportion of votes forB
We want a margin of error ofε=2%, and a probabilistic guarantee of 1−δ =95%
So we just need by Hoeffding’s inequality: n> 1
2ε2ln1
δ >3745 This is completelyindependent ofm!
Application to Polling
Pollofnpersons in a country ofminhabitants
Every personi is asked if they prefer politicianAor politicianB.
We noteXi =0 if they preferA, 1 otherwise
We are interested inpredicting the result of an electionbetweenA andB;E[ ¯X]is the expected proportion of votes forB
We want a margin of error ofε=2%, and a probabilistic guarantee of 1−δ =95%
So we just need by Hoeffding’s inequality: n> 1
2ε2ln1
δ >3745 This is completelyindependent ofm!
Application to Polling
Pollofnpersons in a country ofminhabitants
Every personi is asked if they prefer politicianAor politicianB.
We noteXi =0 if they preferA, 1 otherwise
We are interested inpredicting the result of an electionbetweenA andB;E[ ¯X]is the expected proportion of votes forB
We want a margin of error ofε=2%, and a probabilistic guarantee of 1−δ =95%
So we just need by Hoeffding’s inequality:
n> 1 2ε2ln1
δ >3745
This is completelyindependent ofm!
Application to Polling
Pollofnpersons in a country ofminhabitants
Every personi is asked if they prefer politicianAor politicianB.
We noteXi =0 if they preferA, 1 otherwise
We are interested inpredicting the result of an electionbetweenA andB;E[ ¯X]is the expected proportion of votes forB
We want a margin of error ofε=2%, and a probabilistic guarantee of 1−δ =95%
So we just need by Hoeffding’s inequality:
n> 1 2ε2ln1
δ >3745
Monte-Carlo Sampling
Assumptions:
We cansample in polynomial-timefrom a population Given a sample, we canevaluate a certain quantityin polynomial-time
Then we can compute the expected mean of that quantity with a polynomial-time randomized additive(ε, δ)-approximation algorithm for arbitraryε >0,δ >0
Direct application of Hoeffding’s inequality, can be used to obtain the required number of samples
Examples
Polling
Computation ofπ
Probabilistic Query Evaluation
Monte-Carlo Sampling
Assumptions:
We cansample in polynomial-timefrom a population
Given a sample, we canevaluate a certain quantityin polynomial-time
Then we can compute the expected mean of that quantity with a polynomial-time randomized additive(ε, δ)-approximation algorithm for arbitraryε >0,δ >0
Direct application of Hoeffding’s inequality, can be used to obtain the required number of samples
Examples
Polling
Computation ofπ
Probabilistic Query Evaluation
Monte-Carlo Sampling
Assumptions:
We cansample in polynomial-timefrom a population Given a sample, we canevaluate a certain quantityin polynomial-time
Then we can compute the expected mean of that quantity with a polynomial-time randomized additive(ε, δ)-approximation algorithm for arbitraryε >0,δ >0
Direct application of Hoeffding’s inequality, can be used to obtain the required number of samples
Examples
Polling
Computation ofπ
Probabilistic Query Evaluation
Monte-Carlo Sampling
Assumptions:
We cansample in polynomial-timefrom a population Given a sample, we canevaluate a certain quantityin polynomial-time
Then we can compute the expected mean of that quantity with a polynomial-time randomized additive(ε, δ)-approximation algorithm for arbitraryε >0,δ >0
Direct application of Hoeffding’s inequality, can be used to obtain the required number of samples
Examples
Polling
Computation ofπ
Probabilistic Query Evaluation
Monte-Carlo Sampling
Assumptions:
We cansample in polynomial-timefrom a population Given a sample, we canevaluate a certain quantityin polynomial-time
Then we can compute the expected mean of that quantity with a polynomial-time randomized additive(ε, δ)-approximation algorithm for arbitraryε >0,δ >0
Direct application of Hoeffding’s inequality, can be used to obtain the required number of samples
Examples
Polling
Computation ofπ
Probabilistic Query Evaluation
Monte-Carlo Sampling
Assumptions:
We cansample in polynomial-timefrom a population Given a sample, we canevaluate a certain quantityin polynomial-time
Then we can compute the expected mean of that quantity with a polynomial-time randomized additive(ε, δ)-approximation algorithm for arbitraryε >0,δ >0
Direct application of Hoeffding’s inequality, can be used to obtain the required number of samples
Examples
Computation ofπ
Probabilistic Query Evaluation
Monte-Carlo Sampling
Assumptions:
We cansample in polynomial-timefrom a population Given a sample, we canevaluate a certain quantityin polynomial-time
Then we can compute the expected mean of that quantity with a polynomial-time randomized additive(ε, δ)-approximation algorithm for arbitraryε >0,δ >0
Direct application of Hoeffding’s inequality, can be used to obtain the required number of samples
Examples
Polling
Computation ofπ
Probabilistic Query Evaluation
Monte-Carlo Sampling
Assumptions:
We cansample in polynomial-timefrom a population Given a sample, we canevaluate a certain quantityin polynomial-time
Then we can compute the expected mean of that quantity with a polynomial-time randomized additive(ε, δ)-approximation algorithm for arbitraryε >0,δ >0
Direct application of Hoeffding’s inequality, can be used to obtain the required number of samples
Examples
Outline
Introduction
Intractable Classes
Deterministic Approximations Randomized Approximations
Generalities
Monte-Carlo Sampling FPRAS
Conclusion
FPRAS
PRAS (Polynomial-time Randomized Approximation Scheme):
there exists a polynomial-time randomized(ε,1/3)-approximation for anyε >0
1/3 is irrelevant here; from that, we can obtain an
(ε, δ)-approximation for the sameεand arbitraryδby simply repeating the algorithm
Great, but these approximations may becomemore and more difficult to findasεnears 0
FPRAS (Fully Polynomial-time Randomized Approximation Scheme): PRAS whose overall complexity dependspolynomially in 1/ε
FPRAS
PRAS (Polynomial-time Randomized Approximation Scheme):
there exists a polynomial-time randomized(ε,1/3)-approximation for anyε >0
1/3 is irrelevant here; from that, we can obtain an
(ε, δ)-approximation for the sameεand arbitraryδby simply repeating the algorithm
Great, but these approximations may becomemore and more difficult to findasεnears 0
FPRAS (Fully Polynomial-time Randomized Approximation Scheme): PRAS whose overall complexity dependspolynomially in 1/ε
FPRAS
PRAS (Polynomial-time Randomized Approximation Scheme):
there exists a polynomial-time randomized(ε,1/3)-approximation for anyε >0
1/3 is irrelevant here; from that, we can obtain an
(ε, δ)-approximation for the sameεand arbitraryδby simply repeating the algorithm
Great, but these approximations may becomemore and more difficult to findasεnears 0
FPRAS (Fully Polynomial-time Randomized Approximation Scheme): PRAS whose overall complexity dependspolynomially in 1/ε
FPRAS
PRAS (Polynomial-time Randomized Approximation Scheme):
there exists a polynomial-time randomized(ε,1/3)-approximation for anyε >0
1/3 is irrelevant here; from that, we can obtain an
(ε, δ)-approximation for the sameεand arbitraryδby simply repeating the algorithm
Great, but these approximations may becomemore and more difficult to findasεnears 0
FPRAS (Fully Polynomial-time Randomized Approximation Scheme): PRAS whose overall complexity dependspolynomially in 1/ε
FPRAS for disjunctions [KLM89, KKS09]
E1, . . . ,Em sequence of events in a probability space
Assumptions: For eachEi, we can efficiently (in polynomial-time):
Compute Pr(Ei)
Test whetherEi is true in a given random sample Sample from the subspace conditioned onEi
Thenthere exists a FPRAS to compute Pr Wm i=1Ei
Seehttp://webcourse.cs.technion.ac.il/236605/Spring2015/ho/ WCFiles/L9%20-%20QA%20in%20PDBs.pdffordetailed explanations
FPRAS for disjunctions [KLM89, KKS09]
E1, . . . ,Em sequence of events in a probability space
Assumptions: For eachEi, we can efficiently (in polynomial-time):
Compute Pr(Ei)
Test whetherEi is true in a given random sample Sample from the subspace conditioned onEi
Thenthere exists a FPRAS to compute Pr Wm i=1Ei
Seehttp://webcourse.cs.technion.ac.il/236605/Spring2015/ho/ WCFiles/L9%20-%20QA%20in%20PDBs.pdffordetailed explanations