• Aucun résultat trouvé

A Fistful of Dollars: Formalizing Asymptotic Complexity Claims via Deductive Program Verification

N/A
N/A
Protected

Academic year: 2022

Partager "A Fistful of Dollars: Formalizing Asymptotic Complexity Claims via Deductive Program Verification"

Copied!
59
0
0

Texte intégral

(1)

A Fistful of Dollars:

Formalizing Asymptotic Complexity Claims via Deductive Program Verification

Armaël Guéneau, Arthur Charguéraud, François Pottier Inria

(2)

Motivational example: binary search

Claim: “binary search finds an element in time𝑂(log𝑛)

Goal: formalize this claim in Coq for a concrete implementation

2/21

(3)

Functional correctness

let rec bsearch (a: int array) v i j = if j <= i then -1 else

let k = i + (j - i) / 2 in if v = a.(k) then k

else if v < a.(k) then bsearch a v i k else bsearch a v (i+1) j

We can test this program

We can prove functional correctness (Why3, CFML, ...)

(4)

Functional correctness

let rec bsearch (a: int array) v i j = if j <= i then -1 else

let k = i + (j - i) / 2 in if v = a.(k) then k

else if v < a.(k) then bsearch a v i k else bsearch a v (i+1) j

We can test this program

We can prove functional correctness (Why3, CFML, ...)

3/21

(5)

Yet, there is a bug

(* search for v in the range [i, j) *) let rec bsearch (a: int array) v i j =

if j <= i then -1 else let k = i + (j - i) / 2 in if v = a.(k) then k

else if v < a.(k) then bsearch a v i k else bsearch a v (i+1) j

Can you spot the bug?

(6)

Yet, there is a bug

(* search for v in the range [i, j) *) let rec bsearch (a: int array) v i j =

if j <= i then -1 else let k = i + (j - i) / 2 in if v = a.(k) then k

else if v < a.(k) then bsearch a v i k else bsearch a v (i+1) j

Can you spot thecomplexitybug?

4/21

(7)

Yet, there is a bug

(* search for v in the range [i, j) *) let rec bsearch (a: int array) v i j =

if j <= i then -1 else let k = i + (j - i) / 2 in if v = a.(k) then k

else if v < a.(k) then bsearch a v i k else bsearch a v (i+1) j

buggy, should bek+1

Can you spot thecomplexitybug?

(8)

In this talk

Goal: prove OCaml programs, including their asymptotic complexity expressed with𝑂()bounds

State of the art:

Automatic inference for polynomial bounds

Interactive proofs usingtime credits, e.g. “bsearchcosts3log𝑛 + 4

Issue: conciseness, and modularity of specifications

5/21

(9)

In this talk (2)

Solution: introduce the𝑂()notation for conciseness and modularity

Challenges:

How to write specifications?

What is the meaning of𝑂()in the multivariate case?

How to do proofs (paper proofs are too informal)?

How to automate the cost analysis?

(10)

Separation Logic with Time Credits

(11)

Time Credits: resources in separation logic

Eachfunction call(or loop iteration) consumes$1

$𝑛asserts the ownership of𝑛time credits

$(𝑛 + 𝑚) = $𝑛 ∗ $𝑚

Credits are not duplicable:$1 ⟹/ $1 ∗ $1

Enables amortized analysis References:

Atkey (2011): time credits in Separation Logic

Charguéraud & Pottier (2015): practical verification framework (CFML), applied to Union-Find

(12)

Example of using time credits

A specificationof the complexityofbsearch:

∀𝑖 𝑗 𝑎 𝑣.

{

$(3log(𝑗 − 𝑖) + 4) ∗ …

}

(bsearch a v i j)

{

}

Conciseness issue: even non dominant terms must appear

Modularity issue: changing (even slightly)bsearch

requires updating the specification, andall proofs that depend on it.

Tempting:

{

$𝑂(𝑙𝑜𝑔(𝑗 − 𝑖))∗ …

}

(bsearch a v i j)

{

}

8/21

(13)

Example of using time credits

A specificationof the complexityofbsearch:

∀𝑖 𝑗 𝑎 𝑣.

{

$(3log(𝑗 − 𝑖) + 4) ∗ …

}

(bsearch a v i j)

{

}

Conciseness issue: even non dominant terms must appear

Modularity issue: changing (even slightly)bsearch

requires updating the specification, andall proofs that depend on it.

Tempting:

{

$𝑂(𝑙𝑜𝑔(𝑗 − 𝑖))∗ …

}

(bsearch a v i j)

{

}

(14)

Example of using time credits

A specificationof the complexityofbsearch:

∀𝑖 𝑗 𝑎 𝑣.

{

$(3log(𝑗 − 𝑖) + 4) ∗ …

}

(bsearch a v i j)

{

}

Conciseness issue: even non dominant terms must appear

Modularity issue: changing (even slightly)bsearch

requires updating the specification, andall proofs that depend on it.

Tempting:

{

$𝑂(𝑙𝑜𝑔(𝑗 − 𝑖))∗ …

}

(bsearch a v i j)

{

}

8/21

(15)

Example of using time credits

A specificationof the complexityofbsearch:

∀𝑖 𝑗 𝑎 𝑣.

{

$(3log(𝑗 − 𝑖) + 4) ∗ …

}

(bsearch a v i j)

{

}

Conciseness issue: even non dominant terms must appear

Modularity issue: changing (even slightly)bsearch

requires updating the specification, andall proofs that depend on it.

Tempting:

{

$𝑂(𝑙𝑜𝑔(𝑗 − 𝑖))∗ …

}

(bsearch a v i j)

{

}

(16)

Challenges in reasoning with 𝑂

(17)

Informal reasoning principles can be abused

1 let rec bsearch a v i j =

2 if j <= i then -1 else

3 let k = i + (j - i) / 2 in

4 if v = a.(k) then k

5 else if v < a.(k) then

6 bsearch a v i k

7 else

8 bsearch a v (k+1) j

“Claim”:

bsearch a v i j costs 𝑂(1).

Proof:

By induction on𝑗 − 𝑖:

𝑗 − 𝑖 ≤ 0:𝑂(1). OK!

𝑗 − 𝑖 > 0:𝑂(1) + 𝑂(1) + 𝑂(1) = 𝑂(1). OK!

…but which statement are we proving?

(18)

Informal reasoning principles can be abused

1 let rec bsearch a v i j =

2 if j <= i then -1 else

3 let k = i + (j - i) / 2 in

4 if v = a.(k) then k

5 else if v < a.(k) then

6 bsearch a v i k

7 else

8 bsearch a v (k+1) j

“Claim”:

bsearch a v i j costs 𝑂(1).

Proof:

By induction on𝑗 − 𝑖:

𝑗 − 𝑖 ≤ 0:𝑂(1). OK!

𝑗 − 𝑖 > 0:𝑂(1) + 𝑂(1) + 𝑂(1) = 𝑂(1). OK!

…but which statement are we proving?

9/21

(19)

Informal reasoning principles can be abused

1 let rec bsearch a v i j =

2 if j <= i then -1 else

3 let k = i + (j - i) / 2 in

4 if v = a.(k) then k

5 else if v < a.(k) then

6 bsearch a v i k

7 else

8 bsearch a v (k+1) j

“Claim”:

bsearch a v i j costs 𝑂(1).

Proof:

By induction on𝑗 − 𝑖:

𝑗 − 𝑖 ≤ 0:𝑂(1). OK!

𝑗 − 𝑖 > 0:𝑂(1) + 𝑂(1) + 𝑂(1) = 𝑂(1). OK!

…but which statement are we proving?

(20)

Informal reasoning principles can be abused

1 let rec bsearch a v i j =

2 if j <= i then -1 else

3 let k = i + (j - i) / 2 in

4 if v = a.(k) then k

5 else if v < a.(k) then

6 bsearch a v i k

7 else

8 bsearch a v (k+1) j

“Claim”:

bsearch a v i j costs 𝑂(1).

Proof:

By induction on𝑗 − 𝑖:

𝑗 − 𝑖 ≤ 0:𝑂(1). OK!

𝑗 − 𝑖 > 0:𝑂(1) + 𝑂(1) + 𝑂(1) = 𝑂(1). OK!

…but which statement are we proving?

9/21

(21)

Informal reasoning principles can be abused

1 let rec bsearch a v i j =

2 if j <= i then -1 else

3 let k = i + (j - i) / 2 in

4 if v = a.(k) then k

5 else if v < a.(k) then

6 bsearch a v i k

7 else

8 bsearch a v (k+1) j

“Claim”:

bsearch a v i j costs 𝑂(1).

Proof:

By induction on𝑗 − 𝑖:

𝑗 − 𝑖 ≤ 0:𝑂(1). OK!

𝑗 − 𝑖 > 0:𝑂(1) + 𝑂(1) + 𝑂(1) = 𝑂(1). OK!

…but which statement are we proving?

(22)

Informal reasoning principles can be abused

1 let rec bsearch a v i j =

2 if j <= i then -1 else

3 let k = i + (j - i) / 2 in

4 if v = a.(k) then k

5 else if v < a.(k) then

6 bsearch a v i k

7 else

8 bsearch a v (k+1) j

“Claim”:

bsearch a v i j costs 𝑂(1).

Proof:

By induction on𝑗 − 𝑖:

𝑗 − 𝑖 ≤ 0:𝑂(1). OK!

𝑗 − 𝑖 > 0:𝑂(1) + 𝑂(1) + 𝑂(1) = 𝑂(1). OK!

Where is the mistake?

…but which statement are we proving?

9/21

(23)

Informal reasoning principles can be abused

1 let rec bsearch a v i j =

2 if j <= i then -1 else

3 let k = i + (j - i) / 2 in

4 if v = a.(k) then k

5 else if v < a.(k) then

6 bsearch a v i k

7 else

8 bsearch a v (k+1) j

“Claim”:

bsearch a v i j costs 𝑂(1).

Proof:

By induction on𝑗 − 𝑖:

𝑗 − 𝑖 ≤ 0:𝑂(1). OK!

𝑗 − 𝑖 > 0:𝑂(1) + 𝑂(1) + 𝑂(1) = 𝑂(1). OK!

…but which statement are we proving?

(24)

Meaning of

𝑂(1)

What we just proved:

∀𝑖 𝑗, ∃ 𝑐, bsearch a v i jruns in𝑐steps

What “𝑂(1)” means:

∃ 𝑐, ∀𝑖 𝑗, bsearch a v i jruns in𝑐steps

10/21

(25)

Meaning of

𝑂(1)

What we just proved:

∀𝑖 𝑗, ∃ 𝑐, bsearch a v i jruns in𝑐steps

What “𝑂(1)” means:

∃ 𝑐, ∀𝑖 𝑗, bsearch a v i jruns in𝑐steps

(26)

Meaning of

𝑂(log𝑛)

bsearch a v i j runs in 𝑂(log(𝑗 − 𝑖)) steps.”

“there existsa cost function 𝑓 ∈ 𝑂(log𝑛) such that, for everya,v,i,j,

bsearch a v i jruns in𝑓 (𝑗 − 𝑖)steps.”

“there existsa cost function 𝑓 ∈ 𝑂(log𝑛) such that, for everya,v,i,j,

{

$𝑓 (𝑗 − 𝑖) ∗ …

}

(bsearch a v i j)

{

}

”.

Meaning of “𝑓 ∈ 𝑂(𝑔)”?

How to provide a witness for𝑓?

11/21

(27)

Meaning of

𝑂(log𝑛)

bsearch a v i j runs in 𝑂(log(𝑗 − 𝑖)) steps.”

“there existsa cost function 𝑓 ∈ 𝑂(log𝑛) such that, for everya,v,i,j,

bsearch a v i jruns in𝑓 (𝑗 − 𝑖)steps.”

“there existsa cost function 𝑓 ∈ 𝑂(log𝑛) such that, for everya,v,i,j,

{

$𝑓 (𝑗 − 𝑖) ∗ …

}

(bsearch a v i j)

{

}

”.

Meaning of “𝑓 ∈ 𝑂(𝑔)”?

How to provide a witness for𝑓?

(28)

Meaning of

𝑂(log𝑛)

bsearch a v i j runs in 𝑂(log(𝑗 − 𝑖)) steps.”

“there existsa cost function 𝑓 ∈ 𝑂(log𝑛) such that, for everya,v,i,j,

bsearch a v i jruns in𝑓 (𝑗 − 𝑖)steps.”

“there existsa cost function 𝑓 ∈ 𝑂(log𝑛) such that, for everya,v,i,j,

{

$𝑓 (𝑗 − 𝑖) ∗ …

}

(bsearch a v i j)

{

}

”.

Meaning of “𝑓 ∈ 𝑂(𝑔)”?

How to provide a witness for𝑓?

11/21

(29)

Meaning of

𝑂(log𝑛)

bsearch a v i j runs in 𝑂(log(𝑗 − 𝑖)) steps.”

“there existsa cost function 𝑓 ∈ 𝑂(log𝑛) such that, for everya,v,i,j,

bsearch a v i jruns in𝑓 (𝑗 − 𝑖)steps.”

“there existsa cost function 𝑓 ∈ 𝑂(log𝑛) such that, for everya,v,i,j,

{

$𝑓 (𝑗 − 𝑖) ∗ …

}

(bsearch a v i j)

{

}

”.

Meaning of “𝑓 ∈ 𝑂(𝑔)”?

How to provide a witness for𝑓?

(30)

A generic definition of 𝑂

(31)

Definition of

𝑂

Single variable case:

𝑓 ∈ 𝑂(𝑔) ∃𝑐, ∃𝑛0, ∀𝑛 ≥ 𝑛0, |𝑓 (𝑛)| ≤ 𝑐 |𝑔(𝑛)|

with𝑓 of typeℕ → ℤ

Multivariate case:𝑓 of type𝑘 → ℤ

In our library:𝑓of type𝐴 → ℤ, with afilteron type𝐴

(32)

𝑂

as a relation between functions

We define𝑂as adominationpre-order between functions of𝐴 to:

𝑓 ⪯𝐴 𝑔 ∃𝑐. 𝕌𝐴𝑥. |𝑓 (𝑥)| ≤ 𝑐 |𝑔(𝑥)|

𝐴must be equipped with afilter𝕌𝐴

𝕌𝐴𝑥.𝑃”: “ultimately P” / “P holds of everysufficiently large𝑥

Can be thought of as a quantifier

A standard notion in math (see e.g. Bourbaki)

We prove in our library many properties of𝐴for an arbitrary filtered type𝐴

13/21

(33)

𝑂

as a relation between functions

We define𝑂as adominationpre-order between functions of𝐴 to:

𝑓 ⪯𝐴 𝑔 ∃𝑐. 𝕌𝐴𝑥. |𝑓 (𝑥)| ≤ 𝑐 |𝑔(𝑥)|

𝐴must be equipped with afilter𝕌𝐴

𝕌𝐴𝑥.𝑃”: “ultimately P” / “P holds of everysufficiently large𝑥

Can be thought of as a quantifier

A standard notion in math (see e.g. Bourbaki)

We prove in our library many properties of𝐴for an arbitrary filtered type𝐴

(34)

𝑂

as a relation between functions

We define𝑂as adominationpre-order between functions of𝐴 to:

𝑓 ⪯𝐴 𝑔 ∃𝑐. 𝕌𝐴𝑥. |𝑓 (𝑥)| ≤ 𝑐 |𝑔(𝑥)|

𝐴must be equipped with afilter𝕌𝐴

𝕌𝐴𝑥.𝑃”: “ultimately P” / “P holds of everysufficiently large𝑥

Can be thought of as a quantifier

A standard notion in math (see e.g. Bourbaki)

We prove in our library many properties of𝐴for an arbitrary filtered type𝐴

13/21

(35)

Proving specifications: automatic

(guided) cost synthesis

(36)

Providing the cost function

“there existsa cost function 𝑓 ∈ 𝑂(log𝑛) such that, for everya,v,i,j,

{

$𝑓 (𝑗 − 𝑖) ∗ …

}

(bsearch a v i j)

{

}

”.

becomes

∃𝑓 ∶ ℤ → ℤ.

{

{

𝑓 ⪯ 𝜆𝑛.log𝑛

∀𝑖 𝑗 𝑎 𝑣.

{

$𝑓 (𝑗 − 𝑖) ∗ …

}

(bsearch a v i j)

{

}

First step of the proof: exhibit a concrete cost function.

Guess “𝜆𝑛. 3log 𝑛 + 4” from the start?

It seems desirable to(semi) automatically constructthe witness as the proof progresses.

14/21

(37)

Our approach to this problem

Convince Coq to postpone the moment where the concrete cost function is provided

Progressively synthesize the cost function while applying the reasoning rules from separation logic

The synthesized function has the same structure as the code

Afterwards, prove a𝑂()bound for the cost function

(38)

Our approach to this problem

Convince Coq to postpone the moment where the concrete cost function is provided

Progressively synthesize the cost function while applying the reasoning rules from separation logic

The synthesized function has the same structure as the code

Afterwards, prove a𝑂()bound for the cost function

15/21

(39)

Our approach to this problem

Convince Coq to postpone the moment where the concrete cost function is provided

Progressively synthesize the cost function while applying the reasoning rules from separation logic

The synthesized function has the same structure as the code

Afterwards, prove a𝑂()bound for the cost function

(40)

Our approach to this problem

Convince Coq to postpone the moment where the concrete cost function is provided

Progressively synthesize the cost function while applying the reasoning rules from separation logic

The synthesized function has the same structure as the code

Afterwards, prove a𝑂()bound for the cost function

15/21

(41)

let rec bsearch a v i j = if j <= i then -1 else

let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f n := 1 + (

if n <= 0 then 0 else 0 + 1 + max 0 (

1 + max (f (n/2))

(f (n - n/2 - 1)) )

)

(42)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f (j-i) := 1 + …

a hole (“”) is implemented as an evar in Coq

17/21

(43)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f (j-i) := 1 + (if j <= i then else …)

(44)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f (j-i) := 1 + (if j <= i then else …)

17/21

(45)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f (j-i) := 1 + (if (j-i) <= 0 then else …)

(46)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f (j-i) := 1 + (if (j-i) <= 0 then 0 else …)

17/21

(47)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f (j-i) := 1 + (

if (j-i) <= 0 then 0 else 0 + …

)

(48)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f (j-i) := 1 + (

if (j-i) <= 0 then 0 else 0 + 1 + …

)

17/21

(49)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f (j-i) := 1 + (

if (j-i) <= 0 then 0 else 0 + 1 + max … …

)

(50)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f (j-i) := 1 + (

if (j-i) <= 0 then 0 else 0 + 1 + max 0 …

)

17/21

(51)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f (j-i) := 1 + (

if (j-i) <= 0 then 0 else 0 + 1 + max 0 (1 + …) )

(52)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f (j-i) := 1 + (

if (j-i) <= 0 then 0 else 0 + 1 + max 0 (1 + max … …) )

17/21

(53)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f (j-i) := 1 + (

if (j-i) <= 0 then 0 else 0 + 1 + max 0 (

1 + max (f ((j-i)/2)) … )

)

(54)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f (j-i) := 1 + (

if (j-i) <= 0 then 0 else 0 + 1 + max 0 (

1 + max (f ((j-i)/2))

(f ((j-i) - (j-i)/2 - 1)) )

)

17/21

(55)

if j <= i then -1 else let k = i + (j - i) / 2 in if v = Array.get a k then k else if v < Array.get a k then

bsearch a v i k else

bsearch a v (k+1) j

f n := 1 + (

if n <= 0 then 0 else 0 + 1 + max 0 (

1 + max (f (n/2))

(f (n - n/2 - 1)) )

)

(56)

Our cost synthesis achieves the following objectives:

The user inspects the code only once

The user can guide the synthesis of the cost function

18/21

(57)

Summary

Separation Logic with Time Credits

This work

Modular specifications

using O() and filters

Better automation for proofs with time credits: cost function synthesis

(58)

Closely related work

Howell(2008), in his book, studies properties and difficulties of𝑂()with multiple variables.

In Isabelle/HOL:Zhan & Haslbeck(2018) implement the same formal framework, with strong focus on automation but no “cost function synthesis”. They build onEberl’s (2017) impressive formalization of the Akra-Bazzi theorem.

Hoffmann et al.(2010-2017): automated amortized resource analysis for OCaml. Implemented by Carbonneaux, Hoffmann & Shao(2015) with proof certificates checked by Coq.

20/21

(59)

More in the paper:

Details about side-conditions for cost functions:

monotonic and non-negative

Clear up some confusion about multivariate𝑂()

Variable substitution in multivariate specifications

Other case studies: selection sort, Bellman-Ford, Union-Find

http://gallium.inria.fr/~agueneau/bigO

Challenging case studies in the works!

Références

Documents relatifs

Translating the above bound, the complexity of the Montes approach is at best in O(D e 5 ) but only for computing a local integral basis at one singularity, while the algorithm

Logic Tensor Networks (LTNs) are a deep learning model that comes from the neural-symbolic field: it integrates both logic and data in a neural network to provide support for

kernel estimate based on Epanechnikov kernel and m)n-1/5) is asymptotically minimax in F against the class of kernel estimates fn with deterministic bandwidths,

Using representation (1.1), we shall see that Xa (x) measures the difference between the arça of certain paral- lelograms and the number of lattice points

The purpose of this study has been to estimate a variable cost function using panel data for a sample of 32 Italian water companies, in order to analyze the economies of scale

It follows from our analysis that the values of D-finite functions (i.e., functions described as solutions of linear differential equations with polynomial coefficients) may be

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

In Section 4 we improve the known lower bound on the OBDD size of HWB a little bit. Then we consider the problem of finding a good variable ordering for HWB. All intuitive ideas