• Aucun résultat trouvé

A Graduate Course in Applied Cryptography

N/A
N/A
Protected

Academic year: 2022

Partager "A Graduate Course in Applied Cryptography"

Copied!
818
0
0

Texte intégral

(1)

A Graduate Course in Applied Cryptography

Dan Boneh and Victor Shoup

Version 0.4, September 2017

(2)

Preface

Cryptography is an indispensable tool used to protect information in computing systems. It is used everywhere and by billions of people worldwide on a daily basis. It is used to protect data at rest and data in motion. Cryptographic systems are an integral part of standard protocols, most notably the Transport Layer Security (TLS) protocol, making it relatively easy to incorporate strong encryption into a wide range of applications.

While extremely useful, cryptography is also highly brittle. The most secure cryptographic system can be rendered completely insecure by a single specification or programming error. No amount of unit testing will uncover a security vulnerability in a cryptosystem.

Instead, to argue that a cryptosystem is secure, we rely on mathematical modeling and proofs to show that a particular system satisfies the security properties attributed to it. We often need to introduce certain plausible assumptions to push our security arguments through.

This book is about exactly that: constructing practical cryptosystems for which we can argue security under plausible assumptions. The book covers many constructions for di↵erent tasks in cryptography. For each task we define a precise security goal that we aim to achieve and then present constructions that achieve the required goal. To analyze the constructions, we develop a unified framework for doing cryptographic proofs. A reader who masters this framework will be capable of applying it to new constructions that may not be covered in the book.

Throughout the book we present many case studies to survey how deployed systems operate.

We describe common mistakes to avoid as well as attacks on real-world systems that illustrate the importance of rigor in cryptography. We end every chapter with a fun application that applies the ideas in the chapter in some unexpected way.

Intended audience and how to use this book

The book is intended to be self contained. Some supplementary material covering basic facts from probability theory and algebra is provided in the appendices. The book is divided into three parts.

• Part I develops symmetric encryption which explains how two parties, Alice and Bob, can securely exchange information when they have a shared key unknown to the attacker. We discuss data confidentiality, data integrity, and the important concept of authenticated en- cryption.

• Part II develops the concepts of public-key encryption and digital signatures, which allow Alice and Bob to communicate securely, without having a pre-shared secret key.

• Part III is about cryptographic protocols, such as protocols for user identification, key ex- change, zero knowledge, and secure computation.

(3)

A beginning reader can read though the book to learn how cryptographic systems work and why they are secure. Every security theorem in the book is followed by a proof idea that explains at a high level why the scheme is secure. On a first read one can skip over the detailed proofs without losing continuity. A beginning reader may also skip over the mathematical details sections that explore nuances of certain definitions.

An advanced reader may enjoy reading the detailed proofs to learn how to do proofs in cryptog- raphy. At the end of every chapter you will find many exercises that explore additional aspects of the material covered in the chapter. Some exercises rehearse what was learned, but many exercises expand on the material and discuss topics not covered in the chapter.

Status of the book

The current draft contains part I and most of parts II and III. The remaining four chapters are forthcoming. We hope you enjoy this write-up. Please send us comments and let us know if you find typos or mistakes.

Citations: While the current draft is mostly complete, we still do not include citations and references to the many works on which this book is based. Those will be coming soon and will be presented in the Notes section at the end of every chapter.

Dan Boneh and Victor Shoup September, 2017

(4)

Contents

1 Introduction 1

1.1 Historic ciphers . . . 1

1.2 Terminology used throughout the book . . . 1

I Secret key cryptography 3 2 Encryption 4 2.1 Introduction . . . 4

2.2 Shannon ciphers and perfect security . . . 5

2.2.1 Definition of a Shannon cipher . . . 5

2.2.2 Perfect security . . . 7

2.2.3 The bad news . . . 13

2.3 Computational ciphers and semantic security . . . 13

2.3.1 Definition of a computational cipher . . . 14

2.3.2 Definition of semantic security . . . 15

2.3.3 Connections to weaker notions of security . . . 18

2.3.4 Consequences of semantic security . . . 22

2.3.5 Bit guessing: an alternative characterization of semantic security . . . 25

2.4 Mathematical details . . . 27

2.4.1 Negligible, super-poly, and poly-bounded functions . . . 28

2.4.2 Computational ciphers: the formalities . . . 29

2.4.3 Efficient adversaries and attack games . . . 32

2.4.4 Semantic security: the formalities . . . 34

2.5 A fun application: anonymous routing . . . 35

2.6 Notes . . . 38

2.7 Exercises . . . 38

3 Stream ciphers 45 3.1 Pseudo-random generators . . . 45

3.1.1 Definition of a pseudo-random generator . . . 46

3.1.2 Mathematical details . . . 48

3.2 Stream ciphers: encryption with a PRG . . . 48

3.3 Stream cipher limitations: attacks on the one time pad . . . 52

3.3.1 The two-time pad is insecure . . . 53

(5)

3.3.2 The one-time pad is malleable . . . 53

3.4 Composing PRGs . . . 54

3.4.1 A parallel construction . . . 54

3.4.2 A sequential construction: the Blum-Micali method . . . 59

3.4.3 Mathematical details . . . 61

3.5 The next bit test . . . 64

3.6 Case study: the Salsa and ChaCha PRGs . . . 67

3.7 Case study: linear generators . . . 70

3.7.1 An example cryptanalysis: linear congruential generators . . . 70

3.7.2 The subset sum generator . . . 73

3.8 Case study: cryptanalysis of the DVD encryption system . . . 74

3.9 Case study: cryptanalysis of the RC4 stream cipher . . . 76

3.9.1 Security of RC4 . . . 78

3.10 Generating random bits in practice . . . 80

3.11 A broader perspective: computational indistinguishability . . . 81

3.11.1 Mathematical details . . . 86

3.12 A fun application: coin flipping and commitments . . . 87

3.13 Notes . . . 88

3.14 Exercises . . . 88

4 Block ciphers 94 4.1 Block ciphers: basic definitions and properties . . . 94

4.1.1 Some implications of security . . . 96

4.1.2 Efficient implementation of random permutations . . . 99

4.1.3 Strongly secure block ciphers . . . 99

4.1.4 Using a block cipher directly for encryption . . . 100

4.1.5 Mathematical details . . . 104

4.2 Constructing block ciphers in practice . . . 105

4.2.1 Case study: DES . . . 107

4.2.2 Exhaustive search on DES: the DES challenges . . . 111

4.2.3 Strengthening ciphers against exhaustive search: the 3E construction . . . . 113

4.2.4 Case study: AES . . . 115

4.3 Sophisticated attacks on block ciphers . . . 120

4.3.1 Algorithmic attacks . . . 121

4.3.2 Side-channel attacks . . . 124

4.3.3 Fault-injection attacks on AES . . . 128

4.3.4 Quantum exhaustive search attacks . . . 129

4.4 Pseudo-random functions: basic definitions and properties . . . 130

4.4.1 Definitions . . . 130

4.4.2 Efficient implementation of random functions . . . 131

4.4.3 When is a secure block cipher a secure PRF? . . . 132

4.4.4 Constructing PRGs from PRFs . . . 136

4.4.5 Mathematical details . . . 137

4.5 Constructing block ciphers from PRFs . . . 139

4.6 The tree construction: from PRGs to PRFs . . . 145

4.6.1 Variable length tree construction . . . 149

(6)

4.7 The ideal cipher model . . . 152

4.7.1 Formal definitions . . . 152

4.7.2 Exhaustive search in the ideal cipher model . . . 153

4.7.3 The Even-Mansour block cipher and theEXconstruction . . . 156

4.7.4 Proof of the Even-Mansour and EX theorems . . . 157

4.8 Fun application: comparing information without revealing it . . . 163

4.9 Notes . . . 165

4.10 Exercises . . . 165

5 Chosen Plaintext Attack 174 5.1 Introduction . . . 174

5.2 Security against multi-key attacks . . . 176

5.3 Semantic security against chosen plaintext attack . . . 178

5.4 Building CPA secure ciphers . . . 180

5.4.1 A generic hybrid construction . . . 180

5.4.2 Randomized counter mode . . . 186

5.4.3 CBC mode . . . 191

5.4.4 Case study: CBC padding in TLS 1.0 . . . 196

5.4.5 Concrete parameters and a comparison of counter and CBC modes . . . 196

5.5 Nonce-based encryption . . . 198

5.5.1 Nonce-based generic hybrid encryption . . . 200

5.5.2 Nonce-based Counter mode . . . 200

5.5.3 Nonce-based CBC mode . . . 201

5.6 A fun application: revocable broadcast encryption . . . 202

5.7 Notes . . . 205

5.8 Exercises . . . 205

6 Message integrity 212 6.1 Definition of a message authentication code . . . 214

6.1.1 Mathematical details . . . 217

6.2 MAC verification queries do not help the attacker . . . 217

6.3 Constructing MACs from PRFs . . . 220

6.4 Prefix-free PRFs for long messages . . . 222

6.4.1 The CBC prefix-free secure PRF . . . 223

6.4.2 The cascade prefix-free secure PRF . . . 226

6.4.3 Extension attacks: CBC and cascade are insecure MACs . . . 227

6.5 From prefix-free secure PRF to fully secure PRF (method 1): encrypted PRF . . . 228

6.5.1 ECBC and NMAC: MACs for variable length inputs . . . 229

6.6 From prefix-free secure PRF to fully secure PRF (method 2): prefix-free encodings . 232 6.6.1 Prefix free encodings . . . 232

6.7 From prefix-free secure PRF to fully secure PRF (method 3): CMAC . . . 233

6.8 Converting a block-wise PRF to bit-wise PRF . . . 236

6.9 Case study: ANSI CBC-MAC . . . 237

6.10 Case study: CMAC . . . 238

6.11 PMAC: a parallel MAC . . . 240

6.12 A fun application: searching on encrypted data . . . 242

(7)

6.13 Notes . . . 242

6.14 Exercises . . . 243

7 Message integrity from universal hashing 248 7.1 Universal hash functions (UHFs) . . . 248

7.1.1 Multi-query UHFs . . . 250

7.1.2 Mathematical details . . . 251

7.2 Constructing UHFs . . . 251

7.2.1 Construction 1: UHFs using polynomials . . . 251

7.2.2 Construction 2: CBC and cascade are computational UHFs . . . 254

7.2.3 Construction 3: a parallel UHF from a small PRF . . . 256

7.3 PRF(UHF) composition: constructing MACs using UHFs . . . 258

7.3.1 Using PRF(UHF) composition: ECBC and NMAC security . . . 261

7.3.2 Using PRF(UHF) composition with polynomial UHFs . . . 261

7.3.3 Using PRF(UHF) composition: PMAC0 security . . . 262

7.4 The Carter-Wegman MAC . . . 262

7.4.1 Using Carter-Wegman with polynomial UHFs . . . 269

7.5 Nonce-based MACs . . . 269

7.5.1 Secure nonce-based MACs . . . 269

7.6 Unconditionally secure one-time MACs . . . 270

7.6.1 Pairwise unpredictable functions . . . 271

7.6.2 Building unpredictable functions . . . 271

7.6.3 From PUFs to unconditionally secure one-time MACs . . . 272

7.7 A fun application: timing attacks . . . 272

7.8 Notes . . . 272

7.9 Exercises . . . 273

8 Message integrity from collision resistant hashing 283 8.1 Definition of collision resistant hashing . . . 286

8.1.1 Mathematical details . . . 286

8.2 Building a MAC for large messages . . . 287

8.3 Birthday attacks on collision resistant hash functions . . . 289

8.4 The Merkle-Damg˚ard paradigm . . . 291

8.4.1 Joux’s attack . . . 294

8.5 Building Compression Functions . . . 294

8.5.1 A simple but inefficient compression function . . . 295

8.5.2 Davies-Meyer compression functions . . . 295

8.5.3 Collision resistance of Davies-Meyer . . . 297

8.6 Case study: SHA256 . . . 298

8.6.1 Other Merkle-Damg˚ard hash functions . . . 300

8.7 Case study: HMAC . . . 302

8.7.1 Security of two-key nest . . . 303

8.7.2 The HMAC standard . . . 305

8.7.3 Davies-Meyer is a secure PRF in the ideal cipher model . . . 306

8.8 The Sponge Construction and SHA3 . . . 309

8.8.1 The sponge construction . . . 309

(8)

8.8.2 Case study: SHA3, SHAKE256, and SHAKE512 . . . 314

8.9 Merkle trees: proving properties of a hashed sequence . . . 315

8.9.1 Authenticated data structures . . . 318

8.10 Key derivation and the random oracle model . . . 320

8.10.1 The key derivation problem . . . 320

8.10.2 Random oracles: a useful heuristic . . . 322

8.10.3 Random oracles: safe modes of operation . . . 327

8.10.4 The leftover hash lemma . . . 329

8.10.5 Case study: HKDF . . . 330

8.11 Security without collision resistance . . . 331

8.11.1 Second preimage resistance . . . 331

8.11.2 Randomized hash functions: target collision resistance . . . 333

8.11.3 TCR from 2nd-preimage resistance . . . 333

8.11.4 Using target collision resistance . . . 336

8.12 A fun application: an efficient commitment scheme . . . 339

8.13 Another fun application: proofs of work . . . 339

8.14 Notes . . . 339

8.15 Exercises . . . 339

9 Authenticated Encryption 347 9.1 Authenticated encryption: definitions . . . 348

9.1.1 One-time authenticated encryption . . . 349

9.2 Implications of authenticated encryption . . . 350

9.2.1 Chosen ciphertext attacks: a motivating example . . . 350

9.2.2 Chosen ciphertext attacks: definition . . . 352

9.2.3 Authenticated encryption implies chosen ciphertext security . . . 353

9.3 Encryption as an abstract interface . . . 355

9.4 Authenticated encryption ciphers from generic composition . . . 357

9.4.1 Encrypt-then-MAC . . . 357

9.4.2 MAC-then-encrypt is not generally secure: padding oracle attacks on SSL . 359 9.4.3 More padding oracle attacks. . . 362

9.4.4 Secure instances of MAC-then-encrypt . . . 363

9.4.5 Encrypt-then-MAC or MAC-then-encrypt? . . . 367

9.5 Nonce-based authenticated encryption with associated data . . . 367

9.6 One more variation: CCA-secure ciphers with associated data . . . 370

9.7 Case study: Galois counter mode (GCM) . . . 371

9.8 Case study: the TLS 1.3 record protocol . . . 373

9.9 Case study: an attack on non-atomic decryption in SSH . . . 376

9.10 Case study: 802.11b WEP, a badly broken system . . . 379

9.11 Case study: IPsec . . . 381

9.12 A fun application: private information retrieval . . . 386

9.13 Notes . . . 386

9.14 Exercises . . . 386

(9)

II Public key cryptography 393

10 Public key tools 394

10.1 A toy problem: anonymous key exchange . . . 394

10.2 One-way trapdoor functions . . . 395

10.2.1 Key exchange using a one-way trapdoor function scheme . . . 396

10.2.2 Mathematical details . . . 397

10.3 A trapdoor permutation scheme based on RSA . . . 398

10.3.1 Key exchange based on the RSA assumption . . . 400

10.3.2 Mathematical details . . . 400

10.4 Diffie-Hellman key exchange . . . 401

10.4.1 The key exchange protocol . . . 402

10.4.2 Security of Diffie-Hellman key exchange . . . 402

10.5 Discrete logarithm and related assumptions . . . 403

10.5.1 Random self-reducibility . . . 406

10.5.2 Mathematical details . . . 407

10.6 Collision resistant hash functions from number-theoretic primitives . . . 409

10.6.1 Collision resistance based on DL . . . 409

10.6.2 Collision resistance based on RSA . . . 410

10.7 Attacks on the anonymous Diffie-Hellman protocol . . . 412

10.8 Merkle puzzles: a partial solution to key exchange using block ciphers . . . 413

10.9 Fun application: Pedersen commitments . . . 415

10.10 Notes . . . 415

10.11 Exercises . . . 415

11 Public key encryption 424 11.1 Two further example applications . . . 425

11.1.1 Sharing encrypted files . . . 425

11.1.2 Key escrow . . . 425

11.2 Basic definitions . . . 426

11.2.1 Mathematical details . . . 427

11.3 Implications of semantic security . . . 428

11.3.1 The need for randomized encryption . . . 428

11.3.2 Semantic security against chosen plaintext attack . . . 429

11.4 Encryption based on a trapdoor function scheme . . . 431

11.4.1 Instantiating ETDF with RSA . . . 434

11.5 ElGamal encryption . . . 435

11.5.1 Semantic security of ElGamal in the random oracle model . . . 436

11.5.2 Semantic security of ElGamal without random oracles . . . 438

11.6 Threshold decryption . . . 441

11.6.1 Shamir’s secret sharing scheme . . . 443

11.6.2 ElGamal threshold decryption . . . 445

11.7 Fun application: oblivious transfer from DDH . . . 448

11.8 Notes . . . 449

11.9 Exercises . . . 449

(10)

12 Chosen ciphertext secure public key encryption 456

12.1 Basic definitions . . . 456

12.2 Understanding CCA security . . . 458

12.2.1 CCA security and ciphertext malleability . . . 458

12.2.2 CCA security vs authentication . . . 459

12.2.3 CCA security and key escrow . . . 460

12.2.4 Encryption as an abstract interface . . . 461

12.3 CCA-secure encryption from trapdoor function schemes . . . 463

12.3.1 Instantiating ETDF0 with RSA . . . 468

12.4 CCA-secure ElGamal encryption . . . 468

12.5 CCA security from DDH without random oracles . . . 474

12.5.1 Universal projective hash functions . . . 474

12.5.2 Universal2 projective hash functions . . . 476

12.5.3 The ECS scheme . . . 478

12.6 CCA security via a generic transformation . . . 483

12.6.1 A generic instantiation . . . 488

12.6.2 A concrete instantiation with ElGamal . . . 489

12.7 CCA-secure public-key encryption with associated data . . . 490

12.8 Case study: PKCS1, OAEP, OAEP+, and SAEP . . . 492

12.8.1 Padding schemes . . . 492

12.8.2 PKCS1 padding . . . 493

12.8.3 Bleichenbacher’s attack on the RSA-PKCS1 encryption scheme . . . 494

12.8.4 Optimal Asymmetric Encryption Padding (OAEP) . . . 496

12.8.5 OAEP+ and SAEP+ . . . 499

12.9 Fun application: sealed bid auctions . . . 499

12.10 Notes . . . 500

12.11 Exercises . . . 500

13 Digital signatures 513 13.1 Definition of a digital signature . . . 515

13.1.1 Secure signatures . . . 516

13.1.2 Mathematical details . . . 519

13.2 Extending the message space with collision resistant hashing . . . 519

13.2.1 Extending the message space using TCR functions . . . 520

13.3 Signatures from trapdoor permutations: the full domain hash . . . 521

13.3.1 Signatures based on the RSA trapdoor permutation . . . 522

13.4 Security analysis of full domain hash . . . 525

13.4.1 Repeated one-way functions: a useful lemma . . . 525

13.4.2 Proofs of Theorems 13.3 and 13.4 . . . 529

13.5 An RSA-based signature scheme with tighter security proof . . . 530

13.6 Case study: PKCS1 signatures . . . 532

13.6.1 Bleichenbacher’s attack on PKCS1 signatures . . . 534

13.7 Signcryption: combining signatures and encryption . . . 535

13.7.1 Secure signcryption . . . 537

13.7.2 Signcryption as an abstract interface . . . 539

13.7.3 Constructions: encrypt-then-sign and sign-then-encrypt . . . 542

(11)

13.7.4 A construction based on Diffie-Hellman key exchange . . . 546

13.7.5 Additional desirable properties: forward secrecy and non-repudiation . . . . 548

13.8 Certificates and the public-key infrastructure . . . 552

13.8.1 Coping with malicious or negligent certificate authorities . . . 554

13.8.2 Certificate revocation . . . 557

13.9 Case study: legal aspects of digital signatures . . . 559

13.10 A fun application: private information retrieval . . . 560

13.11 Notes . . . 560

13.12 Exercises . . . 561

14 Fast hash-based signatures 568 14.1 Basic Lamport signatures . . . 569

14.1.1 Shrinking the signature using an enhanced TCR . . . 570

14.2 A general Lamport framework . . . 571

14.2.1 An explicit containment free function . . . 573

14.3 Winternitz one-time signatures . . . 574

14.3.1 A domination free function for Winternitz signatures . . . 577

14.4 HORS: short Lamport signatures . . . 578

14.4.1 Shrinking the public-key using a Merkle tree . . . 579

14.5 Applications of one-time signatures . . . 580

14.5.1 Online/o✏ine signatures from one-time signatures . . . 580

14.5.2 Authenticating streamed data with one-time signatures . . . 581

14.6 From one-time signatures to many-time signatures . . . 581

14.6.1 Indexed signatures . . . 582

14.6.2 A many-time signature scheme from an indexed signature . . . 583

14.6.3 The complete Merkle stateless signature system . . . 585

14.6.4 Nonce-based Merkle signatures . . . 587

14.7 A fun application . . . 588

14.8 Notes . . . 588

14.9 Exercises . . . 589

15 Elliptic curve cryptography and pairings 595 15.1 The group of points of an elliptic curve . . . 596

15.2 Elliptic curves over finite fields . . . 597

15.2.1 Montgomery and Edwards curves . . . 599

15.3 Elliptic curve cryptography . . . 599

15.3.1 The curve P256 . . . 600

15.3.2 The curve 25519 . . . 600

15.4 Pairings . . . 602

15.5 Signature schemes from pairings . . . 603

15.5.1 BLS signatures . . . 603

15.5.2 Group signatures . . . 603

15.6 Advanced encryption schemes from pairings . . . 603

15.6.1 Identity based encryption . . . 603

15.6.2 Threshold decryption with ciphertext security . . . 603

15.6.3 Broadcast encryption . . . 603

(12)

15.6.4 Homomorphic encryption . . . 603

15.7 Multilinear maps . . . 603

15.8 A fun application: secret handshakes . . . 604

15.9 Notes . . . 604

15.10 Exercises . . . 604

16 Lattice based cryptography 606 16.1 Integer lattices . . . 606

16.2 Hard problems on lattices . . . 606

16.2.1 The SIS problem . . . 606

16.2.2 The learning with errors (LWE) problem . . . 606

16.2.3 The ring LWE problem . . . 606

16.3 Trapdoor sampling from a lattice . . . 606

16.4 Signatures from lattice problems . . . 606

16.5 Public-key encryption from lattices . . . 606

16.6 Fully homomorphic encryption . . . 606

16.7 A fun application: factoring integers using lattices . . . 606

16.8 Exercises . . . 606

17 Analysis of number theoretic assumptions 607 17.1 How reasonable are the factoring and RSA assumptions? . . . 607

17.1.1 Quadratic resudousity assumption . . . 607

17.2 How reasonable are the DL and CDH assumptions? . . . 607

17.2.1 Brute-force search . . . 608

17.2.2 The baby-step/giant-step method . . . 608

17.2.3 Groups of orderqe . . . 609

17.2.4 The Pohlig-Hellman algorithm . . . 611

17.2.5 The Pohlig-Hellman algorithm . . . 611

17.2.6 Information leakage . . . 614

17.3 Discrete log inZp . . . 614

17.3.1 The number field sieve . . . 614

17.3.2 Discrete-log records in Zp . . . 615

17.4 How reasonable is decision Diffie-Hellman? . . . 616

17.5 Quantum attacks on number theoretic problems . . . 616

17.6 Side channel and fault attacks . . . 616

17.7 Notes . . . 616

17.8 Chapter summary . . . 616

17.9 Exercises . . . 616

III Protocols 618 18 Protocols for identification and login 619 18.1 Interactive protocols: general notions . . . 621

18.1.1 Mathematical details . . . 622

18.2 ID protocols: definitions . . . 622

(13)

18.3 Password protocols: security against direct attacks . . . 623

18.3.1 Password cracking using a dictionary attack . . . 624

18.4 Making dictionary attacks harder . . . 628

18.4.1 Public salts . . . 628

18.4.2 Secret salts . . . 630

18.4.3 Slow hash functions . . . 630

18.4.4 Slow memory-hard hash functions . . . 632

18.4.5 More password management issues . . . 636

18.5 One time passwords: security against eavesdropping . . . 637

18.5.1 PRF-based one-time passwords: HOTP and TOTP . . . 639

18.5.2 The S/key system . . . 641

18.6 Challenge-response: security against active attacks . . . 642

18.6.1 Challenge-response protocols . . . 644

18.7 A fun application: rainbow tables . . . 646

18.8 Another fun application: hardening password storage . . . 650

18.9 Notes . . . 650

18.10 Exercises . . . 651

19 Identification and signatures from sigma protocols 657 19.1 Schnorr’s identification protocol . . . 657

19.1.1 Honest verifier zero knowledge and security against eavesdropping . . . 662

19.2 From identification protocols to signatures . . . 664

19.2.1 A useful abstraction: repeated impersonation attacks . . . 665

19.2.2 Security analysis of Schnorr signatures . . . 666

19.2.3 A concrete implementation and an optimization . . . 671

19.3 Case study: ECDSA signatures . . . 672

19.4 Sigma protocols: basic definitions . . . 673

19.4.1 Knowledge soundness . . . 675

19.4.2 Special honest verifier zero knowledge . . . 676

19.5 Sigma protocols: examples . . . 676

19.5.1 Okamoto’s protocol for representations . . . 677

19.5.2 The Chaum-Pedersen protocol for DH-triples . . . 679

19.5.3 A Sigma protocol for arbitrary linear relations . . . 680

19.5.4 A Sigma protocol for RSA . . . 682

19.6 Identification and signatures from Sigma protocols . . . 683

19.6.1 The Fiat-Shamir heuristic for signatures . . . 685

19.7 Combining Sigma protocols: AND and OR proofs . . . 688

19.7.1 The AND-proof construction . . . 688

19.7.2 The OR-proof construction . . . 689

19.8 Witness independence and applications . . . 690

19.8.1 Definition of witness independence . . . 691

19.8.2 Special HVZK implies witness independence . . . 692

19.8.3 Actively secure identification protocols . . . 693

19.8.4 Okamoto’s identification protocol . . . 695

19.9 A fun application: a two round witness independent protocol . . . 696

19.10 Notes . . . 697

(14)

19.11 Exercises . . . 697

20 Proving properties in zero-knowledge 707 20.1 Languages and existential soundness . . . 707

20.2 Proving properties on encrypted data . . . 708

20.2.1 A generic protocol for non-linear relations . . . 713

20.3 Non-interactive proof systems . . . 715

20.3.1 Example: a voting protocol . . . 715

20.3.2 Non-interactive proofs: basic syntax . . . 717

20.3.3 The Fiat-Shamir transform . . . 717

20.3.4 Non-interactive existential soundness . . . 718

20.3.5 Non-interactive zero knowledge . . . 718

20.4 Computational zero-knowledge and applications . . . 720

20.4.1 Example: range proofs . . . 721

20.4.2 Special computational HVZK . . . 722

20.4.3 An unconstrained generic protocol for non-linear relations . . . 723

20.5 Efficient multi-round protocols . . . 724

20.6 Succinct non-interactive zero-knowledge proofs (SNARKs) . . . 724

20.7 A fun application: everything that can be proved, can be proved in zero knowledge 724 20.8 Notes . . . 724

20.9 Exercises . . . 725

21 Authenticated Key Exchange 735 21.1 Identification and AKE . . . 737

21.2 An encryption-based protocol . . . 738

21.2.1 Insecure variations . . . 741

21.2.2 Summary . . . 746

21.3 Perfect forward secrecy and a protocol based on ephemeral encryption . . . 747

21.3.1 Assuming only semantically secure encryption . . . 749

21.4 HSM security . . . 749

21.4.1 A technical requirement: strongly unpredictable ciphertexts . . . 752

21.4.2 Insecure variations . . . 752

21.5 Identity protection . . . 756

21.6 One-sided authenticated key exchange . . . 758

21.6.1 A one-sided authenticated variant of AKE4 . . . 759

21.7 Deniability . . . 760

21.7.1 Deniability without identity protection . . . 761

21.7.2 Deniability with identity protection . . . 762

21.8 Channel bindings . . . 764

21.9 Formal definitions . . . 765

21.9.1 Understanding the definition . . . 769

21.9.2 Security of protocol AKE1 . . . 770

21.9.3 Modeling perfect forward secrecy . . . 771

21.9.4 Modeling HSM security . . . 773

21.9.5 Modeling one-sided authentication . . . 776

21.9.6 Modeling channel bindings . . . 777

(15)

21.10 Case study: TLS session setup . . . 777

21.10.1 Authenticated key exchange with preshared keys . . . 780

21.11 Password authenticated key exchange . . . 783

21.11.1 Phishing attacks . . . 783

21.11.2 PAKE: an introduction . . . 786

21.11.3 ProtocolPAKE0 . . . 786

21.11.4 ProtocolPAKE1 . . . 787

21.11.5 ProtocolPAKE2 . . . 789

21.11.6 ProtocolPAKE+2 . . . 792

21.11.7 Explicit key confirmation . . . 793

21.11.8 Phishing again . . . 794

21.12 A fun application: establishing Tor channels . . . 794

21.13 Notes . . . 794

21.14 Exercises . . . 795

22 Key establishment with online Trusted Third Parties 798 22.1 A key exchange protocol with an online TTP . . . 799

22.2 Insecure variations of protocolOnlineTTP . . . 801

22.3 Security proof for protocolOnlineTTP . . . 806

22.4 Case study: Kerberos V5 . . . 806

22.5 O✏ine TTP vs. Online TTP . . . 810

22.6 Notes . . . 811

22.7 Exercises . . . 811

23 Two-party and multi-party secure computation 812 23.1 Yao’s two party protocol . . . 812

23.2 Multi-party secure computation . . . 812

IV Appendices 813 A Basic number theory 814 A.1 Cyclic groups . . . 814

A.2 Arithmetic modulo primes . . . 814

A.2.1 Basic concepts . . . 814

A.2.2 Structure of Zp . . . 815

A.2.3 Quadratic residues . . . 815

A.2.4 Computing in Zp . . . 816

A.2.5 Summary: arithmetic modulo primes . . . 816

A.3 Arithmetic modulo composites . . . 817

B Basic probability theory 819 B.1 Birthday Paradox . . . 819

B.1.1 More collision bounds . . . 821

B.1.2 A simple distinguisher . . . 821

C Basic complexity theory 823

(16)

D Probabilistic algorithms 824

(17)

Part I

Secret key cryptography

(18)

Chapter 2

Encryption

Roughly speaking, encryption is the problem of how two parties can communicate in secret in the presence of an eavesdropper. The main goals of this chapter are to develop a meaningful and useful definition of what we are trying to achieve, and to take some first steps in actually achieving it.

2.1 Introduction

Suppose Alice and Bob share a secret keyk, and Alice wants to transmit a messagem to Bob over a network while maintaining the secrecy ofm in the presence of an eavesdropping adversary. This chapter begins the development of basic techniques to solve this problem. Besides transmitting a message over a network, these same techniques allow Alice to store a file on a disk so that no one else with access to the disk can read the file, but Alice herself can read the file at a later time.

We should stress that while the techniques we develop to solve this fundamental problem are important and interesting, they do not by themselves solve all problems related to “secure commu- nication.”

• The techniques only provide secrecy in the situation where Alice transmits a single message per key. If Alice wants to secretly transmit several messages using the same key, then she must use methods developed in Chapter 5.

• The techniques do not provide any assurances of message integrity: if the attacker has the ability to modify the bits of the ciphertext while it travels from Alice to Bob, then Bob may not realize that this happened, and accept a message other than the one that Alice sent. We will discuss techniques for providing message integrity in Chapter 6.

• The techniques do not provide a mechanism that allow Alice and Bob to come to share a secret key in the first place. Maybe they are able to do this using some secure network (or a physical, face-to-face meeting) at some point in time, while the message is sent at some later time when Alice and Bob must communicate over an insecure network. However, with an appropriate infrastructure in place, there are also protocols that allow Alice and Bob to exchange a secret key even over an insecure network: such protocols are discussed in Chapters 21 and 22.

(19)

2.2 Shannon ciphers and perfect security

2.2.1 Definition of a Shannon cipher

The basic mechanism for encrypting a message using a shared secret key is called a cipher (or encryption scheme). In this section, we introduce a slightly simplified notion of a cipher, which we call a Shannon cipher.

A Shannon cipher is a pairE = (E, D) of functions.

• The functionE (theencryption function) takes as input a keyk and amessage m(also called aplaintext), and produces as output aciphertext c. That is,

c=E(k, m), and we say that cis the encryption of m under k.

• The functionD (the decryption function) takes as input a key k and a ciphertext c, and produces a messagem. That is,

m=D(k, c), and we say that mis the decryption of c under k.

• We require that decryption “undoes” encryption; that is, the cipher must satisfy the following correctness property: for all keysk and all messages m, we have

D(k, E(k, m) ) =m.

To be slightly more formal, let us assume thatKis the set of all keys (thekey space),Mis the set of all messages (themessage space), and that C is the set of all ciphertexts (the ciphertext space). With this notation, we can write:

E :K⇥M!C, D:K⇥C!M. Also, we shall say that E is defined over(K,M,C).

Suppose Alice and Bob want to use such a cipher so that Alice can send a message to Bob.

The idea is that Alice and Bob must somehow agree in advance on a keyk2K. Assuming this is done, then when Alice wants to send a messagem2Mto Bob, she encryptsmunderk, obtaining the ciphertextc=E(k, m)2C, and then sendscto Bob via some communication network. Upon receiving c, Bob decryptscunderk, and the correctness property ensures that D(k, c) is the same as Alice’s original messagem. For this to work, we have to assume thatcis not tampered with in transit from Alice to Bob. Of course, the goal, intuitively, is that an eavesdropper, who may obtain cwhile it is in transit, does not learn too much about Alice’s message m— this intuitive notion is what the formal definition of security, which we explore below, will capture.

In practice, keys, messages, and ciphertexts are often sequences of bytes. Keys are usually of some fixed length; for example, 16-byte (i.e., 128-bit) keys are very common. Messages and ciphertexts may be sequences of bytes of some fixed length, or of variable length. For example, a message may be a 1GB video file, a 10MB music file, a 1KB email message, or even a single bit encoding a “yes” or “no” vote in an electronic election.

(20)

Keys, messages, and ciphertexts may also be other types of mathematical objects, such as integers, or tuples of integers (perhaps lying in some specified interval), or other, more sophisticated types of mathematical objects (polynomials, matrices, or group elements). Regardless of how fancy these mathematical objects are, in practice, they must at some point be represented as sequences of bytes for purposes of storage in, and transmission between, computers.

For simplicity, in our mathematical treatment of ciphers, we shall assume that K, M, and C are sets of finite size. While this simplifies the theory, it means that if a real-world system allows messages of unbounded length, we will (somewhat artificially) impose a (large) upper bound on legal message lengths.

To exercise the above terminology, we take another look at some of the example ciphers discussed in Chapter 1.

Example 2.1. A one-time padis a Shannon cipher E = (E, D), where the keys, messages, and ciphertexts are bit strings of the same length; that is, E is defined over (K,M,C), where

K:=M:=C:={0,1}L,

for some fixed parameter L. For a key k 2 {0,1}L and a message m 2 {0,1}L the encryption function is defined as follows:

E(k, m) :=k m,

and for a keyk2{0,1}L and ciphertext c2{0,1}L, the decryption function is defined as follows:

D(k, c) :=k c.

Here, “ ” denotes bit-wise exclusive-OR, or in other words, component-wise addition modulo 2, and satisfies the following algebraic laws: for all bit vectors x, y, z2{0,1}L, we have

x y =y x, x (y z) = (x y) z, x 0L=x, and x x= 0L.

These properties follow immediately from the corresponding properties for addition modulo 2.

Using these properties, it is easy to check that the correctness property holds for E: for all k, m2 {0,1}L, we have

D(k, E(k, m) ) =D(k, k m) =k (k m) = (k k) m= 0L m=m.

The encryption and decryption functions happen to be the same in this case, but of course, not all ciphers have this property. 2

Example 2.2. A variable length one-time pad is a Shannon cipher E = (E, D), where the keys are bit strings of some fixed lengthL, while messages and ciphertexts are variable length bit strings, of length at mostL. Thus, E is defined over (K,M,C), where

K :={0,1}L and M:=C:={0,1}L.

for some parameterL. Here,{0,1}Ldenotes the set of all bit strings of length at mostL(including the empty string). For a key k2{0,1}L and a messagem 2{0,1}L of length `, the encryption function is defined as follows:

E(k, m) :=k[0 . .` 1] m,

(21)

and for a keyk2{0,1}L and ciphertextc2{0,1}Lof length`, the decryption function is defined as follows:

D(k, c) :=k[0 . .` 1] c.

Here, k[0 . .` 1] denotes the truncation of k to its first ` bits. The reader may verify that the correctness property holds for E. 2

Example 2.3. Asubstitution cipheris a Shannon cipherE = (E, D) of the following form. Let

⌃be a finite alphabet of symbols (e.g., the lettersA–Z, plus a space symbol, ). The message space Mand the ciphertext space C are both sequences of symbols from⌃of some fixed lengthL:

M:=C:=⌃L.

The key spaceKconsists of all permutations on⌃; that is, eachk2K is a one-to-one function from

⌃onto itself. Note that K is a very large set; indeed, |K|=|⌃|! (for |⌃|= 27, |K|⇡1.09·1028).

Encryption of a message m2⌃L under a keyk2K (a permutation on⌃) is defined as follows E(k, m) := k(m[0]), k(m[1]), . . . , k(m[L 1]) ,

where m[i] denotes the ith entry of m (counting from zero), and k(m[i]) denotes the application of the permutation k to the symbol m[i]. Thus, to encrypt m under k, we simply apply the permutation kcomponent-wise to the sequencem. Decryption of a ciphertextc2⌃L under a key k2K is defined as follows:

D(k, c) := k 1(c[0]), k 1(c[1]), . . . , k 1(c[L 1]) .

Here,k 1is the inverse permutation ofk, and to decryptcunderk, we simply applyk 1component- wise to the sequencec. The correctness property is easily verified: for a message m2⌃L and key k2K, we have

D(k, E(k, m) ) =D(k, (k(m[0]), k(m[1]), . . . , k(m[L 1]) )

= (k 1(k(m[0])), k 1(k(m[1])), . . . , k 1(k(m[L 1])))

= (m[0], m[1], . . . , m[L 1]) =m. 2

Example 2.4 (additive one-time pad). We may also define a “addition mod n” variation of the one-time pad. This is a cipher E = (E, D), defined over (K,M,C), where K := M := C :=

{0, . . . , n 1}, wheren is a positive integer. Encryption and decryption are defined as follows:

E(k, m) :=m+kmodn D(k, c) :=c kmodn.

The reader may easily verify that the correctness property holds forE. 2 2.2.2 Perfect security

So far, we have just defined the basic syntax and correctness requirements of a Shannon cipher.

Next, we address the question: what is a “secure” cipher? Intuitively, the answer is that a secure cipher is one for which an encrypted message remains “well hidden,” even after seeing its encryp- tion. However, turning this intuitive answer into one that is both mathematically meaningful and practically relevant is a real challenge. Indeed, although ciphers have been used for centuries, it

(22)

is only in the last few decades that mathematically acceptable definitions of security have been developed.

In this section, we develop the mathematical notion of perfect security — this is the “gold standard” for security (at least, when we are only worried about encrypting a single message and do not care about integrity). We will also see that it is possible to achieve this level of security;

indeed, we will show that the one-time pad satisfies the definition. However, the one-time pad is not very practical, in the sense that the keys must be as long as the messages: if Alice wants to send a 1GB file to Bob, they must already share a 1GB key! Unfortunately, this cannot be avoided:

we will also prove that any perfectly secure cipher must have a key space at least as large as its message space. This fact provides the motivation for developing a definition of security that is weaker, but that is acceptable from a practical point of view, and which allows one to encrypt long messages using short keys.

If Alice encrypts a message m under a key k, and an eavesdropping adversary obtains the ciphertext c, Alice only has a hope of keeping m secret if the key k is hard to guess, and that means, at the very least, that the key k should be chosen at random from a large key space. To say that m is “well hidden” must at least mean that it is hard to completely determine m from c, without knowledge of k; however, this is not really enough. Even though the adversary may not knowk, we assume that he does know the encryption algorithm and the distribution of k. In fact, we will assume that when a message is encrypted, the key k is always chosen at random, uniformly from among all keys in the key space. The adversary may also have some knowledge of the message encrypted — because of circumstances, he may know that the set of possible messages is quite small, and he may know something about how likely each possible message is. For example, suppose he knows the message m is eitherm0 ="ATTACK AT DAWN" orm1 ="ATTACK AT DUSK", and that based on the adversary’s available intelligence, Alice is equally likely to choose either one of these two messages. This, without seeing the ciphertext c, the adversary would only have a 50% chance of guessing which message Alice sent. But we are assuming the adversary does know c. Even with this knowledge, both messages may be possible; that is, there may exist keys k0

and k1 such that E(k0, m0) = c and E(k1, m1) = c, so he cannot be sure ifm = m0 orm = m1. However, he can still guess. Perhaps it is a property of the cipher that there are 800 keysk0 such that E(k0, m0) =c, and 600 keysk1 such that E(k1, m1) = c. If that is the case, the adversary’s best guess would be that m = m0. Indeed, the probability that this guess is correct is equal to 800/(800 + 600) ⇡ 57%, which is better than the 50% chance he would have without knowledge of the ciphertext. Our formal definition of perfect security expressly rules out the possibility that knowledge of the ciphertext increases the probability of guessing the encrypted message, or for that matter, determiningany property of the message whatsoever.

Without further ado, we formally define perfect security. In this definition, we will consider a probabilistic experiment in which the key is drawn uniformly from the key space. We write k to denote the random variable representing this random key. For a message m, E(k, m) is another random variable, which represents the application of the encryption function to our random key and the message m. Thus, every message m gives rise to a di↵erent random variable E(k, m).

Definition 2.1 (perfect security). Let E = (E, D) be a Shannon cipher defined over (K,M,C).

Consider a probabilistic experiment in which the random variablekis uniformly distributed overK. If for all m0, m12M, and all c2C, we have

Pr[E(k, m0) =c] = Pr[E(k, m1) =c],

(23)

then we say that E is a perfectly secure Shannon cipher.

There are a number of equivalent formulations of perfect security that we shall explore. We state a couple of these here.

Theorem 2.1. Let E = (E, D) be a Shannon cipher defined over (K,M,C). The following are equivalent:

(i) E is perfectly secure.

(ii) For every c2C, there exists Nc (possibly depending on c) such that for all m2M, we have

|{k2K :E(k, m) =c}|=Nc.

(iii) If the random variable k is uniformly distributed over K, then each of the random variables E(k, m), for m2M, has the same distribution.

Proof. To begin with, let us restate (ii) as follows: for every c 2 C, there exists a number Pc (depending on c) such that for all m 2 M, we have Pr[E(k, m) = c] =Pc. Here, k is a random variable uniformly distributed over K. Note that Pc = Nc/|K|, where Nc is as in the original statement of (ii).

This version of (ii) is clearly the same as (iii).

(i) =) (ii). We prove (ii) assuming (i). To prove (ii), let c 2 C be some fixed ciphertext.

Pick some arbitrary messagem0 2M, and letPc := Pr[E(k, m0) =c]. By (i), we know that for all m2M, we have Pr[E(k, m) =c] = Pr[E(k, m0) =c] =Pc. That proves (ii).

(ii) =) (i). We prove (i) assuming (ii). Consider any fixed m0, m1 2Mand c2C. (ii) says that Pr[E(k, m0) =c] =Pc = Pr[E(k, m1) =c], which proves (i). 2

As promised, we give a proof that the one-time pad (see Example 2.1) is perfectly secure.

Theorem 2.2. The one-time pad is a perfectly secure Shannon cipher.

Proof. Suppose that the Shannon cipherE = (E, D) is a one-time pad, and is defined over (K,M,C), where K := M := C := {0,1}L. For any fixed message m 2 {0,1}L and ciphertext c 2 {0,1}L, there is a unique key k2{0,1}L satisfying the equation

k m=c,

namely,k:= m c. Therefore, E satisfies condition (ii) in Theorem 2.1 (with Nc = 1 for each c).

2

Example 2.5. Consider again the variable length one-time pad, defined in Example 2.2. This does not satisfy our definition of perfect security, since a ciphertext has the same length as the corresponding plaintext. Indeed, let us choose an arbitrary string of length 1, call it m0, and an arbitrary string of length 2, call itm1. In addition, suppose thatcis an arbitrary length 1 string, and thatk is a random variable that is uniformly distributed over the key space. Then we have

Pr[E(k, m0) =c] = 1/2 and Pr[E(k, m1) =c] = 0, which provides a direct counter-example to Definition 2.1.

(24)

Intuitively, the variable length one-time pad cannot satisfy our definition of perfect security simply because any ciphertext leaks the length of the corresponding plaintext. However, in some sense (which we do not make precise right now), this is theonlyinformation leaked. It is perhaps not clear whether this should be viewed as a problem with the cipher or with our definition of perfect security. On the one hand, one can imagine scenarios where the length of a message may vary greatly, and while we could always “pad” short messages to e↵ectively make all messages equally long, this may be unacceptable from a practical point of view, as it is a waste of bandwidth. On the other hand, one must be aware of the fact that in certain applications, leaking just the length of a message may be dangerous: if you are encrypting a “yes” or “no” answer to a question, just the length of the obvious ASCII encoding of these strings leakseverything, so you better pad “no”

out to three characters. 2

Example 2.6. Consider again the substitution cipher defined in Example 2.3. There are a couple of di↵erent ways to see that this cipher is not perfectly secure.

For example, choose a pair of messages m0, m1 2⌃L such that the first two components of m0 are equal, yet the first two components of m1 are not equal; that is,

m0[0] =m0[1] and m1[0]6=m1[1].

Then for each key k, which is a permutation on ⌃, if c = E(k, m0), then c[0] = c[1], while if c =E(k, m1), then c[0] 6=c[1]. In particular, it follows that if k is uniformly distributed over the key space, then the distributions of E(k, m0) andE(k, m1) will not be the same.

Even the weakness described in the previous paragraph may seem somewhat artificial. Another, perhaps more realistic, type of attack on the substitution cipher works as follows. Suppose the substitution cipher is used to encrypt email messages. As anyone knows, an email starts with a

“standard header,” such as"FROM". Suppose the ciphertext isc2⌃Lis intercepted by an adversary.

The secret key is actually a permutation kon ⌃. The adversary knows that c[0. . .3] = (k(F), k(R), k(O), k(M)).

Thus, if the original message is m 2 ⌃L, the adversary can now locate all positions in m where an F occurs, where an R occurs, where an O occurs, and where an M occurs. Based just on this information, along with specific, contextual information about the message, together with general information about letter frequencies, the adversary may be able to deduce quite a bit about the original message. 2

Example 2.7. Consider the additive one-time pad, defined in Example 2.4. It is easy to verity that this is perfectly secure. Indeed, it satisfies condition (ii) in Theorem 2.1 (withNc = 1 for each c). 2

The next two theorems develop two more alternative characterizations of perfect security. For the first, suppose an eavesdropping adversary applies some predicate to a ciphertext he has obtained. The predicate (which is a boolean-valued function on the ciphertext space) may be something very simple, like the parity function (i.e., whether the number of 1 bits in the ciphertext is even or odd), or it might be some more elaborate type of statistical test. Regardless of how clever or complicated the predicate is, perfect security guarantees that the value of this predicate on the ciphertext reveals nothing about the message.

(25)

Theorem 2.3. LetE= (E, D)be a Shannon cipher defined over(K,M,C). Consider a probabilistic experiment in whichkis a random variable uniformly distributed overK. ThenE is perfectly secure if and only if for every predicate onC, for all m0, m1 2M, we have

Pr[ (E(k, m0))] = Pr[ (E(k, m1))].

Proof. This is really just a simple calculation. On the one hand, suppose E is perfectly secure, and let ,m0, and m1 be given. LetS :={c2C: (c)}. Then we have

Pr[ (E(k, m0))] =X

c2S

Pr[E(k, m0) =c] =X

c2S

Pr[E(k, m1) =c] = Pr[ (E(k, m1))].

Here, we use the assumption that E is perfectly secure in establishing the second equality. On the other hand, suppose E is not perfectly secure, so there exist m0,m1, and csuch that

Pr[E(k, m0) =c]6= Pr[E(k, m1) =c].

Defining to be the predicate that is true for this particularc, and false for all other ciphertexts, we see that

Pr[ (E(k, m0))] = Pr[E(k, m0) =c]6= Pr[E(k, m1) =c] = Pr[ (E(k, m1))]. 2

The next theorem states in yet another way that perfect security guarantees that the ciphertext reveals nothing about the message. Suppose that m is a random variable distributed over the message space M. We do not assume that m is uniformly distributed over M. Now suppose k

is a random variable uniformly distributed over the key space K, independently of m, and define

c := E(k,m), which is a random variable distributed over the ciphertext space C. The following theorem says that perfect security guarantees thatcand mare independent random variables.

One way of characterizing this independence is to say that for each ciphertextc2Cthat occurs with nonzero probability, and each message m2M, we have

Pr[m=m|c=c] = Pr[m=m].

Intuitively, this means that after seeing a ciphertext, we have no more information about the message than we did before seeing the ciphertext.

Another way of characterizing this independence is to say that for each message m2M that occurs with nonzero probability, and each ciphertextc2C, we have

Pr[c=c|m=m] = Pr[c=c].

Intuitively, this means that the choice of message has no impact on the distribution of the ciphertext.

The restriction that mandk are independent random variables is sensible: in using any cipher, it is a very bad idea to choose the key in a way that depends on the message, or vice versa (see Exercise 2.16).

Theorem 2.4. Let E = (E, D) be a Shannon cipher defined over (K,M,C). Consider a random experiment in which k andm are random variables, such that

k is uniformly distributed overK,

(26)

mis distributed over M, and

k and mare independent.

Define the random variable c:=E(k,m). Then we have:

• if E is perfectly secure, thenc and mare independent;

• conversely, if c and m are independent, and each message in M occurs with nonzero proba- bility, thenE is perfectly secure.

Proof. For the first implication, assume that E is perfectly secure. Consider any fixedm2Mand c2C. We want to show that

Pr[c=c^m=m] = Pr[c=c] Pr[m=m].

We have

Pr[c=c^m=m] = Pr[E(k,m) =c^m=m]

= Pr[E(k, m) =c^m=m]

= Pr[E(k, m) =c] Pr[m=m] (by independence ofkand m).

So it will suffice to show that Pr[E(k, m) =c] = Pr[c=c]. But we have Pr[c=c] = Pr[E(k,m) =c]

= X

m02M

Pr[E(k,m) =c^m=m0] (by total probability)

= X

m02M

Pr[E(k, m0) =c^m=m0]

= X

m02M

Pr[E(k, m0) =c] Pr[m=m0] (by independence ofk and m)

= X

m02M

Pr[E(k, m) =c] Pr[m=m0] (by definition of perfect security)

= Pr[E(k, m) =c] X

m02M

Pr[m=m0]

= Pr[E(k, m) =c] (probabilities sum to 1).

For the second implication, assume thatcandmare independent, and each message inMoccurs with nonzero probability. Let m2M and c2C. We will show that Pr[E(k, m) =c] = Pr[c=c], from which perfect security immediately follows. Since Pr[m=m]6= 0, this is seen thusly:

Pr[E(k, m) =c] Pr[m=m] = Pr[E(k, m) =c^m=m] (by independence ofk and m)

= Pr[E(k,m) =c^m=m]

= Pr[c=c^m=m]

= Pr[c=c] Pr[m=m] (by independence ofcand m). 2

(27)

2.2.3 The bad news

We have saved the bad news for last. The next theorem shows that perfect security is such a powerful notion that one can really do no better than the one-time pad: keys must be at least as long as messages. As a result, it is almost impossible to use perfectly secure ciphers in practice: if Alice wants to send Bob a 1GB video file, then Alice and Bob have to agree on a 1GB secret key in advance.

Theorem 2.5 (Shannon’s theorem). Let E = (E, D) be a Shannon cipher defined over (K,M,C). If E is perfectly secure, then |K| |M|.

Proof. Assume that |K|< |M|. We want to show that E is not perfectly secure. To this end, we show that there exist messages m0 and m1, and a ciphertextc, such that

Pr[E(k, m0) =c]>0, and (2.1)

Pr[E(k, m1) =c] = 0. (2.2)

Here,k is a random variable, uniformly distributed overK.

To do this, choose any message m0 2M, and any key k0 2K. Let c:=E(k0, m0). It is clear that (2.1) holds.

Next, let

S:={D(k1, c) :k1 2K}. Clearly,

|S||K|<|M|, and so we can choose a messagem12M \S.

To prove (2.2), we need to show that there is no key k1 such that E(k1, m1) = c. Assume to the contrary that E(k1, m1) =c for somek1; then for this keyk1, by the correctness property for ciphers, we would have

D(k1, c) =D(k1, E(k1, m1) ) =m1,

which would imply thatm1belongs toS, which is not the case. That proves (2.2), and the theorem follows. 2

2.3 Computational ciphers and semantic security

As we have seen in Shannon’s theorem (Theorem 2.5), the only way to achieve perfect security is to have keys that are as long as messages. However, this is quite impractical: we would like to be able to encrypt a long message (say, a document of several megabytes) using a short key (say, a few hundred bits). The only way around Shannon’s theorem is to relax our security requirements. The way we shall do this is to consider not all possible adversaries, but only computationally feasible adversaries, that is, “real world” adversaries that must perform their calculations on real computers using a reasonable amount of time and memory. This will lead to a weaker definition of security calledsemantic security. Furthermore, our definition of security will be flexible enough to allow ciphers with variable length message spaces to be considered secure so long as they do not leak any useful information about an encrypted message to an adversary other than the length of message.

Also, since our focus is now on the “practical,” instead of the “mathematically possible,” we shall also insist that the encryption and decryption functions are themselves efficient algorithms, and not just arbitrary functions.

(28)

2.3.1 Definition of a computational cipher

A computational cipher E = (E, D) is a pair of efficient algorithms, E and D. The encryption algorithmEtakes as input a keyk, along with a messagem, and produces as output a ciphertextc.

The decryption algorithmDtakes as input a keyk, a ciphertextc, and outputs a messagem. Keys lie in some finite key spaceK, messages lie in a finite message spaceM, and ciphertexts lie in some finite ciphertext spaceC. Just as for a Shannon cipher, we say thatE is defined over (K,M,C).

Although it is not really necessary for our purposes in this chapter, we will allow the encryption functionE to be aprobabilisticalgorithm (see Chapter D). This means that for fixed inputskand m, the output of E(k, m) may be one of many values. To emphasize the probabilistic nature of this computation, we write

c R E(k, m)

to denote the process of executingE(k, m) and assigning the output to the program variablec. We shall use this notation throughout the text whenever we use probabilistic algorithms. Similarly, we write

k R K

to denote the process of assigning to the program variable k a random, uniformly distributed element of from the key space K. We shall use the analogous notation to sample uniformly from any finite set.

We will not see any examples of probabilistic encryption algorithms in this chapter (we will see our first examples of this in Chapter 5). Although one could allow the decryption algorithm to be probabilistic, we will have no need for this, and so will only discuss ciphers with deterministic decryption algorithms. However, it will be occasionally be convenient to allow the decryption algorithm to return a specialrejectvalue (distinct from all messages), indicating some kind of error occurred during the decryption process.

Since the encryption algorithm is probabilistic, for a given keykand messagem, the encryption algorithm may output one of many possible ciphertexts; however, each of these possible ciphertexts should decrypt tom. We can state thiscorrectness requirement more formally as follows: for all keys k2K and messagesm2M, if we execute

c R E(k, m),m0 D(k, c), thenm=m0 with probability 1.

From now on, whenever we refer to a cipher, we shall mean a computational cipher, as defined above. Moreover, if the encryption algorithm happens to be deterministic, then we may call the cipher a deterministic cipher.

Observe that any deterministic cipher is a Shannon cipher; however, a computational cipher need not be a Shannon cipher (if it has a probabilistic encryption algorithm), and a Shannon cipher need not be a computational cipher (if its encryption or decryption operations have no efficient implementations).

Example 2.8. The one-time pad (see Example 2.1) and the variable length one-time pad (see Example 2.2) are both deterministic ciphers, since their encryption and decryption operations may be trivially implemented as efficient, deterministic algorithms. The same holds for the substitution

(29)

cipher (see Example 2.3), provided the alphabet⌃ is not too large. Indeed, in the obvious imple- mentation, a key — which is a permutation on⌃— will be represented by an array indexed by⌃, and so we will require O(|⌃|) space just to store a key. This will only be practical for reasonably sized ⌃. The additive one-time pad discussed in Example 2.4 is also a deterministic cipher, since both encryption and decryption operations may be efficiently implemented (if n is large, special software to do arithmetic with large integers may be necessary). 2

2.3.2 Definition of semantic security

To motivate the definition of semantic security, consider a deterministic cipherE= (E, D), defined over (K,M,C). Consider again the formulation of perfect security in Theorem 2.3. This says that for all predicates on the ciphertext space, and all messages m0, m1, we have

Pr[ (E(k, m0))] = Pr[ (E(k, m1))], (2.3) wherek is a random variable uniformly distributed over the key spaceK. Instead of insisting that these probabilities are equal, we shall only require that they are very close; that is,

Pr[ (E(k, m0))] Pr[ (E(k, m1))] ✏, (2.4) for some very small, or negligible, value of ✏. By itself, this relaxation does not help very much (see Exercise 2.5). However, instead of requiring that (2.4) holds for every possible , m0, and m1, we only require that (2.4) holds for all messages m0 and m1 that can be generated by some efficient algorithm, and all predicates that can be computed by some efficient algorithm (these algorithms could be probabilistic). For example, suppose it were the case that using the best possible algorithms for generating m0 and m1, and for testing some predicate , and using (say) 10,000 computers in parallel for 10 years to perform these calculations, (2.4) holds for ✏= 2 100. While not perfectly secure, we might be willing to say that the cipher is secure for all practical purposes.

Also, in defining semantic security, we address an issue raised in Example 2.5. In that example, we saw that the variable length one-time pad did not satisfy the definition of perfect security.

However, we want our definition to be flexible enough so that ciphers like the variable length one- time pad, which e↵ectively leak no information about an encrypted message other than its length, may be considered secure as well.

Now the details. To precisely formulate the definition of semantic security, we shall describe an attack game played between two parties: thechallengerand anadversary. As we will see, the challenger follows a very simple, fixed protocol. However, an adversary Amay follow an arbitrary (but still efficient) protocol. The challenger and the adversary A send messages back and forth to each other, as specified by their protocols, and at the end of the game, A outputs some value.

Actually, our attack game for defining semantic security comprises two alternative “sub-games,”

or “experiments” — in both experiments, the adversary follows the same protocol; however, the challenger’s behavior is slightly di↵erent in the two experiments. The attack game also defines a probability space, and this in turn defines the adversary’sadvantage, which measures the di↵erence between the probabilities of two events in this probability space.

Attack Game 2.1 (semantic security). For a given cipherE= (E, D), defined over (K,M,C), and for a given adversary A, we define two experiments, Experiment 0 and Experiment 1. For b= 0,1, we define

(30)

Challenger A m0, m12M

k R K

ˆb2{0,1} (Experimentb)

c R E(k, mb) c

Figure 2.1: Experiment b of Attack Game 2.1

Experiment b:

• The adversary computesm0, m1 2M, of the same length, and sends them to the challenger.

• The challenger computesk R K,c R E(k, mb), and sendsc to the adversary.

• The adversary outputs a bit ˆb2{0,1}.

For b= 0,1, let Wb be the event that A outputs 1 in Experiment b. We define A’s semantic security advantagewith respect to E as

SSadv[A,E] := Pr[W0] Pr[W1]. 2

Note that in the above game, the eventsW0 andW1 are defined with respect to the probability space determined by the random choice of k, the random choices made (if any) by the encryption algorithm, and the random choices made (if any) by the adversary. The value SSadv[A,E] is a number between 0 and 1.

See Fig. 2.1 for a schematic diagram of Attack Game 2.1. As indicated in the diagram, A’s

“output” is really just a final message to the challenger.

Definition 2.2 (semantic security). A cipher E is semantically secure if for all efficient adversaries A, the value SSadv[A,E] is negligible.

As a formal definition, this is not quite complete, as we have yet to define what we mean by

“messages of the same length”, “efficient adversaries”, and “negligible”. We will come back to this shortly.

Let us relate this formal definition to the discussion preceding it. Suppose that the adversary A in Attack Game 2.1 is deterministic. First, the adversary computes in a deterministic fashion messages m0, m1, and then evaluates a predicate on the ciphertext c, outputting 1 if true and 0 if false. Semantic security says that the value ✏ in (2.4) is negligible. In the case where A is probabilistic, we can view A as being structured as follows: it generates a random value r from

Références

Documents relatifs

The method is composed of a set of performed functions: (a) Domain Knowledge Acquaintance is performed by the Modelling Experts; (b) Co-Modelling Workshop Organisation is

Suppose R is a right noetherian, left P-injective, and left min-CS ring such that every nonzero complement left ideal is not small (or not singular).. Then R

As a final comment, we emphasize that simplicial LS-category is a new strong homotopy invariant, defined in purely combinatorial terms, that generalizes to arbitrary

Note that the above definition of a private-key encryption scheme does not specify any secrecy (or privacy) properties; the only non-trivial requirement is that the decryption

The other rules have to do with second-order types and subtyping; similar rules can be found in the literature for explicitly typed systems (system Fd, but to the best of our

It should not be possible to bypass authentication or encryption by asking for a different address for the same host.. Source Routing and

The URLs which are being standardized in the IETF are directly based on the way World Wide Web built pointers to resources, by creating a uniform way to specify access

The transmitted value of the Checksum field is the checksum value that would have been computed for the entire IP header if the TTL field had been set to zero (i.e., the