• Aucun résultat trouvé

Reproducible builds in Debian and everywhere

N/A
N/A
Protected

Academic year: 2022

Partager "Reproducible builds in Debian and everywhere"

Copied!
132
0
0

Texte intégral

(1)

Reproducible builds in Debian and everywhere

Lunar [email protected]

Libre Software Meeting

2015-06-07

Lunar (Debian) Reproducible builds LSM2015 1 / 126

(2)

What?

(3)

What are reproducible builds?

“reproducible builds”

enable anyone to reproduce identical binary packages

from a given source

Lunar (Debian) Reproducible builds LSM2015 3 / 126

(4)

Trusting a build?

source

build

binary

free software

freedom to study

freedom to run

can be verified can be used

could I get a proof?

(5)

Trusting a build?

source

build

binary free software

freedom to study

freedom to run

can be verified can be used

could I get a proof?

Lunar (Debian) Reproducible builds LSM2015 4 / 126

(6)

Trusting a build?

source

build

binary

free software

freedom to study

freedom to run

can be verified can be used

could I get a proof?

(7)

Trusting a build?

source

build

binary

free software

freedom to study

freedom to run

can be verified can be used

could I get a proof?

Lunar (Debian) Reproducible builds LSM2015 4 / 126

(8)

Why?

(9)

Why?

Reproducible builds allow for independent verifications that a binary matches what the source intended to produce.

… and other nice things.

Lunar (Debian) Reproducible builds LSM2015 6 / 126

(10)

But I’m the developer!

“I know what’s in the binary because I compiled it myself!”

“I’m an upstanding, careful, and responsible individual!”

“Why should I have to worry about hypothetical risks about the contents of my binaries?”

(11)

No kidding

At a CIA conference in 2012:

Source : The Intercept, 2015-03-10

Lunar (Debian) Reproducible builds LSM2015 8 / 126

(12)

Unpleasant thoughts

We think of software development as a fundamentally benign activity.

I “I’m not that interesting.”

Users can be targeted through developers

Known successful attacks against infrastructure used by Linux (2003), FreeBSD (2013)

(13)

Strong motivations

Compromise one computer to get:

I Hundreds of millions of other computers?

I Every bank account in the world?

I Every Windows computer in the world?

I Every Linux server in the world?

Compromise one computer is worth:

I $100k USD? (Market price of remote 0day)

I $100M USD? (Censorship budget of Iran per year)

I $4B USD? (Bitcoin market cap)

Lunar (Debian) Reproducible builds LSM2015 10 / 126

(14)

How small can a backdoor be?

OpenSSH 3.0.2 (CVE-2002-0083) – exploitable security bug (privilege escalation: user can get root)

{ Channel *c;

- if (id < 0 || id > channels_alloc) { + if (id < 0 || id >= channels_alloc) {

log("channel_lookup: %d: bad id", id);

return;

}

(15)

Result of fixing the bug (asm)

cmpl $0x0,0x8(%ebp) cmpl $0x0,0x8(%ebp)

js 16 js 16

mov 0x4,%eax mov 0x4,%eax

cmp %eax,0x8(%ebp) cmp %eax,0x8(%ebp)

jle 30 jl 30

mov 0x8(%ebp),%eax mov 0x8(%ebp),%eax mov %eax,0x4(%esp) mov %eax,0x4(%esp) movl $0x4c,(%esp) movl $0x4c,(%esp)

call 25 call 25

Lunar (Debian) Reproducible builds LSM2015 12 / 126

(16)

Result of fixing the bug (asm)

cmpl $0x0,0x8(%ebp) cmpl $0x0,0x8(%ebp)

js 16 js 16

mov 0x4,%eax mov 0x4,%eax

cmp %eax,0x8(%ebp) cmp %eax,0x8(%ebp)

jle 30 jl 30

mov 0x8(%ebp),%eax mov 0x8(%ebp),%eax mov %eax,0x4(%esp) mov %eax,0x4(%esp) movl $0x4c,(%esp) movl $0x4c,(%esp)

(17)

Resulting difference in the binary

What’s the difference between if (a > b) and if (a >= b) in x86 assembly?

assembly: JLE JL

opcode: 0x7E 0x7C

binary: 01111110 01111100

A single bit!

Other corresponding opcode pairs also differ by just a single bit (JGE=0x7D, JG=0x7F)

Lunar (Debian) Reproducible builds LSM2015 13 / 126

(18)

Result of fixing the bug (hex)

Vulnerable Fixed

55 89 e5 83 ec 28 83 7d 08 00 78 0a a1 04 00 00 00 39 45 08 7e 1a 8b 45 08 89 44 24 04 c7 04 24 4c 00 00 00 e8 fc ff ff ff b8 00 00 00 00 eb 35

55 89 e5 83 ec 28 83 7d 08 00 78 0a a1 04 00 00 00 39 45 08 7c 1a 8b 45 08 89 44 24 04 c7 04 24 4c 00 00 00 e8 fc ff ff ff b8 00 00 00 00 eb 35

(19)

Result of fixing the bug (hex)

Vulnerable Fixed

55 89 e5 83 ec 28 83 7d 08 00 78 0a a1 04 00 00 00 39 45 08 7e 1a 8b 45 08 89 44 24 04 c7 04 24 4c 00 00 00 e8 fc ff ff ff b8 00 00 00 00 eb 35

55 89 e5 83 ec 28 83 7d 08 00 78 0a a1 04 00 00 00 39 45 08 7c 1a 8b 45 08 89 44 24 04 c7 04 24 4c 00 00 00 e8 fc ff ff ff b8 00 00 00 00 eb 35

Overall file size: approx. 500 kBLunar (Debian) Reproducible builds LSM2015 14 / 126

(20)

My first

(21)

Not the first

Bitcoin’s motivation:

Malicious modifications to binaries could result in irrevocable unwanted transfers of bitcoins

Individual developers could be blamed for such modifications

Users might not believe that a developer’s machine was hacked

Reproducible builds therefore protect developers

Lunar (Debian) Reproducible builds LSM2015 16 / 126

(22)

And nothing new, even

From: Martin Uecker <[email protected]>

Cc: [email protected]

Date: Sun, 23 Sep

2007

23:32:59 +0200

I think it would be really cool if the Debian policy required that packages could be rebuild bit-identical from source. At the moment, it is impossible to independly verify the

integricity of binary packages.

https://lists.debian.org/debian-devel/2007/09/msg00746.html

(23)

Why Debian?

Debian is the largest collection of free software More than 21,000 source packages

“Our priorities are our users and free software”

Lunar (Debian) Reproducible builds LSM2015 18 / 126

(24)

How?

(25)

How to achieve reproducibility?

Determine the build environment Reproduce the build environment Eliminate unneeded variations

Lunar (Debian) Reproducible builds LSM2015 20 / 126

(26)

How to

Handle the build environment

(27)

Solution 1: rebuild the environment

Scripts will rebuild the environment from sources URL and checksum of source tarballs are recorded

… or go crazy and check-in everything in the VCS Approach used by Coreboot, OpenWrt, Bazel

Lunar (Debian) Reproducible builds LSM2015 22 / 126

(28)

Solution 1: rebuild the environment

Bonus: changes in the toolchain can be reviewed, merged, bisected, etc. just like other changes in the code.

(29)

Solution 2: system images and packages

Use a virtual machine or container Base install of a common distribution Follow-up installation script

Tools: Gitian (Bitcoin, Tor Browser), rbm (Tor Messenger)

Lunar (Debian) Reproducible builds LSM2015 24 / 126

(30)

Solution 2: system images and packages

Downsides: if you use packages, you need to stick to stable distributions and hope no security fixes will affect the build.

Or you do mix it with solution 1.

(31)

Solution 3: record and replay

In Debian, a new control file *.buildinfo will record:

Versions of build dependencies

I … and their dependencies

Checksum of the source package.

Checksums of the binary packages.

Lunar (Debian) Reproducible builds LSM2015 26 / 126

(32)

Example .buildinfo

Format: 1.9

Build-Architecture: amd64 Source: txtorcon

Binary: python-txtorcon Architecture: all Version: 0.11.0-1

Build-Path: /usr/src/debian/txtorcon-0.11.0-1 Checksums-Sha256:

a26549d9…7b 125910 python-txtorcon_0.11.0-1_all.deb 28f6bcbe…69 2039 txtorcon_0.11.0-1.dsc

Build-Environment:

base-files (= 8), base-passwd (= 3.5.37), bash (= 4.3-11+b1),

(33)

snapshot.debian.org

snapshot.debian.org archives every state of the Debian archive.

2015-05-25: 29 terabytes of data in 17 million files.

Lunar (Debian) Reproducible builds LSM2015 28 / 126

(34)

srebuild

Thin wrapper around sbuild Determines the base release

Installs packages listed in the *.buildinfo file Starts the build

Status: proof-of-concept in #774415

(35)

How to

Eliminate unneeded variations

Lunar (Debian) Reproducible builds LSM2015 30 / 126

(36)

General approach

Gitian (Bitcoin, Tor Browser):

I Use a VM: same kernel, same user, same build path, …

I libfaketime

Debian:

I Fix the tools

I Fix the build systems

I Work-arounds as last resort

(37)

Investigating packages

diffOpenSUSE build-compare debbindiff

Lunar (Debian) Reproducible builds LSM2015 32 / 126

(38)

debbindiff

Examines differences in depth

Outputs HTML or plain text showing the differences Recursively unpack archives

Seeks human readability:

I uncompress PDF

I disassemble binaries,

I unpack Gettext files,

I easy to extend to new file formats

Falls back to binary comparison

(39)

(and test again) Test

Lunar (Debian) Reproducible builds LSM2015 34 / 126

(40)

Finding variations

Build the package Rebuild the package Compare the results

(41)

reproducible.debian.net

Continuous test system driven by Jenkins Bad ass hardware sponsored by ProfitBricks

Tests about 1300 source packages each day on average Results are visible on a website

Recent additions: Coreboot and OpenWrt

Lunar (Debian) Reproducible builds LSM2015 36 / 126

(42)

Variations for Debian packages

The second build differs by:

timetimezone file ordering process ordering

cores used for the build

(43)

Variations for Debian packages

hostname, domainname username, uid, gid umask

language (LANG) and locale (LC_ALL)

kernel version (using linux64 --uname-2.6) PATH

Lunar (Debian) Reproducible builds LSM2015 38 / 126

(44)

Still the same for now

date (but we cheat with timezone) /proc/cpuinfo

rebuilds on different filesystems (currently tmpfs only) Are we forgetting something?

(45)

Findings

Lunar (Debian) Reproducible builds LSM2015 40 / 126

(46)

Identified issues

Timestamps (recording current time) File order

(Pseudo-)randomness:

I Temporary file paths

I UUID

I Protection against complexity attacks

(47)

Identified issues (cont.)

CPU and memory related:

I Code optimizations for current CPU class

I Recording of memory addresses

Build-path

Others, eg. locale settings

Lunar (Debian) Reproducible builds LSM2015 42 / 126

(48)

Identified issues (cont.)

Examples

Timestamps added by build systems

(49)

Timestamps in gzip headers

Lunar (Debian) Reproducible builds LSM2015 44 / 126

(50)

Timestamps written by Maven

(51)

Timestamps in generated Makefiles

Lunar (Debian) Reproducible builds LSM2015 46 / 126

(52)

Timestamps in header files

(53)

Timestamps written by PyQt4

Lunar (Debian) Reproducible builds LSM2015 48 / 126

(54)

Timestamps written by Erlang compiler

(55)

Timestamps in PE binaries

Windows, UEFI, Mono…

Lunar (Debian) Reproducible builds LSM2015 50 / 126

(56)

Timestamps in ADA library information

(57)

Timestamps in Ruby gemspec files

Lunar (Debian) Reproducible builds LSM2015 52 / 126

(58)

Timestamps in PHP registry

(59)

Timestamps by a template engine

Lunar (Debian) Reproducible builds LSM2015 54 / 126

(60)

Timestamps in Python version

(61)

Identified issues (cont.)

Examples

Archives

Lunar (Debian) Reproducible builds LSM2015 56 / 126

(62)

Timestamps in static libraries

(63)

Timestamps in static libraries (cont.)

Lunar (Debian) Reproducible builds LSM2015 58 / 126

(64)

Timestamps in ZIP archives

(65)

Timestamps in Java jar

They are actually ZIP archives.

Lunar (Debian) Reproducible builds LSM2015 60 / 126

(66)

Timestamps in tarballs

(67)

Users and groups in tarballs

Lunar (Debian) Reproducible builds LSM2015 62 / 126

(68)

Random order in tarballs

(69)

Identified issues (cont.)

Examples

Timestamps in documentation

Lunar (Debian) Reproducible builds LSM2015 64 / 126

(70)

Timestamps written by Doxygen

(71)

Timestamps written by docbook-to-man

Lunar (Debian) Reproducible builds LSM2015 66 / 126

(72)

Timestamps written by Groovydoc

(73)

Timestamps written by Epydoc

Lunar (Debian) Reproducible builds LSM2015 68 / 126

(74)

Timestamps written by Sphinx

(75)

Timestamps written by Ghostscript

Lunar (Debian) Reproducible builds LSM2015 70 / 126

(76)

Timestamps written by LaTeX

(77)

Timestamps written by texi2html

Lunar (Debian) Reproducible builds LSM2015 72 / 126

(78)

Timestamps written by texi2html (cont.)

(79)

Timestamps written by help2man

Lunar (Debian) Reproducible builds LSM2015 74 / 126

(80)

Timestamps written by GNU groff

(81)

Timestamps written by Javadoc

Lunar (Debian) Reproducible builds LSM2015 76 / 126

(82)

Timestamps written by man2html

(83)

Timestamps in TeX output (.dvi)

Lunar (Debian) Reproducible builds LSM2015 78 / 126

(84)

Identified issues (cont.)

Examples

“Compiled at/on/by”

(85)

Build time via C preprocessor macros

Lunar (Debian) Reproducible builds LSM2015 80 / 126

(86)

Build time via C preprocessor macros

(87)

Build time recorded via Makefile

Lunar (Debian) Reproducible builds LSM2015 82 / 126

(88)

Hostname recorded via ./configure

(89)

Build time recorded via ./configure

Lunar (Debian) Reproducible builds LSM2015 84 / 126

(90)

m4 macros for autoconf (build time)

(91)

m4 macros for autoconf (username)

Lunar (Debian) Reproducible builds LSM2015 86 / 126

(92)

m4 macros for autoconf (hostname)

(93)

Recorded kernel version

Lunar (Debian) Reproducible builds LSM2015 88 / 126

(94)

Bonus points for programmers

(95)

Identified issues (cont.)

Examples

File ordering

Lunar (Debian) Reproducible builds LSM2015 90 / 126

(96)

File ordering in python-support files

(97)

Identified issues (cont.)

Examples

Randomness

Lunar (Debian) Reproducible builds LSM2015 92 / 126

(98)

Random Perl hash order

See Algorithmic complexity attacks in perlsec(1).

(99)

Random serial numbers in Ogg streams

Lunar (Debian) Reproducible builds LSM2015 94 / 126

(100)

Random import order in Python code

(101)

Random order in Python namespace files

Lunar (Debian) Reproducible builds LSM2015 96 / 126

(102)

Temporary filenames in Ocaml libraries

(103)

Identified issues (cont.)

Examples

Even more timestamps!

Lunar (Debian) Reproducible builds LSM2015 98 / 126

(104)

Timestamp-dependent rebuilds

(105)

Timezone gets recorded

Lunar (Debian) Reproducible builds LSM2015 100 / 126

(106)

Timestamps in EPUB files

(107)

Timestamps in PNG

Even images!

Lunar (Debian) Reproducible builds LSM2015 102 / 126

(108)

Timestamps in TrueType font files

And fonts!

(109)

You think those were enough issues?

Lunar (Debian) Reproducible builds LSM2015 104 / 126

(110)

Default changes with locale

(111)

Not illustrated

Build path getting recorded

Environment variables (e.g. PATH) File permissions inconsistency Cryptographic signatures And even more…

Lunar (Debian) Reproducible builds LSM2015 106 / 126

(112)

Fixes

(113)

Fixing timestamps

Replace with a precise reference to the code:

I Actual version number

I Git commit hash or other VCS reference

Or use a known date (latest modification on the code) Implement support for SOURCE_DATE_EPOCH

Lunar (Debian) Reproducible builds LSM2015 108 / 126

(114)

Fixing timezone related issues

Use UTC!

(115)

Fixing file order variations

Sort!

Be beware of the locale!

Lunar (Debian) Reproducible builds LSM2015 110 / 126

(116)

Example: creating tarballs

Before:

I tar -cf archive.tar src

After:

I find src -print0 | LC_ALL=C sort -z |

tar --null -T - --no-recursion -cf archive.tar

(117)

Fixing random key ordering

Sort!

for my $limit (sort keys(%{$pat})) {

$Data::Dumper::Sortkeys = 1 for key in sorted(d.keys()):

Lunar (Debian) Reproducible builds LSM2015 112 / 126

(118)

Fixing build path

Don’t record build path

Workaround: perform rebuild in the same path as the original build

(119)

Handling cryptographic signatures

Make the signature part of your release process:

1 Do a first (reproducible) build

2 Sign it

3 Ship the signature with the source code

4 Just copy the signature on next builds Example patch: #725803

Lunar (Debian) Reproducible builds LSM2015 114 / 126

(120)

Misc. fixes

Don’t record kernel, CPU type, login…

No added value Privacy issue

(121)

Cleanup via post-processing

re-dzip.sh

pyc-timestamp.sh strip-nondeterminism

Lunar (Debian) Reproducible builds LSM2015 116 / 126

(122)

strip-nondeterminism

Normalize various file formats Currently handles:

I ar archives (.a)

I gzip

I Java jar

I Javadoc HTML

I Maven pom.properties

I PNG

I ZIP archives

Written in Perl (like dpkg-dev)

(123)

How are we doing in Debian?

Lunar (Debian) Reproducible builds LSM2015 118 / 126

(124)

For those in the back

More than 18,000 source packages!

81%

(in our test environment)

(125)

For those in the back

More than 18,000 source packages!

81%

(in our test environment)

Lunar (Debian) Reproducible builds LSM2015 119 / 126

(126)

Still experimenting

It’s not yet possible to reproduce builds of official Debian packages

We are refining changes to the toolchain in an experimental repository

Next steps:

I Allow .buildinfo in the archive (#763822)

I Get changes to dpkg merged

I Ship .buildinfo to the mirrors

I Finish the srebuild srcript (#774415)

(127)

Other distributions

Fedora

http://securityblog.redhat.com/2013/09/18/

reproducible-builds-for-fedora/

OpenSUSE build-compare

https://build.opensuse.org/package/show/openSUSE:

Factory/build-compare NixOS

http://lists.science.uu.nl/pipermail/nix-dev/2013-June/

011357.html

FreeBSDhttps://wiki.freebsd.org/ReproducibleBuildsand https://wiki.freebsd.org/PortsReproducibleBuilds

OpenWrthttps://lists.openwrt.org/pipermail/openwrt-devel/

2015-March/032136.html

Lunar (Debian) Reproducible builds LSM2015 121 / 126

(128)

Reproducible Fedora?

No known activity after the initial blog post.

Fedora is leading developments for key components.

Can we help reproducible Fedora?

(129)

Everywhere?

Please talk to us!

Lunar (Debian) Reproducible builds LSM2015 123 / 126

(130)

Everywhere!

Reproducible builds need to become the norm.

We can help with continuous testing.

Debian wiki wants to be useful to everyone.

(131)

Thanks

Asheesh Laroia, Holger Levsen, Reiner Herrmann, Mattia Rizzolo, Daniel Kahn Gilmor, and so many others…

Mike Perry and Seth Schoen for their 31C3 talk Linux Foundation for sponsoring Holger and myself ProfitBricks for sponsoring jenkins.debian.net Globalsign for sponsoring X.509 certificates

Designers of Tango icons Everyone who helped!

Lunar (Debian) Reproducible builds LSM2015 125 / 126

(132)

Questions? Comments?

?

https://wiki.debian.org/ReproducibleBuilds https://reproducible.debian.net/

#debian-reproducible on OFTC Lunar

Références

Documents relatifs

Software packages uploaded to unstable are normally versions stable enough to be released by the original upstream developer, but with the added Debian-specific packaging and

 Third generation, from now on: Services evolution and global convergence.  Network and

to make it easy for users to install your software?.

It has often been noted that, on a certain conception of essence at least, answers to the question what it is to be the morning star may not themselves answer the question what it is

It cannot be generally the case that what is produced or founded has less reality than what produces or founds it, because both relations may hold between a thing and itself..

contingent grounding, if there is such a thing, the answer is less than clear-cut: certainly, the mixture is something over and above its ingredients, it does not owe its existence

Thus semantic connection should replace semantic value as the principal object of semantic enquiry; and just as the semantic evaluation of a complex expression is naturally taken

It may, however, be worth pointing out that in case the operation of modulation leaves the core meaning unaltered, the possibilities of the &#34;conditional interpretation&#34;