Trusting downloaded software - Technical Topics

Part III: Technical Topics

15.2 Trusting downloaded software

Trust issues exist even before an individual connects her computer to any network. Installing software supplied with your computer or that perhaps was bought in a retail store implies a level of trust - you trust that the software will work in the manner described and that it won't do anything malicious. By purchasing it from a "reputable" company, you believe that you know who wrote the software and what kind of reputation is associated with their software products. In addition, you may be able to take legal action if something goes horribly wrong.

The advent of the Internet changes this model. Now software can be downloaded directly onto your computer. You may not know who the author is and whether the software has been maliciously modified or really does what it claims. We have all heard stories of an individual receiving an attachment via email that, when executed, deleted files on the victim's hard drive.

Ideally, when downloading software from the Internet, we would like to have the same assurances that we have when we purchase the software directly from a store. One might think that simply downloading software from companies that one is already familiar with raises no trust issues.

However, you can see from the sampling of potential problems in Table 15.1 this is really not true. The software you are downloading may have been modified by a malicious party before you even begin downloading it. Even if it begins its journey unmodified, it has to travel to you over an untrusted network - the Internet. The software, while it is traveling on the network, can be intercepted, modified, and then forwarded to you - all without your knowledge. Even if this doesn't happen, your Internet service provider (ISP) or another party could be logging the fact that you are downloading a particular piece of software or visiting a particular web site. This information can, for example, be used to target specific advertising at you. At the very least, this logging is an invasion of privacy. As we shall see there are ways of overcoming each of these problems.^[1]

[1] For a more comprehensive discussion of this topic, see Bruce Schneier (1999), Secrets and Lies: Digital Security in a Networked World, John Wiley & Sons.

Table 15.1, Trust issues when downloading software reputation, or those you know where to find

should a problem occur.

Look for positive reputations.

Software is modified (on

server or in transit). Check for digital signature on message digest and verify signature against author's certificate.

Use an anonymity tool so other parties do not get access to information that might link you to

a particular download. Reduce risk.

15.2.1 Message digest functions

Almost all of the software described in this book is given away for free. The only way to acquire it is to download it - you can't walk into your local computer store and purchase it. We would like some way to verify that the downloaded files have not been tampered with in any way. This can be accomplished through the use of a message digest function , which is also known as a cryptographically secure hash function. A message digest function takes a variable-length input message and produces a fixed-length output. The same message will always produce the same output. If the input message is changed in any way, the digest function produces a different output value. This feature makes digest functions ideal for detecting file tampering.

Now that we have message digest functions, it looks like all of our tamper problems are solved - the author of a piece of software just places the value of the file's hash on the same web page that contains the file download link. After the user downloads the file, a separate program finds the digest of the file.

This digest is then compared with the one on the web page. If the digests don't match the file has been tampered with; otherwise it is unchanged. Unfortunately things are not that simple. How do we know that the digest given on the web page is correct? Perhaps the server administrator or some malicious hacker changed the software and placed the digest of the modified file on the web page. If someone downloaded the altered file and checked the replaced digest everything would look fine. The problem is that we do not have a mechanism to guarantee that the author of the file was the one who generated the particular digest. What we need is some way for the author to state the digest value so that someone else cannot change it.

15.2.2 Digital signatures

Public key cryptography and digital signatures can be used to help identify the author of a file.

Although the mathematics behind public key cryptography are beyond the scope of this book, suffice it to say that a pair of keys can be generated in such a way that if one key is used to sign some piece of data, the other key can be used to verify the signed data. Keys are essentially large numbers that are needed for the signature and verification operations. One of these keys is kept secret and is therefore called the private key. The other key is made available to everyone and is called the public key.

Someone can send you an authenticated message simply by signing the message with his private key.

You can then use his public key to verify the signature on the message.

So it looks like our problem is almost solved. The author of the software generates a public and private key. The author then computes the digest of the software package. This digest is then signed using a private key. A file containing the signed digest is placed on the same web page as the file to be downloaded (the software package). After downloading the software an individual finds its digest. The signed digest file is downloaded from the web site and verified using the author's public key.

15.2.3 Digital certificates

The problem with the scheme is that we have no way of verifying the author's public key. How do we know that someone didn't just generate a public/private key pair, modify the file, and sign its digest with the private key just generated? The public key on the web site cannot necessarily be trusted. We need a way to certify that a particular public key does indeed belong to the author of the software.

Digital certificates are meant to provide this binding of public keys to individuals or organizations.

Digital certificates are issued by companies called certifying authorities (CAs). These are organizations that mint digital certificates for a fee; they are often called trusted third parties because both you and your correspondent trust them. An individual or corporation requesting a certificate must supply the CA with the proper credentials. Once these credentials have been verified, the CA mints a new certificate in the name of the individual or corporation. The CA signs the certificate with its private key and this signature becomes part of the certificate. The CA signature guarantees its authenticity.

The certificate creation process just described is a simplification of the actual process. Different classes of certificates exist corresponding to the type of credentials presented when applying for the certificate. The more convincing the credentials, the more verification work is created for the CA, and therefore it assesses a higher annual fee on the individual or corporation applying for the certificate.

15.2.4 Signature verification

Now all the pieces are in place. The author of some software applies to a CA for a certificate. This certificate binds her to a public key - only she knows the associated private key. She signs her software using the method described above. The signed digest and a link to the software are placed on the author's web page. In addition, a link to the author's certificate is added to the web page. At some later time, an individual downloads the software and author's certificate. The digest function is performed on the file. The author's certificate is verified using the CA's public key, which is available on the CA's web page. Once verified, the author's public key is used to verify the signature on the digest. This digest is compared to the one just performed on the file. If the digests match, the file has not been tampered with. See Figure 15.1 for an illustration of the process.

Figure 15.1. Digital signatures and how they are verified

This verification process provides assurance that the downloaded software is signed by someone who has a private key that was issued to a software author with a particular name. Of course, there is no guarantee that the software author did not let someone else use her key, or that the key was not stolen without her knowledge. Furthermore, if we don't know anything about the reputation of this particular software author, knowing her name may not give us any confidence in her software (although if we have confidence in the CA, we may at least believe that it might be possible to track her down later should her software prove destructive).

The previously described verification process is not performed by hand. A number of software products are available that automate the task.

Pretty Good Privacy (PGP) is a well-known tool for encrypting files and email. It also allows individuals to sign and verify files. Rather than having to trust a third party, the CA, PGP allows individuals to create their own certificates. These certificates by themselves are not very helpful when trying to verify someone's identity; however, other people can sign the certificates. People that know you can sign your certificate, and you in turn can sign their certificates. If you receive a certificate from someone you don't trust, you can check the signatures on the certificate and see if you trust any of them. Based on this information you can decide if you wish to trust the certificate. This is a trust system based on intermediaries, and it forms what is called the " web of trust." The web of trust can be thought of as a peer-to-peer certification system. No centralized certifying authority is needed. A free version of PGP is available for download at http://web.mit.edu/network/pgp.html.

Unfortunately, digital certificates and signatures don't solve all of our problems. Not all software packages are signed. An author's private key can become compromised, allowing others to sign any piece of software with the compromised key. Just because software is signed doesn't mean that it doesn't have malicious intent. So one must still be vigilant when it comes to downloaded software.

15.2.5 Open source software

Much of the software available for download is available as source code, which needs to be compiled or interpreted in order to run on a specific computer. This means that one can examine the source code for any malicious intent. However, this is really practical only for rather small programs. This naturally leads to the question of whether it is possible to write a program that examines the source code of another program to determine if that program really does what it claims to do. Unfortunately, computer scientists have shown that, in general, it is impossible to determine if a program does what it claims to do.^[2] However, we can build programs to monitor or constrain the behavior of other programs.

[2] Proving certain properties of programs can be reduced to proving the " Halting Problem." See, for example, Michael Sipser (1997). Introduction to the Theory of Computation. PWS Publishing Company.

15.2.6 Sandboxing and wrappers

Programs that place limits on the behavior of other programs existed before the Internet. The most obvious example of this type of program is an operating system such as Unix or Windows NT. Such an operating system, for example, won't allow you to delete a file owned by someone else or read a file owned by another user unless that user has granted you permission. Today, programs exist that can constrain the behavior of programs you download while surfing the Web. When a web page contains a Java applet, that applet is downloaded and interpreted by another program running on your computer. This interpreter prevents the applet from performing operations that could possibly damage your computer, such as deleting files. The term used to describe the process of limiting the type of operations a program can perform is called sandboxing. The applet or other suspicious program is allowed to execute only in a small sandbox. Thus the risk of damage is reduced substantially. Programs called wrappers allow the behavior of CGI scripts to be constrained in a similar manner.

Dans le document Peer to Peer: Harnessing the Power of Disruptive Technologies (Page 159-162)