Symmetric Algorithms Come in Different Flavors

Although symmetric algorithms all use one key to encrypt data and the same key to decrypt data, that doesn’t mean they all work the same. There are quite a few flavors of symmetric algorithms: IDEA, Twofish, DES, 3DES, and AES just to name a few. If all that looks like alphabet soup to you now, you’ll be eating it up by the end of this chapter when you realize that it’s not all that difficult. I go through the commonly used symmetric algorithms and explain their similarities and differences.

Making a hash of it

Not all algorithms are meant to be decrypted. Huh? That’s right — some algorithms are used to encrypt data, but not to decrypt them. Such is the case with a message digest, which is also known as a hash. I’m not sure if “hash” is a nickname the algorithm got because it “hashes up” the message, but it certainly seems logical.

The most commonly used hashes are SHA-1 (pronounced shaw-one) and MD5. Hashes are not truly symmetric algorithms because the encryption works only one way. The end result — the encrypted data — is never meant to be decrypted. Instead, the end result is used sort of like a unique serial number. I include hashes in this section because the output is like a symmetric key.

Hashes take a message and pad it a bit by adding some extra data to the message. The hash then encrypts the message and uses a finite number of bytes from the encrypted portion to be used as a snapshot or fingerprint of the data. Hashes are used to prove that the data that has been transmitted is same as the original data and that nothing has been changed en route. How can it do that? Well, every time you use a hash algorithm on the same data, you’ll get exactly the same result. On the other hand, if the data has been changed, even by one letter or a single space, the hash will change.

Figures 2-1 and 2-2 are two examples of what a hash looks like. I used a hash calculator (of which there are many available online) to compute two different text strings. The context of the text strings is exactly the same; the only thing that changes between the first figure and the second is the capitalization of one word: hash. Note that the hashes are completely different, even though the sentences are essentially the same.

Figure 2-1: First example of a hash.

Figure 2-2: Second example of a hash.

Hashes are useful in ensuring that the software you download from the Web is the same software that the vendor released. It has not been uncommon for hackers to get the source code of software, insert a back door or a Trojan program, and then place it back on the Web for downloading. If you were to compute the hash for the altered software, it would not match the hash that the vendor made from the original software. In this way, hashes are used as

protection mechanisms.

I cover hashes in more detail in Chapter 9. They are something you may want to consider for checking the integrity of data kept in storage and messages that can be used as evidence in court.

Defining blocks and streams

At first this may seem like a non-sequitur if you are just browsing the chapter, but block and stream ciphers are important subsets of symmetric algorithms. I’m not talking about walking around the block or paddling down a stream;

I’m talking about the mechanisms of how the symmetric algorithms go about encrypting the data.

Block ciphers take exact chunks of data, encrypt them with the key table, and then take the next chunk, and so on.

You can think of it as being a digital bucket brigade. In this type of brigade, however, every bucket must contain exactly the same amount of data. If a bucket is short and doesn’t have the correct amount of data, the algorithm drops a bunch of bits in the bucket to even out the amount. The algorithm knows what was used to fill up that bucket, so it can throw those bits away when the decryption process begins.

There is a weakness in block ciphers, though. If two different chunks of data contain the exact same data, the ciphertext could be exactly the same. It’s entirely possible that a statement or string of characters are repeated throughout a document, (like the name of a company or a product name) so block cipher algorithms had to be changed to fix this problem. They do that by starting with an initialization vector (IV). And what exactly is an initialization vector? It’s more random stuff!

In one sense, an IV is similar to the seed data that is added to a PRNG. However, not every algorithm or encryption product uses the same method for creating an IV. Some systems can take random input from the computer’s memory buffer and add that to the chunk of data to be encrypted. There have been some products that have been found to have flawed IVs; that is, they used the same string of data all the time instead of takings something random from the computer. To make you feel more at ease, I can tell you that this flaw is usually quickly exposed and the method of creating an IV is changed.

But we’re not finished yet with the encryption process for a block cipher. In order to further obfuscate the ciphertext, the block ciphers currently in use create multiple loops of encryption called cipher block chaining (CBC). In a nutshell, here’s how the whole thing works:

The algorithm creates some random data called an initialization vector (IV).

This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks.

The IV is XOR’d with the first chunk of data. (XOR-ing is a bit-by-bit comparison that is explained in Chapter 1.)

The XOR’d data is encrypted with an entry from the key table.

The encrypted data is XOR’d with the next chunk of data.

The XOR’d data is encrypted with an entry from the key table.

Repeat Steps 4 and 5 over and over until the entire data file has been encrypted.

And there you have a block cipher that includes an IV and CBC. You’re already beginning to sound like a crypto-geek!

Now we go on to stream ciphers. If you go back to my original analogy of a bucket brigade, you can think of stream ciphers as the full rush of water coming out of the fire hose. The difference is that instead of encrypting each bucket as it comes through, you encrypt each drop of water as it comes out of the hose. It may seem impossible, but modern computers have little trouble handling this type of speed.

Here is a short example of how a stream cipher does its tricks:

Generate a key to create a key stream (a very long key of random data that is at least as long, or longer, than the plaintext).

Grab one byte of plaintext and grab one byte of key stream.

Encrypt the plaintext with the key stream to create the ciphertext.

Start over from Step 2 and continue through Step 3 until all the data is encrypted.

As you can see, a stream cipher is quite a bit different from a block cipher. In one sense, the key stream can be considered the same as a one-time pad (explained in Chapter 1). As long as the pad (the key stream) is used only once, the encrypted data is secure. However, if the plaintext is longer than the key stream, then the algorithm will either have to create a new key stream or use the existing key stream again. After that key stream is repeated, the encryption becomes much weaker. Reusing the key stream makes it easier for an attacker to discover the key and may ultimately crack the encrypted data.

So is one type of cipher better than the other? I cover that next.

Which is better: Block or stream?

The simple answer to this question depends on what you need the encryption for. Stream ciphers are very simple to program and they process very quickly. The most commonly used stream cipher is RC4, which is used in SSL (Secure Sockets Layer in secure Web transactions). To date, that is the only stream cipher that has become a de facto standard.

I mention the possibility of a stream cipher having to use its key stream more than one time. This is a weakness and therefore you shouldn’t use a stream cipher if you have a lot of data to encrypt; that is, unless you are willing to have the process interrupted to re-key every time the algorithm reaches the end of a key stream. This takes time and processing power, too. The only system I’m aware of that has overcome this problem is the secure telephone unit used by the government and the military for secret communications. This particular phone is called a STU-III (Secure Telephone Unit #3). It uses a stream cipher because it is easier to change a conversation into a continuous digital stream and mix the digital conversation with a stream cipher. Because there is no way of telling how long a conversation might last, the STU-III constantly remixes the stream cipher as it encrypts the voice transmissions.

Listening to the encrypted conversations on these systems is a hoot because it makes everyone sound like Donald Duck on helium speaking some alien language.

Block ciphers are slower to process but more block ciphers have become standards than have stream ciphers. Take almost any encryption program available and you’ll discover that they are set to accept DES, 3DES, and AES. That’s because those ciphers are accepted standards. If you are concerned about interoperability with other encryption programs, you’re better off using block ciphers.

If you need to reuse keys, a block cipher is better. That’s because the key table that is created by the algorithm can

create a huge number of keys to use. There is very little likelihood that you would get the same bunch of random numbers to encrypt your data.

Neither cipher type is really “better” than the other; it’s more a question of meeting your encryption requirements.

This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks.

Dans le document How to Use This Book (Page 39-43)