• Aucun résultat trouvé

The Hardware Implementation of Private-key Block - Ciphers

N/A
N/A
Protected

Academic year: 2022

Partager "The Hardware Implementation of Private-key Block - Ciphers"

Copied!
168
0
0

Texte intégral

(1)
(2)
(3)
(4)
(5)

The Hardware Implementation of Private-key Block - Ciphers

by

@M oh sin Riaz

Athes issubmittedto the School ofGradua te Studies in part ialfulfillmen t of the requirement sfor

the degr ee ofMasterof Engineering

Facultyof Engineering and Applied Science MemorialUniversityof Newfoundland

July,1999

St.John's Newfoundl an d Canada

(6)

Dedica ti on

Tomymom.Qa.iser,whose selflesslove andtendern ess for me always mak e mestrivefor bett erinlife.

(7)

Abstract

The Natio nal Instituteof Standards andTechnology (NIST)intheU.S.has initiate d a processto developa Federal InfonnationProcessing Stan d ard(FI PS) for an Advanced Encryption Stand ar d(AES)[t],to becomethe standardforpriva te-keyblock encry p tion.

TheDewencryp tion algorithm will bebased ona128-bit block size andthe keysize can be 128, 192,or256hits. AES will be a replacement for the DataEncryption Standard(DES)[21 whichisbasedona64-bit blocksize andbasa5&.bitkey.Inthisregard, theagencyhas acce ptedcan dida te algori thmnominations for AES.

Oneof the important evaluation criteria concern s theefficiency of theprivate- key block cipherfromthe hardwareimplem enta tio n perspective.Re6{3]and CAST- 256 [41are amon g the fifteencan d ida te algorithms that have been acceptedinthefirstround ofthe AES developmentphase. Thisthesis investigatesthe efficiencyof thesetwoAEScandidatesfrom the hardware implementationperspec tivewith FieldProgrammableGat eArrays (F PG As) as the targettechnology.

Ouranalysis andsynt hesis studiesof boththe ciphers suggestit wouldbedesirable for FPGAim plement at ionsto haveasim pler cipherdesign thatmakesuseof simpleroperati ons that not onlv possessgood cryptographic properties ,butalso make theoverallcipherdesign efficientfromthe hardwareim plementat ionperspective.A2,a result,thethesis also proposes a newprivate-keyblock cipher designth at, not only is very efficient as far asitsimplemen- tati oninFPGAsis concerned,but at thesame time is secure against thetwo mostpotent attacksthat havebeen appliedto block ciphers,namely,differential and linear cryptanalysis.

(8)

Acknowledgements

Iwish to express my gratitude to my mentor and supervisor,Dr.HowardHeysforhis valuable time,guidance, and financialsupport throughout thecourse of my research.His constantencou ragementesp eciallyduring hardtim es is not only appreciated,butwill always beremembered .

A spec ialthankyou to my familyand friends,who had always beenthere for me. To Balaand Sidhu-bett erfriendsIcould notask for-aheart-feltappreciat ionforreminding me what else there isto life.

Ialsoappreciate the timely supportprovided to me byMr.MichaelRendellwhileI was ba t tlingalongwiththeCADtools.

iii

(9)

Contents

Abstract

Ackno w ledge me nts

TableofContents Listof Figures List of Tables

Symbols andAbbreviations

1 Introduction

1.1 Moti vation fortheResearch 1.2 Outlineof Thesis .. . . 2 Review-ofPreviowResearch

2.1 PrivateKeyBlock Ciph ers. . . . .. . . • . . • 2.1.1 Acchitectures

2.1.2 PopularPrivate-keyBlockCiphers

tv

ii

iii

viii

xi xii

12

(10)

2.1.3 AdvancedEncryption Standard (AES) 2.2 Cryptographic Properties ofaBlock Cipher

2.2.1 Nonlinearity... 2.2.2 Avala.nche. . 2.2.3 Completeness .. 2.2.4 Strict Avala.nche Criterion 2.2.5 Information Theory. 2.2.6 Invertibility. . .

2.3 Cryptan alysisofPriva te-ke yBlock:Ciphers.. 2.3 .1 BruteforceAtta.clc

2.3 .2 Differential Cry p t analysis 2.3 .3 LinearCryp tan al ys is.

14 15

I.

17 17 17 18

I

18

.

.. . . .. . ... 19

2 .

21

2.3.4 Timi ngAttack 22

2.4 Security ofRe 6andCAST~256Encryp tion Algorithms. 22

2.5 Conclusion. . _.. ._ . . .. 24

2.

.•• • • •.•.. 29 30 31

.._... ... . 32

33 26 .. . . .. . •• 27

3.3.1 XilinxXC4000St ruc t ure.. 3.3.1.1 XC4000Struct ure 3.2 Field ProgrammableGate Arrays(F P GAs). .

3.2.1 Advantagesof FPGAs overMPGAs.. 3.2.2 Disad van tages ofFPGAs over MPGAs 3.3 SRAM-based FPGAs. .

3 HardwareEnvir onmentsfor CryptographicApplica t ions 3.1 Hardware EncryptionV5.SoftwareEncryp tion. . .

(11)

3.3.1.2 Programm ingTechno logy 3.3.1.3 Inter conn ections 3.3.1.4 XilinxLogic. 3.4 CryptographicAlgorithms:FPG Asvs.ASICs 3.5 Conclusion.

4 DesignofRC6 andCAST-256 4.1 The RC6 Cipher 4.2 The CAST-256 Cipher 4.3 Hardwar e Development Environme nt 4.4 Design ofRCG.

4.4.1 RC6Datapa th

4.4.1.1 Designof32~bitBarrelShifte r.

4.4.1.2 Designof 32-bitAdd er.

4.4.1.3 Designof32~ bitXOR

4.4.1.4 Design of32 x 32 "Partial" IntegerMulti plier. 4.4.2 RC6 ControlPathDesign

4.4.2.1 RC6 GlobalStateMachine 4.4.2.2 RC6Da t aFlow Controller.

4.4.3 Key StorageforRC6. 4.4.4 Simula tion and Synthesis Resu lts 4.5 Designof CAST-256

4.5.1 CAST~2 56Datapath 4.5.1.1 Generi cRoundFunct ion.

vi

34 35 37 40 41

42 42 44 46 47 49 50 51 54 54 59 59 62 63 63 66 67 67

(12)

4.5.1.2 The S-BoxDesign 4.5.2 CAST-256ControlPathDesign .

4.5.2.1 CAST-256GlobalStateMachine 4.5.2.2 CAST-256 DataFlow Controller 4.5.3 The Key Storage Unit

4.5.4 Simulation andSynthesisResults 4.6 Comparison ofRC6 and CAST-256Ciphers

4.6.1 Some RecentModlflcatious . 4.7 Conclusion.

.5 A NewPrivate-keyBlockCipher Design 5.1 TheProposed Cipher.

5.2 FP G A ImplementationofthePropo sed Cipher. 5.2.1 Dat apath

5.2.2 ControlPathDesign 5.2.3 The Key StorageUnit 5.2.4 Simulationand Sy'uthesis Results 5.3 Security Analysis

5.3.1 SelectingNonlinearRoundFunctions . 5.3.2 Linear Cryptanalysis.. 5.3.3 Differential Cryptan alys is 5.4 Conclusion..

6 Concl us io ns and Future Work 6.1 Summaryofthe Thesis. ..

vii

68 69 69 70 71 n 75 76

n 7' 7.

82 82 84 84 85 87 87 88 95 96

.7

98

(13)

6.2 Suggesti onsfor FutureWork . Bibliography

A A VHDLDescription ofRC6 GlobalStateMachine B Gate-Level Simulation of RC6 Cipher C Gate-LevelSimulation ofCAST-256Cip her D Gate-Level Simulation ofFast HardwareCipher (FHC)

viii

100

101

109

116

12 7

13 6

(14)

List of Figures

2.1 A General Cryptosystem.. . 2.2 Private andPu blicKey Cryptosystems 2.3 ABasic Substitution-PermutationNetwork••. . 2.4 A BasicFeist el Structure.

3.1 A SimpleFPGA Taxonomy 3.2 AXilinx XC4QOOstruct ure. 3.3 Pass TransistorControlTechnique. 3.4 MultiplexerContro l Technique. ... 3.5 A Lookup Tab leImplementation 3.6 AXilinxXC4000Switch Matrix. 3.7 Simplified LogicSchematicof aXilinxCLB 3.8 SimplifiedBlockDiagramofXilinxlOB 4.1 EncryptionwithRC6-w/r/b 4.2 Encryptio nwithCAST-256

4.3 Realizati on ofRC6Encryp tion in Hardware 4.4 Funct ionalRepresent at ion of RCGDatapath

ix

10 12

28 33 35 35 36 37 .__ .. .. .. . 38 39

...• •. .•• 44 46 48 50

(15)

4.5 Configurable Logic BlockSchematic of the 32-bitCarry Ripple Adder. 52 4.6 A High LevelOrgani za tion of Wallace Tree Multiplier._. ... ..' . 58

4.7 AComponent Interfaceof RC6Encryp tor 60

4.8 ACompo nent Interface Representationof RC6 GlobalSta teMachin e 62

4.9 CAST -2OOEncryptioninHard ware .. 66

4.10The GenericRoundFUnction Module. .

4.11CAST-2OO State Machine Unit_ _

4.12CAST-2OO Data FlowController.

...__._ . ... .. 68 70 71

5.1 Encryption withFastHard ware Cipher(FHC).

si

5.2 Realiz ati onofFastHardware CipherinHard ware . . . .

-

.

. . . .

.3

n.i

Ga te- LevelSimulation of RC6 CipherDesign. 11.

0.2 Gate- LevelSimulationof RC6CipherDesign Cont'd 11.

D.3 Gate-LevelSimula tionof RC6 Cipher DesignCont'd 120 D.4 Gate-LevelSimulationof RC6 Cipher DesignCont'd 121 8.5 Gate-Level SimulationofRC6 Cipher DesignCont 'd 122

D..

Gate-LevelSimulation ofRC6 Cipher Design Cont'd 123 0.7 Gate-LevelSimulation ofRC6 CipherDesignCont'd 124 B.• Gate-Level Simulation of RC6 CipherDesignCont'd 125

D..

Gate-LevelSimulationofRC6 Cipher DesignCont' d 12.

c.i Gate-LevelSimulatio nofCAST -2OOCipherDesign 12.

C.2 Gat e- Level Simulat ion of CAST- 256CipherDesign Cont'd

.. . . . . . . . .

12

s

C.3 Gat e-LevelSimula tio nof CAST-2OOCiphe r Design Cont' d 130

(16)

C.4 Gate-Level Simulation ofCAST-256 Cipher Design Cont' d C.5Gate- LevelSimula ti onof CAST-256Cipher Design Cont'd C.OGate-LevelSimulationof CAST-256 CipherDesign Cont'd C.7Gate-Level SimulationofCAST-256Ciphe rDesignCont'd C.SGate-LevelSimulationof CAST-256 Cipher Design Cont'd D.l Gate-LevelSimulationof FIIC Design .

0.2 Gate-LevelSimulationofFIICDesign Cont'd 0.3 Gate-LevelSimulationofFIICDesign Cont'd 0.4 Gate-Level Simulationof FHCDesign Cont'd 0.5Gate-LevelSimulationofFHC Design Cont'd D.6Gate-LevelSimulation of FIICDesign Cont'd D.7 Gate-LevelSimulationof FHC DesignCont'd 0.8 Gate-Level Simulationof FIIC Design Cont'd

131 132 133 134 135

137 138 139 140 141 142 143 144

(17)

Li st of Tables

5.1 Probabilities of select ingknonlinearS-boxes_ 5.2 LinearCryptanalysisResultsfor Diffe rent values ofNL..

xli

88 92

(18)

Lis t of Symbols and Abbreviations

M Original plaintextmessage C Encrypted ciphertextmessage

Round function

Totalnumbe rofrounds of enc.-yption Numberof inputs to an S-hox Numberof outputsofan 5-booc:

Number of Scboxesina round function Kr; S-hit roundsubkey fortheithr-ound K"" 32-bitmaskingkey fortheifhcound N(f) Nonlineari tyofanrn-bitbooleanfunct ion N(S) Nonlinearity ofan S-box

K Prim ary key of encryption in bytes NL Nonlinearityof S'box function

Nt Totalnumber ofknown plaintecxtsneed ed for linear cryptana lysisto be successful Ne Totalnumberofchosenplainte:xtsneeded fordifferentialcryptanalysis to be successf AES Advanced EncryptionStandarcl

DES Data Encryption Standard

NIST NationalInstitu teof Standards.andTechn ology FIPS Federal lnformation Processing Standard FPGA FieldProgrammableGa teArra,ars MPGA Mask Program mableGat e Arresy VPN Virtual Privat e Netwo rk

xiii

(19)

LAN LocalArea Network WAN WideArea Network NBS National Bureau of Standards SPN Substitution-Permu tati onNetwork SAC Strict Avalanche Criterion XOR Exclusive-OR LUT LookupTable SR.Ai.\1 StaticRA.\f

ASIC Applica tion SpecificIntegratedCirc uits CLB Configurable LogicBlock rO B Input/ Out putBlock FHC Fas tHardwar e Cip her

CMC CanadianMicroelectronicscorporation CLA CarryLookahead Adder BCLA Block Carry LookaheadAdder CRA. Carry Ripple Adder CSEA Carry SelectAdder GSA Carry Save Adder CPA CarryPropaga te Add er

T""" Tot aldelayforthe 32-bit carrypropagat eadder

Ten<: Time toachieve oneencryption

Td'k Oneclockperiod in nanosecond s One logiclevel delayinnanoseconds

xiv

(20)

po. Probabilityofbestr-ro ua d iterative characteristic Pi Probability of the outpu tXOR, given the inputXORinrou ndi piJ Probability thatthelinear approximationholds true PI;n Probability of one linear or affine function

PI: Probabilitythat k outofa total ofeightS-boxes are nonlinear PI Probability ofthebes tlinear expression for an r-roundcipher

(21)

Chapter 1 Introduction

Civi lizationisthe progr esstowarda society of privacy.The savage 's whole exis- tence is public, ruled bythe laws of his tribe.Civilizationistheprocessof setting man free frommen. Ayn Rand . The Founta inh ead (19';3)

In therecentyears, there has been a great need formuch improved techniques ofsecurely tran smit ti ng andsto ring information.Fromelectronicmailto cellular com municat ions, se- curewebaccess tosmart cards and electroniccommerce, wireless LANand WANcomputer networks tovir t ualpriva te networks (VPNs) -these and othernew information- based ap- plicationswillhave far reachingconsequences,affectingthe waybusiness is doneas well as private communication andsocialinteraction.Asthishappens, security aspectsofcom- municationsystems are of growing commercial and publicinterest . Unfortunately, these aspects have been widely underestimated or ignored inthe past.Today,however,there is highdemand forexpertise and high-quality produ ctsin the field of informationsecurity and cryptogr aphy.

Untilrecently, encryption products weregenerallyinthe fonn of specializedhardware.

(22)

These encryption/decryption devicesplugged into the communicati ons lineandencrypted allthe data going acrossthelin e.Although,software encryp t ion is becomingmore preva- lent today, hardware is still the embod iment of choiceformany military and commerci al applications.As an ind ustry trend , manycompaniesin NorthAmerica as wellas Europe are developing cryptographic hardware for applications suchas securevoice, fax:and data networks,secureVPNcryptographic acceler ators, protoco lsensitiveencryptionfor wide area networks,andDSP voice cipheri ng .

Speed and security are also importantissuesthat playin thefavourof the hardware impleme ntation of encryption devices.Encryp tionalgorit hm s involvemany complex oper- ationson themessageor plaintext bits . Oftenthese areDotthetype of operatio nsthat areincorporatedinto ourtypicaldeskto pcomputers.Themost widelyaccepted priva te- key block cipher,theDataEncryptionStandar d(DES)[2], intro du cedin1977,runs inefficiently on generalpurposeprocessors. Altho ugh,somecryptographershavetriedto shape their algorithms tosuitsoftwareim plemen tations,specialized hard war e suchas an encryption chipwilllikelyemerge as thewinnerinefficiency. Anotherkey factor that favoursthe hard- ware implementationofa blockcipherissecurity. An encryptionalgorit hm beingrun on a genera lizedcomputing machinehasno physical protection.On the otherhand,hardware encryption devicescanbe securelyencapsul atedtoprevent this. Otherfactors that suggest ahard wareimplement ationinclude cost, ease ofinstallation , andlower power consumption.

The National Institute ofStan dar dsandTechnology (NIST)inthe U.S.has initiated a process to develop a FederalInform at ionProcessin g Standard(FIP S) foran Advanced Encryption Standard(AES) [IJ.Thenew encryptionstanda rd isbased on a 128-bitblock sizeanda 128,192, or256-bi t key size.Thisst an dar d will beareplacementforDES.This

(23)

thesisexamin es the hardwareimplemen tationof two private-keyblockciphers,RC6 [3] and CAST-256[4},in FieldProgrammableGa te Arrays (F PGAs).BothRC6and CAST-25f)are among thefifteen candidate algorithmsthat havebeen acceptedinthe first round of AES developmentphase.The thesisalsoproposes a simpler private-keyblock cipher design that isvery efficientintermsofhard ware implementationinFPGAs.

1. 1 Motivation for t he R e s earch

TheDataEnc ryp tion Standard (DES)[2], apriva te-key blockciph er,is the mostwidely usedcryptosysreminthe world.DES was developedbyIBM,asamodificationofan ear lier cryp tosyst em knownasLUCIFER [5). DESwas first publishedinthe FederalRegist er in 1975. DESwasadoptedasastan da rd for"unclassified" applica tio ns in1977 by theNa t ion al Bureau of Standards(NBS).

DES hasbeenatarget ofcriticis m sinceits inceptionin1977.OneobjectionofDES concerns themyst e rysurro undi ng the desi gnofitss-bcses,whichbeing the onlyaoalia- earcomponentof the cryptosystem,isvitalto its security.However,the mostpertinent criticism ofDES is thatthe size of thekey, 56 bits,istoosmall to be really secure.After twenty two years,DESisnearing its demise and is theoretically breakablebytwo powerful cryptanaJytical attacksofdifferential and linearcryptanalysis[6,7J.

The National Instituteof Standarrlsand Techn ology(NIST)has initiated a precess to develop a Federal lnfonn ati on Processing Standard(F IPS) foran AdvancedEncryption Standar d(AES)[IJspecifying an encryption algorit hm forthe twen ty-firstcenturyas a replacem ent of DES.Inthis regard , theage ncy hasannouncedarequest forcan dida tealgo- rithm nominations ofAES.Oneofthe importantevaluationcriteria concernstheefficiency

(24)

of the private-keyblock cipher fromthe hardware implemen ta tion perspective.RC6[3]and CAST -256[41areamong thefifteen candida tealgorit hmsthat have beenpresentedtothe first roundoftheAESdevelopm en t phase. Both ciphers aremodifica tionsofearlier gener- ationciphers(Res[81andCAST- 128[9])basedonsm all er(64-bit) block sizes. Likemost proposed private-key blockciphers,RC6andCAST-256 areclear ly designedforefficien t implementationinsoftware.

Thisthesis discusses theissuesthat effect thehardwareimpleme nt atio nofthe twoAES candidates,RC6 and CAST-256,inFPG As. The twomajor aspectsof speedand hardware complexi ty associated withthe two ciphersare explored and acom par at iveanalysisof the twociphersin terms ofimplementa tioninFPG .J\s is prese nted.

Astheresul tofourst udyof thesetwociphers,wealsopro pose anewprivate- keyblock cipher,specificallytargetedforhard ware implement at ion.Thiscipheris based onsim pler operations that notonlypossessgoodcryptographic properti es,but also make theoverall cipher design efficien t for imp lemen ta tionincusto marchitectures asFPGAs.

1.2 Outline of Thesi s

The thesis is organized as follows:

•Chapter 2 presentsaliterature surveyofthe previousresearchthatisrelevantto ourwork.

•Chap ter3 examines thedifferentissues pertaining tohardwareimplementationof cryptc- graphic algorithmsinFPGAs.

•Chapter 4 examines thedesign ofRC6andCAST-256encry ptionsandtheir implementat- ion intar getFPGAdevices.

(25)

•Chapter5 presents the design of a new private-key block cipher based on simpler crypto- graphic operations and its implementationinFP GAs. The security of the proposed cipher against linear and differential cryptanalysis is also examinedinthis chapter.

•Chapter6 summarizes the results of the thesisand presentscertain suggestionsfor future work.

(26)

Chapter 2

Review of Previous Research

Security ofinformation stems from the needfor private transmission ofboth militaryand pub lic messages.This need is as oldas civilizationitself. Theancient Spartans,forin- stance,encip heredtheirmilitarymessages. The first secure communicationchannels were very simpleand theirreliabilitydepended on the physicalsecurity of messengers. Due to theinventionof computer systemsandthe pervasiveintrusionof computer networks, the spectrumof protectionissues basbeen stretched. :Manyprotectionissues of modern day com- putersystems and networks arestrictlyrelatedto the protectionof communicationchannels.

Due to the natural characteris ticsofany channel, we have a communicationmedium that is accessibleto eavesdroppers,so physical securityis meaningless. The onlywayto enforce security in communication channelsis by theapplicationofcryptography.

The term cryptologyoriginatesfrom Greekroots meaning" hidden"and''word'' and is the umbrella tennusedto describe the entirefield of secretcommunications.Cryptology furtherbran ches int otwo:cryptographyandcryptan alysis.Cryptographyis theartand SCi4 ence of transforming inform atio n intoanintermediateformwhichsecuresthat information

(27)

whileinstorage orintransit,AJ:,opposedto steganogrophy, whichseeks to hide theexis- tence ofanymessa ge, cryptogra physeeks torend er a message unintel ligi ble even when the message is completelyexposed. Cryptanalysis,on theother handis the aspectof cryptology which concernsthe strength analys isofa cryptogra phicsystemor cryptosystem,and the penetrationorbreaking ofacryptosyste m .

Acrypt osyst em isany system that employs methodsof cryptographyto encrypta mes- sage. EnC11/ptionis a process thattransformstheoriginalmessag e orinformationreferred to as the plaintextinto anencrypted message knownas theciphertext. This ciphertextisthen transmitted overan insecure channel.when thisciphertextreaches the receiver,a reverse transformationprocess,referred toas decryption,isperform ed to recover the originalplain- text from thecorr espo ndingciphertext. This encryption/decryptionschemeis also referred to as a cipher[10].Figu re2.1 showstheencryp tion/decry p t ion process in the contextofan insecurecommunica tions channel.A blockcipheris a function thatmaps N·bitplaintext blocks to N-bitciphertextblocks, whe reNis the block length,whichis 64bitsinthe case of DES and128 bits in thecase ofAES.

In1948 Shannon[11]proposed two principles thatpresenta soundtheoreticalbasis for cryptosyste ms with goodsecurity, namelyconfusionanddijjwion.Confusionemploys substitutionto hidetheplaintextandthe key.Diffusionspr eadsthe confusion effectacross the enti reciphert ext,thereby masking any statist icalproper ties ofthe plaintext.

The fieldof cryptography is dividedintotwo mainbranches :private-key cryptography and public-keycryptograp hy.The two types of cryptosystemsare shownin Figure2.2. In private- keycryptosystems,thesame secre t key isusedboth for encryp tion anddecryption . Assuming the algorit hm. to he secure enough, the securityof thecryptosyste m isbased on

(28)

Insecurecommunication channel

Pla lIlte l l(P j- ~ Key(K)

_ ._-

EHCRYPTlON

e,

Key(K)

-_

DECRYPTlO N

. _-

D.

Figu re 2.1:AGeneral Cryptosys tem

keeping the key secret. Incontras t ,every userina public-keycryptosysternpossessestwo keys. One keyispublic andisknown to everyone. Theother isprivateand isonlyknown to the personpossessingitand no one else!Public-keycryptographyhas the advantagethat a securech an nelisnot required to exchangekeys .However,itsdisad vantage is thatitis orders of magnit ude slowerto encrypt as compared to private-keycryptosystems.:\t06lcryp- tesystems use a combination of public and private key cryptography. In a typical scenario, apublickey scheme isfirstused toexchange the secret key thatis then used for encry pting ordecrypt ing the messagesusinga private key encry ptionalgorit hm.

2.1 Private Key Block Ciphers

The securityofaprivatekey blockcipherdepen dson the comm unicat ingparties shari ng the same commonsecretkey.If thiskeyis compromised,then the encryptedmessageswill beeasilydecrypted using the known key.The cipheriscalled ablock cipherbecausethe

(29)

/ _ Vcm ti.t,

r-:=~.=-L r:::::~:=L

-I' la ln tn t ~~:~C1Phtl1nl-"'~-I'lalnlnt ..

c~.~ r-' -

"~ CIPhtrtnl

- j

Decryption, Plaint""....

P1lbUc-kt, Crypl.Oooy-.t.,..

Figure 2.2: Privateand Public Key Cryptcsystems

plaintext isbrokenintofixed lengthblocksbefor e beingencrypted.

2.1.1 Architectures

The first practical private key block cipher designsbasedon Shannon'sprinciples of con- fusion anddiffusion werelaid downby Feistel[51 and Feistel,Notz and Smith[121. These design framewor ksare referredto asSubshtution-Pennutation Network3 (SPNs)andFeistel Network8or Feutelciphers.

The SPN cryptographic network consists of&number ofstages or rounds of substitution- permutation layers. Eachsub6t itutio n-penn ut a tion layer(SP layer)ismade upof several smallersub-blocksubstitutions(knownas Scboxes) followed by a large bitpositionpermu- tation opera tion(knownasP-box). The formerhas the effectofShan non'sconfusion,while the latter operation implementsShannon'sconcept of diffusion.A primarykeyisusedto

(30)

• ••

<,

\

Figu re 2.3: ABasic Substitu tion-Perm utationNetwork

generate all subkeysimplementedin each substitut ion-pe rmutationlayer accord ing to akey schedulescheme. An m xn S-boxhas a nonlinear mappingfrom an m bitinputto an n bit output pattern.The Scboxesfor SPNs,however,must have same numberof inp uts andout- puts.Such S-boxes arealsoknown assvmmetric S-boxes. Moreover,these mappingsshould bebijective,meaningthat theyare a one-to-onemappingand areinvert ible. Invertib ihtyis neededforthe purposeof decryp t ion.Two stagesofScboxes inanSPst ruct urebasedon ..x.. S-boxes areshown in Figur e 2.3.

Another type of private keyblockcipher design isbasedon a Feistel network architecture proposedby Feistel,NOb:,and Smith {12J.In a Feistelarchitecture,as shown in Figure 2.4, Shannon'smixing transformationcan be achieved using Scboxes and permutations insidea round function

1-

But these operations are performed on only half the block at a time.For

10

(31)

eachround, theright halfis fed intoa round functionf ....boseoutput is bitwiseXORedwith thelefthalf.Thisis followed by an immediateswappingofthe twohalves.After atowof Rrounds,thetwo halves areconcaten ated to constitu te the ciphertext block.The complete encryptionprocesscanbe visual ized as aniter atio nof the following oper at ion:

n.+l

f(fl;,

K<l e

L; (2.1)

where

n.

andL;arethe rightand left halves,respectively,of the blockforthe iUr. round.

Also,K,represents the,-rAroundsu bkey,

Decrypti on is similartoencrypt ion,withthe only exceptiontha t thesubkeys are used inreverse order.Theroundfunction is the most criticalcompo nentof the cipher design as itintrod uces an elementof randomness totheplain text. It is basicallythe stru ct ureofthe roundfuncti onthat distinguishesbetween diJferentFeistel ciphers. Feis tel ciphers ,unlike SPNs,can haveasymmetri cS-boxes, t.e.mof:n,

Most ofthe existing commercialcryptogr aphic implementationsuse DES fortheir private keyalgori thms.The DataEncryption Standard,first introducedin1977 asanencryption standardfor unclass ifiedapplications is basedon a 64-bit blocksize.The key sizeis56 bits. The round functionexpandsthe 32-bitinput into a 48-bitblock using an expansion table,followedbyan XORoperation involvinga48-bit subkey generatedbythekey !JckJule scheme andthe 48-bitexpanded block.Theresultan t48 bits arethen fed into eight 6x4 S-boxes.The output ofthe eightS-boxesgoesthrou ghafinal32-b itpermutationgivingthe final32-bit outputofthe roundfunction.

Theciphers are typicallykeyedbyapplyingsubkeybits(deri ved foreach roundbythe keyschedule)to theS-boxesem ployingeither:

11

(32)

i

R,

:

~ i

~

T T

~L,., ~R""

Figure 2.4; A Basic Feistel Structure

(i)selectionkeying:Herethe keybits select the desired mapping fora part icul ar Scbox.

(ii)XORkeying: Herethekeybitsare XORed withthe input bitsbeforefeedinginto the 8-box.

2.1.2 Po pularPriva t e-keyBlo ck Ciphers

Overthe years,many private-key block ciphers have been proposed as potentialreplacements for DES.The structure of these block ciphersmayor may not be a Feistel network.An introduction to some of the popular private-key block ciphers is presentedhere.

Blowfish(131 isan algorithm developed byBruce Schneier.Itis a block cipher with a 64-bitblock size and variable lengthkeys(up to 448 bits).It has gained a fair amount of acceptanceina number of applications.No successfulattacks are knownagains tit.Blowfish

12

(33)

isusedin anum ber of popularsoftware packages,includingNa utilusandPGPfone.

FEAL[14Jisa 64-bitblockcipherwith a 64-bitkey.Itisbasically a Feistelstructure.

The8 x8Scboxes inthe round function execute XOR additionsand byterotatio ns. The algorithmiswellsuitedfor8-bit microprocessors.However, the downside ofthis cipher hasto do with its resistance against differentialcryptanalysis.Ithasbeenshown thatthe algorithmwithlessthan8rounds canbeeasily broken usingdifferentialcryptan alysis[15J.

Thecipherisresistan ttothis kindof att ackonlyifthenumber of roundsexceeds32 [161.

IDEA (Intern ational Data EncryptionAlgorithm)is an algorithmdevelopedatETH Zurich in Switzerl and[17).Ituses a128bit key,anditis general ly considered tobevery secure.The blocksize is again 64 bits. The algorithmuses a mix ofthree differentgroups of operations -bitwiseXOR, integeradditionsand integer multiplication s.Ithas already been around forseveral years, and no practicalattacks on it havebeen publisheddespitethe number ofattemptsto analyzeit.IDEAispatentedinthe UnitedStatesand in moot of the European coun tries.The pa tentis held byAscom-Tech. Non-commercialuseofIDEA isfree.

RC5 [8] isa very efficient word-oriented secret-ke y block cipher.Itis a parameterized familyofsymmetric ciphers.Ituses a variablewordsize, a variable-length secret key and a variable numberof encryptionrounds.The architectureof thisnovelsymm etric block.

cipher does notfall into the realms ofa typicalSPN orFeistel cipher.This algorithmmakes use of data-dependent rotations.Italsomakes use of integeradditions,subtract ionsand bitwise XORs.RC5 has beenshownto bevery resistant againstboth linear anddifferential cryptanalysis(18),although potentiallysus cept ible to timingattacks(19J.

CAST-128[9] isanotherprivate-key blockcipher thatisbased ona 64-bit block size.

13

(34)

It usesa128-hit primary encryp t ion key. The algorithmuses six 8x 32 S-hoxes. The stren gth ofthisalgorithm has been showntoliein thelargesizeofthe S-boxes{9J.Lee, Heys,and Tavares{20]sbowedthatthealgorithmis resistantto both linear and differential cryptanaJ:rsis.

2.1.3 AdvancedEncryptionStandard (A E S)

Asmentioned earlier,the National Institute of Stand ards andTechnology(NlST)basun- veiledaprocesstodevelopafederal lnfonnationProc essing Stan d ard(f IP S) foran Ad- vancedEncryption Standard(A ES){I).TheAESrepresen tsaspecifi cat ion for a privat e-key block cipheras areplacementforDES.As a partofthe AES process,a numberof minim um acceptabilityrequirementshave been draft ed .These candidate algorithm eval ua tion criteria include:

•AESshallbea.symme tricprivat e-key block cipher.

• The adoptedstan dardshall bepublicly defined.

• AESshouldbe sui tableboth for hardwareand software implemen tations.

•ThekeylengthfortheAESmaybeincreased as needed.

•Can did a tealgorithms tha tmeettheabove requirementswillbejudgedonthe basisof the followingfactors;

1.Computational efficien cy 2.Hard warecomplexity 3.Encryp t ion speed

14

(35)

4.Softwaresuit ab ility 5. Memoryrequiremen ts 6.Flexi bility 7.Licensing requirements 8.Simplicity

Fifteen candidate algorithms havebeenpresented to thefirstroun dof AESdevelopm ent phase.Inform a tiononAEScandidat es can be foundin[1].CAST -256 and RC6aretwo ofthefifteenAESsubmissions andareinvestigatedinthisthesis. The emphas is ofthis investigationis on the hardwareimpleme ntation of these ciphers inFPGAs.Bothciphers are strong candidates becausetheyaremod ifica tions oftheirearlierversions (CAST-128(9) andRCSIS]).Like mostofthese proposed ciphers. CAST· 256andRC6aredesignedfor efficient implementationsinsoftware . But as oneof theimplementa tionrequirem en tsforan AEScandidate. the har d wareefficiency ofthese algori t hms has to bethoro ughly investiga ted.

Detailed architectu resoftheCAST-256 and RC6 ciphersarepresentedinChapte r 4.

2.2 Cryptographic Properties o f a Block Cipher

Since itsadoption as a standard,DEShasbeenthe focus ofmostof theresearchinprivat e- key cryptogra phy. :Muchofthe efforthadbeendirected towardscryptanalyringDESor investigatingpropertiesthatmight improvethe overallsecuri ty ofthe cipher.Inthis section.

differen t cryptographicproperti esthatarevitalto thesecurityofablock cipher are presented.

15

(36)

2.2 .1 Nonlinearity

NOtllinearityisthe mos tcrucial featu re in the des ignofprivate-key blockciphers. for instance, if there existsalinearrelationshi p(on a perbit or perblockbasis)between th e ciphertext outputandtheplai ntext input,the ciphercan beeasily brokenby reducingthe cipher toasystem ofline ar equa tions.Theselinear equat ionscan bethen solvedusing a small amountofknownplaintext-ciph ert extpairs. Typ ically, an Scbox isthe only non linear compo nen tofanSPN ora Feistelcipher. Assuch ,theneedtodesign highlynonlinear S-boxes makes the differe ncebetwee n amore oralesssecure cipher.

Anm-bitaffine boolean functiongisdefined[21)as

g(X)=a.oEDatXtED....ED a".X... (2.2) where X=[Xl> ...,x...]is the m-bitbinaryinput,EDis thebitwise exclusive-or,andaie (0, I},0~i~m.TheHamming dist ancebetween twom-bit booleanfunctions ,I(X)and g(X ),isdefinedto be

d(j,g)~#{XE(a,1}mlf (X)Ellg(X)~I}

where#is the totalnum berof m-bttbinary inputs.

The nonlinear ityof anm-bit booleanfunctionIisdefined as

(2.3)

(2.4)

whereAis the setofallm-bitaffinebooleanfunctions.Sincean m x n Scboxhas n output bits,eachof which isanm-bit boolean function,thenonli neari ty of the ScboxSisdefined as theminimum nonlinearity overallnon-zerocombin ations ofoutp ut bitboolean functions:

(2.5)

16

(37)

where Ii isthe m-bn boolean funct ion oftheithoutpu tbitof theSebox,and(c;/ i)( X)==

c;(fi(X))forallX.

He}"SandTavares(21) used random search andfiltering againstknownweaknesses to find highlynonlinearlarge Scbcxes,Theyhave used this techniquein const ru cting S-boxesfor SPN s.

2.2.2 Aval an ch e

feist el,Notz,andSmith[121 first describedthe concept ofavalancheas animportant cryp- tographicproperty inthe design ofa blockcipher.The avalanche pro pertyissatisfiedonly when , onaver age,half the outp utblock bitsvarywhenoneinputbit changes .

2.2.3 Complet en ess

Completenes3wasa conceptintro ducedby KamandDavid a [221.The completenesscriterion issatisfiedifaUoutput bitsdependonallinput bits.Kam andDavid a proposeda class of permutationsinabasicSP N which ensures the completenessof an SPN,providedeach S-boxiscomplete.Drown and Seberry{23Jfoundthat DESis complete afterfourtofive rounds with ahigh probability.

2.2 .4 StrictAvalancheCriteri o n

Webster and Tavares {24]usedthe concepts of completenessand avalanche tocome up witha.

newcryptographic propertythatconcernsnot only the indivi dualS-boXESbutalso complete cryptosyst ems. This propertyis knownastheStrictAvalanche CriterionorSAC.This property statesthat for everyinputbit, invertingthebitcauseseach outputbit tovary with

17

(38)

a probabilityof one half over allpossible inputvectors.Higher order properties ofSAC have been presentedby Ferre [25].Adam s[26] andPreneelet al[27].

2.2 .5 In fo rmati on Theory

Many of thecontributions to the fieldofcryptograp hycome fromthe information theQry concepts introducedbyShannon (11J.Inaciph erthat has perfect secrecy theplaintext is statisticallyindependentof the ciphertext. Thismeans that even with an unlimitedtime andcomputationalresources atour disposal.wecannot guess theplaintext.given knowledge of the ciphertext.For a private-keycipher to beperfectly secure, theuncertainty in the key must be at least as large astha t of theplaintext.

DawsonandTavares[28]furthe r investigated the workof Ferre(29]in using information theoryto design theScboxes.They proposedminimizingthemut ual information between a subset of output bits and any subset of inputand/or out pu t bits inthe design ofS-boxes for SPNs and Feistelciphers.

2.2 .6 In vertibilit y

An nxnScboxis saidto be invert ibleifitis a bijective mappin g.Adams and Tavar es(3D) proposed a methodof constructing8-boxes suchthatit satisfiesei)biject ion.(ii )minimum nonlinearity,(iii )SAC,and(iv )outputbitindepen denceby com bining0- 1 balanced booleanfunctions.However , O'Connor[3 1]wasofthe opinionthatthis technique becomes impracticalas n increases.

18

(39)

2.3 Cryptanalys is of Pri vate-ke y Block C iphers

The purpose ofcryptanal ysisis torecovera secretprimarykey usedinaparticularcryp- tosystem,Thereare threetypes ofgeneral attacks thatcan be applied againstanypar ticular blockcip her.These includ eciphertextonly, knoum plaintext,andchosen plaintezt.Incase of a ciphertextonly attack,the cryptanalystpossesses the cipher textsonly.A known plaintext att ack uses the knowledgeof both plaintextsand their correspo ndingciphertexts.Inthecase ofchosen plaintextattack,thecryptanalystcan select particular plaintexts,and produces thecorrespondingciph er text.This is possibleonly because he or she hastem porary access to the encryptionmachinery.

Themost fundamentalway to breaka cipherisexhaus tive keysear ch,referredto as the brute force attack.Tworecent andpowerf ul methods that ha vedemons t ra tedthe ability to breakmodemday block ciphersaredifferent ial and linearcryptanal ysis.This section describesthesetwowidely knownat tacks against private- keyblockciphersas well.Atthe endofthis section anewtypeofattack,called thetimingattackis presented.

2.3.1 Brut e forceAttack

Abrute force attack,also knownas exhaustivekeysearchis aknownplaintextattack. In thiskind of attack,thecryptanalystgetshold of afew eiphertextsand thei rcorres po nding plain texts .The nextstep isto exh a ustivelysearchallpossible keysby encrypting a known plaintextwitheach ofthese keys .Whenone of thekeys generatesthe corr ect ciphertext,we verylikely havethe correctkey.We canuse afewmoreciph ertext- plai ntex tpairstoverify the correctnessofthekey.

Thebest line of defenseagains tthis type of attackistoincrease the key size suchthat

19

(40)

the att ackbecomes infeasi ble.Theoretical ly speaking,acipher is broken,if thetime and memory resources requiredbyany cryptan alyt ic attack are less thanwhatisneededCora bruteforce attack.

2.3.2 DifferentialCryptanalysis

Difftrtntialcryptanalysis,developed by Biham and Sham.ir[61.isone ofthe most potent techniq ues used to cryptanalyzeman y priva te- keyblock ciphers.SPNsand Feis telciphers belong tothe class ofiterat ed product ciphers and this attackisverymuch applicableto the m.

Differe ntialcryp tana lysisis essentially achosen plaintex tat tack.Blh amand Shamir have successfully attackedDES usingthis technique andhavefoundit tobe mor e efficient than abru teforceatt ack.Thismeth od tak esinto accoun t ciphert ext pairs,whosecorrespon ding plaintextshaveapart iculardifference . In otherwords,it looksattheXO Rdifferenceof twoplain tex ts andconsidersthe correspondingciphertex t pair.InaparticularS-box,if weknow the input XORof a pair,itdoes notensuretheknowledge of itsoutputXOR.

However,there exists a probabilisticrelationbetween theoutput XORs andeveryinpu t XOR.Differential cryptanalysis makes use of the highlyproba bleoccurrences of sequences oCout put XORdiJ£ere nces at each round given aparticular plaintext XORdifference.

Severalmethods have been proposedto ensure theimmunity of a round functionagainst thistyp e of attack .Several methods havebeen usedtoreduce these highly probablece- currences ofoutpu t XO Rsinrelation to inputXORs. Forexample,this can be done by incre as ing the outp utbitsoftheS-box to some reasonablevalue[321.Asecondapproach uses amodularmulti p licationtomask theinputof theS-boxes as awaytoreplace theXOR

20

(41)

operationintheroundfunctionthat involvesthe subkey[33J.

2.3.3 LinearCryptanalysis

Linearcryptanalysis is a known plain text attack, inventedby Matsui[iJ,uses linearexpres- sions to approximatetheaction of a block cipher.This attack exploitsthe statistical linear relati onsbetweenplaintext,ciphertextandsu bkeybits. This implies thatifwe XOR some plaintextbits together,XORsome ciphert ext bitstoget her andthenfinallyXO R the resul t , weend up getting a sing lebitthatis equalto the XORof some ofthekey bitswith a probabil itythat is signi ficantlydifferentthanone-half. Thisdefines alinearapproximation, whichholds with a certainprobability.ITthis probab ilityisdifferent from onehalf,wecan usethis fact to construct a linearapproximationoftheenti realgorithm.This is doneby concatenatinglinearapproximationsof differentrounds .Matsui,in his paper present ed two algorithms used to derive the subkey bits usingalin ear approximation.Algorithm1 is used to recover onesubkey bitthat is the XOR sum of a subsetof subkeybits. The second algorithm, an extens ionofthefirst,determinesa numbe rofthesubkeybits at onetime.

DES is highlysusceptible to this kind of attack as the5-boxes ofDES are not optimized againstthis attack.When this attack ismoun ted against aIs-ro und DES, thecipheris brokenwith247knownplaintexts.Asthe attackgreatlyrelies onthestruct ureof S-boxes, thebest wayto increasethe immunity ofSPNsagainstlinear cryptanalysisis to select highly nonli ne arScboxee. Alternate approach esto thwart linea r cryptan alysi s,involve the useof key-dependentrotat ion s{S]andmodular additio ns andsubt ractions[131.

21

(42)

2.3 .4 Tim in gAttack

Another type of attackaimedat breaking apriva te-key blockcipheristhetimi ng attack.

Basedon the assumptionthat accurate timing measurementsare available for individual encryptions ,thisattackempdoysthe methodology ofderiving the keybitsusing timing inform ati on from a set ofcip laert ext s.The timing attack of RC5 as outlinedin[19j,exploits the fact tha t a naiveim plem e nta tio n of the cipher couldresultindata-dependen trotation s taking a time that is a functionof the data. This implies that it is important forthe designers to be aware of different cryptographicissues when implementingcipherslike RC5. However, the timingat t ack cezn be preventedifa digital hardware irnplemention ens urestha t the rotationstake constant tbne. Abarrel shifteris one piece of digitalhardwaretha t can execute any size rotationsinoneclockcycle.

2.4 Security of R C 6 and CAST-256 Encryption A lgo- rit h m s

As mentionedearlier,CAST -2 56 and RC6 are among the fifteen candidatealgorithms that have beenpresented to the fi-rstround of AES development phase. Altho ughCAST-256 and RC6 are neither SPNsnor- Feistel ciphers,their architectures are extensions of the basic Feistel cipher.This sectionoriefly discusses the security of thesetwoAES submissions against linearand differential.crypranalysis aswell as against brute force attack.

InAES submission forRC B (3j,several modificationshavebeen made such as the use of fourworkingregisters insteadof two as in Re5[8], and theintroduction of a quadratic functiontha t uses theprimitive operation of integer multiplication.The useof multiplication

22

(43)

oper ation enhancesthediffusion effect,there by increasing theoverall security ofthecipher.

It has beenconjecturedin{3J thatthebest approach to atta.ckRCS block cipheristo adoptbrute force atta.clc:.Thisisachievedby carrying out an exhaustivesearch for theusee- supp lied encryption key.Rivest,Robsha w,Sidney,andYin[3]have concludedthatthework loadneeded to exhaustively search foethe~byteencryptionkeyDCtheexpan ded forty-four 32-bit subkeys(as apactofAES submission] ismin(~.2104a}operations! :U fae as the linear and differential cryptanalyticat tacks on this cipher are concerned,the data require- mentsto execute theseattacks onRCSexceedtheavailable data. Foe instance,conside ring an g.couodversion of RC6 wouldrequir e morethan278chosen plaintextpairsfoe success- fully mounting differen tialcryp t an alysis, while itneedsmorethan 200knownplain textsto apply lineae cryptanalysis succ essfu lly. Clearly, applicationofthese attacks to the 2o-round versionofReS,as presented forAESsubmissi on makes these attacksimpractical.

Thesecuri ty analysis ofCAST - 256[4, 34]reveals that the ciphe r is resistant to bothlinear and differential cryptanalysis.Thetotal number ofknownplaintextsneeded fora48-roun d linear approximationof CAST-256is approximately2122,whichisalmost equaltothe to tal numberof plaintexts available (2128 ).This impliesthat linearcryptanalysisis imprac ti cal against CAST-256.Incaseofdifferentialcrypta.oalysisofCAST-256,weneed morethan 21-40chosen plaintexts which ismuchgreaterthan thenumber ofplai ntexts availablefora 128- blocksize.Itthereforeappears tha tCAST-256 is immune todifferential cryptanalysis attack too.

23

(44)

2.5 Conclusion

We have introduced the fundament als ofcryp togr a phyin the beginning of thischapter, followed by a detailed investiga tion of private-key blockciphers .Here ,wehave presented twomain arch itec tures ofblockciphers. Many popularprivate-keyblock ciphershavebeen describedtoo.In addition ,various cryptographic propertiestbat are vitaltothe design and analysiscrs-boxesand ciphers have been discussed. Next,different cryptanalys istechniques as applied tociphers have been presented. Fin all y,wehave discussed. the securityof two block ciphers,Re6andCAST -256,againstthe different attacks presented .

(45)

Chapter 3

Hardware Environments for Cryptographic Applications

Twoim port an tcommunica tion revolution shave catal yzed thegenera tion of an entirenew inform a tion-basedindus try. The first revolut ionwasthe interconnection ofdatanetworks aroundtheglobeculm ina tingintheInternet.The second isthe recent availabilityofinex- pensive highspeed connectionsthatlinkusersathome or in the office tothese networks.

This growingtrendofintem etworking hasledto thecom merc ializ a tion of on-lineservices and electroniccommerce. This has resultedina critical urge fordata security.

Mostmodemday securityapplications makeuseof cryptographichardware as a potent weaponagainst differentsecuritybreachesandintrus ions. Withthe demoniac growth of the Internet,theneed forpri vacy,authenticityandanonymityhas encouragedcryptography tosurface asaviable means ofachievingsecurity . ThereareDOWalot of engineering designcompanies thatare shippingoutcryptographichardwarefor applicationsasdiverse as electro niccommerce andbanking, secure wireless solutions,smartcards,PCM CIAcard

25

(46)

securi ty,certificationauthority,digitalsignaturesetc. Oneofthe most recent applications of cryptographichardware hasto do withthe securityofvirtual private networks or VPNs as they are popular ly known . These are cryptographic accelera tor modulesthatact as fastcoprocessors providingcryptographicprocessi ng at wireline speeds,freeingthe router or firewall toperformothercriticaltasks whileeliminatingcongestionin virtualprivate networks. Otherexamplesofcryptographichardwareinclude LANjWANencryptors.

3. 1 Hardware Encryption vs , Software Encryptio n

Any encryptionalgorithmcan be implemen tedinsoftware. Butthere are sever aldisadvan- tages inherent to software imp lementations.Theseincludelowerspeed, higher cost,and less security.Thespeed of encryptio nisbasical ly restricted by themaximumclockfrequencyof the computingplatform,wher eas in case of a hardwaresolution, wecango foran extremely fas t implementationsuchasafullcustomASICimplement a tion.Moreover,a software-alone solut ionis vulnerabletoviruses,ina dvertanterasing, complicationsfrom systemfailures, and hackers.

Incontras t,hardwa reencryption has many advantages over softwaresolutions. Encry p- tion in hardware isfaster.Ahardwa resolutionis imperviousto system failures suchas viruses.Hardwar e imp lementationscan protect against internalandexternal intruders us- ingtwo-factorauthen ticati on : both the bardwaredeviceanda password are necessaryto accessthe root orprim ary encryp tionkey.

Internalkey manage me ntanddistributionis bette rtakencare ofusinga hardwareen- crypt ionsolution. Hard ware solut io nscan controlaccesstotherootkeys so tha tonecan dist ri bu teaccess codes across sever alindividu alswho mustcoopera teto gain access to the

26

(47)

rootkeys. Moreover, hard war e solutionsoffer scalablesecurity.

3.2 Field P rogranunable Gate Arrays (FPGAs)

A Field ProgrammableGate Array(F PG A) is a general -purpose,multi-levelprogrammable logic devicetha tis customizedinthepackagebythe end users. FPGAs are deviceswhose cores are popula tedwithan array of logic structuresofchanging granularity andpro- grammable interconnect used to connectthem in several differentways.For instance, the logic blockscan beSRA.VI-bas ed lookup tables(LUT s) or even mult iplexers with or without registers,whilespecial purposerouting switchboxes or segmented channelscan makeupthe progr am mable interconnect.

Thestructure, size andnum ber of blocksoflogicas well as the amount ofgluelogic or the connectivityofthein terconnecti on differs largelyamongFP G A architectures. This variation inFPGAarchitectures is dictated by differentprogrammingtechnologiesand different target applicati onsofthe parts. Thisimpliestha t an architectural arrangement that workswell withaparticular progr am mingtechnology does not necessarilywork withanother.

Basedon differentprogrammingtechnologiesand architecturalstyles,FPGAs fall int o four grou ps:

• Island-styleSRA M-progr am med devices.

•Cellular SHAM-programmeddevices.

•Channeled,antifuse..programmed devices.

•Array-styleEPROM orEEPROM- programmeddevices.

SR.A.\1-based islan d-style FPGAsincludeXilinx LeAfamilies. The Xilinx FPGAuses a fairly largelogic blockwithtablelook up functionalityand two 0 flip-flops.Xilinx arrays

27

(48)

'...~K...~l·

/

,~-

.. - -, , -- - -< - -

';::.:.::' I

Figure 3.1:A Simple FPC:\. Taxonomy

•.I'll<'\l · n'n 'n<'<l M y

have specialized routing blocks.This enables interconnection of a subset of inputs to one another.The AT&T Ocre and Alters Flex,as ....'ell as UTFPG AI[351, also belongto this type ofFPC :\. architect ure. Toshiba, Plessey'sERA, Atmel's Cl.ifamily,theAlgotr onix CAL,aswell asTript ych[3GIFrGAsbelong tothecellular-s tyle archit ecture. Algot ronix and CAL reuse someof thelogic cellsthemselves to actas rou ti ng resources.

Ant ifuse-based channeledFrGAsinclud e Actel'sACTl and ACT 2, Quicklogic's pASIC and Crosspoint'sCP20I{Series FPC:\..Actellogic blocksare verysmall and multiplexer based [3i] .Actel arrays use segmented channel resources.EPRO~I-programmeddevices include Altere's~IAX5000 and \IAX 7000,A\ID's\lach andXilinx's EPLD,as'il..ell as a few others. Alteralogic blocks directly support multi-level combinationallogic. A simple taxonomyofFPGAsisillust ratedinFigu re 3.1.

28

(49)

3.2.1 Advanta g esof FPGAsoverMPGAs

Inthis section,we discuss theadvantagesof usingFPGA5as a hardware implementation solutionversus a maskedprogrammed gate array(~ PGA)implementation.

Low ToolingCosts:Everydesign tobeimp lem en ted in a ma.skprogrammed gateamlY (MPGA)requirescus to m masks to constructthe custom wiring patterns.TheC05tof each maskis several thousanddollarsandthiscostisthenamortized overthetotal number ofunitsbeingmanufactured. Asaconsequence , maskin g chargesfordesigns basedonMPGAsaretrem endous .Incontras t,thereis no customtool ing neededfor FPG A designs,presenti ngFPGAsascost -effecti vefor mostlogic designs.

Rap idTurnarou nd: From the completion ofthedesignto thedelivery of the finished products,the manufacturing process takesseveral weeks in case of MPGAs.AnFPGA onthe other han d can be progr amm ed inminutesbythe enduser. Faster design turnaro undleadstofaster prod uct develo pment and short ertime- ta- market fornew FPGAproducts.In[38J,Reine rsten found thatinadesignenvironmentthatca te rs theneedsof ahigh-techindustry,a dela y ofsix monthsinproductdelivery reduces the lifetimeprofitsofaproductbythirty- threepercent.

ReducedRis k :The advantagesoflowinitial non-recurringengineeringchargesand rapid turnaro unddesigntime impliestha taredesigndue toanerrorincurslowexpenses and smalldelays.Thisencouragesrapidpro tot ypi ng~dmore aggressivelogi cdes ign.

EfficientDesign Verification:MP G A users haveto verify their designsbyextensive and elaboratesimulationbeforemanufacture,mainly,because ofhuge non-recurringengi- neering costs andlongmanufacturing delays.AnMPGA designmay includeerrors due

29

(50)

toinaccu racies oroversi m plifica ti o ns madeinthesim ula ti on model.Thisisbecause there is need forlongtime simulations .FPGAsdo notsuffer fromthis dilemma.FPGA users may chooseto usein-ci rcuit verificatio n,inst eadof simulatinglarge amountsof time.

Low Test in g Ex pens es : There are three typesof costs associated with testi ng MPGA par ts : on-chiplogic for testing ,generating test programand final partstest ing when the manufacturingis done.The manufacturer's test program verifiestha t every FPGA will he functional forallpossibledesign s thatmaybe implemen tedon it.FPGAusers do not worryabout writing design-specific testsforthe ir designs.Thiseliminatesthe needto build testabilityintothe design. Moreover, since the test program for FPGAs isthe same foralldesigns as opposed to MPGAs, it isreasonabletoinvest more effort onimprov ing it.This achieves excellent testcover age, providinghigh quali t yles.

3.2.2 Disadvanta gesof FPGAs ove r MPGAs

Fieldprogrammablegate arrays(F P G As) alsohave somedisadvantages.These drawbacks are main lydueto the inherentna t ure ofthetechnologyitself.To begin with,FPGAs suffer fromon-chip programming overhead circui trywhichis responsiblefortheprogrammi ngof agiven part.The area occupiedby programming over head cannot be utilized by the end user.Thisremits in low gate densityforthe FPGA.The programmable switch matrices andint erco nn ects in theFPGAsare larger than theirmask-programmedcounterp artsin MP G As.

The programmableswitchesalso increasesign al delayby addingresistanceand capaci- tance tointerc onnect pa ths.As a consequence,FPGAsare largerand slower than equivalent

30

(51)

MPG As. FPG .

.u

a.lsoexhibitsome design limi tatio nsinrelat ion to implemen ting config- urablecomputingapplicationsinthem.For instance,FPGAs are well suited to algorithms composedoCbit-level operations,suchasintegerarithmetic,butthey donotrenderthem- selves efficient enough for implementingnumeric operations,such as high-precision multipli- cation.Infact,dedicated multiplier circuits such as those usedinmicroprocesso rand OSP chips can be optimized to work more efficien tl y thanthose developedCorconfigurablelogic blocksavailableinFPG As.

Theon-chipmemoryprovidedbyFPGAsforstoringinterm ediate computationalresults istoolittle..This implies thatmost configu ra blecomputi ng applicationsneedsomesort of add it ional externalmemory.Thiswillslowdown thecomputat ions.However,researchers and industrypeopleare developingnewer and more advanced FPGA archite ct ur esthatln- corpo rate enoughon-chip memory,very fastand efficientarithmeti c processing andsome special-purpose functional uni ts. A veryrecent example ofsuch FPG Asare Xilinx'sVirtex FPGAs thatnotonly possess great gate densit ies,buta.lsoharness very highspeeds.The Virtex CamilyFP G As have brokendensi ty andperformance barriers whileofferingunprece- dented systemlevel integration,achievingclockspeeds in excess of150 MHz.

3.3 SRAM-based FP G As

Inthis section.weshallCocus our discuss ionon the issuessurroundingtheSR.A.\t:·based.

FPG As.Thisisbecausewehaveusedthesedevicesas0U!target technology Cor realizing the cipherdesignsin hard ware.

SR..A..\t:-based FPGAsarethe mostpopular. This ismainly duetotheirabili tytoreo configure. Severalresearchers[39,40,41,42,43,44,36,35,45, 46Jhaveallinvestigated

31

(52)

this classofFP G As.AnSR..Ac.\l:-hased FPGA is programm ed bydownloading configura tion memory from anexternal source.Theconfiguration memorycells contro l the logicand the interconnecttha texecutethe applica tionfunctioninsidean FP GA. TheRA.Vfmemoryis not centralized,but ratherdistribu tedamong the con.figu ra ble logic blocks. This type of programming approachhas many advantages aswell asdisadvantages attache dto it.One obviousdrawback isthe volatilenatureofprogramming.Thismeansthat when poweris switchedoff,thedeviceloses its programmi ng and assuchthe FPGAistoreprogr ammed every time its turned on. However,the disadvantageof volatilityfurnishes the benefitof reprogrammability. This abili tyma kesSRAM -b ased FPG As ideal for rapidprot ot ypi ng.

This resultsin very high quality devices. Sincethesame CMOSprocess as usedinASICs is employedtobuild these typ e of FPG As , SRA M-basedFPGAsbenefit from processim- provemen tsdrive nby semicon ductorindustry. Finall y,becau se theseFPGAsimplement logic usingstat ic gates, SR.A!.\1 -based FP G As haveverylow powerconsumption.

3.3.1 Xilinx XC40 0 0 Struc ture

XillnxFPGAs possess anarray based stru cture,each chip com prisin gatwo-dimens ional array of logic blocks interconnectedby horizon tal and vertical routingchannels. Xilinx introdu ced the first FPGA series, the XC2000in1985,andnow offersthreemore generations:

XC3000,XC4000andXC5000devices.Avery recent device famil y is the Xilinx 's Virtex series which is still undergoin gfield tests. Ofallthedevice familiesintroduced so far,Xilinx XC4000 is the one thathas mostproven its worthover the

years.

This is the most widelyused FPGA familyinindus try . More informationcanbe obtai ned fromXilinxdata books[47].

Adet ai led descripti onofthis device famil yispresented.

32

(53)

Figure3.2: AXilinxXC4000structure 3.3.1.1 XC4000 Structure

The Xilinx XC.tOOOFPGAstruc t ureisillustratedinFigure 3.2. The logic inside theFPGA isimplemented in an array of programmable blocks of logic called ronfigurableWgicblocks (CLBs) .Input to and output from the array are takencare of by the input/outputblocks (lO Bs) along the edges of the array. The CLBs and thelO Bs are interconnected using sev-eral types ofprogrammable interconnectarchitectures.Connections to and fromCL&

and10D s canbeprogrammed andwiresegments can he interconnected, tofonnpaths that extend fromone box to anot her usinganarray ofprogrammab leconnect ionblockscalled theswitch matrices.

33

(54)

3.3.1. 2 ProgrammingTechnology

Asmentioned ear lier,Xilinx uses SR.A.:.\1techno logy to storetheprogramming informatio n.

Afterthepoweris appliedto the circuit, theprogr am data definingthelogicconfiguration mustbe loadedinto the SRAM.The FPGA itself contains the logicneededto loadthe informationintoitself fromfrom a PROM.Oncethe programm inginformationhas been loaded, the deviceswit ches from programm ingmode tooperatio nal mode in which the logic isavailable. Thislogicismaintainedaslongasthe device is powered up. Assoonasthe deviceisdown,itlosesit . Theabilityto reprogramthe FPGA permitsa new hardware design onthe fiy.

TheSRAM bits control the logicthat is implemented in a Xilinx XC4000device.This is done usingthree techniques,namely pass transis torcontrol, multiplexercontrol, and table lookup implementation [48J.Figure 3.3shows an SRAM cell drivingthega te terminal of an n-channelMOS trans is tor. Whensuch a transistor is used to makeorbreak abidire ctional connection for passing a signal between two wiring segmen ts,itis called a pass tran sisto r . Whenthe SRA.\1 cellcont ains a zero-bit, thetrans isto ris OFF, the path between thetwo wiring segmentsis OPEN,and assuchno controlsignal can be passed. Ontheother hand, whenthe SRA.r.\1 bit con tai ns a one, the transistorisON, the path betweenthe two wiring segments isCLOSED, permitti ngthe sign al to be passed.AnXC4000series FPGAcontains tens ofthousan ds of suchpass transis torsinthe interconnectionstructure.

Figure 3.4 showsan SRAM cell connected.to theselect inputof a 2x 1multiplexer.

When the SRAM cellcontains a0,value onthe zero input line ispassedtothe multiplexer output.Thest ructure is used tomake selectionsbetweentwo signals.

In theFigure 3.5,we have a lookuptable(LUT)built using the SRAM cells.A LUT for

34

(55)

Figure 3.3:PassTransist or Contro l Technique

Figure 3.4; Multiple xer Control Technique

a threevariablefunctio nF(A,B,C)isillustr ated .TheSR.A~fcellsinthe LUT storethe actualtruthtable for the logic functionF.This implies thateachcell houses the value of thefunctionFfor the corresponding minterm 149].

3.3.1.3 Int e r co n nections

Connect ionsbetween the CLDs aswell as betweenCLDs and lOBs are esta blished using wiring segments.These wiring segmentsextend inboth the horizontal and verticaldirections in channels lying between the various blocks. Some of the segments are very long,spanning the entire length or widthofthe array.Suchsegments are known as long lines. Theyare

(56)

f(A,B,C)

Figure 3.5: ALooku pTa bleImplement at ion

basically intended for highfan -out ,time-crit ical stgnelnete, or theonesthat aredist ributed overlong distances[47]. Other segmentsare long enoughto span a single CLB.Thesecan beinterconnected using the switch matrices. These are calledsingle-length linu. Some of the XC4000 family devices even havedouble-length linu or quad lina. The benefit of using longer wire segments is that signal passes through 1t"S6 series resistance in traversing the same distance compared to single-length lines.

The discussion of Xilinx interconnections is incomplete without describing its"witch matrice".An example of a switch matrix is shown in Figure 3.6.Here we have four segments meeting at a point.At this particularcrosspoint,there are six pass transistors: one vertical, onehorizontal andfour00the diagonals. The horizontal and verticalpasstransistors are shownas aplus sign as theyintersect each other. Theremainingfour passtransistors, represent ed byblack bold lines surroundthesetwopass transistors. Theconnection between twosegmentsis CLOSED for a onestored in the SRA11cell drivingthetransist or gate.

Similarly,the connectionis open forazerostoredin theSRA),lcell. Thus, the SRAM·

36

(57)

I .., C

I

· i

--+---+-+--r i .l ! ~ ~ _L

I

! i

&. '

!

: ' 1 ~1 ;

~ I- T ~

~ i l i i

~ l ! l ,

I ! I I ! !

I I I I

Figure 3.6: AXilinxXC400Q Switch Matrix

controlled pass transistors lie atselectedinterconnec:tionsbetweenthe wiring segments in the routingchannels {50].

3.3 .1.4 XllinxLogic

:Most ofthe logicin a Xilinx XC4000 device lies withinthe CL&andthelOBs. The structure oftheeLBaswell as of thelOBis intern allyprogram mable.A sim plified representation of a.Xilinx XC4000 Ct.Bis showninFigure3.7[47].

Here there are thirteen inputs to theeLB,iDcluding the clock input.Two Sip-Bops andtheirassociated logicare also represented by brokenlines.The remainingpartofthe eL Bisusedto implement the combinational logic.There are threeLUTs thatimplement combinational functions. Twofour -inpu t tables implementtwo functions labeled

F

and

G.

A third function generator,

H

can implement any boolean function ofitsthree inputs. Two 37

(58)

Figure 3.7:SimplifiedLogicSchematic ofaXilinxCLB

ofthese inputs canoptionallybethe

F

and

G

outputsof the twofunction generators.The thirdinp ut essentiallycomes fromoutside theCLB.

AeLBcan implementany of the following functions:

•Anyfunction of up to four variab les,any additionalsecondfunctionof up to four unrelated variables,plusany third funct ion of up to three unrelated variables.

•Any singlefunctionof five variables.

•Any funct ionoffourvariablesalong with some functionsofsix variables.

•Some functio nsof up to ninevariables.

38

(59)

Figure 3.8: SimplifiedBlock Diagram of XilinxIOD

User-programmableinput/outputblocks(IODs) supply theinterface betweenexternal package pins and the internal logic. Asimplified block diagram representation of a Xilinx XC4000IODis giveninFigure 3.8.

To simplifythings,theIOD block is divided into twosections-aninput portionand an output part.The output part of theIOD gives the outputdat a from the interior of the FPGA and the I/Opin.Alternatively, it can providethe stored value of the outputdata from a flip-flop.Atri-state driveronthe outputallowsthe I/Opintobeused as an input, an outputor the input/output.Inthe inputpart,the signal at the I/ O pin enters theinput buffer.The signal can be fed directly to input dat a 1 and input dat a 2.These are two input linesto the interiorof theFPGA orone or both of theinputscan be fed by a storedvalue from theI/O pinsignal orits complement.

39

(60)

3 .4 C ryptographic A lgorithms: F PGAs vs, AS ICs

A new developmentinintegratedcircuits that offers a hardware implementation choice that is much more flexiblethan ApplicationSpecific IntegratedCircuits(ASIes)isthe remarkable entry of reconfigurahlecustom hardware;these large,fast reconfigurable gate arraysareFPGAs.Incontrast,ASICsprovide only functionalityneeded for a specific task.

A well-designedASICchip will support a particular application forwhichitisdesigned, but notaslightly modifiedversion of the sameapplicationintroducedafter the ASICdesign is comp leted.Furthermore,evenifa modified ASIC can be developed, the original hardware is toohighly customized to be reusedinsuccessivegenera tions.Incontras t, the configuration of an FPGA can be easilyreprogrammedto eccomodeeeadesignmodifica tion.

Field ProgrammableGateArrays(FPGAs)have been chosen as the targettechnology for realizingthe Re6 and CAST-256cryptographicalgorithmsin hardwarebecauseof a number of reasons.Replaci ng onecryptographicalgorithm with anotherisatrivial matterin software,butit is notthe samein hardware.Moreover,atthe same time hardwaresolu tions canoffer improved performanceintenus ofspeed,securi ty,and cost.As such the solution tothisproblem is reconfigurablehard wareand FPGAsaretheanswer. As a mat ter of fact, FPGAscan be usedto buildalgorithmagileapplications[5 11.Theterm algorithm agility refersto the fact thatthesame FPGA can be reprogrammedatrun time to supportdifferen t algorithms.Other key fact orsthatfavourthe use of FPGAs forhardwareimplemen tation of ciphers includefaster turnarounddesigntime,saJ.lablesecurity,andvari able architectural parameters.

40

(61)

3. 5 Conclusion

1J:lthischapter.we havetoucheddifferentissues that relateto thehardware lmplementa - tionof cryptographicalgorithmsin FPGAs. Wehavecompared hardware encryptionto encryption in software.Next,weintroduced FPGAs asaviablecustom architectureimple- mentationchoice. Here we have described different FPGA architectures,presented several key advantagesanddisad vantag es ofusingFPG As as a targettechnology.Wehavefocused on SRA..."d·based FPG As in general and Xilinx XC4QOOfamily in particular.Wehaveclosed our discussion byreasoning whyFP G As have beenchosen over ASICsforthisparticular application.

41

(62)

Chapter 4

D esign of RC6 and CAST-256

Inthischapter,wewillbasically focuson the FPGA implementationof two strong AES candidates-RC6 andCAST-256. We willinvestigate issues relatingto the efficiencyof the two ciphers from the hardware implementationperspective.

4. 1 The RC 6 Ciphe r

Re6is a symmetric(p riva te-key) block cipher submitted toNISTfor consideration as the newAES.Re6is an evolutionaryimprovement ofRC5 [81.Modifications have been made to meet the AES requirements,to increase security,andto enhance performance.

Re6-wIT/b, a general version of the ReG cipher [3],operateson units offour w-bit words, with the encryptionconsistingof a nonnegative number of rounds r,andb representingthe length of the encryption keyin bytes. The user supplies aprimary key of b bytes,wher e o:5b:5255 and, fromthiskey,the key schedu leschemeoftheReS-to/ribalgorithm derives 2r+4 subkeys,where each subkeyis a w-bit word. These 2r+4 suh keysare thenstoredin

42

(63)

the arrayS(O,....,2r+3}.Thisarrayof subkeysisusedinbothencrypt ionand decry p tio n.

Encryptio n with the RC6 algorithmis described below. RC6wor kswithfour w-bit registersA, B, C,andDwhich contain theinitial input plaint extaswell asthe outp u t ciphertext at theendoftheencryp tion. Thestand ar dlittle endian convention isused for packing the databytes into theinput/outp ut blocks . The encryp tion block involves the following basicoperations ontwo w-hitwordsaandb:

aEEl b:bitwiseexcl usive-orofw-bitwords a+b:integ eraddition modulo 2'"

a x b:integermultip lica tionmodulo 2'"

a<:b:rotatethew-bitword atothe leftbytheamou nt givenbythe least significant log, wbit s of b

For theAESimpl emen tati on of RC6, w

=

32 andr

=

20. Eachof the four32-bitregis tersA, B, C,andDisupd ated after eachround ofencryp tion.Theoutputofthe 2O-roundencryption is also stored in thefour registersas theciphertext.Theentireprocessof encryption in the RC6 algorithm is illustratedin Figure 4.1.

Decryp tion is similar,butinvolvesreversingthe orderofthe subkeys,replacing leftrota- tions by rightrotationsandreplacing addi tion with subtraction.Since the AES submission requiresthe cipher to oper ateon 32-bit words and there shoul d be twenty rounds of encryp- tion/decryption, the issues related tohard wareimpl eme ntati onofRC6-32/ 20/b versionwill be discussedinthe coming sections.Notethat ferAES,b ::; 64 byt es[i.e. up to 512 bits of key)are allowedaspri m arykey.Fordetails onthekey schedulingschemerefer to [3].

43

Références

Documents relatifs

After a comparative study, in order to reduce the complexity and the latency of the digital block, the output signal of the MIMO simulator will be computed in the frequency

I Threshold Search with Highway-Country road approach for analysing S IMON and S PECK. I Extend the Threshold Search technique for

A trust level is a useful method to identify the required hardware and software security protection mechanisms that a system must include to protect the data confiden-

Public-key cryptography is also known as asymmetric cryptography. The public key, as the name indicates, may be widely distributed , while the secret key is kept

constraints and design algorithms to make projections onto curves spaces with bounded

1 When our work was done in 2011, the best previously published cryptanalytic re- sults on Camellia with FL/FL − 1 functions were square attack on 9-round Camellia- 128 [11],

Cette utilisation d’une fonction de hachage avec une clef secrète intervient dans le calcul d’un MAC, et une attaque d’extension comme celle-ci permet de signer un message

In Chapter 2, I first detail all of the state-of-the-art solutions for masking schemes and some overview of the software choices we made throughout this thesis. In Chapter 3, I