• Aucun résultat trouvé

RNS Modular Computations for Cryptographic Applications

N/A
N/A
Protected

Academic year: 2021

Partager "RNS Modular Computations for Cryptographic Applications"

Copied!
2
0
0

Texte intégral

(1)

HAL Id: hal-01141347

https://hal.inria.fr/hal-01141347

Submitted on 11 Apr 2015

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

RNS Modular Computations for Cryptographic Applications

Karim Bigou, Arnaud Tisserand

To cite this version:

Karim Bigou, Arnaud Tisserand. RNS Modular Computations for Cryptographic Applications. RAIM: 7ème Rencontre Arithmétique de l’Informatique Mathématique, Apr 2015, Rennes, France. 2015. �hal-01141347�

(2)

RNS Modular Computations for Cryptographic Applications

Karim Bigou & Arnaud Tisserand

1. Elliptic Curve Cryptography (ECC)

Elliptic curve over FP: y2 = x 3 + a x + b with P a `-bit prime

y2 = x3 + 4x + 20 over F1009

Security levels: ` ∈ {160, . . . , 600} bits Curve level operations:

I point addition (ADD): Q + Q0

I point doubling (DBL): Q + Q

I scalar multiplication:

[k ]Q = Q + Q + . . . + Q

| {z }

k times

Security (ECDLP): knowing Q and

[k ]Q, k cannot be recovered

ECDLP : Elliptic Curve Discrete Logarithm Problem

3. RNS Computation Flow in ECC Applications

RNS allows to perform some field level operations in parallel

mod m1mod m2 mod m3mod m4 mod m5 +, −, ×,−1 in Fp ADD, DBL [k]Q

±× over one channel over one RNS vector

(i.e. n channels)

base extension modulo P in RNS

1 n time n ±× ±× ±× ±× • • • ±× ±× ±× ±× • • • ±× ±× ±× ±× • • • ±× ±× ±× ±× • • • • • • • • • • • • • • • ±× ±× ±× ±× • • • ±× ±× ±× ±× • • • ±× ±× ±× ±× • • • ±× ±× ±× ±× • • •

5. New RNS Modular Inversion (MI) (CHES 2013)

State-of-the-art RNS MI methods:

I based on Fermat’s Little Theorem (FLT-MI): X −1 = X P−2 mod P i.e. a large exponentiation with a lot of modular reductions

which costs O(log2 P × n2) EMMs

I very limited parallelization due to internal data dependencies Proposed method PM-MI:

I extended binary Euclidean algorithm (binary-ternary version)

I uses the plus-minus trick:

if X and Y are odd then X + Y = 0 mod 4 or X − Y = 0 mod 4

I PM-MI works without BE and costs O(log2 P × n) EMMs

CTRL (shared) local reg. {@, en, r/w} Arithmetic Unit (6 pipeline stages) {rst, mode, . . . } w w w w w IN w OUT w cmp w = b1 = c−1 precomp. mult. ≈ 2n × w w @1 precomp. ri (×2) @2 d log 2 ri e precomp. add. 17 × w @3 w

Example: # EMMs for ` = 192 bits

n × w FLT-MI PM-MI Gain Factor

12 × 17 103140 5474 18 9 × 22 61884 4106 15 7 × 29 40110 3193 12 0 50 100 150 200 250 300 350 400 450 500 Inversion time [ µ s] 192 bits FLT−MI PM−MI

256 bits 384 bits 521 bits

4 5 6 7 8 9 10 7 8 9 10 11 12 speed up n 8 9 10 11 12 n 10 12 14 16 18 20 22 n 15 16 17 18 19 n 0 500 1000 1500 2000 2500 3000 3500 4000 7 9 12 slices FLT−MI 192 bits 7 9 12 PM−MI 192 bits 8 9 12 FLT−MI 256 bits 8 9 12 PM−MI 256 bits 0 10 20 30 40 50 60 70 80 7 9 12 # blocks (DSP / BRAM) n DSP BRAM 7 9 12 n 8 9 12 n 8 9 12 n 0 2000 4000 6000 8000 10000 12000 10 12 14 17 18 20 22 slices FLT−MI 384 bits 10 12 14 17 18 20 22 PM−MI 384 bits 15 16 19 FLT−MI 521 bits 15 16 19 PM−MI 521 bits 0 20 40 60 80 100 120 10 12 14 17 18 20 22 # blocks (DSP / BRAM) n DSP BRAM 10 12 14 17 18 20 22 n 15 16 19 n 15 16 19 n

2. Residue Number System (RNS)

X a large `-bit integer is represented by: − → X = (x1, . . . , xn) = (X mod m1, . . . , X mod mn) channel 1 ±× mod m1 w z1 w y1 w x1 channel 2 ±× mod m2 w z2 w y2 w x2

. . .

. . .

. . .

. . .

channel n ±× mod mn w zn w yn w xn X Y Z RNS base B = (m1, . . . , mn)

n pairwise w -bit co-primes with n × w > `

The Chinese remainder

theorem (CRT) is the base of RNS

EMM elementary modular multiplication (w bits)

Pros:

I carry free between channels

I fast parallel +, −, × and some exact divisions

I non-positional number system, randomization against SCAs

I flexibility for hardware implementations

Cons:

I comparison, modular reduction and division are much harder

4. State-of-the-Art Algorithms and Architectures

RNS Montgomery Reduction

Input: −→X , −→X 0

Output: (−→ω , −→ω 0) with ω ≡ X × M−1 mod P

− → Q ←− −→X × (−−→P −1) (in base B) − → Q0 ←−BE(−→Q , B, B0) − → S 0 ←− −→X 0 + −→Q0 × −→P 0 (in base B0) − →ω 0 ←− −→S 0 × −→M−1 (in base B0) − →ω ←−BE(−ω 0, B0, B) B B0 × • • × + × • • BE BE

BE: base extension M = Q mi channel 1 rower 1 w w channel 2 rower 2 w w

. . .

channel n rower n w w cox

. . .

1 t w w Output Input n × w w w w w w w CTRL

6. Fast Patterns for RNS Computations (ASAP 2014)

Cost of standard and modular multiplications in RNS:

I standard: n EMMs fully parallel

I modular: 2n2 + O(n) EMMs 1 mult. & 1 red.

Proposed method:

I splits operands into 2 parts: −→X = −−→(Kx) × −−−→(Ma) + −−→(Rx) allows to replace 2n moduli by only 32n

I reuses split result in various computation patterns

I requires an hypothesis on P: OK for ECC/DH, but not for RSA

Cost for some patterns (#EMMs):

Operations s-o-t-a our

AB mod P 2n2 + 4n 2.5n2 + 12.5n

A2 mod P 2n2 + 4n 1.75n2 + 10.5n

Cst ×A mod P 2n2 + 4n 1.75n2 + 7n

Cst ×A2 mod P 4n2 + 8n 2.75n2 + 16.5n

Usage for Diffie-Hellman or ElGamal:

0.7 0.8 0.9 1.0 1.1 1.2 10 20 30 40 50 60 70 Our / Ref n EMM Expo. LSBF 0.7 0.8 0.9 1.0 1.1 1.2 Our / Ref

EMM Expo. Montg.

base extension (BE) computations in 1 base SPLIT PR MR base Ba Xa Ya Ua Kx Ky Ry = Ya Rx = Xa Qa Sa base Bb Xb Yb Rx Kx Ry Ky Ub Qb Sb base Bc Xc Yc Rx Kx Ry Ky Uc Qc Sc

Funding from DGA-INRIA PhD grant and project PAVOIS ANR 12 BS02 002 01

Références

Documents relatifs

To test whether the vesicular pool of Atat1 promotes the acetyl- ation of -tubulin in MTs, we isolated subcellular fractions from newborn mouse cortices and then assessed

Néanmoins, la dualité des acides (Lewis et Bronsted) est un système dispendieux, dont le recyclage est une opération complexe et par conséquent difficilement applicable à

Cette mutation familiale du gène MME est une substitution d’une base guanine par une base adenine sur le chromosome 3q25.2, ce qui induit un remplacement d’un acide aminé cystéine

En ouvrant cette page avec Netscape composer, vous verrez que le cadre prévu pour accueillir le panoramique a une taille déterminée, choisie par les concepteurs des hyperpaysages

Chaque séance durera deux heures, mais dans la seconde, seule la première heure sera consacrée à l'expérimentation décrite ici ; durant la seconde, les élèves travailleront sur

A time-varying respiratory elastance model is developed with a negative elastic component (E demand ), to describe the driving pressure generated during a patient initiated

The aim of this study was to assess, in three experimental fields representative of the various topoclimatological zones of Luxembourg, the impact of timing of fungicide

Attention to a relation ontology [...] refocuses security discourses to better reflect and appreciate three forms of interconnection that are not sufficiently attended to