An introspective algorithm for the integer determinant

(1)

An introspective algorithm for the integer determinant

Jean-Guillaume Dumas Anna Urba´nska

Laboratoire Jean Kuntzmann, UMR CNRS 5224 Universit´e Joseph Fourier, Grenoble I

BP 53X, 38041 Grenoble, FRANCE.

{Jean-Guillaume.Dumas;Anna.Urbanska}@imag.fr

ljk.imag.fr/membres/{Jean-Guillaume.Dumas;Anna.Urbanska}

Abstract

We present an algorithm for computing the determinant of an integer matrix A. The algorithm is introspective in the sense that it uses several distinct algorithms that run in a concurrent manner. Dur- ing the course of the algorithm partial results coming from distinct methods can be combined. Then, depending on the current running time of each method, the algorithm can emphasize a particular variant. With the use of very fast modular routines for linear algebra, our implementation is an order of magnitude faster than other existing implementations. Moreover, we prove that the expected complexity of our algorithm is onlyO¡

n³log^2.5(nkAk)¢

bit operationsin the case of random dense matrices, where n is the dimension and kAk is the largest entry in the absolute value of the matrix.

1 Introduction

One has many alternatives to compute the determinant of an integer matrix.

Over a field, the computation of the determinant is tied to that of matrix multiplication via block recursive matrix factorizations [19]. On the one hand, over the integers, a na¨ıve approach would induce a coefficient growth that would render the algorithm not even polynomial. On the other hand, over finite fields, one can nowadays reach the speed of numerical routines [12].

Therefore, the classical approach over the integers is to reduce the computation modulo some primes of constant size and to recover the integer

(2)

determinant from the modular computations. For this, at least two variants are possible: Chinese remaindering andp-adic lifting.

The first variant requires either a gooda prioribound on the size of the determinant or an early termination probabilistic argument [13, §4.2]. It thus achieves anoutput dependantbit complexity ofO¡

log (|det (A)|)¡

n^ω+n²log (kAk)¢ ¢ whereω is the exponent of matrix multiplication ¹. Of course, with the co-

efficient growth, the determinant size can be as large as O(nlog(nkAk)) (Hadamard’s bound) thus giving a large worst case complexity. The algorithm is Monte Carlo type, its deterministic (always correct) version exists and has the complexity of O¡¡

n^ω+1+n³log(kAk)¢

log(nkAk)¢

bit operations.

The second variant uses system solving and p-adic lifting [6] to get a potentially large factor of the determinant with a O¡

n³log²(nkAk)¢ bit complexity. Indeed, every integer matrix is unimodularly equivalent to a diagonal matrix S equal to diag (s₁, . . . , s_n), where s_i divides s_i+1. This means that there exist integer matrices U, V with detU,detV = ±1, such thatA=U SV. Thes_i are called the invariant factors of A. In the presence of several matrices we will also use the notation s_i(A). Solving a linear system with a random right hand side revealss_nas the common denominator of the solution vector entries with high probability, see [24, 1].

The idea of [1] is thus to combine both approaches, i.e. to approximate the determinant by system solving and recover only the remaining part (det(A)/s_n) via Chinese remaindering. The Monte Carlo version of Chinese remaindering leads to an algorithm with the expected output-dependant bit complexity of O¡

n^ωlog

³

|^det(A)_s

n |

´

+n³log²(nkAk)¢

. We use the notion of theexpected complexity to emphasize that it requiresE

³ log

³sn

˜ sn

´´

to beO(1), where ˜s_n is the computed factor ofs_n and Edenotes the expected value computed over all algorithm instances for a given matrixA.

Then G. Villard remarked that at most O³p

log (|det (A)|)

´

invariant factors can be distinct and that in some propitious cases we can expect that only the lastO(log(n)) of those are nontrivial [17]. This remark, to- gether with a preconditioned p-adic solving to compute the i-th invariant factor lead to a O

³

n²⁺^ω² log^1.5(nkAk) log^0.5(n)

´

worst case Monte Carlo algorithm. Without fast matrix multiplication, the complexity of the algorithm becomes O¡

n^3.5log^2.5(nkAk) log²(n)¢

. The expected number of invariant factors for a set of matrices with entries chosen randomly and uni-

1the value of ω is 3 for the classical algorithm, and 2.375477 for the Coppersmith- Winograd method, see [4]

(3)

formly from the set of consecutive integers{0,1. . . λ−1} can be proven to be O(log(n)). Thus, we can say that the expected complexity of the algorithm is O¡

n³log²(nkAk) log(n) log_λ(n)¢

. Here, the term expected is used in a slightly different context than in algorithm [1] and describes the complexity in the case where the matrix has a propitious property i.e., the small number of invariant factors.

In this paper we will prefer to use the notion of the expected rather than average complexity. Formally, to compute the average complexity we have to average the running time of the algorithm over all input and argorithm instances. The common approach is thus to compute the expected outputs of the subroutines and use them in the complexity analysis. This allows us to deal easily with complex algorithms with many calls to subroutines which depend on randomization. The two approaches are equivalent when the dependency on the expected value is linear, which is often the case. However, we can imagine more complex cases of adaptive algorithms where the relation between average and expected complexity is not obvious. Nevertheless, we believe that the evaluation of the expected complexity gives a meaningful description of the algorithm. We emphasize the fact, that the propitious input for which the analysis is valid can often be quickly detected at runtime.

Note that the actual best worst case complexity algorithm for dense matrices is O^∼¡

n^2.7log(kAk)¢

, which is O^∼¡

n^3.2log(kAk)¢

without fast matrix multiplication, by [21]. We use the notion O^∼(N^αlog(kAk)), which is equivalent to O(N^αlog^β(N) log(kAk)) with some β ≥ 0. Unfortunately, these last two worst case complexity algorithms, though asymptotically better than [17], are not the fastest for the generic case or for the actually attainable matrix sizes. The best expected complexity algorithm is the Las Vegas algorithm of Storjohann [26] which uses an expected number of O¡

n^ωlog(nkAk) log²(n)¢

bit operations. In section 5 we compare the per- formance of this algorithm (for both certified and not certified variants) to ours, based on the experimental results of [27].

In this paper, we propose a new way to extend the idea of [25, 28] to get the last consecutive invariant factors with high probability in section 3.2.

Then we combine this with the scheme of [1].

This combination is made in an adaptive way. This means that the algorithm will choose the adequate variant at run-time, depending on discovered properties of its input. More precisely, in section 4, we propose an algorithm which uses timings of its first part to choose the best termination. This particular kind of adaptation was introduced in [23] as introspective; here we use the more specific definition of [5].

(4)

In section 4.2 we prove that the expected complexity of our algorithm is O

³

n³log²(nkAk)p log(n)

´

bit operations in the case of dense matrices, gaining a log^1.5(n) factor com- pared to [17].

Moreover, we are able to detect the worst cases during the course of the algorithm and switch to the asymptotically fastest method. In general this last switch is not required and we show in section 5 that when used with the very fast modular routines of [9, 12] and the LinBox library [10], our algorithm can be an order of magnitude faster than other existing implementations.

A preliminary version of this paper was presented in the Transgressive Computing 2006 conference [14]. Here we give better asymptotic results for the dense case, adapt our algorithm to the sparse case and give more experimental evidences.

2 Base Algorithms and Procedures

In this section we present the procedures in more detail and describe their probabilistic behavior. We start by a brief description of the properties of the Chinese Remaindering loop (CRA) with early termination (ET) (see [7]), then proceed with theLargestInvariantFactoralgorithm to compute s_n (see [1, 17, 25]). We end the section with a summary of ideas of Abbottet al. [1], Eberlyet al. and Saunderset al. [25].

2.1 Output dependant Chinese Remaindering Loop (CRA) CRA is a procedure based on the Chinese remainder theorem. Determi- nants are computed modulo several primes p_i. Then the determinant is reconstructed modulo p₀· · ·p_t in the symmetric range via the Chinese reconstruction. The integer value of the determinant is thus computed as soon as the product ofp_i exceeds 2|det (A)|. We know that the product is suffi- ciently big if it exceeds some upper bound on this value or, probabilistically, if the reconstructed value remains identical for several successive additions of modular determinants. The principle of this early termination (ET) is thus to stop the reconstruction before reaching the upper bound, as soon as the determinant remains the same for several steps [7].

(5)

Algorithm 2.1 is an outline of a procedure to compute the determinant using CRA loops with early termination, correctly with probability 1−².

We start with a lemma.

Lemma 2.1. Let H be an upper bound for the determinant (e.g. H can be the Hadamard’s bound: |det(A)| ≤ (√

nkAk)ⁿ). Suppose that distinct primesp_i greater than l >0 are randomly sampled from a setP with |P| ≥ 2dlog_l(H)e. Letr_t be the value of the determinant modulop₀· · ·p_t computed in the symmetric range. We have:

(i) r_t= det (A), if t≥N =

(dlog_l(|det (A)|)e if det (A)6= 0

0 if det (A) = 0;

(ii) if r_t6= det (A), then there are at most R=dlog_l

³|det(A)−rt| p0···pt

´

e primes p_t+1 such that r_t= det (A) mod p₀· · ·p_tp_t+1;

(iii) if r_t = r_t+1 = · · · = r_t+k and _(|P^R⁰^(R|−t−1)...(|P⁰^−1)...(R⁰^−k+1)|−t−k) < ², where R⁰ = dlog_l_p^H+|r^t^|

0p1...pte, then P(r_t6= det (A))< ².

(iv) if r_t = r_t+1 = · · · = r_t+k and k ≥ d_log(P0)−log(log^log(1/²) _l(H))e, where P⁰ =

|P| − dlog_l(H)e, then P(r_t6= det (A))< ².

Proof. For (i), notice that −b^p⁰^···p₂ ^tc ≤ r_t < d^p⁰^···p₂ ^te. Then r_t = det (A) as soon as p₀· · ·p_t ≥2|det (A)|. With l being the lower bound for p_i this reduces tot≥ dlog_l|det (A)|ewhen det (A)6= 0.

For (ii), we observe that det (A) =r_t+Kp₀. . . p_t and it suffices to estimate the number of primes greater thanldividing K.

For (iii) we notice thatkprimes dividingK are to be chosen with the probability (^R_k)

(^|P^|−(t+1)_k ). Applying the boundR⁰ forR leads to the result.

For (iv) we notice that the latter is bounded by

³R⁰ P⁰

´_k

sinceR⁰ ≤ dlog_l¡_2H

2

¢e ≤

|P⁰|. Solving fork the inequality

³R⁰ P⁰

´_k

< ²gives the result.

The two last points of the theorem give the stopping condition for early termination. The condition (iii) can be computed on-the-fly (as in Algorithm 2.1). As a default value and for simplicity (iv) can also be used.

To compute the modular determinant in algorithm 2.1 we use the LU factorization modulop_i. Its complexity isO¡

n^ω+n²log(kAk)¢ .

Early termination is particularly useful in the case when the computed determinant is much smaller than thea priori bound. The running time of this procedure is output dependant.

(6)

Algorithm 2.1Early Terminated CRA Require: n×ninteger matrixA.

Require: 0< ² <1.

Require: H - Hadamard’s bound (H = (√

nkAk)ⁿ)

Require: l >0, a set P of random primes greater thanl,|P| ≥2dlog_l(H)e.

Ensure: The integer determinant of A, correct with probability at least 1−².

1: i= 0;

2: repeat

3: Choose uniformly and randomly a primep_i from the set P;

4: P =P\{p_i}

5: Compute det (A) mod p_i;

6: Reconstructr_i, the determinant modulop₀· · ·p_i; // by Chinese remaindering

7: k= max{t:r_i−t=· · ·=r_i};R⁰=dlog_l_p₀^H+|r_p₁_...pⁱ^|

i−ke;

8: Increment i;

9: until _(|P^R⁰^(R|−i+k−1)...(|P⁰^−1)...(R⁰^−k+1)|−i) < ²orQ

p_i≥2H

2.2 Largest Invariant Factor

A method to compute s_n for integer matrices was first stated by V. Pan [24] and later in the form of the LargestInvariantFactor procedure (LIF) in [1, 17, 7, 25]. The idea is to obtain a divisor of s_n by computing a rational solution of the linear systems Ax = b. If b is chosen uniformly and randomly from a sufficiently large set of contiguous integers, then the computed divisor can be as close as possible to s_n with high probability.

Indeed, withA=U SV, we can equivalently solveSV x=U⁻¹b fory=V x, and then solve forx. AsU andV are unimodular, the least common multiple of the denominators ofx and y,d(x) andd(y) satisfiesd(x) =d(y)|s_n.

Thus, solvingAx=benables us to gets_nwith high probability. The cost of solving using Dixonp-adic lifting [6] isO¡

n³log²(nkAk) +nlog²(kbk)¢ as stated by [22].

The algorithm takes as input parameters β and r which are used to control the probability of correctness;r is the number of successive solvings and β is the size of the set from which the values of a random vectorb are chosen, i.e. a bound forkbk. With each system solving, the output ˜s_nof the algorithm is updated as the lcm of the current solution denominator d(x) and the result obtained so far.

(7)

The following theorem characterizes the probabilistic behavior of the LIF procedure.

Theorem 2.2. Let A be an×nmatrix, H its Hadamard’s bound, r and β be defined as above. Then the outputs˜_nof AlgorithmLargestInvariantFactor of [1] is characterized by the following properties.

(i) If r= 1, p is a prime, l≥1, then P

³ p^l|^s_˜_sⁿ

n

´

≤ _β¹d_p^β_le;

(ii) if r= 2,β =dlog (H)e then E

³ log

³sn

˜ sn

´´

=O(1) ;

(iii) ifr = 2, β= 6 +d2 log (H)e thens_n= ˜s_n with probability at least 1/3;

(iv) if r=d2 log (log (H))e, β≥2 thenE

³ log

³sn

˜ sn

´´

=O(1) ; (v) if r = dlog (log (H)) + log¡₁

²

¢e, 2 | β and β ≥ 2 then s_n = ˜s_n with probability at least 1−²;

Proof. The proofs of (i) and (iv) are in [1][Thm. 2, Lem. 2]. The proof of (iii) is in [17][Thm. 2.1]. To prove (ii) we adapt the proof of (iii). The expected value of the under-approximation ofs_nis bounded by the formula

X

p|sn

blogX_p(sn)c

k=1

log (p) µ1

βdβ p^ke

¶₂ ,

where the sum is taken over all primes dividings_n. As ¹_βd_p^β_keis bounded by

β1 +_p¹k this can be further expressed as X

p prime

X∞

k=1

log (p) 1 p^2k + 2

β X

p|sn

X∞

k=1

log (p) 1 p^k + 1

β² X

p|sn

blogX_p(sn)c

k=1

log (p)≤ X

p prime

log (p) 1

p²−1+ 2 β

X

p|sn

log (p) 1 p−1 + 1

β² X

p|sn

log (p) log_p(s_n)

1.78 +2 log (s_n)

β +log²(s_n)

β² ≤5∈O(1).

To prove (v) we slightly modify the proof of (iv) in the following manner.

From (i) we notice that for every prime p dividings_n, the probability that it divides the missed part ofs_n satisfies:

P µ

p|s_n

˜ s_n

¶

≤ µ1

2

¶_r .

(8)

As there are at most log (H) such primes, we get

P(s_n= ˜s_n)≥1−log (H) (1/2)^r ≥1−log (H) 2⁻log(log(H))−log(¹_²) = 1−log (H) 1 log (H)².

Remark 2.3. Theorem 2.2 enables us to produce a LIF procedure, which gives an output ˜sclose tos_nwith the time complexityO

³

n³(log (n) + log (kAk))²

´

(see (ii)).

2.3 Abbott-Bronstein-Mulders, Saunders-Wan and Eberly- Giesbrecht-Villard ideas

Now, the idea of [1] is to combine both the Chinese remainder and the LIF approach. Indeed, one can first compute s_n and then reconstruct only the remaining factors of the determinant by reconstructing det (A)/s_n. The expected complexity of this algorithm isO¡

n^ωlog (|det(A)/s_n|) +n³log²(nkAk)¢ which is unfortunatelyO^∼¡

n^ω+1¢

in the worst case.

Now Saunders and Wan [25, 28] proposed a way to compute not onlys_n but also s_n−1 (which they call a bonus) in order to reduce the size of the remaining factors det(A)/(s_ns_n−1). The complexity doesn’t change.

Then, Eberly, Giesbrecht and Villard have shown that for the dense case the expected number of non trivial invariant factors is small, namely less than 3dlog_λ(n)e+ 29 if the entries of the matrix are chosen uniformly and randomly in a set of λ consecutive integers [17]. As they also give a way to compute anys_i, this leads to an algorithm with the expected complexity O¡

n³log²(nkAk) log (n) log_λ(n)¢ .

Our analysis yields that the bound on the expected number of invariant factors for

a random dense matrix can be refined as O¡

log^0.5(n)¢ .

Then our idea is to extend the method of Saunders and Wan to get the last invariant factors of A slightly faster than by [17]. Moreover, we will show in the following sections that we are able to build an adaptive algorithm solving a minimal number of systems.

The analysis also yields that it should be possible to change a log (n) factor in the expected complexity of [17] to a log(log (n)). This would require a small modification in the algorithm and a careful analysis. As- suming that the number of invariant factors is the expected i.e. it equals N =O(log (n)), we can verify the hypothesis by computing the (n−N−1)th

(9)

factor. If it is trivial, the binary search is done among O(log (n)) el- ements and there are only O(log (n)) factors to compute, which allows to lessen the probability of correctness of each OIF procedure. Thus, in the propitious case, the expected complexity of the algorithm would be O¡

n³log²(nkAk) log_λ(n) log²(log(n))¢

. However, this cannot ce considered as the average complexity in the ordinary sense since we do not average over all possible inputs in the analysis.

3 Computing the product of O (log(n)) last invari- ant factors

3.1 On the number of invariant factors

The result in [17] says that an×nmatrix with entries chosen randomly and uniformly from a set of sizeλhas the expected number of invariant factors bounded by 3dlog_λ(n)e+ 29. In search for some sharpening of this result we prove the following theorems.

Theorem 3.1. Let A be ann×n matrix with entries chosen randomly and uniformly from the set of contiguous integers {−b^λ₂c. . .d^λ₂e}. Let p be a prime. The expected number of non-trivial invariant factors of A divisible by p is at most 4.

Theorem 3.2. Let A be ann×n matrix with entries chosen randomly and uniformly from the set {−b^λ₂c. . .d^λ₂e}. The expected number of non trivial invariant factors of A is at most lp

2 log_λ(n) m

+ 3.

In order to prove the theorems stated above, we start with the following lemmas.

Lemma 3.3. If j >1 the sum X

8<p<λ

µ1 λdλ

pe

¶_j

over primes p can be upper bounded by¡₁

2

¢_j .

Proof. We will consider separately the primes from the interval ₂_k+1^λ ≤p <

λ

2^k, k = 0,1, . . . k_max. The value of k_max is computed from the condition p >8 and is equal todlog (λ)e −4. For the kth intervald^λ_pe is less than or equal to 2^k+1. In each interval there are at most d₂k+2^λ e odd numbers and at most ₂k+2^λ primes: if in the interval there are more than 3 odd numbers, at least one of them is divisible by 3 and is therefore composite. For this to

(10)

happen it is enough thatd₂_kmax^λ ₊₂e ≥3, which is the case. We may therefore calculate:

X

8<p<λ

µ1 λdλ

pe

¶_j

≤

dlog(λ)e−4X

k=0

λ 2^k+2

µ2^k+1 λ

¶_j

≤

dlog(λ)e−4X

k=0

1 2

µ2^k+1 λ

¶_j−1

= 1

2λ^j−1

dlog(λ)e−4X

k=0

³ 2^k+1

´_j−1

≤ 1 2λ^j−1





dlog(λ)e−4X

k=0

2^k+1





j−1

≤ 1 2λ^j−1

³

2^{dlog(λ)e−2}

´_j−1

≤ 1 2λ^j−1

³

2^log(λ)−1

´_j−1

= µ1

2

¶_j .

Remark 3.4. For λ = 2^l, k can be allowed from 0 up to l−3, instead of dlog (λ)e −4 and we can include more primes in the sum. As a result we obtain an inequalityP

4<p<2^l

³1 2^ld²_p^le

´_j

≤¡₁

2

¢_j .

Lemma 3.5. Let A be a k×n, k ≤ n integer matrix with entries chosen uniformly and randomly from the set {−b^λ₂c. . .d^λ₂e} . The probability that rank_p(A), the rank modulo p of A, is j, 0< j≤k is less than or equal to P(rank_p(A) =j)≤

j−1Y

i=0

(1−α⁽ⁿ⁻ⁱ⁾)·β^{(n−j)(k−j)}· µ 1

1−β

¶_max_k−j−1,0

(1 +β . . . β^k−j)

≤β^{(n−j)(k−j)} µ 1

1−β

¶_k−j

, (1)

where α= _λ+1¹ b^λ+1_p c and β = _λ+1¹ b^λ+1_p c.

The proof of the lemma is given in the appendix A.1.

Proof. (Theorem 3.1)

The idea of the proof is similar to that of [17][Thm. 6.2].

For k≥j let MDep_k(p, j) denote the event that the firstk columns of A mod p have rank at most k−j over Z_p. By I_j(p) we denote the event, that at leastjinvariant factors ofAare divisible byp. This implies that the first columns ofAhave rank at mostn−j modp, or that MDep_n−j+k(p, k) has occurred for all k = 0. . . j. This proves in particular that P(I_j(p))≤ P(MDep_n(p, j)).

In order to compute the probability P(MDep_s(p, j)), s ≥ j we notice that it is less than or equal to

P(MDep_s(p, j))≤ P¡

MDep_j(p, j)¢ +

Xs

k=j+1

P¡

MDep_k(p, j)∧ ¬MDep_k−1(p, j)¢

(11)

Surely, MDep_j(p, j) means that the first j columns of A are 0 mod p, and consequently the probability is less than or equal toβ_p^jn, where the value β_p = _λ+1¹ d^λ+1_p e is a bound on the probability that an entry of the matrix is determined modulop and is set to (λ+ 1)⁻¹ ifp≥λ+ 1 or less than or equal _p+1² in the casep < λ+ 1.

We are now going to findP¡

fork > j.

Since the event MDep_k−1(p, j) did not occur, A_k−1 has rank modulo p at least (k−j) and of course at most (k−1). For MDep_k,jto occur it must be exactly (k−j). This means that we can rewriteP¡

MDep_k(p, j)∧ ¬MDep_k−1(p, j)¢ as

P(MDep_k(p, j)|rank_p(A_k−1) =k−j)· P(rank_p(A_k−1) =k−j), where rank_p(A_k−1) denotes the rank modulo p of submatrix A_k−1 of A, which consists of its first (k−1) columns.

Since the rank modulo p of A_k−1 is equal tok−j, there exists a set of k−j rows L_k−j which has full rank mod p. This means that we can choose k−jentries of thekth column randomly but the remainingn−k+j entries will be determined modulop. This leads to an inequality

P(MDep_k(p, j) | rank_p(A_k−1) =k−j)≤β_p^n−k+j. By Lemma 3.5 we haveP(rank(A_k−1 =k−j)≤

³ 1 1−βp

´_j−1

β(n−k+j)(j−1)

p .

Finally, we get P¡

≤ µ 1

1−β_p

¶_j−1

β_p^(n−k+j)j (2) and

P(MDep_s(p, j))≤ µ 1

1−β_p

¶_j−1X_s

k=j

β_p^(n−k+j)j <

µ 1 1−β_p

¶_j−1

β^j(n−s+j)_p 1 1−βp^j

. (3) The expected number of invariant factor divisible by p < λverifies:

Xn

j=0

j(P(I_j(p))−P(I_j+1(p))) = Xn

j=1

P(I_j(p))≤ Xn

j=1

MDep_n(p, j)

≤ Xn

j=1

µp+ 1 p−1

¶_j−1µ 2 p+ 1

¶_j2

(p+ 1)^j (p+ 1)^j−2^j

(12)

The latter is decreasing inpand therefore less than its value atp= 2, which is lower than 3.46.

Forp≥λ+ 1 the result is even sharper:

Xn

j=0

j(P(I_j(p))−P(I_j+1(p)))≤ Xn

j=1

µ λ λ−1

¶_j−1µ 1 λ

¶_j² λ^j

λ^j−1 ≤ 1 λ−1

1 1−_(λ−1)λ¹ 2

the latter being lower than 1.18 forλ≥1.

Proof. (Theorem 3.2)

In addition to MDep_k(p, j) introduced earlier, let Dep_k denote an event that the first k columns of A are linearly dependent (over rationals) and MDep_k(j), an event that either of MDep_k(p, j) occurred.

Recall from [17, §6] that P(Dep₁)≤(λ+ 1)⁻ⁿ P¡

Dep_k∧¬Dep_k−1¢

≤ P¡

Dep_k | ¬Dep_k−1¢

≤(λ+ 1)^−n+k−1.

This givesP(Dep_k)≤ _(λ+1)¹ n+· · ·+_(λ+1)¹n−k+1 which is less than_(λ+1)¹n−k+1λ+1 λ . As in the previous proof, the probability that the number of non trivial invariant factors is at leastj(eventI_j) is lower thanP¡

MDep_n−j+k(k)∨Dep_n−j+1¢ for allk= 0. . . j. The latter can be transformed toP¡

(MDep_n−j+k(k)∧ ¬Dep_n−j+1)∨Dep_n−j+1¢ , and bothP¡

MDep_n−j+k(k)∧ ¬Dep_n−j+1¢

andP¡

Dep_n−j+1¢

can be treated separately.

To computeP¡

we will sumP¡

MDep_n−j+k(p, k)¢ over all possible primes. Since Dep_n−j+1 does not hold, there exists a

(n−j+ 1)×(n−j+ 1) non-zero minor, and we have to sum over the primes which divide it. We will treat separately primes p < λ + 1 and p ≥ λ+ 1. Once again we set β_p = _p+1² for p < λ+ 1 and β_p = _λ+1¹ for p≥λ.

By (3) we have X

p<λ

P(MDep_n−j+k(p, k))<

µ 1 1−β₂

¶_k−1

β₂^kj 1 1−β₂^k +

µ 1 1−β₃

¶_k−1

β₃^kj 1 1−β₃^k +

µ 1 1−β₅

¶_k−1

β₅^kj 1 1−β₅^k +

µ 1 1−β₇

¶_k−1

β₇^kj 1

1−β₇^k +X

8<p<λ

µ 1 1−β_p

¶_k−1

β_p^kj) 1 1−β^k_p.

(13)

This transforms to X

p<λ

P(MDep_n−j+k(p, k))≤3^k−1 µ2

3

¶_kj 3^k

3^k−2^k + 2^k−1 µ1

2

¶_kj 2^k 2^k−1 +

µ3 2

¶_k−1µ 1 3

¶_kj 3^k 3^k−1 +

µ4 3

¶_k−1µ 1 4

¶_kj 4^k 4^k−1 +

µ6 5

¶_k−1 6^k 6^k−1

X

8<p<λ+1

µ 1

λ+ 1dλ+ 1 p e

¶_kj .

Thanks to Lemma 3.3, the sum P

8<p<λ+1

³ 1

λ+1d^λ+1_p e

´_kj

can be bounded by¡₁

2

¢_kj .

For primes p ≥ λ+ 1 we should estimate the number of primes dividing the (n−j + 1)th minor. By the Hadamard’s bound (notice that Dep_n−j+1 does not hold), the minors are bounded in absolute value by

³

(n−j+ 1)¡_λ+1

2 )¢₂´^n−j+1

2 . Therefore the number of primes p ≥ λ+ 1 dividing the minor is at most ⁿ₂ ¡

log_λ+1(n) + 2¢

. Summarizing, P¡¡

∨Dep_n−j+1¢

≤ µ2

3

¶_kj 3^2k−1 3^k−2^k +

µ1 2

¶_kj 2^2k−1 2^k−1 +

µ3 2

¶_k−1µ 1 3

¶_kj 3^k 3^k−1 +

µ4 3

¶_k−1µ 1 4

¶_kj 4^k 4^k−1+

µ6 5

¶_k−1 6^k 6^k−1

µ1 2

¶_kj + +n

2

¡log_λ+1(n) + 2¢µ λ+ 1

λ

¶_k−1 1 (λ+ 1)^jk

(λ+ 1)^k

(λ+ 1)^k−1 + λ

λ−1λ^−(n−j+1). We can now compute the expected number of non trivial invariant factors.

Let us fix h = max(2,§p

2 log_λ+1(n)¨

). We have that in particular, P(I_j) is less thanP¡

(MDep_n−j+h(h)∧ ¬Dep_n−j+1)∨Dep_n−j+1¢

. We can check that h² ≥ log_λ+1(n) + log_λ+1¡

log_λ+1(n) + 2¢

. This gives also (λ+ 1)^h² > n¡

log_λ+1(n) + 2¢ and 1> n

2

¡log_λ+1(n) + 2¢µ λ+ 1

λ

¶_h−1

(λ+ 1)^2h ((λ+ 1)^h−1)²

1 (λ+ 1)^h(h+1). The expected number of non trivial invariant factors is bounded by:

Xh

j=1

1 + Xn

j=h+1

P¡

(MDep_n−j+h(h)∨ ¬Dep_n−j+1)∧Dep_n−j+1¢

(14)

which in turn is bounded by h+

³ Xⁿ

j=h+1

µ2 3

¶_hj

3^2h−1 3^h−2^h +

µ1 2

¶_hj 2^2h−1 2^h−1+

µ3 2

¶_h−1µ 1 3

¶_hj 3^h 3^h−1 +

µ4 3

¶_h−1µ 1 4

¶_hj 4^h 4^h−1 +

µ6 5

¶_h−1 6^h 6^h−1

µ1 2

¶_hj

+n 2

¡log_λ+1(n) + 2¢µ λ+ 1

λ

¶_h−1 1 (λ+ 1)^hj

(λ+ 1)^h

(λ+ 1)^h−1 +(λ+ 1)

λ (λ+ 1)^−(n−j+1)

´

≤h+ 3^3h−1 (3^h−2^h)²

µ2 3

¶_h(h+1)

+ 2^3h−1 (2^h−1)²

µ1 2

¶_h(h+1) +

µ3 2

¶_h−1 3^2h (3^h−1)²

µ1 3

¶_h(h+1)

+ µ4

3

¶_h−1 4^2h (4^h−1)²

µ1 4

¶_h(h+1) +

µ6 5

¶_h−1 6^h 6^h−1

2^h 2^h−1

µ1 2

¶_h(h+1)

+n 2

¡log_λ+1(n) + 2¢µ λ+ 1

λ

¶_h−1

(λ+ 1)^2h ((λ+ 1)^h−1)²

1

(λ+ 1)^h(h+1) +

µλ+ 1 λ

¶₂ 1 λ+ 1

≤h+f(n, λ) +n 2

¡log_λ+1(n) + 2¢µ λ+ 1

λ

¶_h−1

(λ+ 1)^2h ((λ+ 1)^h−1)²

1 (λ+ 1)^h(h+1)

≤h+f(n, λ) + 1.

wheref(n, λ)≤ ₍₃³h^3h−1−2^h)²

¡₂

3

¢_h(h+1)

+₍₂²h^3h−1−1)²

¡₁

2

¢_h(h+1) +¡₃

2

¢_h−1 ₃2h

(3^h−1)²

¡₁

3

¢_h(h+1)

+¡₄

3

¢_h−1 ₄2h

(4^h−1)²

¡₁

4

¢_h(h+1) +¡₆

5

¢_h−1 ₆h

6^h−1 2^h 2^h−1

¡₁

2

¢_h(h+1)

+¡_λ+1

λ

¢₂ ₁

λ+1 <2 as soon as λ≥ 2. On the other hand f(n,1) =

³ 2 2−1

´₂

12 ≤2 which leads to the final result.

3.2 Extended Bonus Ideas

In his thesis [28], Z. Wan introduces the idea of computing the penulti- mate invariant factor (i.e. s_n−1) ofA while computings_nusing two system solvings. The additional cost is comparatively small, therefore s_n−1 is re- ferred to as a bonus. Here, we extend this idea to the computation of the (n−k+ 1)th factor withksolvings in the following manner:

1. The (matrix) solution ofAX =B, where B is a n×k multiple right hand side can be written as ˜s⁻¹_n N where ˜s_n approximates s_n(A) and the factors ofN give some divisors of the lastkinvariant factors ofA:

see lemma 3.6.

(15)

2. We are actually only interested in getting the product of these invariant factors which we compute as the gcd of the determinants of two perturbedk×k matrixR₁N and R₂N.

3. Then we show that repeating this solving twice with two distinct right- hand sides B₁ and B₂ is in general sufficient to remove those extra factors and to get a very fine approximation of the actual product of the lastkinvariants: see lemma 3.10.

3.2.1 The last k invariant factors

Let X be a (matrix) rational solution of the equation AX = B, where B = [b_i], i = 1, . . . , k, is a random n×k matrix. Then the coordinates of X have a common denominator ˜s_n and we letN = [n_i], i= 1, . . . , k, denote the matrix of numerators ofX. Thus, X = ˜s⁻¹_n N and gcd (N_ij,˜s_n) = 1.

Following Wan, we notice thats_n(A)A⁻¹is an integer matrix, the Smith form of which is equal to

diag

µs_n(A)

s_n(A), s_n(A)

s_n−1(A), . . . ,s_n(A) s₁(A)

¶ . Therefore, we may computes_n−k+1(A) when knowings_k¡

s_n(A)A⁻¹¢ . The trick is that the computation of A⁻¹ is not required: we can perturbA⁻¹ by right multiplying it by B. Then, s_k¡

s_n(A)A⁻¹B¢

is a multiple of s_k¡

s_n(A)A⁻¹¢

. Instead of s_n(A)A⁻¹B we would prefer to use ˜s_nA⁻¹B which is already computed and equal toN.

The relation between A and N is as follows.

Lemma 3.6. Let X= ˜s⁻¹_n N, gcd (˜s_n, N) = 1 be a solution to the equation AX=B, where B isn×k matrix. LetR be a random k×n matrix. Then

˜ s_n gcd (s_i(N),s˜_n)

¯¯

¯¯s_n−i+1(A) and s˜_n

gcd (s_i(RN),s˜_n)

¯¯

¯¯s_n−i+1(A), i= 1. . . , k.

Proof. The Smith forms ofs_n(A)A⁻¹BandN are connected by the relation

sn(A)

˜

sn s_i(N) =s_i¡

s_n(A)A⁻¹B¢

,i= 1, . . . , k. Moreover,s_i(N) is a factor of s_i(RN). We notice that _s ^sⁿ^(A)

i(sn(A)A⁻¹B) equals _s^s^˜ⁿ

i(N), and thus _gcd(s ^s^˜ⁿ

i(RN),˜sn)

is an (integer) factor of s_n−i+1(A). Moreover, the under-approximation is solely due to the choice ofB and R.

Remark 3.7. Taking gcd (s_i(RN),˜s_n) is necessary as _s ^˜^sⁿ

i(RN) may be a rational number.

(16)

3.2.2 Removing the undesired factors

In fact we are interested in computing the productπ_k=s_ns_n−1· · ·s_n−k+1(A) of the k biggest invariant factors of A. Then, following the idea of [1], we would like to reduce the computation of the determinant to the computation of ^det(A)_π_˜

k , where ˜π_kis a factor ofπ_kthat we have obtained. We can compute

˜

π_k as ˜s^k_n/gcd¡

µ_k(RN),s˜^k_n¢

, where µ_k =s₁s₂· · ·s_k is the product of the k smallest invariant factors.

We will need a following technical lemma. Its proof is given in the appendix, see A.5.

Lemma 3.8. Let V be an k×n matrix, such that the Smith form of V is trival. LetM be ann×kmatrix with entries chosen randomly and uniformly from the set {a, a+ 1. . . a+S−1}, the probability that p^l < S divides the determinant det(V M) is at most _p³l.

In the following lemmas we show that by repeating the choice of matrix B and R twice, we will omit only a finite number of bits in π_k. We start with a remark, which is a modification of [28, Lem. 5.17]. We ramaind that the order modulop (ord_p) of a value is the expotent of the highest power of pdividing it.

Remark 3.9. For everyn×nmatrix M there exist a k×n, k≤n, matrix V with trivial Smith form, such that for any n×k matrix B: if the order modulopord_p

³µk(M B) µk(M)

´

is greater thanlthen also ord_p(det (V B)) is greater thanl.

Lemma 3.10. Let Abe ann×ninteger matrix andB_i(resp.R_i),i= 1,2be n×k(resp.k×n), matrices with the entries uniformly and randomly chosen from the set S of S contiguous integers, k ≥ 2. Denote by µ_k the product s₁. . . s_k of the k smallest invariant factors and by π_k the product of the k biggest factors ofA. Then for M =s_n(A)A⁻¹

E Ã

log Ã

π_k(A) s_n(A)^kgcd

³

µ_k(R₁M B₁), µ_k(R₂M B₂), s_n(A)^k

´!!

∈ O(1)+O

µk³log⁴(H) S

¶

where H is the Hadamard bound forA.

Proof. First, notice that ^π^k^(A)

sn(A)^k = _µ ¹

k(M). Therefore