+ All Categories
Home > Documents > École normale supérieure de...

École normale supérieure de...

Date post: 10-Oct-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
18
2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster Algorithms for Multivariate Interpolation With Multiplicities and Simultaneous Polynomial Approximations Muhammad F. I. Chowdhury, Claude-Pierre Jeannerod, Vincent Neiger, Éric Schost, and Gilles Villard Abstract—The interpolation step in the Guruswami-Sudan algorithm is a bivariate interpolation problem with multiplici- ties commonly solved in the literature using either structured linear algebra or basis reduction of polynomial lattices. This problem has been extended to three or more variables; for this generalization, all fast algorithms proposed so far rely on the lattice approach. In this paper, we reduce this multivariate interpolation problem to a problem of simultaneous polynomial approximations, which we solve using fast structured linear algebra. This improves the best known complexity bounds for the interpolation step of the list-decoding of Reed-Solomon codes, Parvaresh-Vardy codes, and folded Reed-Solomon codes. In particular, for Reed-Solomon list-decoding with re-encoding, our approach has complexity O˜(ω1 m 2 (n k)), where , m, n, and k are the list size, the multiplicity, the number of sample points, and the dimension of the code, and ω is the exponent of linear algebra; this accelerates the previously fastest known algorithm by a factor of / m. Index Terms— Multivariate polynomial interpolation, polynomial approximation, structured matrix, list decoding, Reed-Solomon codes. I. I NTRODUCTION I N THIS paper, we consider a multivariate interpolation problem with multiplicities and degree constraints (Problem 1) which originates from coding theory. In what follows, K is our base field and, in the coding theory context, s, , n, b are respectively known as the number Manuscript received February 3, 2014; revised March 5, 2015; accepted March 6, 2015. Date of publication March 23, 2015; date of current version April 17, 2015. M. F. I. Chowdhury and É. Schost were supported in part by NSERC and in part by the Canada Research Chairs Program. V. Neiger was supported by the International Mobility Grant Explo’ra Doc through the Région Rhône-Alpes. C.-P. Jeannerod and G. Villard were supported by ANR through the HPAC project under Grant ANR 11 BS02 013. The material in this paper was presented in part at the 10th Asian Symposium on Computer Mathematics in 2012 and at the 2013 SIAM Conference on Applied Algebraic Geometry. M. F. I. Chowdhury is with the Department of Computer Science, University of Western Ontario, London, ON N6A 3K7, Canada (e-mail: [email protected]). C.-P. Jeannerod is with the Laboratoire de l’Informatique du Parallélisme, (CNRS, ENS de Lyon, Inria, UCBL), Université de Lyon, Lyon 69007, France (e-mail: [email protected]). V. Neiger is with the Laboratoire de l’Informatique du Parallélisme, (CNRS, ENS de Lyon, Inria, UCBL), Université de Lyon, Lyon 69007, France, and with the Department of Computer Science, University of Western Ontario, London, ON N6A 3K7, Canada (e-mail: [email protected]). É. Schost is with the Department of Computer Science, University of Western Ontario, London, ON N6A 3K7, Canada (e-mail: [email protected]). G. Villard is with the Laboratoire de l’Informatique du Parallélisme, (CNRS, ENS de Lyon, Inria, UCBL), Université de Lyon, Lyon 69007, France (e-mail: [email protected]). Communicated by N. Kashyap, Associate Editor for Coding Theory. Digital Object Identifier 10.1109/TIT.2015.2416068 of variables, list size, code length, and as an agreement parameter. The parameters m 1 ,..., m n are known as multiplicities associated with each of the n points; furthermore, the s variables are associated with some weights k 1 ,..., k s . In the application to list-decoding of Reed-Solomon codes, we have s = 1, all the multiplicities are equal to a same value m, n b/ m is an upper bound on the number of errors allowed on a received word, and the weight k := k 1 is such that k + 1 is the dimension of the code. Further details concerning the applications of our results to list-decoding and soft-decoding of Reed-Solomon codes are given in Section IV. We stress that here we do not address the issue of choosing the parameters s, , m 1 ,..., m n with respect to n, b, k 1 ,..., k s , as is often done: in our context, these are all input parameters. Similarly, although we will mention them, we do not make some usual assumptions on these parameters; in particular, we do not make any assumption that ensures that our problem admits a solution: the algorithm will detect whether no solution exists. Here and hereafter, Z is the set of integers, Z 0 the set of nonnegative integers, and Z >0 the set of positive integers. Besides, deg Y 1 ,...,Y s denotes the total degree with respect to the variables Y 1 ,..., Y s , and wdeg k 1 ,...,k s denotes the weighted-degree with respect to weights k 1 ,..., k s Z on variables Y 1 ,..., Y s , respectively; that is, for a polynomial Q = ( j 1 ,..., j s ) Q j 1 ,..., j s ( X )Y j 1 1 ··· Y j s s , wdeg k 1 ,...,k s ( Q) = max j 1 ,..., j s ( deg( Q j 1 ,..., j s ) + j 1 k 1 +···+ j s k s ) . We call conditions (ii), (iii), and (iv) the list-size condition, the weighted-degree condition, and the vanishing condition, respectively. Note that a point (x , y 1 ,..., y s ) is a zero of Q of multiplicity at least m if the shifted polynomial Q( X + x , 0018-9448 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Transcript
Page 1: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015

Faster Algorithms for Multivariate InterpolationWith Multiplicities and Simultaneous

Polynomial ApproximationsMuhammad F. I. Chowdhury, Claude-Pierre Jeannerod, Vincent Neiger, Éric Schost, and Gilles Villard

Abstract— The interpolation step in the Guruswami-Sudanalgorithm is a bivariate interpolation problem with multiplici-ties commonly solved in the literature using either structuredlinear algebra or basis reduction of polynomial lattices. Thisproblem has been extended to three or more variables; forthis generalization, all fast algorithms proposed so far rely onthe lattice approach. In this paper, we reduce this multivariateinterpolation problem to a problem of simultaneous polynomialapproximations, which we solve using fast structured linearalgebra. This improves the best known complexity bounds forthe interpolation step of the list-decoding of Reed-Solomoncodes, Parvaresh-Vardy codes, and folded Reed-Solomoncodes. In particular, for Reed-Solomon list-decoding withre-encoding, our approach has complexity O˜(ℓω−1m2(n − k)),where ℓ, m, n, and k are the list size, the multiplicity, the numberof sample points, and the dimension of the code, and ω is theexponent of linear algebra; this accelerates the previously fastestknown algorithm by a factor of ℓ/m.

Index Terms— Multivariate polynomial interpolation,polynomial approximation, structured matrix, list decoding,Reed-Solomon codes.

I. INTRODUCTION

IN THIS paper, we consider a multivariate interpolationproblem with multiplicities and degree constraints

(Problem 1) which originates from coding theory. In whatfollows, K is our base field and, in the coding theorycontext, s, ℓ, n, b are respectively known as the number

Manuscript received February 3, 2014; revised March 5, 2015; acceptedMarch 6, 2015. Date of publication March 23, 2015; date of current versionApril 17, 2015. M. F. I. Chowdhury and É. Schost were supported in partby NSERC and in part by the Canada Research Chairs Program. V. Neigerwas supported by the International Mobility Grant Explo’ra Doc through theRégion Rhône-Alpes. C.-P. Jeannerod and G. Villard were supported by ANRthrough the HPAC project under Grant ANR 11 BS02 013. The material inthis paper was presented in part at the 10th Asian Symposium on ComputerMathematics in 2012 and at the 2013 SIAM Conference on Applied AlgebraicGeometry.

M. F. I. Chowdhury is with the Department of Computer Science,University of Western Ontario, London, ON N6A 3K7, Canada (e-mail:[email protected]).

C.-P. Jeannerod is with the Laboratoire de l’Informatique du Parallélisme,(CNRS, ENS de Lyon, Inria, UCBL), Université de Lyon, Lyon 69007, France(e-mail: [email protected]).

V. Neiger is with the Laboratoire de l’Informatique du Parallélisme,(CNRS, ENS de Lyon, Inria, UCBL), Université de Lyon, Lyon 69007, France,and with the Department of Computer Science, University of Western Ontario,London, ON N6A 3K7, Canada (e-mail: [email protected]).

É. Schost is with the Department of Computer Science, University ofWestern Ontario, London, ON N6A 3K7, Canada (e-mail: [email protected]).

G. Villard is with the Laboratoire de l’Informatique du Parallélisme, (CNRS,ENS de Lyon, Inria, UCBL), Université de Lyon, Lyon 69007, France (e-mail:[email protected]).

Communicated by N. Kashyap, Associate Editor for Coding Theory.Digital Object Identifier 10.1109/TIT.2015.2416068

of variables, list size, code length, and as an agreementparameter. The parameters m1, . . . , mn are known asmultiplicities associated with each of the n points; furthermore,the s variables are associated with some weights k1, . . . , ks .In the application to list-decoding of Reed-Solomon codes, wehave s = 1, all the multiplicities are equal to a same value m,n − b/m is an upper bound on the number of errors allowedon a received word, and the weight k := k1 is such that k + 1is the dimension of the code. Further details concerning theapplications of our results to list-decoding and soft-decodingof Reed-Solomon codes are given in Section IV.

We stress that here we do not address the issueof choosing the parameters s, ℓ, m1, . . . , mn with respectto n, b, k1, . . . , ks , as is often done: in our context, these are allinput parameters. Similarly, although we will mention them,we do not make some usual assumptions on these parameters;in particular, we do not make any assumption that ensuresthat our problem admits a solution: the algorithm will detectwhether no solution exists.

Here and hereafter, Z is the set of integers, Z!0 theset of nonnegative integers, and Z>0 the set of positiveintegers. Besides, degY1,...,Ys

denotes the total degree withrespect to the variables Y1, . . . , Ys , and wdegk1,...,ks

denotesthe weighted-degree with respect to weights k1, . . . , ks ∈ Zon variables Y1, . . . , Ys , respectively; that is, for a polynomialQ = ∑

( j1,..., js) Q j1,..., js (X)Y j11 · · · Y js

s ,

wdegk1,...,ks(Q) = max

j1,..., js

(deg(Q j1,..., js )+ j1k1 + · · · + jsks

).

We call conditions (ii), (iii), and (iv) the list-size condition,the weighted-degree condition, and the vanishing condition,respectively. Note that a point (x, y1, . . . , ys) is a zero of Q ofmultiplicity at least m if the shifted polynomial Q(X + x,

0018-9448 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

CHOWDHURY et al.: FASTER ALGORITHMS FOR MULTIVARIATE INTERPOLATION 2371

Y1 + y1, . . . , Ys + ys) has no monomial of total degreeless than m; in characteristic zero or larger than m, this isequivalent to requiring that all the derivatives of Q of orderup to m − 1 vanish at (x, y1, . . . , ys).

By linearizing condition (iv) under the assumption thatconditions (ii) and (iii) are satisfied, it is easily seen thatsolving Problem 1 amounts to computing a nonzero solutionto an M × N homogeneous linear system over K. Here,the number M of equations derives from condition (iv) andthus depends on s, n, m1, . . . , mn , while the number Nof unknowns derives from conditions (ii) and (iii) andthus depends on s, ℓ, b, k1, . . . , ks . It is customary toassume M < N in order to guarantee the existence of anonzero solution; however, as said above, we do not makethis assumption, since our algorithms do not require it.

Problem 1 is a generalization of the interpolation step ofthe Guruswami-Sudan algorithm [23], [49] to s variablesY1, . . . , Ys , distinct multiplicities, and distinct weights.The multivariate case s > 1 occurs for instance inParvaresh-Vardy codes [40] or folded Reed-Solomoncodes [22]. Distinct multiplicities occur for instance inthe interpolation step in soft-decoding of Reed-Solomoncodes [28]. We note that this last problem is different fromour context since the xi are not necessarily pairwise distinct;we briefly explain in Section IV-D how to deal with this case.

Our solution to Problem 1 relies on a reduction to asimultaneous approximation problem (Problem 2) which gen-eralizes Padé and Hermite-Padé approximation.

Main Complexity Results and Applications: We first show inSection II how to reduce Problem 1 to Problem 2 efficiently viaa generalization of the techniques introduced by Zeh, Gentner,and Augot [54] and Zeh [53, Sec. 5.1.1] for, respectively, thelist-decoding and soft-decoding of Reed-Solomon codes.

Then, in Section III we present two algorithms for solvingProblem 2. Each of them involves a linearization of theunivariate equations (c) into a specific homogeneous linearsystem over K; if we define

M ′ =∑

0"i<µ

M ′i and N ′ =

0" j<ν

N ′j ,

then both systems have M ′ equations in N ′ unknowns. (As forour first problem, we need not assume that M ′ < N ′.)Furthermore, the structure of these systems allows us to solvethem efficiently using the algorithm of Bostan, Jeannerod,and Schost in [8].

Our first algorithm, detailed in Section III-B, solvesProblem 2 by following the derivation of so-called extendedkey equations (EKE), initially introduced for the particularcase of Problem 1 by Roth and Ruckenstein [43] whens = m = 1 and then by Zeh, Gentner, and Augot [54] whens = 1 and m ! 1; the matrix of the system is mosaic-Hankel.In our second algorithm, detailed in Section III-C, the linearsystem is more directly obtained from condition (c), withoutresorting to EKEs, and has Toeplitz-like structure.

Both points of view lead to the same complexity result,stated in Theorem 2 below, which says that Problem 2 can besolved in time quasi-linear in M ′, multiplied by a subquadraticterm in ρ = max(µ, ν). In the following theorems, and therest of this paper, the soft-O notation O˜( ) indicates that weomit polylogarithmic terms. The exponent ω is so that we canmultiply n ×n matrices in O(nω) ring operations on any ring,the best known bound being ω < 2.38 [15], [31], [48], [51].Finally, the function M is a multiplication time functionfor K[X]: M is such that polynomials of degree at mostd in K[X] can be multiplied in M(d) operations in K, andsatisfies the super-linearity properties of [19, Ch. 8]. It followsfrom the algorithm of Cantor and Kaltofen [11] that M(d) canbe taken in O(d log(d) log log(d)) ⊆ O˜(d).

Combining Theorem 2 below with the above-mentionedreduction from Problem 1 to Problem 2, we immediatelydeduce the following cost bound for Problem 1.

Theorem 1: Let

% ={( j1, . . . , js) ∈ Zs

!0

∣∣ j1 + · · · + js " ℓ

and j1k1 + · · · + jsks < b},

and let m = max1"i"n mi , ϱ = max(|%|,

(s+m−1s

)), and

M = ∑1"i"n

(s+mis+1

). There exists a probabilistic algorithm

that either computes a solution to Problem 1, or determinesthat none exists, using

O(ϱω−1M(M) log(M)2) ⊆ O˜(ϱω−1 M)

operations in K. This can be achieved using Algorithm 1followed by Algorithm 2 or 3. These algorithms choose O(M)elements in K; if these elements are chosen uniformly atrandom in a set S ⊆ K of cardinality at least 6(M + 1)2,then the probability of success is at least 1/2.

We will often refer to the two following assumptions on theinput parameters:

H1: m " ℓ,H2: b > 0 and b > ℓ · max1" j"s k j .

Regarding H1, we prove in Appendix A that the case m > ℓcan be reduced to the case m = ℓ, so that this assumption canbe made without loss of generality. Besides, it is easily verifiedthat H2 is equivalent to having % = {( j1, . . . , js) ∈ Zs

!0 |j1 + · · · + js " ℓ}; when k j > 0 for some j , H2 means thatwe do not take ℓ uselessly large. Then, assuming H1 and H2,we have ϱ = |%| = (s+ℓ

s

).

As we will show in Section IV, in the context of thelist-decoding of Reed-Solomon codes, applicationsof Theorem 1 include the interpolation step of theGuruswami-Sudan algorithm [23] in O˜(ℓω−1m2

GSn)

Page 3: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

2372 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015

operations and the interpolation step of the Wu algorithm [52]in O˜(ℓω−1m2

Wun) operations, where mGS and mWu are therespective multiplicities used in those algorithms; our resultcan also be adapted to the context of soft-decoding [28].Besides, the re-encoding technique of Koetter and Vardy [29]can be used in conjunction with our algorithm in order toreduce the cost of the interpolation step of theGuruswami-Sudan algorithm to O˜(ℓω−1m2

GS(n − k))operations.

In Theorem 1, the probability analysis is a standardconsequence of the Zippel-Schwartz lemma; as usual, theprobability of success can be made arbitrarily close to oneby increasing the size of S. If the field K has fewerthan 6(M + 1)2 elements, then a probability of success atleast 1/2 can still be achieved by using a field extension L ofdegree d ∈ O(log|K|(M)), up to a cost increase by a factorin O(M(d) log(d)).

Specifically, one can proceed in three steps. First, we takeL = K[X]/⟨ f ⟩ with f ∈ K[X] irreducible of degree d;such an f can be set up using an expected numberof O˜(d2) ⊆ O(M) operations in K [19, §14.9]. Then we solveProblem 1 over L by means of the algorithm of Theorem 1,thus using O

(ϱω−1M(M) log(M)2 · M(d) log(d)

)operations

in K. Finally, from this solution over L one can deduce asolution over K using O(Md) operations in K. This last pointcomes from the fact that, as we shall see later in the paper,Problem 1 amounts to finding a nonzero vector u over K suchthat Au = 0 for some M ×(M +1) matrix A over K: once wehave obtained a solution u over L, it thus suffices to rewriteit as u = ∑

0"i<d ui Xi ̸= 0 and, noting that Aui = 0 forall i , to find a nonzero ui in O(Md) comparisons with zeroand return it as a solution over K.

Furthermore, since the xi in Problem 1 are assumed tobe pairwise distinct, we have already |K| ! n and thuswe can take d = O(logn(M)). In all the applicationsto error-correcting codes we consider in this paper, M ispolynomial in n so that we can take d = O(1), andin those cases the cost bound in Theorem 1 holds forany field.

As said before, Theorem 1 relies on an efficient solu-tion to Problem 2, which we summarize in the followingtheorem.

Theorem 2: Let ρ = max(µ, ν). There exists a probabilisticalgorithm that either computes a solution to Problem 2,or determines that none exists, using

O(ρω−1M(M ′) log(M ′)2) ⊆ O˜(ρω−1 M ′)

operations in K. Algorithms 2 and 3 achieve this result.These algorithms both choose O(M ′) elements in K; if theseelements are chosen uniformly at random in a set S ⊆ K ofcardinality at least 6(M ′ +1)2, then the probability of successis at least 1/2.

If K has fewer than 6(M ′ +1)2 elements, the remarks madeafter Theorem 1 still apply here.

Comparison with Previous Work: In the context of codingtheory, most previous results regarding Problem 1 focuson the list-decoding of Reed-Solomon codes via theGuruswami-Sudan algorithm, in which s = 1 and the

assumptions H1 and H2 are satisfied as well as

H3: 0 " k < n where k := k1,H4: m1 = · · · = mn = m.

The assumption H3 corresponds to the coding theory context,where k + 1 is the dimension of the code; then k + 1 must bepositive and at most n (the length of the received word).To support this assumption independently from any applicationcontext, we show in Appendix B that if k ! n, then Problem 1has either a trivial solution or no solution at all.

Previous results focus mostly on the Guruswami-Sudancase (s = 1, m ! 1) and some of them more specifically onthe Sudan case (s = m = 1); we summarize these resultsin Table I. In some cases [1], [6], [13], [41], the complexitywas not stated quite exactly in our terms but the translation isstraightforward.

In the second column of that table, we give the costwith respect to the interpolation parameters ℓ, m, n, assumingfurther m = nO(1) and ℓ = nO(1). The most significant factorin the running time is its dependency with respect to n, withresults being either cubic, quadratic, or quasi-linear. Then,under the assumption H1, the second most important parameteris ℓ, followed by m. In particular, our result in Section IV,Corollary 1 compares favorably to the cost O˜(ℓωmn) obtainedby Cohn and Heninger [13] which was, to our knowledge, thebest previous bound for this problem.

In the third column, we give the cost with respect tothe Reed-Solomon code parameters n and k, using worst-case parameter choices that are made to ensure the existenceof a solution: m = O(nk) and ℓ = O(n3/2k1/2) in theGuruswami-Sudan case [23], and ℓ = O(n1/2k−1/2) in theSudan case [49]. With these parameter choices, our algorithmspresent a speedup (n/k)1/2 over the algorithm in [13].

Most previous algorithms rely on linear algebra, eitherover K or over K[X]. When working over K, a natural ideais to rely on cubic-time general linear system solvers, asin Sudan’s and Guruswami-Sudan’s original papers. Severalpapers also cast the problem in terms of Gröbner basiscomputation in K[X, Y ], implicitly or explicitly: theincremental algorithms of [30], [33], and [37] are particularcases of the Buchberger-Möller algorithm [34], whileAlekhnovich’s algorithm [1] is a divide-and-conquer changeof ordering algorithm for bivariate ideals.

Yet another line of work [43], [54] uses Feng and Tzeng’slinear system solver [17], combined with a reformulation interms of syndromes and key equations. We will use(and generalize to the case s > 1) some of these resultsin Section III-B, but we will rely on the structured linearsystem solver of [8] in order to prove our main results. Priorto our work, Olshevsky and Shokrollahi also used structuredlinear algebra techniques [38], but it is unclear to us whethertheir encoding of the problem could lead to similar resultsas ours.

As said above, another approach rephrases the problemof computing Q in terms of polynomial matrix computa-tions, that is, as linear algebra over K[X]. Starting fromknown generators of the finitely generated K[X]-module(or polynomial lattice) formed by solutions to Problem 1, thealgorithms in [4], [6], [9], [10], [13], [32], and [41] compute

Page 4: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

CHOWDHURY et al.: FASTER ALGORITHMS FOR MULTIVARIATE INTERPOLATION 2373

TABLE I

COMPARISON OF OUR COSTS WITH PREVIOUS ONES FOR s = 1

a Gröbner basis of this module (or a reduced lattice basis),in order to find a short vector therein. To achieve quasi-linear time in n, the algorithms in [4] and [9] use a basisreduction subroutine due to Alekhnovich [1], while thosein [6] and [13] rely on a faster, randomized algorithm dueto Giorgi, Jeannerod, and Villard [20].

This approach based on the computation of a reducedlattice basis was in particular the basis of the extensions tothe multivariate case s > 1 in [9], [10], and [14]. In themultivariate case as well, the result in Theorem 1 improveson the best previously known bounds [9], [10], [14]; we detailthose bounds and we prove this claim in Appendix C. In [18],the authors solve a problem similar to Problem 1 exceptthat they do not assume that the xi are distinct. For simpleroots and under some genericity assumption on the points{(xi , yi,1, . . . , yi,s )}1"i"n , this algorithm uses O(n2+1/s) oper-ations to compute a polynomial Q which satisfies (i), (iii), (iv)with m = 1. However, the complexity analysis is not clear tous in the general case with multiple roots (m > 1).

Regarding Problem 2, several particular cases of itare well-known. When all Pi are of the form X M ′

i , thisproblem becomes known as a simultaneous Hermite-Padéapproximation problem or vector Hermite-Padé approximationproblem [3], [47]. The case µ = 1, with P1 being giventhrough its roots (and their multiplicities) is known as theM-Padé problem [2]. To our knowledge, the only previouswork on Problem 2 in its full generality is by Nielsenin [36, Ch. 2]. Nielsen solves the problem by buildingan ad-hoc polynomial lattice, which has dimension µ + νand degree maxi<µ M ′

i , and finding a short vector therein.Using the algorithm in [20], the overall cost bound for thisapproach is O˜((µ + ν)ω(maxi<µ M ′

i )), to which our costbound O˜(max(µ, ν)ω−1(

∑i<µ M ′

i )) from Theorem 2compares favorably.

Outline of the Paper: First, we show in Section II how toreduce Problem 1 to Problem 2; this reduction is essentiallybased on Lemma 2, which extends to the multivariate case

s > 1 the results in [53] and [54]. Then, after a reminderon algorithms for structured linear systems in Section III-A,we give two algorithms that both prove Theorem 2,in Sections III-B and III-C, respectively. The linearizationin the first algorithm extends the derivation of extendedkey equations presented in [54] to the more general contextof Problem 2, ending up with a mosaic-Hankel system. Thesecond algorithm gives an alternative approach, in whichthe linearization is more straightforward and the structureof the matrix of the system is Toeplitz-like. We concludein Section IV by presenting several applications to thelist-decoding of Reed-Solomon codes, namely theGuruswami-Sudan algorithm, the re-encoding technique andthe Wu algorithm, and by sketching how to adapt our approachto the soft-decoding of Reed-Solomon codes. Readers who aremainly interested in those applications may skip Section III,which contains the proofs of Theorems 1 and 2, and godirectly to Section IV.

II. REDUCING PROBLEM 1 TO PROBLEM 2In this section, we show how instances of Problem 1 can

be reduced to instances of Problem 2; Algorithm 1 gives anoverview of this reduction. The main technical ingredient,stated in Lemma 2 below, generalizes to any s ! 1 and(possibly) distinct multiplicities the result given for s = 1by Zeh, Gentner, and Augot in [54, Proposition 3].To prove it, we use the same steps as in [54]; werely on the notion of Hasse derivatives, which allowsus to write Taylor expansions in positive characteristic(see Hasse [24] or Roth [42, pp. 87, 276]).

For simplicity, in the rest of this paper we will use boldfaceletters to denote s-tuples of objects: Y = (Y1, . . . , Ys),k = (k1, . . . , ks), etc. In the special case of s-tuples of integers,we also write |k| = k1 +· · ·+ks , and comparison and additionof multi-indices in Zs

!0 are defined componentwise. Forexample, writing i " j is equivalent to i1 " j1, . . . , is " js ,and i − j denotes (i1 − j1, . . . , is − js). If y = (y1, . . . , ys)

Page 5: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

2374 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015

is in K[X]s and i = (i1, . . . , is) is in Zs!0, then

Y − y = (Y1 − y1, . . . , Ys − ys) and Y i = Y i11 · · · Y is

s . Finally,for products of binomial coefficients, we shall write

(ji

)=

(j1i1

)· · ·

(jsis

).

Note that this integer is zero when i ̸" j.If A is any commutative ring with unity and A[Y] denotes

the ring of polynomials in Y1, . . . , Ys over A, then for apolynomial P(Y) = ∑

j P jY j in A[Y] and a multi-index i in Zs

!0, the order-i Hasse derivative of P is the

polynomial P[i] in A[Y] defined by

P[i] =∑

j ! i

(ji

)P jY j−i.

The Hasse derivative satisfies the following property(Taylor expansion): for all a in As ,

P(Y) =∑

i

P[i](a)(Y − a)i.

The next lemma shows how Hasse derivatives help rephrasethe vanishing condition (iv) of Problem 1 for one of thepoints {(xr , yr )}1"r"n .

Lemma 1: Let (x, y1, . . . , ys) be a point in Ks+1 andR = (R1, . . . , Rs) in K[X]s be such that R j (x) = y jfor 1 " j " s. Then, for any polynomial Q in K[X, Y],Q(x, y) = 0 with multiplicity at least m if and only if forall i in Zs

!0 such that |i| < m,

Q[i](X, R) = 0 mod (X − x)m−|i|.Proof: Up to a shift, one can assume that the point is

(x, y1, . . . , ys) = (0, 0); in other words, it suffices to showthat for R(0) = 0 ∈ Ks , we have Q(0, 0) = 0 with multiplicityat least m if and only if, for all i in Zs

!0 such that |i| < m,Xm−|i| divides Q[i](X, R).

Assume first that (0, 0) ∈ Ks+1 is a root of Q of multiplicityat least m. Then, Q(X, Y) = ∑

j Q jY j has only monomialsof total degree at least m, so that for j ! i, each nonzeroQ jY j−i has only monomials of total degree at least m − |i|.Now, R(0) = 0 ∈ Ks implies that X divides each componentof R. Consequently, Xm−|i| divides Q jR j−i for each j ! i,and thus Q[i](X, R) as well.

Conversely, let us assume that for all i in Zs!0 such that

|i| < m, Xm−|i| divides Q[i](X, R), and show that Q hasno monomial of total degree less than m. Writing the Taylorexpansion of Q with A = K[X] and a = R, we obtain

Q(X, Y) =∑

i

Q[i](X, R)(Y − R)i.

Each component of R being a multiple of X , we deducethat for the multi-indices i such that |i| ! m every nonzeromonomial in Q[i](X, R)(Y − R)i has total degree at least m.Using our assumption, the same conclusion follows for themulti-indices such that |i| < m. #

Thus, for each of the points {(xr , yr )}1"r"n in Problem 1,such a rewriting of the vanishing condition (iv) for this pointholds. Now intervenes the fact that the xi are distinct: thepolynomials (X − xa)α and (X − xb)β are coprime for a ̸= b,

so that simultaneous divisibility by both those polynomials isequivalent to divisibility by their product (X − xa)α(X − xb)β .Using the s-tuple R = (R1, . . . , Rs) ∈ K[X]s of Lagrangeinterpolation polynomials, defined by the conditions

deg(R j ) < n and R j (xi ) = yi, j (1)

for 1 " i " n and 1 " j " s, we can then combineLemma 1 for all points so as to rewrite the vanishing conditionof Problem 1 as a set of modular equations in K[X] asin Lemma 2 below. In what follows, we use the notationfrom Problem 1 and Theorem 1.

Lemma 2: For any polynomial Q in K[X, Y], Q satisfiesthe condition (iv) of Problem 1 if and only if for all i in Zs

!0such that |i| < m,

Q[i](X, R) = 0 mod∏

1"r"n :mr >|i|

(X − xr )mr −|i|.

Proof: This result is easily obtained from Lemma 1 sincethe xr are pairwise distinct. #

Note that when all multiplicities are equal, that is,m = m1 = · · · = mn , for every |i| the modulus takes thesimpler form Gm−|i|, where G = ∏

1"r"n (X − xr ).Writing j · k = j1k1 + · · · + jsks , recall from the statement

of Theorem 1 that % is the set of all j in Zs!0 such that

| j| " ℓ and j · k < b. Then, defining the positive integers

N j = b − j · k

for all j in %, we immediately obtain the followingreformulation of the list-size and weighted-degree conditionsof our interpolation problem:

Lemma 3: For any polynomial Q in K[X, Y], Q satisfiesthe conditions (ii) and (iii) of Problem 1 if and only if it hasthe form

Q(X, Y) =∑

j∈%Q j(X)Y j with deg(Q j) < N j.

For i ∈ Zs!0 with |i| < m and j ∈ %, let us now define the

polynomials Pi, Fi, j ∈ K[X] as

Pi =∏

1"r"n :mr >|i|

(X − xr )mr −|i| (2a)

and

Fi, j =(

ji

)R j−i mod Pi. (2b)

It then follows from Lemmas 2 and 3 that Q in K[X, Y]satisfies the conditions (ii), (iii), (iv) of Problem 1 if and onlyif Q = ∑

j∈% Q jY j for some polynomials Q j in K[X] suchthat

• deg(Q j) < N j for all j in %,•

∑j∈% Fi, j Q j = 0 mod Pi for all |i| < m.

Let now Mi be the positive integers given by

Mi =∑

1"r"n : mr >|i|(mr − |i|),

for all |i| < m. Since the Pi are monic polynomials ofdegree Mi and since deg Fi, j < Mi, the latter conditionsexpress the problem of finding such a Q as an instance of

Page 6: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

CHOWDHURY et al.: FASTER ALGORITHMS FOR MULTIVARIATE INTERPOLATION 2375

Algorithm 1 Reducing Problem 1 to Problem 2Input: s, ℓ, n, m1, . . . , mn in Z>0, b, k1, . . . , ks in Z, and

points {(xi , yi,1, . . . , yi,s )}1"i"n in Ks+1 with thexi pairwise distinct.

Output: parameters µ, ν, M ′0, . . . , M ′

µ−1, N ′0, . . . , N ′

ν−1,{(Pi , Fi,0, . . . , Fi,ν−1)}0"i<µ for Problem 2, suchthat the solutions to this problem are exactly thesolutions to Problem 1 with parameters the input ofthis algorithm.

1. Compute % = { j ∈ Zs!0 | | j| " ℓ and b − j · k > 0},

µ =(s+m−1

s

), ν = |%|, and bijections φ and ψ as in (3)

2. Compute Mi = ∑1"r"n : mr >|i|(mr −|i|) and N j = b− j·k

for j ∈ %3. Compute Pi and Fi, j for |i| < m, j ∈ % as in (2)4. Return the integers µ, ν, Mφ(0), . . . , Mφ(µ−1),

Nψ(0), . . . , Nψ(ν−1) together with the polynomialtuples {(Pφ(i), Fφ(i),ψ(0), . . . , Fφ(i),ψ(ν−1))}0"i<µ

Problem 2. In order to make the reduction completely explicit,define further

M =∑

|i|<m

Mi,

µ =(

s + m − 1s

), ν = |%|, ϱ = max(µ, ν);

then choose arbitrary orders on the sets of indices {i ∈ Zs!0 |

|i| < m} and %, that is, bijections

φ : {0, . . . , µ − 1} → {i ∈ Zs!0 | |i| < m} (3a)

and

ψ : {0, . . . , ν − 1} → %; (3b)

finally, for i in {0, . . . , µ−1} and j in {0, . . . , ν−1}, associateM ′

i = Mφ(i), N ′j = Nψ( j ), P ′

i = Pφ(i) and F ′i, j = Fφ(i),ψ( j ).

At this stage we have proved that the solutions toProblem 1 with input parameters s, ℓ, n, m1, . . . , mn ,b, k1, . . . , ks and points {(xi , yi,1, . . . , yi,s )}1"i"n areexactly the solutions to Problem 2 with input parametersµ, ν, M ′

0, . . . , M ′µ−1, N ′

0, . . . , N ′ν−1 and polynomials

{(P ′i , F ′

i,0, . . . , F ′i,ν−1)}0"i<µ. This proves the correctness

of Algorithm 1.Proposition 1: Algorithm 1 is correct and uses

O(ϱM(M) log(M)

)

operations in K.Proof: The only thing left to do is the complexity analysis;

more precisely, giving an upper bound on the number ofoperations in K performed in Step 3.

First, we need to compute Pi as in (2a) for every i in Zs!0

such that |i| < m. This involves only m different polyno-mials Pi0, . . . , Pim−1 where we have chosen any indices i jsuch that |i j | = j . We note that, defining for j < mthe polynomial G j = ∏

1"r"n : mr > j (X − xr ), we havePim−1 = Gm−1 and for every j < m − 1, Pi j = Pi j+1 · G j .The polynomials G0, . . . , Gm−1 have degree at most n andcan be computed using O(mM(n) log(n)) operations in K;

this is O(ϱM(M) log(M)) since ϱ !(s+m−1

s

)! m and

M = ∑1"r"n

(s+mrs+1

)! n. Then Pi0 , . . . , Pim−1 can be

computed iteratively using O(∑

j<m M(deg(Pi j ))) operationsin K; using the super-linearity of M(·), this is O(M(M)) sincedeg(Pi j ) = Mi j and

∑j<m Mi j " M .

Then, we have to compute (some of) the interpolationpolynomials R1, . . . , Rs . Due to Lemma 2, the only values ofi ∈ {1, . . . , s} for which Ri is needed are those suchthat the indeterminate Yi may actually appearin Q(X, Y) = ∑

j∈% Q j(X)Y j. Now, the latter will notoccur unless the i th unit s-tuple (0, . . . , 0, 1, 0, . . . , 0)belongs to %. Hence, at most |%| polynomials Ri mustbe computed, each at a cost of O(M(n) log(n)) operationsin K. Overall, the cost of the interpolation step is thusin O(|%|M(n) log(n)) ⊆ O(ϱM(M) log(M)).

Finally, we compute Fi, j as in (2b) for every i, j. This isdone by fixing i and computing all products Fi, j incrementally,starting from R1, . . . , Rs . Since we compute modulo Pi, eachproduct takes O(M(Mi)) operations in K. Summing over all jleads to a cost of O(|%|M(Mi)) per index i. Summing overall i and using the super-linearity of M leads to a total cost ofO(|%|M(M)), which is O(ϱM(M)). #

The reduction above is deterministic and its cost isnegligible compared to the cost in O(ϱω−1M(M) log(M)2)that follows from Theorem 2 with ρ = ϱ and M ′ =∑

0"i<µ M ′i = M . Noting that M = ∑

|i|<m Mi =∑1"r"n

(s+mrs+1

), we conclude that Theorem 2 implies

Theorem 1.

III. SOLVING PROBLEM 2 THROUGH

STRUCTURED LINEAR SYSTEMS

A. Solving Structured Homogeneous Linear Systems

Our two solutions to Problem 2 rely on fast algorithmsfor solving linear systems of the form Au = 0 with A astructured matrix over K. In this section, we briefly reviewuseful concepts and results related to displacement ranktechniques. While these techniques can handle systems withseveral kinds of structure, we will only need (and discuss)those related to Toeplitz-like and Hankel-like systems; for amore comprehensive treatment, the reader may consult [39].

Let M be a positive integer and let ZM ∈ KM×M bethe square matrix with ones on the subdiagonal and zeroselsewhere:

ZM =

⎢⎢⎢⎢⎢⎣

0 0 · · · 0 01 0 · · · 0 00 1 0 · · · 0...

. . .. . .

. . ....

0 · · · 0 1 0

⎥⎥⎥⎥⎥⎦∈ KM×M .

Given two integers M and N , consider the following operators:

+M,N : KM×N → KM×N

A *→ A − ZM AZTN

and

+′M,N : KM×N → KM×N

A *→ A − ZM AZN ,

Page 7: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

2376 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015

which subtract from A its translate one place along thediagonal and the anti-diagonal, respectively.

Let us discuss +M,N first. If A is a Toeplitz matrix, thatis, invariant along diagonals, +M,N (A) has rank at mosttwo. As it turns out, Toeplitz systems can be solved muchfaster than general linear systems, in quasi-linear time inM + N . The main idea behind algorithms for structuredmatrices is to extend these algorithmic properties to thosematrices A for which the rank of +M,N (A) is small, inwhich case we say that A is Toeplitz-like. Below, this rankwill be called the displacement rank of A (with respectto +M,N ).

A pair of matrices (V , W ) in KM×α × Kα×N will becalled a generator of length α for A with respect to +M,Nif +M,N (A) = V W . For the structure we are considering,one can recover A from its generator; in particular, one canuse a generator of length α as a way to represent A usingα(M+N) field elements. One of the main aspects of structuredlinear algebra algorithms is to use generators as a compact datastructure throughout the whole process.

Up to now, we only discussed the Toeplitz structure.Hankel-like matrices are those which have a small displace-ment rank with respect to +′

M,N , that is, those matrices Afor which the rank of +′

M,N (A) is small. As far as solvingthe system Au = 0 is concerned, this case can easily bereduced to the Toeplitz-like case. Define B = AJN , whereJN is the reversal matrix of size N , all entries of which arezero, except the anti-diagonal which is set to one. Then, oneeasily checks that the displacement rank of A with respectto +′

M,N is the same as the displacement rank of B withrespect to +M,N , and that if (V , W ) is a generator for Awith respect to +′

M,N , then (V , W JN ) is a generator for Bwith respect to +M,N . Using the algorithm for Toeplitz-likematrices gives us a solution v to Bv = 0, from which wededuce that u = JN v is a solution to Au = 0.

In this paper, we will not enter the details of algorithmsfor solving such structured systems. The main result we willrely on is the following proposition, a minor extension of aresult by Bostan, Jeannerod, and Schost [8], which featuresthe best known complexity for this kind of task, to the best ofour knowledge. This algorithm is based on previous work ofBitmead and Anderson [7], Morf [35], Kaltofen [25],and Pan [39], and is probabilistic (it depends on the choiceof some parameters in the base field K, and success isensured provided these parameters avoid a hypersurface of theparameter space).

The proof of the following proposition occupies the rest ofthis section. Remark that some aspects of this statement couldbe improved (for instance, we could reduce the cost so thatit only depends on M , not max(M, N)), but that would beinconsequential for the applications we make of it.

Proposition 2: Given a generator (V , W ) of length α fora matrix A ∈ KM×N , with respect to either +M,N or +′

M,N ,one can find a nonzero element in the right nullspace of A,or determine that none exists, by a probabilistic algorithmthat uses O(αω−1M(P) log(P)2) operations in K, withP = max(M, N). The algorithm chooses O(P) elementsin K; if these elements are chosen uniformly at random in

a set S ⊆ K of cardinality at least 6P2, the probability ofsuccess is at least 1/2.

Square Matrices: In all that follows, we consideronly the operator +M,N , since we already pointed outthat the case of +′

M,N can be reduced to it for noextra cost.

When M = N , we use directly [8, Theorem 1], which givesthe running time reported above. That result does not explicitlystate which solution we obtain, as it is written for general non-homogeneous systems. Here, we want to make sure we obtaina nonzero element in the right nullspace (if one exists), soslightly more details are needed.

The algorithm in that theorem chooses 3M − 2 elementsin K, the first 2M − 2 of which are used to precondition Aby giving it generic rank profile; this is the case whenthese parameters avoid a hypersurface of K2M−2 of degree atmost M2 + M .

Assume this is the case. Then, following [26], the outputvector u is obtained in a parametric form as u = λ(u′),where u′ consists of another set of M parameters chosenin K and λ is a surjective linear mapping with image theright nullspace ker(A) of A. If ker(A) is trivial, the algorithmreturns the zero vector in any case, which is correct. Otherwise,the set of vectors u′ such that λ(u′) = 0 is contained in ahyperplane of KM , so it is enough to choose u′ outside ofthat hyperplane to ensure success.

To conclude we rely on the so-called Zippel-Schwartzlemma [16], [45], [55], which can be summarized as follows:if a nonzero polynomial over K of total degree at most dis evaluated by assigning each of its indeterminates a valuechosen uniformly at random in a subset S of K, then theprobability that the resulting polynomial value be zero is atmost d/|S|. Thus, applying that result to the polynomial ofdegree d := M2 + M + 1 " 3M2 corresponding to thehypersurface and the hyperplane mentioned above, we see thatif we choose all parameters uniformly at random in a subsetS ⊆ K of cardinality |S| ! 6M2, the algorithm succeeds withprobability at least 1/2.

Wide Matrices: Suppose now that M < N , so that thesystem is underdetermined. We add N − M zero rows on topof A, obtaining an N × N matrix A′. Applying the algorithmfor the square case to A′, we will obtain a right nullspaceelement u for A′ and thus A, since these nullspaces are thesame. In order to do so, we need to construct a generatorfor A′ from the generator (V , W ) we have for A: one simplytakes (V ′, W ), where V ′ is the matrix in KN×α obtained byadding N − M zero rows on top of V .

Tall Matrices: Suppose finally that M > N . This time,we build the matrix A′ ∈ KM×M by adjoining M − N zerocolumns to A on the left. The generator (V , W ) of A can beturned into a generator of A′ by simply adjoining M − N zerocolumns to W on the left. We then solve the system A′s = 0,and return the vector u obtained by discarding the first M − Nentries of s.

The cost of this algorithm fits into the requested bound; allthat remains to see is that we obtain a nonzero vector in theright nullspace ker(A) of A with nonzero probability. Indeed,the nullspaces of A and A′ are now related by the equality

Page 8: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

CHOWDHURY et al.: FASTER ALGORITHMS FOR MULTIVARIATE INTERPOLATION 2377

ker(A′) = KM−N × ker(A). We mentioned earlier that in thealgorithm for the square case, the solution s to A′s = 0 isobtained in parametric form, as s = λ(s′) for s′ ∈ KM ,with λ a surjective mapping KM → ker(A′). Composingwith the projection π : ker(A′) → ker(A), we obtain aparametrization of ker(A) as u = (π ◦ λ)(s′). The errorprobability analysis is then the same as in the square case.

B. Solving Problem 2 Through a Mosaic-HankelLinear System

In this section, we give our first solution to Problem 2,thereby proving Theorem 2; this solution is outlinedin Algorithm 2. It consists of first deriving and linearizing themodular equations of Lemma 4 below, and then solving theresulting mosaic-Hankel system using the approach recalledin Section III-A. Note that, when solving Problem 1 usingthe reduction to Problem 2 given in Section II, these modularequations are a generalization to arbitrary s of the extendedkey equations presented in [43], [53], and [54] for s = 1.

We consider tuples {(Pi , Fi,0, . . . , Fi,ν−1)}0"i<µ ofpolynomials in K[X] with, for all i , Pi monic of degree M ′

iand deg(Fi, j ) < M ′

i for all j . Given degree boundsN ′

0, . . . , N ′ν−1, we look for polynomials Q0, . . . , Qν−1

in K[X] such that the following holds:(a) the Q j are not all zero,(b) for 0 " j < ν, deg(Q j ) < N ′

j ,

(c) for 0 " i < µ,∑

0" j<ν Fi, j Q j = 0 mod Pi .Our goal here is to linearize the condition (c) into a

homogeneous linear system over K involving M ′ linearequations with N ′ unknowns, where M ′ = M ′

0 + · · · + M ′µ−1

and N ′ = N ′0 +· · ·+ N ′

ν−1. Without loss of generality, we willassume that

N ′ " M ′ + 1. (4)

Indeed, if N ′ ! M ′ + 1, the instance of Problem 2 we areconsidering has more unknowns than equations. We may setthe last N ′ − (M ′ + 1) unknowns to zero, while keeping thesystem underdetermined. This simply amounts to replacingthe degree bounds N ′

0, . . . , N ′ν−1 by N ′

0, . . . , N ′ν ′−2, N ′′

ν ′−1, forν′ " ν and N ′′

ν ′−1 " N ′ν ′−1 such that N ′

0 + · · · + N ′ν ′−2 +

N ′′ν ′−1 = M ′ + 1. In particular, ν may only decrease through

this process.In what follows, we will work with the reversals of the input

and output polynomials of Problem 2, defined by

Pi = X M ′i Pi (X−1),

Fi, j = X M ′i −1 Fi, j (X−1),

Q j = X N ′j −1 Q j (X−1).

Let also β = maxh<ν N ′h and, for 0 " i < µ and 0 " j < ν,

δi = M ′i + β − 1 and γ j = β − N ′

j .

In particular, δi > 0 and γ j ! 0; recalling that Pi is monic,we can define further the polynomials Si, j in K[X] as

Si, j = Xγ j Fi, j

Pimod X δi

for 0 " i < µ and 0 " j < ν. (Those polynomials canbe seen as a generalization of what is usually calledsyndrome polynomials in the context of codingtheory; see for example [54].) By using these polynomials,we can now reformulate the approximation condition ofProblem 2 in terms of a set of extended key equations:

Lemma 4: Let Q0, . . . , Qν−1 be polynomials in K[X] thatsatisfy condition (b) in Problem 2. They satisfy condition (c)in Problem 2 if and only if for all i in {0, . . . , µ − 1}, thereexists a polynomial Ti in K[X] such that

0" j<ν

Si, j Q j = Ti mod X δi and deg(Ti ) < β − 1. (5)

Proof: Condition (c) holds if and only if for all i in{0, . . . , µ − 1}, there exists a polynomial Bi in K[X] suchthat

0" j<ν

Fi, j Q j = Bi Pi . (6)

For all i, j , the summand Fi, j Q j has degree less thanM ′

i + N ′j − 1, so the left-hand term above has degree less

than δi . Since Pi has degree M ′i , this implies that whenever

a polynomial Bi as above exists, we must havedeg(Bi ) < δi − M ′

i = β − 1. Now, by substituting1/X for X and multiplying by X δi −1 we can rewrite theidentity in (6) as

0" j<ν

Fi, j Q j Xγ j = Ti Pi , (7)

where Ti is the polynomial of degree less than β− 1 given byTi = Xβ−2 Bi (X−1). Since the degrees of both sides of (7) areless than δi , one can consider the above identity modulo X δi

without loss of generality, and since Pi (0) = 1 one can furtherdivide by Pi modulo X δi . This shows that (7) is equivalent tothe identity in (5), and the proof is complete. #

Following [43] and [54], we are going to rewrite theconditions in (5) as a linear system in the coefficients ofthe polynomials Q0, . . . , Qν−1, eliminating the unknowns Tifrom the outset. Let us first define the coefficient vectorof a solution (Q0, . . . , Qν−1) to Problem 2 as the vectorin KN ′

obtained by concatenating, for 0 " j < ν, the

vectors[Q(0)

j , Q(1)j , . . . , Q

(N ′j −1)

j

]T of the coefficients of Q j .

Furthermore, denoting by S(0)i, j , S(1)

i, j , . . . , S(δi −1)i, j the δi ! 1

coefficients of the polynomial Si, j , we set up the block matrix

A =[Ai, j

]0"i<µ,0" j<ν ∈ KM ′×N ′

,

whose block (i, j) is the Hankel matrix

Ai, j =[S

(u+v+γ j )i, j

]0"u<M ′

i ,0"v<N ′j∈ KM ′

i×N ′j.

Lemma 5: A nonzero vector of KN ′is in the right nullspace

of A if and only if it is the coefficient vector of a solution(Q0, . . . , Qν−1) to Problem 2.

Proof: It is sufficient to consider a polynomial tuple(Q0, . . . , Qν−1) that satisfies (b). Then, looking at the high-degree terms in the identities in (5), we see that condition (c)is equivalent to the following homogeneous system of linear

Page 9: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

2378 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015

Algorithm 2 Solving Problem 2 via a Mosaic-HankelLinear System

Input: positive integers µ,ν, M ′0, . . . , M ′

µ−1, N ′0, . . . , N ′

ν−1and polynomial tuples {(Pi , Fi,0, . . . , Fi,ν−1)}0"i<µ

in K[X]ν+1 such that for all i , Pi is monic ofdegree M ′

i and deg(Fi, j ) < M ′i for all j .

Output: polynomials Q0, . . . , Qν−1 in K[X] suchthat (a), (b), (c).

1. For i < µ, j < ν, compute the coefficients S(γ j +r)i, j for

r < M ′i + N ′

j − 1, that is, the coefficients of thepolynomials S⋆i, j defined in (11)

2. For i < µ and j < ν, compute the vectors vi, jand wi, j as defined in (8) and (9)

3. For i < µ, compute ri = M ′0 + · · · + M ′

i−1;for j < ν, compute c j = N ′

0 + · · · + N ′j − 1

4. Deduce the generators V and W as defined in (10)from ri , c j , vi, j , wi, j

5. Use the algorithm of Proposition 2 with input V and W ;if there is no solution then exit with no solution, otherwisefind the coefficients of Q0, . . . , Qν−1

6. Return Q0, . . . , Qν−1

equations over K: for all i in {0, . . . , µ − 1} and all δin {δi − M ′

i , . . . , δi − 1},∑

0 " j<ν

0 "r<N ′j

S(δ−r)i, j Q

(N ′j −1−r)

j = 0.

The matrix obtained by considering all these equations isprecisely the matrix A. #

We will use the approach recalled in Section III-A tofind a nonzero nullspace element for A, with respect to thedisplacement operator +′

M ′,N ′ . Not only do we need to provethat the displacement rank of A with respect to +′

M ′,N ′ isbounded by a value α not too large, but we also have toefficiently compute a generator of length α for A, that is,a pair of matrices (V , W ) in KM ′×α × Kα×N ′

such thatA − ZM ′ AZN ′ = V W . We will see that here, computingsuch a generator boils down to computing the coefficientsof the polynomials Si, j . The cost incurred by computing thisgenerator is summarized in the following lemma; combinedwith Proposition 2 and Lemma 5, this proves Theorem 2.

Lemma 6: The displacement rank of A with respect to+′

M ′,N ′ is at most µ + ν. Furthermore, one can computea corresponding generator of length µ + ν for A usingO

((µ + ν)M(M ′)

)operations in K.

Proof: We are going to exhibit two matricesV ∈ KM ′×(µ+ν) and W ∈ K(µ+ν)×N ′

such that A − ZM ′

AZN ′ = V W . Because of the structure of A, at most µrows and ν columns of the matrix A −ZM ′ AZN ′ are nonzero.More precisely, only the first row and the last column of eachM ′

i × N ′j block of this matrix can be nonzero. Indexing the

rows (resp. columns) of A − ZM ′ AZN ′ from 0 to M ′ − 1(resp. from 0 to N ′ − 1), only the µ rows with indices ofthe form ri = M ′

0 + · · · + M ′i−1 for i = 0, . . . , µ − 1 can

be nonzero, and only the ν columns with indices of the formc j = N ′

0 + · · · + N ′j − 1 for j = 0, . . . , ν − 1 can be nonzero.

For two integers i, K with 0 " i < K , defineOi,K = [0 · · · 0 1 0 · · · 0]T ∈ KK with 1 at position i , and

O(V ) =[Ori ,M ′

]0"i<µ ∈ KM ′×µ,

O(W ) =[Oc j ,N ′

]T0" j<ν

∈ Kν×N ′.

For given i in {0, . . . , µ − 1} and j in {0, . . . , ν − 1},we will consider vi, j = [v(r)

i, j ]0"r<M ′i

in KM ′i×1 and

wi, j = [w(r)i, j ]0"r<N ′

jin K1×N ′

j , which are respectively the lastcolumn and the first row of the block (i, j) in A −ZM ′ AZN ′ ,up to a minor point: the first entry of vi, j is set to zero.The coefficients v(r)

i, j and w(r)i, j can then be expressed in terms

of the entries A(u,v)i, j = S

(u+v+γ j )i, j of the Hankel matrix

Ai, j = [A(u,v)i, j ]0"u<M ′

i ,0"v<N ′j

as follows:

v(r)i, j =

⎧⎨

0 if r = 0,

A(r,N ′

j −1)

i, j − A(r−1,0)(i, j+1) if 1 " r < M ′

i ,(8)

w(r)i, j =

⎧⎪⎨

⎪⎩

A(0,r)i, j − A

(M ′i−1−1,r+1)

i−1, j if r < N ′j − 1,

A(0,N ′

j −1)

i, j − A(M ′

i−1−1,0)

i−1, j+1 if r = N ′j − 1.

(9)

Note that here, we use the convention that an indexed objectis zero when the index is out of the allowed bounds forthis object.

Then, we define Vj and Wi as

Vj =

⎢⎣v0, j...

vµ−1, j

⎥⎦ ∈ KM ′×1 and

Wi =[wi,0 · · · wi,ν−1

]∈ K1×N ′

,

and we define V ′ and W ′ as

V ′ = [V0 · · · Vν−1

] ∈ KM ′×ν and

W ′ =

⎢⎣W0...

Wµ−1

⎥⎦ ∈ Kµ×N ′.

Now, one can easily verify that the matrices

V =[V ′ O(V )

]∈ KM ′×(µ+ν) (10a)

and

W =[O(W )

W ′

]∈ K(µ+ν)×N ′

(10b)

are generators for A, that is, A − ZM AZN = V W .We notice that all we need in order to compute the

generators V and W are the last M ′i + N ′

j − 1 coefficients

of Si, j (X) = S(0)i, j + S(1)

i, j X + · · ·+ S(δi−1)i, j X δi −1 for 0 " i < µ

and 0 " j < ν. Now, recall that

Si, j = Xγ j Fi, j

Pimod X δi = X δi −(M ′

i +N ′j −1) Fi, j

Pimod X δi .

Page 10: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

CHOWDHURY et al.: FASTER ALGORITHMS FOR MULTIVARIATE INTERPOLATION 2379

Thus, the first δi − (M ′i + N ′

j − 1) coefficients of Si, j arezero, and the last M ′

i + N ′j − 1 coefficients of Si, j are the

coefficients of

S⋆i, j = Fi, j

Pimod X M ′

i +N ′j −1, (11)

which can be computed in O(M(M ′i + N ′

j )) operations in Kby fast power series division. By expanding products, we seethat M(M ′

i + N ′j ) = O(M(M ′

i ) + M(N ′j )). Summing the costs,

we obtain an upper bound of the form

O

⎝∑

0"i<µ

0" j<ν

M(M ′i ) + M(N ′

j )

⎠,

which is in O(νM(M ′) + µM(N ′)) using the super-linearityof M. Since we assumed in (4) that N ′ " M ′ + 1, thisis O((µ + ν)M(M ′)). #

C. A Direct Solution to Problem 2

In this section, we propose an alternative solution toProblem 2 which leads to the same asymptotic running time asin the previous section but avoids the extended key equationsof Lemma 4; it is outlined in Algorithm 3. As above, ourinput consists of the polynomials (Pi , Fi,0, . . . , Fi,ν−1)0"i<µ

and we look for polynomials Q0, . . . , Qν−1 in K[X] such thatfor 0 " i < µ,

∑0" j<ν Fi, j Q j = 0 mod Pi , with the Q j not

all zero and for j < ν, deg Q j < N ′j .

In addition, for r ! 0, we denote by F (r)i, j and P(r)

i thecoefficients of degree r of Fi, j and Pi , respectively, and wedefine Ci as the M ′

i × M ′i companion matrix of Pi ; if B is

a polynomial of degree less than M ′i with coefficient vector

v ∈ KM ′i , then the product Civ ∈ KM ′

i is the coefficient vectorof the polynomial X B mod Pi . Explicitly, we have

Ci =

⎢⎢⎢⎢⎢⎢⎣

0 0 · · · 0 −P(0)i

1 0 · · · 0 −P(1)i

0 1 · · · 0 −P(2)i

......

. . ....

...

0 0 · · · 1 −P(M ′

i −1)i

⎥⎥⎥⎥⎥⎥⎦∈ KM ′

i ×M ′i .

We are going to see that solving Problem 2 is equivalentto finding a nonzero solution to a homogeneous linearsystem whose matrix is A′ = (A′

i, j ) ∈ KM ′×N ′, where for

i < µ and j < ν, A′i, j ∈ KM ′

i×N ′j is a matrix which depends

on the coefficients of Fi, j and Pi . Without loss of generality,we make the same assumption as in the previous section, thatis, N ′ " M ′ + 1 holds.

For i, j as above and for h ∈ Z!0, let α(h)i, j ∈ KM ′

i be thecoefficient vector of the polynomial Xh Fi, j mod Pi , so thatthese vectors are given by

α(0)i, j =

⎢⎢⎣

F (0)i, j...

F(M ′

i−1)i, j

⎥⎥⎦ and α(h+1)i, j = Ciα

(h)i, j .

Let then A′ = (A′i, j ) ∈ KM ′×N ′

, where for every i < µ

and j < ν, the block A′i, j ∈ KM ′

i ×N ′j is defined by

A′i, j =

[α(0)

i, j · · · α(N ′j −1)

i, j

].

Lemma 7: A nonzero vector of KN ′is in the right nullspace

of A′ if and only if it is the coefficient vector of a solution(Q0, . . . , Qν−1) to Problem 2.

Proof: By definition A′i, j is the M ′

i × N ′j matrix of the

mapping Q *→ Fi, j Q mod Pi , for Q in K[X] of degree lessthan N ′

j . Thus, if (Q0, . . . , Qν−1) is a ν-tuple of polyno-mials that satisfies the degree constraint (b) in Problem 2,applying A′ to the coefficient vector of this tuple outputsthe coefficients of the remainders

∑0" j<ν Fi, j Q j mod Pi , for

i = 0, . . . , µ − 1. The claimed equivalence then followsimmediately. #

The following lemma shows that A′ possessesa Toeplitz-like structure, with displacement rank at most µ+ν.Together with Proposition 2 and Lemma 7, this gives oursecond proof of Theorem 2.

Lemma 8: The displacement rank of A′ with respect to+M ′,N ′ is at most µ + ν. Furthermore, one can computea corresponding generator of length µ + ν for A′ usingO((µ + ν)M(M ′)) operations in K.

Proof: We begin by giving two matrices Y ∈ KM ′×(µ+ν)

and Z ∈ K(µ+ν)×N ′such that +M ′,N ′ (A′) is equal to the

product Y Z . Define first the matrix

C =

⎢⎢⎢⎣

C0 0 · · · 00 C1 · · · 0...

.... . .

...0 0 · · · Cµ−1

⎥⎥⎥⎦∈ KM ′×M ′

.

Up to µ columns, C coincides with ZM ′ ; we make this explicitas follows. For 0 " i < µ, we define

vi =

⎢⎢⎣

P(0)i...

P(M ′

i −1)i

⎥⎥⎦ ∈ KM ′i, (12a)

Vi =

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

0...0vi10...0

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

∈ KM ′, Wi =

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

0...0100...0

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

∈ KM ′, (12b)

where the last entry of vi in Vi and the coefficient 1in Wi have the same index, namely M ′

0 + · · · + M ′i − 1.

(Hence the last vector Vµ−1 only contains vµ−1, without a 1after it.) Then, defining V = [V0 · · · Vµ−1] ∈ KM ′×µ andW = [W0 · · · Wµ−1] ∈ KM ′×µ, we obtain

C = ZM ′ − V0W T0 − · · · − Vµ−1W T

µ−1 = ZM ′ − V W T.

As before, we use the convention that an indexed object is zerowhen the index is out of the allowed bounds for this object.

Page 11: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

2380 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015

Algorithm 3 Solving Problem 2 via a Toeplitz-likeLinear System

Input: positive integers µ,ν, M ′0, . . . , M ′

µ−1, N ′0, . . . , N ′

ν−1and polynomial tuples {(Pi , Fi,0, . . . , Fi,ν−1)}0"i<µ

in K[X]ν+1 such that for all i , Pi is monic of degreeM ′

i and deg(Fi, j ) < M ′i for all j .

Output: polynomials Q0, . . . , Qν−1 in K[X] such that(a), (b), (c).

1. Compute vi and Vi for i < µ, as defined in (12); computeV = [V0 · · · Vµ−1]

2. Compute W ′j for j < ν, as defined in (13); compute

W ′ = [W ′0 · · · W ′

ν−1]3. Compute α

(N ′j )

i, j , that is, the coefficients of X N ′j Fi, j

mod Pi , for i < µ, j < ν − 14. Compute V ′

j for j < µ, as defined in (13);compute V ′ = [V ′

0 · · · V ′ν−1]

5. Compute the row of index M ′0 + · · · + M ′

i − 1 of A′,for i < µ, that is, the coefficient of degree M ′

i − 1 ofXh Fi, j mod Pi , for h < N ′

j , j < ν (see Lemma 9 for fastcomputation)

6. Compute W TA′ whose row of index i is the row ofindex M ′

0 + · · · + M ′i − 1 of A′

7. Compute the generators Y and Z as defined in (14)8. Use the algorithm of Proposition 2 with input Y and Z ;

if there is no solution then exit with no solution, otherwisefind the coefficients of Q0, . . . , Qν−1

9. Return Q0, . . . , Qν−1

For 0 " j < ν, let us further define

V ′j =

⎢⎢⎣

α(0)0, j...

α(0)µ−1, j

⎥⎥⎦ −

⎢⎢⎢⎣

α(N ′

j−1)

0, j−1...

α(N ′

j−1)

µ−1, j−1

⎥⎥⎥⎦∈ KM ′

(13a)

and

W ′j =

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

0...010...0

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

∈ KN ′, (13b)

with the coefficient 1 in W ′j at index N ′

0 + · · · + N ′j−1, and

the compound matrices

V ′ = [V ′0 · · · V ′

ν−1] ∈ KM ′×ν,

W ′ = [W ′0 · · · W ′

ν−1] ∈ KN ′×ν.

Then, we claim that the matrices

Y = [−V V ′] ∈ KM ′×(µ+ν) (14a)

and

Z =[

W T A′ZTN ′

W ′T

]∈ K(µ+ν)×N ′

(14b)

are generators for A′ for the Toeplitz-like displacementstructure, that is,

A′ − ZM ′ A′ZTN ′ = Y Z .

By construction, we have CA′ = (Bi, j )i<µ, j<ν ∈ KM ′×N ′,

with Bi, j given by

Bi, j = Ci A′i, j =

[α(1)

i, j · · · α(N ′j −1)

i, j α(N ′

j )

i, j

]∈ KM ′

i×N ′j .

As a consequence, A′ − CA′ZTN ′ = V ′W ′T , so finally we get,

as claimed,

A′ − ZM ′ A′ZTN ′ = A′ − (C + V W T )A′ZT

N ′

= A′ − CA′ZTN ′ − V W TA′ZT

N ′

= V ′W ′T − V W T AZTN ′

= Y Z .

To compute Y and Z , the only non-trivial steps are thosegiving V ′ and W TA′. For the former, we have to computethe coefficients of X N ′

j Fi, j mod Pi for every i < µ andj < ν − 1. For fixed i and j , this can be done using fastEuclidean division in O(M(M ′

i + N ′j )) operations in K, which

is O(M(M ′i ) + M(N ′

j )). Summing over the indices i < µ andj < ν − 1, this gives a total cost of O(νM(M ′) + µM(N ′))operations. This is O((µ + ν)M(M ′)), since byassumption N ′ " M ′ + 1.

Finally, we show that W T A′ can be computed usingO((µ + ν)M(M ′)) operations as well. Computing thismatrix amounts to computing the rows of A′ of indicesM ′

0 + · · · + M ′i − 1, for i < µ. By construction of A′, this

means that we want to compute the coefficients of degreeM ′

i −1 of Xh Fi, j mod Pi for h = 0, . . . , N ′j −1 and for all i, j .

Unfortunately, the naive approach leads to a cost proportionalto M ′ N ′ operations, which is not acceptable. However,for i and j fixed, Lemma 9 below shows how to do thiscomputation using only O(M(M ′

i ) + M(N ′j )) operations,

which leads to the announced cost by summingover i and j . #

Lemma 9: Let P ∈ K[X] be monic of degree m, letF ∈ K[X] be of degree less than m, and for i ! 0 let cidenote the coefficient of degree m − 1 of Xi F mod P. Then,for n ! 1 we can compute c0, . . . , cn−1 using O(M(m)+M(n))operations in K.

Proof: Writing F = ∑0" j<m f j X j we have Xi F mod

P = ∑0" j<m f j

(Xi+ j mod P

). Hence ci = ∑

0" j<mf j bi+ j , with bi denoting the coefficient of degree m − 1 ofXi mod P . Since b0 = · · · = bm−2 = 0 and bm−1 = 1,we can deduce c0, . . . , cn−1 from bm−1, bm, . . . , bm+n−2 intime O(M(n)) by multiplication by the lower triangularToeplitz matrix [ fm+ j−i−1]i, j of order n − 1.

Thus, we are left with the question of computing then − 1 coefficients bm, . . . , bm+n−2. Writing P asP = Xm + ∑

0" j<m p j X j and using the fact thatXi P mod P = 0 for all i ! 0, we see that the bi aregenerated by a linear recurrence of order m with constantcoefficients:

bi+m +∑

0" j<m

p j bi+ j = 0 for all i ! 0.

Page 12: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

CHOWDHURY et al.: FASTER ALGORITHMS FOR MULTIVARIATE INTERPOLATION 2381

Consequently, bm, . . . , bm+n−2 can be deduced fromb0, . . . , bm−1 in time O( n

m M(m)), which is O(M(m)+M(n)),by ⌈ n−1

m ⌉ calls to Shoup’s algorithm for extending a linearlyrecurrent sequence [46, Th. 3.1]. #

IV. APPLICATIONS TO THE DECODING

OF REED-SOLOMON CODES

To conclude, we discuss Theorem 1 in specific contextsrelated to the decoding of Reed-Solomon codes; in this sectionwe always have s = 1. First, we give our complexity resultin the case of list-decoding via the Guruswami-Sudanalgorithm [23]; then we show how the re-encodingtechnique [27], [29] can be used in our setting; then, wediscuss the interpolation step of the Wu algorithm [52];and finally we present the application of our results to theinterpolation step of the soft-decoding [28]. In these contextsof applications, we will use some of the assumptions on theparameters H1, H2, H3, H4 given in Section I. Note that inthe context of soft-decoding, the xi in the input of Problem 1are not necessarily pairwise distinct: we will explain how toadapt our algorithms to this case. Besides, still in this context,the number of points n is no longer equal to the length of thecode and may actually be much larger, unlike in hard-decision(list-)decoding.

A. Interpolation Step of the Guruswami-Sudan Algorithm

We study here the specific context of the interpolationstep of the Guruswami-Sudan list-decoding algorithm forReed-Solomon codes. This interpolation step is preciselyProblem 1 where we have s = 1 and we make assumptionsH1, H2, H3, H4. Under H2, the set % introduced in Theorem 1reduces to { j ∈ Z!0 : j " ℓ} = {0, . . . , ℓ}, so that |%| = ℓ+1.Thus, assumption H1 ensures that the parameter ϱ in thattheorem is ϱ = ℓ + 1; because of H4 all multiplicities areequal so that we further have M =

(m+12

)n = m(m+1)

2 n. FromTheorem 1, we obtain the following result, which substantiatesour claimed cost bound in Section I, Table I.

Corollary 1: Taking s = 1, if the parameters ℓ, n,m := m1 = · · · = mn, b and k := k1 satisfy H1, H2, H3, H4,then there exists a probabilistic algorithm that computes asolution to Problem 1 using

O(ℓω−1M(m2n) log(mn)2) ⊆ O˜(ℓω−1m2n)

operations in K, with probability of success at least 1/2.We note that the probability analysis in Theorem 1 is

simplified in this context. Indeed, to ensure probability ofsuccess at least 1/2, the algorithm chooses O(m2n) elementsuniformly at random in a set S ⊆ K of cardinality atleast 24m4n2; if |K| < 24m4n2, one can use the remarksfollowing Theorem 1 in Section I about solving the problemover an extension of K and retrieving a solution over K.Here, the base field K of a Reed-Solomon code must be ofcardinality at least n since the xi are distinct; then, an extensiondegree d = O(logn(m)) suffices and the cost bound abovebecomes O

(ℓω−1M(m2n) log(mn)2 ·M(d) log(d)

). Besides, in

the list-decoding of Reed-Solomon codes we have m = O(n2),so that d = O(1) and the cost bound and probability of successin Corollary 1 hold for any field K (of cardinality at least n).

B. Re-encoding Technique

The re-encoding technique has been introduced byKoetter and Vardy [27], [29] in order to reduce the costof the interpolation step in list- and soft-decoding ofReed-Solomon codes. Here, for the sake of clarity, we presentthis technique only in the context of Reed-Solomon list-decoding via the Guruswami-Sudan algorithm, using the samenotation and assumptions as in Subsection IV-A above: s = 1and we have H1, H2, H3, H4. Under some additional assump-tion on the input points in Problem 1, by means of partiallypre-solving the problem one obtains an interpolationproblem whose linearization has smaller dimensions. The ideaat the core of this technique is summarized in the followinglemma [29, Lemma 4].

Lemma 10: Let m be a positive integer, x be an elementin K, and Q = ∑

j Q j (X)Y j be a polynomial in K[X, Y ].Then, Q(x, 0) = 0 with multiplicity at least m if and onlyif (X − x)m− j divides Q j for each j < m.

Proof: By definition, Q(x, 0) = 0 with multiplicity atleast m if and only if Q(X + x, Y ) has no monomial of totaldegree less than m. Since Q(X + x, Y ) = ∑

j Q j (X + x)Y j ,this is equivalent to the fact that Xm− j divides Q j (X + x) foreach j < m. #

This property can be generalized to the case of several rootsof the form (x, 0). More precisely, the re-encoding techniqueis based on a shift of the received word by a well-chosen codeword, which allows us to ensure the following assumption onthe points {(xr , yr )}1"r"n: for some integer n0 ! k + 1,

y1 = · · · = yn0 = 0 and yn0+1 ̸= 0, . . . , yn ̸= 0. (15)

We now define the polynomial G0 = ∏1"r"n0

(X − xr )which vanishes at xr when yr = 0, and Lemma 10 canbe rewritten as follows: Q(xr , 0) = 0 with multiplicity atleast m for 1 " r " n0 if and only if Gm− j

0 divides Q jfor each j < m. Thus, we know how to solve the vanishingcondition for the n0 points for which yr = 0: by settingeach of the m polynomials Q0, . . . , Qm−1 as the product ofa power of G0 and an unknown polynomial. Combining thiswith the polynomial approximation problem corresponding tothe points {(xr , yr )}n0+1"r"n , there remains to solve a smallerapproximation problem.

Indeed, under the previously mentioned assumptions s = 1and H1, H2, H3, H4, it has been shown in Section II thatthe vanishing condition (iv) of Problem 1 restricted to thepoints {(xr , yr )}n0+1"r"n is equivalent to the simultaneouspolynomial approximations

i" j"ℓ

(ji

)R j−i Q j = 0 mod Gm−i for i < m,

where G = ∏n0+1"r"n(X − xr ) and R is the interpolation

polynomial such that deg R < n − n0 and R(xr ) = yr forn0 + 1 " r " n. On the other hand, we have seen that thevanishing condition for the points {(xr , yr )}1"r"n0 is equiva-lent to Q j = Gm− j

0 Q⋆j for each j < m, for some unknown

polynomials Q⋆0, . . . , Q⋆

m−1. Combining both equivalences,

Page 13: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

2382 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015

Algorithm 4 Interpolation Step of List-DecodingReed-Solomon Codes Using Re-Encoding

Input: ℓ, n, m, b, k in Z>0 and satisfying H1, H2, H3, H4,and points {(xr , yr )}1"r"n in K2 with the xrpairwise distinct and the yr satisfying (15).

Output: Q0, . . . , Qℓ in K[X] such that∑

j"ℓ Q j Y j is asolution to Problem 1 with input s = 1, ℓ, n,m = m1 = · · · = mn, b, k and {(xr , yr )}1"r"n .

1. Compute µ = m, ν = ℓ + 1, M ′i = (m − i)(n − n0),

N ′j = b − jk − n0(m − j) for j < m and N ′

j = b − jkfor m " j " ℓ2. Compute G0 = ∏

1"r"n0(X − xr ) and Pi =(∏

n0+1"r"n(X − xr ))m−i

for i < m3. Compute the Fi, j for i < m and j " ℓ as in (17)4. Compute a solution Q0, . . . , Qℓ to Problem 2 on inputµ, ν, M ′

0, . . . , M ′m−1, N ′

0, . . . , N ′ℓ and the polynomials

{(Pi , Fi,0, . . . , Fi,ℓ)}0"i<m

5. Return Gm0 Q0, Gm−1

0 Q1, . . . , G0 Qm−1, Qm , . . . , Qℓ,or report “no solution” if Step 4 did

we obtain for i < m∑

i" j<m

Fi, j Q⋆j +

m" j"ℓFi, j Q j = 0 mod Gm−i (16)

with

Fi, j =

⎧⎪⎪⎪⎨

⎪⎪⎪⎩

(ji

)R j−i Gm− j

0 mod Gm−i for i " j < m,

(ji

)R j−i mod Gm−i for m " j " ℓ.

(17)

Obviously, the degree constraints on Q0, . . . , Qm−1directly correspond to degree constraints on Q⋆

0, . . . , Q⋆m−1

while those on Qm , . . . , Qℓ are unchanged. Thenumber of equations obtained when linearizing (16)is M ′ = ∑

i<m deg(Gm−i ) = m(m+1)2 (n − n0), while

the number of unknowns is N ′ = ∑j<m(b − jk −

(m − j)n0) + ∑m" j"ℓ(b − jk) = ∑

j"ℓ(b − jk) −m(m+1)

2 n0. In other words, we have reduced the number of(linear) unknowns as well as the number of (linear) equationsby the same quantity m(m+1)

2 n0, which is the number oflinear equations used to express the vanishing condition forthe n0 points (x1, 0), . . . , (xn0 , 0). (Note that if we were inthe more general context of possibly distinct multiplicities,we would have set yi = 0 for the n0 points which have thehighest multiplicities, in order to maximize the benefit of there-encoding technique.)

This re-encoding technique is summarized in Algorithm 4.Assuming that Step 4 is done using Algorithm 2 or 3,we obtain the following result about list-decoding ofReed-Solomon codes using the re-encoding technique.

Corollary 2: Take s = 1 and assume the parameters ℓ, n,m := m1 = · · · = mn, b and k := k1 satisfy H1, H2, H3, H4.Assume further that the points {(xr , yr )}1"r"n satisfy (15) forsome n0 ! k + 1. Then there exists a probabilistic algorithm

that computes a solution to Problem 1 using

O(ℓω−1M(m2(n − n0)) log(n − n0)

2 + mM(mn0)

+ M(n0) log(n0)) ⊆ O˜(ℓω−1m2(n − n0) + m2n0)

operations in K with probability of success at least 1/2.Proof: For Steps 1, 2, 3, the complexity analysis is similar

to the one in the proof of Proposition 1; we still note that wehave to compute G0, so that these steps useO(ℓM(m2(n − n0)) log(n − n0) + M(n0) log(n0))operations in K. According to Theorem 2, Step 4 usesO

(ℓω−1M(m2(n − n0)) log(n − n0)2) operations in K. Step 5

uses O(mM(mn0) + M(m2(n − n0))) operations in K.Indeed, we first compute G0, . . . , Gm

0 using O(mM(mn0))

operations and then the products Gm− j0 Q j for j < m are

computed using O(mM(mn0) + M(m2(n − n0))) operations:for each j < m, the product Gm− j

0 Q j can be computedusing O(M(mn0) + M(deg(Q j ))) operations since Gm− j

0 hasdegree at most mn0; and from Algorithms 2 and 3 we knowthat deg Q0 + · · · + deg Qm−1 " (

∑i<m M ′

i ) + 1 (see (4) in

Section III-B), with here∑

i<m M ′i = m(m+1)

2 (n − n0). #Similarly to the remarks following Corollary 1,

if |K| < 24m2(n − n0) then K does not contain enoughelements to ensure a probability of success at least 1/2 usingour algorithms, but one can solve the problem over anextension of degree O(1) and retrieve a solution over Kwithout impacting the cost bound.

C. Interpolation Step in the Wu Algorithm

Our goal now is to show that our algorithms can alsobe used to efficiently solve the interpolation step in theWu algorithm. In this context, we have s = 1 and we makeassumptions H1, H2, H4 on input parameters to Problem 1.We note that here the weight k is no longer related to thedimension of the code; besides, we may have k " 0.

Roughly, the Wu algorithm [52] works as follows. It firstuses the Berlekamp-Massey algorithm to reduce the problemof list-decoding a Reed-Solomon code to a problem of rationalreconstruction which focuses on the error locations (while theGuruswami-Sudan algorithm directly relies on a problemof polynomial reconstruction which focuses on the correctlocations). Then, it solves this problem using an interpolationstep and a root-finding step which are very similar to the onesin the Guruswami-Sudan algorithm.

Here we focus on the interpolation step, which differs fromthe one in the Guruswami-Sudan algorithm by mainly onefeature: the points {(xr , yr )}1"r"n lie in K × (K ∪ {∞}), thatis, some yr may take the special value ∞. For a point (x,∞),a polynomial Q in K[X, Y ] and a parameter ℓ such thatdegY (Q) " ℓ, Wu defines in [52] the vanishing conditionQ(x,∞) = 0 with multiplicity at least m as the vanishingcondition Q(x, 0) = 0 with multiplicity at least m, whereQ = Y ℓQ(X, Y −1) is the reversal of Q with respect to thevariable Y and the parameter ℓ. Thus, we have the followingdirect adaptation of Lemma 10.

Lemma 11: Let ℓ, m be positive integers, x be an elementin K, and Q = ∑

j"ℓ Q j (X)Y j be a polynomial in K[X, Y ]

Page 14: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

CHOWDHURY et al.: FASTER ALGORITHMS FOR MULTIVARIATE INTERPOLATION 2383

with degY (Q) " ℓ. Then, Q(x,∞) = 0 with multiplicityat least m if and only if (X − x)m− j divides Qℓ− j foreach j < m.

As in the re-encoding technique, assuming we reorder thepoints so that y1 = · · · = yn∞ = ∞ and yr ̸= ∞for r > n∞ for some n∞ ! 0, the vanishing conditionof Problem 1 restricted to the points {(xr , yr )}1"r"n∞ isequivalent to Qℓ− j = Gm− j

∞ Q⋆ℓ− j for each j < m, for

some unknown polynomials Q⋆ℓ−m+1, . . . , Q⋆

ℓ. The degreeconstraints on Qℓ−m+1, . . . , Qℓ directly correspond to degreeconstraints on Q⋆

ℓ−m+1, . . . , Q⋆ℓ, while those of Q0, . . . , Qℓ−m

are unchanged.This means that in the interpolation problem we are faced

with, we can deal with the points of the form (x,∞) the sameway we dealt with the points of the form (x, 0) in the case ofthe re-encoding technique: we can pre-solve the correspondingequations efficiently, and we are left with an approximationproblem whose dimensions are smaller than if no specialattention had been paid when dealing with the points of theform (x,∞). More precisely, let G∞ = ∏

1"r"n∞(X − xr )as well as G = ∏

n∞+1"r"n(X − xr ) and R of degree lessthan n − n∞ such that R(xr ) = yr for each r > n∞. Definingfurther

Fi, j =

⎧⎪⎪⎪⎨

⎪⎪⎪⎩

(ji

)R j−i for i " j " ℓ− m,

(ji

)R j−i G j−ℓ+m

∞ for ℓ− m < j " ℓ,

we obtain the following simultaneous polynomialapproximations: for i < m,

i" j"ℓ−m

Fi, j Q j +∑

ℓ−m< j"ℓFi, j Q⋆

j = 0 mod Gm−i .

Pre-solving the equations for the points of the form (x,∞) hasled to reduce the number of (linear) unknowns as well as thenumber of (linear) equations by the same quantity m(m+1)

2 n∞,which is the number of linear equations used to express thevanishing condition for the n∞ points (x1,∞), . . . , (xn∞,∞).We have the following result.

Corollary 3: Take s = 1 and assume that the parame-ters ℓ, n, m := m1 = · · · = mn, b and k := k1 satisfy H1, H2,H4. Assume further that each of the points {(xr , yr )}1"r"n isallowed to have the special value yr = ∞. Then there exists aprobabilistic algorithm that computes a solution to Problem 1using

O(ℓω−1M(m2n) log(n)2) ⊆ O˜(ℓω−1m2n)

operations in K with probability of success at least 1/2.As above, if |K| < 24m2(n − n∞) then in order to ensure a

probability of success at least 1/2 using our algorithms, onecan solve the problem over an extension of degree O(1) andretrieve a solution over K, without impacting the cost bound.

We note that unlike in the re-encoding technique wherethe focus was on a reduced cost involving n − n0, herewe are not interested in writing the detailed cost involvingn − n∞. The reason is that n∞ is expected to be close to 0 inpractice. The main advantage of the Wu algorithm over the

Guruswami-Sudan algorithm is that it uses a smallermultiplicity m, at least for practical code parameters; detailsabout the choice of parameters m and ℓ in the context of theWu algorithm can be found in [5, Sec. IV.C].

D. Application to Soft-Decoding of Reed-Solomon Codes

As a last application, we briefly sketch how to adapt ourresults to the context of soft-decoding, in which we stillhave s = 1. The interpolation step in soft-decoding ofReed-Solomon codes [28] differs from Problem 1 becausethere is no assumption ensuring that the xr are pairwise distinctamong the points {(xr , yr )}1"r"n . Regarding our algorithms,this is not a minor issue since this assumption is at thecore of the reduction in Section II; we will see that we canstill rely on Problem 2 in this context. However, althoughthe number of linear equations

∑1"r"n

mr (mr +1)2 imposed

by the vanishing condition is not changed by the fact thatseveral xr can be the same field element, it is expected thatthe reduction to Problem 2 will not be as effective as before.More precisely, the displacement rank of the structured matrixin the linearizations of the problem in Algorithms 2 and 3 mayin some cases be larger than if the xr were pairwise distinct.

To measure to which extent we are far from the situationwhere the xr are pairwise distinct, we use the parameter

q = maxx∈K

∣∣{r ∈ {1, . . . , n} | xr = x}∣∣.

For example, q = 1 corresponds to pairwise distinct xr , whileq = n corresponds to x1 = · · · = xn; we always have q " nand, if K is a finite field, q " |K|s with s = 1 in our contexthere. Then, we can write the set of points P = {(xr , yr )}1"r"nas the disjoint union of q sets P = P1 ∪ · · · ∪Pq where eachset Ph = {(xh,r , yh,r )}1"r"nh is such that the xh,r are pairwisedistinct; we denote mh,r the multiplicity associated with thepoint (xh,r , yh,r ) in the input of Problem 1. Now, the vanishingcondition (iv) asks that the q vanishing conditions restrictedto each Ph hold simultaneously. Indeed, Q(xr , yr ) = 0with multiplicity at least mr for all points (xr , yr ) in P if andonly if for each set Ph , Q(xh,r , yh,r ) = 0 with multiplicity atleast mh,r for all points (xh,r , yh,r ) in Ph .

We have seen in Section II how to rewrite the vanishingcondition as simultaneous polynomial approximations whenthe xr are pairwise distinct. This reduction extends to this case:by simultaneously rewriting the vanishing condition for eachset Ph , one obtains a problem of simultaneous polynomialapproximations whose solutions exactly correspond to thesolutions of the instance of (extended) Problem 1 we areconsidering. Here, we do not give details about this reduction;they can be found in [53, Sec. 5.1.1]. Now, let m(h) be thelargest multiplicity among those of the points in Ph ; in thisreduction to Problem 2, the number of polynomial equationswe obtain is

∑1"h"q m(h). Thus, according to Theorem 2, for

solving this instance of Problem 2, our Algorithms 2 and 3use O˜(ρω−1 M ′) operations in K, where ρ = max(ℓ + 1,∑

1"h"q m(h)) and M ′ = ∑1"r"n

mr (mr +1)2 . We see in this

cost bound that the distribution of the points into disjoint setsP = P1 ∪ · · ·∪Pq has an impact on the number of polynomialequations in the instance of Problem 2 we get: when choosing

Page 15: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

2384 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015

this distribution, multiplicities could be taken into account inorder to minimize this impact.

APPENDIX AON ASSUMPTION H1

In this appendix, we discuss the relevance of theassumption H1 introduced previously for Problem 1.In the introduction, we did not make any assumption onm = max1"i"n mi and ℓ, but we mentioned that theassumption H1, that is, m " ℓ is mostly harmless. Thefollowing lemma substantiates this claim, by showing thatthe case m > ℓ can be reduced to the case m = ℓ.

Lemma 12: Let s, ℓ, n, m1, . . . , mn, b, k be parameters forProblem 1, and suppose that m > ℓ. DefineP = ∏

1"i"n : mi >ℓ(X − xi )mi −ℓ and d = deg(P). The

solutions to this problem are the polynomials of the formQ = Q⋆P with Q⋆ a solution for the parameters s, ℓ, n,m′

1, . . . , m′n, b − d, k, where m′

i = ℓ if mi > ℓ and m′i = mi

otherwise.Proof: Assume a solution exists, say Q, and let

Qi (X, Y) = Q(X+xi , Y1+yi,1, . . . , Ys+yi,s) for i = 1, . . . , n.Every monomial of Qi has the form XhY j with h ! mi − ℓ,since | j| " ℓ by condition (ii) and h + | j| ! mi bycondition (iv). Therefore, if mi > ℓ then Xmi −ℓ divides Qiand, shifting back the coordinates for each i , we deduce thatP divides Q.

Let us now consider the polynomial Q⋆ = Q/P andshow that it solves Problem 1 for the parameters s, ℓ, n,m′

1, . . . , m′n, b − d, k. First, Q⋆ clearly satisfies conditions

(i) and (ii). Furthermore, writing Q = ∑j Q j(X)Y j and

Q⋆ = ∑j Q⋆

j (X)Y j, we have Q⋆j = Q j/P for all j, so that

wdegk(Q⋆) = maxj

(deg(Q j) − d + k1 j1 + · · · + ks js)

= wdegk(Q) − d< b − d,

so that condition (iii) holds for Q⋆ with b replaced by b − d .Finally, Q⋆ satisfies condition (iv) with each mi > ℓ replacedby m′

i = ℓ: writing Q⋆i (X, Y) = Q⋆(X + xi ,

Y1 + yi,1, . . . , Ys + yi,s ) for i ∈ {1, . . . , n} such that mi > ℓ,we have

Q⋆i (X, Y) = Qi (X, Y)

Xmi −ℓPi (X),

where

Pi (X) =∏

h ̸=i : mh>ℓ

(X + xi − xh)mh−ℓ.

All the monomials of Qi (X, Y)/Xmi −ℓ have the form XhY j

with h + | j| ! mi − (mi − ℓ) = ℓ and, since Pi (0) ̸= 0, thesame holds for Q⋆

i (X, Y).Conversely, let Q′ be any solution to Problem 1 with

parameters s, ℓ, n, m′1, . . . , m′

n, b − d, k. Proceeding as inthe previous paragraph, one easily verifies that the productQ′ P is a solution to Problem 1 with parameters s, ℓ, n,m1, . . . , mn, b, k. #

APPENDIX BON ASSUMPTION H3

In this appendix, we show the relevance of the assumption“k j < n for some j ∈ {1, . . . , s}” when consideringProblem 1; in particular when s = 1 or when we assume thatk1 = · · · = ks =: k, this shows the relevance of the assumptionH3: 0 " k < n. More precisely, when k j ! n for every j ,Lemma 13 below gives an explicit solution to Problem 1.

Lemma 13: Let s, ℓ, n, m1, ..., mn, b, k be parameters forProblem 1 and suppose that k j ! n for j = 1, . . . , s. DefineP = ∏

1"i"n(X − xi )mi and d = deg(P) = ∑1"i"n mi .

If b " d then this problem has no solution. Otherwise,a solution is given by the polynomial P (considered as anelement of K[X, Y]).

Proof: If b > d then it is easily checked that P satisfiesconditions (i)–(iv) and thus solves Problem 1. Now, toconclude the proof, let us show that if Problem 1 admitsa solution Q, then b > d must hold. Let dY = degY Q.If dY ! m = maxi mi , then the weighted-degreecondition (iii) gives b > wdegk(Q) ! dY (min j k j ) !mn ! d . Let us finally assume dY < m.Following the proof of Lemma 12, we can write Q = P⋆Q⋆

where P⋆ = ∏1"i"n : mi>dY

(X − xi )mi−dY , for some Q⋆

in K[X, Y] such that degY Q⋆ = dY . Then, the weighted-degree condition gives b >

∑1"i"n : mi >dY

(mi − dY ) +wdegk(Q⋆) ! ∑

1"i"n : mi >dY(mi − dY ) + dY n !∑

1"i"n : mi >dYmi + ∑

1"i"n : mi "dYdY ! d . #

APPENDIX CTHE LATTICE-BASED APPROACH

In this appendix, we summarize the approach for solvingProblem 1 via the computation of a reduced polynomial latticebasis; this helps us to compare the cost bounds for thisapproach with the cost bound we give in Theorem 1. Here,s ! 1 and for simplicity, we assume that k := k1 = · · · = ks asin the list-decoding of folded Reed-Solomon codes. Besides,we make the assumptions H1, H2, H3, H4 as presented inthe introduction. Two main lattice constructions exist in theliterature; following [10, §4.5], we present them directly in thecase s ! 1, and then give the cost bound that can be obtainedusing polynomial lattice reduction to find a short vector in thelattice.

Let G = ∏1"r"n(X−xr ) and R1, . . . , Rs ∈ K[X] such that

deg(R j ) < n and R j (xi ) = yi, j for every j ∈ {1, . . . , s} andi ∈ {1, . . . , n}. In the first construction, the lattice is generatedby the polynomials

{Gi

s∏

r=1

(Yr − Rr )jr∣∣∣i > 0, i + | j| = m

}

⋃ { s∏

r=1

(Yr − Rr )jr Y Jr

r

∣∣∣| j| = m, |J| " ℓ− m};

this construction may be called banded due to the shape ofthe generators above when s = 1. In the second construction,which may be called triangular, the lattice is generated by the

Page 16: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

CHOWDHURY et al.: FASTER ALGORITHMS FOR MULTIVARIATE INTERPOLATION 2385

polynomials{

Gis∏

r=1

(Yr − Rr )jr∣∣∣i > 0, i + | j| = m

}

⋃ { s∏

r=1

(Yr − Rr )jr∣∣∣m " | j| " ℓ

}.

When s = 1, the first construction is usedin [4, Rem. 16], [13], and [32], and the second one isused in [4] and [6]; when s ! 1, the former can be foundin [10] while the latter appears in [9] and [14]. In bothcases the actual lattice bases are the coefficient vectors (in Y)of the polynomials h(X, XkY1, . . . , XkYs), for h in either ofthe sets above; these Xk are introduced to account for theweighted-degree condition (iii) in Problem 1.

In this context, for a lattice of dimension L given bygenerators of degree at most d , the algorithm in [20]computes a shortest vector in the lattice in expected timeO(LωM(d) log(Ld)), as detailed below. For a deterministicsolution, see the algorithm of Gupta, Sarkar, Storjohann, andValeriote [21], whose cost is O(LωM(d)(log(L)2 + log(d))).

For the banded basis, its dimension L B and degree dB canbe taken as follows:

L B =(

s + m − 1s

)+

(s + m − 1

s − 1

)(s + ℓ− m

s

)

and

dB = O(mn).

The dimension formula is given explicitly in [10, p. 75], whilethe degree bound is easily obtained when assuming that theparameters m, n, b of Problem 1 satisfy b " mn; such anassumption is not restrictive, since when b > mn thepolynomial Q = Gm is a trivial solution. In this case, thearithmetic cost for constructing the lattice matrix withthe given generators is O

((s+ms

)2M(mn))

, which is

O(L2BM(mn)). Similarly, in the triangular case,

LT =(

s + ℓ

s

)and dT = O(ℓn),

and the cost for constructing the lattice matrix is O(L2T M(ℓn)).

Under our assumption H1: m " ℓ, we always haveL B ! LT and dB " dT ; when s = 1, we get L B = LT = ℓ+1.

To bound the cost of reducing these two polynomial latticebases, recall that the algorithm of [20] works as follows.Given a basis of a lattice of dimension L and degree d ,if x0 ∈ K is given such that the determinant of the latticedoes not vanish at X = x0, then the basis will be reduceddeterministically using O(LωM(d) log(Ld)) operations in K.Otherwise, such an x0 is picked at random in K or, if thecardinality |K| is too small to ensure success with probabilityat least 1/2, in a field extension L of K. In general, L should betaken of degree O(log(Ld)) over K; however, here degree 2will suffice. Indeed, following [6, p. 206] we note that forthe two lattice constructions above the determinants havethe special form G(X)i1 Xi2 for some i1, i2 ∈ Z!0.Since G(X) = (X − x1) · · · (X − xn) with x1, . . . , xn ∈ Kpairwise distinct, x0 can be found deterministically in time

O(M(n) log(n)) as soon as |K| > n + 1, by evaluating G atn + 2 arbitrary elements of K; else, |K| is either n or n + 1,and x0 can be found in an extension L of K of degree 2. Suchan extension can be computed with probability of success atleast 1/2 in time O(log(n)) (see for example [19, §14.9]).Then, with the algorithm of [20] we obtain a reduced basisover L[X] using O(LωM(d) log(Ld)) operations in L; sincethe degree of L over K is O(1), this is O(LωM(d) log(Ld))operations in K. Eventually, one can use [44, Th. 13 and 20]to transform this basis into a reduced basis over K[X] withoutimpacting the cost bound; or more directly, since here we areonly looking for a sufficiently short vector in the lattice, thisvector can be extracted from a shortest vector in the reducedbasis over L[X]. Therefore, by applying the algorithm of [20]to reduce the banded basis and triangular basis shown above,we will always obtain a polynomial Q solution to Problem 1(assuming one exists) in expected time

O(LωBM(mn) log(L Bmn)) and O(LωT M(ℓn) log(LT ℓn)),

respectively. For s = 1, the assumption H1 implies that thesecosts are O(ℓωM(mn) log(ℓn)) and O(ℓωM(ℓn) log(ℓn)),respectively, as reported in [6] and [13]. For s > 1, the costsobtained in [9] and [10] are worse, but only because the shortvector algorithms used in those references are slower than theones we refer to; no cost bound is explicitly given in [14].The result in Theorem 1 is an improvement over those ofboth [9] and [10]. To see this, remark that the cost in ourtheorem is quasi-linear in

(s+ℓs

)ω−1(s+ms+1

)n, whereas the costs

in [9] and [10] are at least(s+ℓ

s

)ωmn; a simplification proves

our claim.

ACKNOWLEDGMENT

We thank the two reviewers for their thorough readingand helpful comments. We also thank the three reviewers ofthe preliminary version [12] of this work, and especially thesecond one for suggesting a shorter proof of Lemma 9.

REFERENCES

[1] M. Alekhnovich, “Linear diophantine equations over polynomials andsoft decoding of Reed–Solomon codes,” IEEE Trans. Inf. Theory,vol. 51, no. 7, pp. 2257–2265, Jul. 2005.

[2] B. Beckermann, “A reliable method for computing M-Padé approxi-mants on arbitrary staircases,” J. Comput. Appl. Math., vol. 40, no. 1,pp. 19–42, Jun. 1992.

[3] B. Beckermann and G. Labahn, “A uniform approach for the fastcomputation of matrix-type Padé approximants,” SIAM J. Matrix Anal.Appl., vol. 15, no. 3, pp. 804–823, Jul. 1994. [Online]. Available:http://dx.doi.org/10.1137/S0895479892230031

[4] P. Beelen and K. Brander, “Key equations for list decoding ofReed–Solomon codes and how to solve them,” J. Symbolic Com-put., vol. 45, no. 7, pp. 773–786, Jul. 2010. [Online]. Available:http://www.sciencedirect.com/science/article/pii/S0747717110000477

[5] P. Beelen, T. Høholdt, J. S. R. Nielsen, and Y. Wu, “On rationalinterpolation-based list-decoding and list-decoding binary Goppa codes,”IEEE Trans. Inf. Theory, vol. 59, no. 6, pp. 3269–3281, Jun. 2013.

[6] D. J. Bernstein, “Simplified high-speed high-distance list decoding foralternant codes,” in Post-Quantum Cryptography (Lecture Notes inComputer Science), vol. 7071. Berlin, Germany: Springer-Verlag, 2011,pp. 200–216.

[7] R. R. Bitmead and B. D. O. Anderson, “Asymptotically fast solution ofToeplitz and related systems of linear equations,” Linear Algebra Appl.,vol. 34, pp. 103–116, Dec. 1980.

Page 17: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015

[8] A. Bostan, C.-P. Jeannerod, and E. Schost, “Solving structuredlinear systems with large displacement rank,” Theoretical Comput.Sci., vol. 407, nos. 1–3, pp. 155–181, Nov. 2008. [Online]. Available:http://dx.doi.org/10.1016/j.tcs.2008.05.014

[9] K. Brander, “Interpolation and list decoding of algebraic codes,”Ph.D. dissertation, Dept. Math., Tech. Univ. Denmark, Kongens Lyngby,Denmark, 2010.

[10] P. Busse, “Multivariate list decoding of evaluation codes with a Gröbnerbasis perspective,” Ph.D. dissertation, Dept. Math., Univ. Kentucky,Lexington, KY, USA, 2008.

[11] D. G. Cantor and E. Kaltofen, “On fast multiplication of polynomialsover arbitrary algebras,” Acta Inf., vol. 28, no. 7, pp. 693–701, 1991.[Online]. Available: http://dx.doi.org/10.1007/BF01178683

[12] M. F. I. Chowdhury, C.-P. Jeannerod, V. Neiger, É. Schost, andG. Villard, “On the complexity of multivariate interpolation with multi-plicities and of simultaneous polynomial approximations,” presented atthe ASCM, Beijing, China, Oct. 2012.

[13] H. Cohn and N. Heninger, “Ideal forms of Coppersmith’s theorem andGuruswami–Sudan list decoding,” in Proc. Innovations Comput. Sci.,2011, pp. 298–308. [Online]. Available: http://arxiv.org/pdf/1008.1284

[14] H. Cohn and N. Heninger, “Approximate common divisors via lattices,”in Proc. 10th Algorithmic Number Theory Symp., 2013, pp. 271–293.

[15] D. Coppersmith and S. Winograd, “Matrix multiplication via arith-metic progressions,” J. Symbolic Comput., vol. 9, no. 3, pp. 251–280,Mar. 1990.

[16] R. A. DeMillo and R. J. Lipton, “A probabilistic remark on algebraic pro-gram testing,” Inf. Process. Lett., vol. 7, no. 4, pp. 193–195, Jun. 1978.

[17] G.-L. Feng and K. K. Tzeng, “A generalization of theBerlekamp–Massey algorithm for multisequence shift-register synthesiswith applications to decoding cyclic codes,” IEEE Trans. Inf. Theory,vol. 37, no. 5, pp. 1274–1287, Sep. 1991.

[18] P. Gaborit and O. Ruatta, “Improved Hermite multivariate polynomialinterpolation,” in Proc. IEEE ISIT, Jul. 2006, pp. 143–147.

[19] J. von zur Gathen and J. Gerhard, Modern Computer Algebra, 3rd ed.Cambridge, U.K.: Cambridge Univ. Press, 2013.

[20] P. Giorgi, C.-P. Jeannerod, and G. Villard, “On the complexity ofpolynomial matrix computations,” in Proc. ISSAC, 2003, pp. 135–142.[Online]. Available: http://doi.acm.org/10.1145/860854.860889

[21] S. Gupta, S. Sarkar, A. Storjohann, and J. Valeriote, “Triangular x-basisdecompositions and derandomization of linear algebra algorithms overK[x],” J. Symbolic Comput., vol. 47, no. 4, pp. 422–453, Apr. 2012.[Online]. Available: http://dx.doi.org/10.1016/j.jsc.2011.09.006

[22] V. Guruswami and A. Rudra, “Explicit codes achieving list decodingcapacity: Error-correction with optimal redundancy,” IEEE Trans. Inf.Theory, vol. 54, no. 1, pp. 135–150, Jan. 2008.

[23] V. Guruswami and M. Sudan, “Improved decoding of Reed–Solomonand algebraic-geometry codes,” IEEE Trans. Inf. Theory, vol. 45, no. 6,pp. 1757–1767, Sep. 1999.

[24] H. Hasse, “Theorie der höheren Differentiale in einem algebraischenFunktionenkörper mit vollkommenem Konstantenkörper bei beliebigerCharakteristik,” J. Reine Angew. Math., vol. 1936, no. 175, pp. 50–54,1936.

[25] E. Kaltofen, “Asymptotically fast solution of Toeplitz-like singular linearsystems,” in Proc. ISSAC, 1994, pp. 297–304.

[26] E. Kaltofen and B. D. Saunders, “On Wiedemann’s method of solvingsparse linear systems,” in Applied Algebra, Algebraic Algorithms andError-Correcting Codes (Lecture Notes in Computer Science), vol. 539.Berlin, Germany: Springer-Verlag, 1991, pp. 29–38.

[27] R. Koetter, J. Ma, and A. Vardy, “The re-encoding transformationin algebraic list-decoding of Reed–Solomon codes,” IEEE Trans. Inf.Theory, vol. 57, no. 2, pp. 633–647, Feb. 2011.

[28] R. Koetter and A. Vardy, “Algebraic soft-decision decoding ofReed–Solomon codes,” IEEE Trans. Inf. Theory, vol. 49, no. 11,pp. 2809–2825, Nov. 2003.

[29] R. Koetter and A. Vardy, “A complexity reducing transformation inalgebraic list decoding of Reed–Solomon codes,” in Proc. IEEE ITW,Mar./Apr. 2003, pp. 10–13.

[30] R. Kötter, “Fast generalized minimum-distance decoding of algebraic-geometry and Reed–Solomon codes,” IEEE Trans. Inf. Theory, vol. 42,no. 3, pp. 721–737, May 1996.

[31] F. L. Gall, “Powers of tensors and fast matrix multiplica-tion,” in Proc. ISSAC, 2014, pp. 296–303. [Online]. Available:http://doi.acm.org/10.1145/2608628.2608664

[32] K. Lee and M. E. O’Sullivan, “List decoding of Reed–Solomoncodes from a Gröbner basis perspective,” J. Symbolic Comput.,vol. 43, no. 9, pp. 645–658, Sep. 2008. [Online]. Available:http://www.sciencedirect.com/science/article/pii/S0747717108000059

[33] R. J. McEliece, “The Guruswami–Sudan decoding algorithm forReed–Solomon codes,” California Inst. Technol., Pasadena, CA, USA,Tech. Rep. 42-153, 2003.

[34] H. M. Möller and B. Buchberger, “The construction of multi-variate polynomials with preassigned zeros,” in Computer Algebra(Lecture Notes in Computer Science), vol. 144. Berlin, Germany:Springer-Verlag, 1982, pp. 24–31.

[35] M. Morf, “Doubling algorithms for Toeplitz and related equations,”in Proc. IEEE Conf. Acoust., Speech, Signal Process., Apr. 1980,pp. 954–959.

[36] J. S. R. Nielsen, “List decoding of algebraic codes,” Ph.D. dissertation,Dept. Appl. Math. Comput. Sci., Tech. Univ. Denmark, Kongens Lyngby,Denmark, 2013.

[37] R. R. Nielsen and T. Høholdt, “Decoding Reed–Solomon codes beyondhalf the minimum distance,” in Coding Theory, Cryptography andRelated Areas. Berlin, Germany: Springer-Verlag, 2000, pp. 221–236.

[38] V. Olshevsky and M. A. Shokrollahi, “A displacement approach toefficient decoding of algebraic-geometric codes,” in Proc. STOC,1999, pp. 235–244. [Online]. Available: http://doi.acm.org/10.1145/301250.301311

[39] V. Pan, Structured Matrices and Polynomials. New York, NY, USA:Springer-Verlag, 2001.

[40] F. Parvaresh and A. Vardy, “Correcting errors beyond theGuruswami–Sudan radius in polynomial time,” in Proc. FOCS,Oct. 2005, pp. 285–294.

[41] J.-R. Reinhard, “Algorithme LLL polynomial et applications,”M.S. thesis, Dept. Comput. Sci., École Polytechn., Paris, France, 2003.[Online]. Available: https://hal.inria.fr/hal-01101550

[42] R. M. Roth, Introduction to Coding Theory. Cambridge, U.K.:Cambridge Univ. Press, 2007.

[43] R. M. Roth and G. Ruckenstein, “Efficient decoding of Reed–Solomoncodes beyond half the minimum distance,” IEEE Trans. Inf. Theory,vol. 46, no. 1, pp. 246–257, Jan. 2000.

[44] S. Sarkar and A. Storjohann, “Normalization of row reduced matrices,”in Proc. ISSAC, 2011, pp. 297–304.

[45] J. T. Schwartz, “Fast probabilistic algorithms for verification of polyno-mial identities,” J. ACM, vol. 27, no. 4, pp. 701–717, Oct. 1980.

[46] V. Shoup, “A fast deterministic algorithm for factoring polynomials overfinite fields of small characteristic,” in Proc. ISSAC, 1991, pp. 14–21.

[47] A. Storjohann, “Notes on computing minimal approximant bases,” inProc. Challenges Symbolic Comput. Softw., 2006, p. 06271. [Online].Available: http://drops.dagstuhl.de/opus/volltexte/2006/776

[48] A. J. Stothers, “On the complexity of matrix multiplication,”Ph.D. dissertation, School Math., Univ. Edinburgh, Edinburgh, U.K.,2010.

[49] M. Sudan, “Decoding of Reed Solomon codes beyond the error-correction bound,” J. Complexity, vol. 13, no. 1, pp. 180–193, Mar. 1997.[Online]. Available: http://dx.doi.org/10.1006/jcom.1997.0439

[50] P. V. Trifonov, “Efficient interpolation in the Guruswami–Sudan algo-rithm,” IEEE Trans. Inf. Theory, vol. 56, no. 9, pp. 4341–4349,Sep. 2010.

[51] V. V. Williams, “Multiplying matrices faster thanCoppersmith–Winograd,” in Proc. STOC, 2012, pp. 887–898. [Online].Available: http://doi.acm.org/10.1145/2213977.2214056

[52] Y. Wu, “New list decoding algorithms for Reed–Solomon and BCHcodes,” IEEE Trans. Inf. Theory, vol. 54, no. 8, pp. 3611–3630,Aug. 2008.

[53] A. Zeh, “Algebraic soft- and hard-decision decoding of general-ized Reed–Solomon and cyclic codes,” Ph.D. dissertation, Dept.d’Informatique, École Polytechn., Paris, France, 2013. [Online]. Avail-able: https://pastel.archives-ouvertes.fr/pastel-00866134

[54] A. Zeh, C. Gentner, and D. Augot, “An interpolation procedure for listdecoding Reed–Solomon codes based on generalized key equations,”IEEE Trans. Inf. Theory, vol. 57, no. 9, pp. 5946–5959, Sep. 2011.

[55] R. Zippel, “Probabilistic algorithms for sparse polynomials,” in Symbolicand Algebraic Computation (Lecture Notes in Computer Science),vol. 72. Berlin, Germany: Springer-Verlag, 1979, pp. 216–226.

Muhammad F. I. Chowdhury was born in Sylhet, Bangladesh on31st December 1981. He achieved his BSc in Computer Science andEngineering from Khulna University of Engineering & Technology,Bangladesh. He obtained his MSc in Computer Science from the University ofWestern Ontario, Canada, on April 2009. In February 2014, he achieved hisPhD degree in Computer Science from the University of Western Ontario,Canada. Currently he is working as a senior software engineer at IrdetoCanada Inc. where he is responsible for designing and developing securedalgorithms to be executed in untrusted computational environments.

Page 18: École normale supérieure de Lyonperso.ens-lyon.fr/gilles.villard/BIBLIOGRAPHIE/PDF/ieee-2015.pdf · 2370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015 Faster

CHOWDHURY et al.: FASTER ALGORITHMS FOR MULTIVARIATE INTERPOLATION 2387

Claude-Pierre Jeannerod received his PhD in Applied Mathematics fromInstitut National Polytechnique, Grenoble (France), in 2000. After being apostdoctoral fellow in the Symbolic Computation Group at the University ofWaterloo (Canada), he is now a researcher at Inria Grenoble - Rhône-Alpesand a member of the LIP Computer Science Laboratory (CNRS, ENSL, Inria,UCBL) of the University of Lyon, France. His research interests includecomputer algebra, structured linear algebra, and floating-point arithmetic.

Vincent Neiger studied Computer Science at École Normale Supérieurede Lyon in France, where he obtained a Bachelor’s degree in 2010 and aMaster’s degree in 2012, and passed the Agrégation national competitiveexamination in Mathematics in 2013. He is currently working toward ajoint PhD degree in Computer Science between École Normale Supérieurede Lyon and the University of Western Ontario, Canada, and is a memberof the LIP Computer Science Laboratory (CNRS, ENSL, Inria, UCBL) ofthe University of Lyon, France, and of the Computer Science Departmentof the University of Western Ontario, Canada. His current research interestsinclude computer algebra and algebraic coding theory.

Éric Schost received his PhD in Computer Science at École Polytechniquein 2000, under the supervision of Marc Giusti. He is an associate professorin the Department of Computer Science at Western University and holds aCanada Research Chair in Computer Algebra.

Gilles Villard received the PhD degree from the Institut NationalPolytechnique of Grenoble, and became a research scientist with the FrenchNational Center for Scientific Research (CNRS) in 1990. He arrived at theÉcole Normale Supérieure de Lyon in 2000 and has headed the project-team Arénaire on Computer Arithmetic between 2004 and 2009. He has beenvice-chair then chair of the LIP Computer Science Laboratory (CNRS, ENSL,Inria, UCBL) of the University of Lyon from 2006 to 2014. His main researchinterests in symbolic computation are complexity and efficient algorithms formatrix and Euclidean lattice problems, and generic programming techniquesfor high performance software libraries.


Recommended