1408 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 4, APRIL 2005
Error-Correction Capability of Binary Linear CodesTor Helleseth, Fellow, IEEE, Torleiv Kløve, Fellow, IEEE, and Vladimir I. Levenshtein, Fellow, IEEE
Abstract—The monotone structure of correctable and uncor-rectable errors given by the complete decoding for a binarylinear code is investigated. New bounds on the error-correctioncapability of linear codes beyond half the minimum distanceare presented, both for the best codes and for arbitrary codesunder some restrictions on their parameters. It is proved thatsome known codes of low rate are as good as the best codes in anasymptotic sense.
Index Terms—Error-correction capability, linear codes, min-imal words, monotone functions, test set, trial set, Reed–Mullercodes.
I. INTRODUCTION
I N complete (maximum-likelihood) decoding of linear codes,there is some freedom in the choice of which errors will be
corrected. More precisely, the correctable errors are exactly thecoset leaders, and when there is more than one vector of min-imum weight in a coset, any one of them can be selected as thecoset leader. It has long been known that if the lexicographi-cally smallest minimum-weight vectors are chosen as the cosetleaders, then the set of correctable errors (coset leaders) gets amonotone structure: if is a correctable error and (that is,
for all ), then is also a correctable error. This has beenregarded as a nice property, but without much further relevance.However, Zemor [25] has studied some important consequencesof this property. The goal of this paper is to argue that it is, infact, a fundamental property with a number of important impli-cations, and we study some of these implications. The paper isorganized as follows.
Section II is a short introduction to the complete decodingand describes the monotone structure of the sets of correctableand uncorrectable errors and also introduces some notations.In particular, we introduce trial sets of codewords, consideringthe linear ordering of all vectors of a fixed length defined by
if and only if has smaller Hamming weight than orthey have the same Hamming weight but is lexicographicallysmaller than .
In Section III, we describe the minimal uncorrectable errorsunder the ordering . For any such vector, we determine all vec-tors in its coset which precede it in the ordering . This allowsus to prove that all minimal uncorrectable errors are so-calledlarger halves of minimal codewords. Further, we use the de-scription to give a gradient-like decoding algorithm based on
Manuscript received December 2, 2002; revised October 4, 2004. This workwas supported by the Norwegian Research Council; the work of V. Levenshteinwas also supported by the Russian Foundation for Basic Research under Grant02-01-00687.
T. Helleseth and T. Kløve are with the Department of Informatics, Universityof Bergen, N-5020 Bergen, Norway.
V. I. Levenshtein is with the Keldysh Institute for Applied Mathematics, RAS,125047 Moscow, Russia.
Communicated by R. Koetter, Associate Editor for Coding Theory.Digital Object Identifier 10.1109/TIT.2005.844080
trial sets of codewords. Finally, we give improved estimates ofthe number uncorrectable errors of any weight larger than halfthe minimum distance.
In Section IV, we consider , the fraction of all errorsof weight that are correctable. A consequence of the mono-tonicity of the set of correctable errors and a well-known resulton monotone sets is that, for any in the range from half theminimum distance to the covering radius, decreases withthe growing . Therefore, we also have a well-defined “inverse”
, the error-correction capability function, that is defined asthe largest such that . The rest of the paper is devotedto the study of these two functions, in particular, the asymptoticvalues for infinite sequences of codes.
In our investigation of the quantity
plays a significant role. Note that is a necessarycondition for the existence of a -error-correcting code(the Hamming bound). The monotonicity of implies thatfor any code and any
and hence as . On the other hand,using random selection on a set of codes we prove, forany , , and , the existence of an code for which
Therefore, for such a code, asThese results allows us to obtain in Section V precise esti-
mates of for the best codes with fora fixed , . We give two explicit positive constants
and such that for any there existsan code with such that
and that for any code with and any fixed ,
if and is sufficiency large. (Here and later, ,, is the parameter which is uniquely defined by the equation
where isthe Shannon entropy.)
We also study sequences of codes where the rate goes to zero.For sequences of codes where and
for which for a fixed and ,
0018-9448/$20.00 © 2005 IEEE
HELLESETH et al.: ERROR-CORRECTION CAPABILITY OF BINARY LINEAR CODES 1409
we investigate how close can be to . We show that wealways have
and prove the existence of asymptotically optimal sequences ofcodes for which this asymptotic inequality is tight for all
fixed , .In Section VI, we present three bounds on in the
range for arbitrary codes under somerestrictions on their parameters. As an illustration of the appli-cations of these bounds, we show that the first- and second-order Reed–Muller codes as well as the duals of primitiveBose–Chaudhuri–Hocquenghem (BCH) codes are asymptoti-cally optimal.
In Section VII, we give some open problems.
II. MONOTONE STRUCTURE OF CORRECTABLE AND
UNCORRECTABLE ERRORS
Let be the set of all binary vectors(with coordinates and ). We consider as a metric spacewith (the Hamming) distance and norm (weight)
, where . We consider some or-derings of . A partial ordering is a reflexive, transitive, andantisymmetric binary relation on . An example is (cov-ering) defined by
if and only if
where
is the support of . As usual, we write if and.
Denote by the set of all vectors of of weight . For any, we consider the numerical value
and the minimum coordinate of the support
For example, for we have ,, and . We will use the symbol both for
sums of numbers and for sums of vectors and hope thatthis does not lead to any confusion.
A linear (or total) ordering is a partial ordering such thator for all and . We consider a couple of linear or-
derings. Together with the lexicographic ordering on (orderby increasing numerical values of vectors), we consider anotherlinear ordering defined as follows:
if and only iforand .
(1)
We write if and . Note that in our notation,the inequality is equivalent to .
If , then is called a descendant of and an immediatedescendant of when .
Let be a linear code of dimension or ancode. We also use the notation if the code has minimumdistance at least . We set and denote thecovering radius of by . It is known that is partitionedinto cosets , that is,
where for
and where each for some . For, let denote the minimum element of
with respect to the ordering . It is called the coset leader. Thecoset leader is therefore the lexicographically smallest elementamong the minimum-weight vectors in the coset. We denote theset of all coset leaders by (note that ).It is well known that the (maximum-likelihood or complete) de-coding defined by
if (2)
has remarkable properties. First, this decoding maximizes theaverage probability of correct decoding for the binary-sym-metric channel with a probability , , under theadditional condition that all code vectors are equally likely tobe transmitted. Second, for any we have
if and only if
This means that only error vectors of the set can be cor-rected and they are all corrected for transmission of any code-word. Therefore, the elements of are called correctableerrors, and the elements of are calleduncorrectable errors. Thus, if and only iffor all , and if and only if there exists
such that .This definition gives rise to an algorithm to find, for a received
vector , its coset leader and hence decode according to(2). Since is a coset leader if and only if for all
, we can apply it to and determine whetheris a coset leader. If it is not, then there exists a vector
such that , and we can repeat this procedure on thevector . Note that if which has the property
if and only if for all (3)
then we can modify the algorithm to always choose codewordsfrom . We call such a subset a trial set of the code . Weinvestigate trial sets in more detail in Section III.
It is well known (see, for example, [21, p. 58, Theorem3.11]) that the sets of correctable and uncorrectable errors forma monotone structure, namely, that if , thenimplies and implies . Inparticular, it follows from the following simple statement whichwill be used in Section III.
Lemma 1: If and , then .Proof: It is sufficient to show the result when is an im-
mediate descendant of ; the general result then follows by in-
1410 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 4, APRIL 2005
duction. Using that and that is an immediatedescendant of , we get
(4)
If at least one of the two inequalities in (4) is strict, then, hence, . It remains to consider the
case when and . Since ,in this case, and . Using that ,we get , and so again .
Let
be the set of correctable errors of weight , and
be the set of uncorrectable errors of weight . We have
if
and
if
since the covering radius is the largest weight of a correctableerror. Therefore, the probability of a decoding errorof a code on the binary-symmetric channel is given by
(5)
Note that , , and do not changeif we use a different linear ordering to choose the coset leaderamong the minimum-weight vectors of the coset. However,other orderings may not give the monotone structure which willbe important in some of our proofs. In Appendix I, we describeall orderings which do give a monotone structure.
III. MINIMAL UNCORRECTABLE ERRORS AND TRIAL SETS
Let denote the set of such that, ifand , then . This is the set of minimal
uncorrectable errors. Further, let
be the set of minimal uncorrectable errors of weight . Byanalogy, let denote the set of such that, if
and , then . This is the set of maximalcorrectable errors. Let
Note that if or , and thatif or . Hence,
and
One consequence is that
(6)
A vector will be called a “larger half” of a codeword, , if and only if
(7)
(8)
if (9)
if (10)
A larger half of , , can be also defined as a minimalvector in the ordering such that . We will use thefirst definition but the reader can see that a proof that these twodefinitions are equivalent follows from the proof of Theorem 1below.
Note that if is odd, say, then by (8),and conditions (9) and (10) do not apply. If is
even, we have or ; moreover, if , then, and if , then . Thus,
any codeword of norm has
larger halves; , if , and
if .For , , let denote the set of all larger halves
of , and for any subset , let
By the definition above, for any we haveand hence, is an uncorrectable error. Thus,
It is significant to note that if for a codeword , ,and , then there exists
such that (11)
and hence . Note also the following fact that followsfrom (7)–(10): if and , then is animmediate descendant of .
A codeword is called minimal if andwith implies that . Minimal codewords have foundapplications, e.g., in secret sharing, see [20]. Denote bythe set of all minimal words of a code . Now we give and usesome known basic properties of (see [2] and referencestherein). Since any columns of an code aredependent, the maximum weight of a minimal word of doesnot exceed and hence for any we have
. All codewords of weight less than areminimal. The linear span of coincides with . We have
if and only if any two nonzero codewordshave intersecting supports. Such codes are called intersecting,see, e.g., [7].
HELLESETH et al.: ERROR-CORRECTION CAPABILITY OF BINARY LINEAR CODES 1411
A description of the set is one of the main problemsfor monotone structures which has many applications. We givea solution of the problem in terms of the set of all vectors whichprecede in its coset . As consequences, inparticular, we get that all minimal uncorrectable errors are largerhalves of minimal codewords, have characterizations of trial setsin terms of minimal uncorrectable errors, and obtain an upperbound for via the weight distribution of words in anytrial set.
For any define
(12)
and note that if and only if is empty. It followsthat if and only if is not empty. Further, byLemma 1
(13)
We also observe that if for some , thenand . We now characterize in terms of
, that is, describe all codewords such that .
Theorem 1: Let be a linear code with and. Then if and only if
i) for each ,ii) .
Proof: Let and . First, we showthat . For an arbitrary , consider the immediatedescendant of with and for . Since
and, hence, we have
and (14)
Suppose . Then we have and, by(14)
and
It follows that and . Note that
and
This leads to a contradiction because the conditionsand required in this case, by (14), cannot
be satisfied simultaneously. These arguments are valid for anyand hence . We must further show that (8)–(10)
are satisfied for . Since we have
(15)
Let and let be the immediate descendant ofhaving . Then and so
(16)
By (15) and (16), (8) is satisfied. Suppose . By (15),. Since , this implies that
and, hence, . This gives (9), and sois satisfied in this case. Finally, suppose . Inparticular, this implies that and so
that is, . Hence, is defined anddue to the choice of . We have , by (16), and
because . Since , thisimplies and so . Hence,
This gives (10) and completes the proof of i) for .To prove ii) when , we suppose that a certain
is not a minimal word. Then there exists a codewordsuch that . Let ; this is also a codeword
and . Let the vectors and be defined by
and
There are two cases: 1) when the both and are nonzerovectors and 2) when one of them, say, equals . In case 1),
and , we therefore haveand so
that is, . Similarly, and .Hence, by i) and (8)
and so
In particular by (9) and so
(17)On the other hand, since , is not a larger half of ,and so . Similarly, . Combining,we get
Due to (17) this gives rise to a contradiction in case 1). Now weconsider case 2), when . Note that in this caseand hence, , by i). Since andwe get that is an immediate descendant of , by the note justafter (11). It follows that and , a contradiction.Thus, ii) is true for any .
To complete the proof, it is sufficient to verify that i) implies. First we observe that for any immediate descen-
dant of we have . Indeed, otherwise wewould have
1412 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 4, APRIL 2005
and hence, and for some integer . Since, this implies . Hence,
and ; a contradiction. Therefore,if i) holds for , then an immediate descendantof cannot be an uncorrectable error. Otherwise, there exists acodeword such that and hence, . Thenthe same codeword belongs to , by (13), and we have acontradiction with our observation above.
Theorem 1 has some significant implications.
Corollary 1: For any linear code with ,.
We note that Corollary 1 is true also when . In thiscase, if but , then one can show that thereexists an immediate descendant of such that ,
, and .
Corollary 2: For any code there are no minimal uncor-rectable errors of weight more than .
Barg [3] proposed a general gradient-like approach to finda closest codeword of to a (received) vector . Thisproblem is equivalent to finding, in the coset of , a vector ofthe set
for all
(the Voronoi region of ). Note that contains all vectors ofminimal weight in any coset, not only the coset leaders. For ex-ample, for the even-weight code we have
(that is, there are two coset leaders), but .Barg called a subset a test set when
if and only if for all
It follows that
if and only if for some (18)
The point of the test set is that starting from anyand applying (18) repeatedly, one can determine a codewordsuch that has minimum weight in the coset of (how-ever, if there are more than one vector of minimal weight in thecoset, it is possible that is not the coset leader). Bargshowed that the set of minimal codewords and the set
of so-called zero-neighbors introduced by Levitin andHartman [18] both are examples of test sets.
As explained in Section II, to decode a received vectoraccording to (2), we can use a similar gradient-like algo-
rithm where instead of a test set we use a trial setwhich is characterized by (3). It should be emphasized that atrial set is not necessarily a test set and vice versa.
Now we give some alternative characterizations of trial setsusing the set of minimal uncorrectable errors.
Corollary 3: Let be a linear code and . Thefollowing statements are equivalent:
a) is a trial set for
b) if then
c)
Proof:First a) b). Since , this follows immedi-
ately from the definition of a trial set.b) c). Let . By b), there exists a .
By Theorem 1 ii)
c) a). Let . Then there exists ansuch that . By c), is a larger half of some .Therefore, , and so by (13)
Thus, for any there exists a such that, and hence a) is true by the definition (3) of a trial set.
From Corollaries 1 and 3 we get the following fact.
Corollary 4: The set of minimal codewords is a trialset for a linear code .
A trial set for a code is called minimal if any isnot a trial set.
Corollary 5: Let be a linear code with .
i) If is a minimal trial set for , then .ii) If is an arbitrary trial set for , then is also
a trial set for .Proof: To prove i), let . Then Corollary 3 a) and b)
implies that there exists a such that(since otherwise we could remove from and still have atrial set, contradicting the minimality of ). By Theorem 1 iii),
. This proves i).Now let be an arbitrary trial set. Then contains a minimal
trial set . By i), and so . Sincecontains the trial set , it is itself a trial set.
Our approach can be used to prove that the statements ofCorollary 5 are valid also for test sets. (The proof is done by re-placing by and making the corresponding changesin some definitions, and then making the same steps as in theproofs of the above lemmas and corollaries; many of them be-come simpler for test sets.)
It turns out that for many codes there exist smaller trial setsthan test sets.
Example 1: The even-weight code has a minimaltrial set of size consisting of all vectors of weight twohaving a in the last position. On the other hand, the uniqueminimal test set for this code is the set of allvectors of weight two.
A different kind of application of trial sets is to give goodbounds on the size of and . The weight distribu-tion of a subset is definedby
Clearly, for all and all .
Corollary 6: Let be an code and a trial set for .Then
HELLESETH et al.: ERROR-CORRECTION CAPABILITY OF BINARY LINEAR CODES 1413
Proof: The result follows immediately from Corollaries 2and 3.
Corollary 7: Let be an code and a trial set for .Then for any weight ,
(19)
Proof: By Corollary 3, for any uncorrectable error, there exists a and a such that . If
, this means that there exists a suchthat , , and .
On the other hand, for any , , and any ,, there are vectors of weight for each
of which there exists such that , , and
If , then contains a larger half of ,by (11). If , then these do not contain larger halvesof . If , then the vectors with arelarger halves of and the remaining vectorswith do not contain larger halves of . Summingover all , we get (19).
We note that even for the trivial trial set , (19) is arefinement of results given in [14], [22], and [24].
From Corollary 1 it follows that a vector is a cor-rectable error if and only if it does not cover a larger half of aminimal codeword. In particular, this implies that the coveringradius is equal to the maximum norm of a vectorwhich does not cover a larger half of a minimal codeword.
A vector will be called a “smaller half” of a codeword, , if and with
if . Thus, any codewordof weight has smaller halves and any codeword
of weight has smaller halves .It is worth to note that from the preceding definitions it followsthat any immediate descendant of a larger half of a codeword isa smaller half of the same codeword.
Corollary 8: Correctable errors which are immediate de-scendants of any are smaller halves of a minimalcodeword.
IV. THE ERROR-CORRECTION CAPABILITY FUNCTION
For a binary linear code consider the ratio
of errors of weight that are correctable. Note that
(20)
for , and for .
The following important property is a well-known fact formonotone structures. However, it has apparently not been statedbefore in the coding theory context.
Lemma 2: For any code and any
(21)
with strict inequality for .Proof: The shadow of a set , ,
is defined by
for some
By [5, p. 12, Theorem 3]
(22)
with equality if and only if or . The monotonestructure of correctable and uncorrectable errors implies
(23)
and hence,
(24)
Combined with (22) for , the lemma follows im-mediately.
Note that (20) and (21) imply the following generalization ofthe Hamming bound for -error-correcting codes.
Lemma 3: For any code and any ,
(25)
The Hamming bound is obtained for where .
Example 2: Consider the [15], [4] simplex code for whichand . The values of and the bound (25) are
given in the following table:
For any integer , , and any , ,a binary linear code will be called -error-cor-recting code if . In particular, for this defini-tion coincides with the standard definition of a -error-correctingcode, and the -error-correcting simplex code in Example 2 isa -error-correcting code. It is significant to note that,by (21), any -error-correcting code is a -error-cor-recting code for any integer . This ensures the reasonable-ness of the definition, for any code , of an error-correctioncapability function as the maximum such that is a
-error-correcting code.Lemma 2 implies in particular the following corollary.
Corollary 9: For , is nonincreasing, left con-tinuous, and takes all values .
1414 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 4, APRIL 2005
Now we show that the bound (25) is good in a certain approx-imate sense. To this end, we use the approach proposed in [17,Theorem 1]. We endow by the structure of GF and forany code and any nonzero denote bythe code which is also an code.
Lemma 4: For any code and for any weight ,, there exists a nonzero such that
(26)
Proof: If , then there exists a nonzerosuch that has weight at most . Therefore,
Since there are nonzero in , there exists a for which
Since
this completes the proof.
Consequences of Lemmas 3 and 4 as will be given inSection V. Here we only note that, for a sequence of codes
and weight , if , then . On theother hand, if the parameters satisfy , thenthere exists a sequence of codes such that .
Inequality (25) can be sharpened. Since forand hence, for , we get
and
(27)
To strengthen (21) (and, hence, (25) and (27)) we needstronger lower bounds on or nontrivial lowerbounds on , or both. A general lower bound on
in terms of , where , is the celebrated Kruskal–Ka-tona bound, see, e.g., [5, p. 30]: let and bethe unique integers such that
and
then
This bound is the best possible general bound in the sense thatthere are sets for which the bound is satisfied with equality. Thisis the case even for for some codes . It seems,however, that the bound is weak for most codes. In particular,(24) is not attained for a linear code for .
Example 3: Let be the code of all binary vectors withzeros in the last positions. Then, an error is correctableif and only if none of the first positions are affected. Hence,
for all . In particular, for this code, (24) is attained for all such that
Example 4: Let be the code ob-tained by extending a Hamming code with one position whichis always zero. Then, the correctable errors are all single errors,and the double errors affecting the last bit and one of the re-maining bits. Hence,
and
and
and
Since
the Kruskal–Katona bound gives
and the corresponding bound
which is very weak for large .
V. BOUNDS FOR THE BEST CODES
In this section, we obtain some inequalities and asymptotic re-sults for the error-correction capacity function for the best codeswhich are consequences of Lemmas 3 and 4. We consider se-quences of codes where runs through a subse-quence of the positive integers and as .
HELLESETH et al.: ERROR-CORRECTION CAPABILITY OF BINARY LINEAR CODES 1415
Theorem 2:
i) For any , , and any there exists ancode with such that
(28)
where
ii) For any linear code with , ,and any fixed , , we have
(29)
if
and is sufficiently large.Proof: To prove the existence of the desirable codes we
put . We can assume that , sincewhen and (28) holds for any code when .
Using that , , we apply Lemma 7 inAppendix II for and get
By Lemma 4, this implies the existence of an code with
(30)
Using monotonicity of the function we get (28), and thefirst part of the theorem is proved.
To prove the second part we note that by Lemma 3, the valuefor an code satisfies the inequality
(31)
Therefore, it is sufficient to prove that forand when . We use the
well-known facts that
for and that
for a fixed , , and . Using thatwe can estimate
from below by
Note that (28) and (29) make more precise the well-knownresults that, for any , there are codes of rate which cor-rect almost all errors of weight smaller than and thereare no codes which correct almost all errors of weight smallerthan .
Now, for a sequence of codes of rate , thatis, , we investigate the asymptotic behaviorof for a fixed , , as . Note thatfrom (31) it follows that under our assumptionthat . Thus, can hold only for asequence of codes with the rate going to zero. In order toinvestigate the convergence when , weintroduce for a code the parameter
Note that when .
Theorem 3:
i) For any and , , there exists ancode such that
(32)
ii) If, for a sequence of codes, there exists ,, such that
as
then
(33)
Proof: To prove the first part of the theorem putwhere . Let
Note that the condition implies that ,, , and . In particular, it follows
that . Using that ,, we apply Lemma 7 in Appendix II and get
(34)
Since for we have
By Lemma 4, this implies the existence of an code with
(35)
Using monotonicity of the function we get (32), and thefirst part of the theorem is proved.
1416 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 4, APRIL 2005
To prove the second part of the theorem we use two asymp-totic relations which follow from known results and can befound in [11, Ch. VII, Sec. 6, Problem 14]. The first asymp-totic relation can be formulated as follows. If for an integer ,
, , and as , then
where
(36)For an code , we apply this relation withand to the inequality (31) whichfollows from Lemma 3. As a result, we get
(37)
for some constant and sufficiently large . Since andwhen , this implies that
when . If the condition is not satisfied,it is sufficient to consider the case when and
. In this case, we use the following known result:
(38)
if and as . Note that allrequired conditions are satisfied for the if
Moreover, for such a choice
and hence as , by Lemma 3. This implies(33).
A sequence of codes where asis called asymptotically optimal if for any fixed ,
From Theorem 3 it follows that asymptotically optimal se-quences of codes exist for any function suchthat and as .
VI. BOUNDS ON ERROR CORRECTION BEYOND HALF THE
MINIMUM DISTANCE
The inequalities (30) and (35) characterize the maximumweight such that almost all errors of this weight can be cor-rected. However, these results are not constructive: they arebased on the existence, for any , , and , of an codesuch that
(39)
and on asymptotic analysis of the conditions for .
A significant problem is to find an upper bound forwhen , for some classes of codesincluding the primitive BCH codes (with ,
, and ) and their duals, and the Reed–Mullercodes RM of order (with , , and
). In this section, we present three such bounds forcodes under some restrictions on their parameters. To
prove these bounds we use the following inequality which fol-lows from Corollary 7:
(40)
for any trial set for .Given a positive-valued function , an code is
called -binomial if there exists a trial set for such that
(41)
for all .
Example 5: All words of the first-order Reed–Muller codeRM , except and , have weight and are minimal. Thisset of codewords is of course a trial set. Since
as , the code RM is -binomial whereas .
By the recent result of Blinovsky [4], there exists a constantsuch that for any and there exists an code
which is -binomial (it is proved using the distancedistribution of the trial set of all nonzero codewords).
Theorem 4: Let be a -binomial code. Thenfor any ,
(42)
Moreover, if and
then, for any ,
(43)
Proof: From (40), (41), and the well-known combinatorialidentity
(44)
we get
HELLESETH et al.: ERROR-CORRECTION CAPABILITY OF BINARY LINEAR CODES 1417
Denote the last double sum by . For , this sumconsists of the members over the set of all pairsof integers such that . If we alternativelyintroduce the summation variables and ,we see that they range exactly over the set determined by
. Since , we get
and this proves (42).In the general case, we have
Now we shall show that if
and (45)
then we have the following upper bound for any member of thesum :
(46)
First, if , then , and so
Since and , (46) is clearly satisfiedin this case. Next, consider . Define by
ifotherwise.
Then . Further
and so . Since, by (45)
Equation (46) is satisfied also in this case. Thus, we have
(47)
where
(48)
We see that
(49)
This is true because
is smaller than since when . Further,since , if the conditions (45) are satisfied we get
since if is even and if isodd. Combining this with (47)–(49), we get
Corollary 10: For any -binomial code andany ,
In particular, this means that the Elias bound [9], [10], andthe expurgated bound (see, e.g., [21, pp. 93–95]) for the prob-ability of error decoding of random selection codes on the bi-nary-symmetric channels are also valid for arbitrary sequencesof -binomial codes with the extra factor .
The asymptotic results for the error-correction capabilityfunction obtained in Section V are based on finding conditionsfor
Therefore, asymptotic properties of good codes whose existencewas proved using random selection are preserved for -bi-nomial codes if as . In this case, it issufficient to replace by in the previous proofs.
Corollary 11: Let be a sequence of -binomialcodes such that as .
1418 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 4, APRIL 2005
i) If , then for any
ii) If , then there exists a function such thatand
and, hence, this sequence of codes is asymptotically op-timal.
By a theorem of Sidelnikov [23], the BCH codes withare -binomial with as .
However, the sequence has rate if ,and for these codes
Hence, by Lemma 3, if is a constant orgrows slowly with .
Note that it does not follow from Corollary 11 that the codesRM are asymptotically optimal since for them
. Now we prove a bound similar to (39) for an arbitrarycode . In particular, this bound will imply that the codes
RM are asymptotically optimal. Note that we can assumethat due to (29) and (33).
Theorem 5: For any code and any weight ,
(50)
Proof: From (40), combined with Lemma 8 in Ap-pendix II, it follows that
(51)
Since , using (44) forand we obtain (50).
Let and
(52)
By Lemma 9 in Appendix II, decreases with increasingfrom to . Therefore, for any ,
, the equation has a unique solution, .
Corollary 12: Let be a sequence of codes suchthat , , where . Thenfor any , as .
Corollary 12 is valid since the exponent of the right-hand sideof (50) does not exceed, asymptotically,
when . This follows frominequality (71) of Lemma 10 in Appendix II with ,
, and hence, as .Consider a sequence of codes for
which for an , , and hence,
By Theorems 2 and 3, we have and .
Corollary 13: Let be a sequence of codes suchthat as where . Then for any
such that and
(53)
Corollary 13 follows from (50) and (72) where ,, and hence, and as
.The asymptotic inequality (53) is a refinement of the Sidel-
nikov–Pershakov estimate [24] for Reed–Muller codes RMof a fixed order when . Relation (53) implies that forany sequence of codes, for which the conditionsof Corollary 13 are satisfied (in particular, for the codes RM ),and for any fixed ,
(54)
Note that (54) implies the asymptotic optimality of theReed–Muller codes RM of the first order. On the other hand,it gives only for the Reed–Mullercodes RM of the second order.
In the proof of (50), we estimated the right-hand size of (40)by that of (51). However, we can directly estimate (40) to obtaina sufficient condition for asymptotical optimality.
Theorem 6: Let be a sequence of codes withtrial sets and such that , , and, forsome constants and ,
(55)
for all . Define by
If
when (56)
then
when (57)
and so is asymptotically optimal.
HELLESETH et al.: ERROR-CORRECTION CAPABILITY OF BINARY LINEAR CODES 1419
Proof: Let . Using (40), (44), and (68),where , , and we get
(58)
since . First, we note that
We now use Corollary 15 in Appendix II to estimate the bino-mial coefficients in (58) for even , . Clearly,
and . Next, since and, we get
for sufficiently large. Finally, since ,we get
Applying all these estimates in (72) we see that there exists aconstant such that for every even and sufficientlylarge
(59)
Note that (59) is valid for odd , , as well. Indeed,for , , and, since
Hence, (59) with implies (59) with . There-fore,
By assumption (56), the first sum goes to zero when goes toinfinity. We have to show that the same is true for the second
sum. We have . Hence ,and so
Corollary 14: Let be a sequence of codes suchthat , , and, for a constant
for all
Then
when (60)
and so is asymptotically optimal.Proof: Since , for any when is
sufficiently large. For any code we have , and so thecondition (56) is satisfied.
We note that it follows immediately from Corollary 14 thatthe first-order Reed–Muller codes are asymptotically optimal.We give some more examples to illustrate the use of Theorem 6and Corollary 14.
Example 6: For a fixed , let be the dual of the primitiveBCH code of length and designed distance .By the Carlitz–Uchiyama bound, the weight of any nonzerocodeword of satisfies (see e.g., [19, p. 280])
Also, . Hence, the conditions of Corollary 14 are satis-fied for sufficiently large. Therefore, is asymptoticallyoptimal. We note that the result is true even if grows with ,but not too fast .
Example 7: In this last example, we will apply Theorem 6to show that the second-order Reed–Muller codes RM areasymptotically optimal. The weight distribution of RM isknown (see, e.g., [19, p. 443]):
for
RM
RM
RM otherwise
The weight distribution of the minimal words has been deter-mined by Ashikhmin and Barg [2], but we do not need this, ex-cept the obvious result that RM .
1420 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 4, APRIL 2005
Since RM , (55) is satisfied forand where RM (actually, Ashikhminand Barg [2] showed that
RM RM
and so (55) is satisfied for ; in this proof, we onlyneed some , however). It remains to prove that (56) issatisfied. Note that
and
It follows that
RM
Set
and note that . Then the only nonzeroterms in the sum in (56) occur for where
. Since
for sufficiently large (actually for ) and the numberof nonzero terms in (56) is , we get
RM
that is, (56) is satisfied since . Hence,is asymptotically optimal.
VII. CONCLUDING REMARKS AND OPEN PROBLEMS
Although the monotone structure of correctable and uncor-rectable errors has been known for many years, its consequenceshas not really been studied that much. We have investigatedthis structure, especially the minimal uncorrectable errors. Oneapplication has been the introduction of trial sets and the gra-dient-like decoding algorithm based on these sets. A significantproblem is to estimate the number of steps in the algorithm, be-cause in this case the weight of the vector under considerationmay remain the same from one step to the next, in contrast toBarg’s algorithm of finding a closest codeword based on testsets.
Another problem is to investigate the minimal trial sets forsome classes of linear codes. Using the weight distribution ofminimal trial sets in the inequality (19) can improve the resultsof Section VI for these codes.
We conjecture that the error-correction capability functionhas the following threshold property: if for a given se-
quence of linear codes with , ,and a given , , there exists such that
(61)
then (61) holds for any , . The validity (61) for all ,, means that
and
for any as . From Theorem 2, it follows that, and these bounds are attained for the codes of
Example 3 and for the best known codes, respectively. More-over, for the latter codes, the behavior is changed to
when runs through an interval of length .It is an open question if the threshold property is true in general.The Kruskal–Katona bound implies (see [5, pp. 38–39]) that if
for an (not necessarily integer) , ,then for any
However, these results (valid for all monotone structures on sub-sets of an -set) are not sufficient to prove the threshold propertyabove. We hope that this property can be proved with the helpof some strengthening of these results for the specific mono-tone structure formed by correctable and uncorrectable errorsfor linear codes. Zemor [25] considered a different ordering on
which also ensures a monotone structure of correctable anduncorrectable errors (in Appendix I we characterize all suchorderings) and proved that, for any sequence of linear codes
such that as the followingthreshold property is valid. He defined the threshold probability
by the equation and, for an arbitrary, proved that jumps from almost zero to al-
most one when runs through the interval .Note that if our conjecture is true, then for any ,
and hence, for andfor as If true, this would give a simple proofof Zemor’s result and show that as for theconsidered sequences of linear codes.
The definition of a -error-correcting code as a code ,for which , is a natural generalization of a -error-cor-recting code and possesses the property that a -error-cor-recting code is a -error-correcting code for any . Aninteresting open problem, for a given class of -error-correctingcodes, is to find the maximal such that these are -error-correcting codes (that is essentially to find ). For ex-ample, Gorenstein, Peterson, and Zierler’s well-known resultthat the double-error-correcting BCHcodes are quasi-perfect implies that we have
HELLESETH et al.: ERROR-CORRECTION CAPABILITY OF BINARY LINEAR CODES 1421
for these codes. For -error-correcting BCH codes, however,the question is open (cf. the investigation of the coset leadersof these codes due to Charpin and Zinoviev [6]). A solutionof this problem for the -error-correcting simplex
codes can be found in [15]. The problem isstill open for other Reed–Muller codes.
The functions and can also be defined for arbi-trary codes (and also in the -ary case). Let
be a partition into sets of the maximal likelihood (this meansthat, for any , , andfor any ). Then
One can check that is a nonincreasing function in as inthe case of linear codes.
The problem to estimate for random selection codeshas been considered earlier. In particular, a result in Ahlswede,Bassalygo, Pinsker [1, Lemma 5] can be formulated as follows:For any and , , there exists a code
such that
and
However, in this paper, for any , , and , we prove the exis-tence of an code for which . Notethat for any , , and , such that we have
for an code of a larger size with anyfor which
and under a weaker restriction (by(38), this ensures that and hence, ).
Finally, it is worth noticing that in our investigation, for ancode , of the quantity for we use
the maximum-likelihood decoding (for which errors arecorrected). In some papers is estimated for special classesof codes and other decoding algorithms, which in general donot correct all errors. In particular, in the papers byKrichevsky [16], Sidelnikov and Pershakov [24], Dumer [8]good bounds on for were obtained for decodingReed–Muller codes using majority and recursive decodingalgorithms, and the complexity of these algorithms was alsoestimated.
APPENDIX IALTERNATIVE MONOTONE ORDERINGS
We call a linear ordering on an -ordering ifwhen Given an -ordering on and ancode we say that there is a monotone structure of the sets ofcorrectable and uncorrectable errors, if , then
implies and hence, implies .This is an analogue of a property of the Voronoi region of acode vector in the (continuous) Euclidean space taking into ac-count that points at the minimal distance from some code vec-tors have a nonzero measure in the (discrete) Hamming space.It is easy to give examples of an -ordering and an code
such that there exist vectors , ,, , and . A natural question is to
investigate what -orderings ensure a monotone structure of thesets of correctable and uncorrectable errors for any code
. We call an -ordering on monotone if for any dif-ferent and such that
and (62)
and for any and vectors and defined by
and (63)
we have if . In particular, the orderingdefined by (1) is monotone since
and hence, if .The following statement is a generalization of Lemma 1.
Lemma 5: Let be a monotone ordering on and letbe an code . If and , then .
Proof: It is sufficient to show the result when is an im-mediate descendant of ; the general result then follows by in-duction. Using that and that is an immediatedescendant of , we get
(64)
If at least one of the two inequalities in (64) is strict, then, hence, since is an -or-
dering. It is left to consider the case when and. Let . In this case, we have
, , and and hence, .Therefore, for , , , , wehave by the definition of the monotone ordering.
Lemma 6: An -ordering on gives a mono-tone structure on the sets of correctable and uncorrectable errorsfor any code if and only if the ordering is monotone.
Proof: A monotone ordering gives a monotone structureon the sets of correctable and uncorrectable errors for anycode according to Lemma 5 and the fact thatif and only if there exists such that . If an
-ordering is not monotone, then there exist vectors andsuch that (62) is satisfied, and a number ,
such that
but
where and are defined by (63). Consider the codegenerated by . Since the setsand form two cosets of and hence, and
while .
1422 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 4, APRIL 2005
APPENDIX IISOME INEQUALITIES
Lemma 7: Let where and are integersand is a real number. Then
(65)
Proof: The bound (65) for is known and followsfrom the bounds
if (66)
which is obtained by an estimation of the sum of binomial co-efficients with the help of a geometric progression, and
if (67)
which is a consequence of the Stirling formula (see, for ex-ample, [21, p. 466]). To prove that (65) is valid for any realin the range , we show that, for ,the function
increases with , or, equivalently
First we note that for
Therefore, it is left to check that the function
decreases for and . Since
, and decreases with , we getfor . Finally, we have
since when .
Lemma 8: Let , , , be integers such that
Then
(68)
Further, if , then
(69)
Proof: Denoting by the ratio of the members of thesum in (68) with and we have
Using in turn that , , ,and we get
Summing the geometric progression we have
where
This gives (68) since .To show (69), we observe that since the interval for starts
with the even number , it is sufficient to verify thatand for even . For we have
since , and
since
when .
Recall that
Lemma 9: For , the functiondecreases with increasing and
(70)
HELLESETH et al.: ERROR-CORRECTION CAPABILITY OF BINARY LINEAR CODES 1423
Proof: First we see that
and so is decreasing. To prove (70) we will show that thefunction
decreases with growing from a positive value atto at . This follows from the fact that
for since the second partial derivative is
and this is positive in this range.
Lemma 10: Let , , be integers such that
Then
(71)
Proof: We estimate the factorials in
using the following refinement of Stirling formula [11]:
The exponent of in the upper bound equals
which is a sum of negative terms. This implies (71).
Combining Corollaries 9 and 10 we get the following result.
Corollary 15: Let , , be integers such that
Then
(72)
ACKNOWLEDGMENT
The authors wish to thank L. A. Bassalygo, A. Barg, and I.Dumer who all have read some parts of the manuscript and pro-vided constructive comments that have improved the presenta-tion.
REFERENCES
[1] R. Ahlswede, L. A. Bassalygo, and M. S. Pinsker, “Localized randomand arbitrary errors in the light of arbitrarily varying channel theory,”IEEE Trans. Inf. Theory, vol. 41, no. 1, pp. 14–25, Jan. 1995.
[2] A. Ashikhmin and A. Barg, “Minimal vectors in linear codes,” IEEETrans. Inf. Theory, vol. 44, no. 5, pp. 2010–2017, Sep. 1998.
[3] A. Barg, “Minimum distance decoding algorithms for linear codes,”in Applied Algebra, Algebraic Algorithms and Error-CorrectingCodes (Lecture Notes in Computer Science), T. Mora and H. Mattson,Eds. Berlin, Germany, 1997, vol. 1255, pp. 1–14.
[4] V. M. Blinovsky, “Uniform estimate for linear code spectrum” (in Rus-sian), Probl. Pered. Inf., vol. 37, no. 3, pp. 3–5, 2001. English translationin Probl. Inf. Transm., vol. 37, no. 3, pp. 187–189, 2001.
[5] B. Bollobás, Combinatorics: Set Systems, Hypergraphs, Families ofVectors, and Combinatorial Probability. Cambridge, U.K.: Cam-bridge Univ. Press, 1986.
[6] P. Charpin and V. A. Zinoviev, “On coset weight distributions of the3-error-correcting BCH codes,” SIAM J. Discr. Math., vol. 10, no. 1, pp.128–145, 1997.
[7] G. D. Cohen and A. Lempel, “Linear intersecting codes,” Discr. Math.,vol. 56, pp. 35–43, 1984.
[8] I. Dumer, “Recursive decoding of Reed-Muller codes,” in Proc. 37thAnnu. Allerton Conf. Communication, Control and Computing, Monti-cello, IL, Sep. 22–24, 1999, pp. 61–69.
[9] P. Elias, “Coding for noisy channels,” IRE Conv. Rec., pt. 4, vol. 3, pp.37–46, 1955. Reprinted in E. R. Berlekamp, Ed., Key Papers in the De-velopment of Coding Theory. New York: IEEE Press, 1974, pp. 48–55.
[10] , “Coding for two noisy channels,” in Information Theory, C.Cherry, Ed. London, U.K.: Butterworth, 1956, pp. 61–74.
[11] W. Feller, An Introduction to Probability Theory and its Applica-tions. London, U.K.: Wiley, 1950, vol. 1.
[12] A. B. Fontaine and W. W. Peterson, “Group code equivalence and op-timum codes,” IEEE Trans. Inf. Theory, vol. IT-5, no. 5, pp. 60–70, May1959.
[13] D. C. Gorenstein, W. W. Peterson, and N. Zierler, “Two-error correctingBose-Chaudhuri codes are quasiperfect,” Inf. Contr., vol. 3, pp. 291–294,1960.
[14] T. Helleseth and T. Kløve, “The newton radius of codes,” IEEE Trans.Inf. Theory, vol. 43, no. 6, pp. 1820–1831, Nov. 1997.
[15] T. Helleseth, T. Kløve, and V. Levenshtein, “A coset count that provesthat the simplex codes are not optimal for error correction,” in Proc.2003 IEEE Workshop Information Theory, Paris, France, Mar./Apr.31–4, 2003, pp. 234–237.
[16] R. E. Krichevsky, “On the number of errors which can be corrected bythe Reed-Muller code” (in Russian), Dokl. Akad. Nauk SSSR, vol. 191,no. 3, pp. 544–547, 1970. English translation in Sov. Phys.–Dokl., vol.15, pp. 220–222, 1970/1971.
[17] V. I. Levenshtein, “Bounds on the probability of undetected error” (inRussian), Probl. Pered. Inform., vol. 13, no. 1, pp. 3–18, 1977. Englishtranslation in Probl. Inform. Transm., vol. 13, no. 1, pp. 1–12, 1977.
[18] L. Levitin and C. R. P. Hartman, “A new approach to the general min-imum distance decoding problem: The zero-neighbors algorithm,” IEEETrans. Inf. Theory, vol. IT-31, no. 3, pp. 378–384, May 1985.
[19] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error CorrectingCodes Amsterdam, The Netherlands, North-Holland, 1986.
[20] J. Massey, “Minimal codewords and secret sharing,” in Proc. 6th JointSwedish-Russian Workshop on Information Theory, Mölle, Sweden,1993, pp. 276–279.
[21] W. W. Peterson and E. J. Weldon Jr, Error-Correcting Codes. Cam-bridge, MA: MIT Press, 1972.
[22] G. Poltyrev, “Bounds on the decoding error probability of binary linearcodes via their spectra,” IEEE Trans. Inf. Theory, vol. 40, no. 4, pp.1284–1292, Jul. 1994.
[23] V. M. Sidelnikov, “Weight spectrum of binary Bose-Chaudhuri-Ho-quinghem codes” (in Russian), Probl. Pered. Inform., vol. 7, no. 1, pp.14–22, 1971. English translation in Probl. Inf. Transm., vol. 7, no. 1,pp. 11–17, 1971.
[24] V. M. Sidelnikov and A. S. Pershakov, “Decoding Reed-Muller codeswith a large number of errors” (in Russian), Probl. Pered. Inform., vol.28, no. 3, pp. 80–94, 1992. English translation in Probl. Inf. Transm.,vol. 28, no. 3, pp. 269–281, 1992.
[25] G. Zemor, “Threshold effects in codes,” in Proc. Algebraic Coding,Paris, 1993 (Lecture Notes in Computer Science). Berlin, Germany:Springer-Verlag, 1994, vol. 781, pp. 278–296.