A simplex algorithm for piecewise-linear programming II ... · A SIMPLEX ALGORITHM FOR...

Mathematical Programming 41 (1988) 281-315 281 North-Holland

A SIMPLEX ALGORITHM FOR PIECEWISE-LINEAR PROGRAMMING Ih FINITENESS, FEASIBILITY AND DEGENERACY

R o b e r t F O U R E R

Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, IL 60208, USA

Received March 1985 Revised manuscript received 11 May 1987

The simplex method for linear programming can be extended to permit the minimization of any convex separable piecewise-linear objective, subject to linear constraints. Part I of this paper has developed a general and direct simplex algorithm for piecewise-linear programming, under convenient assumptions that guarantee a finite number of basic solutions, existence of basic feasible solutions, and nondegeneracy of all such solutions. Part II now shows how these assumptions can be weakened so that they pose no obstacle to effective use of the piecewise-linear simplex algorithm. The theory of piecewise-linear programming is thereby extended, and numerous features of linear programming are generalized or are seen in a new light. An analysis of the algorithm's computational requirements and a survey of applications will be presented in Part III.

Key words: Linear programming, simplex methods, piecewise-linear programming, nondifferentiable optimization.

1. Introduction

A piecewise-linear program minimizes a convex separable piecewise-linear objective function, subject to linear constraints. Part I of this paper [8] has developed a version of the simplex algorithm that solves piecewise-linear programs directly. Although this piecewise-linear simplex algorithm differs in certain ways from familiar linear simplex methods, it has substantially the same steps and computational requirements as the algorithm that is commonly applied to solve linear programs in bounded variables.

The derivations in Part I rely on three restrictive assumptions about the nature of piecewise-linear programs. Part II now shows how these assumptions can be weakened so that they pose no obstacle to effective use of the piecewise-linear simplex algorithm. The theory of piecewise-linear programming is thereby extended, and numerous features of linear programming are generalized or are seen in a new light. The stage is thus set for a detailed analysis of the algorithm's computational requirements, in Part III [9] of this paper.

This research has been supported in part by the National Science Foundat ion under grant DMS- 8217261.

282 R . F o u r e r / A s i m p l e x m e t h o d f o r p i e c e w i s e - l i n e a r p r o g r a m m i n g

Outline

The remainder of this introduction briefly reviews the terminology and properties of piecewise-linear programming that will be required in the sequel, and restates the piecewise-linear simplex algorithm. Subsequent sections consider relaxing or removing Part I's three assumptions, in the reverse of the order in which they were imposed.

According to the third assumption, nondegeneracy, every basic variable in every basic solution must lie strictly within one of the objective function's intervals of linearity. At bases that violate this restriction, the piecewise-linear simplex algorithm may be forced to perform degenerate iterations that fail to reduce the objective. Section 2 shows how the algorithm may be extended so that it is well-defined in the presence of degeneracy, and so that it discourages degenerate iterations or executes them in an efficient way. Section 3 then describes how the algorithm can be further modified, by extension of well-known perturbational and combinatorial approches, to prevent infinite cycles of degenerate iterations.

The second assumption says that every piecewise-linear program has a basic feasible solution, if it has any feasible solutions at all. Section 4 demonstrates that a basic (but perhaps infeasible) solution can be constructed in all but certain exceptional situations that are easily circumvented. Once any basic solution is known, a basic feasible solution can be found by minimizing a piecewise-linear penalty function or by adding weighted penalty terms to the true objective, much as in bounded-variable linear programming. If penalties are added then the required size of their weights has a simple relation to the optimal dual values.

Under the first assumption, the objective function of a piecewise-linear program must be defined by only finitely many intervals of linearity. Section 5 shows that the piecewise-linear simplex algorithm remains well-defined under much weaker conditions that permit an infinite number of intervals. These conditions must again be somewhat strengthened, however, if the algorithm is to retain its property of finite termination. The principal difficulty in the analysis is that, although the relaxation of finiteness need not affect the local nature of piecewise-linear functions, it does significantly expand their range of asymptotic behavior.

Piecewise-linear programs

A closed proper convex piecewise-linear (P-L) function [c/Y]k of Xk is character- ized (to within an additive constant) by an increasing sequence of breakpoints y~h),

. . . < <

and by an increasing sequence of slopes C~k h),

( -2) < (--1) ~(0) , / ~(1) j C(2) • • • < C k C k < t. k ~ t~ k ~ . < • . • ,

At least one slope or one breakpoint must be finite. If C(k h) is finite, then [C/y]kXk is linear with slope C(k h) on the interval between y(k h) and y(k h+~). Infinite slopes are

R. Fourer / A simplex method for piecewise-linear programming

interpreted as follows:

c~)=-oo ~ [C/y]kxk=OO

c(k')=+oO ~ [C/y]kXk=OO

for all Xk < y(s+l);

for all xk > 7~*).

283

Thus the s lope-breakpoint sequences can effectively begin with either some c~ s) = - c o

and 7(k s+l) finite, or some y~k s)= --O0 and c~ '~ finite; they can effectively end with either some 7(k ~) finite and c~ ° = + ~ , or some C~k ° finite and y~+~)= +o0.

A closed proper convex separable piecewise-linear funct ion [ c / y ] of

x = ( X l , . . . , x,) is defined by a sum of closed proper convex piecewise-linear functions:

[c/y]x= E [C/y]kXk. k = l

A piecewise-linear p rogram (P-LP) minimizes such a funct ion over all x that satisfy

certain linear equations:

Minimize [ c / y ] x

subject to A x = b.

The matrix A = [al , • • •, an] is m x n and may be taken to have linearly independent rows. A solution 2 to A x = b is feasible if [ c / y ] 2 is finite; thus )7 is feasible if each )Yk lies in an interval o f finite slope o f [C/7]k.

Basic solutions f o r piecewise-linear programs

A basis matrix B is an m x m nonsingular submatrix o f A. The vector xB denotes

the basic variables corresponding to the columns of B, and XN denotes the remaining,

nonbasic variables; an individual variable Xk is described as XBk when it is in the

basic set, and as XNk when it is nonbasic. An analogous notat ion is employed to distinguish other vectors and values that cor respond to basic and nonbas ic variables.

Given a basis matrix B and a vector YN of finite breakpoints , a unique basic

solution is defined by

)~N ~ YN~

B ~ = b - ~ aNjyNj. J

Under the assumption of nondegeneracy , the values of the basic variables determine

a unique vector of slopes cB, where

c~i = c~'i ) if 7~ h) < xm < 7~'i +1)-

The basic solution is necessarily feasible if -oo < cm < +oo for all i.

284 R. Fourer / A s implex me thod f o r piecewise-linear programming

Relative to the breakpoint YNj at which Xuj is fixed, the slopes and breakpoints of [ c / 7 ] N j are denoted as follows:

I n t e r v a l S l o p e O f XNj

[~/T4 ~}, "}/7q5 h-l}] C--N~ h}

:

{2} [TN~ 2}, T~I}I CNj

--{1} [r;~ 1}, v,,~] cNj

[ TN. ~+~5 I} ] c ,,j<l}

e ~/~¢51}, ,y~2} ] C~.@2}

:

The slopes and breakpoints for each [c/3']B~ are described relative to cBi by a complementary notation:

I n t e r v a l Slope Of XBi

:

[r~? }, r;} 2}1 4} 2}

[~;}2}, ~,;}1}] eB}1}

{I} ^ +{ I } 1 [ T B i , r B i ] CUi

+{1} ~ +{2} l c ;~ l } ~/Bi , XBi I

+{2} ~ +{3} l ~+{2} ~lBi ~ YBi J '~Bi

+{h} ~+{h+l} l .+{h} ~/Bi , Y B i d t'Bi

{o}_ 3,~o}= Yuj. Similarly, y~}o}= -{o}_ +{1} +{o}_ CN~I}, and Yuj - By convention, CNj -- CNj , ~Xj - - - { o } _ ~ + { o } _

r B i ~ +{1}, y~}o} = y~l}, and eBi - ~Bi - e B i .

Starting from a basic solution 2, a line of solutions may be generated by pushing some nonbasic variable away from its current breakpoint YNj while adjusting the values of the basic variables to maintain A x = b:

xNj = Y N j + O, XB = 2 u - - O y B ,

XNj = "YNj -- O, X B = 2 B -- OyB,

where B y B = a u j ,

where B y B = - -aNj .

o r (1.1)

R. Fourer / A simplex method for piecewise-linear programming 2 8 5

The rate o f increase in the objective as Xuj increases f rom YNj is given by a reduced

cost dNj, and the rate o f decrease in the objective as XNj decreases f rom YNj is given

by d Nj:

d+j + { 1 } = _ - : CNj - - C B y B C ~ 1} "lT"aNj , o r

(1.2) dNj c ~1}+c = Nj BYB = CN~ 1} - - 3TaNj ,

where ~B = cs. Suppose that XNp, in particular, is increased or decreased by an amount 0. Then it must lie between some two breakpoints :

+{r- - l} / - ^, ~ +{r} ")/Np ~ } ' N p - ~ O ~ r N p , o r

")/ N~--I} > "fNp -- O ~ TN %r},

and the basic variables XBi must also lie between certain breakpoints:

3 , + ( r ~ - 1 ) ~ _ +{r~} Bi ~ "YBi - Oysi ~< yB~ for each YB~ < 0, and

YB} r ' - l}> ~B~- Oy~ >1 y?}r,} for each Ym > 0.

The rate o f increase in the objective as XNj ~ YUj + O, or the rate o f decrease in the

objective as Xuj ~ Y N - 0 , is given by an effective reduced cost d ) v ( 0 ) or dNp(O):

+ +{r. 1} o - - { r 1}, dNp(O)=C+N~ } - Y. CBi' YBi-- 2 ~ i ' yBi, or

YBi<O YBi >0

~+{r . - -1}. , A- o--{r i - - l )~ dNp(O)=cN~ }+ X cui' yBi-- X ~ui yui. yBi<O ysi>O

Both are mono ton ic in 0: dNj<~ d+Nj(O0 <~ d+Nj(O2) and dNj >~ dNj(O0 >~ dNj(02) for

any 0~ < 02.

A simplex algorithm for piecewise-linear programs

Let ff be a basic feasible solution defined by a basis matrix B and breakpoin t

values yN, and let cu be the corresponding vector of basic slopes. One iteration o f the piecewise-linear simplex algorithm proceeds as follows:

(1) Solve (rB = cB. (2) Test for optimality: I f the nonbasic variables satisfy

- ~ + { 1 } CN~ 1}~ 7?auj ~< ~Nj for al lL

then the basic solution is optimal. Stop. (3) Select an entering variable: Choose a variable XNp such that

d+Np=C+N{pl}--~aNp<O o r d~p-=-c/;}-#aNp>O.

(4) Solve ByB=aNp if d~p<O, or Bys = --aNp if dNp>O. (5) Test for unboundedness: I f

~/+{1} B~ --- +oo for all YB~ < 0,

TB} a} = -o0 for all YB~ > 0,

286 R. Fourer / A simplex method for piecewise-linear programming

y~{p 1} +oo l f d N p ~ O ,

y N~pl~ = --00 if d)p > O,

then the objective can decrease without bound. Stop.

(6) Select a leaving variable: Choose any variable XBq and breakpoint y~%/ or y)~q%~ such that

+(r },, (a) (ffBq--YBq ~ )/YBq = 0 and YBq <0, or

(b) (XBq -- TB~%I)/yBq = 0 and y,q > 0;

or choose XNp and any breakpoint y ~ / o r yN~p~ such that

(c) - ( Y u p - ) ' u p - +~r~j= ~7 and due<+ 0, or

(d) ('YNp--')l-N{r})=O and dN.>O;

provided 0 is small enough that

d+Np<O a n d dNp(O)<O , + - or dNp>0 and d~n(0)>0 .

(7) Update the basic solut ion: Reset

"~B ~ XB -- OyB

XNp ('" 'YNp + ~

XNp <'" "~Np -- ~

if + dNp % O.

if d Np > O,

and set CBi accordingly for i # q:

+ { r 1) CBi = C Bi ' for YB~ < O,

CBi=CB} r~ 1} for YB~>0.

(8) Change the basis, according to the choice made in step (6): If (a) or (b): Replace aBq by aNp in B.

= +~r~ (if d~p<0) or CNp= ~Np Set CNp Cup . lr~ (if d-Np > 0). Set YBq = Y+S~ r'~ (if YBq < 0) or YBq = YB~%~ (if YBq > 0).

If (c) or (d): : ~ +{r} (if + < O) ~/Np = rNp Set YNp rNp dNp or . /r~ (if dNp>O).

The piecewise-linear simplex algorithm carries out a series of these iterations, each starting from the B, %, YN, and )~ determined at the end of the preceding iteration. The algorithm stops when, at some iteration, the termination conditions of step (2) or step (5) are satisfied.

2. Degeneracy

Suppose that the nondegeneracy assumption is dropped. Then the algorithm defined above may encounter a basic feasible solution in which certain basic variables

R. Fourer / A simplex method fbr piecewise-linear programming 287

(s~) lie exactly at finite breakpoints. For each i such that 2B~ = Tu~, the variable xB~

cannot be associated with a unique slope cu~; the slope of [c/'y]u~xBi is either c %) Bi

or c~i ~ depending on Whether xu~ increases or decreases from 2si. The vector # and the reduced costs d~j and d~j are thus ill-defined in steps (1)-(3), and the algorithm cannot proceed.

To deal with this difficulty, the piecewise-linear simplex algorithm may arbitrarily (s i 1) choose cB~ as either c~ ~ or cB~ whenever 2Bi = y~)~. The first part of this section

shows that the algorithm can then be applied unambiguously, even though it may encounter degenerate iterations at which the objective value fails to decrease. The

termination conditions of steps (2) and (5) are also seen to remain valid. The remainder of this section takes a closer look at the consequences of degeneracy

in P-L simplex steps. An iteration is shown to compute both true reduced costs that determine whether the objective can be reduced, and nominal reduced costs that determine optimality. The arrangements for computing these quantities can have a substantial effect upon both the complexity of an iteration and the path of iterations.

Further modifications are necessary to insure that the algorithm cannot fall into an endless cycle of degenerate iterations. Criteria for the prevention of cycling are derived in Section 3 below.

A c c o m m o d a t i o n o f degenerate bases

Suppose the that P-L simplex algorithm is modified so that a finite slope cB~ is assigned arbitrarily to each degenerate basic variable 28i = YB~, as described above. The slopes and breakpoints associated with xB~ are then as follows:

I f cBi = c(~i ' ) > -oo: I f cBi = c~i)< +co:

c+{h} (si+h-1) ~+{h} c ( S i + h ) Bi = C Bi t; Bi ~ Bi

CBi = CBi

+{h} (s.+h-1) ~ +{h} ( s + h ) Ui = T B~ r ui = T B~

I f b o t h c~i -1) = --00 and c%~Bi = +oe then no finite cBi can be assigned. In such a case,

however, every feasible solution must have 2i = 3/I "~, so that x~ can be fixed and dropped from the problem in advance.

Once cB is chosen, ,g" can be computed in step (1). I f the conditions of step (2) are satisfied, then the basis must be optimal; the proof of optimality in Part I of this paper [8] is applicable regardless of degeneracy.

I f the optimality conditions do not hold, the iteration may proceed to select some XNp in step (3) and to compute YB in step (4). The meaning of YB is preserved under degeneracy: when Xsp is increased (if d+Np < 0) or decreased (if d ) p > 0) by 0 from

TNp, each XB~ changes by OyB~. Thus the proof of unboundedness in Part I is also applicable. I f the conditions of step (5) are satisfied, then the P-LP must have no finite minimum.


I f the unboundedness conditions do not hold, then a leaving variable can be

selected. The meanings of the effective reduced costs are also preserved under degeneracy: for any positive 0, either d+Np(O) is the rate of increase in the objective

as XNp approaches "YNp~-0 from below, or dNp(O) is the rate of decrease in the objective as xNp approaches YNp-- 0 from above. Thus if step (6) can find a 0 > 0 such that d+p (0 )< 0 or dNp(0)> 0, the iteration can be completed as before, and

the new basic solution will yield a lower value of the objective. In the presence of degeneracy,

a positive ~ For each 2Bi = y(~),

(:£Bi- - +(1}~/., ),,~ j / y , ~ = 0 if

(XBi-- ~/B}I})IYBi = 0 if

however, the algorithm may not be able to choose one of the ratios in step (6) may be zero:

(s -1 ) YBi<0 and CBi= CBI , or

YB~>O and CBi=CB~.

The proof of termination in Part I argued that 0 could always be chosen positive, because it could always be at least as large as the smallest ratio. I f either of the above situations occurs, however, then the smallest ratio is zero, and the reasoning of the proof breaks down. Indeed, it can happen at a degenerate basis that

d+~,<0 but d+,(O)>-O, or dNF>0 but dNp(O)<~O,

for all 0 > 0. Then even though the computed reduced cost is favorable, the objective value must increase when XNF is pushed up or down from YNp.

The P-L simplex algorithm can accommodate this situation provided that step (6) is interpreted to allow a choice of 0 = 0. The variable xfp then enters the basis

at its breakpoint value "YNp, and some degenerate variable XBq leaves the basis at its current breakpoint value. This is a degenerate iteration; the basis changes but the basic solution and the objective value remain the same.

Reduced costs at degenerate bases

At any basic solution, regardless of degeneracy, i t is meaningful to imagine that a nonbasic variable xNj is pushed up or down by a small amount 8 from its current breakpoint value Ym, and that the basic variables are adjusted accordingly to

maintain A x = b. From formula (1.1), the resulting values of XNj and the basic variables may be written as

XNj = "~Nj + 6 XNj = TNj -- 8 o r where BWNj:B = aNj.

XB :- XB -- ~WNj:B XB = XB + 8WNj:B

For each XNj there is a different vector WNj:B , whose element WNj:B i determines how XBi will change as xNj is pushed. For the entering variable xNp in particular, WNp:B is just either YB (if d~vp<0) or -YR (if d N p > 0 ) as computed in step (4) of the algorithm.

All relevant reduced costs for XNj can be expressed in terms of WNj:B. Given any choice of slopes CBi, formula (1.2) implies that the values d~i and dN~ computed

R. Fourer / A simplex method for piecewise-linear programming 289

in step (2) are

d~j = C~,4~ 1} - - ~ CBiWNj:Bi , i

d-Nj = C-N5 '} -- Z CBiWNj:Bi , i

(2.1)

.+(1} where c ~ ~} and ~Nj have been defined as the slopes in [c/T]Nj to the left and right, respectively, of YNj. Let c~i and cu~ similarly denote the slopes in [c/y]u~ to the left and right of xu~. Then for all small enough 3 > O, the effective reduced costs must be

d+Nj(6)=C+N~'}-- ~, C+iWNj:Bi - ~_, CBiWNj:Ui, WI~j:Bi <0 WNj:Bi >0

d~Aa)=c~ " - E CBiW~j:B,-- E c+~w~j:B~. WNj:Bi<O WNj:Bi >0

(2.2)

I f all basic variables lie be tween breakpoin ts then c~ = cB = c~, so that dNj+ = d+j(8) and d ~ j = d-Nj(6) as expected. I f any xB~ = Y%~,B~ however , then c ~ = c~i - ~ but

+ = c ~ ), and there must be some differences: CBi

Coefficients o f WNj:Bi differ i n . . .

and d j(a)

d~i and ds j (6 )

i f . . . a n d . . .

(s~ 1) W N j : B i ~ 0 CBi = CBi

(%) W N j : B i > 0 CBi = CBi

wNj:~ < 0 c~ = c ~; ~ (s~ 1)

W N j : B i ~> 0 CBi ~ C Bi

To express the total difference concisely, let S2~j and g2 m be the disjoint sets o f indices i for which the coefficients o f wm:Bi differ in d+Nj, d+Nj(6) and duj, dNj(6), respectively. Then (2.1) and (2.2) together imply that

+ dNj+ ,~ ~, - c B ; )lwNj:,,I,

(c(S,) (s -1~,, , (2.3) dNj (6 )=dNj - - Z , u, --cu~ )lWuj:sd •

iEI"~Nj

(s i 1) + ~:_ + Since convexity requires c~)>~ cB~ , these re la t ionships confirm that din(6) ~ duj and dNj(6) <~ dNj for all j.

The quantit ies d+Nj(6) and d~j(6), for sufficiently small posit ive 6, are p roper ly regarded as the true reduced costs of Xuj, since they describe the actual change in

the objective value when x m is pushed up or down. The quantit ies d+Nj and d-Nj are nominal reduced costs which, as shown above, can only overstate the t rue reduced costs. (Even so, at most one of d Nj and d ~j can be favorable , since" dNj+ - dNj- = U~+{1}Nj - -

CN~ ~} >~ 0 just as in the nondegenera te case.)

2 9 0 R. Fourer / A simplex method for pieeewise-linear programming

Because the nominal reduced costs can only overstate the true reduced costs, the signs of the reduced costs at a given basis must be related in one of the following

three ways:

(a)

(b)

(c)

Nominal True reduced costs reduced costs Implication

All d~i ~ 0, All d+Nj(6) >i O, Basis is optimal. d Nj ~- O. d Nj( t~ ) ~ O. Stop.

Some d+Np < 0, All d+Nj(6) >~ O, Basis may not be optimal. or dNp>O, d-xj(6)<~O. Must take degen, iter.

Some d+Np<O, Some d ~ p ( 6 ) < 0 , Basis is not optimal.

or dNp>O, o r d-Np(6)>O. Can take nondegen, iter.

The different roles of the nominal and true reduced costs are clear. Optimality must be confirmed by the signs of d~j and dNj, while the possibility of a nondegenerate iteration depends on the signs of d+Np(8) and d-Np(3).

When the P-L simplex algorithm is modified to accommodate degeneracy as described earlier in this section, the first part of an iteration computes only the

nominal reduced costs, which suffice to identify case (a) above. The remainder of an iteration effectively determines the sign of the true reduced cost for just the entering variable XNp. I f a 6 is found such that d+Np(O)< 0 or dNp(O)> 0 then case (c) is identified, and a nondegenerate iteration is taken. Otherwise cases (b) and (c) cannot be distinguished; indeed, (b) can be confirmed only by checking all of

the true reduced costs.

Selection of an entering variable under degeneracy

The preceding analysis suggests that the P-L simplex algorithm might usefully employ the true as well as the nominal reduced costs in selecting an entering variable. I f step (2) does not detect optimality, then step (3) may look for an XNp that has a

favorable true reduced cost; if no such variable can be found, then XNp may instead be chosen to have favorable nominal reduced cost:

(2) Test for optimality: I f the nonbasic variables satisfy

+ dNj>~O and dNj<<-O

for all j, then the basic solution is optimal. Stop. (3) Select an entering variable: I f the nonbasic variables satisfy

d+Nj(6)>~O and d~j(6)~<0

for all j, then begin a degenerate iteration by choosing a variable Xup such that

d+Np<O or dNp>0;


otherwise, begin a nondegenerate iteration by choosing a variable XNp such that

+ dNp(6)<O or dNp(6)>0.

This procedure distinguishes cases (b) and (c) above, and has the advantage that it leads to a nondegenerate iteration whenever possible.

From a computational standpoint, however, the use of true reduced costs in step (3) is problematical. Since an iteration must examine nominal reduced costs in any case, to test for optimality in step (2), the determination of true reduced costs can only add to an iteration's inherent complexity. In particular, the algorithm cannot compute the true reduced cost of xNj until it knows the sign of WNj:Bi for each degenerate xB~.

One possible arrangement is to compute all reduced costs from the vectors WNj:B, using some combination of formulas (2.1), (2.2) and (2.3). Then the vector # of

step (1) is not required, and the vector YB of step (4) is +WNp:B as previously noted. To determine the necessary vectors Wuj:B, an iteration could solve BWNj:I 3 = aNj for each Xuj whose reduced costs were to be computed; but the solution of even a few such systems would add significantly to the cost of an iteration. I f instead all vectors

Wnj: B were updated at each iteration to reflect the change of basis, then the algorithm would perform the equivalent of a simplex " tableau" update [5, 7] whose drawbacks of inefficiency and instability are well known.

An alternative is to calculate 7? and the nominal reduced costs as in Section l ' s statement of the algorithm, and to calculate true reduced costs from the nominal ones by formula (2.3). Then only the values Wuj:B~ corresponding to degenerate

variables xBi need be computed. Writing eB~ for the unit vector such that BeB~ = abe, it follows that

WNj:B i • O'BiaNj where O'BiB = eBi.

Thus it suffices to determine o'B~ for each degenerate xB~. To avoid solving o-B~B -= eB~ at each iteration, o-B~ may be updated by

OrBq <-- O'Bq / WNp:Bq,

O'Bi <-- O'Bi -- Wsp:13iO'Bq f o r a l l i # q.

Thus the principal computations are the inner products ~rBiasj for each i and j such

+ d-Nj> O. In addition, O'BiB = eBi must be solved that xB~ is degenerate and d uj < 0 or for each newly degenerate xB~.

These analyses suggest that the P-L simplex algorithm cannot compute many true reduced costs if it is to retain the essential computat ional features of the iteration presented in Section 1 (and of comparable linear simplex iterations). Instead, the algorithm must choose some cu and rely on the nominal reduced costs. Different choices of cB necessarily yield different values of d+Nj and d~j, however, which

overstate dNj(3) and dNj(6) by differing amounts. Hence certain strategies for choosing c~ might affect the algorithm's performance.

292 R. Fourer / A simplex method for pieeewise-linear programming

In formula (2.3), the discrepancies between the nominal and true reduced costs are expressed as sums over the sets J2~j and J2 m. For each xsi = Y ~ such that

WNj:Bi # 0, choosing cBi = c~ -1~ places i into exactly one of S2~j or S2m, and choosing eB~ = c~i ) places i into the other. Thus it is possible to choose cB to make either J2+j

or J2Nj empty, and hence to make either /2Nj or J2Nj empty, and hence to make either d~j= d+Nj(8) or dNj = dNj(6). Unfortunately, different choices of cB are generally necessary to achieve these equalities for different j. The most one can hope is that some choice of cR will tend to make S2+j small for those xNj such that

dNj<O, and J2Nj small for those such that dNj>O. Such a choice cannot be made efficiently in any precise sense, since membership

in S2~j or J2Nj also depends on the unknown signs of the values WNj:B~. Nevertheless, there may be some merit in a heuristic choice. Intuitively, J2~j and J2Nj represent "wrong guesses" about the behavior of the basic variables: they contain the indices

i for which x~ will increase although eBi has been taken as the slope to the left of fibs, or for which xB~ will decrease although cBi has been taken as the slope to the right. Thus it might help to choose cB~ as c~ ) if xB~ is expected to increase at the next iteration, or as c~ 1~ if xB~ is expected to decrease. Several heuristics for predicting the expected behavior of xB~ are described in Part I I I [9] of this paper.

A final possibility is compromise. I f the expected movement of x~ cannot be

usefully predicted, cB~ may be chosen as any value between e~ and c~i, such as their average if both are finite. An extension of the above analysis shows that such a choice contributes (eBb-C~)IWNj:BiI to the overstatement of one of the nominal

+ C~]WNj:R~I to the overstatement of the other. The proof of reduced costs, and cB~- the optimality conditions also extends to nominal reduced costs determined in this

way.

3. Prevention of degenerate cycling

It remains to show that the piecewise-linear simplex algorithm can always reach an optimal solution, even though degenerate iterations are unavoidable. This can

be done, as in the linear case, by demonstrating a selection rule for the entering and leaving variables that prevents the algorithm from repeatedly visiting a cycle

of degenerate bases. This section examines anti-cycling selection rules of two kinds: those based on

perturbation (or equivalent lexicographic tie-breaking) and those based on certain combinatorial analyses. Either kind of rule is sufficient to prevent cycling, but both

are of interest for the insight they offer into the algorithm's behavior. The P-L anti-cycling rules predictably resemble their linear counterparts, but

assume some of the flexibility and generality of the P-L simplex algorithm. An essential aspect of these rules is their unambiguous specification of the basic slopes e8 at degenerate bases. I f an arbitrary choice of cB were allowed at each iteration, then the algorithm could be made to cycle between even a pair of bases.


Cycling can also be prevented by making certain degenerate basis changes according to supplementary criteria, which do not employ the usual reduced costs in selecting the entering variable or the usual ratio test in selecting the leaving variable. The related methods of Balinski and Gomory [1] and of Graves [11] were devised

for linear programs, but rely on properties that can be extended to the piecewise- linear case. Rockafellar [16] derives a very general method that can be specialized to P-LPs. Since these methods go beyond the purely primal simplex steps that are the subject of this paper, they are more properly the subject of a future study.

Perturbational and lexicographic selection rules

Since the constraints of a piecewise-linear program have the same form as those of a linear program, they can be perturbed in the familiar way, and to much the same effect. Thus the presentation in this section assumes a knowledge of the usual perturbational arguments [5, 7] and outlines the extension of these arguments to the P-L case.

Given m linearly independent m-vectors v l , . . . , vm, an e-perturbed P-LP can be defined by

(P~) Minimize [ c / y ] x

subjec t to A x = b + v l e + v 2 e 2 + . . .+vme m.

The original P-LP is (Po). Any basis matrix B and nonbasic breakpoint vector yN define a basic solution ~(e) for (P~) that is perturbed from the basic solution

for (Po). A basis is said to be e-feasible if )~(e) is feasible for all sufficiently small

nonnegative e; e-feasibility implies feasibility (taking e --0) but the converse need

not hold if )~ is degenerate. By a suitable perturbation of the P-LP, however, any feasible basis can be made e-feasible and, in a certain sense, e-nondegenerate:

Property 1. Given any feasible basis for (Po), linearly independent vectors v l , . . . , vm may be chosen so that it is e-feasible.

Property 2. For each basic variable xB~ in an e-feasible basic solution, there exists an

index si such that Y~I 1) < xB~(e) < y~i ) for all sufficiently small positive e.

Proof. Property 1 is elementary even in the P-L case. For Property 2, any basic value may be expressed as a polynomial function of e,

~B,(~) = ~Bi + (~ ,Vl )e + ( ~ i v 2 ) e 2 + . • • + ( ~ , v . , ) e 'n,

where o-,B = eB~ (the unit vector such that Beni = abe). Because the vectors vl, . • •, vm

are independent, at least one of the coefficients (ow~) # 0. Hence ~B~(e) can equal y ~ 1) y(s,)

or .~ at only a finite number of e values, and the Property follows. []


Suppose now that some e-feasible basis is at hand for selected v l , . . . , vm. The key property is the ability of the algorithm to move to a strictly better e-feasible basis:

Property 3. Given any e-feasible basis, the P-L simplex algorithm selects entering and

leaving variables to produce a new basis that is also e-feasible and that has a lower

objective value for all sufficiently small positive e.

Proof. Consider how the P-L simplex algorithm will behave when applied to any (P~). Property 2 implies that the same vector cB of basic slopes may be associated

with ~(e) for all small enough e/> 0. Since steps (1) through (5) of the algorithm are otherwise unaffected by the perturbation of b, they must be valid for all sufficiently small nonnegative e.

In step (6), the "rat ios" of interest have the form

_ + { r } ~ / . (~Bi(e)- r~i ' ~/yBi foryBi <0 ,

(~B~(e)- yB}r~})/yBi for YBJ > 0,

and - ( Y u r - Y~;}) or (3/up- Y~;~). Extending the proof of Property 2, no two of

these ratios can be equal except at a finite number of e values, and hence the ratios lie in the same order for all sufficiently small positive e. Then it is not hard to show that any selection of leaving variable is valid for all sufficiently small e ~> 0. Finally, steps (7) and (8) are the same for all such e.

Since the selection of entering and leaving variables is valid for all small enough e/> 0, the new basis is necessarily e-feasible. Furthermore, since the old basic solution was e-nondegenerate, the new basic solution must yield a lower objective value for all small enough e > 0. []

The P-L simplex algorithm can thus be guaranteed to optimize the original P-LP when applied to a suitably perturbed variant:

Theorem 1. Given a basic feasible solution for (Po), e and Vl , . . . , v,, can be chosen so that the P-L simplex algorithm applied to ( P~ ) terminates after finitely many iterations with an optimal basis (or an indication of unboundedness) for (Po).

Proof. Follows from Properties 1 and 3 by the same arguments used in the linear case. []

Also as in the linear case, the behavior of the simplex algorithm for sufficiently small e can be determined without knowing the actual value of e. This observation leads to a lexicographic variant outlined below.

Let the coefficients of the polynomial X~i(e) be represented as a vector,

XB~ = (~B~, o'~v~, ~r~v2,..., cr~Vm),


and let the distance from this vector to a breakpoint be

As previously remarked, the independence of v~ . . . . , vm insures that this vector is not all-zero. Either its first nonzero element is positive (J(Bi> (h) YB~, in the usual notation of lexicographic ordering) or its first nonzero element is negative

Using this notation, the proper slope c~ for a degenerate 2B~ = y~i ) in step (1) is given by the following rule:

cBi=c(~i - ' ) if XBi < ('~'~ ")lBi

CBi = c(s')Bi i f XBi > Y(BI )"

This represents a departure from the linear case, in which lexicographic rules only come into play after the entering variable has been selected•

Once cB is chosen, steps (1)-(5) may be carried out as before. In step (6), however, all of the ratios become vectors, with J~a~ substituted for ffBi'~ a s an example,

Y B i ~ ~- X B i - [ B i ' O'iVl O'il)m

YBi YBi ' YBi ' ' E /

Scalar "rat ios" are regarded as vectors by extending them with zeroes; for instance, -{r}

YUp -- YNr is treated as (YN~ -- YN~2 ~, 0 , . . . , 0). A leaving variable is then chosen as follows:

+{r } (6) Select a leaving variable: Choose any variable XBq and breakpoint 7BqO or yB~q~,, } such that

(a) (g~ +~ ~'" - 1"Bq" )/YBq = 0 and YBq < O, or

(b) (J~sq- y {qr})/y~q = 0 and y~q> O;

+{r} * {~} such that or choose Xup and any breakpoint rup or zu~

(c) --(yup--T+u~})=O and d~p<O, or

(d) (yu , , -y ) {~ } )=O and dNp>O;

provided 0 is lexicographically small enough that d+Np < 0 and

+ -- + { r } o+{r --1},, ,~ { r - - l } , < O,

Yui<O YBi>O

or 0 is lexicographically small enough that d up > 0 and

+{r ~} c-{r 1}y > 0 ; dNp(O)=C-N~p "}+ E C.,' Y. i+ ~_. . , ' . . i YBi < 0 YBi > 0

where the indices ri (for i # q) and r satisfy

( R B , - " + ~ r - ' ~ W , , .~ 0 < ( 2 B , - - ~ +{r~w, YBi ~ ] / . Y B i ~ YBi ~ ] / . Y B i

( X B i - ~ B I rl 1 } ) / Y B i < 0 <( ( X B i - ' y B I r i } ) / Y B i

_ + { r - - 1 } X j ~ + { r } ~ - - ( ' ) / N p - - "Y Np ] ~" O < -- ("YNp -- Y Np ]

for Ysi < O,

for YB~ > O, • +

If dNp < O,

i f dNr>O.

2 9 6 R. Fourer / A simplex method for piecewise-linear programming

Steps (7) and (8) can then proceed as before, taking 0 as the first element of O.

Combinator ia l selection rules

The combinatorially motivated anti-cycling rules of Bland [3] also extend to piecewisedinear programming. Following some preliminary development of ideas below, a piecewise-linear equivalent of Rule I in [3] is stated and proved in detail,

and the features of a P-L Rule I I are outlined. Preliminaries. Imagine that, prior to the start of some iteration, all slopes of [ c / Y ] k

are reduced by Yak, for k = 1, . . . , n and an arbitrary vector v. The new objective

can be expressed as

[ Y/ y ] x = [c / y ] x - ( vA)x .

Over all x that satisfy A x = b, the two objectives differ only by a constant vb, and hence have the same minimizers. Indeed, the change has little effect on the quantities

cB, ~', d~vj and dNj that are computed in steps (1)-(3). Since [Y/y] has the same breakpoints as [ c / y ] , the basic slopes can become cB = cB - vs. Then f ib = cB implies ~-= # - v. The reduced costs are

d Nj C+N~ ~ (raNj

_ + ~ 1 } + - c Nj -- VaNj -- ( ¢r -- V) aNj = C)~ 1} -- ¢raNi = d Nj

and similarly for dNj: they are all unchanged. Of particular interest is the special case in which v is taken to be g- itself. In the

resulting canonical objective [Y/y], #aB~=CB~ is subtracted from every slope of

[c/y]B~, and # a N j is subtracted from every slope of [ C / ] / ] N j . Thus [Y/y] has slopes as follows:

YBi = 0, (3.1)

y+{h} ~+{h} Bi = CBi -- CBi > 0, (3.2)

YBI h~ = cB~ h} - cm < O, (3.3)

Y~'} = dNj, (3.4)

y+{h} __ + Nj -- d Nj + ( C~q~ h } - "+(I}~'~ d+ ~Nj , ~ "Ni, (3.5)

y~l}= d ~ , (3.6)

y-~h}_ d ~ j + ( c ~ h}- c ~ 1~) < d~j. (3.7) Nj --

From (3.1)-(3.3), the slopes of [c /y]Bi are nonpositive at all points to the left of )TBi and are nonnegative at all points to the right. Similarly, from (3.4)-(3.7), if xuj is ineligible to enter the basis (because d~j>~0 and dNj~<0) then the slopes of

[Y/Y]Nj are nonpositive at all points to the left of ffNj and are nonnegative at all points to the right.


Finally, consider a series of degenerate iterations; let XN~ be the variable that enters the basis at the first iteration of the series, and let [~/y] be the canonical objective relative to the 77- computed at the first iteration. Since the degenerate iterations leave 2 unchanged, the above observations about the slopes to the left and the right of 2 remain true.

Moreover, if xBi has been basic throughout the degenerate series and eBi has not been changed, then (3.1) says that (Bi remains zero. Similarly, if Xu~ entered with

q - ~

dN~ < 0 and if em was fixed at c+~ 1}, then (3.4) implies that cB~ remains negative at subsequent iterations. I f Xu~ entered with d ~ > 0 and if CNl was fixed at e ~ 1}, then

(3.6) implies that CBl remains positive. Rule L A P-L extension of Bland's first rule in [3] can be stated as follows. The

entering variable XN~ must be the first one available:

p = rain{j: d~j < 0 or dNj > 0}. (3.8)

Thus the nominal reduced cost for XNp must be favorable:

d+Np=C+N{pl}-- E CBiYBi<O (By B =aNn), o r i

(3.9) dNp = CN{)}+ ~ cBiY~i>O (ByB = --aN,,).

i

I f the true reduced cost d+Np(6) or dNp(6) is also favorable, a leaving variable may be chosen in the usual way. Otherwise, writing CB~, C~ for the slopes of [c/y]B~ to the left and right of ffR~ as in Section 2,

dNp(8)=C+{p '} - E c ; iY . i - E CBiYBi >~0, or YBi < 0 YBi > 0

(3.1o) d--Np(t~)=CN{p l}-b E c;iYBi+ E cBiYBi <~0.

yBi<O yBi>O

In this situation, the rule requires that a degenerate leaving variable XBq be chosen "early enough" in the following sense:

either CBq = e~q and YBq <0, +

or eBq=CBq and yBq>0; and (3.11)

+ ,,+(1} + dNp:Bq(a)='~Np -- E C B i Y s i - - E CT~iYBi-- E CmyB, <~0, or i<q,yBi<O i<q, yBi>O i>~q

dNp:Bo(6)=CTv{,1, }+ E C+BiYm + E CBiY, i+ E cBiYBi >~0" i~.q, yBi<O i<q,YBi>O i>~q

Steps (7) and (8) then proceed as before, but updating cB as follows:

cui unchanged for i > q,

CN,,=C~ ~ if d~vp <0 , (3.12)

CNv=CN ~2~ if dNp> 0.

For i < q, the algorithm may set cz~ as before.

2 9 8 R. Fourer / A simplex method for piecewise-linear programming

The nature of the leaving-variable part of the rule may be seen more clearly by defining the set of indices of "wrong" slopes in CB :

S2 = {i: XB~ = Y ~ , and either cBi = e~ but y~ < 0, or cBi = c+~ but y~ > 0}.

A comparison of terms in (3.9) and (3.10) shows that both can hold only if 12 is nonempty. Then, comparing (3.9) and (3.11), a degenerate variable xsq is seen to be early enough provided that

qcg2 and + d N p : B q ( 8 ) = + d Np -~- E i~q,i~g~

dNp:Bq(6) = d N . - - E i<q,i~2

(c(S~) (s - 1 ) ~ j j • B i - - C B i )IYBil ~ l J ,

--CB} )[Y~il >~0.

o r

(3.13)

In words, for xgu to leave its slope must be "wrong" in ca, but its reduced cost must still be favorable or neutral when all lower-numbered slopes are "corrected". This criterion is always satisfied by the lowest-numbered q ~ S2, and may be satisfied by other early enough indices q.

Theorem 2. I f the piecewise-linear simplex algorithm selects the entering variable by criterion (3.8) and the leaving variable by criterion (3.11), and updates Ca by formula (3.12), then it must terminate after finitely many iterations.

Proof. Suppose that entering and leaving variables are chosen as hypothesized, but that nevertheless some basis is repeated after a cycle of degenerate iterations. Following the reasoning in [3], let xl be the highest-numbered variable that both enters and leaves the basis in the course of the cycle. Suppose that xN~ enters with d~:l<0; the case of dNl> 0 is entirely analogous.

Let [ ( / y ] be the canonical objective relative to the 7? computed when XNl is

chosen to enter, and consider the subsequent iteration at which xBt leaves and some xNp enters. Several assertions follow from the fact that x~ is the highest-numbered in the cycle, together with the entering-variable and updating rules and the previous observations about [~/y]. Each assertion pertains to the slopes for some variable or class of variables:

• XNp: Since the pth variable participates in the cycle, p < I. By the entering- variable rule, when Xut entered, the pth variable either was basic or was nonbasic and ineligible to enter. Thus the slope of [c/y]Np is nonpositive at all points to the left of XNp and nonnegative at all points to the right.

• x~i, i < l: By the entering-variable rule, when xm entered, the ith variable either was basic or was nonbasic and ineligible to enter. Thus the slope of [~/Y]Bi is nonpositive at all points to the left of ~n~ and nonnegative at all points to the right.

• x~ : Because the lth variable is the highest numbered in the cycle, the updating

rule insures that eul was set to e ~ 1~ when xN~ entered the basis, and that cBz was not changed at subsequent iterations. Thus YB~ remains negative.

R. Fourer / A simplex method.for piecewise-linear programming 299

Xn~, i > l: These variables do not particpate in the cycle, and so must remain basic throughout it. By the updating rule, cBi is left unchanged at every iteration in the cycle. Thus csi remains zero.

In sum, [ ? / y ] has the following properties at the iteration where x~ leaves:

Variable Slope in [ ~/ y] sat is f ies . . .

XBi, i < 1 C~i < O, Csi'+ >~ 0

XB~ ~ < 0

XBi , i > l ~ B i : O

It remains only to observe that (3.13) is unaffected by substituting [~ /y ] for I t~y ] , + (s~) ( s ~ - l )

since neither the reduced c o s t s dNp , d~p nor the differences cBi - cBi are changed.

Thus, to satisfy the leaving-variable rule, the iteration must have YB~ > 0 (since cst +

was fixed at c,~) and

d~p:B,(6) = UN~p ~ - E ?~iYB~- E ~B,YBi - E ?B~YBi <~ O, or i<~l, yBi <O i<l, YBi>O i>>-I

i<l,yBi<O i<l, yBi>O i~ l

Yet every term for d~vp:Bt(6) is nonnegative, and specifically --Y~IYB~ > 0; every term for d;~p:Bl(6) is nonpositive, and cB~YBI <0. Thus the assumption of a degenerate

cycle leads to a contradiction. []

Rule II. A piecewise-linear version of Bland's second rule can also be demonstrated by reference to the canonical objective. Suppose that XNp is selected to enter at the

+ basic solution 9~ defined by B and TN, and let 5 e + = { j C p : dNi<0} and 5 ° - = { j # p: dNj > 0} be the sets of other nonbasic variables that could have entered. The key to Rule I I is the following P-L extension of Observation 2.1 in [3]:

Property 4. I f a series o f iterations takes the P-L simplex algorithm f rom the basic

solution ~ to a basic solution x ' such that X'Nj<~ Yuj for all j c Sf + and X'Nj>~ YNj for all

j c 9 ~ , then X'Np i> YNp ( / f d~p < 0) or X'Np <~ YNp ( i f dNp > 0).

Proof. Let [~ /y ] be the canonical objective relative to the vector g- computed when

97 was the current solution and Xup was chosen to enter. From (3.1)-(3.7), the slopes of individual terms [~/Y]k in the neighborhood of ffk must have the following signs:


Slope of [ g/ ~']kXk Xk for xk < Xk for Xk > 2k

XUj, j C 50 + XUp, if d~p<O ~<0 <0

Xuj, j C 50- XUp, if dNp>O > 0 ) 0

XN, j ~ 50+ W 50 XB~ ~0 >t0

Suppose that x~j <~ YNj for all j ~ 5 °+ and X'Nj >/ YNj for all j c 5e-, as hypothesized, but that xNp<Yuv' (if d+up <0) or x~o> YNp (if dNp>0). Since "2Nj=yNj, and particularly gup=yup, the above table implies that [~/y]x'>~[~/y]Y, Since the canonical objective differs from the true objective by only a constant, it follows that [c/y]x'>~ [ e / y ] £ contradicting the assumption that x ' follows by a series of iterations from ~. [Z

To make use of this property, imagine that when x m, is brought into the basis, [c/'y] is replaced by a simplified objective [~/,)]:

~-~1} _ _ ~ fo r j c ~ - ; (3.14) c+} 1} = +o0 f o r j e 5 e+, ~uj -

+ "~N,, = -oe if drip<O, "~np=+ec i f d ) p > 0 . (3.15)

The restrictions (3.14) force all subsequent basic feasible solutions x ' to satisfy the hypothesis of Property 4. As a result, the parts of the objective where x~p < YNp (if d~p < 0) or x%p > YNp (if d )p > 0) can be ignored, which is exactly the effect of the changes specified by (3.15). Every feasible basis for [~/~] is also a feasible basis

for [c/y] , and the minimization of [~/~] subject to Ax = b is a restricted subproblem of the original P-LP.

Suppose that the P-L simplex algorithm terminates finitely when applied to [ ~/~] starting from ~. Either it finds an unbounded solution, in which case [c/y] is unbounded as well, or it finds a basic optimal solution 2. In the latter case ~ is also a basic feasible solution for [c/y] , and xm has unfavorable reduced costs relative to [c/y] for all nonbasic j ¢ S e + u 5 0 . Thus a new entering variable XNp can be

chosen, and a new restricted subproblem can be constructed, so that the union of the new 50+ and 50- is strictly smaller than before. After only finitely many such subproblems, 50+ and 50 must be empty and an optimal solution must be at hand.

The same approach may be applied recursively to solve any subproblem. At each level of recursion, the number of finite breakpoints is strictly reduced, so that the number of recursion levels--as well as the number of subproblems at each level - - m u s t be finite. Hence, unless unboundedness is detected, the top-level iterations must eventually stop at an optimal basis.


By construction, every basic feasible solution for a subproblem at any level is

also a basic feasible solution for the problem at the next higher level. Thus any

basic feasible solution for any subproblem must be a basic feasible solution for

[ c / y ] . In fact a careful analysis can show, as in [3], that every iteration on a

subproblem can be interpreted as a valid iteration on the full P-LP. Hence the

recursion effectively describes a cycle-avoiding rule for the P-L simplex algorithm.

4. Feasibility

Like other simplex methods, the piecewise-linear simplex algorithm requires a

basic feasible solution for a starting point. Fortunately, the existence of such solutions

is a reasonable assumption for P-LPs, just as for LPs.

A P-LP can fail to have basic feasible solutions because it has no basic solutions

at all. The difficulties of this situation are easily circumvented, as observed in the first half of this section, by implicitly or explicitly dropping certain linearly dependent

columns of the constraint matrix.

Once a basic solution has been constructed, the P-L simplex algorithm can find

a basic feasible solution - - or determine that no feasible solutions of any kind exist

- - by following either a "two-phase" or a "penalty" approach as described in the

second half of this section. Both approaches can be viewed as extensions of common

linear-programming techniques that replace or supplement the t rue objective with a simple piecewise-linear function. Successful application of the penalty approach

requires that certain sufficiently large penalty values be chosen; the optimality

conditions for P-LPs provide a simple demonstration that the minimum sizes of

these penalties are determined by the dual values # at the optimal basis.

Finding basic solutions

If every piecewise-linear term [ c / y ] k has at least one finite breakpoint, then a

basic (though possibly infeasible) solution is easily constructed. First, any m linearly

independent columns aB; of A are chosen to define the basic variables. Then the

remaining nonbasic variables are fixed at arbitrary finite breakpoints yuj.

If some P-L terms do not have finite breakpoints, then the corresponding f ree

variables can only be basic. Hence bases can exist, and the above construction can

be carried through, if and only if the coefficient columns of the free variables are linearly independent. P-LPs that violate this requirement need not be contrived; as

an example, consider the linear 11 estimation problem defined for an m x n matrix

G and m-vector g as follows:

m

Minimize ~ [u;] l = l

subject to G x + u = g.


The variables Ul, • • •, Urn have breakpoints at zero, but x l , . . . , x,, have a slope o f

zero and no finite breakpoints . Thus basic solutions do not exist if the columns of

G are linearly dependent .

In keeping with the nota t ion for basic and nonbas ic variables, let the free variables

be XFi , and let crl be the one finite slope defining [c / ' y]F i. Let F be the matrix o f

columns aw that cor respond to the free variables. The following elementary relation-

ship between F and c~ carries over f rom linear programming:

Property 5. I r a P-LP has a f ini te optimal value, then there exists a vector # such that

# F = c~.

Proof. I f no such ~- exists, then c~ cannot be in the row space o f F. Hence cF cannot

be or thogonal to the nullspace o f F : there must exist a vector 35v such that FfiF = 0

but c~35F ¢ 0. From the existence o f 35F the P-LP is easily seen to be unbounded if

it has any feasible solutions at all, contradict ing the hypothesis. [ ]

I f the columns of F are not linearly independent , a maximal independent subset

can be chosen. Let XFBi denote the variables corresponding to any such subset, and

let XFNj denote the remaining free variables. Thus for any values )~uj that might be

chosen, there are unique values )Semi such that F~v = 0. In these terms, a converse t o t h e above property, can be stated as follows:

Property 6. Suppose there does exist .7r such that ~rF = @, and let X*N be any arbitrary

values f o r the nonindependent free variables XFN. I f the P-LP has a f ini te minimum,

then it has an optimal solution in which XFN = X~FN . I f the P-LP has an unbounded

solution (or no feasible solution) then it continues to have an unbounded solution (or

no feasible solution) subject to the additional constraints XFN = X*N.

Proof. I f there are no feasible solutions, then there can be no feasible solutions

with XFN = X*FN. Otherwise, let ff be any feasible solution to A x = b, and let )5 F be the solution to F)Sr = 0 such that

Extend 3~r to a y that solves A y = O, by letting Yk = 0 for all xk that are not free

variables. Then a new solution to A x = b is given by

X '= 9¢ +ft.

Since x ' differs f rom ~ only in the free variables, x ' is also feasible. Also x~N =

gFN +YeN = X'N, and cx' = c~ + %9v.

Since @ lies in the row space o f F by hypothesis , and since Yr is in the nullspace,

cv)Se = 0, f rom which it follows that cx '= c~. Thus, for every feasible solution ~,

there is an associated feasible solution x' , achieving the same objective value, in

which the free variables XpNj take the values x*Nj. The assertions of the property

are a direct consequence. []

For practical purposes, Gaussian elimination on F can serve both to construct a maximal independent subset of columns aFBi and to determine whether ffF = cF

can be satisfied. I f ~- exists then the variables Xvuj can be fixed at any values (most conveniently, zero) and the resulting contracted P-LP may be solved in place of the original one.

Alternatively, each objective term [c /y]z i may be transformed to an equivalent P-L function that has an arbitrary "artificial" breakpoint y*~ surrounded by equal

slopes:

To define an inital basic solution, some of the x~i may be made nonbasic at their artificial breakpoints. Once any iteration brings xFi into the basis, however, its breakpoint may be ignored at all subsequent iterations.

When artificial breakpoints are employed in this way, only an independent subset of the columns avi can ever enter the basis. Hence this approach effectively contracts

the P-LP, by fixing free variables at arbitrary values y ' i , although it does not decide in advance which variables to fix. (A specialization of this idea has long been used in implementations of the bounded-variable linear simplex algorithm, to handle variables that have a lower bound of - ~ and an upper bound of +co; the artificial breakpoint is customarily taken to be zero.)

The vector 77 determined in step (1) of the P-L simplex algorithm must satisfy ~ a F i = CFi for all basic free variables XFi. I f the optimality conditions in step (2) are satisfied, then also c~i ~< #a¢~ ~< cF~ for all xF~ that are nonbasic at their artificial breakpoints. Thus, in finding a finite optimum, the algorithm automatically deter-

mines a solution to the condition ~ F = CF of Property 5.

Finding basic feasible solutions

Finding a feasible basic solution may or may not be more difficult than finding just a basic solution. I f [c/3'] is finite everywhere then all basic solutions must be feasible. On the other hand, if any finite breakpoint y(k h~ is adjacent to an infinite slope C~k h-l) ---- --00 or C(k h) = + ~ then many basic solutions may be infeasible, and no

basic feasible solution may be evident.

In the latter case, the piecewise-linear simplex algorithm may itself be used to search for a basic feasible solution from a basic infeasible starting point. A simple


"two-phase" approach defines a new P-L function [ f /7 ] whose slopes are derived from those of [c/y] as follows:

c ~ ) = - o o ~ f ~ s ) = _ l ,

c~')=+oo ~ f ~ ' ~ = + l ,

c~ h) finite ~ f~h) = 0.

Each term [f/Y]kXk is finite everywhere. Thus any basic solution to Ax = b can be used by the P-L simplex algorithm as a starting point for the "phase one" P-LP,

Minimize [ f / y]x subject to Ax = b.

Since [f/y]kXk either becomes constant or increases to infinity as Xk~--oO or xk -+ +co, this problem cannot be unbounded. The algorithm must find a basic optimal solution, and the following result applies:

Property 7. I f any optimal solution for [ f / y] is infeasible for [ c~ y], then every solution to Ax = b is infeasible for [c/ y].

Proof. Each of the terms [f /Y]k is minimal precisely on the interval where [c/y]kXk is finite. Hence if )7 and )? are a feasible and an infeasible solution for [ c /y ] , then

must be optimal for [ f /y] , with [ f /y] f f < [f/y])?. In other words, the infeasible cannot be optimal for [c/y] unless no feasible 2 exists. []

Property 7 implies that, if there does exist a feasible solution for [c/y], then the P-L simplex algorithm must stop at a basic optimal solution for [ f / y ] that is also basic feasible for [c/y]. Then a starting point is at hand for the minimization of

[c/7] (in "phase two"). An alternative "penal ty" approach replaces the infinite slopes in It~y] by large

finite ones:

4"= +oo + MZ,

c? ' finite f2h = e2

The resulting P-L function [ f / y ] is convex for all sufficiently large Mk and M£. Again it is everywhere finite, so that the P-L simplex algorithm can minimize it, subject to Ax = b, starting from any initial basic solution.

Property 8. I f there exists a feasible solution for [ c /y ] , and if M { and M-£ are chosen sufficiently large for all k, then a basic solution is optimal for [c/ y] if and only if it is optimal for [ f / y].

R. Fourer / A s imp lex m e t h o d f o r p iecewise - l inear p r o g r a m m i n g 305

Proof. Let ff be any feasible solution and Y an infeasible solution for [e/3"]. If the function [9?/3"] is varied by letting Mk -~ - -~ and M~ ~ + ~ for all k, then [9?/3"]Y ~ while [97/3"]~ remains unchanged. Thus Y cannot be optimal for [9?/7] if Mk, M~ are large enough. Since the number of basic infeasible solutions is finite, M{ and M~ can be chosen sufficiently large so that every basic optimal solution for [9?/3`] is feasible for [c/3"].

Suppose that MZ and M{ are so chosen; then it need only be shown that a basic feasible solution x* for [e/3"] is optimal for [e/3"] if and only if it is optimal for [9?/3']. Imagine applying the P-L simplex algorithm to either [c/3"] or [9?/3"] at x*. The basic slopes eB and fB may be chosen identically, so that the same # is defined by step (1). The optimality conditions of step (2) may differ, if any nonbasic variables lie at breakpoints adjacent to infinite slopes:

I f . . .

c(S) - Nj - - - - O0

3"Nj = 3"~;1)

then optimality for [ c / 3" ] requires...

~(s+l) (r a Nj ~ c Nj

c(t) _ N j - - + o o

(t) "lvraN] ~ c ~ j 1) 3"Sj = )" Nj

but optimality for [ f / 3" ] requires...

--MNj <~ ~auj <~ c~ +1)

+ M+Nj >~ ~rauj >1 c% s 1)

The optimality conditions are the same, however, for all other nonbasic variables.

Suppose first that x* is optimal for [c/3`]. Thus there is some # such that the optimality conditions for [c/3"] are satisfied. The above comparison shows clearly that the optimality conditions for [9?/3"] will also be satisfied provided that MNj and M+Nj are chosen large enough for all nonbasic Xuj. Since there can be only finitely many basic optimal solutions, M{ and M~ can be chosen large enough for all k so that every basic optimal x* for [c/3"] is optimal for [9?/3"].

Conversely, suppose that x* is not optimal for [c/y]. Then for any # computed in step (1), some optimality condition for [c/3"] is violated in step (2). However, as the above comparison of conditions makes clear, every optimality condition for [c/3"] is also an optimality condition for [9?/3"]. Hence x* cannot be optimal for

[]/3"1. []

A finite minimum for [c/Y] can thus be found, if any exists, by minimizing [f/3'] with sufficiently large penalties M~ and M~. Clearly, moreover, if there is no finite minimum for [c/3'] then there can be none for [9?/3'] with any choice of penalties. (If the P-LP has no feasible solutions for [c/y], however, it must still have feasible solutions for [9?/3`] no matter how large the penalties. Only the two-phase approach can prove an absence of feasible points.)


A precise characterization of sufficiently large penalties can be given in terms of

any optimal basis:

Property 9. I f a basic solution is optimal for [ c /y ] , then it is optimal for [ f l y ] provided that

- M ~ <~ min{~-*ak, C (s+l)} and +M~ >~ max{~-*ak, c~' 1)},

where 7r* is the vector of dual values from step (1), e~ "+1) is the smallest slope of [c/ y]k greater than -oe, and c~ t 1) is the largest slope less than +oc.

Proof. For a basic variable xBi, ~'*aBi equals some slope c,i, and the conditions of

the property merely ensure convexity of [f/Y]B~- For a nonbasic variable xNj that lies at a breakpoint adjacent to two finite slopes, optimality requires c~f 1) <~ ~r*aNj <~ e(~ -1), so again the conditions just ensure convexity.

/ s ) For a nonbasic xNj that lies adjacent to ,~Nj =--oo, optimality for [ c / y ] requires _(s+l) ~(s+l) 7r*aNj ~< CNj . Thus the conditions of the property imply --Mk~< 7r*aNj<~ ~Nj ,

which is exactly the condition for optimality of [ f / y ] as shown in the proof of / t ) = +co is analogous. [] Property 8. The argument for an xNj that lies adjacent to ~Nj

Property 9 can be regarded as a special case of a fundamental result for exact

penalty functions [12, pages 387-390]. Although it is customary to think of M~- and M2 as large numbers, the conditions of the property require only that they be "large enough"; even negative values may suffice for some of them.

Since ~r* cannot be known in advance, the choice of M { and M~ in practice _.+(l~ + -(1~ dNj. I f the must be based on an estimate of ~-*, or of 7r*auj--~Ni --duj = CNj --

penalties are chosen too small, the minimum of [ f / y ] will not be feasible for [ c~ y], and the algorithm will have to continue with larger values. As M k ~ oo and M~- ~ o0, however, the penalty approach becomes indistinguishable from the two-phase approach. Thus the penalties must be "medium sized" for the penalty approach to

be most effective. Both approaches can be viewed as generalizations of well-known linear program-

ming techniques. Large-scale linear simplex codes have traditionally used certain limited versions of the P-L simplex algorithm to solve the phase one problem [17];

a more general P-L simplex algorithm for phase one was proposed by Rarick [15]. The P-L penalty approach for linear programming has been investigated by Conn [6] and Bartels [2].

5. Finiteness

The assumption of a finite number of finite breakpoints (and finite slopes) is seldom restrictive. I f each slope or breakpoint is regarded as a separate piece of data, then such an assumption serves only to insure that every piecewise-linear


program has a description of finite length, to which an algorithm can meaningfully be applied. Moreover, in practice the breakpoint sets are often inherently small; many of the applications surveyed in part I I ! of this paper [9], for example, require no more than three finite breakpoints in any term of the objective function.

Nevertheless, it is reasonable to consider solving P-LPs under weaker conditions. I f the breakpoint and slope sequences are generated by a finite algorithm, then a P-LP can have a finite description even though the sequences are infinite. A P-L simplex method may still be applied, provided that it makes use of only finitely many breakpoints and slopes at each iteration.

Generated slopes and breakpoints are most likely to arise in piecewise-linear

approximations to convex functions [4, 7, 13]. As a simple example, a convex function fk(Xk) can be approximated by letting the breakpoints be 7(k h~ = Ah, for any A > 0 and -oc < h < +oc, and by taking the slopes as

C(kh) = (/(T(kh+l)) __f(y(kh)))/(y(kh+l)__ 7(h)).

For a nonlinear program that has a separable convex objective and linear constraints, this kind of P-L approximation yields a piecewise-linear program. Such a P-LP may be more convenient to solve than the original nonlinear program, even though the P-L simplex algorithm cannot guarantee the rapid convergence characteristic of nonlinear methods that exploit information about the curvature of the objective.

This section begins by formulating minimal assumptions under which the P-L simplex algorithm is well-defined when applied to infinite slope and breakpoint sequences. The conditions for optimality, unboundedness and finite termination can then be re-examined. Conditions for optimality depend only on the local nature of

a P-L function, and are preserved under even very weak assumptions. Conditions for unboundedness are more problematical, because they are also influenced by the greater variety of asymptotic behavior that is possible when infinite numbers of

breakpoints are allowed. Finally, finite termination can be guaranteed only under fairly restrictive assumptions about the nature of the infinite breakpoint sequences.

Minimal assumptions

I f absolutely any increasing collection of breakpoints and slopes were allowed to define a convex piecewise-linear function, then all points on the real line could be taken as breakpoints, and any convex function might be regarded as piecewise- linear. Even dense countable sets of breakpoints (such as the rationals) could suffice

to describe any convex function. A reasonably restrictive and intuitive notion of piecewise linearity thus requires that each P-L term be linear on segments of positive length. Equivalently, the breakpoints must not be dense in any interval

A somewhat stronger assumption is needed to ensure that the P-L simplex

algorithm can be carried out as described in Section 1. Step (2) requires that there be a well-defined slope to the left and to the right of every breakpoint, and subsequent

steps also depend on being able to determine successive slopes and breakpoints to the left or right of "YNp and every eei. Thus no breakpoint may be a limit ofbreakpoints.


Certain other anomalous situations can occur when a subsequence of breakpoints converges to some finite value y(k h*~. Under the above assumptions, [c /y] y~k h*~ may

be finite, provided that y~h*> is not included in the list of breakpoints. Such an arrangement is best avoided, however, for one or more of the following reasons:

• I f [c / y ] has different slopes to the left and to the right of y~h*>, then y<k h*> is

properly included as a breakpoint by any intuitive definition of a piecewise-

linear function. • I f [c / y ] is finite to the left and to the right of @k h*), then the sequence of slopes

cannot be numbered by consecutive integers, as the statement of the P-L simplex

algorithm implicitly assumes. (h*) • I f every basic optimal solution has xk-- rk , then the P-L simplex algorithm

may not be able to reach an opt imum unless y(k h*> is one of the breakpoints.

All of these cases can be ruled out by saying that the function value may not befinite at any limit point of the breakpoints.

All of the above conditions will be assumed in the sequel. Subject to these restrictions, a convex P-L function [c/Y]k either has a largest finite breakpoint (and largest finite slope), or has increasing sequences of infinitely many breakpoints and slopes. In the latter case, [c/y]kXk must fall into one of the following categories as

xk steps to higher and higher breakpoints:

lim ~/~h~ lira c~ h~ lim [C/y]kg'~ h) h-~ +co h ~ + o o h-~ +oo

finite finite finite finite +oo finite finite +oo +co +co negative -oo +oo zero -oo

+oo zero finite +oo positive +oo +oo +oo +oo

The corresponding table for -oo is obtained by reflecting the first two columns: +co becomes -oo, and "negative" is exchanged with "positive".

The first two rows of the table describe somewhat counterintuitive behavior. As the breakpoints approach their finite limit, the distance between them tends to zero; although the intervening slopes may become progressively larger, they grow slowly enough that the function value is bounded. To preserve continuity, the P-L function

could be defined to have

[ C / ")l ] k ( hli~,moo ")l (kh ) ) = h ~ +oolim [ c / Y ] k y ~h ) ;

then limn~+oo 7~k h) would be a feasible point for [c/Y]k (though in the second case it would not be "regularly feasible" [16, Section 8D]).


However, since by assumption a P-L function may not take a finite value at a limit of breakpoints, [c/y]k must instead be regarded as taking an infinite value at limh~+~ y(k h~. Hence [C/y]k is not a closed convex function in the first two cases above. The assumptions do guarantee that it is closed in all of the other cases.

Conditions for optimality and unboundedness

Under the above assumptions, the fundamentally piecewise-linear nature of the functions [c/y]k is preserved in any sufficiently small neighborhood. Thus local properties, such as the conditions for feasibility and optimality, remain unchanged. In particular, the conditions for termination with optimality in the P-L simplex algorithm remain valid:

Property 10. I f the conditions in step (2) of the P-L simplex algorithm are satsfied, then the current basic solution is optimal regardless of the finiteness of the slope and breakpoint sets.

I f the conditions in step (2) are not satisfied and if the true reduced cost is d+Np( 8 ) < 0 or d~p(8) > 0 for the variable Xsp chosen in step (3), then the P-L simplex algorithm achieves a reduction in the objective value.

Proof. If some basic feasible ~ satisfies the optimality conditions of step (2), then it also satisfies these conditions for the "local" objective [~/9] defined by

~Bi = CBi "~ ,~+(1} Bi ~ - ~ - 0 0 ,

~/ N) 1) ~ --00, ~--{1} {1) CNj ~- CNj ,

~/ Nj = ')/ Nj , ~.q~l} ~+{1} ~- t~Nj ,

Since [ ~/9] has only finitely many breakpoints, ~ must be optimal for [~/-~] by the proof in [8]. Yet by construction,

(slope of [~/Y]k)/> (slope of [C/T]k) at all Xk < Xk,

(slope of [~/~]k)<~ (slope of [c /7]k) at all Xk > Xk.

AS a consequence, for any x, [c/T]x - [c/T]~/> [~ /~]x - [~/~]~/> 0. Hence ~ is also optimal for [c/y].

If the conditions of step (2) are not satisfied, then an entering variable XNp can be chosen in step (3). As XNp begins to increase or decrease from TSp, the change in the objective value depends only on the slopes immediately to the left or right of ~Np and all ~Bi. In particular, the objective value decreases if and only if the effective reduced cost is favorable, regardless of how many breakpoints [c/y] has in all. []


Unboundedness is much more problematical. Conditions for an unbounded solution depend on the asymptotic, rather than local, properties of [c /7] . The assumption of a finite breakpoint set in [8] assures that [C/7]kXk either becomes infinite or behaves like a linear function as Xk ~ +o0 or Xk ~ --cO. When this assumption is relaxed, a convex P-L function can exhibit any of the other kinds of limiting behavior allowed of convex functions.

Thus the conditions in step (5) remain sufficient for unboundedness in the infinite-breakpoint case, since they imply that the objective decreases along a ray of solutions that crosses no breakpoints. The same conditions are no longer necessary, however, because the objective may also decrease along a ray that crosses infinitely many breakpoints. In such a case the objective value may be truly unbounded below or may have a finite infimum that is nowhere achieved, even though the conditions of step (5) are not satisfied at any basic feasible solution.

Various weaker necessary and sufficient conditions for the infinite-breakpoint case can be formulated in terms of the effective reduced costs and their limits:

Property 11. I f [ c~ 7] is closed, then the conditions for unboundedness can be summar-

ized as follows, where d+Np(OO) = lim0_~oo d+Np( O) and d Np(oo ) ~- limo_,~ d Np( O).

Unboundedness: the

P-LP sat is f ies . . . Necessary ? Sufficient ?

inf[ c~ y ]x = -oo no yes

Condition: at some

feasible bas i s . . .

d+Np(oO) < 0 or

d Np(C~) > 0 -+-

dNp(O) < 0 or

dNp(0) > 0 for all 0 > 0

d+p(oo) <~ 0 or

d-Np(~) >10

inf[ c~ y ]x yes yes

is nowhere achieved

{x: [ c / y ] x < ~ [ c / 7 ] 2 } yes yes

is unbounded for

all feasible ~2

The same conditions hold i f [ c/ T] is not closed, except that the second condition also

fails to be necessary.

Proof. All of these assertions follow directly from established properties of arbitrary convex separable objective functions; see, for example, the analysis by Rockafellar [16, Chapter 11]. []

The asymptotic effective reduced costs can be expressed conveniently as a function of the asymptotic slopes, provided that the breakpoints have no finite limits. Writing SUph C? )= c~ -~°~) and infh C(g h)= C~k - ~ ,

S r ~ , ( , im ~+~ ~ (lira c~ r,~] y~, d~p(oo)=l im ~ .Np- Z y . i + Y~ yB~<o \0-~oo c , ] \0~oo / 0 ~ o 0 Y B i ~ O

( +oo) -=CNp + E C(Bi°°)Ym -b E C(-i°°)YBi

YBI < 0 YBi > 0


and similarly for d sp(OC ). These formulas depend on the identity of the basic variables xB and the entering variable xN~, but not on the values YN of the nonbasic

variables. Thus there are only finitely many quantities d~Np(~) and dNp(~) associated with a P-LP even when the number of breakpoints is infinite.

Conditions for termination

Since an infinity of breakpoints permits an infinite number of basic solutions, abandonment of the finiteness assumption allows an infinite number of iterations in the P-L simplex algorithm. The proof of finite termination in [8] remains applicable, however, if only finitely many basic feasible solutions actually exist, or if only finitely many can be visited by the algorithm.

Termination can be assured, for example, if the algorithm visits some ~ such that [c/7],2 > [c/y]x ' for only finitely many other basic feasible solutions x'. This is the case, in particular, when [c/y] is a closed function and the solution set is bounded:

Property 12. I f [c/y] is closed and if the set of optimal solutions is nonempty and bounded, then the P-L simplex algorithm must eventually stop.

Proof. Consider first any closed and bounded set 2g of feasible solutions; [c /7]x < o0

for any x e ~ . Let ~k={Xk: ( X l , . . . , X k , . . . , X ~ ) C ~ } be the closed and bounded

interval of values that Xk takes within ~ ; [c/y]kXk is finite for any Xk ~ ~k. Under the hypothesis that [c/y] is closed, and under the assumption that [c/y] takes an infinite value at any limit of breakpoints, the interval ~k may contain only a finite

number of breakpoints. I f follows that ~ must contain only finitely many basic solutions.

For any optimal solution x*, the level set {x: [c /y]x <~ [c/7]x*} is nonempty and bounded by hypothesis. I f one nonempty level set of a convex function is bounded,

however, then all are bounded. Thus if the algorithm is started at some basic feasible solution )~, the level set

given by ~ = {x: [c/y]x<~ [c/y]~} is nonempty and bounded. 5( is also closed, because by hypothesis [c/y] is closed. Hence by the previous argument 5( can

contain only finitely many basic feasible solutions. Yet the algorithm can only visit basic feasible solutions in ~ (since it never permits the objective to increase) and it can visit each basic solution at most once. As a result, it must stop after only

finitely many iterations. []

Other, more restrictive requirements for termination depend only on the nature

of the objective. One result can be stated in terms of the asymptotic behavior of [ c/y]:

Property 13. I f [ c / Y ] is closed and if lim . . . . [ c / y ]kXk = lim~k-~+~ [ c / T ]kXk = °0 for every k, then the P-L simplex algorithm must eventually stop.

Proof. Let )7 be the basic feasible solution at which the algorithm is started. The level set ~ = { x : [c/,),]x<~[c/y]~} is nonempty and convex; the condition that


lim~_,_~[C/y]kXk=lim~+o~[C/y]kXk=O0 for every k implies that ~7 is also

bounded. Thus, just as in the proof of Property 12, the algorithm cannot visit

infinitely many bases. []

Alternatively, a condition for termination may be given in terms of the slopes

and breakpoints:

Property 14. I f the differences c (h+l)- c (h) between slopes and y~h+,)_ y(k h) between breakpoints are bounded below by some positive value for all k, then the P-L simplex algorithm must eventually stop.

Proof. Suppose that the algorithm visits an infinite sequence of basic feasible

solutions. The idea of the proof is to first extract a particularly convenient infinite

subsequence. Then the objective value along the subsequence can be shown to eventually increase, producing a contradiction.

It is useful to first note that, by the Property's hypothesis, neither the slopes nor

the breakpoints may converge. As a result, under the previous assumptions on

breakpoint sequences in [c/7]k, either there is a smallest finite slope C~k ") (with

o r

lim C(k h~= lim y~h)=--O0 and lim [C/y]kXk=O0. h~--oo h~--oo Xk ~--oo

Similarly, either there is a largest finite slope c~ t~ (with rk ~ (t+l) = +oo) or

lim c~ h~= lim y~h)=+oO and lim [c/7]kxg=oo. h~+oo h~+oo xk~+oo

Since there are only finitely many ways to choose a nonsingular submatrix B from

the columns of A, an infinite sequence of basic solutions must contain a subsequence

in which the same m variables are basic and the same n - m are nonbasic. Within this subsequence, moreover, the values 2~Bi of a basic variable may or may not fall

between some two breakpoints infinitely often. Consequently, taking into account

the restrictions of the hypothesis, it must be possible to extract a sub-subsequence

along which one of the following holds:

XBv: The objective values [c/y]~i~i are strictly increasing and tend to co along

the subsequence. (h .+D xBw: For some index hi, y~h,) <~ )TBi <~ rB~' at every iteration in the subsequence.

By performing the extraction for each xBi in turn, a subsequence can be found in

which every basic variable satisfies one of these restrictions. Let XBvi and XBw~ denote

the basic variables of these two types, as indicated above.

By similar reasoning, the nonbasic values ~Nj may or may not lie at the same

breakpoint infinitely often, and it is possible to further extract a subsequence such


that one of the following holds:

xNs: The values gNj lie at a strictly decreasing series of breakpoints that tend to - ~ along the subsequence.

XNV: The values gNi lie at a strictly increasing series of breakpoints that tend to +co along the subsequence.

_ (h i ) xuc : For some index hi, XN~ = YNj at every iteration in the subsequence.

By a series of extractions, a subsequence can be found in which every XNj satisfies one of the above (and in which the properties of the subsequence relative to the

variables xBi are preserved). Let XNSj, XNTj and xNuj denote the nonbasic variables that fall into each of the three categories. There must exist a subsequence that has

at least one variable of type XNSj or XNTj, since otherwise the algorithm could not visit infinitely many distinct basic solutions.

Let g and ~ denote two successive basic solutions in the subsequence. The remainder of the proof must show that 2 can be chosen so that [ c / y ] ~ > [c /y]g ,

contradicting the assumption that ~ follows 2 in a series of simplex iterations.

Since gNU = ~UC throughout the subsequence, the difference between the objective values at ~ and 2 can be broken into just four parts:

[ c / ~ ] ; - [ c / ~]x = ([ c / ~]Ns;Ns -- [ C/ ~] NSX~s)

+ ( [ e / ~ ] ~ T ~ -- [C/~,] ~XNT)

+ ([C/~'] BVGV -- [e /~ , ] ~vXBv)

+ f i e / ~ ] ~ G ~ - [ c / ~ ] ~ x ~ ) .

However, because [c/y]Bwi has the same slope c ~h'~ Bwi at both gBw~ and XBwi, the difference in objective terms for these variables is simply

. ( h ) [ c / ~' ] B ~ , G ~ , - [ c / "y ]~ ,X~w, = ( G~vi - g ~ , )c ,~ , .

The equations A ~ = b and A g = b imply A ( ; - g ) = 0 , or B ( ~ u - - g B ) = --~j (~Nj--XNj)aNj. Letting BWNj:B = a m as in Section 2, and dropping the terms ~NUj- XNUj = 0, it follows that

J

J J

Substituting in the preceding expressions,

[ c / , y ] ; - [ c / ~ , ] g

([ c / ~, ] N~j;~j - [ c / ~,] ~s3~sJ (~.) "l

+ - - WNTj:BWiC Bwi]

j XNTj -- XNTj i"

+ Z ([ c/.~]~v,;~v, - [ c / ~,],~v,g~v,). i


Consider the expression within the first sum above, for any j. The first factor (XNsj--XNsj) is negative and bounded away from zero, by the definition of the subsequence and the hypothesis about breakpoints. Within the second factor, the first term is just the average rate of change of the function [ c~ y] usj when its argument

moves from ffNsj to ~Nsj ; this quantity is at least as great as the slope to the left of 2usj, so it must tend to - ~ as ff moves out the subsequence, by the hypothesis about slopes. The second term in the second factor is the same for any ff in the subsequence. Thus the first sum - - if it is not empty - - must grow arbitrarily large as the subsequence proceeds; an entirely analogous argument shows the same for

the second sum. Since there must be at least one variable XNSj or XNrj, both sums cannot be empty.

Finally, all of the terms in the third sum are positive by the definition of the subsequence. Thus the total difference [ c / y ] : ~ - [ c / y ] g must be positive for £ far enough along the subsequence, which was the contradiction to be shown.

Given some postulated lower-bound value, each iteration of the P-L simplex algorithm can easily check that all newly encountered slopes and breakpoints obey the conditions of the above property. Then the algorithm must eventually stop either with an optimal solution, or with a warning that some pair of breakpoints or slopes differ by less than the lower bound.

Addendum

Two early descriptions of piecewise-linear simplex algorithms predate those surveyed in part I of this paper [8].

Gol'gtein [10] states an algorithm that is essentially equivalent to the P-L simplex

algorithm of [8]. Nondegeneracy is assumed, so that the usual proof of termination applies, but no proof of optimality is suggested.

Orden and Nalbandian [14] describe a piecewise-linear simplex algorithm in terms of the traditional simplex tableau. Their method steps from one basic solution to another, in the same way as the algorithm in [8]. Their test for optimality and their selection of an entering variable rely on the true reduced costs (as defined in

Section 2 above) rather than on the nominal reduced costs used in [8]; hence a nondegeneracy assumption is necessary for optimality of the final basis as well as

for finite termination.

References

[1] M.L. Balinski and R,E. Gomory, "A mutual primal-dual simplex method," in: R.L. Graves and P. Wolfe, eds., Recent Advances in Mathematical Programming (McGraw-Hill, New York, 1963) pp. 17-26.


[2] R.H. Bartels, "A penalty linear programming method using reduced-gradient basis-exchange techniques," Linear Algebra and Its Applications 29 (1980) 17-32.

[3] R.G. Bland, "New finite pivoting rules for the simplex method," Mathematics of Operations Research 2 (1977) 103-107.

[4] A. Charnes and C.E. Lemke, "Minimization of non-linear separable convex functionals," Naval Research Logistics Quarterly 1 (1954) 301-312.

[5] V. Chvatal, Linear Programming (W.H. Freeman, New York, 1983). [6] A.R. Conn, "Linear programming via a nondifferentiable penalty function," SIAM Journal on

Numerical Analysis 13 (1976) 145-154. [7] G.B. Dantzig, Linear Programming and Extensions (Princeton University Press, Princeton, N J, 1963). [8] R. Fourer, "A simplex algorithm for piecewise-linear programming l: Derivation and proof,"

Mathematical Programming 33 (1985) 204-233. [9] R. Fourer, "A simplex algorithm for piecewise-linear programming III: Computational analysis

and applications," Technical Report 86-03, Department of Industrial Engineering and Management Sciences, Northwestern University (Evanston, 1L, 1986).

[10] E.G. Gol'~teln, "A certain class of nonlinear extremum problems," Doklady Akademii Nauk SSSR 133; translation in Soviet Mathematics 1 (1960) 863-866.

[11] G.W. Graves, "A complete constructive algorithm for the general mixed linear programming problem," Naval Research Logistics Quarterly 12 (1965) 1-34.

[12] Luenberger, D.G., Linear and Nonlinear Programming, 2nd edition (Addison-Wesley Publishing Company, Reading, MA, 1984).

[13] C.E. Miller, "The simplex method for local separable programming," in: R.L. Graves and P. Wolfe, eds., Recent Advances in Mathematical Programming (McGraw-Hill, New York, 1963) pp. 89-100.

[14] A. Orden and V. Nalbandian, "A bidirectional simplex algorithm," Journal of the Association for Computing Machinery 15 (1968) 221-235.

[15] D.C. Rarick, "Pivot row selection in the WHIZARD LP code," Management Science Systems. [16] R.T. Rockafellar, Network Flows and Monotropic Optimization (Wiley-lnterscience, New York,

1984). [17] P. Wolfe, "The composite simplex algorithm," SlAM Review 7 (1965) 42-54.

Date post:	27-Jul-2018
Category:	Documents
Upload:	vuongthu
View:	230 times
Download:	0 times

A simplex algorithm for piecewise-linear programming II ... · A SIMPLEX ALGORITHM FOR...

Documents