+ All Categories
Home > Documents > A systematic review and synthesis of higher quality evidence of the effectiveness of exercise...

A systematic review and synthesis of higher quality evidence of the effectiveness of exercise...

Date post: 03-Feb-2017
Category:
Upload: steve-r
View: 213 times
Download: 1 times
Share this document with a friend
12
INTRODUCTION Non-specific low back pain (LBP) is a common con- dition that results in considerable discomfort and dis- ability for patients and high health and social care costs. 1–3 Numerous interventions commonly pre- scribed by physiotherapists and other healthcare providers can play a role in the prevention and treat- ment of LBP. However, a definitive answer in respect of the most effective and efficient treatment for LBP is still the topic of much debate and research. 4–6 Exercise has been suggested as an effective treatment for LBP, 7–9 but research in this area has been plagued by methodological challenges, the wide diversity in exercise interventions and the presence of possible confounders in the diverse group of patients with non-specific LBP. 10,11 Multiple clinical guidelines have been compiled to assist practitioners with providing effective and efficient treat- ments for back pain. 12–14 Where available, randomised controlled trials (RCTs) and systematic reviews (SRs) have provided the basis for these guidelines. However, many of these RCTs have a number of limitations, most notably: (i) small sample size; (ii) low methodological quality; and (iii) inadequate or inappropriate statistical tests to analyse the results. 11,15,16 Most SRs in the area of © W. S. Maney & Son Ltd 2007 DOI 10.1179/108331907X222958 A SYSTEMATIC REVIEW AND SYNTHESIS OF HIGHER QUALITY EVIDENCE OF THE EFFECTIVENESS OF EXERCISE INTERVENTIONS FOR NON-SPECIFIC LOW BACK PAIN OF AT LEAST 6 WEEKS’ DURATION DRIES M. HETTINGA 1 , ANNE JACKSON 1 , JENNIFER KLABER MOFFETT 2 , STEPHEN MAY 3 , CHRIS MERCER 4 , STEVE R. WOBY 5 1 Research and Clinical Effectiveness Unit, Chartered Society of Physiotherapy, London, UK 2 Institute of Rehabilitation, University of Hull, Hull, UK 3 Faculty for Health and Wellbeing, Sheffield Hallam University, Sheffield, UK 4 Physiotherapy Department, Worthing and Southlands Hospitals NHS Trust, Upper Shoreham, Shoreham-by-Sea, UK 5 Physiotherapy Department, North Manchester General Hospital, Manchester, UK Systematic reviews and randomised controlled trials (RCTs) generally support the use of exercise interventions to reduce pain and improve function in patients with chronic non- specific low back pain (LBP). However, many RCTs in the field of LBP include small numbers of subjects, have significant methodological limitations and use diverse exercise interventions. This review shows that the smaller RCTs often overestimate the true effectiveness or fail to detect true benefits. Also, statistically significant results from RCTs of low methodological quality might not be valid. This review showed that evidence from RCTs that are larger (40 subjects in exercise group), score high on the adapted van Tulder methodological quality criteria (5/10) and have used adequate statistical tests, support the use of exercise for patients with LBP of at least 6 weeks’ duration, although the effect sizes tend to be smaller. This higher quality evidence particularly supports the use of strengthening exercises, (organised) aerobic exercises, general exercises, hydrotherapy and McKenzie exercises for back pain of at least 6 weeks’ duration. Keywords: Exercise, low back pain, systematic review Physical Therapy Reviews 2007; 12: 221–232
Transcript

INTRODUCTION

Non-specific low back pain (LBP) is a common con-dition that results in considerable discomfort and dis-ability for patients and high health and social carecosts.1–3 Numerous interventions commonly pre-scribed by physiotherapists and other healthcareproviders can play a role in the prevention and treat-ment of LBP. However, a definitive answer in respectof the most effective and efficient treatment for LBPis still the topic of much debate and research.4–6

Exercise has been suggested as an effective treatmentfor LBP,7–9 but research in this area has been plagued

by methodological challenges, the wide diversity inexercise interventions and the presence of possibleconfounders in the diverse group of patients withnon-specific LBP.10,11

Multiple clinical guidelines have been compiled to assistpractitioners with providing effective and efficient treat-ments for back pain.12–14 Where available, randomisedcontrolled trials (RCTs) and systematic reviews (SRs)have provided the basis for these guidelines. However,many of these RCTs have a number of limitations, mostnotably: (i) small sample size; (ii) low methodologicalquality; and (iii) inadequate or inappropriate statisticaltests to analyse the results.11,15,16 Most SRs in the area of

© W. S. Maney & Son Ltd 2007 DOI 10.1179/108331907X222958

A SYSTEMATIC REVIEW AND SYNTHESIS OF HIGHERQUALITY EVIDENCE OF THE EFFECTIVENESS OF

EXERCISE INTERVENTIONS FOR NON-SPECIFIC LOWBACK PAIN OF AT LEAST 6 WEEKS’ DURATION

DRIES M. HETTINGA1, ANNE JACKSON1, JENNIFER KLABER MOFFETT2, STEPHEN MAY3, CHRIS MERCER4, STEVE R. WOBY5

1Research and Clinical Effectiveness Unit, Chartered Society of Physiotherapy, London, UK2Institute of Rehabilitation, University of Hull, Hull, UK

3Faculty for Health and Wellbeing, Sheffield Hallam University, Sheffield, UK4Physiotherapy Department, Worthing and Southlands Hospitals NHS Trust,

Upper Shoreham, Shoreham-by-Sea, UK5Physiotherapy Department, North Manchester General Hospital, Manchester, UK

Systematic reviews and randomised controlled trials (RCTs) generally support the use ofexercise interventions to reduce pain and improve function in patients with chronic non-specific low back pain (LBP). However, many RCTs in the field of LBP include small numbersof subjects, have significant methodological limitations and use diverse exercise interventions.This review shows that the smaller RCTs often overestimate the true effectiveness or fail to detect truebenefits. Also, statistically significant results from RCTs of low methodological quality might not bevalid. This review showed that evidence from RCTs that are larger (≥ 40 subjects in exercise group),score high on the adapted van Tulder methodological quality criteria (≥ 5/10) and have used adequatestatistical tests, support the use of exercise for patients with LBP of at least 6 weeks’ duration, althoughthe effect sizes tend to be smaller. This higher quality evidence particularly supports the use ofstrengthening exercises, (organised) aerobic exercises, general exercises, hydrotherapy and McKenzieexercises for back pain of at least 6 weeks’ duration.

Keywords: Exercise, low back pain, systematic review

Physical Therapy Reviews 2007; 12: 221–232

LBP fail to recognise all three limitations or the effect ofeach limitation has not been clarified.

In addition to these potential shortcomings in thedesign of RCTs, individuals with non-specific LBP repre-sent a heterogeneous group of patients, which conse-quently means that large sample sizes are required inorder to detect clinically and statistically significantresults.11 Combining results of multiple smaller RCTs inSRs or meta-analyses can be difficult due to the greatvariation in exercise interventions employed in the variousRCTs.7 Moreover, research in other areas has shown thatevidence from large RCTs might contradict evidence fromsystematic reviews on multiple smaller RCTs.17

The limitations described above might explain whymany previous SRs on exercise for non-specific LBP haveonly found exercise to be of moderate benefit.7,18 Recentattempts to extract more specific information on whatexercise programmes should be prescribed for LBP haveshown promising results.8,9 However, limitations in theprimary studies that formed the basis of these reviewsremain an obstacle for compiling high-quality recommen-dations that can be widely generalised. It is, therefore, ofinterest to explore the effect of sample size, methodologi-cal quality and statistical rigour on the recommendationsfor exercise interventions in the treatment of LBP.

For this review, we defined persistent or chronic LBPas pain persisting for 6 weeks or more. Although otherdefinitions have been used, 6 weeks is beyond the periodof spontaneous recovery for much back pain.2,19

Moreover, specific exercise interventions have shown tobe of limited value for individuals with acute LBP (i.e.less than 6 weeks’ duration).7,12,13

The aim of this review is, therefore, to summarise thebest available evidence for exercise interventions for thetreatment of non-specific LBP of at least 6 weeks’ dura-tion. In an attempt to overcome the limitations inherentwith existing reviews, only those RCTs that included alarge sample size, were of good methodological quality,and employed adequate statistical analysis were includedin this review. Since the majority of the RCTs have moni-tored changes in pain and function, these two outcomeshave been used for detailed quantitative analyses.

METHODOLOGY

This review was part of a project aimed at developingphysiotherapy specific guidelines for the treatment ofpersistent low back pain, which will be published bythe Chartered Society of Physiotherapy (UK).

Literature search

An extensive literature search was conducted in 2003 toidentify SRs and RCTs on physiotherapy interventions

for the treatment of LBP. This search in MedLine,EMBASE, CINAHL, AMED, Cochrane, PEDro andthe library collection of the Chartered Society ofPhysiotherapy (UK) resulted in 5065 articles. In 2005, anupdated search was conducted using the same databasesto identify any new SRs or RCTs published before 1 June2005. This search resulted in an additional 2660 articles.SRs were used to identify relevant RCTs, while the searchon RCTs was aimed at finding RCTs not included in theSRs or published after the last search date of the SRs.

RCTs were included if they met the following criteria:

1. Patients (> 18 years of age) with non-specific lowback pain of at least 6 weeks’ duration.

2. Exercise was used as the single physiotherapy inter-vention for at least one group in the trial. RCTs werenot excluded if exercise was combined with othernon-physiotherapy interventions (e.g. general practitioner [GP] care, medication).

3. The effectiveness of exercise was tested in at leastone of the following areas: pain, function,psychological status or return to work/sick leave.

The methodological quality of the included trials wasassessed by two reviewers using an adapted version ofthe van Tulder criteria.20 The original scale consists of 24criteria, 10 were thought to be the most relevant for thisreview. Nine of the criteria relate to the internal validityof the RCT and one criterion relates to the similarities ofmain baseline characteristics/predictors (see Table 1 fora list of all criteria). The other criteria included withinthe original 24-item van Tulder list were not used tocalculate the score in this review, because they wereeither already included in the scope of this review (e.g.eligibility criteria, description of interventions) orqualitatively assessed (e.g. statistical criteria). Samplesize was considered separately since this is not includedin the van Tulder criteria. Instead of using the totalnumber of subjects in the RCTs, we have used thenumber of subjects in the exercise group as an indicationof sample size. This controls for RCTs with a relativelylarge sample size, but multiple subgroups. The smallestRCTs (≤ 15 subjects in the exercise group) that alsoscored very low on the methodological quality scale (≤ 2out of the 10 points) were not considered in this review.

Quantitative analysis

Most of the RCTs monitored pain and function beforeand after intervention; therefore, these two outcomeswere analysed quantitatively. Any outcome measure forpain and function was included in this analysis. Forevery RCT, the change in pain and function (after anyfollow-up period) was calculated as a percentage of the

222 HETTINGA, JACKSON, KLABER MOFFETT, MAY, MERCER AND WOBY

baseline score of that group. These percentage changeswere plotted so that the change in the exercise group wasdisplayed on the x-axis, and the control or alternativegroup on the y-axis (Figs 1 and 2). Some RCTs reportedP-values for this difference in change, while in other casesthe statistical significance of this value was unknown.These L’Abbe plots give a clear picture of all the data inthe included RCTs and facilitate identification of possibleoutliers.21 Points towards the bottom-right corner of theplots indicate that the exercise intervention was better,

while points towards the top-left corner favour the con-trol interventions. Points along the line y = x indicatethat the effectiveness of the exercise intervention wassimilar to the control intervention.

To assess the effect of sample size and methodologicalquality, the difference in change (percentage changein the exercise group minus the percentage change inthe control/alternative intervention) was plotted ver-sus the quality score of the trial and the number ofsubjects in the exercise group (Figs 3 and 4).

EFFECTIVENESS OF EXERCISE INTERVENTIONS FOR NON-SPECIFIC LOW BACK PAIN 223

Fig. 1. Percentage improvement in pain (on any outcomemeasure at any follow up point) for the exercise group and thecorresponding control/alternative group. Each point representone observation on the difference in change: filled squares,statistically significant; open squares, non-statistically signifi-cant; open triangles, unknown level of statistical significance.

Fig. 2. Percentage improvement in function (on any outcomemeasure at any follow up point) for the exercise group and thecorresponding control/alternative group. Each point representone observation on the difference in change: filled squares,statistically significant; open squares, non-statistically signifi-cant; open triangles, unknown level of statistical significance.

Fig. 3. Association between number of subjects in the exercise group and the difference in percentage change between theexercise group and the control/alternative group. Each point represent one observation on the difference in change: filled squares,statistically significant; open squares, non-statistically significant; open triangles, unknown level of statistical significance.

In the above mentioned figures, outcomes after anyfollow-up period are grouped in one graph. To evaluatethe effect of follow-up period on outcome, the changeover time was plotted versus the follow-up period (Figs5 and 6).

Qualitative analysis

Since small sample size, low methodological quality andinadequate statistical testing can distort the results ofRCTs, only results from large, good quality RCTs withadequate statistical tests comparing the difference in

change between intervention groups were considered forthe qualitative analysis. To distinguish larger from smallertrials and low quality from higher quality trials, arbitrarycut-off points were agreed. Methodological quality wasassessed on a 10-point scale and RCTs were considered tobe of lower methodological quality if they scored fewerthan the median score on the 10-point quality score.Consequently, RCTs scoring the median score or betterwere considered to be of higher methodological quality.

Although a power analysis should determine ade-quate sample size, a pragmatic approach was used in thisreview due to limited resources for this project. An RCTwith ≥ 40 subjects in the exercise group was defined as

224 HETTINGA, JACKSON, KLABER MOFFETT, MAY, MERCER AND WOBY

Fig. 4. Association between methodological quality score and difference in percentage change between exercise group andcontrol/alternative group. Each point represent one observation on the difference in change: filled squares, statisticallysignificant; open squares, non-statistically significant; open triangles, unknown level of statistical significance.

Fig. 5. The percentage change in pain over time. R2 of the best-fitted trend line is 0.0016.

large. This cut-off point was adapted from work byMoore et al.15 on the effect of random chance on vari-ability in patients’ response to pain relief interventions.

The third criterion for best available evidence in thisreview is adequate statistical testing to determine thelevel of statistical significance for the difference inchange between the exercise group and control/alter-native group. Given that LBP is known to improve inmany patients over time,2,19 comparing pre- and post-scores within one group is not sufficient to test theeffectiveness of an intervention. This review will onlyconsider the difference in change between the exercisegroup and the comparator group and the level of sta-tistical significance for that difference in change.

Although all RCTs that fulfilled the three inclusion cri-teria were included in the quantitative analysis (minus thevery small RCTs that also scored low on the quality scale),only RCTs that were large, high quality and employedadequate statistical tests were used to recommend specificexercise interventions for chronic non-specific LBP.

RESULTS

Table 1 displays the 31 included RCTs with the qual-ity score and sample size for each trial. The medianmethodological quality score was five and 16 RCTswere considered to be of higher methodological qual-ity (≥ 5 out of 10). None of the included trials scoredthe maximum score. The total number of subjects var-ied widely between the included RCTs (n = 36–1334),but 20 of the 31 included RCTs had fewer than 40subjects in the exercise group. Only seven trials wereboth of high quality (≥ 5 on quality score) and large

(≥ 40 subjects in exercise group). All of these sevenRCTs reported adequate statistical tests that directlycompared the change in the intervention group withthe change in the control/alternative group.

If all RCTs were pooled, exercise resulted in similarimprovements in pain and function as alternativeinterventions (30–32%), while control interventions(i.e. no treatment or placebo) resulted in considerableless benefit (1–16%). When only considering evidencefrom large, good quality RCTs with adequate statisti-cal testing, the effect of exercise on pain and functionwas slightly smaller (26–28%), while the effect ofalternative interventions remained similar (30–32%)and the effect of control interventions was slightlyhigher (8–22%). It should be noted, however, that theeffect of control interventions was based on a small num-ber of observations. Full details are presented in Table 2.

The (percentage) change in pain in the exercisegroup and the corresponding control/alternativeintervention is displayed in Figure 1. A large numberof the points are located around the line of no differ-ence (y = x), although there is a tendency for signifi-cant differences (displayed as black squares) to favourthe exercise intervention. This is more evident for thefigure displaying the effect on function (Fig. 2).

Figure 3 gives the number of subjects in the exercisegroup on the x-axis and the difference in change (per-centage change in exercise group minus percentagechange in control group) on the y-axis. Any blackdots appearing above the x-axis, therefore, representtrials where a statistically significant finding wasreported in favour of the exercise intervention; anyblack dots appearing below the x-axis represent a sig-nificant finding in favour of the control/comparative

EFFECTIVENESS OF EXERCISE INTERVENTIONS FOR NON-SPECIFIC LOW BACK PAIN 225

Fig. 6. The percentage change in function over time. R2 of the best-fitted trend line is 0.0068.

226 HETTINGA, JACKSON, KLABER MOFFETT, MAY, MERCER AND WOBY

Table 1. Characteristics of included RCTs

Reference Interventions Methodological n in exercise Stat. test Outcome measuresquality score group to compare included in

(criteria) (total n) changes? this review

Callaghan et al. General exercise (8 sessions) vs 2 (B,I) 30/30 (80) Yes Pain (VAS)(1994)31 general exercise (4 sessions) vs control

Donchin et al. Strengthening exercise versus 5 (A,B,H,I,J) 46 (142) Yes Pain (painful months), et al. (1990)22 backschool versus control function (Oswestry)

Elnagger et al. Mobilising exercise (flexion) vs 5 (A,B,G,I,J) 28/28 (56) Yes Pain (Modified McGill PQ)(1991)36 mobilising exercise (extension)

Gur et al. General exercise vs general exercise 5 (B,E,G,H,I) 25 (75) Yes Pain (VAS), function (2003)37 + laser therapy versus laser therapy (Roland DQ, Oswestry)

Helmhout et al. Strengthening exercise (high intensity) 3 (A,G,I) 41/40 (81) Yes Function (Roland DQ, (2004)38 vs strengthening exercise (low intensity) Oswestry)

Hemmila et al. General exercise versus physical 7 (A,B,D,G, 35 (132) Yes Pain (VAS)(1997)33, (2002)34 mixed methods versus bone setting H,I,J)

Johannsen et al. Aerobic exercise versus 1 (I) 20/20 (40) Yes Pain (0–4 scale for pain), function(1995)39 co-ordination exercise (patient’s rating of impairment),

return to work (sick days)

Jousset et al. General exercise versus functional 4 (B,E,H,I) 41 (86) No Pain (VAS), function (impact (2004)40 restoration on activities Dallas Pain Q),

return to work (sick leave)

Kankaanpaa General exercise versus placebo 3 (E,H,I) 30 (59) Yes Pain (VAS), function (Pain et al. (1999)28 (massage + thermal therapy) and Disability Index)

Kendall & Strengthening exercise (extension) 2 (H,I) 14/14/14 No Pain (presence of pain onJenkins versus general exercise versus (42) 3-point scale)(1968)41 strengthening exercise (isometric)

Klaber Moffett General exercise versus GP care 7 (A,B,E,G, 89 (187) Yes Pain (Aberdeen Back Pain et al. (1999)27 H,I,J) Scale), function (Roland DQ),

psychological status (Fear Avoidance Q)

Koumantakis General exercise versus core 7 (A,B,E,G,H, 26/29 (55) Yes Pain (McGill, VAS), function et al. (2005)42 stability exercise I,J) (Roland-Morris DQ),

psychological status (TampaKinesiophobia scale)

Kuukkanen & General exercise versus control 4 (B,E,H,I) 29 (90) No Pain (Borg scale), function Malkia (intensive training group excluded (Oswestry)(2000)43 due to non-random allocation)

Martin et al. General exercise (strengthening and 1 (I) 12 (36) Yes Pain (0–5 point scale for hourly (1986)31 mobilising) versus strengthening pain rating), function (VAS for

exercise (isometric) versus placebo difficulties with ADL)

Manniche et al. General exercise (high intensity) vs 7 (A,B,D,G, 27/29 (105) No Pain (0–10 point scale), function (1988)44 general exercise (low intensity) H,I,J) (15 ADL questionnaire)

versus physical mixed methods

Manniche et al. Strengthening exercise (high intensity) 2 (G,I) 33/36 (105) Yes Pain (LBP Rating scale for pain),(1991)45 versus strengthening (low intensity) function (LBP Rating scale for

versus control (massage + disability and physical hot compresses + mild exercise) impairment)

EFFECTIVENESS OF EXERCISE INTERVENTIONS FOR NON-SPECIFIC LOW BACK PAIN 227

Reference Interventions Methodological n in exercise Stat. test Outcome measuresquality score group to compare included in

(criteria) (total n) changes? this review

Mannion et al. Aerobic exercise versus general 9 (A,B,D,E, 44/47 (148) Yes Pain (VAS), function (Roland (1999)24, (2001)46 exercise versus physical mixed F,G,H,I,J) Morris DQ), psychological status

methods (Modified Zung Questionnaire for depression, Fear Avoidance Beliefs Questionnaire)

McIlveen & Hydrotherapy versus control 6 (A,B,E,G,H,I) 45 (109) Yes Pain (McGill Pain Questionnaire), Robertson function (Oswestry)(1998)27

Petersen et al. Strengthening exercise versus 6 (A,B,E,H,I,J) 128/132 (260) Yes Pain (Manniche LBP Rating scale), (2002)23 McKenzie function (15 ADL items), return

to work

Rasmussen-Barr Core stability exercise versus 2 (B,I) 24 (47) Yes Pain (VAS), function (Oswestry, et al. (2003)47 manual therapy VAS for disability rating)

Reilly et al. General exercise (supervised) versus 2 (H,I) 20/20 (40) No Pain (VAS)(1989)48 general exercise (unsupervised)

Risch et al. General exercise versus control 4 (B,H,I,J) 31 (54) Yes Pain (West Haven Yale Multi-(1993)19 (waiting list) dimensional Pain Inventory),

function (Sickness Impact Profile)

Rittweger et al. Strengthening exercise versus 4 (B,E,H,I) 30/30 (60) Yes Pain (VAS), psychological status (2002)49 vibration exercise (Allgemeine Depression Skala)

Shaughnessy & Core stability exercise versus control 2 (B,I) 20 (41) Yes Function (Oswestry, Roland-Caulfield (2004)29 (no intervention) Morris DQ)

Snook et al. General exercise versus prevention of 6 (B,E,F, 36 (85) No Pain (0–10 scale), function (Mean (1998)50 early morning lumbar flexion G,H,J) disability/impairment)

Storheim et al. Aerobic exercise versus cognitive 5 (A,B,G,I,J) 30 (93) Yes Pain (VAS, pain diary), function (2003)51 intervention versus GP care (Roland-Morris DQ), psycho-

logical status (Fear Avoidance Belief Questionnaire), return to work (Sick-listing)

Torstensen et al. General exercise versus aerobic 8 (A,B,D,E, 71/70 (208) Yes Pain (VAS), function (Oswestry), (1998)25 exercise (walking) versus physical G,H,I,J) return to work (Return to work)

mixed methods

Tritilanunt & Aerobic exercise versus mobilising 6 (A,B,D, 35/33 (72) Yes Pain (VAS)Wajanavisit exercise E,H,I)(2001)52

Turner et al. General exercise versus general 4 (B,G,H,I) 24 (96) Yes Pain (McGill), function (Sickness (1990)32 exercise + operant-behaviour versus Impact Profile), psychological

operant-behaviour status (CES Depression Scale)

UK BEAM General exercise + GP care versus 6 (A,B,D, 310 (1334) Yes Pain (Von Korff scale), function (2004)4 general exercise + manual therapy + H,I,J) (Roland DQ, Modified Von Korff

GP care versus manual therapy + scales), psychological status (Fear GP care versus GP care Avoidance Questionnaire)

Yozbatiran et al. General exercise (land-based) vs 5 (B,E,H,I,J) 15 (30) No Pain (VAS), function (Oswestry)(2004)53 general exercise (in the pool)

RCTs in bold have fulfilled the three criteria of higher quality trials (large, good methodological quality and adequate statistical testing).Methodological quality criteria: (A) Treatment allocation: Was the treatment allocation concealed? (B) Were the groups similar atbaseline regarding the most important prognostic indicators? (e.g. age, duration of complaints, value of main outcome measures). (C)Was the care provider blinded to the intervention? (D) Were co-interventions avoided or comparable? (E) Was the compliance rate (ineach group) unlikely to cause bias? (F) Was the patient blinded to the intervention? (G) Was the outcome assessor blinded to theintervention? (H) Was the withdrawal/drop-out rate unlikely to cause bias? (I) Was the timing of the outcome assessments in bothgroups comparable? (J) Did the analysis include an intention-to-treat analysis? These criteria are derived from van Tulder et al.21

group. It appears that smaller studies report morevarying differences in change than the larger studies.Similarly, although less evident, RCTs scoring loweron the quality scale report more variation in differ-ence in change than RCTs scoring higher on the samescale (Fig. 4).

When all observations are grouped, no clear effectof follow-up period on the percentage change can beidentified (Figs 5 and 6). Best-fitted, linear trend linesin both occasions had R2 of close to zero.

For the qualitative analysis, exercise interventionswere grouped according to the type of exercise usedand the following exercises were identified (based onthe descriptions given in the original articles): mobil-ising exercises (2 RCTs), strengthening exercises (7RCTs), aerobic exercises (6 RCTs), general exercises(15 RCTs), core stability exercise (3 RCTs), hydrother-apy (1 RCT), McKenzie exercises (1 RCT) and co-ordination exercises (1 RCT). The individual RCTs inthese groups are noted in Table 1. There was evidencefrom at least one large, good quality RCT with robuststatistical analyses to support the following statements:

1. Strengthening exercises are more effective than notreatment or back schools in reducing pain.22

Also, strengthening exercises are equally aseffective as McKenzie exercises in reducing pain,although the short-term effect is less clear.Strengthening exercises are equally as effective asMcKenzie exercises in improving function andimproving psychological status.23

2. Organised aerobic exercises are more effectivethan physical mixed methods in improvingpsychological status, although the short-termeffect is less clear. Organised aerobic exercises areequally as effective as physical mixed methods inreducing pain and improving function.25

Organised aerobic exercises are also equally aseffective as muscle reconditioning exercises inreducing pain and improving function andpsychological status.25 However, unsupervisedaerobic exercises are less effective than physicalmixed methods in reducing pain, but moreeffective in improving function25 and, in

comparison to general exercises, unsupervisedaerobic exercises are less effective in reducing painand improving function.25

3. General exercises are more effective than GP carein reducing pain and improving function,especially in the long-term.26 In addition, there isevidence from a very large trial that generalexercises are more effective than GP care inreducing pain and improving function andpsychological status.4 Compared to physical mixedmethods, general exercises are equally as effectivein reducing pain and improving function24,25 andmore effective in improving psychological status,although short-term psychological benefits areequal.24 In addition, general exercises are moreeffective than unsupervised aerobic exercises inreducing pain and improving function,25 butequally as effective as organised aerobic exercisesin reducing pain and improving function andpsychological status.25

4. Hydrotherapy is more effective than no treatmentin improving function and equally as effective inreducing pain.27

5. McKenzie exercises are equally as effective asstrengthening exercises in improving function,although in the short-term it might be moreeffective, and McKenzie is equally as effective asstrengthening exercises in reducing pain andimproving return to work.23

Note: For the qualitative analysis, timing of follow-upmeasurements was not taken into consideration,except where discrepancies in short-term (< 6months) and long-term results existed.

DISCUSSION

The aim of this review was to use the best availableevidence to evaluate the effectiveness of exercise inter-ventions for patients with non-specific low back painof at least 6 weeks’ duration. Only evidence from

228 HETTINGA, JACKSON, KLABER MOFFETT, MAY, MERCER AND WOBY

Table 2. Change in pain and function after any follow-up duration. Change is expressed as a percentage of the baseline value

Exercise interventions Alternative interventions Control interventionsChange (%) Observations (n) Change (%) Observations (n) Change (%) Observations (n)

PainAll RCTs considered 30.95 3408 30.46 2743 16.26 264Only best available evidence 25.80 2021 29.98 2042 22.00 50

FunctionAll RCTs considered 30.73 4204 32.31 4376 1.31 175Only best available evidence 27.65 24.97 33.56 3606 8.00 50

RCTs that were large (≥ 40 subjects in the exercisegroup), had good methodological quality (≥ 5/10 onthe quality score) and had directly compared andanalysed the change in the exercise group with thechange in the control/alternative group were used inthe qualitative analysis of this review. Although mostsystematic reviews assess methodological quality ofthe included trials, the effect of sample size and statis-tical quality on the conclusion has often not beenexplored. When only considering evidence from large,high-quality RCTs with adequate statistical testing,support was found for the use of strengthening exer-cises, (structured) aerobic exercises, general exercises,hydrotherapy exercises and McKenzie exercises forpain and disability reduction. Psychological statusand return to work were not always reported in theincluded RCTs; however, there was evidence to sup-port the use of strengthening exercise, McKenzieexercises, (structured) aerobic exercises and generalexercises for improving psychological status and theuse of McKenzie exercises and strengthening exercisesfor improving return to work.

Justification for only considering large, good qual-ity RCTs with adequate statistical testing came fromthe quantitative analysis of all potentially eligibleRCTs. Many RCTs reported changes in pain andfunction in the exercise group that were similar to thechanges on the same outcomes in the control or alter-native interventions (Figs 1 and 2). Statistically signif-icant differences between the changes in the exercisegroup and the comparator group have been reportedfor trials that found large differences between thetwo19,28–31 or trials that included large samplessizes.4,25,26 Figure 3 shows that large differencesbetween exercise and comparator differences havebeen reported mainly in the smaller trials, while thelarger trials reported statistically significant differ-ences that were much smaller.4,25,26 Smaller, but statis-tically significant, differences in effectiveness found inthe larger trials were of similar magnitude to resultsof some of the smaller trials. However, the smaller tri-als failed to show statistical significance of theseresults, probably resulting from insufficient statisticalpower. Moreover, the smaller trials and the lowerquality trials tended to report more often positiveeffects for the exercise intervention (shown by the factthat more points in Figs 3 and 4 are above the liney = 0). This might suggest publication bias as smallertrials with negative results are less likely to be pub-lished. Another explanation might lie in differences indifferences in subject characteristics. The smallerRCTs might have recruited subjects from a smallerand more specific population, while the larger trialsby definition have to recruit from a larger and, there-fore, probably more representative population.

It is also noteworthy that large superior effects ofexercise over comparator interventions have moreoften been reported in RCTs that score low on themethodological quality criteria (Fig. 4), possibly lead-ing to an overestimation of the true effectiveness ofexercise. Low methodological quality in RCTs limitsthe internal and external validity of the results andmay potentially mislead the clinician.

Justification for pooling all follow-up points in theabove quantitative analyses came from Figures 5 and6. There was no clear correlation between timing offollow-up and the change in pain or function. Itshould be noted, however, that these figures neithersupport nor refute long-term effects of exercise inter-ventions since the data presented in Figures 5 and 6are derived from multiple RCTs and do not representfollow-up data of individual treatment groups.

When all the results of only the large, good-qualityRCTs with adequate statistical testing are pooled, itappears that the effectiveness of exercise is similar toalternative interventions. Previous reviews that havepooled all exercise interventions found only moderatebenefits of exercise for LBP. However, clinically moreuseful recommendations can be made if the type ofexercise is specified. Hayden et al.8 concluded in theirreview and accompanying meta-regression analysisthat individually designed, supervised exercise pro-grammes that include strengthening and stretchingmay improve pain and function in chronic LBP.Discrepancies between the conclusion from Haydenet al.8 and the present review can probably beexplained by differences in exercise characteristicsincluded in the analysis (i.e. stretching was notincluded in our analysis, although it might overlapwith mobilising exercises) and by differences in weightgiven to methodological quality, sample size and sta-tistical rigour. Interestingly, some of the RCTsincluded in our review found exercise to be inferior toalternative interventions for chronic LBP,25,31–34

although only in one instance did such a findingemerge from a large, high quality RCT.25 Torstensenet al.25 reported that unsupervised walking was lesseffective than medical exercise therapy or conven-tional physiotherapy. The lack of supervision mightexplain this, as suggested by Hayden et al.8

Although this review gives valuable guidance forclinical practice, some limitations should be noted.First, the authors relied on information supplied inoriginal publications, which in some cases wasambiguous. In such cases, consensus was soughtbetween the authors of the present review, who are allfamiliar with the LBP literature. Second, this reviewincluded some outcome measures for pain, function,psychological status and return to work that had notundergone adequate psychometric testing. Despite

EFFECTIVENESS OF EXERCISE INTERVENTIONS FOR NON-SPECIFIC LOW BACK PAIN 229

this, it should be noted that the higher quality studiesincluded within this review employed valid and reli-able measures for pain and function such as McGillPain Questionnaire, Aberdeen Back Scale, Roland-Morris Disability Questionnaire. Consequently, theuse of less well-established outcomes measures islikely to have had only a minimal impact upon thefinal conclusions of this review. Finally, whilst pre-senting the effectiveness of interventions as a percent-age of the baseline score enabled us to group variousoutcomes measures, this approach has a number ofdisadvantages. For instance regression to the meanmight have over-represented the larger changes.

CONCLUSIONS

Many RCTs on exercise for chronic, non-specific LBPhave limitations in methodological quality, samplesize and/or statistical rigour. This review has shownthat these limitations might have resulted in mislead-ing conclusions. Specifically, the results of RCTs withsmaller samples sizes, lower methodological qualityand inadequate statistical testing should be inter-preted with caution. Nevertheless, evidence fromlarger, higher quality RCTs with adequate statisticaltesting support the use of exercise interventions forpersons with LBP of at least 6 weeks’ duration. Inparticular, it would seem that strengthening exercises,(organised) aerobic exercises, general exercises,hydrotherapy exercises and McKenzie exercises are aneffective form of exercise intervention for individualswith persistent LBP.

Future RCTs on exercise for LBP should includelarge sample sizes, based on appropriate power calcu-lations, follow strict methodological guidelines anduse adequate statistical testing in order to allow forthe true effectiveness of exercise for non-specific LBPto be detected.11 Moreover, sub-classification of non-specific LBP and exploring what exercises might ben-efit which sub-group, might improve effectiveness andfurther guide clinical practice.

ACKNOWLEDGEMENTS

The authors would like to thank the people who assistedwith conducting this review (Katherine Dean, Jo Jordan,Jenni Hall and Sharlene Ting), the Library staff at the CSP(Samantha Molloy, Linda Griffiths, Andrea Peace, AnnaSewarniak and Alison Jinks), other staff at the CSP (AlexWarne, Helen Whittaker, Susan Williams) and all mem-bers of the Guideline Development Group (Panos Barlos,Sarah Ferguson, Susan Greenhaulgh, Vicki Harding,

Deirdre Hurley-Osing, Denis Martin, Jude Monteath,Lisa Roberts, Nia Taylor). Funding for this project wasreceived from the Chartered Society of Physiotherapy andthe CSP’s Charitable Trust, London, UK.

REFERENCES

1 Frank JW, Kerr MS, Brooker AS, DeMaio SE, Maetzel A,Shannon HS et al. Disability resulting from occupational low backpain. Part I: What do we know about primary prevention? A reviewof the scientific evidence on prevention before disability begins.Spine 1996;21:2908–17

2 Frank JW, Brooker AS, DeMaio SE, Kerr MS, Maetzel A,Shannon HS et al. Disability resulting from occupational lowback pain. Part II: What do we know about secondaryprevention? A review of the scientific evidence on preventionafter disability begins. Spine 1996;21:2918–29

3 Maniadakis N, Gray A. The economic burden of back pain inthe UK. Pain 2000;84:95–103

4 UK BEAM trial team. United Kingdom back pain exerciseand manipulation (UK BEAM) randomised trial: effectivenessof physical treatments for back pain in primary care. BMJ2004;329:1377

5 van Tulder MW, Koes B, Malmivaara A. Outcome of non-invasive treatment modalities on back pain: an evidence-basedreview. Eur Spine J 2006;15(Suppl 1):S64–81

6 van der Roer N, Goossens ME, Evers SM, van Tulder MW.What is the most cost-effective treatment for patients with lowback pain? A systematic review. Best Pract Res Clin Rheumatol2005;19:671–84

7 Hayden JA, van Tulder MW, Malmivaara AV, Koes BW. Meta-analysis: exercise therapy for nonspecific low back pain. AnnIntern Med 2005;142:765–75

8 Hayden JA, van Tulder MW, Tomlinson G. Systematic review:strategies for using exercise therapy to improve outcomes inchronic low back pain. Ann Intern Med 2005;142:776–85

9 Liddle SD, Baxter GD, Gracey JH. Exercise and chronic lowback pain: what works? Pain 2004;107:176–90

10 Koes BW, Malmivaara A, van Tulder MW. Trend inmethodological quality of randomised clinical trials in lowback pain. Best Pract Res Clin Rheumatol 2005;19:529–39

11 Bouter LM, van Tulder MW, Koes BW. Methodologic issues inlow back pain research in primary care. Spine 1998;23:2014–20

12 van Tulder M, Becker A, Bekkering T, Breen A, Gil Del RealMT, Hutchinson A et al. Chapter 3 European guidelines forthe management of acute nonspecific low back pain in primarycare. Eur Spine J 2006;15(Suppl 2):s169–91

13 Koes BW, van Tulder MW, Ostelo R, Kim Burton A, WaddellG. Clinical guidelines for the management of low back pain inprimary care: an international comparison. Spine2001;26:2504–13

14 Philadelphia Panel. Philadelphia Panel evidence-based clinicalpractice guidelines on selected rehabilitation interventions forlow back pain. Phys Ther 2001;81:1641–74

15 Moore RA, Gavaghan D, Tramer MR, Collins SL, McQuayHJ. Size is everything – large amounts of information areneeded to overcome random effects in estimating direction andmagnitude of treatment effects. Pain 1998;78:209–16

16 Kjaergard LL, Villumsen J, Gluud C. Reported methodologicquality and discrepancies between large and small randomizedtrials in meta-analyses. Ann Intern Med 2001;135:982–9

17 LeLorier J, Gregoire G, Benhaddad A, Lapierre J, DerderianF. Discrepancies between meta-analyses and subsequent large

230 HETTINGA, JACKSON, KLABER MOFFETT, MAY, MERCER AND WOBY

randomized, controlled trials. N Engl J Med 1997;337:536–4218 van Tulder MW, Malmivaara A, Esmail R, Koes BW. Exercise

therapy for low back pain. Cochrane Database Syst Rev2000(2):CD000335

19 Risch SV, Norvell NK, Pollock ML, Risch ED, Langer H,Fulton M et al. Lumbar strengthening in chronic low backpain patients: Physiologic and psychological benefits. Spine1993;18:232–8

20 van Tulder MW, Assendelft WJ, Koes BW, Bouter LM.Method guidelines for systematic reviews in the CochraneCollaboration Back Review Group for Spinal Disorders. Spine1997;22:2323–30

21 Song F. Exploring heterogeneity in meta-analysis: is theL’Abbe plot useful? J Clin Epidemiol 1999;52:725–30

22 Donchin M, Woolf O, Kaplan L, Floman Y. Secondaryprevention of low-back pain. A clinical trial. Spine1990;15:1317–20

23 Petersen T, Kryger P, Ekdahl C, Olsen S, Jacobsen S. The effectof McKenzie therapy as compared with that of intensivestrengthening training for the treatment of patients withsubacute or chronic low back pain: A randomized controlledtrial. Spine 2002;27:1702–8

24 Mannion AF, Muntener M, Taimela S, Dvorak J. Arandomized clinical trial of three active therapies for chroniclow back pain. Spine 1999;24:2435–48

25 Torstensen TA, Ljungggren AE, Meen HD, Odland E,Mowinckel P, Geijerstam S. Efficiency and costs of medicalexercise therapy, conventional physiotherapy, and self-exercisein patients with chronic low back pain: a pragmatic,randomized, single-blinded, controlled trial with 1-yearfollow-up. Spine 1998;23:2616–24

26 Klaber Moffett J, Torgerson D, Bell-Syer S, Jackson D,Llewlyn-Phillips H, Farrin A et al. Randomised controlledtrial of exercise for low back pain: clinical outcomes, costs, andpreferences. BMJ 1999;319:279–83

27 McIlveen B, Robertson VJ. A randomized controlled study ofthe outcome of hydrotherapy for subjects with low back orback and leg pain. Physiotherapy 1998;84:17–26

28 Kankaanpaa M, Taimela S, Airaksinen O, Hanninen O. Theefficacy of active rehabilitation in chronic low back pain: effecton pain intensity, self-experienced disability, and lumbarfatigability. Spine 1999;24:1034–42

29 Shaughnessy M, Caulfield B. A pilot study to investigate theeffect of lumbar stabilisation exercise training on functionalability and quality of life in patients with chronic low backpain. Int J Rehabil Res 2004;27:297–301

30 Callaghan MJ. Evaluation of a back rehabilitation group forchronic low back pain in an out-patient setting. Physiotherapy1994;80:677–81

31 Martin PR, Rose MJ, Nichols PJ, Russell PL, Hughes IG.Physiotherapy exercises for low back pain: process and clinicaloutcome. Int Rehabil Med 1986;8:34–8

32 Turner JA, Clancy S, McQuade KJ, Cardenas DD.Effectiveness of behavioral therapy for chronic low back pain:a component analysis. J Consult Clin Psychol 1990;58:573–9

33 Hemmila HM, Keinanen-Kiukaanniemi SM, Levoska S,Puska P. Does folk medicine work? A randomized clinical trialon patients with prolonged back pain. Arch Phys Med Rehabil1997;78:571–7

34 Hemmila HM, Keinanen-Kiukaanniemi SM, Levoska S,Puska P. Long-term effectiveness of bone-setting, light exercisetherapy, and physiotherapy for prolonged back pain: arandomized controlled trial. J Manipul Physiol Ther2002;25:99–104

35 Karjalainen K, Malmivaara A, Mutanen P, Pohjolainen T,Roine R, Hurri H. Outcome determinants of subacute low

back pain. Spine 2003;28:2634–4036 Elnaggar IM, Nordin M, Sheikhzadeh A, Parnianpour M,

Kahanovitz N. Effects of spinal flexion and extension exerciseson low-back pain and spinal mobility in chronic mechanicallow-back pain patients. Spine 1991;16:967-72

37 Gur A, Karakoc M , Cevik R , Nas K , Sarac AJ, Karakoc M.Efficacy of low power laser therapy and exercise on pain andfunctions in chronic low back pain. Lasers Surg Med2003;32:233-8

38 Helmhout PH, Harts CC, Staal JB, Candel MJ, de Bie RA.Comparison of a high-intensity and a low-intensity lumbarextensor training program as minimal intervention treatmentin low back pain: a randomized trial. Eur Spine J 2004;13:537-47

39 Johannsen F, Remvig L, Kryger P, Beck P, Warming S, LybeckK, et al. Exercises for chronic low back pain: a clinical trial. JOrthop Sports Phys Ther 1995;22:52-9

40 Jousset N, Fanello S, Bontoux L, Dubus V, Billabert C, VielleB, et al. Effects of functional restoration versus 3 hours perweek physical therapy: a randomized controlled study. Spine2004;29:487-93; discussion 494

41 Kendall PH, Jenkins JM. Exercises for backache: a double-blind controlled trial. Physiotherapy 1968;54:154-7

42 Koumantakis GA, Watson PJ, Oldham JA.Trunk musclestabilization training plus general exercise versus generalexercise only: randomized controlled trial of patients withrecurrent low back pain. Phys Ther 2005;85:209-25

43 Kuukkanen TM, Malkia EA. An experimental controlledstudy on postural sway and therapeutic exercise in subjectswith low back pain. Clin Rehabil 2000;14:192-202

44 Manniche C, Hesselsoe G, Bentzen L, Christensen I, LundbergE. Clinical trial of intensive muscle training for chronic lowback pain. Lancet 1988;2(8626-8627):1473–6

45 Manniche C, Lundberg E, Christensen I, Bentzen L, HesselsoeG. Intensive dynamic back exercises for chronic low back pain:a clinical trial. Pain 1991;47:53-63

46 Mannion AF, Dvorak J, Taimela S, Muntener M. Increase instrength after active therapy in chronic low back pain(CLBP)patients: muscular adaptations and clinical relevance.Schmerz 2001;15:468-73

47 Rasmussen-Barr E, Nilsson-Wikmar L, Arvidsson I.Stabilizing training compared with manual treatment in sub-acute and chronic low-back pain. Man Ther 2003;8:233-41

48 Reilly K, Lovejoy B, Williams R, Roth H. Differences betweena supervised and independent strength and conditioningprogram with chronic low back syndromes. J Occup Med1989;31:547-50

49 Rittweger J, Just K, Kautzsch K, Reeg P, Felsenberg D.Treatment of chronic lower back pain with lumbar extensionand whole-body vibration exercise: a randomized controlledtrial. Spine 2002;27:1829-34

50 Snook SH, Webster BS, McGorry RW, Fogleman MT,McCann KB. The reduction of chronic nonspecific low backpain through the control of early morning lumbar flexion. Arandomized controlled trial. Spine 1998;23:2601-7

51 Storheim K, Brox JI, Holm I, Koller AK, Bø K. Intensivegroup training versus cognitive intervention in sub-acute lowback pain: short-term results of a single-blind randomizedcontrolled trial. J Rehabil Med 2003;35:132-40

52 Tritilanunt T, Wajanavisit W. The efficacy of an aerobicexercise and health education program for treatment ofchronic low back pain. J Med Assoc Thai 2001;84(Suppl2):S528-33

53 Yozbatiran N, Yildirim Y, Parlak B. Effects of fitness andaquafitness exercises on physical fitness in patients withchronic low back pain. Pain Clin 2004;16:35-42

EFFECTIVENESS OF EXERCISE INTERVENTIONS FOR NON-SPECIFIC LOW BACK PAIN 231

232 HETTINGA, JACKSON, KLABER MOFFETT, MAY, MERCER AND WOBY

DRIES M. HETTINGA

School of Health Sciences and Social Care, Brunel University, Uxbridge UB8 3PH, UK

ANNE JACKSON (for correspondence)

Research and Clinical Effectiveness Unit, Chartered Society of Physiotherapy, 14 Bedford Row, London WC1R 4ED, UK

Tel: +44 (0)1903 212116; Fax: +44 (0)207 306 6653; E-mail: [email protected]

JENNIFER KLABER MOFFETT

Institute of Rehabilitation, University of Hull, 215 Anlaby Road, Hull HU3 2PG, UK

STEPHEN MAY

Faculty for Health and Wellbeing, Sheffield Hallam University, Sheffield S10 2BP, UK

CHRIS MERCER

Physiotherapy Department, Worthing and Southlands Hospitals NHS Trust, Upper Shoreham,

Shoreham-by-Sea BN43 6TQ, UK

STEVE R. WOBY

Physiotherapy Department, North Manchester General Hospital, Delauneys Road, Crumpsall, Manchester M8 5RB, UK,

and Centre for Rehabilitation Science, University of Manchester, Manchester, UK


Recommended