+ All Categories
Home > Documents > A NEW LEARNING APPROACH TO PROCESS IMPROVEMENT IN A TELECOMMUNICATIONS COMPANY

A NEW LEARNING APPROACH TO PROCESS IMPROVEMENT IN A TELECOMMUNICATIONS COMPANY

Date post: 30-Sep-2016
Category:
Upload: tony-cox
View: 212 times
Download: 0 times
Share this document with a friend
11
PRODUCTION AND OPERATIONS MANAGEMENT Vol. 4, No. 3, Summer 1995 Primed in U.S.A. A NEW LEARNING APPROACH TO PROCESS IMPROVEMENT IN A TELECOMMUNICATIONS COMPANY * TONY COX, GEORGE BELL, AND FRED GLOVER US WEST AdvancedTechnologies, Communications Services Research, Boulder, Colorado80303, USA School of Business, University of Colorado at Boulder, Boulder, Colorado 80309, USA Redesigning and improving business processes to better serve customer needs has become a priority in serviceindustries asthey scramble to become more competitive. We describe an approach to process improvement that is being developed collaboratively by applied researchers at US WEST, a major telecommunications company, and the University of Colorado. Motivated by the need to streamline and to add more quantitative power to traditional quality improvement processes, the new approach uses an artificial intelligence (AI) statistical tree growing method using customer survey data to identify operations areas where improvements are expected to affect customers most. This AI/statistical method also identifies realistic quantitative targets for improvement and suggests specific strategies predicted to have high impact. This research, funded in part by the Colorado Advanced Software Institute (CASI) to stimulate profitable innovations, has resulted in a practical methodology used successfully at US WEST to help set process im- provement priorities and guide resource allocation decisions throughout the company. (QUALITY IMPROVEMENT; HEURISTIC OPTIMIZATION; MACHINE LEARNING; SERVICE INDUSTRY; CUSTOMER SERVICE MEASURES) 1. Introduction As the simple doctrines of kaizen and continuous quality improvement have swept from boardrooms to shop floors among Fortune 500 companies, many recent converts- especially members of quality teams with a year or more experience applying the new approaches-have begun to feel a need for something more. Even relatively nonquan- titative members of quality teams sometimes sense that Ishikawa diagrams may not adequately illuminate the complexities of control systems with feedback. Likewise, they soon become aware that simple Pareto diagrams may omit important sources of error, cost, and delay when the seeds of customer dissatisfaction stem from interactions among multiple variables rather than from their direct (marginal) contributions. Data envelopment analysis ( DEA) has similarly attracted attention as a way of evaluating of projects and company performance, by solving a collection of linked linear program- ming models. The result, however, is effectively to construct a piecewise linear Pareto * Received January 1994; revised February and August 1995; accepted August 1995. 217 1059-1478/95/0403/217$1.25 Copyright 0 1995, Production and Operations Management society
Transcript

PRODUCTION AND OPERATIONS MANAGEMENT Vol. 4, No. 3, Summer 1995

Primed in U.S.A.

A NEW LEARNING APPROACH TO PROCESS IMPROVEMENT IN A TELECOMMUNICATIONS

COMPANY *

TONY COX, GEORGE BELL, AND FRED GLOVER US WEST Advanced Technologies, Communications Services Research,

Boulder, Colorado 80303, USA School of Business, University of Colorado at Boulder,

Boulder, Colorado 80309, USA

Redesigning and improving business processes to better serve customer needs has become a priority in service industries as they scramble to become more competitive. We describe an approach to process improvement that is being developed collaboratively by applied researchers at US WEST, a major telecommunications company, and the University of Colorado. Motivated by the need to streamline and to add more quantitative power to traditional quality improvement processes, the new approach uses an artificial intelligence (AI) statistical tree growing method using customer survey data to identify operations areas where improvements are expected to affect customers most. This AI/statistical method also identifies realistic quantitative targets for improvement and suggests specific strategies predicted to have high impact. This research, funded in part by the Colorado Advanced Software Institute (CASI) to stimulate profitable innovations, has resulted in a practical methodology used successfully at US WEST to help set process im- provement priorities and guide resource allocation decisions throughout the company. (QUALITY IMPROVEMENT; HEURISTIC OPTIMIZATION; MACHINE LEARNING; SERVICE INDUSTRY; CUSTOMER SERVICE MEASURES)

1. Introduction

As the simple doctrines of kaizen and continuous quality improvement have swept from boardrooms to shop floors among Fortune 500 companies, many recent converts- especially members of quality teams with a year or more experience applying the new approaches-have begun to feel a need for something more. Even relatively nonquan- titative members of quality teams sometimes sense that Ishikawa diagrams may not adequately illuminate the complexities of control systems with feedback. Likewise, they soon become aware that simple Pareto diagrams may omit important sources of error, cost, and delay when the seeds of customer dissatisfaction stem from interactions among multiple variables rather than from their direct (marginal) contributions.

Data envelopment analysis ( DEA) has similarly attracted attention as a way of evaluating of projects and company performance, by solving a collection of linked linear program- ming models. The result, however, is effectively to construct a piecewise linear Pareto

* Received January 1994; revised February and August 1995; accepted August 1995. 217

1059-1478/95/0403/217$1.25 Copyright 0 1995, Production and Operations Management society

218 TONY COX, GEORGE BELL AND FRED GLOVER

curve, and additional tools are often required to translate such information into practical prescriptions.

Yet, despite the oversimplicity of its most widely popularized analytic techniques (or of the model assumptions that underly their implementations) the quality message is too important to ignore. Data-driven, customer-focused efforts to improve business processes are crucial to business success and survival in many service industries. There is an urgent need, increasingly felt by practitioners of TQM, for more realistic and powerful ways of understanding and using complex data to guide resource allocation, organizational re- structuring, and process redesign decisions to achieve TQM goals. Increased customer satisfaction and reduced costs, cycle times, and error rates can be achieved effectively only if improved techniques for moving from data to decisions are developed and de- ployed. We describe the results of an industry-university partnership that is developing innovative approaches to using complex customer service measure (CSM) data to identify areas critical to successful operations and planning.

Our approach involves an innovation in constructing classification trees of the type widely incorporated into many AI and statistics packages (Breiman et al. 1984; Biggs et al. 199 1) . Novel features of this approach include an emphasis on finding classification trees that involve only “actionable” quantities (quantities that can be affected through affordable changes in current practices and processes) and a focus on conclusions (e.g., on the feasibility of achieving different CSM targets in different areas of the business) that are robust, permitting their validity to be established by several independent paths. The technical methodology is still being advanced through the CASI collaboration and has already found practical application within US WEST Communications (uswc). This has profoundly affected the company’s attitudes, strategy, and tactics for process im- provement.

2. Background and Problem Statement

Our AI approach to multivariate data analysis features the application of recursive partitioning algorithms based on concepts found in the AI and statistics literature on machine learning (Buntine 1993; Ohta and Kanaya 199 1) . Our approach has broken through several technical barriers that had frustrated previous efforts to understand and quantify the causes (and potential cures) of customer dissatisfaction. Carried out as part of a customer-focused, data-driven quality improvement process, it has led to insights and recommendations for focusing process improvement efforts on the most critical changes-especially, changes in repair and installation practices gauged to have the greatest impacts on customer perceptions of uswc’s service quality in two mass markets (resi- dential and small business). The AI program’s findings have strong intuitive appeal and quickly earned credibility with customer-behavior experts within US WEST’s Strategic Marketing department. They were incorporated into uswc’s business plan in 1993 and helped to inform process improvement priority-setting at the company’s highest levels.

The accomplishments that made the machine-learning approach so successful were rooted in its ability to overcome technical obstacles that had frustrated previous attempts at clear data analysis and interpretation. The following three contributions emerge, in retrospect, as being most important to its success.

1. It provided a thoroughly nonparametric (model-free) alternative to the previous multivariate statistical models (including path analysis, factor analysis, and logistic and multivariate regression models), that had failed to provide stable, interpretable quanti- tative results as a basis for decision-making and priority-setting. To the statisticians con- ducting the data analysis, the AI approach offers a constructive solution to three challenging problems:

l Nonlinear interactions among different parts of the installation or repair processes in determining the probability distribution of customer-assigned CSM grades (on a

PROCESS IMPROVEMENT IN TELECOMMUNICATIONS 219

scale of A+ to F-). These nonlinearities were embedded in strong, nonobvious substitute and complement effects, deriving from factors that affected customer per- ceptions, such the quality of the interaction with the uswc sales representative, and technician performance. The result had led to model misspecification errors and to unstable parameter estimates within the generalized linear models previously fit to the customer response data.

l Population heterogeneity in the probabilistic responses (i.e., the grade-assignment probabilities) of different customers. Such heterogeneity necessitates using mixture distribution models, in which the proportion of customers behaving according to each submodel, as well as the behaviors associated with the submodels, must be estimated from the data.

l Missing data in customer responses. “No response” occurred in a substantial number of cases ( 5- 10% for some questions). While computationally expensive techniques such as multiple imputation could have been attempted, the AI approach was able to adaptively pool the “NR” category with other outcomes for some questions, and to split it out as a separate response category for others, in such a way that ability to predict final grades was maximized (according to a classzjication entropy criterion discussed below).

These challenges, which had frustrated previous efforts at reproducible data analysis and interpretation based on parametric models, were handled very naturally by the rule- induction approach of the AI framework.

2. The machine learning component provided a clear set of conclusions for better de- cision-making and process improvement countermeasures. To USWC'S top executives, the simple rules learned by the AI program seemed far more interpretable and useful than previous findings and analyses based on factor analysis and other generalized linear modeling approaches. It emerged that a relatively small set of actionable factors strongly affected customer responses. Causal interpretations of these strong predictive relations and an impetus to take actions based on them were rapidly forthcoming.

3. The model provided solid quantitative support for the intuitions and qualitative findings of the company’s strategic marketing experts. To the strategic marketing scientists who designed and managed the administration of the Customer Satisfaction Measure (CSM) survey instruments, the AI program gave detailed quantitative confirmation of the major qualitative suppositions that they had been proposing and debating for over a year: that customer perceptions of uswc’s service quality were sensitive to key behavior patterns such as honoring commitments, installing or repairing equipment properly the first time, and keeping customers informed throughout the installation or repair process. To these generalizations, the AI program added a layer of predictive ability that executives and managers could use to help set priorities and to quantify the expected impact of different proposed changes in operations.

Changing the operations of a major telecommunications provider can cost tens of millions of dollars. The ability to predict probable impact, and thus to set clear priorities, has helped to overcome what could have become an organizational decision impasse- one in which no department was willing to risk the costs involved in making dramatic changes for fear that costs would turn out to exceed benefits. By contrast, AI program findings were incorporated directly into USWC'S 1993 business plan and helped justify priorities for process improvement efforts in 1994.

3. Problem Formulation: Cost-Effective Improvement in the Frequency Distribution of Customer Grades

The CSM data-analysis and decision problem belongs to a family of decision problems characterized by actions with poorly specified probabilities for their consequences. As in

220 TONY COX, GEORGE BELL AND FRED GLOVER

traditional (SEU) decision analysis, it is possible to specify a set A of actions and a set S of states; however, in contrast to traditional models, the feasibility of the actions and the probabilities of the states are both uncertain.

The fundamental decision problem for CSM data analysis is to choose thefirst process improvement actions to take to improve the frequency distribution of the grades customers assign to their experiences with uswc. In a competitive environment, there is some urgency for this decision. More generally, the analytic challenge is to develop sound and practical techniques for moving from customer response data to corporate decisions about which changes to fund in the immediate future to improve customer response.

The customer response data may be summarized in regression model format as a set of pairs,

D = {<Xi, Vi), i = 1,2, . . . , M},

where D denotes the data set; y1 = the overall CSM grade assigned to USWC’S performance by customer i. A4 = number of customers in the sample (typically, several thousand). Xi = the ith customer’s vector of responses to questions on the CSM survey. These

response elements addressed covariates (e.g., sex, age, and occupation) and experiences (such as “Were you put on hold before you talked to someone?‘, “Were you put on hold after you talked to someone?“, “Was the service installed correctly the first time?‘, or “Was the installation completed on time?‘)

Thus, x0 denotes the response of customer i to item j on the survey questionnaire, for j= 1,2 * f 9 questionnaires

N. The number of questions, N, is typically between 40 and 60 for the used by most market units. Questions were administered over the tele-

phone, and typically not all questions had to be answered, i.e., the survey design allowed some questions to be skipped as a result of answers to earlier questions. The descriptors in x included binary, categorical, and ordered categorical (e.g., age category) variables. We will refer to Xi generically as the experience vector for customer i.

For purposes of conceptual modeling, let I;(x, y) denote the empirical joint cumulative distribution function (cdf) for the values of the (x, y) pairs in D, and, taking slight liberties with notation, let P(x) and F(y) denote the corresponding marginal cdfs for vector x and ordered categorical variable y. Let CM denote a finite set of countermeasures that uswc may implement to improve its current processes.

Each countermeasure in CM is a (potentially fundable) discrete change that is ex- pected to have an impact on Ii(x) and to cost a certain amount. For example, the countermeasure “Increase by 5% the number of attendants assigned to answer in- coming calls at the service repair bureau” would be expected to change P’(x) by decreasing the proportion of customers who are put on hold before talking to someone and by decreasing the fraction of customers who encounter busy signals when they call in.

The set of actions available to USWC, denoted by A, consists of all subsets of CM. The interpretation of an action a in A is that the set of countermeasures in a is implemented (i.e., the corresponding changes in operations and practices are made). We need to take this set-oriented perspective because, as we will show, the benefits expected from different countermeasures exhibit strong interactions (synergies and antagonisms).

Let F( x 1 a) denote the cdf for customer experiences x if act a is implemented. Notice that predictions about F(x 1 a) require (or embody) a model for predicting impacts on the joint frequency distribution of experiences, F(x), of acts in A. Let F( y 1 x) denote the cdf of customer grades among those customers receiving (or, more accurately, re-

PROCESS IMPROVEMENT IN TELECOMMUNICATIONS 221

porting) experience vector X. Finally, let J’( y 1 a) denote the cdf of grades induced in the obvious way by act a; thus,

where f( x 1 a) is the probability density of experience vector x given act a [determined from F(x 1 a)]. Then an initial goal of uswc’s process improvement efforts is to identify and select for implementation an act in A that is undominated with respect to cost and the resulting cdf of grades, F(y ] a) (where cdfs are to be compared by first-order stochastic dominance). Alternatively, if it is feared that all customers who give uswc less than a B grade are at risk of switching to a competitor, and if each such loss represents a dollar value of $w in lost revenue, then a more quantitative goal statement might be

minrize {c(a) + w*F(B - la)}

where c(a) denotes the cost of implementing act a. USWC’S stated objective differed from either of these: in keeping with the quality principle of “surprising and delighting” cus- tomers, the corporate goal was formulated as being to find the most cost-effective way to achieve an increased proportion of As from our customers within the next few years. Specific numerical targets were set for different areas of the business. Fortunately, all three formulations turn out to lead to similar recommendations, reflecting the scarcity of undominated alternatives.

4. Solution via a Machine-Learning Approach

A traditional approach to improving CSM grades results from applying a parametric statistical prediction model, e.g., of the form

E(Y) = g(x; b),

where Y = 1 if the grade assigned by a customer meets the desired target (e.g., an A or above) and Y = 0 otherwise, and b is a vector of parameters to be estimated from data set D. Then, an action in A is selected whose effect on x maximizes E(Y), the probability of achieving the target state.

Most traditional parametric statistical models belong to the generalized linear model family, E(Y) = g( bx), where bx is the inner product of vectors b and x ( McCullagh and Nelder 1983). However, attempts over several years to model the CSM data by such methods failed to produce robust, stable results. Indeed, in retrospect, it appears that no such model can describe the empirically observed CSM data. Instead, the coefficients b vary with x, leading to a nonlinear model of unknown functional form.

Recursive partitioning algorithms for nonparametric approximation of unknown non- linear functions have been extensively studied in the computational statistics and machine- learning literature (e.g., through programs such as CART and ID3 through IDS, respectively) (Ohta and Kanaya 199 1; Buntine 1993 ) . Briefly, the uncertainty about y corresponding to a grade distribution function F( y I x) can be quantified by measures such as classification entropy. Then, decision rules represented as classification trees can be automatically constructed, e.g., by always asking next the question that will minimize the expected conditional classification entropy. (Less myopic algorithms, e.g., based on simulated annealing, have recently been created to find more globally optimal trees [ Lutsko and Kuijpers 19941.) Each question represents a split, i.e., a node in the tree from which multiple branches corresponding to the different possible answers descend. A conditional frequency distribution of grades is associated with each node.

Control by Actionable Variables A novel aspect of our use of machine-learning methods is our focus on controling

rather than only on predicting of uswc’s perceived performance. To this end, we first

222 TONY COX, GEORGE BELL AND FRED GLOVER

partitioned the variables in x into actionable variables (e.g., technician showing up on time) and nonactionable variables (e.g., respondent age and sex). We then searched for effective strategies within the space of classification trees. We accomplished this by re- stricting splits to the set of actionable variables and by modifying the tree-evaluation objective function to reflect the quality of the conditional frequency distribution of grades implied by different sets of changes in the actionable variables.

Specifically, we looked for trees in which at least two different branches (paths through the tree) led to conditional grade distributions (at the leafnodes at the end of the branches) that were undominated with respect to the objective function. We sought two paths to increase robustness by avoiding solutions in which successful completion of all of a unique combination of changes was predicted to be necessary and sufficient to achieve better grades. Several different objective functions were explored, including first-order stochastic dominance (which partially ordered the set of actionable classification trees), maximization of the percentage of As (or top scores), and minimization of the percentage of Fs (or lowest scores). These different criteria usually identified identical or almost identical subsets of actions as optimal.

To provide rule-learning analyses of this type for heterogeneous CSM data sets collected by different market units, we needed a classification tree subroutine that would generate multiple plausible trees from a single data set. For this purpose, we used a commercial AI tool called KnowledgeSEEKERTM ( ANGOSS Software, 1994). This software develops classification trees by recursive partitioning in which the question used to split the re- maining data set at each state is selected based on statistical significance with multiple comparisons via Bonferroni adjustment factors. The Bonferroni adjustment factors allow for the fact that the test of significance is modified because of the grouping of categories ‘by the highest level of significance (Biggs et al. 199 1). Applied to the CSM data set, this tool tended to produce short, robust rules for making probabilistic predictions about the effects of different experience vectors on customer ratings of uswc’s performance. The resulting rule sets provide convenient, empirically based approximations to the unknown F( y 1 x). They identify several systematic nonlinearities in customer responses and au- tomatically exploit these nonlinearities to reveal cost-effective strategies for improving uswc’s CSM grades.

An Illustrative Classification Tree

Figure 1 provides an example of a classification tree grown by applying KnowledgeSEEKER TM to 1992 customer data. The dependent variable is the frequency distribution of grades (measured from a low of 1 to a high of 4) for “overall quality of service.” (This variable has been found to be strongly associated with other outcome measures of interest, such as customers’ own reported assessments of their loyalty to US WEST.) The unconditional frequency distribution of grades can be approximated by its empirical realization (top node), which assigns a probability mass of 63.4% to the top grade, 28.5% to the next highest, 4.4% to a 2, and 3.8% to a 1.

For the 800 customers surveyed, the single variable that most reduced expected pre- diction error (i.e., error variance, the criterion used in preparing this diagram), was the Technician promptness rating (this factor is named for illustrative purposes only-the actual factors emerging in our analysis cannot be revealed for competitive reasons). This variable was scored from 1 (lowest) to 10 (highest), with missing data being assigned an arbitrary code of 98. Only 35.4% of customers who rate Technician promptness l-3 give a 4 for overall quality. By contrast, 82.8% of customers who give a rating of 9 or 10 for Technician promptness give the top grade for overall quality. While such statistical as- sociations may in part be explained by artifacts (e.g., some customers may tend to give high grades on all items while others tend to give low grades on all items), these numbers

PROCESS IMPROVEMENT IN TELECOMMUNICATIONS 223

FIGURE 1. Sample KnowledgeSEEKER TM classification tree. The first four numbers in each box are the percentage of customers giving US WEST an overall rating of 1 (worst), 2, 3, or 4 (best). The final number is the number of customers in that category. The factors appearing in this tree are for illustrative purposes only. Their actual names cannot be revealed due to competitive pressures.

are enough to suggest that showing up on time may play a key role in determining customer perceptions of value.

Reading the rest of the tree similarly, we may extract a set of powerful factors for predicting the overall quality score. This technique goes well beyond simply selecting those factors having the highest correlation with the overall quality score since the tree- structured technique takes into account combinations of multiple factors. The factors selected by this tree-structured technique tend to be those that work well in combination and those that are robust with regard to increases in sample size.

Interpretation as a Probabilistic Learning System

Each classification tree can be interpreted as a set of probabilistic expert system rules in standard (Horn-clause) form, augmented by a conditional cdf or pdf for the possible values of y (in place of more problematic and less informative conjidence factors). From this perspective, tree-growing programs can be described as methods for “learning rules from data.” The rules learned tend to be the shortest ones that have high descriptive and predictive power. They are found by simply traversing paths from the root to the leaf nodes in the classification tree. For example, in Figure 1, the rule “(The probability of receiving a 4 exceeds 90%) if [ ( Technician-promptness-rating = 9 or 10) and (Rep-promptness-rating = 10) and (Ease-of-getting-through-to-rep) = 10 ] ” cor- responds to traversal of the a path at the very right side of the tree. By multiple hypothesis testing techniques, this rule (possibly with a modification of the 90% level as stated) could be validated at the 95% confidence level.

Binary versus Multiple Category Variables

Most CSM questionnaires contain a mixture of binary (Yes/ No) and graded responses, many with over 10 (ordered) response categories. We noticed that KnowledgeSEEKERTM

224 TONY COX, GEORGE BELL AND FRED CLOVER

tends to split preferentially on questions with a large number of ordered response cate- gories. One explanation of this phenomenon is the following: since the algorithm attempts to choose a split which differentiates customers as much as possible, the questions which will differentiate them the most are likely to be those that contain the most information, i.e., response categories. This phenomenon would also be expected to result when some customers are hard graders and others easy graders. For example, consider the following pair of questions:

Qr : On a scale of l-10, where 1 is worst and 10 is best, how would you rate the phone representative on how well they explained the products and services you were interested in? (Response categories: l-10, or N/A)

QZ : Did the phone representative satisfactorily explain the products and services you were interested in? (Response categories: Yes, No or N/A)

The set of customers S, answering “10” to Qi is likely to contain a higher proportion of “easy graders” than the set of customers S2 answering “Yes” to Q2, which would lead to a higher overall grade among the customers S1 than &-exactly the trend seen in the actual data. In the data sets we have analyzed, we believe that both actual differences in customer experiences and good graders contribute to KnowledgeSEEKER’s preference for questions with a large number of ordered response categories, like Q, . Whatever the reason for this phenomenon, one implication is that to place all questions on an equal footing, surveys should be designed so that all questions with ordered response categories have the same number of categories.

One can easily convert a question with many response categories into a binary variable by combining response categories. The search for effective strategies is especially simple with binary variables. The effects of different subsets of changes were estimated from the marginal frequency distributions in multidimensional contingency tables formed using the actionable variables as independent variables and proportions of high (or low) grades as the dependent variable. An action is idealized as shifting the proportion of people who answer a specific, actionable question positively from its current empirical value to 100%. When the number of actionable variables to be considered is not greater than about 12, it is practical to enumerate all subsets and identify the ones that give the best distributions of grades. While there is usually a trivial solution-fix everything, so that only positive answers occur-it is more useful to search for “best” (undominated) subsets of size k for k I 4. In our experience, these best small subsets tend to have the following two desirable robustness properties:

Pi : The best subset of size k is included in one of the p best subsets of size (k + 1). Often, this statement is true for p = 1. When p = 1, this indicates that the notion of best subset is stable under size increase.

P2: The marginal improvement in the distribution of grades from fixing more factors (measured by the increase in percentage of As or by the decrease in the percentage of poor grades) tends to decrease rapidly once k exceeds about 3 or 4.

A sample data set from a particular US WEST market unit is used to illustrate these two properties. The data consists of 749 customer responses to 8 binary questions (factors) plus the binary response grade of whether or not they gave US WEST an overall A grade. The kth column of Figure 2 shows a ranking of subsets of size k by the degree to which fixing these factors raises the CSM grade. In this context, the impact ofjixirzg a factor is to calculate the CSM score among those customers who responded in the positive sense to the question attached to that factor. Due to the large number of subsets for k 2 1, we only show the first and last four subsets of size k. A line connects sets of factors in adjacent columns when one set of factors is a subset of another. Property P, is clearly evident for the highest ranking combination of factors (and also for the lowest ranking combination). Clearly, for any value of p 2 1 there exist data sets for which property P1 does not hold. However, these are rare in our experience, even when p = 1.

PROCESS IMPROVEMENT IN TELECOMMUNICATIONS 225

Ranked Subsets k = 1 k=2 k=3 k=4 of Size:

TOP Four

\

Bottom Four

:

FIGURE 2. Ranking of combinations of bi nary factors by their ability to raise CSM score. Note that the best subset of factors of size k is contained wi thin the best subset of factors of size k + 1.

Note that factor 1 is individually most effective at raising the percentage of As. However, factor 3 appears in all four of the top ranked subsets of size 4 (and indeed, in every one of the top 2 1 ranked subsets of size 4). This suggests that factor 3 is a critical ingredient in any improvement involving more than three factors. In this sense, it is wlost &firent from or orthogonal to the other factors. Such conclusions illustrate the power of this technique, which goes well beyond a traditional Pareto analysis in understanding the interactions between factors.

In Figure 3, we show the impact of fixing combinations of factors. The greatest im- provement is made by fixing all eight factors. However, the best combination of four factors achieves over 80% of this maximum. Figure 3 exhibits property P2 : it is important to select the right combination of factors, but only a small (three or four) subset needs to be chosen to achieve a large fraction of the improvement that can be made by fixing all factors.

In other words, most of the benefit expected from improving actionable variables is expected to be achieved after the first few changes have been made, and a core set of changes that will have the most benefit is often easy to identify. These two features make it possible to draw useful, robust conclusions that are simple to communicate.

Refinement: From Prediction to Strategy

In the CASI project, we further refined the distinction between actionable and nonac- tionable variables by assigning a cost of change to each variable and treating nonactionable variables as infinitely costly to change. Incorporating this additional layer of knowledge

226 TONY COX, GEORGE BELL AND FRED GLOVER

100

g 90

: 2

80

fj 70 60 = -2

50 40

2 8 30 s 20 g 10

0 0 1 2 3 4 5 6 7 8

Subset Size (k)

FIGURE 3. Range of improvement gained by fixing a specific number of factors. The vertical scale shows the range in improvement in CSM score stated as a percentage of the total possible by fixing all eight factors.

into the tree-growing procedure led to implementable strategies rather than merely high- quality predictions. We represented these strategies as countermeasure transition diagrams showing how predicted CSM grade distributions change with the implementation of dif- ferent subsets of countermeasures. Such diagrams, summarizing the actionable recom- mendations from the AI data analysis, became the basis at US WEST for suggesting CSM targets applicable to different parts of the business (e.g., residential repair, small business billing, etc.). In addition, we used them to characterize specific strategies for achieving these targets. Based on the results of this analysis, a new approach to scheduling technician time was implemented in Salt Lake City. As predicted, the percentage of customers rating uswc at 4 increased promptly and dramatically among customers in the trial area. We emphasize the importance of coordinating the various components of implementation to achieve such outcomes.

5. Conclusions and Next Steps

The CSM data analysis conclusions with the largest impact at US WEST include iden- tifying realistic targets for CSM improvement and suggesting specific areas to focus on to achieve these improvements.

Additional conclusions quantified the expected impact on grades of changes in these areas, identified other actionable variables strongly associated with these three, and sug- gested areas where more precise wording of questions was expected to be most useful in improving the prediction of impacts and the identification of hi&-impact changes.

The CSM data analysis described in this paper is the basis for a no-frills approach to process quality improvement and process re-engineering being developed at US WEST Technologies. Our goal for this approach is to move from data to critical management decisions as quickly as possible by using rapid exploratory data analysis (including the CSM data analysis) to identify the areas of action where more careful study is expected to yield the greatest benefits. Such data-drivenfocusing helps to condense several of the steps of problem selection, formulation, and refinement that can otherwise consume weeks or months of quality improvement team time.

The use of the CSM data interpretation program and classification trees to guide process improvement priorities has now spread to over half a dozen market units, and the CSM

PROCESS IMPROVEMENT IN TELECOMMUNICATIONS 227

results and recommendations for our mass markets are being routinely updated and referred to resource allocation decisions making. The more difficult tasks of identifying and implementing specific changes within the areas targeted by the CSM data analysis are ongoing in most parts of uswc. The results of implemented countermeasures and changes in CSM will be carefully monitored over the next year and used to refine the CSM data modeling process in order to better predict customer purchasing behavior, as opposed to customer perceptions and grades.

Finally, the new approaches to data analysis, classification tree optimization, and au- tomated identification of promising process improvement strategies developed through the CASI project appear promising for reuse in several new business and engineering applications, possibly including low-cost network fault diagnosis and efficient indexing of text and multimedia documents. CASI has selected the collaboration as the only recipient of its 1993 prize for achieving its goal of a successful industry-university partnership leading to a commercially valuable technology innovation that improves business in Colorado. ’

’ Note: A previous version of this paper has been published in The Annals of Operations Research. KnowledgeSEEKERTM is a trademark of ANGOSS Software, Suite 20 1,430 King Street West, Toronto, Canada M5V lJ5, Tel: (416) 593-l 122, Fax: (416) 593-5077.

BICCS, D., B. DE VILLE, AND E. SUEN ( 1991), “A Method of Choosing Multiway Partitions for Classification and Decision Trees,” Journal ofApplied Statistics, 18, I, 49-62.

BREIMAN, L., J. FRIEDMAN, R. OLSHEN, AND C. STONE ( 1984), Classification and Regression Trees, Wadsworth, Belmont, CA.

BUNTINE, W. ( 1993), “Learning Classification Trees,” in Artificial Intelligence Frontiers in Statistics, D. J. Hand (ed), Chapman and Hall, New York, 182-20 1.

LUTSKO, J. F. AND B. KUIJPERS ( 1994), “Simulated Annealing in the Construction of Near-Optimal Decision Trees,” Artificial Intelligence and Statistics IV: Selecting Models from DATA, P. Cheeseman & R. W. Oldford (eds.), Springer-Vet@.

MCCULLAGH, P. AND J. A. NELDER ( 1993), Generalized Linear Models, Chapman and Hall, New York. OHTA, S., AND F. KANAYLA ( 1991), “Optimal Decision Tree Design Based on Information Theoretical Cost

Bound,” IEICE Transactions, E74,9,2523-2529.


Recommended