+ All Categories
Home > Documents > A methodology for calculating sample size to assess localized corrosion of process components

A methodology for calculating sample size to assess localized corrosion of process components

Date post: 05-Sep-2016
Category:
Upload: mohamed-khalifa
View: 214 times
Download: 1 times
Share this document with a friend
11

Click here to load reader

Transcript
Page 1: A methodology for calculating sample size to assess localized corrosion of process components

lable at ScienceDirect

Journal of Loss Prevention in the Process Industries 25 (2012) 70e80

Contents lists avai

Journal of Loss Prevention in the Process Industries

journal homepage: www.elsevier .com/locate/ j lp

A methodology for calculating sample size to assess localized corrosionof process components

Mohamed Khalifa*, Faisal Khan, Mahmoud HaddaraProcess Engineering, Faculty of Engineering & Applied Science, Memorial University, Prince Philip Dr., St John’s, NL A1B 3X5, Canada

a r t i c l e i n f o

Article history:Received 15 April 2011Received in revised form8 June 2011Accepted 10 July 2011

Keywords:InspectionSample sizeLocalized corrosionExtreme value methodBootstrapGumbel probability distribution

* Corresponding author.E-mail address: [email protected] (M. Khalifa).

0950-4230/$ e see front matter � 2011 Elsevier Ltd.doi:10.1016/j.jlp.2011.07.004

a b s t r a c t

This paper proposes a methodology to estimate the required sample size to assess, with a specified preci-sion, the localized corrosion of process components. Theproposedmethodology uses the extremevalue andbootstrap methods. The results of estimated sample size ensure that the predicted maximum localizedcorrosionwith the extreme value method is within an acceptable margin of error at a specified confidencelevel. Using the results of the proposed methodology, an equation is introduced to calculate sample size asa function of the acceptable margin of error, the population size, the standard deviation of corrosion dataand the required confidence level. The probability of exceedance of critical limit of localized corrosion is alsoestimated. The methodology is explained through a case study of localized corrosion in process piping.

� 2011 Elsevier Ltd. All rights reserved.

1. Introduction Creep deformation is accelerated at high temperatures.

Processing components are subjected to several structuraldamagemechanisms during the operational life. Themost commondamage mechanisms are:

A. CorrosionCorrosion results in metal loss, pitting, cracks or/and degra-dation of material properties due to changes in the materialmicrostructure. Corrosion could be general or localized.General corrosion is a metal loss widely distributed over thesurface area of an asset. Localized corrosion results ina localized metal loss or cracks at different small areas overthe material surface such as pitting corrosion, galvaniccorrosion, crevice corrosion, stress corrosion cracking andhydrogen induced cracking.

B. FatigueFatigue occurs due to fluctuating stress (mechanical fatigue) orfluctuating temperature (thermal fatigue). Fatigue causesinitiation and growth of cracks especially at locations of mate-rial discontinuities until the crack size reaches a critical limitsuch that the asset is no longer able to resist the applied load.

C. CreepCreep is continuous plastic deformation that happens whenan asset is continuously subjected to a load for a long period.

All rights reserved.

Creep is accompanied by microscopic voids that eventuallylead to macroscopic cracks and crack growth.

D. Corrosionefatigueecreep interactionAn asset could be subjected to combined degradationmechanisms such as corrosion, fatigue and creep. Thedegradation is accelerated due to the presence of severalmechanisms at the same time.

Due to variability in operating conditions and uncertainty inhow they may affect asset health, inspection is undertaken toensure that assets are performing their intended function anddeterioration is not threatening the asset integrity. A practicalproblem is to decide the required sample size to ensure that theinspection sample represents the population. There is a lack ofstudies aim to address calculation of the localized corrosion samplesize and still no clear consensus on this problem but it is widelybelieved that the larger the sample size is the smaller the error ofthe sample estimate (Kowaka, Tsuge, Akashi, Katsumi, & Ishimoto,1984 and Alfonso, Caleyo, Hallen, Esplna-Hernandez, & Escamilla-Davish, 2008). This problem was studied previously by Shibata(1991), Schneider, Muhammed, and Sanderson (2001), Wang(2006) and Alfonso et al. (2008). Shibata (1991) showed fromhistorical data that localized corrosion may be modeled withGumbel extreme value distributionwith location parameter (l) andscale parameter (s) change with time while the ratio (s/l) remainsapproximately constant irrespective of the time for a certainmaterial in same environment. Shibata (1991) plotted sample size

Page 2: A methodology for calculating sample size to assess localized corrosion of process components

Abbreviations

AI Unit inspection areaAT Total inspection areaANOVA Analysis of varianceCI Confidence intervalCOEmax Coefficient of error of the maximumCOEmean Coefficient of error of the meanF(xmax) Cumulative density function of xmax

FPB Finite population bootstrapFPCF Finite population correction factorMOEaccept Acceptable margin of errorMOEmax Margin of error of the maximumMOEmean Margin of error of the meann Inspection sample sizeN Population sizenb Bootstrap sample size

POE Probability of exceedanceS Scale parameterSEmax Standard error of the maximumSEmean Standard error of the meanT Return periodti Time of the ith inspectionxi Maximum localized corrosion at tixmax Maximum valuea Significance levell Location parameterm Population meanms Sample means Population standard deviationss Sample standard deviation

F�1 Inverse of the standard normal distribution

M. Khalifa et al. / Journal of Loss Prevention in the Process Industries 25 (2012) 70e80 71

(n) versus return period (T) for different ratios (s/l) estimated usingthe minimum variance linear unbiased estimator method. Fromthis plot, the optimum sample size is obtained for a given returnperiod. Schneider et al. (2001) estimated the required inspectionarea by investigating the dependence between the data points atdifferent distances from each other. Wang (2006) estimated therequired number of tubes in heat exchangers for the assessment ofthe minimum remaining thickness of tubes in industrial heatexchangers subjected to corrosion. Alfonso et al. (2008) aimed toestimate the optimum sample size and unit inspection area ina long pipeline based on the required accuracy of the estimate ofmaximum pit depth in un-pigable, buried pipelines using theextreme value method. Alfonso et al. (2008) performed extensiveMonte Carlo simulations to estimate the mean square error of theestimate of the maximum localized corrosion as an indicator to theestimate accuracy for different sample sizes.

API 570 (2009) (piping inspection code) provides excellentrecommendations for conducting good piping inspections;however, it does not provide specific guidelines to determine theinspection sample size. In API 570, inspection sample size is left forthe inspection practitioner’s judgment (Hobbs & Ku, 2002).

API 581 (2000) (risk-based inspection resource document)considers risk as a basis for inspection planning. Although API 581providesguidance fordetermining frequencyof inspections, itdoesnotprovide specific guidance for determining minimum number of loca-tions tobe inspected torepresent theconditionof thecomplete system.

The extreme value method is widely used to predict themaximum localized corrosion over the entire population usingsample data. A major limitation in the application of the extremevalue method to the assessment of the corrosion is that the samplesize affects the accuracy of the extreme value prediction (Jarrahet al., 2011). The proposed methodology calculates the samplesize which is required to be used with the extreme value method topredict, with a specified precision, the maximum localized corro-sion of process components.

2. Proposed methodology to estimate sample size forlocalized corrosion

2.1. Key assumptions

1- The sample data is independent.2- The measurement error is negligible.3- The sample data (the maximum corrosion in each inspected

component/area) follows Gumbel extreme value distribution.

2.2. The proposed methodology

The proposed methodology compromises the following mainparts as shown on the methodology flowchart (Fig. 1):

2.2.1. Part 1-layering separationThrough layering separation the equipment of an installation

subjected to corrosion are classified into groups or areas. Thegroups obtained by this classification process are usually referred toas corrosion circuits or loops. A corrosion circuit (loop) is a group ofsimilar assets in the plant which have the same material andexposed to the same corrosion conditions. Each group is consideredas a population from which sampling is required. The objective oflayering separation is to reduce the source of variability in theinspection data within each group. This would help to reduce therequired sample size when sampling randomly within a groupbecause sample size is strongly dependent on the standard devia-tion, s, of the population.

2.2.2. Part 2-physical sampling within each groupA randomly selected number of components/areas within

a group is inspected. Only themaximum localized corrosion of eachcomponent/area is recorded and fitted to a Gumbel extreme valuedistribution.

2.2.3. Part 3-bootstrap sampling and extreme value analysis2.2.3.1. Use of bootstrap sampling methods to estimate standard errorand confidence interval. The standard error is a measure of theaccuracy of an estimator obtained based on sample data. There is noaccurate formula for estimating the standard error of a statisticother than the mean.

The standard error of the mean, SEmean is given by:

SEmean ¼ sffiffiffin

p (1)

where s is the standard deviation of the population and n is thesample size.

The confidence interval, CI, is estimated as a multiple of thestandard error, SE, as follows:

CI ¼ 2$F�1�1� a

2

�$SE (2)

where F�1 is the inverse of the standard normal distribution and(1 � a) is the confidence level.

Page 3: A methodology for calculating sample size to assess localized corrosion of process components

iii. Fit Gumbel distribution to inspection data

ii. Sample with size n from a group (population)

vi. Draw a bootstrap sample without replacement from the bootstrap population

iv. Generate a bootstrap population of size N following the fitted Gumbel distribution

viii. Predict the maximum corrosion with extreme value method

vii. Fit Gumbel distribution to the drawn bootstrap sample Bootstrap replications

ix. Calculate bootstrap standard error of the maximum (SEmax)

x . Estimate coefficient of error of the maximum (COEmax)

Yes

No

xii. The number of all inspected components/areas is the required sample size

v. Let the bootstrap sample size = nb N

Next nb

Next bootstrap population

xi. Average COEmax for each sample size nb N, plot average COEmax versus nb and

obtain the sample size nb corresponding to the acceptable COEaccept.

i. Layering separation into groups Part 1

Part 2

Part 3

Part 4

Increase n to nbIs obtained nb n ?

Next group

Fig. 1. Proposed methodology flowchart.

M. Khalifa et al. / Journal of Loss Prevention in the Process Industries 25 (2012) 70e8072

The margin of error of an estimator obtained using sample datais the maximum acceptable deviation of this estimator from thepopulation value. It expresses the required estimate precision. Forexample, if the desired precision is set at�0.1 mm, then the marginof error is 0.1 mm.

The margin of error is expressed as half the width of theconfidence interval. From Eqs. (1 and 2), the margin of error of themean, MOEmean, is given by:

MOEmean ¼ F�1ð1� a=2Þ $ sffiffiffin

p (3)

In case of localized corrosion, as there is no close analyticalexpression to estimate the confidence interval and therefore themargin of error of the maximum localized corrosion, bootstrapsampling is used in the proposed methodology for this purpose.Bootstrap sampling is a simulation of physical sampling process. It isa convenient tool that requires less assumptions and computations.

In bootstrapping, a resample called bootstrap sample can bedrawn randomly with one of three main procedures:

i. From the sample data itself (with replacement). This proce-dure is refereed to non-parametric bootstrap and was firstproposed by Efron (1979) to estimate the standard error andconfidence interval when the standard methods cannot beused.

ii. From a created virtual population (either with or withoutreplacement). This population is called bootstrap population.An example of this procedure is the finite population boot-strap (FPB) which was first introduced by Gross (1980). Thesimplest case when the population size, N, is a multiple of thesample size, n (i.e., N ¼ C$n). In this case the bootstrap pop-ulation is created by repeating the sample C times (Cohen,1997). In the proposed methodology, the FPB methodwithout replacement is used to estimate the confidence

Page 4: A methodology for calculating sample size to assess localized corrosion of process components

M. Khalifa et al. / Journal of Loss Prevention in the Process Industries 25 (2012) 70e80 73

interval of the maximum corrosion predicted with theextreme valuemethod. The bootstrap population is generatedusing Monte Carlo simulation following Gumbel extremevalue distribution of the sample data.

iii. From a distribution fits the sample data assuming that thesample distribution is an approximation to the populationdistribution. This may be called a Monte Carlo procedure(Brooker & Geoffery, 2004) and also referred to parametricbootstrap. This procedure is useful where a parametric modelfits the population distribution is known. It is more accuratethan non-parametric bootstrap (Efron & Tibshirani, 1993).

The steps of bootstrapping to estimate standard error andconfidence intervals are summarized as follows:

a) Bootstrap sample:Draw a bootstrap sample.

b) Bootstrap statistic:A statistic such as mean, median or maximum is evaluatedfor large number of bootstrap samples.

c) Bootstrap distribution:Bootstrap distribution of bootstrap statistic is obtained.

d) Bootstrap standard error and confidence interval:The standard error and confidence interval of the statistic isestimated as the standard deviation and confidence intervalfor the mean of the bootstrap distribution, respectively.

Fig. 2 shows a flowchart of bootstrapping to estimate standarderror and confidence intervals.

2.2.3.2. Use of the extreme value statistical method to predict themaximum localized corrosion. The extreme value distribution isclassified into three types (Type I, Type II and Type III) for two cases(maximumvalues andminimumvalues). Type I (in case ofmaximumvalues) is known as Gumbel distribution. It is a common practice touse Gumbel distribution to represent the probability distribution ofmaximum localized corrosion; see Kowaka et al. (1984). The cumu-lative probability of Gumbel distribution is given by:

FðxmaxÞ ¼ Exp�� Exp

�� xmax � l

S

��(4)

where F is the cumulative density function of the random variablexmax (maximum value). l and S are location and scale parameters,respectively. Several methods can be used to estimate l and S suchas the maximum likelihood method or fitting a straight line in theGumbel probability paper.

The mean, m and standard deviation, s are estimated in terms ofthe scale and location parameters as follows:

m ¼ lþ gS (5)

where g ¼ 0.57722 is Euler constant.

Bootstrap statistic

Bootstrap sample

Bootstrap standard error and confidence interval

Bootstrap distribution

Bootstrap replications

Fig. 2. Bootstrapping to estimate standard error and confidence intervals.

s ¼ pffiffiffi6

p S (6)

The localized corrosion may be modeled by extreme valuedistribution (Khan and Howard, 2007). The inspection datamodeled by extreme value distribution (case of maximum values)may be extrapolated for the whole population to predict themaximum corrosion size in uninspected areas (Kowaka et al., 1984;The Health and Safety Executive, 2002; ASTM G46-94, 2005).

To demonstrate how the maximum corrosion is predicted overthe whole population with the extreme value method, let usconsider a systemhaving a total ofN components/areas (populationof size N) and a sample of size n components/areas is inspected. Thesample data points (measured maximum corrosion for eachcomponent/area) are arranged in order of increasing rank. Thecumulative probability, i, can be calculated as i/(n þ 1), when usingthe average rank method, where i is the order of rank and n is thesample size (number of recordedmaxima). The highest value of thesample cumulative probability can be estimated as F ¼ n/(n þ 1).This maximum value corresponds to the maximum corrosion overthe sample. Similarly, the highest value of the cumulative proba-bility for the whole population of size N can be estimated asi ¼ N/(N þ 1). The Gumbel extreme value cumulative probabilityfunction is a straight line on Gumbel probability plot paper. Thus,the maximum localized corrosion over the entire population can bepredicted by extrapolating the Gumbel probability plot linearlyfrom point A to point B (Fig. 3).

The predicted maximum corrosion, xmax, corresponds to point Bcan also be estimated as follows:

xmax ¼ lþ slnT (7)

where T is the return period and is given by, (see Alfonso et al.,2008; Kowaka et al., 1984; Shibata, 1991):

T ¼ 11� Fmax

(8)

where Fmax ¼ N/(N þ 1) is the maximum value of the cumulativedensity function. This leads to:

T ¼ N þ 1 (9)

As N is large in comparison with 1, thus T can be approximatedas:

TyN ¼ AT

AI(10)

where AT is the total area of the population (group) and AI is the unitinspection area.

Thus, the return period T of localized corrosion can be explainedas number of unit inspection areas at which the maximum corro-sion is observed. The scale of return period T¼ 1/(1� F) is shown onthe right-hand side of the vertical axis of Fig. 2.

The extrapolation shown in Fig. 2 is valid provided that thestatistical characteristics of the sample completely represent thestatistical characteristics of the whole population. Thus deter-mining the sample size to represent the whole population isimportant to ensure the precision of the extreme value method inthe prediction of the maximum corrosion.

In the proposed methodology, a large number of bootstrapsamples of different sizes nb � N are drawn without replacementfrom generated bootstrap populations of the same size of theoriginal population, N. The maximum localized corrosion is pre-dicted with the extreme value method for each bootstrap sample.The bootstrap standard error of the maximum corrosion, SEmax,predicted with the extreme value method is estimated.

Page 5: A methodology for calculating sample size to assess localized corrosion of process components

Point B

Point A

Maximum corrosion over the population

Maximum corrosion over the sample

Localized corrosion size

F=n/(n+1)

Fmax=N/(N+1)

Cumulative probability, F

Return period, T

T=N

0.5 2

0.9 10

100 0.99

0.999 1000

Fig. 3. Extrapolation of the Gumbel probability plot to predict the maximum localized corrosion.

M. Khalifa et al. / Journal of Loss Prevention in the Process Industries 25 (2012) 70e8074

2.2.4. Part 4-calculation of the required sample size to predict themaximum localized corrosion within each group

It is required to estimate the appropriate sample size withineach group which provides an accurate description of the state ofthe whole group.

When the sampling objective is to evaluate the populationmean, the sample size is estimated as follows:

a) When the standard deviation of the population, s, is known, Eq.(3) leads to a sample size n, corresponding to a pre-definedacceptable margin of error, MOEaccept, given by:

n ¼�F�1

�1� a

2

�$

�s

MOEaccept

��2(11)

b) When the standard deviation of the population, s, is unknownand has to be estimated from the data, one can use t-distri-bution. In this case, F�1(1 � (a/2)) and s used in Eq. (11) arereplaced with t1 � a/2,n � 1 and ss, respectively as follows:

n ¼�t1�a=2;n�1$

�ss

MOEaccept

��2(12)

where t1� a/2,n� 1 is the critical value at the probability of (1� a/2)of the t-distribution with (n � 1) degree of freedom and ss is thesample standard deviation. For large sample size n (for examplen > 50), the t-distribution approaches the standard normaldistribution. In this case, the above two equations give approxi-mately equal sample sizes. The solution for n in Eq. (12) should beobtained by trial and error because t1 � a/2,n � 1 is a function of n.

c) If the population is finite, the standard deviation in Eqs. (11 and12) is multiplied by a finite population correction factor (FPCF)is given by (Bernstein & Bernstein, 1999):

FPCF ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiN � nN � 1

ry

ffiffiffiffiffiffiffiffiffiffiffiffiffi1� n

N

r(13)

Eqs. (11 and 12) can be used without FPCF when the populationsize (N) is infinite or practically large in comparison to thesample size, n (i.e., n � 0.05N). When n/N is less than 0.05, thereduction in the sample size estimate by the FPCF is usually oflittle importance.Instead of multiplying the standard deviation by FPCF, theestimated sample obtained from Eq. (11) is divided by (1 þ n/N)as suggested by ASTM E122 (2009). This yields to the following

classical equation for calculating the sample size needed for theevaluation of the mean of a finite population:

n ¼

�F�1

�1� a

2

�$

�s

MOEaccept

��2

�F�1

�1� a

2

�$

�s

MOEaccept

��2N

(14)

In case of localized corrosion, evaluation of the mean is notsufficient because the failure is expected when the maximumcorrosion at any location in the population exceeds the criticallimit. Thus, the sampling objective is to predict the maximumcorrosion, not the mean, over the whole population (inspected anduninspected components/areas). In order to achieve that, thefollowing steps are undertaken in the proposed methodology:

i) Standard error of the population maximum (SEmax) is esti-mated as the standard deviation of the population maximumwhich is predicted using the extreme value method for largenumber of bootstrap samples of sizes nb � N (see Figs. 2and 3).

ii) The ratio of standard error of the population maximum(SEmax) to standard deviation of sample data (ss) is evaluatedfor each bootstrap sample. In this work, we will refer to thisratio as the coefficient of error of the population maximum(COEmax) and define it as follows:

COEmax ¼ SEmax=ss (15)

iii) The margin of error of population maximum, MOEmax, isexpressed as half of the confidence interval of the maximumand is given by:

MOEmax ¼ F�1�1� a

2

�$SEmax (16)

iv) From Eqs. (15 and 16), the coefficient of error of themaximum, COEmax, is given by:

COEmax ¼ MOEmax=hF�1

�1� a

2

�$ss

i(17)

v) The acceptable coefficient of error COEaccept is estimated fromEq. (17) corresponding to a pre-defined acceptable margin oferror, MOEaccept, as follows:

Page 6: A methodology for calculating sample size to assess localized corrosion of process components

M. Khalifa et al. / Journal of Loss Prevention in the Process Industries 25 (2012) 70e80 75

COEaccept ¼ MOEaccept=hF�1

�1� a

2

�$ss

i(18)

vi) The bootstrap sample size, nb, is plotted versus COEmax esti-mated from Eq. (17). Then from this plot, the required samplesize is obtained corresponding to the acceptable COEaccept.

2.3. Analysis of variance (ANOVA)

Analysis of variance was performed to check the significance ofthe factors that could affect the COEmax. These factors are samplesize (n), population size (N), mean (m) and standard deviation (ss).The levels of these factors used for ANOVA are shown in Table 1. Theunits of the mean and standard deviation are consistent.

The ANOVA shows that the factors and factors interactionshaving a P-value less than a specified significance level a (forexample 0.01 or 0.05) are significant factors at confidence level(1 � a). The analysis of variance results showed that sample sizeand population size have significant effect (P-value ¼ 0.000) whilemean and standard deviation have insignificant effect(P-value > 0.05) on COEmax. When the effect of a factor depends onthe level of another factor, it is said that the two factors interact. Allfactors interactions found with insignificant effect (P-value > 0.05)except the interaction of sample size and population size(P-value ¼ 0.000). This conclusion about the significance of theeffect of the factors and factors interactions on COEmax was used toconfirm the proposed equation to estimate sample size.

2.4. Proposed equation to estimate sample size

Eq. (14) is used to estimate the required sample size to evaluatethe mean value of a population of size N. In the proposed meth-odology, our aim is to obtain an equation similar to Eq. (14) toestimate the required sample size to evaluate the populationmaximum instead of the population mean.

First we estimate the required sample size evaluate the pop-ulation mean using the proposedmethodology in order to comparethe results of the proposed methodology with the classical equa-tion (Eq. (14)). In this case step (viii) shown in Fig. 1 was replacedwith an estimate of the mean. Also, the maximum is replaced withthe mean in steps (ix), (x), and (xi). An equation is fitted to theresults obtained with the proposed methodology as follows:

a) The sample mean and standard deviation are set at two levels(low and high) of 0.1 and 100 units. The scale and locationparametersofGumbeldistributionareestimated foreachsample.

b) Bootstrap populations were generated following the sampleGumbel distribution with size, N, at levels of 102, 103, 104, 105,106 and 107.

c) The COEmean is estimated for all possible combinations withdifferent levels of sample mean (ms), standard deviation (ss),population size (N), and bootstrap sample size nb � N.

d) The estimated COEmean is plotted versus bootstrap sample sizenbfor different levels of population sizeN. For example, Fig. 4 showsCOEmean versus bootstrap sample size, nb, for population size,N ¼ 100 with two levels of ms and ss are 0.1 and 100 units.

Table 1Factors levels.

Factor Levels

Sample size, n 20, 40, 60, 80, 100Population size, N 102, 103, 104, 105, 106, 107

Sample mean, ms 0.1, 100Sample standard deviation, ss 0.1, 100

e) The results of COEmean are fitted to the following equation:

COEmean ¼ffiffiffiffiffiffiffiffiffiffiffiffiffi1n� 1N

r(19)

where COEmean is defined similar to Eq. (15) as follows:

COEmean ¼ SEmean=ss (20)

Also margin of error of the mean is expressed similar to Eq.(16) as follows:

MOEmean ¼ F�1�1� a

2

�$SEmean (21)

From Eqs. (19e21), the proposed methodology yields the clas-sical equation (Eq. (14)). Thus, the sample size obtained using theproposed methodology is the same as the sample size obtainedwith the classical method when the sampling objective is to esti-mate the population mean.

The methodology is then extended to estimate the requiredsample size to evaluate the populationmaximum as shown in Fig.1.Then, an equation is fitted to the results. As it was done in case ofthe mean, the sample mean and standard deviation are set at twolevels (low and high) of 0.1 and 100 units and the bootstrap pop-ulations were generated with size, N, at levels of 102, 103, 104, 105,106 and 107. The COEmax is plotted versus bootstrap sample size nbfor all possible combinations of sample mean (ms) standard devia-tion (ss) and population size (N) at different levels. For example,Fig. 5A and B show COEmax versus bootstrap sample size, nb, forpopulation size, N¼ 102 and N¼ 107 respectively with two levels ofms and ss are 0.1 and 100 units.

The results in Fig. 5A and B show that the COEmax is a function ofthe sample size and population size but not function of mean andstandard deviation as it was evident by ANOVA. The results ofCOEmax are fitted to the following proposed equation:

COEmax ¼ f ðNÞ$ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi�1n� 1N

�s;N � 107 (22)

where f ðNÞ ¼

1:3N0:2 ;N � 2002:1N0:13 ;200 < N � 107

:

It can be noted that Eq. (22) is similar to Eq. (19) except thefunction f(N). The predicted COEmax using Eq. (22) is plotted in thehorizontal axis versus the actual COEmax obtained using theproposed methodology in the vertical axis for different possiblecombinations of the levels of bootstrap mean, standard deviationand population size. For example, Fig. 6A and B shows this plot forpopulation size, N ¼ 102 and N ¼ 107 respectively.

Fig. 6A and B shows approximately a straight line with slop 45�.This means that the predicted and actual COEmax are consistent.

From Eq. (22):

n ¼ 1�COEmax

f ðNÞ�2

þ 1N

(23)

From Eq. (18) and Eq. (23), the sample size required to predictthe population maximum with pre-defined MOEaccept can becalculated using the following equation:

n ¼ 1264 MOE accept

f ðNÞ$F�1�1� a

2

�$ss

3752

þ 1N

;N � 107 (24)

Page 7: A methodology for calculating sample size to assess localized corrosion of process components

20 30 40 50 60 70 80 90 1000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

Bootstrap sample size, nb

CO

E mea

n

Proposed methodology

Classical equation

Fig. 4. COEmean versus bootstrap sample size, nb, for population size, N ¼ 100.

M. Khalifa et al. / Journal of Loss Prevention in the Process Industries 25 (2012) 70e8076

The estimated sample size using the proposed equation (Eq.(24)) ensures that the predicted maximum localized corrosionusing the extreme value method is within pre-defined �MOEacceptat a confidence level (1 � a).

Table 2 shows a comparison of the proposed equation andclassical equation to estimate sample size:

10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

A

B

1

Bootstrap

Bootstrap

CO

Em

ax

0 1 2 3 40

0.005

0.01

0.015

0.02

0.025

0.03

0.035

CO

E max

Fig. 5. A. Bootstrap sample size, nb, versus COEmax for N ¼ 102

2.5. Probability of exceedance

Probability of exceedance (POE) is the probability that a corro-sion flaw exceeds a specified critical limit at the inspection time.Once the sample size is determined and the maximum corrosion ispredicted, the probability of exceedance at a specified critical limit

60 70 80 90 100

sample size, nb

sample size, nb

5 6 7 8 9 10

x 106

. B. Bootstrap sample size, nb, versus COEmax for N ¼ 107.

Page 8: A methodology for calculating sample size to assess localized corrosion of process components

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.040

0.005

0.01

0.015

0.02

0.025

0.03

0.035

CO

E max

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

A

B

1

Predicted COEmax

Predicted COEmax

CO

Em

ax

Fig. 6. A. Predicted versus actual COEmax for N ¼ 102. B. Predicted versus actual COEmax for N ¼ 107.

M. Khalifa et al. / Journal of Loss Prevention in the Process Industries 25 (2012) 70e80 77

may be determined. As the entire population is not inspected anda limited sample data is used to represent the population, there isuncertainty in the POE estimate. This uncertainty can be handledusing bootstrapping. A sufficient number of bootstrap populationsare generated from which bootstrap samples are drawn. BootstrapPOE is estimated for all bootstrap samples as one minus bootstrapcumulative probability at the specified critical limit. The expectedPOE is the average of the bootstrap POE (see Fig. 7).

3. Case study

Table 3 shows sample data of pitting corrosion in an offshoreprocess piping. The data represent the maximum pit depth, xmax,

Table 2Comparison of the proposed equation and classical equation.

Proposed equation (Eq. (24)) n¼ 1"MOE accept

f ðNÞ$F�1�1�a

2

�$ss

#2þ1N

Sampling objective To estimate the population maximumMethod of estimation The population maximum is estimated with the extreme

value method using a sample of size nPrecision The estimate error does not exceed a pre-defined MOEaccept

at a specified confidence level (1 � a)Sample size with

same precisionLarger

measured using an ultrasonic inspection technique in 30 straightpiping segments (design thickness 13.5 mm). These segments areselected randomly over the piping.

The proposed methodology was applied to estimate the samplesize required to predict the maximum pit depth using the extremevalue method within �0.5 mm (i.e., MOEaccept ¼ 0.5 mm) at 0.95confidence level (i.e., a ¼ 0.05) as follows:

Layering separation:

The total number of piping segments (population size, N) is 100.These segments represent one group as they are similar and sub-jected to the same corrosion conditions.

;N � 107 Classical equation (Eq. (14)) n ¼

�F�1

�1� a

2

�$

�s

MOE accept

��2

�F�1

�1� a

2

�$

�s

MOE accept

��2N

To estimate the population meanThe population mean is estimated as the averageof a sample of size n

Smaller

Page 9: A methodology for calculating sample size to assess localized corrosion of process components

Fit Gumbel distribution to inspection data

Draw a bootstrap sample without replacement from the generated bootstrap population

Generate a bootstrap population following the fitted Gumbel distribution

Estimate bootstrap POE

The expected POE = Average bootstrap POE

Next bootstrap population

Bootstrap replications

Fig. 7. Probability of exceedance (POE) calculation.Fig. 8. Gumbel probability plot of the inspection data.

M. Khalifa et al. / Journal of Loss Prevention in the Process Industries 25 (2012) 70e8078

Physical sampling:

The inspected segments are selected randomly over the piping.

Bootstrap sampling and extreme value analysis:

The data in Table 3 fits a straight line in the Gumbel probabilityplot as shown in Fig. 8, thus this data fits Gumbel distribution. TheGumbel probability plot is obtained by plotting xmax versus[�ln(�ln(F(xmax)))]. The slope of the straight line gives 1/S and theintersection with the X-axis gives l. The standard deviation of thesample data, ss, is estimated as 0.99 mm.

Bootstrap samples with different sizes size nb � N are drawnfrom bootstrap populations randomly generated following thefitted Gumbel distribution. The maximum localized corrosion ispredicted with extreme value method and the correspondingCOEmax is estimated for each bootstrap sample.

The results of COEmax for different bootstrap sample sizes areshown in Fig. 9.

Sample size calculation:

From Eq. (18)

COEaccept ¼ MOEaccept=hF�1

�1� a

2

�$ss

iCOEaccept ¼ 0:5=ð1:96*0:99ÞCOEaccept ¼ 0:26

From Fig. 9, the bootstrap sample size at COEaccept ¼ 0.26 is 58.The proposed equation (Eq. (24)) can be used to calculate the

sample size alternatively to the whole methodology as follows:

n ¼ 1264 MOE accept

f ðNÞ$F�1�1� a

2

�$ss

3752

þ 1N

f ðNÞ ¼ 1:3N0:2 ¼ 1:3*1000:2 ¼ 3:265

n ¼ 1�0:5

3:265*1:96*0:99

�2þ 1100

¼ 62

Table 3Recorded maximum pit depth (in mm) in 30 inspected piping segments.

2 2 3.5 2 3.5 3.5 2 3.5 2 32.5 3 5 3.5 3.5 2 3 1 3.5 35 2.5 3 3.5 4.5 3 3.5 3.5 2 1

The obtained sample size using the methodology is close to theone obtained using the proposed equation.

The required sample size, n, to be inspected is assumed equal tothe obtained bootstrap sample, nb. The inspection is carried out foradditional 28 segments (i.e., total n ¼ 58 inspected segments).Table 4 shows the recordedmaximum pit depth including all the 58inspected pipeline segments.

The data shown in Table 4 fits Gumbel distribution with scaleparameter 0.71 mm and location parameter 2.31 mm. The standarddeviation of all inspection data shown in Table 4 is 0.91 mm.

From Eq. (18), the COEaccept is re-estimated based on the newdata (Table 4) as 0.28 which corresponds to nb ¼ 56 in Fig. 9.

The required sample size n can also be obtained using theproposed equation (Eq. (24)) which yields to:

n ¼ 1�0:5

3:265*1:96*0:91

�2þ 1100

¼ 58

The number of inspected segments is 58 is not less than theestimated sample size, thus it is not required to inspect more pipingsegments.

Prediction of the maximum pit depth over the entire population(100 piping segments):

The maximum pit depth over the inspected sample is 5 mm asshown in Table 4. The maximum pit depth over the entire pop-ulation was predicted as 5.54 mm by extrapolation of the Gumbelextreme value distribution of the inspection data in Table 4. Themargin of error in this prediction is 0.5mm at 0.95 confidence level.

Probability of exceedance (POE):

The critical limit of the maximum pit depth is assumed 75% ofthe design thickness (i.e., 10 mm). The critical corrosion size inpipelines may be estimated from the principles of fracturemechanics or obtained from standards for assessment of corrodedpipelines such as ASME B31G (2009) or DNV-RP-F101 (2000).Following the flowchart shown in Fig. 7, the probability ofexceedance (POE) to this critical limit is estimated as 2 � 10�5.

Sample size to estimate the mean:

If it is required to estimate the mean of the corrosion instead ofthe maximum based on the sample data shown in Table 4 with thesame margin of error ¼ 0.5 mm and confidence level 0.95.Assuming that standard deviation of the sample (0.99 mm) is an

Page 10: A methodology for calculating sample size to assess localized corrosion of process components

10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Bootstrap sample size, nb

CO

E max

Fig. 9. COEmax versus bootstrap sample size for population size of 100.

Table 4Maximum localized corrosion in 58 inspected piping segments in mm.

2 2 3.5 2 3.5 3.5 2 3.5 2 32.5 3 5 3.5 3.5 2 3 1 3.5 35 2.5 3 3.5 4.5 3 3.5 3.5 2 12.5 2 3.5 3.5 2.5 2 2.5 2 2 32 2.5 2.5 3 1.5 3.5 1 2.5 5 32 2.5 2 2.5 2 2 2.5 2

M. Khalifa et al. / Journal of Loss Prevention in the Process Industries 25 (2012) 70e80 79

approximation to standard deviation of the population (N ¼ 100),the classical equation (Eq. (14)) yields to sample size, n ¼ 15. It canbe noted that the sample size (n ¼ 15) required to estimate themean of population is much less than the sample size (n ¼ 58)required to estimate themaximum of the same populationwith thesame precision (i.e., margin of error). Thus, the sampling strategydepends on what actually is investigated, for example, the mean orthe maximum.

Effect of the population size:

In order to show the effect of the population size on the esti-mated sample size, it was assumed that the total number ofsegments of the pipeline (population size) is 1000 and theproposed methodology is reapplied for this population size.

0 100 200 300 400 50

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Bootstrap s

CO

E max

Fig. 10. COEmax versus sample size

Fig. 10 shows COEmax versus sample size for population size of1000.

From Fig. 10, the sample size at COEaccept ¼ 0.26 is 278.

Using the proposed equation:

f ðNÞ ¼ 2:1N0:13 ¼ 2:1*10000:13 ¼ 5:1551

n ¼ �0:5

5:155*1:96*0:99

�2þ 11000

¼ 286

The sample size is 58 when the population size is 100 (i.e., 58%)while it is 278 when the population size is 1000 (i.e., 27.8%) withthe same acceptable margin of error (0.5 mm) and at the sameconfidence level (0.95). Thus the larger the population size, thesmaller the required sample size to population size ratio (n/N). Thelarger population size can be obtained by reducing the unitinspection area (AI) to the limit that practically does not affect thedetectability of the localized corrosion by the used inspectiontechnique such as radiographic, ultrasonic or eddy current. Therecommended practical inspection unit area can be obtained fromthe applicable codes such as API 570 (2009) ASME, Section XI(2004) and ASTM G46-94 (2005).

00 600 700 800 900 1000

ample size, nb

for population size of 1000.

Page 11: A methodology for calculating sample size to assess localized corrosion of process components

M. Khalifa et al. / Journal of Loss Prevention in the Process Industries 25 (2012) 70e8080

4. Conclusion

A methodology for calculating the sample size required topredict, with a specified precision, the maximum localized corro-sion of processing components is proposed.

The proposed methodology is divided into main parts: i) layer-ing separation, ii) physical sampling, ii) bootstrap sampling andextreme value analysis, and iv) calculation of sample size.

The estimated sample size ensures that the predictedmaximumlocalized corrosion using the extreme value method is within pre-defined margin of error at a specified confidence level.

An equation for calculating the sample size is obtained using theresults of the proposed methodology.

The application of the methodology has been illustrated througha case study of offshore process piping subjected to pitting corrosion.

Acknowledgment

The authors gratefully acknowledge the financial support byPetroleum Research Atlantic Canada (PRAC) and the strategic grantprovided by Natural Science and Engineering Research Council(NSERC) Canada.

References

Alfonso, L., Caleyo, F., Hallen, J. M., Esplna-Hernandez, J. H., & Escamilla-Davish, J. J.(2008). Application of extreme value statistics to the prediction of maximum pitdepth in non-pigable, buried pipelines. In Proceedings of IPC2008 7th Interna-tional pipeline conference, Calgary, Alberta, Canada.

API 570. (2009). Piping inspection code: In-service inspection, repair, and alteration ofpiping systems (3rd ed.).

API 581. (2000). Risk based inspection resource document (1st ed.). AmericanPetroleum Institute.

ASME Boiler and Pressure Vessel Code, Section XI. (2004). Rules for inserviceinspection of nuclear power plant components. New York: The American Societyof Mechanical Engineers.

ASME B31G. (2009). Manual for determining the remaining strength of corrodedpipelines. New York: The American Society of Mechanical Engineers.

ASTM E122. (2009). Standard practice for calculating sample size to estimate, withspecified precision, the average for a characteristic of a lot or process.

ASTM G46-94. (2005). Standard guide for examination and evaluation of pittingcorrosion.

Bernstein, S., & Bernstein, R. (1999). Elements of statistics II, Shaum’s outlines series(1st ed.). New York: McGRAW-HILL.

Brooker, D. C., & Geoffery, K. C. (2004). Accurate calculation of confidence intervalson predicted extreme met-ocean conditions when using small data sets. In.Proceeding of the 23rd International Conference on Offshore Mechanics and ArcticEngineering (OMAE), Vol. 2 (pp. 149e154).

Cohen, M. P. (1997). Bayesian bootstrap for unequal probability sample designs forapplication to multiple imputation. 555 New Jersey Avenue NW, Washington DC20208e5654: National Center for Education Statistics.

DNV-RP-F101. (2000). Corroded pipelines. Det Norske Veritas.Efron, B. (1979). Bootstrap methods: another look at the jackknife. The Annals of

Statistics, 7(1), 1e26.Efron, B., & Tibshirani, B. (1993). An introduction to the bootstrap. New York:

Chapman and Hall.Gross, S. (1980). Median estimation in sample surveys. Proceedings of the section of

survey research methods. American Statistical Association. pp. 181e184.Hobbs, D., & Ku, A. (2002). Statistical consideration for determining extend of

piping inspections for RBI or API-570 driven inspections. In ASME pressurevessels and piping conference, Vancouver, British Columbia, Canada (pp.167e172).

Jarrah, A., Bigerelle, M., Guillemot, G., Najjar, D., Iost, A., & Nianga, G. M. (2011). Ageneric statistical methodology to predict the maximum pit depth of a localizedcorrosion process. Corrosion Science, 53(8), 2453e2467.

Khan, F., & Howard, R. (2007). Statistical approach to inspection planning andintegrity assessment. Journal of Non-destructive Testing and Condition Moni-toring, 49(1), 26e36.

Kowaka, M., Tsuge, H., Akashi, M., Katsumi, M., & Ishimoto, H. (1984). Introductionto life prediction of industrial plant materials: application of extreme valuestatistical method for corrosion analysis. The Japan Society of CorrosionEngineers.

Schneider, C. R. A., Muhammed, A., & Sanderson, R. M. (2001). Predicting theremaining lifetime of in-service pipelines based on sample inspectiondata. Insight: Non-destructive Testing and Corrosion Monitoring, 43(2),102e104.

Shibata, T. (1991). Evaluation of corrosion failure by extreme value statistics. ISIJInternational, 31(2), 115e121.

The Health and Safety Executive, HSE. (2002). Guidelines for the use of statisticsanalysis of sample inspection of corrosion. Research Report 016. Cambridge, U.K.:TWI.

Wang, W. D. (2006). Extreme value analysis of heat exchanger tube inspection data.In Proceedings of PVP2006-ICPVT-11 ASME Pressure Vessels and Piping DivisionConference. Vancouver, BC, Canada.


Recommended