Predicting injection profiles using ANFIS

Information Sciences 177 (2007) 4445–4461

www.elsevier.com/locate/ins

Predicting injection profiles using ANFIS

Mingzhen Wei a,*, Baojun Bai a, Andrew H. Sung b, Qingzhong Liu b,Jiachun Wang c, Martha E. Cather d

a University of Missouri-Rolla, 129 mcNutt Hall, 1870 Miner Circle, Rolla, MO 65409, United Statesb New Mexico Institute of Mining and Technology, 801 Leroy Place, Socorro, NM 87801, United States

c Daqing Petroleum Company Limited, PetroChina, Daqing, Haerbing, PR Chinad New Mexico Petroleum Recovery Research Center/New Mexico Tech, 801 Leroy Place, Socorro, NM 87801, United States

Received 21 September 2005; received in revised form 16 March 2007; accepted 17 March 2007

Abstract

Decision making pertaining to injection profiles during oilfield development is one of the most important factors thataffect the oilfields’ performance. Since injection profiles are affected by multiple geological and development factors, it isdifficult to model their complicated, non-linear relationships using conventional approaches. In this paper, two adaptive-network-based fuzzy inference systems (ANFIS) based neuro-fuzzy systems are presented. The two neuro-fuzzy systemsare: (1) grid partition based fuzzy inference system (FIS), named ANFIS-GRID, and (2) subtractive clustering basedFIS, named ANFIS-SUB. We compare the performance of resultant FIS and study the effect of parameters. A real-worldinjection profile data set from the Daqing Oilfield, China is used. FIS are generated and tested using training and testingdata from that data set. The impact of data quality on the performance of FIS is also studied. Experiments demonstratethat although soft computing methods are somewhat of tolerant of inaccurate inputs, cleaned data results in more robustmodels for practical problems. ANFIS-GRID outperforms ANFIS-SUB due to its simplicity in parameter selection and itsfitness in the target problem.� 2007 Elsevier Inc. All rights reserved.

Keywords: ANFIS; Subtractive clustering; Grid partition; Petroleum industry; Data quality problems

1. Introduction

In water flooding oilfields, injected water pushes petroleum fluid (oil, gas or/and water) to move toward towellbore through the porous media underground. Injection profiles of injection wells present the distributionof injected water in the active or producing strata. Understanding injection profiles significantly aids in ana-lyzing production related problems, such as residual oil distribution, residual reserve estimation, water flood-ing efficiency, injection and production balance, and so on.

0020-0255/$ - see front matter � 2007 Elsevier Inc. All rights reserved.

doi:10.1016/j.ins.2007.03.021

* Corresponding author. Tel.: +1 573 341 4221; fax: +1 505 835 6031.E-mail address: [email protected] (M. Wei).

mailto:[email protected]

4446 M. Wei et al. / Information Sciences 177 (2007) 4445–4461

Many methods can be applied to obtain injection profiles in oilfields, such as sealed coring, sidewall coring,interpretation of logging data, C/O spectral logging, numerical simulation and comprehensive analysis of sta-tic and dynamic data from the oilfield development. Most of those methods, except for numeric simulation,are for obtaining injection profiles by in-place measurement and interpretation. They are expensive and time-consuming. In addition, it is impossible to obtain injection profiles whenever and wherever they are needed forimproving oil recovery (IOR) purposes. Reservoir numeric simulation models the oil/gas production by com-bining petroleum fluid flow and other models. By properly modeling reservoir and matching the history pro-duction data, reservoir simulation generates injection profiles in the production history and predicts injectionprofiles in the future. However, reservoir simulation has its own inherent problems, including that (1) it is dif-ficult to model multiple parameters and integrate sub-models; (2) history matching is still largely a trial-and-error, and consequently time-consuming process which depends heavily on reservoir simulation expertise; and(3) reservoir simulators sometimes encounter difficulties in modeling actual reservoir features due to theirbuilt-in limitations. In addition, time-consuming post-processing is required to obtain injection profile datafrom reservoir simulation results. Considering that injection profiles are required in many different IOR pro-jects, it is desirable to have handy data available when it is required.

Soft computing techniques are known for their efficiency in dealing with complicated problems when conven-tional analytical methods are infeasible or too expensive, with only sets of operational data available. Soft com-puting methods have been widely applied in many areas in the petroleum industry, such as reservoir description[27], well logging interpretation [16], production prediction [29] and treatment optimization [17]. In this paper,two neuro-fuzzy systems, ANFIS-GRID and ANFIS-SUB, are employed to model the relationships of injectionprofiles and their influential parameters. A set of data from real injection profiles in the Daqing Oilfield of Chinais employed to train and test these neuro-fuzzy systems. Average prediction accuracy of about 80% is achieved.

The rest of this paper is organized as follows. Section 2 describes the injection profile modeling problem;Section 3 is a brief introduction to ANFIS, ANFIS-GRID, and ANFIS-SUB; Section 4 studies the effectsof parameters for these neuro-fuzzy systems and presents the experimental results on the raw data; Section5 demonstrates the effect of low quality data on the performance of FIS and presents the improved result usingcleaned data; Section 6 concludes the paper.

2. Problem statement

In water flooding oilfields, the injection profile is tightly related to fluid flow in the underground porousmedia. Therefore, water injectivity of the injection wells is affected by many parameters. For example, the lar-ger the permeability, the larger the water injectivity is. Water always breaks through along the high permeabil-ity channels to the producing wells. Formation communication between injection and producing wells, whichdepends on the depository environment for oil/gas generation and transportation, is another important factorthat affects the injectivity. With a nice communication environment, stored oil/gas volume in reservoirs can beproduced easily, hence nice injectivity.

In the underground porous media, petroleum fluid flow follows the non-linear Darcy’s Law, described bythe following equation:

u ¼ � kl

dPdx; ð1Þ

where u is the superficial velocity; k is the permeability; l is the viscosity of petroleum fluid; and dP/dx is thepressure drop in fluid flow direction. Considering the complex interaction of rock and fluid properties, anisot-ropy of permeability, the fluid flow can be generally described as follows:

r � KKrw

lwBwðrpw � cwrdÞ

n o� Qw ¼ / o

otSw

Bw

� �for oil;

r � KKro

loBoðrpo � cordÞ

n o� Qo ¼ / o

otSo

Bo

� �for water;

r � KKroRs

loBoðrpo � cordÞ

n oþr � KKrg

lgBgðrpg � cgrdÞ

n o� ðRsQo þ QgÞ ¼ / o

otSg

Bgþ RsSo

Bo

� �for gas;

8>>>><>>>>:

ð2Þ

https://www.researchgate.net/publication/254512672_Application_And_Method_Based_On_Artificial_Intelligence_For_Selection_Of_Structures_And_Screening_Of_Technologies_For_Enhanced_Oil_Recovery?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/254511908_How_Artificial_Intelligence_Methods_Can_Forecast_Oil_Production?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/254510218_Soft_Computing_for_Intelligent_Reservoir_Characterization?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/254510501_Optimization_Design_for_Conformance_Control_Based_on_Profile_Modification_Treatments_of_Multiple_Injectors_in_a_Reservoir?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

M. Wei et al. / Information Sciences 177 (2007) 4445–4461 4447

where $ is the differential functions in the spatial dimensions, and rðf Þ ¼ ofox Dxþ of

oy Dy þ ofoz Dz; K stands for

the absolute permeability of producing formations; Krw, Kro, Krg denote the relative permeability of water, oil,and gas, in multiple phase compressible flow, respectively; lo, lw, lg are viscosity of oil, water and gas under-ground, respectively; Bo, Bw, Bg are the volume factors for oil, water, and gas, respectively; po, pw, pg are thepressure of oil, water, and gas phases, respectively; ro, rw, rg are specific gravity of oil, water, and gas, respec-tively; So, Sw, Sg are saturation of oil, water and gas, respectively; and Rs is the solvable gas ratio in the oilphase.

Eq. (2) reveals the complexity of fluid flow in the porous media underground, which partly explains thecomplicated nature of injection profile modeling and prediction using conventional approaches such as reser-voir simulation. Therefore, soft computing based modeling is proposed.

2.1. Parameter selection

As discussed above, injection profile prediction is a complicated problem that involves multiple interactingfactors. In order to build a reasonably accurate model for prediction, proper parameters must be selected. Fol-lowing are some practical considerations in parameter selection:

• The selected parameters must affect the target problem, i.e., strong relationships must exist among theparameters and target (or output) variables

• The selected parameters must be well-populated, and corresponding data must be as clean as possible. Sincethe soft computing methods model problems based on available data, the availability and quality of dataare both essential

For modeling and predicting injection profiles, studies [28] have been conducted in the Daqing Oilfield,China. They select three parameters to construct the fuzzy membership functions and fuzzy rules based onanalysis results of 25 wells (totally 218 active strata) and their expertise. These parameters are: (1) sand type,which is reflected by main sand, subordinate sand or untabulated stratum sand thickness; (2) connection sta-tus, which is reflected by the correspondence of sand types near the injection and producing wells; and (3) wellspacing, which is the distance of an injection well and surrounding producing wells.

Formation permeability of active strata is a key factor that affects the injection profiles. Studies on availabledata show that absolute permeability of active strata is positively related to the sand type in the Daqing Oil-field, China [28]. Sand types can be represented by sand thickness and communication of injection and pro-ducing wells, as shown in Table 1. In addition to the positive association, permeability is not widely availablein the tested area. Therefore, permeability is not considered in modeling and prediction.

Based on their research results, considering the difficulties in identifying fuzzy membership functions anddomain expertise in constructing proper FIS, we select following parameters in our problem modeling:

• Gross pay thickness near the wellbore of injection wells, in meters, denoted as hgross1

• Net pay thickness near the wellbore of injection wells, in meters, denoted as hnet1

• Gross pay thickness near the wellbore of nearby producing wells, in meters, denoted as hgross2

• Net pay thickness near the wellbore of nearby producing wells, in meters, denoted as hnet2

• Spacing distance between injection wells and surrounding producing wells, in meters, denoted as d

Among these five parameters, the first four reflect the depository environment and the communicationbetween injection and producing wells. Well spacing distance reflects the effect of well pattern and production

Table 1Relationship of permeability and pay zone thickness in the active producing strata [28]

Sand thickness (m) <0.5 gross P0.5 gross 0.2–0.5 net 0.5–1.0 net 1.0–1.5 net 1.0–1.5 net

Average permeability (lm2) 0.037 0.123 0.264 0.802 1.064 2.181

https://www.researchgate.net/publication/241791264_Prediction_of_Injection_Profile_of_an_Injector_Using_a_Fuzzy_Mathematical_Method?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==




criteria. The larger the well spacing, the smaller the injection capability is. Data for these parameters are avail-able from the in-place measurement.

2.2. Problem formulation

The relationship of injection profile and the selected parameters is not obvious. In this paper, injection pro-file is calculated by summing up relative water injectivity of producing wells perforated in each active stratum,formulated as follows:

Fig. 1.gross2

rii ¼ RI � ratioi

Xi

ratioi

,; ð3Þ

ratioi ¼ hnet2i þ ðhgross2i � hnet2iÞ=3:3; ð4Þ

where i = 1, 2, . . . refers to one of the producing wells surrounding an injection well; and RI is the injectivityof a producing stratum. The water injectivity of an active stratum is calculated as follows:

RIk ¼X

i

riki; ð5Þ

where k is the index of a producing stratum of an injection well, and i is the index of surrounding producingwells of the injection well in the producing stratum.

In order to model a system that is lack of complete or computationally feasible analytic description, softcomputing methods can be used. In the case of injection profile, the parameters are rii, hgross1i, hnet1i, hgross2i,hnet2i, and di. Available profile data is used for constructing the model. The resultant model is then validatedby independent data sets. Prediction accuracy is calculated by comparing the difference of predicted and mea-sured relative injectivity. If the difference is within tolerance, as in jRIpredicted

k � RImeasuredk j 6 e, accurate predic-

tion is achieved. The tolerance e is defined based on accuracy requirement in the petroleum industry. Ininjection profile prediction, errors of ±2% are allowed. So the prediction accuracy is defined as follows:

accuracy ¼ of jRIpredicted � RImeasuredj 6 epredicted setk k � 100: ð6Þ

2.3. Sample injection profile data

Injection profiles from 10 wells are used for the problem modeling. Fig. 1 shows the complicated relation-ships of five selected parameters and the relative injectivity in each active stratum per producing well. It is

0

1

2

3

4

5

6

7

8

0 5 10 15 20 25 30 35

relative injectivity (%)

pay

th

ickn

ess

(met

er)

0

50

100

150

200

250

300

wel

l sp

acin

g (

met

er)

gross1 net1

gross2 net2

well spacing

Relationships of selected parameters to relative injectivity of producing oil strata, where gross1 stands for hgross1, net1 for hnet1,for hgross2, net2 for hnet2, and well spacing for d.


obvious that non-linear relationships exist between hgross1, hnet1, hgross2, hnet2, d and ri, making it difficult tobuild a model using conventional approaches such as regression and curve fitting.

3. Neuro-fuzzy systems

3.1. Fuzzy logic and fuzzy inference systems

Fuzzy logic (FL) and fuzzy inference systems (FIS), first proposed by Zadeh [31], provide a solution formaking decisions based on vague, ambiguous, imprecise or missing data. FL represents models or knowledgeusing IF–THEN rules in the form of ‘‘if X and Y then Z’’. As shown in Fig. 2, a fuzzy inference system mainlyconsists of fuzzy rules and membership functions and fuzzification and de-fuzzification operations. By apply-ing the fuzzy inference, ordinary crisp input data produces ordinary crisp output, which is easy to be under-stood and interpreted. A more generalized description of fuzzy problems and uncertainty is provided in [32].

Broadly speaking, there are two categories of fuzzy inference systems, namely Mamdani [19] and Takagi–Sugeno (ST) [26] FIS. A Mamdani FIS consists of simple rules such as

IF pressure is high and temperature is low, then volume is small,

where pressure and volume and temperature are linguistic variables; high and small and low are linguisticvalues that are characterized by membership functions. ST type of fuzzy rules only involves fuzzy sets ormembership functions in the premise part. A FIS has two inputs and two ST rules can be generally representedas follows:

R1: if x1 is A11 and x2 is A1

2; then f 1 ¼ p1x1 þ q1x2 þ c1;


2; then f 2 ¼ p2x1 þ q2x2 þ c2 ð7Þ

Eq. (7) represents the first order ST type fuzzy rules. The output part can also be constants, named as Takagi–Sugeno–Kang fuzzy model [25], represented as


2; then f 1 ¼ C1;


2; then f 2 ¼ C2: ð8Þ

For complicated problems as discussed in this paper, the first order ST FIS is widely employed to model therelationships of inputs and outputs.

3.2. FIS identification and refinement

Identification of the rule base is the key of a fuzzy inference system. The problems are: (1) there are no stan-dard methods for transforming human knowledge or experience into the rule base; and (2) it is required tofurther tune the membership functions (MF) to minimize the output errors and to maximize the performance,as stated in [9]. There are many methods [27,20,14] that can be applied to identify the MF and FIS. In thispaper, two commonly used methods are applied for FIS identification and refinement.

Fig. 2. Components of fuzzy inference systems [9].

https://www.researchgate.net/publication/3336037_DENFIS_Dynamic_Evolving_Neural-Fuzzy_Inference_System_and_Its_Application_for_Time-Series_Prediction?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/243761411_Industrial_Application_of_Fuzzy_Control?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==


https://www.researchgate.net/publication/239064517_Advanced_in_the_Linguistic_Synthesis_of_Fuzzy_Controller?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/4196630_Toward_a_generalized_theory_of_uncertainty_GTU_-_An_outlineInform_Sci?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/287094082_Intelligent_control_of_a_stepping_motor_drive_using_an_adaptive_neuro-fuzzy_approach?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==


3.2.1. ANFIS

ANFIS is a multi-layer adaptive network-based fuzzy inference system proposed by Jang [9]. An ANFISconsists of totally five layers to implement different node functions to learn and tune parameters in a FIS usinga hybrid learning mode. In the forward pass, with fixed premise parameters, the least squared error estimateapproach is employed to update the consequent parameters and to pass the errors to the backward pass. Inthe backward pass, the consequent parameters are fixed and the gradient descent method is applied to updatethe premise parameters. Premise and consequent parameters will be identified for MF and FIS by repeating theforward and backward passes. ANIFS has been widely used in automation control [20] and other areas.

3.2.2. ANFIS-GRID

The ANFIS-GRID fuzzy inference system is the combination of grid partition and ANFIS. Grid partitiondivides the data space into rectangular sub-spaces using axis-paralleled partition based on pre-defined numberof membership functions and their types in each dimension, as shown in Fig. 3. Premise fuzzy sets and param-eters are calculated using the least square estimate method based on the partition and MF types. When con-structing the fuzzy rules, consequent parameters in the linear output MF are set as zeros. Hence it is requiredto identify and refine parameters using ANFIS. The combination of grid partition and ANFIS has beenreported in [1,15].

The wider application of grid partition in FL and FIS is blocked by the curse of dimensions, which meansthat the number of fuzzy rules increases exponentially when the number of input variables increases. Forexample, if there are averagely m MF for every input variable and a total of n input variables for the problem,the total number of fuzzy rules is mn. It is obvious that the wide application of grid partition is threatened bythe large number of rules. According to [10,13], grid partition is only suitable for cases with small number ofinput variables (e.g. less than 6). In this paper, the injection profile modeling problem has exactly five anteced-ent variables. It is reasonable to apply the ANFIS-GRID.

3.2.3. ANFIS-SUB

The ANFIS-SUB fuzzy inference system combines the subtractive clustering method and ANFIS. The sub-tractive clustering method is proposed by Chiu [2] by extending the mountain clustering method [30]. It clus-ters data points in an unsupervised way by measuring the potential of data points in the feature space. Whenthere is not a clear idea how many clusters there should be used for a given data set, it can be used for esti-mating the number of clusters and the cluster centers. Subtractive clustering assumes that each data point is apotential cluster center and calculates the potential for each data point based on the density of surroundingdata points. Then data point with highest potential is selected as the first cluster center, and the potential ofdata points near the first cluster center (within the influential radius) is destroyed. Then data points with thehighest remaining potential as the next cluster center and the potential of data points near the new cluster

X2

X10

S1 S2 S3

S4 S5 S6

S7 S8 S9

Fig. 3. Grid partition of an input domain with two input variables and two membership functions for each input variable.

https://www.researchgate.net/publication/233932671_Fuzzy_Model_Identification_Based_on_Cluster_Estimation?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/246669319_Learning_of_Fuzzy_Rules_by_Mountain_Clustering?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/221403884_Torque-ripple_Minimisation_in_Switched_Reluctance_Motors_using_a_Neuro-fuzzy_Control_Strategy?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==



center is destroyed. The influential radius is critical for determining the number of clusters. A smaller radiusleads to many smaller clusters in the data space, which results in more rules, and vice versa. Hence it is impor-tant to select proper influential radius for clustering the data space.

After clustering the data space, the number of fuzzy rules and premise fuzzy MF are determined. Then thelinear squares estimate is used to determine the consequent in the output MF, resulting in a valid FIS. Asdescribed above, ANFIS learns and refines the premise fuzzy MF and consequents using the least squares esti-mate and back propagation. Tuned by ANFIS, the resultant FIS achieves minimum training errors.

The combination of ANFIS and subtractive clustering has been widely applied in automation control [12],function approximation [3] and resolving engineering problems [4,7,11,22].

4. Injection profile modeling and predication

In injection profile modeling, we apply neuro-fuzzy inference systems, and split the sample data set basedon injection wells, each training set containing data from nine wells and each testing set containing data fromone well. Ten training and testing data sets are generated to check the performance of selected FIS.

When modeling practical problems using neuro-fuzzy systems, it is important to obtain proper training andtesting data sets. If they are not selected properly, the testing data does not validate the model obtained usingthe training data, as shown in Fig. 4. For the second training and testing data, although the checking errors(the curve of chk 2) are not very large compared with the curve of chk 4, the minimum checking error (MCE)is achieved within the first epoch. This fact discovers that the checking data presented to ANFIS for training issufficiently different from the training data set. Hence, the trained FIS does not capture the features of thetesting data set very well. It is required to change the membership function types or the number of membershipfunctions to retrain the model. Properly selected training and checking sets should have training error curvesas trn 4 and chk 4, shown in Fig. 4. The checking error decreases with training proceeding until a jump point.Overfitting occurs when training passes that point. The problem is considered when constructing FIS usingANFIS, ANFIS-GRID and ANFIS-SUB.

4.1. Behavior of the ANFIS

When generate a FIS using ANFIS, it is important to select proper parameters, including the number ofMF num_MF for each individual antecedent variable. It is also important to select proper parameters forthe learning and refining process, including the initial step size S, the step size increase rate RInc, and the stepsize decrease rate RDec. Parameter selection and their impact on the ANFIS have been addressed in the liter-ature [6,11,22]. For specific training and testing data sets, we analyze the effect of these parameters on the finalANFIS performance, including the training and testing MCE.

Figs. 5–8 present the impact of parameters (e.g. num_MF, S, RDec, and RInc) on the training/checkingerrors. From Fig. 5, combination of num_MF can affect the training and testing errors significantly; andincreasing the number of MF does not necessarily improve the performance of FIS. In our case, there are

0

2

4

6

8

10

1 11 21 31 41

trai

nin

g e

rro

r

trn2 chk2

trn4 chk4

Fig. 4. Training and checking errors obtained by ANFIS using two different training-testing data sets.

https://www.researchgate.net/publication/2382699_Three_Machine_Learning_Techniques_for_Automatic_Determination_of_Rules_to_Control_Locomotion?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/30871146_Neuro_-_Fuzzy_and_Soft_Computing_A_Computational_Approach_to_Learning_and_Machine_Intelligence?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==


https://www.researchgate.net/publication/246154922_Extracting_Fuzzy_rules_from_Data_for_Function_Approximation_and_Pattern_Classification?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/4148405_Automization_of_an_INSGPS_intecrated_system_using_genetic_optimization?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

0

20

40

60

80

100

120

140

1 5 9 13 17

trai

nin

g e

rro

r

trn1 chk1

trn2 chk2

trn3 chk3

trn4 chk4

Fig. 5. Training and testing errors using different combination of radii. The initial step size S = 0.01, step size decrease rate RDec = 0.9,and step size increase rate RInc = 1.1. The radius combination is: [2 2 2 2 2] for trn 1 and chk 1, [2 2 2 3 2] for trn 2 and chk 2, [2 2 3 2 2] fortrn 3 and chk 3, and [3 2 2 2 2] for trn 4 and chk 4.

1.8

1.9

2

2.1

2.2

1 11 21 31 41

trai

nin

g e

rro

r

2

3

4

5

6

7

8

9

10

chec

kin

g e

rro

r

trn1 trn2trn3 trn4

chk1 chk2chk3 chk4

Fig. 6. Training and checking errors using different initial step sizes. The constant radius Rc = 2, the step decreasing rate RDec = 0.8 andstep increasing rate RInc = 1.10. The initial step size is: S = 0.01 for trn 1 and chk 1, S = 0.03 for trn 2 and chk 2, S = 0.05 for trn 3 and chk3, S = 0.07 for trn 4 and chk 4.

1.8

1.9

2

2.1

2.2

1 11 21 31 41

trai

nin

g e

rro

r

3

4

5

6

7

8

9

10

chec

kin

g e

rro

rs

trn1 trn2trn3 trn4chk1 chk2chk3 chk4

Fig. 7. Training and checking errors using different step increasing rates. Initial step size S = 0.01 and step increasing rate RDec = 0.8. Thestep size decreasing rate is: RInc = 1.05 for trn 1 and chk 1, RInc = 1.10 for trn 2 and chk 2, RInc = 1.15 for trn 3 and chk 3, RInc = 1.20 fortrn 4 and chk 4.


1

2

3

4

5

6

7

8

1 11 21 31 41

trai

nin

g e

rro

r

trn1 chk1

trn2 chk2trn3 chk3

trn4 chk4

Fig. 8. Training and checking errors using different step increasing rates. The initial step size S = 0.01 and step increasing rate RInc = 1.05.The step size decreasing rate is: RDec = 0.8 for trn 1 and chk 1, RDec = 0.85 for trn 2 and chk 2, RDec = 0.90 for trn 3 and chk 3,RDec = 0.95 for trn 4 and chk 4.


335 data points in the training set, and a FIS with 48 fuzzy rules (having totally 321 parameters) is the lim-itation. The best combination of radius is [22222]. That means a constant radius Rc in every dimension ofdata space.

Fig. 6 shows the training and testing errors using different initial step size S. Initial step size does not affectthe value of MCE, the jump points in the checking error curves; while it does affect the training epochs whenthe MCE appears. The larger the initial step size, the earlier the MCE comes. Fig. 7 presents the effect of stepsize increase rate RInc on the training and checking errors. A similar conclusion can be drawn as the impact ofthe initial step size. The larger the increase rate, the faster FIS achieves the MCE. Fig. 8 shows that no sig-nificant difference exists for FIS obtained using different step size decrease rates.

From the above analysis, the 4th training and testing sets are applied to construct FIS and to cross testother testing sets. The parameters are selected as follows: S = 0.01, RDec = 0.9, RInc = 1.10, and num_MF = 2.The cross checking results of prediction accuracy based on Eq. (6) are listed in Table 2. It is obvious that thetesting accuracy can be high as 93%, which indicates that the trained FIS contains most of data patterns in thetesting data. The overall average prediction accuracy is 78.7% for both training and testing sets.

4.2. Using the ANFIS-GRID

Due to the curse of dimension in the ANFIS-GRID, radius combinations in Fig. 5 are tried. Similar resultsare achieved as in Fig. 5. This indicates that after initializing the FIS with same architecture, ANFIS will iden-tify and tune the parameters to achieve the least training error.

Table 3 lists the training and cross testing accuracy of 10 pairs of data sets. In the table, Fis1, Fis2, Fis3,Fis4, Fis5, Fis6, Fis7, Fis8, Fis9 and Fis10 are generated by the first, second, third, fourth, fifth, sixth, seventh,eighth, ninth, and tenth training sets, respectively. All training sets generate FIS with five inputs and 32 Tak-agi–Sugeno rules. The ANFIS training parameters are S = 0.01, RInc = 1.1, and RDec = 0.9.

4.3. Using the ANFIS-SUB

In order to generate a proper FIS using ANFIS-SUB, it is critical to determine the proper influential radiusfor each dimension in the data space. It is also important to select proper values or combinations of following

Table 2Cross-validation results of 10 training and testing data sets. The FIS is obtained using the 4th training/testing data set

Data set Set 1 Set 2 Set 3 Set 4 Set 5 Set 6 Set 7 Set 8 Set 9 Set 10

Training acc. (%) 80 80 79 81 80 78 79 79 80 80Testing acc. (%) 77 78 81 59 68 93 85 84 74 79

Table 3Validation accuracy results using different training and testing data sets

Fis1 Fis2 Fis3 Fis4 Fis5 Fis6 Fis7 Fis8 Fis9 Fis10

Train1/test1 82/63 79/87 81/78 80/78 80/79 79/93 80/81 80/84 82/69 80/84Train2/test2 77/83 80/59 77/83 78/70 78/74 76/93 78/75 76/88 79/69 78/79Train3/test3 76/76 76/74 79/44 75/77 76/79 73/93 75/81 75/82 77/67 76/79Train4/test4 79/76 80/74 79/83 80/56 79/74 77/93 78/81 78/86 80/72 79/84Train5/test5 83/80 84/80 83/81 84/74 84/74 82/95 82/85 82/92 84/74 83/84Train6/test6 76/83 75/83 76/81 77/67 77/68 79/56 75/77 75/84 76/82 76/74Train7/test7 78/80 78/80 78/81 79/74 79/63 76/93 77/66 77/86 79/69 78/84Train8/test8 75/83 76/74 75/81 76/78 76/74 73/96 79/75 79/58 76/69 76/74Train9/test9 76/73 75/83 75/81 76/78 76/68 74/93 75/75 75/84 80/36 76/84Train10/test10 78/76 78/76 78/81 79/74 78/74 76/93 77/87 77/84 79/72 81/32

Highlighted results in bold have low testing accuracy (lower than 60%). The highlighted FIS in italics achieves best performance.


parameters: (1) influential radius Rc, which affects the clustering result directly; (2) quash factor CQuash, whichis used to multiple the given radii values to quash the potential of outlying points to be considered as part ofthat cluster, (3) accept ratio RAccept, which sets the potential as a fraction of the potential of the first clustercenter and above which a data point will be accepted as a cluster center, and (4) reject ratio RReject, which setsthe potential as a fraction of the first cluster center and below which a data point will be rejected as a clustercenter.

Extensive experiments are conducted to select proper combination of constant radius Rc, the quash factorCQuash, the accept ratio RAccept, and the reject ration RReject, which affect the number of fuzzy rules signifi-cantly, as shown in Fig. 9. The numbers of fuzzy rules are obtained using different combinations of CQuash,RAccept and RReject. It tells that the most critical parameter is Rc; the larger the radius constant, the fewerthe fuzzy rules in the resultant FIS; larger RReject will lead to fewer rules with other parameters the same;the larger quash factor will decrease the number of fuzzy rules.

When using ANFIS-SUB to generate FIS, it is required to have more than one fuzzy rule to refine theparameters. Hence parameters that lead to only one fuzzy rule in resultant FIS will be not valid. Cross vali-dation results using 10 optimized FIS based on MCE criteria are presented in Table 4.

4.4. Discussion

From the above discussion, the performance of ANFIS-SUB is hard to predict. When the number of clus-ters for a given data set is unclear, it is difficult to specify the influential radius for each dimension, consideringthe impact of quash factor, accept ratio and reject ratio on clustering.

The performance of the ANFIS-GRID is mainly affected by the num_MF for each dimension, which canusually be determined by the data distribution and the size of training data set. For practical problems with

1

2

3

4

5

6

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58

nu

mb

er o

ffu

zzy

rule

s

Rc=0.20

Rc=0.30

Rc=0.40

Fig. 9. Effect of Rc, CQuash, RAccept, and RReject on the number of fuzzy rules.

Table 4Validation accuracy results using different training and testing data sets


Train1/test1 70/80 72/90 70/87 67/77 70/77 66/83 68/80 63/83 68/90 67/80Train2/test2 73/57 74/72 72/67 70/60 72/61 67/65 69/65 67/52 71/67 68/67Train3/test3 70/75 74/75 71/69 68/75 70/75 67/67 69/69 65/67 70/72 69/64Train4/test4 70/74 74/70 72/67 68/74 71/67 67/72 69/70 66/59 71/66 68/63Train5/test5 70/84 73/84 71/84 68/74 70/79 66/79 68/78 65/79 70/84 68/79Train6/test6 70/77 72/84 70/79 68/72 69/84 66/77 68/74 64/79 69/79 67/77Train7/test7 69/79 72/81 70/77 67/79 69/77 66/75 67/79 62/85 69/79 66/81Train8/test8 71/68 75/66 72/66 70/62 72/62 69/54 69/70 68/48 72/62 70/56

Train9/test9 71/72 76/59 72/67 70/56 71/67 69/53 72/41 68/46 72/56 69/64Train10/test10 73/32 75/53 73/42 70/42 71/53 68/48 69/57 66/63 72/42 70/42

Highlighted results in bold have low testing accuracy (lower than 60%).


few input variables (fewer than six input variables), ANFIS-GRID can be a good choice. Table 3 shows thatthe overall average accuracy is up to 78% for Fis7. This is a very satisfactory result in injection profile predic-tion. Hence the ANFIS-GRID is chosen to construct FIS in this paper.

5. Effects of noisy data

For soft computing methods, when the prediction accuracy is discussed, it is assumed that both the dataused to train the models and the testing data to make predictions are free of errors [24]. But rarely a dataset is clean before extraordinary effort having been put to clean the data [23]. For the problem of injectionprofile prediction, with measurement errors caused by reading and equipment for all selected six parameters,especially the relative injectivity, it is not unusual to have some extreme patterns which will decrease the modelaccuracies. In this paper, we briefly discuss the effect of data quality on FIS using selected ANFIS-GRID.

5.1. Approximate dependencies

In this work, raw data is analyzed using approximate functional dependence mining method. An approx-imate functional dependency, or an approximate dependency, is a functional dependency that is almost validwith some exceptional data tuples. A functional dependency studies the relationships of attributes in one orseveral tables, and claims that the value of an attribute is uniquely determined by the values of some otherattributes. The discovery of functional dependencies in databases leads to discovery of useful knowledgeand data quality problems.

More formally, a functional dependency over a relation (or a table) is expressed as X! A, where X � Rand A � R. The dependency is valid in a given relation r if for all pairs of records t, u 2 r, following statementshold: if t[B] = u[B] for all B 2 X, then t[A] = u[A]. A functional dependency X! A is minimal if A is not func-tionally dependent on any proper subset of X. The dependency X! A is trivial if A 2 X. The task in func-tional dependency mining is to find all minimal non-trivial dependencies that hold in r.

Approximate dependencies arise in many databases when there are natural dependencies between attri-butes, but some records contain errors or inconsistencies. For example, the relationship between zip codeand the combination of city and state in a country. Another example is the social security number (SNN)and a corresponding person residing in the USA. Theoretically, these attributes have consistent relationships,as one person associated with one SSN, and one zip code associated with one combination of city, state in acountry. But if errors are somehow introduced, the relationships between these attributes will be violated,which leads to the approximate dependencies.

The TANE algorithm [8], which deals with discovering functional and approximate dependencies in largedata files, is an effective algorithm in practice. The TANE algorithm partitions attributes into equivalence par-titions of the set of tuples. By checking if the tuples that agree on the right-hand side agree on the left-handside, one can determine whether a dependency holds or not. By analyzing the identified approximate depen-dencies, one can identify potential erroneous data in the relation.

https://www.researchgate.net/publication/27295794_The_Impact_of_Poor_Data_Quality_on_the_Typical_Enterprise?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/220459290_TANE_An_Efficient_Algorithm_for_Discovering_Functional_and_Approximate_Dependencies?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/220593024_Data_Errors_in_Neural_Network_and_Linear_Regression_Models_An_Experimental_Comparison?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==


In this paper, relationships of five pivot parameters (hgross1, hnet1, hgross2, hnet2, and d) and relative injectivity(ri) are analyzed using the TANE algorithm. For equivalence partition, hgross1, hnet1, hgross2, hnet2 are kept intheir original representation. The relative injectivity is kept in the precision of 1%. The well spacing (d) is pro-cessed into discrete numbers using Eq. (9). The d 0 keeps the unit of d, which is meter.

TableAppro

Index

1234

TableConfli

Index

123

4

5

6

789

101112131415

16

17

18

19

20

Highli

d 0 ¼

150 125 6 d < 175;

200 175 6 d < 225;

250 225 6 d < 275;

300 275 6 d < 325:

8>>><>>>:

ð9Þ

5.2. Results from TANE algorithm

After data pre-processing, four approximate dependencies are discovered, as shown in Table 5. Althoughall these dependencies reflect the relationships among parameters, the first dependency is the most importantone because it shows that selected five parameters have consistent association relationship with the water injec-tivity per active layer except a few data tuples, which is a very important dependency for injection profileprediction.

To identify exceptional tuples by analyzing the approximate dependencies, it is required to investigate theequivalence partitions of both left-hand and right-hand sides of an approximate dependency. It is non-trivialwork that could lead to the discovery of problematic data. By analyzing the first dependency, conflictingtuples are identified as given in Table 6. From Table 6, one can see that detected tuples contain conflicting

5ximate dependencies detected using the TANE algorithm

Approximate dependencies Rows to delete

hgross1, hnet1, hgross2, hnet2, d 0 ! ri 25hgross1, hnet1, hgross2, hnet2, ri! d0 20hgross1, hnet1, hgross2, d

0, ri! hnet2 24

hgross1, hgross2, hnet2, d 0,ri! hnet1 23

6cting tuples identified by analyzing the first approximate dependency in Table 5

hgross1 hnet1 hgross2 hnet2 d0 ri

0.2 0 0.4 0 150 00.2 0 0.4 0 150 20.2 0 0.6 0.5 150 0

0.2 0 0.6 0.5 150 3

0.2 0 0.4 0.2 150 0

0.2 0 0.4 0.2 150 5

0.4 0.2 0.4 0 150 00.4 0.2 0.4 0 150 10.5 0 0.4 0 150 00.5 0 0.4 0 150 20.5 0.5 0.5 0.5 150 00.5 0.5 0.5 0.5 150 10.5 0 0.8 0 150 00.5 0 0.8 0 150 20.5 0.4 1.0 0.4 150 1

0.5 0.4 1.0 0.4 150 6

0.6 0.2 0.5 0.2 200 1

0.6 0.2 0.5 0.2 200 6

1.3 1.1 1.2 0.4 150 0

1.3 1.1 1.2 0.4 150 3

ghted tuples contain obvious conflicting or erroneous information.


relationships or associations among parameters, and some of them contain severe ones. For example, as thesame parameters in hgross1, hnet1, hgross2, hnet2, d 0 as in tuples 5 and 6, and tuples 15 and 16, the water relativeinjectivity per active layer for these cases bear large difference. These tuples could create trouble for injectionprofile modeling and prediction. Based on domain experts’ suggestion, tuples 4, 6, 16, 18, 19 are removed fromthe raw data sets, and more experiments are implemented using the above methods.

5.3. Results using cleaned sample data

Repeat the experiments using ANFIS-GRID using cleaned data, better results are achieved, as shown inTable 7. The corresponding first order ST FIS is represented in Figs. 10 and 11.

Fig. 11 shows the membership functions for five input variables, and Fig. 11 lists the linear consequentequations in the format of C�1 input1 þ C�2 input2 þ C�3 input3 þ C�4 input4 þ C�5 input5 þ C, where C1, C2,C3, C4, C5, C are coefficients as shown in the Fig. 12, respectively, and input1 (for gross thickness near theinjection wells), input2 (for net thickness near the injection wells), input3 (for the gross thickness near the pro-ducing wells), input4 (for the net thickness near the producing wells) and input5 (for well distance betweeninjection wells and surrounding producing wells) represent five input variables, respectively. Combining themembership functions of inputs and output, first order ST fuzzy rules can be established, in the format ofEq. (7), expressing the relationships of hgross1, hnet1, hgross2, hnet2, d and ri. A FIS can be constructed with aset of fuzzy rules, and complicated fuzzification and defuzzification operations. With the constructed FIS,crisp output, the relative injectivity, can be calculated by feeding with proper crisp inputs.

Comparing Tables 3 and 7, the effect of data quality on FIS is clearly observed. Therefore, it is highlydemanded to analyze the data quality of data sets before they are applied in soft computing modeling.Fig. 12 shows the predicted relative injectivity using Fis6 mentioned in Table 7. The overall accuracy is 86.1%.

5.4. Practical results

Fuzzy mathematical approach was applied in the Daqing Oilfield of China to predict injection profile, asreported in [28]. Promising results were achieved in understanding the residual oil distribution and improvingthe oil recovery in the Daqing Oilfield. In their fuzzy system, sand types, connection status of injection andproduction wells, and distance of injection well to the production well were taken as appraisal objects or influ-ential factors. Each appraisal factor was classified into several categories to construct fuzzy membership func-tions based on the domain expertise. The target variable of relative injectivity was also classified intocategories, according to the percentage of relative injectivity. Three subordinate relationship plates were con-structed for membership functions of sand type, connection status and well spacing with respect to the waterabsorbing status. They classified the relative injectivity into five categories as 0, (0,3), [3,5), [5,7), and [7,inf) orgood absorbing, bad absorbing and non-absorbing. The predication accuracy is 71% and 79% for these twoclassification methods, respectively.

Table 7Validation results from different training and testing data sets


Train1/test1 85/42 84/77 82/90 82/87 81/95 83/89 83/85 83/81 83/80 83/83Train2/test2 81/84 86/49 81/86 81/85 80/95 81/89 81/81 81/83 82/82 82/83Train3/test3 83/84 82/87 86/62 82/85 81/95 82/89 82/81 82/86 83/78 82/90Train4/test4 82/84 83/72 81/88 85/66 80/95 82/89 82/85 82/81 82/82 82/90Train5/test5 81/84 81/82 80/88 81/83 84/60 81/89 81/78 81/83 81/82 81/90Train6/test6 86/84 87/82 85/92 86/85 85/95 86/89 87/78 87/81 87/82 86/90Train7/test7 84/84 84/85 83/90 84/85 83/93 84/89 84/56 84/81 84/82 84/86Train8/test8 83/84 83/79 81/90 83/83 81/95 82/89 86/85 86/53 83/82 82/86Train9/test9 81/84 82/74 80/88 81/83 80/93 81/89 81/81 81/86 85/54 81/90Train10/test10 84/84 85/77 83/92 84/85 83/95 84/89 85/81 85/81 84/87 86/66

The training and testing sets are cleaned based on results in Table 6. Highlighted results in bold have low testing accuracy (lower than70%). Highlighted FIS in italics has best performance.

https://www.researchgate.net/publication/220256474_Higher_order_fuzzy_system_identification_using_subtractive_clustering_J_Intell_Fuzzy_Syst_9129-158?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==





Fig. 10. The generalized bell-shaped membership functions of five input variables using Fis6 in Table 7.


With intensive domain expertise, it is possible to construct proper FIS as in [28]. But for relative absorbingpercentage over 7%, the classification is blurred. It makes the prediction of water injection in high level diffi-cult. With about 10% data points having water absorbing no less than 7%, it needs more detailed classification.

In addition, it is usually difficult to construct proper fuzzy membership functions, even with the assistanceof domain experts. Therefore, self-learning and refinement are the best choices which enable to learn FIS fromavailable training data sets. In order to test the resultant FIS, injection profile data from N2-D2-B447 isapplied using Fis6 in Table 7. The N2-D2-B447 is an independent injection well in the South II District ofthe Daqing Oilfield, China. From its injection profile data, it has thin strata having large injectivity, suchas S24a and S25, and thick strata having small injectivity, such as S216 and P21a. From the results inFig. 13, these strata are poorly predicted. The overall prediction accuracy is up to 85%.


Output1: [27.39 -21.58 -10.14 -18.05 -0.2389 -1.172]

Output2: [-125.4 59.82 48.78 31.97 -0.5125 91.85]

Output3: [-323.8 249 25.99 424.2 -0.7146 -91.64]

Output4: [648.4 -111.6 -190.7 -554.5 2.675 -249.8]

Output5: [-81.45 -14.16 -185 -72.65 5.384 294.1]

Output6: [1093 -365.8 -578.6 457.7 3.203 -155.6]

Output7: [1047 133.5 1016 881.3 -36.34 125.8]

Output8: [248.2 11.14 7.57 42.19 -12.61 -7.887]

Output9: [-114.5 -54.33 -78.3 118.7 0.9122 36.67]

Output10: [108.1 321.5 59.6 231.3 3.395 -571.8]

Output11: [521.8 150.7 438.1 -527.9 -6.152 -216.1]

Output12: [429.9 -58.44 707.3 209.8 -33.33 41.54]

Output13: [809 105.7 149.5 -287.9 0.8564 449.8]

Output14: [-1591 -187.9 858.9 902.3 -15.03 243.6]

Output15: [274.9 52.83 41.51 103.7 -90.28 13.79]

Output16: [-72.69 14.51 91.94 67.63 67.01 59.28]

Output17: [86.01 -30.34 -44.12 -51.6 -3.541 -6.4]

Output18: [-53.38 65.14 -433.4 -1103 -4.199 108.3]

Output19: [-1986 -594.8 962.1 -1750 68.58 -675.6]

Output20: [-72.16 204.5 381.9 -16.21 50.59 -24.52]

Output21: [841.2 1110 -77.11 1451 4.623 -32.76]

Output22: [238.7 -183.3 -349.7 469.3 21.87 -81.53]

Output23: [-353 -146 -180.2 -232.7 -245 -89.46]

Output24: [20.41 -5.733 -45.98 -8.671 27.18 -0.1444]

Output25: [-62.79 109.7 -103.8 -151.1 0.3008 13.69]

Output26: [166.9 -719.4 625 -336.9 -4.252 457.5]

Output27: [-547.5 -419.4 1079 278.3 2.249 83.72]

Output28: [-229.2 -333.9 299.6 -20.64 69.48 12.75]

Output29: [992.9 252.4 -1074 -36.96 -11.88 -203]

Output30: [516.6 161.3 39.42 208.2 -12.85 59.8]

Output31: [-8.946 -0.8251 -142.4 27.05 136.8 -12.12]

Output32: [30.04 -7.689 -16.4 -4.141 -337.4 7.105]

Fig. 11. Linear output coefficients of fuzzy rules of fis6 in Table 7.

0

5

10

15

20

1 31 61 91 121 151 181 211 241 271 301 331

rela

tive

in

ject

ivit

y (%

) FIS Predicted

Original

Fig. 12. Prediction of the sample data set using Fis6 in Table 7.

0

4

8

12

16

S12a

S21

S21-2

1a

S221b

S24a

S24a

S24b

S25S21

5S21

6S31

aS37

S38c

S310b

P21a

P21b

P26P27

a

P210b

P210c

inje

ctiv

ity

(%)

Real Water Injection Profile

Predicted Profile

Fig. 13. Prediction injection profile for N2-D2-B447 using Fis6 mentioned in Table 7.


Tested by data from the Daqing Oilfield, the average injection profile prediction accuracy is improved. Thenew approach can be expected to have wider application in resolving complicated petroleum exploration anddevelopment related problems. Due to the easiness in FIS identification and modification, more knowledgecan be saved for a real knowledge base in certain development unit or locality. Significant savings in produc-tion cost and improvement in work efficiency can be achieved.


6. Conclusions

As an efficient neuro-fuzzy system, ANFIS can be applied to learn FIS and to identify and refine the ante-cedent and consequent parameters in MF and fuzzy rules using training data sets. It provides an effectiveapproach for many complicated engineering problems in various fields. In this paper, studies on ANFIS,ANFIS-GRID and ANFIS-SUB indicate that selection of appropriate neuro-fuzzy systems depends on theproblem and available data sets. Taha and Noureldin found out that [22], in their cases, different selectionsof num_MF does not affect the performance of ANFIS significantly; while initial step size S and step changerates RInc and RDec are significant to the training RMSE of the model. In contrast, in our experiments, RDec

does not matter much on results.ANFIS-GRID is known for the ‘‘curse of dimensionality’’. It works best in our problem, compared with

ANFIS and ANFIS-SUB. In problem modeling using ANFIS-GRID, it is important to investigate the sizeof the training set and the architecture of FIS, with the size of the training set being larger than the total num-ber of parameters (e.g. for premise fuzzy sets and linear output) in the FIS.

In this paper, for five input variables, generalized bell-shaped membership functions are applied. An easierapproach is applied to construct FIS based on available data sets, compared with the manual work done in[28]. The prediction accuracy is higher based on our results.

There are other types of membership functions [21] that should be tried for complicated engineering prob-lems. Because ANFIS-GRID has its own disadvantages for problems with more than five input variables, it isalso recommended to improve the performance of ANFIS-SUB by applying genetic algorithms as in [7] forparameter selection and other improvement, as seen in [5,18].

Acknowledgements

The authors thank an anonymous referee for the insightful comments, which helped to improve the papergreatly. Financial support from the New Mexico Petroleum Recovery Research Center, a research division ofNew Mexico Tech, is gratefully acknowledged.

References

[1] J. Abonyi, H. Andersen, L. Nagy, F. Szeifert, Inverse fuzzy-process-model based direct adaptive control, Mathematics andComputers in Simulation 51 (1999) 119–132.

[2] S. Chiu, Fuzzy model identification based on cluster estimation, Journal of Intelligent and Fuzzy Systems 2 (3) (1994) 267–278.[3] S. Chiu, Extracting fuzzy rules from data for function approximation and pattern classification, in: D. Dubois, H. Prade, R. Yager

(Eds.), Chapter 9 in the Fuzzy Information Engineering: A Guided tour of Applications, Springer, Berlin, 1997, pp. 149–162.[4] K. Demirli, S.X. Cheng, P. Muthukumaran, Subtractive clustering based modeling of job sequencing with parametric search, Fuzzy

Sets and Systems 137 (2) (2003) 235–270.[5] K. Demirli, P. Muthukumaran, Higher order fuzzy system identification using subtractive clustering, Journal of Intelligent and Fuzzy

Systems 9 (3–4) (2000) 129–158.[6] B. Fritzke, Incremental neuro-fuzzy systems, in: Proc. Application of Soft Computing, SPIE International Symposium on Optical

Science, Engineering and Instrumentation, San Diego, 1997, pp. 86–97.[7] M.A. Hassanain, M.M. Reda Taha, A. Noureldin, N. El-Sheimy, Automation of an INS/GPS integrated system using genetic

optimization, in: Proc. the 5th International Symposium on Intelligent Automation and Control, Seville, Spain, 2004, pp. 347–352.[8] Y. Huhtala, J. Karkkainen, P. Porkka, H. Toivonen, TANE: an efficient algorithm for discovering functional and approximate

dependencies, The Computer Journal 42 (2) (1999) 100–111.[9] J.R. Jang, ANFIS: adaptive-network-based fuzzy inference system, IEEE transaction on Systems, Man and Cybernetics 23 (3) (1993)

665–685.[10] J.R. Jang, Frequently asked questions – ANFIS in the fuzzy logic toolbox, <http://www.cs.nthu.edu.tw/~jang/anfisfaq.htm>.[11] J.R. Jang, C. Sun, E. Mizutani, Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine

Intelligence, Prentice Hall Inc., Englewood Cliffs, 1997.[12] S. Jonic, T. Jankovic, V. Gajic, D. Popovic, Three machine learning techniques for automatic determination of rules to control

locomotion, IEEE Transactions on Biomedical Engineering 46 (3) (1999) 300–310.[13] S.D. Kaehler, Fuzzy logic tutorial, the newsletter of the Seattle Robotics Society.[14] N. Kasabov, Q. Song, DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction, IEEE

Transactions on Fuzzy Systems 10 (2) (2002) 144–154.

http://www.cs.nthu.edu.tw/~jang/anfisfaq.htm









https://www.researchgate.net/publication/233932671_Fuzzy_Model_Identification_Based_on_Cluster_Estimation?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==





https://www.researchgate.net/publication/2682544_Incremental_Neuro-Fuzzy_Systems?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/2682544_Incremental_Neuro-Fuzzy_Systems?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==




[15] P. Kennedy, M. Condon, J. Dowling, Torque-ripple minimization in switched reluctant motors using a neuro-fuzzy control strategy,in: Proc. IASTED International Conference on Modeling and Simulation, 2003.

[16] R.R. Lbatullin, N.G. Lbragimov, R.S. Khisamov, E.D. Podymov, A.A. Shutov, Application and method based on artificialintelligence for selection of structures and screening of technologies for enhanced oil recovery, SPE 75175, in: Proc. SPE/DOEImproved Oil Recovery Symposium, Tulsa, Oklahoma, 2002.

[17] Y. Liu, B. Bai, Y.X. Li, J.-P. Coste, Optimization design for conformance control based on profile modification treatments of multipleinjectors in a reservoir, SPE 64731, in: Proc. the 2000 SPE Symposium, Beijing, 2000.

[18] W. Liu, C. Xiao, B. Wang, Y. Shi, S. Fang, Study on combining subtractive clustering with fuzzy c-means clustering, in: Proc. the 2ndInternational Conference on Machine Learning and Cybernetics, Xi’an, China, 2003, pp. 2659–2662.

[19] E.H. Mamdani, Advances in the linguistic synthesis of fuzzy controllers, International Journal of Man-Machine Studies 8 (1976) 669–678.

[20] P. Melin, O. Castillo, Intelligent control of a stepping motor drive using an adaptive neuro-fuzzy inference system, InformationSciences 170 (2–4) (2005) 133–150.

[21] S. Mitaim, B. Kosko, The shape of fuzzy sets in adaptive function approximation, IEEE Transactions on Fuzzy Systems 9 (4) (2001)637–656.

[22] M.M. Reda Taha, A. Noureldin, N. El-Sheimy, Improving INS/GPS positioning accuracy during GPS outages using fuzzy logic, in:Proc. GPS-GNSS 2003, Oregon, 2003, pp. 499–508.

[23] T.C. Redman, The impact of poor data on the typical enterprise, Communications of the ACM (1998) 79–82.[24] D.F. Rossin, B.D. Klein, Data errors in neural network and linear regression models: an experimental comparison, Data Quality 5 (1)

(1999) 33–43.[25] M. Sugeno, Industrial Applications of Fuzzy Control, Elsevier, Science, Amsterdam, the Netherlands, 1985.[26] T. Takagi, M. Sugeno. Derivation of fuzzy control rules from human operator’s control actions. In: Proc. the IFAC Symp. on Fuzzy

Information, Knowledge Representation and Decision Analysis, 1983, pp. 55–60.[27] D. Tamhane, P.M. Wong, F. Aminzadeh, M. Nikravesh, Soft computing for intelligent reservoir characterization, SPE 59397, in:

Proc. SPE Asia Pacific Conference on Integrated Modeling for Asset Management, Yokohama, Japan, 2000.[28] J. Wang, X. Li, H. Li, Prediction of injection profile of an injector using a fuzzy mathematical method, CIPC2005-134, in: Proc.

Petroleum Society’s 6th Canadian International Petroleum Conference (56th Annual Technical Meeting), Calgary, Alberta, Canada,2005.

[29] W. Weiss, R. Balch, How artificial intelligence methods can forecast oil production, SPE 75143, in: Proc. SPE/DOE Improved Oilrecovery Symposium, Tulsa, Oklahoma, 2002.

[30] R. Yager, D. Filev, Generation of fuzzy rules by mountain clustering, Journal of Intelligent and Fuzzy Systems 2 (3) (1994) 209–219.[31] L.A. Zadeh, Fuzzy sets, Information and Control 8 (1965) 338–353.[32] L.A. Zadeh, Toward a generalized theory of uncertainty (GTU) – an outline, Information Sciences 172 (2005) 1–40.

https://www.researchgate.net/publication/4052935_Study_on_Combining_Subtractive_Clustering_With_Fuzzy_c-Means_Clustering?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/4052935_Study_on_Combining_Subtractive_Clustering_With_Fuzzy_c-Means_Clustering?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/27295794_The_Impact_of_Poor_Data_Quality_on_the_Typical_Enterprise?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/246669319_Learning_of_Fuzzy_Rules_by_Mountain_Clustering?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==




https://www.researchgate.net/publication/3335992_The_shape_of_fuzzy_sets_in_adaptive_function_approximation?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==

https://www.researchgate.net/publication/3335992_The_shape_of_fuzzy_sets_in_adaptive_function_approximation?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==







https://www.researchgate.net/publication/243761411_Industrial_Application_of_Fuzzy_Control?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==










https://www.researchgate.net/publication/4196630_Toward_a_generalized_theory_of_uncertainty_GTU_-_An_outlineInform_Sci?el=1_x_8&enrichId=rgreq-0b13205a7034e6ec54575e029378d956-XXX&enrichSource=Y292ZXJQYWdlOzIyMzI2ODUwMjtBUzoxMDEyMDkyNjg1NTU3NzdAMTQwMTE0MTU3ODI2Mg==



Date post:	28-Feb-2023
Category:	Documents
Upload:	shsu
View:	0 times
Download:	0 times

Predicting injection profiles using ANFIS

Documents