Real time phase detection based online monitoring of batch fermentationprocesses
Soumen K. Maiti a, Rajesh K. Srivastava b, Mani Bhushan a, Pramod P. Wangikar a,b
a Department of Chemical Engineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, Indiab School of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
Keywords:
Principal component analysis
Mean
Covariance
Moving window
Singular point
A B S T R A C T
Industrial fermentations conducted in a batch or semi-batch mode demonstrate significant batch-to-
batch variability. Current batch process monitoring strategies involve manual interpretation of highly
informative but low frequency offline measurements such as concentrations of products, biomass and
substrates. Fermentors are also fitted with computer interfaced instrumentation, enabling high
frequency online measurements of several variables and automated techniques which can utilize this
data would be desirable. Evolution of a batch fermentation, which typically uses complex medium, can
be conceptualized as a sequence of several distinct metabolic phases. Monitoring of batch processes can
then be achieved by detecting the phase change events, also termed as singular points (SP). In this work,
we propose a novel moving window based real-time monitoring strategy for SP detection based only on
online measurements. The key hypothesis of the strategy is that the statistical properties of the online
data undergo a significant change around an SP. The strategy is easily implementable and does not
require past data or prior knowledge of the number or time of occurrence of SPs. The efficacy of the
proposed approach has been demonstrated to be superior compared to that of reported techniques for
industrially relevant model organisms. The proposed approach can be used to decide offline sampling
timings in real time.
1. Introduction
Fermentation processes are widely used in food, pharmaceu-tical, agrochemical and chemical industries. The production unitsrange from small scale for biopharmaceuticals to large scale forbulk chemicals. A majority of the processes are operated in a batchor semi-batch mode. Intense competition and regulatory require-ments pose severe demands on consistency of these batches interms of the end of batch productivity and product quality [1].However, fermentation processes are subject to intrinsic batch-to-batch variability due to variability in raw material quality, state ofthe seed culture and operator skills. It is therefore desirable toautomate monitoring, fault detection and diagnosis and control offermentation processes. This can lead to improved processreliability, product quality and productivity as well as reduceddevelopment time, manpower inputs and cost of production [2].
Typically, during operation, the product quality and batchperformance are monitored via off-line measurements of con-centrations of the product, byproducts, biomass and substrates.These measurements are expensive, labor intensive and timeconsuming, are obtained at low frequencies (e.g., every few hours)at pre-defined intervals and hence, may not always lead to timelyinformation about the status of the batch. Further, in someprocesses, the product formation begins only towards the laterparts of the batch and this leads to additional difficulty inadequately monitoring the process using these offline measure-ments [3]. Fermentors are typically equipped with several on-linesensors such as pH, temperature, concentrations of dissolvedoxygen (DO) and carbon dioxide and partial pressure of oxygen andcarbon dioxide in the exhaust gas. These measurements areinexpensive, usually available at high frequencies (e.g., every fewseconds) and are obtained in an automated fashion. Hence, there isenormous potential to use these measurements to effectivelymonitor batch fermentation processes.
In the general process systems engineering literature, severaldifferent techniques have been reported for process monitoringand fault diagnosis [4]. These can be broadly classified as processmodel based, knowledge based and historical data based. Thesuccess of any model based strategy depends critically on the
800
adequacy of the underlying model. Industrial fermentationprocesses typically employ complex media with multiple sub-stitutable carbon and nitrogen substrates, which leads todifficulties in developing adequate process models. Further,several aspects of fermentation processes such as the dynamicevolution of pH and concentration of dissolved oxygen, are not wellunderstood in general and this may lead to additional difficulties indeveloping reliable process models. Hence, model based strategiesmay not be suitable for monitoring of majority of industrialfermentation processes. Knowledge based monitoring techniquessuch as those based on fuzzy logic require expert knowledge of thesystem and therefore are system specific [4,5]. Such expertknowledge may not be available for the system of interest.Further, even for cases where such knowledge exists in terms of themanpower knowledgeable about the system, it is not straightfor-ward to translate such knowledge to a form that can be readilyutilized by automated monitoring systems. Historical data basedmethods rely on large amount of past data to capture theunderlying relationships between the process variables [4,6,7].However, due to batch-to-batch variability intrinsic to fermenta-tion processes, it is difficult for these techniques to delineatebetween normal and abnormal variations.
Another set of methods, based on ideas from statistical controlliterature, have been proposed that rely only on data available fromthe current batch [8–10]. Fermentation processes typically utilizecomplex organic substrates such as yeast extract in addition todefined components such as glucose and ammonia. This provides asubstitutable multisubstrate milieu, which may result in sequen-tial and/or simultaneous utilization of the substrates. The cellularmetabolism may be different in each such substrate uptake phase[11]. Evolution of a batch fermentation process can then beconceptualized as a sequence of such phases, each with its ownduration and dynamics. It is expected that batch-to-batchvariability would therefore, among other things, translate tovariations in switching times between the phases [12]. Hence,effective monitoring can be achieved by detecting the time ofoccurrence of these various phases. The reported technique basedon this philosophy detects the phase change time by identifyingqualitative changes in trajectories of the test statistic T2 andprincipal component score plots [9]. Being qualitative in nature,this technique is difficult to automate. While other statisticalprocess monitoring techniques such as Shewhart Charts, Cumu-lative sum (CUSUM) and Exponentially weighted moving average(EWMA) [8,10,13] have been applied for monitoring batchprocesses in general, they have not been specifically applied formonitoring fermentation processes characterized by multiplephases since it is typically assumed in these techniques that theentire batch data is characterized by single set of statisticalproperties (such as mean and covariance).
In this article, we present a real time phase detection basedprocess monitoring scheme that does not require process model orhistorical data. The scheme is inspired from statistical controlliterature, is multivariate in nature, relies only on onlinemeasurements and can be easily automated to work withindustrial processes. The basic premise in our approach is thatstatistical properties of online measured data are different indifferent phases. Hence, the problem of phase change detection istreated to be equivalent to that of detection of changes in statisticalproperties of the data. To be consistent with earlier work [9,12], werefer to a point where phase change is detected as a singular point(SP).
2. Experimental methods
In this study, experimental data has been collected for two different strains,
Amycolatopsis balhimycina DSM5908 and Bacillus pumilus ATCC 21951 while the
data for Amycolatopsis mediterranei S699 was taken from Doan et al. [9]. For A.
balhimycina and B. pumilus, the fermentation experiments were performed in a 2.5 l
fermentor equipped with various sensors and data acquisition system (Model:
Biostat B, B. Braun, Germany). The fermentor was aerated at a constant flow rate of
1.0 vvm (volume of air per unit volume of medium per minute) using a mass flow
controller. Dissolved oxygen (DO) concentration in the fermentor was maintained
at 40% of saturation value by controlling the stirrer speed in cascade mode with DO.
The concentrations of oxygen and carbon dioxide in the exhaust gas were measured
by infrared spectroscopy and paramagnetic analysis, respectively (Analyser
BINOS1002 M, Rosemount Analytical, Germany). The online measurements were
stored at 5 min intervals.
The Amycolatopsis balhimycina strain was a gift from Prof Anna Eliasson Lantz of
Denmark’s Technical University, Denmark, and was stored on Bennett agar plates at
4 8C. Seed culture was grown in 100 ml medium in a 500 ml capacity Erlenmeyer
flask with single baffle and incubated at 30 8C and 150 rpm. The seed medium
contained per liter of distilled water: glucose: 15 g, glycerol: 15 g, soya peptone:
15 g, NaCl: 5 g and yeast extract: 3 g. Upon reaching an optical density of �12 at
600 nm, 25 ml of the seed culture was transferred to a fermentor containing 1 l of
production medium. The production medium contained, per liter of distilled water,
glucose: 54–100 g, glycerol: 0–16 g, ammonium sulfate: 3–6.6 g, yeast extract:
0.75–1.5 g, defatted soybean flour: 0.25–1.0 g, ZnSO4: 0.02 g, FeSO4: 0.02 g,
trisodium citrate: 0.025 g, MgSO4: 1.5 g, MnSO4: 0.01 g, NaCl: 1 g, MES: 1.045 g
and KH2PO4: 0.2 g. In addition, the following vitamins were added: biotin:
0.00005 g, calcium-pantothenate: 0.001 g, nicotinic acid: 0.001 g, myo-inositol:
0.025 g, thiamin HCL: 0.001 g, pyridoxine HCL: 0.001 g and para-aminobenzoic
acid: 0.0002 g. Temperature was maintained at 30 8C and pH was maintained at 7.0
by adding 1.5N NaOH solution by using a pH controller. The online measurements
included NaOH flow rate, pH, agitator speed and DO.
A transketolase (tkt) deficient strain of Bacillus pumilus ATCC 21951 was procured
from Institute for fermentation, Osaka, Japan. The strain was maintained on Luria
Bertani agar slant and was stored at 4 8C. The preparation of pre-seed and seed
cultures and the culture transfer criteria were as described earlier [14]. The
production medium contained per liter of distilled water: glucose: 200 g, cas amino
acids: 15 g or corn steep liquor: 12 g, ammonium sulfate: 5 g, CaCO3: 16 g, MnSO4:
0.5 g, leucine: 0.5 g and tryptophan: 0.05 g. The temperature was maintained at
37 8C. The online measurements available for Bacillus pumilus were: pH, dissolved
oxygen, agitator speed and CO2 and O2 concentration in exhaust gas.
For both the strains, samples were drawn from the fermentation medium at
regular intervals to obtain the time profiles of concentrations of dry cell weight
(DCW), product(s) and substrate(s). Glucose, glycerol, D-ribose, acetate, acetoin and
2,3-butanediol were analyzed via RI detector on HPLC (Hitachi, Merck KgaA,
Darmstadt, Germany) using HP-Aminex-87-H column (Biorad, Hercules, CA, USA)
with column temperature maintained at 60 8C. A mobile phase of 5 mM sulfuric acid
with flow rate of 0.6 ml/min was used. The concentration of free amino acids was
estimated via the ninhydrin method. The details are described in earlier works
[11,14,15]. Ammonia was measured using Nessler’s reagent [16]. For A.
balhimycina, DCW was measured by filtering 10 ml of the fermentation broth
using pre weighted filter papers (Whatman, Brentford, Middlesex, UK) as reported
elsewhere[11]. Micrococcus luteus was used as a test organism to measure
antimicrobial activity of balhimycin [17]. For this purpose, agar test plates with
Micrococcus luteus growth medium were prepared. Holes were punched in the agar
medium and filled with fermentor samples. Then the plates were incubated for two
days at 30 8C. The growth inhibition diameter around the holes was measured and
concentration of balhimycin was determined using pre-computed calibration
curve.
The data for Amycolatopsis mediterranei S699 was taken from literature [9] and
consisted of the following online measurements: pH, dissolved oxygen, agitator
speed and CO2 and O2 concentration in exhaust gas.
3. Phase detection technique
3.1. Algorithm
In this work, the problem of monitoring of fermentation processhas been posed as that of detection of singular points (SPs). Weassume that the underlying characteristic dynamics and in turn thestatistical properties of the online data vary from one phase toanother. Thus, we propose that an SP can be detected byappropriately detecting the change in the statistical propertiesof the available online data as described below (Fig. 1). For thecurrent phase fi and a new data point xk (xk = [x1k x2k x3k. . .xpk]where p is the number of variables being measured), the followinghypothesis is checked:
Null hypothesis : H0 : xk 2fi
Alternative hypothesis : H1 : xk =2fi(1)
Fig. 1. Schematic representation of the proposed ‘‘Moving window-dynamic principal component analysis (MW-DPCA)’’ approach for singular point (SP) detection.
801
Let the data belonging to phase fi correspond to a probabilitydistribution Pi. Then these hypotheses can be tested by construct-ing an appropriate test statistic depending on the nature of thedistribution Pi. In this work, we assume that Pi is a normaldistribution, i.e. Pi = N (mi,
Pi), where mi and
Pi are the mean and
covariance matrix of the data corresponding to phase fi which canbe approximated by sample average xi and sample covariance Si
calculated from the available data belonging to phase fi as
xi ¼1
ni
X
x j 2fi
x j; Si ¼1
ni � 1
X
x j 2fi
ðx j � xiÞTðx j � xiÞ (2)
where ni are the number of data points belonging to phase fi. Inorder to obtain reliable estimates of mi and
Pi, the hypothesis
testing is performed only after collecting data for a minimumwindow length (Wmin), i.e., ni �Wmin.The relevant test statistic isthen [18]:
T2k ¼ ðxk � xiÞS�1
i ðxk � xiÞT (3)
which represents the Mahalanobis distance of the current point xk
from the mean of the data corresponding to phase fi. The nullhypothesis is rejected when T2
k violates the upper or lower control
limits, T2UCL and T2
LCL, respectively as:
T2k � T2
LCL
T2k �T2
UCL
(4)
where
T2LCL ¼
pðni � 1Þðni þ 1Þniðni � pÞ Fða=2; p;ni � pÞ (5)
T2UCL ¼
pðni � 1Þðni þ 1Þniðni � pÞ Fð1� a=2; p;ni � pÞ (6)
where a is the significance level. For this study, a = 0.01 has beenused. When the null hypothesis is rejected, a phase change event isdeclared and the index i, which keeps track of the number ofphases detected so far, is incremented by 1: i = i + 1. Datacorresponding to the new phase is then collected afresh and theprocedure continued. On the other hand, when the null hypothesisis accepted, the current point is appended to the sample availablefor phase fi and statistical properties of this phase are recomputedby Eq. (2) before testing the hypothesis (i.e. Eq. (1)) for nextavailable measurement.
802
The proposed approach, if implemented directly, can sufferfrom following drawbacks: (a) high sensitivity to measurementand process noise: this can lead to high false alarm rate (detectionof SP even if there has been no phase change in the process), (b)inability to capture dynamic relationships in the measured data,and (c) unnecessary computational overload if online data is veryfrequent since time constants of fermentation process may bemuch larger. To deal with these drawbacks, we incorporate thefollowing modifications to the basic approach.
3.1.1. Incorporating robustness to noise
(i) A phase change event is declared only if the null hypothesis isrejected for at least h out of j consecutive data points. Theparameters h and j can be tuned to achieve an acceptable tradeoff between false alarm rate and speed of phase changedetection.
(ii) Principal component analysis (PCA): PCA involves projectingthe measured data onto few orthogonal directions (referred toas loadings) and monitoring the projection of the data (scores)only on those directions. These orthogonal directions are theeigenvectors of the covariance matrix corresponding to itslargest eigenvalues. The number of directions used depends onthe fraction of variability of the data captured in thosedirections. Based on this number b, T2 and control limits usedfor monitoring are changed as [18]:
T2k ¼ ðxk � xiÞPb diagðlbÞ½ ��1PT
bðxk � xiÞT (7)
where diag (lb) is the diagonal matrix of b largest eigenvalues ofcovariance matrix Si and the columns of the matrix Pb are thecorresponding b number of eigenvectors of Si. The correspondingexpressions for upper and lower control limit are as [19]
T2LCL ¼
bðni � 1Þðni þ 1Þniðni � bÞ Fða=2; b;ni � bÞ (8)
T2UCL ¼
bðni � 1Þðni þ 1Þniðni � bÞ Fð1� a=2; b;ni � bÞ (9)
While the covariance matrix was utilized in the above discussionon PCA, use of the correlation matrix for PCA has also been reported[9]. We provide results based on both the techniques.
3.1.2. Incorporation of dynamic relationships among variables
The current online measurements may depend on the pastonline measurements. To capture such dynamic relationships,appropriately lagged data can be added to the current measure-ment [12]. Let the data at current time be related to data up to dsamples in the past, where d is known as the lag, then the currentdata xk is modified as: xd
k ¼ ½xk xk�1 . . . xk�d�. Accordingly, thecurrent phase data matrix Xi is changed to Xd
i and the mean andcovariance matrix of current phase, and the upper and lowercontrol limits are also changed to reflect this modification. Theparameter d needs to be tuned to obtain a balance between thepredictive ability of the model, computational cost and speed of SPdetection. To be consistent with the nomenclature used inliterature [18], PCA when applied to the lagged data will bereferred to as Dynamic PCA (DPCA).
3.1.3. Reducing the computational requirement
Typically the online data is measured at time scales (order ofseconds) which are much faster than the time constants (typicallyorder of hours) of fermentation processes. We consider data sampledat a lower frequency (sampling ratet) for detecting phase shifts. Thissampling rate is then a tuning parameter which should be chosen tobe consistent with the time constants of the fermentation process.
The overall approach with the above modifications is summar-ized in Fig. 1 and this technique will be referred to as ‘‘movingwindow dynamic PCA’’ (MW-DPCA). For the sake of comparison, inthe results section, we have also considered SP detection withoutreducing the dimensionality of the data. For this purpose, Eqs. (3),(5) and (6) are used. This approach will be referred to as ‘‘movingwindow all dimensions’’ (MW-AD). The results are also comparedwith conventional PCA (with and without lag) approach where theentire data is assumed to correspond to a single mean vector andcovariance matrix [9]. For such a scenario, the T2 for every datapoint follows beta distribution since each data point is used toestimate the mean and covariance [19]. Then, the upper controllimit (T2
UCL) and lower control limit (T2LCL) are determined as:
T2LCL ¼
ðm� 1Þm
2
B a=2; b=2; ðm� b� 1Þ=2ð Þ (10)
T2UCL ¼
ðm� 1Þm
2
B 1� a=2; b=2; ðm� b� 1Þ=2ð Þ (11)
Similar to the modifications adopted for the strategies proposed inthis article, the SP detection algorithm based on these conventionalPCA techniques also utilizes the heuristic that for an SP to bedeclared, h out of j consecutive points should be out of controllimits where data points are assumed to be available at samplingfrequency t.
3.2. Selection of model parameters
Implementation of the proposed algorithm in an effectivemanner requires the specification of the tuning parameters d, t, j,and h. In general, the optimal values of the parameters will varyfrom one organism to another due to significant differences in theirfermentation physiology. While searching through the parameterspace, the following values for these parameters have beenconsidered regardless of the organism: d [0, 4, 8, 12, 16], t [5, 10, 15,20] min, j [4, 5, 6, 7, 8] and h [2, 3, 4, 5, 6, 7, 8]. We first construct thereceiver operating characteristic (ROC) curve for all possiblemodels (i.e. all combinations of parameters). ROC captures thetrade off between the sensitivity and specificity for a binaryclassifier system [20,21]. The problem of SP detection can also beconsidered to be a binary classification problem where each datapoint needs to be classified as either a normal point (not an SP) oran SP as shown in Eq. (12).
Null hypothesis H0 : Point is not SPAlternative hypothesis H1 : Point is SP
(12)
Four types of outcomes are possible while testing these competinghypotheses: (i) true negative: H0 is actually true and it is notrejected by the model, (ii) true positive: H1 is actually true and H0 isrejected by the model, (iii) false negative: H0 is actually false but itis not rejected by the model, and (iv) false positive: H0 is actuallytrue but is rejected by the model. For a given model and batch data,let the number of instances of each of the above outcomes bedenoted by TN, TP, FN and FP respectively. For the given batch,knowing the true status of each time point (whether it is normalpoint or SP) from offline measurement data and/or expertknowledge, these numbers are then computed for each model(combination of parameters) under consideration.
All these models are then represented on the ROC curve whichis a plot of sensitivity versus 1-specificity, where sensitivity andspecificity are defined as:
Sensitivity ¼ TP
TPþ FN; Specificity ¼ TN
TNþ FP
In the ROC curve, models lying on the top left hand corner indicatethe optimal trade off between high sensitivity and specificity.
803
Based on the specific requirements (related to sensitivity andspecificity) of the user, any of the models which capture the besttrade off, can be used. However, in absence of such requirementsthis choice is not straightforward and, single metrics which are acombination of specificity and sensitivity can be used to rank thesemodels [22]. One such popular metric is Matthews CorrelationCoefficient (MCC) which is defined as [23]
MCC ¼ ðTP � TN � FP � FNÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðTP þ FPÞðTP þ FNÞðTN þ FPÞðTN þ FNÞ
p (13)
Fig. 2. Detection of singular points (SPs) in batch fermentation of D-ribose producing, tr
measurements; B) T2 plot for the MW-DPCA approach with covariance matrix using mod
matrix using model parameters as d = 8, t = 5 min, j = 4 and h = 2; and D) Profiles for t
weight. The measured SPs (MSPs) are q: end of lag phase and start of exponential growt
stationary phase, u: end of stationary phase and v: glucose exhaustion. Initial media comp
sulfate 5 g l�1,CaCO3 16 g l�1, MnSO4 0.5 g l�1, leucine 0.5 g l�1 and tryptophan 0.05 g l�1.
v’) denote the model predicted SPs (PSPs) which match with MSP and open circle symbols
indicative of the presence of points which go out of the range of the plot. In this parti
The MCC takes values between �1 and +1. Value of �1 indicatescomplete mismatch between model predictions and true nature ofevery point (SP or not an SP) while value of +1 indicates completeagreement. For finding a common (across different batches), high-performing model for a given organism, models which have MCCvalues greater than t% of maximum MCC (for that batch) and whichdo not miss more than some specified number nm of true SPs (forthat batch) for all batches, are considered. Here t and nm are userspecified parameters. If more than one model satisfies thesecriteria, any of these models can be selected.
ansketolase deficient strain of Bacillus pumilus ATCC 21951. A) Profiles of the online
el parameters as d = 4, t = 5 min, j = 6 and h = 2; C) T2 plot for DPCA with covariance
he offline measurement of concentrations of the substrates, products and dry cell
h phase, r: ribose production start, s: end of exponential growth phase and start of
osition for the batch: glucose 200 g l�1, corn steep liquor (CSL) 12 g l�1, ammonium
Gray filled circle symbols indicate the MSP, dark filled circle symbols (i.e., q’, r’, u’ and
(i.e., w, z, y and t) denote the PSPs which do not match with MSP. In B, the arrows are
cular batch, acetate and acetoin were not detected.
804
4. Results
We present the results for the proposed moving window basedtechniques: (i) MW-DPCA using covariance; (ii) MW-DPCA usingcorrelation; and (iii) MW-AD. The results are also compared withconventional approaches which use the data for the entire batchduration, namely, (iv) DPCA using covariance; (v) DPCA usingcorrelation; and (vi) using all dimensions (AD). For the PCA basedapproaches, three PCs were used. For all the approaches, Wmin ischosen to be 30. A brief discussion about the rationale behind thischoice is presented in Appendix A. Efficacy of these six approaches iscompared for SP detection for case studies involving three differentmicroorganisms. The predicted SPs (PSP) were compared with SPsidentified manually (MSP) based on the offline measurements. Notethat due to the low frequency of the offline measurements, theoccurrence of true SPs may differ from the identified MSPs by sometolerance d. Hence, the following strategy was used:
If PSP 2 MSP � d½ �then PSP is considered as a true SPIf PSP =2 MSP � d½ �then PSP is considered as a false SP
Fig. 3. Monitoring of the batch-to-batch variability by monitoring the occurrence of SPs fo
symbols) and PSPs (dark filled symbols when the time of PSP matches with that of MSP a
model parameter values of d = 4, t = 5 min, j = 6 and h = 2 for the triplicate batches. (B–H
and dry cell weight. The MSPs for different batches are: Batch IIa: q: start of acetoin a
ammonium sulfate consumption stops, v: amino acid in culture broth starts to increases
starts, s: end of logarithmic phase and start of acetate production, u: ammonium sulfate c
Batch IIc: q: ribose production starts and logarithmic phase ends, r: start of acetate for
amino acid in media starts to increase and death phase starts. Media composition for t
5 g l�1, CaCO3 16 g l�1, MnSO4 0.5 g l�1, leucine 0.5 g l�1 and tryptophan 0.05 g l�1.
In this work, the tolerance d was chosen to be 2.5 h.
4.1. Case study 1: Bacillus pumilus
The tkt deficient strains of bacillus are reported to becommercially important for the production of D-ribose. Twobatches I and II with different initial conditions were conducted,where the following online variables were recorded: pH, DO,agitator speed and concentrations of CO2 and O2 in the exhaust gas.Batch II was conducted in triplicate (labeled as IIa, IIb and IIc). Asexplained in the section on model parameter selection, for a givenapproach, MCC values and the number of missed SPs for allcombinations of model parameters were calculated for thesebatches. The SP detection results for batch I with t = 50 and nm = 2are presented in Fig. 2. Other satisfactory models had only minorvariations in model parameters such as h and j and are thereforenot presented. The MSPs shown in this figure were identified basedon the physiological characteristics as captured by the offlinemeasurements shown in Fig. 2D. Fig. 2A shows the raw online data
r batch fermentation of Bacillus pumilus conducted in triplicate. (A) MSPs (gray filled
nd open symbols otherwise) detected by MW-DPCA with covariance matrix using
) Profiles of the offline measurements of concentrations of the substrates, products
nd ribose production, r: start of acetate formation, s: start of stationary phase, u:
and death phase starts. Batch IIb: q: start of logarithmic phase, r: ribose production
onsumption stops, v: amino acid in media starts to increases and death phase starts.
mation, s: acetoin production starts and ammonium sulfate consumption stops, u:
riplicate batches is: Glucose 200 g l�1, cas amino acid 15 g l�1, ammonium sulfate
805
while Fig. 2B shows the T2 values for the MW-DPCA approach alongwith the corresponding time varying values for T2
LCL and T2UCL. By
monitoring the control limit violation of the T2 value, the proposedMW-DPCA with covariance approach has detected four of the fiveMSPs. Two additional SPs are also detected. These may be falsealarms or true SPs that are not captured as MSPs due to the lowfrequency of offline measurements. This type of predictionaccuracy would be difficult to achieve simply by visual inspectionof the online data which shows sharp changes at several timepoints. For example, the plot of agitator speed has several visibletroughs and peaks. However, not all these sharp changescorrespond to SPs. Results for MW-DPCA with correlation andMW-AD approaches are not presented since these were inferior tothose of MW-DPCA with covariance for this microorganism. For thesake of comparison, results for conventional single modeltechniques viz. DPCA with covariance, DPCA with correlationand AD were also generated. Fig. 2C shows the T2 values along withthe corresponding T2
UCL and T2LCL for DPCA with covariance, which
was found to be the best among these three conventionalapproaches. The static nature of DPCA is reflected in the timeinvariant nature of the T2 control limits. It is seen from Fig. 2B and Cthat MW-DPCA performs better than DPCA.
Offline data and results for Batch II are shown in Fig. 3. From theoffline data it can be seen that there is significant variability in
Fig. 4. Comparison of different monitoring techniques for batch fermentation of balhimy
circle symbols) and PSPs (dark filled circle symbols when the time of PSP matches with th
using covariance matrix, d = 0, t = 5 min, j = 8 and h = 5, (ii) MW-DPCA using correlation m
(iv) DPCA using covariance matrix, d = 8, t = 5 min, j = 4 and h = 2, (v) DPCA using corre
t = 5 min, j = 8 and h = 3 and (vii) MSPs. (B) Profiles for the offline measurement of conc
measurements. The MSPs are: q: start of antibiotic production, r: glycerol exhaustion, s: gl
Glucose 100 g l�1, glycerol 10 g l�1, yeast extract 1 g l�1, ammonium sulfate 4.95 g l�1, d
trisodium citrate 0.025 g l�1, MgSO4 1.5 g l�1, MnSO4 0.01 g l�1, NaCl 1 g l�1, KH2PO4
0.001 g l�1, myo-inositol 0.025 g l�1, thiamin HCL 0.001 g l�1, pyridoxine HCL 0.001 g l�
these batches. In particular, D-ribose, acetate and acetoinproduction in batch IIc appears to be delayed compared to theother two batches. This variability is reflected in different times ofoccurrence of MSPs corresponding to similar events such as thestart of death phase. The MW-DPCA model applied to batch I is ableto capture the batch-to-batch variability in terms of the PSPs inbatch II as well. Moreover, the results are superior to those ofconventional DPCA (data not shown).
4.2. Case study 2: Amycolatopsis balhimycina
Balhimycin (a glycopeptide antibiotic) producer strain of A.balhimycina was cultivated in media containing multiple carbonand nitrogen substrates including complex sources such as yeastextract and defatted soybean flour. The substitutable substratesmay be taken up sequentially or simultaneously, therebycomplicating the task of SP detection from the raw online data.Seven batches (labeled I-VII) were conducted. Results for batches I-IV, with t = 50 and nm = 2, and common model parameter valuesacross batches for a given approach, are shown in Figs. 4 and 5. Dueto space limitations, results for batches V–VII are presented assupplementary material. Batch I is characterized by three MSPsbased on the offline data (Fig. 4B). The three moving window basedapproaches have successfully identified these SPs whereas the
cin producing strain of Amycolatopsis balhimycina, batch I. (A) The MSPs (gray filled
at of MSP and open circle symbols otherwise) by different approaches: (i) MW-DPCA
atrix, d = 4, t = 10 min, j = 7 and h = 2, (iii) MW-AD, d = 0, t = 5 min, j = 5 and h = 3,
lation matrix, d = 8, t = 5 min, j = 4 and h = 2, (vi) using all dimensions (AD), d = 8,
entrations of the substrates, products and dry cell weight. (C) Profiles of the online
ucose consumption stops and stationary phase starts. Media composition for batch I:
efatted soybean flour 1 g l�1, and micronutrients ZnSO4 0.02 g l�1, FeSO4 0.02 g l�1,
0.16 g l�1, biotin 0.00005 g l�1, calcium-pantothenate 0.001 g l�1, nicotinic acid1, para-aminobenzoic acid 0.0002 g l�1 and MES buffer 1.045 g l�1.
Fig. 5. Monitoring of batch fermentation of A. balhimycina via MW-DPCA covariance matrix, d = 0, t = 5 min, j = 8 and h = 5 for three different batches. The profiles of offline
measurements, MSPs (gray filled circle symbols) and PSPs (dark filled circle symbols when the time of PSP matches with that of MSP and open circle symbols otherwise) are
shown for (A) Batch II; (B) Batch III; and (C) Batch IV. The MSPs of batch II are; q: end of glycerol and exponential growth phase, r: start of antibiotic production, s: start of death
phase as well as ammonia increase in media and u: complete glucose exhaustion. Media composition for batch II: glucose 84 g l�1, yeast extract 1.64 g l�1, ammonium sulfate
6.0 g l�1, defatted soybean flour 0.37 g l�1. The MSPs of batch III are; q: start of antibiotic production, r: end of glycerol and end of logarithmic phase, s: start of death phase, u:
glucose exhaustion. Media composition for batch III: glucose 51.5 g l�1, glycerol 16.5 g l�1, yeast extract 0.35 g l�1, ammonium sulfate 6.0 g l�1, defatted soybean flour
2.3 g l�1. The MSPs of batch IV are: q: end of lag phase, r: end of first exponential phase, s: end of glycerol and start of antibiotic production, u: ammonium and amino acid start
to increase in culture broth, v: end of glucose and start of death phase. Media composition for batch IV: glucose 68 g l�1, glycerol 2.7 g l�1, yeast extract 1.7 g l�1, ammonium
sulfate 6.6 g l�1, defatted soybean flour 0.16 g l�1. In each batch, micronutrients were added as mentioned in legend to Fig. 4.
806
three single model conventional approaches have missed one ormore of these SPs (Fig. 4A). For example, strategy (v) has missed allMSPs. It is interesting to note that between 36 and 48 h, most of thetechniques have predicted two additional SPs, which maycorrespond to the diauxic nature of the growth as the organismbegins to utilize glucose in this interval. Identification of suchevents, which fall in the interval between two offline samplingtimes, is possible only based on online data. The online data for thisbatch, in particular DO and agitator speed, is more noisy comparedto that for B. pumilus (Fig. 2A). This precludes manual identifica-tion of SPs by visual inspection of online data for this batch.However, the automated MW techniques have successfullyidentified the SPs.
The MW-DPCA results for batches II, III and IV along with thecorresponding offline data are presented in Fig. 5A–C respectively.For batch II, all the MSPs are captured by the MW-DPCA approach.Some of the additional PSPs detected in this batch may correspondto phase change events not identified by the offline measurements.For batch III, three of the four MSPs are correctly identified whilethe MSP u does not match any PSP. However, it should be noted thatthe time of occurrence of u may be inaccurate due to lack of offline
measurements between 108 to 120 h. It is interesting to note thatthere is a PSP at �112 h, which may indicate the actual time ofexhaustion of glucose (the event corresponding to MSP u). For batchIV, four of the five MSPs are captured by MW-DPCA. The missedMSP r corresponds to the end of first exponential growth phase.However, due to lack of offline measurements, this event couldhave actually occurred anytime in the 24–36 h interval. Onceagain, it is interesting to note that there is a PSP in this interval,which may correspond to the actual time of occurrence of thisevent.
4.3. Case study 3: Amycolatopsis mediterranei S699
Data for two batches I and II for this case study has been takenfrom Doan et al. [9]. The results for batch I for MW-DPCA witht = 75, nm = 2 and DPCA with t = 50, nm = 1 along with online andoffline data, and the T2 plots are shown in Fig. 6. For DPCA therewas no model parameter combination which met the t = 75 andnm = 2 criteria for both batches and hence lower t value had to beused to obtain a common model. MW-DPCA approach hassuccessfully predicted the three MSPs in batch I while DPCA
Fig. 6. Comparison of MW-DPCA and DPCA approaches for monitoring of Amycolatopsis mediterranei fermentation case study, batch I. (A) Profiles of the online measurements;
(B) T2 plot for the MW-DPCA approach with covariance matrix using model parameters as d = 4, t = 5 min, j = 8 and h = 5; (C) T2 plot for DPCA with covariance matrix using
model parameters as d = 16, t = 5 min, j = 8 and h = 6 and (D) Profiles for the offline measurement of concentrations of the substrates, product and dry cell weight. The MSPs
are; q: exhaustion of amino acids and beginning of adaptation to ammonium sulfate, r: beginning of exponential growth on ammonium sulfate and start of rapid consumption
of ammonium sulfate and s: exhaustion of ammonium sulfate. Media composition of batch I: glucose 80 g l�1 and ammonium sulfate 4 g l�1. Additional micronutrients
included potassium sulfate 1 g l�1, magnesium sulfate 1 g l�1, ferrous sulfate 1 g l�1, zinc sulfate 0.01 g l�1 and cobalt chloride 0.03 g l�1. In B–D, MSPs are denoted by gray
filled circle symbols while PSPs are denoted by dark filled circle symbols when the time of PSP matches with that of MSP and open circle symbols otherwise. Data and A and D
reproduced from Doan et al. [9]. The arrows are indicative of the presence of points which go out of the range of the plot.
807
approach has predicted only one MSP. From the online data(Fig. 6A), it is seen that there are no sharp changes corresponding toMSPs q and r and this may be the reason for the failure of DPCAapproach in capturing these events. However, the use of multiplemodels in MW-DPCA enables successful prediction of these events.
Batch II contains two alternate nitrogen substrates namelyammonia and nitrate leading to sequential uptake. MW-DPCAsuccessfully predicted three out of four MSPs whereas theconventional DPCA is able to predict only one (Fig. 7A). Note thatin the neighbourhood of MSPs q and r, there are no noticeable sharpchanges in the online profiles. Despite this, MW-DPCA hascaptured MSP r and has predicted an SP approximately 8 h priorto MSP q.
5. Discussion
In this work we have presented a moving window basedapproach for the detection of SPs in fermentation processes. Theapproach has the following salient features: (i) The method doesnot need to assume that a single statistical model is applicable forthe entire batch duration, (ii) the switching times from onestatistical model to the next are not decided a priori and are insteaddecided in real time based on the dynamic evolution of the batchunder consideration, (iii) similarly, the T2 control limits are notfixed a priori but are decided in real time based on the amount ofdata available in the corresponding phase, and (iv) the approachcan be used even in the absence of historical data. To demonstrate
Fig. 7. Comparison of different monitoring techniques for batch fermentation of Amycolatopsis mediterranei, batch II. (A) The MSPs (gray filled circle symbols) and PSPs (dark
filled circle symbols when the time of PSP matches with that of MSP and open circle symbols otherwise) by different approaches (i) MW-DPCA using covariance matrix with
d = 4, t = 5 min, j = 8 and h = 5, (ii) MW-DPCA using correlation matrix with d = 4, t = 5 min, j = 8 and h = 3, (iii) MW-AD with d = 4, t = 5 min, j = 6 and h = 5, (iv) DPCA using
covariance matrix with d = 16, t = 5 min, j = 8 and h = 6, (v) DPCA using correlation matrix with d = 8, t = 5 min, j = 6 and h = 3, (vi) using all dimensions (AD) with d = 8,
t = 5 min, j = 8 and h = 3 and (vii) manually identified SPs from experiment (MSPs). (B) Profiles for the offline measurement of concentrations of the substrates, products and
dry cell weight. (C) Profiles of the online measurements. The MSPs are: q: start of log phase, r: end of KNO3 adaptation period, s: ammonium sulfate consumption stops, u:
stationary phase starts. Media composition: Glucose 80 g l�1, KNO3 4.76 g l�1, ammonium sulfate 1.3 g l�1 and micronutrients as mentioned in legend to Fig. 6.
808
the efficacy of our approach, we have presented a comparison withthe conventional single model based approach. The results werefound to be superior to the conventional single model approacheven though the latter utilize the data for the entire batch duration.In contrast, our approach utilizes only the currently available datain an evolving batch in real time. This feature makes our approachamenable for real time implementation which is not possible withthe conventional single model approach.
Note that the proposed SP detection is based on violation ofeither the upper or the lower control limit by the T2 statistic(Eqs. (5) and (6)). In contrast, monitoring techniques haveconventionally relied only on violation of T2
UCL alone [24,25]. Notethat while changes in m mainly lead to violation of T2
UCL, changes inPcan manifest as violations of either T2
UCL or T2LCL [19]. Since a
phase change can correspond to either a change in the mean(operating level of the variables) or the covariance (relationshipsbetween variables), we chose to use both T2
LCL and T2UCL to detect
SPs. Indeed, we have observed several cases where violation of T2LCL
detects the SP (data not shown). Additionally, to provide insightinto the nature of the SP, we perform statistical tests to check if themeans and the covariances of adjacent phases are identical. Details
about these statistical tests are presented in Appendix B. Based onthese tests; it was observed that several of the phase shift pointswhich were detected due to T2
LCL violation corresponded to changesonly in the covariance and not in the mean (data not shown). Thisindicates the utility of using T2
LCL, apart from T2UCL, as a bound on T2
to improve SP detection.From the point of view of sensor selection, it would be of
interest to determine the utility of the various online measure-ments in SP detection. To this end, the contribution of variousonline measurements in SP detection was quantified for the casestudies presented in this work (see Appendix C). Fig. 8 shows thecontribution plots for SP prediction by MW-DPCA for batches incase study I. Note that the contributions of agitator speedand DO were more significant than those of other onlinevariables in majority of the batches. This is consistent with thefact that the rate of aerobic growth dictates the oxygenrequirement which in turn determines the DO and agitatorspeed as seen from Eq. (14).
dCO
dt¼ kLaðC � C0Þ �
mXBM
YB=O(14)
Fig. 8. Contribution of the different online variables toward SP detection by MW-DPCA approach with covariance matrix for batch fermentation of B. pumilus. (A) Batch I; (B)
Batch IIa; (C) Batch IIb; (D) Batch IIc. Exhaust CO2: gray filled bars; dissolved oxygen: dark filled bars and agitator speed: bars filled with slanted lines. The contribution of pH
and exhaust oxygen are not significant to detection of the SPs and therefore, not shown.
Fig. A1. The profile of T2UCL and T2
LCL at different sample size and different number of
variables (p) to define the Wmin.
809
At pseudosteady state condition dC0/dt becomes zero and Eq. (15)holds:
kLa/ f ðmXBMÞ (15)
But since
kLa/hðNÞ (16)
we get,
N/ gðmXBMÞ (17)
The proposed SP detection technique can in principle be used inconjunction with online data collected via a variety of sensors [26–29]. These sensors may range from simple probes such as those forpH, dissolved oxygen concentration, and optical density tocomplex probes which can acquire near infrared (NIR) orfluorescence spectroscopy based measurements [30,31]. Some ofthe complex probes are more informative but may suffer fromdrawbacks such as limited measurement range and high cost. Webelieve that the proposed approach can be used to evaluate theutility of a given sensor in SP detection. Additionally, the proposedapproach can be used as a guide in real time in making decisionsabout the timings of offline sampling. In particular, an offlinesample can be collected whenever an SP based on online data isdetected instead of collecting offline samples based on arbitrarilyspecified timings.
Acknowledgements
The authors acknowledge the generous gift of the Amycolatopsis
balhimycina strain from Anna Eliasson Lantz of Denmark’sTechnical University, Denmark. The work was partially supportedby a grant from the Department of Biotechnology, Government ofIndia.
Appendix A. Selection of Wmin
The case studies considered in this article involved either four(A. Balhyimycina DSM5908) or five (B. pumilus ATCC 21951and A.
mediterranei S699) online measurements. For these case studies,Wmin was taken to be 30 since it was found (Fig. A1) that there wasnot much variation in the T2
LCL and T2UCL values with respect to Wmin
beyond 30 for p = 4 and 5.
Appendix B. Covariance and mean comparison of adjacentphases
The key idea in our proposed algorithm is that occurrence of anSP corresponds to change in statistical properties of online data.Under the assumption of normally distributed data, this change instatistical properties will be reflected as differences in the meanvectors and/or covariance matrices of adjacent data sets (beforeand after detection of an SP). The data in the two adjacent
810
populations is considered to be normal with means m1, m2 andcovariances
P1,P
2 respectively. To identify changes in theseparameters, the following statistical tests were used:
The modified likelihood test [32] was used for testing whetherthe covariance matrices of adjacent populations are identical. Inparticular, the following hypotheses were tested;
null hypothesis : H0 : S1 ¼S2
alternative hypothesis : H1 : S1 6¼S2
An approximate test of H0 at significance level a based on the
modified likelihood ratio statistic is to reject H0 if
�2r logL* > cf(1 � a) where cf(1 � a) denotes the percentage
point from the x2f distribution such that area to the left is 1 � a,
r ¼ 1� ð2 p2 þ 3 p� 1Þ=ð6ðpþ 1ÞnÞðP2
i¼1 1=ki � 1Þ, f = p(p + 1)/2 is
degrees of freedom and L ¼ ðY2
i¼1
ðdet SiÞ ni�1ð Þ=2=ðdet SÞ n�2ð Þ=2Þ
ððn� 2Þ pðn�2Þ=2=Y2
i¼1
ðni � 1Þpðni�1Þ=2Þ is the modified likelihood ratio.
In these expressions, Si is the ith population sample covariance
matrix, S ¼P2
i¼1 Si, ni is the size of the ith population sample,
n ¼P2
i¼1 ni, ki = (ni – 1)/(n – 2). For checking whether the mean
vectors of adjacent populations are identical, the followinghypothesis was tested [33].
nullhypothesis : H0 : m1 ¼ m2
alternativehypothesis : H1 : m1 6¼m2
The null hypothesis was rejected if the test statistic T2S > TC
[assuming n1 � n2], where
T2S ¼ n1ðx1 � x2ÞT C�1ðx1 � x2Þ
x1 ¼Pn1
g¼1 x1g=n1, x1g is the gth data point of 1st population.
x2 ¼Xn2
b¼1
x2b=n2, x2b is the bth data point of 2nd population
C ¼Pn1
g¼1ðug � uÞðug � uÞT
ðn1 � 1Þ
ug ¼ x1g � ðn1=n2Þ1=2x2g ; g ¼ 1;2 . . . n1
u ¼Xn1
g¼1
ug=n1;
Tc ¼n1 p
n1 � pþ 1Fð1� a; p;n1 � pþ 1Þ
For both covariance and mean checking, the value of a was taken as0.01.
Appendix C. Contribution of variables towards SP detection
When an SP is detected, the variables primarily responsible foroccurrence of SP can be identified based on the contribution plot ofthe variables. The procedure is as follows [18]:
When an SP is detected at the kth time point corresponding toobservation xk, then T2
k >T2UCL or T2
k <T2LCL. The normalized score
t2i =li are then computed for the ith principal component
(i = 1,2,. . .,b) where t2i is the score of the projection of xk to the
ith loading vector. The principal components for whichðt2
i =liÞ> ð1=bÞT2UCL (in case T2
k >T2UCL) or ðt2
i =liÞ< ð1=bÞT2LCL (in case
T2k <T2
LCL) are determined to be responsible for the out of controlstatus. Let the number of such principal components be r. Then thecontribution of each variable j to the out of control score ti can bedefined as conti; j ¼ ðti=liÞpi; jðx j �m jÞ, where pi,j is the (i, j)th
element of the loading matrix P. If conti,j is negative, it is set equalto zero. The total contribution of the jth process variables is then:
CONT j ¼Pr
i¼1 conti; j The variables with large values of CONTare identified as primary causes for phase change detection. Thisinformation can potentially aid the process operator in determin-ing the nature of phase change as well as in taking any controlaction if required.
Appendix D. Nomenclature
b number of eigen values to be considered for PCA
B(a/2; a, b) beta-distribution with a% significant level and a and b
degree of freedom
C* saturation concentration of oxygen
C0 dissolved oxygen concentration in medium
d dynamic lag
diag (lb) the diagonal matrix of b largest eigenvalues of
covariance matrix Si
DPCA dynamic principal component analysis
F(a/2; c,b) percentage point from F-distribution with a and b
degrees of freedom such that the area to the left is a/2.
kLa volumetric oxygen mass transfer coefficient
m total number of data points for the entire batch
MSP SP based on offline measurements
MW-AD moving window all dimensions
MW-DPCA moving window dynamic PCA
ni the number of data points belonging to phase fi
nm number of missed SPs during prediction
N agitator speed
p number of variables being measured
Pi probability distribution corresponding to ith phase
Pb matrix of b eigenvectors of Si corresponding to the
largest b eigenvalues
PSP predicted SPs
Si sample covariance matrix of ith phase
SP singular point
t percent of MCC considered to find the common model
T2k Mahalanobis distance of the current point xk from the
mean of the data corresponding to phase fi
T2LCL lower control limit of T2
T2UCL upper control limit of T2
Wmin minimum window length for calculating covariance
matrix
xk data (row) vector at kth time
xpk value of pth variable at kth time
xdk data (row) vector with lag d at kth time
Xdi data matrix of ith phase with lag d
XBM biomass concentration
YB/O yield of biomass per unit of oxygen consumed
Greek letters
a the significance level
fi ith phase
h number of points required to violate the control limits for
SP detection
m specific growth rate
t sampling rate
j number of consecutive points checked for violation of
control limits for SP detection
811
d tolerance for comparing MSP and PSP
mi population mean of ith phasePi population covariance matrix of ith phase
Appendix E. Supplementary data
Supplementary data associated with this article can be found, inthe online version, at doi:10.1016/j.procbio.2009.03.008.
References
[1] Sonnleitner B, Locher G, Fiechter A. Automatic bioprocess control 1. A generalconcept. J Biotechnol 1991;19:1–17.
[2] DePalma A. Process monitoring on-line & in real-time. Genet Eng News2006;26:50–1.
[3] Lopes JA, Costa PF, Alves TP, Menezes JC. Chemometrics in bioprocess engi-neering: process analytical technology (PAT) applications. Chemometr IntellLab 2004;74:269–75.
[4] Venkatasubramanian V, Rengaswamy R, Kavuri SN, Yin K. A review of processfault detection and diagnosis Part III: Process history based methods. ComputChem Eng 2003;27:327–46.
[5] Kamimura R, Konstantinov K, Stephanopoulos G. Knowledge-based systems,artificial neural networks and pattern recognition: applications to biotechno-logical processes. Curr Opin Biotechnol 1996;7:231–4.
[6] Lee JM, Yoo CK, Lee IB. On-line batch process monitoring using a consecutivelyupdated multiway principal component analysis model. Comput Chem Eng2003;27:1903–12.
[7] Lee JM, Yoo CK, Lee IB. Enhanced process monitoring of fed-batch penicillincultivation using time-varying and multivariate statistical analysis. J Biotech-nol 2004;110:119–36.
[8] de Vargas VDCC, Lopes LFD, Souza AM. Comparative study of the performanceof the CuSum and EWMA control charts. Comput Ind Eng 2004;46:707–24.
[9] Doan XT, Srinivasan R, Bapat PM, Wangikar PP. Detection of phase shifts inbatch fermentation via statistical analysis of the online measurements: a casestudy with rifamycin B fermentation. J Biotechnol 2007;132:156–66.
[10] Neubauer AS. The EWMA control chart: properties and comparison with otherquality-control procedures by computer simulation. Clin Chem 1997;43:594–601.
[11] Bapat PM, Das D, Dave NN, Wangikar PP. Phase shifts in the stoichiometry ofrifamycin B fermentation and correlation with the trends in the parametersmeasured online. J Biotechnol 2006;127:115–28.
[12] Doan XT, Srinivasan R. Online monitoring of multi-phase batch processesusing phase-based multivariate statistical process control. Comput Chem Eng2008;32:230–43.
[13] Cinar A, Parulekar SJ, Undey C, Birol G. Batch Fermentation Modeling, Mon-itoring and Control. New York: Marcel Dekker; 2003, 245–261.
[14] Srivastava RK, Wangikar PP. Combined effects of carbon, nitrogen and phos-phorus substrates on D-ribose production via transketolase deficient strain ofBacillus pumilus. J Chem Technol Biotechnol 2008;83:1110–9.
[15] Bapat PM, Bhartiya S, Venkatesh KV, Wangikar PP. Structured kinetic model torepresent the utilization of multiple substrates in complex media duringrifamycin B fermentation. Biotechnol Bioeng 2006;93:779–90.
[16] Morrison GR. Microchemical determination of organic nitrogen with nesslerreagent. Anal Biochem 1971;43:527–32.
[17] Allen NE, LeTourneau DL, Hobbs Jr JN. The role of hydrophobic side chains asdeterminants of antibacterial activity of semisynthetic glycopeptide antibio-tics. J Antibiot (Tokyo) 1997;50:677–84.
[18] Chiang LH, Russell EL, Braatz RD. Fault Detection and Diagnosis in IndustrialSystems. London: Springer-Verlag; 2001, 21–55.
[19] Tracy ND, Young JC, Mason RL. Multivariate control charts for individualobservations. J Qual Technol 1992;24:88–95.
[20] Obuchowski NA. Receiver operating characteristic curves and their use inradiology. Radiology 2003;229:3–8.
[21] Zweig MH, Campbell G. Receiver operating characteristic (ROC) Plots—afundamental evaluation tool in clinical medicine. Clin Chem 1993;39:561–77.
[22] Baldi P, Brunak S, Chauvin Y, Andersen CAF, Nielsen H. Assessing the accuracyof prediction algorithms for classification: an overview. Bioinformatics2000;16:412–24.
[23] Matthews BW. Comparison of the predicted and observed secondary structureof T4 phage lysozyme. Biochim Biophys Acta 1975;405:442–51.
[24] Chen JH, Liu KC. On-line batch process monitoring using dynamic PCA anddynamic PLS models. Chem Eng Sci 2002;57:63–75.
[25] Lennox B, Montague GA, Hiden HG, Kornfeld G, Goulding PR. Process mon-itoring of an industrial fed-batch fermentation. Biotechnol Bioeng2001;74:125–35.
[26] Esbensen K, Kirsanov D, Legin A, Rudnitskaya A, Mortensen J, Pedersen J,Vognsen L, Makarychev-Mikhailov S, Vlasov Y. Fermentation monitoring usingmultisensor systems: feasibility study of the electronic tongue. Anal BioanalChem 2004;378:391–5.
[27] Grube M, Gapes JR, Schuster KC. Application of quantitative IR spectral analysisof bacterial cells to acetone–butanol–ethanol fermentation monitoring. AnalChim Acta 2002;471:127–33.
[28] Riley MR, Rhiel M, Zhou XJ, Arnold MA, Murhammer DW. Simultaneousmeasurement of glucose and glutamine in insect cell culture media by nearinfrared spectroscopy. Biotechnol Bioeng 1997;55:11–5.
[29] Turner C, Rudnitskaya A, Legin A. Monitoring batch fermentations with anelectronic tongue. J Biotechnol 2003;103:87–91.
[30] Arnold SA, Crowley J, Woods N, Harvey LM, McNeill B. In-situ near infraredspectroscopy to monitor key analytes in mammalian cell cultivation. Biotech-nol Bioeng 2003;84:13–9.
[31] Rhee JI, Kang TH. On-line process monitoring and chemometric modeling with2D fluorescence spectra obtained in recombinant E. coli fermentations. Pro-cess Biochem 2007;42:1124–34.
[32] Muirhead RJ. Aspects of Multivariate Statistical Theory. New York: John Wiley& Sons; 1982. pp. 291–311.
[33] Ito K. On the effect of heteroscedasticity and nonnormality upon some multi-variate test procedure. In: Krishnaiah PR, editor. Multivariate Analysis II.Academic Press; 1969. p. 87–119.