Date post: | 31-Jan-2023 |
Category: |
Documents |
Upload: | khangminh22 |
View: | 0 times |
Download: | 0 times |
1
Simultaneous Confidence Bands for theMean of Dense Functional Data
By
LI Xiaoyu
(17250811)
A thesis submitted in partial fulfillment of the requirements
for the degree of
Bachelor of Science (Honours)in MATH & STAT QDA
at
Hong Kong Baptist University
Supervisor: Prof CHENG Ming-Yen
13/10/2020
2
Acknowledgement
First of all, I want to express my sincere gratitude to Professor CHENG
Ming-Yen, my supervisor, who gave me necessary research papers, detailed
instructions, considerable assistance and valuable suggestions during the first
semester. Moreover, I appreciate my parents, my friends and professors from
the Department of Mathematics. Without their assistance, encouragement and
guidance in my life, I cannot successfully complete my final year project.
Finally, the plasma data set comes from Andersen et al. (1981), Hart and Wehrly
(1986) and moreover, the R package ‘SCBmeanfd’ comes from Degras (2016).
Thanks for their contribution.
______________________
Signature of Student
______________________
Student Name
Department of Mathematics
Hong Kong Baptist University
Date: ________________
LI Xiaoyu
08/01/2021
LI Xiaoyu
3
Simultaneous Confidence Bands for theMean of Dense Functional Data
LI Xiaoyu
(17250811)
Department of Mathematics
ABSTRACT
The mean function is an important topic in functional data analysis (FDA). To
analyze the mean of functional data on the entire domain, one should conduct
simultaneous inference rather than pointwise inference. Before building SCB,
one needs to first estimate the mean function, conduct the nonparametric
smoothing, choose the smoothing parameter, and finally estimate the covariance
function.
This thesis first introduces the whole process and two specific methods to build
simultaneous confidence bands (SCB) for the mean of dense functional data and
then uses R package ‘SCBmeanfd’ to analyze and interpret the plasma data.
This example illustrates how SCB applies in hypothesis testing and provides
evidence to draw the conclusion whether to reject the null hypothesis or not.
Keywords: FDA, SCB, hypothesis testing, R package ‘SCBmeanfd’
4
1. Introduction
As shown in Degras (2017), rapid development in science and industry have helped collecthigh-resolution data in the areas of time, space, and frequency. Such data can be consideredas discrete observation functions and thus are called functional data. For example, in figure 1,daily temperature in Spain on average per year during the period 1980 to 2009 is Spanishweather functional data. The R package ‘fda.usc’ comes from Bande et al. (2020).
> install.packages("fda.usc")
> library(fda.usc)
> plot(aemet$temp)
FIGURE 1: daily temperature in Spain on average per year over 30 years
As shown in figure 1, Spanish temperature data possess dense features, which are so-calleddense functional data.
Figure 2 and figure 3 show that the daily wind speed and log precipitation in Spain onaverage per year during the period 1980 to 2009, respectively, which give other examples ofthe Spanish weather dense functional data.
> plot(aemet$wind.speed)
> plot(aemet$logprec)
5
FIGURE 2: daily wind speed in Spain on average per year over 30 years
FIGURE 3: daily log precipitation in Spain on average per year over 30 years
More importantly, functional data are very useful and crucial when analyzing biomedicalscience, social science, climate science and finance. The common feature of these sciencefields is that they need to collect data many times in a specific period and draw thecontinuous curve on domain t.
To analyze the functional data, new theory and statistical methods are needed. As discussed inDegras (2017), the goal of functional data analysis (FDA) is to analyze the given information
6
on curves or functions such as mean function estimation and functional principal componentanalysis (FPCA). In other words, FDA is the process of analyzing surfaces, curves orfunctions.
There are many excellent past works on how to build SCB for functional data analysis suchas Choi and Reimherr (2018), Cao et al. (2016), and Degras (2011). This thesis mainlyfocuses on the SCB for the mean of dense functional data.
This thesis will be organized as follows. In section 2, the thesis will discuss the estimation ofthe mean and covariance of functional data. Section 3 tells how to build simultaneousconfidence bands (SCB). Section 4 illustrates the methods to build simultaneous confidencebands by analyzing the plasma data set available through R. Finally the thesis will make theconclusion in section 5 and list all references in section 6.
7
2. Preparation process before building SCB
Before conducting the estimation, one needs to know some basic statistical process. First, astochastic process {X(t)| t ∈ T} is simply a collection of some random variables where indext is often time. Second, a Stochastic process X is normally distributed with mean μ and
variance 2V if its probability density function is � �� �2
22x
2e
21xf V
P��
SV . Third, Gaussian
process (GP) is a special case in stochastic process and refers to a set of random variables onthe continuous domain in which every finite collection follows the multivariate normaldistribution. In other words, their arbitrary finite linear combination is a normal distribution.Hence, a Gaussian process can be uniquely determined by a fixed mean and covariancefunction.
Now let Xi(t) denote a stochastic process and assume Xi(t) follow the Gaussian process with
mean function � �tP and covariance function � �.t,s* That is, � � )).t,s(,tμ(GP~(t)Xi *
Then a general statistical model for the functional data is as follows:Yij = Xi (tj) +εij, for i=1,...,n, & j=1,...,p
where Yij is the observation of the ith statistical unit at grid point tj and εij is a measurementerror (Degras, 2017).
2.1 Estimate the mean function
In general, the most common method to estimate mean function P is .Xn1X
n
1ii¦
However, small size of grid tj and large noise εi always lead to unstable and incorrectestimation for mean function. According to Degras (2017), one can calculate the sample
mean ¦
n
1iijj Y
n1Y at every grid point tj and then smooth them in a nonparametric fashion.
There are mainly two benefits for data smoothing. First, it generates a better-definedestimator for mean. Second, it can cancel out the adverse effect caused by unstablemeasurement error εij.
8
2.2 Nonparametric smoothing
As stated in Degras (2017), the basic process of data nonparametric smoothing is as follows.First, let wj denote the weight functions which are determined by given data and smoothing
methods. Second, estimate mean μ by nonparametric smoother � � � � j
p
1jj Ytwtμ ¦
.
Wand and Jones’s book (as cited in Degras, 2017) presented that the Nadaraya-Watson
estimator, a very common example of nonparametric smoother, is � �¦
¦
�.
�.
p
1j
j
p
1jj
j
)h
tt(
Y)h
tt(tμ for
kernel function . and positive constant h, which is a bandwidth decided by the analyst. In
this case, the weight functions are ,)
htt
htt
(t) w p
1i
i
j
j
¦
��.
�.
)(
which is very complicated. Note
that � � 1twp
1jj ¦
for .Dt��
The kernel function K and the bandwidth h are both very important concepts in datasmoothing. According to Guidoum (2015), if satisfying the following conditions, function K(t)can be considered as a kernel:(1) . Non-negative and integral equals 1.(2) . Symmetric around the origin.
(3) . With a finite second moment. That is to say, f³ <dt)t(KtR
2 .
As shown in Guidoum (2015), the most frequently used kernel function is normally
distributed with mean zero and variance 1, that is, .e21)t(K 2
t2�
S
In addition, as discussed in Degras (2017), bandwidth h determines the effective size of thelocal average window. Bandwidth h has the following characteristics. Given the fixed grid
point tj, if the bandwidth h is small, then the weight function wj (t) to the sample mean jY is
big where t is close to tj. On the contrary, when the bandwidth h is big, then the weight
function wj (t) to the sample mean jY is large where the t is far from tj. Furthermore, if grid
9
point tj and bandwidth h are both fixed, then the weights function wj (t) will increase with trunning closer to tj.
Finally, as stated in Eilers and Marx (1996), B-splines provides an important tool fornonparametric smoothing. Moreover, there are still many other nonparametric smoothingmethods such as the spline smoothing methodology (Silverman, 1985). One should choosethe suitable nonparametric smoothing method based on the specific data features. As shownin Degras (2017), one of the most common nonparametric smoothing techniques is localpolynomial smoothing method. Its nonparametric smoother is
� � � � ¸¹·
¨©§ �
¦ �E� P °
°
¿
°°
¾
½
°°
¯
°°
®
¦
EEh
ttKttYmintˆ jkj
m
0kkj,...,
2p
1jphk1
m0
where K is the kernel function and h is the bandwidth.
However, it’s well known that the effect of different smoothing methods on the final results isfar less than that of the value of bandwidth h. Therefore, one can also choose thenonparametric smoothing method based on preferences.
2.3 Choose smoothing parameter
The selection of the smoothing parameter is a key part. In section 2.3 smoothing parameter isjust the bandwidth h mentioned above in section 2.2.
In probability and statistics, the mean squared error (MSE) measures the average of thesquare of the errors. In other words, MSE calculates the average squared difference betweenthe actual values and estimated values. So it can be expressed in the following form:
� � � �� �> @2tμtμE)μ(MSE � for .Dt�
After some proof, MSE can satisfy the following form:
� � � �� � � �� � � � � �� �� �> @22 tμEtμEtμtμEμMSE ��� � �� � .μ),μ(BiastμVar 2�
Therefore, if MSE of the mean estimator � �tP is small, then both its variance and bias
should be also small. According to Degras (2017), the smoothing parameter h defines the
relationship between the variance and the bias for the estimator � �.tμ The larger the
smoothing parameter h is, the larger the bias is and the smaller the variance is. This isbecause the data curve is more smooth with h increasing. In contrast, the smaller h is, thesmaller the bias is and the larger the variance is.
10
Now let AMSE denote the average mean squared error where
� � � �� �> @¦
¸¹·¨
©§ �
p
1j
2jjh tμtμE
p1hAMSE (Degras, 2017).
One needs to determine the value of smoothing parameter h to minimize the AMSE. This isthe goal of section 2.3.
Leave-one-out cross-validation (LOOCV) can successfully achieve this goal. As studied inSteorts (2017), the basic conducting process and data analysis of the simplestleave-one-data-out cross validation is as follows:Assume that a data set consists of n data. Pick (n-1) data for each time as the training set toadjust the parameters of the model and use the remaining one data as the validation set. Keepconducting the experiments until each data has been taken as the testing set. Note that the ithtraining set includes all data points but data point i. In this case one can eventually get nmodels and compute n corresponding MSE, which are MSE1, ..., MSEn, respectively. Then the
LOOCV estimator is the average of these n MSE where .i
n
1i)n( MSE
n1CV ¦
This LOOCV
re-sampling method can reduce the bias because it uses n-1 data as a training set each time.
According to Degras (2017), the goal of leave-one-data-out cross validation is to minimize
the CV score: � � � �¦¦
¸¹·¨
©§� �P
n
1i
p
1j
2ijj
jh Ytˆ
np1hCV related to the smoothing parameter h where
jhˆ �P is the mean estimator based on all but the data point at grid tj, for j=1,...,p.
However, in functional data analysis, one needs to analyze the curves instead of the datapoints. In this case, the corresponding leave-one-curve-out cross-validation score is asfollows:
� � � � ,Ytˆnp1hCV
2n
1i
p
1jijjh
i¦¦
¸¹·¨
©§ �P �
where -ihμ is the mean estimator based on all but the ith curve (Rice & Silverman, 1991, as
cited in Degras, 2017).
More importantly, it can be inferred based on the equation mentioned above that theexpectation of CV score with regard to h approximately equals AMSE(h) +c for constant cwhich is independent of h. Therefore, the smoothing parameter h determines the expected CVscore and the average mean squared error (AMSE) simultaneously in the same way.Accordingly, one can select the appropriate smoothing parameter h by minimizingcross-validation score, instead of AMSE. This method is so-called Cross-validation
11
bandwidth selection in R package ‘SCBmeanfd’.
2.4 Estimate the covariance function
The last step to build SCB is to estimate the covariance function for mean estimator � �.tP
Hart and Wehrly’s and Degras’s paper (as cited in Degras, 2017) stated that the covariance
estimator of μ satisfies:
� � � �� � � �n
t,stμ,sμCov *| fo� p,nDt,s ,
where * is the covariance of Gaussian process Xi(t). Therefore, the covariance estimationtask is simplified to * estimation.
There are mainly two methods to estimate the covariance � �t,s* of the Gaussian process
Xi(t).
First of all, as stated in Degras (2017), apply the smoothing method which was applied to the
average data curve to all curves to estimate mean function � �.tP This process is so-called
data pre smoothing. The pre smoothed data can be expressed like this:
� � j
p
1jiji YtwtX ¦
¸¹·¨
©§
where wj is the weight function. In this case, the mean estimator � �tP is the mean of the pre
smoothed data mentioned before and thus satisfies .Xn1ˆ
n
1ii¦
P Hence, one can use sample
covariance of iX to estimate the covariance * :
� � � � � �� � � � � �� �.tμ tXsμ sXn1t,sˆ
n
1iii¦
�� *
Ramsay and Silverman (2005) studied the second method to estimate the covariance * ,which is to conduct the functional principal components analysis (FPCA). In short, one needs
to compute the eigenvalues 0ˆˆˆ n21 tOt���tOtO and the corresponding eigenfunctions
n21 ˆˆˆ M���MM ,,, of the sample covariance � �t,s* , mentioned above in the first method. The
12
eigenfunctions kM are orthonormal.
Indritz’s book (as cited in Degras, 2017) stated that the sample covariance � �t,s* can be
expanded as � � ,¸¹·¨
©§¸
¹·¨
©§
MMO * ¦ tˆsˆˆt,sˆ kk
n
1kk which is based on the Mercer’s Theorem. However,
this expansion form will take up a huge amount of storage space. In this case, one should
select a suitable truncation order K, which is much smaller than n, to decompose � �t,s* into
the following reduced form: � � ¸¹·¨
©§¸
¹·¨
©§
MMO * ¦ tˆsˆˆt,sˆ kk
K
1kk (Degras, 2017). The reduced form can
save a considerable amount of storage space and maintain most of the variance of the sample
covariance � �t,s* at the same time.
13
3. Build SCB
Having estimated the mean and covariance function of ,P it’s time to build SCB.
Recall that a Gaussian process refers to a set of random variables in which every finitecollection follows the multivariate normal distribution. Moreover, the general model fordense functional data of the Gaussian process Xi(t) is Yi = Xi (tj) +εi, as mentioned in section2.
As stated in Degras (2011), let �*,0(GP denote a Gaussian Process with mean zero and
covariance * , then the mean estimator μ can be expressed in the following form based on
the functional central limit theorem (FCLT):
� � � �*o ,0GPμ-μn as n, p →f .
Note that the central limit theorem argues that under ideal conditions, the mutuallyindependent random variables approximately follow a normal distribution. Functional centrallimit theorem is a specific example of central limit theorem.
There exist two basic methods to build simultaneous confidence bands (SCB) for P (Degras,
2017):
First of all, let � � � �t,tt * V and � � � � � �ts)t,st,s
VV�*
U denote the standard deviation and
correlation functions of Gaussian process Xi(t), respectively, and then let a stochastic process
Z be another Gaussian process with mean zero and covariance U , that is, � �.,0GP~Z U
Therefore, based on the functional central limit theorem (FCLT), ZV converges to the limitGaussian process with mean 0 and covariance * when n and p go to infinity, that is,
,),0(GP~Z *V as .p,n fo
The first method is to use the Gaussian process �*,0(GP to build SCB. In this case, the
SCB for μ in the confidence level D�1 is:
� � � � ,zn
1tˆzn
1tˆ ,,, ¸¹·
¨©§ �P�P *D*D Dt�
where ),0(GP~Z *V and � � .1ztZtsupP ,Dt D� dV *D¸¹·¨
©§¸
¹·¨
©§�
14
Second, one can also use the Gaussian process � �U,0GP to build SCB instead of the
previous limit Gaussian process .,0(GP �* Recall that � � � �t,tt * V and � � � � � �.ts)t,st,s
VV�*
U
Then the SCB for μ in the confidence level D�1 is approximately as follows:
� � � � � � � � Dtntztμ,
ntztμ ,, �¸
¹·
¨©§ V
�V
� UDUD ,
where UD ,Z satisfies the probability function � �� � D� d UD� 1ztZsupP ,Dt . This SCB
construction method used the method to build the pointwise confidence intervals. In otherwords, μ will fall in the above confidence region with the probability of 1-D when n and p
are large enough. In this case, as the correlation U increases, the quantile UD,z will decrease.
In particular, if the correlation U equals 1, UD,z will equal 1.96 for D= 0.05, which is the z
score for standard normal distribution.
As mentioned in Degras (2017), in practice, � �tV is difficult to compute. However, � �tV
can be easily estimated by its estimator � � � �.t,tˆtˆ * V Moreover, the quantile UD,z can be
approximately derived from Gaussian process simulation � �.ˆ,0GP~Z U Therefore, the
practical SCB for μ in the confidence level D�1 is:
� � � � � � � � ,ntˆ
ztμ,ntˆ
ztμ ˆ,ˆ, ¸¹·
¨©§ V
�V
� UDUD Dt�
where the quantile UD ˆ,z approximately by Gaussian process silumation � �U,0GP~Z is
considered as a parametric bootstrap of the estimator� � � � � �� �,ttˆtˆn
P�PV
.Dt�
The process of the parametric bootstrap is as follows (Degras, 2017):
First of all, discretize the Gaussian process � �U,0GP~Z over a fine grid ^ `m21 ,...,, WWW in D
and consider Zm as the resulting random vector. Generally, the grid size m should be set largeenough that the discretization effect can be negligible. However, with grid size m increasingrapidly, the computation cost O(m3) will also increase quickly due to the complicated,uncertain and random simulation process. The computation cost is the time it takes computersto analyze large amounts of data. Therefore, one needs to determine a suitable value of gridsize m by considering synthetically all the factors such as the domain D size and the
15
roughness of .U If domain D runs from 0 to 1, grid size m should be no higher than 500.
Second, simulate the normal distribution � �Um M,0N~Z where � �� � .,ˆM mk,1jkjˆ dtU WWU
There exist memory costs O(m2) and computation costs O(m3) during the complicatedsimulation process, and both of them will increase quickly as grid size m increases. Note thatthe memory costs refer to the historical cost of computers to store and record data; thecomputation costs is the time it takes computers to analyze large amounts of data. Therefore,as mentioned in the first step, grid size m should be set suitable. To greatly reduce both
memory costs and computation costs, � �t,s* can be expressed in the following reduced
form: � � � � � �tsˆt,sˆ lk
K
1k
K
1lkl MMO * ¦¦
where the truncation order K is much smaller than the grid
size m. Note that kM is the eigenfunctions of sample covariance * and klO is the
corresponding eigenvalues. Recall that this reduced decomposition form occurs when one
uses FPCA method to estimate � �t,s* of the Gaussian process Xi(t). In this case, to simulate
the normal distribution � �,M,0N~Z ˆm U one just needs to simulate a random vector
� �Ok M,0N~Z where .ˆˆ
ˆM
llkk
klˆ
OO
O O
The last step is to compute the L-Infinity norm of Zm. The L-Infinity norm is the maximumabsolute value of the entries of Zm. Conduct many times of simulation process, and then setthe quantile of confidence level 1-D of these simulated process of the L-Infinity norm of Zm
as the quantile .ˆ,Z UD It’s worthy mentioning that the parametric bootstrap does not considerthe uncertainty related to smoothing parameter h selection and covariance estimation.Therefore, errors of SCB curves may happen especially when sample size n is not largeenough.
Besides the parametric bootstrap, one can also use the nonparametric bootstrap of the
estimator� � � � � �� �ttˆtˆn
P�PV
to determine the value of UD ˆ,Z (Degras, 2017). In general, conduct
the nonparametric sample ^ `n1 X~,...,X~ of the pre smoothed curves ^ `n1 X,...,X and compute
the bootstrap estimator .X~n1~ n
1ii¦
P This process needs repeating for many times to
determine the bootstrap distribution. Afterwards one only needs to use the quantile ofconfidence level 1-D of the bootstrap distribution as .ˆ,Z UD Although the nonparametric
16
bootstrap needs much more computation cost O(m3) than the parametric bootstrap mentionedbefore, its results are more accurate because it considers the uncertainty when estimating thecovariance function and selecting the smoothing parameter h.
17
4. Analyze plasma data set
Then I will use the above methodology to analyze the plasma data set through the R package‘SCBmeanfd’. Before conducting the analysis, I need to emphasize again that the plasma dataset comes from Andersen et al. (1981), Hart and Wehrly (1986) and that the R package‘SCBmeanfd’ comes from Degras (2016). Thanks for their work. This example illustrates theimportant application of SCB on the hypothesis test of one sample test for population mean.
> install.packages("SCBmeanfd")
> library(SCBmeanfd)
> data(plasma)
The data set description is the plasma citrate concentration of 10 persons random sampling onsome day. The measurements were taken every hour from 8AM to 9PM. So the plasma dataset is a 10*14 matrix, as shown in figure 4.
FIGURE 4: plasma data table
Furthermore, the below figure 5 is the curve of plasma data sets on the continuous index time,including the mean curve.
> matplot(8:21, t(plasma), type = "l", col = 3, lty = 3,
xlab = "hour per day", ylab = "plasma citrate concentration",
main = "plasma data")
> lines(8:21, colMeans(plasma), col = 1, lwd = 2)
> legend("top", col = 1, lty = 1, legend = "the average")
18
FIGURE 5: curve of plasma data sets on the continuous index time
According to Leslie and Renty (2016), the normal plasma citrate concentration is larger than
100 .MP Now I want to estimate whether the overall average plasma citrate concentration is
in this normal range, which is larger than 100 units or not given the sample plasma data. Notethat the sample mean of plasma citrate concentration is 119.1357, which is computed by R.Design an one-tailed hypothesis test for whether the population mean of plasma citrate
concentration per day is 100 units. That is, 100μ :H 0 { versus � � 100tμ:H 0a ! for .Dt 0 �
In this case, I used the cross-validation bandwidth selection method to select bandwidth hwhere degree of the local polynomial fit equals 1. Set significance level α = 0.05, then thesimultaneous confidence band can be drawn like figure 6:
> h <- cv.select(8:21, plasma,degree=1,interval=c(0.5,1))
> scb <- scb.mean(8:21, plasma, bandwidth =h, scbtype = "both", gridsize =2e3)
> plot(scb, xlab = "hr", ylab = "concentration", main = "plasma data")
> legend("topright", col = "green", lty = 2, lwd = 1, legend = "normal SCB")
> legend("topleft", col = "blue", lty = 2, lwd = 1, legend = "bootstrap SCB")
19
FIGURE 6: SCB curves for population mean of plasma citrate concentration
As shown in figure 6, the green dotted lines are the boundary of normal SCB while the bluedotted lines indicate the boundary of bootstrap SCB in the confidence level 0.95 for thepopulation mean of plasma citrate concentration. Obviously the normal SCB curve excludes100 all the time and the bootstrap SCB curve does not contain 100 from 6PM to 8PM. So weshould reject the null hypothesis at significance level 0.05. In other words, the population
plasma citrate concentration should be larger than 100 ,MP which is in the normal range.
Then double check the test statistic, normal p value, and bootstrap p value by thegoodness-of-fit test.
> scb.model(8:21,plasma,model=1,bandwidth=h,level=.05,scbtype="both",
gridsize =2e3)
Goodness-of-fit test
Model for the mean function: linear
Bandwidth: 0.75
SCB type: normal and bootstrap
Significance level: 0.05
Test statistic and p value
stat normal p bootstrap p
5.174 <1e-16 0.0374
20
So the test statistic T= 5.174, normal p value is smaller than 10-16 and bootstrap p value is
0.0374. In this case, there is enough evidence to reject Ho in the significance level 0.05. In
other words, there is enough evidence to prove the human mean of plasma citrate
concentrations is higher than 100 ,MP which is in the normal range.
21
5. Conclusion
Dense functional data are widely applied in bio-medicine, social science, and finance andthus are an important topic in statistics. Simultaneous inference is a good statistical tool offunctional data analysis (FDA). In particular, simultaneous confidence bands have two mainadvantages, which are powerful visualization and easy interpretation.
To conclude, this thesis introduces the basic process and methods to build SCB for the meanof dense functional data. Moreover, it applies SCB to hypothesis test of one sample test forpopulation mean by analyzing the plasma data set through the R package ‘SCBmeanfd’ andfinally gives normal and bootstrap p values to provide evidence to prove that the nullhypothesis should be rejected in the significance level .05.0 D
22
6. Reference
Andersen, A. H., Jensen, E. B., & Schou, G. (1981). Two-way analysis of variance withcorrelated errors. International Statistical Review, 49, 153–157.
Bande, M. F., Fuente, M. O., Galeano, P., Nieto, A., & Portugues, E. G. (2020). fda.usc:Functional Data Analysis and Utilities for Statistical Computing. Version 2.0.2.
Cao, G., Wang, L., Li, Y., & Yang, L. (2016). Oracle-efficient confidence envelopes forcovariance functions in dense functional data. Statistica Sinica, 26 (1), 359-383.
Choi, H., & Reimherr, M. (2018). A geometric approach to confidence regions and bands forfunctional parameters. J. R. Stat. Soc. B, 80, 239-260.
Degras, D. (2011). Simultaneous confidence bands for nonparametric regression withfunctional data. Statistica Sinica, 21, 1735-1765.
Degras, D. (2016). SCBmeanfd: Simultaneous Confidence Bands for the Mean of FunctionalData. Version 1.2.2.
Degras, D. (2017). Simultaneous confidence bands for the mean of functional data. WIREsComput Stat, 9, 1397.
Eilers, P., & Marx, B. (1996). Flexible smoothing with B-splines and penalties. Stat Sci, 11,89–121.
Guidoum, A.C. (2015). Kernel Estimator and Bandwidth Selection for Density and itsDerivatives. The kedd package. Retrieved October 22, 2020 fromhttps://cran.r-project.org/web/packages/kedd/vignettes/kedd.pdf
Hart, J.D., & Wehrly, T.E. (1986). Kernel regression estimation using repeated measurementsdata. J Am Stat Assoc, 81, 1080–1088.
Leslie, C.C., & Renty B.F. (2016). Plasma Citrate Homeostasis: How It Is Regulated; And ItsPhysiological and Clinical Implications. An Important, But Neglected, Relationship inMedicine. HSOA J Hum Endocrinol, 1(1).
Ramsay, J.O., & Silverman, B.W. (2005). Functional Data Analysis. Springer Series inStatistics. 2nd ed. Springer, New York.
Silverman, B. W. (1985). Some aspects of the the spline smoothing approach tonon-parametric regression curve fitting. J R Statist. Soc. Series B, 47, 1–52.