Abstract
An accurate and prompt diagnosis of pediatric pneumonia is imperative for successful treatment intervention. One approach to diagnose pneumonia cases is using radiographic data. In this article, we propose a novel parsimonious scalaronimage classification model adopting the ideas of functional data analysis. Our main idea is to treat images as functional measurements and exploit underlying covariance structures to select basis functions; these bases are then used in approximating both image profiles and corresponding regression coefficient. We reexpress the regression model into a standard generalized linear model where the functional principal component scores are treated as covariates. We apply the method to (1) classify pneumonia against healthy and viral against bacterial pneumonia patients, and (2) test the null effect about the association between images and responses. Extensive simulation studies show excellent numerical performance in terms of classification, hypothesis testing, and efficient computation.
Keywords: Xrays; Image classification; Functional data analysis; Pneumonia detection; Scalaronimage regression.
1 Introduction
Pneumonia is a form of an acute respiratory illness affecting the lungs. Though pneumonia can occur at any age, younger children are most vulnerable. Bacteria, viruses, or fungi can categorize the common causes of pneumonia. In all these cases, children experience breathing difficulty as their lungs get contaminated with pus and fluid, which results in the inflammation of air sacs limiting the oxygen intake. Pediatric pneumonia is the primary cause of fatality that claims 3 million children globally every year, according to the United Nations Children’s Fund (UNICEF) [51]. Unfortunately, the majority of these deaths were preventable and occurred due to the lack of proper immunization, adequate nutrition, and the availability of antibiotic treatments. The developing countries, particularly in Southeast Asia and subSaharan Africa, suffer most and have the majority of death tolls. The developed countries still suffer substantially from the burden of disease, with 2.5 million incidences yearly leading a third to a half of these to hospitalizations incurring high healthcareassociated costs; see [3, 20, 11]. Very recently, with the discovery of the pneumococcal vaccine, the risk of pneumonia in the United States has been reduced substantially. In contrast, the developed countries are yet to avail such measures, and they have been working strenuously to achieve the Integrated Global Action Plan for the Prevention and Control of Pneumonia and Diarrhoea (GAPPD) under the supervision of the World Health Organization (WHO) and UNICEF.
The clinical signs or symptoms for pneumonia are often nonspecific and vary based on the patient’s physical characteristics. Accurate and timely diagnosis is the key to fight pneumonia. The current stateoftheart diagnostic approaches include polymerase chain reaction (PCR), serology, culture, complete blood cell (CBC), chest radiography, and ultrasonography based test. Some of these tests require advanced lab facilities and are not entirely immune from producing false negatives. The PCR test is widely used but requires 2448 hours to get back the diagnostic results and often requires close contact to collect the specimens from the suspected individuals manually, which is risky and has a high chance of transmitting infectious diseases like COVID19. Recently, chest Xrays have shown a lot of promise and been used in diagnosis or as a confirmatory test for pneumonia [30, 28]. Such medical images need to be interpreted by radiologists to detect appropriately any abnormalities; unfortunately, this is often a manual, timeconsuming step, and requires experts with domainknowledge to diagnose. Therefore, there is a need for intelligent decision support that can empower and augment clinical decision making, which in turn saves time and prevents physicians’ burnout and morbidity. Such a tool, if developed, may classify, flag, or confirm pneumonialike patients using radiography images or can even stratify symptomatic patients into different risk categories as needed; see [2, 27, 53, 50, 52, 16] and many others.
Deep learningbased artificial intelligence (AI) systems have been used extensively in studying radiology images and detecting patients [27]
. While AI works well in prediction, it requires a substantially large dataset to train the machine learning model, which may not always be feasible. Besides, such an approach does not offer flexibility to either make inference about the parameters of interest or quantify uncertainty in estimation. Very recently, statisticalbased methodologies have been applied in neuroimaging; see
[45, 26, 22, 55, 42] and many others. [39, 26, 22] adopted Bayesian spatial modeling approaches to model the correlation between neighboring voxels with carefully placing priors on brain regions in predicting adverse clinical outcomes of Alzheimer’s disease. In a similar spirit, [46, 18, 45]fitted univariate linear regressions on each region of interest of brain image separately to explore the relationship between images and scalar measures. While most of these methods primarily focus on prediction, the inferential aspects of parameter estimates are not discussed explicitly.
[55, 54] model an image data as functional measurements and apply the regularization penalty to induce zeroeffect for the region where there is no association between responses and images. Here the authors used prespecified Haar wavelet basis functions to approximate coefficient functions for which the selection of the optimum number of basis is based on either information criterion or crossvalidation, which is a timeconsuming step for multidimensional images. Such methodologies are also not directly applicable in the presence of noisy image data. Furthermore, the authors adopted a permutationbased test to make inference about the nonzero effect in a region, and it is unclear how to test the nullity of the overall regression coefficient.In this paper, we address these limitations and introduce a novel scalaronimage regression procedure based on functional principal component (FPC) analysis. There are three key novelties of this paper. The first novelty is the use of the functional data analysis approach exploiting the underlying covariance structure of 2dimensional (2D) Xray data, which has not previously been used, to the best of our knowledge, in this generality in image classification. The second novelty is the parsimonious modeling of the coefficient function using datadriven basis functions. The third novelty is casting the scalaronimage regression model into a generalized linear model framework, which allows hypothesis testing about the parameters of interests explaining the association between responses and covariates. The main advantages of this approach are as below  (1) It provides excellent numerical performance in terms of classification accuracy; (2) it is computationally efficient and magnitude faster; (3) unlike the deep learning approach, it does not require a massive set of images to train the model; (4) it offers inference about the effect of radiographic images in classification; (5) it is applicable for images corrupted with subtle noises.
The paper is organized as follows. Section 2 describes the collection and processing of the Xray. Section 3 details the image classification approach. Section 4 presents the estimation procedure. Section 5 describes the inferential procedure. Section 6 presents the application in classifying pediatric chest Xrays to detect pneumonia. Multiple simulation studies are performed in Section 7, showing the robustness of our method mimicking the original data. Section 8 concludes the paper with a discussion.
2 Data processing
Figure 1 illustrates the posteroanterior Xray images for (a) four randomly chosen healthy subjects (top four panels), (b) two bacterial pneumonia patients (twobottomleft), and (c) two viral pneumonia patients (twobottomright). (a) The normal chest Xrays have no signs of abnormal opacification such as fluid or solid materials within the airways in the images, (b) Bacterial pneumonia usually displays a focal lobar consolidation where a segment of lung shows confluent cottonlike opacity, (c) Apparently, more diffuse “interstitial” patterns such as scattered white patches in both lungs are evident in the Xrays of viral pneumonia patients.
The pneumonia dataset [7] consists of 5,863 XRay images (in JPEG format) for two categories: pneumonia (bacterial and viral) and normal cases; see Figure 1. This dataset is freely available in Kaggle. These chest Xray images (anteriorposterior) were collected from pediatric patients aged between one to five years who were treated in the Guangzhou Women and Children’s Medical Center, Guangzhou, China, before 2018. These chest Xray images were recorded during the patients’ routine clinical visits. For the purpose of analysis, all chest radiographs were initially screened for quality control by removing all low quality or unreadable scans, and the diagnoses were labeled by two expert physicians [7]. For the purpose of our analysis, we use the Xray images saved in the Train folder as there could be an issue with labeling in the Test folder identified by independent researchers; see [9]. There are 5,216 images (1,341 normal and 3,875 pneumonia cases) in the folder, which we use for our analysis, assuming that the current labeling is correct.
Each Xray image is grayscaled 2dimensional (2D) array structure of dimension where and represents the width and height of the images. The image data are comprised of intensity profiles with values between 0 and 1 representing black and white, respectively. We reshape all images into the same size, i.e., for our computational convenience.
3 Proposed methodology
Let the observed data be where indexes the subject, is the scalar response associated with the th subject, is the th observed clinical or demographic (e.g, age, prior history of lung diseases, number of days since the start of symptoms or hospitalization), is the realized discrete intensity value of a twodimensional (2D) projection image at discrete grid points and such that and where both and are closed compact sets. We assume that and are regular, dense design in and respectively. In our application, takes values either 0 or 1 indicating the disease status (i.e., normal or pneumonia) of the th subject and ’s indicate the intensity values of the Xray images which are the 2D projections of 3D objects. In our modeling, we treat each 2D image object as a functional measurement.
Denote by the exponential family distribution with mean and dispersion parameter As our objective is to classify patients with respect to their radiograph images, we propose to fit the following generalized functional regression model
(1)  
where
is a known monotone link function (e.g., logit link for the binary responses),
is an unknown scalar intercept, and is an unknown bivariate function defined on quantifying the association between mean responses and functional covariate. If the chest Xrays of healthy and pneumonia afflicted patients provide nondistinguishable pattern, then For notational convenience, we denote simply by and use them interchangeably. Model 7 is a variation of the classical functional linear models described in [4, 23, 43, 15, 34, 13] and also an extension of the functional varying coefficient models [5, 29, 47, 48].3.1 Model approximation
Let be a set of orthonormal basis defined in such that if and 1 otherwise. Similarly, define by the orthonormal basis functions spanned in such that if
and 1 otherwise. We use the tensor product of these basis functions to approximate
such that where ’s are the unknown basis coefficients and can be defined uniquely by For our modeling exposition, we let vary within each resulting in We represent the functional covariate using the same basis functions and in two steps: first, we write where is a zeromean smooth process defined on and independent identically distributed (IID) over Next, for each write where ’s are the weights associated with the corresponding basis functions and IID over and . It follows that(2)  
Here the infinite summation is intractable and needs to be truncated at some finite levels, say and for direction and direction, respectively. In a similar spirit to [47], the truncation values and control the degree of smoothness of in both directions. As in the nonparametric regression approach, the choice of truncation values leads to overly smooth or wiggly patterns in and therefore needs to be selected with caution that we discuss in the next section. Using the equation 2, we cast the model into a generalized linear model [35] framework for given ’s as below
(3) 
where , , are the unknown regression coefficients. Note that the number of basis functions is kept same across for the purpose of exposition only. We relax this assumption in the next sections assuming
In our model, first, we need to estimate . Subsequently, using the estimated components, we fit the model 3 and estimate ’s so as
3.2 Selection of basis functions
The selection of orthonormal basis functions is critical in approximating the model in the form of 3. One approach is to select a prespecified orthonormal basis functions similar to [54, 12, 48]. An alternative approach is to use a datadriven basis; see [47, 37] where the main idea is to estimate the marginal covariance function of the observed functional covariates and select the corresponding orthogonal eigenbasis as a basis. In our approach, we use the latter approach as it has the advantage of exploiting the underlying covariance structures of radiographical images and allows us to account for correlation between the neighboring points.
The observed functional measurements are most likely to be contaminated with error or white noise. Unlike
[54], assume , where ’s are the noisy realizations of the true functional covariates, ’s, and is a white noise with mean 0 and covariance, and is independent over , and . Define , where is the sampling density of ’s; is a proper covariance function (positive semidefinite and symmetric function); see [41, 19]. Define the covariance function of the observed functional covariate realized at a point in direction by The spectral decomposition of results in where’s are the nonnegative eigenvalues and
’s are the orthonormal eigenbasis functions spanned in which are used in approximating The infnite summation is truncated at a finite level such aswhich is determined based on predetermined percentage of variance explained (PVE) value where the main idea is to choose the smallest integer
such that is larger than the preset PVE; such approach is adopted in [15, 47, 43, 8]. This follows that and the observed covariate can be written as where the th observed loadings are In particular, where is a nugget effect. In a similar spirit to [47], we model the loadings to exploit the underlying covariance structures in the direction. Define the covariance function of the th latent smooth process, by Assume A spectral decomposition of leads to , where ’s are the corresponding eigencomponents with ’s being the nonnegative eigenvalues and ’s being the orthonormal basis functions spanned in Using the truncated KarhunenLoève (KL) expansion, we approximate by where ’s are the weights for the th functional principal component (FPC) associated with the th direction for the th subject, and is uniquely defined by . The ’s are the covariates used in modeling 3. Such idea of using FPCs with the largest variation are most predictive of responses in an association model is the basis of functional principal component regression (FPCR); [13, 43]. In a similar spirit to the selection of the truncation value is also chosen by prespecified PVE value.4 Estimation
4.1 Estimation of components related to functional predictor
Estimation is done following the procedures described in [47, 59] where the ideas were developed for a sparse design, we borrow the same techniques for a dense design. We briefly review the idea here. We demean the observed functional predictor to ensure Define the demeaned covariates by Assume is a matrix of stacking up all the image data over subjects. A pooled sample covariance estimator is obtained as a crossproduct of the elements of divided by the total number of elements in each column such that the th element is Denote the covariance matrix by Due to the presence of noise in the diagonal terms in are inflated by variance and thus requires smoothing. We apply the bivariate smoothing approach [57] to obtain the smoothed covariance function which is denoted by Applying spectral decomposition on with a preset PVE value results in a set of eigencomponents, where the terms bear the meaning as described in section 3.2. Next the corresponding loadings are obtained through numerical integration as
Let the data obtained from the estimated loadings for a fixed be . Using the idea described in [59], we estimate the covariance function of the loadings and denote it by ’s. Next the spectral decomposition leads to The corresoponding scores ’s are obtained by casting the model for
’s into a linear mixed model framework assuming that
’s follow Gaussian distribution; see
[59] for details. We replicate this procedure for each components.4.2 Estimation of response related parameters
Given the estimated scores the approximating model in 3 can be written as Define and let be the norms. As in [47] we set PVE generously large enough to capture all nonnegligible association between the functional predictor and scalar responses. It is possible that a direction is only associated with noise and have variance much larger than the signal, such components introduce wiggly pattern of and are likely to overfit the model. To circumvent this problem, we use a penalized (PEN) technique to estimate parameters. In our application, we also use nonpenalized (NonPEN) version.
For NonPEN, we estimate by maximizing the loglikelihood function of the GLM such as
For PEN, we use a group LASSO penalty to regularize estimation of ; we impose the penalty on the effects total magnitude . Here we maximize the following penalized criterion
adopting the ideas described in [60, 58]. Here is a tuning parameter controlling sparsity by shrinking all coefficients associated with a group to zero simultaneously; is selected by Kfold crossvalidation (CV) technique.
4.3 Prediction of response
Once ’s are estimated, we estimate the mean of by where is the inverse of the link function
5 Inference
Though our idea is illustrated by using 2D Xray images, the proposed methodology can be leveraged in other radiography images such as magnetic resonance imaging (MRI), positron emission tomography (PET) images, ultrasounds, and mammography. While there are various kinds of medical images, it is a question of interest to investigate which images are sufficiently discriminative enough to differentiate between cases and noncases. For instance, if there is no difference between the chest Xrays of pneumonia and healthy patients, then the odds ratio for a given covariate
will be equal to 1. Such a hypothesis of interest can be written formally as(4)  
Equivalently, with an abuse of notation, we write where Using the orthonormal property of the basis functions this can further be written as
This follows that the hypothesis 4 can be reformulated as
(5)  
The parsimonious modeling framework 3 allows to make inference about the significance of the association between binary responses and functional measurements through testing the nullity of ’s for all ’s and ’s using the conventional hypothesis testing procedures such as likelihood ratio test (LRT), F1 test, Wald test, or score test. While all these tests are asymptotically equivalent, we use LRT due to its amenable theoretical properties in testing both constrained and unconstrained parameter space; see [25, 33]. In particular, the test is based on the differences between the likelihoods computed from the maximum likelihood estimates (MLEs) under and
Define the test statistic by
(6) 
where refers to the likelihood value with respect to the MLEs under and the MLEs under ; it can be shown that follows asympotically distribution with degrees of freedom where equals the difference in the number of free parameters to be estimated under and such that LRT has been studied extensively and implemented in various applications for decades; we refer to [40, 56, 31] and references there in for details. Note that such hypothesis testing procedure makes more sense for the NonPEN model. If the PEN model is used, will ideally shrink all coefficents, ’s, to zero when there is no association between responses and covariates given that the optimum value for is chosen appropriately.
6 Pneumonia image classification
We apply the methodology on the pediatric pneumonia dataset described in section 2. The main objectives are to classify the images by (S1) regular and pneumonia and (S2) viral pneumonia and bacterial pneumonia cases, and evaluate the statistical significance of the association between Xray images and binary classes. In particular, we use the model
(7) 
where is a logit link assuming is a binary response such that in (S1), for pneumonia case and 0 otherwise. Similarly in (S2), let be for bacterial pneumonia, and 0 for viral pneumonia. While we use both healthy and pneumonia data in the former classification problem, the latter exploits only the pneumonia data.
6.1 Computational details
We implement the method in R version 3.6.3 [49]. We read and edit the Xray images using the package EBImage [38]. We use the fpca.face and fpca.sc functions from the refund [14] to estimate the covariates related components. We set the PVE value equal to 0.99 in our application. The NonPEN is solved using the glm function from the package stats. The PEN model is fitted by gglasso function of the gglasso package [58]. The LRT is calculated using the lrtest function from the lmtest package [61]. The computation time to train and test the NonPEN model in a sample of 2,600 Xray images is approximately 40 seconds, and it takes approximately 240 seconds to trainandvalidate the sample of 5,216 images in a machine with an Intel Corei7 processor having 16GB RAM.
6.2 Permormance metrics
We split the data into a training set (insample) on which the model is built and a test set (outofsample) on which the performance is evaluated; we report the results for both insample and outofsample dataset. Two sampling mechanisms are considered. (M1) All 5,216 images (1,341 standard and 3,875 pneumonia) are used where 350 pictures from each type are selected randomly and kept aside to constitute the test dataset; the remaining 4,466 Xrays are used to train the model. (M2) A random sample of 1,300 from each type (i.e., healthy and pneumonia) so that a total of 2,600 images form the whole dataset. As in (M1), a total of 700 copies (i.e., 350 healthy and 350 pneumonia) form the test set, and the remaining 1,900 images build the training cohort. There are 1,345 viral and 2,345 bacterial pneumonia Xrays. Notice (M2) ensures balanced cases and uses much smaller sample to train the model than that of (M1); the purpose is to assess the model’s performance with a smaller sample. We consider two ways of forming the test set  (a) replicating the testtrain split 200 times and evaluating the performance within each split (b) forming the testtrain split for once. In both cases, test images are selected randomly without replacement. Approach (a) will potentially induce more randomness into the traintest split and is less susceptible to sampling bias. Let and be the set of patients referring to the pneumonia (i.e., ) and healthy cases (i.e., ) respectively such that where is the cardinality of a set. Define

True positive (TP) =

False negative (FN) =

True negative (TN) =

False positive (FP) =
Here is an indicator function taking values 0 and 1. We consider the following performance metrics

True positive rate (TPR) = TP / (TP + FN)

True negative rate (TNR) = TN / (TN + FP)

Positive predictive value (PPV) = TP / (TP + FP)

Accuracy = TP + TN / (TP + TN + FP + FN)

Mathhews correlation coefficient (MCC) =

F =

Area under the curve (AUC) of a receiver operator characteristic (ROC) curve; ROC is generated by plotting the TPR versus false positive rate (FPR) at various thresholds of a binary classifier system.
6.3 Classification performance
Table 1 provides the numerical performance of the model for both insample and outofsample image data for (S1). The classification results are close in the validation set for both (a) and (b); due to the smaller training sample in (b), unsurprisingly, insample performances are marginally better. The high ACC values indicate that the model identifies pneumonia and healthy patients using Xrays correctly. The high F values suggest a good balance between precision (i.e., TPR) and recall (i.e., PPV) referring to the number of instances that the model correctly classifies and instances that the model misses, respectively. Besides, the high AUC values () indicate an excellent balance between TPR and the false positive rate (FPR). Apparently, we observe better classification performances in the NonPEN method for (M1) than that of (M2). This could be due to the fact that (M1) is comprised of all images implying more variation in the data; therefore, an FPC analysis with a high PVE (i.e., 0.99) is likely to provide some directions associated with noise that could potentially overfit the model. Therefore, by shrinking such directions to zero, we improve the model performance. The IQR values provide evidence of small variation in the reported summary values across 200 validation sets for (a). The other important metrics such as TNR, PPV, and are reported in the Supplementary Material; see Section in 9.
Table 2 provides the numerical performance of the model for (S2). Separating viral and bacterial cases using a scalaronimage regression is a much more difficult problem than that of (S1). Overall, the model performs satisfactorily in differentiating between viral and bacterial pneumonia patients. As above, we observe better outofsample performances in PEN approach.
We are interested in evaluating two scientific questions. (1) Do the radiographic image data differ between healthy and pneumonia patients? (2) Do the image data differ between viral and pneumonia patients? We use 6 to test the hypotheses. As the calculated pvalue is 2.2e16 for testing (1), we may conjecture that the observed patterns in the Xrays are distinctive between healthy individuals and pneumonia patients. Similarly, since the pvalue for testing (2) is 2.2e16, we may conclude that there is a statistically significant difference in the image data between viral and pneumonia patients. Further results related to testing the equality of mean scores between healthy and pneumonia patients are provided in Section of the Supplementary Material 9.
7 Numerical experiment
7.1 Classification performance on simulated data
In this section, we consider a simulation study generating Xray images for healthy and pneumonia patients. The objective is to assess the model performance when the images are alterted by noises. Let the observed data be where is the binary response associated with the th subject for who the simulated image, is rendered. Let We generate grayscaled pixel images for each patient such that and Denote by the model that generates noisy realizations of Xray images; where is a white noise which follows IID Here ’s are the original grayscaled images selected randomly from 5,216 original images; i.e. or
. Note that the standard deviation
controls the departure in pixel from the true Xray images; as departs from zero, the image quality drops gradually and gets perturbed on both and directions. In our simulation study, we consider see Figure 2. Note that as the magnitude of noise increases, it gets difficult to interpret and becomes indistinguishable between healthy and pneumonia cases. For each setting, we run simulations for 500 times, and within each run, we simulate the Xray images for 300 healthy and 300 pneumonia patients. We label the binary responses such that if a patient belongs to the healthy cohort and 1, otherwise. To assess the classification performance of the proposed method, we divide each simulated dataset into training (75%) and test set (25%) and calculate the classification metrics for both insample and outofsample data. We set PVE equal to 0.90 for this experiment.Table 3 displays the binary classification results for the simulated data with different magnitude of noises. The results are close for and , which provides evidence of the efficacy of the proposed methodology in the presence of noise. Overall, both PEN and NonPEN perform competitively. While the model still classifies patients satisfactorily for the performance drops to an extent. This phenomenon is not too unexpected as it becomes hard for the model to extract the signal out of the data from such noisy images. In reality, such lowquality images are not even recommended for clinical use anyway. Additional results are provided in Section of the Supplementary Material 9.
Figure 3 illustrates the area under the curve (AUC) of a receiver operator characteristic for different noise levels for the NonPEN (left) and PEN (right) model; the differences are very marginal and visually indistinguishable. The model performance with respect to AUC remains somewhat similar for and As expected, with , the performance of the classification model decays gradually.
7.2 Numerial evaluation of inferential property
As before, we simulate the images by where or , and follows Gaussian distribution with mean 0 and standard deviation and is an IID. Let All ’s are selected randomly from the pool of original 5,216 images. Let control the number of samples taken from the pneumonia cohort. We consider two scenarios. (N1) Simulate 600 ’s from the original 1,341 files associated with the normal Xrays by sampling without replacement. We assign the first 300 labels as noncases (i.e., ) and the rest 300 as cases (i.e., ) ignoring their true class assignments as (N2) Simulate ’s in such a way so that data are comprised of both normal and penumonia patients. The first 300 Xrays are generated from the healthy cohort (i.e., ) and we label the corresponding responses as For the second half, % of the remaining 300 images are selected from the healthy cohort (i.e., ) and % are from the pneumonia cohort (i.e., ). For these 300 cases, we label the corresponding responses as Note that (N1) and (N2) evaluate the rejection rates of under and as in 4, respectively.
The top section of table 4
reports the rejection probabilities and corresponding standard errors for testing the null effect under the case when indeed
is true. The results are based on 8,000 simulations. The empirical typeI error rates are around the nominal levels for
However, we notice slightly lower rejection rates for . The bottom section illustrates the power properties for testing the null effect when data are generated under . The results associated with the power are based on 500 simulations. As departs from 0, more pneumonia patients are added in the sample. As expected, the empirical rejection rates for the model get closer to 1 for higher at different8 Discussion
In this paper, we propose a novel scalaronimage regression by borrowing the ideas from functional data analysis where we view images as functional measurements. We approximate both functional covariate and coefficient using the same orthonormal datadriven basis functions and cast the model into a GLM framework. The methodology relies on the assumption that the leading eigenbasis functions of the functional predictor are most predictive of generalized response and that the latent predictor signals are relatively smooth.
The proposed idea is applicable to other response types that follow multinomial, Poisson, Gaussian, or any other distribution belonging to the Exponential family; in this case, we need to select the corresponding link function appropriately (e.g., log link for Poisson). The methodology can be extended to the case with multiple functional predictors observed with or without noise and defined on diverse sampling designs. Numerical results using chest Xrays show excellent classification performances in detecting (a) pneumonia and healthy patients, (b) viral and bacterial pneumonia cases. The method performs competitively even when we have a smaller sample to train the model. Two numerical experiments based on the data application show the robustness of the methodology and exhibit efficacy in attaining the inferential properties in terms of size and power. Despite the increased flexibility, the method is computationally efficient (computation time is in seconds) and can be implemented with the existing freely available software.
This work also opens a few avenues for future exploration. One possibility is to extend this methodology to explore the association between scalar responses and images that are observed longitudinally in a sparse design. Such an idea is vital to study the prognosis of diseases. In this case, we need to borrow the concept of longitudinal data analysis and merge it into the proposed scalaronimage regression.
While the methodology is applied to the pediatric pneumonia cases, the approach can also be implemented in detecting pneumonialike other infectious diseases. One possible test case could be the detection of COVID19 patients in the recent pandemic. Based on the Johns Hopkins database [24], as of the 6th of May, 2020, approximately 3,755,341 people are infected by 2019 Novel coronavirus known as COVID19 and more than 263,831 individuals died across the globe. Figure 4 illustrates the chest Xray image for three COVID patients extracted from the freely available Kaggle dataset [6]; here the scattered white patches in both lungs are noticeable.
COVID19 is very infectious, and once exposed, a person can transmit the infection being in an asymptomatic state, which is a state without showing medically known clinical signs or symptoms. Also, the detection of COVID19 patients becomes a significant challenge due to the limited detection kits and short of medical supplies. The current testing approach is based on PCR, which is a timeconsuming procedure. Besides, the community hospitals in remote areas are falling short in medical supplies and staff. Therefore, an alternative testing procedure based on radiography examination may play a role in identifying COVID patients as a supplement to the existing process. In a recent study, the chest radiography images of COVID19 patients indicate abnormalities in their Xrays, suggesting the effectiveness of a radiography examination as an alternative infection detection approach; see [21, 36, 1]. We want to address that COVID detection is more complicated due to the presence of preexisting conditions and uncertainty about diseasebiology as we are still in an early stage of this outbreak, and thus requires substantial validation and very highquality data.
Acknowledgments
The content is solely the responsibility of the author and does not represent the official views of the UnitedHealth Group R&D or its any associated subsidiaries.
References
 [1] (2020) Correlation of chest ct and rtpcr testing in coronavirus disease 2019 (covid19) in china: a report of 1014 cases. Radiology, pp. 200642. Cited by: §8.
 [2] (2015) Deep learning with nonmedical training used for chest pathology identification. In Medical Imaging 2015: ComputerAided Diagnosis, Vol. 9414, pp. 94140V. Cited by: §1.
 [3] (2018) Pediatric pneumonia. Medscape. External Links: Link Cited by: §1.
 [4] (1999) Functional linear model. Statistics & Probability Letters 45, pp. 11–22. Cited by: §3.
 [5] (2008) Varyingcoefficient functional linear regression models. Communications in Statistics—Theory and Methods 37, pp. 3186–3203. Cited by: §3.
 [6] (2020) COVID19 image data collection. arXiv 2003.11597. External Links: Link Cited by: §8.
 [7] (2018) Labeled optical coherence tomography (oct) and chest xray images for classification. Note: Mendeley Data, v2 External Links: Link Cited by: §2.

[8]
(2009)
Multilevel functional principal component analysis
. The annals of applied statistics 3 (1), pp. 458. Cited by: §3.2.  [9] (2018) Labeled optical coherence tomography (oct) and chest xray images for classification. External Links: Link Cited by: §2.
 [10] (2018) An introduction to generalized linear models. CRC press. Cited by: §4.2.
 [11] (2019) Pediatric pneumonia. In StatPearls [Internet], Cited by: §1.
 [12] (2015) Functional additive regression. The Annals of Statistics 43 (5), pp. 2296–2325. Cited by: §3.2.
 [13] (2013) Longitudinal scalaronfunctions regression with application to tractography data. Biostatistics 14, pp. 447–461. Cited by: §3.2, §3.
 [14] (2018) Refund: regression with functional data. R package version 0.117. External Links: Link Cited by: §6.1.
 [15] (2011) Penalized functional regression. Journal of computational and graphical statistics 20 (4), pp. 830–851. Cited by: §3.2, §3.
 [16] (2020) Coronavirus detection and analysis on chest ct with deep learning. arXiv preprint arXiv:2004.02640. Cited by: §1.
 [17] (2000) Principles and practical application of the receiveroperating characteristic analysis for diagnostic tests. Preventive veterinary medicine 45 (12), pp. 23–41. Cited by: §4.3.
 [18] (2009) Clinical severity of alzheimer’s disease is associated with pib uptake in pet. Neurobiology of aging 30 (12), pp. 1902–1909. Cited by: §1.
 [19] (2012) Inference for functional data with applications. Vol. 200, Springer Science & Business Media. Cited by: §3.2.
 [20] (2019) Global childhood pneumonia: the good news, the bad news, and the way ahead. The Lancet Global Health 7 (1), pp. e4–e5. Cited by: §1.
 [21] (2020) Clinical features of patients infected with 2019 novel coronavirus in wuhan, china. The Lancet 395 (10223), pp. 497–506. Cited by: §8.
 [22] (2013) Bayesian scalaronimage regression with application to association between intracranial dti and cognitive outcomes. NeuroImage 83, pp. 210–223. Cited by: §1.
 [23] (2005) Functional adaptive model estimation. Journal of the American Statistical Association 100, pp. 565–576. Cited by: §3.
 [24] (2019) 2019 novel coronavirus covid19 (2019ncov) data repository by johns hopkins csse. External Links: Link Cited by: §8.
 [25] (1963) Econometric methods. Vol. 26, New York. Cited by: §5.
 [26] (2011) Meta analysis of functional neuroimaging data via bayesian spatial point processes. Journal of the American Statistical Association 106 (493), pp. 124–134. Cited by: §1.
 [27] (2018) Identifying medical diagnoses and treatable diseases by imagebased deep learning. Cell 172 (5), pp. 1122–1131. Cited by: §1, §1.
 [28] (2007) Round pneumonia: imaging findings in a large series of children. Pediatric radiology 37 (12), pp. 1235–1240. Cited by: §1.
 [29] (2016) Longitudinal functional models with structured penalties. Statistical Modelling 16, pp. 114–139. Cited by: §3.
 [30] (1998) Pneumonia in infants and children: radiologicalpathological correlation. In Seminars in roentgenology, Vol. 33, pp. 151–162. Cited by: §1.
 [31] (2018) Generalized linear models. Routledge. Cited by: §5.
 [32] (1989) Generalized linear models chapman and hall. New York. Cited by: §4.2.
 [33] (2007) Likelihood ratio, score, and wald tests in a constrained parameter space. The American Statistician 61 (1), pp. 22–27. Cited by: §5.
 [34] (2012) Functional additive models. Journal of the American Statistical Association 103, pp. 1534–1544. Cited by: §3.
 [35] (1972) Generalized linear models. Journal of the Royal Statistical Society: Series A (General) 135 (3), pp. 370–384. Cited by: §3.1.
 [36] (2020) Imaging profile of the covid19 infection: radiologic findings and literature review. Radiology: Cardiothoracic Imaging 2 (1), pp. e200034. Cited by: §8.
 [37] (2015) Longitudinal functional data analysis. Stat 4 (1), pp. 212–226. Cited by: §3.2.
 [38] (2010) EBImage—an r package for image processing with applications to cellular phenotypes. Bioinformatics 26 (7), pp. 979–981. Cited by: §6.1.
 [39] (2005) Bayesian fmri time series analysis with spatial priors. NeuroImage 24 (2), pp. 350–362. Cited by: §1.
 [40] (2010) A practical guide to multilevel modeling. Journal of school psychology 48 (1), pp. 85–112. Cited by: §5.
 [41] (2007) Applied functional data analysis: methods and case studies. Springer. Cited by: §3.2.
 [42] (2015) Waveletdomain regression and predictive inference in psychiatric neuroimaging. The annals of applied statistics 9 (2), pp. 1076. Cited by: §1.
 [43] (2007) Functional principal component regression and functional partial least squares. Journal of the American Statistical Association 102 (479), pp. 984–996. Cited by: §3.2, §3.
 [44] (1999) The interpretation of diagnostic tests. Statistical methods in medical research 8 (2), pp. 113–134. Cited by: §4.3.
 [45] (2010) Voxelbased analysis of alzheimer’s disease pet imaging using a triplet of radiotracers: pib, fddnp, and fdg. Neuroimage 52 (2), pp. 488–496. Cited by: §1.
 [46] (2006) Tractbased spatial statistics: voxelwise analysis of multisubject diffusion data. Neuroimage 31 (4), pp. 1487–1505. Cited by: §1.
 [47] (2020) Longitudinal dynamic functional regression. Journal of the Royal Statistical Society: Series C (Applied Statistics) 69 (1), pp. 25–46. Cited by: §3.1, §3.2, §3.2, §3, §4.1, §4.2.
 [48] (2020) Optimal emg placement for a robotic prosthesis controller with sequential, adaptive functional estimation (safe). Annals of Applied Statistics. Cited by: §3.2, §3.
 [49] (2013) R: a language and environment for statistical computing. Cited by: §6.1.
 [50] (2019) Semisupervised multitask learning with chest xray images. In International Workshop on Machine Learning in Medical Imaging, pp. 151–159. Cited by: §1.
 [51] (2019) Estimates developed by the un interagency group for child mortality estimation. UNICEF. External Links: Link Cited by: §1.

[52]
(2020)
COVIDnet: a tailored deep convolutional neural network design for detection of covid19 cases from chest radiography images
. arXiv preprint arXiv:2003.09871. Cited by: §1. 
[53]
(2018)
Tienet: textimage embedding network for common thorax disease classification and reporting in chest xrays.
In
Proceedings of the IEEE conference on computer vision and pattern recognition
, pp. 9049–9058. Cited by: §1.  [54] (2017) Classification of adni pet images via regularized 3d functional data analysis. Biostatistics & epidemiology 1 (1), pp. 3–19. Cited by: §1, §3.2, §3.2.
 [55] (2014) Regularized 3d functional regression for brain image data via haar wavelets. The annals of applied statistics 8 (2), pp. 1045. Cited by: §1.
 [56] (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 (1), pp. 3–36. Cited by: §5.
 [57] (2013) Fast bivariate psplines: the sandwich smoother. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 75, pp. 577–599. Cited by: §4.1.
 [58] (2015) A fast unified algorithm for solving grouplasso penalize learning problems. Statistics and Computing 25 (6), pp. 1129–1141. Cited by: §4.2, §6.1.
 [59] (2005) Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association 100 (470), pp. 577–590. Cited by: §4.1, §4.1.
 [60] (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68 (1), pp. 49–67. Cited by: §4.2.
 [61] (2002) Diagnostic checking in regression relationships. R News 2 (3), pp. 7–10. External Links: Link Cited by: §6.1.
 [62] (1993) Receiveroperating characteristic (roc) plots: a fundamental evaluation tool in clinical medicine. Clinical chemistry 39 (4), pp. 561–577. Cited by: §4.3.
9 Supplementary Materials
This Supplementary Material consists of two sections. Section 9.1 provides additional results from the analysis of the image classification for healthy and pneumonia patients. Further results from the numerical experiment are illustrated in Section 9.2.
9.1 Additional results for pnumonia detection analysis
9.1.1 Exploratory analysis
Figure 5 illustrates grayscaled image data for three randomly chosen patients  healthy (left), viral (middle), and bacterial (right) pneumonia cases, respectively. The observed functional covariate, associated with the th subject, is comprised of intensity profiles of the corresponding Xray image containing values between 0 (black) and 1 (white). The corresponding smoothed profiles are displayed in dashed black lines.
9.1.2 Data analysis
We illustrate the idea for (S1). We apply functional principal component (FPC) analysis on the marginal covariance function induced by the functional predictor comprised of intensity profiles associated with 5,216 Xray images. Setting a percentage of variance explained (PVE) value equal to 99% provides directions at which we have maximum variation in the data. We estimate functional principal components and corresponding eigenvalues where . Figure 6 displays the estimated directions with the explained percentage of variance; first, 6 FPCs explain almost 96% variation in the data. Similarly, spectral decomposition on for each results in ’s and ’s. Figure 7 displays the estimated directions associated with the profile of loadings ; first, 6 FPCs explain approximately 96% variation in the data. Similarly, we do it for each In our example,
Figure 8 displays the boxplots of the estimated scores (first 60 components) corresponding to both normal and pneumonia patients’ intensity profiles. Figure 9 displays the log pvalues for testing the equality of mean scores between healthy and pneumonia patients at 5% level of significance. In particular, we test for each and where and refer to the mean values associated with the
th component for normal and pneumonia patients, respectively. A twotailed ttest statistic is used to make inference. For testing,
for each we adjust for multiple comparisons by Bonferroni correction.We report additional classification metrics associated with a confusion matrix. The metrics are defined as in Section 6. For completeness, we recall the necessary items here.

True negative rate (TNR) = TN / (TN + FP)

Positive predictive value (PPV) = TP / (TP + FP)

Mathhews correlation coefficient (MCC) =

F =
Table 5 reports results for (S1). We observe high TNRs implying low falsepositive rates (FPRs), and high PPVs indicating low false discovery rates (FDR). Such observations are dominant across all settings. As before, PEN outperforms NonPEN in the setting (M1), and the magnitude of improvement is higher in evaluating outofsample cases than that of insample ones. These findings are in alignment with what we observe in Table 1 of the draft.
Table 6 provides additional results for (S1). The results between PEN and NonPEN are competitive.
9.2 Additional results for numerical experiment
Table 7 displays additional binary classification results for the simulated data with different magnitude of noises. We notice similar phenomena as in Table 3 of the draft; the classification results for PEN and NonPEN are very close. We observe high MCC values for smaller noise levels for ; where MCC is viewed as a correlation coefficient between observed and predicted binary responses.
Comments
There are no comments yet.