1 Introduction
Cell therapy is one of the most promising new treatment approaches over the last decades, demonstrating great potential in treating cancers, including leukemia and lymphoma (Kim and De Vellis, 2009; Yin, 2017). Among those therapies, chimeric antigen receptor (CAR) T cell therapy (Bonifant et al., 2016; June et al., 2018), involving the reprogramming of a patient’s T cells to effectively target and attack tumor cells, has shown innovative therapeutic effects in clinical trials, leading to a recent approval (i.e., the treatment of CD19+ hematological malignancies, see Prasad, 2018) by FDA as a new cancer treatment modality. As illustrated in Figure 1, a typical CAR T cell therapy involves four steps – deriving cells from a patient, genetically modifying the cells, culturing the cells, and readministering back to the patient. With increasingly mature gene modification technology, more and more researchers focus on the culturing step (i.e., the red box in Figure 1), where the goal is to substantially increase the cell amount from a small batch to one dose for delivery to the patient. However, a key challenge is the intrinsic patienttopatient variability in the starting material, i.e., cells derived from different patients vary in their viabilities, acceptance rates of genetic modification, and reactions to culture media (Hinrichs and Restifo, 2013). These variabilities introduce difficulties in cell culturing scaleup (i.e., cell manufacturing), and therefore, the current CAR T cell therapy is hindered by low scalability, laborintensive processes, and extremely high cost (Harrison et al., 2019). To achieve high quality and acceptable veintovein cost, we present in this work a statistical framework for online monitoring in cell manufacturing, which can alleviate the negative effect of the intrinsic patienttopatient variability.
There are two reasons why a new statistical method is needed for monitoring critical quality attributes (exampled by the cell concentration) in cell manufacturing. Firstly, a direct measurement method for cell concentrations is not suitable in cell manufacturing (Slouka et al., 2016). Such a method typically requires experienced technicians to collect culture media, take microscopic images, and perform computation via an imagebased software (e.g., ImageJ, see Collins, 2007). Therefore, it is laborintensive, timeconsuming, and may introduce contamination to the culture media. Furthermore, the direct measurement method is oftentimes destructive – the collected cells would be killed for taking microscopic images. Secondly, while there are nondestructive sensors available, these sensors need to be calibrated due to the unknown parameters in the sensing relationship (Pan et al., 2019). For example, impedance sensors (adopted in this work, see Figure 2
), which measure the dielectric relaxation of cell suspension, can be used to effectively estimate cell concentrations
after the calibration of unknown electrical attributes, e.g., permittivity (Gheorghiu and Asami, 1998) and resistivity (Goh and Ram, 2010). However, due to the patienttopatient variability, those electrical attributes not only are unknown but also vary among different patients, leading to difficulties in recovering cell concentrations from sensor readings.We introduce in this work a calibrationfree statistical method for online monitoring in cell manufacturing. Specifically, the intrinsic patienttopatient variability is modeled by a patientspecific calibration parameter. We propose to use multiple sensor readings to construct a patientinvariance statistic, where a transformation is adopted to isolate and alleviate the effect of the calibration parameter. The constructed invariance statistic is then used to model the critical quality attribute of interest. In the training stage, we use the historical data to estimate a transformation and model parameters via a carefully formulated optimization problem, rather than estimate the calibration parameter as in the standard calibration problem (Kennedy and O’Hagan, 2001; Tuo and Wu, 2015). In the monitoring stage, we use the online sensor readings to recover the underlying critical quality attribute through the patientinvariance statistic, free from the calibration parameter. We demonstrate improvements of the proposed calibrationfree method in both simulation experiments and a realworld case study of monitoring viable cell concentrations in cell manufacturing. The proposed approach provides an effective way to monitor cell manufacturing, and therefore, reduces the cost for the promising CAR T cell therapy in treating cancers.
The remaining part of the article is organized as follows. In Section 2, we formulate the biosensing problem in cell manufacturing, with an emphasis on its challenging aspects. In Section 3, we present the proposed calibrationfree method. A detailed simulation study and a realworld cell manufacturing case study are conducted in Sections 4 and 5, respectively. We conclude this work with future directions in Section 6.
2 Biosensing in cell manufacturing
We first describe the biosensing problem of recovering the Viable Cell Concentration (VCC) in cell manufacturing. We then discuss the key challenge – the patienttopatient variability, and related works.
2.1 Impedancebased biosensing
As discussed in Section 1, the goal is to monitor VCC in cell manufacturing, thereby reducing the cost of the CAR T cell therapy. One stateoftheart approach is to use biosensors to measure impedance signals, as indicators for the VCC of interest (Gheorghiu and Asami, 1998; Pan et al., 2019). As illustrated in Figure 2, we adopt impedancebased biosensors with a facingelectrode (FE) design (Miura and Uno, 2019): Our FE biosensor consists of a pair of parallelplate electrodes and silicone at four corners to maintain a gap between them; it would be soaked in media to monitor floating cells in between the electrodes.
With the adopted biosensors, we need a biosensing method to recover VCCs from impedance readings. From physical knowledge, we know that the impedance reading between the two electrodes reflects the cell amount due to the capacitive property of viable cell membranes (Schwan, 1957). The sensing relationship is denoted by
(1) 
where is the impedance reading, is the underlying VCC of interest, denotes the sensor geometry, e.g., the gap width, and models the underlying electrical attributes, e.g., permittivity and resistivity. Here, the relationship is unknown due to the dynamic interaction between cells and biosensors, and extremely challenging to simulate via computer codes considering the micro size of cells. In the proposed calibrationfree method, we first learn the sensing relationship from the historical data of different patients (i.e., the “training” stage), and then conduct online inference of the VCC using the impedance readings for new patients (i.e., the “monitoring” stage).
2.2 Patienttopatient variability
The key challenge in biosensing is that the electrical attributes are unknown in both training and monitoring and different for each patient (also see an illustrating gravity application in Section 4.1). Note that it is impractical to compute for a patient from first principle since it represents the intrinsic properties from genetic material of the patient. One popular way is to model as a calibration parameter, and estimate it from the training dataset . Existing approaches include Bayesian implementation (Kennedy and O’Hagan, 2001), maximal likelihood estimation (Joseph and Melkote, 2009), and an interpretative optimization (Tuo and Wu, 2015). However, such methods are proposed specifically for data fusion of computer experiments and physical experiments, where is available in the former and a constant in the latter. Whereas in biosensing, there is no effective computer simulator, and electrical attributes vary among physical experimental runs for different patients. In the literature, this challenge is also related to the functional calibration problem (Plumlee et al., 2016; Brown and Atamturktur, 2018; Ezzat et al., 2018), where the calibration parameter is modeled as a function of the input variables. In the biosensing application, however, the calibration parameter varies among different patients yet is a constant over different input VCCs for each patient.
Methods  Online  Historical  

Inverse problem (Aster et al., 2018)  Known  Unknown  N.A. 
Supervised learning (Bishop, 2006)  Unknown  Same  Same 
Calibration (Kennedy and O’Hagan, 2001)  Unknown  Unknown  Known 
Calibrationfree (proposed)  Unknown  Unknown  Unknown 
The biosensing problem is also related to the inverse problem in the literature (Aster et al., 2018), where one would estimate both and via an optimization problem. However, such a method typically assumes that the sensing relationship is known or can be easily learned with a complete data , whereas, the calibration parameter is unknown even in the training stage in biosensing. Furthermore, one may regard as an additional model parameter in the unknown relationship , and adopt a standard supervised learning scheme (Bishop, 2006) for finding both and ; this implicitly assumes that is a constant for different patients, which is not true in biosensing. Table 1 summarizes the related methods discussed; with extensive efforts in literature search, we have not found a standard method, which can be directly adopted for the biosensing problem in cell manufacturing.
3 Calibrationfree biosensing method
We present the proposed calibrationfree method in four parts. First, we discuss the sensing relationship for multiple sensors. We then introduce an invariance statistic to alleviate patienttopatient variability. In the online monitoring stage, we use the invariance statistic to recover VCCs. In the training stage, we propose a carefully constructed optimization problem and an algorithmic framework to estimate the underlying sensing model.
3.1 Sensing relationship with multiple sensors
The key idea of the calibrationfree method is to use multiple sensors to address the unknown and patientspecific calibration parameter. For the th sensor, we let be its geometry parameter. Consider first the sensing relationship for a given patient (or experiment). Denote as the scalar impedance reading (more details in Section 5) from the th sensor at experimental time . Following (1), we model via the sensing relationship . Here, is the VCC at experimental time , and is the calibration parameter. It is important to note that measurements from different sensors can be modeled by a same calibration parameter , featuring the underlying values for electrical attributes of the specific patient’s cells. This patientspecific property is the key to “canceling out” the calibration parameter using multiple sensor readings (see Section 3.2).
We then introduce an additional superscript for different patients:
(2) 
Equation (2) further layouts our biosensing settings with multiple sensors: (i) We assume the homogeneity of VCC, i.e., at given time and a given patient , the VCC is the same for different senors at different locations (see Figure 2 (b)). This is because, in suspension cell manufacturing, the culture media is constantly stirred to ensure the homogeneity of nutrition, and thereby VCC (Haycock, 2011). (ii) We use the same set of sensors with known parameters for all patients and at different experimental time . Those parameters are known from the fabrication process or can be easily measured from the sensors. Note that the proposed method is also effective for different sets of sensors, as long as those sensor parameters are known – the same sensor assumption is only for fabrication convenience and notation simplicity. (iii) Besides for different sensors , the calibration parameter is the same among different measurement time . This is because models the intrinsic property of the th patient’s cells, which typically does not change during cell manufacturing. After we clearly layout the above settings, we can then construct the patientinvariance statistic and recover the underlying VCC.
3.2 Invariance statistic
For notation simplicity, we drop the experimental time and write in the subscript in this subsection. Furthermore, we rewrite (2) by decomposing into two parts:
(3) 
Here, for a given sensor , models the part of effect of VCC on impedance reading , without hampered by the patientspecific calibration parameter ; and is the remaining effect of both and on . Intuitively speaking, can be viewed as the mean process of by plugging in some population average of , ignoring the patienttopatient variability; it can also be interpreted as the physical understanding of the sensing relationship. In practice, the mean relationship is oftentimes known, at least to a certain degree, prior to experimentation according to the domainspecific knowledge (e.g., the known set of basis functions, see Section 5). On the other hand, is the variability term, i.e., how patienttopatient variability affects the impedance reading. Such a term leads to different readings, even when the VCC is the same. Note that in one of the considered baseline methods (see Sections 4 and 5), we ignore , i.e., assuming the calibration parameter is a constant; this will lead to noticeable errors when estimating . This variability term is typically unknown. Such a decomposition of a mean trend and a variability term is widely assumed in different modeling methods (see, e.g., Guillas et al., 2018; Chen et al., 2019).
Assume for now , which suggests that the mean relationship extracts all the dependency of on (further discussion in Section 3.4). In other words, is assumed to be separable for each sensor :
(4) 
Now, we construct a statistic which is invariant to the calibration parameter . To gain intuition, consider the following illustrating example with known for (see the illustration application in Section 4.1). If we take a logtransformation to the variability term , the effect of the calibration parameter is further separated from . Therefore, by subtracting the (logtransformed) variability at different sensors, one can obtain an invariance statistic . Note that we incorporate the patientspecific property of the calibration parameter when constructing the invariance statistic.
Following the above intuition yet with the unknown variability term , we construct the following statistic, via a linear combination of the transformed :
(5) 
for patient with . Here, are predefined combination coefficients, and . With a properly selected transformation , (5) gives the target invariance statistic . Note that here we adapt a general transformation , instead of the specific logtransformation in the above example. The transformation would be selected so that the dependency of the invariance statistic to is minimal; a detail estimation method for will be discussed in Section 3.4.
It is important to note that we reconstruct the sensing model from (2) to (5) via the proposed invariance statistic. This is again due to the key challenge of patienttopatient variability. Consider first using (2) for VCC recovery (also see the discussion in Section 2.2). Due to the unknown and patientspecific calibration parameter , it is challenging to either learn a sensing model from training data or recover VCCs for a new patient. However, the new model (5) contains only the invariance statistic, and is free from the calibration parameter. Thanks to the properly selected transformation and the combination (see Section 3.4), the invariance statistic would be approximately a constant for different patients. Therefore, our new model (5) allows an effective estimation of the sensing relationship (only the mean part needed) similar to the standard calibration problem with a constant calibration parameter (Tuo and Wu, 2015), and then a calibrationfree recovery of the VCC of interest.
3.3 Online calibrationfree recovery
We present next the method for recovering the VCC of interest , in the online monitoring stage for a new patient denoted by . At any time , the sensor reading is denoted as along with the unknown calibration parameter . Assume for now the mean sensing relationship and the transformation are known (see Section 3.4 for the estimation). We adopt the new sensing model (5) with the invariance statistic
(6) 
where is the target VCC. Note that the computed in online monitoring is also invariant to the calibration parameter . Therefore, the VCC of interest can be recovered by minimizing the squared difference between the computed value and the underlying value (see Section 3.4 for the estimation):
(7) 
In the cell manufacturing application, we are interested in recovering a VCC curve over the whole manufacturing period . To this end, we perform optimization (7) for times corresponding to each experimental time . Note that here we have not incorporated the timedependency (or smoothness) of the recovered function in online recovering; one can use postprocessing methods or directly model via a parametric form in the optimization (7). Readers are referred to functional data analysis literature (Ramsay, 2004; Wang et al., 2016) for more discussion.
3.4 Parameter estimation
We estimate the unknown transformation , and the physical relationship using the training data at hand. In our implementation, the transformation is paramterized by the BoxCox transformation (Box and Cox, 1964; Sakia, 1992)
(8) 
Note that the logtransformation in the above example is a special case in the BoxCox transformation. Here, the BoxCox transformation contains an unknown parameter . A twoparameter Box–Cox or Yeo–Johnson transformation (Yeo and Johnson, 2000) can also be used if the data are not restricted to be nonnegative. Furthermore, in this article, we will focus on the parametric transformation (8), but our proposed method are general and can be applied to nonparametric cases.
As for the physical relationship , we adopt the following basis decomposition:
(9) 
Here, are the predefined basis functions and
denotes the vector of corresponding coefficients. Such a set of basis
is selected by the physical knowledge of the cell manufacturing system or the observation from data.Meanwhile, to account for the separable assumption in Section 3.2, we introduce slack variables to account for the “goodness” of the assumption (more discussion below). Furthermore, the underlying value for the best invariance statistic is also unknown and need to be estimated from data (see Section 3.3).
We propose to estimate the unknown parameters via the following optimization problem with two penalization terms:
(10) 
Here, and are two penalization coefficients, and denotes the vector norm.
The main objective term (i.e., the first term) in (10) is for achieving the best patientinvariance property of the constructed statistic . Specifically, we minimize the meansquared error (MSE) of its underlying truth and the computed value from data. This is equivalent to modeling the patientinvariance statistic
for each patient as i.i.d. random draws from a normal distribution
with meanand variance
. Moreover, the first penalization term in (10) is for the separable assumption . Here, we minimize the corresponding MSE of the set to the underlying truth , for each sensor and patient . Similarly, this can also be viewed as modelingvia i.i.d. normal random variables; the corresponding penalization
can then be interpreted as the ratio between the variances of the two normal distributions. Finally, the second penalization term is for basis selection, similar to the widely used LASSO method in the literature (Tibshirani, 1996; Donoho, 2006). This is because, in the cell manufacturing application, one would only have an intuitive understanding of the sensing relationship; we will collect a set of basis functions from experience and select the suitable ones via this penalization.From the duality of the optimization problem, (10) can also be viewed as unpenalized loglikelihood combining both normal random variables with a sparsity constraint (Boyd and Vandenberghe, 2004). The parameter sets the variance ratio between the two random variables, and controls the desired sparsity level as parameter . Since the objective is to obtain a high recovery accuracy of the VCC of interest, and would be tuned using crossvalidation techniques (Friedman et al., 2001).
Consider now estimating the parameters via optimization (10) for fixed and . We propose to use the following Blockwise Coordinate Descent (BCD) optimization algorithm, described below. First, assign initial values for . Next, iterate the following three steps until the convergence is achieved: (i) for fixed and , update via the following estimates from firstorder conditions
(11) 
(ii) for fixed and , update the transformation parameter ignoring the two penalization terms; and (iii) for fixed , and , optimize for via numerical line search methods, e.g., LBFGS algorithm (Liu and Nocedal, 1989). The full optimization procedure is provided in Algorithm 1. Since (10) is a nonconvex optimization problem, the proposed BCD algorithm only converges to a stationary solution (Tseng, 2001). Because of this, we suggest performing multiple runs of Algorithm 1 with random initializations for each run, then taking the converged estimates for the run with smallest objective function. These runs should be performed in parallel if possible, to take advantage of the parallel computing capabilities in many computing systems.
It is important to note the difference between the training stage in our calibrationfree method and the calibration stage in the standard calibration problem (Kennedy and O’Hagan, 2001). In calibration methods, the calibration parameter , assumed to be a constant, is directly estimated from the training set. This can be viewed as estimating a population average of the historical , which would not be helpful in our cell manufacturing application. Due to the patienttopatient variability, the calibration parameter corresponding to the new patient can be completely different from the historical average value. In contrast, our calibrationfree method adopts a patientinvariance statistic , constructed from multiple sensor readings, to alleviate this patienttopatient variability. In our training setup, we learn the unknown mean relationship and the transformation , which provide the best patientinvariance statistic . We can then use the invariance statistic to effectively recover VCCs via (7), free from the patientspecific calibration parameter .
4 Simulation study
A detailed simulation study is conducted in this section. We first look into a toy application of recovering gravitational acceleration coefficients, to show the applicability of the proposed method. We then discuss more simulation experiments with different sensing relationships.
4.1 A gravity application
Consider the following toy application, where the goal is to recover the gravitational acceleration coefficient , for a different planet. As illustrated in Figure 3, we drop a ball and measure the traveling distance of the ball after a certain period of time . From the physical knowledge, we have the relationship
(12) 
Here, is the traveling distance measured by, e.g., taking a photo, is the initial velocity of the ball, and is the time period between dropping the ball and taking the photo. Suppose the ball is dropped by an engineer, meaning that the initial velocity is nonzero and changes among different drops. With the collected data , typically, one cannot recover the gravitational acceleration even with the known relationship (12). This is because the initial velocity is also unknown. The key idea of the proposed calibrationfree method is to take multiple photos at different time . Therefore, more data is collected with the same initial velocity . We can then use the proposed invariance statistic and Algorithm 1 to “cancel out” and conduct inference on the gravitational acceleration of interest.
The setup for recovering the gravity coefficient is as follows. We set the number of photos , with parameters (i.e., the time of taking photos) . A historical dataset of size is generated with calibration parameters (i.e., initial speed, unknown) , gravity coefficients , and each sensor reading (i.e., traveling distance) simulated by (12) with an additional i.i.d. measurement error following . To test the recovery accuracy, we let the underlying truth , randomly generated, and obtained by (12) with the same measurement error.
The proposed calibrationfree method (via Algorithm 1) is applied to find the best transformation and then recover the gravitational acceleration . The linear combination coefficients are , and the set of candidate basis functions is . The proposed calibrationfree method is repeated, with newly generated test data , for 20 times.
We consider the following two baseline methods (also see Table 1). First, we implement the supervised learning setting (Bishop, 2006), i.e., assuming the calibration parameter is the same in both training stage and monitoring stage. Such an assumption is not true in the considered cell manufacturing application. Specifically, we use the historical dataset to estimate an and a relationship , which would then be used to recover . Here, we use the same set of basis functions for , and a similar optimization scheme with LASSO type penalization (Tibshirani, 1996) for estimating the coefficients . This method is referred to as “SameCal”. The other baseline method is the standard calibration method suggested by Tuo and Wu (2015). In order to adopt this method, we need to assume that the calibration parameters in the historical data are known, which is not true in reality. Therefore, we refer this method as “Oracle”. To estimate the sensing relationship , we adopt the set of basis functions and a similar optimization scheme with LASSO type penalization. Both baseline methods are implemented to recover of interest via minimizing the squared difference similar to (7), using the same 20 simulated test data.
Figure 4 (a) shows the boxplots of the estimated using the proposed calibrationfree method and the two baseline methods. The red line indicates the ground truth value . Among the three methods, the Oracle baseline preforms the best since it queries additional information of the calibration parameter in the training stage, which is again not feasible in reality. We notice that the proposed calibrationfree method performs almost as good as Oracle. It can accurately recover the true value, with the mean over the 20 estimates and a relatively small variance, whereas for the SameCal baseline, the mean for 20 estimates is and a noticeable bigger variance is observed.
We also conduct a pairwise comparison over the 20 test repetitions. Figure 4 (b) shows the boxplots of absolute error ratios of the proposed method over the two baseline methods, with the red line indicating the ratio of 1.0. We notice that the proposed method is only slightly worse than Oracle; this is impressive since our method do not query the underlying calibration parameter in the training stage. Furthermore, the proposed calibrationfree method is noticeably better in recovering the true compared to SameCal. More specifically, our method outperforms SameCal with smaller errors in 17 estimates over 20 test runs in total. This is not surprising since the calibrationfree method, utilizing the patientinvariance statistic, can address the patientspecific calibration parameters .
4.2 More experiments
Here, we conduct more experiments on the proposed calibrationfree method. Specifically, we consider the following four underlying sensing relationships :

;

;

;

.
Note that function is quite similar to the sensing relationship (12) in the gravity application in Section 4.1. For functions and , we notice the existence of interaction terms between and , which means the separable assumption in Section 3.2 does not hold. However, and can still be approximately represented by the adopted set of basis functions . For function , it is quite complex, and cannot be represented by . We test all four functions, using the proposed calibrationfree method and the same two baseline methods – SameCal and Oracle – introduced in Section 4.1. The detailed test procedure is the same as that in Section 4.1.
Figure 5 shows the boxplots of the absolute error ratios of the proposed method over the baseline SameCal method (a) and the baseline Oracle method (b), under all four underlying sensing functions . We notice the error ratios in Figure 5 (a) are mostly smaller than , indicting that the proposed method can achieve smaller errors compared to SameCal. This is because the assumption of constant calibration parameter in SameCal does not hold in cell manufacturing application (and thereby in this simulation study), whereas, our calibrationfree method can address the patientspecific calibration parameter via a proper combination of multiple sensor readings. Moreover, compared to Oracle, the proposed method is only slightly worse. This shows the good performance of our calibrationfree method: Though we do not know the values of the calibration parameter, we can recover the underlying parameter of interest similar to the Oracle baseline, where whose values are assumed accessible. Finally, we notice that for the sensing relationship , while the proposed method adopts an inappropriate basis decomposition , it still outperforms SameCal. This is again because the proposed calibrationfree method introduces the invariance statistic to alleviate the effect of patienttopatient variability, and therefore, shows improved performances in recovering the quantity of interest.
5 Cell manufacturing case study
In this section, we apply the proposed calibrationfree method to the motivating case study of cell manufacturing. As discussed in Sections 1 and 2, we are interested in recovering and monitoring the Viable Cell Concentration (VCC) at different experimental time . This is because the goal of cell manufacturing is to culture a small batch of cells to a significant amount, for an effective readministering in the CAR T cell therapy (see Figure 1).
5.1 Experimental setup
In our experiment, human leukemic T cells (Jurkat E61; American Type Culture Collection, ATCC^{®}) are cultured in ATCCformulated culture medium (RPMI1640; GE Healthcare) with 10% fetal bovine serum, 2 mM Lglutamine, 10 mM 4(2hydroxyethyl)1piperazineethanesulfonic acid (HEPES), 1 mM sodium pyruvate, 4500 mg/L glucose, and 1500 mg/L sodium bicarbonate in a 75 cm^{2} petri dish (Nunc™ EasYFlask™; ThermoFisher Scientific™). The cells are cultured in a humidified incubator controlled at 37°C and 5% CO^{2}, and the culture media is preheated to avoid the temperature effect on the impedance measurement (Dlugolecki et al., 2010).
The impedance measurements are obtained by our electric cellsubstrate impedance sensing. Figure 6 illustrates the experimental setting for the impedance measurement. Here, we use sandwich shape 3D impedance sensors, consisting a pair of parallelplate electrodes and PDMS (Sylgard 184, Tow Corning) to maintain a gap between two electrodes (see Figure 2 (b)). In our experimental setup, the geometry parameter
of the sensors is the edge length of the electrode pads, and
. Impedance measurements are conducted by an LCR meter (E4980AL; Keysight Technologies) with a sinusoidal signal of 22 mVrms under multiple frequencies ranging from 500 Hz to 100 kHz. We let impedance measurement be the relaxation strength computed from the raw impedance readings over frequencies, i.e., the difference between permittivity values of highfrequency end and lowfrequency end of the dielectric relaxation process, for its high dependency to the VCC of interest (Schwan, 1957). The measurement is taken every minutes for around 3035 hours. This is because typically after 35 hours, we have to change the culture media and expand the cells to a bigger cell culture flask, which would inevitably interrupt the online monitoring. This results in an online monitoring dataset with the underlying VCCs to be recovered.The ground truth VCCs are obtained by an automated cell counter (TC20™; BioRad Laboratories, Inc.), and the concentration is maintained between 1 × 10^{5} and 1 × 10^{6} cells/mL. Multiple repetitions are performed, with the averaged value reported as the underlying VCC . Note that the measurement procedure is not only laborintensive but may also introduce contamination to the culture media (see Section 1). We will only measure VCC around six times per cell culture experiment, which leads to a much smaller () yet full training dataset .
We conduct the cell culturing for experiments, each with different starting materials, i.e., different calibration parameters . We use the same sensors for the experiments. In experiment , we measure and compute the relaxation strength of impedance for each sensor , at different time (every 15 minutes, in total). Meanwhile, we measure the ground truth VCC with a much lower resolution ( in total).
Exp 1  Exp 2  Exp 3  Exp 4  Exp 5  Mean  

Proposed  0.092  0.270  0.080  0.379  0.590  0.282 
SameCal  0.500  0.174  0.328  0.742  0.760  0.501 
5.2 Cross validation of viable cell concentration
For the collected training dataset , we first preform a crossvalidation test (Friedman et al., 2001) on the recovered VCCs. More specifically, we apply the proposed calibrationfree method (via Algorithm 1) to learn the sensing relationship using four out of five experiments, and then recover VCC for the remaining experiment via (7). We let the linear combination coefficients and select the same set of basis functions as in Section 4.1. Furthermore, a logtransformation on VCCs is performed prior to the analysis. We consider SameCal (see Section 4) as the baseline method. Such a method introduces an additional assumption that the calibration parameter is a constant. Note that the Oracle baseline cannot be adopted here since the actual values of the calibration parameter are always unknown, which is the key motivation of the proposed calibrationfree method (also see Section 2.2 and Table 1).
Table 2 shows the absolute errors of the recovered log VCCs when the ground truth VCCs are measured in each experiment. We observe that the proposed method outperforms the baseline SameCal method four experiments out of five. Furthermore, the mean error of the five experiments by the proposed method is , which is almost two times smaller than that of by SameCal. This is due to the fact that the calibration parameter, which models the patienttopatient variability, is not a constant in cell manufacturing (Hinrichs and Restifo, 2013); the proposed calibrationfree method properly addresses this variability via the construction of the patientinvariance statistic.
5.3 Online recovery of viable cell concentration
We then perform VCC recovery on the online monitoring set . Here, the sensing relationship is estimated using all five experiments in the training set . Figure 7 shows the two recovered log VCC curves over the whole culture time , via the proposed calibrationfree method (in red line) and the baseline SameCal method (in green dash line). The ground truth (log) VCC measurements in are also plotted in black dots. We see that the proposed method recovers a meaningful estimation of VCC. The recovered increases approximately linear over the culture time , indicating growths exponentially in time; this matches the preliminary understanding in the cell culture literature (Haycock, 2011). Furthermore, the recovered approximately passes through the ground truth measurements. However, due to the huge patienttopatient variability, the baseline SameCal method struggles in either passing through the ground truth experiments or providing reasonable estimates of VCC curves. Our calibrationfree method, adopting the patientinvariance statistic, appears to alleviate such variability well.
The proposed calibrationfree method can also provide important biological insights for cell growth in cell manufacturing. From Figure 7 (b), we notice a decrease in the VCC curve at around hour 32. This may be due to the lack of nutrition in the media since the culture media typically needs to be changed after 30 hours. Furthermore, we observe from Figure 7 that the VCC curves decrease slightly in the first two hours in cell manufacturing. This may be because of the lack of viability of the cells at the beginning of the culture process – though we have already thaw cells and stood them still for several minutes, it seems that a certain portion of cells still do not gain full viability and die soon. As a result, we suggest standing the cell still longer for future experiments. Last but not least, we notice a small VCC decrease when conducting the ground truth VCC measurements. One reason for this is that the measurement itself is not inline and needs to contact the culture media; it may introduce contamination, and therefore, kill a small portion of cells (Haycock, 2011). In contrast, the proposed calibrationfree biosensing method, together with impedancebased biosensors, provides an inline, nondestructive, and noncontact way for VCC monitoring in cell manufacturing.
6 Conclusion
In this work, we propose a new calibrationfree method for monitoring viable cell concentration in cell manufacturing, which is a critical component in the promising CAR T cell therapy. The key challenge here is the patienttopatient variability in the initial culturing material, leading to poor performances in recovering viable cell concentrations via existing methods. We propose to use multiple impedancebased biosensors with different geometries and an associated calibrationfree statistical framework for online recovery of viable cell concentrations. Specifically, we model the patienttopatient variability via a patientspecific calibration parameter. We then construct a patientinvariance statistic, which uses a transformation and a linear combination of sensor readings to alleviate the effect of the calibration parameter. In the training stage, we learn the best transformation and the sensing relationship via a carefully formulated optimization problem. In the online monitoring stage, viable cell concentrations can be recovered via the invariance statistic, free from the patientspecific calibration parameter. We then apply the proposed calibrationfree method in different simulation experiments and a realworld case study of cell manufacturing, where the proposed method demonstrates substantial improvements against the existing methods. Therefore, we believe the proposed calibrationfree method can play an essential role in cell manufacturing and reduce the cost of the promising CAR T cell therapy.
Looking ahead, there are several interesting directions for future exploration. To begin with, a more thorough analysis of impedancebased sensors can be conducted, with a detailed comparison of sensitivity using different experimental settings such as sensor geometries and electrode materials. Moreover, we adopt in this work a parametric sensing relationship and a heuristic approach for parameter estimation. This is mainly due to the already improved performance compared to the baseline methods. A more flexible, and nonparametric Gaussian process regression method
(Santner et al., 2013; Lin and Joseph, 2019)with a rigorous likelihoodbased parameter estimation scheme may lead to further improvements in recovering viable cell concentrations, as well as other critical quality attributes. Finally, micro cameras can also be used in cell manufacturing. Therefore, we are also interested in monitoring cell manufacturing based on cell morphology. In this case, physicsinformed deep learning frameworks in the literature
(Raissi et al., 2017; Chen et al., 2020) appear to be suitable for recovering critical quality attributes in cell manufacturing.References
 Aster et al. (2018) Aster, R. C., Borchers, B., and Thurber, C. H. (2018). Parameter Estimation and Inverse Problems. Elsevier.
 Bishop (2006) Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.
 Bonifant et al. (2016) Bonifant, C. L., Jackson, H. J., Brentjens, R. J., and Curran, K. J. (2016). Toxicity and management in car tcell therapy. Molecular TherapyOncolytics, 3:16011.
 Box and Cox (1964) Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society, Series B, 26:211–252.
 Boyd and Vandenberghe (2004) Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press.
 Brown and Atamturktur (2018) Brown, D. A. and Atamturktur, S. (2018). Nonparametric functional calibration of computer models. Statistica Sinica, pages 721–742.
 Chen et al. (2019) Chen, J., Mak, S., Joseph, V. R., and Zhang, C. (2019). Functiononfunction kriging, with applications to 3d printing of aortic tissues. arXiv preprint arXiv:1910.01754.
 Chen et al. (2020) Chen, J., Xie, Y., Wang, K., Zhang, C., Vannan, M. A., Wang, B., and Qian, Z. (2020). Active image synthesis for efficient labeling. IEEE Transactions on Pattern Analysis and Machine Intelligence.
 Collins (2007) Collins, T. J. (2007). ImageJ for microscopy. Biotechniques, 43(S1):S25–S30.
 Dlugolecki et al. (2010) Dlugolecki, P., Ogonowski, P., Metz, S. J., Saakes, M., Nijmeijer, K., and Wessling, M. (2010). On the resistances of membrane, diffusion boundary layer and double layer in ion exchange membrane transport. Journal of Membrane Science, 349(12):369–379.
 Donoho (2006) Donoho, D. L. (2006). For most large underdetermined systems of linear equations the minimal norm solution is also the sparsest solution. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences, 59(6):797–829.
 Ezzat et al. (2018) Ezzat, A. A., Pourhabib, A., and Ding, Y. (2018). Sequential design for functional calibration of computer models. Technometrics, 60(3):286–296.
 Friedman et al. (2001) Friedman, J., Hastie, T., and Tibshirani, R. (2001). The Elements of Statistical Learning, volume 1. Springer Series in Statistics, NY.
 Gheorghiu and Asami (1998) Gheorghiu, E. and Asami, K. (1998). Monitoring cell cycle by impedance spectroscopy: experimental and theoretical aspects. Bioelectrochemistry and Bioenergetics, 45(2):139–143.
 Goh and Ram (2010) Goh, S. and Ram, R. (2010). Impedance spectroscopy for in situ biomass measurements in microbioreactor. In Proceedings of the 14th International Conference on Miniaturized Systems for Chemistry and Life Science, Groningen, The Netherlands, pages 3–7.
 Guillas et al. (2018) Guillas, S., Sarri, A., Day, S. J., Liu, X., and Dias, F. (2018). Functional emulation of high resolution tsunami modelling over cascadia. The Annals of Applied Statistics, 12(4):2023–2053.
 Harrison et al. (2019) Harrison, R. P., Zylberberg, E., Ellison, S., and Levine, B. L. (2019). Chimeric antigen receptor–T cell therapy manufacturing: modelling the effect of offshore production on aggregate cost of goods. Cytotherapy, 21(2):224–233.
 Haycock (2011) Haycock, J. W. (2011). 3D cell culture: a review of current approaches and techniques. In 3D Cell Culture, pages 1–15. Springer.
 Hinrichs and Restifo (2013) Hinrichs, C. S. and Restifo, N. P. (2013). Reassessing target antigens for adoptive Tcell therapy. Nature Biotechnology, 31(11):999.
 Joseph and Melkote (2009) Joseph, V. R. and Melkote, S. N. (2009). Statistical adjustments to engineering models. Journal of Quality Technology, 41(4):362–375.
 June et al. (2018) June, C. H., O’Connor, R. S., Kawalekar, O. U., Ghassemi, S., and Milone, M. C. (2018). CAR T cell immunotherapy for human cancer. Science, 359(6382):1361–1365.
 Kennedy and O’Hagan (2001) Kennedy, M. C. and O’Hagan, A. (2001). Bayesian calibration of computer models. Journal of the Royal Statistical Society: Series B, 63(3):425–464.
 Kim and De Vellis (2009) Kim, S. U. and De Vellis, J. (2009). Stem cellbased cell therapy in neurological diseases: a review. Journal of Neuroscience Research, 87(10):2183–2200.
 Lin and Joseph (2019) Lin, L.H. and Joseph, V. R. (2019). Transformation and additivity in gaussian processes. Technometrics, pages 1–11.
 Liu and Nocedal (1989) Liu, D. C. and Nocedal, J. (1989). On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45(13):503–528.
 Miura and Uno (2019) Miura, T. and Uno, S. (2019). Computer simulation for electrochemical impedance of a living cell adhered on the interdigitated electrode sensors. Japanese Journal of Applied Physics, 58(SB):SBBG15.
 Pan et al. (2019) Pan, Y., Hu, N., Wei, X., Gong, L., Zhang, B., Wan, H., and Wang, P. (2019). 3D cellbased biosensor for cell viability and drug assessment by 3D electric cell/matrigelsubstrate impedance sensing. Biosensors and Bioelectronics, 130:344–351.
 Plumlee et al. (2016) Plumlee, M., Joseph, V. R., and Yang, H. (2016). Calibrating functional parameters in the ion channel models of cardiac cells. Journal of the American Statistical Association, 111(514):500–509.
 Prasad (2018) Prasad, V. (2018). Tisagenlecleucel – the first approved CARTcell therapy: implications for payers and policy makers. Nature Reviews Clinical Oncology, 15(1):11–12.
 Raissi et al. (2017) Raissi, M., Perdikaris, P., and Karniadakis, G. E. (2017). Physics informed deep learning (part i): Datadriven solutions of nonlinear partial differential equations. arXiv preprint arXiv:1711.10561.
 Ramsay (2004) Ramsay, J. O. (2004). Functional data analysis. Encyclopedia of Statistical Sciences, 4.
 Sakia (1992) Sakia, R. M. (1992). The BoxCox transformation technique: a review. Journal of the Royal Statistical Society: Series D (The Statistician), 41(2):169–178.
 Santner et al. (2013) Santner, T. J., Williams, B. J., and Notz, W. I. (2013). The Design and Analysis of Computer Experiments. Springer Science & Business Media.
 Schwan (1957) Schwan, H. P. (1957). Electrical properties of tissue and cell suspensions. In Advances in Biological and Medical Physics, volume 5, pages 147–209. Elsevier.
 Slouka et al. (2016) Slouka, C., Wurm, D. J., Brunauer, G., WelzlWachter, A., Spadiut, O., Fleig, J., and Herwig, C. (2016). A novel application for low frequency electrochemical impedance spectroscopy as an online process monitoring tool for viable cell concentrations. Sensors, 16(11):1900.
 Tibshirani (1996) Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):267–288.
 Tseng (2001) Tseng, P. (2001). Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of Optimization Theory and Applications, 109(3):475–494.
 Tuo and Wu (2015) Tuo, R. and Wu, C. F. J. (2015). Efficient calibration for imperfect computer models. The Annals of Statistics, 43(6):2331–2352.
 Wang et al. (2016) Wang, J.L., Chiou, J.M., and Müller, H.G. (2016). Functional data analysis. Annual Review of Statistics and its Application, 3:257–295.
 Yeo and Johnson (2000) Yeo, I.K. and Johnson, R. A. (2000). A new family of power transformations to improve normality or symmetry. Biometrika, 87(4):954–959.
 Yin (2017) Yin, P. (2017). Keys to scaleup CAR Tcell therapy manufacturing. BioProcess Online.