 # Contribution of Interval Linear Algebra to the Ongoing Discussions on Multiple Breath Washout Test

In the paper the interval least squares approach to estimate/fit data with interval uncertainties is introduced. The solution of this problem is discussed from the perspective of interval linear algebra. Using the interval linear algebra carefully, it is possible to significantly speed up the computation in specialized cases. The interval least squares approach is then applied to lung function testing method - Multiple breath washout test (MBW). It is used for algebraic handling of uncertainties arising during the measurement. Surprisingly, it sheds new light on various aspects of this procedure - it shows that the precision of currently used sensors does not allow verified prediction. Moreover, it proved the most commonly used curve to model the nitrogen washout process from lung to be wrong. Such insight contributes to the ongoing discussions on the possibility to predict clinically relevant indices (e.g., LCI).

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction 1 – Multiple Breath Washout test (MBW)

First works concerning multiple breath washout test date back to ’40s and ’50s . In those days the method faced crucial limitations which prevented its use in clinical practice. The precision of sensors was not sufficient to measure low gas concentrations accurately and also the computational power of digital computers was insufficient to handle problems described using too many parameters (much of mathematical work was still done manually). With increasing power of sensors and computers MBW was reborn in ’90s.

MBW is a very promising lung function test since it does not require any specific breath maneuvers. The only requirement is the regular tidal breathing with no leaks. This makes it applicable to the wide age range of patients including infants, who undergo this test in sleep (either artificial or natural).

In contrast to the conventional methods (e.g, spirometry, bodypletysmography), MBW is able to evaluate even the most peripheral airway disease. Its high sensitivity to the most peripheral airway changes has been shown in most of chronic lung diseases (e.g. bronchial asthma, cystic fibrosis, primary cilliary dyskinesia etc.) , , .

The test consists of two phases – washin and washout. During the first phase lung is filled with an inert gas (sulphur hexafluoride , helium or resident inert gas – nitrogen ), during the second phase the inert gas is being washed out by air or by 100% oxygen (depending on the inert gas used). Concentration of the respective inert gas, volume of exhaled gas and flow are measured in real time. The measurement is stopped after reaching a certain level of inert gas concentration within lung (usually 2.5% of the initial gas concentration). The pattern of inert gas concentration decrease gives information about the homogeneity of ventilation and thus about the patency of the most peripheral airways. The washout procedure can be seen in Figure 1.

In our work we focus on use of the nitrogen multiple breath washout test (-MBW). Although, the has been used for much longer time, the use of nitrogen as inert gas has many advantageous properties:

• is not used in medicine, so it must be specially prefabricated, is naturally present in the surrounding air

• is an exogenous inert gas, which needs to be washed in to the lungs

• is not routinely available in medical settings, it is quite expensive and have a severe green-house effect

Contrarily, is naturally present in the surrounding air and in lung (so called endogenous inert gas) – there is no need for washin phase. Moreover, is present also in poorly ventilated areas of lung, which makes the evaluation of ventilation inhomogeneity in severely affected patients more accurate. A small drawback is that the nitrogen is not ideal because of its solubility in blood and its back-diffusion from tissues and blood to the lung during washout phase.

The main output is depicted in Figure 2. These are two graphs – actual flow (bottom curve) and decreasing nitrogen concentration (top curve) measured in each time slice. These data are further used for computing clinically significant indices (FRC, LCI, Scond, Sacin, etc.). Some of them will be mentioned further in the text. The advantage of MBW is its high sensitivity to the early stages of various lung diseases. That enables early therapeutic intervention. There are studies that describe a typical evolution of the mentioned indices for a given disease . Figure 2: Nitrogen concentration (top curve) and air flow (bottom curve) in time measured during the nitrogen washout process. In about 50th second the washout phase begins (the first large drop of N2 concentration).

## 2 LCI and FRC

Functional residual capacity (FRC, defined as volume of air in lung at the end of tidal exhalation) and lung clearance index (LCI defined as number of lung volume turnovers required to wash out the inert gas to of its initial concentration) are currently the most important indices derived from MBW data. If we omit the correction for deadspace ventilation, the FRC is calculated as follows:

 FRC=VN2outCN2start−CN2end,

where is the total volume of expired , are concentrations of nitrogen in the start and end point of FRC computation respectively. The index LCI is calculated simply as

 LCI=VoutFRC, (1)

where is the total volume of expired air. It is necessary to specify how we define the terminal breath (the end of measurement). It is defined as the first of three consecutive breaths with end tidal concentration of inert gas under a preset level; historically it is 2.5%. The corresponding LCI index is then marked LCI2.5. The whole washout process up to 2.5% might be too time consuming. Making it difficult for uncooperative patients to finish the MBW test properly.

The FRC relates to the size of lung. The LCI states how many air volumes (equal to FRC) exchanges are necessary to clean the lung from the inert gas (more specifically to reach the level of 2.5% of initial inert gas concentration). The LCI index seems to be very useful to evaluate the homogeneity of lung ventilation (the most peripheral airways included).

Currently, there are some ongoing discussions about possibility to use of the level 5% as the end of washout (LCI5). We also would like to contribute to the discussion with our results.

## 3 Introduction 2 – Interval methods

The history of interval analysis is actualy quite similar to MBW. It was developed in ’50s but it took some time (until ’90s) before it started to be used practically (mainly because of insufficient computational power of machines).

The basic notion is an interval (denoted in boldface), which is for our purpose a real closed interval containing all real numbers within the lower and upper bounds,

 x=[x––,¯¯¯x]={x, x––≤x≤¯¯¯x}.

The interval works in a verified way. That means it contains some desired value (e.g., a physical constant, a root of polynomial) for sure, however, it is not known where exactly the value lies. With intervals, the arithmetic can be defined as follows:

Let us have two intervals and then

 x+y =[x––+y–,¯¯¯x+¯¯¯y], x−y =[x––−¯¯¯y,¯¯¯x−y–], x∗y =[min(S),max(S)],where S={xy–––,x––¯¯¯y,¯¯¯xy–,¯¯¯¯¯¯xy}, x / y =x∗(1/y),where 1/y=[1/¯¯¯y,1/y–],0∉y.

Interval arithmetics can be (carefully) incorporated in our computational problems (solving systems of equations, integration, data fitting) in order to preserve the verified nature of our data.

Interval computation is used when dealing with measurement and rounding errors. In the MBW procedure there are many sources of such errors:

• Imprecision of sensors

• Changing viscosity and humidity of air

• Time shift of signals

• Physiological noise (heart pulse, hick-ups, leaks)

• Irregular breathing pattern, apnea

• Computer and machine rounding errors

• etc.

Unknown distributions and interplay of the mentioned uncertain variables will result in intervals with unknown distribution. Hence they will only provide verified lower and upper bounds. The situation is depicted in figure 3.

Regarding data estimation one can usually think of some form of regression. Of course, the meaning of ”regression on interval data” needs to be specified first. We will provide the definition in Section 4. We will see that interval analysis provides an interesting tool for dealing with such uncertainties algebraically (using means of interval linear algebra). Figure 3: An illustration of how intervals occur when having discrete sampling and given measurement error (a segment of a nitrogen washout curve).

More about interval analysis and its use can be found in e.g., [13, 15, 17].

### 3.1 Nitrogen concentration in peaks

To obtain interval bounds for nitrogen concentration in peaks (end of breaths) we must first locate the end of breaths. For that purpose we developed our own algorithm  which is able to outperform the existing state-of-the-art approaches and even commercial software (Spiroware). After localization of the breath ends the imprecision of machine sensors must be incorporated. We used Exhalyzer D machine by Ecomedics, Duernten, Switzerland, that does not measure nitrogen concentration directly. It computes the nitrogen concentration (in %) according to the formula :

 100=N2%+O2%+CO2%+Ar%,

where and where the concentrations of nitrogen, oxygen, carbon dioxide and argon in inspired and expired air are supposed to sum up to 100 %. With argon concentration fixed. That together gives

 N2%=11.0118(100−O2%−CO2%),

where all parameters are in percents.

According to the manufacturer, the sensor has accuracy 0.3% and the sensor has accuracy 5%. From that we can derive a verified interval bounding the nitrogen concentration in each time slice according to the formulas

 n––i=11.0118(100−1.003∗O2%−1.05∗CO2%), (2)
 ¯¯¯ni=11.0118(100−0.997∗O2%−0.95∗CO2%). (3)

From the minimal possible concentrations of were subtracted to obtain an upper bound on nitrogen concentration and the maximal values were subtracted to obtain a lower bound.

## 4 Regression on interval data

Various authors approached the topic of regression on interval data, e.g, , , , . Behind the interval regression or interval estimation the following general definition can be seen.

###### Definition 1

A result of the multi-linear interval regression on (interval) data tuples

 (xi1,xi2,…,xin,yi)

is generally

 r(x1,x2,…,xn)=p1x1+p2x2+⋯+pnxn,

where are interval parameters.

The resulting can be viewed as a multi-dimensional band. A two dimensional example can be seen in Figure 4. Figure 4: An example of r=p1x+p2. The band actually forms an interval line, which passes through the interval boxes.

As it was explained, there are various types of interval regression. They vary in computation of interval parameters . For example, could be computed in such a way to force the band to contain all the data tuples, or at least to cross all the interval data. For our purpose the interval least squares approach is the most meaningful.

###### Definition 2

For a given data: an interval matrix , where its -th row is the tuple

 (xi1,xi2,…,xin),

and an

-dimensional column vector

, where its coefficients are , the interval parameters of the interval least squares estimation are defined in the following way,

 p=□{p:XTXp=XTy for some X∈X,y∈%$y$},

where is the tightest possible enclosure of a given set by an -dimensional box (interval vector).

### 4.1 Computation of interval least squares and our improvement

When we are given real data , the least squares parameters can be obtained by solving

. However, this is not the recommended approach since the condition number of the matrix of the new system is squared. There are various possibilities how to approach this problem (QR-decomposition, Krylov subspace methods). They both rely on orthogonality, however when we are tackling the general interval data

the orthogonality of two interval vectors makes no sense, hence these methods are of no use. When we try to apply the first mentioned approach to the interval case it seems nice, since there are a lot of methods for solving interval linear systems , , , . Unfortunately, multiplication of two interval matrices results not only in quadratic condition number but also in exceptional growth of interval widths, therefore the obtained solution would generally be useless. The state of the art approach is mentioned in e.g.,  it is based on solving the following system

 (IXX⊤0)(pp2)=(y0). (4)

The enclosure of parameter vector appears as the first components of the obtained enclosure. From we form much larger square matrix, that is why, we call it supersquare or supsquare approach. It can be seen that much larger system of interval equation needs to be solved.

Later, we want to use regression with nonlinear models that are linearizable, therefore the data formed out of the MBW data depicted in Figure 5 will have a certain shape:

• is thin

• Intervals are to be found only in the right-hand side

• is going to be small , , depending on the model used (see the Table 3 in advance)

• Depending on the linearization used, might consist of integers only (ones, numbers of breaths or its powers)

• (component-wise) Figure 5: Ilustration of decreasing concentration of nitrogen in peaks bounded with intervals. From such data the X,y for the regression will be formed.

All these are really favorable properties. That is why we asked whether it is possible to design a method returning tighter enclosures than (4). Unfortunately, we were not able to find such method. We believe that it is a really hard task since the mentioned properties are also in favor of (4). However, we were able to rewrite the formulas to obtain algorithms that are much faster.

### 4.2 Case 2×2

When is an integer matrix of size , then is of size . We can apply the state-of-the-art supsquares approach, however, in this case the ”not-recommended” approach of solving the interval normal system of equations approach may pay off. This actually means computing some verified enclosure of , where

 p=(XTX)−1XTb.

When computing the inverse matrix, fractions can occur and therefore possibly machine non-representable numbers can occur. That is why we also need to compute in a verified way with intervals. Nevertheless, it is advantageous to postpone the interval computation as far as possible, because the classical arithmetic is usually faster (e.g., in Octave or Matlab). In this case we use the simple shape of the matrix inverse

It is possible to compute

in floating point arithmetic since

contains only integers, and similarly for .

When we compute the expression we multiply with an interval matrix , that unfortunately causes large growth of interval radii. And then we multiply it again with the matrix which causes another growth. Much more suitable way is to compute the whole expression as

 ((XTX)−1XT)y,

that is multiplying the matrices first and then multiplying with . In conclusion, the enclosure of can be computed as

 (MXT)(qy)⊇p,

where

the symbol stands for tightest enclosure of the given expression by an interval with machine representable bounds.

The system (4) can be solved by any cited method computing enclosures of interval systems. We used the method within the Octave interval package . We tested the difference between the two mentioned approaches on random systems for sizes , which represent the ceiling for the maximum number of breaths generally occurring during MBW testing. For the purpose ot the testing, consisted of two columns, the first consisted of numbers 1 to and the second of ones. For the right hand side we first generated random intervals with fixed radius 1 and then these intervals were placed along a random line. Both methods were tested for each size on 100 systems. To show the difference, our method was tested with and without postponing of interval operations. In all cases methods computed enclosures for of the same width. However, computational times were different, they are displayed in Table 1. In the table a difference between postponed and non-postponed interval computation can be clearly seen.

### 4.3 Case 3×3 and larger

It would be more complicated to find similar formula for an inverse of a general square matrix. That is why this time we refrain from postponing interval computations and enclose directly with tight intervals (e.g., with radii ). We again compare it with the supsquare approach. The obtained enclosures of are again the same and the time computations are displayed in Table 2. This method is still faster than the supsquares approach.

## 5 Our data

We collected the data from real patients measured for medical purposes. The measurement technique adhered to ERS/ATS recommendation and the standard operation procedure for -MBW , .

The three necessary conditions to obtain reliable data were:

• Patients have sufficiently regular breathing pattern during measurement

• There is no leakage during the measurement

• Wash-out phase is finished (nitrogen is washed out to a preset level – 2.5%)

The data was captured using Exhalyzer D machine by Ecomedics, Duernten, Switzerland.

We included 15 raw data files (A-files) from healthy volunteers (50% of males ) with mean age 12.4 years. Additionally, we included 12 A-files from patients with cystic fibrosis (40% of males) with mean age 10.6 years. The study was approved by the institutional ethical committee of University Hospital Motol, Prague. The legal representatives of patients gave written informed consent. In all A-files breath ends were detected using our own algorithm . Corresponding end tidal nitrogen concentrations were expressed as intervals according to the formulas (2) and (3). The pre-washout parts of the data were automatically trimmed.

After long discussions we stated a few questions that are interesting from both clinical and mathematical point of view. The important and still discussed question is the behaviour of the nitrogen washout curve in time. There is an observable difference between the healthy and diseased persons, however the objective description is still missing. The long duration of washout (especially in severely affected patients) limits the feasibility of the test especially in small children (toddlers and pre-schoolers). Currently, the premature cessation of the washout (before reaching 2.5% of the starting nitrogen concentration) prevents us from analysing the data. The possibility to derivate some substitute indices computable from incomplete washout curve would be of great benefit.

## 7 In search for a model

One of the main goals is to determine the shape of the nitrogen washout curve. In another words, we try to derive the following function

 f(n), for n=1,2,…

where is the number of a peak (the initial peak has number 1) and the function returns a nitrogen concentration in each peak (it can be an interval concentration). Such function we call nitrogen washout curve model. This goal was addressed earlier in  using a simplified model of lungs. They were not able to compute models with more parameters due to the limited computational power (they handled many calculations manually). Their approach could be described as ”bottom-up”. A similar approach but for a different goal can be seen e.g., in .

Our approach is slightly different, we could call it ”top-down”. Using a computer we explore the most frequent mathematical models of decay and try to fit the existing medical data with them. From the fitting it will be hopefully possible to obtain more information about the real behaviour of the nitrogen washout process and such knowledge will help to better predict the behaviour of the incomplete measurement.

Of course, there can be some outliers in the data (e.g., false breaths) that could prevent a ”perfect” fit. We can use the

iterative refinement procedure – first, the data are fitted, then the worst outlier is discarded and the data are fitted again. We tried such refinement with 1 up to 5 iterations. We discuss the practical use of the refinement later.

### 7.1 Center data

In the previous sections we showed how to derive verified interval data from our measured real patient data. We applied this procedure for all datasets. First, to have at least rough idea of the washout curve model, classical least squares data fitting was applied on centered data (real data obtained when instead of each interval its center is taken). We were interested in fitting curves for which the process of good fitting can be transformed to solving a linear system of equations. When we are talking about a quality of fit we need to measure it somehow. The typically used measure is mean square error (MSE), which measures the mean of squared distances from model fit to real data. More specifically, we use rMSE which is the square root of MSE. We fit the data in least squares manner. MASE is another measure of quality of fit that we use. It measures the quality of fit of a model in contrast to the naive predictor (a function that predicts for the next step the same value that just occurred in the current step).

If we evaluate the measurements visually, we could detect ”exponential”-like decay in all data. An example could be seen in Figure 5. Many papers and books (also the medical software shipped with the machine Exhalyzer D) describe this decay with an exponential function . This is one of the classical fitting models. When talking about classical fitting models we tried to find the one most suitable among them.

From the large collection of models  we selected the following model candidates fulfilling the visual criteria first. They are written in Table 3. In the left column there is a shortcut by which we address a model, in the second column the mathematical description and in the third column the parameters that need to be computed to fit a given dataset. As already mentioned, all of these models can be linearized. For a detailed description of this process for each model see .

For each dataset (one measurement) each model was fitted with the following procedure:

1. For a given dataset (A-file), try to fit the given model via least squares procedure

2. Compute metrics – rMSE, MASE

In the first process we tried to remove the outliers from the initial fit. We tried removing 1 to 5 outliers (iteratively or at once) however, it did not lead to any significant improvement. Usually, the initial parts of the washout curve that were not fitted well were omitted leaving almost no difference to the terminal part in a refit curve. We implied that the level of 2.5% and 5% is significant for medical specialist. When we follow the nitrogen curve in time beyond the 2.5% level of concentration, it can be seen that the concentrations peaks can be interlaced with a nearly horizontal line. It is difficult for all models to fit properly such slowly decreasing end. That is why we also measured the quality of fit to a level where something is ”still happening” (the curve does not decrease so slowly) – up to 5%. The rMSE results can be seen in Tables 4 and 5.

From the perspective of rMSE measure the model loglin is the winner. The rMSE penalizes heavily the large misfits. If we take a look at the loglin curve it can fit the initial part of the washout curve pretty well. All other models are penalized, except for the model exploglin. It sometimes seems to be better, however, the coefficient in exponential member of the formula (d) is usually an extremely tiny number (). That is why this model is usually the same as loglin. From the perspective of Occam’s principle further we consider only the loglin model.

The curve with the best rMSE fit does not have to be necessary the best for the sake of prediction of the washout curve behaviour. It can be seen from the MASE Tables 6 and 7. If we compare the curves to the naive predictor, then we see that the exppow model does often better than loglin. The model exp model which is heavily used in describing the nitrogen washout curve in literature however is not so accurate.

From the mentioned tables we can get the idea how the washout curve behaves. Model loglin fits the data best. However, the tail of this model curve usually tends to grow up, we will see that later. This is not a plausible behaviour. Anyway, from the whole viewpoint this curve models the whole curve the best.

We took an experiment and for all the curves tried to model only the first third of data. This way we could show that other models (exp, pow, exppow) are doing much better that the model loglin. This brought us to idea that maybe the problem is in too shallow descend of the end of the washout curve that cannot be modeled well by any of used model curves.

When data sets were shortened up to the point where the nitrogen concentration decreases below 5% of its initial concentrations, the model exppow works much better on this initial phase. And its fitting error improved. Nevertheless, the best fitting model is still loglin. We therefore have some candidates for interval fitting models – the ones that have best rMSE and MASE at the same time. We omit the model exploglin, since it is too complicated. We exclude the model log since it is contained in loglin and does not have better results than loglin. We also cast out models explin and explog due to a large error rate. We have four remaining candidates – exp, pow, exppow, loglin – that we further use.

None of the checked model curves was able to nicely fit the data from the 5% to 2.5%. The level of 5% seems to be the nice level that still enables possible plausible fitting with one of the classical models. This could also be an important fact for current discussions about advantages of LCI5 over LCI2.5. However, we must be careful not to reach the conclusions too quickly, because the part of the washout curve between 5% and 2.5% can possibly contain some important information about the quality of patient airways.

### 7.2 Interval models – least squares

We took the four candidates on fitting curves – exp, pow, exppow, loglin – and provided the interval fitting. Each fitting can be transformed to solving an interval linear system of equations. The process is thoroughly described in . Unfortunately, the results were not encouraging. Due to the errors of sensors the interval data are consisted of intervals with large widths. That is why the resulting interval washout curve models are too thick. Another reason for such overestimation might be that solving an interval linear system exactly is a hard task (NP-hard in the language of computational complexity) therefore we usually use only approximative methods and they might provide some verified overestimations. Shapes typical for each interval washout model are depicted in Figure 6. No curve was completely able to fit the data nicely. The exp function misses the initial and final part of the washout data. The pow model misses the initial part. The exppow model is usually too wide, however, contains the data inside the interval curve. The loglin model usually tends to widen in time, ruining any possibility of prediction. Figure 6: Interval curves fitting the real data with real measurement errors. The small red rectangles represent the interval data. The blue line represent the level of 5% of initial nitrogen concentration. Notice that y-scale of each graph is different. Interval least squares fitting curves (interval washout models) are depicted in pink.

### 7.3 Hypothetical sensors

We showed that problem of the least squares fitting lies within precision of current sensors (0.3% for sensor and 5% for sensor of Exhalyzer D machine) and also possibly within the methods for solving interval systems of equations. One might claim that the main flaw lies in the methods for solving interval systems and their overestimation. To shed more light on this, let us assume we have the sensors with better accuracy by one order i.e, 0.03% for sensor and 0.5% for sensor.

Let us repeat the same procedure as in Figure 6, this time for the hypothetical sensors. The surprising results are displayed in Figure 7. We checked all the four mentioned models manually by visual evaluation. We omitted the model pow, because it gave poor fitting results in the initial parts. We also omitted the model exp. Although, it gave very narrow curves it resulted in really poor fit. We checked the two remaining models – exppow and loglin. The problems with loglin still persist. Even for narrow intervals the curve tends to rise at its end. This gives us the winning description model – exppow. If we take a look at Figure 7, we see that the behaviour of exppow model does not fit the data well under the blue line (5% concentration level). However, till the line it behaves well. We further check its properties in the next section. Figure 7: Interval curves fitting the real data with hypothetical measurement errors. The small red rectangles represent the interval data. The blue line represent the level of 5% of initial nitrogen concentration. See the variable y-scale of each graph. Interval least squares fitting curves are depicted in pink.

### 7.4 Prediction

As it was said the level of nitrogen concentration where we stop the measurement is 2.5% or 5%. This boundary is set historically. For small uncooperative infants it might be difficult to prevent leaks and maintain calm and regular breathing for longer time. Sometimes the measurement must be aborted. In order to not waste the so far good measurement we can try to predict the successive behaviour of the washout curve. Using the previously developed interval washout models we focus on determination of the terminal breath of a measurement. To remind the definition, for a given level of nitrogen concentration (20%, 10%, 5% or 2.5%),

the terminal breath for this concentration is defined to be the first one of the three consecutive breaths with concentration below the respective level.

As discussed earlier, we limited our prediction to the part of the washout curve between 10% and 5% as depicted in Figure 8. The goal was to predict the interval containing the terminal breath at 5% level and compare it with the real terminal breath at the corresponding level. For the prediction we used both the real and hypothetical sensors, the result are in Tables 8(real sensors) and 9 (hypothetical sensors). Figure 8: Concentrations of nitrogen in % of the initial nitrogen concentration.

The case of real sensors is provided just for illustration, the resulting intervals predicting terminal breaths are too large. In the case of hypothetical sensors, the prediction is not generally bad. However, in some cases the prediction is completely wrong. We suppose that none of the tested models is completely suitable for absolutely correct prediction. Nevertheless, the quality of prediction brings us to the very important question we tackle more in the following subsection.

### 7.5 An alternative clinical index?

The prediction of washout curve in current software (Spiroware) is of poor quality. We could see that the prediction using verified interval regression is also not too trustworthy. The problem lies in an unsatisfactory model of the nitrogen washout process. We discussed many washout curve models, however none of them was plausible enough (for the purpose of prediction). Before one starts a hunt for better models, it needs to be specified, why exactly we need predictions and models of washout process. One reason has been documented previously on an example of an interrupted measurement because of patient’s weak cooperation. Indeed, the possibility to predict washout process would be of a great clinical value. Unfortunately, our results indicate, that predictions are not possible within the currently used approach to washout data analysis.

Let us say we want to predict LCI from an incomplete measurement. To derive the LCI, the FRC is also needed. For FRC derivation we need to compute (as an integration of flow), therefore we need to know the missing flow data whose prediction is nearly impossible (too jagged shape of the flow curve). In conclusion, even if we had a good prediction, there is no way to compute meaningful LCI with this prediction.

With that a new question arises – can LCI be replaced by another index describing ventilation inhomogeneity and being more suitable to be predicted (and also robust enough to overcome some inaccuracy of prediction)? Much more suitable might be some form of clinical index that is based on the curvature of the washout curve. It would also permit to omit the computation of volume of air/nitrogen. During our early regression tests it seemed that for healthy persons the model exppow works better and for patients with cystic fibrosis the model loglin works better. We wanted to derive a new clinical index as a ratio of quality of the fit of these two methods. That is why all the tables contain the rightmost column ”rat”. Another option would be an index depicted in Figure 9. It is the angle of the two lines – first going through the initial concentration and 20% of concentration, the second going through 20% concentration and 5% of concentration. However, these two indices remain hypothetical so far since the relation between them and lung properties is a subject of further clinical study. Figure 9: An example of an alternative hypothetical clinical index. The angle between two lines – one going through the initial 100% concentration of nitrogen and 20%, the second going through 20% and 5%.

## 8 Conclusions

We summarize the results in the form of the following list:

• We were able to significantly speed up the interval least squares procedure for certain specialized cases (e.g., the output data from MBW).

• An example of handling of uncertainties algebraically was shown.

• We demonstrated that the models that are usually used in literature for description of the behaviour of the nitrogen washout process are not plausible.

• We showed that if we consider the classical fitting models, the best model (but still not ideal) for the washout curve description is exppow.

• Fitting the data with classical models up to 5% is much more achievable than the attempts to fit the data up to 2.5%.

• The current accuracy of Exhalyzer D sensors is insufficient for interval data estimation and making reasonable predictions.

• If we had sensors with better accuracy just by one order the verified fitting would work.

• It is impossible to predict the future value of LCI based on interrupted measurement due to properties of LCI.

• The possibility of new clinical indices was discussed.

In our work numerous ways of future research emerged – finding better models of the washout process, combination of the top-down and bottom-up approach in washout modeling, search for new clinical indices that will enable better prediction (our newly proposed indices are currently subjects of further clinical study). It would be also interesting to combine the algebraic approach to uncertainty with the statistical one.

## References

•  Michal Černý, Jaromír Antoch, and Milan Hladík.

On the possibilistic approach to linear regression models involving uncertain, indeterminate or interval data.

Information Sciences, 244:26–47, 2013.
•  John E Cotes, David J Chinn, and Martin R Miller. Lung function: physiology, measurement and application in medicine. John Wiley & Sons, 2009.
•  Yadin David, Wolf W Von Maltzahn, Michael R Neuman, and Joseph D Bronzino. Clinical engineering. CRC Press, 2003.
•  Jane C Davies, Steve Cunningham, Eric WFW Alton, and JA Innes. Lung clearance index in cf: a sensitive marker of lung disease severity. Thorax, 63(2):96–97, 2008.
•  Francisco de AT de Carvalho, Eufrasio de A Lima Neto, and Camilo P Tenorio. A new method to fit a linear regression model for interval-valued data. In

Annual Conference on Artificial Intelligence

, pages 295–306. Springer, 2004.
•  Kent Green, Frederik F Buchvald, June Kehlet Marthin, Birgitte Hanel, Per M Gustafsson, and Kim Gjerum Nielsen. Ventilation inhomogeneity in children with primary ciliary dyskinesia. Thorax, pages thoraxjnl–2011, 2011.
•  Per M Gustafsson. Peripheral airway involvement in cf and asthma compared by inert gas washout. Pediatric pulmonology, 42(2):168–176, 2007.
•  O Heimlich. Gnu octave interval package. version 1.4. 1, 2016.
•  Milan Hladík. New operator and method for solving real preconditioned interval linear equations. SIAM J. Numer. Anal., 52(1):194–206, 2014.
•  Milan Hladík and Michal Černý. Interval regression by tolerance analysis approach. Fuzzy Sets and Systems, 193:85–107, 2012.
•  Jaroslav Horáček and Milan Hladík. Computing enclosures of overdetermined interval linear systems. Reliable Computing, 19:143, 2013.
•  Jaroslav Horáček, Václav Koucký, and Milan Hladík. New insight into automated breath detection. In preparation.
•  Luc Jaulin, Michel Kieffer, Olivier Didrit, and Éric Walter. Applied interval analysis. With examples in parameter and state estimation, robust control and robotics. Springer, London, 2001.
•  Renee Jensen, Kent Green, Per Gustafsson, Philipp Latzin, Jessica Pittman, Felix Ratjen, Paul Robinson, Florian Singer, Sanja Stanojevic, and Sophie Yammine. Standard operating procedure: multiple breath nitrogen washout. EcoMedics AG, Duernten, Switzerland, 2013.
•  R Baker Kearfott. Interval computations: Introduction, uses, and resources. Euromath Bulletin, 2(1):95–112, 1996.
•  Kenneth A Macleod, Alex R Horsley, Nicholas J Bell, Andrew P Greening, J Alastair Innes, and Steve Cunningham. Ventilation heterogeneity in children with well controlled asthma with normal spirometry indicates residual airways disease. Thorax, 64(1):33–37, 2009.
•  R.E. Moore, R.B. Kearfott, and M.J. Cloud. Introduction to interval analysis. Society for Industrial Mathematics, 2009.
•  Arnold Neumaier. Linear interval equations. In Interval Mathematics 1985, pages 109–120. Springer, 1986.
•  Arnold Neumaier. Interval Methods for Systems of Equations. Cambridge University Press, Cambridge, 1990.
•  Paul Robinson, Philipp Latzin, Sylvia Verbanck, Graham L Hall, Alexander Horsley, Monika Gappa, Cindy Thamrin, Hubertus GM Arets, Paul Aurora, S Fuchs, et al. Consensus statement for inert gas washout measurement using multiple and single breath tests. European Respiratory Journal, pages erj00697–2012, 2012.
•  Robert G Rossing, M Bryan Danford, Earl L Bell, and Raul Garcia. Mathematical models for the analysis of the nitrogen washout curve. Technical report, DTIC Document, 1967.
•  Vera Sit, Melanie Poulin-Costello, and Wendy Bergerud. Catalogue of curves for curve fitting. Forest Science Research Branch, Ministry of Forests, 1994.
•  Hideo Tanaka and Haekwan Lee.

Interval regression analysis by quadratic programming approach.

IEEE Transactions on Fuzzy Systems, 6(4):473–481, 1998.
•  Merryn Howatson Tawhai and Peter J Hunter. Multibreath washout analysis: modelling the influence of conducting airway asymmetry. Respiration physiology, 127(2-3):249–258, 2001.