DeepAI
Log In Sign Up

Numerical computation of formal solutions to interval linear systems of equations

03/15/2019
by   Sergey P. Shary, et al.
0

The work is devoted to the development of numerical methods for computing "formal solutions" of interval systems of linear algebraic equations. These solutions are found in Kaucher interval arithmetic, which extends and completes the classical interval arithmetic algebraically. The need to solve these problems naturally arises, for example, in inner and outer estimation of various solution sets to interval linear systems of equations. The work develops two approaches to the construction of stationary iterative methods for computing the formal solutions that are based on splitting the matrix of the system. We consider their convergence and implementation issues, compare with the other approaches to computing formal solutions.

READ FULL TEXT VIEW PDF
01/20/2020

Solving interval linear least squares problems by PPS-methods

In our work, we consider the linear least squares problem for m× n-syste...
04/01/2021

Efficient Set-Based Approaches for the Reliable Computation of Robot Capabilities

To reliably model real robot characteristics, interval linear systems of...
10/31/2015

Formal Solutions of Completely Integrable Pfaffian Systems With Normal Crossings

In this paper, we present an algorithm for computing a fundamental matri...
10/26/2022

Development of linear functional arithmetic and its application to solving problems of interval analysis

The work is devoted to the construction of a new type of intervals – fun...
01/31/2018

An efficient algorithm for global interval solution of nonlinear algebraic equations and its GPGPU implementation

Solving nonlinear algebraic equations is a classic mathematics problem, ...

1 Introduction

The main object we study in this article is interval systems of linear algebraic equations having the form

(1)

with intervals , which will be considered as elements of Kaucher complete interval arithmetic [16]. It is convenient to denote such systems as

(2)

where is an interval -matrix, is an interval

-vector.

For interval equations and systems of equations, various definitions of solutions and solution sets exist. Additionally, practice requires that these solution sets be evaluated in a wide variety of ways. As a consequence, there are many problem statements associated with interval equations and systems of equations. In our paper, we are going to concentrate on computing the so-called formal solutions to interval linear systems of the form (1)–(2).

Definition 1 An interval vector is called formal solution to an interval equation or a system of equations, if substituting it into the equation or equation system and execution of all operations in interval arithmetic result in a valid equality.

Due to a number of reasons, it is useful to find formal solutions to interval systems of equations in Kaucher interval arithmetic rather than in classical interval arithmetic . is an algebraic and order completion of .

The problem we consider is not new. It was first noted in the paper by S. Berti [2] with respect to one interval quadratic equation. Then H. Ratschek and W. Sauer [32] studied such solutions for a single interval linear equation, and they used the term algebraic solution. K. Nickel [29] considered formal solutions to interval linear systems of equations in complex interval arithmetics. However, formal (algebraic) solutions of interval equations and systems of equations have been studied only in about a dozen papers for the last decades when interval analysis rapidly developed.

The need to consider formal solutions arises, in particular, in the interval analogue of the method of undetermined coefficients, as well as in modeling linear static systems under interval uncertainty of their parameters, as a consequence of the so-called formal (algebraic) approach to their analysis [36, 37, 39]. We remind that this approach replaces the original problem of estimating the internal states of the system to the problem of finding algebraic solutions of an auxiliary equation in Kaucher arithmetic . In this work, we do not focus on the derivation of Kaucher interval arithmetic and discussion of its remarkable properties. The interested reader should refer to the original works of [9, 10, 16], or to the summary in [20, 36, 37, 38, 39].

Everywhere below, we will denote the intervals and interval objects (vectors, matrices) in bold (for example, A, B, C, …, x, y, z), , are understood as the lower (left) and upper (right) endpoints of an interval x, other symbols also follow the informal international standard [18]. Some results of this work were previously published in [36].

2 Kaucher interval arithmetic

The classical interval arithmetic is known to be an algebraic system formed by intervals of the real axis with the operations between them defined “by representatives”, i. e. according to the following fundamental principle

(3)

The constructive reformulation of the above relation for separate arithmetic operations looks as follows:

(4)
(5)
(6)
(7)

The algebraic properties of the classical interval arithmetic are much more poor than those of the usual number systems, the ring of integers, the fields of rational numbers and real numbers , because

  • all intervals with nonzero width, i. e. most of the elements of do not have inverse elements with respect to the operations (4)–(7),

  • the arithmetic operations (4)–(7) are related to each other by a weak sub-distributivity relation (see [27, 28, 40]), and the full distributivity of multiplication (and division) with respect to addition and subtraction does not take place.

As a consequence, first, in elementary equations with respect to the unknown variable

and their like do not always have solutions. Secondly, the technique of symbolic transformations in the classical interval arithmetic is quite poor. We cannot even transfer terms from one part of the equation to another, and the lack of distributivity makes it impossible to reduce such terms.

In addition, the ordinal properties of the classical interval arithmetic with respect to inclusion ordering “” are unsatisfactory. In partially ordered sets, the possibility of taking, for any two elements, their lower bound “” and the upper bound “” with respect to the order in question plays a huge role. In , the corresponding operations are

(8)
— taking the lower bound with respect to “”,
(9)
— taking the upper bound with respect to “”.

Since the first of these operations is not always fulfilled, then is, in a sense, “not closed” with respect to the inclusion order.111If are usual one-dimensional intervals with non-empty intersection, then and coincide with and respectively. But this is not valid in general. For example, is not defined.

The elements of the complete interval arithmetic are pairs of real numbers , not necessarily connected by the relation . Thus, is obtained by appending improper intervals , to the set of proper intervals and real numbers (identified with degenerate intervals of zero width). The elements of Kaucher arithmetic and the more complex objects formed from them (vectors, matrices) will be highlighted in bold, similar to usual intervals. Moreover, if , then is called the left (or lower) end of the interval a and is denoted by or , and is called the right (or upper) endpoint of the interval a and is denoted or . The interval a is called balanced if .

Without trying to replace the rigorous mathematical constructions performed by E. Kaucher, we will consider the description of the arithmetic along with informal motivations that help to understand the meaning of various constructions.

Proper and improper intervals, the two halves of , transform into each other as a result of the dualization mapping swapping (turning over) the endpoints of the interval, i. e. such that

Proper projection of an interval a is the value

Similar to the classical interval arithmetic , the “inclusion” of one interval to another is defined in as follows:

(10)

For example, . As a consequence, the operations of taking the minimum (8) and the maximum (9) keep their definitions unchanged in , but now they are always possible due to the presence of improper intervals. In particular, . Thus, the extension of to makes the set of intervals a lattice, and even a conditionally complete lattice with respect to the inclusion ordering (10).222A conditionally complete lattice is a partially ordered set in which every non-empty bounded subset has exact upper and lower bounds [3]. This is more than just a lattice, but less than a full lattice, where minima and maxima can be taken for any families of elements.

In addition to the set-theoretic inclusion on the set of intervals , there is another partial ordering, which naturally generalizes the linear order “” on the real axis:

Definition 2 For the intervals a, , we say that a does not exceed b and write “” if and only if and .

The interval is called nonnegative, i. e. “” if both its endpoints are nonnegative. The interval is nonpositive, i. e. “” if both its endpoints are not positive.

For example, , and both compared intervals and are non-negative.

It is useful to introduce the concept of the sign of an interval, which we define as

The zero, i. e. the zero interval , may have any sign.

The semigroup of all proper intervals with the operation of addition is fairly simple: the addition of intervals is divided into independent operations of addition of the left and right endpoints of the operands. As a consequence, the extension of addition from to is easy, and it is defined in in exactly the same way as in classical interval arithmetic:

But now it follows from the existence of improper intervals that each element a of has a unique inverse with respect to addition (also called opposite), denoted by “”, and the equality implies that

(11)

With respect to addition, the arithmetic is thus a commutative group that is isomorphic to the additive group of the standard linear space . For brevity, we denote by “” an operation that is the inverse of addition, and it will be called internal subtraction in (or algebraic subtraction). Then

It is easy to verify that, for the addition in , the following relation is valid

(12)

It means, in particular, that the addition of improper intervals in is a “mirror image” of the addition for proper intervals in . This is consistent with the status of improper intervals, so it seems necessary to preserve this property when defining other arithmetic operations in .

We start extending the multiplication to the complete interval arithmetic with the simplest situation, i. e. multiplication by real numbers. It is natural to define it in exactly the same way as in classical interval arithmetic:

(13)

Further, for nonnegative proper intervals, multiplication in is performed “through the endpoints”, quite similar to addition. Therefore, it makes sense to define, in , the multiplication of non-negative intervals, which are not necessarily proper, according to the same “endpoint formulas”, i. e. as . In fact, in this way, we embed a multiplicative semigroup of positive intervals into a group. If, in the product , one of the factors is a nonpositive interval, then we can make it non-negative through multiplying by and using formula (13), thus coming to the previous case.

Note that in all these cases the property analogous to (12) holds:

How to extend the definition of multiplication to the entire set ? The ability of algebra to do this has already been exhausted, and we need to involve considerations concerning the inclusion ordering in and the related properties of arithmetic operations.

Using the maximum with respect to inclusion (9), the fundamental property (3), which defines the operations of classical interval arithmetic, can be rewritten in the following equivalent form:

(14)

where . It is easily seen that the addition extended to the entire set of proper and improper intervals , as well as the multiplication defined for intervals that do not contain zero and are not contained in zero can be represented in a similar way through the operations (8) and (9) of taking minimum and maximum with respect to inclusion. If both operands a and b are improper, then

(15)

where . The lower arguments of the operations “” should have proper projections and , since improper intervals themselves are contained in points due to the definition of inclusion in .

The property (15) is evidently obtained from (14) with the use of dualization. But the next properties of the operation are not obvious:

if a is proper and b is improper, and

if a is improper and b is proper. Their validity can be verified by direct verification.

All the above written formulas, starting with (14), can be combined into one as follows. We introduce the so-called conditional operation of taking the extremum with respect to inclusion:

(16)

This is an operation that depends on the interval parameter a standing as its upper index. The operation is either maximum or minimum with respect to the inclusion “”, depending on whether a is proper or improper. This extremum is taken over all from the proper projection of the interval a. Note that any interval can be represented as

(instead of , any letter can be used in the formula). Anyway, for , the following relation is valid:

(17)

This representation, first obtained in [14], expresses the relationship between the result of the interval operation and the results of the point operations for and . It can be taken as a basis for the definition of arithmetic operations in the complete interval arithmetic .

It is not hard to derive, from (17), the monotonicity of the interval arithmetic operations with respect to inclusion.

In order to write out explicit formulas for multiplication in complete interval arithmetic, we select the following subsets in :

— nonnegative intervals,
— zero-containing intervals,
— nonpositive intervals,
— intervals contained in zero.

Overall, . Then the multiplication in Kaucher interval arithmetic can be described by Table 1 [16], the cells of which are obtained as a result of detailed writing out the particular cases of applying formula (17) and our previous results. A remarkable fact is that this table is the supplement of a similar table for multiplication in classical interval arithmetic with one more row and one more column that correspond to the case of operands from the set .

Table 1: Multiplication in Kaucher complete interval arithmetic

As we see, the multiplication in Kaucher arithmetic admits non-trivial zero divisors. For example, . The interval multiplication in Kaucher arithmetic turns out to be commutative and associative [9, 15, 16], but the multiplication group in is formed only by intervals a for which (or, otherwise, ), because no any wider subset of satisfies the so-called “cancellation law”

This is the algebraic condition that a semigroup can be embedded into a group.

Therefore, for any interval a of that does not contain zero and is not contained in zero itself, there is a single inverse element with respect to multiplication, which we will denote by “”. From the equality , it follows that

(18)

For brevity, we will denote the inverse operation of the multiplication, the so-called internal (algebraic) division in , by “”, so that

The above table of explicit formulas for multiplication in the complete interval arithmetic is convenient for computer implementation, but rather cumbersome and barely foreseeable. In some cases, it makes sense to resort to other formulas for interval multiplication in , which were proposed by A.V. Lakeyev in [21, 23]. Recall the following definition (see, for example, [3, 5]):

Definition 3 For a real number , the values

are called positive part and negative part of respectively.

Then and .

Proposition 1 (Lakeyev formulas)  For any intervals , , there holds

If one of the intervals is a, b is proper, then

(19)

This formula is not simplified if we additionally know that both intervals a, b are proper.

If one of the intervals a, b is proper and the other is improper, then

(20)

A

The advantage of the Lakeyev formulas is their global character. They give a single and uniform expression for the interval product over the entire domain of a and b, whereas the representation via Table. 1 has a piecewise character. This is inconvenient in the study of interval functions “as a whole”, in particular, in the study of differentiability and its analogues, in the calculation and evaluation of generalized derivatives, etc.

Subtraction and division in arithmetic are defined in the same way as in classical interval arithmetic:

Similar to its classical predecessors, all operations of the complete interval arithmetic are inclusion monotone, i. e. monotone with respect to the partial order (10): if , then

for any arithmetic operation . This follows from their definition according to formula (17).

The relationship of addition and multiplication in Kaucher arithmetic is expressed by the following inclusions:

(21)
— subdistributivity,
(22)
— superdistributivity.

These inclusions turn to exact equalities when, in particular, a squeezes to a point, that is, :

(23)

Another important case of distributivity is the case when the signs of the intervals b and c coincide with each other:

(24)

E. Gardeñes et al. introduced in [9], for a complete description of all cases of distributivity, the concept of distributive areas defined by an interval a. Membership of the operands in these distributivity areas leads to equalities instead of inclusions (21)–(22). Later, S. Markov and his co-workers, as a classification of various particular cases of the distributivity of addition with respect to multiplication in , proposed a “generalized distribution law” [6, 26, 31], covering a large number of various situations. Of the variety of cases considered in these articles, we will further need the following relation:

(25)

if the intervals , c and have definite signs and .

3 Theory

In this section, we consider the basic theoretical facts concerning interval linear systems of equations [36, 37, 38, 39].

Despite the simple structure of the system of equations (1), we can use, for its solution, some elimination methods, symbolic transformations, etc., only in very particular situations. The reason is insufficient algebraic properties of the interval arithmetics. The absence of full distributivity in Kaucher arithmetic makes it generally impossible to perform even such a simple operation as reduction of similar terms. It is for this reason that the methods considered in our work are essentially numerical. An important theoretical result on formal solutions to interval linear systems was obtained by A.V. Lakeyev who managed to show the NP-complexity (intractability) of computing formal (algebraic) solutions to interval systems of linear equations in general form [21, 22].

To find formal solutions of interval linear systems of equations, several numerical methods were proposed, of which the subdifferential Newton method is the most efficient (see [36, 38, 40], and its computer implementations are freely available at [41]). The goal of this paper is the development of stationary single-step iterative methods for computing formal (algebraic) solutions to interval systems of linear algebraic equations. The need to build such methods is due to a number of facts. Despite very high efficacy of the subdifferential Newton method in practice, its justification for the most general case faces a number of difficulties. In addition to calculating the solution, the methods of the type we are going to construct provide also a proof of the uniqueness of the found solution, which is not provided by the subdifferential Newton method. Finally, another reason for the need to develop single-step stationary iterative methods is the fact that they are able to solve interval systems of equations that are not linear in form, for example,

where is an interval function of the unknown variable .

To date, computational mathematics has accumulated a large arsenal of theoretical approaches and efficient practical algorithms for solving a wide variety of equations and systems of equations. Can we use any of these methods? Is it possible to apply to our problem any of the traditional numerical methods for solving “operator equations”? Yes, but with some reservations and modifications.

The majority of traditional methods for the solution of equations and systems of equations relate to operator equations in linear spaces. Formally, these methods are not applicable to the problem of computing the formal (algebraic) solutions of system (1), since and are not linear spaces (see [38]). However, we can easily circumvent this difficulty by embedding the space or into the usual well-studied Euclidean space .

As we have already noted, the problem of finding formal solutions to interval equations is, in essence, the traditional mathematical problem of solving some equations, and most of the classical numerical analysis is devoted to solving such problems. But the peculiarity of our situation is that the basic set , on which the equations to be solved are considered, is not a linear space at all: the lack of distributivity in interval arithmetic leads to a violation of the axiom of linear space that requires the fulfillment of the identity

for all and any scalars . Thus, most of the existing approaches to the study of operator equations and to the calculation of their solutions are not directly applicable to our problem.

Moreover, remaining within the interval space , we will not be able to perform a theoretical analysis of the situation and understand some phenomena. For example, the point matrix

(26)

is regular (non-singular) in the sense of classical linear algebra, but multiplication by this matrix in can nullify a non-zero vector:

What is the reason? It is hardly possible to detect it from within the interval space, which is essentially non-linear. So, there is an urgent need to transfer our considerations to a certain linear space, which we denote by for generality. We also assume that a topological structure is determined on consistent with its linear structure.

From an abstract mathematical point of view, we have two different spaces, the interval space and the linear space , on which essentially different algebraic structures are given. How is it possible to “jump over” from the first one to the second? We are going to do it in a way similar to a change of variables, which is defined in the following subsection and is called immersion.

3.1 Definition and main properties

To transfer our considerations from the interval space to a linear one, we should build some mapping

embedding of the interval space into the linear space . It must be bijective (one-to-one mapping “to”) in order to correctly restore the interval preimage by its image in , and vice versa. Further, it is easy to understand that any bijection also generates a bijection from the set of all mappings of into itself onto the set of all mappings of to itself. More precisely, each is associated with a uniquely determined mapping

(27)

where “” denotes a composition of mappings. Thus, we can argue consider mappings of linear spaces as “exact copies” of interval mappings.

Definition 4 For an interval mapping and a fixed embedding , we will refer to the mapping of the linear space into itself which is determined by (27) as induced mapping for (or, expanded, -induced).

Visually, the situation is represented by a commutative diagram in Fig. 1.

l i n e a r   s p a c e s i n t e r v a l   s p a c e s original mappinginduced mapping

Figure 1: How an immersion generates an induced mapping.

The properties of the mappings and are closely related, so instead of the study of , one can investigate the induced mapping . Moreover, we can replace the problem of solving the equation in with the problem of solving the equation in the linear space , coming to a situation more familiar to modern numerical analysis.

Definition 5 Let the equation be given in the interval space ,

(28)

where are some mappings, and an embedding is fixed. We shall call the induced equation for (28) such an equation

in the linear space , that and are induced mappings for and respectively, i. e. and .

Thus, the initial interval equation

(29)

has a formal solution if and only if the induced equation

has a solution in the linear space. In this case, the desired formal interval solution for (29) is uniquely reconstructed by from the relation

We are interested in the specific situation with interval linear equations (1)–(2). We can change the original problem — finding solution of the equation

such that

to the problem of solving equations

in the linear space with induced mappings

defined as

A more general consideration. Since and are bijections, then the invertibility of any mapping on the interval space is equivalent to the invertibility of the -induced map acting on the linear space . Herewith

(30)

The main question concerning the construction of the embedding of the interval space into the linear space is to choose a reasonable compromise between the simplicity of this mapping and the convenient form of induced mappings (27). Among all bijective embedding of , it makes sense to select special embeddings that

  • preserve the additive algebraic structure of ,
    i. e. such that for any ,

  • preserve the topological structure of ,
    i. e. such that both the mapping itself
    and its inverse are continuous.

Embeddings satisfying two above prescribed conditions will be called immersions of the interval space into the linear space . Thus, formally we accept the following

Definition 6 [36] Let be a linear space. A bijective mapping will be called immersion of in , if it satisfies the following properties:
   
(1) is an isomorphism of additive groups and ,
   
(2) is a homeomorphism of topological spaces and .

For example, if the interval is matched with a pair of numbers , i. e. its endpoints, “forgetting” about their interval sense, then the mapping is an immersion. This example is typical in some sense, since, by involving dimension considerations (see, e. g., [7]), it is easy to show that Definition 6 determines the linear space uniquely: must be the Euclidean space . This fact is in good agreement with our analytical intuition, and we do not give here its strict substantiation, so as not to overload the overgrown text of the article. The purpose of this preparatory section is the study of the simplest properties of immersions that we will need in the future, when solving the induced equations.

Denote by and zero vectors in the spaces and respectively. It immediately follows from Definition 6, that for any immersion , we have

(31)

At the same time,

In addition, the inverse of the immersion map also satisfies conditions similar to (1)–(2) from Definition 6, and

(32)

Proposition 2 The immersion is a positive-homogeneous map, i. e.

The mapping , inverse to an immersion, is also positively homogeneous.

Proof is standard. Let . If is a natural number, then

If for some natural , then, from

it follows that

If for natural numbers and , then, using the already considered cases, we obtain

Hence, the equality is valid for all nonnegative rational numbers . Extending it to all non-negative real numbers can be done by passing to the limit, using the continuity of the immersion .

The proof for a mapping which is inverse to an immersion is performed in a completely similar way.

Proposition 3 If is an immersion, and

is a non-singular linear transformation of the space

, then is also an immersion.

Conversely, any other immersion can be represented as for some nonsingular linear transformation .

Proof. The first part of Proposition is justified trivially.

To prove the second part, we consider the mapping . Being a composition of two isomorphisms, it is an automorphism of the additive group of the linear space , and by virtue of Proposition 3.1 this map is also positively homogeneous. Also, for any