 # Max-Min and Min-Max universally yield Gumbel

"A chain is only as strong as its weakest link" says the proverb. But what about a collection of statistically identical chains: How long till all chains fail? The answer to this question is given by the Max-Min of a random matrix whose (i,j) entry is the failure time of link j of chain i: take the minimum of each row, and then the maximum of the rows' minima. The corresponding Min-Max is obtained by taking the maximum of each column, and then the minimum of the columns' maxima. The Min-Max applies to the storage of critical data. Indeed, consider multiple copies (backups) of a set of critical data items, and consider the ( i,j) matrix entry to be the time at which item j on copy i is lost; then, the Min-Max is the time at which the first critical data item is lost. In this paper we establish that the Max-Min and Min-Max of large random matrices are universally governed by asymptotic Gumbel statistics. We further establish that the domains of attraction of the asymptotic Gumbel statistics are effectively all-encompassing. Also, we show how the asymptotic Gumbel statistics can be applied to the design of large systems.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Extreme Value Theory (EVT) is a branch of probability theory that focuses on extreme-value statistics such as maxima and minima Gal -HF . EVT has major applications in science and engineering Cas -BGS

; examples range from insurance to finance, and from hydrology to computer vision

RT -Sch . At the core of EVT stands its fundamental theorem, the Fisher-Tippett-Gnedenko theorem FT -Gne , which establishes the three extreme-value laws: Weibull Wei1 -Wei2 , Frechet Fre , and Gumbel Gum .

The fundamental theorem of EVT applies to ensembles of independent and identically distributed (IID) real-valued random variables, and is described as follows BGT . Consider an ensemble whose components are IID copies of a general real-valued random variable . Further consider the ensemble’s maximum , and an affine scaling of this maximum:

 ~Mn=sn⋅(Mn−δn) , (1)

where is a positive scale parameter, and where is a real location parameter. The fundamental theorem of EVT explores the convergence in law (as ) of the scaled maximum to a non-trivial limiting random variable .

Firstly, the fundamental theorem determines its admissible ‘inputs’: the classes of random variables that yield non-trivial limits . Secondly, given an admissible input , the fundamental theorem specifies the adequate scale parameter and location parameter . Thirdly, as noted above, the fundamental theorem establishes that its ‘outputs’ are the three extreme-value laws: the statistics of the non-trivial limits are either Weibull, Frechet, or Gumbel. The domain of attraction of each extreme-value law is the class of inputs yielding, respectively, each output law.

The fundamental theorem of EVT yields asymptotic approximations for the maxima of large ensembles of IID real-valued random variables. Indeed, consider the scaled maximum to converge in law (as ) to a non-trivial limit . Then, for a given large ensemble (), the ensemble’s maximum admits the following extreme-value asymptotic approximation in law:

 Mn≃L∗:=δn+1sn⋅L%. (2)

The extreme-value asymptotic approximation of Eq. (2) has the following meaning: the deterministic asymptotic approximation of the ensemble’s maximum is the location parameter ; the magnitude of the random fluctuations about the deterministic asymptotic approximation is , the inverse of the scale parameter ; and the statistics of the random fluctuations about the deterministic asymptotic approximation are that of the limit – which is governed by one of the three extreme-value laws.

The three extreme-value laws are universal in the sense that they are the only non-trivial limiting statistics obtainable (as ) from the scaled maximum . However, universality holds neither for the corresponding domains of attraction, nor for the corresponding scale parameter and location parameter . Indeed, each extreme-value law has a very specific and rather narrow domain of attraction BGT . Also, for any given admissible input , the scale parameter and location parameter are ‘custom tailored’ in a very precise manner BGT .

In essence, the fundamental theorem of EVT considers a random-vector setting: the maxima of what can be perceived as vector-structured ensembles of IID real-valued random variables. This paper elevates from the random-vector setting to the following

random-matrix setting: the Max-Min and the Min-Max of matrix-structured ensembles of IID real-valued random variables. The Max-Min is obtained by taking the minimum of each matrix-row, and then taking the maximum of the rows’ minima. The Min-Max is obtained by taking the maximum of each matrix-column, and then taking the minimum of the columns’ maxima.

The Max-Min and the Min-Max values of matrices emerge naturally in science and engineering. Perhaps the best known example of the Max-Min and the Min-Max comes from game theory FuT -MSZ . Indeed, consider a player that has a set of admissible strategies, and that faces a set of viable scenarios. A payoff matrix determines the player’s gains – or, alternatively, losses – for each strategy it applies and for each scenario it encounters. The player’s goal is to optimize with respect to the worst-case scenario. Hence, in the case of gains, the player goes Max-Min: calculate the minimal gain per each scenario, and then pick the strategy that yields the largest minimal gain. And, in the case of losses, the player goes Min-Max

: calculate the maximal loss per each scenario, and then pick the strategy that yields the smallest maximal loss. In the field of game theory the Max-Min and the Min-Max values appear also in the context of game-searching procedures on trees

Pea -KDN .

Architectural illustrations of the Max-Min and the Min-Max values come from reliability engineering BP -Fin , where one is interested in calculating the failure time (or the failure load) of a given system. Two important system-architectures are, so called, “series-parallel” and “parallel-series” Kolo ; Kolo1 ; Kolo2 . In the series-parallel architecture a system is a parallel array of sub-systems, and each sub-system is a serial array of components. In the parallel-series architecture a system is a serial array of sub-systems, and each sub-system is a parallel array of components. The Max-Min and the Min-Max values correspond, respectively, to the failure times (or the failure loads) of systems with series-parallel and with parallel-series architectures Kolo ; Kolo1 ; Kolo2 .

There are several limit-law results – counterparts of the fundamental theorem of EVT – for the Max-Min and the Min-Max of random matrices (with IID entries). The pioneering mathematical results were presented by Chernoff and Teicher CT , reliability-engineering results were presented by Kolowrocki Kolo -Kolo2 , and relatively recent reliability-engineering results were presented by Reis and Castro RC . All these limit-law results use affine scalings – similar to that of Eq. (1) – for the Max-Min and the Min-Max. Also, all these limit-law results employ asymptotic couplings of the dimensions of the random matrices (as these dimensions are taken to infinity).

Chernoff and Teicher established that the limit-laws for the Max-Min and the Min-Max are the three extreme-value laws CT : Weibull, Frechet, and Gumbel. Kolowrocki investigated limit-laws for the Max-Min and the Min-Max in the context of systems with the aforementioned series-parallel and parallel-series architectures Kolo ,Kolo1 -Kolo2 . Considering the Max-Min, and applying the fundamental theorem of EVT iteratively – first to the minimum of each matrix-row, and then to the maximum of the rows’ minima – Reis and Castro established a Gumbel limit-law RC ; this limit-law applies to matrix entries that belong to sub-sets of the domains of attraction of the three extreme-value laws.

For the results of CT -RC – as in the case of the fundamental theorem of EVT – universality holds neither with regard to the domains of attraction, nor with regard to the affine scalings. Also, for these results, universality does not hold with regard to the asymptotic couplings of the dimensions of the random matrices. Moreover, as the results of CT -RC involve very intricate mathematical conditions and schemes, their practical implementation is extremely challenging.

The limit-law results of CT -RC are derived via an ‘EVT machinery’, i.e. methods similar to the Fisher-Tippett-Gnedenko theorem, together with other EVT results (e.g. BdH ). In this paper we take an altogether different approach: a ‘bedrock’ Poisson-process method. Specifically, we dive down to the bedrock level of the rows’ minima and the columns’ maxima (of random matrices with IID entries), and establish Poisson-process limit-laws for these minima and maxima. Then, elevating back from the bedrock level to the Max-Min and the Min-Max, we establish Gumbel limit-laws for these values.

The limit-laws presented here have the following key features. Firstly, their domain of attraction is vast: the limit-laws hold whenever the entries’ distribution has a density. Secondly, they use affine scalings similar to that of Eq. (1) with: a location parameter that is tunable (it can be set as we wish within the interior of the support of the IID entries); and a scale parameter that depends on the entries’ distribution only up to a coefficient. Thirdly, their asymptotic couplings (of the dimensions of the random matrices) are geometric. Due to these features the practical implementation of the limit-laws presented here is easy and straightforward, and hence these results are highly applicable.

Figure 1 demonstrates the potency of the Gumbel limit-law for the Max-Min (see section III for the details). This figure depicts numerical simulations of the Max-Min of random matrices whose IID entries are drawn from an assortment of distributions: Exponential, Gamma, Log-Normal, Inverse-Gauss, Uniform, Weibull, Beta, Pareto, and Normal. For all these distributions, the convergence of the simulations to the theoretical prediction of the Max-Min result is evident. The MATLAB code that was used in order to generate the simulations is detailed in the Appendix; this short code shows just how easy it is to apply, in practice, the novel Gumbel limit-laws presented here.

The reminder of this paper is organized as follows. Section II presents the random-matrix setting, and the ‘bedrock’ Poisson-process limit-law for the rows’ minima. Then, section III establishes the Gumbel limit-law for the Max-Min – which is motivated by the following question: within a collection of IID chains, how long will the strongest chain hold? Section IV further establishes the counterpart Gumbel limit-law for the Min-Max – which is based on a counterpart ‘bedrock’ Poisson-process limit-law for the columns’ maxima, and which is motivated by the following question: using a collection of IID data-storage backup copies, how long can the data be stored reliably by the backup copies? Section V describes the application of the Gumbel limit-laws as approximation tools and as design tools. An in-depth discussion of the limit-laws is held in section VI. Finally, section VII concludes, and the proofs of the key results stated along the paper are detailed in the Appendix.

## Ii Bedrock

Consider a collection of chains, labeled by the index . Each chain comprises of links, and all the links are IID copies of a generic link. In this paper we take a temporal perspective and associate the failure time of the generic link with a real-valued random variable . Namely, is the random time at which the generic link fails mechanically.

As the analysis to follow is probabilistic, we introduce relevant statistical notation. Denote by () the distribution function of the generic failure time , and by () the corresponding survival function. These functions are coupled by (). The density function of the generic failure time is given by (). In particular, this notation covers the case of a positive-valued generic failure time . We note that, alternative to the temporal perspective taken here, the random variable can manifest any other real-valued quantity of interest of the generic link, e.g. its mechanical strength (in which case is positive-valued).

The following random matrix underlies the collection of chains:

 T=⎛⎜ ⎜ ⎜⎝T1,1⋯T1,l⋮⋱⋮Tc,1⋯Tc,l⎞⎟ ⎟ ⎟⎠ . (3)

The dimensions of the random matrix are , and its entries are IID copies of the generic failure time . The row of the random matrix represents the links of chain , and the entries of this row manifest the respective failure times of the links of chain . Specifically, the entry is the failure time of link of chain .

“A chain is only as strong as its weakest link” says the proverb. So, chain fails as soon as one of its links fails. Hence the chain’s failure time is given by the minimum of the failure times of its links:

 ∧i=min{Ti,1,⋯,Ti,l} (4)

(). Namely, the random variable is the minimum over the entries of the row of the random matrix .

Now, consider an arbitrary reference time of the generic failure time , e.g. its median, its mean (in case the mean is finite), or its mode (in case the density function is unimodal). In general, the reference time can be any real number that satisfies two basic requirements: (i) , which is equivalent to ; and (ii) . These requirements are met by all the interior points in the support of the input .

With respect to the reference time , we apply the following affine scaling to the failure time of the chain:

 ~∧i=l⋅(∧i−t∗) (5)

(). Namely, in the affine scaling of Eq. (5) the chains’ common length is the positive scale parameter, and the reference time is the real location parameter.

Our goal is to analyze the limiting behavior of the chains’ scaled failure times in the case of a multitude of long chains: and . To that end we set our focus on the ensemble of the chains’ scaled failure times: . Also, to that end we introduce the following asymptotic geometric coupling between the number of the chains and the common length of the chains: . Specifically, the asymptotic geometric coupling is given by the limit

 limc→∞,l→∞c⋅¯F(t∗)l=1 . (6)

With the affine scaling of Eq. (5), and the asymptotic geometric coupling of Eq. (6), we are now in position to state the following Poisson-process limit-law result.

###### Proposition 1

The ensemble converges in law, in the limit of Eq. (6), to a limiting ensemble that is a Poisson process over the real line with the following intensity function: ( ), where .

See the Appendix for the proof of proposition 1. Table 1 summarizes proposition 1 and its underlying setting. We now elaborate on the meaning of this proposition.

A Poisson process is a countable collection of points that are scattered randomly over its domain, according to certain Poisson-process statistics that are determined by its intensity function Kin -Str . Poisson processes are of key importance in probability theory, and their applications range from insurance and finance EKM to queueing systems Wol , and from fractals LT to power-laws PWPL .

In the case of the Poisson process of proposition 1 the domain is the real line ( ), and the intensity function is . The points of the Poisson process of proposition 1 manifest, in the limit of Eq. (6), the chains’ scaled failure times. The informal meaning of the intensity function is the following: the probability that the infinitessimal interval contains a point of the Poisson process is , and this probability is independent of the scattering of points outside the interval .

The exponent of the intensity function manifests the hazard rate of the generic failure time at time BP -Fin : is the likelihood that the generic link will fail right after time , conditioned on the information that the generic link did not fail up to time . Specifically, this hazard rate is given by the following limit:

 ¯ϵ=limΔ→01ΔPr(T≤t∗+Δ|T>t∗) . (7)

The hazard rate is a widely applied tool in reliability engineering and in risk management BP -Fin .

## Iii Max-Min

With proposition 1 at our disposal, we now set the focus on the strongest chain, i.e. the last chain standing. The failure time of the strongest chain is given by the maximum of the chains’ failure times:

 ∧max=max{∧1,⋯,∧c} . (8)

Namely, the random variable is the Max-Min over the entries of the random matrix : for each and every row of the matrix pick the minimal entry, and then pick the rows’ largest minimal entry.

As with the chains’ failure times, we apply the affine scaling of Eq. (5) to the failure time of the strongest chain:

 ~∧max=l⋅( ∧max−t∗) , (9)

where is the above reference time. Also, as with the ensemble , we analyze the limiting behavior of the random variable in the case of a multitude of long chains: and .

Here and hereinafter denotes a ‘standard’ Gumbel random variable. Namely, is a real-valued random variable whose statistics are governed by the following ‘standard’ Gumbel distribution function:

 Pr(G≤t)=exp[−exp(−t)] (10)

(). We note that within the three extreme-value laws, Gumbel is the only law whose range is the entire real line.

The three extreme-value laws have one-to-one correspondences with the maximal points of specific Poisson processes GinEVT . In particular, the Gumbel extreme-value law has a one-to-one correspondence with the maximal point of the Poisson process of proposition 1. This connection leads to the following Gumbel limit-law result.

###### Proposition 2

The random variable converges in law, in the limit of Eq. (6), to a limiting random variable , where , and where is the ‘standard’ Gumbel random variable of Eq. (10).

See the Appendix for the proof of proposition 2. Table 2 summarizes proposition 2 and its underlying setting. In Figure 1 we use numerical simulations to demonstrate Proposition 2. To that end nine different distributions of the generic failure time are considered: Exponential, Gamma, Log-Normal, Inverse-Gauss, Uniform, Weibull, Beta, Pareto, and Normal. In all nine cases, the convergence of the simulations to the theoretical prediction of proposition 2 is evident. See the Appendix for the MATLAB code that was used in order to generate the numerical simulations.

Proposition 2 yields an asymptotic approximation for the Max-Min of large random matrices with dimensions . Indeed, consider the matrix-dimensions ( and ) and the reference time () to satisfy the relation . Then, the Max-Min random variable admits the following Gumbel asymptotic approximation in law:

 ∧max≃G∗:=t∗+¯ηl⋅G , (11)

where and are as in proposition 2.

The Gumbel asymptotic approximation of Eq. (11) has the following meaning: the deterministic asymptotic approximation of the Max-Min is the reference time ; the magnitude of the random fluctuations about the deterministic asymptotic approximation is ; and the statistics of the random fluctuations about the deterministic asymptotic approximation are Gumbel. Table 3 summarizes the Gumbel asymptotic approximation of Eq. (11), and details the key statistical features of this approximation.

## Iv Min-Max

So far we addressed the Max-Min of the random matrix : pick the minimum of each row (), and then pick the maximum of these minima . Analogously, we can address the Min-Max of the random matrix : pick the maximum of each column

 ∨j=max{T1,j,⋯,Tc,j} (12)

(), and then pick the minimum of these maxima

 ∨min=min{∨1,⋯,∨l} . (13)

To illustrate the Min-Max consider the collection of the aforementioned chains to be copies of a given DNA strand. The chains’ links represent sites along the DNA strand, where each of these sites codes a critical information item. The links’ generic failure time manifests the time at which the information coded by a specific DNA site is damaged; namely, the matrix entry is the time at which the information item on the DNA copy is damaged. The information item is lost once all its copies are damaged, and hence the failure time of the information item is given by Eq. (12). As all the information items are critical, a system-failure occurs once any of the information items is lost. Hence, the time of the system-failure is given by the Min-Max of Eq. (13).

More generally, the Min-Max applies to a setting in which critical information items are stored on different backup copies, where: is the index of the information items; is the index of the copies; and is the time at which the information item on the backup copy is damaged. The above ‘DNA model’ was for the sake of illustration – following the ‘chains model’ of section II, which we used in order to illustrate the Max-Min.

The analysis presented above was with regard to the Max-Min. Analogous analysis holds with regard to the Min-Max. Indeed, consider the above reference time , and apply the following affine scaling to the failure time of the information item:

 ~∨j=c⋅(∨j−t∗) (14)

(). Namely, in the affine scaling of Eq. (14) the number of the copies is the positive scale parameter, and the reference time is the real location parameter.

Also, introduce an asymptotic geometric coupling between the number of the information items and the number of the copies: . Specifically, the asymptotic geometric coupling is given by the limit

 liml→∞,c→∞l⋅F(t∗)c=1 . (15)

With the affine scaling of Eq. (5), and the asymptotic geometric coupling of Eq. (6), we are now in position to state the following counterpart of proposition 1.

###### Proposition 3

The ensemble converges in law, in the limit of Eq. (15), to a limiting ensemble that is a Poisson process over the real line with the following intensity function: ( ), where .

See the Appendix for the proof of proposition 3. Table 1 summarizes proposition 3 and its underlying setting. The notion of Poisson processes was described right after proposition 1. The exponential intensity function of proposition 3, and the Poisson process that this intensity characterizes, are most intimately related to the notion of accelerating change AccCha ; readers interested in a detailed analysis of the (rich) statistical structure of this Poisson process are referred to AccCha . The exponent has the following limit interpretation:

 ϵ=limΔ→01ΔPr(T>t∗−Δ|T≤t∗) , (16)

which is a time-reversal analogue of the hazard rate of Eq. (7).

Continuing on from proposition 3, and considering the above reference time , we apply the affine scaling of Eq. (14) to the time of the system-failure:

 ~∨min=c⋅(∨min−t∗) . (17)

Then, as proposition 1 led to proposition 2, proposition 3 leads to the following Gumbel limit-law result – which is the Min-Max counterpart of proposition 2.

###### Proposition 4

The random variable converges in law, in the limit of Eq. (15), to a limiting random variable , where , and where is the ‘standard’ Gumbel random variable of Eq. (10).

See the Appendix for the proof of proposition 4. Table 2 summarizes proposition 4 and its underlying setting. Proposition 4 yields an asymptotic approximation for the Min-Max of large random matrices with dimensions . Indeed, consider the matrix-dimensions ( and ) and the reference time () to satisfy the relation . Then, the Min-Max random variable admits the following Gumbel asymptotic approximation in law:

 ∨min≃G∗:=t∗−ηc⋅G , (18)

where and are as in proposition 4.

The Gumbel asymptotic approximation of Eq. (18) is the Min-Max counterpart of the Max-Min Gumbel asymptotic approximation of Eq. (11). Specifically: the deterministic asymptotic approximation of the Min-Max is the reference time ; the magnitude of the random fluctuations about the deterministic asymptotic approximation is ; and the statistics of the random fluctuations about the deterministic asymptotic approximation are Gumbel. Table 3 summarizes the Gumbel asymptotic approximation of Eq. (18), and details the key statistical features of this approximation.

## V Application

The Gumbel asymptotic approximations of Eq. (11) and of Eq. (18) can be applied in two modalities: as approximation tools and as design tools for the Max-Min and the Min-Max, respectively. Both applications are based on the fact that – for Eqs. (11) and (18) to hold – it is required that the matrix-dimensions ( and ) and the reference time () be properly coupled. In this section we describe and demonstrate these applications.

We start with the Max-Min, and its Gumbel asymptotic approximation of Eq. (11). This approximation requires the following coupling between the matrix-dimensions and the reference time: , where . Consequently, if the matrix-dimensions are given () then the approximation of Eq. (11) holds with the following implied reference time:

 t∗=¯F−1[(1c)1/l] . (19)

For example, if then the implied reference time is the median of the generic failure time . This application is an approximation tool: given the random matrix , Eq. (11) with the implied reference time of Eq. (19) approximates the Max-Min of the matrix.

To demonstrate the design-tool application of the Gumbel asymptotic approximation of Eq. (11), consider a system with a “series-parallel” architecture: the system is a parallel array of sub-systems (labeled ), and each sub-system is a serial array of components (labeled ). In terms of the random matrix of Eq. (3), the failure time of component in sub-system is . The series-parallel architecture implies that the system’s failure time is the Max-Min . Now, assume that our goal is to design a system whose failure time has the following properties: its deterministic approximation is , and the magnitude of its random fluctuations about its deterministic approximation is – where and are specified target values. Then, to meet the goal, the dimensions of the system should be designed as follows:

 l≃1¯m¯F(t∗)f(t∗) \ \& \ c≃1¯F(t∗)l% . (20)

Let’s turn now to the Min-Max, and its Gumbel asymptotic approximation of Eq. (18). This approximation requires the following coupling between the matrix-dimensions and the reference time: , where . Consequently, if the matrix-dimensions are given () then the approximation of Eq. (18) holds with the following implied reference time:

 t∗=F−1[(1l)1/c] . (21)

For example, if then the implied reference time is the median of the generic failure time . This application is an approximation tool: given the random matrix , Eq. (18) with the implied reference time of Eq. (21) approximates the Min-Max of the matrix.

To demonstrate the design-tool application of the Gumbel asymptotic approximation of Eq. (18), consider a system with a “parallel-series” architecture: the system is a serial array of sub-systems (labeled ), and each sub-system is a parallel array of components (labeled ). In terms of the random matrix of Eq. (3), the failure time of component in sub-system is . The parallel-series architecture implies that the system’s failure time is the Min-Max . Now, assume that our goal is to design a system whose failure time has the following properties: its deterministic approximation is , and the magnitude of its random fluctuations about its deterministic approximation is – where and are specified target values. Then, to meet the goal, the dimensions of the system should be designed as follows:

 c≃1mF(t∗)f(t∗) % \ \& \ l≃1F(t∗)c . (22)

Eq. (19) and Eq. (21) are explicit formulae facilitating the approximation of the Max-Min and Min-Max of large random matrices. Eq. (20) and Eq. (22) are explicit formulae facilitating the design of systems with, respectively, “series-parallel” and “parallel-series” architectures. The practical implementation of these formulae is easy and straightforward.

## Vi Discussion

We opened this paper with the fundamental theorem of EVT, and with a short discussion of the extreme-value asymptotic approximations emerging from this theorem. We now continue with this discussion, and expand it to include the Gumbel asymptotic approximations of Eqs. (11) and (18

), as well as the asymptotic approximation emanating from the Central Limit Theorem (CLT) of probability theory

Fel1 -Fel2 . To that end we begin with a succinct review of the CLT.

As in the case of the fundamental theorem of EVT, the CLT applies to ensembles of IID real-valued random variables: where the ensemble’s components are IID copies of a general real-valued random variable . The input

is assumed to have a finite (positive) standard deviation

, and hence also a finite (real) mean . We consider the ensemble’s average , and further consider the following affine scaling of this average:

 ~An=1σ√n⋅(An−μ) . (23)

Eq. (23) is the CLT counterparts of Eq. (1) – with the term assuming the role of the positive scale parameter ( in Eq. (1)), and with the mean assuming the role of the real location parameter ( in Eq. (1)).

The CLT asserts that the scaled average convergence in law (as ) to a limiting random variable that is ‘standard’ Normal; i.e. the statistics of the limit

are Normal (Gauss) with zero mean and with unit variance. Consequently, for a given large ensemble (

), the ensemble’s average admits the following Normal asymptotic approximation in law:

 An≃N∗:=μ+σ√n⋅N . (24)

The Normal asymptotic approximation of Eq. (24) has the following meaning: the deterministic asymptotic approximation of the ensemble’s average is the mean ; the magnitude of the random fluctuations about the deterministic asymptotic approximation is ; and the statistics of the random fluctuations about the deterministic asymptotic approximation are Normal.

It is illuminating to compare the extreme-value asymptotic approximation of Eq. (2), the Normal asymptotic approximation of Eq. (24), and the Gumbel asymptotic approximations of Eqs. (11) and (18). Such a comparison will highlight the analogies and the differences between these asymptotic approximations – as we shall now see.

The extreme-value asymptotic approximation of Eq. (2) has the following key features. (I) The domains of attraction are characterized by narrow tail conditions: regular-variation conditions for the Weibull and Frechet extreme-value laws, and a complicated condition for the Gumbel extreme-value law (see theorems 8.13.2 - 8.13.4 in BGT , and BdH ). (II) The deterministic asymptotic approximation is highly dependent on the input . (III) The fluctuations’ magnitude is highly dependent on the input . (IV) The limit is either Weibull, Frechet, or Gumbel. (V) The information required in order to apply this asymptotic approximation is infinite-dimensional: the input’s distribution function.

The Normal asymptotic approximation of Eq. (24) has the following key features. (I) The domain of attraction is characterized by a

wide moment condition

: inputs with a finite variance. (II) The deterministic asymptotic approximation is the input’s mean. (III) The fluctuations’ magnitude depends on the input only via the coefficient (which is the input’s standard deviation); hence the asymptotic order of the fluctuations’ magnitude is independent of the input . (IV) The limit is ‘standard’ Normal. (V) The information required in order to apply this asymptotic approximation is two-dimensional: the input’s mean and standard deviation.

The Gumbel asymptotic approximations of Eqs. (11) and (18) – for a preset reference time – have the following key features. (I) The domain of attraction is characterized by a wide smoothness condition: inputs with a density function. (II) The deterministic asymptotic approximation is the preset reference time. (III) The fluctuations’ magnitudes and depend on the input only via the coefficients and , respectively; hence the asymptotic orders and of the fluctuations magnitudes are independent of the input . (IV) The limit is ‘standard’ Gumbel. (V) The information required in order to apply these asymptotic approximations is two-dimensional: the value of the input’s distribution function and density function at the reference time .

On the one hand, the key features of the Gumbel asymptotic approximations of Eqs. (11) and (18) are quite different from those of the extreme-value asymptotic approximation of Eq. (2). On the other, the key features of these Gumbel asymptotic approximations are markedly similar to those of the Normal asymptotic approximation of Eq. (24). Thus, the Gumbel asymptotic approximations presented here are ‘as universal’ as the Normal asymptotic approximation; the similarities between these approximations are summarized in Table 4.

As its name suggests, a cornerstone of the Central Limit Theorem (CLT) is its centrality. In terms of the Normal asymptotic approximation of Eq. (24), centrality is manifested as follows: the ensemble’s average is approximated about the ‘center point’ of the input – its mean . In effect, the CLT ‘magnifies’ the statistical behavior of the ensemble’s average about the ‘center point’ .

The fundamental theorem of EVT is diametric to the CLT. Indeed, denote by the upper bound of the support of the input ; this upper bound can be either finite () or infinite (). Specifically, in the Weibull case it is finite, in the Frechet case it is infinite, and in the Gumbel case it is either (see theorems 8.13.2 - 8.13.4 in BGT , and BdH ). In effect, the fundamental theorem of EVT ‘magnifies’ the statistical behavior of the ensemble’s maximum about the upper bound .

Thus, on the one hand, the Normal asymptotic approximation of Eq. (24) ‘anchors’ at the mean – which is an interior point of the support of the input . And, on the other hand, the extreme-value asymptotic approximation of Eq. (2) ‘anchors’ at the upper bound – which is a boundary point of the support of the input . So, also from an ‘anchoring perspective’: the Gumbel asymptotic approximations of Eqs. (11) and (18) are different from the extreme-value asymptotic approximation of Eq. (2), and are similar to the Normal asymptotic approximation of Eq. (24). Indeed, these Gumbel asymptotic approximations ‘anchor’ at the reference time – which is an interior point of the support of the input .

Notably, in the design-tool modality, the Gumbel asymptotic approximations of Eqs. (11) and (18) offer a feature that even the CLT does not offer: tunability. The ‘center point’ at which the Normal asymptotic approximation of Eq. (24) ‘anchors’ is the mean – and this anchoring point is fixed. The ‘center point’ at which the Gumbel asymptotic approximations of Eqs. (11) and (18) ‘anchor’ is the reference time – and this anchoring point is tunable. Namely, propositions 1-4 allow us to set the reference time as we wish within the support of the input .

Perhaps the most straightforward approach to tackle the Max-Min and the Min-Max of random matrices is to apply the fundamental theorem of EVT iteratively. Reis and Castro did precisely so for the Max-Min RC : they applied the fundamental theorem first to the minimum of each and every row of the random matrix (of Eq. (3)), and then to the maximum of the rows’ minima. Interestingly, the results of Reis and Castro and our results both yield Gumbel limit-laws. Nonetheless, these seemingly identical limit-law results are profoundly different. “God is in the details” – or in the features – as we shall now elucidate.

Consider the iterative EVT approach. The first iteration of the fundamental theorem implicitly confines the input to one of the theorem’s narrow domains of attraction (Weibull, Frechet, Gumbel); moreover, as noted above, this iteration ‘anchors’ at the the upper bound of the support of the input . To apply the second iteration one has to impose further conditions, as well as to introduce an asymptotic coupling between the dimensions of the random matrix . Consequently, the iterative EVT approach comes with an expensive ‘intricacy price tag’. Specifically, for the limit-law of RC the following are highly dependent on the input , and are also highly elaborate: the Max-Min domain of attraction, scaling scheme, and asymptotic coupling. Matters are as intricate also in the Max-Min and Min-Max results of CT -Kolo2 (which are derived via ‘EVT machineries’).

Here, rather than mimicking the fundamental theorem of EVT, we mimicked the CLT. Firstly, we set a vast domain of attraction: inputs with a density function. Secondly, we devised particular asymptotic couplings and affine scalings: Eqs. (6) and (9) for the Max-Min, and Eqs. (15) and (17) for the Min-Max. Thirdly, we showed that these particular asymptotic couplings and affine scalings always yield the Gumbel limit-laws of propositions 2 and 4; i.e. they do so for all inputs that belong to the vast domain of attraction. These novel Gumbel limit-laws were achieved via a Poisson-process approach: the ‘bedrock’ Poisson-process limit-laws of propositions 1 and 3. This approach enabled us to circumvent the use of the fundamental theorem of EVT.

The Gumbel limit-laws of propositions 2 and 4 are truly workable tools for the Max-Min and the Min-Max of random matrices with IID entries. In turn, so are the Gumbel asymptotic approximations of Eqs. (11) and (18). A short MATLAB code given in the Appendix shows just how easy it is to apply these tools in prctice.

## Vii Conclusion

This paper explored the Max-Min value and the Min-Max value of a random matrix with: rows, columns, and entries that are IID real-valued random variables. This IID setting is common to random-matrix theory, to the fundamental theorem of Extreme Value Theory, and to the Central Limit Theorem. The Max-Min and the Min-Max values of matrices emerge naturally in science and engineering, e.g. in game theory and in reliability engineering. We motivated the Max-Min value by the following question: within a collection of IID chains, each with links, how long will the strongest chain hold? And, we motivated the Min-Max value by the following question: how long can critical information items be stored reliably on IID backup copies?

We showed that if the number of rows and the number of columns are large, and are coupled geometrically, then: the Max-Min value and the Min-Max value admit, respectively, the Gumbel asymptotic approximations of Eq. (11) and of Eq. (18) (in law). These Gumbel asymptotic approximations are similar, in form, to the Normal asymptotic approximation that follows from the Central Limit Theorem. Moreover, in their design-tool modality, the Gumbel asymptotic approximations display a special feature: their deterministic part – the reference time – is tunable. Hence, these Gumbel asymptotic approximations can be used, via Eqs. (20) and (22), to design the Max-Min and Min-Max values.

The Gumbel asymptotic approximations are founded on the Gumbel limit-laws of propositions 2 and 4. In turn, the Gumbel limit-laws are founded on the ‘bedrock’ Poisson-process limit-laws of propositions 1 and 3. These four novel limit-laws have a vast domain of attraction, have simple affine scalings, and use geometric asymptotic couplings (of and ). With their generality, their CLT-like structure, their straightforward practical implementation, and their many potential applications – the results established and presented in this paper are expected to serve diverse audiences in science and engineering.

Acknowledgments. R.M. acknowledges Deutsche Forschungsgemeinschaft for funding (ME 1535/7-1) and support from the Foundation for Polish Science within an Alexander von Humboldt Polish Honorary Research Fellowship. S.R. gratefully acknowledges support from the Azrieli Foundation and the Sackler Center for Computational Molecular and Materials Science.

## Viii Appendix

### viii.1 A general Poisson-process limit-law result

In this subsection we establish a general Poisson-process limit-law result. The setting of the general result is as follows. Consider to be IID copies of a generic random variable . The random variable is real-valued, and its density function is given by

 fθ(x)=κθ⋅gθ(x) (25)

(), where: is a positive parameter; is a positive constant; is a non-negative function.

Consider the joint limits and . We assume that the parameter and the constant admit the following asymptotic coupling:

 limn→∞,θ→∞n⋅κθ=κ , (26)

where is a positive limit value. Also, we assume that

 limθ→∞gθ(x)=g(x) (27)

(), where is a non-negative limit function.

Now, let’s analyze the asymptotic statistical behavior of the ensemble in the joint limits and . To that end we take a real-valued ‘test function’ (), and compute the characteristic functional of the ensemble with respect to this test function:

 E[ϕ(X1)⋯ϕ(Xn)] =E[ϕ(X)]n={∫∞−∞ϕ(x)fθ(x)dx}n ={1−∫∞−∞[1−ϕ(x)]fθ(x)dx}n ={1−1n∫∞−∞[1−ϕ(x)][(nκθ)⋅gθ(x)]dx}n (28)

(in Eq. (28) we used the IID structure of the ensemble , and Eq. (25)). Applying the limits of Eqs. (26)-(27), Eq. (28) implies that:

 limn→∞,θ→∞E[ϕ(X1)⋯ϕ(Xn)]=exp{−∫∞−∞[1−ϕ(x)][κ⋅g(x)]dx} . (29)

The characteristic functional of a Poisson process over the real line, with intensity function (), is given by Kin :

 E[∏x∈Pϕ(x)]=exp{−∫∞−∞[1−ϕ(x)]λ(x)dx} , (30)

where () is a real-valued ‘test function’. We emphasize that the characteristic functional of Eq. (30) is indeed characteristic Kin : if is collection of real points that satisfies Eq. (30) – then is a Poisson process over the real line, with intensity function (). Hence, combined together, Eqs. (29) and (30) yield the following general result:

###### Proposition 5

The ensemble  converges in law, in the joint limits and , to a Poisson process over the real line with intensity function ().

### viii.2 Proof of proposition 1

Eq. (4) implies that

 Pr(∧i>t)=Pr[min{Ti,1,⋯,Ti,l}>t] =Pr(Ti,1>t)⋯Pr(Ti,l>t) =Pr(T>t)l=¯F(t)l (31)

(). Eq. (5) and Eq. (31) imply that

 Pr(~∧i>t)=Pr[l⋅(∧i−t∗)>t] =Pr(∧i>t∗+tl)=¯F(t∗+tl)l (32)

(). Differentiating Eq. (32) with respect to the variable implies that the density function of the scaled random variable is given by

 −ddtPr(~∧i>t)=¯F(t∗+tl)l⋅¯h(t∗+tl) (33)

(), where . In what follows we use the shorthand notation . Note that the two basic requirements and imply that: .

Now, apply proposition 5 to the following setting: , , and (). Eq. (33) implies that

 (34)

(). Note that

 [¯F(t∗+xθ)¯F(t∗)]θ=[¯F(t∗)−f(t∗)xθ+o(1θ)¯F(t∗)]θ =[1−¯ϵxθ+o(1θ)]θ⟶θ→∞exp(−¯ϵx) (35)

(). Eqs. (34) and (35) imply that

 limθ→∞gθ(x)=g(x):=¯ϵexp(−¯ϵx) (36)

(). Also, the asymptotic geometric coupling of Eq. (6) implies that the asymptotic coupling of Eq. (26) holds with . Hence, the result of proposition 5 holds with the intensity function

 λ(x)=¯ϵexp(−¯ϵx) (37)

(). This proves proposition 1.

### viii.3 Proof of proposition 2

Set to be a Poisson process, over the real line, with intensity function () and exponent . Consider the number of points of the Poisson process that reside above a real threshold . The Poisson-process statistics imply that the number

is a Poisson-distributed random variable

with mean

 E[N(t)]=∫∞tλ(x)dx =∫∞t¯ϵexp(−¯ϵx)dx=exp(−¯ϵt) . (38)

Now, consider the maximal point of the Poisson process . This maximal point is no larger than the threshold if and only if no points of the Poisson process reside above this threshold: . Hence, as is a Poisson-distributed random variable with mean , Eq. (38) implies that the distribution function of the maximal point is given by

 Pr(M≤t)=exp[−exp(−¯ϵt)] (39)

(). The distribution function of Eq. (39) characterizes the Gumbel law. A ‘standard’ Gumbel-distributed random variable is governed by the distribution function of Eq. (10): ( ). Eqs. (39) and (10) imply that the maximal point admits the following Gumbel representation in law:

 M=¯η⋅G , (40)

where

 (41)

Proposition 1 established that the ensemble converges in law – in the limit of Eq. (6) – to the Poisson process . Consequently, the maximum of the ensemble converges in law – in the limit of Eq. (6) – to the maximal point of the Poisson process . Hence, Eq. (40) proves proposition 2.

### viii.4 Proof of proposition 3

For the random variable we have

 Pr(∨j≤t)=Pr[max{T1,j,⋯,Tc,j}≤t] =Pr(T1,j≤t)⋯Pr(Tc,j≤t) =Pr(T≤t)c=F(t)c (42)

(). In turn, for the scaled random variable Eq. (42) implies that

 Pr(~∨j≤t)=Pr[c⋅(∨j−t∗)≤t] =Pr(∨j≤t∗+tc)=F(t∗+tc)c (43)

(). Differentiating Eq. (43) with respect to the variable implies that the density function of the scaled random variable is given by

 ddtPr(~∨j≤t)=F(t∗+tc)c⋅h(t∗+tc) (44)

(), where . In what follows we use the shorthand notation . Note that the two basic requirements and imply that: .

Now, apply proposition 5 to the following setting: , , and (). Eq. (44) implies that

 fθ(x)=F(t∗)θκθ⋅⎡⎢ ⎢⎣F(t∗+xθ)F(t∗)⎤⎥ ⎥⎦θ⋅h(t∗+xθ)gθ(x) (45)

(). Note that

 [F(t∗+xθ)F(t∗)]θ=[F(t∗)+f(t∗)xθ+o(1θ)F(t∗)]θ =[1+ϵxθ+o(1θ)]θ⟶θ→∞exp(ϵx) (46)

(). Eqs. (45) and (46) imply that

 limθ→∞gθ(x)=g(x):=ϵexp(ϵx) (47)

(). Also, the asymptotic geometric coupling of Eq. (15) implies that the asymptotic coupling of Eq. (26) holds with . Hence, the result of proposition 5 holds with the intensity function

 λ(x)=ϵexp(ϵx) (48)

(). This proves proposition 3.

### viii.5 Proof of proposition 4

Set to be a Poisson process, over the real line, with intensity function () and exponent . Consider the number of points of the Poisson process that reside up to a real threshold . The Poisson-process statistics imply that the number is a Poisson-distributed random variable with mean

 E[N(t)]=∫t−∞λ(x)dx =∫t−∞ϵexp(ϵx)dx=exp(ϵt) . (49)

Now, consider the minimal point of the Poisson process . This minimal point is larger than the threshold if and only if no points of the Poisson process reside up to this threshold: . Hence, as is a Poisson-distributed random variable with mean , Eq. (49) implies that the survival function of the minimal point is given by

 Pr(M>t)=exp[−exp(ϵt)] (50)

(). A ‘standard’ Gumbel-distributed random variable is governed by the distribution function of Eq. (10): ( ). Eqs. (50) and (10) imply that the minimal point admits the following Gumbel representation in law:

 M=−η⋅G , (51)

where

 η=1ϵ=F(t∗)f(t∗) . (52)

Proposition 3 established that the ensemble that the ensemble converges in law – in the limit of Eq. (15) – to the Poisson process . Consequently, the minimum of the ensemble converges in law – in the limit of Eq. (15) – to the minimal point of the Poisson process . Hence, Eq. (51) proves proposition 4.

### viii.6 MATLAB code for Figure 1

% This function computes the scaled MaxMin/eta_bar

% N specifies the number of random matrices to be generated
N=10^5;

% MaxMin will hold the N Max-Min values that will be computed
MaxMin=zeros(1,N);

% pd specifies the distribution of the random matrix entries
pd = makedist(’Exponential’,’mu’,1);

% CDF_t specifies the value of the cumulative distribution function at the anchor point

CDF_t=1/5;

% This computes the anchor point t by inverting cumulative distribution function
t=icdf(pd,CDF_t);

% l sets the number of links
l=70;

% c sets the number of chains via geometric coupling
c=floor((1-CDF_t)^(-l));

% This for-loop generates the random matrices and computes the MaxMin
for k=1:N
M=random(pd,c,l);
MaxMin(k)=max(min(M’));
end

% This computes the coefficient eta_bar
eta_bar=(1-CDF_t)/pdf(pd,t);

% This computes the scaled MaxMin/eta_bar
MaxMin=(MaxMin-t)*l/eta_bar;

## References

• (1) Galambos, J., 1978. The asymptotic theory of extreme order statistics (No. 04; QA274, G3.).
• (2) S. Coles, Coles, S., Bawa, J., Trenner, L. and Dorazio, P., 2001. An introduction to statistical modeling of extreme values (Vol. 208). London: Springer.
• (3) De Haan, L. and Ferreira, A., 2007. Extreme value theory: an introduction. Springer Science & Business Media.
• (4) Castillo, E., 2012. Extreme value theory in engineering. Elsevier.
• (5) Kotz, S. and Nadarajah, S., 2000. Extreme value distributions: theory and applications. World Scientific.
• (6) Beirlant, J., Goegebeur, Y., Segers, J. and Teugels, J.L., 2006. Statistics of extremes: theory and applications. John Wiley & Sons.
• (7) Reiss, R.D., Thomas, M. and Reiss, R.D., 2007. Statistical analysis of extreme values (Vol. 2). Basel: Birkhäuser.
• (8) Embrechts, P., Klüppelberg, C. and Mikosch, T., 2013. Modelling extremal events: for insurance and finance (Vol. 33). Springer Science & Business Media.
• (9) Scheirer, W.J., 2017. Extreme Value Theory-Based Methods for Visual Recognition. Synthesis Lectures on Computer Vision, 7(1), pp.1-131.
• (10) Fisher, R.A. and Tippett, L.H.C., 1928, April. Limiting forms of the frequency distribution of the largest or smallest member of a sample. In Mathematical Proceedings of the Cambridge Philosophical Society (Vol. 24, No. 2, pp. 180-190). Cambridge University Press.
• (11) B. Gnedenko, Ann. Math. 44 (1943) 423 (translated and reprinted in: Breakthroughs in Statistics I, edited by S. Kotz and N.L. Johnson, pp. 195-225, Springer, New York, 1992).
• (12) W. Weibull, Ingeniors Vetenskaps Akademiens, Stockholm (1939) 151.
• (13) W. Weibull, ASME J. Appl. Mech. 18 (1951) 293.
• (14) M. Fréchet, Ann. Soc. Polon. Math. Cracovie, 6 (1927) 93.
• (15) Gumbel, E.J., 2012. Statistics of extremes. Courier Corporation.
• (16) Bingham, N.H., Goldie, C.M. and Teugels, J.L., 1989. Regular variation (Vol. 27). Cambridge university press.
• (17) Fudenberg, D. and Tirole, J., 1991. Game theory, 1991. Cambridge, Massachusetts, 393(12), p.80.
• (18) M. Maschler, M., Eilon Solan, and Shmuel Zamir, Game Theory (Cambridge University Press, Cambridge, 2013).
• (19)

J. Pearl, Asymptotic properties of minimax trees and game-searching procedures, Artificial Intelligence 14 (1980) 113-138.

• (20) T.A. Khan, L. Devroye, and R. Neininger, A limit law for the root value of minimax trees, Electron. Comm. Probab. 10 (2005) 273-281.
• (21) Barlow, R.E. and Proschan, F., 1996. Mathematical theory of reliability (Vol. 17). Siam.
• (22) Finkelstein, M., 2008. Failure rate modelling for reliability and risk. Springer Science & Business Media.
• (23) K. Kolowrocki, Limit reliability functions of some series-parallel and parallel-series systems, Applied Math. Comp. 62 (1994) 129-151.
• (24) K. Kolowrocki, On a class of limit reliability functions of some regular homogeneous series-parallel systems, Reliability Eng. System Safety 39 (1993) 11-23.
• (25) K. Kolowrocki, On asymptotic reliability functions of series-parallel and parallel-series systems with identical components, Reliability Eng. System Safety 41 (1993) 251-257.
• (26) H. Chernoff and H. Teicher, Limit distributions of the minimax of independent identically distributed random variables, Trans. American Math. Soc. 116 (1965) 474-491.
• (27) P. Reis and L.C. Castro, Limit model for the reliability of a regular and homogeneous series-parallel system, Revstat 7 (2009) 227-243.
• (28) Balkema, A. A., and L. De Haan. On R. von Mises’ condition for the domain of attraction of . The Annals of Mathematical Statistics (1972): 1352-1354.
• (29) Kingman, J.F.C., 1992. Poisson processes (Vol. 3). Clarendon Press.
• (30) Cox, D.R. and Isham, V., 1980. Point processes (Vol. 12). CRC Press.
• (31) Streit, R.L., 2010. Poisson point processes: imaging, tracking, and sensing. Springer Science & Business Media.
• (32) Wolff, R.W., 1989. Stochastic modeling and the theory of queues. Pearson College Division.
• (33) Lowen, S.B. and Teich, M.C., 2005. Fractal-based point processes (Vol. 366). John Wiley & Sons.
• (34) Eliazar, I. and Klafter, J., 2012. A probabilistic walk up power laws. Physics Reports, 511(3), pp.143-175.
• (35) Eliazar, I. and Sokolov, I.M., 2010. Gini characterization of extreme-value statistics. Physica A: Statistical Mechanics and its Applications, 389(21), pp.4462-4472.
• (36) Eliazar, I. and Shlesinger, M.F., 2018. Universality of accelerating change. Physica A: Statistical Mechanics and its Applications, 494, pp.430-445.
• (37) Feller, W., 2008. An introduction to probability theory and its applications (Vol. 1). John Wiley & Sons.
• (38) Feller, W., 2008. An introduction to probability theory and its applications (Vol. 2). John Wiley & Sons.