Isotonic regression for metallic microstructure data: estimation and testing under order restrictions

02/03/2020 ∙ by Martina Vittorietti, et al. ∙ Delft University of Technology 0

Investigating the main determinants of the mechanical performance of metals is not a simple task. Already known physical inspired qualitative relations between 2D microstructure characteristics and 3D mechanical properties can act as the starting point of the investigation. Isotonic regression allows to take into account ordering relations and leads to more efficient and accurate results when the underlying assumptions actually hold. The main goal in this paper is to test order relations in a model inspired by a materials science application. The statistical estimation procedure is described considering three different scenarios according to the knowledge of the variances: known variance ratio, completely unknown variances, variances under order restrictions. New likelihood ratio tests are developed in the last two cases. Both parametric and non-parametric bootstrap approaches are developed for finding the distribution of the test statistics under the null hypothesis. Finally an application on the relation between Geometrically Necessary Dislocations and number of observed microstructure precipitations is shown.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 12

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Understanding the intrinsic nature of the mechanical properties of metals is usually not an easy task. In order to get insight into what gives desired mechanical performance to a metal, a deep and detailed analysis of the metal microstructure characteristics is needed. For instance, it is known in literature that dislocations, i.e. line defects in the crystalline arrangement of the atoms [14], play a fundamental role in the mechanical behavior of metal alloys. More specifically, the appearance of Geometrically Necessary Dislocations222

Dislocations are usually classified into redundant and non-redundant dislocations, respectively called Statistically Stored Dislocations (SSDs) and Geometrically Necessary Dislocations (GNDs). GNDs are dislocations with a cumulative effect and they allow the accommodation of lattice curvature due to non-homogeneous deformation. They control the work hardening individually by acting as obstacles to slip and collectively by creating a long-range back stress.

(GNDs) during plastic deformation of the material contributes to the hardening of the material. Detecting GNDs from 2D microstructure images is often challenging. One widely accepted way is to use the so called Kernel Average Misorientation (KAM) [22]. The KAM, measured in Electron BackScatter Diffraction (EBSD), quantifies the average misorientation around a measurement point with respect to a defined set of a nearest or nearest plus second-nearest neighbor points [6]. In [18, 27, 5] studies on the relation between GNDs and microstructure properties such as grain size and carbides size are presented. The relation between GNDs and grain size has both theoretical and experimental confirmation and it can be related to the well-known macroscopic physical Hall-Petch relation [12, 25]. In fact, the Hall-Petch relation, in its original version, describes the negative dependence of yield stress (mechanical property) on grain size; loosely speaking the smaller grains are, the stronger the material is. More specifically in [15] the authors give as an explanation of the relation between GNDs and grain size that as the grain size decreases the grain boundary layer in which GNDs typically accumulate, occupies a greater volume fraction of the material, therefore it is reasonable to think that the smaller are the grains, the more GNDs will be observed.
Still unclear is instead the relation between carbides and GNDs. In fact, since the 1940’s several studies on how carbides affect the mechanical behaviour of metals have been conducted. In [26] the authors state that the primary carbides and their distribution have a major influence on the wear resistance and the toughness of the material. However, carbides tend to precipitate along the grain boundaries, that as said before, are the locations in which GNDs typically accumulate. Until now, no direct physical relationship has been found between carbides and Geometrically Necessary Dislocations.Therefore, isolating carbides effect and assessing the conjecture on the positive relation between carbides and GNDs is a problem of interest.
In [13]

a descriptive statistical analysis with response variable KAM, used as a proxy of GNDs and as explanatory variables the number of grains, the number of carbides and the position of carbides revealed an almost monotone trend of the response variable according to the increments of the explanatories.


Therefore, in order to take into account the already known direction of the physical relation, we want to propose an approach that incorporates this information and a procedure for testing the prementioned conjectures on a new dataset.
In this context, isotonic regression comes to aid. In fact, the idea at the basis of isotonic regression is taking order restrictions into account for improving the efficiency of the statistical analysis by reducing the error or the expected error of estimates and increasing the power of the testing procedures, provided that the hypothesized order restriction actually holds. The first papers about isotonic regression appeared in the 1950’s [1, 38] and books [2, 29] are well known references for statistical inference under order restrictions. Isotonic regression proves its power in different fields such as epidemiology in testing the effects of different treatments or in dose-finding [30, 37], but also in genetics [19], business [16], biology [3]. There are not many examples of isotonic regression use in Materials Science. Throughout this paper special attention is given to the peculiar data structure. Nowadays, developments towards multivariate isotonic regression, isotonic regression in inverse and censoring problems [11, 10], Bayesian isotonic regression [17] are ongoing. But also in the most basic framework there is still something missing.
In this paper, starting off with the most basic case, univariate isotonic regression of means under normality assumptions with known variances, we guide the reader into estimation and testing order restriction assumptions, considering different conditions on the variances.
Three different scenarios are considered. In all three cases, we focus on maximum likelihood as estimation procedure and likelihood ratio test as test statistic for hypothesis testing.
The first case is the basic case in which ‘the variances’ are known or unknown but their ratio is known. This instance is considered extensively in [2, 29] and results for estimation and testing order restrictions are already known.
The second scenario is from an applications point of view the most common scenario in which the variances are unknown. In [33], the authors derive a two steps estimating procedure for means and variances and interesting results on existence and uniqueness of the maximum likelihood estimates are derived under special conditions. Another iterative method, proposed in [34], is extended to the unknown variances case. The derivation of the test statistic and of its distribution in this scenario is not trivial. In fact, the estimate of the mean under the null hypothesis is also affected by the non knowledge of the variances. We propose the likelihood ratio test statistic and two different bootstrap approaches, one parametric and one non-parametric, for obtaining the test statistic distribution.

The last model considers not only the means under order restrictions but also the variances. This case has not often been faced probably because it is not common to have prior knowledge on the order of both means and variances. As in the unknown variances scenario, a two steps procedure for estimating means and variances is derived in

[32] and similar results on existence and uniqueness under specific conditions on the empirical variances are given. In [34] an improved algorithm called Alternating Iterative Method (AIM) and more general results about convergence are derived. For testing in this case we derive the likelihood ratio test taking into account the order of variances also under the null hypothesis and apply a parametric and non-parametric bootstrap approach in line with the one derived in the unknown variance case to obtain approximate p-values.
The paper structure is the following. In Section 2 we explain the estimation procedure of the isotonic means in the three different cases (Sections 2.1, 2.2, 2.3). In Section 3 the focus is on the Likelihood Ratio Test. We present it in the three different cases (Sections 3.1, 3.2, 3.3) and in Section 4 we propose both a parametric and non-parametric bootstrap approach for approximating the distribution of the test statistics under the null hypothesis. Finally, in Section 5 we come back to the application and we illustrate step-by-step how to deal with a real problem and more precisely how to perform isotonic regression and test for monotonicity of KAM with respect to the number of carbides. The paper ends with conclusions in Section 6.

2. Estimating restricted means in the normal case

We first introduce isotonic regression and the notation used in the rest of the paper in a more general context. Normality is assumed throughout this section.

Let , , be the th observation of the response variable corresponding to the th level of the explanatory variable .
We assume

to be independent random variables, normally distributed with means

and variances , , .
The log-likelihood is then given by

(1)

where is a constant which does not depend on the parameters and .
Furthermore, we assume that satisfies

(2)

A

-dimensional vector

is said to be isotonic if implies .
Be the set of all the isotonic vectors in ,

(3)

In this section we are interested in the maximum likelihood estimator of , where is isotonic and . Depending on the information on , different MLEs have been derived.
In the following three subsections the three different cases are considered.

2.1. Isotonic regression of means with known variance ratio

This first case constitutes the most basic case in which all variances are either known or unknown but they differ according to some known multiplicative constants . This means that the variance of the response variable is given by:

This specific case is already covered in [2, 29], but we hereafter report the main results. The problem of maximizing log-likelihood (1) in can be rewritten equivalently as solving:

(4)

where and . Note that this objective function does not depend on . The solution, , is called the isotonic regression of with weights [33]. For obtaining the solution to (4), different algorithms have been proposed in the literature ([2],[29]). In this paper, the “Pool-Adjacent Violators Algorithm” (PAVA) is used.
More details about the algorithm are provided in Appendix A.1.

2.2. Isotonic regression of means with unknown variances

In this second case, no assumptions on the variances are made. They are unknown and for obtaining the maximum likelihood estimate of , they need to be estimated as well. In [33] the authors consider this case and interesting results on existence and uniqueness of the MLE are achieved. We hereby recall the main results. The approach is to maximize the log-likelihood (1), with and .
For any fixed the maximizer of over is the isotonic regression of with weights and .
On the other hand, for any fixed , the maximizer of over is , where .
Substituting into (1), we can express the profile log-likelihood of as

(5)

where is the sample variance of the th normal population and a constant that does not depend on . Note that if or . Hence, maximizing over is equivalent to maximizing over a compact subset of of type . As is continuous on , a maximizer over exists.
As previously said, the authors in [33] discuss also uniqueness of the MLE of . They state that is not a concave function in general and that for guaranteeing uniqueness the following condition suffices (see Theorem 2.3 [33]):

Condition 2.1.

For , .

For finding a maximizer of (5), a two steps iterative algorithm based on PAVA has been proposed in [33]. From an initial guess for , the associated maximizer in is computed and after that the maximizer in based on this and so on. This iterative procedure stops when the maximum difference between the estimated means at step and at step is less than an arbitrary small threshold value, e.g.,

where is taken to be equal to in our case. In [34] the authors propose a new algorithm called Alternating Iterative Method (AIM). The procedure is based on the minimization of a semi-convex function. In particular, restating the problem in terms of , where and given is a convex subset of and a convex subset of , , is a semi-convex function because: i) is defined on ; ii) for any given , is strictly convex on and, for any given , is strictly convex on . The algorithm originally proposed for the simultaneous order restrictions of means and variances can be easily extended to the unknown variance case. The iteration method works in alternating the search of the minimum point, , of on a compact subset and the search of the minimum point, , of on . Proof of the convergence of the algorithm does not require additional conditions [34]. The iterative procedure stops when the difference between the likelihoods at step and at step is less than an arbitrary small threshold value:

(6)

A more detailed version of both algorithms is reported in Appendix A.2.

2.3. Isotonic regression of means and variances simultaneously

We now assume that both mean and variances are restricted by simple orderings. Therefore, in addition to assumption (2), we assume also:

(7)

The reason for taking decreasing order is relates to our application considered in Section 5; increasing variances can be dealt with analogously. In [32], maximum likelihood estimation under simultaneous order restrictions on mean and variances from a Normal population is studied. Some of the most important results are hereby recalled. The approach is to maximize the log-likelihood (1) with and , where is the closure of

(8)

This means that the maximizer will have positive -values if there is variation within the groups. Then, for any fixed , the maximizer of over is the isotonic regression of with weights and .
Furthermore, for any , the maximizer of is the so called antitonic regression (isotonic regression with reversed order [10]) of , , with weights . Existence is guaranteed noticing that
, (see Theorem 2.1 [32]).
Uniqueness is proved under the following condition (see Theorem 2.2 [32])

Condition 2.2.

For the sample variance satisfies where and are the maximal and the minimal means respectively.

As in the unknown variances case, a two steps iterative algorithm is proposed for finding the solution for both means and variances under order restrictions. The proof of the convergence of the algorithm is given under Condition 2.2.
Later, in [34], as mentioned in the previous section, the authors show that restating the problem in terms of , where Condition 2.2 is not needed for proving that the algorithm converges. In fact, also in this case the proposed AIM algorithm can be employed. Since has continuous second-order partial derivatives and the Hessian matrix with respect to is a positive definite diagonal matrix for any fixed , then by Theorem 4 in [34] the iterative sequence of solutions to , converges to the MLE solution and consequently the sequence as well.
As in the previous case, the alternating iterative procedure is stopped when the maximum difference between the likelihoods at step and at step is less than an arbitrary small threshold value (see (6)).
A pseudo code of the algorithms can be found in Appendix A.3.

3. Likelihood Ratio Test: constant against monotonicity

We are interested in testing hypotheses of monotonicity in under the various assumptions on the variances discussed in Section 2. There exists extensive literature on testing hypotheses on means. In most cases, a standard testing procedure entails testing the hypothesis of equality of means against the hypothesis that they are different. In this paper, we consider the same null hypothesis but the alternative is different: monotonicity of the means. As in the previous section, we consider three different testing frameworks according to the different assumptions on the variances. In all three different scenarios the test statistic of interest is the Likelihood Ratio Test (LRT), an intuitive and powerful tool in hypothesis testing. In both [2] and [29] an entire chapter is dedicated to LRT developments and its use for testing order restrictions hypothesis under the normality assumption and known variance ratio. Using the same notation used in Section 2, we wish to test

against monotonicity of means

(9)

The likelihood ratio test for against can be defined as:

(10)

where , and . It rejects the null hypothesis for small values of or alternatively for large values of . The convenience in using this other form lies on the analogy with the statistic used to test against the alternative hypothesis , that not all ’s, , are the same.
In the following subsections more explicit expressions for are given depending on the specific assumptions on means and variances.

3.1. LRT with known variance ratio

As in Section 2.1 let , be independent observations, normally distributed with unknown mean and variances with known and unknown. Under , the maximum likelihood estimate of is given by:

(11)

with . Under the MLE of is , the isotonic regression of , with weights , with respect to the simple order defined in (9).
The likelihood ratio test for against , if the variances are known and boils down to rejecting for large values of

(12)

It is easy to check that the test is equivalent to rejecting for large values of:

(13)

where and is the (known) common value of the variance.

Now, let us consider the more general case, with known and unknown. The estimator of under the null hypothesis is

(14)

and under

(15)

The likelihood ratio test rejects for small values of or equivalently, taking , for large values of

(16)

An extension to the multivariate case with covariance matrix unknown but common can be found in [24, 31].

3.2. LRT with unknown variances

In this second case, no assumptions on the variances are made. They are unknown and possibly unequal. Using the notation of Section 2.2 let , , be independent observations from a univariate Normal distribution with unknown mean vector and completely unknown variances . Let be the solution of the isotonic regression of with weights , found used Algorithm (2.2) in Appendix A.
The first example of testing when all the variances are unknown can be found in [4] and the univariate version of the test proposed by the author is:

(17)

where and . This test is clearly inspired by the LRT but it is not.
Let us consider first the maximum likelihood solution , under the null hypothesis. The log-likelihood under the null hypothesis is

(18)

Differentiating this loglikelihood with respect to and , the following score equations in unknowns emerge:

(19)

Substituting in (18), the profile likelihood of is:

(20)
Theorem 3.1.

A maximizer of (20) over exists and it is contained in . Moreover, if then the maximizer is unique.

Proof.

Maximizing profile likelihood of (20) boils down to maximize the sum of functions

(21)

Functions of type (21) are unimodal with mode at and strictly concave on . As the sum of unimodal functions is decreasing to the right of the rightmost mode (since all terms are decreasing) and from to the leftmost mode, the sum is increasing (as all of the functions are increasing on that set). Therefore, any maximizer of , if it exists, belongs to the interval . As is continuous on , existence of a maximizer is guaranteed.
Then if we consider the (possibly empty) interval where all the functions in (21) are strictly concave, on that interval the sum is also strictly concave. As for each the function (21) is strictly concave on , (20) is strictly concave on . If is contained in this intersection, is strictly concave on . Hence has a unique maximizer on .

Remark: in a setting with real data, it is easy to check whether and hence to determine whether the maximum is unique.
However, as seen from (19) the MLE estimate has no closed form expression. Therefore, in [8] and [23] two different methods for finding the optimal solution are proposed. The first is an iterative procedure based on the Newton-Raphson method. A reasonable initial value for is the so called Graybill-Deal estimator [9] with . The convergence speed of the algorithm strongly depends on the initial values. The second method is based on the profile likelihood approach. The authors in [23] propose the bisection method for finding the zero of the profile likelihood with respect to . Under we use as estimates of , found using the iterative procedure described in Section 2.2.
The likelihood ratio test when the variances are completely unknown can be expressed as:

Therefore, as in the previous case, the test rejects for small values of or equivalently for large values of .

3.3. LRT with ordered variances

Using the notation of Section 2.3 let , , be independent observations from Normal distributions with mean vector and variances . As in the previous case, the first step is the estimation of under the null hypothesis. In this case we need to maximize (18) under the restriction

(22)
Theorem 3.2.

Suppose that for , . Then there exists a maximizer of (18) under constraints (22).

Proof.

First consider the situation for fixed with for all . Differentiating (18) with respect to yields the equation

This shows, that for this , the (unique) maximizer of (18) in is given by the following weighted sum of level-means,

Consequently, , bounding the set of possible maximizers of (18) in irrespective of the precise value of .
Now, given any , the corresponding optimal is the solution to the antitonic regression problem where , , (see [29] Example 1.5.5). The vector to be projected has elements . This means, that if is restricted to , the coordinates to be projected all belong to the interval . So, if ranges over , the optimal is also contained in a the closed bounded region . By our assumption that all , the MLE exists being a maximizer of a continuous function on a compact set in

If we consider this case as a special case of the case considered in [32] the solution is unique if Condition 2.2 holds. Given that the solution is not in a closed form, we use an iterative procedure to approximate the solution. As a starting value , a modified version of the Graybill-Deal estimator of the common mean when the variances are subject to order restrictions proposed in [21] appears to be a good choice:

(23)

where is the isotonic regression of where , .
Under we use as estimates of , found using the iterative procedure described in Section 2.3.
In contrast with the previous cases, it is not possible to further reduce the expression of the LRT because

does not reduce to a constant. The same holds under . Therefore the LRT in this case can be computed by substituting the solutions obtained via the iterative procedure under and in the generic expression given in (10):

(24)

4. Null hypothesis distribution of the test statistics: bootstrap approach

In order to determine the significance of the various test statistics proposed in the previous sections, we need the null hypothesis distribution of the test statistics. The main distributional results concerning and , the test statistics derived in the known variance ratio case, are contained in [2] (Theorems 3.1-3.2). However, problems related to the value of can arise in the analytical derivation of the p-values. Numerical approximation can be necessary, especially if and if the variation in the range of the weights is not ‘moderate’ [28, 35].
Furthermore, in the case of completely unknown variances, the null distribution depends on the unknown variances. When analytical derivation of the null distribution is particularly complex or not possible, bootstrap methodology is a good option. Therefore, we propose both a parametric and a non-parametric bootstrap approach that can be easily employed for finding approximate p-values taking into account the different assumptions on the variances. For overcoming the complex derivation when the variances are unknown, bootstrap procedures have been proposed in the literature [20, 4].
In particular, in [20] an interesting review of the methods used to approximate the null distribution of the test statistic under and the restrictive normality assumption (with which we will not deal in this paper) is reported. Moreover the authors propose both a parametric and non-parametric bootstrap approach for the likelihood ratio test null distribution for one sided hypothesis testing for means in a multivariate setting [20]. Also in [4] a bootstrap approach to test the homogeneity of order restricted mean vectors when the covariance matrices are unknown is used. In line with those previous approaches, here we propose two general bootstrap procedures, parametric and non-parametric, that can be used for testing the null hypothesis taking into account the various assumptions on the variances.

Parametric bootstrap

Algorithm:

(1):

Obtain the estimates and using the original data and compute the observed value of the test statistic of interest (, , or ).

(2):

Generate, for , , independently.

(3):

For obtain the estimates and and compute the bootstrap test statistic of interest

(4):

Repeat (2)-(3) for a sufficient large number of times

The bootstrap approximation of the p-value is the given by:

(25)

and the null hypothesis is rejected whenever this p-value is less than the nominal level .
Step (2) is the key step, in which the assumption on the variances play a crucial role. It is interesting to notice that the above procedure can be further simplified. In fact, we can instead of generating individual observations, directly generate empirical means , with Standard Normally distributed.

Non-parametric bootstrap

The non-parametric version of the bootstrap releases the normality assumption of the bootstrap samples. However a relatively large sample size is required for the following approach.

Algorithm:

(1):

Obtain the estimates and using the original data and compute the observed value of the test statistic of interest (, , or ).

(2):

Standardize the original to obtain ‘standardized residuals’

(3):

Combine all observations from into a vector of length and draw simple random samples with replacement each of respective sizes

(4):

Transform to

(5):

For each bootstrap sample obtain the estimates and and compute the bootstrap test statistic of interest

(6):

Repeat (3)-(4)-(5) for a sufficient large number of times

The bootstrap approximated p-value is defined as in the parametric case (25).

5. Application

One of the most common ways for investigating strength and ductility of metallic materials is by performing a tensile test. Loosely speaking a tensile test is an experiment in which force is applied to the test sample causing deformation of the material, temporarily (elastic behavior), permanently (plastic behavior) and eventually its fracture [7]. Data used in this paper are image data of the microstructure of the material subjected to a plastic strain (deformation) of 0.139 obtained performing a uniaxial tensile test in which force is applied to the test sample with respect to just one specific axis (Fig. 1).

Figure 1. Tensile testing machine

At a microstructure level the deformation of the material corresponds to displacements in the lattice structure and in the possible appearance of Geometrically Necessary Dislocations (GNDs). The material used in this paper is an annealed AISI420 stainless steel with carbides and aim is investigating the carbide effect on the GNDs formation. Kernel Average Misorientation (KAM) is used as a proxy of the GNDs. In Figure 2 the KAM is represented by red filaments, the blue lines represent the grain boundaries, carbides are the black dots.

Figure 2. Microstructure image showing the KAM at strain level 13.9 % (overlapped grid of ).
Figure 3. Plot of KAM and Number of carbides for the 625 squared areas of Figure 2

The results of the estimation in the three different scenarios faced in Sections 2.1-2.3 are summarized in Table 1.

0 1 2 3
0.815 0.833 0.870 0.854
0.035 0.024 0.017 0.022
0.035 0.024 0.017 0.023
0.815 0.833 0.867 0.867
0.815 0.833 0.867 0.867
0.035 0.024 0.017 0.022
0.815 0.833 0.866 0.866
0.035 0.024 0.018 0.018
340 211 54 18
Table 1. Values of estimated means and variances of the KAM conditioned on the number of carbides visible in a square of a grid according to different order restrictions assumptions (13.9% Strain)
Test Statistic p-value (parametric) p-value (non-parametric)
0.827 5.760 0.0323 0.0310
0.831 0.0121 0.0112 0.0085
0.831 0.0178 0.0222
0.831 0.0212 0.0251
0 1 2 3
0.035 0.024 0.018 0.022
0.035 0.024 0.019 0.019
Table 2. Estimated values for the four different likelihood ratio test with the corresponding parametric and non-parametric p-values

Modeling the relationship between KAM and carbides and more generally understanding its inhomogeneous distribution over the microstructure is now the main aim and it can be considered a starting point for finding a stochastic model for predicting mechanical properties from 2D microstructure images. We apply estimation procedures and perform tests under order restrictions, three different univariate isotonic regressions according to the assumption on the variances.
The first step for obtaining the data in the most suitable form for the analysis is ‘overlaying’ a grid over the image. In Figure 2 a grid is added to the image. With , we denote the mean KAM value of the th square of the grid of the image taken in which carbides are observed. The explanatory variable in all three isotonic regressions is the number of carbides observed in the grid squares. A plot of the data is shown in Figure 3.
We wish to test the null hypothesis that the expected KAM is the same in all the squares of the gird, regardless the numbers of carbides observed in the grid. The alternative hypothesis

(26)

represents the idea that KAM tends to be higher in areas where more carbides are observed. Moreover, in the ordered variances case, we assume that

(27)

This is in accordance with what we see in Figure 3. In fact, the idea behind this assumption is that in areas in which less carbides are observed GNDs have more freedom to move, resulting in increments in dispersion. In Table 2, the results of testing the null hypothesis are shown. For computing and , the variance ratio is supposed to be known. In the specific case, we assume that the KAM total variance for the whole image is the real known variance and that . For computing both the parametric and non-parametric p-values, the two different bootstrap approaches described in Section 4 have been used and , the number of replications, is taken equal to . Independent of the knowledge or assumptions on the variances, the conclusion is the same and leads to the rejection of the null hypothesis.

</