Group decision making (GDM) is a common activity occurring in human being’s daily life. For a typical multi-attribute group decision making (MAGDM) problem, a group of decision makers are usually required to express their assessments over alternatives with regard to some predefined criteria. Afterwards, the evaluation information is aggregated to form a group opinion, based on which collective evaluation and a ranking of alternatives can be obtained [1, 2]. Current GDM problems demand quick solutions and decision makers may either doubt or have vague or uncertain knowledge about alternatives; hence they cannot express their assessments with exact numerical values. Consequently a more realistic approach may be to use linguistic assessments instead of numerical values [3, 4]. In literature, MAGDM problems involving uncertainty are usually dealt with linguistic modeling that implies computing with words (CW) processes to obtain accurate and easily understood results [5, 6].
Despite a large amount of research conducted on GDM with linguistic information [7, 8, 9, 10], there are still some challenges that need to be tackled. One of them is how to deal with GDM problems with large groups under linguistic environment. For traditional GDM problems, only a few number of decision makers may take part in the decision process. In recent years, the increase of technological and societal demands has given birth to new paradigms and means of making large-scale group decisions (such as e-democracy and social networks) . As a result, the large-scale GDM problems have received more and more attentions from scholars. Large-scale GDM (LGDM) can be grouped into four categories, i.e., clustering methods in LGDM [12, 13, 14], consensus reaching processes in LGDM [11, 15], LGDM methods [16, 17] and LGDM support systems [18, 19].
For linguistic LGDM problems, one important challenge is how to represent the group’s linguistic assessment, especially when anonymity is needed to protect the privacy of decision makers. It seems that the linguistic models and computational processes used in traditional linguistic GDM problems  can be directly extended to linguistic LGDM problems, which may include the linguistic aggregation operator-based approach and the models based on uncertain linguistic terms [21, 22] and hesitant fuzzy linguistic term sets [23, 24, 25]. However, in linguistic LGDM problems, the group’s assessments usually tend to present a distribution concerning the terms in the linguistic term set used, which can reflect the tendencies of preference from decision makers and provide more information about the collective assessments of alternatives. The linguistic models and computational processes introduced to deal with linguistic information in traditional linguistic GDM problems could imply an oversimplification of the elicited information from the very beginning, thus may lead to the loss and distortion of information. In order to keep the maximum information elicited by decision makers in a group in the initial stages of the decision process, this paper proposes the use of linguistic distribution assessments [26, 27] to represent group’s linguistic information for linguistic LGDM problems.
Additionally in linguistic GDM problems, multiple sources of information with different degree of knowledge and background may take part in the decision process, which usually implies the appearance and the necessity of multiple linguistic scales (multi-granular linguistic information) to model properly different knowledge elicited by each source of information . Different approaches have been introduced in literature not only to model and manage such a type of information but also for computing with it [29, 30, 31, 32, 33, 34, 35]. Therefore in a linguistic LGDM problem, decision makers may use different linguistic term sets to provide the assessments over alternatives. In order to keep maximum information in initial stages of the decision process, the linguistic distribution assessments will be multi-granular linguistic ones. Hence, there is a clear need of dealing with multi-granular linguistic distribution assessments in the decision processes. Moreover, according to the CW scheme [36, 20], it is also crucial to obtain interpretable final linguistic results to decision makers. Therefore, new models for representing and managing multi-granular linguistic distribution assessments will be developed.
Consequently, the aim of this paper is to introduce a new linguistic computational model which is able to deal with multi-granular linguistic information by keeping the maximum information at the initial stages, removing initial aggregation processes and modeling the information provided by experts with the use of linguistic distribution assessments to obtain a solution set of alternatives by a classical decision approach with specific operators defined for linguistic distribution assessments providing interpretable results.
The remainder of this paper is organized as follows. In Section II, some necessary preliminaries for the proposed model are presented. In Section III, improved distance measures and ranking approach of linguistic distribution assessments are provided. In Section IV, a new linguistic computational model is introduced to deal with multi-granular linguistic distribution assessments. In Section V, an approach is developed to deal with large-scale linguistic MAGDM problems using multi-granular linguistic distribution assessments. In Section VI, an example is given to illustrate the proposed MAGDM approach. Finally, this paper is concluded in section VII.
In order to make this paper as self-contained as possible, some related preliminaries are presented in this section. In subsection II-A, we review some basic knowledge related to linguistic information and decision making. In subsection II-B, how to deal with multi-granular linguistic information is presented. In subsection II-C, related concepts about linguistic distribution assessments are provided.
Ii-a Linguistic information and decision making
Many aspects of decision making activities in the real world are usually assessed in a qualitative way due to the vague or imprecise knowledge of decision makers. In such cases, the use of linguistic information seems to be a better way for decision makers to express their assessments. To manage linguistic information in decision making, linguistic modeling techniques are needed. In linguistic modeling, the linguistic variable defined by Zadeh [37, 38, 39] is usually employed to reduce the communication gap between humans and computers. A linguistic variable is a variable whose values are not numbers but words in a natural or artificial language.
To facilitate the assessment process in linguistic decision making, a linguistic term set and its semantics should be chosen in advance. One way to generate the linguistic term set is to consider all the linguistic terms distributed on a scale in a total order 
. The most widely used linguistic term set is the one which has an odd value of granularity, being triangular-shaped, symmetrical and uniformly distributed its membership functions. A formal description of a linguistic term set can be given below.
Let denote a linguistic term set with odd cardinality, the element represents the linguistic term in , and is the cardinality of the linguistic term set . Moreover, for the linguistic term set , it is usually assumed that the midterm represents an assessment of “approximately 0.5”, with the rest of the terms being placed uniformly and symmetrically around it. Moreover, should satisfy the following characteristics [41, 6]: (1) The set is ordered: ; (2) There is a negation operator: , such that ; (3) Maximization operator: , if ; (4) Minimization operator: , if .
Different linguistic computational models have been developed for CW [6, 20], such as models based on fuzzy membership functions , symbolic models based on ordinal scales , models based on type-2 fuzzy sets . However, such models sometimes may lead to information loss or lack interpretability. To enhance the accuracy and interpretability of linguistic computational models, Herrera and Martínez  proposed the 2-tuple linguistic representation model, which is defined as below.
 Let be a linguistic term set and be a value representing the result of a symbolic aggregation operation, then the 2-tuple that expresses the equivalent information to is obtained with the following function:
with , , where “round()” is the usual round operation, has the closest index label to , and is the value of symbolic translation.
 Let be a linguistic term set and be a 2-tuple, there exists a function , which can transform a 2-tuple into its equivalent numerical value . The transformation function is defined as
Based on the above definitions, a linguistic term can be considered as a linguistic 2-tuple by adding the value 0 to it as a symbolic translation, i.e. . In this paper, the 2-tuple linguistic model will be used as the basic linguistic computational model.
Ii-B Multi-granular linguistic information
When multiple decision makers or multiple criteria are involved in a linguistic decision making problem, the assessments concerning the alternatives are usually in the form of multi-granular linguistic information, which is due to the fact that a decision maker who wants to provide precise information may use a linguistic term set with a finer granularity, while a decision maker who is not able to be very precise about a certain domain may choose a linguistic term set with a coarse granularity [29, 45].
To manage multi-granular linguistic information, different linguistic computational models have been proposed, including models based on fuzzy membership functions [46, 21], ordinal models based on a basic linguistic term set [32, 29, 47], the linguistic hierarchies (LH) model , ordinal models based on hierarchical trees , models based qualitative description spaces  and ordinal models based discrete fuzzy numbers . For a systematic review about multi-granular fuzzy linguistic modeling, the readers can refer to .
To fuse linguistic information with any linguistic scale, Espinilla et al.  introduced an extended linguistic hierarchies (ELH) model based on the LH model. In this paper, the ELH model will be used to handle multi-granular linguistic information. Before introducing the ELH model, we first recall the LH model proposed by Herrera and Martínez .
A LH is the union of all levels : , where each level of a LH corresponds to a linguistic term set with a granularity of denoted as: , and a linguistic term set of level is obtained from its predecessor as . Based on the LH basic rules, a transformation function between any two linguistic levels and of the LH is defined as below.
 Let be a LH whose linguistic term sets are denoted as , and let us consider the 2-tuple linguistic representation. The transformation function from a linguistic label in level to a label in level , satisfying the LH basic rules, is defined as
The ELH model constructs extended linguistic hierarchies based on the following proposition.
 Let be a set of linguistic term sets, where the granularity is an odd value, . A new linguistic term set with that keeps all the formal modal points of the linguistic term sets has the minimal granularity:
where is the least common multiple and , . The set of former modal points of the level is defined as and each former modal point is located at .
Based on Proposition 1, an ELH which is the union of the levels required by the experts and the new level that keeps all the former modal points to provide accuracy in the processes of CW is denoted by
Espinilla et al.  defined a transformation function which can transform any pair of linguistic term sets, and , in the ELH without loss of information. The basic idea of the transformation function is as follows. First, transform linguistic terms at any level in the ELH into those at , being , that keeps all the former modal points of the level , by means of without loss of information, and then transform the linguistic terms at in the ELH into any level by means of without loss of information.
 Assume and be any pair of linguistic term sets in the ELH and is the level in the ELH, the new extended transformation function is defined as
where and are the transformation functions as defined in the LH model.
Ii-C Linguistic distribution assessments
In this subsection, some related concepts of linguistic distribution assessments are presented. First, the definition of a linguistic distribution assessment is revised.
 Let denote a linguistic term set and be the symbolic proportion of , where , , and , then an assessment is called a linguistic distribution assessment of , and the expectation of is defined as a linguistic 2-tuple by , where 111The representation is different from the definition provided in , but they have the same meaning.. For two linguistic distribution assessments and , if , then .
A linguistic distribution assessment can be used to represent the linguistic assessment of a group. Assume that the originality of a research project was assessed by five experts using linguistic terms from a linguistic term set . If the assessments of the five experts were , then the overall assessment could be denoted as a linguistic distribution assessment . Based on a linguistic distribution assessment, we not only can roughly know the possible assessment of an alternative in a linguistic way, but also can derive the distribution of each linguistic term, which keeps the maximum information elicited by decision makers in a group.
Zhang et al.  developed the weighted averaging operator of linguistic distribution assessments (i.e., DAWA operator), which is defined as follows.
The distance measure between two linguistic distribution assessments is also given in , as showed below.
 Let and be two linguistic distribution assessments of a linguistic term set , then the distance between and is defined as
Iii Improving distance and ranking methods for linguistic distribution assessments
In this section, it is pointed out that previous distance measure and ranking method for linguistic distribution assessments present some flaws, and a new distance measure and a new ranking method are then introduced to overcome such flaws. First, it is showed the flaws of the distance measure defined in , i.e. Definition 7, with Example 1.
Let be a linguistic term set and there are three linguistic distribution assessments: , and . By the definition of the linguistic distribution assessment, we know that a linguistic term of is a special case of the linguistic distribution assessment with and , for all , i.e. , and . However, by Definition 7, we can obtain
which means that the distance between and is equal to that between and . Obviously it is unreasonable.
From Definition 7, it can be seen that (8) just calculates the deviation between symbolic proportions and ignores the importance of linguistic terms. In this paper a novel distance measure between two linguistic distribution assessments is defined as:
Let and be two linguistic distribution assessments of a linguistic term set , then the distance between and is defined as
Looking now at the ranking problem of a linguistic distribution assessments collection. Zhang et al.  utilized the expectation values to rank linguistic distribution assessments. However, there may be cases that the expectation values of some linguistic distribution assessments are equal. As a result, the comparison rule mentioned in Definition 5 sometimes cannot distinguish these linguistic distribution assessments. As the uncertainty in the sense of inaccuracy of a linguistic distribution assessment is reflected by its distribution, which can be measured by using Shannon’s entropy. It is then proposed that the ranking of linguistic distribution assessments will be computed by an inaccuracy function for linguistic distribution assessments and several comparison rules introduced below.
Let be a linguistic distribution assessment of a linguistic term set , where , , and . The inaccuracy function of is defined as 222 is defined in this paper..
Let and be two linguistic distribution assessments, then the comparison rules are defined as follows: (1) If , then ; (2) If and , then ; If and , then .
Let be a linguistic term set and there are three linguistic distribution assessments: , and .
Iv Dealing with multi-granular linguistic distribution assessments
As the focus of this paper is to deal with LGDM problems with multi-granular linguistic information, this section is devoted to develop a new computational model to deal with multi-granular linguistic distribution assessments. Due to the fact that our proposal for dealing with multi-granular linguistic distribution assessments and obtaining interpretable results will be based on tools introduced for linguistic 2-tuple values, Subsection IV-A shows how to transform a linguistic 2-tuple into a linguistic distribution assessment. Afterwards, a new model for managing multi-granular linguistic distribution assessments is developed in Subsection IV-B.
Iv-a Transforming a linguistic 2-tuple into a linguistic distribution assessment
This subsection discusses the relationship between a linguistic 2-tuple and a linguistic distribution assessment. For convenience, let be a linguistic term set as defined in section II and be a linguistic 2-tuple, then:
(1) If , denotes the linguistic information between and .
(2) If , denotes the linguistic information between and .
(3) If , denotes the linguistic information .
Let be the integer part of , then a linguistic 2-tuple denotes the linguistic information between and if .
We consider two cases.
Case 1: . In this case, . Hence, and .
Case 2: . In this case, . Hence, and .
According to the previous results, a linguistic 2-tuple denotes the linguistic information between and if . This completes the proof of Proposition 2. ∎
Proposition 2 demonstrates that a linguistic 2-tuple , can denote the linguistic information between two successive linguistic terms and . From the perspective of linguistic distribution assessments, the linguistic information between and should be denoted as a linguistic distribution assessment . It is then necessary to determine the value of .
As the linguistic information between and is equivalent, the expectation of should be equal to . Therefore, , i.e.
By solving (10), .
From the previous analysis, a linguistic 2-tuple , can be denoted as a linguistic distribution assessment , where is the integer part of and .
It is easy to verify that the above statement also holds for the case , i.e. a linguistic 2-tuple can be denoted as a linguistic distribution assessment , where and .
Let be a linguistic term set and be the set of all the linguistic distribution assessments of , and there exists a function , which can transform a linguistic 2-tuple into its equivalent linguistic distribution assessment. The transformation function is defined as
where is the integer part of and .
For Definition 11, the following theorem is given:
Let be a linguistic term set. The equivalent linguistic distribution assessment of a linguistic 2-tuple is
If , , then and . By (11), .
If , , then and . By (11), .
This completes the proof of Theorem 1. ∎
Let be a linguistic term set, and be two linguistic 2-tuples. Based on Theorem 1, and .
Iv-B Unifying multi-granular linguistic distribution assessments
To deal with decision making problems with multi-granular linguistic information, a natural solution is to unify them and derive linguistic information based on the same linguistic term set [29, 30]. Afterwards, the multi-granular linguistic information can be fused. This subsection focuses on the unification of multi-granular linguistic distribution assessments.
For convenience, some notations are defined as follows. Let be a set of linguistic term sets, where is a linguistic term set with an odd granularity , , and an ELH is constructed by Eq. (5) as , where is the level of . By Proposition 1, , where , . For the level , the linguistic term set is denoted by . Moreover, a linguistic distribution assessment on a linguistic term set is denoted as by .
Now, it is necessary to transform a linguistic distribution assessment into a linguistic distribution assessment on another linguistic term set , where and .
Motivated by the extended transformation function of the ELH model, it is proposed a two-stage procedure to conduct the transformation process.
Stage 1: Transform the linguistic distribution assessment into a linguistic distribution assessment on .
Stage 2: Transform the linguistic distribution assessment on into a linguistic distribution assessment on .
Looking at Stage 1, intuitively, it can be first transformed the linguistic terms in into linguistic information in the linguistic term set by the function . As the transformation is from a low level to a high level, the transformed linguistic information are normative linguistic terms without symbolic translations. As a result, it is only necessary to attach corresponding symbolic proportions in with each linguistic term in . By doing so, a linguistic distribution assessment on is derived. Formally, it is given the following definition.
Let and be defined as before, then can be transformed into a linguistic distribution assessment on by
where , .
The meaning of Definition 12 is to find out the linguistic terms in , whose corresponding linguistic terms in have non-zero symbolic proportions in , and then assign the non-zero symbolic proportions to them.
The transformed is a linguistic distribution assessment of .
According to , the transformation from to is one-to-one, i.e. each linguistic term of corresponds to a linguistic term of . Specifically, we have
If , then and
. Thus we have .
Since , it is obtained . Therefore, is a linguistic distribution assessment of , which completes the proof of Theorem 2. ∎
At Stage 2 it is transformed the linguistic distribution assessment on into a linguistic distribution assessment on .
At first glance, it might be thought that we can also utilize the transformation function to transform the linguistic terms in into linguistic information of , i.e.
and then attach the corresponding symbolic proportions. However, such transformation is from a high level to a low level. Hence, some linguistic terms in may be transformed into linguistic 2-tuples of . In this way, the derived result is not a normative linguistic distribution assessment of .
To address this issue and according to Definition 11, a linguistic 2-tuple can be transformed into its equivalent linguistic distribution assessment by (11). Therefore, it can be first transformed each linguistic 2-tuple derived by (15) into its equivalent linguistic distribution assessment by using Definition 11 and obtain linguistic distribution assessments of , i.e.
where is the integer part of and .
Considering the symbolic proportion of each , it can be aggregated these linguistic distribution assessments , into a new one by the DAWA operator, which yields a linguistic distribution assessment of . Formally, it is provided the following definition.
The procedures of the two-stage transformation are illustrated by Fig. 1.
derived by Definition 13 is a linguistic distribution assessment.
Based on the above analysis, we have that each , is a linguistic distribution assessment. Moreover, . Since the weighted average of some linguistic distribution assessments is also a linguistic distribution assessment , is a linguistic distribution assessment. ∎
Let and be two linguistic term sets, and there are two linguistic distribution assessments to be fused, i.e. and
Here, there are two linguistic term sets, i.e. and . According to Proposition 1, we have . Therefore, . For the first linguistic distribution assessment, since , , , by (14), then , , and a linguistic distribution assessment is derived.