Application of Genetic Algorithms to the Multiple Team Formation Problem

03/08/2019 ∙ by Jose G. M. Esgario, et al. ∙ 0

Allocating of people in multiple projects is an important issue considering the efficiency of groups from the point of view of social interaction. In this paper, based on previous works, the Multiple Team Formation Problem (MTFP) based on sociometric techniques is formulated as an optimization problem taking into account the social interaction among team members. To solve the resulting optimization problem we propose a Genetic Algorithm due to the NP-hard nature of the problem. The social cohesion is an important issue that directly impacts the productivity of the work environment. So, maintaining an appropriate level of cohesion keeps a group together, which will bring positive impacts on the results of a project. The aim of the proposal is to ensure the best possible effectiveness from the point of view of social interaction. In this way, the presented algorithm serves as a decision-making tool for managers to build teams of people in multiple projects. In order to analyze the performance of the proposed method, computational experiments with benchmarks were performed and compared with the exhaustive method. The results are promising and show that the algorithm generally obtains near-optimal results within a short computational time.



There are no comments yet.


page 1

page 2

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Eempirical studies in the area of multi-project management are still rare [1], and usually when building a team of people, technical skills are the most considered attributes by project managers. However, recent studies have shown that either success or failure of a group often depends on: a) the interdependence of group skills; b) integration; c) trust; and d) the technical skills of each member [2]. In addition, an appropriate level of cohesion is required to get people together and make them collaborate to build the basis of a quality work [3]. In sociological theory, some studies point out that team cohesion is an important variable in the emergence of consensus among team members and that cohesion is directly responsible for positively impacting the productivity of a work environment, influencing team motivation, morale, coordination of efforts, productivity, satisfaction and cooperation [4]. Besides that, Social Identity Theory suggests that the more a person identifies himself as belonging to a group, the more that person will actively contribute to achieve a common goal [5].

Aiming to determine the effects of social cohesion on a group of people, multiple tasks were performed by 50 army teams and a positive effect of social cohesion on the teams’ physical and mental performance was identified [6]. Seers, Petty and Cashman [7] reported a study of 103 manufacturing workers that found a positive association between job satisfaction, motivation, and group cohesion. Sanders and Nauta [8] pointed out that how increased cohesion reduces employee absenteeism.

It is therefore obvious that social interaction plays an important role in achieving the success of a project. However, only in 2009 the first attempt was made by Lappas, Liu and Terzi [9] to allocate people with different skills to a group while seeking to maximize their social compatibilities. Since then, several other approaches to the problem have been developed and other works have been published proposing better models for the determination of an optimal solution with lower computational costs. For instance, it was proposed a new allocation model [10] that takes into account the collaboration among members and seeks to minimize the likelihood of someone leaving the project which would decrease performance and put the project at risk. In [11] the problem of allocating multiple people (either full-time or in smaller time fractions) to various groups is called the Multiple Team Formation Problem (MTFP) and some algorithms were proposed for the solution of the problem. The MTFP has received attention in the scientific literature [12, 13], however the majority of these works are based only on the psychological and behavioral perspective.

Ballesteros-Pérez, González-Cruz and Fernández-Diego [14] developed a method for allocating people in multiple projects, so that the combination of human resources allocated to different working groups, maximizes the efficiency of groups from the point of view of social interaction. This approach allocates individuals based on sociometric techniques whereas the main point was to present a manual calculation method that determines a sufficiently good allocation, in a very practical way - the optimal solution is not guaranteed. Although such a method is quite practical for a small number of people, since the NP-Hard nature of the problem the method becomes prohibitive when this number increases. For computer-based calculations, where the optimal solution to the problem at hand is guaranteed, the authors have limited themselves to describing that a computational application must take into account the constraints and information given by the project managers. Unfortunatelly, the computation of all possible permutations for determining the optimall allocations is viable only for teams with small number of individuals.

Yannibelli and Amandi [15] proposed a deterministic crowding evolutionary algorithm for the formation of collaborative learning teams, so that the roles of students in a group are balanced. The idea of applying an evolutionary algorithm to solve this problem aims to find good solutions in a short time period. Other works also made use of evolutionary approaches for grouping people based on their interpersonal relationships [16, 17, 18, 19]. Da Silva and Krohling [20] presents an algorithm based on Sociometry that incorporates fuzzy numbers to the MTFP and allows the expression of personal preferences provided to the sociometric test in a more natural way.

In this paper, the MTFP is formulated as a cohesion maximization problem subject to constraints of the requirements matrix. To solve the problem, a Genetic Algorithm (GA) is proposed. This paper extends the work developed by Ballesteros-Pérez, González-Cruz and Fernández-Diego [14] and was inspired by Yannibelli and Amandi [15], who proposed an evolutionary approach to solve the MTFP, which allows finding optimal or semi-optimal solutions to the problem when the number of individuals to be allocated is large and the computation of all possible solutions is not viable. The remainder of this paper is organized as follows: section 2 presents the problem formulation; section 3 is devoted to describing the proposed approach; section 4 presents the experimental results and finally a conclusion is presented in section 5.

Ii Problem Formulation

Ii-a Group Cohesion and Sociometric Matrix

Given an organization that develops projects of any kind, which is composed of different individuals belonging to different departments and having different skills, it is desired to allocate each one of the individuals to some group of the organization to carry out a project in a way that maximizes the cohesion as a whole. Fig. 1 shows a scheme of how MTFP works.

Fig. 1: Multiple team formation problem scheme.

Group cohesion is defined as the degree in which individuals feel accepted or rejected by a given group [21]. In developing the proposed algorithm, it was postulated, as in [14], that the result achieved by a team depends greatly on the way in which individuals develop their relationships and interactions. Thus, if it is possible to maximize the cohesion of several groups by bringing together certain people, the whole project is expected to have an optimal performance.

The cohesion degree of a given group can be obtained through several methods, such as: sociometric tests and work environment studies. The sociometric test is a tool used by Sociometry created by Jacob Levy Moreno [22] to understand how relationships between members of a given group are structured.

In the sociometric test each member accepts or rejects each of the other members of the project. The results of the sociometric test can be obtained through the following questions: 1) “Would you like to work with which employees?”  2) “Which employees would you not like to work with?”.

From the results obtained in the sociometric test a matrix called Sociometric Matrix is then constructed, whose values +1, 0 and -1, mean that a given member “chose”, omitted his opinion or “rejected” another member, respectively. The values contained in the main diagonal of the Sociometric Matrix will always be equal to zero since self-evaluations are not allowed.

Ii-B The Sociometric and the Project Requirement Matrices

To solve the MTFP two matrices are required as input, the Sociometric Matrix and the Project Requirement Matrix . The Sociometric Matrix is described by


where each project member assigns a value to another member that will be part of the main project to be formed. As stated earlier, will assume values equal to , or depending on the answers given to sociometric test questions. Each member is also characterized from their skills, training or in terms of information about which department of the company this member belongs to. For example, is the worker number in the finance department of a given company, it may be denoted as F7, where the letter “F” means “Department of Finance”.

The other matrix needed to solve the problem is called the Project Requirement Matrix , described by


where a certain number of people from a given department or a group of people with a certain skill are required for each group or subproject .

Ii-C Problem Constraints and Solution Space

Once the input matrices are obtained, we proceed to the second phase of the problem. In this phase all Allocation Matrices are generated, which correspond to the possible solutions of the problem. An Allocation Matrix is described by


where the elements indicate to which group or subproject the worker will be allocated. A feasible solutions must necessarily meet the following constraints:

Comply with the Project Requirement Matrix (2).

Allocate each individual to only one group according to the equation


Ii-D General Cohesion

To determine which allocation is the one that maximizes the General Cohesion it is necessary compute the General Cohesion of each solution generated. is calculated as


where is the total number of groups that forms the main project, is the weight of each group given by , is the number of individuals of a group , is the total number of individuals in the main project and represents the cohesion of the -th group and is defined as


The higher is, the greater is the cohesion among the members of a group.

Iii Proposed Approach

Genetic Algorithm (GA) originally developed by Holland [23] is a powerful optimization algorithm inspired by the concepts of Darwin’ s Theory of Evolution. In this paper, we propose to solve the MTFP using a GA.

The main components of the algorithm are: genetic representation, population initialization, genetic operators (selection, crossover and mutation) and the fitness function. All components are detailed in the following.

Iii-a Genetic Representations

The individual of the population are possible solutions of the optimization problem, and codified individuals are known as chromosomes. Each chromosome is represented as a vector of dimension

where each position () of this vector represents an individual of the MTFP. The value of the elements that constitutes these vectors indicates the group in which the -th individual must be allocated. The variables of a chromosome, called genes, use binary coding. The number of bits used in these variables were defined as being equal to the number of groups of the MTFP. Just one bit receive the value 1 and its position indicates the group that the individual will compose.

Fig. 2 illustrates the coding of a Solution Matrix, taking as example a matrix , containing five individuals and three groups. The chromosome has five genes where each gene is codified by three bits.

Fig. 2: Codification of a Solution Matrix

The generation of the initial population consisted in the creation of a set of random solutions satisfying the equation (4). Initially, genes of the chromosomes are initialized with zeros and for each gene a random bit is selected which will be assigned the value 1.

Iii-B Genetic Operators

Next, we describe the three genetic operators.

Iii-B1 Selection

The selection process used by GA is based on Darwin’s theory of natural selection, whereas fit individuals are more likely to survive. For the selection of the most fit individuals, the tournament selection method was used since it is a simple and widely used in the literature [24]. Two possible solutions or individuals are selected randomly from the current population, the fitness values of these individuals are compared and the one with the best value is selected for the next generation.

Iii-B2 Crossover

The crossover operator acts on the individuals resulting from the selection process through the exchange of genetic material, thus generating new individuals. Individuals of the population are selected in pairs, called parents, and are crossed so that each gene has a probability of being exchanged between them. The crossing process results in a new population composed of these new pairs of individuals, called offspring. Fig. 3 shows an example of applying the crossover operator on a pair of individuals. The chromosomes have five genes, the and genes of the parents were exchanged generating two new individuals.

Fig. 3: Example of a crossover operator application.

Iii-B3 Mutation

The mutation operator increases the diversity of the solutions and helps the algorithm to escape from local minima by changing one or more genes from a chromosome by random values. The mutation is performed with a given probability called mutation probability. For each gene of a chromosome a random value in the interval is generated, if then the mutation is performed. The mutation process employed consists of assigning zero to the gene value and randomly a new bit is selected to receive the value .

Iii-C Fitness function

Due to the intrinsic constraints of the problem, there is a need to use some constraint-handling strategy that forces the algorithm to produce feasible solutions. Among the several approaches developed to treat constrained optimization problems, many use the penalty method [25] that adds a penalty component to the objective function by transforming the constrained optimization problem into an unconstrained one. The fitness function with the penalty method for the MTFP is calculated according to


represents the General Cohesion calculated by (5). is the penalty function that measures how much a possible solution violates the constraints of the problem, i.e., the sum of the module of the difference between the obtained Requirement Matrix and the desired Requirement Matrix . The penalty is given by


Given a Solution Matrix generated from an individual of the population. The obtained Requirement Matrix is calculated as follows


where the element of the matrix is equal to the sum of individuals of the same department who were allocated to work in the same group and it is calculated as


where represents the department of the -th individual.

The pseudo-code of the approach used to solve the MTFP proposed, is described in Algorithm 1.

Input: : Requirement Matrix, : Sociometric Matrix
    Output: : Solution Matrix

1:   Initialize population
2:  while number of generation is not met do
3:      Compute fitness(, , )
4:      Selection(, )
5:      Crossover(, )
6:      Mutation(, )
7:  end while
8:   Solution Matrix()
Algorithm 1 Genetic Algorithm

Iv Experimental Results

Iv-a Benchmarks

In order to perform the experiments, seven datasets were developed. The features of each dataset are presented in the table I.

Dataset Individuals Groups Departments
1 10 3 4
2 15 3 3
3 20 2 4
4 21 3 3
5 50 4 4
6 100 5 4
7 200 6 5
TABLE I: Features of the proposed datasets

The datasets consist of non-symmetric sociometric matrices and requirements matrices, generated randomly. In order to illustrate the MTFP, the input matrices of database 1 are presented in detail in Tables II and III. The rest of the datasets and the source-code is available at:

2 2 0
2 1 0
0 1 1
0 0 1
TABLE II: Dataset 1 - Requirement Matrix
0 1 0 0 1 -1 1 1 1 -1
0 0 0 0 1 1 1 0 -1 1
1 1 0 1 -1 1 1 -1 1 1
1 1 1 0 0 1 1 1 1 1
0 0 -1 -1 0 1 1 0 0 0
0 1 1 0 0 0 1 -1 0 1
1 1 0 0 0 0 0 1 1 0
0 0 1 0 0 0 0 0 1 1
1 0 0 0 0 0 0 0 0 0
0 1 -1 0 0 1 1 0 -1 0
TABLE III: Dataset 1 - Sociometric Matrix

Iv-B Experimental Setup

Experiments were performed in the development environment Matlab 2015a, running on a PC with Intel Core i5 processor, 8 GB of memory and Microsoft Windows 10 operating system.

Before performing the experiments it is necessary to set up values for GA parameters: number of generations , population size , crossover probability and mutation probability . The effectiveness of the algorithm greatly depends on the parameters choices. So, empirical tests were performed seeking more appropriate values combination to this problem.

Aiming to estimate the number of generations it was taken into account that a Solution Matrix generated by a chromosome has dimension

in such a way that satisfying the constraint (4) the total combinations of possible solutions is . In order to make the number of generations scalable for each problem, the information of the number of possible combinations was used so that where is a constant. However, in this way the number of generations grows abruptly as the number of individuals increases. To overcome this problem a logarithmic transformation was performed, resulting in . In the experiments performed was chosen.

For population size with , little improvements have been observed. Therefore, for all experiments was used . The crossover probability did not have a significant impact on the results. In the experiments it was set up to which is equivalent to . In the first case, approximately of the genes would be exchanged and in the second case of the genes would be exchanged, resulting in similar offspring pairs.

Initial tests showed that a fixed value of mutation probability for all datasets did not present good results. This happens due to the difference in the number of individuals of the problem at hand. In addition, the tests have shown that applying the mutation on average in only one gene of the chromosome provides good results. Therefore, since a chromosome has dimension equivalent to the number of individuals of the MTFP, it was set up to . Taking as an example a dataset with then we have , i.e., each gene has a chance of mutation.

Table IV presents the parameters used in computational experiments with GA.

Parameter Value
Number of generations
Population size
Crossover probability
Mutation probability
TABLE IV: Parameters of Genetic Algorithm

Iv-C Results Analysis

GA was performed on all datasets and statistical results were collected for runs and presented in the table V. In addition to the GA results, table V presents the results for the exhaustive method that generates all possible permutations satisfying the Requirement Matrix (2) and evaluates the cohesion for all permutations that meet the problem requirements. The exhaustive method results represent the optimal solution of the problem. For both algorithms, time was computed as the mean of runs.

Exhaustive Method Genetic Algorithm
Dataset Best Fitness Time (s) Permutations Func. Eval. Max Mean Std Min Time (s) Func. Eval.
1 1.6000 1.8e-2 1296 36 1.6000 1.5800 0.0616 1.4000 1.2 11000
2 2.3333 134.5 1.4e+7 7200 2.3333 2.2100 0.1087 2.0000 2.2 16500
3 3.5000 234.6 2.5e+7 5040 3.5000 3.4425 0.0494 3.3500 2.0 13850
4 2.6667 6082.9 6.1e+8 31360 2.6667 2.5667 0.1155 2.3810 4.0 23050
5 N/A N/A N/A N/A 3.1400 2.6600 0.2569 2.2000 37.5 69300
6 N/A N/A N/A N/A 4.1600 3.5350 0.2573 3.1400 289.7 160950
7 N/A N/A N/A N/A 5.5600 4.5930 0.4232 3.8500 2659.3 358350
TABLE V: Simulation results for Exhaustive Method and Genetic Algorithm

Due to the NP-Hard nature of the problem, the application of the exhaustive method was limited only to the less complex datasets. Among the first four datasets the only one whose exhaustive method presented the shortest time was the dataset 1, whose optimal solution is obtained with a small number of permutations. The Solution Matrix obtained for this dataset with the exhaustive method is presented in table VI.

1 0 0
1 0 0
0 1 0
0 1 0
1 0 0
0 1 0
1 0 0
0 0 1
0 1 0
0 0 1
TABLE VI: Dataset 1 - Solution Matrix

Comparing both methods, it is clear how effective the GA is, especially when solving datasets 1-4 whose results is close enough to the optimal results obtained by the exhaustive method with a much shorter time at bases 2-4. The penalty method was quite effective for this problem, since GA was able to find feasible solutions in all datasets and runs. Although the optimal values obtained for the databases 5-7 are not known, was adapted in such a way that for more complex problems the algorithm was run with a greater number of generations. The higher the value of the greater the number of function evaluations, which leads the algorithm to explore more the search space.

In order to verify how restrictive is the MTFP due to its combinatorial nature, a series of experiments were performed in different scenarios (different input matrices) for the set of individuals , for the groups and a fixed number of departments . Due to differences in input Requirement Matrices, the number of permutations performed by the exhaustive method is also different. In order to compute the average time, runs were performed for each value of and

. The discrepancy values (outliers) of maximum and minimum were discarded and the mean was calculated from the remaining values. The average execution time of the exhaustive method is shown in Fig.

4. The similarity of the mean time curves obtained with a straight line in the semi-log plot, suggests an exponential growth of time when the number of individuals increases.

Fig. 4: Average time to execute the exhaustive method by the number of individuals.

V Conclusion

In this paper, an evolutionary approach was proposed to solve the MTFP based on sociometry in order to form teams with high cohesion in a reduced time. The genetic operators were adapted to deal with the constraints of the problem. In addition, the penalty method was used to force the algorithm to find feasible solutions. Computational experiments were performed to evaluate the performance of the proposed algorithm. The algorithm was executed for datasets with different levels of complexity and the statistical results of cohesion and time are presented. The performance of the algorithm was compared with the exhaustive method in four out of seven datasets. The comparison with the rest of the datasets was impracticable due to the high computational time required by the exhaustive method for more complex problems. The proposed method provides results close to the optimal one with a reduced computational time. So, the approach turns out to be quite promising. Future research will investigate the use of the matrix of allocation with fractional values which will allow the distribution of the workload of employees in more than one group.


This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001. R. A. Krohling would like to thank the Brazilian agency CNPq and the local Agency of the state of Espírito Santo FAPES for financial support under grant No. 309161/2015-0 and No. 039/2016, respectively.


  • [1] P. Patanakul and D. Milosevic, “The Effectiveness in Managing a Group of Multiple Projects: Factors of Influence and Measurement Criteria,” International Journal of Project Management, vol. 27, no. 3, pp. 216–233, 2009.
  • [2] B. K. Baiden and A. D. Price, “The Effect of Integration on Project Delivery Team Effectiveness,” International Journal of Project Management, vol. 29, no. 2, pp. 129–136, 2011.
  • [3] M. Hoegl and H. G. Gemuenden, “Teamwork Quality and the Success of Innovative Projects: A Theoretical Concept and Empirical Evidence,” Organization Science, vol. 12, no. 4, pp. 435–449, 2001.
  • [4] R. Dwivedula and C. N. Bredillet, “Profiling Work Motivation of Project Workers,” International Journal of Project Management, vol. 28, no. 2, pp. 158–165, 2010.
  • [5] I. Maurer, “How to Build Trust in Inter-organizational Projects: The Impact of Project Staffing and Project Rewards on the Formation of Trust, Knowledge Acquisition and Product Innovation,” International Journal of Project Management, vol. 28, no. 7, pp. 629–637, 2010.
  • [6] M. H. Jordan, H. S. Feild, and A. A. Armenakis, “The Relationship of Group Process Variables and Team Performance: A Team-level Analysis in a Field Setting,” Small Group Research, vol. 33, no. 1, pp. 121–150, 2002.
  • [7] A. Seers, M. Petty, and J. F. Cashman, “Team-member Exchange under Team and Traditional Management: A Naturally Occurring Quasi-experiment,” Group & Organization Management, vol. 20, no. 1, pp. 18–38, 1995.
  • [8] K. Sanders and A. Nauta, “Social Cohesiveness and Absenteeism: The Relationship Between Characteristics of Employees and Short-term Absenteeism within an Organization,” Small Group Research, vol. 35, no. 6, pp. 724–741, 2004.
  • [9] T. Lappas, K. Liu, and E. Terzi, “Finding a Team of Experts in Social Networks,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 467–476, ACM, 2009.
  • [10] M. Fathian, M. Saei-Shahi, and A. Makui, “A New Optimization Model for Reliable Team Formation Problem Considering Experts’ Collaboration Network,” IEEE Transactions on Engineering Management, vol. 64, pp. 586–593, Nov 2017.
  • [11] J. H. Gutiérrez, C. A. Astudillo, P. Ballesteros-Pérez, D. Mora-Melià, and A. Candia-Véjar, “The Multiple Team Formation Problem Using Sociometry,” Computers & Operations Research, vol. 75, pp. 150–162, 2016.
  • [12] M. A. Campion, G. J. Medsker, and A. C. Higgs, “Relations Between Work Group Characteristics and Effectiveness: Implications for Designing Effective Work Groups,” Personnel psychology, vol. 46, no. 4, pp. 823–847, 1993.
  • [13] E. L. Fitzpatrick and R. G. Askin, “Forming Effective Worker Teams with Multi-functional Skill Requirements,” Computers & Industrial Engineering, vol. 48, no. 3, pp. 593–608, 2005.
  • [14] P. Ballesteros-Pérez, M. C. González-Cruz, and M. Fernández-Diego, “Human Resource Allocation Management in Multiple Projects Using Sociometric Techniques,” International Journal of Project Management, vol. 30, no. 8, pp. 901–913, 2012.
  • [15] V. Yannibelli and A. Amandi, “A Deterministic Crowding Evolutionary Algorithm to form Learning Teams in a Collaborative Learning Context,” Expert Systems with Applications, vol. 39, no. 10, pp. 8584–8592, 2012.
  • [16]

    C.-M. Lin and M. Gen, “Multi-criteria Human Resource Allocation for Solving Multistage Combinatorial Optimization Problems using Multiobjective Hybrid Genetic Algorithm,”

    Expert Systems with Applications, vol. 34, no. 4, pp. 2480–2490, 2008.
  • [17] L. E. Agustín-Blas, S. Salcedo-Sanz, E. G. Ortiz-García, A. Portilla-Figueras, Á. M. Pérez-Bellido, and S. Jiménez-Fernández, “Team Formation based on Group Technology: A Hybrid Grouping Genetic Algorithm Approach,” Computers & Operations Research, vol. 38, no. 2, pp. 484–495, 2011.
  • [18] R.-C. Chen, “Grouping Optimization based on Social Relationships,” Mathematical Problems in Engineering, vol. 2012, pp. 1–19, 2012.
  • [19] R.-C. Chen, J.-Y. Li, N.-J. Ma, and Y.-T. Chang, “Application of Sociometry and Genetic Algorithm to Selection of Class Officers,” International Information Institute (Tokyo). Information, vol. 16, no. 2, p. 1233, 2013.
  • [20] I. E. da Silva and R. A. Krohling, “A Fuzzy Sociometric Approach to Human Resource Allocation,” in IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–8, July 2018.
  • [21] D. J. Beal, R. R. Cohen, M. J. Burke, and C. L. McLendon, “Cohesion and Performance in Groups: A Meta-analytic Clarification of Construct Relations,” Journal of Applied Psychology, vol. 88, no. 6, p. 989, 2003.
  • [22] J. L. Moreno, “Fundamentos de Sociometría,” tech. rep., 1961.
  • [23] J. Holland, “Adaptation in Natural and Artificial Systems,” Ann Arbor: The University of Michigan Press, 1975.
  • [24] B. L. Miller, D. E. Goldberg, et al., “Genetic Algorithms, Tournament Selection, and the Effects of Noise,” Complex Systems, vol. 9, no. 3, pp. 193–212, 1995.
  • [25] Y.-B. Hu, Y.-P. Wang, and F.-Y. Guo, “A New Penalty based Genetic Algorithm for Constrained Optimization Problems,” in

    2005 International Conference on Machine Learning and Cybernetics

    , vol. 5, pp. 3025–3029, Aug 2005.