Mobile Crowdsourcing (MCS) is the act of outsourcing sensing tasks traditionally performed by employees or contractors to an undefined large group of dynamic Internet population or cyber community through an open or targeted call. It harnesses the power of built-in sensors in mobile devices (i.e., smartphones, tablets, smart devices) and allows Internet-of-things (IoT) devices to establish relationships and cooperate together to complete specific sensing and data collection tasks without requiring pre-deployed dedicated infrastructure [7491206, 7538981].
In a typical MCS architecture [article5], there are three main agents: task requesters, task workers, and the cloud platform hosting the main framework. The task requester, which may be a human carrying a smartphone as well as a machine (e.g., autonomous vehicle), provides its task information and requirements to the cloud platform to execute a certain task, e.g., collecting photos or sensing data. The platform then uses these criteria to recruit suitable workers and provides them with task requirements. Most of the existing MCS approaches tackle simple tasks such as photo collection  or improving the labeling accuracy . In each of these MCS tasks, selected workers are asked to achieve what is necessary independently of each other and the final result is combined from their partial results to produce the overall outcome. To this end, most of the recent MCS researchers, for example, Cheng et al.  focused on optimizing the recruitment process by hiring skilled workers for each task such that they can fulfill the tasks’ requirements and provide suitable results.
However, in many other MCS applications, the set of tasks, also called projects, are very complex and the success of their completion depends not only on the expertise of their selected workers but also on how efficiently these workers can work together as a team . This could be, for example, the case of a MCS framework that helps build a virtual search party of smartphone users to find lost items, pets, or persons, as well as returning them. The enlisted workers are divided into convenient groups, using specific criteria required by the requester, and are asked to collaborate together for the search by providing up-to-the minute reports about any updates. If the collaboration somehow fails for one reason or another, the job cannot be achieved successfully. Therefore, besides providing the required skills, the successful completion of the project is very sensitive to the way team members collaborate and communicate. This type of MCS is called Collaborative MCS (CMCS).
CMCS is a teamwork-based paradigm where the set of workers, often with diverse and complementary skills, form groups and work together to complete complex tasks [2222, Mass1912, 7806269]. In traditional crowdsourcing applications, workers are recruited and asked to complete the same task independently of each other and without any contact. Recent studies have begun to address the need to consider recruiting a team of workers in MCS [article, 7248382, AAAI1612106]. In fact, some approaches, such as , focused on dividing complex tasks into flows of simple sub-tasks and allocating these sub-tasks to a team of workers. At the end, the partial results are combined to produce the overall outcome. These approaches have focused only on the expertise of recruited team members and did not consider the interaction within members. Other approaches limited their focus on team formation in Social Networks (SNs) and proposed a solution to hire teams with good social relationships indifferent of the members’ levels of expertise [8488386, article7, article8].
In this paper, we complement these studies and investigate two CMCS recruitment techniques that combine social interaction with expertise and consider four key recruitment metrics: required skills, social relationships, budget allocation, and recruiter confidence level. The first recruitment approach is a platform-based strategy in which the CMCS platform itself is responsible for forming the entire team based on its knowledge about the workers SNs and its attributes (e.g., profile, history, experience, previous performance, reliability). The second one is a leader-based strategy in which the cloud platform selects a worker as a leader to which it delegates the team formation procedure. The chosen leader recruits team members based on its knowledge about other workers in its SN vicinity (e.g., social incentive mechanism, man-to-man, friendship). In layman’s terms, we propose:
A platform-based recruitment strategy that exploits the knowledge of the platform towards workers and recruits suitable team members.
A leader-based recruitment strategy that uses the knowledge of an appointed leader by the platform to recruit the rest of the team.
The common goal of both of these recruitment strategies is to form not only, skilled teams, but also socially connected. The platform-based approach utilizes the overall knowledge of the platform about the workers’ attributes and SNs. However, although the platform typically has a global view of all available workers attributes, its knowledge is limited with a low accuracy and precision. To this end, we introduce the leader-based approach which makes use of the workers knowledge about other workers’ skills and profiles in their vicinity. This approach relies on locality and includes the leader’s knowledge who is usually better informed about the workers in its SN vicinity than the platform. Both CMCS recruitment strategies are modeled as Integer Linear Programs (ILP). Simulation results show a performance trade-off between the two recruitment strategies when varying the workers SN edge degree. Compared to the leader-based strategy, the platform-based strategy recruits a more skilled team but with lower SN relationships and higher cost.
Ii CMCS Model
In a typical CMCS system, there are two external parties which interact with the platform. As illustrated in Fig. 1, these actors are the project initiator (e.g., local authority, weather company, mobile user, etc.) and the available workers (e.g., humans with smartphones, autonomous vehicles, sensors, etc). When it needs services, the project initiator submits its CMCS project , having as a set of required skills (e.g., expertise for humans, device specifications, etc.), to the platform. The latter is usually a centralized computing architecture responsible of recruiting a suitable team.
We denote by the set of workers registered in the CMCS platform where . Given the set of all possible skills in the system, we define the logical skill quantity for a project by where if the skill is required by project and otherwise. Hence, the skills set required by the project . Each worker has a degree of expertise in skill denoted by where . The term represents the actual expertise value of skill that worker have and it is interpreted as follows: means that the worker is an expert in skill . Otherwise, . We assume that a recruiter , which can be a worker as well as the platform itself, does not perfectly know the degree of the skill
of each worker. Instead, it knows an estimated value expressed as follows:, where is a skill error made by the recruiter given its knowledge about worker . Let be the set of skills provided by worker . We suppose that each recruited worker can only contribute with one required skill. Consequently, for a project having as a skill set , the number of team members must be .
To execute a task with skill , a worker may request a certain cost denoted by .
We suppose that the workers in the platform form a SN modeled as an undirected weighted graph . Every vertex of corresponds to a worker while the set of edges represents the SN relationships between the workers. Initially, we only consider the edges connecting a pairwise of workers that can directly communicate and collaborate and we associate the value to their weights. Then, the edges between the remaining pairwise of vertices, e.g., , which are not directly connected are given a weight computed using the shortest number of hops, denoted by , needed for one of the pairwise vertices to reach the other. Hence, the graph is converted into a complete graph where all vertices are directly connected and the values of the edges’ weights indicate the social relationships levels between each pair of workers. The real values on each edge between two workers and are given as: . We also assume that the recruiter does not perfectly knows the relationships degree between workers in the SN. Instead, it knows an estimated value expressed as: , where and represent relationships error made by recruiter given its knowledge about workers and , respectively. Both the noises andthat reflects the confidence level of the recruiter towards worker . If an isolated sub-graph exists, then the weights connecting a node of this sub-graph to other external vertices is a pure noise (i.e., ).
The efficiency of worker chosen by recruiter to contribute with skill is written as follows:
where is the set of hired workers in the formed team. The efficiency expression which the platform aims to maximize is established using the four key metrics. The quantities in (1) are introduced for normalization purposes so that the four key metrics have the same order of magnitude in the efficiency expression. The weights , with and , indicate the recruiter’s recruitment strategy. For example, situations where the project requester only cares about its task being completed by the workers having the highest skills (i.e., , and are set to 0). If the recruiter is looking for a reduced cost-effective team, a higher value of is recommended.
Iii Problem Formulation
In this section, we formulate a general optimization problem adequate for both of the recruitment strategies. Initially, we start by defining the required decision variables. Then, we introduce the optimization problems while specifying the common constraints for the two approaches and also the ones that are specific to each recruitment strategy.
Let be the set of possible team recruiters, which can be defined as follows:
In order to assign to the recruiter the chosen workers for a project and their contributed skill , we introduce a binary decision variable defined as follows:
Depending on the recruitment strategy, the index can take either the value of for the platform-based approach or any value in for the leader-based approach. Furthermore, this index indicates the identity of the recruiter to specify to the optimizer which entity’s knowledge about the workers attributes it will base its recruitment decision.
Another binary decision variable is introduced to consider the social relationships between the project team mates. This variable is presented as follows:
The variable indicates that all the positive 2-tuple combinations of the chosen team members. Its value can be computed using the following expression:
where the symbol () represents the logical operator AND.
The objective of this paper is to hire the most suitable team to complete a CMCS project . To this end, we introduce a general team formation optimization problem for both approaches and we define it as follows:
The previous constraints are common for both recruitment strategies. The following constraint is specifically added to the platform-based approach:
, is a endogenous binary variable that equalsif worker is the leader of the hired team and , otherwise. Constraint (6) forces each worker to provide at most one skill for the project . The value of is computed using a product of the decision variables . Therefore, we use the standard linearization technique and replace given in (5) with the constraints (7a) and (7b).
Constraints (8) address the SNs of the workers within the team. In fact, the first term eliminates the case of counting a worker to be its own co-worker within the hired team. The second term highlights the symmetric relations between vertices in the undirected graph of workers’ SNs.
The constraints presented in (9) and (10) have analogous goal but they are adapted for each strategy. On one hand, constraint in (9) is presented for the platform-based approach and ensures that each of the workers within the hired team contributes with a required skill defined by the project. On the other hand, constraints (10a), (10b), (10c), and (10d) are the result of a Big-M method to guarantee that the team leader recruits a team with workers having the required skills. The term represents the upper bound of . With the leader-based constraints, we add the first term in constraints (11) to ensure that the chosen leader is also a team member that contributes in the project with one required skill. To guarantee the uniqueness of the leader, we include the second term in constraints given in (11).
This optimization problem in (P) is formulated as an ILP and the solution can be optimally obtained using off-the-shelf software such as Gurobi or CPLEX integrating the branch and bound algorithm and simplex method. Also, note that there are cases where the problem is infeasible, for example when the number of workers is less than the number of skills . However, in CMCS platforms, this case is unlikely to occur since by definition, in large-scale IoT systems, the value of .
Iv Experiments and Evaluation
In order to simulate the team formation process in CMCS, we use a synthetic data with different types of project requirements and workers’ skills. We use Watts-Strogatz model to randomly produce the graph with small-world properties. For the platform-based approach, the values of the error levels on the workers’ skills and the workers’ relationships are proportionally modeled to the workers history (i.e., workers with more history in the platform has lower uncertainty levels). On the other hand, the uncertainty levels of potential leaders towards workers
for the leader-based recruitment strategy and the error levels on the workers relationships are proportional and increase with the number of hops between the team leader and other workers. For both approaches, the error is modeled as a zero-mean normal distribution.
In Fig. 2-5, we perform a Monte Carlo simulations where realizations of different parameter settings are generated. We evaluate the average four metrics of the selected teams: skills efficiency, recruiter confidence, team cost, and social relationships while varying the edge density of the SN graph for and . We adopt a proportional setting and set , . We also set the values of and to and , respectively.
The result of this simulation for shows that, for the leader-based approach, the performances for all metrics get higher with the increase of the edge density. For instance, the uncertainly level on the selected team skill decreases and tends to zero when the edge density reaches one. This can be explained by the fact that, by increasing the edge density of the graph, the number of hops between the leader and any worker in decreases. Moreover, the number of directly connected workers increases until the resultant graph is fully connected (i.e., everyone knows everyone). Also, the team skills level, budget allocation, and relationships rate increase when increasing the edge density because the team leader will have more workers connected to it and consequently more vast choices in its vicinity. When increasing the edge density, the relationships rate within the team increases because team members are more likely to have more connections. However, the performances of the platform-based approach, except the relationships rate, remains basically invariant while varying the edge density. We notice a growth in the relationships rate and a slight decrease in all the other metrics. This can be explained by the fact that the platform is basing its recruitment decision on the workers’ history and any changes of the edges density in will only affect the relationships term in the objective function. Notice that, for the leader-based approach, the recruited team skills level exceeds the one of the platform-based approach when the SN network becomes nearly fully-connected. These observation remains valid for . In fact, the effect of expanding the network while maintaining the number of required team members enhanced slightly the performance of both recruitment strategies by increasing the team skill level and decreasing the recruiter uncertainty. We notice that, for the leader-based approach, the team skill level curve has a higher slop and converge faster than when .
In Fig. 6, we present an example of the recruited teams using both recruitment strategies for , . The figure shows that the leader-based approach recruits a congregated team (workers are close to each other in the SN) while the platform-based approach recruits a team relatively scattered but with higher skills.
In this paper, we presented two recruitment strategies that form a team of skilled and socially connected workers in CMCS IoT systems. The platform-based approach exploits the platform knowledge to hire the team. The leader-based approach uses workers’ knowledge about their SN neighbors to designate a leader that recruits the rest of the team. Results show a performance trade-off between the two strategies when varying the workers SN edge degree. The platform-based strategy recruits a more skilled team than the leader-based approach but with lower SN relationships and higher rewards. In our future work, we will focus on designing low complexity recruitment algorithms enabling real-time CMCS operations.