Trust Evaluation Mechanism for User Recruitment in Mobile Crowd-Sensing in the Internet of Things

03/04/2019 ∙ by Nguyen Binh Truong, et al. ∙ 0

Mobile Crowd-Sensing (MCS) has appeared as a prospective solution for large-scale data collection, leveraging built-in sensors and social applications in mobile devices that enables a variety of Internet of Things (IoT) services. However, the human involvement in MCS results in a high possibility for unintentionally contributing corrupted and falsified data or intentionally spreading disinformation for malevolent purposes, consequently undermining IoT services. Therefore, recruiting trustworthy contributors plays a crucial role in collecting high-quality data and providing better quality of services while minimizing the vulnerabilities and risks to MCS systems. In this article, a novel trust model called Experience-Reputation (E-R) is proposed for evaluating trust relationships between any two mobile device users in a MCS platform. To enable the E-R model, virtual interactions among the users are manipulated by considering an assessment of the quality of contributed data from such users. Based on these interactions, two indicators of trust called Experience and Reputation are calculated accordingly. By incorporating the Experience and Reputation trust indicators (TIs), trust relationships between the users are established, evaluated and maintained. Based on these trust relationships, a novel trust-based recruitment scheme is carried out for selecting the most trustworthy MCS users to contribute to data sensing tasks. In order to evaluate the performance and effectiveness of the proposed trust-based mechanism as well as the E-R trust model, we deploy several recruitment schemes in a MCS testbed which consists of both normal and malicious users. The results highlight the strength of the trust-based scheme as it delivers better quality for MCS services while being able to detect malicious users.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 5

page 15

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Emerging Internet of Things (IoT) applications and services depend heavily on data collected from sensing campaigns such as sensor networks and crowd-sourcing. Traditional sensor networks deploy sensors in the terrain to acquire data on a variety of aspects of human lives but they have never reached their full potential and been successfully implemented in the real world. This is due to a number of unsolved challenges, such as high installation cost and insufficient spatial coverage [r01]. The new sensing paradigm called Mobile Crowd-Sensing (MCS), which is a sort of crowd-sourcing that leverages built-in sensors and applications in smart mobile devices, has recently emerged as a promising solution for IoT sensing campaigns [r02]. MCS allows increasing numbers of mobile device owners to share sensed data and, in exchange, the owners get incentives for their contributions. The potential for data collected from smart mobile devices are diverse such as local news, noise levels, traffic conditions, and social knowledge. With diversified spatial coverage due to the mobility of large-scale mobile users, MCS is expected to enable a variety of IoT services including public safety, traffic planning, environment monitoring, and social recommendation. This human-powered sensing approach augments the capabilities of existing IoT infrastructures without introducing additional costs, resulting in a win-win strategy for both users and IoT systems.

However, the introduction of MCS also poses some significant challenges such as cross-space data mining, retaining privacy and providing high-quality data [r03]

. Low-quality data could lead to numerous difficulties in providing high-quality services or even damage MCS systems. Certain methods have been proposed for improving the quality of data (QoD) in MCS, including estimation and prediction of sensing data, along with statistical processing for identifying and removing outliers in sensing values

[r04]. Data selection techniques are also used to filter low-quality or irrelevant data and to generate a high-quality dataset for further processing in IoT services [r05]. Another approach is the use of a recruitment mechanism for selecting trustworthy users who are expected to contribute high-quality data. An appropriate recruitment scheme would therefore not only reduce system costs but also minimize vulnerabilities, risks and potential attacks in MCS systems.

In this article, a novel trust evaluation mechanism called Experience-Reputation (E-R) is proposed for evaluating trust relationships between any two mobile device users in a MCS platform. To establish and evaluate the trust relationships, we utilize our conceptual trust model in the IoT environment called Reputation-Experience-Knowledge (REK), which comprises of the trust indicators (TIs) called Reputation, Experience and Knowledge proposed in [r06, r07]. To employ the E-R mechanism, virtual interactions between service requesters and data contributors are generated when one user requests a MCS service and other users contribute their sensing data to fulfill it. These interactions are then assessed by performing a QoD assessment over the contributed data. Based on these interactions, Experience relationships between service requesters and data contributors are established and calculated. Then, based on all of these Experience relationships between the users, the Reputation of each user is calculated accordingly. Trust relationships between users are finalized by combining the two associated TIs; Experience and Reputation. As a result, the proposed trust-based recruitment scheme examines the trust relationships between a service requester and potential participants in order to select the most trustworthy contributors for a requested service.

To verify the effectiveness of a user recruitment scheme, we propose an evaluation model for the quality of MCS service (QoS) based on the QoD assessment of data contributed to the service. We simulate the trust-based recruitment scheme along with two popular recruitment mechanisms using predictive algorithms in the same MCS testbed for comparison. The results indicate that the trust-based scheme not only provides better QoS for MCS services but also efficiently differentiates between high-quality, low-quality and malicious users. As a result, using the proposed trust evaluation mechanism for recruiting trustworthy data contributors not only prevents adversaries from contributing falsified data and potential attacks but also motivates users to provide high-quality data in order to be recruited in the next sensing task, hence further strengthening the MCS platform.

The main contributions of this article are three-fold:

  • The E-R trust mechanism for evaluating trust relationships between MCS users consisting of the Experience and Reputation models.

  • A practical real-world deployable trust-based user recruitment scheme leveraging the QoD assessment and the E-R mechanism.

  • A simulation for a MCS testbed consisting of three types of user models deploying different user recruitment schemes including our trust-based user recruitment, and an evaluation model for QoS based on QoD assessment.

The rest of the article is organized as follows. Section II presents background and related work on the MCS platform and user recruitment schemes. Section III introduces the trust-based MCS system model and components and the following section specifies the proposed trust evaluation mechanism including the Experience and Reputation computational models in detail. Section V describes the simulation scenarios including the testbed and user recruitment algorithms. Section VI presents the outcomes with analysis and discussion. The last section concludes our work and outlines future research directions.

Ii Mobile Crowd-Sensing Background and Related Work

Ii-a Background on Mobile Crowd-Sensing in the IoT

Fig. 1: A Centralized MCS Platform Architecture

In IoT ecosystems data from various sources such as actuations, sensors, and smart devices are gathered, analyzed and processed to provide ubiquitous and intelligent services [r08, r09]. In this environment, users could contribute to the process through sharing not only data sensed from their own devices’ sensors but also their incidents and knowledge over social networks without the need to pre-allocate sensing devices in the area [r10], hence saving deployment costs [r11, r12]. This prospect coined the term MCS that has since gained popularity as a promising data acquisition approach for the IoT because of the increasing usage of mobile smart devices. These devices are equipped with many different types of sensors such as Global Positioning System (GPS), accelerometers, a gyroscope, microphone and camera along with advanced features including processing and wireless communications that can efficiently support crowd-sensing processes [r13, r14]. In a MCS platform, heterogeneous information regarding different aspects of human life is collected from mobile devices before being aggregated, analyzed and mined for supporting a variety of IoT applications and services (Fig. 1).

The data acquisition models for a MCS system can be categorized as either opportunistic or participatory [r01]. In optimistic sensing systems, data is automatically collected using a background process, such as reporting speed and GPS coordinates in navigation services while driving. Sensing decisions are application or device-driven, meaning that the involvement of participants is minimal, thus, user recruitment is not necessary. Conversely, in participatory sensing systems, participants agree to a requested sensing task that is dispatched from a centralized MCS platform. Users are explicitly engaged in the sensing process by accepting or rejecting the sensing request; and by actively collecting data such as taking a picture, reporting an available parking lot and manually providing information (illustrated as step (2) and (3) in Fig. 1). Such sensing data can be extracted and directly consumed by end-users for supporting some prompt services or further aggregated in the cloud for large-scale sensing and community intelligent mining [r04]. It is worth to note that in both data acquisition models, the participant trajectories could be revealed by the MCS platform, resulting in the risk of privacy leakage. As a consequence, mobile users may not be enthusiastic to contribute sensing data to the platform even though they get incentives (step (5) in Fig. 1). Privacy-preserving mechanisms for MCS should also therefore be carried out in the MCS platform [r1_4].

Generally, the life cycle of a MCS system comprises of three phases: ‘task creation and user recruitment’, ‘task execution’, and ‘data collection and processing’ [r15]. More recently, Zhang et al. have divided the life cycle into four phases: ‘task execution’, ‘task assignment’, ‘individual task execution’, and ‘sensing data integration’ [r16], where the ‘task assignment’ phase recruits users and assigns individual sensing tasks for the participants. Nonetheless, the user recruitment scheme plays a key role in the success of any participatory MCS system. The high density of mobile device users, especially in urban areas, allows a MCS system to select only a subset of all available data contributors; and obviously, different user recruitment schemes may lead to different system performances. In order to obtain high quality data, a simple solution is to recruit as many participants as possible [r5_3]. However, collecting data from collocated users may result in data redundancy which cannot further improve the QoD while posing the waste of incentive cost, storage space and imposing network overheads. Therefore, a good recruitment scheme not only selects proper users for providing high-quality data but also allows MCS service providers to manage expenditure by considering incentive costs based on users’ contributions. These MCS systems are tailored to a centralized MCS platform illustrated in Fig. 1, which facilitates major system control operations including the user recruitment.

Ii-B Related Work

A variety of user recruitment schemes in a centralized MCS platform have been investigated. Reddy et al. have proposed a recruitment mechanism in a participatory sensing platform considering some core attributes such as geographic and temporal coverage and user behaviors for defining participant profiles comprising of availability, reputation and cost in their recruitment policy [r17]. Standing on these attributes, Karaliopoulos et al. have come up with a deterministic and stochastic mobility model for solving an optimization problem on cost minimization and user location in their recruitment policy [r18]. Lately, other researchers have employed piggyback crowd-sensing techniques for gathering more information from mobile-device owners such as phone call, GPS coordination, and mobile application usages. As a result, these proposed recruitment mechanisms are able to predict geographical coverage and user availability; thus, these mechanisms are capable of determining a minimum number of participants for a sensing task in an energy-efficient recruitment strategy [r19, r20, r21]. For instance, the authors in [r23] have demonstrated a recruitment policy based on statistics of social services usage to compute a ’sociability’ metric, indicating the willingness of users to participate in sensing tasks. Wang et al. have theoretically leveraged mobile social networks such as Facebook, Twitter and FourSquare as the medium for information sharing and propagation in a novel recruitment platform and proposed two recruitment algorithms. The ultimate goal is to select a near-optimal set of social network users used as seeds (i.e., influenced users) in order to maximize the temporal-spatial coverage of MCS sensing tasks [minor_1]. The authors in [r5_6]

have proposed a prediction-based recruitment mechanism considering a factor called ‘contact probability’ indicating whether two MCS users are in the same points of interest (PoIs). They have used a semi-Markov model to determine the probability distribution of the users’ arrival time at a PoI to calculate the inter-user contact probability, which is used in a prediction strategy to recruit users with the purpose of lowering data uploading cost. Similarly, Li

et al. considered a recruitment scheme in a large-scale piggyback MCS system with dynamic and heterogeneous sensing tasks with the aim of minimizing the number of participants while still achieving a stable task coverage [r5_1]. Most of the aforementioned recruitment approaches have the same purposes of developing an energy-efficient and cost-effective recruitment strategy by minimizing the sensing costs for a MCS service provider while guaranteeing certain requirements of requested services such as sensing area coverage. These approaches normally use an auction mechanism for negotiating incentives with mobile-device users [r21, r22]. However, such recruitment mechanisms need to obtain location traces, history of phone calls, and social services personal information, which could pose the risk of serious users privacy leakage. Moreover, the quality of the contributed sensing data from the recruited users is largely neglected. There are multiple factors that affect the recruitment process, and the assurance of high-quality sensing data is of paramount importance.

  • Quality of Data in MCS User Recruitment

Recently, several efforts have proposed to recruit users based not only on time, location and statistical metrics but also on the QoD and the quality of information (QoI). Liu et al. have taken the Quality of Information (QoI) requirements of sensing tasks into account for some incentive-based recruitment schemes using a bidding mechanism [r1_1]. However, such schemes only work in a trustworthy environment with no malicious users due to the assumption that the recruited users will always provide data satisfying the QoI requirements for the sensing tasks as in the bid. Li et al. also performed statistical analysis on the history of participation in previous sensing tasks for learning and predicting the QoD of the next sensing task [minor_3]. The drawback of this idea is the requirement of calculating the similarity features among sensing tasks in order to recruit high-quality users. The ultimate goal of this work is similar to our work, but our approach is more practical and is not based on the calculation of this similarity. The authors in [r5_2] proposed a participant selection scheme to provide high-QoI satisfaction while minimizing overall energy consumption. The scheme is based on two criteria called the remaining energy level and the ‘willingness of participation’ defined by the rejection probability as the input for a constrained optimization solution. Again, this scheme only works if there is no malicious user who can purposely upload high-quality sensor readings as samples in order to be recruited and then turn out to provide false data to mislead the MCS system. QoI is not only used as a criterion for the user recruitment but also for incentive schemes in MCS systems. For example, the authors in [r5_4] leverage QoI assessment to allocate suitable incentives for data contributors, resulting in a fair incentive mechanism.

  • Reputation and Trust in MCS User Recruitment

In order to deal with the presence of malicious users, reputation can be used as an indicator to perceive trustworthy participators in MCS sensing tasks on the assumption that regular users and adversaries behave differently. Kantarci et al. have proposed a reputation-based MCS management approach adopting the M-Sensing auction approach [r24] in which a statistical reputation is taken into account [r25]. This statistical reputation is simply the percentage of true sensor readings over total readings. Pouryazdan et al. have further employed a vote-based approach using a social network for evaluating users’ reputation [r26, r27]. In this platform, users who have recently participated in a common sensing task form a community. All members of the community will then vote on the reputation of a newly joining user based on their similarity on sensor readings. The same authors have also considered a vote-based mechanism implementing a Subgame Perfect Equilibrium (SPE) and gamification techniques based on the calculation of users’ reputation in the three-step recruitment process for improving the platform utility. The reputation scores are used as the core attributes for recruitment and incentivizing users in sensing tasks in [minor_5, minor_6]. Nevertheless, such reputation-based recruitment schemes have unintentionally claimed the reputation is trust and have used the reputation on its behalf. In reality, reputation is one of several TIs partially affecting trust, but should not be confused with trust itself [r06]. Moreover, the mechanisms used in such approaches are either too simple [r25, r29], based only on statistical sensor readings, or impractical assumptions [r26, r27, minor_5, minor_6]. For instance, if two users join in the same sensing task, then there will be an interaction between them and they will get connected and directly interact with each other. Another assumption is that any user has the right to access all previous readings of other users in the same community for making up their votes. This results in the unfeasible deployment of these mechanisms in the real world. The authors in [r1_2] have proposed a dynamic trust-based framework for recruiting suitable mobile users that provide high-quality sensing data on time. In that paper, an overall trust degree is calculated for selecting trustworthy users by aggregating from three factors: Direct Trust, Feedback Trust and an Incentive Function. The final goal is similar to our research work, however, the drawback of this approach is that it requires feedback from task recruiters for the Feedback Trust as well as to keep track of non-cooperative behaviors of mobile users for the Incentive Function. Restuccia et al. has summarized recent research about developing a framework for discovering trust in MCS [r5_5]. They have furthermore discussed current challenges and different approaches for evaluating trust through a collection of trust indicators.

Given this state-of-the-art, we propose a trust evaluation mechanism that can be effectively used to recruit trustworthy users while still being practically deployable for real-world services.

Iii E-R Trust Mechanism in Mcs Platform: Model and System Components

This section explores a MCS system model and scenarios, then introduces the E-R trust evaluation mechanism and its components deployed on top of a centralized MCS platform.

Iii-a MCS System Model and Scenario

In a MCS platform, users share and provide data from their smart devices through being physically close (direct sensing model) or through a centralized MCS platform (indirect sensing model) [r30]. In the direct sensing model, direct interactions exist between a requester and a provider such that sensing data is transmitted in a peer-to-peer manner. This sensing model uses a variety of wireless communication technologies such as Wi-Fi direct, ZigBee, Near-Field Communication (NFC) and Bluetooth over a social platform that operates among nearby smart device users [r31, r32]. In the indirect sensing model, a requester and a provider indirectly interact via a centralized MCS platform. In this model, users can upload and obtain data to and from a cloud server through wide-range communication technologies such as Wi-Fi and 3G/4G. The indirect sensing model adopts the well-known service-oriented approach model called Sensing as a Service (S2aaS) [r33]. Melino et al. have further developed a Cloud-based SaaS model designated for MCS systems called Mobile Crowd-Sensing as a Service (MCSaaS) [r34].

Nevertheless, in any MCS model, a user can be either a requester that asks for a service or a data provider that collects and delivers data being used by another service; thus MCS users are directly or indirectly interacting with each other. This introduces either a ‘direct’ or an ‘indirect’ relationship between a service requester and a data provider depending on the sensing model deployed in a MCS system. In this article, we consider MCS systems that adopt the indirect sensing model with a participatory data acquisition style, which is overwhelmingly the most common in real-world usage. For such a system model, there is a centralized MCS cloud platform that handles and operates all the MCS processes including data collection and processing, task creation and execution; and the user recruitment and incentive schemes as illustrated in Fig. 1.

Iii-B E-R Trust Mechanism in the MCS Platform

Trust can be considered as the ‘belief’ of a trustor that the trustee will perform a task as the trustor’ expects. Trust plays an important role in supporting participants to overcome the perception of uncertainty and risks when making a decision [r06]. In the MCS context, trust can be utilized to predict whether a mobile device user (i.e., the trustee) is going to provide high-quality data for a service requested by a service requester (i.e., the trustor). To establish and evaluate trust relationships between service requesters and data contributors, the REK trust model proposed in [r06, r07, r35] is employed.

As depicted in Fig. 2, trust is comprised of three TIs called Reputation, Experience and Knowledge. Knowledge is identified as ‘direct trust’ and evaluated by inferring trustees’ characteristics considering the trust context [r06]. In the MCS context, Knowledge is constituted from a variety of attributes such as availability, the mobility model, GPS coordination and geography coverage. These attributes specify criteria for user ability and eligibility for fulfilling crowd-sensing campaigns. Experience and Reputation in contrast are identified as “indirect trust” and are quantified by accumulating previous interactions between mobile device users. Experience is a relationship between two users reflecting the personal perception of a trustor on a trustee. Reputation is the property of a user indicating the global consciousness of that user by considering all personal perceptions toward it [r06].

Fig. 2: Trust Indicators and Attributes in the REK Trust Model

Knowledge assessment requires various information from mobile device users that imposes critical privacy concerns in this context. Moreover, some information is challenging to retrieve or is not practical to implement in real-world scenarios [r06]. For those reasons, we simplify the REK model which we will now call E-R that relies only on two indicators; Experience and Reputation. Knowledge is neglected in the E-R model, but some information could play a supplemental role in strengthening the evaluation of trust. As illustrated in Fig. 3, the E-R trust components are integrated in a centralized MCS cloud platform that establishes and manages virtual interactions between mobile-device users. An indirect interaction occurs after each sensing task is accomplished; and the interaction value is calculated based on the QoD provided to the MCS system (from data providers) and feedback (from service consumers). Experience between any two users is established and updated by an aggregation model on the virtual interactions. Based on all Experiences between users, the Reputation of each user is calculated accordingly. Finally, the value of a trust relationship is calculated by aggregating the Experience and Reputation. Detailed calculation models for the Experience, Reputation and trust are presented in Section IV.

Fig. 3: E-R Trust Mechanism in a Centralized MCS platform

Iii-C Quality of Data Assessment

The aim of MCS systems is to extract useful knowledge and intelligence from sensing data for delivering smart services; and to achieve this, high QoD must be ensured [r36]. Low-quality data might cause numerous problems such as deception in decision making, consumer dissatisfaction and distrusting the system [r37]. Well-known research works have pointed out that QoD consists of evaluating measurable properties that represent certain aspects of the data [r37, r38], and some data can be identified as high quality based on the measurements of these dimensions [r37]. Six data quality dimensions are specified by Askham et al. in [r38] and have been widely accepted, namely Accuracy, Completeness, Consistency, Timeliness, Uniqueness, and Validity. Detailed analysis and measurement methodologies for the six dimensions have also been proposed in related articles. Therefore, based on the system requirements, context, and system goals these dimensions can be taken into consideration for the QoD assessment [r40, r41].

Fig. 4: QoD Monitoring Module for traffic and parking sensors in the Wise-IoT project

We have utilized the QoD calculation mechanisms in [r37, r38] for measuring live data streaming QoD from traffic sensors and parking sensors deployed in Santander City Center, Spain as a result of the Wise-IoT111http://wise-iot.eu/en/home project. As the data is presented in semantic form, we have proposed two further novel dimensions called Syntactic Accuracy and Semantic Accuracy in the QoD assessment [r42]. These two dimensions are suitable for checking data syntax and semantics from live information produced by the sensors (Fig. 4) using predefined data quality rules as well as the ontology validating rules developed by EGM222http://www.eglobalmark.com [r42]. We believe this mechanism can be reused here for evaluating sensing data in a MCS platform because the underlying theoretical and practical QoD assessments are identical.

Iii-D User Feedback

QoD is the most important indicator of how contributors fulfill an assigned sensing task but it may not be sufficient alone because QoD scores do not completely reflect the level of consumers’ satisfaction with the service provider. In this regard, feedback can complement the assessment of to what extent a service provider has accomplished a requested service. Feedback can be both implicit and explicit; and may or may not require human participation. Feedback could be obtained by directly asking customers to give opinions after a service has been provided. This approach has been used in many e-commerce services such as eBay, Amazon and Airbnb, which requires huge effort to attract users to anticipate; and opinions are sometimes biased. The implicit approach is based on calculation models with some predefined criteria to estimate the outcome, which normally do not require a human participant. For example, this has been applied in some networking protocols as an ACK message to indicate whether a packet or a file is transmitted successfully or unsuccessfully [r45].

However, this type of user feedback is out of scope of this article. In the E-R trust component we neglect the feedback mechanism at this stage and thus indirect interactions between users rely on QoD scores only. However, user feedback could be an important component for improving the quality of IoT services and we will consider it as part of further work.

Iv E-R Trust Evaluation Model

In this section, the mathematical calculation models for the E-R trust mechanism are described in detail.

Iv-a Experience Model

Experience is an asymmetric relationship between two entities built up from previous interactions reflecting to what extent a trustor trusts a trustee. After each interaction, awareness between the trustor and the trustee is supposed to improve, and Experience should be maintained to correctly indicate the relationship between the two (illustrated in Fig. 5).

Fig. 5: Experience computation model based on feedback mechanism

The proposed Experience model for MCS systems follows human relationships investigated in sociological literature [r46, r47]. That is, Experience increases due to cooperative interactions and decreases by uncooperative interactions. Experience also decays if no interactions occur after a period of time. The amount of the increase, decrease and decay depends on the intensity of interactions, interaction scores, and current Experience value. Therefore, Experience can be modeled using mathematical models as follows with the notations denoted in Table I:

Notations Description
Experience value at the time
Maximum value of Experience, normally set to 1
Minimum value of Experience, normally set to 0
Initial Experience value at the bootstrap
Interaction value (i.e., QoD score) at the time
Maximum Increase value,
Rate of the Decrease, normally
Cooperative Threshold for the
Uncooperative Threshold for the
Minimum Decay value
Decay Rate

TABLE I: NOTATIONS USED IN THE EXPERIENCE MODEL
  • Increase Model (due to cooperative interactions)

A cooperative interaction is when . The Increase function is modeled using a linear difference equation as follows:

(1)
(2)
  • Decrease Model (due to uncooperative interactions)

An uncooperative interaction is when the QoD score threshold. The Decrease function is modeled as follows:

(3)

Where is already determined by (2).

  • Decay Model (due to no or neutral interactions)

Experience TI decays if there is no transaction after a period of time or the interactions are neutral (i.e., ). The Decay function is proposed as follows:

(4)
(5)

Iv-B Analysis and Discussion for Experience Model

As we are imitating the relationships seen in human society, it is expected that Experience TI is accumulated from cooperative interactions; and depends on both QoD score and current value . Also, a strong relationship should require more and more cooperative interactions to attain. Considering the trust evaluation in which trust values and QoD scores are in the range , Experience values should be normalized to the range , thus we set , , and . It is obvious that the increase model defined in (1) and (2) is incremental; and the increase value from time to time is relatively large when the current value of is small and vice versa (considering the same interaction value ), meaning that higher gets more difficult to achieve.

Lemma IV.1.

The proposed increase function is always less than and asymptotic to .

Proof.

From (1) and (2) with , the function can be re-written as:

(6)

Subtracting both sides of (6) from :

(7)

According to (7), because , , and , ; in other words, .

Moreover, because , we have:

(8)

Because , , and are three pre-defined parameters, thus:

(9)

Applying the Squeeze theorem on (8) and (9), we have: . Therefore, the increase of is asymptotic to .

As with the Increase function, the Decrease function in (3) is decremental and the decrease value depends on both the current value of and the uncooperative QoD score. It is worth to note that the Decrease rate should be greater than because a strong relationship (i.e., high value) is difficult to gain but easy to lose (e.g., means that the value decrease due to uncooperative interactions is twice compared to the amount gained in the corresponding cooperative interaction). The Decrease function also ensures that strong relationships are more resistant to uncooperative interactions whereas weak relationships are severely damaged.

Regarding to the Decay function, is the minimal decay value which guarantees that even strong relationships still get decreased; and is the decay rate. In sociology, relationships between people decay over time if participants do not interact, although the decay rates are different depending on the strength of the relationships [r48]. Similarly, the proposed decay model shows that relationships require periodic maintenance, but strong ones tend to persist longer even without reinforcing cooperative interactions. As can be seen in (4), the decay value is assumed to be inversely proportional to the current Experience value, thus strong relationships exhibit less decay than weak ones.

Iv-C Reputation Model

Reputation is a property of a user reflecting the overall opinion of a community about that user. In the MCS environment, especially in urban scenarios with a large number of mobile users, only small numbers of users have already interacted with others, resulting in a very high possibility that a service requester (i.e., the trustor) and a data provider (i.e., the trustee) are new to each other, thus no prior Experience relationship exists between the two. The Reputation of the trustee, therefore, is a vital indicator for the trust evaluation.

As Reputation is an overall opinion, the calculation for the reputation of a user , denoted as , needs to take all users that have prior Experience with into consideration. Intuitively, Reputation can be quantified using a graph analysis algorithm on the Experience relationship graph, which is somewhat similar to the Google PageRank [r49] and the weighted PageRank [r50] approaches. The difference from the two previous models is that each user contributes differently to , in either a positive or negative manner, depending on both (i.e., the Experience from toward ) and the user’s reputation (i.e., ).

To come up with the new model for Reputation, we modify the PageRank models proposed in [r49, r50]

by classifying the Experience relationships into two sub-groups: Positive Experiences (i.e.,

) and Negative Experiences (i.e., ) where is a predefined threshold. Let be the number of users in a MCS system, and is a damping factor () as defined in the standard PageRank [r6_18]. Then, the Reputation model is proposed as a composition of the two components Positive Reputation and Negative Reputation as follows:

  • Positive Reputation

The positive reputation can be calculated as follows:

(10)

Where: is the sum of all positive Experience from user .

  • Negative Reputation

The negative reputation can be calculated as follows:

(11)

Where: is the sum of all compliment of negative Experience from user .

  • Overall Reputation

Finally, the overall reputation is the combination of the two positive and negative reputations:

(12)

Iv-D Mathematical Analysis for Reputation Model

According to the proposed model, the reputation of a user is recursively calculated from other users’ reputations and the corresponding Experience relationships; consequently reputations of all users (forming a -vector denoted as ) in a MCS platform are correlated with each other. Therefore, this vector might not exist due to the correlations among users’ reputations; or the vector might be ambiguous (i.e., not unique: a user might have more than one reputation value) which is not reasonable.

Lemma IV.2.

The reputation vector calculated by the proposed reputation model exists and is unique.

Proof.

Regarding (10), let be the diagonal matrix where the diagonal element . Let be a matrix that:

(13)

Let be the positive reputation vector consisting of elements . Then, (10) can be expressed in matrix notation as follows:

(14)

where is a matrix of . Let us define:

(15)

Thus, (14) can be rewritten as:

(16)

Therefore, is the of matrix with . We now prove that the of the matrix exists and is unique. Equation (13) and (14

) is reminiscent of the stationary distribution of a Markov chain which moves among the set of

states from to with the transition matrix where go from state to state .

Let us consider a discrete-time Markov chain defined by a set of states as the entities and a transition probability matrix :

(17)

Consequently, the Markov chain can be defined as following:

(18)

Fortunately, this turns to a model of random surfer with random jumps as in the edge-weighted PageRank model [r6_29]. This leads us to show the Markov chain is strongly connected, and the vector, which is the stationary distribution of the Markov chain, is unique [r6_18, r6_26, r6_28].

Similarly, the vector from (11) exists and is unique. As a consequence, the overall reputation vector defined in (12) also exists and is unique.

Iv-E Final Trust Value

A trust value is an aggregation of both the Experience and Reputation values. There are a variety of techniques for combining the two TIs such as Bayesian neutron networks, fuzzy logic, and machine learning depending on the specific use-cases and individual users’ preferences. A simple weighted sum for calculating a final trust value between trustor A and trustee B is used as follows:

(19)

Where are weighting factors satisfying . The weighting factors can be autonomously tuned using different techniques such as machine learning and semantic reasoning.

V Simulation Testbed and User Recruitment Schemes

This section presents a MCS testbed in which the trust-based user recruitment is simulated along with two other schemes called Average and Polynomial Regression predictive models [r51].

V-a User Models in MCS

Some statistics and analysis were carried out on QoD scores in a real-time data stream collected from traffic sensors333https://mu.tlmat.unican.es:8443/v2/entities?limit=1&type=ParkingSpot and parking sensors444https://mu.tlmat.unican.es:8443/v2/entities?limit=1&type=TrafficFlowObserved deployed in the city of Santander, Spain as part of the Wise-IoT project. Histograms of QoD from various sensors were analyzed and normalized in the range . Based on this histogram, we have observed that the QoD score distribution from any sensor nicely fits to the Beta probability distribution family. By using a Beta parameter estimation mechanism, we categorize users in a MCS system into three groups based on their QoD score distribution as follows:

Fig. 6: User Models in MCS systems
  • High-Quality Users

High quality users consistently produce high QoD scores in most sensing tasks. Based on the statistical information, QoD scores from a high-quality user distribute in the interval but the highest distribution is in the range

. QoD scores from a high-quality user follow a unimodal Beta distribution with two positive shape parameters

satisfying and

. The probability density function (PDF) of the Beta distributions for 50 high-quality users are shown in Fig.

6.

  • Low-Quality Users

Low-quality users consistently produce average or below-average QoD scores in most of the sensing tasks. QoD scores are in the interval but mostly fall in the range . Similar to high-quality users, QoD scores from a low-quality user follow a unimodal Beta distribution with the two positive shape parameters satisfying and . The PDF of the Beta distribution for 50 low-quality users are depicted in Fig. 6.

  • Intelligent Malicious Users

Even though no data from malicious smart devices was collected, a feasible intelligent malicious user might follow the behaviors below:

  • Normally produces very high QoD scores in order to pose as a strong candidate for recruitment schemes.

  • Unpredictably and intentionally produces very low-quality data once the user is recruited in a sensing task to destroy a targeted MCS service. The service will be heavily damaged if the data is used for fulfilling requested services.

According to the above description, the malicious user model follows a bi-modal Beta distribution. Thus, firstly we define two Beta distribution models, one for very high QoD scores , satisfying and ; and one for very low QoD scores , satisfying and . Then the two Beta distributions are mixed in order to form the desired bimodal Beta distribution using a mixture coefficient parameter as follows:

(20)

Fig. 6 also illustrates 25 malicious users with the mixture coefficient , meaning that the users follow the in of the sensing tasks (providing high quality data) and provide very low quality data in of the sensing tasks (i.e., following the ).

V-B QoS Evaluation Model for MCS Services

To evaluate and compare the effectiveness between different user recruitment schemes in the performance of MCS services, a QoS evaluation model is proposed. Low-quality data lowers system efficiency and misleads system operations that directly leads to customer dissatisfaction [r52]. Low-quality data also increases system operational overheads and cost; and imposes vulnerabilities and risks to the system. Some QoS evaluation models for IoT services have been proposed, taking into consideration different factors at various layers of the IoT infrastructure [r54]; and the QoD is one of the pivotal factors in the evaluation of QoS for MCS services.

Considering a service request that comprises of sensing tasks ; each sensing task is fulfilled by participants providing datasets with respectively. The QoS for the service R is calculated as follows:

(21)
(22)

Equation (21) depicts that the QoS of the service request is proportional to the QoD scores of each the sensing task , represented by the product of the natural logarithm of these scores. The score of the sensing task is calculated by taking the average of the QoD scores from the contributors associated to the sensing task. This is because contributors in the same sensing task are normally required to collect the same sort of data; such redundant datasets are then filtered and pre-processed to retrieve a high-quality dataset before processing and mining. However, the number of participants in each sensing task should be small enough in order to not incur significant computation and storage overheads. Nevertheless, user recruitment plays a crucial role in providing high-quality services because even in a sensing task fulfilled by many participants, some attackers providing extremely low QoD data could result in massive damage to MCS services.

V-C Trust-based, Average, and Polynomial Regression User Recruitment Schemes

Generally, all three recruitment schemes have the same purpose of recruiting mobile device users that are expected to provide high QoS scores for sensing tasks in a MCS service request. The algorithms to recruit users in the three schemes rely only on QoD scores of sensing data contributed by users who have been recruited in previous sensing tasks. The Trust-based recruitment scheme uses trust relationships between a service requester and other users for recruiting participants. The Average-QoD and Polynomial Regression-QoD schemes use the two popular predictive schemes; namely Average and Polynomial Regression, respectively, for predicting the QoD scores, and recruiting users who are likely to provide the highest QoD scores for the next sensing task accordingly.

For the comparison among the recruitment schemes, all of the algorithms have the same inputs consisting of Users, Service Requests, and associated sensing tasks and the same output as the QoS score for the requested services:

Input :  Users. Service Requests . Each requires Sensing Tasks and . Each is fulfilled by participants .
Output :  scores for the Service Requests
Algorithm 1 Inputs and Outputs for User Recruitment Algorithms

Then, the three algorithms are demonstrated in mathematical-style pseudo-code as follows:

  • Trust-based User Recruitment scheme

This scheme establishes and maintains trust relationships between users based on the E-R trust model proposed in Section IV and recruits users with the highest trust values with a service requester. As can be seen in Algorithm 2, it firstly initiates the matrices EXP, REP and TRUST for keeping track of Experience relationships, Reputation values, and Trust relationships for users (line #1). The output at the beginning state is set to 0 (line #2). For each request from a user and for each sensing task , the algorithm recruits participants that have the highest trust values with (line #5). When the sensing task has been accomplished, the algorithm calculates the QoD score for the sensing data collected from the recruited users and updates EXP, REP and TRUST accordingly (line #6 to line #9). Finally, the output is updated by adding the QoS score of the requested service (line #11).

1 Initialization TRUST[][], EXP[][], REP[]; ;
2 = 0; ;
3 foreach request from user  do
4       foreach sensing task  do
5             Recruit( users with highest TRUST[ users][]);
6             QoD(Sensing data from users);
7             Update(EXP[][ users]);
8             Update(REP[]);
9             Update(TRUST[][]);
10            
11       end foreach
12       + QoS();
13 end foreach
Return
Algorithm 2 Trust-Based Recruitment Algorithm
  • Average-QoD User Recruitment scheme

This scheme maintains a list of the average QoD scores for users and recruits participants with highest average QoD scores. As can be seen in Algorithm 3, it initiates the AVG matrix for keeping track of the average QoD scores for users (line #1). The output at the beginning state is set to 0 (line #2). For each request from a user and for each sensing task , the algorithm simply recruits participants with the highest average QoD score (line #5). When the sensing task has been accomplished, the algorithm calculates the QoD score for the sensing data collected from the recruited users (line #6) and updates the AVG matrix accordingly (line #7). Finally, the output is updated by adding the QoS score of the requested service (line #9).

1 Initialization AVG[]; ;
2 = 0; ;
3 foreach request from user  do
4       foreach sensing task  do
5             Recruit( users with highest AVG[] score);
6             QoD(Sensing data from users);
7             Update(AVG[ users]);
8            
9       end foreach
10       + QoS();
11 end foreach
Return
Algorithm 3 Average-based QoD Recruitment Algorithm
  • Polynomial Regression-based QoD User Recruitment scheme

This scheme maintains a history of QoD scores that users have contributed to the MCS system and recruits participants based on a prediction on QoD scores for next sensing tasks using a polynomial regression model. The 3-degree polynomial model by means of the least-square fit method is used as the predictive model in the algorithm.

As can be seen in Algorithm 4, it initiates the matrix for storing the history of QoD scores in previous sensing tasks for users (line #1). The output at the beginning state is set to 0 (line #2). For each request from a user and for each sensing task , the algorithm uses the and functions for finding the coefficients and predicting the next QoD scores for each user (line #5, line #6); then, it recruits users with highest predicted QoD scores (line #7). When the sensing task has been accomplished, the algorithm calculates the QoD score for the sensing data collected from the recruited users (line #8) and updates the matrix accordingly (line #9). Finally, the output is updated by adding the QoS score of the requested service (line #11).

1 Initialization QoDScore[][]; ;
2 = 0; ;
3 foreach request from user  do
4       foreach sensing task  do
5             f = polyfit((t, QoDScore[][],3));;
6             polyval((f, t+1)); ;
7             Recruit( users with highest predicted QoD score);
8             QoD(Collected Data from users);
9             Update(QoDScore[ users]);
10            
11       end foreach
12       + QoS();
13 end foreach
Return
Algorithm 4 Polynomial Regression-based QoD Recruitment Algorithm

Vi Simulation Results and Discussions

The testbed is implemented in Matlab containing a set of users consisting of low-quality, high-quality and malicious users, a number of service requests, and the three user recruitment schemes. For comparison purposes, all three schemes take the same inputs (i.e., the set of users and the service requests) and produce outputs as QoS scores for the requested services. The source code for the implementation can be found here555https://github.com/nguyentb/MCS_project.

Vi-a Parameter Settings for Experience Model

As discussed in Section IV.B, and are set to and , respectively. is set to at the bootstrap state. According to the statistics of the QoD scores discussed in Section V.A, if a user provides a dataset with a QoD score then it is a cooperative interaction; otherwise if the QoD score is meaning that the user provides a very low-quality dataset, then it is an uncooperative interaction. is the maximum Increase value and the smaller the is, the more interactions are required to get a strong relationship. As can be seen in Fig. 7, we set , as a result, it takes more than interactions in order to attain a strong relationship (i.e., the Experience value ). Similar experiments were conducted to come up with the other controlling parameters for the Decrease model and Decay model (i.e., and ) for forming reasonable curves as shown in Fig. 7.

Fig. 7: Experience Model with Increase, Decrease and Decay models

Note that different use-cases might result in different parameter settings, depending on how difficult it is to build up a strong relationship as well as to lose and decay the relationship. Details for the parameters used in this article are shown in Table II.

Parameters Values Parameters Values
1 0.005
0 0.005
0.3 0.3
0.1 0.6
2

TABLE II: PARAMETERS SETTING FOR THE EXPERIENCE MODEL

Vi-B Calculation Mechanism for the Reputation Model

The Reputation mechanism in a MCS system can be calculated either algebraically or iteratively. The traditional algebra method to solve the matrix equations in (15) and (16) takes roughly operations that is not suitable for a large number of users ( is the network size, i.e., the number of users). On the other hand, the iterative method is much faster because the and vectors converge after conducting a number of iterations [r55]. We therefore use the second method in this simulation and, with the damping factor set to , the , and the number of users ranging from to , it takes from to iterations to converge. This reputation calculation is suitable for huge networks like the IoT as the scaling factor is roughly linear in logarithm of [r07].

  • Testbed simulation scenarios

The number of service requests is varied from to , and without the loss of generality, we assume that each service request is fulfilled by a random number of sensing tasks from to . Each sensing task requires a number of users from to (50% of the total users). The total number of users is set at ; and the number of malicious users is varied from 0% to 25% of . We also assume that a user can participate in several tasks simultaneously.

Vi-C Results and Discussion

We have implemented the three algorithms outlined above in the simulation and, for better observation, we have also implemented a random selection method as the simplest recruitment scheme. As can be seen in Fig. 8, the Trust-based scheme outperforms all other schemes in most of the cases, meaning that the quality of the requested services using the proposed trust-based user recruitment is better than the other schemes. All the schemes, except the Random Selection, produce better QoS scores as more requested services are served. However, after a period of about requests (i.e., the learning phase), the Trust-based scheme achieves consistent QoS scores for following services whereas the Average-based and the Polynomial Regression take about and requests, respectively. After the learning phase, the Trust-based scheme persistently achieves the highest QoS scores compared to the other schemes at about to , whereas the Average-based scheme fluctuated between and while the Regression outcomes steadily increased and ultimately reaches about to .

Fig. 8: QoS scores after numbers of services using different User Recruitment schemes

The three schemes all learn from previous data contributors for maximizing the outcomes. However, with the exception of the Trust-based scheme, they fail to detect malicious users. That is why some malicious users are still recruited in these schemes resulting in lowering the QoS scores for requested services. This is understandable because the Average-based scheme will consider malicious users to be high-quality users due to their average QoD scores being similar. Compared to the Average-based scheme, the Regression method produces just slightly better QoS scores and is more consistent after a long learning phase. This is because malicious users contribute high-quality data most of the time so that low-quality data, which rarely occurs, could be considered as outliers in the regression model. As such, some malicious users are quantified as high-quality users. The regression model also requires more data points for a more accurate prediction, resulting in the longer learning phase.

Fig. 9: QoS scores in different Percentages of Malicious Users using different User Recruitment Schemes

Unlike these two schemes, the E-R model heavily penalizes a user who sometimes produces very low QoD scores, resulting in rapid drops in the trust relationship and the reputation value of that user. By looking at the reputation vector for all users after the learning phase, we notice that reputation values of malicious users are normally lower than low-quality users and far lower than high-quality users. Considering a scenario in which the number of malicious users is 10% ( malicious users out of total users), we examined the users with the lowest reputation values after the learning phase (i.e., after service requests). As can be seen in Table III, 80% of the malicious users are detected just by looking at 10% of the users having the lowest reputation values. Moreover, in the users (20% of the total users) with the lowest reputation values there are out of malicious users. That is why after the learning phase, the trust-based scheme tends to avoid recruiting these malicious users; because there is a very high possibility that a low reputation value results in a low overall trust value.

Lowest Reputation Malicious Low-Quality High-Quality
10 users (2.5%) 10 (100%) 0 (0%) 0 (0%)
20 users (5%) 19 (95%) 1 (5%) 0 (0%)
30 users (7.5%) 26 (87%) 4 (13%) 0 (0%)
40 users (10%) 32 (80%) 8 (20%) 0 (0%)
80 users (20%) 35 43 2

TABLE III: LOWEST REPUTATION VALUES IN ACCORDANCE WITH PERCENTAGE OF USERS TYPES

We also examined scenarios in which the number of malicious users are varied. As shown in Fig. 9, the percentage of malicious users over total users is increased (i.e, from 0% to 25% of the total number of users), the QoS is also decreased (in all scenarios with different numbers of requested services (i.e., 10, 40, 80, 160 requested services)). This is inevitable because the possibility of recruiting malicious users is higher. However, as the number of requested services increase, the QoS scores from all recruitment schemes, except the Random Selection, get higher. For instance, at 15% of malicious users, the QoS scores from the Trust-based scheme increased from about , , and after serving , , and services, respectively.

As can also be seen in Fig. 9, as the number of malicious users increase, the gap in QoS scores between the Trust-based scheme and the others schemes expands, especially when more requested services are served, showing the advantages of the Trust-based scheme in untrustworthy environments. For example, when the number of requested services is (as shown in the below-right subplot of Fig. 9), with 10% of malicious users, the QoS scores obtained from the Trust-based scheme and the Regression scheme are and , respectively; with 25% of malicious users, the QoS scores are and . Therefore, the QoS score gap between the proposed Trust-based and the Regression schemes increases from to .

If the percentage of malicious users is less than 10%, the Average-based scheme seems the best option which offers similar QoS scores but requires less computing resources. Unlike the Experience model, the Reputation model requires significant computational resources. Thus, it is not necessarily desirable to execute the Reputation mechanism in every evaluation of trust. In reality, the reputation mechanism should be periodically performed, which could drastically save time and computational resources.

Vii Conclusion and Future Work

In this article we propose a trust evaluation mechanism to create and maintain trust relationships between mobile device users in a MCS platform called E-R. To establish and manage the trust relationships, we introduce the concept of virtual interactions in a centralized MCS platform, forming when a user contributes data for a sensing task from a service requester. Such interactions are quantified by the assessment of the quality of the contributed data; and used as the inputs for the calculation of the two indicators of trust: Experience and Reputation; and the trust relationships between the MCS users are attained by incorporating these two TIs. Based on the trust relationships, a trust-based user recruitment scheme in a MCS platform is proposed for selecting the most trustworthy users with the purpose of contributing high-quality data.

In order to show the effectiveness of the proposed E-R trust mechanism and the trust-based user recruitment, we simulate a MCS testbed consisting of both normal and malicious users with the deployment of the trust-based recruitment scheme along with three other recruitment mechanisms for comparison. The results reveal that the trust-based scheme outperforms the other schemes as it provides better QoS for MCS services in most cases. The trust-based scheme is also able to envisage different types of users including intelligent malicious users, preventing the from being recruited for sensing tasks. Moreover, the proposed recruitment mechanisms is practically implemented in real-world IoT services as we have done in the Wise-IoT project666http://wise-iot.eu/2018/03/29/march-2018-8, which is also an achievement over other recruitment mechanisms which rely on unrealistic assumptions.

This article opens some future research directions. The first direction is an automatic adaptation of parameter settings for the Experience and Reputation models in a context-aware manner. Different MCS systems have different characteristics and types of users which need to be examined, meaning that the QoD assessment, the user models and the QoS evaluation model could also be different. This opens a second research direction for customizing the proposed mechanism for specific MCS use-cases. For example, the trustworthiness of data contributors can also be used in a crowd-sensing data model for better handling of noisy and unreliable data from mobile users, which could effectively improve the data quality in MCS systems [r1_3]. A fourth direction is the integration of the Knowledge TI that contains various useful information of MCS systems. This could result in even more precise indications of trustworthy mobile users; or the integration of other mechanisms like Incentive for a better recruitment scheme.

Acknowledgment

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (2018R1A2B2003774).

References