Recent years have witnessed significant progress in using machine learning for solving challenging problems in the Internet of things (IoT) including data caching [25, 34], activity recognition [40, 2]
, channel estimation, and security [39, 38]. Such machine learning-based solutions require massive datasets for system training. Trading IoT data among firms allows data vendors to make profit by offering their data to service providers over the Internet. Service providers use the bought data in creating and training advanced machine learning-based IoT services111For the rest of this paper, we use “machine learning-based IoT services” and “services” interchangeably., e.g., fraud detection, activity recognition, acoustic modeling, and medical diagnosis. However, economics of IoT services among service providers, data vendors, and customers is rarely studied in the literature. IoT market models and optimal pricing schemes are therefore required to ensure maximum profits of firms by achieving optimal utilization of the IoT resources and optimal subscription fees for services.
IoT services can be either sold separately or bundled and sold as one service package. In particular, multiple service providers, e.g., fraud detection and recommender systems, can cooperate to sell a bundled service at a discounted rate while sharing the resulting profit222Product bundling is a marketing strategy which is widely used by economists, e.g., a fast food meal consisting of a sandwich and a soft drink.. Several major questions related to this service bundling process arise. Firstly, how is service quality defined and what are the optimal sizes of data that should be bought from data vendors? Secondly, should service providers cooperate to offer a bundled service instead of the standalone sales of services? Thirdly, once a cooperation is formed, how do cooperative service providers divide the resulting profit of a bundled service among themselves?
This paper provides answers for the aforementioned questions by proposing IoT market models and optimal pricing schemes of selling machine learning-based IoT services separately or as bundled packages. The main objective is maximizing the profits of IoT service providers while providing bundled services to customers at discounted price. The key contributions of this paper can be summarized as follows:
From the service provider’s perspectives, we define the service quality as a mapping between the bought data size and the resulting accuracy of machine learning algorithms.
We develop an IoT market model for the standalone selling of services. We then propose a non-linear optimization problem with the objective of maximizing the profit of service providers. Specifically, a service provider will make decisions on the optimal data size to buy from data vendors and the optimal subscription fee that should be charged to service customers.
We formulate and solve an IoT service bundling optimization to obtain the optimal bundle subscription fee and data sizes for each service in the bundle. Here, the optimization objective is to maximize the total profit of the cooperative providers inside the bundling coalition. We show that a bundled service increases the profit of service providers compared to the standalone sales of services. Moreover, bundled IoT services are favored by customers due to their discounted subscription fees compared to the standalone sales.
We present a profit sharing scheme to divide the generated profit of a bundled IoT service among the cooperative providers based on their contributions to the bundle profit. We apply the concepts of core solution and Shapley value to find the payoff allocations to providers.
Additionally, we provide closed-form solutions for each optimization problem in this paper. Unlike iterative solutions of nonlinear problems with inequality constraints, e.g., interior point methods , closed-form solutions can be evaluated in a finite time and are computationally efficient.
The rest of this paper is organized as follows. Section II presents the related work. Section III discusses the IoT market model and assumptions, and machine learning-based model is presented as a measure of service qualities. Then, optimization problems for profit maximizing are derived in Section IV for the standalone services and in Section V for the bundled IoT services. Section VI presents a model for payoff allocation and sharing among service providers in service bundling. Section VII discusses the experimental evaluation results. The paper is finally concluded in Section VIII.
Ii Related work
A survey of employees from countries reports that successful firms apply machine learning five times more than low-performing ones . This clearly shows the importance of adopting the recent advances in machine learning for generating revenues and meeting the market demands on intelligent IoT systems. Moreover, market models and pricing strategies are integral for maximizing the profit of selling products and services including wired and wireless network access [31, 32, 43], cloud computing , and mobile crowdsensing [15, 41], just to name a few.
Ii-a Economics of Information Goods
Pricing information goods, e.g., software products and movies, is a well-studied problem in the literature. In , three pricing schemes of information goods were presented, namely, connection time-based, search-based, and subscription-based pricing. It has been argued that connection time-based pricing is less profitable than the other schemes for highly skilled users. The authors of  presented a nonlinear pricing scheme of information goods based on the customer usage, i.e., defining a fixed price to each usage level. In , the authors discussed the pay-per-use and unlimited subscription of information goods. It is shown that a maximum profit is achieved through competitive pricing. The authors in  discussed the benefits of bundling information goods. It is shown that an accurate estimation of the user behavior can be achieved for bundled goods compared to the individual sales of products.
Pricing and bundling of IoT services is more challenging than those of classical information goods, as the quality of information goods can be easily determined, e.g., the quality of a software is defined based on its supported features, while the casting and genre of a movie define its price. On the contrary, the quality of IoT services cannot be directly measured.
Ii-B Economics of Query-based Data Services
Query-based data services extract data from structured databases and visualize them according to customer requests. Simple query pricing models can be preferable by buyers, but applying a complex pricing model increases the revenues of the content provider . The authors in  presented a query-based pricing scheme of data. The price of a query, i.e., a question, is defined based on the number of data views required to provide the answer. In , the authors presented a pricing method in privacy-preserving query systems. This differential privacy pricing aims at maximizing the accuracy of queries while maintaining the privacy of the data owners. The privacy level is defined by the query buyers. The authors in  proposed a query pricing scheme with arbitrage-prevention. This prevents buyers from combining simple and cheap queries to achieve the same results of a complex and expensive query. Such existing query-based schemes are restricted by design to structured and relational data, and they do not address unstructured data, e.g., IoT data, which is the dominant form of data in real-world settings.
Ii-C Incentive Mechanisms for Crowdsensing
Incentive mechanisms are required for maintaining high user participation and providing fair reward allocation. The authors in  presented a user-centric incentive mechanism formulated as a reverse auction, and a platform-centric incentive scheme modeled as a Stackelberg game. The user-centric mechanism enables the participants to compete for a higher reward. In the platform-centric mechanism, the service acts as the game leader and announces the reward of completing the data collection task. The participants are the game followers and set their sensing time in order to maximize the received reward. Likewise, the authors in  modeled the crowdsensing problem as an online auction, while assuming that the arrival of participants has a stochastic distribution. In , the reward allocation problem was modeled as an all-pay auction such that the participant with the highest contribution receives the full reward.
This paper is fundamentally different from all past works. First, the problem of defining the quality of service in IoT has not been previously addressed. Second, existing works have not considered the data demand and pricing in IoT services when machine learning is heavily utilized. Third, none of the existing works has considered the problem of cooperation among service providers to form a bundled service which is an effective strategy for profit maximization. This paper addresses these limitations and presents a market model, bundling strategies, and optimal pricing schemes of IoT services. The novel data-driven optimizations of this paper allow achieving the maximum profits from offering IoT services to customers.
Iii System Model
In this section, we first briefly introduce the concept of machine learning-based IoT services and give real world examples of IoT services where optimal pricing and bundling are required. Then, we present a method for defining the value of data from a machine learning perspective.
Iii-a Machine Learning-Based IoT Services
We consider the IoT service architecture shown in Figure 1. An IoT service is typically composed of a data vendor, service provider, and service customers.
: The IoT data is firstly collected, stored and filtered by a data vendor. The data can be generated by different devices and technologies, e.g., sensor nodes, IoT gadgets, and smart phones. In addition to the deployment cost, the process of data gathering also requires costly human intervention for data annotation, validation, and preprocessing, e.g., anomaly detection and missing data imputation. Accordingly, the data vendor charges data buyers with a price. The cost of one data unit333We assume that a data unit includes one percent of the full dataset samples. is denoted by .
Service provider: The raw data owned by data vendors is unprofitable unless suitable analytics tools are applied. A service provider is a business entity which buys data from one or more data vendors, uses machine learning tools, and offers a service to customers willing to pay a subscription fee. For profit maximization, a rational service provider decides the data size (in data units) to buy from data sources and the subscription fee to charge for his service.
Service customers: We assume that there are a total of customers. Each customer is an independent entity who decides whether to subscribe to a service provider based on his willingness-to-pay for the service and the subscription fee set by the provider. defines the maximum subscription fee that a customer can pay for a service based on his evaluation and need for the service.
This IoT market model is useful in many service-oriented architectures.
A data marketplace is an online store where entities can trade data. machine learning, e.g., machine learning algorithms, can be applied on the data to generate prediction models. Datasets of various types are offered and exchanged as assets. Examples of data markets include Azure Marketplace444https://www.datamarket.azure.com, Qlik DataMarket555https://www.qlik.com, and Infochimps666http://www.infochimps.com.
Crowdsourcing services also require optimized market models for data exchange. Placemeter777https://www.placemeter.com
, for example, provides real-time information of pedestrian and vehicular movement in cities and urban areas. Placemeter allows users to upload videos of streets and public areas and they are paid back based on the video quality. Computer vision algorithms are used by Placemeter to extract information from real-time data.
IoT services is a new trend of IoT platforms which are designated to visualize and trade IoT data. Thingful888https://www.thingful.net is an example of these platforms which allows IoT vendors and owners to visualize geolocations of connected devices around the world. Likewise, health care systems, such as PatientsLikeMe999https://www.patientslikeme.com, sell rich medical data which can be collected with IoT gadgets and other smart technologies.
The list of symbols used in this paper are summarized in Table I.
|Requested data size|
|Cost of one data unit|
|Number of customers|
|Willingness-to-pay (reservation price) for a service by customers|
|Profit function of a service provider|
|Hessian matrix of|
|th-order principal minor of|
|Lagrangian dual problem|
|Bundle subscription fee|
|Service quality function|
|Fitting parameters of|
|The core solution set|
|Profit function of a bundling coalition|
|Hessian matrix of|
|Lagrangian dual function of a bundling coalition|
|Payoff allocation for provider in the core solution|
|Payoff allocation for provider by the Shapley solution|
Iii-B Machine Learning in IoT
As shown in Figure 2, consider a training data which includes tuples of connected feature set and a class label .
is only present in supervised learning, e.g., classification and regression problems. Based on the problem at hand, a feature setcontains sensing data of features such as images in vision problems, audio in acoustic modeling, text in document classification, and temperature values in weather forecasting. Generally, a machine learning optimization problem adheres to the following general formula:
where is a learning objective function designed to fit a model to historical data , is a regularization term, is a weighted hyper-parameter. For example, a deep network  has a learning function which is defined as follows:
where is the number of layers in the deep model,
is the number of neurons at layer, is the deep model prediction for input . The first term is an average sum-of-squared errors, the second term is a weight decay regularization which includes the summation modeling parameters. After the model is trained using the training data, it is tested to define the accuracy using the unseen testing data.
Iii-C Service Quality
One of the key barriers in the development of data marketplaces is defining data quality and value to the potential service providers. In this section, we define the data quality based on the performance of machine learning models trained using the data.
Customers do not always subscribe to the cheapest IoT services. Instead they infer the subscription fee and service quality. More data is important to increase the accuracy of IoT services [13, 10]. However, this accuracy gain increases the cost of bought data. From the service provider’s perspectives, we define the quality function of a service as a mapping between the bought data size to the resulting accuracy of the machine learning. The quality of the service is equal to the utility of bought data from the service provider’s perspectives. For example, machine learning in intrusion detection systems by energy-constrained sensor networks is aimed for locating and identifying intruders . The cost of detecting an intruder is a direct reflection of the value of data.
We introduce the service quality function to meet the following empirical assumptions:
is an increasing function such that . This assumption is intuitive as more data improves the accuracy of the analytics and quality of the service.
has a decreasing marginal utility such that . This assumption reflects the empirical accuracy of machine learning models.
In our data market and pricing framework, is defined as follows:
where is the data size and is a tuple of three fitting parameters. To find the fitting parameters , we vary the size of raw data used to fit the model . Specifically, a series of experimentation points is performed, where is the service accuracy resulting from a data size of with . is then found by applying nonlinear least squares for minimizing the sum of squared errors as follows:
The parameter fitting problem in (4) can be solved iteratively to find the fitting parameter . can be estimated before transmitting the data from the data vendor to the service provider through a third party broker which charges a broker fee. In real-world IoT services, the broker can also manage the financial transactions, service performance, and service delivery, e.g., measuring the compliance with the service-level agreement (SLA).
Iv IoT Market Model and Optimal Pricing of Standalone Services
In this section, we first develop an IoT market model of selling standalone IoT services, and we formulate an optimization problem to maximize the profit of a service provider. Then, the closed-form solutions for the optimal subscription fee and requested data size are provided.
We consider the separate sales of a monopolist service provider providing a service to a set of potential customers as shown in Figure 1. Each customer has a different willingness-to-pay for the data service. If the willingness-to-pay of a particular customer is higher than or equal to the subscription fee weighted by the service quality such that , that customer will subscribe to the service. This indicates that the willingness-to-pay value depends on the customer evaluation of the service as well as its quality, i.e., high quality services will attract more customers. Based on our real-world customer surveys (Figure 6), we show that
follows a uniform distribution. Then, the profit of the service provider is found as follows:
where is the service quality function. The first term of (5) is the revenue of offering the service to the customers with the subscription fee . As is assumed to follow a uniform distribution, the expression
defines the probability of subscribing to the service by the customers. The second term of (5) is the total data price paid to the data vendor which is equal to the requested data size multiplied by the data unit cost. The marginal cost of running the service is negligible.
We next formulate a profit maximization optimization based on the profit function in (5) for the service provider to decide their optimal requested data size and subscription fee.
Iv-a Profit Maximization of Service Providers
The service provider is assumed to be rational and is interested in maximizing his individual profit. A non-linear optimization problem can be formulated to choose the optimal data size that should be bought from the data vendor and optimal subscription fee to be charged to the customers as follows:
The objective function of (7) maximizes the profit of the service provider. The two constraints and are required to ensure non-negative solutions for the subscription fee and requested data size.
is concave, and hence the closed-form solution of the optimization problem is globally optimal.
We use Sylvester’s criterion of twice differentiable functions to prove that the Hessian matrix of is negative semidefinite, and hence the concavity of . Suppose that the principal minors of are denoted as , where is the order of the principal minors. Based on Sylvester’s criterion, is negative semidefinite if and only if the condition
holds, i.e., every odd-order principal minor is non-positive and every even-order principal minor is non-negative. The Hessian matrixof is obtained as shown in (11) at the top of the next page. The first-order principal minors of are derived as follows:
The principal minor of order two is derived as follows:
Accordingly, Sylvester’s criterion is satisfied. is negative semidefinite and is concave. ∎
Iv-B Optimal Subscription Fee and Requested Data Size
Converting the constrained profit maximization problem (7) into a minimization problem and formulating the Lagrangian dual problem results in the following unconstrained Lagrange dual function:
where and are Lagrange multipliers for the constraints and , respectively. The first derivatives of (12) with respect to and are as follows:
where and (no active constraints). Setting both derivatives to zero, the optimal closed-form solutions are found as follows:
where the condition must hold to ensure positive values of and . Accordingly, the maximum profit of the provider is found by substituting the optimal values of and in (5).
V IoT Market Model and Optimal Pricing of Bundled Services
In this section, we first present a market model where bundling is used by two cooperative service providers (denoted as Service 1 and Service 2). We denote the bundling coalition as . The two services are grouped together in one package and sold at a discounted subscription fee, i.e., subscribing to the two services separately costs more than the bundled service. It is important to note that customers can still subscribe to services separately. Then, we develop an optimization problem to select the optimal bundle subscription fee and requested data sizes by both providers.
V-a Profit Maximization of Bundled Services
Consider two service providers that form a coalition to provide a bundled service to a base of customers as shown in Figure 3. The subscription fee of the bundle is denoted as . The qualities of Services 1 and 2 are denoted as and , respectively. The data unit costs paid by Services 1 and 2 are and , respectively. We define the reservation prices and as the willingness of the customers to pay for Services 1 and 2 in the bundle. Specifically, a customer will decide to subscribe to a bundled service if the following relation hold
which indicates that the customer evaluation of the bundled service is higher than its subscription fee. The profit optimization problem of a bundled service is then defined as follows:
where is the demand on the bundle service by customers, i.e., the probability that a customer will decide to subscribe to a bundled service. The objective function of (18) maximizes the total bundling profit of cooperative service providers which is equal to the sale revenue minus the paid data cost. , , and are required to ensure positive solution values for , , and , respectively. The constraints and are the optimization constraints which depend on the bundle price and service quality. In particular, there are 4 possible demand patterns (defining the optimization constraints and ) where (17) holds as demonstrated by the shaded areas in Figure 4. Specifically, can be found for each case as follows:
is concave, and hence the closed-form solution of the optimization problem is globally optimal.
We will next prove the concavity of the profit function for Case 1 ( and ) using Sylvester’s criterion which provides a necessary and sufficient condition for the concavity of . The other demand cases (Cases 2-4) can be analyzed similarly and are omitted due to the space limit. The Hessian matrix of for Case 1 is defined as in (23) shown at the top of the next page. , and are the three first-order principal minors of which can be derived as follows:
There are also three second-order principal minors for which are obtained as follows:
Finally, has one third-order principle minor defined as follows:
Accordingly, Sylvester’s criterion is satisfied. This proves that is negative semidefinite and is concave. Then, the solution of the optimization problem is globally optimal. ∎
We next provide the closed-form solutions for the profit maximization problem of bundled services by applying the KKT conditions.
V-B Optimal Subscription Fee and Requested Data Sizes
V-B1 Case 1 and
V-B2 Case 2 and
where are the Lagrange multipliers. Taking the derivatives of (42) with respect to , , and , the closed-form solution can be found by solving the resulting derivatives as follows:
V-B3 Case 3 and
Taking the derivatives of (50) with respect to , , and , the closed-form solution can be expressed as follows:
V-B4 Case 4 and
Taking the derivatives of (58) with respect to , , and , the closed-form solution can be deduced as follows:
The cooperative providers should run the optimization of the four cases derived in this section. The case with the maximum resulting profit should be used in the bundled service.
Vi Profit Sharing Among IoT Providers
After forming a coalition to sell IoT services as a bundle, the cooperative providers share the resulting profit. This section presents a profit sharing model using the core solution and Shapley value from cooperative game theory  to define the payoff allocations for the cooperative providers in .
Vi-a The Core Solution
Let indicate the profit share of service provider . The core solution set is calculated as follows :
is a vector of profit allocations, is the profit of bundling, and is the profit of separate selling when . includes a set of profit allocations which guarantee no service provider will reject the payoff allocation, i.e., no incentive of leaving the coalition to sell services separately.
The core solution can be empty, containing a large number of possible solutions, or unfair to a service provider based on the individual contributions to the bundle formulation. Therefore, we next present the Shapley solution which provides a single and fair solution of the profit sharing problem.
Vi-B The Shapley Solution
The Shapley solution provides a fair allocation of the bundling profit among the service providers forming a bundling coalition . For each cooperative provider , the Shapley value assigns a payoff found as :