I Introduction
Last decades witnessed the remarkable achievements of the wireless technologies towards offering connectivity to the people. Recently there has been a growing interest to provide ubiquitous connectivity for machines and objects, many of which do not require interactions with humans [1]. This is being driven by the rapid advance of Internet of Things (IoT) that will significantly benefit the way we conduct business, deliver education, health care and government services, and the way we live everyday lives [2]. Typical IoT applications, as shown in Fig. 1, include smart healthcare in which the wearable devices transmit continuous streams of accurate data to the cloud for better care decisions, smart home that enables home automation with the aid of intelligent appliances such as the smart speaker even when the people are at remote locations, smart manufacturing that supports streamlined business operations and optimized productivity in factories via automatically collecting and analyzing data from the sensors for making betterinformed decisions to the actuators such as robotics, smart transportation in which the connected vehicles make transportation itself more efficient and help us get from place to place more quickly, and etc. Targeting at the above growing market of IoT, the 5G cellular technologies roadmap has already identified massive machinetype communications (mMTC) as one of the three main use cases, along with enhanced mobile broadband (eMBB) and ultrareliable and low latency communications (URLLC).
The fundamental challenge of mMTC for IoT is to enable data transmission from a massive number of devices in an efficient and timely manner. However, the key characteristic of the IoT traffic is that the device activity patterns are typically sporadic, so that at any given time only a small and random fraction of all devices are active, as shown in Fig. 2. The sporadic traffic pattern may be, e.g., due to the fact that often devices are designed to sleep most of the time in order to save energy and are activated only when triggered by external events, as is typically the case in a sensor network. In these scenarios, the active users need to be dynamically identified along with the reception of their data, which is a challenging task.
Ia GrantBased Random Access Scheme
The common user access approach in cellular systems is to perform grantbased random access using the dedicated randomaccess control channel, so that the uncoordinated devices can contend for physicallayer resource blocks for data transmission [3], as shown in Fig. 3. Specifically, in the first stage each active device picks a random preamble, sometimes referred to as a pilot sequence, from a predefined set of orthogonal preamble sequences, to notify the base station (BS) that the user has become active. In the second stage, the BS sends a response corresponding to each activated preamble as a grant for transmitting in the next step. In the third stage, each device that has received a response to its preamble transmission sends a connection request in order to demand resources for subsequent data transmission. In case a preamble has been selected by a single device, the connection request of the device is granted by the BS, which in turn sends contention resolution message informing device about the resources reserved for the pending data transmission. However, if two or more devices have selected the same preamble in the first stage, their connection requests collide. When the BS detects a collision, it does not reply with a contention resolution message; the affected devices restart the random access procedure after a timer expires. In the above procedure, the messages sent by the active devices in the first and third phases correspond to metadata, since they belong to control information for establishing the connection without containing any data information.
This access mechanism can be seen as an instance of the classical ALOHA, imposing a limit on the number of active devices that can get the grant to access the network. Recently, extensive efforts have been devoted to different variants of the random access schemes with advanced contention resolution strategies [4, 5]. However, due to the large number of collisions in the massive IoT scenarios, still many users cannot access the network even if some of the colliding connection requests could be resolved, as shown in the following example.
Example 1
Consider a cellular network consisting of one BS and users. Let denote the length (and thus the number) of the orthogonal preambles available for the devices to choose. Assume that in each time slot, out of these devices are active, each picking one of the orthogonal pilots at random. The coherence bandwidth and the coherence time of the wireless channel are 1MHz and 1ms, respectively, thus in each coherence block 1000 symbols can be transmitted. Moreover, we assume both the scenario in which the contention resolution is not performed and the scenario in which even when there is a collision, the BS can always grant access to one of the colliding devices^{1}^{1}1This could happen, e.g., due to the capture effect, in random access networks.. Under this setup, the average numbers of devices that are granted the permission to access the network, for both the cases with and without contention resolution, versus different values of are plotted in Fig. 4. The plot is obtained by Monte Carlo simulations. It is observed that to guarantee success rate, at least and out of symbols are needed, respectively, as pilot for the cases with and without contention resolution.
A question arising from the above example is how we can accommodate more devices with low latency requirement in the future massive IoT connectivity systems. One promising solution is the grantfree random access scheme based on the advanced compressed sensing techniques.
IB GrantFree Random Access Scheme
Under the grantfree random access scheme, each active device directly transmits its metadata and data to the BS without waiting for any permission, as shown in Fig. 5. Specifically, in contrast to the grantbased random access scheme in which pilot sequences are randomly selected at each time slot, each device under the grantfree random access scheme is preassigned with a unique pilot sequence used for all the time slots. This pilot sequence thus also serves as the ID for this user and is reminiscent of the role that codedivision multipleaccess (CDMA) sequence plays in facilitating the extraction of a user data under interference from other users. At each time slot, the BS first detects the active devices by detecting which pilot sequences are used. Next, the BS estimates their channels based on the received metadata, and then decodes the data with the estimated channels [6, 7].
Obviously, the very fact that both metadata and data in the grantfree access are sent in a single step offers the possibility to decrease the access latency compared to the grantbased access. However, device activity detection is now more challenging, because due to the massive number of devices in the network as well as the limited channel coherence time, it is not possible to assign orthogonal pilot sequences to all the devices. The difference with the classical CDMA systems is that the activation dynamics covers a much larger population, placing this problem in the realm of sparse signal processing.
This article aims to pave the way for a theoretical investigation on how the sparse signal processing technologies can enable accurate and efficient active device detection under the grantfree access scheme. We first point out that the device activity detection can be cast into a compressed sensing problem. Next, a random pilot sequence design is introduced, and the use of approximate message passing (AMP) algorithm [8] is proposed for detecting the active devices. Further, we show that massive multipleinput multipleoutput (MIMO) [9, 10], which has already exhibited outstanding performance for enhancing the spectrum efficiency in humantype communications, provides an opportunity to leverage the socalled multiplemeasurement vector (MMV) compressed sensing technique [11, 12] to achieve asymptotically perfect device activity detection accuracy in the massive IoT machinetype communications. Another important fact about mMTC is that it dominantly relies on short packet transmissions. We elaborate on a new method to embed a small number of information bits in the short packets that can be decoded in the device activity detection process. This is enabled by letting each active device randomly select one pilot from a predefined set and letting the BS detect which pilot is used by each active device using AMP. Finally, this article discusses the related technique of coded ALOHA [13] for device activity detection.
Ii Device Activity Detection as a Compressed Sensing Problem
As discussed in the above, it is the sporadic IoT traffic and device activity detection that impose the greatest challenge in the design of the grantfree device access protocol. Interestingly, it is also the sporadic IoT traffic itself that provides a promising opportunity for tackling this challenge. As only a small subset of users are active at each time slot, user activity detection amounts to a sparse signal recovery problem.
Suppose that there are users in the system, which are denoted by the set . Further, assume that the BS is equipped with one antenna, and the channel from user to the BS is denoted by . In each coherent time slot, define the user activity indicator function as
(1) 
Assume that each device
decides in each coherence block whether or not to access the channel with probability
in an independent manner. Then,can be modeled as a Bernoulli random variable so that
, ,. The sparse activity level
depends on the specific applications. The model is sufficiently general so that it can capture a variety of applications, e.g., a sensor fusion network in which the sampling rates at different sensors may even be different.Suppose that each device is assigned with one pilot sequence with , where denotes the length of device pilot sequence. Furthermore, we assume that the active users are synchronized within the cyclic prefix, and accurately enough in frequency such that the block fading assumption yields a legitimate model for the channel. This is justified by having the BS send a beacon that invites uplink transmissions from the active devices. The received signal at the BS for device activity detection is then
(2) 
where is the received signals over symbols, is the total transmit energy of the pilot for each active device, is the independent additive white Gaussian noise (AWGN) at the BS, is the collection of pilot sequences of all the devices, and with denoting the effective channel of device . The goal for the BS is to detect the active devices and detect their channels by recovering based on the noisy observation .
Restricted by the limited coherence time in a practical massive IoT connectivity scenario, the length of device pilot sequence is much smaller than the number of devices, i.e., . Hence, (2) describes an underdetermined linear system with more unknown variables than equations. However, since is sparse with many zero entries based on (1), such a reconstruction problem is a sparse optimization problem that can be possibly solved via nonlinear compressed sensing techniques.
There are two main theoretical questions in compressed sensing. First, how to design the sensing matrix so as to capture almost all the information about with a minimal cost ? Second, given a sensing matrix , how to recover from the noisy observation even if ? In fact, these two questions are coupled: a good design of the sensing matrix leads to an easier algorithm for recovering the sparse signal . For the massive IoT connectivity setting, this indicates that the device pilot sequences should be carefully designed to enable efficient activity detection schemes at the BS side.
Although a number of desirable properties for a good sensing matrix are known, e.g., restricted isometry property (RIP), optimizing the sensing matrix design is quite a challenging problem. This magazine article mainly focuses on simple ways to construct the sensing matrix that are easy to be implemented for practical pilot design. In Sections III and IV, we consider the case when each entry of
is i.i.d. randomly generated based on Gaussian distribution and review the AMP algorithm
[8] to recover [14, 6]. Later in Section VI, we will briefly review the other choices of the sensing matrix and the corresponding compressed sensing algorithms, e.g., the sparsegraph based algorithm with a sparse [15], and their applications in device activity detection, e.g., coded slotted ALOHA [13].Iii AMPbased Device Activity Detection
AMP, proposed in the seminal work [8], is an efficient iterative thresholding method designed for largescale compressed sensing problems, making it appealing in the massive IoT connectivity scenario of interests. An attractive feature of the AMP framework is that it allows an analytic performance characterization via the socalled state evolution [16]. In the following, we introduce how the AMP algorithm works for device activity detection in massive IoT connectivity.
Iiia Device Pilot Sequence Design
We assume in this section that the entries of user pilots are generated from i.i.d. complex Gaussian distribution with zero mean and variance
, i.e.,(3) 
This particular choice of user pilot sequence is convenient for use with the AMP algorithm for two reasons: first, the convergence of the AMP algorithm for device activity detection is guaranteed if is generated in this way [8]; second, with such a Gaussian sensing matrix, the state evolution of the AMP algorithm is well established [16], based on which detection performance, e.g., missed detection probability (probability that an active device is not detected) and false alarm probability (probability that an inactive device is declared to be active), can be analytically characterized in the asymptotic limit.
IiiB Algorithm Design and Performance Analysis
IiiB1 General Form of AMP Algorithm
The AMP algorithm aims to provide an estimate based on that minimizes the meansquared error (MSE)
(4) 
Based on an approximation of the message passing algorithm and starting with and , the AMP algorithm proceeds at each iteration as [8, 17]:
(5)  
(6) 
where is the index of the iteration, is the estimate of at iteration , denotes the corresponding residual, is the socalled denoiser, and is the firstorder derivative of . The basic intuition is that since the solution should minimize , algorithm makes progress in (5) by moving in the direction of the gradient of , i.e., , , and then promotes sparsity by applying an appropriately designed denoiser . The residual is then updated in (6), but corrected with a socalled Onsager term involving .
IiiB2 State Evolution
An important analytical result about the AMP algorithm is the socalled state evolution in the asymptotic regime when , while their ratios converge to some fixed positive values and with . In systems for massive IoT connectivity, these assumptions indicate that the length of the pilot sequence is in the same order of the number of active users or total users. After the th iteration of the AMP algorithm, define a set of random variables ’s as
(7) 
where the random variables ’s capture the distributions of ’s,
follows the normal distribution, i.e.,
, and is independent of as well as , , and is the state variable, which changes from iteration to iteration as modeled by a simple scalar iterative function known as the MSE map:(8) 
Here, the expectation is over the random variables ’s and ’s and over all . Under the aforementioned asymptotic regime, [16] shows that applying the denoiser to in (5) is statistically equivalent to applying the denoiser to as shown in (7).
IiiB3 Minimax Framework for Denoiser Design
The flexibility in the AMP algorithm design lies in the denoiser in (5). In the AMP literature, the prior distribution of is in general assumed to be unknown. In this case, the denoiser is designed under the minimax framework to optimize the AMP algorithm performance for the worstcase or leastfavorable distribution of [18]. Such a design leads to a soft thresholding denoiser for promoting sparsity even for with the worstcase distribution [8]:
(9) 
where the distribution of is captured by , and is the threshold for device for the th iteration of the AMP algorithm, which can be optimized based on the state evolution (8) to minimize the MSE as given in (4). With this denoiser, after the th iteration of the AMP algorithm as shown in (5) and (6), device is declared to be active if , and declared to be inactive otherwise. Note that AMP with soft thresholding implicitly solves the LASSO problem [18], i.e., the sparse signal recovery problem as an penalized least squares optimization.
IiiB4 Bayesian Framework for Denoiser Design
On the other hand, if the distribution of is known in (2), we can design the minimum meansquared error (MMSE) denoiser via the Bayesian approach to minimize the MSE for the estimation of as given in (4) [18]. Considering the equivalent signal model (7), the MMSE denoiser is given as the following conditional expectation
(10) 
where the expectation is over and .
For example, if we assume a Rayleigh fading channel such that , where denotes the pathloss and shadowing component of user and is assumed to be known by the BS, then the effective channel follows a BernoulliGaussian distribution. Under this particular distribution of , an analytical expression of the above MMSE denoiser can be found in [14], which is in general nonlinear and has complicated form. Similar to the soft thresholding denoiser case, with the MMSE denoiser (10), we can detect the user activity based on whether the magnitude of is larger than or smaller than a carefully designed threshold .
A comparison between the soft thresholding and MMSE denoisers with a BernoulliGaussian distributed is given in Fig. 6. It can be observed that the MMSE denoiser is also a thresholding based denoiser, but more “soft” around the regime around the threshold. Moreover, the threshold for the MMSE denoiser is obtained by calculating (10) to minimize the MSE (4), while the design of the threshold for the soft thresholding denoiser follows a minimax framework, which is not optimal given a particular distribution of in general.
IiiB5 Analytical Performance Characterization
The state evolution also allows an analytical performance characterization of the AMP algorithm. For example, with both the soft thresholding and MMSE denoisers, missed detection event happens if one user is active but , while false alarm event happens if one user is inactive but . Since defined in (7) captures the statistical distribution of , the probabilities of missed detection and false alarm for device after the th iteration of the AMP algorithm thus can be expressed as
(11)  
(12) 
respectively.
Given the distribution of and denoiser , we can track the values of ’s over iterations based on the state evolution (8), then calculate the probabilities of missed detection and false alarm based on (11) and (12).
Example 2
Here we provide a numerical example to show the probabilities of missed detection and false alarm achieved by the AMP algorithm, under the same setup that is used in Example 1. The devices are assumed to be randomly located in a cell with a radius meters, while each device accesses the channel with an identical probability , , i.e., and of the devices are active at any given time. The transmit power of each user for sending its pilot is dBm. The power spectral density of the AWGN at the BS is assumed to be dBm/Hz. Moreover, we define the systemlevel missed detection and false alarm probabilities as and , where and denote the missed detection and false alarm probabilities of device achieved by AMP after its convergence. Hence, and are the average numbers of missed detection and false alarm events at each time slot in a system with devices. In addition, under both the soft thresholding and MMSE based AMP algorithms, the thresholds ’s are carefully selected such that .
Fig. 7 shows the device activity detection accuracy achieved by the AMP algorithm with the soft thresholding denoiser and the MMSE denoiser. It is observed that with the MMSE denoiser, active devices can be detected with the AMP algorithm when the length of pilot sequence satisfies . Recall that in Example 1 of the random access scheme, such a performance can be achieved only when the pilot sequence length is longer than even if contention resolution is performed. Moreover, with a careful design of the MMSE denoiser, the MMSE denoiser based AMP algorithm outperforms the soft thresholding denoiser based AMP algorithm in terms of device activity detection.
Iv From SMV to MMV: Massive MIMO for Massive IoT Connectivity
As compared to most other applications of compressed sensing such as imaging, a unique and essential opportunity provided by the wireless massive IoT connectivity system design lies in the potential for utilizing the MMV technique for compressed sensing [11], thanks to the multiantenna technologies nowadays used ubiquitously in cellular networks. The previous section deals with the application of compressed sensing technique for user activity detection when the BS is equipped with one antenna. In the literature of compressed sensing, the case with one measurement vector is referred to as a singlemeasurement vector (SMV) problem. Recently, massive MIMO has emerged as a revolutionary technology for dealing with the future data deluge for humantype communications. This section shows that massive MIMO is also a natural solution for accommodating a huge number of IoT devices for the future machinetype communications. From the compressed sensing perspective, device activity detection in massive MIMO systems corresponds to the MMV problem, which generalizes the sparse signal recovery problem to the case with a group of measurement vectors for a group of signal vectors that are assumed to be jointly sparse and share a common support. It is of both theoretical and practical importance to investigate the role of massive MIMO on massive IoT connectivity, which is the aim of this section.
Suppose that the BS is equipped with antennas. In this case, the channel from user to the BS is . Then, the signal model given in (2) is generalized to
(13) 
where is the matrix of received signals across antennas over symbols, with denoting the effective channel of user , and with , , is the independent AWGN at the BS.
As compared to the SMV signal model (2), the main difference lies in the fact that in (13) is a rowsparse matrix, i.e., if one entry of one particular row of is zero, the other entries of that row must be also zero. This information can be utilized to improve the user detection accuracy. A comparison between the SMV model (2) and MMV model (13) is illustrated in Fig. 8. In the following, we discuss how to generalize the AMP based algorithm in Section III to the massive MIMO scenario and to quantify its significant improvement in device activity detection accuracy over the singleantenna BS case.
Iva Algorithm Design
With massive MIMO at the BS, the user pilot sequence assignment still follows (3), which is the same as the case with one antenna at the BS. However, the AMP algorithm is modified as [12]
(14)  
(15) 
As compared to (5) and (6), the dimensions of the signals are now and ; moreover, the denoiser is a mapping in higher dimension, i.e., .
The state evolution of the AMP algorithm still holds for MMV in the asymptotic regime that with fixed ratios and . Specifically, define [12]
(16) 
where the random vector captures the distribution of , is the independent Gaussian noise, and can be tracked over iterations as follows
(17) 
Here, the expectation is over ’s and ’s and over all . Then, in (14), applying denoiser to is statistically equivalent to applying denoiser to
(18) 
where the distributions of and are captured by and .
Based on the above state evolution, denoisers of the MMVbased AMP algorithm can be designed based on different criteria as for the SMV case. For example, the soft thresholding denoiser is
(19) 
Further, assuming Bernoulli Gaussian distributed ’s, the MMSE denoiser
(20) 
is characterized in [6]. With both the soft thresholding and MMSE denoisers, after the th iteration of the AMP algorithm, user can be declared to be active if , and declared to be inactive otherwise, where is the carefully designed threshold for device detection.
IvB Asymptotically Perfect Device Activity Detection
Fix the number of antennas at the BS, , the missed detection and false alarm probabilities from the MMSE denoiser based AMP algorithm, denoted by and (reducing to (11) and (12) when ), are characterized in [6]. Interestingly, perfect device activity detection is achieved in the asymptotic regime of if the thresholds for device detection, i.e., ’s, are properly selected (c.f. [6, Theorem 4]):
(21) 
This important result implies that in a massive MIMO system, in which can be larger than , the AMPbased grantfree access scheme is able to detect device activity with extremely high accuracy in the massive IoT connectivity systems.
Example 3
Here we provide a numerical example to show the power of massive MIMO for massive IoT connectivity, under the same setup that is used in Examples 1 and 2. Fig. 9 shows the probabilities of missed detection and false alarm (which are made equal by adjust the detection threshold) versus pilot sequence length , with antennas at the BS. Here, similar to Example 2, we define the systemlevel missed detection and false alarm probabilities as and , where and denote the missed detection and false alarm probabilities of device achieved by AMP after its convergence. As compared to Fig. 7, it is observed that even with antennas at the BS, both the missed detection and false alarm probabilities can be driven down to when the pilot sequence is , several orders of magnitude lower than the SMV case with the same .
This article mainly focuses on the device activity detection performance under the grantfree access scheme. However, as shown in Fig. 5, besides device activity detection, channel estimation is performed as well via the metadata; moreover, data also should be decoded. Fortunately, the state evolution of the AMP algorithm enables us to characterize the channel estimation performance analytically, thus making it possible to quantify the user achievable rate with the effect of device activity detection into consideration [7]. Readers interested in informationtheoretical studies on the capacity of the massive IoT connectivity systems with randomly active devices (also known as manyaccess channel) can refer to [19, 20]. These references provide a justification for our proposed strategy to first detect the user activity via preambles then decode the user messages, i.e., the grantfree access scheme shown in Fig. 5.
V AMPbased Device Activity Detection with Embedded Information
The AMP algorithm is introduced above for device activity detection. In this section, we show how a modified version of AMP may be used for noncoherent detection of information bits embedded in the pilot transmission. Although the twophase grantfree access scheme shown in Fig. 5 works very well for most of the cases when the user messages are of moderate and large size [7, 19, 20], as discussed at the end of last section, the strategy discussed in this section can be an effective alternative in the special case when very short messages (1 or several bits) are transmitted.
Va Motivation
In many applications the amount of data to be transmitted per block may comprise only a small number of information bits, or even a single bit. This situation is particularly common in control signaling, where the message may contain acknowledgment (ACK/NACK) bits in a retransmission protocol, or simply a concise request for a particular kind of response from the BS.
The transmission of extremely short packages is a challenging problem from two perspectives. First, fundamentally, the protection of very short packets against transmission errors is very expensive. For a single bit, repetition coding is the only possible strategy and for short blocks, block codes with low coding gains must be used. Second, as only error probability matters, capacity is an irrelevant metric. In fact, for extremely short blocks even finiteblocklength information theory becomes inapplicable as the corresponding bounds and approximations are too loose to be of practical value.
As an aside, it is noteworthy that most academic work tend to deal with the transmission of long coded blocks, with Shannon capacity as the primary performance metric. Contrarily, much of the effort invested in standardization and system design is concerned with the transmission of short data blocks on the control plane, for which Shannon capacity is, mostly, an illegitimate performance measure. An explanation for this situation might be that digital transmission on the control plane is too hard to model and tackle with rigorous information theory: there is no “Shannon theory” available for its analysis – whereas in contrast, established recipes are available for capacity analysis of longblock transmission. A contributing reason might also be that many academic researchers simply are unaware of the importance and the magnitude of the problem.
There are practical solutions for transmission of a single bit of control information. For example, [21] considers the joint transmission of linearly coded payload data and a single “additional bit”. The transmitter uses the additional bit, through a onetoone mapping, to select one out of two possible codebooks for the encoding of the payload. The receiver uses a fast algorithm to detect which codebook that was used, such that the additional bit can be detected before attempting to decode the payload data.
VB Algorithm Design
In the context of grantfree random access with nonorthogonal pilots, the main focus of our discussion, a small number, say , of bits may be encoded as follows [22, 23]. Each terminal is assigned, a priori, distinct, typically nonorthogonal, pilots. Upon transmission, the terminal uses the bits to select one of these pilots; specifically, it selects pilot number – which depending on the bits ranges from 1 to . The BS detects “activity” using the AMP algorithm – but now, importantly, “activity” means the combination of the event that a particular terminal is active, and that a particular string of bits is being communicated. One may think of the resulting communication scheme as noncoherent transmission.
The analytical model for device activity detection with embedded information in a massive MIMO system is given by:
(22) 
where denotes the collection of all the pilots that can be used by the devices, and denotes the collection of all the effective channels of the devices. Specifically, the effective channel is modeled as , and , where
(23) 
As compared to (13) for sole device activity detection, the dimensions of sensing matrix and effective channels are enlarged by a factor of to embed bits information.
The AMP for device activity detection in the form described above in principle could be directly applied to this problem as it stands. However, significantly, it is suboptimal because the BS knows a priori that among the pilots assigned to each terminal, only one can be active at a time, i.e., if , then , .
Here we discuss the modified AMP algorithm for joint detection of user activity and embedded information bits, as proposed in [22]. For conciseness of the exposition we focus here on the case of a single embedded bit (for which we omit the index) – i.e., ; then each user is assigned one out of two unique, but generally nonorthogonal pilot sequences. The modification of the AMP should introduce the constraint that out of the two possible pilots, at most one may be transmitted at at a time; the possible options are that either none of these pilots are sent (device silent), the first one is sent (device active and communicates “0”), or the second one is sent (device active and communicates “1”). The overarching idea is to modify the AMP denoiser function, , to take into this constraint into account.
In more details, similar to (16), let and be the two vectors associated with the two possible pilots (for information bit “0” respectively “1”) for device ; we omit the iteration index of the AMP algorithm here for brevity. The statistical characterization of and is
(24) 
Based on these characterizations, we construct the following likelihood ratios:
(25) 
We now rework the denoiser, such that in each time update, the constraint is taken into account that at most one of the vectors and can be nonzero. Suppose that device is detected to be active. In principle, a comparison of and to a threshold would yield a hypothesis test, that could be used to discriminate between the two possibilities and or equivalently and ; one of and could then be set to zero based on the outcome of this test. In this process, taking a soft decision as given in (19) instead is preferable in order to avoid taking premature incorrect decisions on the embedded bits, which may propagate to subsequent iterations. Experimentation in [22]
has shown that a good heuristic is to use soft decision obtained by taking the original soft thresholding denoiser function given in (
19) and multiplying the denoisers for and by and , respectively, whereis a modified sigmoid function with its inflection point at
, where is a parameter to control the sharpness of the sigmoid function. The intuition is that the larger the likelihood ratio is relative to , the more likely it is that or and the closer the weight for is to unity, and the closer the weight for is to zero. This way, the effect of the denoiser on is similar to the effect of the soft thresholding (19) as used in the original AMP algorithm solely for device activity detection, whereas on is instead pushed down towards zero. A similar interpretation holds for the opposite case when .Importantly, while the modified denoiser outlined here yields good results in numerical experiments, it is not optimal in any known sense. Research opportunities are available to find improved denoisers that can make a better utilization of the constraint that at most one of and can be nonzero. An extension of the modified AMP denoiser to the case of multiple embedded bits is available [23].
A final remark is that the embedding of one or several information bit(s) of course incurs an expense of storage of more pilot sequences at the device and at the BS. Also for given coherence block length, more resources need be dedicated to pilot transmission in order to maintain the same error probability performance. Yet, in the case to transmit very short messages, the embedding scheme has been shown to be efficient compared to conventional scheme consisting of pilotbased channel estimation (using the sparsity/AMPbased techniques proposed here) followed by coherent detection [23].
Vi Other Compressed Sensing Techniques for Device Activity Detection
Besides AMP, researchers with diverse backgrounds have developed many other powerful algorithms to reconstruct sparse signals from low dimensional linear measurements as given in (2) and (13). These compressed sensing algorithms can also be leveraged in our considered massive IoT connectivity setting for device activity detection, e.g., coded slotted ALOHA.
One powerful algorithm of low complexity is the socalled sparsegraph based compressed sensing algorithm [15], where the sensing matrix is designed by sparsifying each row of the measurement matrix with zero patterns guided by sparsegraph codes. The reason for such a sparse sensing matrix design is to disperse the signal into singletons that only contain one nonzero element in and peel them off from multitons that contain two or more nonzero elements in such that they can become singletons.
Fig. 10 gives a simple example to briefly illustrate how this algorithm works to recover in the ideal case without noise in (2), in which the dimensions of and are and , respectively, and , , are nonzero entries in . Moreover, the sensing matrix is
(26) 
Due to the sparsity in both and , in the received signal , , , and only contain information about , , and , respectively. Thus, can be detected from the singleton . Next, is removed from and , which become singletons so that and can be decoded. Note that the effect of the channels is not taken into account in this example.
Density evolution, a powerful tool in modern coding theory, tracks the average density of remaining edges that are not decoded after a fixed number of peeling iteration. The convergence of the above algorithm is guaranteed by showing the convergence of the density evolution towards zero.
We remark that the above “successive interference cancellation” procedure is the principle of coded slotted ALOHA, a powerful multiuser access scheme in which the active devices transmit replicas of their packets in randomly chosen slots that contain both metadata (i.e., pilot sequences) and data. A successful detection of a packet replica in some slot enables removal of the related replicas from the slots in which they occur. This, in turn, lowers the number of colliding packets in the affected slots and boosts their detection probability, instigating new rounds of successive interference cancellation etc. If a single user packet can be detected in a slot, then the entries in denote packets of active users, and entries in the sparse sensing matrix denote the choice of the slots where the packets are repeated. On the other hand, the possibility to decode multiple user packets in a slot is also discussed in [13, 24] to improve the detection performance.
We remark that besides AMP and the sparsegraph based algorithm, many powerful compressed sensing algorithms exist in the literature, including LASSO [25], Orthogonal Matching Pursuit (OMP) [26], and so on. Further, the groupsparsity in the MMV model (13) can also be utilized in LASSO [27], in which the penalty, i.e., the sum of norm penalty, is used to promote the desired sparsity pattern. The potential to apply these advanced compressed sensing techniques for user activity detection has been discussed in [28, 29, 30]. It would be of great significance to investigate which compressed sensing algorithm is best suited for device activity detection in the massive IoT connectivity setting, in terms of the complexity of pilot sequence design, the pilot sequence length required to achieve reasonable device detection accuracy, the corresponding missed detection and false alarm probabilities performance and channel estimation performance, etc.
Vii Conclusions
A key feature of the future IoT network is the massive number of devices, e.g., sensors, actuators, and etc., each with sporadic data traffic. Facilitating the data transmission from so many IoT devices with extremely low latency poses plenty of new research challenges to the signal processing community. To embrace the upcoming era of IoT, this article advocates a grantfree access scheme that mitigates the delay arising from the contention resolution in the current random access scheme, and outlines a compressed sensing based approach for device activity detection to enable the grantfree access scheme to work. Most notably, the massive MIMO technology, originally proposed for improving the spectrum efficiency of humantype communications, is able to boost the device activity detection accuracy remarkably for massive IoT connectivity as well, with the aid of the MMVbased AMP algorithm. We have also discussed about the potential to decode some short messages along with the device activity detection process.
References
 [1] G. Durisi, T. Koch, and P. Popovski, “Toward massive, ultrareliable, and lowlatency wireless communication with short packets,” Proc. IEEE, vol. 104, no. 9, pp. 17111726, Sep. 2016.
 [2] L. Da Xu, W. He, and S. Li, “Internet of Things in industries: A survey,” IEEE Trans. Ind. Informat., vol. 10, no. 4, pp. 22332243, Nov. 2014.
 [3] M. Hasan, E. Hossain, and D. Niyato, “Random access for machinetomachine communication in LTEadvanced networks: Issues and approaches,” IEEE Commun. Mag., vol. 51, no. 6, pp. 8693, June 2013.
 [4] O. Y. Bursalioglu, C. Wang, H. Papadopoulos, and G. Caire, “RRH based massive MIMO with on the “fly” pilot contamination control,” in Proc. IEEE Int. Conf. Commun. (ICC), 2016.
 [5] E. Björnson, E. de Carvalho, J. H. Sørensen, E. G. Larsson, and P. Popovski, “A random access protocol for pilot allocation in crowded massive MIMO systems,” IEEE Trans. Wireless Commun., vol. 16, no. 4, pp. 22202234, Apr. 2017.
 [6] L. Liu and W. Yu, “Massive connectivity with massive MIMO–Part I: Device activity detection and channel estimation,” IEEE Trans. Signal Process., vol. 66, no. 11, pp. 29332946, Jun. 2018.
 [7] L. Liu and W. Yu, “Massive connectivity with massive MIMO–Part II: Achievable rate characterization,” IEEE Trans. Signal Process., vol. 66, no. 11, pp. 29472959, Jun. 2018.
 [8] D. L. Donoho, A. Maleki, and A. Montanari, “Messagepassing algorithms for compressed sensing,” Proc. Nat. Acad. Sci., vol. 106, no. 45, pp. 1891418918, Nov. 2009.
 [9] T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas,” IEEE Trans. Wireless Commun., vol. 9, no. 11, pp. 35903600, Nov. 2010.
 [10] T. L. Marzetta, E. G. Larsson, H. Yang and H. Q. Ngo, Fundamentals of Massive MIMO, Cambridge, UK: Cambridge University Press, Nov. 2016.
 [11] J. Ziniel and P. Schniter, “Efficient highdimensional inference in the multiple measurement vector problem,” IEEE Trans. Signal Process., vol. 61, no. 2, pp. 340354, Jan. 2013.
 [12] J. Kim, W. Chang, B. Jung, D. Baron, and J. C. Ye, “Belief propagation for joint sparse recovery,” Feb. 2011, [Online] Available: http://arxiv.org/abs/1102.3289.
 [13] E. Paolini, C. Stefanović, G. Liva, and P. Popovski, “Coded random access: How coding theory helps to build random access protocols,” IEEE Commun. Mag., vol. 53, no. 6, pp. 144–150, June 2015.
 [14] Z. Chen, F. Sohrabi, and W. Yu, “Sparse activity detection for massive connectivity,” IEEE Trans. Signal Process., vol. 66, no. 7, pp. 18901904, Apr. 2018.
 [15] X. Li, S. Pawar, K. Ramchandran, “Sublinear time time compressed sensing using sparsegraph codes,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), 2015, pp. 1645?649.
 [16] M. Bayati and A. Montanari, “The dynamics of message passing on dense graphs, with applications to compressed sensing,” IEEE Trans. Inf. Theory, vol. 57, no. 2, pp. 764785, Feb. 2011.
 [17] S. Rangan, “Generalized approximate message passing for estimation with random linear mixing,” in IEEE Inter. Symp. Inf. Theory (ISIT), 2011, pp. 21682172.
 [18] D. L. Donoho, A. Maleki, and A. Montanari, “Message passing algorithms for compressed sensing: I. Motivation and construction,” in Proc. Inf. Theory Workshop, (Cairo, Egypt), pp. 15, Jan. 2010.
 [19] X. Chen, T.Y. Chen, and D. Guo, “Capacity of Gaussian manyaccess channels,” IEEE Trans. Inf. Theory, vol. 63, no. 6, pp. 35163519, Jun. 2017.
 [20] W. Yu, “On the fundamental limits of massive connectivity,” in Inf. Theory and Appl. (ITA), Workshop, Feb. 2017.
 [21] E. G. Larsson and R. Moosavi, “Piggybacking an additional lonely bit on linearly coded payload data,” IEEE Wireless Commun. Lett., pp. 292295, Aug. 2012.
 [22] K. Senel and E. G. Larsson, “Device activity and embedded information bit detection using AMP in massive MIMO,” in Proc. IEEE Global Commun. Conf. (Globecom) workshops, 2017.

[23]
K. Senel and E. G. Larsson, “Joint user activity and noncoherent data detection in mMTCenabled massive MIMO using machine learning algorithms,” in
Proc. of International ITG Workshop on Smart Antennas (WSA), Mar. 2018.  [24] G. Wunder, C. Stefanović, P. Popovski and L. Thiele, “Compressive coded random access for massive MTC traffic in 5G systems,” in Proc. of 49th Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, Nov. 2015.
 [25] R. Tibshirani, “Regression shrinkage and selection via the Lasso,” Journal of the Royal Statistical Society. Series B (Methodological), pp. 267288, 1996.
 [26] J. A. Tropp and A. C. Gilbert, “Signal recovery from random measurements via orthogonal matching pursuit,” IEEE Trans. on Inf. Theory, vol. 53, no. 12, pp. 46554666, Dec. 2007.
 [27] L. Jacob, G. Obozinski, and J. Vert, “Group Lasso with overlap and graph Lasso,” in Proc. Int. Conf. on Machine Learning, Montreal, QC, Canada, Jun. 2009.
 [28] Z. Utkovski, O. Simeone, T. Dimitrova, and P. Popovski, “Random access in CRAN for user activity detection with limitedcapacity fronthaul,” IEEE Signal Process. Lett., vol. 24, no. 1, pp. 1721, Jan. 2017.
 [29] H. Zhu and G. B. Giannakis, “Exploiting sparse user activity in multiuser detection,” IEEE Trans. Commun., vol. 59, no. 2, pp. 454465, Feb. 2011.
 [30] G. Wunder, H. Boche, T. Strohmer, and P. Jung, “Sparse signal processing concepts for efficient 5G system design,” IEEE Access, vol. 3, pp. 195208, 2015.
Comments
There are no comments yet.