1 Introduction
Compressive sensing (CS) enables one to sample signals that admit a sparse representation in some transform basis wellbelow the Nyquist rate, while still enabling their faithful recovery [3, 7]. Since many natural and manmade signals exhibit a sparse representations, CS has the potential to reduce the costs associated with sampling in numerous practical applications.
1.1 Spatialmultiplexing cameras
The single pixel camera (SPC) [8] and its multipixel extensions [38, 6, 21] are spatialmultiplexing camera (SMC) architectures that rely on CS. In this paper, we focus on such SMC designs, which acquire random (or coded) projections of a (typically static) scene using a spatial light modulator (SLM) in combination with a small number of optical sensors, such as single photodetectors or bolometers. The use of a small number of optical sensors—in contrast to fullframe sensors having millions of pixel elements—turns out to be advantageous when acquiring scenes at nonvisible wavelengths. Since the acquisition of scene information beyond the visual spectrum often requires sensors built from exotic materials, corresponding fullframe sensor devices are either too expensive or cumbersome [10].
Obviously, the use of a small number of sensors is, in general, not sufficient for acquiring complex scenes at high resolution. Hence, existing SMCs assume that the scenes to be acquired are static and acquire multiple measurements over time. For static scenes (i.e., images) and for a singlepixel SMC architecture, this sensing strategy has been shown to deliver good results [8] typically at a compression of 28. This approach, however, fails for timevariant scenes (i.e., videos). The main reason is due to the fact that the timevarying scene to be captured is ephemeral, i.e., each measurement acquires information of a (slightly) different scene. The situation is further aggravated when we deal with SMCs having a very small number of sensors (e.g., only one for the SPC). Virtually all existing methods for CSbased video recovery (e.g., [25, 32, 34, 37, 22]) seem to overlook the important fact that scenes are changing while one acquires compressive measurements. In fact, all of the mentioned SMC video systems treat scenes as a sequence of static frames (i.e., as piecewise constant scenes) as opposed to a continuously changing scene. This disconnect between the realworld operation of SMCs and the assumptions commonly made for video CS motivates novel SMC acquisition systems and recovery algorithms that are able to deal with the ephemeral nature of real scenes. Figure 1 illustrates the effect of assuming piecewise static scenes. Put simply, grouping too few measurements for reconstruction results in poor spatial resolution; grouping too many measurements results in severe temporal aliasing artifacts.
1.2 The “chickenandegg” problem of video CS
Highquality video CS recovery methods for camera designs relying on temporal multiplexing (in contrast to spatial multiplexing as it is the case for SMCs) are generally inspired by video compression schemes and exploit motion estimation between individually recovered frames [28]. Applying such techniques for SMC architectures, however, results in a fundamental problem: On the one hand, obtaining motion estimates (e.g., the optical flow between pairs of frames) requires knowledge of the individual video frames. On the other hand, recovering the video frames in absence of motion estimates is difficult, especially when using low sampling rates and a small number of sensor elements (cf. Fig. 1). Attempts to address this “chickenandegg” problem either perform multiscale sensing [25] or sense separate patches of the individual video frames [22]. However, both approaches ignore the timevarying nature of realworld scenes and rely on a piecewise static scene model.
1.3 The CSMUVI framework
In this paper, we propose a novel sensing and recovery method for videos acquired by SMC architectures, such as the SPC [8]. We start (in Sec. 3) with an overview of our sensing and recovery framework. In Sec. 4, we study the recovery performance of timevarying scenes and demonstrate that the performance degradation caused by violating the staticscene assumption is severe, even at moderate levels of motion. We then detail a novel video CS strategy for SMC architectures that overcomes the staticscene assumption. Our approach builds upon a codesign of scene acquisition and video recovery. In particular, we propose a novel class of CS matrices that enables us to obtain a lowresolution “preview” of the scene at low computational complexity. This preview video is used to extract robust motion estimates (i.e., the optical flow) of the scene at fullresolution (in Sec. 5). We exploit these motion estimates to recover the fullresolution video by using offtheshelf convexoptimization algorithms typically used for CS (in Sec. 6). We demonstrate the performance and capabilities of our SMC videorecovery algorithm for a different scenes in Sec. 7, show video recovery on real data in Sec. 8, and discuss our findings in Sec. 9. Given the multiscale nature of our framework, we refer to it as CS multiscale video, or short CSMUVI.
We note that a short version of this paper appeared at the IEEE International Conference on Computational Photography [31] and Computational Optical Sensing and Imaging [40] meeting. This paper contains an improved recovery algorithm, a more detailed performance analysis, and a larger number of experimental results. Most importantly, we show—to the best of our knowledge—the first highquality video recovery results from real data obtained with a laboratory SPC; see Fig. 2 for corresponding results.
2 Background
2.1 Design of multiplexing systems
Suppose that we have a signal acquisition system characterized by where is the signal to be sensed and is the measurement obtained using the matrix . The entries of the measurement matrix are usually restricted to
. Given an invertible matrix
, the recovery error associated with the leastsquares estimate satisfies the following inequality:Traditional imaging systems mostly use the identity as the measurement matrix, i.e., ; such measurements result in an error equal to .
A classical problem is the design of matrix , which results in minimal recovery error. As shown in [14], Hadamard matrices are optimal in guaranteeing the smallest possible error when the measurement noise is signal independent. Specifically, if an Hadamard matrix were to exist, then the recovery error satisfies , which is a dramatic reduction from achieved by .
While Hadamard multiplexing provides immense benefits in the context of imaging, it still requires an invertible measurement matrix, i.e, the dimensionality of the measurement needs to be the same (or greater) than that of the sensed signal . For SMCs that aggregate measurements over a time period, this implies a long acquisition period as the dimensionality of the signal increases. This also leads to a poorer temporal resolution. All of these concerns can potentially be addressed if it were possible to reconstruct a signal from farfewer measurements than its dimensionality or when . Such a sensing framework is popularly referred to as compressive sensing. We discuss this approach next.
2.2 Compressive sensing
CS deals with the estimation of a vector
from nonadaptive linear measurements [3, 7](1) 
where is the sensing matrix and represents measurement noise. Estimating the signal from the compressive measurements is an illposed problem, in general, since the (noiseless) system of equations
is underdetermined. Early results in sparse polynomial interpolation
[1] showed that, in the noiseless setting, it is possible to recover a sparse vector from measurements; however, the use of algebraic methods involving polynomials of highdegrees made the solutions fragile to perturbations. A fundamental result from CS theory states that a robust estimate of the vector can be obtained from(2) 
measurements if (i) the signal admits a sparse representation in an orthonormal basis (i.e., has no more than nonzero entries), and (ii) the effective sensing matrix satisfies the restricted isometry property (RIP) [2]. For example, if the entries of the sensing matrix
are i.i.d. zeromean Gaussian distributed, then
is known to satisfy the RIP with high probability. Furthermore, any
sparse signal satisfying (2) can be estimated stably from the noisy measurement by solving the following convexoptimization problem [3]:Here, denotes matrix transposition and the parameter is a bound on the measurement noise. For sparse signals, it can be shown that recovery error is bounded from above by , where is a constant. Hence, in the noiseless setting (where ), the sparse signal can be recovered perfectly, even by acquiring farfewer measurements (2) than the signal’s dimensionality.
Signals with sparse gradients
The results of compressive sensing have been extended to include a broad class of signals beyond that of sparse signals; an example of this are signals that exhibit sparse gradients. For such signals, one can solve problems of the form [5, 24]
where the gauge promotes sparse gradients. In the context of images where denotes a 2D signal (i.e., an image), the operator can be defined as
where and are the spatial gradients in x and ydirection of the 2dimensional image , respectively. This definition can easily be extended to extended to higherdimensional signals, such as RGB color images or videos (where the 3 dimension is time). We next look at the prior art devoted specifically to CS of videos.
2.3 Video compressive sensing
An important challenge in CS of videos is that the temporal dimension is fundamentally different from spatial and spectral dimensions due to its ephemeral nature. The causality of time prevents us from obtaining additional measurements of an event that has already occurred. This is especially relevant for SMCs that aggregate measurements over a time period. Further, temporal statistics of a video are often different from the spatial statistics. These unique characteristics have lead to a large body of work dedicated to video CS, and can broadly be grouped into signal models and corresponding recovery algorithms, and novel compressive imaging architectures.
2.3.1 Spatial multiplexing cameras
SMCs are imaging architectures that build on the ideas of CS. In particular, they employ an SLM, e.g., a digital micromirror device (DMD) or liquid crystal on silicon (LCOS), to optically compute a series linear projections of the scene ; these linear projections determine the rows of the sensing matrix . Since SMCs are usually built with only a few sensor elements, they can operate at wavelengths where corresponding fullframe sensors are too expensive. In the recovery stage, one estimates the image from the compressive measurements collected in , for example, by solving (P1) or variants thereof.
Single pixel camera
A prominent SMC is the SPC [8]; its main feature is the ability of acquiring images using only a single sensor element (i.e., a single pixel) and by taking significantly fewer multiplexed measurements than the number of pixels of the scene to be recovered. In the SPC, light from the scene is focused onto a programmable DMD, which directs light from only a subset of activated micromirrors onto the photodetector. The programmable nature of the DMD enables us to freely direct light from each of the micromirror towards the photodetector or away from it. As a consequence, the voltage measured at the photodetector corresponds to an inner product of the image focused on the DMD and the activation pattern of the DMD (see Figure 3). Specifically, at time , if the DMD pattern were and the scene were , then the photodetector measures a scalar value , where denotes the innerproduct between the vectors. If the scene were static , then multiple measurements can be aggregated to form the expression in (1), with . The SPC leverages the high operating speed of the DMD, i.e., the mirror’s orientation patterns on the DMD can be reprogrammed at kHz rates. The DMD’s operating speed defines the measurement bandwidth (i.e., the number of measurements/second), which is one of the key factors that define the achievable spatial and temporal resolutions.
There have been many recovery algorithms proposed for video CS using the SPC. Wakin et al. [37] use 3dimensional wavelets as a sparsifying basis for videos and recover all frames of the video jointly under this prior. Unlike images, videos are not well represented using wavelets since they have additional temporal properties, like brightness constancy, that are better represented using motionflow models. Park and Wakin [26] analyzed the coupling between spatial and temporal bandwidths of a video. In particular, they argue that reducing the spatial resolution of a scene implicitly reduces its temporal bandwidth and hence, lowers the error caused by the static scene assumption. This builds the foundation for the multiscale sensing and recovery approach proposed in [25], where several compressive measurements are acquired at multiple scales for each video frame. The recovered video at coarse scales (low spatial resolution) is used to estimate motion, which is then used to boost the recovery at finer scales (high spatial resolution). Other scene models and recovery algorithms for video CS with the SPC use blockbased models [22, 9], sparse frametoframe residuals [35, 4], linear dynamical systems [34, 32, 33], and low rank plus sparse models [39]. To the best of our knowledge, all of them report results only on synthetic data and use the assumption that each frame of the video remains static for a certain duration of time (typically of a second)—an assumption that is violated when operating with an actual SPC.
2.3.2 Temporal multiplexing cameras
In contrast to SMCs that use sensors having lowspatial resolution and seek to spatially superresolve images and videos, temporal multiplexing cameras (TMCs) have low framerate sensors and seek to temporally superresolve videos. In particular, TMCs use SLMs for temporal multiplexing of videos and sensors with high spatial resolution, such that the intensity observed at each pixel is coded temporally by the SLM during each exposure.
Veeraraghavan et al. [36] showed that periodic scenes could be imaged at very high temporal resolutions by using a global shutter or a “flutter shutter” [27]. This idea was extended to nonperiodic scenes in [16] where a unionofsubspace models was used to temporally superresolve the captured scene. Reddy et al. [28] proposed the perpixel compressive camera (P2C2) which extends the flutter shutter idea with perpixel shuttering. Inspired from video compression standards such as MPEG1 [18] and H.264 [29], the recovery of videos from the P2C2 camera was achieved using the optical flow between pairs of consecutive frames of the scene. The optical flow between pairs of video frames is estimated using an initial reconstruction of the high framerate video using wavelet priors on the individual frames. A second reconstruction is then performed that further enforces the brightness constancy expressions provides by the optical flow fields. The implementation of the recovery procedure described in [28] is tightly coupled to the imaging architecture and prevents its use for SMC architectures. Nevertheless, the use of opticalflow estimates for video CS recovery inspired the recovery stage of CSMUVI as detailed in Sec. 6.
Gu et al. [12] propose to use the rolling shutter of a CMOS sensor to enable higher temporal resolution. The key idea there is to stagger the exposures of each row randomly and use image/video statistics to recover a highframe rate video. Hitomi et al. [15] uses a perpixel coding, similar to P2C2, that is implementable in modern CMOS sensors with perpixel electronic shutters; however, a hallmark of their approach is the use of a highly overcomplete dictionary of video patches to recovery the video at high frame rates. This results in highly accurate reconstructions even when brightness constancy—the key construct underlying optical flow estimation—is violated. Llull et al. [20] propose a TMC that uses a translating mask in the sensor plane to achieve temporal multiplexing. This approach avoids the hardware complexity involved with DMDs and LCOS, and enjoys other benefits including low operational power consumption. In Yang et al. [42]
, a Gaussian Mixture Model (GMM) is used as a signal prior to recovery highframe rate videos for TMCs; a hallmark of this approach is that the GMM parameters are not just trained offline but also adapted and tuned in situ during the recovery process. Harmany et al.
[13] extend coded aperture systems by incorporating a flutter shutter [27] or a coded exposure; the resuling TMC provides immense flexibility in the choice of measurement matrix. They also show the resulting system provides measurement matrices that satisfy the RIP.3 Overview of CSMUVI
Stateoftheart video compression methods rely on estimating the motion in the scene, compress a few reference frames, and use the motion vectors that relate the remaining parts of a scene to these reference frames. While this approach is possible in the context of video compression, i.e., where the algorithm has prior access to the entire video, it is significantly more difficult in the context of compressive sensing.
A general strategy to enable the use of motion flowbased signal models for video CS is to use a twostep approach [28]. In the first step, an initial estimate of the video is generated by recovering each frame individually using sparse wavelet or gradient priors. The initial estimate is used to derive motion flow between consecutive frames; this enables a powerful description in terms of relating intensities at pixels across frames. In the second step, the video is reestimated but now with the aid of enforcing the extracted motion flow constraints in addition to the measurement constraints. The success of this two step strategy critically depends on the ability to obtain reliable motion estimates, which, in turn, depends on obtaining robust initial estimates in the first step. Unfortunately, in the context of SMCs, obtaining reliable initial estimates of the frames of the video, in absence of motion knowledge, is inherently hard due to the violation of the static scene model (recall Fig. 1).
The proposed framework, referred to as CSMUVI, enables a robust initial estimate by obtaining the individual frames at a lower spatial resolution. This approach has two important benefits towards reducing the violation of the static scene model. First, obtaining the initial estimate at a lower spatial resolution reduces the dimensionality of the video significantly. As a consequence, we can estimate individual frames of the video from fewer measurements. In the context of an SMC, this implies a smaller time window over which these measurements are obtained, and hence, reduced misfit to the static scene model. Second, spatial downsampling naturally reduces the temporal resolution of the video [26]; this is a consequence of the additional blur due to spatialdownsampling. This implies that the violation of the static scene assumption is naturally reduced when the video is downsampled. In Sec. 4, we study this strategy in detail and characterize the error in estimating the initial estimates at a lower resolution. Specifically, given consecutive measurements from an SMC, we are interested in estimating a single static image at a resolution of pixels. Note that varying , which denotes the window length, varies both the spatial resolution of the recovered frame (since it has a resolution of ) as well as its temporal resolution (since the acquisition time is proportional to ). We analyze various sources of error in the recovered lowresolution frame. This analysis provides conditions for stable recovery of the initial estimates that leads to the design of measurement matrices in Sec. 5.
The proposed CSMUVI framework for video CS relies on three steps. First, we recover a lowresolution video by reconstruction each frame of the video, individually, using simple leastsquares techniques. Second, this lowresolution video is used to obtain motion estimates between frames. Third, we recover a highresolution video by enforcing a spatiotemporal gradient prior, the constraints induced by the compressive measurements as well as the constraints due to motion estimates. Fig. 4 provides an overview schematic of these steps.
4 Spatiotemporal tradeoff
We now study the recovery error that results from the staticscene assumption while sensing a timevarying scene (video) with an SMC. We also identify a fundamental tradeoff underlying a multiscale recovery procedure, which is used in Sec. 5 to identify novel sensing matrices that minimize the spatiotemporal recovery errors. Since the SPC is the most challenging SMC architecture as it only provides a single pixel sensor, we solely focus on the SPC in the following. Generalizing our results to other SMC architectures with more than one sensor is straightforward.
4.1 SMC acquisition model
The compressive measurements taken by a singlepixel SMC at the sample instants can be modeled as
where is the total number of acquired samples, is the measurement vector, represents measurement noise, and is the scene (or frame) at sample instant . In the remainder of the paper, we assume that the 2dimensional scene consists of spatial pixels, which, when vectorized, results in the vector of dimension . We also use the notation to represent the vector consisting of a window of successive compressive measurements (samples), i.e,
(3) 
4.2 Staticscene and downsampling errors
Suppose that we rewrite our (timevarying) scene for a window of consecutive sample instants as follows:
Here, is the static component (assumed to be invariant for the considered window of samples), and is the error at sample instant caused by the staticscene assumption. By defining , we can rewrite (3) as
(4) 
where is the sensing matrix whose th row corresponds to the transposed measurement vector .
We now investigate the error caused by spatial downsampling of the static component in (4). To this end, let be the downsampled static component, and assume with . By defining a linear upsampling and downsampling operator as and , respectively, we can rewrite (4) as follows:
(5) 
since . Inspection of (5) reveals three sources of error in the CS measurements of the lowresolution static scene : (i) The spatialapproximation error caused by downsampling, (ii) the temporalapproximation error caused by assuming the scene remains static for samples, and (iii) the measurement error Note that when , the matrix has at least as many rows as columns and hence, we can get an estimate of . We next study the error induced by this leastsquares estimate in terms of the relative contributions of the spatialapproximation and temporalapproximation terms.
4.3 Estimating a lowresolution image
In order to analyze the tradeoff that arises from the staticscene assumption and the downsampling procedure, we consider the scenario where the effective matrix is of dimension with ; that is, we aggregate at least as many compressive samples as the downsampled spatial resolution. If has full (column) rank, then we can obtain a leastsquares (LS) estimate of the lowresolution static scene from (5) as
(6) 
where denotes the pseudo inverse. From (6) we observe the following facts: (i) The window length controls a tradeoff between the spatialapproximation error and the error induced by assuming a static scene , and (ii) the least squares (LS) estimator matrix (potentially) amplifies all three error sources.
4.4 Characterizing the tradeoff
The spatial approximation error and the temporal approximation error are both functions of the window length . We now show that carefully selecting minimizes the combined spatial and temporal error in the lowresolution estimate . A close inspection of (6) shows that for , the temporalapproximation error is zero, since the static component is able to perfectly represent the scene at each sample instant . As increases, the temporalapproximation error increases for timevarying scenes; simultaneously, increasing reduces the error caused by downsampling (see Fig. 5(b)). For there is no spatial approximation error (as long as is invertible). Note that characterizing both errors analytically is, in general, difficult as they heavily depend on the on the scene under consideration.
Figure 5 illustrates the tradeoff controlled by and the individual spatial and temporal approximation errors, characterized in terms of the recovery signaltonoiseratio (SNR). The figure highlights our key observation that there is an optimal window length for which the total recovery SNR is maximized. In particular, we see from Fig. 5(c) that the optimum window length increases (i.e., towards higher spatial resolution) when the scene changes slowly; in contrary, when the scene changes rapidly, the window length (and consequently, the spatial resolution) should be low. Since , the optimal window length dictates the resolution for which accurate lowresolution motion estimates can be obtained. Hence, the optimal window length depends on the scene to be acquired, the rate of which measurements can be acquired, and the sensing matrix itself.
5 Design of sensing matrix
In order to bootstrap CSMUVI, a lowresolution estimate of the scene is required. We next show that carefully designing the CS sensing matrix enables us to compute highquality lowresolution scene estimates at low complexity, which improves the performance of video recovery.
5.1 Dualscale sensing matrices
The choice of the sensing matrix and the upsampling operator are critical to arrive at a highquality estimate of the lowresolution image . Indeed, if the effective matrix is illconditioned, then application of the pseudoinverse amplifies all three sources of errors in (6), eventually resulting in a poor estimate. For virtually all sensing matrices commonly used in CS, such as i.i.d. (sub)Gaussian matrices, as well as subsampled Fourier or Hadamard matrices, right multiplying them with an upsampling operator
often results in an illconditioned matrix or even a rankdeficient matrix. Hence, wellestablished CS matrices are a poor choice for obtaining a highquality lowresolution preview. Figures
6(a) and 6(b) show recovery results for naïve recovery using (P1) and leastsquares (LS), respectively, using a random sensing matrix. We immediately see that both recovery methods result in poor performance, even for large window sizes or for a small amount of motion.In order to achieve good CS recovery performance and have minimum noise enhancement when computing a lowresolution preview according to (6), we propose a novel class of sensing matrices, referred to as dualscale sensing (DSS) matrices. These matrices will (i) satisfy the RIP to enable CS and (ii) remain wellconditioned when rightmultiplied by a given upsampling operator . Such a DSS matrix enables robust lowresolution as shown in Fig. 6(c). We next discuss the details.
5.2 DSS matrix design
In this section, we detail a particular design that is suited for SMC architectures. In SMC architectures, we are constrained in the choice of the entries of the sensing matrix . Practically, the DMD limits us to matrices having binaryvalued entries (e.g., ) if we are interested in the highest possible measurement rate.^{4}^{4}4It is possible to employ more general sensing matrices, e.g., using spatial and/or temporal halftoning, which, however, comes at the cost of spatial resolution and/or speed. The design of such matrices are not in the scope of this paper but an interesting research direction. We propose the matrix to satisfy , where is a Hadamard matrix^{5}^{5}5In what follows, we assume that is chosen such that a Hadamard matrix exists. and is a predefined upsampling operator. Recall from Section 2.1, Hadamard matrices have the following advantages: (i) they have orthogonal columns, (ii) they exhibit optimal SNR properties over matrices restricted to entries, and (iii)
applying the (forward and inverse) Hadamard transform requires very low computational complexity (i.e., the same complexity as a fast Fourier transform).
We now show the construction of a such a DSS matrix (see Fig. 7(a)). A simple way is to start with a Hadamard matrix and to write the CS matrix as
(7) 
where is a downsampling matrix satisfying , and is an auxiliary matrix that obeys the following constraints: (i) The entries of are , (ii) the matrix has good CS recovery properties (e.g., satisfies the RIP), and (iii) should be chosen such that . Note that an easy way to ensure that be is to interpret as sign flips of the Hadamard matrix . Note that one could chose
to be an allzeros matrix; this choice, however, results in a sensing matrix
having poor CS recovery properties. In particular, such a matrix would inhibit the recovery of high spatial frequencies. Choosing random entries in such that (i.e., by using random patterns of high spatial frequency) provides excellent performance.To arrive at an efficient implementation of CSMUVI, we additionally want to avoid the storage of an entire matrix. To this end, we generate each row of as follows: Associate each row vector to an image of the scene, partition the scene into blocks of size , and associate an dimensional vector with each block. We can now use the same vector for each block and choose such that the full matrix satisfies . We also permute the columns of the Hadamard matrix to achieve better incoherence with the sparsifying bases used in Sec. 6 (see Fig. 7(b) for the details).
5.3 Preview mode
The use of Hadamard matrices for the lowresolution part in the proposed DSS matrices has an additional benefit. Hadamard matrices have fast inverse transforms, which can significantly speed up the recovery of the lowresolution preview frames. Such a “fast” DSS matrix has the key capability of generating a highquality preview of the scene (see Fig. 8) with very low computational complexity; this is beneficial for video CS as it allows one to easily and quickly extract an estimate of the scene motion. The motion estimate can then be used to recover the video at its full resolution (see Sec. 6). In addition to this, the use of fast DSS matrices can be beneficial in various other ways, including (but not limited to):
Digital viewfinder
Conventional SMC architectures do not enable the observation of the scene until CS recovery is performed. Due to the high computational complexity of most existing CS recovery algorithms, there is typically a large latency between the acquisition of a scene and its observation. Fast DSS matrices offer an instantaneous visualization of the scene, i.e., they can provide a realtime digital viewfinder; this capability substantially simplifies the setup of an SMC in practice.
Adaptive sensing
The immediate knowledge of the scene—even at a low resolution—is a key enabler for adaptive sensing strategies. For example, one may seek to extract the changes that occur in a scene from one frame to the next or track the locations of moving objects, while avoiding the typically high latency caused by computationally complex CS recovery algorithms.
5.4 Selecting
Crucial to the design of the DSS matrix is the selection of the parameter . While is often scenespecific, a good rule of thumb is as follows: given an scene, choose such that the motion of objects is less than pixels in the amount of time required to get measurements. Basically, this would serve to have motion in the preview images restricted to 1 pixel (at the resolution of the preview image).
6 Opticalflowbased video recovery
We next detail the second part of CSMUVI, where we obtain the video at a high spatial resolution by estimating and enforcing motion estimates between frames.
6.1 Opticalflow estimation
Thanks to the preview mode, we can estimate the optical flow between any two (lowresolution) frames and . For CSMUVI, we compute opticalflow estimates at full spatial resolution between pairs of upsampled preview frames. For the results in the paper, we used “bicubic” interpolation to upsample the frames. This approach turns out to result in more accurate opticalflow estimates compared to an approach that first estimates the optical flow at low resolution followed by upsampling of the optical flow. Let be the upsampled preview frame. The optical flow constraints between two frames, and , can be written as
where denotes the pixel in the plane of , and and correspond to the translation of the pixel () between frame and (see [17, 19]).
In practice, the estimated optical flow may contain subpixel translations, i.e., and are not necessarily integer valued. If this is the case, then we approximate as a linear combination of its four closest neighboring pixels
where denotes rounding towards and the weights are chosen according to the location within the four neighboring pixels. In order to obtain robustness against occlusions, we enforce consistency between the forward and backward optical flows; specifically, we discard optical flow constraints at pixels where the sum of the forward and backward flow causes a displacement greater than one pixel.
6.2 Choosing the recovery frame rate
Before we detail the individual steps of the CSMUVI videorecovery procedure, it is important to specify the rate of the frames to be recovered. When sensing scenes with SMC architectures, there is no obvious notion of frame rate. One notion of the frame rate comes from the measurement rate which in the case of the SPC is the operating rate of the DMD. However, this rate is extremely high and leads to videos whose dimensions are too high to allow feasible computations. Further, each frame would be associated with a single measurement which leads to a severely illconditioned inverse problem. A potential definition comes from the work of Park and Wakin [26] who argue that the frame rate is not necessarily defined by the measurement rate. Specifically, the spatial bandwidth of the video often places an upperbound on its temporal bandwidth as well. Intuitively, the idea here is that the larger the pixel size (or smaller the spatial bandwidth), the greater the motion to register a change in the scene. Hence, given a scene motion in terms of pixels/second, a suitable notion of frame rate is one that ensures subpixel motion between consecutive frames. This notion is more meaningful since it intuitively weaves in the observability of the motion into the definition of the framerate. Under this definition, we wish to find the largest window size such that there is virtually no motion at full resolution (). In practice, an estimate of can be obtained by analyzing the preview frames. Hence, given a total number of compressive measurements, we ultimately recover fullresolution frames. Note that a smaller value of would decrease the amount of motion associated with each recovered frame; this would, however, increase the computational complexity (and memory requirements) substantially as the number of fullresolution frames to be recovered increases. Finally, the choice of is inherently scenespecific; scenes with fast moving highly textured objects require a smaller as compared to those with slow moving smooth objects. The choice of could potentially be made timevarying as well and derived from the preview; this showcases the versatility of having the preview and is an important avenue for future research.
6.3 Recovery of fullresolution frames
We are now ready to detail the final stage of CSMUVI. Assume that is chosen such that there is little to no motion associated with each preview frame. Next, associate a preview frame with a highresolution frame , by grouping compressive measurements in the immediate vicinity of the frame (since ). Then, compute the opticalflow between successive (upscaled) preview frames.
We can now recover the highresolution video frames as follows. We enforce sparse spatiotemporal gradients using the 3D total variation (TV) norm. We furthermore consider the following two constraints: (i) Consistency with the acquired CS measurements, i.e, , where maps the sample index to the associated frame index , and (ii) estimated opticalflow constraints between consecutive frames. Together, we arrive at the following convex optimization problem:
which can be solved using standard convexoptimization techniques. The specific technique that we employed was by variable splitting and using ALM/ADMM.
The parameters and are indicative of the measurement noise levels and the inaccuracies in the brightness constancy, respectively.
captures all sources of measurement noise including photon, dark, and read noise. Photon noise is signal dependent. However, in an SPC, each measurement is the sum of a random selection of half the micromirrors on the DMD. For most natural scenes, we can expect the measurements to be tightly clustered — to be more specific, around onehalf of the total lightlevel of the scene. Hence, the photon noise will have nearly the same variance across the measurements. Hence, for the SPC, all sources of measurement noise can be clubbed into one parameter
which is set via a calibration process. Setting is based on the thresholds used in detecting violation of brightness constancy when estimating brightness constancy. For the results in this paper, is set to , where is the total number of pixel pairs for which we enforce brightness constancy.7 Evaluation and Comparisons
In this section, we validate the performance and capabilities of the CSMUVI framework using simulations. Results on real data obtained from our SPC lab prototype are presented in Sec. 8. All simulation results were generated from highspeed videos having a spatial resolution of pixels. The preview videos have a spatial resolution of pixels with (i.e., ). We assume an SPC architecture as described in [8] with parameters chosen to mimic operation of our lab setup. Noise was added to the compressive measurements using an i.i.d. Gaussian noise model such that the resulting SNR was 60 dB. Opticalflow estimates were extracted using the method described in [19]. The computation time of CSMUVI is dominated by both optical flow estimation and solving . Typical runtimes for the entire algorithm are 2–3 hours on an offtheshelf quadcore CPU for a video of resolution pixels with frames. However, computation of the lowresolution preview can be done almost instantaneously.
Video sequences from a highspeed camera
The results shown in Figs. 9 and 10 correspond to scenes acquired by a highspeed (HS) video camera operating at 250 frames per second. Both videos show complex (and fast) movement of large objects as well as severe occlusions. For both sequences, we emulate an SPC operating at compressive measurements per second. For each video, we used frames of the HS camera to obtain a total of compressive measurements. The final recovered video sequences consist of frames . Both recovered videos demonstrate the effectiveness of CSMUVI.
Comparison with the P2C2 algorithm
In the P2C2 camera [28], a twostep recovery algorithm — similar to CSMUVI — is presented. This algorithm is nearidentical to CSMUVI except that the measurement model does not use DSS measurement matrices; hence, an initial recovery using wavelet sparse models is used to obtain an initial estimate that plays the role of the preview frames. Figure 11 presents the results of both CSMUVI and the recovery algorithm for the P2C2 camera [28], with the same number of measurements/compression level. It should be noted that the P2C2 camera algorithm was developed for temporal multiplexing cameras and not for SMC architectures. Nevertheless, we observe from Figs. 11 (a) and (d) that naïve norm recovery delivers significantly worse initial estimates than the preview mode of CSMUVI. The advantage of CSMUVI for SMC architectures is also visible in the corresponding opticalflow estimates (see Figs. 11 (b) and (e)). The P2C2 recovery algorithm has substantial artifacts, whereas the result of CSMUVI is visually pleasing. In all, this demonstrates the importance of the DSS matrix and the ability to robustly obtain a preview of the video.
Comparisons against singleimage superresolution
There has been remarkable progress in single image superresolution (SR). Figure 12 compares CSMUVI to a sparse dictionarybased superresolution algorithm [41]. From our observations, the results produced by the superresolution are comparable to CSMUVI when the upsampling is about . However, in spite of this, the best known results in SR seldom produce meaningful results beyond superresolution. Our proposed technique is in many ways similar to SR except that we obtain multiple coded measurements of the scene and this allows us to obtain higher superresolution factors at potential loss in temporal resolution.
Performance analysis
Finally, we look at quantitative evaluation of CSMUVI for varying compression ratios and input measurement noise level. Our metric for performance is reconstruction SNR in dB defined as follows:
where and are the ground truth and estimated video, respectively. The testdata for this is a 250 fps video of vehicles on a highway. A few frames from this video are shown in Fig. 13(a). We establish a baseline for these results using two different algorithms. First, we consider “Nyquist cameras” that blindly tradeoff spatial and temporal resolution to achieve the desired compression. For example, at a compression factor of , a Nyquist camera could deliver fullresolution at th the temporal resolution or deliver th the spatial resolution at th the temporal resolution, and so on. This spatiotemporal trade off is feasible in most traditional imagers by binning pixels at readout. Second, we consider videos recovered using naïve frametoframe wavelet priors. For such reconstructions, we optimized over different window lengths of measurements associated with each recovered frame and chose the setting that provided the best results. Figure 13(b,c) show reconstruction SNR for CSMUVI and the two baseline algorithms for varying levels of compression. At high compression ratios, the performance of CSMUVI suffers from poor opticalflow estimates. Finally, in Fig. 13(d), we present performance for varying level of measurement or input noise. Again, as before, for high noise levels, optical flow estimates suffer leading to poorer reconstructions. In all, CSMUVI delivers high quality reconstructions for a wide range of compression and noise levels.
8 Hardware implementation
We now present video recovery results on real data from our SPC lab prototype.
Hardware prototype
The SPC setup we used to image real scenes is comprised of a DMD operating at 10,000 mirrorflips per second. The real measured data was acquired using a SWIR photodetector for the scenes involving the pendulum and a visible photodetector for the rest (the hand and windmill scene). While the DMD we used is capable of imaging the scene at a XGA resolution (i.e., 1024768 pixels), we operate it at a lower spatial resolution mainly, for two reasons. First, recall that the measurement bandwidth of an SPC is determined by the speed of operation of the DMD. In our case, this was 10,000 measurements per second. Even if we were to obtain a compression of , then our device would be similar to a conventional sampler whose measurement bandwidth is measurements/sec which would result in a video of approximately pixels at 30 frames/sec. Hence, we operate it at a spatial resolution of pixels by grouping pixels together on the DMD as one superpixel. Second, the patterns displayed on the DMD were required to be preloading onto the memory board attached to DMD via a USB port. With limited memory, typically 96 GB, any reasonable temporal resolution with XGA resolution would be infeasible on our current SPC prototype. We emphasize that both of these are limitations due to the used prototype and not of the underlying algorithms. Recent, commercial DMDs can operate at least to orders of magnitude faster [23] and the increase in measurement bandwidth would enable sensing at higher spatial and temporal resolutions.
Gallery of real data results
Figure 14 shows a few example reconstructions from our SPC lab setup. Each video is approximately seconds long and correspond to measurements from the SPC. With , all previews (the top row in each subimage in 14) were each of size pixels. Videos were recovered with frames. The supplemental material has videos for each of the results.
Role of different signal priors
Figures 2, 15, and 16 show the performance of three different signal priors on the same set of measurements. In Fig. 2, we compare wavelet sparsity of the individual frames, 3D total variation, and CSMUVI, which uses optical flow constraints in addition to the 3D total variation model. CSMUVI delivers superior performance in recovery of the spatial statistics (the textures on the individual frames) as well as temporal statistics (the textures on temporal slices). In Fig. 15, we look at specific frames across a wide gamut of reconstructions where the target motion is very high. Again, we observe that reconstructions from CSMUVI is not just free from artifacts, it also resolves spatial features better (ring on the hand, palm lines, etc.). Finally, for completeness, in Fig. 16, we vary the number of measurements associated with each frame for both 3D total variation and CSMUVI. Predictably, while the performance of 3D total variation is poor for fast moving objects, CSMUVI delivers highquality reconstructions across a wide range of target motion.
Achieved spatial resolution.
In Fig. 17 and Fig. 18 , Note that a SMC seeks to superresolve a low resolution sensor using optical coding and spatial light modulators. Hence, it is of utmost importance to verify if the device actually delivers on the promised improvement in spatial resolution.
In Fig. 17, we present reconstruction results on a resolution chart. The resolution chart was translated so as to enter and exit the fieldofview of the SPC within 8 seconds providing a total of measurements. A video with frames was recovered from these measurements for an overall compression ratio of . Fig. 17 indicates that the CSMUVI recovers spatial detail to a perpixel precision validating the claims of achieved compression. For this result, we regularized the optical flow to be translational. Specifically, after estimating the flow between the preview frames, we used the median of the flowvectors as a global translational flow.
In Fig. 18, we characterize the spatial resolution achieved by CSMUVI by comparing it to the image of a static scene obtained using pure Hadamard multiplexing. As expected, we observe that the preview image is the same resolution as the static image downsampled . Frames recovered from CSMUVI exhibit sharper texture than a downsampling of the static frame, but slightly worse than the fullresolution static image. Note that this scene contained complex nonrigid and fast motion.
Variations in speed, illumination, and size
Finally, we look at performance on real data for varying levels of scene illumination, object speed and size. For illumination (Fig. 20), we use the SPC measurement level as a guide to the amount of scene illumination. For object speed (Fig. 19), we instead slow down the DMD since it indirectly provides finer control on the apparent speed of the object. For size (Fig. 21), we vary the size of the moving target. In all cases, we show the recovered frame corresponding to the object moving at the fastest speed. The performance of CSMUVI degrades gracefully across all variations. The interested reader is referred to supplemental material for videos of these results.
9 Discussion
Summary
The promise of an SMC is to deliver high spatial resolution images and videos from a lowresolution sensor. The most extreme form of such SMCs is the SPC which poses a single photodetector or a sensor with no resolution by itself. In this paper, we demonstrate—for the very first time on real data—successful video recovery at superresolution for fastmoving scenes. This result has important implications for regimes where highresolution sensors are prohibitively expensive. A example of this is imaging in SWIR; to this end, we show results using a SPC with a photodetector tuned to this spectral band.
At the heart of our proposed framework is the design of a novel class of sensing matrices and an opticalflow based video reconstruction algorithm. In particular, we have proposed dualscale sensing (DSS) matrices that (i) exhibit no noise enhancement when performing leastsquares estimation at low spatial resolution and (ii) preserve information about high spatial frequencies. We have developed a DSS matrix having a fast transform, which enables us to compute instantaneous preview images of the scene at low cost. The preview computation supports a large number of novel applications for SMCbased devices, such as providing a digital viewfinder, enabling humancamera interaction, or triggering adaptive sensing strategies.
Limitations
Since CSMUVI relies on opticalflow estimates obtained from lowresolution images, it can fail to recover small objects with rapid motion. More specifically, moving objects that are of subpixel size in the preview mode are lost. Figure 9 shows an example of this limitation: The cars are moved using fine strings, which are visible in Fig. 9(a) but not in Fig. 9(b). Increasing the spatial resolution of the preview images eliminates this problem at the cost of more motion blur. To avoid these limitations altogether, one must increase the sampling rate of the SMC. In addition, reducing the complexity of solving (PV) is of paramount importance for practical implementations of CSMUVI.
Faster implementations
Current implementation of CSMUVI take in the order of hours for highresolution videos with a large number of frames. This large runtime can be attributed to the DSS matrix lacking a fast transform as well as the inherent complexity associated with highresolution signals. Faster implementations of the recovery algorithm is an interesting research directions.
Multiscale preview
A drawback of our approach is the need to specify the resolution at which preview frames are recovered; this requires prior knowledge of object speed. An important direction for future work is to relax this requirement via the construction of multiscale sensing matrices that go beyond the DSS matrices proposed here. The recently proposed sumtoone (short STOne) transform [11] provides such a multiscale sensing matrix. Specifically, the STOne transform is a carefully designed Hadamard transform that remains a Hadamard transform of a lowerresolution when downsampled. Using the STOne transform in place of the DSS matrix could potentially provide previews of various spatial resolutions.
Multiframe optical flow
The majority of the artifacts in the reconstructions stem from inaccurate opticalflow estimates—a result of residual noise in the preview images. It is worth noting, however, that we are using an offtheshelf opticalflow estimation algorithm; such an approach ignores the continuity of motion across multiple frames. We envision significant performance improvements if we use multiframe opticalflow estimation [30]. Such an approach could potentially alleviate some of the challenges faced in pairwise optical flow including the inability to recover precise flow estimates for both slowmoving and fastmoving targets.
Towards highresolution imagers
The spatial resolution of an SMC is limited by the resolution of the spatial light modulator. Commercially available DMDs, LCDs and LCoSs have a spatial resolution of – megapixels. An important direction for future research is the design of imaging architectures, signal models and recovery algorithms to obtain videos at this spatial resolution (and say, 30 fps temporal resolution). The key stumbling block for an SPCbased approach for solving this is the measurement bandwidth which, for the SPC, is limited by the operating rate of DMD. An approach to increasing the measurement rate is by using a multipixel architecture [38, 6, 21]. One way to interpret such imagers is to think of each pixel on the sensor as an SPC. Hence, with the successful demonstrated in this paper, megapixel videos could potentially be achieved with the use of an photodetector array. However, the very highdimensionality of the recovered videos raises important computational challenges with regards to the use of optical flowbased recovery algorithms.
Acknowledgments
ACS was supported by the NSF grant CCF1117939. LX, YL and KFK were supported by ONR (N660011114090), DARPA KeCoM (#11DARPA1055) through Lockheed Martin, and Princeton MIRTHE (NSF EEC #0540832). RGB was supported by the grants NSF CCF0431150, CCF0728867, CCF0926127,CCF1117939, ARO MURI W911NF09 10383, W911NF0710185, DARPA N660011114090, N6600111C4092, N660010812065, ONR N000141210124 and AFOSR FA95500910432.
References

[1]
M. BenOr and P. Tiwari, A deterministic algorithm for sparse
multivariate polynomial interpolation
, in ACM Symposium on Theory of Computing, 1988, pp. 301–309.
 [2] E. J. Candès, The restricted isometry property and its implications for compressed sensing, Comptes rendusMathématique, 346 (2008), pp. 589–592.
 [3] E. J. Candès, J. Romberg, and T. Tao, Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information, IEEE Trans. Inf. Theory, 52 (2006), pp. 489–509.

[4]
V. Cevher, A. C. Sankaranarayanan, M. F. Duarte, D. Reddy, R. G. Baraniuk,
and R. Chellappa, Compressive sensing for background subtraction
, in Euro. Conf. Computer Vision (ECCV), Marseille, France, Oct. 2008.
 [5] A. Chambolle, An algorithm for total variation minimization and applications, J. Mathematical Imaging and Vision, 20 (2004), pp. 89–97.

[6]
H. Chen, M. S. Asif, A. C. Sankaranarayanan, and A. Veeraraghavan, FPACS: Focal plane arraybased compressive imaging in shortwave
infrared
, in IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, June 2015.
 [7] D. L. Donoho, Compressed sensing, IEEE Trans. Inf. Theory, 52 (2006), pp. 1289–1306.
 [8] M. F. Duarte, M. A. Davenport, D. Takhar, J. N. Laska, T. Sun, K. F. Kelly, and R. G. Baraniuk, Singlepixel imaging via compressive sampling, IEEE Signal Process. Mag., 25 (2008), pp. 83–91.
 [9] J. E. Fowler, S. Mun, E. W. Tramel, M. R. Gupta, Y. Chen, T. Wiegand, and H. Schwarz, Blockbased compressed sensing of images and video, Foundations and Trends in Signal Processing, 4 (2010), pp. 297–416.
 [10] M. E. Gehm and D. J. Brady, Compressive sensing in the EO/IR, Appl. Opt., 54 (2015), pp. C14–C22.
 [11] T. Goldstein, L. Xu, K. F. Kelly, and R. G. Baraniuk, The STOne Transform: Multiresolution image enhancement and realtime compressive video, arXiv preprint arXiv:1311.3405, (2013).
 [12] J. Gu, Y. Hitomi, T. Mitsunaga, and S. Nayar, Coded rolling shutter photography: Flexible spacetime sampling, in IEEE Intl. Conf. Computational Photography (ICCP), Cambridge, MA, USA, Apr. 2010.
 [13] Z. T. Harmany, R. F. Marcia, and R. M. Willett, Compressive coded aperture keyed exposure imaging with optical flow reconstruction, arXiv preprint arXiv:1306.6281, (2013).
 [14] M. Harwit and N. J. Sloane, Hadamard transform optics, New York: Academic Press, 1979.
 [15] Y. Hitomi, J. Gu, M. Gupta, T. Mitsunaga, and S. K. Nayar, Video from a single coded exposure photograph using a learned overcomplete dictionary, in IEEE Intl. Conf. Computer Vision (ICCV), Barcelona, Spain, Nov. 2011.
 [16] J. Holloway, A. C. Sankaranarayanan, A. Veeraraghavan, and S. Tambe, Flutter shutter video camera for compressive sensing of videos, in IEEE Intl. Conf. Computational Photography (ICCP), Seattle, WA, Apr. 2012.
 [17] B. K. P. Horn and B. G. Schunck, Determining optical flow, Artif. Intel., 17 (1981), pp. 185–203.
 [18] D. Le Gall, MPEG: A video compression standard for multimedia applications, Communications of the ACM, 34 (1991), pp. 46–58.
 [19] C. Liu, Beyond Pixels: Exploring New Representations and Applications for Motion Analysis, PhD thesis, Mass. Inst. Tech., 2009.
 [20] P. Llull, X. Liao, X. Yuan, J. Yang, D. Kittle, L. Carin, G. Sapiro, and D. J. Brady, Coded aperture compressive temporal imaging, Optics express, 21 (2013), pp. 10526–10545.
 [21] A. Mahalanobis, R. Shilling, R. Murphy, and R. Muise, Recent results of medium wave infrared compressive sensing, Appl. Opt., 53 (2014), pp. 8060–8070.
 [22] S. Mun and J. E. Fowler, Residual reconstruction for blockbased compressed sensing of video, in Data Compression Conf., Snowbird, UT, USA, Apr. 2011.
 [23] S. G. Narasimhan, S. J. Koppal, and S. Yamazaki, Temporal dithering of illumination for fast active vision, in Euro. Conf. Computer Vision (ECCV), Marseille, France, Oct. 2008.
 [24] S. Osher, M. Burger, D. Goldfarb, J. Xu, and W. Yin, An iterative regularization method for total variationbased image restoration, Multiscale Modeling and Simulation, 4 (2005), pp. 460–489.
 [25] J. Y. Park and M. B. Wakin, A multiscale framework for compressive sensing of video, in Pict. Coding Symp., Chicago, IL, USA, May 2009.
 [26] , Multiscale algorithm for reconstructing videos from streaming compressive measurements, Journal of Electronic Imaging, 22 (2013), pp. 021001–021001.
 [27] R. Raskar, A. Agrawal, and J. Tumblin, Coded exposure photography: Motion deblurring using fluttered shutter, ACM Trans. Graphics, 25 (2006), pp. 795–804.
 [28] D. Reddy, A. Veeraraghavan, and R. Chellappa, P2C2: Programmable pixel compressive camera for high speed imaging, in IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, June 2011.
 [29] I. E. Richardson, H.264 and MPEG4 video compression: Video coding for nextgeneration multimedia, John Wiley and Sons, 2004.
 [30] M. Rubinstein, C. Liu, and W. T. Freeman, Towards longer longrange motion trajectories, in British Machine Vision Conf., 2012.
 [31] A. C. Sankaranarayanan, C. Studer, and R. G. Baraniuk, CSMUVI: Video compressive sensing for spatialmultiplexing cameras, in IEEE Intl. Conf. Computational Photography (ICCP), Seattle, WA, Apr. 2012.
 [32] A. C. Sankaranarayanan, P. Turaga, R. Baraniuk, and R. Chellappa, Compressive acquisition of dynamic scenes, in Euro. Conf. Computer Vision (ECCV), Crete, Greece, Sep. 2010.
 [33] A. C. Sankaranarayanan, P. K. Turaga, R. Chellappa, and R. G. Baraniuk, Compressive acquisition of linear dynamical systems, SIAM Journal on Imaging Sciences, 6 (2013), pp. 2109–2133.
 [34] N. Vaswani, Kalman filtered compressed sensing, in IEEE Conf. Image Processing (ICIP), San Diego, CA, USA, Oct. 2008.
 [35] N. Vaswani and W. Lu, Modifiedcs: Modifying compressive sensing for problems with partially known support, IEEE Trans. Signal Processing,, 58 (2010), pp. 4595–4607.
 [36] A. Veeraraghavan, D. Reddy, and R. Raskar, Coded strobing photography: Compressive sensing of high speed periodic events, IEEE Trans. Pattern Anal. Mach. Intell., 33 (2011), pp. 671–686.
 [37] M. B. Wakin, J. N. Laska, M. F. Duarte, D. Baron, S. Sarvotham, D. Takhar, K. F. Kelly, and R. G. Baraniuk, Compressive imaging for video representation and coding, in Pict. Coding Symp., Beijing, China, Apr. 2006.
 [38] J. Wang, M. Gupta, and A. C. Sankaranarayanan, LiSens — A scalable architecture for video compressive sensing, in IEEE Conference on Computational Photography (ICCP), Houston, TX, USA, Apr. 2015.
 [39] A. E Waters, A. C. Sankaranarayanan, and R. G. Baraniuk, SpaRCS: Recovering lowrank and sparse matrices from compressive measurements, in Adv. Neural Information Processing Systems (NIPS), Vancouver, BC, Canada, Dec. 2011.
 [40] L. Xu, A. Sankaranarayanan, C. Studer, Y. Li, R. G. Baraniuk, and K. F. Kelly, Multiscale compressive video acquisition, in Computational Optical Sensing and Imaging, 2013, pp. CW2C–4.
 [41] J. Yang, Z. Wang, Z. Lin, S. Cohen, and T. Huang, Coupled dictionary training for image superresolution, IEEE Trans. Image Processing, 21 (2012), pp. 3467–3478.
 [42] J. Yang, X. Yuan, X. Liao, P. Llull, D. J. Brady, G. Sapiro, and L. Carin, Video compressive sensing using gaussian mixture models, IEEE Trans. Image Processing, 23 (2014), pp. 48634878.
Comments
There are no comments yet.