Vaneet Aggarwal

is this you? claim profile


Assistant Professor with the School of Industrial Engineering at Purdue University

  • Covfefe: A Computer Vision Approach For Estimating Force Exertion

    Cumulative exposure to repetitive and forceful activities may lead to musculoskeletal injuries which not only reduce workers' efficiency and productivity, but also affect their quality of life. Thus, widely accessible techniques for reliable detection of unsafe muscle force exertion levels for human activity is necessary for their well-being. However, measurement of force exertion levels is challenging and the existing techniques pose a great challenge as they are either intrusive, interfere with human-machine interface, and/or subjective in the nature, thus are not scalable for all workers. In this work, we use face videos and the photoplethysmography (PPG) signals to classify force exertion levels of 0%, 50%, and 100% (representing rest, moderate effort, and high effort), thus providing a non-intrusive and scalable approach. Efficient feature extraction approaches have been investigated, including standard deviation of the movement of different landmarks of the face, distances between peaks and troughs in the PPG signals. We note that the PPG signals can be obtained from the face videos, thus giving an efficient classification algorithm for the force exertion levels using face videos. Based on the data collected from 20 subjects, features extracted from the face videos give 90% accuracy in classification among the 100% and the combination of 0% and 50% datasets. Further combining the PPG signals provide 81.7% accuracy. The approach is also shown to be robust to the correctly identify force level when the person is talking, even though such datasets are not included in the training.

    09/25/2018 ∙ by Vaneet Aggarwal, et al. ∙ 8 share

    read it

  • Rank Determination for Low-Rank Data Completion

    Recently, fundamental conditions on the sampling patterns have been obtained for finite completability of low-rank matrices or tensors given the corresponding ranks. In this paper, we consider the scenario where the rank is not given and we aim to approximate the unknown rank based on the location of sampled entries and some given completion. We consider a number of data models, including single-view matrix, multi-view matrix, CP tensor, tensor-train tensor and Tucker tensor. For each of these data models, we provide an upper bound on the rank when an arbitrary low-rank completion is given. We characterize these bounds both deterministically, i.e., with probability one given that the sampling pattern satisfies certain combinatorial properties, and probabilistically, i.e., with high probability given that the sampling probability is above some threshold. Moreover, for both single-view matrix and CP tensor, we are able to show that the obtained upper bound is exactly equal to the unknown rank if the lowest-rank completion is given. Furthermore, we provide numerical experiments for the case of single-view matrix, where we use nuclear norm minimization to find a low-rank completion of the sampled data and we observe that in most of the cases the proposed upper bound on the rank is equal to the true rank.

    07/03/2017 ∙ by Morteza Ashraphijuo, et al. ∙ 0 share

    read it

  • Unsupervised clustering under the Union of Polyhedral Cones (UOPC) model

    In this paper, we consider clustering data that is assumed to come from one of finitely many pointed convex polyhedral cones. This model is referred to as the Union of Polyhedral Cones (UOPC) model. Similar to the Union of Subspaces (UOS) model where each data from each subspace is generated from a (unknown) basis, in the UOPC model each data from each cone is assumed to be generated from a finite number of (unknown) extreme rays.To cluster data under this model, we consider several algorithms - (a) Sparse Subspace Clustering by Non-negative constraints Lasso (NCL), (b) Least squares approximation (LSA), and (c) K-nearest neighbor (KNN) algorithm to arrive at affinity between data points. Spectral Clustering (SC) is then applied on the resulting affinity matrix to cluster data into different polyhedral cones. We show that on an average KNN outperforms both NCL and LSA and for this algorithm we provide the deterministic conditions for correct clustering. For an affinity measure between the cones it is shown that as long as the cones are not very coherent and as long as the density of data within each cone exceeds a threshold, KNN leads to accurate clustering. Finally, simulation results on real datasets (MNIST and YaleFace datasets) depict that the proposed algorithm works well on real data indicating the utility of the UOPC model and the proposed algorithm.

    10/15/2016 ∙ by Wenqi Wang, et al. ∙ 0 share

    read it

  • On Deterministic Conditions for Subspace Clustering under Missing Data

    In this paper we present deterministic conditions for success of sparse subspace clustering (SSC) under missing data, when data is assumed to come from a Union of Subspaces (UoS) model. We consider two algorithms, which are variants of SSC with entry-wise zero-filling that differ in terms of the optimization problems used to find affinity matrix for spectral clustering. For both the algorithms, we provide deterministic conditions for any pattern of missing data such that perfect clustering can be achieved. We provide extensive sets of simulation results for clustering as well as completion of data at missing entries, under the UoS model. Our experimental results indicate that in contrast to the full data case, accurate clustering does not imply accurate subspace identification and completion, indicating the natural order of relative hardness of these problems.

    07/11/2016 ∙ by Wenqi Wang, et al. ∙ 0 share

    read it

  • Information-theoretic Bounds on Matrix Completion under Union of Subspaces Model

    In this short note we extend some of the recent results on matrix completion under the assumption that the columns of the matrix can be grouped (clustered) into subspaces (not necessarily disjoint or independent). This model deviates from the typical assumption prevalent in the literature dealing with compression and recovery for big-data applications. The results have a direct bearing on the problem of subspace clustering under missing or incomplete information.

    08/14/2015 ∙ by Vaneet Aggarwal, et al. ∙ 0 share

    read it

  • Adaptive Sampling of RF Fingerprints for Fine-grained Indoor Localization

    Indoor localization is a supporting technology for a broadening range of pervasive wireless applications. One promis- ing approach is to locate users with radio frequency fingerprints. However, its wide adoption in real-world systems is challenged by the time- and manpower-consuming site survey process, which builds a fingerprint database a priori for localization. To address this problem, we visualize the 3-D RF fingerprint data as a function of locations (x-y) and indices of access points (fingerprint), as a tensor and use tensor algebraic methods for an adaptive tubal-sampling of this fingerprint space. In particular using a recently proposed tensor algebraic framework in [1] we capture the complexity of the fingerprint space as a low-dimensional tensor-column space. In this formulation the proposed scheme exploits adaptivity to identify reference points which are highly informative for learning this low-dimensional space. Further, under certain incoherency conditions we prove that the proposed scheme achieves bounded recovery error and near-optimal sampling complexity. In contrast to several existing work that rely on random sampling, this paper shows that adaptivity in sampling can lead to significant improvements in localization accuracy. The approach is validated on both data generated by the ray-tracing indoor model which accounts for the floor plan and the impact of walls and the real world data. Simulation results show that, while maintaining the same localization accuracy of existing approaches, the amount of samples can be cut down by 71 and 55

    08/10/2015 ∙ by Xiao-Yang Liu, et al. ∙ 0 share

    read it

  • Achieving Approximate Soft Clustering in Data Streams

    In recent years, data streaming has gained prominence due to advances in technologies that enable many applications to generate continuous flows of data. This increases the need to develop algorithms that are able to efficiently process data streams. Additionally, real-time requirements and evolving nature of data streams make stream mining problems, including clustering, challenging research problems. In this paper, we propose a one-pass streaming soft clustering (membership of a point in a cluster is described by a distribution) algorithm which approximates the "soft" version of the k-means objective function. Soft clustering has applications in various aspects of databases and machine learning including density estimation and learning mixture models. We first achieve a simple pseudo-approximation in terms of the "hard" k-means algorithm, where the algorithm is allowed to output more than k centers. We convert this batch algorithm to a streaming one (using an extension of the k-means++ algorithm recently proposed) in the "cash register" model. We also extend this algorithm when the clustering is done over a moving window in the data stream.

    07/26/2012 ∙ by Vaneet Aggarwal, et al. ∙ 0 share

    read it

  • On Deterministic Sampling Patterns for Robust Low-Rank Matrix Completion

    In this letter, we study the deterministic sampling patterns for the completion of low rank matrix, when corrupted with a sparse noise, also known as robust matrix completion. We extend the recent results on the deterministic sampling patterns in the absence of noise based on the geometric analysis on the Grassmannian manifold. A special case where each column has a certain number of noisy entries is considered, where our probabilistic analysis performs very efficiently. Furthermore, assuming that the rank of the original matrix is not given, we provide an analysis to determine if the rank of a valid completion is indeed the actual rank of the data corrupted with sparse noise by verifying some conditions.

    12/05/2017 ∙ by Morteza Ashraphijuo, et al. ∙ 0 share

    read it

  • Tensor Train Neighborhood Preserving Embedding

    In this paper, we propose a Tensor Train Neighborhood Preserving Embedding (TTNPE) to embed multi-dimensional tensor data into low dimensional tensor subspace. Novel approaches to solve the optimization problem in TTNPE are proposed. For this embedding, we evaluate novel trade-off gain among classification, computation, and dimensionality reduction (storage) for supervised learning. It is shown that compared to the state-of-the-arts tensor embedding methods, TTNPE achieves superior trade-off in classification, computation, and dimensionality reduction in MNIST handwritten digits and Weizmann face datasets.

    12/03/2017 ∙ by Wenqi Wang, et al. ∙ 0 share

    read it

  • On the Optimality of Scheduling Dependent MapReduce Tasks on Heterogeneous Machines

    MapReduce is the most popular big-data computation framework, motivating many research topics. A MapReduce job consists of two successive phases, i.e. map phase and reduce phase. Each phase can be divided into multiple tasks. A reduce task can only start when all the map tasks finish processing. A job is successfully completed when all its map and reduce tasks are complete. The task of optimally scheduling the different tasks on different servers to minimize the weighted completion time is an open problem, and is the focus of this paper. In this paper, we give an approximation ratio with a competitive ratio 2(1+(m-1)/D)+1, where m is the number of servers and D> 1 is the task-skewness product. We implement the proposed algorithm on Hadoop framework, and compare with three baseline schedulers. Results show that our DMRS algorithm can outperform baseline schedulers by up to 82%.

    11/27/2017 ∙ by Vaneet Aggarwal, et al. ∙ 0 share

    read it

  • Stochastic Non-preemptive Co-flow Scheduling with Time-Indexed Relaxation

    Co-flows model a modern scheduling setting that is commonly found in a variety of applications in distributed and cloud computing. A stochastic co-flow task contains a set of parallel flows with randomly distributed sizes. Further, many applications require non-preemptive scheduling of co-flow tasks. This paper gives an approximation algorithm for stochastic non-preemptive co-flow scheduling. The proposed approach uses a time-indexed linear relaxation, and uses its solution to come up with a feasible schedule. This algorithm is shown to achieve a competitive ratio of (2m+1)(1+√(mΔ))(1+m√(Δ))(3+Δ)/2 for zero-release times, and (2m+1)(1+√(mΔ))(1+m√(Δ))(2+Δ) for general release times, where Δ represents the upper bound of squared coefficient of variation of processing times, and m is the number of servers.

    02/11/2018 ∙ by Ruijiu Mao, et al. ∙ 0 share

    read it