
Covfefe: A Computer Vision Approach For Estimating Force Exertion
Cumulative exposure to repetitive and forceful activities may lead to musculoskeletal injuries which not only reduce workers' efficiency and productivity, but also affect their quality of life. Thus, widely accessible techniques for reliable detection of unsafe muscle force exertion levels for human activity is necessary for their wellbeing. However, measurement of force exertion levels is challenging and the existing techniques pose a great challenge as they are either intrusive, interfere with humanmachine interface, and/or subjective in the nature, thus are not scalable for all workers. In this work, we use face videos and the photoplethysmography (PPG) signals to classify force exertion levels of 0%, 50%, and 100% (representing rest, moderate effort, and high effort), thus providing a nonintrusive and scalable approach. Efficient feature extraction approaches have been investigated, including standard deviation of the movement of different landmarks of the face, distances between peaks and troughs in the PPG signals. We note that the PPG signals can be obtained from the face videos, thus giving an efficient classification algorithm for the force exertion levels using face videos. Based on the data collected from 20 subjects, features extracted from the face videos give 90% accuracy in classification among the 100% and the combination of 0% and 50% datasets. Further combining the PPG signals provide 81.7% accuracy. The approach is also shown to be robust to the correctly identify force level when the person is talking, even though such datasets are not included in the training.
09/25/2018 ∙ by Vaneet Aggarwal, et al. ∙ 8 ∙ shareread it

Rank Determination for LowRank Data Completion
Recently, fundamental conditions on the sampling patterns have been obtained for finite completability of lowrank matrices or tensors given the corresponding ranks. In this paper, we consider the scenario where the rank is not given and we aim to approximate the unknown rank based on the location of sampled entries and some given completion. We consider a number of data models, including singleview matrix, multiview matrix, CP tensor, tensortrain tensor and Tucker tensor. For each of these data models, we provide an upper bound on the rank when an arbitrary lowrank completion is given. We characterize these bounds both deterministically, i.e., with probability one given that the sampling pattern satisfies certain combinatorial properties, and probabilistically, i.e., with high probability given that the sampling probability is above some threshold. Moreover, for both singleview matrix and CP tensor, we are able to show that the obtained upper bound is exactly equal to the unknown rank if the lowestrank completion is given. Furthermore, we provide numerical experiments for the case of singleview matrix, where we use nuclear norm minimization to find a lowrank completion of the sampled data and we observe that in most of the cases the proposed upper bound on the rank is equal to the true rank.
07/03/2017 ∙ by Morteza Ashraphijuo, et al. ∙ 0 ∙ shareread it

Unsupervised clustering under the Union of Polyhedral Cones (UOPC) model
In this paper, we consider clustering data that is assumed to come from one of finitely many pointed convex polyhedral cones. This model is referred to as the Union of Polyhedral Cones (UOPC) model. Similar to the Union of Subspaces (UOS) model where each data from each subspace is generated from a (unknown) basis, in the UOPC model each data from each cone is assumed to be generated from a finite number of (unknown) extreme rays.To cluster data under this model, we consider several algorithms  (a) Sparse Subspace Clustering by Nonnegative constraints Lasso (NCL), (b) Least squares approximation (LSA), and (c) Knearest neighbor (KNN) algorithm to arrive at affinity between data points. Spectral Clustering (SC) is then applied on the resulting affinity matrix to cluster data into different polyhedral cones. We show that on an average KNN outperforms both NCL and LSA and for this algorithm we provide the deterministic conditions for correct clustering. For an affinity measure between the cones it is shown that as long as the cones are not very coherent and as long as the density of data within each cone exceeds a threshold, KNN leads to accurate clustering. Finally, simulation results on real datasets (MNIST and YaleFace datasets) depict that the proposed algorithm works well on real data indicating the utility of the UOPC model and the proposed algorithm.
10/15/2016 ∙ by Wenqi Wang, et al. ∙ 0 ∙ shareread it

On Deterministic Conditions for Subspace Clustering under Missing Data
In this paper we present deterministic conditions for success of sparse subspace clustering (SSC) under missing data, when data is assumed to come from a Union of Subspaces (UoS) model. We consider two algorithms, which are variants of SSC with entrywise zerofilling that differ in terms of the optimization problems used to find affinity matrix for spectral clustering. For both the algorithms, we provide deterministic conditions for any pattern of missing data such that perfect clustering can be achieved. We provide extensive sets of simulation results for clustering as well as completion of data at missing entries, under the UoS model. Our experimental results indicate that in contrast to the full data case, accurate clustering does not imply accurate subspace identification and completion, indicating the natural order of relative hardness of these problems.
07/11/2016 ∙ by Wenqi Wang, et al. ∙ 0 ∙ shareread it

Informationtheoretic Bounds on Matrix Completion under Union of Subspaces Model
In this short note we extend some of the recent results on matrix completion under the assumption that the columns of the matrix can be grouped (clustered) into subspaces (not necessarily disjoint or independent). This model deviates from the typical assumption prevalent in the literature dealing with compression and recovery for bigdata applications. The results have a direct bearing on the problem of subspace clustering under missing or incomplete information.
08/14/2015 ∙ by Vaneet Aggarwal, et al. ∙ 0 ∙ shareread it

Adaptive Sampling of RF Fingerprints for Finegrained Indoor Localization
Indoor localization is a supporting technology for a broadening range of pervasive wireless applications. One promis ing approach is to locate users with radio frequency fingerprints. However, its wide adoption in realworld systems is challenged by the time and manpowerconsuming site survey process, which builds a fingerprint database a priori for localization. To address this problem, we visualize the 3D RF fingerprint data as a function of locations (xy) and indices of access points (fingerprint), as a tensor and use tensor algebraic methods for an adaptive tubalsampling of this fingerprint space. In particular using a recently proposed tensor algebraic framework in [1] we capture the complexity of the fingerprint space as a lowdimensional tensorcolumn space. In this formulation the proposed scheme exploits adaptivity to identify reference points which are highly informative for learning this lowdimensional space. Further, under certain incoherency conditions we prove that the proposed scheme achieves bounded recovery error and nearoptimal sampling complexity. In contrast to several existing work that rely on random sampling, this paper shows that adaptivity in sampling can lead to significant improvements in localization accuracy. The approach is validated on both data generated by the raytracing indoor model which accounts for the floor plan and the impact of walls and the real world data. Simulation results show that, while maintaining the same localization accuracy of existing approaches, the amount of samples can be cut down by 71 and 55
08/10/2015 ∙ by XiaoYang Liu, et al. ∙ 0 ∙ shareread it

Achieving Approximate Soft Clustering in Data Streams
In recent years, data streaming has gained prominence due to advances in technologies that enable many applications to generate continuous flows of data. This increases the need to develop algorithms that are able to efficiently process data streams. Additionally, realtime requirements and evolving nature of data streams make stream mining problems, including clustering, challenging research problems. In this paper, we propose a onepass streaming soft clustering (membership of a point in a cluster is described by a distribution) algorithm which approximates the "soft" version of the kmeans objective function. Soft clustering has applications in various aspects of databases and machine learning including density estimation and learning mixture models. We first achieve a simple pseudoapproximation in terms of the "hard" kmeans algorithm, where the algorithm is allowed to output more than k centers. We convert this batch algorithm to a streaming one (using an extension of the kmeans++ algorithm recently proposed) in the "cash register" model. We also extend this algorithm when the clustering is done over a moving window in the data stream.
07/26/2012 ∙ by Vaneet Aggarwal, et al. ∙ 0 ∙ shareread it

On Deterministic Sampling Patterns for Robust LowRank Matrix Completion
In this letter, we study the deterministic sampling patterns for the completion of low rank matrix, when corrupted with a sparse noise, also known as robust matrix completion. We extend the recent results on the deterministic sampling patterns in the absence of noise based on the geometric analysis on the Grassmannian manifold. A special case where each column has a certain number of noisy entries is considered, where our probabilistic analysis performs very efficiently. Furthermore, assuming that the rank of the original matrix is not given, we provide an analysis to determine if the rank of a valid completion is indeed the actual rank of the data corrupted with sparse noise by verifying some conditions.
12/05/2017 ∙ by Morteza Ashraphijuo, et al. ∙ 0 ∙ shareread it

Tensor Train Neighborhood Preserving Embedding
In this paper, we propose a Tensor Train Neighborhood Preserving Embedding (TTNPE) to embed multidimensional tensor data into low dimensional tensor subspace. Novel approaches to solve the optimization problem in TTNPE are proposed. For this embedding, we evaluate novel tradeoff gain among classification, computation, and dimensionality reduction (storage) for supervised learning. It is shown that compared to the stateofthearts tensor embedding methods, TTNPE achieves superior tradeoff in classification, computation, and dimensionality reduction in MNIST handwritten digits and Weizmann face datasets.
12/03/2017 ∙ by Wenqi Wang, et al. ∙ 0 ∙ shareread it

On the Optimality of Scheduling Dependent MapReduce Tasks on Heterogeneous Machines
MapReduce is the most popular bigdata computation framework, motivating many research topics. A MapReduce job consists of two successive phases, i.e. map phase and reduce phase. Each phase can be divided into multiple tasks. A reduce task can only start when all the map tasks finish processing. A job is successfully completed when all its map and reduce tasks are complete. The task of optimally scheduling the different tasks on different servers to minimize the weighted completion time is an open problem, and is the focus of this paper. In this paper, we give an approximation ratio with a competitive ratio 2(1+(m1)/D)+1, where m is the number of servers and D> 1 is the taskskewness product. We implement the proposed algorithm on Hadoop framework, and compare with three baseline schedulers. Results show that our DMRS algorithm can outperform baseline schedulers by up to 82%.
11/27/2017 ∙ by Vaneet Aggarwal, et al. ∙ 0 ∙ shareread it

Stochastic Nonpreemptive Coflow Scheduling with TimeIndexed Relaxation
Coflows model a modern scheduling setting that is commonly found in a variety of applications in distributed and cloud computing. A stochastic coflow task contains a set of parallel flows with randomly distributed sizes. Further, many applications require nonpreemptive scheduling of coflow tasks. This paper gives an approximation algorithm for stochastic nonpreemptive coflow scheduling. The proposed approach uses a timeindexed linear relaxation, and uses its solution to come up with a feasible schedule. This algorithm is shown to achieve a competitive ratio of (2m+1)(1+√(mΔ))(1+m√(Δ))(3+Δ)/2 for zerorelease times, and (2m+1)(1+√(mΔ))(1+m√(Δ))(2+Δ) for general release times, where Δ represents the upper bound of squared coefficient of variation of processing times, and m is the number of servers.
02/11/2018 ∙ by Ruijiu Mao, et al. ∙ 0 ∙ shareread it
Vaneet Aggarwal
is this you? claim profile
Assistant Professor with the School of Industrial Engineering at Purdue University