
Learning Networked Exponential Families with Network Lasso
The data arising in many important bigdata applications, ranging from social networks to network medicine, consist of highdimensional data points related by an intrinsic (complex) network structure. In order to jointly leverage the information conveyed in the network structure as well as the statistical power contained in highdimensional data points, we propose networked exponential families. We apply the network Lasso to learn networked exponential families as a probabilistic model for heterogeneous datasets with intrinsic network structure. In order to allow for accurate learning from highdimensional data we borrow statistical strength, via the intrinsic network structure, across the dataset. The resulting method aims at regularized empirical risk minimization using the total variation of the model parameters as regularizer. This minimization problem is a nonsmooth convex optimization problem which we solve using a primaldual splitting method. This method is appealing for big data applications as it can be implemented as a highly scalable message passing algorithm.
05/22/2019 ∙ by Alexander Jung, et al. ∙ 22 ∙ shareread it

Semisupervised Learning in NetworkStructured Data via Total Variation Minimization
We propose and analyze a method for semisupervised learning from partiallylabeled networkstructured data. Our approach is based on a graph signal recovery interpretation under a clustering hypothesis that labels of data points belonging to the same wellconnected subset (cluster) are similar valued. This lends naturally to learning the labels by total variation (TV) minimization, which we solve by applying a recently proposed primaldual method for nonsmooth convex optimization. The resulting algorithm allows for a highly scalable implementation using message passing over the underlying empirical graph, which renders the algorithm suitable for big data applications. By applying tools of compressed sensing, we derive a sufficient condition on the underlying network structure such that TV minimization recovers clusters in the empirical graph of the data. In particular, we show that the proposed primaldual method amounts to maximizing network flows over the empirical graph of the dataset. Moreover, the learning accuracy of the proposed algorithm is linked to the set of network flows between data points having known labels. The effectiveness and scalability of our approach is verified by numerical experiments.
01/28/2019 ∙ by Alexander Jung, et al. ∙ 18 ∙ shareread it

Localized Linear Regression in Networked Data
The network Lasso (nLasso) has been proposed recently as an efficient learning algorithm for massive networked data sets (big data over networks). It extends the wellknown least least absolute shrinkage and selection operator (Lasso) from learning sparse (generalized) linear models to network models. Efficient implementations of the nLasso have been obtained using convex optimization methods. These implementations naturally lend to highly scalable message passing methods. In this paper, we analyze the performance of nLasso when applied to localized linear regression problems involving networked data. Our main result is a sufficient conditions on the network structure and available label information such that nLasso accurately learns a localized linear regression model from few labeled data points.
03/26/2019 ∙ by Alexander Jung, et al. ∙ 14 ∙ shareread it

Classifying Partially Labeled Networked Data via Logistic Network Lasso
We apply the network Lasso to classify partially labeled data points which are characterized by highdimensional feature vectors. In order to learn an accurate classifier from limited amounts of labeled data, we borrow statistical strength, via an intrinsic network structure, across the dataset. The resulting logistic network Lasso amounts to a regularized empirical risk minimization problem using the total variation of a classifier as a regularizer. This minimization problem is a nonsmooth convex optimization problem which we solve using a primaldual splitting method. This method is appealing for big data applications as it can be implemented as a highly scalable message passing algorithm.
03/26/2019 ∙ by Nguyen Tran, et al. ∙ 14 ∙ shareread it

Analysis of Network Lasso For SemiSupervised Regression
We characterize the statistical properties of network Lasso for semisupervised regression problems involving network structured data. This characterization is based on the con nectivity properties of the empirical graph which encodes the similarities between individual data points. Loosely speaking, network Lasso is accurate if the available label informa tion is well connected with the boundaries between clusters of the networkstructure datasets. We make this property precise using the notion of network flows. In particular, the existence of a sufficiently large network flow over the empirical graph implies a network compatibility condition which, in turn, en sures accuracy of network Lasso.
08/22/2018 ∙ by Alexander Jung, et al. ∙ 12 ∙ shareread it

Classifying Process Instances Using Recurrent Neural Networks
Process Mining consists of techniques where logs created by operative systems are transformed into process models. In process mining tools it is often desired to be able to classify ongoing process instances, e.g., to predict how long the process will still require to complete, or to classify process instances to different classes based only on the activities that have occurred in the process instance thus far. Recurrent neural networks and its subclasses, such as Gated Recurrent Unit (GRU) and Long ShortTerm Memory (LSTM), have been demonstrated to be able to learn relevant temporal features for subsequent classification tasks. In this paper we apply recurrent neural networks to classifying process instances. The proposed model is trained in a supervised fashion using labeled process instances extracted from event log traces. This is the first time we know of GRU having been used in classifying business process instances. Our main experimental results shows that GRU outperforms LSTM remarkably in training time while giving almost identical accuracies to LSTM models. Additional contributions of our paper are improving the classification model training time by filtering infrequent activities, which is a technique commonly used, e.g., in Natural Language Processing (NLP).
09/16/2018 ∙ by Markku Hinkka, et al. ∙ 6 ∙ shareread it

On The Complexity of Sparse Label Propagation
This paper investigates the computational complexity of sparse label propagation which has been proposed recently for processing network structured data. Sparse label propagation amounts to a convex optimization problem and might be considered as an extension of basis pursuit from sparse vectors to network structured datasets. Using a standard firstorder oracle model, we characterize the number of iterations for sparse label propagation to achieve a prescribed accuracy. In particular, we derive an upper bound on the number of iterations required to achieve a certain accuracy and show that this upper bound is sharp for datasets having a chain structure (e.g., time series).
04/25/2018 ∙ by Alexander Jung, et al. ∙ 4 ∙ shareread it

Classifying Big Data over Networks via the Logistic Network Lasso
We apply network Lasso to solve binary classification (clustering) problems on network structured data. To this end, we generalize ordinary logistic regression to nonEuclidean data defined over a complex network structure. The resulting logistic network Lasso classifier amounts to solving a nonsmooth convex optimization problem. A scalable classification algorithm is obtained by applying the alternating direction methods of multipliers to solve this optimization problem.
05/07/2018 ∙ by Henrik Ambos, et al. ∙ 4 ∙ shareread it

Graph Signal Sampling via Reinforcement Learning
We formulate the problem of sampling and recovering clustered graph signal as a multiarmed bandit (MAB) problem. This formulation lends naturally to learning sampling strategies using the wellknown gradient MAB algorithm. In particular, the sampling strategy is represented as a probability distribution over the individual arms of the MAB and optimized using gradient ascent. Some illustrative numerical experiments indicate that the sampling strategies based on the gradient MAB algorithm outperform existing sampling methods.
05/15/2018 ∙ by Oleksii Abramenko, et al. ∙ 2 ∙ shareread it

Predicting Electricity Outages Caused by Convective Storms
We consider the problem of predicting power outages in an electrical power grid due to hazards produced by convective storms. These storms produce extreme weather phenomena such as intense wind, tornadoes and lightning over a small area. In this paper, we discuss the application of stateoftheart machine learning techniques, such as random forest classifiers and deep neural networks, to predict the amount of damage caused by storms. We cast this application as a classification problem where the goal is to classify storm cells into a finite number of classes, each corresponding to a certain amount of expected damage. The classification method use as input features estimates for storm cell location and movement which has to be extracted from the raw data. A main challenge of this application is that the training data is heavily imbalanced as the occurrence of extreme weather events is rare. In order to address this issue, we applied SMOTE technique.
05/21/2018 ∙ by Roope Tervo, et al. ∙ 2 ∙ shareread it

Recovery Conditions and Sampling Strategies for Network Lasso
The network Lasso is a recently proposed convex optimization method for machine learning from massive network structured datasets, i.e., big data over networks. It is a variant of the wellknown least absolute shrinkage and selection operator (Lasso), which is underlying many methods in learning and signal processing involving sparse models. Highly scalable implementations of the network Lasso can be obtained by stateofthe art proximal methods, e.g., the alternating direction method of multipliers (ADMM). By generalizing the concept of the compatibility condition put forward by van de Geer and Buehlmann as a powerful tool for the analysis of plain Lasso, we derive a sufficient condition, i.e., the network compatibility condition, on the underlying network topology such that network Lasso accurately learns a clustered underlying graph signal. This network compatibility condition relates the location of the sampled nodes with the clustering structure of the network. In particular, the NCC informs the choice of which nodes to sample, or in machine learning terms, which data points provide most information if labeled.
09/03/2017 ∙ by Alexandru Mara, et al. ∙ 0 ∙ shareread it
Alexander Jung
is this you? claim profile
Assistant Professor of Computer Science at Aalto University