
AutoML: A Survey of the StateoftheArt
Deep learning has penetrated all aspects of our lives and brought us great convenience. However, the process of building a highquality deep learning system for a specific task is not only timeconsuming but also requires lots of resources and relies on human expertise, which hinders the development of deep learning in both industry and academia. To alleviate this problem, a growing number of research projects focus on automated machine learning (AutoML). In this paper, we provide a comprehensive and uptodate study on the stateoftheart AutoML. First, we introduce the AutoML techniques in details according to the machine learning pipeline. Then we summarize existing Neural Architecture Search (NAS) research, which is one of the most popular topics in AutoML. We also compare the models generated by NAS algorithms with those humandesigned models. Finally, we present several open problems for future research.
08/02/2019 ∙ by Xin He, et al. ∙ 257 ∙ shareread it

Simple Physical Adversarial Examples against EndtoEnd Autonomous Driving Models
Recent advances in machine learning, especially techniques such as deep neural networks, are promoting a range of highstakes applications, including autonomous driving, which often relies on deep learning for perception. While deep learning for perception has been shown to be vulnerable to a host of subtle adversarial manipulations of images, endtoend demonstrations of successful attacks, which manipulate the physical environment and result in physical consequences, are scarce. Moreover, attacks typically involve carefully constructed adversarial examples at the level of pixels. We demonstrate the first endtoend attacks on autonomous driving in simulation, using simple physically realizable attacks: the painting of black lines on the road. These attacks target deep neural network models for endtoend autonomous driving control. A systematic investigation shows that such attacks are surprisingly easy to engineer, and we describe scenarios (e.g., right turns) in which they are highly effective, and others that are less vulnerable (e.g., driving straight). Further, we use network deconvolution to demonstrate that the attacks succeed by inducing activation patterns similar to entirely different scenarios used in training.
03/12/2019 ∙ by Adith Boloor, et al. ∙ 14 ∙ shareread it

Symmetryconstrained Rectification Network for Scene Text Recognition
Reading text in the wild is a very challenging task due to the diversity of text instances and the complexity of natural scenes. Recently, the community has paid increasing attention to the problem of recognizing text instances with irregular shapes. One intuitive and effective way to handle this problem is to rectify irregular text to a canonical form before recognition. However, these methods might struggle when dealing with highly curved or distorted text instances. To tackle this issue, we propose in this paper a Symmetryconstrained Rectification Network (ScRN) based on local attributes of text instances, such as center line, scale and orientation. Such constraints with an accurate description of text shape enable ScRN to generate better rectification results than existing methods and thus lead to higher recognition accuracy. Our method achieves stateoftheart performance on text with both regular and irregular shapes. Specifically, the system outperforms existing algorithms by a large margin on datasets that contain quite a proportion of irregular text instances, e.g., ICDAR 2015, SVTPerspective and CUTE80.
08/06/2019 ∙ by Mingkun Yang, et al. ∙ 8 ∙ shareread it

LinearTime Succinct Encodings of Planar Graphs via Canonical Orderings
Let G be an embedded planar undirected graph that has n vertices, m edges, and f faces but has no selfloop or multiple edge. If G is triangulated, we can encode it using 4/3m1 bits, improving on the best previous bound of about 1.53m bits. In case exponential time is acceptable, roughly 1.08m bits have been known to suffice. If G is triconnected, we use at most (2.5+23){n,f}7 bits, which is at most 2.835m bits and smaller than the best previous bound of 3m bits. Both of our schemes take O(n) time for encoding and decoding.
01/27/2001 ∙ by Xin He, et al. ∙ 0 ∙ shareread it

A Fast General Methodology for InformationTheoretically Optimal Encodings of Graphs
We propose a fast methodology for encoding graphs with informationtheoretically minimum numbers of bits. Specifically, a graph with property pi is called a pigraph. If pi satisfies certain properties, then an nnode medge pigraph G can be encoded by a binary string X such that (1) G and X can be obtained from each other in O(n log n) time, and (2) X has at most beta(n)+o(beta(n)) bits for any continuous superadditive function beta(n) so that there are at most 2^beta(n)+o(beta(n)) distinct nnode pigraphs. The methodology is applicable to general classes of graphs; this paper focuses on planar graphs. Examples of such pi include all conjunctions over the following groups of properties: (1) G is a planar graph or a plane graph; (2) G is directed or undirected; (3) G is triangulated, triconnected, biconnected, merely connected, or not required to be connected; (4) the nodes of G are labeled with labels from 1, ..., ell_1 for ell_1 <= n; (5) the edges of G are labeled with labels from 1, ..., ell_2 for ell_2 <= m; and (6) each node (respectively, edge) of G has at most ell_3 = O(1) selfloops (respectively, ell_4 = O(1) multiple edges). Moreover, ell_3 and ell_4 are not required to be O(1) for the cases of pi being a plane triangulation. These examples are novel applications of small cycle separators of planar graphs and are the only nontrivial classes of graphs, other than rooted trees, with known polynomialtime informationtheoretically optimal coding schemes.
01/23/2001 ∙ by Xin He, et al. ∙ 0 ∙ shareread it

Multilayer Nonlinear Processing for Information Privacy in Sensor Networks
A sensor network wishes to transmit information to a fusion center to allow it to detect a public hypothesis, but at the same time prevent it from inferring a private hypothesis. We propose a multilayer nonlinear processing procedure at each sensor to distort the sensor's data before it is sent to the fusion center. In our proposed framework, sensors are grouped into clusters, and each sensor first applies a nonlinear fusion function on the information it receives from sensors in the same cluster and in a previous layer. A linear weighting matrix is then used to distort the information it sends to sensors in the next layer. We adopt a nonparametric approach and develop a modified mirror descent algorithm to optimize the weighting matrices so as to ensure that the regularized empirical risk of detecting the private hypothesis is above a given privacy threshold, while minimizing the regularized empirical risk of detecting the public hypothesis. Experiments on empirical datasets demonstrate that our approach is able to achieve a good tradeoff between the error rates of the public and private hypothesis.
11/13/2017 ∙ by Xin He, et al. ∙ 0 ∙ shareread it

Scalable kernelbased variable selection with sparsistency
Variable selection is central to highdimensional data analysis, and various algorithms have been developed. Ideally, a variable selection algorithm shall be flexible, scalable, and with theoretical guarantee, yet most existing algorithms cannot attain these properties at the same time. In this article, a threestep variable selection algorithm is developed, involving kernelbased estimation of the regression function and its gradient functions as well as a hard thresholding. Its key advantage is that it assumes no explicit model assumption, admits general predictor effects, allows for scalable computation, and attains desirable asymptotic sparsistency. The proposed algorithm can be adapted to any reproducing kernel Hilbert space (RKHS) with different kernel functions, and can be extended to interaction selection with slight modification. Its computational cost is only linear in the data dimension, and can be further improved through parallel computing. The sparsistency of the proposed algorithm is established for general RKHS under mild conditions, including linear and Gaussian kernels as special cases. Its effectiveness is also supported by a variety of simulated and real examples.
02/26/2018 ∙ by Xin He, et al. ∙ 0 ∙ shareread it

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes
Driven by deep neural networks and large scale datasets, scene text detection methods have progressed substantially over the past years, continuously refreshing the performance records on various standard benchmarks. However, limited by the representations (axisaligned rectangles, rotated rectangles or quadrangles) adopted to describe text, existing methods may fall short when dealing with much more freeform text instances, such as curved text, which are actually very common in realworld scenarios. To tackle this problem, we propose a more flexible representation for scene text, termed as TextSnake, which is able to effectively represent text instances in horizontal, oriented and curved forms. In TextSnake, a text instance is described as a sequence of ordered, overlapping disks centered at symmetric axes, each of which is associated with potentially variable radius and orientation. Such geometry attributes are estimated via a Fully Convolutional Network (FCN) model. In experiments, the text detector based on TextSnake achieves stateoftheart or comparable performance on TotalText and SCUTCTW1500, the two newly published benchmarks with special emphasis on curved text in natural images, as well as the widelyused datasets ICDAR 2015 and MSRATD500. Specifically, TextSnake outperforms the baseline on TotalText by more than 40
07/04/2018 ∙ by Shangbang Long, et al. ∙ 0 ∙ shareread it

AxTrain: HardwareOriented Neural Network Training for Approximate Inference
The intrinsic error tolerance of neural network (NN) makes approximate computing a promising technique to improve the energy efficiency of NN inference. Conventional approximate computing focuses on balancing the efficiencyaccuracy tradeoff for existing pretrained networks, which can lead to suboptimal solutions. In this paper, we propose AxTrain, a hardwareoriented training framework to facilitate approximate computing for NN inference. Specifically, AxTrain leverages the synergy between two orthogonal methodsone actively searches for a network parameters distribution with high error tolerance, and the other passively learns resilient weights by numerically incorporating the noise distributions of the approximate hardware in the forward pass during the training phase. Experimental results from various datasets with nearthreshold computing and approximation multiplication strategies demonstrate AxTrain's ability to obtain resilient neural network parameters and system energy efficiency improvement.
05/21/2018 ∙ by Xin He, et al. ∙ 0 ∙ shareread it

Scene Text Detection and Recognition: The Deep Learning Era
With the rise and development of deep learning, computer vision has been tremendously transformed and reshaped. As an important research area in computer vision, scene text detection and recognition has been inescapably influenced by this wave of revolution, consequentially entering the era of deep learning. In recent years, the community has witnessed substantial advancements in mindset, approach and performance. This survey is aimed at summarizing and analyzing the major changes and significant progresses of scene text detection and recognition in the deep learning era. Through this article, we devote to: (1) introduce new insights and ideas; (2) highlight recent techniques and benchmarks; (3) look ahead into future trends. Specifically, we will emphasize the dramatic differences brought by deep learning and the grand challenges still remained. We expect that this review paper would serve as a reference book for researchers in this field. Related resources are also collected and compiled in our Github repository: https://github.com/Jyouhou/SceneTextPapers.
11/10/2018 ∙ by Shangbang Long, et al. ∙ 0 ∙ shareread it

On Measurement of the SpatioFrequency Property of OFDM Backscattering
Orthogonal frequencydivision multiplexing (OFDM) backscatter system, such as WiFi backscatter, has recently been recognized as a promising technique for the IoT connectivity, due to its ubiquitous and lowcost property. This paper investigates the spatialfrequency property of the OFDM backscatter which takes the distance and the angle into account in different frequency bands. We deploy three typical scenarios for performing measurements to evaluate the received signal strength from the backscatter link. The impact of the distances among the transmitter, the tag and the receiver, as well as the angle between the transmitter and the tag is observed through the obtained measurement data. From the evaluation results, it is found that the best location of tag is either close to the receiver or the transmitter which depends on the frequency band, and the best angle is 90 degrees between the transmitter and the receiver. This work opens the shed light on the spatial deployment of the backscatter tag in different frequency band with the aim of improving the performance and reducing the interference.
10/27/2018 ∙ by Xiaoxue Zhang, et al. ∙ 0 ∙ shareread it
Xin He
is this you? claim profile