Large-scale Kernel Methods and Applications to Lifelong Robot Learning
As the size and richness of available datasets grow larger, the opportunities for solving increasingly challenging problems with algorithms learning directly from data grow at the same pace. Consequently, the capability of learning algorithms to work with large amounts of data has become a crucial scientific and technological challenge for their practical applicability. Hence, it is no surprise that large-scale learning is currently drawing plenty of research effort in the machine learning research community. In this thesis, we focus on kernel methods, a theoretically sound and effective class of learning algorithms yielding nonparametric estimators. Kernel methods, in their classical formulations, are accurate and efficient on datasets of limited size, but do not scale up in a cost-effective manner. Recent research has shown that approximate learning algorithms, for instance random subsampling methods like Nyström and random features, with time-memory-accuracy trade-off mechanisms are more scalable alternatives. In this thesis, we provide analyses of the generalization properties and computational requirements of several types of such approximation schemes. In particular, we expose the tight relationship between statistics and computations, with the goal of tailoring the accuracy of the learning process to the available computational resources. Our results are supported by experimental evidence on large-scale datasets and numerical simulations. We also study how large-scale learning can be applied to enable accurate, efficient, and reactive lifelong learning for robotics. In particular, we propose algorithms allowing robots to learn continuously from experience and adapt to changes in their operational environment. The proposed methods are validated on the iCub humanoid robot in addition to other benchmarks.
READ FULL TEXT