Mapping Energy Landscapes of Non-Convex Learning Problems
In many statistical learning problems, the target functions to be optimized are highly non-convex in various model spaces and thus are difficult to analyze. In this paper, we compute Energy Landscape Maps (ELMs) which characterize and visualize an energy function with a tree structure, in which each leaf node represents a local minimum and each non-leaf node represents the barrier between adjacent energy basins. The ELM also associates each node with the estimated probability mass and volume for the corresponding energy basin. We construct ELMs by adopting the generalized Wang-Landau algorithm and multi-domain sampler that simulates a Markov chain traversing the model space by dynamically reweighting the energy function. We construct ELMs in the model space for two classic statistical learning problems: i) clustering with Gaussian mixture models or Bernoulli templates; and ii) bi-clustering. We propose a way to measure the difficulties (or complexity) of these learning problems and study how various conditions affect the landscape complexity, such as separability of the clusters, the number of examples, and the level of supervision; and we also visualize the behaviors of different algorithms, such as K-mean, EM, two-step EM and Swendsen-Wang cuts, in the energy landscapes.
READ FULL TEXT