Structure Learning of H-colorings
We study the structure learning problem for graph homomorphisms, commonly referred to as H-colorings, including the weighted case which corresponds to spin systems with hard constraints. The learning problem is as follows: for a fixed (and known) constraint graph H with q colors and an unknown graph G=(V,E) with n vertices, given uniformly random H-colorings of G, how many samples are required to learn the edges of the unknown graph G? We give a characterization of H for which the problem is identifiable for every G, i.e., we can learn G with an infinite number of samples. We focus particular attention on the case of proper vertex q-colorings of graphs of maximum degree d where intriguing connections to statistical physics phase transitions appear. We prove that when q>d the problem is identifiable and we can learn G in poly(d,q)× O(n^2n) time. In contrast for soft-constraint systems, such as the Ising model, the best possible running time is exponential in d. When q≤ d we prove that the problem is not identifiable, and we cannot hope to learn G. When q<d-√(d) + Θ(1) we prove that even learning an equivalent graph (any graph with the same set of H-colorings) is computationally hard---sample complexity is exponential in n in the worst-case. For the q-colorings problem, the threshold for efficient learning seems to be connected to the uniqueness/non-uniqueness phase transition at q=d. We explore this connection for general H-colorings and prove that under a well-known condition in statistical physics, known as Dobrushin uniqueness condition, we can learn G in poly(d,q)× O(n^2n) time.
READ FULL TEXT