Statistical Relational Artificial Intelligence with Relative Frequencies: A Contribution to Modelling and Transfer Learning across Domain Sizes
Dependencies on the relative frequency of a state in the domain are common when modelling probabilistic dependencies on relational data. For instance, the likelihood of a school closure during an epidemic might depend on the proportion of infected pupils exceeding a threshold. Often, rather than depending on discrete thresholds, dependencies are continuous: for instance, the likelihood of any one mosquito bite transmitting an illness depends on the proportion of carrier mosquitoes. Current approaches usually only consider probabilities over possible worlds rather than over domain elements themselves. We introduce two formalisms that explicitly incorporate relative frequencies into statistical relational artificial intelligence. The first formalism, Lifted Bayesian Networks for Conditional Probability Logic, expresses discrete dependencies on probabilistic data. The second formalism, Functional Lifted Bayesian Networks, expresses continuous dependencies. Incorporating relative frequencies is not only beneficial to modelling; it also provides a more rigorous approach to learning problems where training and test or application domains have different sizes. To this end, we provide a representation of the asymptotic probability distributions induced by the two formalisms on domains of increasing sizes. Since that representation has well-understood scaling behaviour across domain sizes, it can be used to estimate parameters for a large domain consistently from randomly sampled subpopulations.
READ FULL TEXT