Inference on the History of a Randomly Growing Tree

by   Harry Crane, et al.

The spread of infectious disease in a human community or the proliferation of fake news on social media can be modeled as a randomly growing tree-shaped graph. The history of the random growth process is often unobserved but contains important information such as the source of the infection. We consider the problem of statistical inference on aspects of the latent history using only a single snapshot of the final tree. Our approach is to apply random labels to the observed unlabeled tree and analyze the resulting distribution of the growth process, conditional on the final outcome. We show that this conditional distribution is tractable under a shape-exchangeability condition, which we introduce here, and that this condition is satisfied for many popular models for randomly growing trees such as uniform attachment, linear preferential attachment and uniform attachment on a D-regular tree. For inference of the root under shape-exchangeability, we propose computationally scalable algorithms for constructing confidence sets with valid frequentist coverage as well as bounds on the expected size of the confidence sets. We also provide efficient sampling algorithms that extend our methods to a wide class of inference problems.


page 1

page 2

page 3

page 4


Root and community inference on the latent growth process of a network using noisy attachment models

We introduce the PAPER (Preferential Attachment Plus Erdős–Rényi) model ...

Confidence Sets for the Source of a Diffusion in Regular Trees

We study the problem of identifying the source of a diffusion spreading ...

Knowing what you know: valid confidence sets in multiclass and multilabel prediction

We develop conformal prediction methods for constructing valid predictiv...

Correlated randomly growing graphs

We introduce a new model of correlated randomly growing graphs and study...

Persistence of the Jordan center in Random Growing Trees

The Jordan center of a graph is defined as a vertex whose maximum distan...

Tree-Values: selective inference for regression trees

We consider conducting inference on the output of the Classification and...