Statistical methods for modeling spatially-referenced paired genetic relatedness data

09/28/2021
by   Joshua L. Warren, et al.
0

Understanding factors that contribute to the increased likelihood of disease transmission between two individuals is important for infection control. Measures of genetic relatedness of bacterial isolates between two individuals are often analyzed to determine their associations with these factors using simple correlation or regression analyses. However, these standard approaches ignore the potential for correlation in paired data of this type, arising from the presence of the same individual across multiple paired outcomes. We develop two novel hierarchical Bayesian methods for properly analyzing paired genetic relatedness data in the form of patristic distances and transmission probabilities. Using individual-level spatially correlated random effect parameters, we account for multiple sources of correlation in the outcomes as well as other important features of their distribution. Through simulation, we show that the standard analyses drastically underestimate uncertainty in the associations when correlation is present in the data, leading to incorrect conclusions regarding the covariates of interest. Conversely, the newly developed methods perform well under various levels of correlated and uncorrelated data. All methods are applied to Mycobacterium tuberculosis data from the Republic of Moldova where we identify factors associated with disease transmission and, through analysis of the random effect parameters, key individuals and areas with increased transmission activity. Model comparisons show the importance of the new methodology in this setting. The methods are implemented in the R package GenePair.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset