Hypothesis Testing of One-Sample Mean Vector in Distributed Frameworks

10/06/2021
by   Bin Du, et al.
0

Distributed frameworks are widely used to handle massive data, where sample size n is very large, and data are often stored in k different machines. For a random vector X∈ℝ^p with expectation μ, testing the mean vector H_0: μ=μ_0 vs H_1: μμ_0 for a given vector μ_0 is a basic problem in statistics. The centralized test statistics require heavy communication costs, which can be a burden when p or k is large. To reduce the communication cost, distributed test statistics are proposed in this paper for this problem based on the divide and conquer technique, a commonly used approach for distributed statistical inference. Specifically, we extend two commonly used centralized test statistics to the distributed ones, that apply to low and high dimensional cases, respectively. Comparing the power of centralized test statistics and the distributed ones, it is observed that there is a fundamental tradeoff between communication costs and the powers of the tests. This is quite different from the application of the divide and conquer technique in many other problems such as estimation, where the associated distributed statistics can be as good as the centralized ones. Numerical results confirm the theoretical findings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2018

Distributed Statistical Inference for Massive Data

This paper considers distributed statistical inference for general symme...
research
05/17/2017

Two-Sample Tests for Large Random Graphs Using Network Statistics

We consider a two-sample hypothesis testing problem, where the distribut...
research
09/06/2021

A Unified Approach to Hypothesis Testing for Functional Linear Models

We develop a unified approach to hypothesis testing for various types of...
research
11/05/2022

Testing for high-dimensional white noise

Testing for multi-dimensional white noise is an important subject in sta...
research
04/13/2023

A review of distributed statistical inference

The rapid emergence of massive datasets in various fields poses a seriou...
research
04/19/2019

Note on Mean Vector Testing for High-Dimensional Dependent Observations

For the mean vector test in high dimension, Ayyala et al.(2017,153:136-1...
research
11/30/2017

On reducing the communication cost of the diffusion LMS algorithm

The rise of digital and mobile communications has recently made the worl...

Please sign up or login with your details

Forgot password? Click here to reset