Randomized Empirical Processes and Confidence Bands via Virtual Resampling
Let X,X_1,X_2,... be independent real valued random variables with a common distribution function F, and consider {X_1,...,X_N }, possibly a big concrete data set, or an imaginary random sample of size N≥ 1 on X. In the latter case, or when a concrete data set in hand is too big to be entirely processed, then the sample distribution function F_N and the the population distribution function F are both to be estimated. This, in this paper, is achieved via viewing {X_1,...,X_N } as above, as a finite population of real valued random variables with N labeled units, and sampling its indices {1,...,N } with replacement m_N:= ∑_i=1^N w_i^(N) times so that for each 1≤ i ≤ N, w_i^(N) is the count of number of times the index i of X_i is chosen in this virtual resampling process. This exposition extends the Doob-Donsker classical theory of weak convergence of empirical processes to that of the thus created randomly weighted empirical processes when N, m_N →∞ so that m_N=o(N^2).
READ FULL TEXT