Adaptive random Fourier features with Metropolis sampling
The supervised learning problem to determine a neural network approximation ℝ^d∋ x↦∑_k=1^Kβ̂_k e^iω_k· x with one hidden layer is studied as a random Fourier features algorithm. The Fourier features, i.e., the frequencies ω_k∈ℝ^d, are sampled using an adaptive Metropolis sampler. The Metropolis test accepts proposal frequencies ω_k', having corresponding amplitudes β̂_k', with the probability min{1, (|β̂_k'|/|β̂_k|)^γ}, for a certain positive parameter γ, determined by minimizing the approximation error for given computational work. This adaptive, non-parametric stochastic method leads asymptotically, as K→∞, to equidistributed amplitudes |β̂_k|, analogous to deterministic adaptive algorithms for differential equations. The equidistributed amplitudes are shown to asymptotically correspond to the optimal density for independent samples in random Fourier features methods. Numerical evidence is provided in order to demonstrate the approximation properties and efficiency of the proposed algorithm. The algorithm is tested both on synthetic data and a real-world high-dimensional benchmark.
READ FULL TEXT