Learning Sparse Additive Models with Interactions in High Dimensions

04/18/2016
by   Hemant Tyagi, et al.
0

A function f: R^d →R is referred to as a Sparse Additive Model (SPAM), if it is of the form f(x) = ∑_l ∈Sϕ_l(x_l), where S⊂ [d], |S| ≪ d. Assuming ϕ_l's and S to be unknown, the problem of estimating f from its samples has been studied extensively. In this work, we consider a generalized SPAM, allowing for second order interaction terms. For some S_1 ⊂ [d], S_2 ⊂[d] 2, the function f is assumed to be of the form: f(x) = ∑_p ∈S_1ϕ_p (x_p) + ∑_(l,l^') ∈S_2ϕ_(l,l^') (x_l,x_l^'). Assuming ϕ_p,ϕ_(l,l^'), S_1 and, S_2 to be unknown, we provide a randomized algorithm that queries f and exactly recovers S_1,S_2. Consequently, this also enables us to estimate the underlying ϕ_p, ϕ_(l,l^'). We derive sample complexity bounds for our scheme and also extend our analysis to include the situation where the queries are corrupted with noise -- either stochastic, or arbitrary but bounded. Lastly, we provide simulation results on synthetic data, that validate our theoretical findings.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset