Generalized Gapped-kmer Filters for Robust Frequency Estimation
In this paper, we study the generalized gapped k-mer filters and derive a closed form solution for their coefficients. We consider nonnegative integers ℓ and k, with k≤ℓ, and an ℓ-tuple B=(b_1,…,b_ℓ) of integers b_i≥ 2, i=1,…,ℓ. We introduce and study an incidence matrix A=A_ℓ,k;B. We develop a Möbius-like function ν_B which helps us to obtain closed forms for a complete set of mutually orthogonal eigenvectors of A^⊤ A as well as a complete set of mutually orthogonal eigenvectors of AA^⊤ corresponding to nonzero eigenvalues. The reduced singular value decomposition of A and combinatorial interpretations for the nullity and rank of A, are among the consequences of this approach. We then combine the obtained formulas, some results from linear algebra, and combinatorial identities of elementary symmetric functions and ν_B, to provide the entries of the Moore-Penrose pseudo-inverse matrix A^+ and the Gapped k-mer filter matrix A^+ A.
READ FULL TEXT