Tracking Multiple Audio Sources with the von Mises Distribution and Variational EM
In this paper, we address the problem of simultaneously tracking several audio sources, namely the problem of estimating source trajectories from a sequence of the observed features. We propose to use the von Mises distribution to model audio-source directions of arrival (DOAs) with circular random variables. This leads to a multi-target Kalman filter formulation which is intractable because of the combinatorial explosion of associating observations to state variables over time. We propose a variational approximation of the filter's posterior distribution and we infer a variational expectation maximization (VEM) algorithm which is computationally efficient. We also propose an audio-source birth method that favors smooth source trajectories and which is used both to initialize the number of active sources and to detect new sources. We perform experiments with a recently released dataset comprising several moving sources as well as a moving microphone array.
READ FULL TEXT