Sparsity-based audio declipping methods: overview, new algorithms, and large-scale evaluation
Recent advances in audio declipping have substantially improved the state of the art in certain saturation regimes. Yet, practitioners need guidelines to choose a method, and while existing benchmarks have been instrumental in advancing the field, larger-scale experiments are needed to guide such choices. First, we show that the saturation levels in existing small-scale benchmarks are moderate and call for benchmarks with more perceptually significant saturation levels. We then propose a general algorithmic framework for declipping that covers existing and new combinations of flavors of state-of-the-art techniques exploiting time-frequency sparsity: synthesis vs analysis sparsity, with plain or structured sparsity. Finally, we systematically compare these combinations and state-of-the-art methods. Using a large-scale numerical benchmarks and a smaller scale formal listening test, we provide guidelines for various saturation levels, both for speech and various musical "genres" from the RWC database. The code is made publicly available for the purpose of reproducible research and benchmarking.
READ FULL TEXT