LSSD: a Controlled Large JPEG Image Database for Deep-Learning-based Steganalysis "into the Wild"
For many years, the image databases used in steganalysis have been relatively small, i.e. about ten thousand images. This limits the diversity of images and thus prevents large-scale analysis of steganalysis algorithms. In this paper, we describe a large JPEG database composed of 2 million colour and grey-scale images. This database, named LSSD for Large Scale Steganalysis Database, was obtained thanks to the intensive use of controlled development procedures. LSSD has been made publicly available, and we aspire it could be used by the steganalysis community for large-scale experiments. We introduce the pipeline used for building various image database versions. We detail the general methodology that can be used to redevelop the entire database and increase even more the diversity. We also discuss computational cost and storage cost in order to develop images.
READ FULL TEXT