IStego100K: Large-scale Image Steganalysis Dataset

11/13/2019
by   Zhongliang Yang, et al.
0

In order to promote the rapid development of image steganalysis technology, in this paper, we construct and release a multivariable large-scale image steganalysis dataset called IStego100K. It contains 208,104 images with the same size of 1024*1024. Among them, 200,000 images (100,000 cover-stego image pairs) are divided as the training set and the remaining 8,104 as testing set. In addition, we hope that IStego100K can help researchers further explore the development of universal image steganalysis algorithms, so we try to reduce limits on the images in IStego100K. For each image in IStego100K, the quality factors is randomly set in the range of 75-95, the steganographic algorithm is randomly selected from three well-known steganographic algorithms, which are J-uniward, nsF5 and UERD, and the embedding rate is also randomly set to be a value of 0.1-0.4. In addition, considering the possible mismatch between training samples and test samples in real environment, we add a test set (DS-Test) whose source of samples are different from the training set. We hope that this test set can help to evaluate the robustness of steganalysis algorithms. We tested the performance of some latest steganalysis algorithms on IStego100K, with specific results and analysis details in the experimental part. We hope that the IStego100K dataset will further promote the development of universal image steganalysis technology. The description of IStego100K and instructions for use can be found at https://github.com/YangzlTHU/IStego100K

READ FULL TEXT
research
03/03/2023

T360RRD: A dataset for 360 degree rotated rectangular box table detection

To address the problem of scarcity and high annotation costs of rotated ...
research
08/25/2017

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

We present Fashion-MNIST, a new dataset comprising of 28x28 grayscale im...
research
04/12/2021

SuperSim: a test set for word similarity and relatedness in Swedish

Language models are notoriously difficult to evaluate. We release SuperS...
research
04/21/2023

RoCOCO: Robust Benchmark MS-COCO to Stress-test Robustness of Image-Text Matching Models

Recently, large-scale vision-language pre-training models and visual sem...
research
11/24/2020

Learning to Sample the Most Useful Training Patches from Images

Some image restoration tasks like demosaicing require difficult training...
research
02/26/2023

PDIWS: Thermal Imaging Dataset for Person Detection in Intrusion Warning Systems

In this paper, we present a synthetic thermal imaging dataset for Person...
research
03/02/2017

Unsupervised Steganalysis Based on Artificial Training Sets

In this paper, an unsupervised steganalysis method that combines artific...

Please sign up or login with your details

Forgot password? Click here to reset