Photographic dataset: random peppercorns

by   Teemu Helenius, et al.

This is a photographic dataset collected for testing image processing algorithms. The idea is to have sets of different but statistically similar images. In this work the images show randomly distributed peppercorns. The dataset is made available at .



There are no comments yet.


page 2

page 3

page 5


Photographic dataset: playing cards

This is a photographic dataset collected for testing image processing al...

Helsinki Deblur Challenge 2021: description of photographic data

The photographic dataset collected for the Helsinki Deblur Challenge 202...

Arabian Horse Identification Benchmark Dataset

The lack of a standard muzzle print database is a challenge for conducti...

Vectorization of Large Amounts of Raster Satellite Images in a Distributed Architecture Using HIPI

Vectorization process focus on grouping pixels of a raster image into ra...

Data Twinning

In this work, we develop a method named Twinning, for partitioning a dat...

Principled network extraction from images

Images of natural systems may represent patterns of network-like structu...

A Hierarchical Distributed Processing Framework for Big Image Data

This paper introduces an effective processing framework nominated ICP (I...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

This document reports the acquisition, structure and properties of a digital photographic dataset collected at the Industrial Mathematics Laboratory of the Department of Mathematics and Statistics of University of Helsinki, Finland.

The idea is to have objects with the same size scale.

The collected dataset is intended to be ideal for computational approaches based on sparse patch-based dictionaries [1]. Similar images have already been used in such contexts, see [2].

2. Materials and Methods

2.1. Camera equipment

We use a PhaseOne XF medium-format camera equipped with an achromatic IQ260 digital back. The lens is Phase One Digital AF 120mm F4. The pixel size in the resulting 16bit TIFF image file is .

2.2. Lighting

The targets were lit with five Olight X6 Marauder LED flashlights with luminous flux of 5000 lm (nominal value, 4825 lm was measured in the laboratory of the vendor The lights were positioned at roughly equiangular arrangement. The distance of each light from the target was 1,0–1,2m. The lights were heating up quickly as they were used at maximum power. Cooling was enhanced with three regular household fans (Merox Floor Fan FE-45A).

A diffuser was placed between the lights and the target to make the lighting more uniform and to reduce sharp shadows.

See Figure 1 for the imaging setup.

Figure 1. The imaging setup.

2.3. Size and scale considerations

How to decide how many objects should be visible in the image?

Here are the relevant numbers: the radius of the approximately round peppercorns is 4–5 mm. The patch sizes we have in mind for image processing applications are between (used in JPEG coding) and . The full image size is pixels.

The relationship between patch size and object size should be chosen wisely. Perhaps it makes sense to have the smallest patch size () roughly equal to the smaller object type, namely 4mm peppercorn. Then using a larger patch would enable having several objects inside one patch. These choices lead to two pixels per millimeter, which means that the full image target area has roughly the size 4,5 meters by 3,4 meters. This leads to an impractically large amount of peppercorns needed to cover the image area.

However, we cannot predict all possible future uses of this open dataset. It may become important in some study to have even several patches fitting completely inside one peppercorn. Therefore, we choose the scale so that downsampling the images by a factor of four leads to the image of a peppercorn to have diameter of approximately 8 pixels. This means that we need to cover one square meter with objects, see figure 2.

A box of size m was filled with peppercorns, whereas the photographed area was about m, see figure 3.

Figure 2. The dataset scale.
Figure 3. The photographed area with a tape measure.

3. Results

The lens–target distance was 150cm. The image sensor was roughly parallel with the surface of the object layer. The aperture f-stop was f/11, the ISO setting 200 and the shutter speed was set to .

In total 10 photos were taken as the dataset, see figure 4. Between taking each photo the peppercorns were carefully re-shuffled by hand. Although care was taken to avoid the bottom of the container from being visible through the peppercorns, the possibility of such an occurrence cannot be fully excluded.

Figure 4. The dataset downsampled to size .


  • [1] R. Rubinstein, M. Zibulevsky, and M. Elad, Double sparsity: Learning sparse dictionaries for sparse signal approximation, Signal Processing, IEEE Transactions on, 58 (2010), pp. 1553–1564.
  • [2] S. Soltani, Studies of Sensitivity in the Dictionary Learning Approach to Computed Tomography: Simplifying the Reconstruction Problem, Rotation, and Scale, Technical University of Denmark, 2015.