Measures of Complexity for Large Scale Image Datasets

08/10/2020
by   Ameet Annasaheb Rahane, et al.
0

Large scale image datasets are a growing trend in the field of machine learning. However, it is hard to quantitatively understand or specify how various datasets compare to each other - i.e., if one dataset is more complex or harder to “learn” with respect to a deep-learning based network. In this work, we build a series of relatively computationally simple methods to measure the complexity of a dataset. Furthermore, we present an approach to demonstrate visualizations of high dimensional data, in order to assist with visual comparison of datasets. We present our analysis using four datasets from the autonomous driving research community - Cityscapes, IDD, BDD and Vistas. Using entropy based metrics, we present a rank-order complexity of these datasets, which we compare with an established rank-order with respect to deep learning.

READ FULL TEXT

page 4

page 6

research
01/05/2021

Data Quality Measures and Efficient Evaluation Algorithms for Large-Scale High-Dimensional Data

Machine learning has been proven to be effective in various application ...
research
11/04/2021

OpenFWI: Benchmark Seismic Datasets for Machine Learning-Based Full Waveform Inversion

We present OpenFWI, a collection of large-scale open-source benchmark da...
research
12/13/2020

Predicting Generalization in Deep Learning via Local Measures of Distortion

We study generalization in deep learning by appealing to complexity meas...
research
05/14/2022

Visual Exploration of Large-Scale Image Datasets for Machine Learning with Treemaps

In this paper, we present DendroMap, a novel approach to interactively e...
research
08/15/2023

ADD: An Automatic Desensitization Fisheye Dataset for Autonomous Driving

Autonomous driving systems require many images for analyzing the surroun...
research
11/12/2018

Fast Computing von Neumann Entropy for Large-scale Graphs via Quadratic Approximations

The von Neumann graph entropy (VNGE) can be used as a measure of graph c...
research
06/07/2022

Pushing the Limits of Learning-based Traversability Analysis for Autonomous Driving on CPU

Self-driving vehicles and autonomous ground robots require a reliable an...

Please sign up or login with your details

Forgot password? Click here to reset