Efficient Convolutional Neural Networks for Pixelwise Classification on Heterogeneous Hardware Systems

09/11/2015 ∙ by Fabian Tschopp, et al. ∙ 0

This work presents and analyzes three convolutional neural network (CNN) models for efficient pixelwise classification of images. When using convolutional neural networks to classify single pixels in patches of a whole image, a lot of redundant computations are carried out when using sliding window networks. This set of new architectures solve this issue by either removing redundant computations or using fully convolutional architectures that inherently predict many pixels at once. The implementations of the three models are accessible through a new utility on top of the Caffe library. The utility provides support for a wide range of image input and output formats, pre-processing parameters and methods to equalize the label histogram during training. The Caffe library has been extended by new layers and a new backend for availability on a wider range of hardware such as CPUs and GPUs through OpenCL. On AMD GPUs, speedups of 54× (SK-Net), 437× (U-Net) and 320× (USK-Net) have been observed, taking the SK equivalent SW (sliding window) network as the baseline. The label throughput is up to one megapixel per second. The analyzed neural networks have distinctive characteristics that apply during training or processing, and not every data set is suitable to every architecture. The quality of the predictions is assessed on two neural tissue data sets, of which one is the ISBI 2012 challenge data set. Two different loss functions, Malis loss and Softmax loss, were used during training. The whole pipeline, consisting of models, interface and modified Caffe library, is available as Open Source software under the working title Project Greentea.



There are no comments yet.


page 9

page 13

page 14

page 15

page 16

page 21

page 39

page 40

Code Repositories


Caffe model zoo and scripts

view repo


A C++ interface for the caffe library

view repo


This project was developed for identifying vehicles in a video stream. The project is a corner stone for a real time vehicle tracking algorithm that employ semantic pixel-wise methods. This project solves the tracking problem for the Udacity final project in a different way that the general approach presented in the course. Instead of using the HOG features and other features extracted from the color space of the images, we used the U-Net[1] which is a convolutional network for biomedical image

view repo


CIL road segmentation project repository

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.