Flexible and Scalable Deep Learning with MMLSpark

04/11/2018
by   Mark Hamilton, et al.
0

In this work we detail a novel open source library, called MMLSpark, that combines the flexible deep learning library Cognitive Toolkit, with the distributed computing framework Apache Spark. To achieve this, we have contributed Java Language bindings to the Cognitive Toolkit, and added several new components to the Spark ecosystem. In addition, we also integrate the popular image processing library OpenCV with Spark, and present a tool for the automated generation of PySpark wrappers from any SparkML estimator and use this tool to expose all work to the PySpark ecosystem. Finally, we provide a large library of tools for working and developing within the Spark ecosystem. We apply this work to the automated classification of Snow Leopards from camera trap images, and provide an end to end solution for the non-profit conservation organization, the Snow Leopard Trust.

READ FULL TEXT

page 8

page 9

research
03/04/2021

CLAIMED, a visual and scalable component library for Trusted AI

Deep Learning models are getting more and more popular but constraints o...
research
10/25/2021

Lhotse: a speech data representation library for the modern deep learning ecosystem

Speech data is notoriously difficult to work with due to a variety of co...
research
05/15/2023

Dragon-Alpha cu32: A Java-based Tensor Computing Framework With its High-Performance CUDA Library

Java is very powerful, but in Deep Learning field, its capabilities prob...
research
07/05/2019

MigrationMiner: An Automated Detection Tool of Third-Party Java Library Migration at the Method Level

In this paper we introduce, MigrationMiner, an automated tool that detec...
research
06/10/2021

CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Mixing

The NLP community has witnessed steep progress in a variety of tasks acr...
research
05/11/2020

deepSELF: An Open Source Deep Self End-to-End Learning Framework

We introduce an open-source toolkit, i.e., the deep Self End-to-end Lear...
research
03/28/2023

Specification-based CSV Support in VDM

CSV is a widely used format for data representing systems control, infor...

Please sign up or login with your details

Forgot password? Click here to reset