Performance and Energy Consumption of Parallel Machine Learning Algorithms

05/01/2023
by   Xidong Wu, et al.
0

Machine learning models have achieved remarkable success in various real-world applications such as data science, computer vision, and natural language processing. However, model training in machine learning requires large-scale data sets and multiple iterations before it can work properly. Parallelization of training algorithms is a common strategy to speed up the process of training. However, many studies on model training and inference focus only on aspects of performance. Power consumption is also an important metric for any type of computation, especially high-performance applications. Machine learning algorithms that can be used on low-power platforms such as sensors and mobile devices have been researched, but less power optimization is done for algorithms designed for high-performance computing. In this paper, we present a C++ implementation of logistic regression and the genetic algorithm, and a Python implementation of neural networks with stochastic gradient descent (SGD) algorithm on classification tasks. We will show the impact that the complexity of the model and the size of the training data have on the parallel efficiency of the algorithm in terms of both power and performance. We also tested these implementations using shard-memory parallelism, distributed memory parallelism, and GPU acceleration to speed up machine learning model training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/21/2013

GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training

The ability to train large-scale neural networks has resulted in state-o...
research
05/19/2022

Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models

The energy requirements of current natural language processing models co...
research
08/19/2023

High Performance Computing Applied to Logistic Regression: A CPU and GPU Implementation Comparison

We present a versatile GPU-based parallel version of Logistic Regression...
research
06/21/2019

Trade-offs in Large-Scale Distributed Tuplewise Estimation and Learning

The development of cluster computing frameworks has allowed practitioner...
research
10/28/2018

Machine Learning in Network Centrality Measures: Tutorial and Outlook

Complex networks are ubiquitous to several Computer Science domains. Cen...
research
04/01/2021

Reservoir-Based Distributed Machine Learning for Edge Operation

We introduce a novel design for in-situ training of machine learning alg...
research
10/25/2019

The Scalability for Parallel Machine Learning Training Algorithm: Dataset Matters

To gain a better performance, many researchers put more computing resour...

Please sign up or login with your details

Forgot password? Click here to reset