DeepAI AI Chat
Log In Sign Up

How sustainable is "common" data science in terms of power consumption?

by   Bjorge Meulemeester, et al.

Continuous developments in data science have brought forth an exponential increase in complexity of machine learning models. Additionally, data scientists have become ubiquitous in the private market, academic environments and even as a hobby. All of these trends are on a steady rise, and are associated with an increase in power consumption and associated carbon footprint. The increasing carbon footprint of large-scale advanced data science has already received attention, but the latter trend has not. This work aims to estimate the contribution of the increasingly popular "common" data science to the global carbon footprint. To this end, the power consumption of several typical tasks in the aforementioned common data science tasks will be measured and compared to: large-scale "advanced" data science, common computer-related tasks, and everyday non-computer related tasks. This is done by converting the measurements to the equivalent unit of "km driven by car". Our main findings are: "common" data science consumes 2.57 more power than regular computer usage, but less than some common everyday power-consuming tasks such as lighting or heating; large-scale data science consumes substantially more power than common data science.


Automating Data Science: Prospects and Challenges

Given the complexity of typical data science projects and the associated...

Trust in Data Science: Collaboration, Translation, and Accountability in Corporate Data Science Projects

The trustworthiness of data science systems in applied and real-world se...

Data science in economics: comprehensive review of advanced machine learning and deep learning methods

This paper provides a state-of-the-art investigation of advances in data...

Efficient Specialized Spreadsheet Parsing for Data Science

Spreadsheets are widely used for data exploration. Since spreadsheet sys...

Tropical Data Science

Phylogenomics is a new field which applies to tools in phylogenetics to ...

X-TIME: An in-memory engine for accelerating machine learning on tabular data with CAMs

Structured, or tabular, data is the most common format in data science. ...

Tensor Algebra and its Applications to Data Science and Statistics

This survey provides an overview of common applications, both implicit a...