Disaggregated Accelerator Management System for Cloud Data Centers

10/26/2020
by   Ryousei Takano, et al.
0

A conventional data center that consists of monolithic-servers is confronted with limitations including lack of operational flexibility, low resource utilization, low maintainability, etc. Resource disaggregation is a promising solution to address the above issues. We propose a concept of disaggregated cloud data center architecture called Flow-in-Cloud (FiC) that enables an existing cluster computer system to expand an accelerator pool through a high-speed network. FlowOS-RM manages the entire pool resources, and deploys a user job on a dynamically constructed slice according to a user request. This slice consists of compute nodes and accelerators where each accelerator is attached to the corresponding compute node. This paper demonstrates the feasibility of FiC in a proof of concept experiment running a distributed deep learning application on the prototype system. The result successfully warrants the applicability of the proposed system.

READ FULL TEXT

page 3

page 4

research
01/28/2021

Cloud Computing Concept and Roots

Cloud computing is a particular implementation of distributed computing....
research
12/03/2018

Hoard: A Distributed Data Caching System to Accelerate Deep Learning Training on the Cloud

Deep Learning system architects strive to design a balanced system where...
research
03/12/2021

A Risk-taking Broker Model to Optimise User Requests placement on On-demand and Contract VMs

Cloud providers offer end-users various pricing schemes to allow them to...
research
04/05/2018

A Markov Model for Request-Based Inter-Slice Resource Management in 5G Networks

The emerging feature of network slicing in future Fifth Generation (5G) ...
research
01/05/2022

CDN Slicing over a Multi-Domain Edge Cloud

We present an architecture for the provision of video Content Delivery N...
research
03/03/2015

Disaggregated and optically interconnected memory: when will it be cost effective?

The "Disaggregated Server" concept has been proposed for datacenters whe...
research
03/19/2021

Performance Analysis of Deep Learning Workloads on a Composable System

A composable infrastructure is defined as resources, such as compute, st...

Please sign up or login with your details

Forgot password? Click here to reset