Amir Erfan Eshratifar

is this you? claim profile

0

  • A Hardware-Friendly Algorithm for Scalable Training and Deployment of Dimensionality Reduction Models on FPGA

    With ever-increasing application of machine learning models in various domains such as image classification, speech recognition and synthesis, and health care, designing efficient hardware for these models has gained a lot of popularity. While the majority of researches in this area focus on efficient deployment of machine learning models (a.k.a inference), this work concentrates on challenges of training these models in hardware. In particular, this paper presents a high-performance, scalable, reconfigurable solution for both training and deployment of different dimensionality reduction models in hardware by introducing a hardware-friendly algorithm. Compared to state-of-the-art implementations, our proposed algorithm and its hardware realization decrease resource consumption by 50% without any degradation in accuracy.

    01/11/2018 ∙ by Mahdi Nazemi, et al. ∙ 0 share

    read it

  • A Meta-Learning Approach for Custom Model Training

    Transfer-learning and meta-learning are two effective methods to apply knowledge learned from large data sources to new tasks. In few-class, few-shot target task settings (i.e. when there are only a few classes and training examples available in the target task), meta-learning approaches that optimize for future task learning have outperformed the typical transfer approach of initializing model weights from a pre-trained starting point. But as we experimentally show, meta-learning algorithms that work well in the few-class setting do not generalize well in many-shot and many-class cases. In this paper, we propose a joint training approach that combines both transfer-learning and meta-learning. Benefiting from the advantages of each, our method obtains improved generalization performance on unseen target tasks in both few- and many-class and few- and many-shot scenarios.

    09/21/2018 ∙ by Amir Erfan Eshratifar, et al. ∙ 0 share

    read it

  • Gradient Agreement as an Optimization Objective for Meta-Learning

    This paper presents a novel optimization method for maximizing generalization over tasks in meta-learning. The goal of meta-learning is to learn a model for an agent adapting rapidly when presented with previously unseen tasks. Tasks are sampled from a specific distribution which is assumed to be similar for both seen and unseen tasks. We focus on a family of meta-learning methods learning initial parameters of a base model which can be fine-tuned quickly on a new task, by few gradient steps (MAML). Our approach is based on pushing the parameters of the model to a direction in which tasks have more agreement upon. If the gradients of a task agree with the parameters update vector, then their inner product will be a large positive value. As a result, given a batch of tasks to be optimized for, we associate a positive (negative) weight to the loss function of a task, if the inner product between its gradients and the average of the gradients of all tasks in the batch is a positive (negative) value. Therefore, the degree of the contribution of a task to the parameter updates is controlled by introducing a set of weights on the loss function of the tasks. Our method can be easily integrated with the current meta-learning algorithms for neural networks. Our experiments demonstrate that it yields models with better generalization compared to MAML and Reptile.

    10/18/2018 ∙ by Amir Erfan Eshratifar, et al. ∙ 0 share

    read it

  • JointDNN: An Efficient Training and Inference Engine for Intelligent Mobile Cloud Computing Services

    Deep neural networks are among the most influential architectures of deep learning algorithms, being deployed in many mobile intelligent applications. End-side services, such as intelligent personal assistants (IPAs), autonomous cars, and smart home services often employ either simple local models or complex remote models on the cloud. Mobile-only and cloud-only computations are currently the status quo approaches. In this paper, we propose an efficient, adaptive, and practical engine, JointDNN, for collaborative computation between a mobile device and cloud for DNNs in both inference and training phase. JointDNN not only provides an energy and performance efficient method of querying DNNs for the mobile side, but also benefits the cloud server by reducing the amount of its workload and communications compared to the cloud-only approach. Given the DNN architecture, we investigate the efficiency of processing some layers on the mobile device and some layers on the cloud server. We provide optimization formulations at layer granularity for forward and backward propagation in DNNs, which can adapt to mobile battery limitations and cloud server load constraints and quality of service. JointDNN achieves up to 18X and 32X reductions on the latency and mobile energy consumption of querying DNNs, respectively.

    01/25/2018 ∙ by Amir Erfan Eshratifar, et al. ∙ 0 share

    read it

  • Towards Collaborative Intelligence Friendly Architectures for Deep Learning

    Modern mobile devices are equipped with high-performance hardware resources such as graphics processing units (GPUs), making the end-side intelligent services more feasible. Even recently, specialized silicons as neural engines are being used for mobile devices. However, most mobile devices are still not capable of performing real-time inference using very deep models. Computations associated with deep models for today's intelligent applications are typically performed solely on the cloud. This cloud-only approach requires significant amounts of raw data to be uploaded to the cloud over the mobile wireless network and imposes considerable computational and communication load on the cloud server. Recent studies have shown that the latency and energy consumption of deep neural networks in mobile applications can be notably reduced by splitting the workload between the mobile device and the cloud. In this approach, referred to as collaborative intelligence, intermediate features computed on the mobile device are offloaded to the cloud instead of the raw input data of the network, reducing the size of the data needed to be sent to the cloud. In this paper, we design a new collaborative intelligence friendly architecture by introducing a unit responsible for reducing the size of the feature data needed to be offloaded to the cloud to a greater extent, where this unit is placed after a selected layer of a deep model. Our proposed method, across different wireless networks, achieves on average 53x improvements for end-to-end latency and 68x improvements for mobile energy consumption compared to the status quo cloud-only approach for ResNet-50, while the accuracy loss is less than 2

    02/01/2019 ∙ by Amir Erfan Eshratifar, et al. ∙ 0 share

    read it

  • BottleNet: A Deep Learning Architecture for Intelligent Mobile Cloud Computing Services

    Recent studies have shown the latency and energy consumption of deep neural networks can be significantly improved by splitting the network between the mobile device and cloud. This paper introduces a new deep learning architecture, called BottleNet, for reducing the feature size needed to be sent to the cloud. Furthermore, we propose a training method for compensating for the potential accuracy loss due to the lossy compression of features before transmitting them to the cloud. BottleNet achieves on average 30x improvement in end-to-end latency and 40x improvement in mobile energy consumption compared to the cloud-only approach with negligible accuracy loss.

    02/04/2019 ∙ by Amir Erfan Eshratifar, et al. ∙ 0 share

    read it