COSIME: FeFET based Associative Memory for In-Memory Cosine Similarity Search

07/25/2022
by   Che-Kai Liu, et al.
0

In a number of machine learning models, an input query is searched across the trained class vectors to find the closest feature class vector in cosine similarity metric. However, performing the cosine similarities between the vectors in Von-Neumann machines involves a large number of multiplications, Euclidean normalizations and division operations, thus incurring heavy hardware energy and latency overheads. Moreover, due to the memory wall problem that presents in the conventional architecture, frequent cosine similarity-based searches (CSSs) over the class vectors requires a lot of data movements, limiting the throughput and efficiency of the system. To overcome the aforementioned challenges, this paper introduces COSIME, an general in-memory associative memory (AM) engine based on the ferroelectric FET (FeFET) device for efficient CSS. By leveraging the one-transistor AND gate function of FeFET devices, current-based translinear analog circuit and winner-take-all (WTA) circuitry, COSIME can realize parallel in-memory CSS across all the entries in a memory block, and output the closest word to the input query in cosine similarity metric. Evaluation results at the array level suggest that the proposed COSIME design achieves 333X and 90.5X latency and energy improvements, respectively, and realizes better classification accuracy when compared with an AM design implementing approximated CSS. The proposed in-memory computing fabric is evaluated for an HDC problem, showcasing that COSIME can achieve on average 47.1X and 98.5X speedup and energy efficiency improvements compared with an GPU implementation.

READ FULL TEXT
research
10/21/2019

A Comparison of Semantic Similarity Methods for Maximum Human Interpretability

The inclusion of semantic information in any similarity measures improve...
research
09/16/2019

High-Throughput In-Memory Computing for Binary Deep Neural Networks with Monolithically Integrated RRAM and 90nm CMOS

Deep learning hardware designs have been bottlenecked by conventional me...
research
08/27/2016

Testing APSyn against Vector Cosine on Similarity Estimation

In Distributional Semantic Models (DSMs), Vector Cosine is widely used t...
research
12/18/2018

Index-based, High-dimensional, Cosine Threshold Querying with Optimality Guarantees

Given a database of vectors, a cosine threshold query returns all vector...
research
05/03/2019

When parallel speedups hit the memory wall

After Amdahl's trailblazing work, many other authors proposed analytical...
research
02/15/2022

Fast and Scalable Memristive In-Memory Sorting with Column-Skipping Algorithm

Memristive in-memory sorting has been proposed recently to improve hardw...

Please sign up or login with your details

Forgot password? Click here to reset