Manu: A Cloud Native Vector Database Management System

06/28/2022
by   Rentong Guo, et al.
0

With the development of learning-based embedding models, embedding vectors are widely used for analyzing and searching unstructured data. As vector collections exceed billion-scale, fully managed and horizontally scalable vector databases are necessary. In the past three years, through interaction with our 1200+ industry users, we have sketched a vision for the features that next-generation vector databases should have, which include long-term evolvability, tunable consistency, good elasticity, and high performance. We present Manu, a cloud native vector database that implements these features. It is difficult to integrate all these features if we follow traditional DBMS design rules. As most vector data applications do not require complex data models and strong data consistency, our design philosophy is to relax the data model and consistency constraints in exchange for the aforementioned features. Specifically, Manu firstly exposes the write-ahead log (WAL) and binlog as backbone services. Secondly, write components are designed as log publishers while all read-only analytic and search components are designed as independent subscribers to the log services. Finally, we utilize multi-version concurrency control (MVCC) and a delta consistency model to simplify the communication and cooperation among the system components. These designs achieve a low coupling among the system components, which is essential for elasticity and evolution. We also extensively optimize Manu for performance and usability with hardware-aware implementations and support for complex search semantics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/26/2023

Enhancement of database access performance by improving data consistency in a non-relational database system (NoSQL)

This study aims to enhance data consistency in NoSQL databases, traditio...
research
04/17/2019

In Search of a Key Value Store with High Performance and High Availability

In recent year, the write-heavy applications is more and more prevalent....
research
02/10/2019

Antidote SQL: Relaxed When Possible, Strict When Necessary

Geo-replication poses an inherent trade-off between low latency, high av...
research
03/08/2021

Structural Coupling for Microservices

Cloud-native Applications are 'distributed, elastic and horizontal-scala...
research
10/04/2017

A Comparative Analysis of Materialized Views Selection and Concurrency Control Mechanisms in NoSQL Databases

Increasing resource demands require relational databases to scale. While...
research
03/03/2021

Long Live The Image: Container-Native Data Persistence in Production

Containerization plays a crucial role in the de facto technology stack f...
research
01/08/2019

Inversion-based Measurement of Data Consistency for Read/Write Registers

Both provides and consumers of distributed storage services can benefit ...

Please sign up or login with your details

Forgot password? Click here to reset