Thistle: A Vector Database in Rust

03/25/2023
by   Brad Windsor, et al.
0

We present Thistle, a fully functional vector database. Thistle is an entry into the domain of latent knowledge use in answering search queries, an ongoing research topic at both start-ups and search engine companies. We implement Thistle with several well-known algorithms, and benchmark results on the MS MARCO dataset. Results help clarify the latent knowledge domain as well as the growing Rust ML ecosystem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/29/2019

Towards More Usable Dataset Search: From Query Characterization to Snippet Generation

Reusing published datasets on the Web is of great interest to researcher...
research
11/24/2020

Comprehensive and Sensitive Proteogenomics Data Analysis Strategy based on Complementary Multi-Stage Database Search

Proteogenomics provide opportunities for proteomic validation of gene st...
research
11/28/2016

MS MARCO: A Human Generated MAchine Reading COmprehension Dataset

This paper presents our recent work on the design and development of a n...
research
11/12/2022

Mining Mathematical Documents for Question Answering via Unsupervised Formula Labeling

The increasing number of questions on Question Answering (QA) platforms ...
research
04/10/2020

Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned

We present the Neural Covidex, a search engine that exploits the latest ...
research
02/10/2021

Auctus: A Dataset Search Engine for Data Augmentation

Machine Learning models are increasingly being adopted in many applicati...
research
12/22/2018

On Functional Aggregate Queries with Additive Inequalities

Motivated by fundamental applications in databases and relational machin...

Please sign up or login with your details

Forgot password? Click here to reset