Encode-Store-Retrieve: Enhancing Memory Augmentation through Language-Encoded Egocentric Perception

08/10/2023
by   Junxiao Shen, et al.
0

We depend on our own memory to encode, store, and retrieve our experiences. However, memory lapses can occur. One promising avenue for achieving memory augmentation is through the use of augmented reality head-mounted displays to capture and preserve egocentric videos, a practice commonly referred to as life logging. However, a significant challenge arises from the sheer volume of video data generated through life logging, as the current technology lacks the capability to encode and store such large amounts of data efficiently. Further, retrieving specific information from extensive video archives requires substantial computational power, further complicating the task of quickly accessing desired content. To address these challenges, we propose a memory augmentation system that involves leveraging natural language encoding for video data and storing them in a vector database. This approach harnesses the power of large vision language models to perform the language encoding process. Additionally, we propose using large language models to facilitate natural language querying. Our system underwent extensive evaluation using the QA-Ego4D dataset and achieved state-of-the-art results with a BLEU score of 8.3, outperforming conventional machine learning models that scored between 3.4 and 5.8. Additionally, in a user study, our system received a higher mean response score of 4.13/5 compared to the human participants' score of 2.46/5 on real-life episodic memory tasks.

READ FULL TEXT

page 1

page 5

page 8

research
01/02/2023

NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory

Searching long egocentric videos with natural language queries (NLQ) has...
research
09/21/2023

Memory-Augmented LLM Personalization with Short- and Long-Term Memory Coordination

Large Language Models (LLMs), such as GPT3.5, have exhibited remarkable ...
research
05/23/2023

RET-LLM: Towards a General Read-Write Memory for Large Language Models

Large language models (LLMs) have significantly advanced the field of na...
research
04/15/2021

Time-Stamped Language Model: Teaching Language Models to Understand the Flow of Events

Tracking entities throughout a procedure described in a text is challeng...
research
06/14/2023

AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn

Recent research on Large Language Models (LLMs) has led to remarkable ad...
research
08/07/2023

RCMHA: Relative Convolutional Multi-Head Attention for Natural Language Modelling

The Attention module finds common usage in language modeling, presenting...
research
11/20/2022

Graceful Forgetting II. Data as a Process

Data are rapidly growing in size and importance for society, a trend mot...

Please sign up or login with your details

Forgot password? Click here to reset