Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System

04/26/2023
by   Xinnian Liang, et al.
0

Large-scale Language Models (LLMs) are constrained by their inability to process lengthy inputs. To address this limitation, we propose the Self-Controlled Memory (SCM) system to unleash infinite-length input capacity for large-scale language models. Our SCM system is composed of three key modules: the language model agent, the memory stream, and the memory controller. The language model agent iteratively processes ultra-long inputs and stores all historical information in the memory stream. The memory controller provides the agent with both long-term memory (archived memory) and short-term memory (flash memory) to generate precise and coherent responses. The controller determines which memories from archived memory should be activated and how to incorporate them into the model input. Our SCM system can be integrated with any LLMs to enable them to process ultra-long texts without any modification or fine-tuning. Experimental results show that our SCM system enables LLMs, which are not optimized for multi-turn dialogue, to achieve multi-turn dialogue capabilities that are comparable to ChatGPT, and to outperform ChatGPT in scenarios involving ultra-long document summarization or long-term conversations. Additionally, we will supply a test set, which covers common long-text input scenarios, for evaluating the abilities of LLMs in processing long documents. [Working in progress.][<https://github.com/wbbeyourself/SCM4LLMs>]

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2023

Augmenting Language Models with Long-Term Memory

Existing large language models (LLMs) can only afford fix-sized inputs d...
research
08/29/2023

Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models

Most open-domain dialogue systems suffer from forgetting important infor...
research
05/23/2023

Narrative XL: A Large-scale Dataset For Long-Term Memory Models

Despite their tremendous successes, most large language models do not ha...
research
09/01/2021

∞-former: Infinite Memory Transformer

Transformers struggle when attending to long contexts, since the amount ...
research
08/09/2016

A deep language model for software code

Existing language models such as n-grams for software code often fail to...
research
10/17/2022

Keep Me Updated! Memory Management in Long-term Conversations

Remembering important information from the past and continuing to talk a...
research
12/11/2018

Learning What to Remember: Long-term Episodic Memory Networks for Learning from Streaming Data

Current generation of memory-augmented neural networks has limited scala...

Please sign up or login with your details

Forgot password? Click here to reset