Revisiting Parallel Context Windows: A Frustratingly Simple Alternative and Chain-of-Thought Deterioration

05/24/2023
by   Kejuan Yang, et al.
0

We identify two crucial limitations in the evaluation of recent parallel-integrated method Parallel Context Windows (PCW), which extends the maximum context lengths of language models, e.g., 2048 for LLaMA, by harnessing window-wise attention and positional embedding techniques. We first show that a simple yet strong baseline, weighted sum ensemble, is missing for the in-context few-shot classification. Moreover, on more challenging Chain-of-Thought (CoT) reasoning (e.g., HotpotQA), PCW would present unexpected deterioration regarding question miscomprehension and false inference. Based on our findings, we suggest that the existing PCW design may not guarantee sufficient improvement and practicality in handling lengthy documents in real-world applications. More community efforts on enabling language models' long context understanding ability should be paid.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/21/2022

Parallel Context Windows Improve In-Context Learning of Large Language Models

For applications that require processing large amounts of text at infere...
research
06/12/2023

Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with Language Models

Generating intermediate steps, or Chain of Thought (CoT), is an effectiv...
research
09/16/2022

Text and Patterns: For Effective Chain of Thought, It Takes Two to Tango

Reasoning is a key pillar of human cognition and intelligence. In the pa...
research
05/09/2023

Large Language Model Programs

In recent years, large pre-trained language models (LLMs) have demonstra...
research
09/16/2022

Psychologically-informed chain-of-thought prompts for metaphor understanding in large language models

Probabilistic models of language understanding are interpretable and str...
research
10/04/2022

ThinkSum: Probabilistic reasoning over sets using large language models

Large language models (LLMs) have a substantial capacity for high-level ...
research
06/01/2023

Chain-Of-Thought Prompting Under Streaming Batch: A Case Study

Recently, Large Language Models (LLMs) have demonstrated remarkable capa...

Please sign up or login with your details

Forgot password? Click here to reset