HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models

09/06/2023
by   Guijin Son, et al.
0

Large Language Models (LLMs) pretrained on massive corpora exhibit remarkable capabilities across a wide range of tasks, however, the attention given to non-English languages has been limited in this field of research. To address this gap and assess the proficiency of language models in the Korean language and culture, we present HAE-RAE Bench, covering 6 tasks including vocabulary, history, and general knowledge. Our evaluation of language models on this benchmark highlights the potential advantages of employing Large Language-Specific Models(LLSMs) over a comprehensive, universal model like GPT-3.5. Remarkably, our study reveals that models approximately 13 times smaller than GPT-3.5 can exhibit similar performance levels in terms of language-specific knowledge retrieval. This observation underscores the importance of homogeneous corpora for training professional-level language-specific models. On the contrary, we also observe a perplexing performance dip in these smaller LMs when they are tasked to generate structured answers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2019

Negated LAMA: Birds cannot fly

Pretrained language models have achieved remarkable improvements in a br...
research
09/01/2023

Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior

Shannon, in his seminal paper introducing information theory, divided th...
research
10/23/2021

Spanish Legalese Language Model and Corpora

There are many Language Models for the English language according to its...
research
11/18/2022

Context Variance Evaluation of Pretrained Language Models for Prompt-based Biomedical Knowledge Probing

Pretrained language models (PLMs) have motivated research on what kinds ...
research
08/29/2023

Evaluation and Analysis of Hallucination in Large Vision-Language Models

Large Vision-Language Models (LVLMs) have recently achieved remarkable s...
research
09/12/2023

The first step is the hardest: Pitfalls of Representing and Tokenizing Temporal Data for Large Language Models

Large Language Models (LLMs) have demonstrated remarkable generalization...
research
05/07/2023

Professional Certification Benchmark Dataset: The First 500 Jobs For Large Language Models

The research creates a professional certification survey to test large l...

Please sign up or login with your details

Forgot password? Click here to reset