WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

08/18/2023
by   Haipeng Luo, et al.
0

Large language models (LLMs), such as GPT-4, have shown remarkable performance in natural language processing (NLP) tasks, including challenging mathematical reasoning. However, most existing open-source models are only pre-trained on large-scale internet data and without math-related optimization. In this paper, we present WizardMath, which enhances the mathematical reasoning abilities of Llama-2, by applying our proposed Reinforcement Learning from Evol-Instruct Feedback (RLEIF) method to the domain of math. Through extensive experiments on two mathematical reasoning benchmarks, namely GSM8k and MATH, we reveal the extraordinary capabilities of our model. WizardMath surpasses all other open-source LLMs by a substantial margin. Furthermore, our model even outperforms ChatGPT-3.5, Claude Instant-1, PaLM-2 and Minerva on GSM8k, simultaneously surpasses Text-davinci-002, PaLM-1 and GPT-3 on MATH. More details and model weights are public at https://github.com/nlpxucan/WizardLM and https://huggingface.co/WizardLM.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2023

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

Code Large Language Models (Code LLMs), such as StarCoder, have demonstr...
research
09/01/2023

No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function

Large language models (LLMs) exhibit impressive language understanding a...
research
09/21/2023

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Large language models (LLMs) have pushed the limits of natural language ...
research
05/26/2023

Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance

As large language models (LLMs) are continuously being developed, their ...
research
05/22/2023

VideoLLM: Modeling Video Sequence with Large Language Models

With the exponential growth of video data, there is an urgent need for a...
research
05/03/2022

ElitePLM: An Empirical Study on General Language Ability Evaluation of Pretrained Language Models

Nowadays, pretrained language models (PLMs) have dominated the majority ...
research
10/05/2022

Ask Me Anything: A simple strategy for prompting language models

Large language models (LLMs) transfer well to new tasks out-of-the-box s...

Please sign up or login with your details

Forgot password? Click here to reset