Evaluating ChatGPT and GPT-4 for Visual Programming

07/30/2023
by   Adish Singla, et al.
0

Generative AI and large language models have the potential to drastically improve the landscape of computing education by automatically generating personalized feedback and content. Recent works have studied the capabilities of these models for different programming education scenarios; however, these works considered only text-based programming, in particular, Python programming. Consequently, they leave open the question of how well these models would perform in visual programming domains popularly used for K-8 programming education. The main research question we study is: Do state-of-the-art generative models show advanced capabilities in visual programming on par with their capabilities in text-based Python programming? In our work, we evaluate two models, ChatGPT (based on GPT-3.5) and GPT-4, in visual programming domains for various scenarios and assess performance using expert-based annotations. In particular, we base our evaluation using reference tasks from the domains of Hour of Code: Maze Challenge by Code-dot-org and Karel. Our results show that these models perform poorly and struggle to combine spatial, logical, and programming skills crucial for visual programming. These results also provide exciting directions for future work on developing techniques to improve the performance of generative models in visual programming.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/29/2023

Generative AI for Programming Education: Benchmarking ChatGPT, GPT-4, and Human Tutors

Generative AI and large language models hold great promise in enhancing ...
research
05/26/2023

Neural Task Synthesis for Visual Programming

Generative neural models hold great promise in enhancing programming edu...
research
06/27/2023

Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation

We evaluate AI-assisted generative capabilities on fundamental numerical...
research
03/16/2023

Can Generative Pre-trained Transformers (GPT) Pass Assessments in Higher Education Programming Courses?

We evaluated the capability of generative pre-trained transformers (GPT)...
research
06/15/2023

Thrilled by Your Progress! Large Language Models (GPT-4) No Longer Struggle to Pass Assessments in Higher Education Programming Courses

This paper studies recent developments in large language models' (LLM) a...
research
05/03/2023

GPTutor: a ChatGPT-powered programming tool for code explanation

Learning new programming skills requires tailored guidance. With the eme...
research
08/15/2023

Large Language Models in Introductory Programming Education: ChatGPT's Performance and Implications for Assessments

This paper investigates the performance of the Large Language Models (LL...

Please sign up or login with your details

Forgot password? Click here to reset