On the Applicability of Language Models to Block-Based Programs

02/08/2023
by   Elisabeth Griebl, et al.
0

Block-based programming languages like Scratch are increasingly popular for programming education and end-user programming. Recent program analyses build on the insight that source code can be modelled using techniques from natural language processing. Many of the regularities of source code that support this approach are due to the syntactic overhead imposed by textual programming languages. This syntactic overhead, however, is precisely what block-based languages remove in order to simplify programming. Consequently, it is unclear how well this modelling approach performs on block-based programming languages. In this paper, we investigate the applicability of language models for the popular block-based programming language Scratch. We model Scratch programs using n-gram models, the most essential type of language model, and transformers, a popular deep learning model. Evaluation on the example tasks of code completion and bug finding confirm that blocks inhibit predictability, but the use of language models is nevertheless feasible. Our findings serve as foundation for improving tooling and analyses for block-based languages.

READ FULL TEXT

page 1

page 4

page 9

research
10/26/2022

Benchmarking Language Models for Code Syntax Understanding

Pre-trained language models have demonstrated impressive performance in ...
research
02/11/2020

Modeling Programs Hierarchically with Stack-Augmented LSTM

Programming language modeling has attracted extensive attention in recen...
research
10/15/2020

Empirical Study of Transformers for Source Code

Initially developed for natural language processing (NLP), Transformers ...
research
11/29/2022

Coder Reviewer Reranking for Code Generation

Sampling diverse programs from a code language model and reranking with ...
research
05/03/2022

Neural language models for network configuration: Opportunities and reality check

Boosted by deep learning, natural language processing (NLP) techniques h...
research
08/16/2021

Autoencoders as Tools for Program Synthesis

Recently there have been many advances in research on language modeling ...
research
08/16/2023

ChatLogo: A Large Language Model-Driven Hybrid Natural-Programming Language Interface for Agent-based Modeling and Programming

Building on Papert (1980)'s idea of children talking to computers, we pr...

Please sign up or login with your details

Forgot password? Click here to reset