Human Languages in Source Code: Auto-Translation for Localized Instruction

09/10/2019
by   Chris Piech, et al.
0

Computer science education has promised open access around the world, but access is largely determined by what human language you speak. As younger students learn computer science it is less appropriate to assume that they should learn English beforehand. To that end we present CodeInternational, the first tool to translate code between human languages. To develop a theory of non-English code, and inform our translation decisions, we conduct a study of public code repositories on GitHub. The study is to the best of our knowledge the first on human-language in code and covers 2.9 million Java repositories. To demonstrate CodeInternational's educational utility, we build an interactive version of the popular English-language Karel reader and translate it into 100 spoken languages. Our translations have already been used in classrooms around the world, and represent a first step in an important open CS-education problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/19/2023

Redefining Computer Science Education: Code-Centric to Natural Language Programming with AI-Based No-Code Platforms

This paper delves into the evolving relationship between humans and comp...
research
02/28/2023

An evaluation of Google Translate for Sanskrit to English translation via sentiment and semantic analysis

Google Translate has been prominent for language translation; however, l...
research
06/03/2022

Oxford-style Debates in Telecommunication and Computer Science Education

Oxford-style debating is a well-known tool in social sciences. Such form...
research
06/19/2023

BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models

Large language models (LLMs) have demonstrated remarkable prowess in lan...
research
03/21/2023

Optical Character Recognition and Transcription of Berber Signs from Images in a Low-Resource Language Amazigh

The Berber, or Amazigh language family is a low-resource North African v...
research
03/08/2021

Atoms of Confusion in Java

Although writing code seems trivial at times, problems arise when humans...

Please sign up or login with your details

Forgot password? Click here to reset