Expanding the scope of statistical computing: Training statisticians to be software engineers

by   Alex Reinhart, et al.

Traditionally, statistical computing courses have taught the syntax of a particular programming language or specific statistical computation methods. Since the publication of Nolan and Temple Lang (2010), we have seen a greater emphasis on data manipulation, reproducible research, and visualization. This shift better prepares students for careers working with complex datasets and producing analyses for multiple audiences. But, we argue, statisticians are now often called upon to develop statistical software, not just analyses, such as R packages implementing new analysis methods or machine learning systems integrated into commercial products. This demands different skills. We describe a graduate course developed to meet this need by focusing on four themes: programming practices; software design; important algorithms and data structures; and essential tools and methods. Through code review and revision, and a semester-long software project, students practice all the skills of software engineering. The course allows students to expand their understanding of computing as applied to statistical problems while building expertise in the kind of software development that is increasingly the province of the working statistician. We see this as a model for the future evolution of the computing curriculum in statistics and data science.


page 1

page 2

page 3

page 4


Are you cloud-certified? Preparing Computing Undergraduates for Cloud Certification with Experiential Learning

Cloud Computing skills have been increasing in demand. Many software eng...

Deep R Programming

Deep R Programming is a comprehensive course on one of the most popular ...

Using R for teaching and research

R is a language and environment for statistical computing and graphics, ...

OPENMENDEL: A Cooperative Programming Project for Statistical Genetics

Statistical methods for genomewide association studies (GWAS) continue t...

Training the next generation of computational scientists through a new undergraduate course

We introduce a newly designed undergraduate-level interdisciplinary cour...

The algebra and machine representation of statistical models

As the twin movements of open science and open source bring an ever grea...

Please sign up or login with your details

Forgot password? Click here to reset