TaskTracker-tool: a Toolkit for Tracking of Code Snapshots and Activity Data During Solution of Programming Tasks

by   Elena Lyulina, et al.

The process of writing code and use of features in an integrated development environment (IDE) is a fruitful source of data in computing education research. Existing studies use records of students' actions in the IDE, consecutive code snapshots, compilation events, and others, to gain deep insight into the process of student programming. In this paper, we present a set of tools for collecting and processing data of student activity during problem-solving. The first tool is a plugin for IntelliJ-based IDEs (PyCharm, IntelliJ IDEA, CLion). By capturing snapshots of code and IDE interaction data, it allows to analyze the process of writing code in different languages – Python, Java, Kotlin, and C++. The second tool is designed for the post-processing of data collected by the plugin and is capable of basic analysis and visualization. To validate and showcase the toolkit, we present a dataset collected by our tools. It consists of records of activity and IDE interaction events during solution of programming tasks by 148 participants of different ages and levels of programming experience. We propose several directions for further exploration of the dataset.



There are no comments yet.


page 1

page 2

page 3

page 4


Data-driven insight into the puzzle-based cybersecurity training

Puzzle-based training is a common type of hands-on activity accompanying...

Comparison of block-based and hybrid-based programming environments in transferring programming skills to text-based environment

Teachers face several challenges when presenting the fundamental concept...

Students Struggle to Explain Their Own Program Code

We asked students to explain the structure and execution of their small ...

Toward Agile Situated Visualization: An Exploratory User Study

We introduce AVAR, a prototypical implementation of an agile situated vi...

THAP: A Matlab Toolkit for Learning with Hawkes Processes

As a powerful tool of asynchronous event sequence analysis, point proces...

Computing with CodeRunner at Coventry University: Automated summative assessment of Python and C++ code

CodeRunner is a free open-source Moodle plugin for automatically marking...

TipsC: Tips and Corrections for programming MOOCs

With the widespread adoption of MOOCs in academic institutions, it has b...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

With an ever-increasing presence of software in our lives, programming education also becomes more and more popular (Mcgettrick et al., 2005; Robins et al., 2003; Yang et al., 2015). However, teaching programming is challenging. Programming courses are usually based on tasks and projects (Ambrose et al., 2010; Gratchev and Jeng, 2018) that students should complete by themselves. Many students may enroll for a particular programming course at once (Danielsiek and Vahrenhold, 2016). As a result, it is not always possible for teachers to pay enough attention to each individual student to have a detailed understanding of their progress. The process of programming can be a valuable source of insight into the learning process: for example, typical “novice errors” may indicate particular gaps in students’ understanding (Altadmri and Brown, 2015; Robins et al., 2003; Konecki, 2014). These challenges make tracking and analysis of students’ coding behavior a promising technique for educational research.

A number of studies are focused on analysis of students’ general behavior (Jadud, 2005; Vihavainen et al., 2014), their code patterns  (Blikstein et al., 2014; Bulmer et al., 2018), and errors (Altadmri and Brown, 2015; McCall and Kölling, 2014; Tirronen et al., 2015). Other studies aim to facilitate the process of teaching by observing the learning progress within groups in computer science (CS) courses (Yan et al., 2019; Blikstein, 2011). To do so, one needs data containing interaction between a student and their programming environment. Such data may include sequential code snapshots, compilation events, or actions performed in the integrated development environment (IDE).

Functionality of existing data collection tools (Brown et al., 2014; Shah, 2003; Norris et al., 2008; Spacco et al., 2006) varies depending on the purpose of their use, but they still have a lot in common. Such tools are usually implemented as plugins for IDEs, which facilitate the installation of the tools and allow to preserve a natural programming environment. These tools tend to have a client-server architecture. This allows to automatically receive and store gathered data, which typically consists of user’s interactions with the IDE and snapshots of their code. However, existing tools have several restrictions. They are often tailored to support a single programming language (Norris et al., 2008; Brown et al., 2014), do not track the context of students’ actions (Norris et al., 2008; Spacco et al., 2006; Kazerouni et al., 2017), or do not provide detailed information about tasks (Norris et al., 2008; Brown et al., 2014). Some of the tools offer many other features besides data collection, which makes the data gathering process difficult and confusing for both students and researchers (Spacco et al., 2006; Shah, 2003).

Our intention was to create a tool capable of collecting all consecutive code snapshots and user actions during the solution of a programming task in various programming languages. Such a tool could be useful to make the process of teaching more effective by providing additional information about the process of programming. Detailed insights in the task solving process could help the teacher to improve their course and understand which topics and assignments may be more difficult for students. In addition, the tool could be used by researchers to gather data beyond the classroom. It is important that collected activity data, such as IDE interaction events, is linked to concrete tasks so that the context of actions is available in further analysis. Finally, the tool should be as flexible and easy to customize as possible. In particular, data gathering should not be limited to one IDE or programming language.

This work presents the following contributions:

The remainder of the paper is organized as follows. In  Section 2 we discuss related work and motivate the development of TaskTracker-tool.  Section 3 describes the proposed tools for data gathering, post-processing, and analysis.  Section 4 describes the dataset and highlights the points of interest in a sample of data produced by TaskTracker-tool. In  Section 5 we draw conclusions, discuss the encountered difficulties, and outline future work.

2. Background

Analysis of students’ behavior in the IDE during problem-solving is an established technique in computing education research (Ihantola et al., 2015). Some studies rely on existing datasets for analysis (Altadmri and Brown, 2015; McCall and Kölling, 2014), other researchers gather their own datasets (Jadud, 2005; Vihavainen et al., 2014). Such datasets usually include information relevant to the programming process, such as code snapshots, interactions of users with the IDE, and various demographic data (Hundhausen et al., 2017; Ihantola et al., 2015). Collecting such data is a tedious process that involves substantial technical work. In this section we overview prior studies that collect such datasets with various tools and briefly describe their approaches to data collection.

Norris C. et al. (Norris et al., 2008) developed the Clockit plugin for BlueJ IDE (blu, 2020)

, augmented with a web-based data visualizer, to compare behavior of novice programmers to more experienced ones visually. Their plugin is designed to capture and log various events in the IDE, such as compilation, running the code, or changes in file size. The collected log files are later sent to a server. Marmoset 

(Spacco et al., 2006) is a plugin for Eclipse IDE (ecl, 2020a) that was developed as an automated testing system to grade submissions of solutions to programming tasks. Its notable feature is automatic synchronization of contents with the version control system each time a student saves their project. This approach enables the collection of sequential snapshots of code for further analysis.

DevEventTracker (Kazerouni et al., 2017) is another Eclipse plugin that collects all code snapshots during the coding process. The plugin works with the Web-CAT (Shah, 2003) system, which allows to analyze the data gathered in the classroom. The system provides features such as scoring of solutions, providing feedback to students, and tracking their progress. Blackbox (Brown et al., 2014) is a large-scale data collection project for BlueJ IDE. The data gathered by Blackbox includes code changes and user actions in the IDE. After a user consents to send their data, all changes that occur through the user’s interaction with the IDE are sent to the Blackbox server. The authors collected a large open dataset, which is available to the community.

Existing tools provide wide opportunities for data collection and analysis. They can be used to support students through the whole learning process or to collect comprehensive datasets with code and IDE interactions of students all around the world. However, existing tools also have some limitations.

Data gathering tools that are designed to assist in teaching often include additional features, like code review or chats. While providing additional value, such features make the tools less flexible and harder to integrate into an already established teaching workflow. Being the plugins for IDEs such as BlueJ or Eclipse, existing tools are suitable only for a particular programming language or a relatively narrow audience (ecl, 2017, 2020b). Since IDEs for beginners differ from professional IDEs (Hagan and Markham, 2000; Kölling, 2008), this may result in a limited scope of use for the gathered data. Therefore, actions performed by users and the general flow of solutions may be different in this case. On top of that, users accustomed to other IDEs cannot participate in data gathering or use their familiar tools while learning.

Data collected by existing tools is rich enough to interpret the programming process for a variety of purposes, since it contains both user interactions with the IDE and snapshots of code. However, sometimes code snapshots are too sparse to meaningfully restore the solution’s timeline (Kazerouni et al., 2017; Spacco et al., 2006; Brown et al., 2014). Moreover, information about the task that the student is currently working on is not always complete (Brown et al., 2018). Such information can be essential to analyze and facilitate the process of implementation of the solution, which causes most of the difficulties for students (Lahtinen et al., 2005) and is therefore interesting to study deeply.

In our work, we strive to overcome the restrictions of existing tools, such as low flexibility, narrow range of supported IDEs, and sparsity of data. We present a tool called TaskTracker-tool. It is designed as a plugin for IntelliJ-based IDEs (int, 2020) and therefore supports working with various programming languages (Python, Java, Kotlin, C++). In addition, support for any other language integrated with these IDEs can be easily added to our toolkit.

The main purpose of TaskTracker-tool is to collect all code changes and IDE actions that happen during the process of solving programming assignments. This data can either be used in the classroom or collected for further study in a research environment. The user interface (UI) of the plugin allows the user to choose a task to solve. This feature is useful both for data gathering and for use in class: for example, the tool can be used during a test with predefined tasks to track students’ individual solution patterns. The data collected by TaskTracker-tool is grouped per task. In contrast to existing work, this allows exploring the data while keeping track of the exact context where it was produced. The client-server architecture allows the plugin to be used as a data gathering tool. Finally, the distribution package of TaskTracker-tool includes utilities for post-processing and visualization of data, providing basic analysis capabilities and suitable for use as a base for more advanced analysis.

3. TaskTracker-tool

Our intention was to create a tool to collect fine-grained data of how students solve their programming assignments and analyze their learning progress. An important requirement for such a tool, imposed by the limitations of existing approaches, is flexibility and ease of tailoring it to a particular environment. The tool should both extend the (currently narrow) range of IDEs that have similar plugins available and be capable of collecting snapshots of the solution’s code along with information about the task. To broaden potential use cases, the tool should not be overloaded with extra features that may affect students’ behavior or require sufficient changes in teaching practices.

We have developed a tool called TaskTracker-tool that allows gathering sequences of code snapshots and user-to-IDE interactions during the process of solution. TaskTracker-tool includes:

  1. TaskTracker-plugin — an IDE plugin to track the process of solving tasks within the IDE;

  2. TaskTracker-server — a server to gather data remotely, collect solutions of multiple students in one place, and customize the (UI) of the plugin;

  3. Data post-processing tool — a set of utilities for basic processing of data collected by TaskTracker-plugin.

The first part of TaskTracker-tool is TaskTracker-plugin for IDEs based on the IntelliJ Platform (int, 2020), such as PyCharm, CLion, IntelliJ IDEA, and others. The plugin collects all code changes during the solution process. In addition, it works in conjunction with the Activity Tracker (act, 2020) plugin that captures students’ interaction with the IDE such as copy-paste, run-debug, and other actions.  Figure 1 presents the workflow of a student during the plugin use. First of all, the plugin has to be installed into the IDE. The work starts with filling a survey, including gender, age, country, and programming experience. This information is then stored locally and does not need to be filled again when switching to the next task or starting the new session. Next, the student selects a task to solve from the list of tasks defined in the plugin’s configuration. Each task has its own description, which the student should read before starting to solve it. The plugin automatically creates a draft file for each solution.

Figure 1. TaskTracker-plugin student workflow

TaskTracker-tool has a client-server architecture. The use of a server allows to configure the plugin remotely — for example, by updating the list of tasks — and to collect data in a centralized manner. In addition, the server reduces the workload on the user’s computer. The only requirement of the plugin is the internet connection.

The last part of TaskTracker-tool is Data post-processing tool, a set of utilities for the analysis of the collected data. The task of Data post-processing tool is to prepare the gathered data for further analysis by adapting them to specific problems. For example, it can filter out unnecessary steps in the solution flow captured by TaskTracker-plugin or merge consecutive IDE actions and code snapshots. It also assists in the exploration of the gathered data through visualization of aggregated information, such as distributions of different measures of participants, the progress of individual solutions, or unusual patterns in IDE interaction.

3.1. TaskTracker-plugin

The goal of TaskTracker-plugin is to collect all consecutive code changes and IDE actions during the solution of a programming task.

Collected data. Every time the solution is updated, even by one character, the plugin writes a snapshot of its code to the corresponding .csv file. High-frequency snapshots ensure that every detail of the solution process is recorded. To respect students’ privacy, the plugin only collects data related to the solution. To achieve this, the plugin automatically creates a dedicated file for each task where the user should write their solution, and discards changes in other files.

Interactions between the student and the IDE are tracked by an additional plugin, Activity Tracker, and are also stored in a .csv file. The format of logs and a complete list of tracked events can be found on the project’s page (act, 2020). All the data is sent anonymously (no user names, paths or data unrelated to tasks are sent) with student’s permission as soon as they mark their solution as done and submit it.

User interface. Besides data collection, the plugin provides a convenient interface to ease the routine of problem-solving for both students and teachers. Its UI is designed to describe every task right within the IDE so that the student is not distracted from the solution process if they have to consult with a description. In addition, tasks can be remotely customized by the teacher for each new session.

3.2. TaskTracker-server

TaskTracker-server is used in conjunction with TaskTracker-plugin for data gathering. The server could be launched locally or remotely, depending on the desired setup. A dedicated server allows the plugin workflow to be flexible, automated, and, if needed, remote. The two primary functions of the server are sending data to TaskTracker-plugin to initialize its UI and receiving data with user solutions and IDE actions. Remote UI configuration allows to customize the lists of tasks and language of the interface. Remote data collection greatly facilitates the process of data gathering. The setup is as easy as cloning the server’s repository, deploying it remotely or locally, and setting its URL in the plugin configuration.

Server configuration. The first feature of the server is remote customization of TaskTracker-plugin’s UI. The server stores a list of supported UI languages along with translations of UI texts. Therefore, to add a new language, one should add it to the list of languages and extend existing translations. Customization of tasks requires similar actions to the tasks list, which is also stored on the server. A detailed description of these and implemented data models can be found in the server repository.555TaskTracker-server documentation: https://github.com/JetBrains-Research/task-tracker-server/wiki

Data storage. The server provides another major feature — receiving and storing files from TaskTracker-plugin. The files generated by the plugin are uploaded into the server’s database. Each new user receives a unique identifier that does not disclose their identity. This is necessary to be able to attribute solutions to a particular user later, as all data is gathered anonymously. At the end of the gathering process, the collected data is divided into folders per user and task and can be downloaded as an archive.

3.3. Data post-processing tool

Data post-processing tool prepares raw data collected by TaskTracker-plugin for further analysis. This data contains snapshots of code collected during the solution process and records of user interaction with the IDE. The tool consists of two major modules. The first is responsible for data processing, and the other handles data visualization.

Post-processing. The data post-processing module is required to prepare raw data for further analysis. While post-processing could be implemented directly in the plugin, we chose to implement it as a separate set of tools for several reasons. First, we wanted to minimize the size of the plugin. In addition, some operations, such as scoring of solutions, may take a long time. Keeping such operations, which do not have to be real-time, in the plugin could slow it down substantially. Finally, the separation of data processing allows to use different combinations of data processing submodules to produce different datasets. For example, one may want to create a family of datasets varying in degree of data granularity, or only keep a particular part of the data, such as IDE interactions or code snapshots.

Currently, the data post-processing module consists of several submodules:

  • merging Activity Tracker and TaskTracker files;

  • scoring solutions;

  • removing intermediate diffs.

Code snapshots and IDE interactions are collected by separate plugins and are saved into separate files. The data post-processing module contains a submodule for combining this data.

The submodule for scoring of solutions allows running the tests to calculate a correctness rating for each task. The correctness of a solution can be defined as the percentage of passing tests. Similarly to TaskTracker-plugin, the scoring submodule supports multiple languages: Java, Python, Kotlin, and C++.

The submodule for removing intermediate diffs is capable of deleting all intermediate, i.e. non-final, states in code snapshots that are collected during the implementation of a solution. The definition of “final” is configurable: for example, final snapshots may be taken after completion of every new line, or adjusted by other criteria. Filtering out intermediate diffs enables adjustments to the level of data granularity, which in turn helps to reduce noise and adapt the collected data to further processing.

Data visualization. The data visualization module is designed to find interesting and unusual patterns in the gathered data by plotting it.

Charts of distributions of participants and tasks are available for plotting. They may be crucial to determine whether the dataset is representative and complete.

The next types of plots could provide deep insight into the solution process by recreating its flow. The first plot (Figure 2) visualizes the sequence of actions performed by the user during the solution which makes them notable and transparent for analysis. The second plot (Figure 3) represents the dynamics of the score which helps to identify problematic stages in the solution process. It also allows to highlight characteristic patterns, like a drop in the score after an unsuccessful change. Each of these plots may assist in the assessment of students’ understanding of the task and general programming concepts, as well as their familiarity with IDE features, and helps to tailor the learning process to suit each student best.

3.4. Use cases

Data collected by TaskTracker-tool can find use in practical settings (for example, to support the learning process in programming courses) and in research environments (for example, one can use it to collect a dataset of the behavior of a group of students during their work on a particular task). TaskTracker-tool has detailed documentation. In particular, to facilitate the ease of setup and use of the toolkit, it covers plugin setup and server deployment in detail.

Programming courses. In the environment of a programming course, TaskTracker-tool can suit several goals. First, thanks to remote configuration capabilities, it can serve as a framework for the observed problem-solving in the classroom. In addition to that, analysis of the gathered data within the course group can help improve the course by tailoring the curriculum to adjust the difficulty of tasks or letting the teacher focus on topics that the students may find hard. Moreover, during data gathering, students have an option to disclose their identifiers to the teacher. This allows to build a personalized overview of the solution data for each student and provide individual insights. While the same result can be achieved by talking to students, conversations with each of them may take more time and the teacher’s energy compared to the monitoring of the visualized solution process. Finally, the plugin can help identify cheating. For example, if a large piece of code was inserted after executing the paste command, which miraculously leads to a perfect score, the teacher may want to be more attentive. Such events can easily be highlighted in data collected by TaskTracker-tool.

Data gathering. Besides classroom use, TaskTracker-tool can be used to collect data in a wide variety of research settings. The primary characteristic that makes it suitable for research data gathering is flexibility. TaskTracker-tool has a client-server architecture, which enables remote data gathering. Moreover, UI text and tasks can be configured in any language, which does not limit the collection of data only to English-speaking participants. The demographic survey allows for collecting additional information about users.

TaskTracker-plugin is capable of gathering incremental data on the process of problem-solving in different programming languages in a variety of IDEs. The plugin is easy to install into the IDE. In addition, we provide detailed step-by-step guides for installing and uninstalling the plugin.666TaskTracker-plugin guides: https://github.com/JetBrains-Research/task-tracker-plugin/wiki Our own experience with using TaskTracker-tool for data collection demonstrates that even users with little programming experience are able to install and set up TaskTracker-tool easily.

TaskTracker-tool is designed with care for users’ privacy. The data is sent anonymously, and the tracking feature only extends to files automatically created by TaskTracker-plugin. This ensures that users’ privacy is respected, and protects their personal data from being unintentionally fingerprinted during data collection.

4. Dataset

Figure 2. Actions performed in IDE during solution of a task

To showcase TaskTracker-tool, we present a dataset collected with our toolkit. While we use it in our ongoing research project, it can find use in other studies and thus may be of value to the community. We discuss some potential applications in Section 4.1.

The dataset consists of code snapshots, IDE actions, and demographic information. We had 148 participants, aged 11 to 40 (mean age is 19 years), take part in the data gathering process. Their programming experience range is spanning from zero to more than 6 years of programming.

During data gathering, solutions were accepted in one of four languages: Python, Java, Kotlin, or C++. However, some of the students chose not to submit tasks or solved some tasks incorrectly. At the same time, some students solved some tasks many times in multiple languages. All submitted solutions are included in the final dataset.

For this experiment, we proposed six different programming problems of different difficulty levels as tasks. Table 1 presents an overview of the problems. Our dataset includes 474 solutions to these problems. Table 2 presents solution statistics per task and language.

Task Description
Pies A single pie costs A dollars and B cents in the cafe. Calculate how many dollars and cents one needs to pay for N pies.
Max 3 Print the largest of three numbers in the input.
Is zero Check if there are zeros among numbers in the input.
Voting Given three numbers, each of them being 1 or 0, determine which one occurs more often: 1 or 0. Print the number that occurs more often.
Max digit Given a string containing only digits, find and print the largest digit.
Brackets Place opening and closing brackets into the input string like this:

for odd length: example → e(x(a(m)p)l)e;

for even length: card → c(ar)d, but not c(a()r)d.
Table 1. Task descriptions
Task Python Java Kotlin C++ All
Pies 67 31 16 7 4 4 1 0 88 42
Max 3 31 12 22 2 4 4 1 0 58 18
Is zero 27 13 8 0 7 1 2 0 44 14
Voting 32 29 24 0 3 4 1 0 60 33
Max digit 16 5 5 1 5 4 1 0 27 10
Brackets 37 28 4 1 6 2 2 0 49 31
All 210 118 79 11 29 19 8 0 326 148
Table 2. Number of submitted solutions: S — number of correct solutions; NS — number of incorrect solutions
Figure 3. Changes of task score during a sample solution

4.1. Use cases

The gathered dataset is publicly available. To preserve privacy, we anonymized variables and function names in the code; IDs of users are also depersonalized. In this subsection, we present several topics for exploration in our dataset, which other researchers may find of interest.

Experience and feature use. The dataset can be used to investigate the relationship between programming experience and the use of language features or IDE actions. Beginners may not yet be familiar with some basic features of an IDE, such as a debugger. This may influence their productivity during problem-solving and pose a barrier for further learning.

Influence of age on feature use. It is not only children who learn to program. In a group of students with similar programming experience, ages may differ greatly. It would be interesting to see how the use of language features and patterns of actions performed in the IDE is related to age.

Actions after solution. After getting the solution right, students do not always submit the task immediately: for example, they may first refactor the code to improve it. It could be interesting to know the most frequent changes and IDE actions that occur after achieving a perfect score.

Common errors. While our tasks may look trivial to experienced programmers, even participants with impressive experience still made mistakes during the solution. However, common errors may be different for people with different experiences. A deeper study of error patterns for different experience groups could be another potential application of our dataset.

Advanced solution metrics. Solution snapshots in our dataset could be evaluated using different measures. For example, one could derive a total count of characters added or removed during the solution, or a ratio of the time spent writing code to the total time since the start of the solution. Such metrics could highlight individual traits of students and potentially be used to personalize their learning process and environment.

Generating personalized hints. Another interesting application of the dataset that we use in our ongoing research project is generation of personalized hints. Changes of code can be used to build a model of the solution process of a given task and suggest personalized hints to help students stuck in common pitfalls to make the next step towards a correct solution. Therefore, both the information about the current task and consequent code snapshots are important sources of data. To the best of our knowledge, there is no such dataset for high-level programming languages, so we collected our own.

4.2. Analysis

We analyzed the collected dataset using the Data post-processing tool to explore users’ behavior and programming patterns.

IDE actions. To look closely at actions performed in IDE while solving tasks, we plotted them together with the length of the current code fragment.  Figure 2 presents an example of such a plot for one of the solutions for the brackets task. Descriptions and arrows were added manually for clarification.

At the beginning of the solution, the user pasted a large piece of code which turned out to be the keyboard input code from a previous solution. Further, the user took a break for about 6.5 minutes, an explanation of which may be the user trying to come up with a solution before starting to code. Another interesting detail is an abundance of code runs, which is quite unusual for most users. This behavior might highlight their preference to have the solution code valid at all stages, even while it is still incomplete. Also, we found that in the middle of the process the user pasted actual and expected outputs of their program for character-by-character comparison, thus using the editor as a diff tool.

Score changes. Another way to analyze users’ behavior is to explore the dynamics of their solution score. Using our post-processing tool, we removed all intermediate code changes and colored each solution state according to its score. The plot of this type in Figure 3 shows that the process of this solution can be separated into three stages: implementing an initial solution, correcting it, and refactoring. Analysis of the score dynamics alongside snapshots of source code allows to reason about students’ intent in detail, thus providing deep insights into their approach to each individual solution.

5. Conclusion

In this paper, we introduced TaskTracker-tool — a toolkit for collecting exhaustive code snapshots and user actions during the solution of programming tasks in various programming languages in IntelliJ-based IDEs. The toolkit enables its user to gather solution data separated per individual task, collect it on a server, process the data, and perform basic analysis over it. Using the toolkit, we collected a dataset of solution activity by 148 participants with different experience levels. The dataset is publicly available. We found several interesting patterns in the dataset and suggested multiple directions for future use of the dataset and our toolkit.

Future work on TaskTracker-tool would involve adding support for collecting programming data in arbitrary environments to allow tracking of code in settings beyond predefined tasks while respecting privacy. Apart from that, we are planning to extend our dataset to more participants. Finally, we are going to provide support for further development of TaskTracker-tool and its adaptation to other researchers’ needs, if the community considers the toolkit valuable.


  • (1)
  • ecl (2017) 2017. Developer Productivity Report 2017: Java Tools Usage. https://www.jrebel.com/blog/java-development-tools-usage-stats
  • act (2020) 2020. Activity Tracker. https://github.com/dkandalov/activity-tracker
  • blu (2020) 2020. BlueJ IDE. https://bluej.org/
  • ecl (2020a) 2020a. Eclipse IDE. https://www.eclipse.org/eclipseide/
  • ecl (2020b) 2020b. IntelliJ IDEA dominates the IDE market with 62% adoption among JVM developers. https://snyk.io/blog/intellij-idea-dominates-the-ide-market-with-62-adoption-among-jvm-developers
  • int (2020) 2020. IntelliJ platform. https://www.jetbrains.com/opensource/idea/
  • Altadmri and Brown (2015) Amjad Altadmri and Neil CC Brown. 2015. 37 million compilations: Investigating novice programming mistakes in large-scale student data. In Proceedings of the 46th ACM Technical Symposium on Computer Science Education. 522–527.
  • Ambrose et al. (2010) Susan A Ambrose, Michael W Bridges, Michele DiPietro, Marsha C Lovett, and Marie K Norman. 2010. How learning works: Seven research-based principles for smart teaching. John Wiley & Sons.
  • Blikstein (2011) Paulo Blikstein. 2011. Using learning analytics to assess students’ behavior in open-ended programming tasks. In Proceedings of the 1st international conference on learning analytics and knowledge. 110–116.
  • Blikstein et al. (2014) Paulo Blikstein, Marcelo Worsley, Chris Piech, Mehran Sahami, Steven Cooper, and Daphne Koller. 2014. Programming pluralism: Using learning analytics to detect patterns in the learning of computer programming. Journal of the Learning Sciences 23, 4 (2014), 561–599.
  • Brown et al. (2018) Neil CC Brown, Amjad Altadmri, Sue Sentance, and Michael Kölling. 2018. Blackbox, five years on: An evaluation of a large-scale programming data collection project. In Proceedings of the 2018 ACM Conference on International Computing Education Research. 196–204.
  • Brown et al. (2014) Neil Christopher Charles Brown, Michael Kölling, Davin McCall, and Ian Utting. 2014. Blackbox: a large scale repository of novice programmers’ activity. In Proceedings of the 45th ACM technical symposium on Computer science education. 223–228.
  • Bulmer et al. (2018) Jeff Bulmer, Angie Pinchbeck, and Bowen Hui. 2018. Visualizing code patterns in novice programmers. In Proceedings of the 23rd Western Canadian Conference on Computing Education. 1–6.
  • Danielsiek and Vahrenhold (2016) Holger Danielsiek and Jan Vahrenhold. 2016. Stay on these roads: Potential factors indicating students’ performance in a CS2 course. In Proceedings of the 47th ACM Technical Symposium on Computing Science Education. 12–17.
  • Gratchev and Jeng (2018) Ivan Gratchev and Dong-Sheng Jeng. 2018. Introducing a project-based assignment in a traditionally taught engineering course. European Journal of Engineering Education 43, 5 (2018), 788–799.
  • Hagan and Markham (2000) Dianne Hagan and Selby Markham. 2000. Teaching Java with the BlueJ environment. In Proceedings of Australasian Society for Computers in Learning in Tertiary Education Conference ASCILITE.
  • Hundhausen et al. (2017) Christopher David Hundhausen, Daniel M Olivares, and Adam S Carter. 2017. IDE-based learning analytics for computing education: a process model, critical review, and research agenda. ACM Transactions on Computing Education (TOCE) 17, 3 (2017), 1–26.
  • Ihantola et al. (2015) Petri Ihantola, Arto Vihavainen, Alireza Ahadi, Matthew Butler, Jürgen Börstler, Stephen H Edwards, Essi Isohanni, Ari Korhonen, Andrew Petersen, Kelly Rivers, et al. 2015. Educational data mining and learning analytics in programming: Literature review and case studies. In Proceedings of the 2015 ITiCSE on Working Group Reports. 41–63.
  • Jadud (2005) Matthew C Jadud. 2005. A first look at novice compilation behaviour using BlueJ. Computer Science Education 15, 1 (2005), 25–40.
  • Kazerouni et al. (2017) Ayaan M Kazerouni, Stephen H Edwards, T Simin Hall, and Clifford A Shaffer. 2017. DevEventTracker: Tracking development events to assess incremental development and procrastination. In Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education. 104–109.
  • Kölling (2008) Michael Kölling. 2008. Using BlueJ to introduce programming. In Reflections on the Teaching of Programming. Springer, 98–115.
  • Konecki (2014) M Konecki. 2014. Problems in programming education and means of their improvement. DAAAM international scientific book 2014 (2014), 459–470.
  • Lahtinen et al. (2005) Essi Lahtinen, Kirsti Ala-Mutka, and Hannu-Matti Järvinen. 2005. A study of the difficulties of novice programmers. ACM SIGCSE Bulletin 37 (2005), 14–18.
  • McCall and Kölling (2014) Davin McCall and Michael Kölling. 2014. Meaningful categorisation of novice programmer errors. In 2014 IEEE Frontiers in Education Conference (FIE) Proceedings. IEEE, 1–8.
  • Mcgettrick et al. (2005) Andrew Mcgettrick, Roger Boyle, Roland Ibbett, John Lloyd, Gillian Lovegrove, and Keith Mander. 2005. Grand challenges in computing: Education—a summary. Comput. J. 48, 1 (2005), 42–48.
  • Norris et al. (2008) Cindy Norris, Frank Barry, James B Fenwick Jr, Kathryn Reid, and Josh Rountree. 2008. ClockIt: collecting quantitative data on how beginning software developers really work. In Proceedings of the 13th annual conference on Innovation and technology in computer science education. 37–41.
  • Robins et al. (2003) Anthony Robins, Janet Rountree, and Nathan Rountree. 2003. Learning and teaching programming: A review and discussion. Computer science education 13, 2 (2003), 137–172.
  • Shah (2003) Anuj Ramesh Shah. 2003. Web-cat: A web-based center for automated testing. Ph.D. Dissertation. Virginia Tech.
  • Spacco et al. (2006) Jaime Spacco, David Hovemeyer, William Pugh, Fawzi Emad, Jeffrey K Hollingsworth, and Nelson Padua-Perez. 2006. Experiences with marmoset: designing and using an advanced submission and testing system for programming courses. ACM Sigcse Bulletin 38, 3 (2006), 13–17.
  • Tirronen et al. (2015) Ville Tirronen, Samuel Uusi-Mäkelä, and Ville Isomöttönen. 2015. Understanding beginners’ mistakes with Haskell. Journal of Functional Programming 25 (2015).
  • Vihavainen et al. (2014) Arto Vihavainen, Juha Helminen, and Petri Ihantola. 2014. How novices tackle their first lines of code in an ide: Analysis of programming session traces. In Proceedings of the 14th Koli Calling International Conference on Computing Education Research. 109–116.
  • Yan et al. (2019) Lisa Yan, Annie Hu, and Chris Piech. 2019. Pensieve: Feedback on coding process for novices. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education. 253–259.
  • Yang et al. (2015) Tzu-Chi Yang, Gwo-Jen Hwang, Stephen JH Yang, and Gwo-Haur Hwang. 2015. A two-tier test-based approach to improving students’ computer-programming skills in a web-based learning environment. Journal of Educational Technology & Society 18, 1 (2015), 198–210.