1. Introduction
In recent years, the Python programming language has gained the reputation to be popular among hackers [Sarwar2021, p. v]. Since Python was one of the most widely used languages in 2021, this may not come as a surprise [StackOverflow2021]. There are several online resources that support the hypothesis [Simplilearn2022][Analyticsinsight2021][calltutors2020][ubuntupit2021]. Unfortunately, there are hardly any scientific studies available on the prevalence of programming languages in the hacking community. Our paper aims to help closing this research gap. The question we try to answer is: Which programming languages do hackers use?
In order to address this research question, we conducted a survey among the members of the German Chaos Computer Club (CCC). As ”Europe’s largest association of hackers”[CCC2022], the club provides a good basis for our study. Before we explain our approach and discuss the results, the term hacker needs to be clarified. Since there is no standard notion in the scientific literature, we define the term as follows:
Definition 1.
A hacker is someone who uses his/her technical expertise to deal with computers with special regard to their security. We explicitly refer to the Chaos Computer Club’s hacker ethics [HackerEthics2022]. These form the core of the club’s hacker definition and consequently that of our study.
Some authors distinguish between ”black-hat”, ”white-hat” and ”grey-hat” hackers with respect to their ethics [Marushat2019, p. 20] [Sarwar2021, pp. 12-13]. By this definition, a white-hat hacker has no criminal intent, while black-hats use their computer knowledge for unlawful activities. Grey-hats are in between these two concepts. Raymond’s guide on how to become a hacker uses the term ”cracker” to describe malicious behaviour. According to Raymond, ”hackers build things, crackers break them” [Raymond2020]. Since we conducted our survey at the Chaos Computer Club, our research focuses on their hacker ethics, in line with Definition 1. The club’s hacker concept is not based on the black-hat/white-hat definition outlined above, but constitutes separate principles such as freedom of information and protection of private data [HackerEthics2022].
There are numerous works on hacking in the (scientific) literature. Not all of them draw a line between different groups of hackers. Whether this distinction is important for the choice of programming language is another question. Programming languages are mainly discussed from a functional standpoint in literature. Raymond, for example, mentions Python, C, Perl, and Lisp as adequate choices [Raymond2020]. Ericson provides code snippets in C[Ericson2008]. Simpson and Antill describe C as ”one of the most popular programming languages for security professionals and hackers” [SimpsonAntill_2017, p. 196]. The authors also write that hackers ”use Perl to create automated exploits and malicious bots” [SimpsonAntill_2017, p. 178]. Clark explains hacking techniques with shell scripts under Linux and Windows as well as programmatic approaches with Python and other languages [Clark2014]. Python is covered by a variety of further hacking books [O_Connor_2012], [Seitz2015], [Sinha_2017]. Since software vulnerabilities play an import role for security, Turner analysed C, Java, C++, Objective-C, C#, PHP, Visual Basic, Python, Perl and Ruby in this regard [Turner_2014].
As mentioned above, it is hard to find literature on the prevalence of programming languages in the hacking community. One reason might be that hackers often operate anonymously and are difficult to reach. Samtani et al. sidestepped this problem by exploring hacker assets in underground forums [Samtani2015]
. The authors used machine learning algorithms to classify whether code postings were written in Java, Python, C/C++, HTML, PHP, Delphi, ASP, SQL, Ruby, or Perl. This way, they could show that XSS attacks were primarily implemented in Perl, password cracks and keyloggers in Java, and finally banking vulnerabilities and Microsoft exploits in SQL
[Samtani2015, p. 35]. Our paper follows an alternative approach to shed light on this topic.2. Approach
In May 2021 we conducted a cross-sectional survey [mcmillan1996educational] at the Chaos Computer Club. For this purpose, we have sent a link to an online questionnaire to the local and regional affiliates of the CCC (so-called Erfa-Kreise) [ErfaKreise2022]. The design of the questionnaire was inspired by the 2019 Stack Overflow Developer Survey [StackOverflow2019]
. Other sources of inspiration were the Kaggle Data Science Survey
[Kaggle2020] and Raymond’s guide on how to become a hacker [Raymond2020]. Our questions focused on programming languages as well as related topics such as operating systems and development environments. For a better interpretation of the results, we also asked the participants how important they consider the choice of programming language to be for hacking.In our questionnaire, we used Likert scales as well as multiple choices and drop-downs. Control questions were included to ensure proper answering [raithel2008quantitative]. The survey instrument was pre-tested on 3 selected participants. Based on the pre-test we slightly revised some survey items, especially enclosed answer options and added examples and instructions.
We opened the questionnaire on 1 May and closed it on 30 May 2021. In total, we received 43 responses. As not all questions were mandatory, the number of responses for a given question may be lower than the total number of participants. It is clear that the results do not allow for a representative conclusion on our research question. Nevertheless, the study offers first insights into the topic and can serve as a starting point for further, broader research.
3. Discussion of Results
In the following, we discuss the most important results of our survey. For a better overview, we have divided the responses into several subsections.
3.1. Experience
Our first questions focused on the experience of the participants. Table 1 shows that the majority of respondents had a general programming experience of several years. A slightly different picture emerged when we asked about the specific hacking experience as presented in Table 2.
Options | % Percentages | # Responses |
---|---|---|
20+ years | 41.86 % | 18 |
10 - 20 years | 32.56 % | 14 |
5 - 10 years | 18.60 % | 8 |
3 - 5 years | 2.33 % | 1 |
1 - 3 years | 4.65 % | 2 |
1 year | 0.00 % | 0 |
I have never written code | 0.00 % | 0 |
Total | 43 |
Options | % Percentages | # Responses |
---|---|---|
20+ years | 27.50 % | 11 |
10 - 20 years | 27.50 % | 11 |
5 - 10 years | 20.00 % | 8 |
3 - 5 years | 15.00 % | 6 |
1 - 3 years | 2.50 % | 1 |
1 year | 5.00 % | 2 |
I have no hacking experience | 2.50 % | 1 |
Total | 40 |
3.2. Programming Language
When asked about the programming languages used for hacking in the last year, participants named a variety of technologies. Table 3 lists the responses in descending order.
Options | % Percentages | # Responses |
---|---|---|
Bash/Shell/PowerShell | 72.50 % | 29 |
Python | 70.00 % | 28 |
C | 32.50 % | 13 |
JavaScript | 32.50 % | 13 |
HTML/CSS | 30.00 % | 12 |
C++ | 27.50 % | 11 |
Go | 22.50 % | 9 |
SQL | 22.50 % | 9 |
Java | 20.00 % | 8 |
Others | 20.00 % | 8 |
Assembly | 17.50 % | 7 |
C# | 15.00 % | 6 |
PHP | 15.00 % | 6 |
Rust | 12.50 % | 5 |
Ruby | 10.00 % | 4 |
Perl | 7.50 % | 3 |
TypeScript | 7.50 % | 3 |
Kotlin | 5.00 % | 2 |
Scala | 5.00 % | 2 |
VB/VBA | 5.00 % | 2 |
Lisp | 2.50 % | 1 |
Swift | 2.50 % | 1 |
Objective-C | 0.00 % | 0 |
R | 0.00 % | 0 |
Total Respondents 40 |
The options of programming languages provided in the questionnaire were based on a Stack Overflow survey [StackOverflow2019], Raymond’s hacking guide [Raymond2020] and feedback from our pre-testers. Whether technologies such as HTML, Bash or SQL can be called programming languages is of course debatable. We have included them in the list anyway to avoid possible gaps in the study.
According to Table 3, shell scripts (e.g. Bash) and Python were used most frequently. It also appears that the C language family (C, C++, C#, and Objective-C) is still common. With regard to Java, the survey supports Raymond’s argument that this language is not the first choice for hackers [Raymond2020]. Table 4 shows that the language preference of the majority of participants has changed over time. Compared to programming languages used more than a year before (see Table 10 in the appendix), a shift towards shell scripts and Python can be noted. Python’s rise in popularity has been observed by other developer surveys too [StackOverflow2019] [StackOverflow2021].
Options | % Percentages | # Responses |
---|---|---|
yes, my preference has changed | 77.50 % | 31 |
no, I always used the same programming languages | 22.5 % | 9 |
Total | 40 |
Options | % Percentages | # Responses |
---|---|---|
Strongly Agree | 5.0 % | 2 |
Agree | 20.0 % | 8 |
Neither / Nor Agree | 32.50 % | 13 |
Disagree | 20.00 % | 8 |
Strongly Disagree | 22.50 % | 9 |
Total | 40 |
An interesting result appears in Table 5. Many respondents (75%) did not agree that the choice of the programming language is important for hacking. Only 25% of participants agreed or strongly agreed with this statement. The prevalence of Python for hacking might therefore simply reflect the general increase in its use in recent years. Consequently, one could expect that the language preference of hackers will continue to change in future as technology evolves.
3.3. Ecosystem
We also asked the participants which operating systems (OS) they used for hacking in the last year. Table 6 shows that the majority of respondents chose a Linux-based variant. This is not surprising, since Kali Linux even provides a specific distribution for security and penetration testing [Kali2022]. When asked about the integrated development environments (IDEs), a variety of tools were selected. As evident from Table 7, Vim and Visual Studio Code were in the top five (see Table 11 in the appendix for a full list).
Options | % Percentages | # Responses |
---|---|---|
Linux-based | 95.00 % | 38 |
Windows | 40.00 % | 16 |
MacOS | 32.50 % | 13 |
BSD | 17.50 % | 7 |
Others | 5.00 % | 2 |
Total Respondents 40 |
Options | % Percentages | # Responses |
---|---|---|
Vim | 60.00 % | 24 |
Visual Studio Code | 50.00 % | 20 |
Others | 22.50 % | 9 |
IntelliJ | 17.50 % | 7 |
Visual Studio | 17.50 % | 7 |
… | … | … |
Total Respondents 40 |
3.4. Demographics
Finally, we asked the participants about their gender identity and age. Table 8 shows that most respondents identified themselves as male. Their age varied, with the majority between 25 and 44 years old, as revealed by Table 9.
Options | % Percentages | # Responses |
---|---|---|
woman | 2.50 % | 1 |
man | 60.00 % | 24 |
nonbinary | 12.50 % | 5 |
prefer not to say | 20.00 % | 8 |
prefer to self-describe | 5.00 % | 2 |
Total | 40 |
Options | % Percentages | # Responses |
---|---|---|
0 - 17 | 2.50 % | 1 |
18 - 21 | 2.50 % | 1 |
22 - 24 | 5.00 % | 2 |
25 - 29 | 17.50 % | 7 |
30 - 34 | 22.50 % | 9 |
35 - 39 | 15.00 % | 6 |
40 - 44 | 20.00 % | 8 |
45 - 49 | 7.50 % | 3 |
50 - 54 | 2.50 % | 1 |
55 - 59 | 5.00 % | 2 |
60 - 69 | 0.00 % | 0 |
70+ | 0.00 % | 0 |
Total | 40 |
4. Conclusion
The purpose of this paper was to shed light on the question of which programming languages are used by hackers. In order to achieve that goal, we conducted a survey at the German Chaos Computer Club in May 2021. Our results show that the members were using different programming languages at the time. Shell scripts and Python were chosen most frequently. It also appears that the C language family is still common. Another important finding is that the choice of programming language does not play a vital role for hackers. Their language preference has changed over time and will presumably continue to do so in the future.
The number of responses we received does not allow for a representative conclusion. Furthermore, the survey targeted only members of the CCC. The findings might therefore be biased both regionally and in favour of a specific group. Our results do, however, add to the extremely scarce literature on the subject. The approach could serve as a model for future surveys, possibly at international level.
Appendix A Additional Questions
This section contains responses to our questionnaire that were referenced but not included in the main text.
Options | % Percentages | # Responses |
---|---|---|
Bash/Shell/PowerShell | 47.06 % | 16 |
Python | 41.18 % | 14 |
C | 38.24 % | 13 |
PHP | 32.35 % | 11 |
C++ | 26.47 % | 9 |
JavaScript | 26.47 % | 9 |
Assembly | 23.53 % | 8 |
Java | 23.53 % | 8 |
Others | 20.59 % | 7 |
HTML/CSS | 17.65 % | 6 |
Perl | 14.71 % | 5 |
SQL | 8.82 % | 3 |
VB/VBA | 8.82 % | 3 |
C# | 5.88 % | 2 |
Rust | 5.88 % | 2 |
R | 2.94 % | 1 |
Ruby | 2.94 % | 1 |
Go | 0.00 % | 0 |
Kotlin | 0.00 % | 0 |
Lisp | 0.00 % | 0 |
Objective-C | 0.00 % | 0 |
Scala | 0.00 % | 0 |
Swift | 0.00 % | 0 |
TypeScript | 0.00 % | 0 |
Total Respondents 34 |
Options | % Percentages | # Responses |
---|---|---|
Vim | 60.00 % | 24 |
Visual Studio Code | 50.00 % | 20 |
Others | 22.50 % | 9 |
IntelliJ | 17.50 % | 7 |
Visual Studio | 17.50 % | 7 |
Android Studio | 15.00 % | 6 |
Eclipse | 15.00 % | 6 |
Nano | 15.00 % | 6 |
Notepad++ | 15.00 % | 6 |
PyCharm | 12.50 % | 5 |
Sublime Text | 12.50 % | 5 |
Atom | 7.50 % | 3 |
IPython / Jupyter | 7.50 % | 3 |
NetBeans | 5.00 % | 2 |
PHPStorm | 5.00 % | 2 |
RubyMine | 5.00 % | 2 |
Xcode | 5.00 % | 2 |
Coda | 2.50 % | 1 |
Emacs | 2.50 % | 1 |
TextMate | 2.50 % | 1 |
Komodo | 0.00 % | 0 |
RStudio | 0.00 % | 0 |
Total Respondents 40 |
Appendix B Glossary
|
|
||
---|---|---|---|
CCC | Chaos Computer Club | ||
Cracker | Individual using computer knowledge with malicious intent | ||
Erfa-Kreise | Local and regional affiliates of the Chaos Computer Club | ||
Hacker | See Definition 1 | ||
Keylogger | Software for keystroke recording | ||
XSS | Cross site scripting |