Malicious actors employ secure channels to hide communications from detection agents. These channels enable activities such as the installation of exploit kits, distribution of malware and adware, and communicating useful information to controllers. So, knowledge of channel contents is unknown while malicious use of secure channel surges - a cloud vendor blocked 1.7 billion threats using TLS in he second half of 2018 . Malware classes that frequently uses TLS secure channel for communications are bots and ransomware.
The ransomware element of an attack can be defined as the weaponisation part, where a ransomware element can be packaged with a defined distribution and infection methods, and then targeted as required. The first phase of ransomware, such as WannaCry, was often fairly scattergun in its targets, but the usage of ransomware in targeted attacks increases. In 2019, Symantec found that enterprise-targeting ransomware showed a yearly increase of 12% and attacks on mobile devices rose by 33% . With WannaCry we saw the weaponisation of ransomware, which used the Eternal Blue vulnerability for its infection. The NHS in the UK was but one organization of many who suffered major outages from the ransomware infection .
Modern bot malware is generally multi-purpose. Once installed on a client, an external controller determines bot activities through issued commands. Encrypted controller-to-bot channels prevent defenders from discovering commands. Ransomware clients are generally single purpose in that users pay to recover document or device access. Crypto-ransomware, where documents on an infected client are encrypted and a payment is required, typically in Bitcoins, for the decryption key, is a common variant . Communications between crypto-ransomware clients and controllers may include useful information such as encryption keys.
Current methods of dealing with malicious use of secure channels have limitations. Whereas in unencrypted channels, payload inspection provides knowledge of malware activity, with encrypted channels, detection methods rely on discovering anomalies between benign and malicious activity. These methods assist in the detection and possible prevention of possible malicious activity but cannot provide detailed knowledge of the malicious activity.
This paper investigates decrypting TLS communications of real-world malware. A framework uses a standard approach for decrypting TLS traffic to analyse and decrypt the secure communications. For malware, performance challenges can result from malware use of different cryptographic libraries. So, the framework is extended to accommodate the Windows cryptographic library. Experiments evaluate decrypting real bot and ransomware command and control communications using the extension. The contribution is a novel method to discovering cryptographic artefacts used by real malware. As these are discovered in single memory extracts and decrypted in less than a second, the communicated activities of unknown malware can be discovered.
The rest of the paper is structured as follows. Related research on malware command and control channels is presented in Section II. Section III discusses sourcing of real-world malware samples while Section IV evaluates and discusses the limitations of decrypting using a standard TLS decryption methodology. An new approach for decrypting traffic using Windows cryptographic libraries is presented in Section V. The results are presented and discussed in Section VI and conclusions drawn in Section VII.
Ii Related Work
A number of papers focus on memory inspection to discover the malware threat using calls to APIs. Gupta et al  analysed API calls within Windows and mapped a total of 534 important API calls with 26 categories (A-Z). These were then used to identify five types of malware (Worm, Trojan-Downloader, Trojan-Spy, Trojan-Dropper and Backdoor). Hampton et al  furthered this work by analysing 14 strains of ransomware on Windows platforms and created mappings of the API call frequencies.
Other studies detect the presence of malware command and control channels. Signature-based detection systems that check for known byte sequences in packet headers or payloads may be less successful with new malware variants or encrypted traffic. So, anomaly detection methods using data mining methods can distinguish benign from malware traffic. For example, features such as correlations between malware channel request and response times , short packet sizes , TLS header information , or a combination of features  are possible differentiators. The features facilitate malware detection and prevention but the encrypted contents are hidden.
Controller emulation can discover malware client plaintext. Logging client requests may provide useful insights although cited challenges may be environment security, scalability for large botnets, and transparency, where malware detects the presence of a test environment and terminates . Although a controlled environment is advantageous  and can be used to detect adversaral activities  emulator drawbacks for decrypting real-world malware communications may be not knowing valid controller responses but, perhaps more importantly, the malware must be known a priori to execute in the environment.
Patil et al  define an investigation framework for analysing captured memory for the detection of malware using information from processes, running threads, opened registry keys, and user authentication details. Google, too, focus on memory inspection for the detection of malware  and use a number of virtual machines to detect anomalous behaviour.
Feichtner et al 
defines a method of detecting cryptographic misuse in Apple iOS applications. Their work uses a decompilation method to analyse the code calls to the core cryptographic libraries. In their analysis they found that 82% of the applications sampled had a cryptographic flaw. For this they defined six main rules to identify a cryptographic flaw. These included: the usage of the ECB mode for encryption; the usage of a non-random IV for CBC encryption; and the usage of constant encryption keys. Most of flaws found related to the usage of a non-random Initialization Vector (IV) and the usage of constant encryption keys. They also found that 27% of the sampled apps used ECB (Electronic Code Book) for the symmetric key encryption.
A priori malware knowledge may also enable plaintext discovery. While virtual machine environments, such as DRAKVUF , support malware dynamic analysis, such solutions succeed only where actual, or suspected, malware is obtained. Our approach requires no prior knowledge so discovering the plaintext in the encrypted communications of unknown malware is possible.
Discovery of malware cryptographic artefacts may enable their communications to be decrypted. For example, researchers have discovered TLS encryption keys in memory . Initialization vectors were not discovered, which for commonly-used AES-GCM encryption is necessary, so plaintext was not derived. Furthermore, whereas keys were discovered in Linux memory, malware is still predominantly Windows-focused .
Iii Sourcing Malware Samples
Malware communications analysis ideally executes real malware samples. Samples of recent provenance are preferable as these may reflect the approaches of modern malware authors, Maintained online databases provide guides to current usage. For SSL traffic, the SSL Blacklist website  lists bot and ransomware clients in reverse chronological order of awareness. From the list, 35 potential malware entities were identified. For each, multiple client executable samples were downloaded in compressed, password protected from the VirusShare website . The compressed files are securely copied to the client. on which Windows Defender is disabled, the compressed files uncompressed and executed. As the responder acting as a malware controller was not configured to respond appropriately, only three executables successfully established a TLS connection including a data exchange with a server running the OpenSSL server application . The three malware executables are listed in Table I followed by pertinent background.
Zbot, also known as ZeuS or Zeus, is a well-known bot malware instance. Detected in 2006 , Zbot is primarily known for stealing banking passwords by injecting code into a user browser as illustrated in Figure 1. Other Zbot functionality includes extracting information such as browser history and cookies, certificates, and mail account information as well as perform actions such as manipulate local files, install ransomware, log keystrokes, take screenshots, and manage a botnet of other infected computers  . Although Zbot previously used the HTTP protocol for communications, information and commands are now generally concealed with TLS.
Gozi, also known as Ursnif inter alia, is also an information-stealing bot. Detected in 2007 , Gozi is commonly used by malicious actors for stealing and other confidential banking information    as shown in 2. Although functions include theft of cookies and email credential, and logging of keystrokes and browsing activity , a key feature is intercepting network traffic to hijack financial transactions. For example, when a money transfer is detected, Gozi issues an encrypted message through its command and control server to prevent the correct transfer and redirect the funds to a controlled account .
TorrentLocker is an instance of crypto-ransomware. Known since 2014, sufficiently similar to CryptoLocker to also be known as Crypt0L0cker, it encrypts user documents including pictures, advises the user of its action, and demands payment using Bitcoins  as indicated in Figure 3. Although client-controller communications were previously encrypted using XOR, TLS is now a common communications mechanism . Information transmitted to the controller includes the ransom page, the encryption key which is RSA-encrypted with a TorrentLocker public key, counts of encrypted files, address book contacts, email credentials, and logs .
The analysis framework executes in a virtualized environment, A hypervisor supports a virtual machine monitor executing on a privileged virtual machine, a suspect client virtual machine executing malware samples, and virtual machines providing server functionality for client communications. The framework extracts read/write client virtual machine memory, analyses memory extracts to discover small sets of candidate cryptographic artefacts, and decrypts encrypted network traffic until a decrypt is validated. The framework accommodates SSH and TLS protocols and encryption algorithms such as AES and ChaCha20.
A standard TLS extension searches memory extracts for key blocks that a pseudo-random number generator (PRNG) create after the handshake. For AES-GCM, the agreed encryption algorithm for each malware sample, key blocks contain client and server encryption keys, and client and server implicit initialization vector (IV) segments. MemDecrypt memory analysis searches for key blocks following the process illustrated in 4 where the explicit IV segment in an Application Data network packet enables searches for candidate implicit IVs, which enable candidate key block discovery. Candidate implicit IVs are memory extract segments co-located with explicit IV segment values. Candidate key blocks are memory extract segments co-located with candidate implicit IV segments and where the key block client key and server key exceed a threshold.
Iv-a Test Environment
The Xen Project 4.4.1 hypervisor runs on a Core 2 Duo Dell personal computer with 40 GB of disk storage and 3 GB of RAM, It supports a privileged hypervisor console running Debian release 3.16.0-4-amd64 version and the MemDecrypt framework. and three unprivileged virtual machines. Experiments execute on a Windows client and a Linux server virtual machine. The client runs the Windows 10 (10.0.16299) operating system with 2 GB of memory and 40 GB of disk, and the server runs an Ubuntu 14.04 build (“Trusty”) with 512 MB of allocated memory and 4 GB of disk storage.
The environment is configured for malware containment. The malware client is prevented from communicating with external servers to prevent possible corruption of other environments. So, an additional Linux machine running Ubuntu 14.04 build (“Trusty”) with 512 MB of allocated memory and 4 GB of disk storage is established as a DNS server using the ‘dnsmasq’ package. Responses to benign DNS requests, such as *.microsoft.com, return the DNS server IP address and to other requests the IP address of the target TLS server. For the first experiment, debug mode was enabled to log keys, IVs and plaintext. The OpenSSL server command used was:
openssl s_server -accept 443 -debug -cert crt.pem -key key.pem -WWW
With Zbot, the application of the standard TLS extension to memory extraction and analysis, the decrypt analysis component was projected to require approximately 34 hours to identify correct artefacts. By excluding artefacts less than 1000 bytes apart in memory extracts, this reduced to 15 minutes so the combined analysis duration was 38 minutes. Although quicker than brute-force, cryptanalytic, and side-channel approaches, the duration may be insufficient for practical application in live scenarios. Furthermore, the duration is substantially longer than earlier experiments with, for example, the OpenSSL library, warranting further analysis.
Two factors cause this increase. One is the 8-byte explicit IV segment obtained from an Application Data Message. For Zbot, the first explicit IV segment is ’0x0000000000000001’ as illustrated in the highlighted section of the Wireshark packet capture in Figure 5. This byte sequence occurs more frequently than randomly generated explicit IVs in memory extracts. So, when the explicit IV is used to discover possible four-byte implicit IV segments, 578,629 possible instances were found. Entropy measure thresholds reduced the candidate key block size to 23,361. By contrast, an experiment with an OpenSSL client application yielded three candidate implicit IVs and 79 candidate keys.
The other factor is masquerading. To evade detection, malware applications may camouflage their activities and one such mechanism is masquerading as a benign application so the malware may also be known as a ‘trojan’. In the Windows environment, examples of benign applications used for masquerading includes the Edge browser and Windows Explorer. However, when Zbot masquerades as the Windows Explorer, for example, the data collection component extracts 265 read/write memory files totalling 73.2 MB for each separate extraction. The combination of these two factors leads to large sets of candidate keys and IVs.
V Windows Library Extension
Memory extract features suggest a more efficient alternative. Using the session key and IV from OpenSSL server logs, a search of malware client application memory yielded interesting facts: the key occurs frequently in different extract files, memory extract files sizes containing the key are within specific ranges, and two unusual ASCII strings are present near encryption key locations in the memory extract files.
The repeated occurrence of the key in memory extracts may result from data protection or an absence of data cleansing. After a TLS handshake when client and server keys have been generated by a pseudo-random generator, the keys may be copied to record data structures for simplified access by the encryption process. The malware may copy the keys repeatedly to ensure access. Alternatively, the malware writer may copy keys on different occasions but fail to cleanse the source or copy. In any case, this feature is not used in the extension.
Sizes of memory extract files containing encryption keys ranged between 2 MB and 4 MB. Consistent with prior SSH and TLS investigations, the size probably originates from an application memory allocation request (‘malloc’) for a data structure to hold the encryption, or decryption fields such as keys, encrypt/decrypt flags, key length, mode, etc. As illustrated in Figure6 which maps the number of segments above a 4.5 threshold (Y-axis) against the extracted size (X-axis), the distribution of high-entropy counts in malware application memory suggests that prioritizing regions for analysis may speed up the IV and key discovery process.
The presence of specific ASCII strings in memory extracts containing keys is more significant. The strings are ‘3LLS’ and ‘KSSM’, or in big-endian format" ’SSL3’ and ‘MSSK’. Researchers identified ‘MSSK’ in the Windows security policy application, Local Security Authority Subsystem Service (LSASS) as a possible acronym for ‘Microsoft Symmetric Key’ or ‘Microsoft Symmetric Key’ . ‘SSL3’ may refer to the deprecated forerunner of TLS, SSLv3. Kambic identified probable fields in the undocumented LSASS data structure including: encryption data structure sizes; TLS version; and the encryption key. The field identified by Kambic as the probable IV field is inconsistent with MemDecrypt memory extracts. The implicit IV is located approximately 20 bytes after the ‘3LLS’ string, and the key approximately 30 bytes after the ‘KSSM’ string. Although random occurrences are possible, the strings provide good indicators for identifying candidate memory extracts containing cryptographic artefacts when Microsoft security libraries are used.
A MemDecrypt extension to decrypt TLS communications from executables that use Microsoft security libraries accommodates these features. Microsoft encryption libraries are assumed When TLS Application Data messages contain explicit IV values of ‘x0000000000000001’, Additional techniques such as the identification of executable linked libraries might validate this assumption. Extract file sizes are banded to prioritise medium-sized files. The extract files are searched for the ASCII strings and fields in near locations in the same extracts and of sufficient entropy to be candidate keys and IVs are identified. The Microsoft memory analysis algorithm is shown in Algorithm 1. The banding is wider, and the maximum allowable distances in memory between ’3LLS’ and a candidate IV, and ’KSSM’ and a candidate key exceed the empirically observed values to allow for potential data structure changes, as may result from operating system upgrades. The entropy thresholds for ’IVsize’ and ’keysize’ are set to 1.5 and 4.5 respectively based on previous experiments with PRNG functions. If the Microsoft library extension fails to find cryptographic artefacts, the TLS extension provides a fall-back.
Vi Windows Library Extension Evaluation
Experiments were executed to evaluate the Windows library extension. Each malware sample was executed on a Windows 10 client, memory extracted and Ubuntu OpenSSL server logs collected. Analysis decrypts were validated by evaluating compliance with HTTP 1.1 and comparison with server logs. An example of Zbot decrypted analysis output is illustrated in Figure 7 and verification provided by the OpenSSL server log shown in Figure 8. Each Zbot, Gozi, and TorrentLocker samples decrypted with 100% success. Decrypt output examples for all malware samples are shown in Table II. Host names and GET image names vary for different test runs, and furthermore, Gozi decrypts produce POST as well as GET requests.
|User-Agent: Mozilla/4.0 (compatible; MSIE 8.0;|
|Windows NT 10.0; Win64; x64)|
|TorrentLocker||POST /topic.php HTTP/1.1|
|User-Agent: Mozilla/4.0 (compatible; MSIE 8.0;|
|Windows NT 10.0; Win64; x64)|
Analysis component durations for each malware sample confirm the extension’s performance. As illustrated in Table III the maximum combined duration for memory analysis and decrypt analysis is below 1 second, a direct consequence of reduced candidate cryptographic artefact set sizes. With the Microsoft library extension, the set sizes range between three and six, and IV set sizes between 79 and 483.
|Memory Analysis||Decrypt Analysis|
The small experimental set size might inhibit complete confidence in the extension’s capacity to decrypt encrypted malware command and control traffic. When malware writers have developed custom security routines, analysts have broken them easily broken so known cryptographic libraries are more commonly used. Microsoft library presents a good opportunity for malware writers being pre-loaded with a Windows operating system. Use of libraries such as OpenSSL would require additional download increasing the risk of detection. It is concluded that payloads of secure TLS communications between malware clients executing on Windows clients and their controllers can be rapidly discovered.
Decrypting a single request is not conclusive. This outcome is determined by testbed configuration as the server responds with OK to any TLS requests, In the absence of a reasonable response, the client may terminate or cease further communications with the controller. Also, decryption may not necessarily provide useable information, particularly where the plaintext includes a secondary encryption layer as with TorrentLocker. However, having discovered the key and IV, a complete session is decryptable.
Rapid decryption of live TLS malware traffic offers exciting prospects. For instance, by permitting the client to communicate with its controller in a managed environment, knowledge such as client-controller interaction details may contribute to enhanced malware defences. Furthermore, decrypting unmanaged communications between malware and controller may provide ransomware keys or stolen banking details. Future work will explore these opportunities as well as expanding the range of tested malware clients.
-  D. Desai, “What’s hiding in encrypted traffic? Millions of advanced threats,” 2019. [Online]. Available: https://www.zscaler.com/blogs/research/whats-hiding-encrypted-traffic-millions-advanced-threats
-  Symantec, “Internet Security Threat Report (ISTR) 2019,” https://solutionsreview.com/endpoint-security/by-the-numbers-endpoint-security-vulnerabilities/, 2019, [Online; accessed 20-Mar-2019].
-  J. Hoeksma, “NHS cyberattack may prove to be a valuable wake up call,” BMJ: British Medical Journal (Online), vol. 357, 2017.
-  A. Kharraz, W. Robertson, D. Balzarotti, L. Bilge, and E. Kirda, “Cutting the gordian knot: A look under the hood of ransomware attacks,” in International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 2015, pp. 3–24.
-  S. Gupta, H. Sharma, and S. Kaur, “Malware characterization using windows API call sequences,” in International Conference on Security, Privacy, and Applied Cryptography Engineering. Springer, 2016, pp. 271–280.
-  N. Hampton, Z. Baig, and S. Zeadally, “Ransomware behavioural analysis on Windows platforms,” Journal of information security and applications, vol. 40, pp. 44–51, 2018.
-  A. S. Shekhawat, F. Di Troia, and M. Stamp, “Feature analysis of encrypted malicious traffic,” Expert Systems with Applications, 2019.
-  G. Gu, J. Zhang, and W. Lee, “BotSniffer: Detecting botnet command and control channels in network traffic,” in NDSS ’08 Proceedings of the 15th Annual Network and Distributed System Security Symposium, 2008.
-  X. Ma, X. Guan, J. Tao, Q. Zheng, Y. Guo, L. Liu, and S. Zhao, “A novel IRC botnet detection method based on packet size sequence,” in 2010 IEEE International Conference on Communications. IEEE, 2010, pp. 1–5.
-  B. Anderson, S. Paul, and D. McGrew, “Deciphering malware’s use of tls (without decryption),” Journal of Computer Virology and Hacking Techniques, pp. 1–17, 2016.
-  P. McLaren, G. Russell, and B. Buchanan, “Mining malware command and control traces,” in 2017 Computing Conference. IEEE, 2017, pp. 788–794.
-  B. Lin, Q. Hao, L. Xiao, L. Ruan, Z. Zhang, and X. Cheng, “Botnet emulation: challenges and techniques,” in Emerging Technologies for Information Systems, Computing, and Management. Springer, 2013, pp. 897–908.
-  C. P. Lee, “Framework for botnet emulation and analysis,” Ph.D. dissertation, Georgia Institute of Technology, 2009.
-  S. Sentanoe, B. Taubmann, and H. P. Reiser, “Virtual machine introspection based ssh honeypot,” in Proceedings of the 4th Workshop on Security in Highly Connected IT Systems. ACM, 2017, pp. 13–18.
-  D. N. Patil and B. B. Meshram, “Windows Physical Memory Analysis to Detect the Presence of Malicious Code,” in Recent Findings in Intelligent Computing Techniques. Springer, 2019, pp. 3–13.
-  E. Thioux, M. Amin, and O. A. Ismael, “System and method for analysis of a memory dump associated with a potentially malicious content suspect,” Feb. 5 2019, uS Patent App. 10/198,574.
-  J. Feichtner, D. Missmann, and R. Spreitzer, “Automated Binary Analysis on iOS: A Case Study on Cryptographic Misuse in iOS Applications,” in Proceedings of the 11th ACM Conference on Security & Privacy in Wireless and Mobile Networks. ACM, 2018, pp. 236–247.
-  T. K. Lengyel, S. Maresca, B. D. Payne, G. D. Webster, S. Vogl, and A. Kiayias, “Scalability, Fidelity and Stealth in the DRAKVUF Dynamic Malware Analysis System,” in Proceedings of the 30th Annual Computer Security Applications Conference, ser. ACSAC ’14. ACM, 2014, pp. 386–395.
-  B. Taubmann, C. Frädrich, D. Dusold, and H. P. Reiser, “TLSkex: Harnessing virtual machine introspection for decrypting TLS communication,” Digital Investigation, vol. 16, pp. S114–S123, 2016.
-  “SSL Blacklist,” 2018. [Online]. Available: https://sslbl.abuse.ch
-  “VirusShare.” [Online]. Available: https://virusshare.com/
-  OpenSSL Software Foundation, “OpenSSL: Cryptography and SSL/TLS toolkit,” https://www.openssl.com/, 2018, [Online; accessed 29-Jan-2019].
-  Trend Micro, “Trend Micro,” https://www.trendmicro.com/vinfo/us/security/news/cybercrime-and-digital-threats/online-banking-trojan-brief-history-of-notable-online-banking-trojans, 2015, [Online; accessed 20-Mar-2019].
-  “Trojan.Zbot,” 2016. [Online]. Available: https://www.symantec.com/security-center/writeup/2010-011016-3514-99
-  Panda, “Zeus is Still the Base of Many Current Trojans,” 2017. [Online]. Available: https://www.pandasecurity.com/mediacenter/panda-security/zeus-trojan/
-  R. PP, “Zeus/Zbot Trojan Attacks Credit Cards of Banks,” 2016. [Online]. Available: https://techpp.com/2010/07/15/zeuszbot-trojan-attacks-credit-cards-of-banks/
-  D. Jackson, “Gozi Trojan,” 2007.
-  M. Alvarez, T. Agayev, and T. Darsan, “Q1 2018 Results: Gozi (Ursnif) Takes Larger Piece of the Pie and Distributes IcedID,” 2018. [Online]. Available: https://securityintelligence.com/q1-2018-results-gozi-ursnif-takes-larger-piece-of-the-pie-and-distributes-icedid/
-  E. Brumaghin, H. Unterbrink, and A. Weller, “Gozi ISFB Remains Active in 2018, Leverages "Dark Cloud" Botnet For Distribution,” 2018. [Online]. Available: https://blogs.cisco.com/security/talos/gozi-isfb-remains-active-in-2018
-  M. Garnaeva, F. Sinitsyn, Y. Namestnikov, D. Makrushin, and A. Liskin, “Kaspersky Security Bulletin: Overall Statistics for 2016,” p. 31, 2016. [Online]. Available: https://kasperskycontenthub.com/securelist/files/2016/12/Kaspersky_Security_Bulletin_2016_Statistics_ENG.pdf
-  A. Mohanta, A. Saldanha, and P. Kimayong, “The Gozi Sleeper Cell,” 2018.
-  C. Weller, “CyberSecurity in 120 Secs_ The Comeback of Gozi Malware,” 2016. [Online]. Available: https://blog.ensilo.com/cyber-security-in-120-secs-the-comeback-of-gozi-malware
-  M.-E. M. Léveillé, “TorrentLocker,” ESET, Tech. Rep., 2014. [Online]. Available: http://www.welivesecurity.com/wp-content/uploads/2014/12/torrent_locker.pdf
-  M.-E. M.Léveillé, “TorrentLocker: Crypto-ransomware still active, using same tactics,” 2016. [Online]. Available: https://www.welivesecurity.com/2016/09/01/torrentlocker-crypto-ransomware-still-active-using-tactics/
-  J. M. Kambic, “Extracting CNG TLS/SSL Artifacts from LSASS Memory,” 2016.