Malware Analysis – Lesson 1; an Introduction

I discussed in our last post how I have returned to college and will be digitizing my notes, as I make them. This is the first post in that series. It will cover a few areas of malware analysis on a high level, focusing on definitions and a few descriptive lines on each. The areas covered will be discussed in greater detail in future blogs.

What is Malware Analysis?

Malware Analysis is an extremely interesting area of cyber security where we take a piece of malware or malicious code and put it under a microscope. We examine the code with an aim to understand it; How did it infect our system, how does it spread, what does it do and what does it aim to do (what is its intent)? The information we collect from this investigation can be used to stop the malware spreading, and even help us improve our security to prevent a similar infection in future.

WannaCry is a great example of this. Marcus Hutchins[1] famously identified the malwares kill switch using dynamic analysis – by monitoring the malwares network connections and traffic he saw multiple queries being made to an unregistered domain. When he initially registered the domain he had unwittingly stopped the malware in its tracks. In his MalwareBytes blog post he also gives us a good insight into how malware analysis is carried out;

  1. Look for unregistered or expired C2 domains belonging to active botnets and point it to our sinkhole (a sinkhole is a server designed to capture malicious traffic and prevent control of infected computers by the criminals who infected them).
  2. Gather data on the geographical distribution and scale of the infections, including IP addresses, which can be used to notify victims that they’re infected and assist law enforcement.
  3. Reverse engineer the malware and see if there are any vulnerabilities in the code which would allow us to take-over the malware/botnet and prevent the spread or malicious use, via the domain we registered. [2]

There are 2 types of malware analysis we are going to discuss next – Static and Dynamic Analysis

What is Static Analysis?

With static analysis we look at analysing the malware itself in the form of code reviews. By going through the code or malware structure we try to identify functions within them. This can be a very challenging task, especially once you see the methods the malicious coders deploy to prevent analysis of their code.

This kind of analysis takes place when the malware is “at rest”, that is it is not being run, and it can be useful as a first step preliminary analysis. There are two types of Static analysis. Basic analysis makes use of simple hash comparisons (commonly seen in older antivirus) and the extraction of strings, headers, functions and system API calls to try to build a picture of what the malware is. Advanced analysis is much cooler, hardcore disassembling the executable into assembly language and then review that, jumps and all! We have several tools to help us with this, IDA Pro being what comes to mind.

What is Dynamic Analysis?

Dynamic analysis, as the name suggests, deals with malware “in process” (i.e. malware actively running on the victims machine). When we are carrying out this kind of malware analysis we knowingly execute it in order to observe (and document!) its impact on the victim. This gives us a much better understanding and detailed view of what the malware is doing, and can be especially useful if the executable is packed or encrypted in part or whole. This is because, when you execute a program and it is in memory/in use it is decrypted and unpacked. There are 2 types of dynamic analysis, like with static; Basic and Advanced. With Basic Dynamic Analysis the method of analysis is to run the malware and gather information on what it is doing. We can identify, for example, what changes are being made on the file system, to configuration files, what registry keys are being created and edited, and what network activity is taking place and more. For Advanced Dynamic Analysis we thoroughly debug the malware binary and step through its execution. This involved us trying to identify each instruction and the outcome of that instruction, giving us a better understanding of all activities.

Show we do our analysis on a virtual or physical environment?

When we decide to get practical we must figure out if we are going to use a VM running on our laptop to act as a virtual environment or to get an old derelict computer for a physical environment. In general physical environments are preferred as modern malware may attempt to detect if it is running on a VM (and if so will refuse to perform any actions). If we are using physical machines we need to take adequate care with making sure it is sufficiently segregated from our internal network and the public internet. There are a few ways to do this, mostly by restricting traffic on your firewall, or even just not connecting the device to the network at all (the infamous Air-gap method). There are also some tools to help us with snapshotting and roll backs so we don’t have to reinstall after every execution using Ghost Imaging software or using technology like Deep Freeze, which resets core configurations on every reboot to a known good state.

Having a virtual environment simply means using VMware or VirtualBox on your standard workstation to run VMs of your “victim” to execute the malware on. This can be great for speedy rollback as there is usually snapshotting in your virtual environment but if the virtual nature of this environment is detected by the malware it may not run. There may be ways to mask this but I will need to research this in a later post.

What is automated analysis?

After reading about static and dynamic analysis, large portions look like they might be repetitive and tedious. Both of these traits indicate a good candidate for automation and Malware analysis is no exception. This will free up the ever more precious human analyst for more important work and reduce the ever present risk of error to some degree. There are many automated analysis tools available. Some such as Comodo Valkyrie and Threat Expert are cloud based tools where we upload our malware samples, while others such as ZeroWine and Buster are locally installed. These analysers are sandboxed so can run malware automatically with lesser risk (because risk is never 0!) of compromising the host system. They are great for reducing the noise our analysts sift through and highlighting the most important findings for further review but come with several drawbacks;

  • If the target malware is VM-aware then they may identify the analyser as a virtual environment and not execute as normal;
  • If the malware has a particular trigger it requires to run it may not receive this with an automated tool such as
    • Requiring human interaction;
    • Executing at a particular time;
    • Executing after a predefined action has taken place or similar – such as fileless cryptojackers waiting until the device has been idle for a period of time before starting to mine.
  • They might miss certain logical indicators that a human would not.

Despite these flaws, they are a great first step.

How does malware stop us from analysing it?

Malware has a few techniques it uses to try and prevent us from analysing it. A few ways have already been mentioned in the automated analysis section. If the malware detects it is running on a virtual machine it may not run normally or at all. It might rely on triggers to try to avoid or delay detection of activities until conditions are right or it may even remain dormant for a long period of time, frustrating the analyst into believing it is benign. One method of combating analysts that malware employs that is very effective is the use of obfuscation techniques.

Obfuscation is making use of a few techniques to make reading the code to identify its purpose challenging. With encryption the malware is composed of two parts, one part is the encrypted main body of code, and the other is the decryptor used to recover that code. In general a different encryption key is used for each iteration resulting in different encrypted outputs and hashes, confusing antivirus’ but the decryptor itself tends to remain on changed, providing a way to detect these infections.

The malware may also use encoding such as XOR or Base64 to transform its code and make it less readable by humans – this is especially true for malware that uses custom encoding. This means that even once an analyst unpacks to source code there is an extra layer of defense they must navigate. Defense in depth it seems is used by both sides in this computational war.

Packing is another way malware can try to obfuscate itself. Packing is used by many applications – malicious and legitimate. By packing the code we are compressing it for distribution. This can help malware avoid detection and analysis though most packing tools are common and detectable the analyst must still unpack the malware prior to analysing it.

Had your fill of obfuscation yet? Or are you still finding it incredibly interesting? Code obfuscation is a common source of frustration felt by analyst’s worldwide. The malware authors play with their code to make it as confusing as possible. They do this by re-ordering their code so it does not flow in a logical fashion, they might insert code that can lead analysts down a dead end (called Dead Code)  by having code with functions and calls that do not do anything. They might substitute common instructions for less known equivalents. In some cases they may change the assembly language jump instructions to further cause confusion.

Summary

For a first class, this was jam-packed with exciting information, especially around Code obfuscation, which if you thought was skimmed over, don’t worry I’ll be writing a dedicated post on this in the future! There should be one new blog post per week on malware analysis bookmark us so you never miss a chapter! If you cant wait that long to go further on your malware hunting journey I recommend Malware Unicorns Reverse Engineering 101 for a cool course on both static and dynamic analysis; https://securedorg.github.io/RE101/

It is complete with some amazing graphics that put my text heavy blog to shame. 🙂

Until next time.


[1] https://twitter.com/MalwareTechBlog

[2] https://www.malwaretech.com/2017/05/how-to-accidentally-stop-a-global-cyber-attacks.html

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s