Static Analysis - process of analyzing the code or structure of a program to determine its function. The program itself is not run.
Antivirus Scanning: A Useful First Step
A good first step - Run it through multiple antivirus tools. They rely on a database of identifiable pieces of known suspicious code (signatures) as well as behavioral and pattern-matching analysis (heuristics) to identify suspect files.
Problem - Malware writers can easily modify their code, changing their programs signatures and evading anti-virus scanners.
Websites like VirusTotal allow you to upload a file for scanning on multiple antivirus engines and generates a report that provides the total number of engines that marked it as malicious, the malware name and additional information.
Hashing: A Fingerprint for Malware
A common method to uniquely identify malware. The malware when run through a hashing program produces a unique hash that identifies the malware (fingerprint). MD5 or SHA1 used.
e,g, Using the freely available md5deep
C:\>md5deep c:\WINDOWS\system32\sol.exe
Once you have a unique hash you can:
- use it as a label
- share with other analysts to help them identify malware
- search for it online to see if has already been identified
Finding Strings
A program contains strings if it prints a message, connects to a URL, copies a file to a specific location.
Searching through the strings can give us hints about the functionality of the program. e.g. if the program accesses a URL, then you see the URL stored as a string. We can use the Strings program to search an executable for strings which is in ASCII or Unicode format.
Both ASCII and Unicode store characters in sequences that end with a NULL terminator. ASCII strings use 1 byte per character and Unicode uses 2 bytes per character.
e.g. ASCII representation of the string "BAD"
e.g. Unicode representation of the string "BAD"
When Strings searches executables for ASCII and Unicode strings, it ignores context and formatting, so that it can analyze any file type and detect strings across an entire file (this also means that it may identify bytes of characters as strings when they are not). Sometimes it interprets memory addresses, CPU instructions or data as strings incorrectly. Fortunately, most invlaid strings are obvious because they do not represent legitimate text.
If a string is short and doesnt correspond to words it is mostly meaningless. Here, strings GetLayout and SetLayout are Windows functions used by the Windows graphics library. Windows function names normally begin with a capital letter and subsequent words are also capital letters.
GDI32.DLL is meaningful because it's the name of a common Windows DLL used by graphics programs. We also see an IP address which most likely the malware will use in some fashion.
Finally, the last string is an error message. Error messages often give us the most useful information. This message reveals two things: The subject malware sends messages (probably through email) and it depends on a mail system DLL. This suggests that we might want to check email logs for suspicious traffic, and that another DLL might be associated with this malware.
Packed and Obfuscated Malware
Obfuscated programs are ones whose execution the malware author has attempted to hide. Packed programs are a subset of obfuscated programs in which the malicious program is compressed and cannot be analyzed. Both techniques severely limit our attempts to statically analyze the malware.
Packed malware will contain very few strings.
Packing Files
When the packed program is run, a small wrapper program also runs to decompress the packed file and then run the unpacked file. When the packed program is analyzed statically only the small wrapper program can be dissected.
One way to detect packed files is the PEiD program.
When a program is packed, you must unpack it to perform analysis. The unpacking process is complex but UPX packer is very popular and easy to use for unpacking. To decompress it
upx -d <packed program.exe>
PE File Format
Understanding the format can reveal important information about the malware. See "Dissecting The PE File Format Series" to learn more.
Linked Libraries and Functions
One of the most useful pieces of information about an executable is the list of function it imports. Imports are functions used by one program that are stored in a different program, such as code libraries that contain functionality common to many programs. Code libraries can be connected to the main executable by linking.
Programmers can link imports to their programs so that they don't beed to re-implement certain functionality in multiple programs. Code libraries can be linked statically, at runtime or dynamically. Knowing how the library code is linked is critical to our understanding of malware because the information we can find in the PE file header depends on how the library code has been linked.
Static, Runtime and Dynamic Linking
Static Linking - least common. All code from the library is copied into the executable, which makes the executable grow in size. When analyzing code, its difficult to differentiate between statically linked code and the executables own code, because nothing in the PE file header indicates that the file contains linked code.
Runtime linking is commonly used in packed or obfuscated malware. Executables that use runtime linking connect to libraries only when that function is needed, not at program start, as with dynamically linked programs.
Several Windows functions allow programmers to import linked functions not listed in the program's file header. Of these, the two most commonly used are LoadLibrary and GetProcAddress. LdrGetProcAddress and LdrLoadDll are also used. LoadLibrary and GetProcAddress allow a program to access any function in any library on the system, which means that when these functions are used, you can't tell statically which functions are being linked to by the suspect program.
Dynamic Linking is the most common. When libraries are dynamically linked, the host OS searches for the necessary libraries when the program is loaded. When the program calls the linked library function, that function executes within the library.
The PE file header stores information about every library that will be loaded and every function that will be used by the program. The libraries used and functions called are often the most important parts of a program, and identifying them is particularly important, because it allows us to guess what the program does. e.g. if it imports UrlDownloadToFile, its an easy guess.
Exploring Dynamically Linked Functions with Dependency Walker
Dependency Walker lists the dynamically linked functions in an executable.
The left pane shows the DLLs being imported. When you click on one of them, the upper right pane shows the imported functions of that DLL. The middle right pane (4) lists all functions in that DLL that can be imported. We see that the executables can be imported by name or by ordinal. When importing a function by ordinal, the name of the function never appears in the original executable and it can be harder to analyze it. In this case, you can find out the function being imported by looking at the middle pane.
Imported Functions
The PE header will include information about specific functions used by an executable. The name of these Windows functions can give us a good idea about what the executable does.
Exported Functions
Like imports, DLLs and EXEs export functions to interact with other programs and code. Typically, a DLL implements one or more functions and exports them for use by an executable that can then import and use them.
The exported functions can be found in the PE. Exported functions are commonly found in DLLs. The exported functions are generally named meaningfully but it can have any name its name. So, the name of the exported functions are of limited use when analyzing malware.
Static Analysis in Practice
PotentialKeylogger.exe
We see that this executable imports the following functions.
You see a huge list of imported functions but only some of them are useful.
The imports from kernel32.dll tells us that this software can open and manipulate processes and files. The functions FindFirstFile and FindNextFile can be used to navigate through directories.
The imports from user32.dll have a lot of GUI based functions which suggests that this program might have a GUI (it may not necessarily be displayed to the user).
The function SetWindowsHookEx is commonly used in spyware and is the most common way that key loggers receive keyboard inputs. This function has some legitimate uses but if you suspect a malware and see this, you are probably looking at key-logging functionality.
The function RegisterHotKey is also interesting. It registers a hot key so that whenever the user presses the hot key combination the application is notified.
The imports from GDI32.dll are graphics related and confirm that this probably has a GUI.
The imports from shell32.dll tells us that this program can launch other programs, another common functionality of malware.
The imports from Advapi32.dll tells us that this program uses the registry which also tells us that we should look for strings that look like registry keys. Registry keys look a lot like directories. In this case we found Software\Microsoft\Windows\CurrentVersion\Run which controls the programs that are automatically run when Windows starts up.
This also has a few exports: LowLevelKeyboardProc and LowlevelMouseProc.
Microsoft documentation - "The LowLevelKeyboardProc hook procedure is an application-defined or library-defined callback function used with the SetWindowsHookEx function." In other words, this function is used with SetWindowsHookEx to specify which function is called when a specific event occurs - in this case, a low-level keyboard event. We were able to get this info as in this case the programmer used the same name as the export.
PackedProgram.exe - A Dead End
If a program is packed you will see very few imported functions and prevents us from learning more through static analysis.
The PE File Headers and Sections
Understanding the PE header and the information obtained from it is very useful to understand certain malware functionality. See the PE File Format series for more information.
No comments:
Post a Comment