Malware Root Cause Analysis

Sunday, July 29, 2012 Posted by Corey Harrell
The purpose to performing root cause analysis is to find the cause of a problem. Knowing a problem’s origin makes it easier to take steps to either resolve the problem or lessen the impact the next time the problem happens again. Root cause analysis can be conducted on a number of issues; one happens to be malware infections. Finding the cause of an infection will reveal what security controls broke down that let the malware infect the computer in the first place. In this post I’m expanding on my Compromise Root Cause Analysis Model by showing how a malware infection can be modeled using it.

Compromise Root Cause Analysis Revisited


The Compromise Root Cause Analysis Model is a way to organize information and artifacts to make it easier to answer questions about a compromise. The attack artifacts left on a network and/or computer fall into one of these categories: source, delivery mechanism, exploit, payload, and indicators. The relationship between the categories is shown in the image below.


I’m only providing a brief summary about the model but for more detailed information see the post Compromise Root Cause Analysis Model. At the model’s core is the source of the attack; this is where the attack came from. The delivery mechanisms are for the artifacts associated with the exploit and payload being sent to the target. Lastly, the indicators category is for the post compromise activity artifacts. The only thing that needs to be done to use the model during an examination is to organize any relevant artifacts into these categories. I typically categorized every artifact I discover as an indicator until additional information makes me move them to a different category.

Another Day Another Java Exploit


I completed this examination earlier in the year but I thought it made a great case to demonstrate how to determine a malware infection’s origin by using the Root Cause Analysis Model. The examination was kicked off when someone saw visual indicators on their screen that their computer was infected. My antivirus scan against the powered down computer confirmed there was an infection as shown below.


The antivirus scan flagged four files as being malicious. Two binaries (fuo.exe and 329991.exe) were identified as the threat: Win32:MalOb-GR[Cryp]. One cached webpage (srun[1].htm) was flagged as JS:Agent-PK[Trj] while the ad_track[1].htm file was identified as HTML:RedirME-inf[Trj]. A VirusTotal search on the fuo.exe file’s MD5 hash provided more information about the malware.

I mentally categorized the four files as indicators of the infection until it’s proven otherwise. The next examination step that identified additional artifacts was timeline analysis because it revealed what activity was occurring on the system around the time when malware appeared. A search for the files fuo.exe and 329991.exe brought me to the portion of the timeline shown below.


The timeline showed the fuo.exe file was created on the computer after the 329991.exe file. There were also indications that Java executed; the hsperfdata_username file was modified which is one artifact I highlighted in my Java exploit artifact research. I was looking at the activity on the system before the fuo.exe file appeared which is shown below.


The timeline confirmed Java did in fact execute as can be seen by the modification made to its prefetch file. The file 329991.exe was created on the system at 1/15/2012 16:06:22 which was one second after a jar file appeared in the anon user profile’s temporary folder. This activity resembles exactly how an infection looks when a Java exploit is used to download malware onto a system. However, additional legwork was needed to confirm my theory. Taking one look at the jar_cache8544679787799132517.tmp file in JD-GUI was all that I needed. The picture below highlights three separate areas in the jar file.


The first area labeled as 1 shows a string being built where the temp folder (str1) is added to 329991.exe. The second area labeled as 2 first shows the InputStream function sets the URL to read from while the FileOutputStream function writes the data to a file which happens to be str3. Remember that str3 contains the string 329991.exe located inside the temp folder. The last highlighted area is labeled as 3 which is where the Runtime function starts to run the newly created 329991.exe file. The analysis on the jar file confirmed it was responsible for downloading the first piece of malware onto the system. VirusTotal showed that only 8 out of 43 scanners identified the file as a CVE-2010-0840 Java exploit. (for another write-up about how to examine a Java exploit refer to the post Finding the Initial Infection Vector). At this point I mentally categorized all of the artifacts associated with Java executing and the Java exploit under the exploit category. The new information made me move 329991.exe from the indicator to the payload category since it was the payload of the attack.

I continued working the timeline by looking at the activity on the system before the Java exploit (jar_cache8544679787799132517.tmp) appeared on the system. I noticed there was a PrivacIE entry for a URL ending in ad_track.php. PrivacIE entries are for 3rd party content on websites and this particular URL was interesting because Avast flagged the cached webpage ad_track[1].htm. I started tracking the URLs in an attempt to identify the website that served up the 3rd party content. I didn’t need to identify the website per say since I already reached my examination goal but it was something I personally wanted to know. I gave up looking after spending about 10 minutes working my way through a ton of Internet Explorer entries and temporary Internet files for advertisements.


I answered the “how” question but I wanted to make sure the attack only downloaded the two pieces of malware I already identified. I went back in the timeline to when the fuo.exe file was created on the system. I started looking to see if any other files were created on the system but the only activity I really saw involved the antivirus software installed on the system.


Modeling Artifacts


The examination identified numerous artifacts and information about how the system was compromised. The diagram below shows how the artifacts are organized under the Compromise Root Cause Analysis Model.


As can be seen in the picture above the examination did not confirm all of the information about the attack. However, categorizing the artifacts helped make it possible to answer the question how did the system become infected. It was a patching issue that resulted in an extremely vulnerable Java version running on the system. In the end not only did another person get their computer cleaned but they also learned about the importance of installing updates on their computer.


Usual Disclaimer: I received permission from the person I assisted to discuss this case publicly.
  1. Corey,

    Great stuff! This really goes a long way toward helping to codify and document the root cause of an incident. Thanks so much for sharing this insightful, well written, and well thought out post.

  2. Corey,

    I'd be interested in knowing if anything appeared in the shimcache data...

  3. Harlan,

    I wanted to know the same thing when I went back over the case for this post. I did my analysis in January 2012 and I first learned about the shim cache in April 2012 when Mandiant’s post came out about it. The shim cache would have been useful since the prefetch files didn’t show the malware executing or provide additional timestamps. Unfortunately, the only thing I held onto from the case was the exploit, my timeline, and my notes.

  4. Anonymous

    Hi Corey,

    Very interesting stuff. How would you compare this to VERIS or the work on malware root cause analysis that I did in http://download.microsoft.com/download/0/3/3/0331766E-3FC4-44E5-B1CA-2BDEB58211B8/Microsoft_Security_Intelligence_Report_volume_11_Zeroing_in_on_Malware_Propagation_Methods_English.pdf ?

  5. @anon,

    Thanks for the comment; it took a bit to respond since I was reading a few things about what you mentioned.

    First, I really enjoy reading the Microsoft security reports because they provide a wealth of information. I missed the report you mentioned but read it once I saw your link. It was good by the way. We are both talking about the same issue but doing it from different perspectives. I think this is the main difference between your report and what I'm talking about. The report like most Microsoft or other security reports are coming from the malware analysis perspective. Looking at understnading malware once it is already on a computer. Don't get me wrong, the malware analysis perspective is needed and I enjoy reading about it. However, my perspective is coming from the malware forensics side of the house. Looking at a system to find the malware and determine how it got there. Your report even mentions on page 12 "the actual method of infection is very difficult to determine without performing forensic work on each computer". The comment was made in reference to the MSRT detections but that's the point to the Compromise Root Cause Analysis Model. To help those performing forensic work to determine the actual method of infection.

    I never tried to compare my model against others such as VERIS. Thinking about it I don't think its comparable to any model. The other models may be helpful by providing a language to describe security incidents. To me they aren't much help in telling me about if malware on the system came from a drive-by targetting Java, a malicious email attachment, or a network share. This is where the Root Cause Analysis Model comes into play; it helps to answer the "how" of a security incident occured.

Post a Comment