How to Spot a Zero-Day Sight-Unseen

Zero-day attacks have businesses and consumers alike worried about how to protect data. If we don’t know what a threat looks like, can we really protect ourselves against it?For some time, security tools have been developed with the objective of helping organizations defend against the unknown, but the reality of zero-day attacks (the fact that they are by definition brand new, leveraging new vulnerabilities or techniques) makes it difficult to detect using traditional security software.Let’s look for a moment at antivirus or antimalware software. The reason why you have to continuously install updates for these software solutions is because new signatures (or threat “definitions”) for new strains or versions of malware are constantly being added to their dictionaries. But this doesn’t work for zero-days. What happens when this software doesn’t have a definition for a specific malware yet?The “old school” way of detecting malware relies on spotting indicators of compromise (IOCs). The problem with this method is that detecting malware via IOCs requires previously having seen that IOC (see Figure 1). This is essentially a signature- and rules-based approaches, which require the software to be acquainted with new “definitions” of an intrusion before it’s able to spot it. As you can imagine, detecting zero-day malware through this process is extremely difficult because zero-day malware is brand new, which means that its specific IOCs haven’t been seen before.

Figure 1: Malware with known IOCs are detected. The zero-day malware avoids detection since it has a specific IOC that has not yet been seen and catalogued.These traditional applications can be very good at protecting against what has been seen before in the past. But can you train a machine to spot malicious software that has never been observed in the wild? In other words, can a machine detect a zero day?In a nutshell, yes.Behavioral analytics—which we can call the “new school” approach—enables an entirely different way of detecting threats that is uniquely appropriate for zero-day attacks. The trick is that nearly all malware, even zero-days, exhibit behaviors that when detected become unavoidable clues about an attack in progress.At its core, behavioral analytics uses anomaly detection to look for these behaviors that look different from regular, well-behaved software. All malware behave in some way that is different, and they can therefore be detected by analytics (see Figure 2).Another powerful advantage of relying on behaviors instead of signatures is that you can even detect so-called “fileless malware”—malware that only exists in memory and does not write itself to a file, making it nearly impossible for traditional file-based signature scanning and whitelist techniques to detect. But guess what? Fileless malware still exhibits behaviors that can be detected.

Figure 2: All malware (even zero-day malware, depicted in red) trigger unusual behaviors, which can be detected.Let’s consider how this detection process might play out by imagining the progression of the worst possible scenario: a zero-day, fileless attack (Figure 3). First, a phishing email delivers a malicious Word document to your computer. Next, the malware attempts to grab a foothold inside of the infected computer; Microsoft Word launches PowerShell and registry keys are modified. PowerShell then communicates with the computer’s command and control (C&C) center to prepare the payload. And finally, the payload is executed. Mission complete.

Figure 3: Sample timeline of a zero-day attackIf we’re looking at a zero-day malware, your antivirus software won’t recognize it yet when it enters your computer—it hasn’t “met” the malware before. Further, this is a fileless attack that can’t be scanned, doesn’t install any new executable code on the machine and only uses existing programs on your computer. But several of the activities in the timeline described signal “unusual behaviors” that are identifiable by analytics. Word launching PowerShell is an unusual parent-child process. C&C domain queries trigger anomalous DNS frequency and time of day. Protocol tunneling exhibits unusual network payload sizes for that protocol. And so on. In this case, the zero-day threat may be entirely unknown, but its resulting behavioral changes can’t hide.The really cool thing about this approach is that behavioral analytics eliminates the need to know a threat before it comes your way. With behavioral analytics, you can see the warning signs of even brand-new malware before it does irreparable damage.

About the Author: Stephan Jou is the chief technology officer at Interset, an In-Q-Tel-backed security analytics company. He leads the development of advanced analytics and mathematical modeling for unsupervised machine learning to detect how corporate intellectual property is being attacked, moved, shared, and utilized. Prior to Interset, Jou served as a technical architect at IBM’s Business Analytics Office of the CTO—a role in which he architected the development of more than ten Cognos and IBM products in the areas of cloud-computing, mobile, visualization, semantic search, data-mining, and neural networks. Stephan has also published award-winning articles for Verizon’s Data Breach Investigations Report and the ISSA on the topics of deep learning, machine learning, and insider threats. Jou holds a Master of Science in computational neuroscience and biomedical engineering.Editor’s Note: The opinions expressed in this guest author article are solely those of the contributor, and do not necessarily reflect those of Tripwire, Inc.