How to Use DNS Analytics to Find the Compromised Domain in a Billion DNS Queries

Co-authored by Chenta Lee, Aviv Ron, Marc Ph. Stoecklin and Emily Ratliff.

Finding a needle in a haystack is hard, but it’s nothing compared to finding a single piece of hay in the haystack that was put there with malicious intentions.

As Verisign noted in its August 2018 “Domain Name Industry Brief,” there were around 339.8 million registered domains at the end of the second quarter, with approximately 7.9 million new domains registered in the last year. Additionally, public Domain Name System (DNS) providers log hundreds of billions of queries every day. Cloudflare reported that it serves 130 billion DNS queries per day, and in 2014, Google reported that it served more than 400 billion DNS queries per day. Furthermore, Let’s Encrypt issues around 600,000 digital certificates per day.

Buried in all this information are clues to past compromise, ongoing attacks and future malware campaigns. The goal of advanced DNS analytics is to transform this data into actionable threat intelligence, which enables security teams to block domain names, detect infected devices, identify insider threats and assess an organization’s overall security hygiene.

Why Are DNS Analytics Important?

DNS is a critical component of the internet. It is essentially a telephone book that translates domain names into IP addresses. Because of its ubiquity, DNS provides a highly attractive channel for advanced threat actors to exploit the DNS system for a variety of threat vectors.

The image below shows the different ways actors misuse DNS in attacks, grouped by the stages of the Gartner Cyber Kill Chain. For example, a domain name that looks similar to a well-known domain can be easily overlooked by a user in a phishing email, or attackers may register domain names expecting victims to mistype their destination, a tactic known as typosquatting.

DNS Analytics

The DNS protocol is also one of the few application protocols that are allowed to cross network perimeters of organizations. As such, DNS provides a channel for threat actors and botnet operators to establish hard-to-block and stealthy communication channels between infected devices and command-and-control (C&C) infrastructures. These include fast fluxing, domain generation algorithms (DGAs) and combinations thereof, as well as using the DNS protocol to exfiltrate secret information with covert tunneling techniques.

Today, organizations typically employ rules and filters based on blacklists to block known malicious domain names. However, comprehensive monitoring of DNS traffic is often overlooked in many organizations’ cybersecurity strategies. That’s not to say DNS monitoring is trivial; in general, DNS-based attacks cannot be detected by a single DNS lookup, but require advanced analytics of the context, including the history of lookups, contents of the response and correlation with additional data sources — including proactive analysis of website contents.

The 4 Types of Advanced DNS Analytics

We can begin to understand the multiple types of advanced DNS analytics by breaking them down into four categories based on what the analysis is being used for:

  1. Threat intelligence — Identification of malicious domains (e.g., squatting, command-and-control, compromised name).

  2. Infection detection — Detection of infected endpoints (e.g., suspicious behavior patterns).

  3. Domain categorization — Automatic categorization of domain names (e.g., domains similar to unwanted known domains).

  4. Forensic markers — Providing actionable information for forensics (e.g., past lookups and responses).

The first category uses DNS analytics to generate new threat intelligence that can be used to block domain names, inhibiting future access to malicious domains. The focus here is on early identification of malicious domains, perhaps even before they have been actively spread and used in an attack. The second category, infection detection, is about finding compromised systems quickly based on suspicious DNS behavior. The more quickly compromised endpoints can be detected, the faster they can be remediated, thus limiting the damage that is done.

Domain categorization is similar to threat intelligence, but is broader in that domains can be classified as gambling, gaming or with other classification markers, rather than marked as explicitly malicious or not. Forensic markers bring in provable facts about historical information. Because DNS is highly dynamic, with associations between IP addresses and names changing at high rate, it is important to maintain the exact history of what IP address an endpoint resolved in the past as part of a threat hunt or forensic investigation.

Detecting Security Threats With a Proactive Approach

There has been significant research into using DNS analytics to detect fast fluxing and DGAs published in the years since IBM security researcher Douglas L. Schales socialized the concept in a report titled “Stream Computing for Large-Scale, Multi-Channel Cyber Threat Analytics: Architecture, Implementation, Deployment, and Lessons Learned.” With the above categories in mind and access to large data streams of ongoing DNS queries, domain name registration data (WHOIS) and digital certificate registration information, what types of analysis can we perform to identify security threats?

Ahead-of-threat detection is an advanced approach to protecting users from malicious domains. Generally, malicious domain protection involves a time-of-use approach in a browser. Ahead-of-threat detection is a fundamentally different approach to finding potentially malicious domains, and the bad actors behind those domains, before an actual security threat is visible.

Threat actors typically generate malicious domains by applying generation methods, e.g., domain squatting. Ahead-of-threat detection finds those newly registered malicious domains from Quad9 and identifies the actors behind the domains. Additional analytics on related data — WHOIS analytics and site image comparison, for example — can reduce the chance of a false positive.

We have a different approach for ahead-of-threat protection. It finds phishing sites of famous domains proactively, making it possible to monitor the life cycle of the malicious domains. We have investigated around 1 million potentially malicious domains and found that the number of existing suspicious domains for cryptocurrency sites is quite higher than other famous domains. We can investigate the trend of malicious domains and improve the analysis by applying core technologies.

Content and information analysis is based on the fast stream of DNS queries. Such analytics have been used to detect DNS tunneling, aiming at identifying data exfiltration and other covert channels over DNS. The idea is to quickly filter out nontunnel traffic and flag suspicious activity, such as repeated queries to a particular domain, for additional review. This analytics technique was described in a research paper titled “Practical Comprehensive Bounds on Surreptitious Communication Over DNS.”

Volumetric analysis can be used to detect suspicious DNS activity by identifying anomalous peaks in DNS query traffic. When traffic spikes for a given domain name, peaks for benign domains often have normal behavior immediate preceding and following the peak, whereas peaks in queries for malicious domains do not. Volumetric DNS analytics can also be used to detect DNS rebinding attacks by characterizing the difference between benign resolution of IP addresses (common usage) and malicious usage for rebinding attacks.

Focus In on the Data by Limiting the Scope of Your Analysis

While volumetric analytics are extremely effective, efficiencies can be gained by limiting the scope of the analysis. This can be done by recognizing that malicious actors primarily use DNS in one of two ways:

  1. New domains created for the attack.
  2. Existing domains hijacked for malicious activity.

Focusing on newly observed domains (NODs) enables early detection analytics that respond to the observation of a new domain, which allows for early detection of malicious domains in the absence of full knowledge about the domain. Likewise, focusing on changes to an existing domain limits the volume of data that has to be processed and enables early detection.

Using the adversarial technique of fast fluxing, an attacker quickly cycles through a large number of different IP addresses for a domain name in an attempt to avoid being blocked. Schales described two analysis techniques for fast flux detection using MapReduce and Feature Collection and Correlation Engine (FCCE) in “Scalable Analytics to Detect DNS Misuse for Establishing Stealthy Communication Channels,” which was published in the 2016 special issue of the IBM Journal of Research and Development on Cybersecurity.

Another way to reduce the volume of DNS query data is to focus on failed domain resolutions (NXDOMAIN). In that same paper, Schales described using this technique to detect domain fluxing, an adversarial technique whereby malware algorithmically computes domains for its C&C server and tries to resolve these domains until it finds a live server. Looking at the failed domain names, this technique extracts 43 features of each domain — for example, whether it is punycode-encoded, the language(s) of any dictionary words it contains, etc. Using these features, the technique calculates a weighted confidence score to measure the liklihood that the domain belongs to a malicious domain cluster.

In many deployments, the volume of DNS queries and responses is a challenge to scale, both in the central processing unit (CPU) and memory. Attempts to perform in-depth analysis of every DNS name will simply not scale. To tackle this, we are exploring the use of sketch/streaming algorithms, which are typically approximation algorithms that operate very efficiently. We utilize these as a first-line method for identifying candidates for in-depth analysis. This, combined with a high-speed store that contains a recent history of DNS data, has proven to be highly scalable.

Get Creative to Separate the Malicious From the Benign

Each of these DNS analytics techniques advances the goal of quickly finding and eliminating threats and aiding threat analysts in making the best and fastest determinations possible using unique methods of analyzing DNS data. Predictive analysis powered by DNS analytics reduces the complexity of security operations by providing actionable threat intelligence. DNS analytics also help threat hunters to narrow down their investigations by providing more indicators. Research is ongoing in this field, so we can expect to see more interesting ways of processing the high volume of DNS data into actionable intelligence.

When MythBusters raced to find the needle in a haystack, co-host Adam Savage floated the hay over water so the needle would sink below the hay. IBM Research is approaching advanced DNS analytics in similarly novel ways to separate the malicious from the benign as efficiently as possible.