There’s no doubt that artificial intelligence (AI) for cybersecurity is surrounded by an incredible amount of hype.
Cognitive intelligence and machine learning have the potential to combat a series of tricky issues facing the industry. Enterprise security leaders are fighting a widening skills gap, a higher volume of sophisticated attacks and a wider attack surface. Cybersecurity AI has enormous potential to identify advanced persistent threats (APTs) and create automation and efficiency in the overworked security operations center (SOC). AI can learn and adapt to detect anomalies from a real-time stream of data on zero-day threats, attempted data breaches or breach events with data loss.
However, AI’s potential for automation isn’t just limited to the enterprise. Machine learning techniques can be used for both attack and defense, and cybersecurity AI should be viewed as a potential attack vector for adversarial machine learning.
What You Need to Know About Adversarial Machine Learning
Security experts are increasingly concerned about the use of adversarial machine learning techniques aimed directly at cybersecurity AI in the enterprise. Deepfakes, or AI-generated social engineering attacks involving video and audio files, were at the top of MIT Technology Review‘s 2019 list of emerging cyberthreats. Highly convincing media files can now be “pulled off by anyone with a decent computer and a powerful graphics card,” wrote Martin Giles, the San Francisco bureau chief of MIT Technology Review. Deepfakes can be deployed for convincing social engineering attacks, whaling or even to cause mass damage, such as the manipulation of stock prices.
Safely deploying cybersecurity AI requires security leaders to understand the types of adversarial techniques, tactics and procedures (TTPs) attackers use to trick, poison or mislead enterprise AI tools. Some common TTPs include:
- Adversarial inputs — big data inputs developed to be reliably misclassified by AI technologies to allow threat actors to evade detection. This category includes malicious documents and attachments designed to evade spam filters or antivirus technologies.
- Data poisoning — the method of feeding “poisoned” training data to cybersecurity tools. Poisoning attacks can occur when data is fed to a classifier to skew the machine learning model’s ability to distinguish adverse events from normal events.
- Feedback weaponization — a method of data poisoning that tricks a machine learning model into generating an enormous volume of false positives to create excessive noise in the SOC and evade detection.
- Model stealing — an attack that incorporates techniques used to create a duplicate of a machine learning model or steal model training data. This methodology can be used to steal AI models used to classify incidents, events and malicious content. Stealing models enables bad actors to develop sophisticated, highly targeted attacks against cybersecurity AI.
Researchers have demonstrated the potential of each of these categories of attacks. However, there’s little evidence in industry studies of data breaches that machine learning attacks have begun occurring at scale. While the “evil robots” haven’t fully arrived yet, they’re probably on their way.
According to cybersecurity author Stuart McClure, “a battle between AI systems in cybersecurity … will come in the next three to five years.” While there’s no reason to panic just yet, it’s smart to prepare for a future in which adversarial AI could reach maturity and become commoditized like ransomware and distributed denial-of-service (DDoS) threats.
3 Categories of Cybersecurity AI Risk
Despite the buzz around cybersecurity AI and research showing that chief information security officers (CISOs) are confident in this technology, adoption statistics show the industry hasn’t fully wrapped their arms around cognitive intelligence in the SOC. Nearly one-third of cybersecurity teams have deployed AI, with the majority focused on testing solutions.
Realizing the remarkable benefits of AI in the SOC requires organizations to recognize that machine learning can be a potential attack vector or an asset. Furthermore, safely deploying AI requires organizations to address data, resources and human risks.
Data used for training machine learning models is a sensitive resource that can be exploited as an attack vector if organizations fail to protect the quality of model inputs against adversarial inputs or poisoning attacks. Data can be protected via smart data sampling methodologies that prevent training data from being skewed. AI solutions that rely on anonymous data sharing can also strengthen the quality of inputs to avoid the risks of a poor-quality sample.
Cybersecurity AI requires effective algorithms, models and platforms for the rapid detection of threats. Human intervention is needed to validate machine learning models and understand changes in AI. Methods for testing the quality of algorithms include dark launches, which test models against the same dataset, or A/B testing. A technique called backtesting can validate a model against historical datasets to determine the accuracy of the AI’s ability to detect a known outcome.
In addition, before an AI solution is launched, it’s wise to test the solution against a golden dataset that contains representations of normal events and sophisticated anomalies to validate the accuracy of the model before it’s brought into production.
3. Human Expertise
AI is not intended to replace or automate human intelligence. In the cognitive SOC, humans and AI work side by side to identify real-time threats. Cybersecurity AI specialists and data scientists are critically needed to ensure the safe and effective use of machine learning in the enterprise.
It’s risky to wait until adversarial machine learning reaches maturity to bolster your internal AI talent. There’s a critical shortage of qualified experts, and the void may be even deeper than the general cybersecurity talent gap. According to Security Now, global estimates of the number of AI experts range from just 22,000 to 300,000. As a result, many organizations are beginning to hire physicists, astronomers and people in other disciplines with a focus on mathematics, hoping those abilities will translate to AI roles. In addition to nontraditional talent pipelines, third-party expertise could be key to mitigating the risks of AI adoption.
What’s Next for AI and Cybersecurity?
Blockchain dominated the discussion in 2018 and security AI is the most-hyped technology for 2019. When any emergent technology is surrounded in buzz, it’s wise for security leaders to look past claims that a solution is a silver bullet. AI for security hasn’t reached full maturity, but it’s closing in and approaching prime time at many organizations. When machine learning and cognitive intelligence are safely deployed to enhance human expertise, organizations can achieve automation, efficiency and the potential to detect sophisticated threats such as adversarial AI.
A war between offensive and defensive AI is likely inevitable, but organizations shouldn’t wait until adversarial machine learning reaches maturity to address these risks. Using techniques to protect your models and algorithms against future AI threats can improve the value of machine learning models in the present. It’s time for organizations to build a cognitive SOC with the strength to defend against intelligent threats by partnering with AI expertise, strengthening data inputs, and creating processes to test and validate machine learning models.
- Build a golden dataset that your classifier must accurately predict in order to be launched into production. This dataset ideally contains a set of curated attacks and normal content that are representative of your system. This process will ensure that you can detect when a weaponization attack was able to generate a significant regression in your model before it negatively impacted your users.