The modern public cloud landscape offers a world of possibilities. The major cloud services (Amazon Web Services [AWS], Azure and Google Cloud Platform [GCP] provide robust solutions with an ever-growing list of services and features. Today 81 percent of organizations leverage multiple cloud service providers (CSPs).
The reasons for adopting a multi-cloud strategy can vary: organizations may build disaster recovery (DR) sites on a different cloud platform or divide workloads according to the most suitable cloud services, and sometimes multi-cloud security is the product of mergers and acquisitions. No matter how or why multi-cloud was introduced, securing multiple cloud platforms is challenging.
Cloud vendors are in a race to close the gaps in capabilities among themselves as well as to create product differentiation that will attract and retain customers. Their security services are evolving rapidly to offer powerful functionality across different security areas, but they are doing so in variety of ways. Some services may look similar, but minor differences can lead to security issues and misconfigurations. Let’s explore some of the challenges that security organizations face with multi-cloud deployments.
Different vendors, different account models
The first challenge starts at the beginning – each cloud vendor provides a different account management model. Security organizations often need to map resources to owners of the relationship with the cloud vendor. To do this, they must understand the correct permissions model they need to apply. When heterogeneous CSPs are used, this task becomes even more challenging.
GCP is based on projects. Any GCP resource must belong to a project. Projects are placed in folders, and multi-level folders are supported.
While these different concepts are related, subtleties that can impact security still exist. To understand the resource hierarchy, it is important to understand which security model to apply.
Controlling security groups on different platforms
IT engineers have decades of experience with private networks. But while in the physical Domain Controller (DC) they control everything, starting from the wires all the way up to the application, in the cloud it is Amazon, Microsoft and Google that control the physical layer and invent different services that run on top of the virtual network. Routing models used by cloud solutions differ from those used by the DC, and different cloud solutions use different models. The network firewall from the DC is embedded into the infrastructure as Security Groups (SGs), and there are some differences among the SGs.
The AWS SG includes inbound and outbound rules for traffic. These rules are permissive, so effectively they work as a whitelist. Users can attach multiple SGs to each Elastic Compute Cloud (EC2) instance (technically, to the Elastic Network Interface, or ENI), and the rules of each security group are effectively aggregated to create one set of rules. SGs can be applied to different entities, including instances or managed services such as load balancers.
Azure Network Security Groups (NSG) and Google Elastic Compute Cloud (GCP) SGs provide an experience closer to classic firewalls, holding lists of allow and deny rules. The order of the rules is significant; rules with higher priority control the decision to allow or deny traffic. Azure allows only a single NSG to a virtual machine (VM), and NSGs can also be applied to subnets or the network interfaces (NICs) attached to VMs. GCP Security Groups are based on tags, which allow attaching rules to assets such as a VM.
When building the network, it is critical to keep in mind the need to implement the right model. Setting the wrong rule priority in Azure can lead to traffic being accepted by mistake. Applying additional SGs to a VM in AWS can lead to accepting traffic that was denied by the original SG. Engineers who work primarily with AWS can change priority-of-deny rules and block access to the service (or, in other cases, expose the service to the Internet when it should not be). Configuring security requires the right mindset, and IT engineers shifting back and forth among different deployments can easily make mistakes.
Virtual Networks in the cloud have different behavior
We can go one level deeper into the network. AWS virtual private cloud (VPC) subnets can either be private or public; a subnet is public if it has an internet gateway (IGW) attached. Only public subnets allow resources deployed in them to access to the internet. Azure VNet does not have private or public subnets; resources connected to a VNet have access to the Internet, by default. An engineer who is used to AWS relies on it to block instances from accessing the internet. When creating a DR site on Azure, however, the engineer needs to deny internet access. Context switching issues can be hazardous.
Another complication in AWS networking is related to the Network Access Controls (NACLs). NACLs operate at the subnet level by examining traffic entering and exiting the subnet, while SGs operate at the VM (actually – the Elastic Network Interface) level. NACLs are stateless – meaning even if ingress traffic is allowed, a response is not automatically allowed unless explicitly allowed in the rule for the subnet. The following diagram provided by AWS explains the flow of traffic across different security layers.
Azure and GCP do not use the concept of NACL, so engineers who migrate from those platforms to AWS often find themselves puzzled as to why traffic is blocked, even if it is clearly allowed by the SGs.
A few recommendations
The examples we have explored are only a taste of the real experiences and challenges that multi-cloud solutions bring to us. Becoming an expert with one cloud platform takes a lot of time and working with multiple clouds at the same time is more difficult and can make teams prone to errors. To reduce risk, you can follow a few recommendations:
- Automate processes. Manual changes are prone to errors; automation can reduce the chance of making mistakes.
- Employ a cross-cloud system. A solution that provides a level of abstraction and a similar experience across different cloud platforms can eliminate context-switching issues.
- Adopt a change-review mechanism.
About the author: Zohar Alon is the Founder and CEO of Dome9 Security, and a veteran in networking security. He helped shape the early days of network security while at Check Point Software where he built Provider-1,