AWS Identity and Access Management (IAM) roles are a significant component in the way customers operate in Amazon Web Service (AWS). In this post, I’ll dive into the details on how Cloud security architects and account administrators can protect IAM roles from misuse by using trust policies. By the end of this post, you’ll know how to use IAM roles to build trust policies that work at scale, providing guardrails to control access to resources in your organization.
In general, there are four different scenarios where you might use IAM roles in AWS:
- One AWS service accesses another AWS service – When an AWS service needs access to other AWS services or functions, you can create a role that will grant that access.
- One AWS account accesses another AWS account – This use case is commonly referred to as a cross-account role pattern. This allows human or machine IAM principals from other AWS accounts to assume this role and act on resources in this account.
- A third-party web identity needs access – This use case allows users with identities in third-party systems like Google and Facebook, or Amazon Cognito, to use a role to access resources in the account.
- Authentication using SAML2.0 federation – This is commonly used by enterprises with Active Directory that want to connect using an IAM role so that their users can use single sign-on workflows to access AWS accounts.
In all cases, the makeup of an IAM role is the same as that of an IAM user and is only differentiated by the following qualities:
- An IAM role does not have long term credentials associated with it; rather, a principal (an IAM user, machine, or other authenticated identity) assumes the IAM role and inherits the permissions assigned to that role.
- The tokens issued when a principal assumes an IAM role are temporary. Their expiration reduces the risks associated with credentials leaking and being reused.
- An IAM role has a trust policy that defines which conditions must be met to allow other principals to assume it. This trust policy reduces the risks associated with privilege escalation.
Recommendation: You should make extensive use of temporary IAM roles rather than permanent credentials such as IAM users. For more information review this page: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html
While the list of users having access to your AWS accounts can change over time, the roles used to manage your AWS account probably won’t. The use of IAM roles essentially decouples your enterprise identity system (SAML 2.0) from your permission system (AWS IAM policies), simplifying management of each.
Managing access to IAM roles
Let’s dive into how you can create relationships between your enterprise identity system and your permissions system by looking at the policy types you can apply to an IAM role.
An IAM role has three places where it uses policies:
- Permission policies (inline and attached) – These policies define the permissions that a principal assuming the role is able (or restricted) to perform, and on which resources.
- Permissions boundary – A permissions boundary is an advanced feature for using a managed policy to set the maximum permissions that an identity-based policy can grant to an IAM entity. An entity’s permissions boundary allows it to perform only the actions that are allowed by both its identity-based permission policies and its permissions boundaries.
- Trust relationship – This policy defines which principals can assume the role, and under which conditions. This is sometimes referred to as a resource-based policy for the IAM role. We’ll refer to this policy simply as the ‘trust policy’.
A role can be assumed by a human user or a machine principal, such as an Amazon Elastic Computer Cloud (Amazon EC2) instance or an AWS Lambda function. Over the rest of this post, you’ll see how you’re able to reduce the conditions for principals to use roles by configuring their trust policies.
An example of a simple trust policy
A common use case is when you need to provide security audit access to your account, allowing a third party to review the configuration of that account. After attaching the relevant permission policies to an IAM role, you need to add a cross-account trust policy to allow the third-party auditor to make the sts:AssumeRole API call to elevate their access in the audited account. The following trust policy shows an example policy created through the AWS Management Console:
As you can see, it has the same structure as other IAM policies with Effect, Action, and Condition components. It also has the Principal parameter, but no Resource attribute. This is because the resource, in the context of the trust policy, is the IAM role itself. For the same reason, the Action parameter will only ever be set to one of the following values: sts:AssumeRole, sts:AssumeRoleWithSAML, or sts:AssumeRoleWithWebIdentity.
Note: The suffix root in the policy’s Principal attribute equates to “authenticated and authorized principals in the account,” not the special and all-powerful root user principal that is created when an AWS account is created.
Using the Principal attribute to reduce scope
In a trust policy, the Principal attribute indicates which other principals can assume the IAM role. In the example above, 111122223333 represents the AWS account number for the auditor’s AWS account. In effect, this allows any principal in the 111122223333 AWS account with sts:AssumeRole permissions to assume this role.
To restrict access to a specific IAM user account, you can define the trust policy like the following example, which would allow only the IAM user LiJuan in the 111122223333 account to assume this role. LiJuan would also need to have sts:AssumeRole permissions attached to their IAM user for this to work:
The principals set in the Principal attribute can be any principal defined by the IAM documentation, and can refer to an AWS or a federated principal. You cannot use a wildcard (“*” or “?”) within a Principal for a trust policy, other than one special condition, which I’ll come back to in a moment: You must define precisely which principal you are referring to because there is a translation that occurs when you submit your trust policy that ties it to each principal’s hidden principal ID, and it can’t do that if there are wildcards in the principal.
The only scenario where you can use a wildcard in the Principal parameter is where the parameter value is only the “*” wildcard. Use of the global wildcard “*” for the Principal isn’t recommended unless you have clearly defined Conditional attributes in the policy statement to restrict use of the IAM role, since doing so without Conditional attributes permits assumption of the role by any principal in any AWS account, regardless of who that is.
Using identity federation on AWS
Federated users from SAML 2.0 compliant enterprise identity services are given permissions to access AWS accounts through the use of IAM roles. While the user-to-role configuration of this connection is established within the SAML 2.0 identity provider, you should also put controls in the trust policy in IAM to reduce any abuse.
Because the Principal attribute contains configuration information about the SAML mapping, in the case of Active Directory, you need to use the Condition attribute in the trust policy to restrict use of the role from the AWS account management perspective. This can be done by restricting the SourceIp address, as demonstrated later, or by using one or more of the SAML-specific Condition keys available. My recommendation here is to be as specific as you can in reducing the set of principals that can use the role as is practical. This is best achieved by adding qualifiers into the Condition attribute of your trust policy.
There’s a very good guide on creating roles for SAML 2.0 federation that contains a basic example trust policy you can use.
Using the Condition attribute in a trust policy to reduce scope
The Condition statement in your trust policy sets additional requirements for the Principal trying to assume the role. If you don’t set a Condition attribute, the IAM engine will rely solely on the Principal attribute of this policy to authorize role assumption. Given that it isn’t possible to use wildcards within the Principal attribute, the Condition attribute is a really flexible way to reduce the set of users that are able to assume the role without necessarily specifying the principals.
Limiting role use based on an identifier
Occasionally teams managing multiple roles can become confused as to which role achieves what and can inadvertently assume the wrong role. This is referred to as the Confused Deputy problem. This next section shows you a way to quickly reduce this risk.
The following trust policy requires that principals from the 111122223333 AWS account have provided a special phrase when making their request to assume the role. Adding this condition reduces the risk that someone from the 111122223333 account will assume this role by mistake. This phrase is configured by specifying an ExternalID conditional context key.
In the example trust policy above, the value ExampleSpecialPhrase isn’t a secret or a password. Adding the ExternalID condition limits this role from being assumed using the console. The only way to add this ExternalID argument into the role assumption API call is to use the AWS Command Line Interface (AWS CLI) or a programming interface. Having this condition doesn’t prevent a user who knows about this relationship and the ExternalId from assuming what might be a privileged set of permissions, but does help manage risks like the Confused Deputy problem. I see customers using an ExternalID that matches the name of the AWS account, which works to ensure that an operator is working on the account they believe they’re working on.
Limiting role use based on multi-factor authentication
By using the Condition attribute, you can also require that the principal assuming this role has passed a multi-factor authentication (MFA) check before they’re permitted to use this role. This again limits the risk associated with mistaken use of the role and adds some assurances about the principal’s identity.
In the example trust policy above, I also introduced the MultiFactorAuthPresent conditional context key. Per the AWS global condition context keys documentation, the MultiFactorAuthPresent conditional context key does not apply to sts:AssumeRole requests in the following contexts:
- When using access keys in the CLI or with the API
- When using temporary credentials without MFA
- When a user signs in to the AWS Console
- When services (like AWS CloudFormation or Amazon Athena) reuse session credentials to call other APIs
- When authentication has taken place via federation
In the example above, the use of the BoolIfExists qualifier to the MultiFactorAuthPresent conditional context key evaluates the condition as true if:
- The principal type can have an MFA attached, and does.
- The principal type cannot have an MFA attached.
This is a subtle difference but makes the use of this conditional key in trust policies much more flexible across all principal types.
Limiting role use based on time
During activities like security audits, it’s quite common for the activity to be time-bound and temporary. There’s a risk that the IAM role could be assumed even after the audit activity concludes, which might be undesirable. You can manage this risk by adding a time condition to the Condition attribute of the trust policy. This means that rather than being concerned with disabling the IAM role created immediately following the activity, customers can build the date restriction into the trust policy. You can do this by using policy attribute statements, like so:
Limiting role use based on IP addresses or CIDR ranges
If the auditor for a security audit is using a known fixed IP address, you can build that information into the trust policy, further reducing the opportunity for the role to be assumed by unauthorized actors calling the assumeRole API function from another IP address or CIDR range:
Limiting role use based on tags
IAM tagging capabilities can also help to build flexible and adaptive trust policies, too, so that they create an attribute-based access control (ABAC) model for IAM management. You can build trust policies that only permit principals that have already been tagged with a specific key and value to assume a specific role. The following example requires that IAM principals in the AWS account 111122223333 be tagged with department = OperationsTeam for them to assume the IAM role.
If you want to create this effect, I highly recommend the use of the PrincipalTag pattern above, but you must also be cautious about which principals are then also given iam:TagUser, iam:TagRole, iam:UnTagUser, and iam:UnTagRole permissions, perhaps even using the aws:PrincipalTag condition within the permissions boundary policy to restrict their ability to retag their own IAM principal or that of another IAM role they can assume.
Limiting or extending access to a role based on AWS Organizations
Since its announcement in 2016, almost every enterprise customer I work with uses AWS Organizations. This AWS service allows customers to create an organizational structure for their accounts by creating hard boundaries to manage blast-radius risks, among other advantages. You can use the PrincipalOrgID condition to limit assumption of an organization-wide core IAM role.
Caution: As you’ll see in the example below, you need to set the Principal attribute to “*” to do this, which would, without the conditional restriction, allow all role assumption requests to be accepted for this role, irrespective of the source of that assumption request. For that reason, be especially careful about the use of this pattern.
It isn’t practical to write out all the AWS account identifiers into a trust policy, and because of the way policies like this are evaluated, you can’t include wildcard characters for the account number in the principal’s account number field. The use of the PrincipalOrgID global condition context key provides us with a neat and dynamic mechanism to create a short policy statement.
There are instances where a third party might themselves be using IAM roles, or where an AWS service resource that has already assumed a role needs to assume another role (perhaps in another account), and customers might need to allow only specific IAM roles in that remote account to assume the IAM role you create in your account. You can use role chaining to build permitted role escalation routes using role assumption from within the same account or AWS organization, or from third-party AWS accounts.
Consider the following trust policy example where I use a combination of the Principal attribute to scope down to an AWS account, and the aws:UserId global conditional context key to scope down to a specific role using its RoleId. To capture the RoleId for the role you want to be able to assume, you can run the following command using the AWS CLI:
Here is the example trust policy that limits to only the CrossAccountAuditor role from AWS Account 111122223333.
If you’re using an IAM user and have assumed the CrossAccountAuditor IAM role, the policy above will work through the AWS CLI with a call to aws sts assume-role and through the console.
This type of trust policy also works for services like Amazon EC2, allowing those instances using their assigned instance profile role to assume a role in another account to perform actions. We’ll touch on this use case later in the post.
Putting it all together
AWS customers can use combinations of all the above Principal and Condition attributes to hone the trust they’re extending out to any third party, or even within their own organization. They might create an accumulated trust policy for an IAM role which achieves the following effect:
Allows only a user named PauloSantos, in AWS account number 111122223333, to assume the role if they have also authenticated with an MFA, are logging in from an IP address in the 203.0.113.0 to 203.0.113.24 CIDR range, and the date is between noon of September 1, 2020, and noon of September 7, 2020.
I’ve seen customers use this to create IAM users who have no permissions attached besides sts:AssumeRole. Trust relationships are then configured between the IAM users and the IAM roles, creating ultimate flexibility in defining who has access to what roles without needing to update the IAM user identity pool at all.
A word on Effect: Deny and NotPrincipal in IAM role trust policies
I have seen some customers make use of an “Effect”: “Deny” clause in their trust policies. This pattern can help manage a wildcard statement in another “Effect”: “Allow” clause of the same trust policy. However, this isn’t the best approach for most scenarios. You will typically be able to define each principal in your policy as being allowed access. An example of where this might not be true is where you have a clause that uses the global wildcard “*” as a principal, in which case it will be necessary to add Deny statements to further filter the access.
Putting a wildcard into the Principal attribute of an Allow policy statement, particularly in relation to trust policies, can be dangerous if you haven’t done a robust job of managing the Condition attribute in the same statement. Be as specific as possible in your Allow statement, and use Principal attributes first, rather than then relying on Deny statements to manage potential security gaps created by your use of wildcards.
The following trust policy allows all IAM principals within the o-abcd12efg1 organization to assume the IAM role, but only if it’s before September 7, 2020:
The use of NotPrincipal in trust policies
You can also build into your trust policies a NotPrincipal condition. Again, this is rarely the best choice, because you can introduce unnecessary complexity and confusion into your policies. Instead, you can avoid that problem by using fairly simple and prescriptive Principal statements.
Statements with NotPrincipal can also use a Deny statement as well, so it can create quite baffling policy logic, which if misunderstood could create unintended opportunities for misuse or abuse.
Here’s an example where you might think to use Deny and NotPrincipal in a trust policy—but notice this has the same effect as adding arn:aws:iam::123456789012:role/CoreAccess in a single Allow statement. In general, Deny with NotPrincipal statements in trust policies create unnecessary complexity, and should be avoided.
Remember, your Principal attribute should be very specific, to reduce the set of those able to assume the role, and an IAM role trust policy won’t permit access if a corresponding Allow statement isn’t explicitly present in the trust policy. It’s better to rely on the default deny policy evaluation logic where you’re able, rather than introducing unnecessary complexity into your policy logic.
Creating trust policies for AWS services that assume roles
There are two types of contexts where AWS services need access to IAM roles to function:
- Resources managed by an AWS service (like Amazon EC2 or Lambda, for example) need access to an IAM role to execute functions on other AWS resources, and need permissions to do so.
- An AWS service that abstracts its functionality from other AWS services, like Amazon Elastic Container Service (Amazon ECS) or Amazon Lex, needs access to execute functions on AWS resources. These are called service-linked roles and are a special case that’s out of the scope of this post.
In both contexts, you have the service itself as an actor. The service is assuming your IAM role so it can provide your credentials to your Lambda function (the first context) or use those credentials to do things (the second context). In the same way that IAM roles are used by human operators to provide an escalation mechanism for users operating with specific functions in the examples above, so, too, do AWS resources, such as Lambda functions, Amazon EC2 instances, and even AWS CloudFormation, require the same mechanism. You can find more information about how to create IAM Roles for AWS Services here.
An IAM role for a human operator and for an AWS service are exactly the same, even though they have a different principal defined in the trust policy. The policy’s Principal will define the AWS service that is permitted to assume the role for its function.
Here’s an example trust policy for a role designed for an Amazon EC2 instance to assume. You can see that the principal provided is the ec2.amazonaws.com service:
Every configuration of an AWS resource should be passed a specific role unique to its function. So, if you have two Amazon EC2 launch configurations, you should design two separate IAM roles, even if the permissions they require are currently the same. This allows each configuration to grow or shrink the permissions it requires over time, without needing to reattach IAM roles to configurations, which might create a privilege escalation risk. Instead, you update the permissions attached to each IAM role independently, knowing that it will only be used by that one service resource. This helps reduce the potential impact of risks. Automating your management of roles will help here, too.
Several customers have asked if it’s possible to design a trust policy for an IAM role such that it can only be passed to a specific Amazon EC2 instance. This isn’t directly possible. You cannot place the Amazon Resource Name (ARN) for an EC2 instance into the Principal of a trust policy, nor can you use tag-based condition statements in the trust policy to limit the ability for the role to be used by a specific resource.
The only option is to manage access to the iam:PassRole action within the permission policy for those IAM principals you expect to be attaching IAM roles to AWS resources. This special Action is evaluated when a principal tries to attach another IAM role to an AWS service or AWS resource.
You should use restrictions on access to the iam:PassRole action with permission policies and permission boundaries. This means that the ability to attach roles to instance profiles for Amazon EC2 is limited, rather than using the trust policy on the role assumed by the EC2 instance to achieve this. This approach makes it much easier to manage scaling for both those principals attaching roles to EC2 instances, and the instances themselves.
You could use a permission policy to limit the ability for the associated role to attach other roles to Amazon EC2 instances with the following permission policy, unless the role name is prefixed with EC2-Webserver-:
You now have all the tools you need to build robust and effective trust policies that work at scale, providing guardrails for your users and those who might want to access resources in your account from outside your organization.
Policy logic isn’t always simple, and I encourage you to use sandbox accounts to try out your ideas. In general, simplicity should win over cleverness. IAM policies and statements that might well be frugal in their use of policy language might also be difficult to read, interpret, and update by other IAM administrators in the future. Keeping your trust policies simple helps to build IAM relationships everyone understands and can manage, and use, effectively.
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.