Write Better CloudFormation Templates
At Gerald we heavily use CloudFormation to deploy our cloud infrastructure. Our backend architecture is fairly complex with a lot of moving parts. From API servers to Big Data Engines and Data Science components, everything is deployed using cloudformation which integrates seamlessly with our CI/CD pipelines. The benefits of using cloudformation which is an Infrastructure as a Code service provided by AWS is huge.
A few things that I love about using CloudFormation.
-
Track your Infrastructure via version control since CFN templates are just
.yaml
or.json
files. -
Easy to replicate. Since CFN templates are just text files you can easily create replicas of your existing infrastructure to create new AWS environments.
-
Easily integrates with CI/CD tools and pipelines ā¤ļø.
I want to write a few good practices that me and my team follows while working with CloudFormation. These practices have helped us to write better cloudformation templates and deploy more often with extra confidence. Letās start š
Donāt provide a Resource Name
ā¦ If it is not required. A lot of CFN resources have a Name
or a similar attribute which is used to give a logical name to a resource. One major problem that comes if you provide a static value for that would be that CFN cannot replace that resource if needs to while updating.
Resources:
APIDockerRepository:
Type: AWS::ECR::Repository
Properties:
RepositoryName: 'api-repository'
SomeOtherAttribute: 'value'
Letās say to perform an update on this ECR Repository CFN needs to replace it and create a new one in its place. In the above case, CFN wonāt be able to do that since the RepositoryName
has a static value. Why? To understand that you need to know how CFN updates resources.
Whenever you make a change to a CFN Stack, CFN will first decide whether it can just update the existing resource or does it need to create a new one and replace the existing one with it.
In the case when it can just update the existing resource, it will do that and remember that action as well as the previous state of that resource. In case it needs to perform a Rollback it will undo the updates performed on that resource and will restore its previous state.
Now in the case when it cannot update the existing resource and it needs to create a new one. CFN will create a new resource and replace the existing one with the new one. It will remember the previous resource and its state just in case it needs to perform a Rollback. In the case of a Rollback, it will delete the newly created resource and bring back the one that was deleted (in most of the cases).
Now hereās the thing, to replace the ECR repository with a new one, CFN will try to create a new repository with the same attributes and one of the attributes is RepositoryName
which has a static value of api-repository. Now, if youāve worked with ECR, you might know that you cannot have two repositories with the same name. So when CFN tries to create a new ECR repository, it fails as ECR throws an error āRepository already existedā.
There are workarounds to this problem (delete first, create after) but a better solution (if it fits your requirements) is to omit the name attribute at all. By doing that, you leave the responsibility of naming your resources on CFN. CFN will give a pseudo-random name to your resource when it creates it.
Resources:
APIDockerRepository:
Type: AWS::ECR::Repository
Properties:
SomeOtherAttribute: 'value'
Now, the next time CFN needs to replace it, it can create a new resource with a new random name and replace it with the existing one. You donāt have to break a sweat when updating resources š
Comment Everything
I canāt emphasize how important it is to comment every resource, parameter, attribute that is not very obvious in the first look. A CFN file can be overwhelming to look at when youāre seeing it for the first couple of times. It will be an absolute disaster if there are no comments provided to help you understand the reasoning behind the values. Thatās why I prefer to use .yaml
files as you can add comments in a YAML file which is not the case with .json
templates where there is no way to add comments.
I also make sure to add references in the comments to the official docs as well as statements that describe what was the reason behind adding an attribute or its value. It has helped me more than anyone else especially when I go back to view a template that I created a while ago. I donāt have to re-think the reasoning as I can read it in the comments.
3. Donāt Hard-Code
Hard-Coding is bad. Especially when you are creating resources in the cloud. Your infrastructure should be flexible enough to change as per the architecture. Iāll list some typical things that Iāve seen are commonly hardcoded in CFN Templates.
- RAM/Compute/Disk Sizesā¦
- S3 Bucket Names
- CIDR/IP Ranges
- Instance Types
- AutoScaling/ReplicaSet Values
- ā¦etc
There will be a problem associated with each one of the above if you hard code their values. Letās go over some of them and see where the problem lies.
RAM/CPU/Disk Size
Letās assume that you are using 4GiB as the RAM value for production. Now you need to deploy the same Stack for testing. Of course, you will not use 4GiB for the testing environment, it will be something like 1Gib. Will you create a duplicate stack for testing? or re-use the same stack with dynamic values for production
and testing
.
S3 Bucket Names
Letās assume in the current Stack you have an S3 bucket with name user.images
where you are storing all the user images. Now if you want to store user images in different buckets based on the userās region. You would need multiple buckets. Would you create duplicate stacks with different bucket names and deploy them in multiple regions? or you would want to re-use the existing stack more intelligently.
But how to do that? CFN template language has multiple constructs to make our templates dynamic and thus more flexible.
Parameters
Parameters are user provided inputs when CFN is creating/updating a Stack. It comes with type checking, regex validations, pre-defined options etc.
Parameters:
Environment:
Description: An environment name
Type: String
AllowedValues:
- production
- stage
UserRegion:
Description: "A value for the company's region"
Type: String
AllowedValues:
- asia
- us
- eu
Resources:
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
ContainerDefinitions:
- Environment:
- Name: NODE_ENV
Value: production
- Name: NODE_CONFIG_ENV
# providing Environment value to docker container as an environment variable
Value: !Ref Environment
In this way, I can re-use the same template and deploy it for different environments and it will automatically use the right values. The value of the environment parameter will be provided by the person deploying the Stack.
Mappings
Mappings are key-value pairs that have different values based on different context.
Extending the previous example, letās say I want to use different instance types for different environments. Hereās how the CFN Mappings can help. I am also using different bucket names for different regions.
Mappings:
EnvBased:
stage:
mainApiEc2Type: t3.small
mainApiMem: 512
production:
mainApiEc2Type: m5a.xlarge
mainApiMem: 1024
RegionBased:
asia:
userImageBucket: asia.user.images
us:
userImageBucket: us.user.images
eu:
userImageBucket: eu.user.images
Resources:
EC2LaunchConfig:
Type: AWS::AutoScaling::LaunchConfiguration
DependsOn: PrivateRoute
Properties:
InstanceType: !FindInMap [EnvBased, !Ref Environment, mainApiEc2Type]
TaskDefinition:
Type: AWS::ECS::TaskDefinition
DependsOn: MainAPITaskRole
Properties:
ContainerDefinitions:
- Memory: !FindInMap [EnvBased, !Ref Environment, mainApiMem]
CodeArtifactsBucket:
Type: AWS::S3::Bucket
Properties:
AccessControl: Private
BucketName: !FindInMap [RegionBased, !Ref UserRegion, userImageBucket]
Conditions
Sometimes you also want to conditionally create resources. For example, you only want Cloudtrail enabled for the production account. Conditions help you define conditions which you can used to specify whether to create a resource or not.
Parameters:
Environment:
Description: An environment name
Type: String
AllowedValues:
- production
- stage
Conditions:
CreateProdResources: !Equals [!Ref Environment, production]
Resources:
CloudTrail:
Type: AWS::CloudTrail::Trail
Condition: CreateProdResources
Properties:
S3BucketName: 'cloud-trail'
This way if you deploy this template in stage
environment Cloudtrail will not be created.
Use Cross-Stack References
A lot of times you will have multiple stacks that powers your projectās infrastructure. In a lot of those scenarios, you might want to use a resource across multiple stacks. For example, you could have a Network stack that manages your VPCs and you will need to refer to VPCs defined in that stack when creating resources in other stacks. In such cases, it is not the right approach to copy-paste values from one stack and hardcode them in another stack. Instead, itās advisable to use cross-stack references using Export and Import.
The following stack exports values of 2 resources that it creates. The first one is the ID of a VPC exported by the name !Sub ${AWS::StackName}-VPCId
where {AWS::StackName}
is the unique name of that stack. Itās a good practice to prefix your exported
values with the Stack id/name to avoid name collisions. The second exported value is the Subnet Id of a public subnet.
# vpc.yaml
# Stack Name is "VPC-Stack"
Outputs:
VPCId:
Description: The ID of the VPC that this stack creates
Value: !Ref MainVPC
Export:
## If Stack name is "VPC-Stack" then this value is "VPC-Stack-VPCId"
Name: !Sub ${AWS::StackName}-VPCId
PublicSubnetOne:
Description: Public subnet one
Value: !Ref PublicSubnetOne
Export:
Name: !Sub ${AWS::StackName}-PublicSubnetOne
This is how you would use these values in an another stack.
# another-stack.yaml
Parameters:
VPCStack:
Description: Name of the VPC Stack
Type: String
Default: VPC-Stack
Resources:
SecGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Outbound Access To Anywhere
VpcId:
# This value after substitution becomes "VPC-Stack-VPCId"
Fn::ImportValue: !Sub '${VPCStack}-VPCId'
BatchCompute:
Type: 'AWS::Batch::ComputeEnvironment'
Properties:
Type: Managed
ComputeResources:
Subnets:
- Fn::ImportValue: !Sub '${VPCStack}-PublicSubnetOne'
Type: Ec2
To import a resource reference out of a stack you need to use the exact same name with which that resource was exported. One caveat with using this approach is that once an exported value is used (imported) in another stack it can not be changed.
Divide your Stacks into Units
Now Iāve learned this lesson from a real-life experience where I created a 2500+ lines of CFN template. Although it was properly organized but it still became a bottleneck for us to make changes to it.
In general, your Stack should be like a unit of similar resources that all achieve one primary goal. For instance, a typical web architecture includes a database, API server, frontend server.
You could divide it into the following stacks.
- AppName-VPC - that creates VPC,Subnets,VPC Endpoints, IG, NAT Gateways etc.
- AppName-API - that creates the API server resources like AutoScaling group, Launch Config, Security Groups, S3 Buckets, Load Balancers.
- AppName-Front - that creates the S3 bucket, Cloudfront distribution, Route53.
- AppName-DB - that creates the databases required for the application.
This way you would be able to make targeted changes to the infrastructure which will not affect all parts of it. You can use cross-reference stack values to reference/use resources from other stacks.
CloudFormation is AWS
Majority of the things that Iāve learned about different AWS services is through CloudFormation docs. A lot of [Important] things which you might miss while using the Console have to be explicitly specified when using CFN Templates which provides a holistic view of that AWS service. You should read about all the attributes in the docs when defining a resource. It will help you gain an in-depth understanding of how the service works under the hood.
Conclusion
Well, thatās about it. In this post, I went through some of the things I learned while working with CloudFormation. Two years ago, I started using CloudFormation and Iāve never looked back at the Console for creating things in AWS. From side projects to enterprise CloudFormation has always proven itself to be a great choice for managing Cloud Infrastructure on AWS. If youāve never tried it I encourage you to try it once. Start with small and remember Sky is the Limit! š