Image credit: Adobe Stock: ArtemisDianaIntroduction to DevOps
DevOps merges cultural philosophies, practices, and tools to enhance an organization’s ability to deliver applications and services at high velocity, outpacing organizations with traditional development and infrastructure management processes. This synergy of development and operations aims for a continuous delivery model emphasizing repeatability, reliability, stability, resilience, and security, alongside operational efficiency improvements.

The essence of DevOps culture lies in eliminating the barriers between development and operations teams, fostering an environment where both work in unison to amplify productivity and operational reliability. The movement’s core values are encapsulated in the mantra “People over Process over Tools” and include:
These values guide the practical benefits of DevOps principles, allowing for frequent code deployments and the creation of resilient, self-healing systems equipped with advanced monitoring and alerting capabilities.
DevOps practices enable organizations to deploy code multiple times a day, significantly reducing outages and downtime through the use of resilient systems. AWS services further enhance DevOps practices by providing tools that support continuous integration and delivery, infrastructure automation, and a consistent approach across projects.

Key components include:

Infrastructure as Code is a paradigm that manages and provisions infrastructure through code rather than manual processes, promoting reliability, reproducibility, and documentation. IaC tools range from ad hoc scripts for single-use tasks to configuration management tools like Chef, Puppet, Ansible, and SaltStack, which automate software installation on servers.
Below is an overview of the different types of tools used in IaC.
Ad hoc scripts are simple, often improvised commands or sets of commands that are used to perform a specific task on one or more servers. They are the most basic form of automation, providing a quick and easy way to get things done without the need for more complex tooling. However, they can become difficult to manage and scale as infrastructure grows and changes.

Configuration management tools automate the process of controlling and tracking changes in the software, and ensuring that it is consistent and maintains its integrity over time. They can install and manage software on existing servers, enforce desired states, and automate routine tasks.

Server templating tools are used to create images of server configurations, which can be rapidly deployed. This allows for the creation of consistent, repeatable server setups that can be quickly spun up or down as needed.

Server provisioning tools are responsible for the initial setup of servers. They can create servers, install operating systems, and then hand them off to configuration management tools for further setup.

Procedural languages are characterized by their focus on the sequence of operations to perform a task. They do not inherently capture the complete state of the infrastructure, making it difficult to understand the deployment’s current state without knowing the order in which scripts or templates were executed. This sequential nature also limits the reusability of procedural code, as adjustments must often be made based on the infrastructure’s existing state.
Procedural languages: Chef & Ansible
In contrast, declarative languages, as used in tools like CloudFormation and Terraform, allow for the description of the desired state of the infrastructure without specifying the sequence of steps to achieve it. This approach ensures that the code always accurately represents the infrastructure’s current state, enhancing clarity, reusability, and manageability.
Declarative languages: Terraform, Cloudformation, SaltStack, Puppet and OpenStack Heat

CloudFormation is a service provided by AWS that automates the provisioning and management of a wide range of AWS resources. It allows users to use programming languages or simple text files to model and provision, in an automated and secure manner, all the resources needed for their applications across all regions and accounts.
In advanced CloudFormation architecture, the focus is on designing scalable, resilient, and efficient infrastructure by leveraging the following concepts:

StackSets extend the functionality of CloudFormation stacks by enabling you to create, update, or delete stacks across multiple accounts and regions with a single operation. This is particularly useful for large-scale deployments where consistency and automation across accounts and regions are critical.

Nested Stacks allow you to organize your CloudFormation templates into reusable, manageable components. A nested stack is a stack that you create within another stack by using the AWS::CloudFormation::Stack resource. This “modular” approach simplifies the management of greater systems by allowing you to build layers of abstraction.

CloudFormation templates can be written in JSON or YAML format. They consist of five main sections: Parameters, Mappings, Conditions, Resources, and Outputs.
{
"AWSTemplateFormatVersion": "2010-09-09", // Optional - Defines which CloudFormation Version is used
"Description": "An example CloudFormation template.", // Optional
"Metadata": {
// Optional - Additional Infos about template, to document in a tagging matter.
"Template": "BasicExample"
},
"Parameters": {
// Optional
"InstanceType": {
"Description": "EC2 instance type",
"Type": "String",
"Default": "t2.micro"
}
},
"Mappings": {
// Optional
"RegionMap": {
"us-west-1": { "AMI": "ami-0abcdef1234567890" },
"eu-central-1": { "AMI": "ami-1234567890abcdef0" }
}
},
"Conditions": {
// Optional
"CreateProdResources": {
"Fn::Equals": [{ "Ref": "EnvType" }, "prod"]
}
},
"Transform": {
// Optional
"Name": "AWS::Include",
"Parameters": {
"Location": "s3://my-bucket/my-transform-macro.yml"
}
},
"Resources": {
// Required - Main part of the template
"MyEC2Instance": {
"Type": "AWS::EC2::Instance",
"Properties": {
"InstanceType": { "Ref": "InstanceType" },
"ImageId": {
"Fn::FindInMap": ["RegionMap", { "Ref": "AWS::Region" }, "AMI"]
}
}
}
},
"Outputs": {
// Optional
"InstanceId": {
"Description": "The Instance ID",
"Value": { "Ref": "MyEC2Instance" }
}
}
}
AWSTemplateFormatVersion: "2010-09-09" # Optional - Defines which CloudFormation Version is used
Description: An example CloudFormation template. # Optional
Metadata: # Optional - Additional Infos about template, to document in a tagging matter.
Template: BasicExample
Parameters: # Optional
InstanceType:
Description: EC2 instance type
Type: String
Default: t2.micro
Mappings: # Optional
RegionMap:
us-west-1:
AMI: ami-0abcdef1234567890
eu-central-1:
AMI: ami-1234567890abcdef0
Conditions: # Optional
CreateProdResources:
Fn::Equals:
- Ref: EnvType
- prod
Transform: # Optional
Name: "AWS::Include"
Parameters:
Location: "s3://my-bucket/my-transform-macro.yml"
Resources: # Required - Main part of the template
MyEC2Instance:
Type: "AWS::EC2::Instance"
Properties:
InstanceType: !Ref InstanceType
ImageId: !FindInMap [RegionMap, !Ref "AWS::Region", AMI]
Outputs: # Optional
InstanceId:
Description: The Instance ID
Value: !Ref MyEC2Instance
The AWS Cloud Development Kit (CDK) is a development framework for defining AWS cloud infrastructure in software like coding manner and provisioning it through CloudFormation.

Terraform is a powerful tool designed for building, changing, and versioning infrastructure safely and efficiently. As an open-source project initiated by HashiCorp in 2014, it has rapidly become a key player in the infrastructure as code (IaC) paradigm.
provider: Specifies a plugin that Terraform uses to interact with cloud providers, services, and other APIs. It defines the necessary information to connect to a service, like AWS or Google Cloud, such as credentials and region.
variable: Variables in Terraform are placeholders for values that can be set at runtime. They allow for customization of Terraform configurations without altering the code.
resource: A resource block defines a piece of infrastructure, like a virtual machine, network, or database. Terraform uses these definitions to create, manage, and update infrastructure components.
output: Output values are like return values for a Terraform module. They can be used to extract information about the infrastructure, such as IPs, hostnames, and IDs, which can be used elsewhere or displayed to the user.
module: Modules are containers for multiple resources that are used together. A module can be reused across different projects to create predefined sets of resources.
data: Data sources allow Terraform to use information defined outside of Terraform, or defined by another separate Terraform configuration.
terraform: A special block where you define Terraform settings, such as required Terraform version, backend configuration, etc.
locals: Locals are named values that you can use to simplify or avoid repetition in your Terraform code. Unlike variables, locals are not user input but are more like constants within a module.
resource "aws_instance" "web" {
ami = "ami-0375ca3842950ade6"
instance_type = "t2.micro"
}
resource "dnsimple_record" "web" {
domain = "hashicorp.com"
name = "web"
ttl = "3600"
type = "A"
value = aws_instance.web.public_ip
}
Core <-> Plugins <-> Upstream APIs
All interactions with terraform occur via CLI
TF is a local tool (runs on current machine)
ecosystem with different providers of cloud services and module repo
terraform init to initialize new working directory containing .tf config files
terraform fmt for canonical formatting + reporting syntax errors


The terraform destroy command is used to destroy the Terraform-managed infrastructure. This will ask for confirmation before destroying
Options: see terraform destroy –help
Terraform only knows the configuration & state of your infrastructure. Use version control and revert to earlier version (of main.tf). Then run terraform apply on it.

Terraform stores the state of the infrastructure from the last time Terraform was run. The state is used to create plans and make changes to your infrastructure. It is critical that this state is maintained appropriately so future runs operate as expected. It’s important to note that TF state files can contain sensitive data. Therefore it’s recommended to not store the TF state in source control.
Terraform stores its state locally in terraform.tfstate (not encrypted) by default, but for team collaboration, it allows to store the state remotly for example in Amazon S3, TF Cloud etc. to ensure consistency. Remote state encryption is backend-specific!
Meta-Arguments: Control Terraform’s behavior, not directly linked to cloud resources.
count: Define the number of identical resources to create without loops.
depends_on: Explicitly set dependencies for resource creation order.
provider: Specify which provider to use for a resource, useful in multi-provider setups.
lifecycle: Manage resource lifecycle rules, like prevention of destruction.
Access: Requires AWS account details. Interaction: Defines how Terraform interacts with AWS API. Configuration:
provider "aws" {}hcl provider "aws" { region = "us-west-2" access_key = "anaccesskey" secret_key = "asecretkey" }
Credentials:providers.tf.File Structure: Define all providers in providers.tf.
Security: Never hardcode access keys; use environment variables or config files.
Aliases: Use aliases for handling multiple provider instances.
provider setting. provider "aws" {
version = ">= 1.19.0"
alias = "providerAlias"
region = "${var.region}"
}
resource "aws_vpn_gateway" "vpn_gw" {
provider = "aws.providerAlias"
vpc_id = "vpc_123456gw"
}
Types: Most common types are strings, numbers, lists and maps. Other accepted types are booleans, sets, objects and tuples. If omitted, the type is inferred from the default value. If the type and the default value is missing, it’s assumed to be a string.
# Variable declaration with string type
variable "image_id" {
type = string
}
# Variable with a default list value
variable "availability_zone_names" {
type = list(string)
default = ["us-west-1a"]
}
# Variable declaration for a map
variable "tags" {
type = map(string)
}
# Usage of string interpolation
resource "aws_instance" "example" {
ami = var.image_id
instance_type = "t2.micro"
# Interpolate variable into a string
tags = {
Name = "Server-${var.image_id}"
}
}
# Multiline string with heredoc syntax
resource "aws_security_group" "example" {
name = "security_group_name"
description = <<EOF
This is a multiline description
that spans several lines
using heredoc syntax.
EOF
}
# Numeric values, including hex
resource "aws_ebs_volume" "example" {
size = 10 # base 10 integer
# Hexadecimal value for the number of IOPS
iops = 0x100
}
# Boolean value
resource "aws_instance" "example_with_condition" {
ami = var.image_id
instance_type = "t2.micro"
monitoring = true # Boolean value
}
# List value
resource "aws_autoscaling_group" "example" {
availability_zones = var.availability_zone_names
min_size = 1
max_size = 5
}
# Map value
resource "aws_instance" "example_with_tags" {
ami = var.image_id
instance_type = "t2.micro"
# Map variable usage
tags = var.tags
}
# Conditional expression
resource "aws_elb" "example" {
name = "foobar-terraform-elb"
availability_zones = var.availability_zone_names
# Conditional example - if instance is production, use 5 instances, else use 1
instances = var.environment == "production" ? [aws_instance.production.*.id] : [aws_instance.development.id]
}
In Terraform, locals are used to simplify and reuse expressions within a module. Think of it as a local variable within a function in Python that can only be addressed within the function.
Example:
locals {
# Define a local value
service_name = "my-service"
}
resource "aws_s3_bucket" "example" {
# Use the local value
bucket = "${local.service_name}-data"
}
The AWS provider facilitates interactions with the many resources supported by AWS. Resources are defined as follows:
resource "TYPE" "NAME" {
CONFIG ...
[for_each = FOR_EACH]
[count = COUNT]
[depends_on = [NAME, ...]]
[provider = PROVIDER]
}
A basic resource configuration for an AWS instance might look like this:
resource "aws_instance" "example" {
ami = "ami-275f631"
instance_type = "t2.micro"
}
for_each and countfor_each and count are used to create multiple instances of a resource:
for_each is used to iterate over a map or set of values, creating one resource per item.count is used to create a specified number of instances of a resource.Examples:
# for_each
resource "aws_subnet" "public_subnet" {
for_each = var.subnet_numbers
# Additional configurations ...
}
# count
resource "aws_subnet" "public_subnet" {
count = 4
# Additional configurations ...
}
Lifecycle policies and timeouts can be configured to control resource behavior on changes:
lifecycle can be used to ignore certain changes or prevent resource destruction.timeouts define how long Terraform should wait for a resource to be created or deleted. resource "aws_instance" "example" {
# Configurations ...
lifecycle {
ignore_changes = [ami]
prevent_destroy = true
}
timeouts {
create = "60m"
delete = "2h"
}
}
# Example for a resource file
module "my_module" {
source = "./modules/my_module"
# Additional configurations ...
}
Provisioners in Terraform are used to execute scripts on a local or remote machine as part of resource creation or destruction.
resource "aws_instance" "example" {
ami = "ami-275f631"
instance_type = "t2.micro"
provisioner "local-exec" {
command = "echo ${aws_instance.example.private_ip} >> inventory.txt"
}
}
when = "destroy".user_data or AWS cloud-init.remote-exec provisioner on a base AMI to run a few commands upon instance creation.Data sources in Terraform are used to fetch or compute data for use elsewhere in your Terraform configuration. They allow a Terraform configuration to build on information defined outside of Terraform or defined by another separate Terraform configuration. For most AWS Resources, there is an equivalent Data Source available for querying data.
Example of a Data Source configuration:
data "aws_ami" "web" {
filter {
name = "state"
values = ["available"]
}
filter {
name = "tag:Component"
values = ["web"]
}
most_recent = true
}
cluster_id = data.terraform_remote_state.base.iac_ecs_cluster.ecs_cluster_id
Outputs in Terraform are used to output important data from your Terraform configuration that you want to easily access or use in other configurations. This data can be outputted when Terraform apply is called and can be queried using the Terraform output command.
Outputs are particularly useful for displaying computed values like IP addresses, DNS names, and resource IDs. They can be consumed by other Terraform configurations or modules.
Example of defining an output:
output "public_ip" {
value = aws_instance.web.public_ip
}
output "public_dns" {
value = aws_instance.web.public_dns
}
Example of querying an output:
> terraform output
public_dns = ec2-34-222-156-11.us-west-2.compute.amazonaws.com
public_ip = 34.222.156.11
Backends in Terraform are configuration elements that determine where and how the infrastructure state is stored, crucial for collaboration in teams and managing remote operations.
State Storage: Backends allow storing the state in a remote environment like AWS S3 instead of locally on the disk. This promotes collaboration as the team can access the same state.
Locking Mechanism: To prevent state corruption, some backends, such as Terraform Cloud or Enterprise, offer locking mechanisms that block concurrent state modifications.
Sensitive Information: By using backends like S3, sensitive information is not stored on the local disk, enhancing security.
Remote Operations: For large infrastructures or specific changes, terraform apply operations can take a long time. Backends enable these operations to be executed remotely, allowing you to turn off your computer in the meantime.
The terraform init command must be called whenever a new environment is set up or any change to the backend configuration is made, to initialize or update the backend.
A backend’s configuration is done directly in Terraform files within the terraform block.
Example of S3 backend configuration:
terraform {
backend "s3" {
bucket = "mybucket"
key = "path/to/my/key"
region = "us-east-1"
}
}
In this example, the S3 backend is configured to store the Terraform state in a specified S3 bucket. The path to the state key and the bucket’s region are specified. This configuration allows multiple users to manage the state consistently and carry out operations securely and efficiently.
main.tf. provider "aws" {
region = "eu-west-1"
}
resource "aws_vpc" "this" {
cidr_block = "10.10.0.0/16"
enable_dns_hostnames = true
}
output "this_vpc_id" {
value = "${aws_vpc.this.id}"
}
As the project grows (20+ resources and data sources), issues arise:
terraform apply.Modules solve these issues by organizing Terraform configurations into folders.
Resource modules (terraform-aws-modules), used for:
Infrastructure modules incorporate:
Resource Module Example:
module "atlantis_alb_sg" {
source = "terraform-aws-modules/security-group/aws//modules/https-443"
version = "v2.0.0"
name = "atlantis-alb"
vpc_id = "vpc-12345678"
description = "Security group with HTTPS ports open for everybody (IPv4 CIDR)"
ingress_cidr_blocks = ["0.0.0.0/0"]
}
Infrastructure Module Example:
module "atlantis" {
source = "terraform-aws-modules/atlantis/aws"
name = "atlantis"
# VPC
cidr = "10.20.0.0/20"
azs = ["eu-west-1a", "eu-west-1b", "eu-west-1c"]
private_subnets = ["10.20.1.0/24", "10.20.2.0/24", "10.20.3.0/24"]
public_subnets = ["10.20.101.0/24", "10.20.102.0/24", "10.20.103.0/24"]
# DNS
route53_zone_name = "terraform-aws-modules.modules.tf"
# Atlantis app
atlantis_github_user = "atlantis-bot"
atlantis_github_user_token = "examplegithubtoken"
}
Categorize by function:
Utilize the Terraform Module Registry for discovering and using community modules.
.terraform directory stores module references, allowing immediate access to module changes. Use tree or ls -1 to view the .terraform directory contents for modules and plugins.
Mismanagement of resources in the cloud can lead to critical issues:
Terraform provides a series of commands to help manage and troubleshoot resources:
To see the current state of resources as known by Terraform:
terraform state list
To actively query the current state of the resources and detect any changes:
terraform plan
To apply the necessary changes to reach the desired state configuration:
terraform apply
If a resource exists in the cloud but not in Terraform’s state, it can be imported:
terraform import <ADDRESS> <ID>
For example, to import an AWS instance:
terraform import aws_instance.example i-abcd1234
The terraform refresh command updates the state file with the real-world infrastructure:
terraform refresh
This is useful for ensuring that Terraform’s state matches the actual infrastructure and for detecting drift.
The terraform state list command will then list the updated resources known to the state file.
Terraform Workspaces are used to manage multiple states within the same Terraform configuration, allowing for parallel management of different environments such as development, staging, and production. Each workspace encapsulates a set of infrastructure with its state and variables, enabling changes to be applied without affecting other environments.
terraform workspace new <workspace_name>
terraform workspace select <workspace_name>
Local values and provider configuration can be adapted based on the workspace:
locals {
environment = terraform.workspace == "default" ? "development" : terraform.workspace
// Other local variables mapped per environment...
}
provider "aws" {
region = "us-west-1"
allowed_account_ids = [local.allowed_account_ids]
// Assume role if necessary...
}
Resources can be conditionally created based on the workspace:
resource "aws_instance" "example" {
count = terraform.workspace == "prod" ? 1 : 0
// Other configuration...
}
Automating workspace operations through a CI/CD pipeline is recommended for safety and efficiency:
# CI/CD pipeline example for Terraform
build:
commands:
- terraform init
- terraform validate
- terraform workspace select ${WORKSPACE_NAME} || terraform workspace new ${WORKSPACE_NAME}
- terraform plan
- terraform apply
for_each in Terraform variable "user_names" {
description = "Create IAM users with these names"
type = list(string)
default = ["neo", "trinity", "morpheus"]
}
resource "aws_iam_user" "example" {
for_each = toset(var.user_names)
name = each.value
}
output "all_arns" {
value = values(aws_iam_user.example)[*].arn
}
This example creates IAM users for each name in the user_names list and outputs their ARNs after terraform apply.
resource "aws_autoscaling_group" "example" {
# (...)
dynamic "tag" {
for_each = var.custom_tags
content {
key = tag.key
value = tag.value
propagate_at_launch = true
}
}
}
The dynamic block with for_each loops over custom_tags and creates tags for the autoscaling group.
for_each with Expressions variable "names" {
description = "A list of names"
type = list(string)
default = ["neo", "trinity", "morpheus"]
}
output "upper_names" {
value = [for name in var.names : upper(name)]
}
output "short_upper_names" {
value = [for name in var.names : upper(name) if length(name) <= 5]
}
The first output transforms all names to uppercase, while the second output includes only names with 5 or fewer characters in uppercase.
for_each dynamic "tag" {
for_each = {
for key, value in var.custom_tags :
key => upper(value) if key != "Name"
}
content {
key = tag.key
value = tag.value
propagate_at_launch = true
}
}
This dynamic block uses a for expression with a conditional to exclude certain tags.
count and for_each within the same resource block.for_each and count together within module definitions.The for_each argument allows Terraform to create multiple instances of a resource or module. It loops over a given collection and creates one instance per item. Conditionals can be used to filter or modify the collection. The outputs can then collect the attributes of the created resources.