Practical patterns for Terraform modules at scale: versioning, composition, testing, and avoiding the monolith trap.

On this page

Terraform Modules Done Right: Lessons from Managing 50+ Services

After managing infrastructure for 50+ microservices with Terraform, we've learned which module patterns scale and which become nightmares. Here's what works.

The Monolith Trap #

Our first approach was one massive Terraform repo with everything in it. Plan took 12 minutes. A typo in a dev variable once triggered a production change. We split it up.

Pattern 1: Layered Modules #

We organize modules in three layers:

code

modules/
  base/          # VPC, subnets, DNS zones
  platform/      # EKS cluster, RDS, ElastiCache
  service/       # Per-service: ALB, task def, IAM role

Each layer depends only on the layer below via remote state data sources:

hcl.hcl

data "terraform_remote_state" "platform" {
  backend = "s3"
  config = {
    bucket = "terraform-state-prod"
    key    = "platform/terraform.tfstate"
    region = "us-east-1"
  }
}

resource "aws_lb_target_group" "service" {
  vpc_id = data.terraform_remote_state.platform.outputs.vpc_id
  # ...
}

Pattern 2: Versioned Module Registry #

We publish reusable modules to a private registry with semantic versioning:

hcl.hcl

module "service" {
  source  = "app.terraform.io/ourorg/service/aws"
  version = "~> 2.0"

  name        = "payment-api"
  environment = "production"
  cpu         = 512
  memory      = 1024
}

Rules we follow:

Breaking changes = major version bump
New optional variables = minor version bump
Bug fixes = patch version bump
Teams pin to major version (~> 2.0), not exact

Pattern 3: Composition Over Configuration #

Instead of one module with 40 variables and 15 conditional blocks, we compose small modules:

hcl.hcl

module "alb" {
  source = "./modules/alb"
  # ...
}

module "ecs_service" {
  source          = "./modules/ecs-service"
  target_group_arn = module.alb.target_group_arn
  # ...
}

module "monitoring" {
  source       = "./modules/cloudwatch-alarms"
  service_name = module.ecs_service.name
  # ...
}

Each module does one thing. Connecting them is explicit, not hidden behind flags.

Pattern 4: Automated Testing #

We test modules with terraform validate, tflint, and integration tests:

bash.bash

# In CI pipeline
cd modules/service
terraform init -backend=false
terraform validate
tflint --init
tflint

# Integration test (creates real resources, then destroys)
cd tests/
go test -v -timeout 30m ./...

Best Practices Summary #

Keep blast radius small: one service per state file
Version your modules: semantic versioning with a changelog
Compose, don't configure: small modules > one mega-module
Test in CI: validate + lint + integration tests
Use remote state carefully: read-only data sources, not cross-stack references
Document inputs/outputs: a module without docs is a module no one trusts

Terraform at scale is a software engineering problem, not just an infrastructure problem. Treat your modules like libraries.

Terraform Modules Done Right: Lessons from Managing 50+ Services

Terraform Modules Done Right: Lessons from Managing 50+ Services

The Monolith Trap #

Pattern 1: Layered Modules #

Pattern 2: Versioned Module Registry #

Pattern 3: Composition Over Configuration #

Pattern 4: Automated Testing #

Best Practices Summary #

Stay Updated

Linux Performance Troubleshooting: A Real Incident Walkthrough

Incident Postmortems That Actually Prevent Repeat Failures

More from Infrastructure

Database Migrations Without Downtime: Patterns From Three Real Cutovers

Monitoring That Actually Helps On-Call: Alerts, Dashboards, and Runbooks

Terraform Module Version Pinning: How One Platform Team Stopped Surprise Breakage

Database Migrations Without Downtime: Patterns From Three Real Cutovers

Monitoring That Actually Helps On-Call: Alerts, Dashboards, and Runbooks

Terraform Module Version Pinning: How One Platform Team Stopped Surprise Breakage

Terraform State Isolation by Environment: How We Stopped One Change from Hitting Prod

Pre-Commit Hooks That Saved Our Repo: 7 Real Examples

EKS Auto Mode: What Worked, What Broke in Our Migration

About Kiril urbonas

You might have missed

GitOps with Argo CD: Best Practices for 2025

AI Agents in DevOps: From Copilots to Autonomous Automation in 2025

Prompt Engineering Best Practices: Maximizing LLM Performance

Terraform Modules Done Right: Lessons from Managing 50+ Services

The Monolith Trap#

Pattern 1: Layered Modules#

Pattern 2: Versioned Module Registry#

Pattern 3: Composition Over Configuration#

Pattern 4: Automated Testing#

Best Practices Summary#

Stay Updated

Linux Performance Troubleshooting: A Real Incident Walkthrough

Incident Postmortems That Actually Prevent Repeat Failures

More from Infrastructure

Database Migrations Without Downtime: Patterns From Three Real Cutovers

Monitoring That Actually Helps On-Call: Alerts, Dashboards, and Runbooks

Terraform Module Version Pinning: How One Platform Team Stopped Surprise Breakage

About Kiril urbonas

You might have missed

GitOps with Argo CD: Best Practices for 2025

AI Agents in DevOps: From Copilots to Autonomous Automation in 2025

Prompt Engineering Best Practices: Maximizing LLM Performance

The Monolith Trap #

Pattern 1: Layered Modules #

Pattern 2: Versioned Module Registry #

Pattern 3: Composition Over Configuration #

Pattern 4: Automated Testing #

Best Practices Summary #