EDP
terraform-platform-infra-live
Visual infrastructure guide

terraform-platform-infra-live

This repo is the live Terraform composition for the Enterprise Data Platform. It builds the AWS network, lake storage, IAM, processing, orchestration, serving, analytics backend, frontends, and monitoring resources for each environment.

VPCS3 data lakeKMS + IAMDMSGlueStep FunctionsMWAAECS frontendsCloudWatch
Module dependency map

What Terraform builds, and why order matters

Module dependency order Read from left to right: foundation outputs are created first, then reusable runtime modules, then optional demo modes. 1. Foundation networking VPC, subnets, gateways, endpoint data-lake S3 buckets, Athena results iam-metadata KMS, roles, Glue databases 2. Default dev runtime processing Glue connection + workgroup step-functions default orchestrator analytics-agent ECS, ECR, ALB, UI services monitoring dashboards, alarms, SNS 3. Optional modes ingestion RDS PostgreSQL + DMS CDC serving Redshift Serverless orchestration MWAA Airflow visual DAG
Foundation firstNetworking, buckets, KMS, IAM roles, and Glue databases are shared prerequisites. Most later modules cannot even be planned correctly without their outputs.
Default dev pathProcessing, Step Functions, Analytics Agent, and Monitoring are the normal short-session setup. It keeps the platform usable without running every expensive component.
Optional modulesIngestion, Redshift serving, and MWAA are enabled when needed. They are dashed because they are intentionally not always on in dev.
Network topology

What lives in public vs private subnets

VPC: 10.10.0.0/16 in eu-central-1 Placement view: public edge at the top, private workloads in the middle, durable storage and logs at the bottom. Public subnet Internet Gateway internet edge Application LB browser entry NAT Gateway outbound only Bastion SSM access path Private subnets ECS Fargate agent + UIs Glue jobs private Spark RDS PostgreSQL source database DMS reads WAL MWAA Airflow UI Private AWS service access S3 VPC Endpoint S3 data lake CloudWatch
Public means edge onlyThe ALB receives browser traffic. The NAT Gateway lets private workloads call AWS APIs or Claude, but it does not allow inbound access.
Private means protected computeECS, Glue, RDS, DMS, and MWAA live in private subnets. They communicate through VPC routes, security groups, IAM, and S3 endpoints.
S3 endpoint reduces exposureGlue, DMS, and ECS can reach S3 over the AWS private network instead of crossing the public internet.
Data and serving architecture

How infrastructure supports the full analytics platform

Complete platform built by Terraform Infrastructure creates the lanes. Application repos put code into those lanes. Optional ingestion RDS PostgreSQL source DMS CDC replication S3 data lake Bronze raw CDC parquet Silver clean tables Gold analytics marts Processing Glue PySpark Bronze to Silver dbt + Athena Silver to Gold Step Functions default control path Serving and visibility ECS analytics service FastAPI + web frontends ALB browser entrypoint CloudWatch logs, metrics, alarms Optional MWAA visual DAG Optional Redshift BI serving
Infrastructure is the stageThis repo does not generate data or run analytics code itself. It creates the AWS services those repos need to run safely.
Two orchestration choicesStep Functions is the default fast path. MWAA is the optional Airflow UI path for visual DAG demonstrations.
Serving boundaryThe infrastructure creates ECS, ALB, ECR, and monitoring for the analytics service; the detailed frontend behavior belongs in the analytics-agent repo.
Session modes

What is active in dev and why

Dev session operating modes Dev has three operating modes. They are choices, not a single runtime sequence. Default short session Foundation + processingnetwork, lake, IAM, Glue, Athena Step Functions pipelinefast startup, low run cost Analytics servingECS agent + ALB + monitoring Refresh source data Enable ingestionRDS + DMS when CDC needs rerun Run simulatornew OLTP changes land in Bronze Airflow demo mode Enable MWAAvisual Airflow task graph Run same pipelineGlue → crawler → dbt → artifacts
Default is practicalMost sessions only need the foundation, Step Functions, agent serving, and monitoring. That keeps startup faster and cost lower.
Ingestion is specialRDS and DMS cost money while running, so ingestion is only enabled when you need fresh CDC from PostgreSQL.
MWAA is for visual orchestrationAirflow is useful for demos and DAG inspection, but Step Functions is the lean default pipeline path.
Concept map

The infrastructure ideas this repo teaches

Terraform composition

environmentmodulesAWS resources

The environment folder is the composition root. It wires module outputs and variables together for dev, staging, or prod.

Least privilege by layer

KMSIAM rolesscoped buckets

Roles are created for the service that needs them. Glue, DMS, MWAA, Redshift, and ECS do not all get the same permissions.

Private compute

private subnetsS3 endpointNAT outbound

The platform exposes only what must be public. Most services have no public IP and reach S3 privately.

Orchestration modes

Step FunctionsorMWAA

Both paths run the same data pipeline. Step Functions is lean; MWAA provides the Airflow UI and DAG graph.

Serving choices

FastAPIStreamlitHTMLSlack

The frontends should be presentation layers. The Analytics Agent backend remains the governed query brain.

Observability

logsmetricsalarms

The monitoring module makes failures visible: pipeline errors, ECS health, ALB problems, and stale Silver data.

Environment mental model

Same code, different environment values

dev

  • Fast iteration and demos
  • Short-lived sessions to save cost
  • Ingestion and MWAA enabled only when needed

staging

  • Production-like validation
  • Stronger availability settings
  • Useful before promoting infra changes

prod

  • Same module structure
  • Production-grade variables
  • Designed to avoid console-created drift