Building a Hybrid AWS Microservices Platform with API Gateway, Lambda, ECS, and Load Balancers
Building a Hybrid AWS Microservices Platform with API Gateway, Lambda, ECS, and Load Balancers Introduction When teams start splitting a large backend into smaller services, the first infrastructure question is usually not "How do we build a microservice?" but "How do we expose many different services safely, consistently, and without creating a networking mess?" Our architecture provides a practical answer to that problem using a hybrid AWS design: API Gateway as the front door Lambda for lightweight serverless capabilities and supporting workflows ECS Fargate for containerized business services Internal load balancers for private service routing Terraform for repeatable, staged infrastructure delivery The important architectural idea is separation of concerns. Public access, authentication, routing, container execution, and service discovery are all handled by different layers. That keeps the platform easier to scale and much easier to evolve as the number of services grows. At a high level, the platform follows this flow: A client sends an HTTPS request to API Gateway. API Gateway applies request-level controls such as API key enforcement, CORS behavior, and route matching. The request is sent either to a Lambda-backed endpoint or to a private containerized service. For ECS services, traffic goes through a VPC Link into internal load balancing. The load balancer forwards the request to the correct ECS service based on path rules. ECS Fargate runs one or more healthy tasks for that service and returns the response. This gives a single API surface to consumers while allowing the backend implementation to vary by use case. A platform like this benefits from using both compute models rather than forcing every workload into one. Lambda is a strong fit for: lightweight request handlers event-driven tasks simple orchestration platform support functions endpoints that do not need a full container lifecycle ECS Fargate is a better fit for: long-lived HTTP microservices containerized frameworks and dependencies services that need more predictable runtime behavior APIs that benefit from load balancing, health checks, and horizontal scaling In our architecture, the design supports both. Some APIs are routed to Lambda-based services, while others are routed to ECS services defined through service configuration. That hybrid model is useful in real organizations because all services do not have the same runtime needs. One of the strongest ideas in our architecture is the staged Terraform layout. Instead of deploying everything together, the infrastructure is split into three layers. The first stage establishes the network foundation: VPC selection or creation public and private subnet discovery or provisioning internal Network Load Balancer internal Application Load Balancer VPC Link for API Gateway ECS task security group ALB log storage and network observability components This stage is intentionally infrastructure-only. No application services are deployed here. The second stage provisions the actual execution environment: ECS cluster on Fargate ECR repositories for service images target groups per service ALB listener and listener rules ECS service definitions CloudWatch log groups Lambda functions used by the platform This stage consumes outputs from the networking stage so the compute layer never hardcodes network assumptions in its own design. The third stage exposes services through API Gateway: a public API for internet-facing consumption a private API for VPC-only access route creation from service metadata VPC Link integrations for containerized services Lambda proxy integrations for Lambda-backed services API keys, usage plans, and stage configuration This split is operationally important. Teams can change routing without rebuilding networking, and they can add services without redesigning the entire platform. For containerized microservices, the implementation follows a private ingress model. The path is: Client -> API Gateway -> VPC Link -> internal NLB -> internal ALB -> ECS service -> ECS task That may look like one hop too many at first, but each layer has a purpose. API Gateway is the public control plane. It handles: TLS termination at the edge route exposure API key enforcement request and header mapping CORS handling stage-based deployment It gives consumers a stable API contract while keeping the backend private. ECS services are not exposed directly to the internet. Instead, API Gateway connects privately into the VPC using a VPC Link. That allows the public API layer to reach internal services without making the services themselves public. This is a strong security pattern because the application runtime stays inside the VPC, but consumers still get a clean managed API endpoint. A useful implementation detail in our architecture is that the VPC Link targets an internal Network Load Balancer, and that NLB forwards to an internal Application Load Balancer. This arrangement provides two separate benefits: The NLB is used as the stable target for the API Gateway VPC Link. The ALB performs path-based routing to the actual microservices. The ALB is what makes many ECS services practical behind one internal entry point. Each service gets its own listener rule and target group, so the platform can route based on URL path rather than provisioning a separate load balancer per service. The load-balancing model is service-oriented. Each ECS microservice contributes: a base API path an ALB path pattern a listener rule priority a container port a health check definition From that metadata, Terraform creates: one target group per service one listener rule per service one ECS service per service This means the routing layer is not manually duplicated for every new microservice. The service declares its path and runtime settings, and the platform generates the infrastructure around it. Each target group points to ECS tasks using IP targets. That is the correct choice for Fargate because tasks run with their own elastic networking interfaces rather than on shared EC2 hosts. The target groups in this repository also use application-level health checks. A task is considered healthy only when its service endpoint responds successfully on the configured health path. That matters because container startup is not the same as application readiness. A service may be running from ECS's perspective but still not ready to receive traffic. The ALB listener is configured once, and each service gets a path-based rule. For example, a service under a quoting path can be matched independently from a service under a product-pricing path. This keeps the routing layer centralized and avoids deploying a dedicated ALB per service, which would become expensive and operationally noisy as the platform grows. The repository uses health checks in multiple places: API health endpoints at the application level ALB target group health checks ECS service health grace periods container health checks inside the task definition That layered approach improves resilience: unhealthy tasks are removed from target groups ECS replaces failed tasks API Gateway continues to route through the same private entry point The result is a platform that can recover from instance-level failures without changing the public API contract. The ECS side of the platform is built for repeatability rather than one-off service definitions. The platform provisions a shared ECS cluster per environment. That allows multiple microservices to run within the same operational boundary while still being isolated at the task and service level. The cluster uses Fargate, which removes the need to manage EC2 worker nodes. This simplifies operations significantly: no patching of container hosts no cluster capacity management at the instance level easier scaling by task count Instead of defining each ECS service from scratch, the repository uses a reusable Terraform module for service deployment. That module is responsible for: task definition creation container logging configuration IAM role wiring ECS service creation target group attachment subnet and security group placement optional capacity provider strategy This is a strong platform choice. It makes service onboarding consistent and reduces drift between services. Each service runs as a Fargate task with: a named container image from ECR CPU and memory settings environment variables a health check command CloudWatch logging The repository also includes support for an additional X-Ray sidecar container in the task definition pattern, which is useful for distributed tracing in a microservice environment. Tasks run with awsvpc networking, which gives each task its own network interface and private IP. This is the standard model for ECS on Fargate and is what allows ALB target groups to use IP mode cleanly. This repository supports both existing/default VPC usage and a more segmented custom VPC model. That flexibility matters because many teams start in a default-VPC or dev-friendly setup and later move to stricter network isolation for staging and production. The network layer discovers public and private subnets where available. In a custom VPC, the design supports proper private subnet deployment. In a simpler default VPC setup, the platform can fall back to available public subnets when private ones are not present. This is an important operational nuance: development environments often optimize for simplicity higher environments usually optimize for stricter isolation The repository is built to handle both. The security model follows least-privilege intent: ECS tasks accept application traffic from the internal load-balancing layer services are not directly internet-facing API Gateway reaches backend services through private network integration This keeps the application tier out of direct public exposure while still allowing a public API facade. One of the most scalable ideas in our architecture is that services are registered through configuration rather than by handcrafting infrastructure every time. There is a master service registry that lists enabled services per environment, and each service provides its own deployment metadata, including: service identity container port desired task count CPU and memory API base path ALB path pattern listener priority health check behavior logging retention autoscaling preferences This creates a platform model rather than a collection of unrelated microservices. Adding a new service becomes a repeatable process: Create the service. Define its configuration. Register it in the service catalog. Build and publish the image. Apply Terraform stages. That is much easier to maintain than cloning infrastructure blocks over and over. For ECS workloads, the container supply chain is straightforward: Build the service image. Push it to an ECR repository. Reference the tagged image in the ECS task definition. Update the ECS service to roll out the new task definition. Our platform provisions one ECR repository per service, with image scanning enabled. That is a good baseline for a microservices platform because it keeps artifacts separated by service while still following a common naming convention. There is also an explicit deployment phase between infrastructure provisioning and API exposure where container images are built and pushed. That is a practical real-world step many diagrams omit, but it is essential because ECS cannot run a service until the image exists in the registry. Lambda is used here as a first-class platform option, not as an afterthought. There are two useful Lambda patterns in our architecture. Some services can be exposed through API Gateway using Lambda proxy integration. This is ideal for capabilities that are naturally event-driven, lightweight, or operationally simpler as functions than as always-on containers. In this model: API Gateway owns the route Lambda executes the business logic API Gateway returns the Lambda response directly This avoids unnecessary load-balancer and container overhead for smaller workloads. Our architecture also provisions Lambda functions that support the overall platform, such as authentication-related or onboarding-related workflows. This is a smart use of Lambda in a hybrid platform because not every supporting concern needs to run inside ECS. Our architecture clearly treats API protection as an API Gateway concern. The current public API implementation enforces API key usage through API Gateway methods, API keys, and usage plans. The codebase also provisions a supporting API key validation Lambda function and related permissions, which shows the platform is designed to accommodate Lambda-based validation flows where needed. From a blog perspective, the important architectural takeaway is this: keep authentication and traffic governance at the gateway layer keep service containers focused on business logic keep private workloads private That separation keeps the platform easier to secure and easier to reason about. Another strength of our architecture is that it supports both public and private APIs. The public API is intended for internet-facing access. It handles: external client access API keys and usage plans CORS behavior Lambda and ECS route exposure The private API is intended for internal or VPC-scoped access. It is useful when services should only be reachable from trusted network boundaries such as internal AWS workloads, integration environments, or enterprise connectivity paths. This split is helpful when some capabilities should be public and others should remain internal even though they share the same service platform underneath. A microservices platform is only as good as its operational visibility. Our architecture includes observability at several levels: CloudWatch log groups for ECS services CloudWatch logs for Lambda functions API Gateway stage logging ALB logging support VPC flow logging X-Ray-friendly task patterns That combination helps answer the most common production questions: Did the request reach the gateway? Was it routed to the right backend? Was the target healthy? Did the service fail or time out? Was the problem in networking, routing, or application logic? Without that layered visibility, hybrid platforms become difficult to troubleshoot. This architecture scales well because each layer can evolve somewhat independently. API Gateway absorbs public traffic without requiring the backend to manage edge-facing concerns directly. ECS services scale by task count. Each service can define: desired count minimum and maximum capacity CPU and memory sizing autoscaling thresholds That means heavily used services can scale out without affecting lighter services. As more services are added, the platform does not need a new ingress pattern each time. The same path-based routing model continues to work as long as route definitions and listener priorities stay clean. This architecture also aligns well with AWS best-practice design principles, especially the AWS Well-Architected mindset. We have structured the platform so that it is operated as a system rather than as a collection of one-off deployments. This is reflected in: staged Terraform deployments for clearer ownership and safer changes configuration-driven service onboarding consistent ECS service patterns through reusable modules standardized logging and deployment workflows This reduces manual drift and makes operational changes more repeatable. Security is addressed through layered controls rather than a single protection point. We have adhered to good AWS security practices by: placing ECS services behind private networking rather than exposing them directly using API Gateway as the controlled ingress layer applying API-level protection at the gateway using security groups to limit east-west traffic supporting encrypted log and storage patterns separating public access from internal service routing This follows the AWS principle of strong boundaries, least privilege, and defense in depth. Reliability comes from designing for failure at the service and routing layers. We have incorporated that through: multi-AZ subnet placement load balancer health checks ECS task replacement behavior target group isolation per service decoupled gateway and backend layers staged infrastructure dependencies with clear outputs between layers This means a failing task or unhealthy target does not require the API surface itself to change. The architecture chooses the right compute model for the right workload. That is an AWS best practice because it avoids treating all traffic the same. Examples include: Lambda for lighter, event-oriented, or supporting workflows ECS Fargate for containerized services that need steady HTTP handling ALB path-based routing for efficient multi-service consolidation service-specific CPU, memory, and scaling settings This lets us tune services independently instead of overprovisioning everything at the platform level. Cost optimization is also visible in the design choices. We are not multiplying infrastructure unnecessarily. Instead, the architecture encourages shared but controlled platform components: one API layer for many services one internal routing layer for many ECS workloads shared ECS cluster patterns per environment service-level scaling instead of blanket scaling support for Fargate and optional capacity-provider strategies where appropriate That is much closer to AWS best practice than provisioning separate ingress and compute stacks for every small service. Even when sustainability is not called out directly, maintainable designs usually consume fewer engineering and infrastructure resources over time. The architecture helps here by: reducing duplicated infrastructure definitions making service onboarding metadata-driven encouraging reuse of shared platform components keeping the public contract stable while backend services evolve That leads to lower long-term complexity, which is a practical form of architectural efficiency. This AWS pattern is effective because it balances standardization with flexibility. It standardizes: deployment stages ingress architecture service registration load-balancer behavior logging and health checks ECS service creation It stays flexible by allowing: Lambda-backed endpoints ECS-backed endpoints public and private APIs different service-level scaling and runtime settings multiple environments with different networking strategies That is exactly what a growing microservices platform needs. If you want to implement a similar architecture, a good sequence is: Build the networking foundation first. Keep all service backends private. Put API Gateway in front of everything external. Use ECS Fargate for containerized APIs that benefit from long-lived service behavior. Use Lambda for support functions and lightweight endpoints. Register services through metadata, not repetitive infrastructure definitions. Use path-based ALB routing so many services can share one internal ingress layer. Add strong health checks and centralized logs before traffic grows. The key is not just choosing AWS services, but assigning each AWS service a clear responsibility. Our architecture demonstrates a mature way to implement Lambda and ECS-based microservices through API Gateway without exposing backend services directly. The architecture uses: staged Terraform for separation of concerns API Gateway as the public and private API facade Lambda where serverless execution makes sense ECS Fargate for containerized microservices NLB and ALB together for private, path-aware routing config-driven onboarding for scale For teams building an enterprise microservices platform, this is a strong pattern because it supports security, operational clarity, and service growth without forcing every workload into the same runtime model. Most importantly, it turns infrastructure into a reusable platform. Once that platform is in place, adding the next service becomes much easier than adding the first one. Keeping API Gateway as the front door and backend services private makes the architecture easier to secure and easier to evolve. Using both Lambda and ECS is more practical than forcing every use case into a single compute model. Path-based routing through shared internal load balancing scales better than creating isolated ingress infrastructure for every service. Service onboarding becomes significantly easier when routing, health checks, scaling, and runtime settings are driven by configuration. Health checks, logging, and observability need to be designed from the beginning; adding them later is much harder in a distributed system. A staged infrastructure model reduces operational risk because networking, compute, and API exposure can be changed independently. Standardizing platform patterns early saves substantial effort as the number of microservices grows.
