Building SwiftDeploy: From Declarative Deployments to Policy-Gated Releases
Building SwiftDeploy: From Declarative Deployments to Policy-Gated Releases Introduction SwiftDeploy is a DevOps deployment tool I built as part of the HNG DevOps Track. The project started as a deployment automation task and later became a policy-gated and observable deployment system. In Stage 4A, I built the deployment engine. The tool could read a manifest.yaml file, generate infrastructure files, deploy the stack, and switch between stable and canary modes. In Stage 4B, I extended the project with observability, policy enforcement, chaos testing, status checks, and audit reporting. The goal was to build a tool that does not just deploy containers, but also checks whether the environment is safe before deployment and promotion. The Design: A Tool That Writes Its Own Infrastructure The main idea behind SwiftDeploy is that manifest.yaml remains the single source of truth. Instead of manually writing docker-compose.yml and nginx.conf, the CLI reads the manifest and generates those files from templates. The flow looks like this: text manifest.yaml | v swiftdeploy init | v nginx.conf + docker-compose.yml | v Docker Compose deployment This means if the generated files are deleted, the CLI can recreate them again from the manifest. The manifest defines the service image, port, mode, version, Nginx settings, network settings, and policy thresholds. Example: services: image: 10johnny-swiftdeploy-stage4b:latest port: 3000 mode: stable version: "1.0.0" restart_policy: unless-stopped nginx: image: nginx:latest port: 8080 proxy_timeout: 30 network: name: swiftdeploy-net driver_type: bridge policy: opa_url: http://localhost:8181 thresholds: min_disk_free_gb: 10 max_cpu_load: 2.0 max_error_rate_percent: 1 max_p99_latency_ms: 500 Architecture The application runs behind Nginx. User / Browser / curl | v Nginx container on port 8080 | v Python API service on internal port 3000 Only Nginx is exposed to the host. The Python API service is not exposed directly. This ensures all traffic goes through the reverse proxy. The generated Docker Compose file creates the service container, Nginx container, network, named volume, and OPA policy sidecar. Stage 4A: Deployment Lifecycle SwiftDeploy supports the main deployment lifecycle commands: python swiftdeploy init python swiftdeploy validate python swiftdeploy deploy python swiftdeploy promote canary python swiftdeploy promote stable python swiftdeploy teardown python swiftdeploy teardown --clean Init The init command reads manifest.yaml and generates: nginx.conf docker-compose.yml Validate The validate command performs pre-flight checks before deployment. It checks that: manifest.yaml exists and is valid YAML required fields are present Docker image exists locally Nginx port is free nginx.conf syntax is valid Deploy The deploy command runs init, starts the stack, and waits until the health check passes. Promote The promote canary command switches the application to canary mode. The promote stable command switches it back to stable mode. The mode is updated in manifest.yaml, the Docker Compose file is regenerated, and only the service container is restarted. Observability: Adding /metrics For Stage 4B, I added a /metrics endpoint in Prometheus text format. The API service tracks: http_requests_total http_request_duration_seconds app_uptime_seconds app_mode chaos_active These metrics help track: throughput errors latency uptime current mode active chaos state The metrics endpoint can be tested with: curl http://localhost:8080/metrics This gives SwiftDeploy visibility into the running service. The Guardrails: OPA Policy Enforcement Stage 4B also introduced Open Policy Agent, also known as OPA. OPA acts as the policy decision engine. The CLI does not make the allow or deny decision itself. Instead, SwiftDeploy collects data and sends it to OPA. OPA evaluates the policy and returns a decision with reasons. The policies are stored in the policies/ directory: policies/ ├── infrastructure.rego └── canary.rego Infrastructure Policy The infrastructure policy answers this question: Is the host safe enough for deployment? It blocks deployment if: Disk free space is below 10GB CPU load is above 2.0 This protects the deployment from running on an unhealthy host. Canary Safety Policy The canary policy answers this question: Is the canary safe enough to promote? It blocks promotion if: Error rate is above 1% P99 latency is above 500ms Before promotion, SwiftDeploy scrapes /metrics, calculates error rate and P99 latency, and sends the result to OPA. Why Policy Isolation Matters Each policy domain owns one responsibility. The infrastructure policy only checks host safety. The canary policy only checks canary safety. This separation is important because a change in one policy should not require changing another policy. It also makes the system easier to debug because each policy gives its own decision and reason. Example: Infrastructure policy: PASS Canary policy: FAIL This makes it clear which part of the deployment process is unsafe. Gated Deploy and Promote The deploy flow now works like this: swiftdeploy deploy | v start OPA | v collect disk, CPU, and memory data | v ask OPA infrastructure policy | v deploy only if allowed The promote flow works like this: swiftdeploy promote canary | v start OPA | v scrape /metrics | v calculate error rate and P99 latency | v ask OPA canary policy | v promote only if allowed OPA decisions include reasons, not just true or false. This is useful because the operator can understand why an operation passed or failed. The Chaos: Testing Slow and Error States The application has a /chaos endpoint that works in canary mode. It supports slow mode: { "mode": "slow", "duration": 3 } It also supports error mode: { "mode": "error", "rate": 0.5 } Slow mode delays responses. Error mode causes some requests to return 500 errors. This allows the canary safety policy to be tested under unhealthy conditions. Example request: curl -X POST http://localhost:8080/chaos \ -H "Content-Type: application/json" \ -d "{\"mode\":\"error\",\"rate\":0.5}" After injecting chaos, the /metrics endpoint records the request behaviour. The swiftdeploy status command then displays the current metrics and policy compliance. Example status view: python swiftdeploy status Requests: 20 Error Rate: 5.0% P99 Latency: 700ms Policy Compliance: - Infrastructure: PASS - Canary: FAIL This shows how SwiftDeploy can detect an unhealthy canary instead of promoting it blindly. Status Command I added: python swiftdeploy status The status command scrapes /metrics, calculates request statistics, checks policy compliance, and appends every scrape to: history.jsonl This creates a simple audit trail of what happened over time. Audit Command I also added: python swiftdeploy audit This command reads history.jsonl and generates: audit_report.md The audit report contains: deployment timeline mode changes policy checks policy violations This gives SwiftDeploy memory. Instead of only showing the current state, it records what happened during deployment, promotion, chaos testing, and policy checks. Repository Structure The project structure is: manifest.yaml swiftdeploy Dockerfile README.md app/ templates/ policies/ Generated files are created in the root folder: nginx.conf docker-compose.yml Audit files include: history.jsonl audit_report.md How to Run the Project Build the Docker image: docker build -t 10johnny-swiftdeploy-stage4b:latest . Generate configuration files: python swiftdeploy init Deploy the stack: python swiftdeploy deploy Check health: curl http://localhost:8080/healthz Check metrics: curl http://localhost:8080/metrics Promote to canary: python swiftdeploy promote canary Run status: python swiftdeploy status Generate audit report: Python SwiftDeploy Audit Tear down: python swiftdeploy teardown --clean Lessons Learned This project helped me understand how deployment tools work beyond simply starting containers. I learned how to: generate infrastructure files from templates Use Docker Compose to manage multiple containers Use Nginx as a reverse proxy Expose Prometheus-style metrics Use OPA for policy decisions separate policy logic from CLI logic Test canary behavior with chaos injection Generate an audit report from deployment history The biggest lesson is that deployment automation should not only focus on speed. A good deployment tool should also provide safety, visibility, clear decision-making, and traceability. SwiftDeploy started as a deployment automation tool, but Stage 4B made it more reliable by adding observability, policy guardrails, chaos testing, and auditing. GitHub Repository GitHub Repo: https://github.com/10Johnny/swiftdeploy-stage4a
