Distributed Email Infrastructure
A horizontally scalable email system handling 50,000+ emails/day across 21 nodes with 5-second failover.
Overview
Designed and built a production-grade distributed email infrastructure from the ground up. The system replaced a fragile, single-point-of-failure legacy setup with a stateless, horizontally scalable architecture across 21+ nodes. Every node can be fully rebuilt in ~2 hours, deployments use canary rollouts (25% → 100%), and the entire stack is observable via centralized logging and metrics.
Challenges
- 1
Migrating from CentOS 7 to Ubuntu 22.04 with zero-downtime email delivery across all nodes
- 2
Implementing SASL + TLS hardening across all 21 SMTP nodes simultaneously
- 3
Achieving sub-5-second failover using HAProxy with fine-tuned health checks
- 4
Building centralized log aggregation across a geographically distributed cluster
Future Improvements
- ◆Add ML-based anomaly detection for proactive spam/delivery issue alerts
- ◆Implement geo-distributed nodes for globally low-latency delivery
- ◆Add automated DKIM/SPF/DMARC rotation and management tooling