Client Challenge
The client had an existing network load balancer (NLB) exposed to the internet to allow traffic for various applications. While the NLB fulfilled its function, it posed a significant security risk due to its direct public accessibility without a firewall. Additionally, the NLB was integrated with Kubernetes ingress, which further complicated the setup. The client needed a secure, scalable solution that would still allow traffic to reach the required services but with enhanced protection and visibility.
Solution
To resolve the security vulnerabilities, we designed a solution using Terraform to implement an Application Load Balancer (ALB) combined with AWS Web Application Firewall (WAF). The key aspects of this solution include:
- ALB Targeting the Private NLB:
One of the primary steps was making the NLB private, ensuring it was no longer directly accessible from the internet. Since the ALB could not directly target the NLB, we leveraged the private network interface IPs of the NLB. These IPs are stable and do not change, allowing us to route traffic efficiently through the ALB while keeping the NLB protected within a private network.
- WAF Security Rules:
We attached a WAF to the ALB to provide a robust security layer. The WAF was configured with the following rules:
- AWS Common Rule Set: A baseline security layer protecting against common vulnerabilities like SQL injection and cross-site scripting (XSS).
- AWS IP Reputation Lists: Blocking malicious IP addresses known for engaging in abusive or suspicious behavior.
- GeoBlocking: Denying traffic from specific countries where the client did not conduct business, reducing unwanted exposure.
- Rate Limiting: Limiting the number of requests from a single IP address to mitigate brute force attacks and distributed denial-of-service (DDoS) attempts.
- Centralized Monitoring & Logging:
To enhance visibility into the traffic and security events, we enabled logging for both the ALB and WAF. These logs were stored in an S3 bucket, where they could be analyzed and transformed into actionable insights. The WAF also provided a real-time dashboard to monitor blocked traffic and other security events.
- Terraform & Modularization:
The entire implementation was done using Terraform, and we modularized the ALB and WAF setup to make it easy to add, adjust, and replicate the configuration across different environments. This approach not only simplified maintenance but also allowed rapid deployment of changes when needed.
Implementation & Testing in Production
To ensure a smooth transition and minimize downtime for the client’s applications, we implemented and tested the solution in a production environment with a carefully staged rollout:
- Creation of New Internal NLB:
We began by creating an entirely new internal network load balancer (NLB) while keeping the original NLB active and unchanged. New applications were set to use the new internal NLB to isolate the impact of the new configuration.
- ALB and WAF Setup:
The ALB and WAF were set up to target the new internal NLB, ensuring that traffic would flow securely through the WAF before reaching the applications.
- Traffic Management with Route 53:
Using Route 53, we set up two routing rules: one directing traffic to the old, internet-facing NLB, and the other routing traffic to the new ALB. Initially, 100% of the traffic was routed to the old NLB, with no disruption to the current flow.
- Gradual Traffic Shift:
Throughout the week, we gradually adjusted the weights in Route 53 to increase traffic going to the ALB while reducing traffic to the old NLB. This allowed us to monitor and address any issues that arose during the transition. Any client-reported issues were resolved as they came in, ensuring that the transition was smooth.
- Final Switch and Monitoring:
Once the client’s support tickets and issues subsided, we completed the traffic shift by directing 100% of the traffic to the new ALB. We continued monitoring performance closely, keeping the old NLB available as a fallback in case we needed to revert the routing.
Challenges and Solutions
- XSS Rules Blocking Large Requests:
One issue we encountered was that AWS WAF can only scan up to 8KB of request body data. For certain endpoints with larger request bodies, this limitation caused the XSS rules to block legitimate traffic. To resolve this, we created a new rule that exempted these larger requests from the XSS scan for specific endpoints. Simultaneously, we implemented XSS protection on the backend of the application to ensure that these requests were still scanned for potential vulnerabilities.
- GeoBlock Exceptions:
Another challenge was the need to allow specific IPs through the GeoBlock, as some legitimate traffic was being blocked. To address this, we created an IP set that contained the approved IP addresses. Using a combination of “and” and “not” statements, we ensured that these IPs were exempt from the GeoBlock while still enforcing the geographic restrictions for all other traffic.
Results
- Enhanced Security: By routing traffic through the ALB with WAF protection and making the NLB private, we effectively shielded it from public access while implementing advanced security controls.
- Scalable & Flexible Solution: The use of Terraform allowed for seamless provisioning and scaling of the ALB and WAF, ensuring the infrastructure could adapt to growing demands. Modularizing the configuration also made it easy to replicate and adjust for different environments.
- Actionable Insights: The combination of S3 logging and the WAF dashboard gave the client clear visibility into the traffic reaching their applications, including threats that were being blocked, allowing them to respond proactively to potential security incidents.
- Zero Downtime Transition: Through careful planning and the staged rollout, we were able to implement the new security architecture with minimal impact on the client’s services, ensuring that the transition was seamless.
Conclusion
By leveraging AWS ALB and WAF, along with making the NLB private, we were able to secure the client’s infrastructure, reduce their exposure to security risks, and provide them with greater control and insight into their traffic. The modular approach using Terraform ensured that the solution was not only secure but also adaptable and scalable for the client’s future needs. The phased implementation strategy also ensured zero downtime for the client’s production environment, providing confidence and stability throughout the transition.