The client had an existing network load balancer (NLB) exposed to the internet to allow traffic for various applications. While the NLB fulfilled its function, it posed a significant security risk due to its direct public accessibility without a firewall. Additionally, the NLB was integrated with Kubernetes ingress, which further complicated the setup. The client needed a secure, scalable solution that would still allow traffic to reach the required services but with enhanced protection and visibility.
Solution
To resolve the security vulnerabilities, we designed a solution using Terraform to implement an Application Load Balancer (ALB) combined with AWS Web Application Firewall (WAF). The key aspects of this solution include:
ALB Targeting the Private NLB:
One of the primary steps was making the NLB private, ensuring it was no longer directly accessible from the internet. Since the ALB could not directly target the NLB, we leveraged the private network interface IPs of the NLB. These IPs are stable and do not change, allowing us to route traffic efficiently through the ALB while keeping the NLB protected within a private network.
WAF Security Rules:
We attached a WAF to the ALB to provide a robust security layer. The WAF was configured with the following rules:
AWS Common Rule Set: A baseline security layer protecting against common vulnerabilities like SQL injection and cross-site scripting (XSS).
AWS IP Reputation Lists: Blocking malicious IP addresses known for engaging in abusive or suspicious behavior.
GeoBlocking: Denying traffic from specific countries where the client did not conduct business, reducing unwanted exposure.
Rate Limiting: Limiting the number of requests from a single IP address to mitigate brute force attacks and distributed denial-of-service (DDoS) attempts.
Centralized Monitoring & Logging:
To enhance visibility into the traffic and security events, we enabled logging for both the ALB and WAF. These logs were stored in an S3 bucket, where they could be analyzed and transformed into actionable insights. The WAF also provided a real-time dashboard to monitor blocked traffic and other security events.
Terraform & Modularization:
The entire implementation was done using Terraform, and we modularized the ALB and WAF setup to make it easy to add, adjust, and replicate the configuration across different environments. This approach not only simplified maintenance but also allowed rapid deployment of changes when needed.
Implementation & Testing in Production
To ensure a smooth transition and minimize downtime for the client’s applications, we implemented and tested the solution in a production environment with a carefully staged rollout:
Creation of New Internal NLB:
We began by creating an entirely new internal network load balancer (NLB) while keeping the original NLB active and unchanged. New applications were set to use the new internal NLB to isolate the impact of the new configuration.
ALB and WAF Setup:
The ALB and WAF were set up to target the new internal NLB, ensuring that traffic would flow securely through the WAF before reaching the applications.
Traffic Management with Route 53:
Using Route 53, we set up two routing rules: one directing traffic to the old, internet-facing NLB, and the other routing traffic to the new ALB. Initially, 100% of the traffic was routed to the old NLB, with no disruption to the current flow.
Gradual Traffic Shift:
Throughout the week, we gradually adjusted the weights in Route 53 to increase traffic going to the ALB while reducing traffic to the old NLB. This allowed us to monitor and address any issues that arose during the transition. Any client-reported issues were resolved as they came in, ensuring that the transition was smooth.
Final Switch and Monitoring:
Once the client’s support tickets and issues subsided, we completed the traffic shift by directing 100% of the traffic to the new ALB. We continued monitoring performance closely, keeping the old NLB available as a fallback in case we needed to revert the routing.
Challenges and Solutions
XSS Rules Blocking Large Requests:
One issue we encountered was that AWS WAF can only scan up to 8KB of request body data. For certain endpoints with larger request bodies, this limitation caused the XSS rules to block legitimate traffic. To resolve this, we created a new rule that exempted these larger requests from the XSS scan for specific endpoints. Simultaneously, we implemented XSS protection on the backend of the application to ensure that these requests were still scanned for potential vulnerabilities.
GeoBlock Exceptions:
Another challenge was the need to allow specific IPs through the GeoBlock, as some legitimate traffic was being blocked. To address this, we created an IP set that contained the approved IP addresses. Using a combination of “and” and “not” statements, we ensured that these IPs were exempt from the GeoBlock while still enforcing the geographic restrictions for all other traffic.
Visibility of Web Traffic and Real-Time Dashboards
The product integrates AWS WAF with real-time dashboards that provide actionable insights into traffic patterns and detected threats. These dashboards allow users to filter and analyze collected logs using multiple criteria, including:
Web ACL: Identify which Web ACL triggered the rules and how it impacted traffic flow.
WAF Rule: Analyze specific rules to see how they performed, including block, allow, and count actions.
URI: Monitor traffic to specific endpoints and detect anomalies in access patterns.
HTTP Method: Track methods like GET, POST, DELETE to identify unusual activity patterns.
IP Address: Analyze requests from specific IPs, including identifying repeated offenders.
Region and Country: Gain geographic insights into traffic, including potential GeoBlocked traffic.
Log Fields: Filter and search logs for specific fields such as headers, request IDs, or custom metadata.
Dashboard Capabilities
Real-Time Monitoring:
Dashboards display live data on traffic volume, rule triggers, and blocked threats. Users can view metrics like request count, latency, and error rates.
Custom Filters:
Users can create and save custom filters to focus on specific threats or traffic patterns. Predefined filters include commonly used fields such as IP Address, Country, and WAF Rule.
Integration with Visualization Tools:
Logs are centralized in Amazon S3 and analyzed using AWS Athena, with results visualized in tools like Amazon QuickSight or Kibana. This enables dynamic dashboards with drill-down capabilities to investigate specific incidents.
Proactive Alerts:
Dashboards are integrated with CloudWatch Alarms to send notifications for anomalous activity, such as spikes in blocked requests or rule violations.
Centralized Logging for AWS WAF
This guide provides step-by-step instructions for configuring centralized logging of AWS WAF logs from multiple AWS accounts into a single dashboard. The solution leverages Amazon S3, AWS CloudWatch, and AWS Athena for log collection, analysis, and display.
1. Prerequisites
An Amazon S3 bucket configured in a logging account to store logs from multiple AWS accounts.
Appropriate IAM roles in the source accounts to allow cross-account logging.
AWS CloudWatch Log Groups configured in each account for real-time log streaming.
Optional: A BI tool like Amazon QuickSight or external tools for visualization.
2. Configure Centralized Logging
Step 1: Set Up an S3 Bucket for Centralized Logging
Create an S3 bucket in the centralized logging account.
Enable server-side encryption (SSE-S3 or SSE-KMS).
Configure a bucket policy to allow logs to be written from other AWS accounts.
Go to the AWS WAF Console → Web ACL → Logging and Metrics.
Add the centralized S3 bucket as the logging destination.
Step 3: Stream Logs to CloudWatch (Optional)
Use Amazon Kinesis Data Firehose to stream logs from S3 to CloudWatch Log Groups for further analysis.
3. Analyze Logs Using AWS Athena
Enable S3 Access Logs for the centralized bucket and configure AWS Glue to catalog the data.
Use Athena to query the WAF logs across accounts with a predefined schema.
Example query for blocked requests:
SELECT action, httpRequest.clientIp, COUNT(*) AS request_count
FROM waf_logs
WHERE action = 'BLOCK'
GROUP BY httpRequest.clientIp
4. Visualize Logs with Amazon QuickSight
Integrate QuickSight with Athena to create real-time dashboards displaying metrics like:
Total blocked requests by IP.
Geo-location insights for traffic sources.
Trends for request volume and threat activity.
5. Managing Multi-Account Access
Use AWS Organizations or set up cross-account IAM roles to automate the log centralization process.
Automate logging configuration using Terraform or AWS CloudFormation for consistency.
Results
Enhanced Security: By routing traffic through the ALB with WAF protection and making the NLB private, we effectively shielded it from public access while implementing advanced security controls.
Scalable & Flexible Solution: The use of Terraform allowed for seamless provisioning and scaling of the ALB and WAF, ensuring the infrastructure could adapt to growing demands. Modularizing the configuration also made it easy to replicate and adjust for different environments.
Actionable Insights: The combination of S3 logging and the WAF dashboard gave the client clear visibility into the traffic reaching their applications, including threats that were being blocked, allowing them to respond proactively to potential security incidents.
Zero Downtime Transition: Through careful planning and the staged rollout, we were able to implement the new security architecture with minimal impact on the client’s services, ensuring that the transition was seamless.
False-Positive Mitigation Documentation
This document outlines the steps and mechanisms for addressing false positives flagged by AWS WAF rules within the product. The product automatically minimizes false positives while offering customers the tools and guidance to handle them effectively.
1. Automatic Analysis of False Positives
Dynamic Rule Refinement: If legitimate traffic consistently triggers specific WAF rules (e.g., SQL injection or XSS rules), the system dynamically adjusts the rule sensitivity or adds specific exclusions for the identified request patterns.
IP Whitelisting: Trusted IPs or sources identified through repeated legitimate access are added to an allow list. These IPs bypass rules like rate limiting and GeoBlocking to ensure smooth operation for trusted users.
2. Customer Handling of False Positives
Retrieve WAF Logs: Access flagged requests in real time via the WAF Logs stored in an Amazon S3 bucket or viewed in AWS CloudWatch.
Modify Rule Sensitivity: Adjust rule settings for specific scenarios, such as increasing request size thresholds or excluding specific endpoints from inspection.
Whitelist Legitimate Traffic: Use IP sets to allow trusted IPs or source regions to bypass the offending rule.
Rule Exclusions for Specific Paths: Add exclusions to rules for certain URL paths or request patterns that generate false positives, using WAF’s Custom Rule Builder.
3. Support Paths for False Positives
Knowledge Base: Access to a detailed guide on analyzing flagged requests, modifying rules, and addressing common scenarios.
Direct Support: Customers can report complex false positives through the support portal or email. Support engineers analyze the traffic and recommend tailored adjustments to the WAF rules.
4. Proactive Monitoring to Minimize False Positives
Dashboards and Alerts: The product integrates with CloudWatch and provides real-time dashboards to monitor flagged traffic. These dashboards include insights into rule triggers, traffic patterns, and flagged requests.
Continuous Improvement: WAF rules are periodically reviewed and updated to align with emerging traffic patterns and customer feedback.
5. Examples of Common False-Positive Scenarios
Large API Payloads: Problem: Large POST requests exceeding WAF’s 8KB inspection limit. Solution: Exclude these endpoints from body inspection and validate them in the application layer.
Legitimate Traffic from GeoBlocked Regions: Problem: Certain users in GeoBlocked regions require access. Solution: Whitelist IPs for these users while maintaining GeoBlocking for other traffic.
Conclusion
By leveraging AWS ALB and WAF, along with making the NLB private, we were able to secure the client’s infrastructure, reduce their exposure to security risks, and provide them with greater control and insight into their traffic. The modular approach using Terraform ensured that the solution was not only secure but also adaptable and scalable for the client’s future needs. The phased implementation strategy also ensured zero downtime for the client’s production environment, providing confidence and stability throughout the transition.