Load Balancing Strategies: L4 vs L7, Algorithms, and Health Checks

Learn how load balancers distribute traffic, the differences between Layer 4 and Layer 7, common algorithms (round robin, least connections, IP hash), and health check mechanisms.

18 min readload balancingL4L7algorithmshealth checkssystem design

Why Load Balancing Matters

In any distributed system, a single server cannot handle all the traffic. Load balancers are the traffic cops that distribute incoming requests across multiple servers to ensure no single server becomes a bottleneck.

✅

Interview insight: When designing any scalable system, you'll almost always need a load balancer. Be prepared to explain why you chose a particular type and algorithm.

Layer 4 (Transport Layer) vs Layer 7 (Application Layer) Load Balancers

Load balancers operate at different layers of the OSI model, each with distinct capabilities.

Layer 4 Load Balancers

Operate at the transport layer (TCP/UDP)
Make routing decisions based on IP addresses and port numbers
Cannot see the content of the requests (e.g., HTTP headers, URLs, cookies)
Faster and less resource-intensive because they don't inspect application data
Examples: AWS Network Load Balancer, HAProxy in TCP mode

Layer 7 Load Balancers

Operate at the application layer (HTTP)
Can inspect and make decisions based on HTTP headers, URLs, cookies, and even the request body
Enable advanced features like SSL termination, content-based routing, and rate limiting
More flexible but slightly slower due to deep packet inspection
Examples: AWS Application Load Balancer, NGINX, Envoy

💡

Key trade-off: L4 is faster and simpler; L7 is more intelligent and flexible. Choose L4 for raw TCP traffic (like databases) and L7 for HTTP/WebSocket traffic where you need routing based on content.

Common Load Balancing Algorithms

The algorithm determines how the load balancer selects a backend server for each request.

Round Robin

Distributes requests sequentially to each server in the pool
Simple and works well when all servers have similar capacity
Does not account for server load or response time

Weighted Round Robin

Assigns weights to servers based on their capacity
More powerful servers get more requests
Example: Server A (weight 3) gets 3 requests for every 1 request to Server B (weight 1)

Least Connections

Sends requests to the server with the fewest active connections
Better for long-lived connections (like WebSockets or database connections)
Accounts for varying request processing times

Least Response Time

Sends requests to the server with the fastest average response time
Requires monitoring response times, adding overhead
Good for minimizing latency

IP Hash

Uses the client's IP address to determine which server to send the request to
Ensures the same client always goes to the same server (session persistence)
Useful when you can't use sticky cookies (e.g., non-HTTP traffic)

Random with Two Choices

Picks two servers at random and selects the one with the lower load
Provides good load distribution with less overhead than checking all servers
Used in systems like Redis Cluster

⚠️

Sticky sessions: Algorithms like IP hash enable sticky sessions, but they can lead to uneven load distribution if certain clients generate more traffic. Consider using cookie-based stickiness (at L7) for better balance.

Health Checks: Ensuring Traffic Goes Only to Healthy Servers

Load balancers must avoid sending traffic to servers that are down or unhealthy. Health checks are periodic probes that determine a server's fitness.

Types of Health Checks

TCP check: Can you establish a TCP connection to the server's port? (L4)
HTTP check: Send an HTTP GET request and expect a 2xx or 3xx response. (L7)
HTTPS check: Same as HTTP but over TLS.
Custom check: Run a script or endpoint that returns 200 if the server is healthy (e.g., checks disk space, queue depth, etc.)

Health Check Configuration

Interval: How often to perform the check (e.g., every 5 seconds)
Timeout: How long to wait for a response before marking the check as failed
Unhealthy threshold: Number of consecutive failed checks before marking the server as unhealthy
Healthy threshold: Number of consecutive successful checks required to mark an unhealthy server as healthy again (prevents flapping)

✅

Best practice: Use an endpoint like /health that checks critical dependencies (database, cache, etc.) and returns 200 only if all are healthy. Avoid overly complex health checks that might fail transiently.

Putting It All Together: Choosing a Load Balancer

When designing a system, consider:

Traffic type: Is it HTTP/WebSocket (L7) or raw TCP (L4)?
Features needed: Do you need SSL termination, URL-based routing, or WAF capabilities? (L7)
Performance requirements: Do you need the lowest possible latency? (L4 might be better)
Algorithm suitability: Match the algorithm to your workload (e.g., least connections for long-lived requests, round robin for short-lived HTTP requests).
Observability: Choose a load balancer that provides good metrics and logging.

🚨

Common mistake: Overlooking health checks. A misconfigured health check can lead to all servers being marked unhealthy (too strict) or unhealthy servers receiving traffic (too lax).

What to Remember for Interviews

L4 vs L7: Know the differences in what they can inspect and their typical use cases.
Algorithms: Be able to explain round robin, least connections, and IP hash, and when to use each.
Health checks: Understand why they're critical and how to configure them properly.
Sticky sessions: Know the trade-offs between IP hash and cookie-based stickiness.
Cloud vs self-managed: Be familiar with offerings like AWS ALB/NLB, GCP Cloud Load Balancing, and open-source options like HAProxy and NGINX.

✅

Practice: Draw a diagram of a typical web architecture with clients, L7 load balancer, web servers, and a database. Explain how you'd choose the load balancer type, algorithm, and health check for each layer.

Basic System Concepts: Latency, Throughput, Scaling, Load Balancing, and Caching

Caching Patterns: Cache-Aside, Read-Through, Write-Through, and More