You’re waiting in line for concert tickets. You see “Your current position in line: 50,000” and start wondering what technology powers this. As a developer, you might assume these high-traffic systems use modern real-time tech like WebSockets or Server-Sent Events (SSE). After all, they’re designed for real-time communication, right?

Well, not quite. When you’re dealing with hundreds of thousands of simultaneous users, the secret to stability often lies in a much simpler method: HTTP Polling.


How HTTP Polling Works

HTTP Polling is straightforward: your browser repeatedly asks the server “Is it my turn yet?” at regular intervals. The server responds with your current position, and then the connection closes. Your browser waits a bit, then asks again.

Think of it like checking your mailbox—you don’t keep the mail slot open all day. You check it periodically, get your mail, close it, and check again later.

This might seem inefficient compared to WebSockets, but it’s actually the foundation that keeps massive systems like South Korea’s SRT train booking site running smoothly under extreme load.


The TTL Mechanism

If every user polled every second, the server would collapse. That’s where TTL (Time To Live) comes in.

Instead of letting the client decide when to poll again, the server controls the pace. When it responds, it includes a TTL value that tells the client: “Don’t ask again for 10 seconds” or “Check back in 1 second.”

This gives the server complete control over traffic flow. When load is high, it can increase the TTL to reduce requests. When things calm down, it can decrease the TTL for faster updates.

Here’s how you might implement this in practice:

Go Example

package main

import (
    "encoding/json"
    "net/http"
    "time"
)

type QueueStatus struct {
    Position int `json:"position"`
    Total    int `json:"total"`
    TTL      int `json:"ttl"` // seconds until next poll
}

func getQueueStatus(userID string) (int, int) {
    // Your queue logic here - check Redis, database, etc.
    // Returns: position, total
    return 5000, 50000
}

func calculateTTL(position, total int) int {
    // Adaptive TTL based on position
    progress := float64(position) / float64(total)
    
    switch {
    case progress > 0.9: // Top 10%
        return 1 // Poll every second
    case progress > 0.5: // Top 50%
        return 3 // Poll every 3 seconds
    case progress > 0.1: // Top 90%
        return 5 // Poll every 5 seconds
    default:
        return 10 // Poll every 10 seconds
    }
}

func queueHandler(w http.ResponseWriter, r *http.Request) {
    userID := r.URL.Query().Get("user_id")
    
    position, total := getQueueStatus(userID)
    ttl := calculateTTL(position, total)
    
    status := QueueStatus{
        Position: position,
        Total:    total,
        TTL:      ttl,
    }
    
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(status)
}

func main() {
    http.HandleFunc("/queue/status", queueHandler)
    http.ListenAndServe(":8080", nil)
}

Rust Example

use serde::{Deserialize, Serialize};
use std::sync::Arc;
use tokio::sync::RwLock;
use axum::{
    extract::Query,
    http::StatusCode,
    response::Json,
    routing::get,
    Router,
};

#[derive(Serialize, Deserialize)]
struct QueueStatus {
    position: usize,
    total: usize,
    ttl: u64, // seconds until next poll
}

#[derive(Deserialize)]
struct QueueParams {
    user_id: String,
}

fn calculate_ttl(position: usize, total: usize) -> u64 {
    // Adaptive TTL based on position
    let progress = position as f64 / total as f64;
    
    match progress {
        p if p > 0.9 => 1,  // Top 10%: poll every second
        p if p > 0.5 => 3,  // Top 50%: poll every 3 seconds
        p if p > 0.1 => 5,  // Top 90%: poll every 5 seconds
        _ => 10,            // Default: poll every 10 seconds
    }
}

async fn get_queue_status(user_id: &str) -> (usize, usize) {
    // Your queue logic here - check Redis, database, etc.
    // Returns: (position, total)
    (5000, 50000)
}

async fn queue_handler(Query(params): Query<QueueParams>) -> Result<Json<QueueStatus>, StatusCode> {
    let (position, total) = get_queue_status(&params.user_id).await;
    let ttl = calculate_ttl(position, total);
    
    Ok(Json(QueueStatus {
        position,
        total,
        ttl,
    }))
}

#[tokio::main]
async fn main() {
    let app = Router::new()
        .route("/queue/status", get(queue_handler));
    
    axum::Server::bind(&"0.0.0.0:8080".parse().unwrap())
        .serve(app.into_make_service())
        .await
        .unwrap();
}

The client then uses the TTL value to determine when to poll again:

async function pollQueueStatus(userId) {
    const response = await fetch(`/queue/status?user_id=${userId}`);
    const { position, total, ttl } = await response.json();
    
    console.log(`Position: ${position}/${total}`);
    
    // Wait for TTL seconds before polling again
    setTimeout(() => pollQueueStatus(userId), ttl * 1000);
}

Why Not WebSockets or SSE?

Many developers wonder why not use WebSockets or Server-Sent Events (SSE). Both create persistent connections where the server can push updates directly—WebSockets for bidirectional communication, SSE for one-way server-to-client updates. On paper, that sounds perfect for queue status updates.

But at scale—think 500,000 simultaneous users—both become a liability:

WebSockets

  1. Memory overhead: Each WebSocket connection consumes server memory, even when idle. A typical WebSocket connection uses around 8-16KB of memory per connection (buffers, state, file descriptors). With 500,000 persistent connections, that’s roughly 4-8GB of memory just for maintaining connections—and that’s before any actual data processing. 500,000 open file descriptors and memory buffers is a lot of resources just sitting there.

  2. The reconnection storm: If the network hiccups and those 500,000 connections drop, every user tries to reconnect simultaneously. This creates an unintentional DDoS attack that can take down the entire system. Recovery becomes nearly impossible because the reconnection attempts keep hammering the server.

Server-Sent Events (SSE)

SSE has the same fundamental problems:

  1. Memory overhead: Like WebSockets, SSE maintains persistent connections. Each SSE connection typically uses 4-8KB of memory. 500,000 SSE connections still mean 500,000 open connections consuming roughly 2-4GB of memory and file descriptors.

  2. Reconnection issues: While SSE has built-in automatic reconnection, that’s actually part of the problem. When connections drop, all 500,000 clients reconnect automatically at once, creating the same reconnection storm.

  3. Connection limits: Many servers and proxies have connection limits. Hitting those limits with persistent connections is easier than with short-lived polling requests.

The core issue with both WebSockets and SSE is that they’re stateful—they require maintaining persistent connections. At massive scale, stateless polling often wins.


Why Polling Works at Scale

Polling succeeds because it’s stateless. Each request is independent—the server processes it, sends a response, and forgets about it. No connection to maintain, no state to track.

Checking a queue position is a simple operation: read a number from memory (or Redis—see this post for how Redis fits into the queue architecture), return it, done. It’s lightweight. Even with 500,000 users polling every few seconds, the server handles it because each request is quick and self-contained.

The math is simple: 500,000 lightweight requests spread over time is more manageable than 500,000 persistent connections that all need to be maintained simultaneously.


Adaptive Polling

Modern polling systems don’t use fixed intervals. They use adaptive polling where the interval changes based on your position:

  • Far from the front? Poll every 10 seconds.
  • Getting closer? Poll every 5 seconds.
  • Almost there? Poll every 1 second.

This balances server load with user experience. Users get faster updates as they approach the front, but the server isn’t overwhelmed by constant requests from users who are still far back in line.


The Takeaway

WebSockets and SSE aren’t bad—they’re perfect for real-time games, chat apps, or collaborative tools where you need persistent connections. But for massive waiting queues, the simplest, most stable approach often wins.

The lesson here is about choosing the right tool for the job. Sometimes the “old” way is exactly what keeps your system running when traffic spikes. Real engineering isn’t about using the flashiest technology—it’s about understanding the constraints and picking what actually works.