Real-Time Device Monitoring: Instant Status Updates with WebSockets

In This Article

  1. Why Real-Time Matters
  2. What We Built
  3. How It Works: The Real-Time Pipeline
  4. Email Alerts with Smart Cooldown
  5. Notification Preferences
  6. Get Started

When a switch goes down at 2 AM, the difference between knowing in 30 seconds and knowing in 5 minutes is the difference between a brief blip and a cascading outage that takes half your network with it. Most monitoring dashboards make you wait — you refresh the page, stare at stale data, and hope the next poll cycle catches the problem. That lag is where incidents grow from minor to major.

With Down Device v3.0.0, we're shipping real-time WebSocket updates for device monitoring. Your dashboard now reflects the actual state of your infrastructure within 30 seconds of a change — no manual refresh, no polling lag, no guesswork. This post covers why we built it, how the architecture works under the hood, and the smart alerting features that come with it.

Why Real-Time Matters

Traditional monitoring dashboards rely on the browser polling the API at intervals. You load the page, see a snapshot of your infrastructure, and that snapshot is already aging. If your polling interval is 60 seconds and a device goes offline one second after your last poll, you won't see it for another 59 seconds — at best. In practice, most dashboards poll less frequently to reduce server load, and many require a full page refresh to fetch updated data.

For IT admins managing dozens or hundreds of devices, this creates a real operational problem:

Real-time delivery via WebSockets eliminates all of these problems. Status changes arrive at the browser the moment they're detected — not on the next poll cycle, not on the next page refresh, but within seconds of the actual event.

What We Built

Down Device v3.0.0 introduces a WebSocket-based real-time update system for device monitoring. Here's what it delivers:

What This Means in Practice

Open your Down Device dashboard and leave it open. When a device goes offline anywhere in your infrastructure, you'll see the status change within 30 seconds — automatically. No clicking, no refreshing, no waiting. This works whether you're watching one device or a thousand.

How It Works: The Real-Time Pipeline

The real-time system is a pipeline with four stages. Each stage is designed to be fast, reliable, and independently scalable. Here's how a device status change goes from your network to your browser.

Stage 1: Workers Execute Checks

ARQ workers — lightweight async task processors — run monitoring checks against your devices on a 30-second cycle. Each worker picks up check tasks from a Redis queue, executes the ICMP ping or SNMP query, and records the result: the device's status (online, offline, degraded), response time, packet loss, and any error information.

Workers are distributed across monitoring regions and connected via WireGuard VPN, so checks run close to your infrastructure regardless of where the Down Device API is hosted. A worker in your region pings your device, and the result is available in under a second.

Stage 2: Workers Publish to Redis

As soon as a check completes, the worker publishes the result to a Redis pub/sub channel. Redis pub/sub is a message broadcasting system — when a message is published to a channel, every subscriber on that channel receives it immediately. There's no queue to drain, no batch window to wait for. The message goes out the instant it arrives.

Each account has its own pub/sub channel, which means the API server only receives updates relevant to the accounts its connected clients belong to. This keeps message volume manageable even at scale and ensures strict tenant isolation — you never receive data about another customer's devices.

Stage 3: API Server Broadcasts via WebSocket

The Down Device API server subscribes to the relevant Redis pub/sub channels for each connected user. When a check result arrives on a channel, the API server immediately forwards it to every WebSocket client authenticated to that account.

This is where the multi-tenant architecture matters. The API server maintains a mapping of WebSocket connections to account IDs. When a message arrives on an account's channel, it's routed only to browsers that belong to that account. The server never broadcasts data to the wrong tenant, and the filtering happens at the connection level — not in the browser.

Stage 4: Browser Receives and Renders

The browser's WebSocket client receives the JSON payload containing the updated device status, parses it, and updates the dashboard in place. There's no full-page re-render, no API call to fetch the latest state, no flash of loading spinners. The specific device row in your dashboard updates its status indicator, response time, and last-checked timestamp — and that's it.

If the WebSocket connection is interrupted for any reason, the client automatically reconnects using an exponential backoff strategy. On reconnection, it fetches the latest state via a standard API call to ensure nothing was missed during the gap, then resumes listening for real-time updates.

The Full Path in Numbers

Worker checks device (ICMP round-trip: ~5–50ms) → publishes to Redis (~1ms) → API server receives and routes (~1ms) → WebSocket delivers to browser (~10–50ms over the internet). Total pipeline latency from check completion to dashboard update: typically under 100 milliseconds. Combined with the 30-second check interval, you see status changes within 30 seconds of an actual state change on your network.

Email Alerts with Smart Cooldown

Real-time dashboard updates are powerful when you're watching the screen, but you can't watch a dashboard 24/7. That's where email alerts come in — and getting email alerts right is harder than it sounds.

Starting with v3.1.0, Down Device sends email notifications when a device goes offline and when it comes back online. Every alert includes the device name, IP address, the time the state change was detected, and the monitoring region that observed it. You get the information you need to start investigating without having to log in and look it up.

But anyone who has managed a monitoring system knows the real problem with email alerts: alert fatigue. A device with an intermittent connection might flap between online and offline every few minutes, generating a flood of emails that bury the alerts that actually matter. Your inbox fills up with "Device X is offline" / "Device X is online" pairs, and you start ignoring all of them — including the one that says your core router just went down.

Down Device addresses this with cooldown protection. After sending an offline alert for a device, the system enforces a cooldown period before sending another alert for the same device. If the device flaps back online and then offline again within the cooldown window, you don't get a second wave of emails. You get one offline alert, one recovery alert when the device stabilizes, and silence in between.

This approach gives you two critical properties:

Notification Preferences

Not everyone on your team needs the same alerts. The network engineer responsible for core infrastructure wants to know about every switch and router state change. The developer who added a test server to monitoring doesn't want emails about it at 3 AM. The billing admin doesn't need device alerts at all.

With v3.2.0, Down Device introduces per-user notification preferences. Each team member can independently control their alert settings:

These preferences are per-user, not per-account. Each person on your team configures their own notification settings without affecting anyone else. An admin can have both offline and online alerts enabled, while a viewer on the same account can have alerts disabled entirely.

This matters for teams of any size. Even a two-person team benefits from being able to have one person receive all alerts while the other only checks the dashboard during business hours. For larger teams with on-call rotations, individual notification preferences mean the on-call engineer gets alerts while everyone else sleeps undisturbed.

Coming Next

Notification preferences in v3.2.0 cover email alerts for device online/offline events. We're working on expanding this to include additional channels (Slack, webhook, SMS) and per-device alert rules in future releases. The foundation is built — the granularity will keep growing.

What This All Adds Up To

Real-time WebSocket updates, smart email alerts with cooldown protection, and granular notification preferences work together to solve a single problem: making sure you know about infrastructure issues the moment they happen, without drowning in noise.

Here's what changes in your day-to-day workflow:

These features ship as part of Down Device v3.0.0 through v3.2.0. If you're already a Down Device user, WebSocket updates are live now — just open your dashboard. Email alerts and notification preferences roll out in the subsequent releases.

See Your Infrastructure in Real Time

Down Device delivers device status updates to your browser within 30 seconds via WebSockets. Pair that with smart email alerts and per-user notification preferences, and you have monitoring that keeps you informed without overwhelming you. Free plan available — no credit card required.

Start Free Trial

Wrapping Up

Monitoring that makes you wait isn't really monitoring — it's periodic checking with a pretty interface. Real-time delivery changes the dynamic. Your dashboard becomes a live view of your infrastructure, not a snapshot that was accurate 60 seconds ago. Email alerts reach you within seconds of an actual state change, filtered through cooldown logic so you trust every notification you receive.

The architecture behind it — workers publishing to Redis pub/sub, the API server routing messages to authenticated WebSocket connections, the browser updating in place — is designed to be fast and reliable at scale. Whether you're monitoring 10 devices or 10,000, the pipeline delivers updates in under 100 milliseconds from check completion to dashboard render.

If you've been refreshing your monitoring dashboard to check on devices, that stops today. Start your free trial and see what real-time monitoring actually looks like, or reach out to our team if you have questions about how it fits your infrastructure.