Infrastructure
Fleet health and status heatmap
The fleet health card gives you a bird's-eye view of your entire fleet's availability. It surfaces which agents are online, idle, or offline, calculates fleet-wide uptime, and provides a status heatmap showing how agent statuses are distributed over time.
Fleet health summary
The summary section shows four key fleet-level metrics at a glance:
| Metric | Description |
|---|---|
| Online / Idle / Offline | Current count of agents in each status category |
| Fleet uptime % | Weighted average uptime across all agents in the selected range |
| Total transitions | Number of status changes across all agents in the selected range |
Offline agents
If any agents are currently offline, they are listed in a dedicated section. Each entry shows the agent name, the reason it went offline (heartbeat timeout, manual status change, or error detection), the HTTP status code of the last failed heartbeat, and how long the agent has been offline.
Lowest uptime agents
Below the offline agents, a ranked list shows the agents with the lowest uptime percentage in the selected range. This highlights chronically unreliable agents that may need configuration changes, endpoint fixes, or replacement.
Network status heatmap
The status heatmap visualizes agent status distribution over time. Each row is an agent, each column is a time bucket. Cells are colored by the agent's dominant status during that period: green for online, amber for idle, red for offline. This makes it easy to spot fleet-wide outages (vertical red bands) or agent-specific reliability issues (horizontal red streaks).
Export options
Copy to clipboard— copies the fleet health summary and offline agent list as formatted text for pasting into incident reports or chat messages.
CSV export— downloads the full agent health data as a CSV file including uptime percentages, transition counts, and current status for every agent.
Next
Drill into per-agent uptime timelines and outage details. See Agent uptime →