Infrastructure

Agent uptime

The agent uptime card provides detailed availability metrics for every agent in your fleet. Each row includes uptime percentage, downtime duration, mean time to recovery, and an expandable status timeline that visualizes exactly when each agent was online, idle, or offline.

Per-agent uptime table

ColumnDescription
Uptime %Percentage of time the agent was online or idle (not offline) in the range
Downtime hoursTotal hours the agent spent in offline status
MTTR (minutes)Mean time to recovery — average duration of offline periods before returning online
TransitionsNumber of status changes (online → offline, offline → online, etc.)
Worst outageDuration of the longest single offline period in the range
A high MTTR combined with few transitions indicates long outages that take a while to resolve. A low MTTR with many transitions suggests flapping — the agent is repeatedly going offline and recovering quickly, which may indicate an unstable endpoint.

Expandable status timeline

Click any agent row to expand it and reveal the status timeline visualization. The timeline is a horizontal bar spanning the selected time range. Colored segments represent the time spent in each status:

  • Green segments — agent was online with recent liveness activity
  • Amber segments — agent was idle (no liveness activity for more than 15 minutes)
  • Red segments — agent was offline (endpoint unreachable, health check failed, or no activity for more than 12 hours)

Segment widths are proportional to time. A thin red sliver indicates a brief outage; a wide red band indicates a prolonged one. Hover over any segment to see the exact start time, end time, and duration.

The timeline renders using the same time range as the rest of the analytics page. A 24-hour range shows fine-grained detail; a 30-day range compresses the timeline so short outages may appear as thin lines.

Use the search field to filter the uptime table by agent name. The filter is case-insensitive and matches partial names.

CSV export

Click the CSV export button to download the uptime data for all agents. The file includes uptime percentage, downtime hours, MTTR, transition count, and worst outage duration for each agent.

Next

Review individual status transition events. See Status transitions →