Mean time to remediate (MTTR) is the security metric that most teams track and few understand well. The number — average days from finding detected to finding closed — looks simple. The interpretation is harder than it appears, and the levers that actually move it are different from what most security programs focus on.
I've seen programs reduce their MTTR from 47 days to 19 days without becoming meaningfully more secure, and I've seen programs where MTTR held flat but actual risk exposure dropped significantly. The metric is useful when you measure and segment it correctly. Measured naively, it's a number that looks good on a dashboard while the real prioritization problem goes unsolved.
This post is about how to measure MTTR in a way that's actually diagnostic, what drives it down in a way that matters, and where optimizing for MTTR will mislead you.
The Naive MTTR Calculation and Why It Lies
Simple MTTR: take all findings closed in a given period, sum the number of days each took to close (from first-seen to closed), divide by count. Report the result monthly.
The problem with this calculation is the denominator. If your security program closes 200 low-severity findings in a month and 5 critical findings, your MTTR averages across all 205. The low-severity findings are typically closed fast (they're easy, engineers don't push back, patch cycles handle them automatically). The critical findings may take much longer and represent almost all of your actual risk. Averaging them together produces a number that looks reasonable while hiding the fact that your critical MTTR is 38 days and your low MTTR is 12 days.
The median-not-mean problem compounds this. A handful of very old findings that finally get closed in a given period will inflate the average dramatically. Two findings that sat for 400 days in the backlog before finally getting resolved can push your average up by 10-15 days for that month even if every other finding was closed promptly.
Correct MTTR measurement requires segmentation: report separate MTTR numbers for each severity tier and for each business-context category. Critical MTTR, high MTTR, and medium MTTR should be tracked as separate metrics with separate trend lines. The critical MTTR is the one that matters for security program effectiveness; the others matter for compliance reporting but shouldn't dilute your operational picture.
The Phase Decomposition: Where Time Is Actually Spent
Total MTTR is a sum of phases. If you want to reduce it, you need to know which phase is consuming the most time. The phases:
Detection-to-triage lag: time from scanner first-sees-finding to security team acknowledges it and assigns it. In programs without automated triage, this phase alone can be 5-15 days for non-critical findings that sit in the queue waiting for a weekly review cycle.
Triage-to-ticket lag: time from security acknowledgment to remediation ticket created and assigned to an engineer. If the security team is doing manual ticket creation from a spreadsheet, this is often 2-7 days. Automated ticket generation eliminates this phase almost entirely — it's the single highest-leverage reduction in MTTR that doesn't require any engineering effort.
Ticket-to-start lag: time from ticket assigned to engineer starts working on it. This is queue depth plus prioritization clarity. If an engineer receives a ticket for a CVE alongside 40 other tickets and no clear signal of which to work on first, this phase is unpredictable. If the ticket arrives ranked, with a clear "this is your #1 security item this sprint," it gets started faster.
Active remediation time: time from work start to fix complete. This is mostly outside security's control — it's determined by patch availability, deployment process, change management requirements, and technical complexity. For most CVEs with available patches, active remediation time is hours, not days.
Verification-to-close lag: time from engineer says it's fixed to scanner confirms it's fixed and the finding is officially closed. Scan cadence is the primary driver — daily scans close this phase fast, weekly scans add up to 7 days automatically.
A typical MTTR of 40 days in a program without automated workflows breaks down roughly as: 10 days detection-to-triage, 5 days triage-to-ticket, 12 days ticket-to-start, 3 days active remediation, 10 days verification-to-close. If you focus on "making engineers fix things faster" (active remediation time), you might save 1-2 days. If you focus on the first three phases — all of which are tooling and process problems, not engineering speed problems — you can save 20+ days.
A Concrete Before/After
Consider a scenario: a 3-person security team at a B2B software company managing about 2,400 assets across two AWS accounts and an on-prem datacenter. Their initial critical MTTR was 52 days. The phase breakdown was: 14 days detection-to-triage (weekly review process), 7 days triage-to-ticket (manual spreadsheet-to-Jira workflow), 18 days ticket-to-start (flat queue, no priority signal), 4 days active remediation, 9 days verification-to-close (weekly scan cadence).
The changes made: automated ticket generation (eliminated 7 days from triage-to-ticket), priority-ranked ticket queue with Vendrsec Risk Score visible in each ticket description (ticket-to-start improved from 18 days to 7), scan cadence increased from weekly to daily for critical-asset groups (verification-to-close from 9 to 3 days). No change to active remediation or detection-to-triage.
New critical MTTR: 27 days. The 25-day improvement came entirely from tooling changes, not from asking engineers to move faster or adding security headcount. The active remediation time — the thing most programs try to optimize — was never the constraint.
The Risk-Weighted MTTR
Plain MTTR treats all findings as equal in the denominator. Risk-weighted MTTR treats days-open on high-criticality findings as more expensive than days-open on low-criticality findings.
Risk-weighted MTTR = (sum of days_open × finding_risk_weight for all closed findings) / (sum of finding_risk_weight for all closed findings).
Using the same finding risk weight from posture drift calculation (CVSS × asset criticality × reachability × exploit status), this gives a metric that improves faster when you close high-risk findings first and slower when you close low-risk findings first. It's a direct measure of whether your prioritization is creating effective triage or just processing tickets in arrival order.
We're not saying risk-weighted MTTR is a replacement for plain severity-segmented MTTR. They answer different questions. Plain MTTR by severity tier answers "how fast are we processing Critical versus High findings?" Risk-weighted MTTR answers "are we doing the important work first?" Both are useful. Neither alone is sufficient.
Where MTTR Optimization Goes Wrong
The gaming risk with MTTR is significant. Programs that are measured on MTTR will find ways to reduce it that don't reflect actual security improvement.
The most common gaming pattern is closing-then-reopening. A finding is closed when the engineer says the patch is applied, before scanner verification confirms it. When the scanner runs and finds it still present (patch applied incorrectly, to the wrong host, or to one of two affected hosts), it reopens. The MTTR for the original finding looks great. The reopened finding starts a new timer. Aggregate MTTR goes down; actual remediation rate is unchanged.
The second pattern is accepted-risk inflation. Findings that are hard to close get moved to "accepted risk" status, which removes them from MTTR calculation. MTTR drops because the difficult findings are no longer in the denominator. The accepted-risk queue grows quietly in the background.
The defense against gaming isn't more rules — it's measuring the right companion metrics alongside MTTR. Track accepted-risk queue size over time. Track findings that appear closed-then-reopened within 14 days. Track the ratio of accepted-risk findings to total findings by severity tier. If MTTR is improving while accepted-risk queue is growing, someone is optimizing the metric rather than the outcome.
MTTR is a proxy for exposure duration. A shorter MTTR on the highest-risk findings means attackers have less time to exploit a known window. That's the goal. The metric is worth tracking, worth reporting, and worth optimizing — as long as you keep the goal in view rather than the number.