Executive Context

A European telecommunications provider faced rising operating costs from reactive maintenance across a large DSL infrastructure footprint. Faults were rare but disruptive, so the business maintained excess field capacity to respond after customers were affected.

The operational constraint was uncertainty. Without a way to predict where faults would happen, labor allocation had to cover too much of the network at once.

The Actual Problem

The stated problem was fault prediction. The economic problem was inefficient risk pricing.

The organization was treating infrastructure failure as a binary classification problem. But with very low fault rates, the signal was hard to detect directly. The more useful frame was decay: how fast is this line degrading, and when does intervention become worthwhile?

Diagnostic Approach

Data Sufficiency: Basic signal data was not enough. The aperture expanded to include peripheral DSL and hardware telemetry.
Methodological Fit: Extreme class imbalance made standard classification weak. Time-to-failure modeling fit the operating reality better.
Infrastructure Readiness: Data sovereignty and on-premise constraints ruled out a simple cloud-native approach.

Strategic Intervention

We shifted the workflow from break-fix response to probabilistic operations.

1. Model Time-to-Failure

Survival and decay models estimated a line-level risk profile, flagging likely failures before customer disruption.

2. Score Locally and Frequently

The scoring architecture ran on-premise and refreshed risk profiles at a high cadence without violating data residency constraints.

3. Govern Dispatch by Risk

The dispatch logic changed from geography and ticket response to a refurbishment roster ranked by predicted failure risk.

Outcome

Metric	Before	After	Impact
Annual cost recovery	N/A	GBP 11M	Labor optimization
CSAT score	4.6 / 10	8.7 / 10	Shifted from apology to advisory
Fault rate	1 in 2,000,000	1 in 4,000,000	Healthier subnet
Labor efficiency	Double shifts common	40% reduction	Less excess standby time

Strategic Takeaway

AI value came from mapping the method to operational physics. The useful question was not “is this line broken?” but “how quickly is this line decaying?”

When the business problem is capital allocation under uncertainty, the model must price risk, not just classify events.

Predictive maintenance and capital allocation

Executive Context

The Actual Problem

Diagnostic Approach

Strategic Intervention

1. Model Time-to-Failure

2. Score Locally and Frequently

3. Govern Dispatch by Risk

Outcome

Strategic Takeaway

Want to find the same kind of logic leak?

Executive Context#

The Actual Problem#

Diagnostic Approach#

Strategic Intervention#

1. Model Time-to-Failure#

2. Score Locally and Frequently#

3. Govern Dispatch by Risk#

Outcome#

Strategic Takeaway#

Want to find the same kind of logic leak?

Executive Context

The Actual Problem

Diagnostic Approach

Strategic Intervention

1. Model Time-to-Failure

2. Score Locally and Frequently

3. Govern Dispatch by Risk

Outcome

Strategic Takeaway