Predictive maintenance and capital allocation
£ 11M annual cost recovery
Recovering GBP 11M annually by shifting from reactive dispatch to probabilistic decay modeling.
Executive Context
A European telecommunications provider faced rising operating costs from reactive maintenance across a large DSL infrastructure footprint. Faults were rare but disruptive, so the business maintained excess field capacity to respond after customers were affected.
The operational constraint was uncertainty. Without a way to predict where faults would happen, labor allocation had to cover too much of the network at once.
The Actual Problem
The stated problem was fault prediction. The economic problem was inefficient risk pricing.
The organization was treating infrastructure failure as a binary classification problem. But with very low fault rates, the signal was hard to detect directly. The more useful frame was decay: how fast is this line degrading, and when does intervention become worthwhile?
Diagnostic Approach
- Data Sufficiency: Basic signal data was not enough. The aperture expanded to include peripheral DSL and hardware telemetry.
- Methodological Fit: Extreme class imbalance made standard classification weak. Time-to-failure modeling fit the operating reality better.
- Infrastructure Readiness: Data sovereignty and on-premise constraints ruled out a simple cloud-native approach.
Strategic Intervention
We shifted the workflow from break-fix response to probabilistic operations.
1. Model Time-to-Failure
Survival and decay models estimated a line-level risk profile, flagging likely failures before customer disruption.
2. Score Locally and Frequently
The scoring architecture ran on-premise and refreshed risk profiles at a high cadence without violating data residency constraints.
3. Govern Dispatch by Risk
The dispatch logic changed from geography and ticket response to a refurbishment roster ranked by predicted failure risk.
Outcome
| Metric | Before | After | Impact |
|---|---|---|---|
| Annual cost recovery | N/A | GBP 11M | Labor optimization |
| CSAT score | 4.6 / 10 | 8.7 / 10 | Shifted from apology to advisory |
| Fault rate | 1 in 2,000,000 | 1 in 4,000,000 | Healthier subnet |
| Labor efficiency | Double shifts common | 40% reduction | Less excess standby time |
Strategic Takeaway
AI value came from mapping the method to operational physics. The useful question was not “is this line broken?” but “how quickly is this line decaying?”
When the business problem is capital allocation under uncertainty, the model must price risk, not just classify events.
Want to find the same kind of logic leak?
Start with a Clarity Call. We will look for the point where data, model choice, and operating decision stop matching.
Book a Clarity Call