Thousands of journeys are recorded every day. GPS data, tire pressure, fuel consumption, engine speed - every journey leaves a digital fingerprint. Most of it is normal. But sometimes there is something in the data stream that is not right - a journey that behaves differently from all the others. Not noticeable enough for a red warning light. Not conspicuous enough to attract attention. And that's exactly the problem.
An experienced car mechanic hears a vehicle pull into the driveway and says, even before he has looked up: "Something sounds funny." No alarm, no fault code - just a gut feeling that has been trained over the years. This instinct for the unusual is difficult to describe, but immediately recognizable when you have it.
Anomaly detection attempts to translate precisely this intuition into a system - for thousands of journeys simultaneously, based on dozens of measured values that no human could ever have a complete overview of.
The system is not looking for obvious faults. Instead, they are looking for non-obvious irregularities, anomalies, peculiarities - things that only become visible when compared with the overall picture.
The obvious idea is to show the system what a conspicuous journey looks like and then let it search for it. Sounds sensible - but fails in practice because there were no labeled examples: no collection of journeys that someone had already classified as "conspicuous". And even if there were: If you knew in advance exactly what you were looking for, you wouldn't need a model.
This is why unsupervised machine learning is used - machine learning without predefined examples. The model is not given a classification of "good" or "bad". Instead, it independently attempts to recognize patterns in the data, deduces what is considered normal - and flags anything that clearly deviates from this.
An analogy: someone moves to a new city. After a few weeks, they have a sense of what belongs to the area - which faces they recognize, which cars are parked in front of which houses. If a strange vehicle suddenly appears at the same time every morning, it is noticeable. Not because someone has said: "Watch out for this car." But because it doesn't fit in.
A telematics system records hundreds of measured values per journey - but not all of them are equally meaningful. The art lies in feature selection: Which key figures describe the driving behavior so precisely and comparably that a model can really learn something from them? In technical jargon, a feature is a single measurable variable or factor - for example, tire pressure, braking intensity or engine speed.
Three principles guide the selection:
Comparability: instead of absolute figures (total fuel consumed), relative values are used (fuel per 100 km). This is the only way to meaningfully compare journeys of different lengths.
Customer focus: the selection of features is based on the specific requirements and priorities of the customer - not every technically available measurement parameter is relevant for the respective issue.
Statistical robustness: not just mean values are calculated for each journey, but also percentiles - such as the median (the typical value of the journey) and the 90th percentile (which applies to heavy loads). This means that individual outliers within a journey do not distort the picture.
Some concrete examples:
|
Feature |
What it reveals |
|
Tire pressure (all four tires, median/P10/P90 each) |
Pressure slightly too low in the long term - no alarm, but noticeable |
|
Fuel consumption (liters/100 km) |
Significantly more than comparable vehicles may indicate engine problems, load or driving style |
|
Braking intensity (braking per km or per minute) |
Frequent braking may indicate driving style or road conditions |
|
Acceleration (forward, sideways, vertical) |
Characterizes driving style and possible road damage |
|
Engine speed and engine oil pressure |
Values permanently outside the typical range may indicate wear |
|
Time in certain gear positions |
How long is the vehicle driven in neutral or reverse, for example? |
|
Idle fuel consumption |
Is the engine often left running unnecessarily? |
Anyone comparing a truck with a car is not measuring anything - they are generating noise. Increased fuel consumption in a heavily loaded truck is an everyday occurrence. In a small car, it would be a warning signal. Frequent braking in city traffic is unavoidable - on the highway it would be a cause for concern.
For this reason, all journeys are first divided into populations - comparable groups:
Feature generation: A structured data set is created from the raw sensor data for each journey - with all relevant key figures.
Division into populations: The journeys are grouped according to vehicle type and route length. The features are then re-evaluated within each population: Features that are missing in all journeys in this group or have the same value everywhere are removed - they carry no information and would only confuse the model.
Training per population: A separate model is trained for each group, which learns what is "normal" in this group.
Anomaly score: Each journey is given a score that indicates how much it deviates from the normal state. The most conspicuous journeys - the proportion of which can be configured - are marked as anomalies.
Downstream analysis: The results are visualized and explained.
A journey has dozens of features. This cannot be represented in a single graphic - at least not directly. A special visualization method solves this problem: it takes all feature values and condenses them to two coordinates. Details are inevitably lost in the process - that is the price of simplification. What is retained are the local similarities: Rides that are similar end up close to each other. That's enough for what we're talking about here: an initial, intuitive picture of the data structure.
Points that appear far away from the main cloud are often exactly the trips that the model has marked as anomalies - a match that often appears in practice, but is not guaranteed.
What makes this display so valuable is that you can see at a glance whether there are isolated outliers or whether an entire group of trips is behaving conspicuously differently. And you can see it without knowing a single figure.
A model that only says "this journey is conspicuous" is only half useful. The really interesting question is: Why?
An explainability procedure breaks down which features contributed to a decision - and to what extent. There are two levels of consideration:
At the level of all journeys in a population, the procedure shows which features have the strongest overall influence on anomaly detection. For trucks on long journeys, for example, these are fuel consumption, tire pressure and engine oil pressure.
This is not only useful for understanding the model - it is also a way of checking whether the model is reacting to the right things. If it reacts to something unexpected, this is an indication that something in the process needs to be adjusted.
For a single conspicuous journey, the process specifically shows which features were unusual for this particular journey. A so-called waterfall diagram makes this visible: the tire pressure at the rear right was significantly below the median of the group. The braking intensity was unusually high. Fuel consumption, on the other hand, was within the normal range.
This enables targeted measures to be taken - tire pressure checks, driver coaching, workshop appointments - instead of looking helplessly at an anomaly score.
Models learn from data. But data does not explain itself.
Technical experts - fleet managers, fleet managers, experienced dispatchers - know things that are not in any data set: why a certain vehicle structurally needs a higher tire pressure, why a short journey with high fuel consumption on a certain route is completely normal, which measured values are distorted depending on the situation. Without this knowledge, the model remains blind to correlations that everyone in the company is aware of.
Specifically, technical experts play a decisive role in three areas:
The interplay between visualization and explainability was particularly informative: experts were able to look at individual conspicuous journeys and understand why the model had flagged them. Sometimes the confirmation: "Yes, that is indeed conspicuous." But at least as often: "No, that's completely normal for this vehicle - the model is wrong here."
The latter sounds like a setback. It is not. It was precisely this feedback that was incorporated into the next iteration - by adjusting the selection of features, changing the population limits and recalibrating the threshold values. With each round, the models became more accurate. Not through more data. Through more understanding.
Anomaly detection is not a panacea - nor is it a conclusion. It is a tool that makes visible what would otherwise remain hidden: non-obvious irregularities, anomalies, peculiarities - in the stream of thousands of inconspicuous journeys.
What is important here is that an anomaly is neither "good" nor "bad". It is unusual. What this means in individual cases is decided by the person - not the model.
And in most cases, anomaly detection is just the beginning. The marked journeys become the basis for the next step: supervised machine learning, in which the model learns not only to recognize the unusual, but also to classify it into specific categories and derive recommendations for action. Anomaly detection - including its own explainability - is just one building block in a larger development project.
The invisible becomes visible - and the journey continues.