SAP APM Visual Inspection: Edge AI Inferencing Via...

This article walks through a practical example of running AI/ML inference directly on an edge device — a thermal analytics use case for visual inspection. The goal is to show how the edge inferencing approach actually works in practice, end to end, from capturing a thermal image to triggering a downstream action in SAP Asset Performance Management (APM) Visual Inspection.

For broader context on how AI/ML models are trained, managed, and deployed on Cumulocity (covering both cloud and edge), readers can refer to the foundation article: Bringing AI/ML to Cumulocity — Cloud and Edge Inferencing.

The scenario is straightforward: a thermal infrared camera is connected to Cumulocity via thin-edge.io. The camera continuously captures thermal images of equipment, and the goal is to detect hotspots — regions where the temperature exceeds a defined threshold.

When a hotspot is detected, an event should be raised in Cumulocity with the relevant details (bounding box, temperature values, annotated image), and the same information should also be sent to a downstream service — in this example, the SAP APM Visual Inspection — for further processing.

The advantage of this integration is that the inspection results land directly in the hands of field workers and inspection engineers, where they can be reviewed, validated, and acted upon as part of the existing maintenance process. Instead of AI insights staying isolated in a separate system, they become part of the standard inspection workflow — closing the gap between automated detection and real-world action.

This is a classic computer vision workflow, and a good fit for edge inferencing. The thermal images are large, the analysis needs to happen close to the source, and only the meaningful results need to be sent upstream.

This demo shows how thermal images captured from a thermal infrared camera — such as the Optris Xi 410 or any similar radiometric camera — connected to Cumulocity via thin-edge.io can be analyzed locally using an AI model. In this setup, a Raspberry Pi running thin-edge.io acts as the edge device.

Modern thermal cameras produce radiometric images — images that not only show temperature variation visually but also contain per-pixel temperature data. This makes them especially well-suited for ML-based hotspot detection.

For this demo, synthetic radiometric images generated by a simulator are used as input. Eight thermal frames are fed into the pipeline — most show normal temperature variation, while the last two contain regions of significantly higher temperature. The data format matches exactly what a typical thermal camera SDK would produce, so the pipeline is camera-ready and can be connected to a real camera with minimal changes.

For edge inferencing, a generic ONNX Pipeline Runner is used in combination with Cumulocity. The runner is an open-source solution and will be made available through a GitHub repository. It is packaged as a standard Debian package, installed once on the device through Cumulocity's Software Repository, and then it acts as a reusable inference engine on the edge.

The runner does not know anything about thermal imaging, anomaly detection, or any other use case. It simply executes a three-step pipeline — pre-processing, ONNX inference, post-processing — and expects these three components as configuration files on the device. These files are managed centrally through Cumulocity's Configuration Repository.

For this thermal analytics use case, the three components look like this:

Pre-processing — Reads the thermal frame (a radiometric array) and prepares it as input for the ONNX model. When connecting a real camera, this script is simply updated to read from the camera SDK instead of the simulated files.
ONNX inference — A lightweight hotspot detector runs on the input frame. It scans for regions where the temperature exceeds the defined threshold and outputs a heat grid indicating where the hotspots are.
Post-processing — Interprets the model output. If temperatures are within range, nothing happens. If the threshold is exceeded, the script generates an event in Cumulocity with the bounding box, temperature values, and an annotated image.

A Cumulocity microservice then triggers a REST call to the SAP APM Visual Inspection with the corresponding JSON payload.

Once the pipeline is deployed, here is how the end-to-end flow looks in action.

Step 1 — Inference on the edge

The runner picks up the thermal frame, runs it through the ONNX model, and identifies hotspots locally on the device. No data leaves the edge unless something meaningful is detected.

Step 2 — Event raised in Cumulocity

When the temperature threshold is exceeded, the post-processing step generates an event in Cumulocity. The event carries the bounding box, temperature values, and the annotated thermal image.

Step 3 — Result available in SAP APM Visual Inspection Manager

The same event triggers a REST call with the structured JSON payload to SAP APM Visual Inspection Manager. From here, field workers and inspection engineers can review the results and act on them as part of the regular maintenance workflow.

The inferencing happens entirely on the edge. Only the meaningful results — events and structured payloads — are sent to the cloud, keeping bandwidth low and ensuring real-time responsiveness.

One of the strongest aspects of this design is reusability. Today, the device runs a thermal hotspot detector. Tomorrow, the same device could run an anomaly detection model — without rebuilding or redeploying anything.

The pipeline runner stays exactly the same. Only the configuration files change: a different pre-processor (to read sensor data instead of thermal frames), a different ONNX model (trained for anomaly detection), and a different post-processor (to emit anomaly events). All of this is pushed through Cumulocity's Configuration Repository — no SSH, no code changes, no downtime on the device.

The same applies to model updates. As new training data comes in from the field, the model can be retrained externally, exported as a new ONNX file, and pushed down to the edge through Configuration Management. The runner picks up the new model automatically.

This article showed how the generic edge inferencing approach plays out in a real visual inspection scenario. The thermal analytics use case is just one example — the same pipeline can be adapted to anomaly detection, predictive maintenance, or any other ML use case that benefits from running close to the data.

The key idea remains simple: train your models using your own preferred tools, manage them through Cumulocity's existing repositories, and let the edge runner handle the inference.

Source link