Martin Rudorfer


Lecturer in Robotics and AI


about me



service-based architectures for machine vision

This is the work I did during my PhD. Find the abstract here:

The growing trend towards high-mix and low-volume production demands more flexible and reconfigurable control for assembly systems. In unstructured or less structured environments, object detection and pose estimation is a key capability to enable industrial robotics applications such as grasping, handling and assembling.

The integration and interconnectivity of such automation functions is fostered by Industry 4.0 through the adoption of service-based ecosystems. The main objective of this thesis is to create a service-based framework for robust object detection and pose estimation in manufacturing environments. This could resemble a viable alternative to traditional machine vision systems such as smart cameras and embedded PCs, which are challenged by the high diversity and fast-paced progress in the field of object detection and pose estimation.

We approach this problem in three steps. In the first step, we propose a framework and demonstrate that it is realizable. It has a REST /gRPC interface that allows to handle all detection methods uniformly. A virtualization strategy enables upscaling and easy deployment, and the new OPC UA vision specification is exploited for integration of the detector services into the production environment.

industrial vision services

In a second step, we examine three exemplary object detection and pose estimation methods and prove that they can be integrated into the framework. This boils down to automatic training from a CAD model and parameterization without expert knowledge, which is possible for two of the three methods. Regarding the third method – a Deep Learning approach – we demonstrate that synthetic images can be generated from the model to enable training, but further measures are required to attain the desired pose accuracy.

render and compose

Finally, in a third step, we characterize the framework to identify its strengths and weaknesses compared to conventional machine vision systems. We perform a scenario-based analysis to determine certain quality attributes and find that both system types have their justification. The proposed service-based framework enables more efficient resource utilization, has a better configurability, maintainability and availability. On the other hand, conventional systems have better timing behavior and do not require such elaborate security measures. Timing, resource utilization and reusability are moreover strongly affected by the chosen detection method. Given a particular application, our characterization helps to identify the most suitable system type.

Altogether, in this work we have contributed a novel type of vision system and demonstrated that it is a viable alternative for object detection and pose estimation applications. The framework structure as well as the identified architectural trade-offs can furthermore be generalized to other machine vision or automation tasks. Promising future research directions include facilitating the training of Deep Learning methods, quantifying architectural trade-offs in case studies, and integrating other vision applications to create an ecosystem of vision services.