Information-Collection in Robotic Process Monitoring: An Active Perception Approach

05/01/2020 ∙ by Martin A. Sehr, et al. ∙ 0

Active perception systems maximizing information gain to support both monitoring and decision making have seen considerable application in recent work. In this paper, we propose and demonstrate a method of acquiring and extrapolating information in an active sensory system through use of a Bayesian Filter. Our approach is motivated by manufacturing processes, where automated visual tracking of system states may aid in fault diagnosis, certification of parts and safety; in extreme cases, our approach may enable novel manufacturing processes relying on monitoring solutions beyond passive perception. We demonstrate how using a Bayesian Filter in active perception scenarios permits reasoning about future actions based on measured as well as unmeasured but propagated state elements, thereby increasing substantially the quality of information available to decision making algorithms used in control of overarching processes. We demonstrate use of our active perception system in physical experiments, where we use a time-varying Kalman Filter to resolve uncertainty for a representative system capturing in additive manufacturing.



There are no comments yet.


page 1

page 3

page 4

page 5

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In modern manufacturing, a wide array of thermomechanical processes are used in production and modification of parts and objects of various materials. The purposes of these processes are diverse, including hardening of structures, increase in corrosion resistance, additive manufacturing and many others. Regardless of their widespread application in industry, complex thermomechanical manufacturing processes remain difficult to control precisely for a variety of reasons, many of which relate to insufficient knowledge of process states for either re- and pro-active decision making, often going in hand with little to no use of adequate process models. A common challenge with thermomechanical processes is that even slight changes in temperature above certain thresholds can result in severe damage or even loss of parts. Hence, accurate real-time estimation of temperature fields in parts is of great interest.

Unfortunately, only few applications have access to the measurement data needed to accomplish this. In some cases, one could attempt to distribute stationary sensors over a workspace and compose their measurements to reconstruct desired information. While possible in principle, this method is often impractical for a variety of reasons, including workspace constraints such as occlusions and moving parts, unit costs per sensor added or excessive exposure to heat, vibrations and other process factors. One can circumvent these issues by dynamically repositioning a mobile sensor and gathering all data required from that single sensor. Naturally, this alternative allows incorporating additional, stationary or mobile sensors installed in the workspace. The nature of this problem typifies those encountered in active sensing, where re-orientation of a single sensor such as a camera taking images from various angles provides the same if not richer sensory signals as compared to multiple stationary sensors.

In order to move or re-orient the sensor so as to provide that richer signal, decision criteria are required. A reasonable choice in practice is often to maximize information gain within the feasible active sensing field, as motivated by information-theoretic control [8]. Aim of this paper is to provide a general framework for an active-sensing system that can extrapolate from sensed information to estimate current and predict future states, and use that information to inform future sensory actions and ultimately regulate underlying physical processes. Our algorithmic approach, visualized in Figure 1, is designed for active sensing of distributed physical phenomena such as temperature fields with cameras mounted on robot arms. In Section IV, we demonstrate our approach for a challenging monitoring problem in additive manufacturing, which requires continuous repositioning of an IR camera to capture interior and exterior temperature distributions.

Fig. 1: Active sensing for distributed physical phenomena, repeating indefinitely: (1) estimation of current state values using physical process model and Bayesian Filter, (2) evaluation of state and uncertainty information, (3) robot path planning for new sensor position capturing critical/uncertain part locations, (4) trajectory execution on robot arm, (5) measurement data capture, (6) data processing including projection and mapping of image data to process model.

I-a Prior work

As detailed in [10], interest in active perception and information theoretic control have grown rapidly in recent years. There are different ways to derive positions for information maximization. Optimal sensing actions are derived in [14, 1, 2] by sensing at locations with process uncertainty. Similarly, Gaussian process regression is used in [18, 13, 9, 5] to measure at locations that minimize the uncertainty of a surface model.

For more active, real-time scenarios, Bayesian Optimization techniques can help choosing sensing location via maximizing the probability of providing useful information 

[11]. Regardless of how one chooses the information criteria, it is well accepted that maintaining a belief representation [20]

- whether by culmination of sensory information into an evolving state vector, or by updating one or multiple probability models in a non-Markovian sense - will yield richer subsequent actions. For example, recent work in 

[7] indicates that use of a Memory Unscented Kalman Filter is a more efficient means of Tactile Localization. Combining both the Bayesian and Gaussian modelling, Extended Kalman Filters are used in [15] to generate optimal trajectories and find considerable success over random methods. A special case of active perception, which is also related to our contribution, is concerned with determining the next pose for a visual sensor given its previous measurements. This is often called the ”next best view” problem [4, 3, 16, 17].

I-B Contributions

Following the steps outlined in [15], we develop a generalized framework for the active sensing problem wherein the information used for decision-making stems from a Bayesian Filter. Given a process-specific state-evolution matrix and state vector as well as noise statistics, we merge observations and physical process models in order to obtain high-dimensional state estimates including associated measures of uncertainty. The main contribution of this paper is an active system that is tailored towards reducing the uncertainty about dynamically varying scalar fields on objects in manufacturing processes. The system can be used to automatically and actively gather concurrent state information from thermomechanical processes at runtime. This application illustrates the feasibility of joint active sensing and closed-loop state estimation of scalar fields on physical objects.

Ii Methodology

This paper discusses a particular class of state estimation problems for nonlinear autonomous systems of the form


where denotes the state vector, the process noise vector and the state evolution function for . The class of estimation problems tackled in this paper is motivated by successively repositioning sensor equipment to resolve the state of system (1), leading to output equation


where denotes the system output vector, the measurement noise vector, and the output function for . The particular output function at time instance depends on the current location of the sensing equipment, which is selected based on the statistics of the Bayesian Filter [19] used to estimate the state vector,



denotes the probability density function (pdf) of

given past measurement data and initial prior density, . This leads to the following abstract routine:

1:for  do
2:     Based on , position sensors
3:     Based on sensor positions, derive
4:     Measure system output via (2)
5:     Compute probability density via (3)
6:     Compute via (4)
Algorithm 1 Active Perception System

The remainder of this paper discusses the implementation of a linear variant of Algorithm 1. However, notice that while the Bayesian Filter (3)-(4) can be reduced to a classic Kalman Filter in the linear case, the problem structure remains as in the general, nonlinear setup. We chose to proceed with the linear case to simplify description of our experiments in Section IV, which demonstrate the use of Algorithm 1 for a particular problem variant in which temperatures of an object are observed with an IR camera, which is relocated dynamically using a robotic to resolve uncertainty about estimated temperatures on the object surface as well as in its interior. In the linear case, the system equation (1) reduces to


the output equation (2) to


and the Bayesian Filter to a classic Kalman Filter with


where denotes the state estimate vector at time using all measurements up to time , and denote the respective noise covariance matrices of and , the posterior state estimate covariance matrix and the Kalman Filter gain.

Iii Implementation

We continue with a particular implementation of Algorithm 1, aimed to estimate states of physical objects by use of a single line-of-sight sensor that can be positioned and oriented dynamically as required to resolve state estimate uncertainty. While our implementation is based on the linear formulation captured by (5)-(7), it can be extended seamlessly to its more general nonlinear counterpart (1)-(4). Similarly, given that our fundamental approach is based on a Kalman Filter, additional fixed and mobile sensors may be added without fundamental changes to our system. In the following, we use libigl [12]

, an open-source

C++ visualization library, to recreate physical objects in virtual space and derive the observation matrices based on sensor positions at time instance , corresponding to Line 3 in Algorithm 1.

Iii-a Initialization

Using libigl, we create matrices to generate vertices and edges for given objects. To avoid excessive state dimensions in our active perception algorithm, we select only a reduced number of object vertices to form the spatial discretization of our system state to be estimated. This reduction, essential especially for complex object shapes with large numbers of vertices, is based on projecting uniform points across bounding boxes of given objects (see e.g. Figure 2) onto their surface areas. This projection is performed by selecting the vertices closest to each of the points distributed uniformly across the bounding boxes, resulting in an evenly spaced, reduced number of vertices, which form the state vector of our system model (1) and are referred to as control points below.111Notice that, while objects with occlusions may result in a low number of control points within said occlusions, these occlusions would typically also fall outside the field of vision of line-of-sight sensors such as cameras. Moreover, additional control points on the surface and in the interior of a given object may be added manually. That is, each physical quantity (e.g. temperature) captured in our state vector is associated with a location specified by a unique control point. Based on this system state, we generate the state evolution matrices,

, for instance via discretization of a partial differential equation, such as the heat equation in our experiments below, capturing the system dynamics, taking into account specific control point locations.

Fig. 2: Bounding box for the Stanford Bunny [6]; points annotated with Cartesian coordinates in -notation.

Iii-B Object Rotation and Projection

At each measurement step of our active perception process, Lines 2-4 in Algorithm 1, we move the sensor to its desired location and project a plane onto the object surfaces in line of sight. To determine the face of the object the sensor is currently observing, we assume that the sensor is positioned at the coordinate origin and oriented pointing along positive -direction; that is, the positive -axis is facing from the sensor towards the center of its field of view. Based on this fixed setup, we calculate rotation and translation matrices to re-orient the camera from its fixed rest location to its current state and perform the inverse operations on the vertices of the object observed. Doing so orients the virtual object in the sensor reference frame. We next average the -coordinates of a slice of vertices centered on the - plane, denoting their average by

. The system then classifies as observed by the sensor any control point whose

-coordinate lies between the average and the coordinate origin.222Notice that while this process of determining control points in line of sight assumes relatively smooth objects shapes with limited surface curvature, our implementation performed well in experiments with various objects, such as e.g. the Stanford Bunny [6].

Finally, for all observed points, the system populates a row in the current observation matrix, , as defined in equation 6. Algorithm 2 summarizes schematically the procedure to construct these observation matrices, where and denote the matrices of object vertices in default frame and sensor frame, respectively. Furthermore, denotes the set of -coordinates explored around , where , are Lagrange basis vectors in -directions, respectively, corresponds to the rotation matrix required to re-orient the global -axis to the positive sensor-orientation.

1:procedure Frame Change
4:procedure Partitioning
10:procedure Observation Matrix
11:     for  do
Algorithm 2 Observation Algorithm
(a) To rotate the object into the sensor frame, we record translation and rotation matrices from fixed rest location to current sensor position (white) and orientation (positive -direction), respectively.
(b) We distinguish between observed and occluded faces of the object by calculating a local average and only projecting onto those points that are on the same side of that average as is the sensor.
Fig. 3: Reference frame rotation and projection of sensor information onto a cylindrical object; control points on cylinder highlighted by black dots; points annotated with Cartesian coordinates in -notation.

After identifying the observed control points via Algorithm 2, we record the data of the closest sensor information (e.g. pixels in case of a camera) as measurement value, generating an observation vector with one entry per observed control point of the object. Based on this measurement vector, , current state estimate vector and covariance matrix are all updated through a Kalman Filter of form (7). Figure 3 highlights an example of our rotation-projection routine used to map sensor information onto the surface of a given object.

Iii-C Active Perception System

Based on the filter estimating concurrently the state of the system, two different methods were used to prescribe new sensor positions: either the system navigated the sensor to observe the control point with the highest value (e.g. temperatures in our experiments in Section IV), or it navigated to the control point at which the state estimate was most uncertain. Notice that these two variants may be augmented and combined rather arbitrarily without structural modifications to the framework presented in this study; they represent example setups for detection of extreme values and smoothing overall system uncertainty, respectively.

Regardless of this choice, new sensor positions are generated by first identifying the vector connecting the geometric center of the object, denoted , to the control point in question and scaling it by to define the displacement vector, denoted . Based on that, the sensor position is defined as the sum of the displacement vector and the object’s geometric center. The sensor orientation vector, , is defined as the anti-parallel unit-vector to the unit-vector in the direction of the displacement vector. This procedure is summarized by Algorithm 3, where denotes the set of observed control points as per Algorithm 2, is the matrix of Cartesian positions of each data point in the sensor frame, are the Cartesian coordinates of all object control points, denotes the sensor data at location and time instance , and and denote sensor position and orientation, respectively. Positions are assigned to each sensor data point depending on the average distance of the sensor to the object, the current state estimate vector, , and sensor-specific width and height of observed frames.

1:procedure Measurement and Filtering
2:     for  do
4:     ,
5:procedure Max Value Update
10:procedure Max Uncertainty Update
Algorithm 3 Measurement and Updating

Iv Hardware Experiments

In order to validate our method for active perception in thermomechanical manufacturing processes, we next describe in detail a specific use case scenario and experiments for an emulated thermomechanical process. In specific, we use our system to capture the thermal field over a part being produced by means of robotic manufacturing. We consider a two-robot scenario where Robot A produces a part via robotic additive manufacturing and Robot B executes our algorithmic pipeline for active perception.

For practical reasons, we emulate this scenario in a hardware-in-the-loop configuration, where the physical process guided by Robot A is simulated in virtual space while Robot B executes the motion prescribed by our algorithm in physical space with a scaled physical replica model of the part to be produced. The physical Robot B is fully aware of the virtual world in which Robot A lives and the manufacturing process takes place, but acts in the real-world. This setup is possible, because the manufacturing process does not depend on Robot B’s actions. The physical components of this setup are displayed in Figure 4 with a UR5 robot arm representing Robot B and an Optris IR camera mounted facing in parallel direction with the outward pointing axis of the tool center point. In our experiments performed for this paper, the images captured by the IR camera are discarded in favor of noisy process values governed by our detailed thermomechanical process model, which serves as ground truth for our experiments. However, while we discard the actual IR image data, we process the physical camera orientation via the exact projection pipeline described through Section III above to extract the surface temperature data in the camera field of view from the underlying detailed thermomechanical simulation model.

Fig. 4: Experiment configuration: UR5 robot arm with mounted IR camera running active perception pipeline.

Notice that the detailed thermomechanical model used to generate ground truth data in our hardware-in-the-loop experiments is not identical in structure to the one used by our active perception algorithm, which operates using a lower-dimensional linear time-varying model capturing the continuous deposition of material during the additive manufacturing process of interest. Instead, we use dedicated simulation tools for thermomechanical processes to generate a detailed high-fidelity model of the underlying additive manufacturing process. The data generated by this software is saved as a look-up table which is parsed by our hardware-in-the-loop configuration at runtime.

While this hardware-in-the-loop configuration may appear to reduce the complexity of our case study, notice that it has several key advantages over using an entirely physical manifestation of our use case scenario. For instance, the hardware-in-the-loop configuration allows us to capture the thermomechanical process with an additional, user-specified and more fine-grained dynamic model than that processed by our Kalman Filter at runtime. In addition, our experimental setup permits access to quantities such as interior part temperatures over time, which would be inaccessible in a complete physical version of this scenario. This in turn allows us to analyze in detail the performance of our algorithms in the emulated additive manufacturing scenario.

The part produced in our experiments is of rectangular geometry with a concentric rectangular pocket at its center, and is produced via a tool path depositing concentric three beads per layer of material. Notice that, while we chose a rather simple geometry for illustration, our numerical model has to capture both material deposition and heat transfer at the same time, which requires a time-varying version of our Kalman Filter (7). For numerical efficiency, our active perception algorithm captures the temperature distribution only for the layers of material produced most recently; this approximation allows us to increase spatial resolution at locations where temperatures are expected to be the highest without sacrificing computational speed. The resulting experiment data for active layers of material and fixed model update time step of is captured by Figure 5. Given the time required to reposition the IR camera using the robot arm, we obtain measurement data approximately every at optimal locations in terms of temperature estimate covariance data. After an initialization period, during which the first layers of material are deposited, the algorithm results in steady progression of both temperature errors and covariances, as expected using our algorithmic approach. The residual errors are explained by the different modeling approaches employed to generate the ground truth data and the LTV model used to estimate the temperature field at runtime. While one could adjust the models to reduce these errors, we believe they are representative of what one may experience in a fully physical experiment.

V Conclusions

We demonstrated the feasibility of an active sensory system for estimating scalar fields from camera frames, where only partial information is available to the sensory system. This was shown based on a Bayesian filter to provide decision-making criteria, which depends on states that are not directly measured or cannot be measured. For the specific use case in our experiments, we demonstrated that the active sensory system improves iteratively information gain in a challenging additive manufacturing scenario. Given the generality of our approach, it may be used similarly in other, more complex thermomechanical scenarios and ultimately for direct process control.


  • [1] A. R. Ansari and T. D. Murphey (2016) Sequential action control: closed-form optimal control for nonlinear and nonsmooth systems. IEEE Transactions on Robotics 32 (5), pp. 1196–1214. Cited by: §I-A.
  • [2] N. Atanasov, J. Le Ny, K. Daniilidis, and G. J. Pappas (2014) Information acquisition with sensing robots: algorithms and error bounds. In 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 6447–6454. Cited by: §I-A.
  • [3] J. E. Banta, L. Wong, C. Dumont, and M. A. Abidi (2000) A next-best-view system for autonomous 3-d object reconstruction. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 30 (5), pp. 589–598. Cited by: §I-A.
  • [4] A. Bircher, M. Kamel, K. Alexis, H. Oleynikova, and R. Siegwart (2016) Receding horizon” next-best-view” planner for 3d exploration. In 2016 IEEE international conference on robotics and automation (ICRA), pp. 1462–1468. Cited by: §I-A.
  • [5] D. A. Duecker, A. R. Geist, E. Kreuzer, and E. Solowjow (2019) Learning environmental field exploration with computationally constrained underwater robots: gaussian processes meet stochastic optimal control. Sensors 19 (9). Cited by: §I-A.
  • [6] G. Tuck and M. Levoy (1994) Zippered polygon meshes from range images. Siggraph, pp. 311–318. Cited by: Fig. 2, footnote 2.
  • [7] G. Vezzani, U. Pattacini, G. Battistelli, L. Chisci, and L. Natale (2017-10) Memory unscented particle filter for 6-dof tactile localization. IEEE Transactions on Robotics 33, pp. 1139–1155. Cited by: §I-A.
  • [8] B. Grocholsky (2002) Information-theoretic control of multiple sensor platforms. Ph.D. Thesis, University of Sydney. School of Aerospace. Cited by: §I.
  • [9] G. A. Hollinger, B. Englot, F. S. Hover, U. Mitra, and G. S. Sukhatme (2013) Active planning for underwater inspection and the benefit of adaptivity. The International Journal of Robotics Research 32 (1), pp. 3–18. Cited by: §I-A.
  • [10] J. Bohg, K. Hausman, B. Sankaran, O. Brock, D. Kragic, S. Schaal, G. Sukhatme (2017-12) Interactive perception: leveraging action in perception and perception in action. IEEE Transactions on Robotics 33, pp. 1273–1291. Cited by: §I-A.
  • [11] J.R. Souza, R. Marchant, L. Ott, D.F. Wolf, and F. Ramos (2014) Bayesian optimisation for active perception and smooth navigation. In ICRA, Cited by: §I-A.
  • [12] A. Jacobson, D. Panozzo, et al. (2017) libigl: a simple C++ geometry processing library. Note: Cited by: §III.
  • [13] E. Kreuzer and E. Solowjow (2018) Learning environmental fields with micro underwater vehicles: a path integral—gaussian markov random field approach. Autonomous Robots 42 (4), pp. 761–780. Cited by: §I-A.
  • [14] N. Jamali, C. Ciliberto, L. Rosasco, and L. Natale (2016) Active perception: building objects’ models using tactile exploration.. In IEEE-RAS International Conference on Humanoid Robots, Cited by: §I-A.
  • [15] P. Salaris, R. Spica, P.R. Giordano, and P. Rives (2017) Online optimal active sensing control. In ICRA, Cited by: §I-A, §I-B.
  • [16] R. Pito (1999) A solution to the next best view problem for automated surface acquisition. IEEE Transactions on pattern analysis and machine intelligence 21 (10), pp. 1016–1030. Cited by: §I-A.
  • [17] C. Potthast and G. S. Sukhatme (2014) A probabilistic framework for next best view estimation in a cluttered environment. Journal of Visual Communication and Image Representation 25 (1), pp. 148–164. Cited by: §I-A.
  • [18] S. Caccamo, P. Gutler, H. Kjellstrom, D. Kragic (2016) Active perception and modeling of deformable surfaces using gaussian processes and position-based dynamics. In IEEE-RAS International Conference on Humanoid Robots, Cited by: §I-A.
  • [19] D. Simon (2006) Optimal state estimation: kalman, h infinity, and nonlinear approaches. John Wiley & Sons. Cited by: §II.
  • [20] S. Thrun, W. Burgard, and D. Fox (2005) Probabilistic robotics. MIT press. Cited by: §I-A.