Design of Adaptive Compliance Controllers for Safe Robotic Assembly

04/22/2022
by   Devesh K Jha, et al.
MERL
0

Insertion operations are a critical element of most robotic assembly operation, and peg-in-hole (PiH) insertion is one of the most widely studied tasks in the industrial and academic manipulation communities. PiH insertion is in fact an entire class of problems, where the complexity of the problem can depend on the type of misalignment and contact formation during an insertion attempt. In this paper, we present the design and analysis of adaptive compliance controllers which can be used in insertion-type assembly tasks, including learning-based compliance controllers which can be used for insertion problems in the presence of uncertainty in the goal location during robotic assembly. We first present the design of compliance controllers which can ensure safe operation of the robot by limiting experienced contact forces during contact formation. Consequently, we present analysis of the force signature obtained during the contact formation to learn the corrective action needed to perform insertion. Finally, we use the proposed compliance controllers and learned models to design a policy that can successfully perform insertion in novel test conditions with almost perfect success rate. We validate the proposed approach on a physical robotic test-bed using a 6-DoF manipulator arm.

READ FULL TEXT VIEW PDF

page 1

page 7

11/20/2021

Imitation and Supervised Learning of Compliance for Robotic Assembly

We present the design of a learning-based compliance controller for asse...
03/14/2022

Flexure-based Environmental Compliance for High-speed Robotic Contact Tasks

The design of physical compliance – its location, degree, and structure ...
05/11/2019

Fast Skill Learning for Variable Compliance Robotic Assembly

The robotic assembly represents a group of benchmark problems for reinfo...
03/06/2020

Robotic Assembly across Multiple Contact Stiffnesses with Robust Force Controllers

Active Force Control (AFC) is an important scheme for tackling high-prec...
04/10/2019

Compare Contact Model-based Control and Contact Model-free Learning: A Survey of Robotic Peg-in-hole Assembly Strategies

In this paper, we present an overview of robotic peg-in-hole assembly an...
02/15/2021

Interface Compliance of Inline Assembly: Automatically Check, Patch and Refine

Inline assembly is still a common practice in low-level C programming, t...
02/19/2019

Improving dual-arm assembly by master-slave compliance

In this paper we show how different choices regarding compliance affect ...

I Introduction

Over the last several decades, robots have become very precise in performing repetitive pick-and-place operations. However, complications arise when the positions of the parts involved in assembly vary between repetitions of the operation. A classical example of such a task is PiH insertion, which has been studied extensively in assembly for a long time, due to its relevance to manufacturing [22]. This task is a major component of a lot of assembly operations. Even though this task has been studied in robotics and automation research for a long time, this problem remains open in multiple aspects. Presence of pose uncertainty for the parts being assembled leads to complex contact configurations between the parts. Consequently, manipulation for successful assembly requires design of force-feedback controllers that can interpret contact forces and correct the contact configuration, so that the parts could be assembled. Since these contact configurations depend on the physical, as well as the geometrical features of the objects being assembled, they are notoriously difficult to model precisely. As a result, learning-based approaches have been very popular for designing the required corrective controllers. However, learning-based approaches also present challenges when it comes to designing efficient controllers which can reliably perform assembly in the presence of sustained contact interactions.

Fig. 1: Experimental setup with a Mitsubishi Electric Factory Automation (MELFA) RV-AS-D Assista DoF manipulator arm in a possible contact configuration with the hole environment. The diameter of the peg is approximately mm and the tolerance is approximately mm. The figure also shows an Intel Realsense D435 camera which is used in the experiments for the detection of the hole.

The design of learning-based controllers requires an initial exploration phase where the robot has to explore different possible contact configurations so that a generalizable corrective policy could be learnt. We believe that one of the key components of the design of such controllers is the design of compliant controllers which can ensure safe interaction of the robot with the assembly components during the exploration phase. Whereas this is a key requirement for learning adaptive assembly controllers, design of such safe controllers remains mostly unexplored. With this motivation, we present the design of a class of accommodation controllers which guarantee that the contact forces will remain within safe bounds. This accommodation controller is used to collect contact force data to learn a relationship between misalignment and the expected contact forces. To interpret the data efficiently, we present analysis of the contact forces for the purpose of selecting features which can be used to learn a predictive model between contact forces and the amount of expected misalignment. Finally, this predictive model is used to design a corrective policy that allows assembly using the predictive model.

Contributions. This paper has the following key contributions:

  1. We present the design and analysis of two different controllers for safe interaction between the robot and its environment in the presence of sustained contacts.

  2. We present feature analysis for the design of an efficient force-feedback controller for interpretation of different complex contact configurations.

  3. We present the design and verification of a learning-based controller that makes use of the proposed safe accommodation controller and the proposed feature analysis for insertion-type assembly using a -DoF manipulator system.

Ii Related Work

Automatic assembly is one of the most common robot applications, and what differentiates it from many other robotic applications is the need to carefully consider the effect of contact between assembled parts. Pure position control is usually inadequate, because if the robot uses such a controller to follow a reference trajectory exactly, even small misalignments between where the assembled parts are and where they were expected to be would result in very large forces, possibly damaging the robot and/or the parts. A much more suitable type of control for this application is force control that adjusts the motion of the robot in response to experienced contact forces. Such methods for robotic assembly are commonly known asadaptive assembly strategies (AAS) ([19, 8, 1, 15, 5, 14, 17]). A lot of the research in this area has focused on the PiH insertion problem, as a prototypical operation for various assembly tasks.

The main challenge in an AAS is how to interpret the measured force/torque (F/T) signals in order to direct the motion of the robot so as to accomplish the insertion. As early as the 1970s, it was successfully demonstrated that high-accuracy PiH insertion was possible by direct interpretation of F/T signals by a robot program [10]. However, the development of such robot programs is very complex, laborious, expensive, and case-dependent, so this approach turned out to be impractical for wide industrial use. A more universally-applicable approach is to follow a suitable position trajectory that would accomplish the task in the absence of collisions, and adjust the robot’s motion in response to contact forces, thus forming a force feedback controller [19]

. Many such controllers use a mapping from the F/T readings measured at the wrist of the robot (or the platform that the hole is mounted on), onto a correction to the trajectory. In some rare instances, this mapping can be computed analytically – for example, when the peg and hole are circular, have no angular misalignment, overlap at least to some extent, and the point around which the moments of the F/T sensor are computed lies on the axis of the peg (which is also the direction of insertion,

[7]). However, this kind of solution requires careful placement and alignment of the F/T sensor, and is not general enough for regular use.

As this approach to designing force controllers reduces to finding a suitable mapping between F/T measurements and corrections to a nominal trajectory, a much more general method for obtaining this mapping, and thus a working controller, is to use machine learning methods for estimating the mapping from data. One early such method for

programmed compliance proposed in [20] used linear least squares to estimate a linear mapping between F/T measurements and corrections to either position or velocity, effectively learning the admittance and accommodation matrices used in linear compliance controllers. The training examples needed for learning were constructed based on general considerations about what the corrections should be for prototypical situations, and what contact forces might be measured in them. However, it was later demonstrated that for the contact configurations usually experienced in PiH insertion tasks, the mapping between forces and corrections was not linear [2, 3]

, and it was suggested to use neural networks to represent a non-linear mapping between the two. This advance significantly expanded the type of mappings that could be learned, but still left open the question of how a suitable data set of training examples could be compiled, as doing this manually is excessively difficult for all but the simplest geometries. A much more appealing solution is to measure contact forces directly on a real robot by putting the peg and hole in various contact situations. Gullapalli et al. (

[8]

proposed a reinforcement learning (RL) solution based on trial and error, which learned to associate the contact forces with a correction that was advantageous in bringing the peg closer to its desired end position, while also minimizing contact forces. The desired outcome was encoded in the reward function of the RL problem formulation. Although this approach achieved remarkable results, learning to insert a peg in a hole with clearances significantly lower than the accuracy (repeatability) of the robot used, this method still needed accurate knowledge of where the goal position was, in order to use it in the reward function. This precludes its direct use in the version of the problem we consider, where the uncertainty is precisely in the position of the hole, and thus in the correct end position that the peg should reach.

Following this seminal application of RL to PiH insertion, a number of later works explored the use of machine learning models for the design of adaptive force controllers. The application of deep RL for learning end-to-end visuomotor policies was demonstrated in [15]. In addition to F/T sensors, tactile sensors have been employed, too ([6, 5]). As the instantaneous F/T readings might not be sufficient to disambiguate the contact configuration, the use of recurrent neural nets has been proposed in [11, 14]

. However, as is well known, RL often suffers from unfavorable sample complexity, making it less suitable for use on real mechanical systems. In contrast, we explore below a supervised learning approach to learning mappings between forces and corrections, thus significantly reducing the number of training samples needed.

The work proposed in this paper is closest to our previous work in [13]. However, compared to the work in [13], we present design of an additional nonlinear accommodation controller, we present a proof for convergence of the controller, as well as feature analysis for threshold detection. Furthermore, we show an improvement in the final insertion system which uses a DL-based hole detection method along with a faster controller for insertion.

Iii Problem Statement

In this section, we present the problem that we are trying to solve in this paper. Loosely speaking, the objective is to control the contact state and the problem state (i.e., the pose of the peg for our case) during an insertion attempt. The schematic in Figure 2 shows the twofold objective of using force feedback in controller design for assembly. The force feedback is used to design a lower-level controller to limit interaction forces in the event of contact formation during an insertion attempt. As could be seen in the figure, any insertion attempt leads to a contact formation and the goal is to use the corresponding force signature to correct for the underlying misalignment. However, we would like that the interaction forces obtained during any arbitrary contact formation be bounded, irrespective of the reference trajectory provided to the robot. Furthermore, we would like to learn models using quasi-steady behavior of the system for ease of learning and prediction.

Fig. 2: The control system design that we study in this paper. The important point to note is that the force feedback is used to design the accommodation controller as well as to correct the object pose for successful insertion.

In all all of these cases, we assume that the misalignment is only in a plane, and that there is no angular misalignment between the peg and the hole. This corresponds to the often encountered case in practice when the hole base slides across a working surface in a factory, for example a workbench. In order to train a model, we try to solve the following problems in this paper, which are then used together to design a force-feedback controller for performing peg-in-hole assembly in the presence of significant positional inaccuracy.

  1. Suppose that we have a reference trajectory for insertion, which is denoted as . Suppose that due to contact formation, the robot experiences a sequence of contact forces denoted as . The force control task is to design a force feedback controller that modifies using a force feedback law so that such that , , where is arbitrarily small.

  2. The second task is to then analyze and use the force signature data obtained from the force controller to design a force feedback controller to correct the misalignment between the peg and hole position.

In summary, the goal is to use force feedback to design both the lower level accommodation controller, as well as the corrective policy that can allow the robot to correct any contact formation for successful assembly (see also Figure 2).

Iv Controller Design

In this section, we present the design and analysis of the compliance controllers that we use to ensure safe interaction during insertion. We believe that this is a critical step to ensure safety of the learning process. Even though there has recently been a lot of work in robot learning approaches for performing manipulation, ensuring the safety of the contact-rich interactions during these tasks has largely been overlooked. However, this is a very critical requirement for adoption of these learning-based approaches for use in assembly operations, and many other related operations. Based on this motivation, we present the design and analysis of two different kinds of controllers using force feedback with different convergence behaviors.

In both of these controllers, we use the force measured by a force-torque sensor mounted at the wrist of the robot (see Figure 1) to adapt a reference trajectory to regulate interaction forces experienced by the robot with their environment. The idea is to use force feedback to modify the reference trajectory so as to limit the contact forces to allowable bounds. For clarity of presentation, we present block diagrams for both controllers in Figure 3.

Fig. 3: Block diagrams for the two controllers described in the paper. : Accommodation Matrix, : Stiffness Gains of the low-level stiffness controller.

Iv-a Linear Accommodation Controller Design

The operation of the proposed linear accommodation controller is presented in Figure 3. As could be seen in the block diagram in Figure 3, the accommodation controller modifies the reference trajectory using force feedback. In particular, the accommodation controller uses the following feedback law to modify the reference trajectory of the robot. Let us denote the discrete-time reference trajectory by , the trajectory commanded to the low-level position controller by , the experienced forces by , the measured position as , at any instant . Note that the here denotes the control time index and not the actual time in seconds. In this design, we employ a low-level compliant position controller that makes the robot behave like a spring-damper system with desired stiffness and damping coefficients. Most robot vendors provide such a stock controller with the robot, or if not, one can be implemented relatively easily ([18]). Let us denote the stiffness constant of the compliant position controller by and the accommodation matrix for the force feedback by . For simplicity, we consider a diagonal matrix . With this assumption, we present the force-feedback law for updating the commanded position along each individual axis next. The commanded trajectory sent to the robot is computed using the following update rule (also see Figure 3):

(1)

where is a discounting parameter for computing the integral error, and are desired position increments computed from the reference trajectory. An actual force trajectory obtained for a reference trajectory that advances with constant velocity along the -axis of the robot under the operation of the linear accommodation controller is shown in Figure 4a. Note that even though the reference trajectory keeps advancing, the experienced force stabilizes; this behavior is in contrast to that of the stock compliant controller, where contact forces grow proportionally to the advance of the reference position, and can easily become dangerously large for the robot or manipulated parts. (It is generally not feasible to limit these forces by making the stiffness of the stock compliant controller very low, because the robot does not know exactly where an obstacle will be encountered. In contrast, the proposed accommodation controller guarantees bounded forces, even if the reference trajectory advances to infinity, as long as this happens at a constant velocity. The latter condition can be guaranteed easily by sampling any desired geometric reference trajectory accordingly.)

Iv-B Non-linear Accommodation Controller Design

(a) Force signature obtained by the Linear Accommodation Controller
(b) Force signature obtained by the Nonlinear Accommodation Controller
Fig. 4: Force trajectories obtained by the same reference trajectory, but using the two different accommodation controllers from Section IV. As we can observe from the plots above, the non-linear force controller can achieve much faster convergence, and the forces converge to lower values in the presence of the same reference trajectory.

Next, we present a nonlinear feedback law to design an accommodation controller. The corresponding block diagram for this controller is shown in Figure 3. Using the nomenclature from Section IV-A, the non-linear force feedback law is given by the following equation (also see Figure 3):

(2)

where , where is specified by the user and approximately defines the force around which the controller would converge. Note that the control law in 2 does not use an integration block. Rather, the idea here is to use the the force feedback to cancel any increment of the commanded trajectory. The proposed feedback law ensures that the feedback does not interfere with the movement of the robot in free space (as the force feedback is close to zero in free space). However, since quickly converges to as forces go beyond , this would lead to convergence of the commanded trajectory and hence of the contact forces. The convergence behavior could also be seen in the plot for the nonlinear accommodation controller in Figure 4b.

Iv-C Convergence Analysis

Next, we state a Theorem which proves that under the assumption of constant velocity of trajectories, the interaction forces will converge for the controller presented in Section IV-A and IV-B. To prove convergence of forces, we will need an assumption that relates the robot position and the commanded position with contact forces.

Assumption 1

The robot is equipped with a stiffness controller with stiffness constant such that the forces observed during an interaction is given by , where , and are the robot actual state, robot commanded state and observation noise respectively.

We make another assumption regarding the velocity of the reference trajectory of the robot.

Assumption 2

The reference trajectory of the robot has a constant velocity, i.e., .

With these two assumptions, we can now state the following theorem.

Theorem IV.1

Suppose that a reference trajectory with constant velocity is modified with the force feedback specified in Equations (1) and (2). Suppose that the robot makes contact with a rigid environment at time instant . Then, we have that the following is true: a such that , , where can be made arbitrarily small.

Since the robot moves with constant velocity in free space, there is no force experienced by the force sensor (except for the measurement noise). Thus, we ignore the part of the trajectory before contact formation.

Upon contact formation with the external environment, using Assumption 1 (we ignore the noise term) and Equation (1), we get the following:

(3)

For simplicity of notation, let us denote the summation term by . Thus, the above equation can be simplified as follows:

(4)

Using the above equation, we can write that . Note that is a discounted infinite sum of the sequence of observed forces times a gain term. To show convergence, we make an assumption that we can find at least one and accommodation term , such that , , where is arbitrarily small. Using this assumption, then we have that .

Convergence of the nonlinear controller given by Equation (2) is straightforward. It can be shown by the convergence properties of . We show this in the following text. Equation (2) can be re-arranged as following:

(5)

The convergence rate of the sigmoid function in Equation (

2) can be controlled by the accommodation term . Using the asymptotic convergence of , we have that a such , . Then we can use this to re-write Equation (5) as the following:

(6)

where, . Convergence of follows from the fact that the can be made arbitrarily small. Then using Assumption 1 and Equation (6), we can show that .

The assumption regarding the existence of and is not very strict. In practice, we found that we were able to find an interval for for which our infinite sum converged. The plots shown in Figure 4a were obtained with . The above provides us a solution to the first problem that was presented in Section III. In the next section, we analyze the data collected using the proposed force controller and present the design of a learning-based controller for peg insertion.

V Learning Predictive Model for Misalignment

In this section, we analyze contact wrench data to understand the dependence of the contact wrench on the misalignment. To provide a complete understanding of this relationship, we analyze the force signature data that is collected during an initial training phase. The purpose of this model is to predict misalignment based on the force signature which is characteristic of a certain contact configuration.

To learn the predictive model, we collect training data consisting of the force signature for the contact configuration by a known amount of misalignment. We use the accommodation controller that we presented earlier during data collection to ensure safe interaction during this exploration phase. Furthermore, we also ensure that we measure the force signature for a given misalignment at a quasi-steady state, when it has converged to an asymptotic value, which simplifies the learning problem.

V-a Data Collection

To learn a predictive model for correcting misalignment, we collect data by introducing misalignment in the position of the peg with respect to the hole. The work in this paper only considers planar misalignment between the peg and the hole. Consequently, we introduce misalignment in the and

axes from the known hole location. The misalignment is sampled from a uniform distribution from the interval

mm. This interval was chosen, because the deep learning-based hole detection method we used is able to achieve similar accuracy in the estimated position of the hole. With the added misalignment in the position of the peg, any insertion attempt leads to a contact formation between the peg and the hole environment. The contact formation leads to a force signature that is observed through the F/T sensor mounted at the wrist of the robot (see Figure 

1). For every episode of data collection, the robot follows the insertion trajectory, and records the force measurements measured through the F/T sensor for the resulting contact formation. Thus, we collect a data set where we store the misalignment as well as the measured force signature corresponding to the misalignment. We use a Mitsubishi Electric Factory Automation (MELFA) RV-AS-D Assista -DoF arm (see Figure 1) for the experiments. The robot has pose repeatability of mm. The robot is equipped with Mitsubishi Electric F/T sensor F-FS-W (see Figure 1). In the initial set of experiments, we also verify that Assumption 1 is valid for our robotic setup.

V-B Numerical Analysis for Convergence

We analyse the convergence properties of the proposed controllers. In Figures 5 and 6, the statistics of the force signature measured by the F/T sensor along the vertical direction have been reported for regular time intervals on all experiments described in Section V-A for the linear controller. Similarly, we report these quantities for the nonlinear controller in Figures 7 and 8.

Fig. 5: Average values of the vertical forces computed every consecutive on the trajectory. Left: shows the mean and the confidence interval of these values for all the experiments. Right: zooms in only the last of the trajectory.
Fig. 6: Standard deviation values of the vertical forces computed every consecutive on the trajectory. Left: shows the mean and the confidence interval of these values for all the experiments. Right: zooms in only the last of the trajectory.

In particular, we have computed the mean, , and twice the standard deviation, , for each time interval of along the trajectory and for all the 1200 experiments. The mean and the confidence interval of these two statistics are reported in Figures 5 and 6, respectively for the linear and in Figures 7 and 8, for the nonlinear controller, respectively.

The practical purpose of this analysis is to be able to decide online when the controller has converged to a stable value of the vertical forces as soon as possible. The criterion we selected to decide the convergence of the system is based on the changes we can observe in the 4 statistics we have described above: the mean, , the standard deviation, , the mean of the standard deviation, , and the standard deviation of the standard deviation . We then took the difference of each of these statistics with respect to time intervals, with where is one of the four statistics and . We declare that the system converged if

(7)

for 2 consecutive time intervals , where is in the end operator. Basically we are requesting for the time interval where the changes for all the statics are less than a predetermined threshold .

Criterion (7) applied to the linear controller, see signals in Figures 56, output that the controller converged after . Analogously, for the nonlinear controller, see signals in Figures 78, the controller converged after . Note that the confidence intervals never go to zero because of the measurement noise. This empirical analysis confirms the theoretical convergence results shown in Section IV

. Therefore, the classifiers can be computed based on values at convergence without having to wait for the end of the experiment.

Fig. 7: Average values of the vertical forces computed every consecutive on the trajectory. Left: shows the mean and the confidence interval of these values for all the experiments. Right: zooms in only the first of the trajectory.
Fig. 8: Standard deviation values of the vertical forces computed every consecutive on the trajectory. Left: shows the mean and the confidence interval of these values for all the experiments. Right: zooms in only the first of the trajectory.

V-C Model Learning Performance

We train classification and regression models using the collected contact force data to learn predictive models for direction and magnitude of misalignment. We use the results from the previous section to decide on the convergence and use the convergence criterion to decide when to stop collecting data during the contact formation. We use found in the last section for collecting the force signature. We then train a classification and a regression model to learn the direction and the magnitude of error respectively, to understand the efficacy of the models. The results of classification and regression are shown in Tables I and II (see the results with full features). Note that we are able to achieve better result with the linear accommodation controller; however, the nonlinear controller can predict these faster than the linear controller. Another point to notice is that we are able to achieve good RMSE scores for both controllers – the linear controller is better than the nonlinear. However, this problem requires that we should be able to predict the directions accurately. The regression model with the linear accommodation controller is able to predict the direction with accuracy and in the and axis respectively. With the nonlinear controller, we achieve an accuracy of and . Notice that overall the linear controller is able to achieve higher accuracy. This might be due to higher interaction forces which leads to less noise in the force signatures. This might be one of the reasons for the slightly better performance of the linear controller.

Axis Classification Accuracy (higher is better)
Linear Controller NonLinear Controller
Full Feat Reduced Feat Full Feat Reduced Feat
X 0.9964 0.9916 0.9928 0.9916
Y 0.939 0.949 0.92 0.920
TABLE I: Classification accuracy in prediction of direction of misalignment along X and Y axis using Gaussian process classifiers.
Axis RMSE [mm] (lower is better)
Linear Controller NonLinear Controller
Full Feat Reduced Feat Full Feat Reduced Feat
X 0.59 0.61 0.59 0.55
Y 0.83 0.67 0.92 0.72
TABLE II: Regression accuracy in prediction of magnitude of misalignment along X and Y axis using Gaussian process regression.

V-D Feature Importance

We use feature importance analysis to describe which features are relevant for learning the predictive model for misalignment from force observations. Feature analysis can help with a better understanding of the problem. In particular, we use a forest of trees to evaluate the importance of the force features on the classification task [16]. We consider the Cartesian force signals and the corresponding moment signals from the F/T sensor to obtain the wrench signal

, which we use as features for identifying the hole misalignment. The fitted attribute provides feature importance, and they are computed as the mean and standard error of accumulation of the impurity decrease within each tree. We observe that

and Cartesian force signals and moments are found important for the classification task, where Cartesian force signals and moments are unimportant. In figure 9, the bars are the feature importance of the forest, along with their inter-trees variability represented by the error bars. This agrees with the physical intuition about the insertion– since the forces in the direction are constant for all trials, it should not be helpful in providing any discriminating information for class separation. Similarly, the contact formation during insertion attempts should not lead to any moment in , and thus this information is also not useful for making misalignment decisions. We repeat the classification and regression modeling with the reduced feature sets and the results are listed in Tables I and II

(see the reduced feature results). It can be observed that we can achieve better performance than using the full force signature for learning a predictive model. This shows the effectiveness of feature selection and that we are able to do better than using the entire

-dimensional wrench vector.

Fig. 9: Feature importance provided by the fitted attribute computed as the mean and standard error of accumulation of the impurity decrease within each tree with a forest of trees. Blue and orange bars describe which features are relevant for learning the predictive model for X and Y direction misalignment, respectively.

Vi Results for Insertion

In this section, we present design of an insertion policy using the predictive model using the force signature data that was described in the previous section. For completeness of presentation, we present brief details of the deep learning-based hole detection framework which is used for testing the performance of the force controller proposed in the paper. This vision module is used in the proposed work to perform hole-detection-based insertion. The approach is based on our previous work presented in [12]. In this section, we first present details of the vision module that we use for hole detection, and then present results for insertion using the learned predictive models.

Vi-a Vision System for Hole Detection

We choose a supervised learning approach to detect the hole location from visual sensory data obtained from an RGB-D sensor (Intel Realsense, D435) camera. Using traditional computer vision approaches to detect hole location with unknown object pose might lead to false positives (e.g., template matching 

[4] or the Hough circle transform [23]). We implement the Mask R-CNN [9]

deep learning architecture for instance-level segmentation to detect hole locations. Our classification setup has two classes, one for the background and one for the hole location. The network prediction identifies the resulting segmentation masks for hole locations. We performed transfer learning from the MS COCO dataset pre-trained weights in a supervised manner. For the learning dataset, we captured 300 images of size 640×480 at different distances. We annotated the data to indicate hole pixels with the labelme 

[21] annotation tool. At inference time, we utilize the detected segmentation mask of the hole location to compute the corresponding registered point cloud data points. The output from the approach is the estimate of the 3D hole location from the visual sensory data. Figure 10 shows some qualitative samples of hole detection approach on point cloud of the test object.

Fig. 10: Samples of the visual hole detection shown on the point cloud of the object. Detected hole locations are indicated in red and the masks are shown in green.

Vi-B Insertion Using Force Feedback Models

In this section, we present results from performing insertion with the trained force feedback models in the presence of error in the detection of hole location. We use the vision module to detect the approximate location of hole in the environment of the robot. Compared to our previous work proposed in [12, 13], we experiment with parts with tighter tolerances to test the performance of the force controller. In particular, we use parts with tolerance of approximately mm. To test the performance of the integrated force with the vision-based hole detection, we move the object in the field of view of the RGB-D sensor (see Figure 1). The robot is now asked to perform insertion based on the estimate of the DL method for hole location. We find that the DL-based method is fairly accurate for reaching in the vicinity of the hole location. Then we use the learned force controller for performing insertion, overcoming any misalignment. We use the classification prediction by the trained classifiers to move by a unit step of mm in the predicted direction while maintaining contact with the object surface. This is repeated till either the robot succeeds in insertion or diverges more than mm. The robot is given a maximum of correction attempts. We measure the number of corrections made by the linear and non-linear controllers. We move the object to random location in the view of the camera and attempt insertion. We observe that the ML model with linear controller is able to achieve success rate with average number of corrections to be while the ML model with non-linear controller achieves success rate (1 failure case out of 20 attempts) with an average correction rate of per successful attempt. A more thorough analysis of the controller performance is left to an extended version of the paper.

Vii Conclusions and Future Work

In this paper, we presented the design and analysis of accommodation controllers for contact interaction during assembly operations and their use in adaptive assembly strategies based on machine learning. Most assembly operations with tight tolerances result in complex contact formations which might damage the parts being assembled. Ensuring safe operation of robots requires the design of force feedback controllers that can ensure limited contact forces in the presence of sustained contacts. In this paper, we presented two designs of generalized accommodation controllers that use force feedback during contact interaction to ensure limited contact forces. We presented analysis of these controllers to show convergence of contact forces under the assumption of constant velocity of the underlying reference trajectory. We presented results from different machine learning models which were trained using different signal statistics, and compared them to find an optimal signal feature. Finally, we used the trained model to perform insertion using a DL-based vision algorithm for hole detection. We show that we are able to achieve success rate for insertion using the proposed controllers and using the DL-based vision system for detecting hole location with tolerances tighter than mm.

In the future, we will perform more rigorous comparison between the linear and nonlinear controllers at different operating velocities of the robot, and find the best operating conditions which leads to fastest insertion times and highest success rate.

References

  • [1] F. J. Abu-Dakka, B. Nemec, J. A. Jørgensen, T. R. Savarimuthu, N. Krüger, and A. Ude (2015) Adaptation of manipulation skills in physical contact with the environment to reference force profiles. Autonomous Robots 39 (2), pp. 199–217. External Links: ISSN 1573-7527 Cited by: §II.
  • [2] H. Asada (1990) Teaching and learning of compliance using neural nets: Representation and generation of nonlinear compliance. In International Conference on Robotics and Automation, pp. 1237–1244. External Links: ISBN 0818620617, Document Cited by: §II.
  • [3] H. Asada (1993) Representation and Learning of Nonlinear Compliance Using Neural Nets. IEEE Transactions on Robotics and Automation 9 (6), pp. 863–867. External Links: Document, ISSN 1042296X Cited by: §II.
  • [4] K. Briechle and U. D. Hanebeck (2001) Template matching using fast normalized cross correlation. In

    Optical Pattern Recognition XII

    ,
    Vol. 4387, pp. 95–102. Cited by: §VI-A.
  • [5] S. Dong, D. Jha, D. Romeres, S. Kim, D. Nikovski, and A. Rodriguez (2021) Tactile-RL for insertion: generalization to objects of unknown geometry. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. . Cited by: §II, §II.
  • [6] S. Dong and A. Rodriguez (2019-11) Tactile-Based Insertion for Dense Box-Packing. In IEEE International Conference on Intelligent Robots and Systems, pp. 7953–7960. External Links: ISBN 9781728140049, Document, ISSN 21530866 Cited by: §II.
  • [7] S. Gottschlich and Kak AC (1989) A dynamic approach to high-precision parts mating. IEEE Transactions on Systems, Man, and Cybernetics 19, pp. 797–810. Cited by: §II.
  • [8] V. Gullapalli, J. A. Franklin, and H. Benbrahim (1994) Acquiring Robot Skills via Reinforcement Learning. IEEE Control Systems 14 (1), pp. 13–24. External Links: Document, ISSN 1066033X Cited by: §II, §II.
  • [9] K. He, G. Gkioxari, P. Dollár, and R. Girshick (2017) Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, Cited by: §VI-A.
  • [10] H. Inoue (1974-08) Force Feedback in Precise Assembly Tasks. Massachusetts Institute of Technology. Cited by: §II.
  • [11] T. Inoue, G. De Magistris, A. Munawar, T. Yokoya, and R. Tachibana (2017-12) Deep reinforcement learning for high precision assembly tasks. In IEEE International Conference on Intelligent Robots and Systems, pp. 819–825. External Links: ISBN 9781538626825, Document Cited by: §II.
  • [12] S. Jain, D. Romeres, D. K. Jha, W. Yerazunis, D. Nikovski, and A. Sullivan (2022) Automated visual hole detection for robotic peg-in-hole assembly. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. under review. Cited by: §VI-B, §VI.
  • [13] D. K. Jha, D. Romeres, W. Yerazunis, and D. Nikovski (2021) Imitation and supervised learning of compliance for robotic assembly. arXiv preprint arXiv:2111.10488. External Links: 2111.10488 Cited by: §II, §VI-B.
  • [14] P. Kulkarni, J. Kober, R. Babuška, and C. D. Santina (2021-09) Learning Assembly Tasks in a Few Minutes by Combining Impedance Control and Residual Recurrent Reinforcement Learning. Advanced Intelligent Systems, pp. 2100095. External Links: Document, ISSN 2640-4567 Cited by: §II, §II.
  • [15] S. Levine, C. Finn, T. Darrell, and P. Abbeel (2016) End-to-end training of deep visuomotor policies. Journal of Machine Learning Research 17, pp. 1–40. External Links: Document Cited by: §II, §II.
  • [16] A. Liaw, M. Wiener, et al. (2002) Classification and regression by randomforest. R news 2 (3), pp. 18–22. Cited by: §V-D.
  • [17] Y. Liu, D. Romeres, D. K. Jha, and D. Nikovski (2020) Understanding multi-modal perception using behavioral cloning for peg-in-a-hole insertion tasks. CoRR abs/2007.11646. External Links: 2007.11646 Cited by: §II.
  • [18] K. Lynch and F. Park (2017) Modern Robotics. Cambridge University Press. Cited by: §IV-A.
  • [19] J. L. Nevins and D. E. Whitney (1977) Research on Advanced Assembly Automation. IEEE Computer 10 (12), pp. 24–38. External Links: Document Cited by: §II, §II.
  • [20] M. A. Peshkin (1990) Programmed Compliance for Error Corrective Assembly. IEEE Transactions on Robotics and Automation 6 (4), pp. 473–482. External Links: Document, ISSN 1042296X Cited by: §II.
  • [21] B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman (2008) LabelMe: a database and web-based tool for image annotation. International journal of computer vision 77 (1), pp. 157–173. Cited by: §VI-A.
  • [22] J. Xu, Z. Hou, Z. Liu, and H. Qiao (2019) Compare contact model-based control and contact model-free learning: a survey of robotic peg-in-hole assembly strategies. arXiv preprint arXiv:1904.05240. Cited by: §I.
  • [23] H. Yuen, J. Princen, J. Illingworth, and J. Kittler (1990) Comparative study of hough transform methods for circle finding. Image and vision computing 8 (1), pp. 71–77. Cited by: §VI-A.