I Introduction
In many assembly operations, fixtures and jigs are designed to fit the shapes of a particular part. Such tools used to fix or support assembly parts are frequently used to firmly determine or control part orientations such as in insertion operations. However, the design and production of these tools require a large amount of time and high cost. To achieve high-mix low-volume production in a short time at a low cost, we require a versatile assembly jig to replace the current custom-made jigs designed according to each product.
Kiyokawa et al. [6] was the first attempt to use soft materials for a jig. They proposed a soft-jig made of a silicone membrane filled with only beads. This jig worked according to a jamming transition to fix parts, such as a jamming gripper [2], by vacuuming the air inside the jig.
If we use part-customized rigid jigs, the object orientation on the jig can be precisely determined because of the high rigidity of the surrounding planes that firmly constrain the object’s surfaces. In contrast, if we use a soft-jig with a high degree of freedom in terms of the surface changes, the uncertainty of the jig-fixed object orientation is a crucial issue.

(a) Soft-jig with a thin flexible membrane filled with transparent beads and oil. (b) Principal normal vector of the target object (cube) estimated by the proposed soft-jig system (red arrow) and the normal vector measured using a motion capture system (green arrow).
To reduce the uncertainty, the orientation of an object can be estimated using its image captured through an external camera. Several state-of-the-art methods for 3D position and orientation estimation have achieved high accuracy [3]. However, when the object is occluded by the gripper and jig, the orientation estimation deteriorates significantly. Therefore, it is difficult for a robot to manipulate an object with high accuracy using only an external camera.
We attempted to reduce the uncertainty of the object orientation by using a sensing, flexible jig that acquires tactile information from the parts in contact with the jig. Because tactile information is independent of the color and material of the object, it can be treated as highly reliable information.
Therefore, to estimate the orientation of a part while fixing the part, we used transparent beads and oil with a refractive index close to that of the beads. In this study, to recognize the markers on the inner surface of the jig, we replaced the air used for jamming grippers with transparent oil based on previous studies [14] [13]. If the refractive indices of all the materials filled inside the jig are the same, when light travels straight through, the surface of the membrane can be captured optically by the camera. To convert the captured image into tactile information that can be used by a robot, markers are painted on the inner surface of the membrane, and two cameras are installed inside the jig.
As part orientation needs to be estimated, we proposed to calculate a principal normal vector that shows the bottom surface normal of the target object. In our estimation method, we detect the markers on the membrane, estimate marker positions, calculate an approximate plane for the markers, and determine the plane normal as the principal normal vector, as shown in Fig. 1. The principal normal vector reduces the uncertainty of the orientation of the object on the jig and allows us to perform more precise assembly operations by re-grasping or changing the assembled object’s orientation with the target orientation.
To evaluate the sensing ability of the proposed soft-jig, we conducted experiments on the principal normal vector estimation using three target objects in different orientations.

Ii Related work
Flexible fixtures have been developed using a wide range of strategies. Reconfigurable fixtures [10] [12] fix parts by rearranging or replacing the components. The ability to change in a fixed configuration provides flexibility. Pin array fixtures [11] [15] use a shape-memorable mechanism to fix parts by constraining multiple points of the part with pins. The high degree of freedom in fixing using pins leads to flexibility. Because these fixtures can fix different types of objects, they are suitable for high-mix low-volume production in terms of part-fixing; however, they do not aid in the uncertainty of the fixed object’s orientation. In addition, the proposed innovative fixture not only fixes parts using the jamming transition but can also perform sensing to acquire the principal normal vector simultaneously.
For sensors that use flexible materials to acquire the object shape on the soft jig, camera-based methods are attracting attention because of their high spatial resolution. GelSight is a sensor that acquires normal map using multiple light sources of different colors and photometric stereo [16]. SoftBubble [1], [7] relies on a depth camera to capture the shape of a balloon inflated with air and acquire the shape of the contact area. Lin et al. [9] proposed a sensor that estimates the curvature of a flexible material using the subtractive color mixing principle. In our previous work, we used a pinhole camera model to estimate the depth of a monocular camera using known-sized markers [14, 13].
Our proposed system performs the jamming transition to fix the object’s orientation, and membrane shape acquisition using triangulation based on estimated marker positions and object orientation estimation for the object contacted or fixed. A depth camera can be used for estimating the object’s orientation, but it is difficult to acquire point cloud data using a commercial depth camera because the light passes through a transparent plate, oil, and beads to reach the inner surface of the membrane with attached markers.


Iii Design of soft-jig
The soft jig shown in Fig. 2, is composed of the following four key components:
-
A silicone membrane with markers to capture the surface deformation
-
Glass beads for jamming transition
-
A base plate cut out of transparent PET plastic
-
Two cameras to acquire the point cloud data around the target object using triangulation
The base plate size is , and the height of the silicone membrane is approximately from the base plate. The actual soft-jig is filled with oil inside the membrane, and the markers can be seen even when the membrane is filled with beads, as shown in Fig. 3.
Iii-a Silicone membrane
A silicone membrane with multiple rounded convex markers was formed using the mold shown in Fig. 4. The thickness of the membrane is less than (approximately ). To form a thin membrane, we employed a method for coating liquid silicone materials (Dragon Skin FX-Pro, Smooth-On) on a mold surface instead of casting the material with several molds. The mold had hemispherical indentations to create markers. The indentations were filled with white silicone that was colored with pigment (Silc Pig, Smooth-On) and mixed with a thickener (THI-VEX, Smooth-On).
After the white silicone hardened, we adhered the black silicone to the mold surface after filling the indentations. Instead of using the color of the silicone itself, it was colored to make it easier to detect the marker in image processing. The mold was printed with a 3D printer. To easily remove the hardened silicone from the mold surface, we puttied the surface to be as smooth as possible.
Iii-B Glass beads
In the case of large beads and a thin membrane, the shape of the bead is bumped on the surface of the membrane, which changes the contact between the membrane and the object from a surface contact to a point contact. This causes a decrease in the coefficient of friction between the object and the membrane [4]. The beads must not only be small in diameter to avoid degrading the fixation performance, but also transparent to use optical sensing.
Based on these requirements, we selected glass beads (FGB-20, Fuji Manufacturing) with a diameter of .
Iii-C Base plate
The plate serves to hold a silicone membrane, optical markers, and tube connectors. The silicone membrane and tube connector were fixed by screwing, and the optical markers were fixed by inserting a pin.
Tube connectors with a filter are attached to the underside of the base plate to pump oil while preventing beads from flowing out. Because of the small diameter of the beads, the filters need to be finer, which reduces the mass flow rate , as shown in the equation: , where is the oil density, is the cross-section of the tube, and is the velocity. In the production line, it is necessary to increase velocity to shorten the tact time. For this purpose, eight tube connectors were attached to the base plate.
Iii-D Camera
Two cameras (RealSense D435, Intel) were placed approximately below the base plate with a baseline of . Note that although this camera is a depth camera, it is used only to acquire color images with image rectification in the camera processor.
Cameras were covered with a board to remove the influence of the outside world, and an LED light maintained a constant brightness.
Iv Optical Sensing
Fig. 5 shows a flowchart of the proposed optical sensing procedure. Fig. 6 shows the markers seen through the oil. The markers on the inner surface of the membrane are visible owing to the refractive index-tuned oil, even though the light travels through many beads to reach the membrane. On the other hand, the refractive indices are not perfectly matched, so they appear blurred in the camera image.
The Laplacian of Gaussian (LoG) was used to accurately detect marker centers in blurred images. Using the known variation of the marker size in the image, of the Gaussian filter was determined, and the center of the markers was calculated by extracting the local maximum from LoG image.
The centers of the markers detected in the left and right images were acquired with sub-pixel accuracy through least-squares fitting to a quadratic function with the center neighborhood, and converted to a point cloud by triangulation.
We used singular value decomposition (SVD) to estimate the plane of the inner surface of the soft jig along the bottom surface of the object, and it is expressed as
from point cloud . Here, () shows the principal normal vector of the objects, and is the amount of the object pushed into the soft jig.Note that if plane estimation using SVD is performed for all point clouds, the plane is incorrectly estimated by the markers in the areas where the object bottom surface is not in contact. We solve this problem by using fiducial markers on the base plate of the jig, the point cloud is transformed to the jig center coordinates, and only the point cloud within the radial distance from the jig center that is less than the base area of the object is cropped.


V Experiments and Results
V-a Principal Normal Vector Estimation
We compared the difference between the ground truth data obtained by motion capture and the object’s principal normal vector measured from the inside soft jig. The six reflective markers on the jig and the four reflective markers on the cube were measured from the top using motion capture (V120: Trio, OptiTrack), as shown in the setup in Fig. 8.
![]() |
![]() |
We assume that the captured orientation of the jig is a rotation matrix and translation vector , and captured the orientation of the target object as a rotation matrix and translation vector . The principal normal vector of the jig in the marker coordinate system is calculated using the following equation:
The results of the projection of the normal vectors estimated from the jig and the normal vectors obtained from the motion capture onto the YZ and XZ planes are shown in Fig. 8. We can see that the locations of the peaks and valleys of the ground truth and the estimated values are consistent, but the absolute values of the angles are different. This result shows that the estimated values need to be corrected by calibration in order for the robot to use these values to manipulate the object.

V-B Jig Coordinate Calibration
We performed calibration to convert the information acquired from the camera inside the jig into a coordinate system based on the marker attached to the base plate. For calibration, we used a manipulator (LBR iiwa 14 R820, KUKA) to precisely control the orientation of the object pressed against the jig. The robot and soft jig were arranged as shown in Fig. 9. To transform the orientation of an object from the robot coordinate system to the jig coordinate system, a hand-eye camera and four circular markers on the jig were used for coordinate alignment using the perspective-n-point problem [8].
Fig. 10 shows a diagram of a motion of the robot to press object against the jig. Because the orientation is estimated using the bottom surface, the estimation accuracy decreases if the bottom surface becomes smaller. The deviation between the actual and estimated angles will be large at large contact angles. To investigate the sensing abilities in terms of the object size and object orientation, we conducted estimation experiments using cylindrical objects with three different diameters tilted at different angles .


Fig. 11 shows cylindrical objects used for calibration. The robot grasps these objects and moves to the origin within the horizontal orientation in the XY plane of the jig coordinate system using the marker recognized by the camera. The object was then moved by the robot in the Z direction until it contacted the jig, and then it was tilted by the robot to the specified angles between 5, 10, 15, and 20 from the center of the contacting bottom. The contact with the jig was judged when the force applied to the wrist of the manipulator changed by or more. The sensing results from inside the jig were acquired while changing the tilt direction by 360.
The acquired tilt angles and during calibration are shown in Fig. 13. is the number of samples used for the calibration. and were calculated by projecting the normal vector onto the YZ and XZ planes and converting them into angles. We calculated three parameters, offsets , and scale , as calibration results to obtain the correct value of the acquired tilt angle. and can be regarded as a pure sine wave in the YZ and XZ planes because the tilt direction is changed by 360 while keeping the angle constant in the radial direction by the robot. Therefore, the offset , and scale can be calculated using the following equations:
where:
The tilt angles calibrated using these parameters are presented in Fig. 13. The root-mean-square error (RMSE) of the difference between the tilt angles and the angles controlled by the robot is shown in Fig. 14.
![]() |
![]() |
When the diameter of the object was small, the number of markers that could be referenced was small; the result of the normal estimation became noisy. If the diameter of the bottom surface was larger than 50 mm, the noise would have been reduced, and the RMSE was stabilized. In addition, the values in the lower right corner of the RMSE table are larger. This is probably due to the fact that when a large object is tilted and pressed against the jig at a large angle, part of the bottom surface of the object lifts up and becomes non-contact with the jig.

Vi Discussion
Our proposed soft jig suffers from some limitations in terms of material selection. The blurred image is caused by the fact that the refractive indices of the beads and liquid do not exactly match, and also by the fact that the beads are not completely clear. In this study, we used inexpensive glass beads for polishing to fill 500 g of beads, but more transparent materials, such as acrylic beads, would be more suitable.
The current issue is that a camera from the outside world is required for position estimation. This is because it is difficult to estimate the position of the order from the tactile information because of the insufficient density of marker placement. However, we believe that by introducing a mechanism that can change the color of the membrane to visible or invisible as in a previous study [5], we will be able to estimate the uncertainty of the position as well.
Vii Conclusion
We proposed a soft jig with a sensing function for an assembly system, a method for estimating a principal normal vector using tactile information, and a calibration method using a real robot. Conventional flexible jigs do not consider the uncertainty caused by flexibility, but the proposed jig with a sensing function can simultaneously fix and estimate the object’s orientation. The principal normal vector can be used to refine the object’s orientation acquired from an external camera during object manipulation, even when the object is occluded.
References
- [1] (2019) Soft-bubble: A highly compliant dense geometry tactile sensor for robot manipulation. In Proceedings of the IEEE International Conference on Soft Robotics (RoboSoft), pp. 597–604. Cited by: §II.
- [2] (2010) Universal robotic gripper based on the jamming of granular material. Proceedings of the National Academy of Sciences 107 (44), pp. 18809–18814. Cited by: §I.
-
[3]
(2021)
Vision-based robotic grasping from object localization, pose estimation, grasp detection to motion planning: A review
. Artificial Intelligence Review 54, pp. 1677––1734. Cited by: §I. - [4] (2021-02) Effect of the granular material on the maximum holding force of a granular gripper. Granular Matter 23, pp. . External Links: Document Cited by: §III-B.
-
[5]
(2021)
Seeing through your skin: recognizing objects with a novel visuotactile sensor.
In
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
, pp. 1218–1227. Cited by: §VI. - [6] (2021) Soft-jig-driven assembly operations. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA2021), Cited by: §I.
- [7] (2020) Soft-bubble grippers for robust and perceptive manipulation. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 9917–9924. Cited by: §II.
- [8] (2009) Epnp: an accurate o(n) solution to the pnp problem. International Journal of Computer Vision 81 (2), pp. 155. Cited by: §V-B.
- [9] (2020) Curvature sensing with a spherical tactile sensor using the color-interference of a marker array. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 603–609. Cited by: §II.
- [10] (2016) Design and testing of a highly reconfigurable fixture with lockable robotic arms. Journal of Mechanical Design 138 (8). Cited by: §II.
- [11] (2019) A novel universal gripper based on meshed pin array. International Journal of Advanced Robotic Systems 16 (2). Cited by: §II.
- [12] (2013) Reconfigurable handling systems as an enabler for large components in mass customized production. Journal of Intelligent Manufacturing 24, pp. 977–990. Cited by: §II.
- [13] (2019) A parallel gripper with a universal fingertip device using optical sensing and jamming transition for maintaining stable grasps. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 5814–5819. Cited by: §I, §II.
- [14] (2018) A universal gripper using optical sensing to acquire tactile information and membrane deformation. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 6431–6436. Cited by: §I, §II.
- [15] (2021) Development of a shape-memorable adaptive pin array fixture. Advanced Robotics 35 (10), pp. 591–602. External Links: Document Cited by: §II.
- [16] (2017) GelSight: high-resolution robot tactile sensors for estimating geometry and force. Sensors 17 (12), pp. 2762. Cited by: §II.
Comments
There are no comments yet.