SEED: Series Elastic End Effectors in 6D for Visuotactile Tool Use

by   H. J. Terry Suh, et al.
Toyota Research Institute

We propose the framework of Series Elastic End Effectors in 6D (SEED), which combines a spatially compliant element with visuotactile sensing to grasp and manipulate tools in the wild. Our framework generalizes the benefits of series elasticity to 6-dof, while providing an abstraction of control using visuotactile sensing. We propose an algorithm for relative pose estimation from visuotactile sensing, and a spatial hybrid force-position controller capable of achieving stable force interaction with the environment. We demonstrate the effectiveness of our framework on tools that require regulation of spatial forces. Video link:



There are no comments yet.


page 1

page 5

page 6

page 7

page 8


A Sliding Mode Force and Position Controller Synthesis for Series Elastic Actuators

This paper deals with the robust force and position control problems of ...

Towards Accurate Force Control of Series Elastic Actuators Exploiting a Robust Transmission Force Observer

This paper develops an accurate force control algorithm for series elast...

Control of A High Performance Bipedal Robot using Viscoelastic Liquid Cooled Actuators

This paper describes the control, and evaluation of a new human-scaled b...

Real-Time Tactile Grasp Force Sensing Using Fingernail Imaging via Deep Neural Networks

This paper has introduced a novel approach for the real-time estimation ...

HapFIC: An Adaptive Force/Position Controller for Safe Environment Interaction in Articulated Systems

Haptic interaction is essential for the dynamic dexterity of animals, wh...

Grasping Unknown Objects with Proprioception Using a Series-Elastic-Actuated Gripper

Grasping unknown objects has been an active research topic for decades. ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Many tasks in robot manipulation require handling of general tools in the wild; in the future, we believe that robots will be able to grab any tool and do meaningful control in order to accomplish various tasks and exchange forces with the environment. To manipulate tools skillfully and robustly, we will need end effectors that allow controllable hand-tool interaction in hardware, while having sensing capabilities on this interaction to enable closed-loop feedback.

Parallel-jaw grippers are sufficient for grasping [antipodal], but quickly meet limitations when it comes to forceful tool use. Even when sensing is given via finger attachments [gelslim], the hardware often relies on friction to handle the forces that arise in hand-tool interaction, and may lack the ability to resist spatial forces in some of the axes. For example, torques applied perpendicular to the finger surface are hard to resist, but arise in many tool-use scenarios [holladay]. Multi-fingered hands are much more versatile, but existing solutions (e.g. imposing a full-rank grasp matrix on the tool [liandsastry, cutkosky]) also rely on frictional contacts, which can limit the amount of force they can exert.

More fundamentally, the rigidity of most of our hardware requires non-smooth contact forces to be used to resist external forces in tool-use. Such forces can be notoriously hard to control robustly as their behavior changes instantaneously [suh2021bundled]. In the absence of the challenges brought by rigid contacts, custom tool changers that are rigidly attached to the robot have demonstrated impressive capability to achieve finely controlled force interaction with the environment [grinder]. However, such solutions require modifying tools with specialized handles compatible with the tool changer, which limits the robot’s ability to use unmodified tools.

To alleviate the difficulties coming from rigidity and the non-smooth behavior that it brings, we ask the following question in this work: can we consider visuotactile hardware as not only a mechanism for sensing, but also as an opportunity to provide compliance for control? Indeed, similar ideas have been proposed in Series Elastic Actuators (SEA)s [sea]; by attaching a soft spring in front of the gearbox whose deformation can be measured, SEAs have been successful in achieving smooth and stable force control by turning the problem into that of position control [sea, seathesis, hoganimpedancecontrol].

Fig. 1: A visual illustration of our framework.

How can we generalize the benefits of SEAs to the setting of grasping and using arbitrary tools? We propose an answer that attaches soft, spatially compliant elements at the end effector right where interaction with the tool occurs. Mechanically, such a solution can be attached to a low-cost, position-controlled robot, while still achieving the benefits of SEAs in the interaction of the end effector and the tool. Through our solution, we aim to achieve a 6D generalization of SEAs that can be useful for spatial tool use.

Our characterization of spatial series-elastic actuators would not be complete unless we can measure the deformation of the spatial compliance in real time, and use the feedback for force control. In order to achieve this, we leverage recent advances in visuotactile sensing that measure 6D deformation using vision [gelsight, gelslim, bubblegrippers, bigbubble]

. In contrast to many works that utilize deep learning to directly process data from visuotactile sensors in an end-to-end manner, we propose to measure the pose of the grasped tool relative to the end effector, abstracting visuotactile sensing as a relative pose estimator.

Our proposed framework of Series Elastic End Effectors in 6D (SEED) consists of three elements: a manipulator capable of accurate position control, a 6D spatially compliant stiffness element, and visuotactile sensing that measures the deformation of the spatial compliance. With these three elements, we show that we can achieve spatial force control of tools with closed-loop feedback from visuotactile sensing.

Ii Literature Review

Ii-a Tool-Use in Manipulation

Tool-use has long been one of the hallmarks of intelligence [animaltooluse], as well as a practical problem to solve for robotic applications. As such, many existing works [toussainttooluse, holladay] center around how to give robots the ability to use tools. However, only a few works attempt to perform explicit force control with a tool that has not been rigidly attached to the robot, but rather, must be grabbed before it can be used.

Most existing works in this setting focus on planning, where the grabbed tool must be used to manipulate the pose of another object [toussainttooluse, holladay, pushandpull]. Such plans can be very useful in reaching confined spaces [pushandpull] or beyond the workspace of the manipulator [toussainttooluse]. However, as the focus of these works lie more in planning, tasks that require force exchange among static objects, such as using a squeegee, surface grinding, wiping a table, or using torque drivers, are often not considered.

On the other extreme, classical works in robot force control excel in forceful manipulation with rigidly attached tools. Strategies such as impedance control [hoganimpedancecontrol] and hybrid force-velocity control [hybridpositionforce] have been extensively tested and applied on problems that require force exchange between the robot and the environment [grinder, albuschaffer2, forcecontrol]. However, customized tool changers are quite limited in terms of versatility in the wild.

Finally, works that attempt to explicitly apply forces with the grasped tool [toussaintforce] often run into hardware limitations, as typical parallel jaw grippers with rigid, flat fingers are unable to resist forces and provide compliance in certain directions due to their relatively small contact patch.

Ii-B Manipulation using Visuotactile Sensing

Visuotactile sensors [gelsight, gelslim, bubblegrippers] consist of a deformable membrane which interacts with objects, and a camera (color, depth or both) under the membrane to measure its deformation during interactions. As the measurements from visuotactile sensors are images, some works have leveraged deep learning approaches to learn the dynamics [swingbot], or directly learn a map from the input image or optical flow to the policy [visionandtouch, tactilerl]. While such approaches can be effective, we first focus on interpretable abstractions in this work that are more conducive for understanding, and may provide more inductive bias [inductivebias] for designing deep models in the future.

Other works have taken a more model-based route. In [tactiledexterity], visuotactile sensors are used to track geometric features of the objects such as lines and points. These features are utilized to track the pose of the object and the contact state, which is then used for feedback control. Similarly, [cablemanipulation] tracks the state of a deformable cable by estimating the contact patch ellipse, and fits a linear dynamics model which is stabilized by LQR. Although we use a similar model-based approach, our work is unique in that we generalize the estimator spatially, then explicitly do force control.

Ii-C Tactile Force and Pose Estimation

Many of the existing works in tactile pose/force estimation attempt to deal with dense measurements. In [contactpatchposeestimation], ICP is used from dense depth information in order to estimate the poses of the object. Similarly, [tactileposeestimation] uses geometric contact rendering which is then compared with the dense tactile image. While such dense information is useful for classification [gelsight], it is unclear if such dense information is necessary for control.

On the other hand, [tactiledexterity] estimates simpler features such as points and edges, and [cablemanipulation] estimates ellipsoidal contact patches that are sufficient for achieving the task. We use similar representation to [cablemanipulation], but estimate the patch in 3D instead of localizing on the plane. While such approaches are efficient to implement and is more relevant to the task, we note that they lack geometric generalizability compared to dense information.

Iii Preliminary: 1D Series Elastic Actuator

In this section, we briefly review the concept of 1D SEAs, their proposed benefits, and the corresponding control strategies. Although the section will entirely be a review of previous work on SEAs, the ideas presented here will have direct correspondences with our generalization.

Iii-a 1D Series Elasticity

Closed-loop force control often requires a motor, a gearbox, and a force sensor in series. Typically, a relatively stiff sensor based on strain gauges is used. However, this force-feedback setting can result in instability due to high contact stiffness [sea], as well as non-collocation of sensors and actuators [hoganimpedancecontrol, flexiblevehicles, macromicromanipulator]. This prevents the use of high-gains that are necessary to overcome undesired effects of the gearbox.

SEAs, initially proposed in [sea], can be understood as a special case where the sensor stiffness is very low. Under this setting, force-feedback enjoys better stability properties at the expense of controller bandwidth, as the spring acts like a mechanical low-pass filter [stableactuator, hoganimpedancecontrol]. For many household tool-use tasks such as wiping with a squeegee, the loss of control bandwidth does not pose a big problem, as such tasks are usually quasistatic. Thus, one may use high-gain position control to overcome unwanted effects of the gearbox, while still maintaining stability of the system and achieving greater force accuracy [sea].

Iii-B Force Control of Series Elastic Actuators

We present a simple version of force control with series elastic actuators. In force control, the user supplies a desired force , which can be turned into desired relative position using the sensor stiffness . Then, a high-gain position controller can be used to achieve this relative position. The detailed procedure is described in Algorithm 1

. In practice, frequency-domain analysis can be done to carefully choose gains that stabilize the closed-loop system


1 Given: Desired force , sensor stiffness ;
2 Convert desired force to desired deformation ;
3 Convert desired spring deformation to desired position ;
Use position control to regulate to desired position
Algorithm 1 Force Control with SEAs

Iii-C Multi-DOF SEAs for Tool-Use

How can we utilize the benefits of SEAs to multiple degrees-of-freedom? One straightforward answer might be to connect SEAs serially at the joint-level

[albuschaffer2]. However, achieving accurate end-effector position and force tracking using joint-level SEAs requires fast and accurate joint-level torque sensing, which is not available on many position-controlled robots. Instead, we offer an alternative generalization of SEAs that concentrate the 6D elasticity into the end effector, while allowing the robot to remain stiff. Our generalization involves the following three components:

  1. A 6D deformable element capable of being stiff in multiple directions simultaneously.

  2. A mechanism to sense the spatial deformation of the above element.

  3. A manipulator capable of controlling spatial pose of the deformable element.

Iv SEED: Series Elastic end effector in 6D

In this section, we present Series Elastic End Effectors in 6D (SEED), a spatial generalization of 1D SEAs that satisfies the three requirements in Sec.III-C by using a soft deformable membrane, visuotactile sensing to sense the spatial deformation of the membrane, and a position-controlled manipulator to control the pose of the membrane base.

Iv-a Defining 6D Series Elasticity

One of the challenges of generalizing the 1D SEA using a spatially compliant element comes from defining an appropriate notion of spatial stiffness [cartesianmatrix], especially for large rotations (rotations up to 30 degrees are common in our experiments). Rotational stiffness has been traditionally defined on the roll-pitch-yaw and axis-angle parameterization of rotations [spatialimpedanceaxisangle, natale], which can be made to work for large rotations.

In this work we have chosen the bushing model, which was initially proposed as a coordinate-free parameterization of a bushing element in Drake [drake]. The bushing model also works for large rotations, and can be interpreted more intuitively due to its correspondence to a spring-loaded gimbal (Fig. 1). Based on the bushing model, we will develop a generalized stiffness map that relates the relative pose between two frames to a spatial force.

Iv-B Frame Definition

We give the definition of the frames here in order to better ground our notion of 6D series elasticity to the setting of a soft and tactile hand grabbing a tool. At the moment of grasp between the soft hand and the tool, two frames are initialized:

, which is rigidly attached to the gripper at a pre-defined nominal location (e.g. the center of the gripper), and , which is rigidly attached to the tool and initialized to be coincident with (i.e. identity relative transform).

Iv-C The Generalized Stiffness Map

Given the definition of these frames, our goal is to characterize the relation between the relative pose of with respect to (denoted as


[T]XCSE(3)) and the spatial force (written in frame ) applied on , which we denote as . We abstractly denote this as a generalized stiffness map such that Eq.1 holds:


We expect to be a generalized notion of stiffness with smoothness and monotonicity properties under the following assumption of no-slip.

Assumption 1.

No slip occurs between the contact patch of the gripper and the tool, such that smoothly maps relative transform to spatial force.

We now concretely describe the bushing model . We denote as the roll-pitch-yaw parametrization (which lives on a gimbal) of , and to be the position component of . Similarly, the spatial force is divided into torques and forces . Then, the bushing model gives spatial force for a pose using the following relation:


where is the gimbal stiffness matrix, and is the standard translational stiffness matrix. is the coordinate transformation matrix necessary to convert gimbal torques to spatial torques, and is given by


The necessity of the matrix becomes apparent by visualizing as torques that are being exerted on each axis of the gimbal, while is defined spatially in . We obtain the matrix by equating the power on a spatial representation to power on the gimbal representation , and using the standard conversion between angular velocities and gimbal rates. Throughout our work, we make the following assumption on the structure of and .

Assumption 2.

The gimbal stiffness matrix and the translational stiffness matrix are positive definite diagonal matrices.

Under Assumption 2, we present the following theorem which gives a more rigorous notion of smoothness mentioned in Assumption 1.

Theorem 1.

The bushing model stiffness map is a diffeomorphism under Assumption 2 everywhere for .


Since there is no coupling between the orientation and translational maps, it suffices to separately show that each are diffeomorphisms. The translational map given by is trivially a diffeomorphism under Assumption 2. We use the Inverse Function Theorem to prove the inverse differentiability of the orientation map. The determinant of the Jacobian for the orientation map is given by


where are the diagonal elements of , which is well defined everywhere in . We complete the proof by noting that the orientation map is bijective, and show it by providing a well-defined nonlinear inverse :


Note that the equation is written in semi-implicit form to save space: one can easily make it explicit by substituting values starting from the bottom row. ∎

Theorem 1 tells us that our model of follows desirable properties that can smoothly map back and forth between relative pose deformation and spatial force, which we can effectively use in order to do force and impedance control in a manner akin to SEAs. We also note that in hardware, we expect to be confined to at most before Assumption 1 is broken and slip occurs.

V Force control with SEED

Fig. 2: Left: System Identification Setup with the Soft-Bubble gripper [bubblegrippers] and a 6-axis force/torque sensor. Center: Results of identified stiffness values for different axis, per change in pressure. Right Top: Change in pressure as a function of gripper distance command. Right Bottom: Frame definition.

Now we present our main algorithms for doing control with SEED, which follows the general philosophy of controllers using SEAs: a force control problem is turned to a position control problem [sea]. Thus, we assume access to a manipulator that can achieve reliable position commands with high gains and rates (akin to how SEAs can use high gains to overcome gearbox effects quickly and achieve accurate positions), which describes most position controlled manipulators with high mechanical repeatability.

V-a Problem Setup - Feedback and Action

To setup the control problem, we note that the position controlled manipulator can command end-effector pose with high rates using direct inverse kinematics or integration of differential inverse kinematics. Our feedback signal will come from the estimation of relative pose , which is measured by the visuotactile method given in Sec.VII. Then, the goal is to find a policy that achieves some desired specification of the user.

Throughout the section, we will assume we have some estimate of parameters and that define the generalized stiffness map, and denote the estimated map as .

V-B 6D Force Control

In force control, the user specifies some desired spatial force , described in the world frame. SEED achieves this specified spatial force by converting it to some desired relative transform with the estimated generalized stiffness map . Then, a position command is sent to the manipulator to achieve this relative pose. We describe the detailed process in Algorithm 2.

1 Given: Desired wrench ;
2 Given: Estimated SEED stiffness ;
3 while  do
4       Convert to using adjoint transform with current position ;
5       Using the estimated SEED stiffness , convert desired wrench into desired relative pose by ;
6       Convert desired relative pose into desired end effector pose by ;
7       Send position command to the position controller.
Algorithm 2 Force Control with SEED

The expression for the orientation part of has been given in Eq.5, while inverting the position simply requires . We note that, like most force control strategies, the controller will not be well-behaved if there is no contact with an external environment. In particular, while position-only force control can move until contact and maintain some desired force, the orientation torque controller must keep rotating until contact, which likely runs into workspace limitations of the manipulator quickly and makes the controller impractical to use in free-space.

V-C 6D Hybrid Force/Pose Control

In many tasks involving tools, the goal is to simultaneously control force and torque in certain directions, while controlling position and orientation in other directions. We naturally extend spatial force control with SEED to this setting by defining a partial inverse of the impedance map that attempts to construct the spatial deformation from a subset of specified forces.

V-C1 Hybrid Force/Position Control

Given a task-relevant decomposition matrix which selects a subspace for desired position and desired force , we can compute the position that achieves the specified positions and forces:


where is the matrix that represents the orthogonal complement of .

V-C2 Hybrid Torque/Orientation Control

Unlike force / position control, coordinate transform in rotational space does not happen in a linear manner. Thus, defining hybrid torque/orientation control for an arbitrary task-relevant coordinate representation is significantly more difficult. To deal with this problem, we make the following assumption:

Assumption 3.

The decomposition of specified orientation and torques happen in the frame of .

Such an assumption is not too restrictive under a large class of tools, as most tools require decomposition of torques and angles in a manner consistent with its natural task-relevant coordinate frame. Under such assumption, we can define partial maps from a subset of desired torques to the full orientation as follows:

  1. 2 torques, 1 angle: The following angles achieve the given two desired torques and one desired angle , given the stiffness map :

  2. 1 torque, 2 angles: The following angles achieves the given torque and the two desired angles , given the stiffness map :


After recovering the full pose from a subset of desired forces and torques, we use position control to command this pose, as done in Alg. 2.

Vi System Identification

In order to apply our framework, we need to do identification on the parameters of the stiffness map , which is consisted of stiffness parameters. The stiffness parameters can be identified by measuring the static sensitivity of wrench with respect to pose. We achieve this by having a dexterous manipulator grab a 6-axis force/torque sensor and perturbing the pose to observe responses in wrench.

In addition, to see if squeezing or pressurizing the gripper changes the stiffness parameters of the hand, we use the pressure sensor on board the soft-bubble hand [bubblegrippers] to characterize how the gripper distance affects the pressure, and in turn, how the pressure affects the identified stiffness values. The results of our experiments are presented in Fig.2.

Along with the quantitative values of stiffness, we summarize our findings from the identification process:

  1. The dependence on internal pressure of the hand with respect to the gripper distance is linear.

  2. For

    direction torque and all the forces, higher pressure near-linearly corresponds to higher stiffness. The identified stiffness values also have low standard deviation.

  3. For direction torque, the measurement is relatively unreliable and the identified stiffness values are subject to large standard deviations. In addition, higher pressure does not seem to lead to higher stiffness values along these directions.

The results of system identification, combined with the monotinicity of the stiffness map, leads to a very natural interpretation: if stiffer behavior is desired while controlling the tool, the hardware gives us the means to control the stiffness by grabbing the tool more firmly or by more pressurization.

Vii Tactile Relative Pose Estimation

In principle, our framework of control can work well with any tactile end effector that is compliant enough, and an estimation algorithm that produces a well-behaved estimate of the relative pose . In our work, we show an example of such a relative pose estimator by utilizing the PicoFlexx IR-Depth camera mounted within the bubble grippers [bubblegrippers].

Vii-a Contact Patch Estimation

We estimate the position of the contact patch using a simple background subtraction algorithm. Denote as depth image at time . Then, we simply compare to the initial depth image , taken when the bubble is not in contact. After performing a thresholding operation to obtain the difference, we perform a morphological transformation using an elliptical kernel to obtain a binary mask . Finally, we use the calibration matrix to transform the masked depth image into a set of points , where denotes element-wise multiplication, denotes the left camera frame, and denotes the left contact patch. Finally, we take a mean to obtain the 3D coordinate of the contact patch, and repeat this process for the right camera.

Vii-B Frame Estimation from Contact Patches

Given the location of the contact patch on the left bubble expressed in the gripper frame and , we average the positions of the two patches to obtain the position of the contact frame:


To compute the rotation , we introduce an intermediate frame such that the axis of is aligned with , and the component of the axis is zero. Denote as columns of (i.e.

), which individually represent the components of the unit vector that define

. Then, we compute the columns using the following process:

  1. Set to be the normalization of .

  2. Set to define the zero-pitch frame, and

  3. Set Compute by using , and normalize .

  4. Set

Fig. 3: Frame Definition and example images for Relative Pose Estimation.

Vii-C Pitch Estimation with Optical Flow

After computing , we compute , by computing the rotation along the axis of . We estimate this quantity by computing the optical flow of the IR image. We denote as as the Eulerian flow of relative to . Then, we compute the curl of :


where the superscript denotes the component of the vector field, and is some normalization constant we calibrate for. The gradients are computed using a Sobel filter with corresponding kernels.

Vii-D Validation Results

In order to validate the performance of the proposed relative pose estimator, we use the same setup that was used for system identification (Fig.2). Through the forward kinematics of the manipulator, and the fact that is a fixed transform for the system identification setup, we compare the measured values of with the results of the relative pose estimator . Our results, illustrated in Fig.4, show that tracking performance of relative pose is dependent on which axis is being tracked:

Fig. 4: Tracking performance of the Relative Pose Estimation
  1. The position, which uses depth information from each camera, can be tracked reliably.

  2. On the other hand, the and position tracking is not very reliable due to the large contact patch caused by the cylindrical geometry of the tool.

  3. The locations of contact patches on both sides give a very good estimate for roll angle. Optical flow is also successful in tracking pitch.

  4. While yaw shows reasonable behavior, the estimate tends to underestimate the true yaw angle as the contact patch lags behind true rotation due to the softness of the membrane (i.e. perfect roll does not occur).

Viii Experiment Methods & Results

Viii-a Simulation Methods & Results

To verify the performance of our proposed pipeline, we first set up a simulation in Drake [drake], where the compliance between the tool frame and the compliance frame is simulated using Drake’s 6D compliance element LinearBushingRollPitchYaw. By assuming a perfect measurement of the relative pose , we aim to decouple the validity of the proposed controller with the accuracy of the tactile relative pose estimator.

Viii-A1 The Squeegee Task

The squeegee is a tool that requires regulation of spatial forces along some principle axis, while requiring regulation of position along others. We illustrate the frame definition in Fig.5, and decompose the spatial forces and positions in the following directions in order to set a task specification of the hybrid force position controller:

  1. are used for position control in order to specify the trajectory of the tool from a table-top view.

  2. and are used to enforce the magnitude of pressure between the blade and the table.

  3. is used to enforce equal pressure distribution.

As a baseline, we include an open-loop trajectory that is tuned such that the squeegee barely contacts the table, within the mechanical repeatability of the manipulator (0.1mm). In addition, we modify the controller for the case where the tool is rigidly fixed (welded) to the end effector in order to simulate the performance of a custom tool changer. The resulting contact forces are inspected based on how much force is exerted (), and how much the pressure distribution on the blade is balanced (). The resulting trajectory is shown in Fig.6.

We show that compared to the case where the end effector is rigidly attached, the compliant hardware allows much better tracking of , such that equal pressure is applied on both sides of the squeegee. We mainly attribute this behavior to the built-in compliance, as behaves well even in the open-loop setup. By commanding the desired force in closed-loop however, the 6D hybrid force-position controller adds the ability to exert desired amount of forces. Finally, we note that there exists offset in the tracking error due to the unaccounted weight of the tool.

Fig. 5: Left: Drake Simulation environment. Right: Frame and contact point definitions for the squeegee tool.
Fig. 6: Controller results of tracking and .

Viii-B Hardware Methods & Results

Though we have verified the behavior of the controller in simulation assuming perfect pose tracking, showing the controller on hardware requires coupling the pose estimator and the controller in all six axes. However, the estimator is unreliable in certain directions such as yaw or -position, which can adversely destabilize closed-loop behavior.

In order to overcome these limitations of the estimator, we propose a simple yet effective strategy: we purposely align the axis that requires force tracking with the axis that our estimator performs well in. As most tasks require at most two or three components of force tracking, we show that it is possible to only estimate well a subset of the relative pose, and still achieve the underlying task.

Viii-B1 Pen Writing Task

We first test the controller on a pen-writing task, where the robot is commanded to write some characters in the plane, while some force is commanded in the direction. Our setup is illustrated in Fig.7.D. As the result of Fig.7.B demonstrates, our controller achieves good tracking performance of specified force, as observed by the differences in marker stroke width and darkness.

We also test our controller by writing letters in Fig.7.C. While we are successful in tracking the characters, the inherent softness of the hardware sacrifices the bandwidth of the position controller, and frictional interactions between the marker and the paper (e.g. caused by the Painleve effect) can compromise the tracking performance of SEED.

Viii-B2 Squeegee Task

We test the proposed controller on a real-life task of using a squeegee to clean some liquid on top of a cutting board. The results of our hardware experiment are shown in Fig.7.A. While the open-loop baseline fails to exert much force on the board, the closed-loop controller is successful in pressing down firmly and clearing all the liquid.

Fig. 7: A. Deploying our controller on hardware for using the squeegee (top row), against the open-loop baseline (bottom row). B. Comparison of open-loop stroke against closed-loop strokes with different specified downward forces. C. Performance of character tracking. D. Pen writing setup.

Ix Conclusion

We have presented SEED, a control and hardware framework that combines the benefits of hardware compliance with visuotactile sensing. Throughout our work, we have demonstrated that we can measure the relative pose of a tool with respect to the gripper using visuotactile sensing. Combined with offline-identified parameters of our spatial stiffness model, we have shown that we can achieve closed-loop spatial force control that can be useful for tool-use. By our demonstration, we aim to alleviate some of the difficulties that rigid contacts and the associated non-smooth behavior bring in the setting of grasping and using tools in the wild.