Tensegrity robots are composed of rigid elements (rods) and flexible elements (cables), which are connected together to form a lightweight deformable structure. Their adaptive and safe features motivate many applications, such as manipulation , locomotion , morphing airfoil  and spacecraft lander design 
. While useful and versatile, they are difficult to accurately model and control because of their high number of degrees of freedom and complex dynamics. Identifying system parameters is necessary, either to learn controllers in simulation, as real-world experiments are time-consuming, expensive and perhaps dangerous, or for traditional model-based control.
The movement of tensegrity robots is made possible by collision responses with objects that are in contact, which is facilitated by normal reaction and friction forces. Different object materials result in normal reaction forces ranging from perfectly elastic to inelastic, and the imperfect contact surface can lead to static or dynamic friction forces as well. Since these forces are generally nonlinear and difficult to model, researchers have proposed various first principles [19, 25, 10, 29, 4] to describe them. These equations have different forms and parameters. In the end, all of these equations are only approximations of reality, and system identification processes often are limited into considering linear combinations of these equations and their parameters.
physics engines, often based on neural networks, allow for automatic parameter inference from data using backpropagation. They are gaining attention as a data-driven approach for system identification. Compared to traditional methods, such as L-BFGS-B  and CMA-ES , they often converge faster and are more accurate. Nevertheless, training a differentiable physics engine requires significant amounts of data. This has prompted hybrid designs, where differentiable engines are combined with analytical models to reduce data requirements [42, 15].
A differentiable physics engine to model cable forces in tensegrity robots was recently proposed . While it has high accuracy and low data requirements, it did not include contact or actuation modeling, preventing applicability to locomotion. This paper augments this method with a new linear contact model, which can predict complex collision responses. It also includes a new linear actuator model for motors, which apply forces to expand or contract the flexible cables. These models are embedded into the engine, which leads to an end-to-end differentiable solution for modeling tensegrities and collision responses. A progressive training pipeline is proposed, which helps avoid local optima and reduces data requirements. To the best of the authors’ knowledge, this is the first differentiable physics engine for such complex robots with all of these features included.
The differentiable engine is evaluated using data sampled from the MuJoCo physics engine , a popular closed-source simulator, which supports rigid body dynamics based on linear complementarity principles (LCP)  and includes a model for NASA’s icosahedron SUPERballBot . The constraint-based solver inside MuJoCo tries to resolve contact constraints at each time step. MuJoCo does not exactly follow the same first principles contact modeling as that of the developed differential physics engine. This feature makes MuJoCo a good candidate as a ground-truth, simulated model to mimic trajectories that would be obtained from a real platform.
The evaluation shows that the output of the differentiable engine is sufficiently close to the ground truth MuJoCo engine. The criterion for what constitutes close enough comes from the use of the differentiable engine as a simulation platform for learning locomotion policies. In particular, after the engine is trained with few trajectories to match the MuJoCo’s engine model, a locomotion policy learned on the differentiable engine is successfully transferred back to MuJoCo. Training the controller requires orders of magnitude more data than training the differential engine and is preferable to be executed in simulation.
Ii Related Work
The traditional approach for system identification involves building a dynamics model by minimizing the prediction error [38, 14]. This is achieved by iterating between parameter refinement and data sampling. Data-driven techniques can avoid this iterative process by directly fitting a physical model to data [30, 31, 2]. Nevertheless, these techniques are data hungry because they treat the dynamics as a black box, and require retraining in a new environment.
A black box view of the environment can be avoided by modularizing objects and their interactions in an interaction network . A more general hierarchical relation network has been proposed  for graph-based object representations of rigid and soft bodies by decomposing them into particles. A multi-step propagation network was recently proposed . A Hamiltonian network achieves better conservation of an energy-like quantity without damping . These methods still follow a black-box approach regarding the interactions between different physical elements. This is often a data hungry direction and does not utilize well-understood governing equations.
Differentiable physics engines have received a surge in interest for many applications, including for the prediction of forward dynamics of articulated rigid bodies , the solution of inverse dynamics problems for soft materials using the Material Point Method (MPM) , linear complementarity problems (LCP) for multi-body contacts , and nonlinear optimization with the augmented Lagrangian method . A compiler for programming differentiable engines has also been introduced . Differentiable engines have also been used with traditional physics simulators for control . Differentiable physics engines that are specific to particular objects have been recently investigated, such as molecules , fluids , and cloth .
Prior work by the authors on tensegrity locomotion [36, 24, 37] has achieved complex behaviors even on uneven terrain using the NTRT simulator , which has been manually tuned to match a real platform [26, 6]. While the locomotion policies have been demonstrated on a real system after trained in NTRT, the sim2real transfer of locomotion policies was quite challenging. This line of work motivated the authors to mitigate the reality gap for cable driven robots. The current work is the first step in this direction by showing sim2sim transfer, where the different and unknown physical model of MuJoCo acts as the ground truth.
Iii Problem Formulation
The objective is to design a predictive dynamics model for tensegrity robots using the governing equations of motion as first principles. The model must be able to return the next state using knowledge of the current state and control signals . The focus is on modeling the entire state space of cable-driven tensegrity robots, which includes rigid elements, flexible cables, complex contacts with the environment, and associated non-linearities and oscillations.
Since analytical solutions that express the dynamics of such systems either employ approximations that do not hold in the real world, or require knowledge of parameters (such as friction, restitution coefficient, mass, etc.) that are unknown, this work adopts a (model-based) machine learning perspective. The available data are pairs of the form, which are collected by observing and decomposing trajectories sampled from a target, ground-truth platform, which in this paper come from MuJoCo . The current work focuses on addressing the high-dimensionality and complex dynamics of tensegrity robots. It assumes that the robot’s state is directly observable as a stepping stone towards addressing the general case where sensing data are used to indirectly and imperfectly observe the system’s state.
The metric of success for the above objective is whether the differentiable physics engine allows the learning of locomotion policies using an off-the-shelf controller by using fewer ground-truth trajectories than would be otherwise required (i.e., without the differentiable physics engine). In particular, consider the following scenario:
An RL-based controller requires amount of ground-truth data of the form to solve a task with the target platform, with a performance score .
The amount of ground-truth data required to train the differentiable physics engine is .
The controller can also be used to learn locomotion policies using the differentiable physics engine and fewer ground-truth data. It achieves a performance score on the target platform in this case.
Then, the metric of success for this work is that and . This means that the controller requires fewer ground-truth data to be trained in conjunction with the differentiable physics engine, in contrast to when it is trained directly on the ground-truth system. Moreover, it should achieve at least the same level of performance.
The experiments accompanying this paper use the SUPERballBot platform , as simulated in the MuJoCo engine as the target, ground-truth platform. Its state is 72-dimensional since there are 6 rigid rods for which the 3D position , linear velocity , quaternion and angular velocity are tracked over time. The space of control signals is 24-dimensional, since there are 24 cables and it is possible to set the target length for each of them individually. For the learned controller, this work employs iterative LQG (iLQG) , used before to learn locomotion policies for this tensegrity [37, 36], where the objective was to achieve a desired speed for the platform’s center of mass.
Iv-a Differentiable Physics Engine
provides the architecture of the proposed differentiable physics engine. The loss function is the L2-norm of the difference between the predicted next stateand the ground-truth state at time . The internal parameters of the physics engine are optimized using backpropagation with gradient descent. The engine includes multiple modules, which fall into two categories: a) the contact simulation modules, and b) the tensegrity robot simulation modules.
The contact simulation modules include: 1) a collision detector, 2) a dynamic interaction graph, and 3) a collision response generator. The collision detector rebuilds the world-space configuration of the system by using the current state and an off-the-shelf collision detection algorithm, which outputs contact information, including the colliding object pairs, collision point and intersection depth. This information is used for computing a dynamic interaction graph, which describes the contact relations between the different objects. Figure 3 illustrates a simple environment with two rods, one cable and the ground. The bidirectional connectors reflect relations between objects. In particular, the dashed connectors denote the contact relations, which are dynamic during the simulation. The collision response generator predicts collision forces and torques for each colliding pair of objects.
Following prior work , the robot simulation modules include 1) a static topology graph, 2) a cable actuator, 3) a cable force generator, 4) a rod acceleration generator, and 5) an integrator. The static topology graph shows how cables and rods are connected together, and where the connection anchors are. The solid connections in Figure 3 denote the static topology of the robot’s components. The cable actuator controls the cable’s rest length with the control signal . The cable force generator is an analytic model based on Hooke’s law. The rod acceleration generator takes forces and torques as inputs and maps them to the rod’s linear and angular accelerations using the relation graph. Finally, the integrator applies the semi-implicit Euler method  to compute .
Iv-B Collision Response Generator
The collision response generator produces contact forces based on contact states computed using the dynamic interaction graph. Each contact state includes the contact point, contact depth , contact normal , and the indices of the two contacting objects, which have relative linear velocity at the contact point. With , and , the module computes the contact forces including the reaction force, sliding friction force, torsional friction force, and rolling friction force.
Iv-B1 Reaction Force
The reaction force is derived using an impulse-based contact model . The reaction force formula is:
where is the bias factor, is restitution coefficient, is the simulation time step, are the masses of the objects,
are their inertia tensors in world frame, andare the torque arms of the two objects at the collision point.
Inspired from Baumgarte stabilization , the bias factor accounts for the percentage of error, i.e. , that is corrected at each time step with the collision response.
The coefficient of restitution relates the pre-collision relative velocity at the contact point to the post-collision relative velocity along the contact normal using the relation . Prior methods have used for modeling hard contacts, showing that is in the opposite direction of, and less than, . Nevertheless, the choice made here is to set , accounting for the case when is only less than, but in the same direction as, . This allows to model soft contacts.
The only parameters that require system identification are and . Since both of them are within the interval , they can be conveniently identified during the training process using the same learning rate that is used by all other parameters for updating the gradients. Although the additional terms , do arise from physical first principles, they also benefit the optimization during training. Since the magnitudes of and are different, help to normalize them to the same interval.
The linear model can also be viewed as an improved version of the Kelvin-Voigt linear model :
where is the stiffness, is the damping coefficient. A concern with the Kelvin-Voigt model is that the transition between contact and non-contact conditions is discontinuous due to the use of a linear dashpot. The damping parameter may also result in a non-physical negative normal force during separation . Parameters and also have different magnitudes, and a unique learning rate may find it difficult to update their gradients. This model doesn’t consider the effects of collision torques either. Our linear model, however, changes this situation, since all parameters can use the same learning rate, and torques are considered as well.
Iv-B2 Sliding Friction Force
Sliding friction opposes relative sliding between the two objects. The sliding friction is derived as from the impulse-based contact model :
where is the sliding friction direction, which is opposite to the tangential projection of the relative velocity onto the contact plane, and is the damping coefficient.
The coefficient of damping relates the pre-collision relative velocity of the contact point to the post-collision relative velocity along the tangential sliding friction direction as follows . The parameter is set to be in . Thus, can only be in the same direction, but less than .
The sliding friction is governed by Coulomb friction, i.e.,
where is the coefficient of friction and is the norm of the reaction force. The only parameters that require system identification are and .
Iv-B3 Torsional and Rolling Friction
The torsional and rolling friction forces oppose the spinning and rolling of the object. Instead of LCP, a simplified method is used. Since the only object that might spin or roll is the rod (the ground has infinite inertia), and the rod can only roll along its principle axis, a frictional torque is applied along its axis to mimic the effects of these two friction forces.
where is the object’s angular velocity, and is the damping coefficient, which relates the pre-collision object angular velocity to the post-collision angular velocity as follows . By setting , then could only be in same direction, but less than . The only parameter that requires system identification is .
In summary, the collision response generator only has three linear models with 5 parameters to identify, .
Iv-C Cable Actuator
The tensegrity robot has actuators at the end of each cable, which are motors with a reel that cast and retract the cables. The control signal is the target position that the actuator must roll to. The actuation forces are computed as follows:
where is the motor speed, is the motor position, is a scalar mapping the motor position to the change in the cable’s rest length, and is the cable’s rest length when the motor position . The only parameters that require system identification are and . The cable’s rest length is updated by the motor position at each time step.
Iv-D Progressive Training
We apply progressive training to avoid local optima. Combining tensegrity and contact simulation, the system dynamics are highly nonlinear and have many parameters to identify. This makes the training loss landscape highly nonconvex and gradient descent often converges to parameter estimates that poorly describe the ground-truth data. Parallel random search could be a possible solution but is computationally expensive. Instead of identifying the whole system in one shot, we divide all system parameters into groups and identify these groups progressively. We also prioritize the identification for groups of parameters that have higher fidelity with our model.
We first identify non-contact parameters such as the actuator speed, actuator position scalar, cable stiffness, cable damping, rod mass, etc., which only describe the robot itself. In simulation, we “hang” the SUPERball in the air, actuate with random controls and sample non-contact trajectories. We use these trajectories to identify non-contact parameters.
Then we trust and fix these non-contact parameters, and identify the remaining contact parameters. The contact model is identified in a progressive way as well. We first train our linear contact model and then fix
it. Then we augment the linear model with a Multi-layer Perceptron (MLP) to learn complex dynamics.
The engine is implemented using Pytorch, which provides built-in support for auto-differentiation. The range of parameters are limited by activation functions, e.g., the sigmoid and hyperbolic tangent functions. The learning module receives the current stateand returns a prediction . The loss function is the mean square error (MSE) between the predicted and the ground-truth state
. The proposed decomposition, first-principles approach, and the cable’s linear nature allow the application of linear regression, which helps with data efficiency. This linear regression step has been implemented as a single layer neural network without activation functions.
The setup of MuJoCo is listed as follows. The rod mass is 10kg. The rod’s length is 1.684m and cable rest length is 0.95m. The cable stiffness is 10000 and the cable damping is 1000. The cable actuator has control range . The activation dynamics type is filter. The gain parameter is 1000. The dimensionality of the contact space is 6, including sliding, torsional and rolling friction. The torsional and rolling coefficient is set to 1. This setup makes the robot motion in MuJoCo closer to its behavior in the real world.
V-a Tensegrity Robot Model Identification
First we identify parameters of the tensegrity robot itself without considering contacts. The parameters to estimate are rod mass , cable stiffness and damping , actuator motor speed and position scalar . We fix one rod of the SUPERball in the air, actuate all 24 cables with random controls, as shown in Figure 1 (left). We only sample one trajectory for training, one for validation and 10 for testing. Each trajectory lasts for 5 seconds, i.e. 5,000 time steps. The simulation time step is 1ms.
We train with an Adam optimizer for 40 epochs, 256 batch size. The initial learning rate is 0.1, and is reduced by half every 10 epochs. Both the training loss and validation loss go down from 0.055 to 3e-11. Figure1 (right) shows the stable training curves.
The identified parameters are . Although they look different compared to ground truth, these parameters produce the exact same motions for the robots, since the accelerations are the same. For example, if a cable’s two ends have length change of and relative velocity of , thus the linear acceleration is . The ground truth linear acceleration is . The linear acceleration from our identified parameters is .
We test our identified parameters on two datasets. One includes 10 trajectories with controls every 1ms and 10 trajectories with controls every 100ms. The controls are randomly sampled from the uniform distribution in range. We include 100ms controls here because 100ms is the control interval in the real world setup. Figure 4 shows that our model performs good on both of them. After 5,000 time steps, we can achieve both position and quaternion error in a magnitude of .
V-B Tensegrity Contact Model Identification
Now we have identified robot parameters without contact and it’s time to decipher the contact parameters. The parameters to estimate are of the collision response generator. We throw SUPERball to a random direction with a random initial speed. The SUPERball will start rolling on the ground and finally stop. We take 1 such trajectory for training, 1 for validating and 10 for testing. Each trajectory is 5 seconds long, i.e. 5,000 time steps. The environment setup and the training loss curve is shown in Figure 5.
We train our engine with an Adam optimizer for 60 epochs, 256 batch size. The initial learning rate is 0.1, and is reduced by half every 20 epochs. Both the training loss and validation loss go down to 0.13. Compared to the non-contact training process, the minimum training error here, 0.13, is larger than the maximum training error there, 0.055. This shows the necessity of the progressive identification pipeline. Figure 5 shows that the simplified linear contact model converges fast and stable. It only takes 1 trajectory for training, which is favorable for a real world setup.
We test the identified parameters on a dataset with 10 rolling trajectories, which also start with a random direction and a random initial speed. The evaluation is from the trajectory difference of CoM (center of mass). Figure 6 shows that our model has error approx. 0.63m after 4,000 time steps. Considering the average trajectory length is 2.35m, the relative error is in the order of 26.8% its length.
In the real world, as we keep running a robot, we would have an increasing amount of data. Then we increase the training dataset to 5 trajectories. But Figure 6 tells us the testing error doesn’t go down. It is because our contact model only has 5 parameters, however the real world is more complex. This reminds us to introduce the neural network to learn more complex dynamics.
V-B1 Comparison to a MLP
MLP (Multi-layer Perceptron) is a popular neural network model to learn functions. Replacing our contact response generator, the MLP receives contact states and predicts contact forces and torques. The input of MLP is a 29 dimension vector, based on contact principles, including( is the time step during collision). To use less training data, all input elements are transformed into the contact frame. MLP has 3 fc layers and each layer has 100 hidden variables.
We train the MLP for 5 epochs, 256 batch size. The initial learning rate is 0.1, and is reduced by half every 1 epoch. We take less epochs and reduce learning rate quickly, because MLP is quite easy to overfit a small training dataset. The training and validation loss is shown in Figure 7 left. The performance comparison is in Figure 7 right. Although the validation loss keeps steady during the training process, the testing error indicates that MLP overfits the training data. In a small dataset, both training and validation dataset are biased and the validation loss is not that helpful to tell if the model is overfitted.
V-B2 Augment Linear Contact Model with a MLP
Then the next idea to explore is ”why not accompany the MLP with our linear contact model?” This can help learn complex dynamics as a complement to the drawbacks of the linear model. Figure 8 shows how it performs. Thankfully, after 15 epochs, the training error keeps going down to 0.005, which is much smaller than the linear model training error, 0.13, and the testing CoM error, 0.5m, is smaller as well, which is only 21% in the order of the length average 2.35m. Remember it only takes 5 trajectories for training now. Adding more trajectories would continue to improve the performance. This indicates that the model can be learned in a lifelong mode. The linear model also helps to get rid of overfitting of MLP.
Since the linear model has a different learning rate to the MLP, we still apply a progressive training method here. We first learn our linear contact model and fix it. Then, we learn a MLP with a smaller learning rate. The MLP actually learns the ’error’ of the linear model. Here the learning rate of MLP is 0.01 and is reduced every 4 epochs.
V-C SUPERball Wheel Rolling Control
This experiment executes iLQG (iterative Linear Quadratic Gaussian) both on MuJoCo and our differentiable engine. The evaluation is in terms of 1) data requirements: number of training trajectories and 2) policy transfer: difference in policies trained on the engine and on MuJoCo.
The task is to control the SUPERball rolling to the right at speed . The setup for iLQG is as follows. We set the target linear velocity of CoM to , the weight of each dimension is , which means we prioritize the motion on x axis, allow swings on y axis and hope to maintain its height on z axis. The weight of KL divergence is 0.005. We use the exactly same setup for MuJoCo and our differentiable simulator.
We use our linear contact model trained with only one trajectory in this section. MLP is not included.
V-C1 Data Requirement
The iLQG is trained in an iterative way. We run 20 iterations in total, sample 40 trajectories in each iteration and each trajectory is 2.5s, i.e. 2,500 time steps. So if we directly train the policy on the real platform, it needs time steps. But our engine only needs a 5s trajectory to identify non-contact parameters and a 5s trajectories to identify contact parameters, i.e. time steps, which is only 0.5% of .
V-C2 Policy Transfer
We run iLQG on MuJoCo directly to obtain the ground truth policy. Then we run iLQG on our engine to get our policy. Finally, we evaluate the ground truth policy and our policy on the MuJoCo platform and compare their performance. We sampled a trajectory from each policy and computed the difference to the target linear velocity. Figure 9 shows the trajectories from the two policies. The trajectory costs in Figure 10 indicate that the two policies are very similar. We don’t apply any online adaptation for policy transfer.
This paper proposed an end-to-end differentiable physics engine for tensegrity robots. This engine is composed only of linear models and all parameters are explainable. This helps to reduce its data requirement for system identification, which is critical to mitigate the reality gap in practice. The explainable parameters are easy to understand and their uniform features are amenable to gradient descent. We also proposed a progressive training process to get rid of local optima and achieve higher accuracy for more confident parameter evaluation. The linear model can be also integrated with a neural network to benefit from larger amounts of data and can also be deployed in the context of lifelong learning. The identified engine has been tested by generating policies and the policy can still perform well on the original platform without adaptation.
-  (2019) Combining Physical Simulators and Object-Based Networks for Control. In IEEE International Conference on Robotics and Automation (ICRA), Cited by: §II.
-  (2019) Vid2Param: online system identification from video for robotics applications. arXiv preprint arXiv:1907.06422. Cited by: §II.
-  (2016) Interaction networks for learning about objects, relations and physics. In Advances in neural information processing systems, pp. 4502–4510. Cited by: §II.
-  (2017) Contact modelling for forward dynamics of human motion. Master’s Thesis, University of Waterloo. Cited by: §I, §IV-B1.
SUPERball: exploring tensegrities for planetary probes.
12th International Symposium on Artificial Intelligence, Robotics, and Automation in Space (i-SAIRAS). Cited by: §I.
-  (2014) Design and control of compliant tensegrity robots through simulation and hardware validation. Journal of the royal society interface 11 (98), pp. 20140520. Cited by: §II.
-  (2020) Design and control of tensegrity morphing airfoils. Mechanics Research Communications 103, pp. 103480. Cited by: §I.
-  (2009) The linear complementarity problem. SIAM. Cited by: §I.
-  (2018) End-to-end differentiable physics for learning and control. In Advances in Neural Information Processing Systems, pp. 7178–7189. Cited by: §I, §II.
-  (1961) Impact: the theory and physical behaviour of colliding solids. JSTOR. Cited by: §I.
-  (1987) Robot dynamics algorithms. Springer, USA. Cited by: §IV-B1.
-  (2001) Classical mechanics. Pearson; 3rd edition, USA. Cited by: §IV-B1, §IV-B1, §IV-B2.
-  (2019) Hamiltonian Neural Networks. In Advances in Neural Information Processing Systems 32, pp. 15353–15363. External Links: Cited by: §II.
-  (2001) Completely derandomized self-adaptation in evolution strategies. Evolutionary computation 9 (2), pp. 159–195. Cited by: §I, §II.
-  (2020) Augmenting differentiable simulators with neural networks to close the sim2real gap. arXiv preprint arXiv:2007.06045. Cited by: §I, §IV-D.
-  (2019) Interactive differentiable simulation. arXiv preprint arXiv:1905.10706. Cited by: §II.
-  (2019) DiffTaichi: differentiable programming for physical simulation. arXiv preprint arXiv:1910.00935. Cited by: §II.
-  (2019) ChainQueen: a real-time differentiable physical simulator for soft robotics. In 2019 International Conference on Robotics and Automation (ICRA), pp. 6265–6271. Cited by: §II.
-  (1975) Coefficient of restitution interpreted as damping in vibroimpact. Cited by: §I.
-  (2019) A differentiable augmented lagrangian method for bilevel nonlinear optimization. arXiv preprint arXiv:1902.03319. Cited by: §II.
-  (2016) A bio-inspired tensegrity manipulator with multi-dof, structurally compliant joints. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5515–5520. Cited by: §I.
-  (2019) Propagation networks for model-based control under partial observation. In 2019 International Conference on Robotics and Automation (ICRA), pp. 1205–1211. Cited by: §II.
-  (2019) Differentiable cloth simulation for inverse problems. In Conference on Neural Information Processing Systems (NeurIPS), Cited by: §II.
-  (2019-accepted) Kinodynamic planning for spherical tensegrity locomotion with effective gait primitives. International Journal of Robotics Research (IJRR). External Links: Cited by: §II.
-  (1999) A compliant contact model with nonlinear damping for simulation of robotic systems. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 29 (6), pp. 566–572. Cited by: §I.
-  (2015) Towards bridging the reality gap between tensegrity simulation and robotic hardware. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5357–5363. Cited by: §II.
-  (2018) Flexible neural representation for physics prediction. In Advances in Neural Information Processing Systems, pp. 8799–8810. Cited by: §II.
-  (Accessed 2020) NASA Tensegrity Robotics Toolkit. Note: https://github.com/NASA-Tensegrity-Robotics-Toolkit/NTRTsim Cited by: §II.
-  (1995) Two-dimensional models of boundary and mixed friction at a line contact. Cited by: §I.
-  (1958) The perceptron: a probabilistic model for information storage and organization in the brain.. Psychological review 65 (6), pp. 386. Cited by: §II.
-  (1986) Learning representations by back-propagating errors. nature 323 (6088), pp. 533–536. Cited by: §II.
-  (2018) Design, simulation, and testing of a flexible actuated spine for quadruped robots. arXiv preprint arXiv:1804.06527. Cited by: §I.
-  (2018) SPNets: differentiable fluid dynamics for deep neural networks. In Proceedings of the Second Conference on Robot Learning (CoRL), Zurich, Switzerland. Cited by: §II.
-  (2019) JAX m.d.: end-to-end differentiable, hardware accelerated, molecular dynamics in pure python. Note: https://github.com/google/jax-md, https://arxiv.org/abs/1912.04232 External Links: Cited by: §II.
-  (2000) An implicit time-stepping scheme for rigid body dynamics with coulomb friction. In Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No. 00CH37065), Vol. 1, pp. 162–169. Cited by: §IV-A.
Adaptive Tensegrity Locomotion: Controlling a Compliant Icosahedron with Symmetry-Reduced Reinforcement Learning. International Journal of Robotics Research (IJRR). Cited by: §II, §III.
-  (2018-11/2018) Any-axis tensegrity rolling via bootstrapped learning and symmetry reduction. In International Symposium on Experimental Robotics (ISER), Buenos Aires, Argentina. External Links: Cited by: §II, §III.
-  (1997) Optimal robot excitation and identification. IEEE transactions on robotics and automation 13 (5), pp. 730–740. Cited by: §I, §II.
-  (2012) Mujoco: a physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033. Cited by: §I, §III.
-  (2005) A generalized iterative lqg method for locally-optimal feedback control of constrained nonlinear stochastic systems. In Proceedings of the 2005, American Control Conference, 2005., pp. 300–306. Cited by: §III.
-  (2018) Design of superball v2, a compliant tensegrity robot for absorbing large impacts. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2865–2871. Cited by: §I, §III.
-  (2020) A first principles approach for data-efficient system identification of spring-rod systems via differentiable physics engines. Proceedings of Machine Learning Research 120, pp. 1–15. Cited by: §I, §I, §IV-A.