FAKIR : An algorithm for estimating the pose and elementary anatomy of archaeological statues

07/26/2019 ∙ by Tong Fu, et al. ∙ CNRS 0

The digitization of archaeological artefacts has become an essential part of cultural heritage research be it for purposes of preservation or restoration. Statues, in particular, have been at the center of many projects. In this paper, we introduce a way to improve the understanding of acquired statues by registering a simple and pliable anatomical model to the raw point set data. Our method performs a Forward And bacKward Iterative Registration (FAKIR) which proceeds joint by joint, needing only a few iterations to converge. Furthermore, we introduce a simple detail-preserving skinning approach working directly on the point cloud, without needing a mesh. By combining FAKIR with our skinning method we are able to detect the pose and the elementary anatomy of a sculpture and modify it, paving the way for pose-independent style comparison and statue restoration by combination of parts belonging to statues with different poses.



There are no comments yet.


page 3

page 13

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

With the progress of 3d scanning techniques, it is now common to create digital replicas of artworks, which will remain forever intact, while the real-world counterparts will slowly decay due to time damage or human activity. However when the digitization is performed, the statues are often already degraded. Restorations being costly, invasive and sometimes even risky, museums are often reluctant to carry out such processes. Hopefully, digitization offers another huge advantage: the possibility to build and test restoration hypotheses, whether or not they are applied to the real model afterwards. In the statue’s case, one of the first steps towards virtual restoration is to be able to identify anatomical parts, in order to guide the restoration. While this task can be performed manually, it is often long and tedious. In this paper, we are thus interested in recovering the anatomy of a statue, using a simple anatomical model, while keeping user interventions to a minimum. We focus on human statues with no or few garments, since those will benefit directly from a registered anatomical model. Many Roman or Gallo-Roman statues fall within this scope. Furthermore, we consider that the digitized statues are provided as point sets.

Our objective is to identify the elementary anatomy of a statue allowing in turn to change the statue’s pose. To do so, we first propose a method for calibrating and registering a simple anatomical model to a point set. This step is achieved directly on the point cloud, avoiding thus the tedious meshing step and preserving the accuracy of the initial sampling. To perform the calibration and registration, we introduce the Forward And bacKward Iterative Registration (FAKIR) algorithm, inspired by recent inverse kinematics approaches. FAKIR permits to efficiently register the anatomical model in only a few iterations. Once the model is registered, the point set surface is locally represented as a residual heightfield above the registered anatomical model. It is then possible to change the pose of the statue and combine parts of different statues after they are brought to a common pose. To do this, we propose a skinning approach that is applied directly to the point cloud, without using a mesh.

To summarize, our contributions are the following:

  • A simple anatomical model efficiently representing a statue pose.

  • An efficient calibration and registration process based on inverse kinematics.

  • A point set skinning process permitting to modify the point set following an anatomy and/or pose change.

2. Related work

Anatomical Model.

Designing anatomical models for human shapes has raised a lot of interest. The most common representation consists in a more or less detailed graph of bones such as the ones used in the MakeHuman framework (makehuman). While some methods go beyond the human skeleton representation and model every single muscle to increase realism (Lee12), we will focus here on skeletons, which are simple and efficient enough for our purpose. Among skeleton-based models, the sphere-mesh model (Thiery13), a variant of convolution surfaces (Bloomenthal:1991:CS:122718.122757), has been introduced for representing mesh models by packing sphere into it and encoding its structure. Conceptually, the sphere-mesh model can be seen as a piecewise linear simplification of the computational geometry skeleton (Tagliasacchi16). Although the original sphere-mesh construction proposed to discover entirely the shape structure from an input mesh, the sphere-mesh model can be used to represent an anatomical model by imposing constraints on it. As such, it has been used successfully for representing hand skeletons (Tkach16; Remelli17). This model is light and pliable and we will also rely on it.

Skeleton rigging and skinning.

Once a skeleton model is chosen, the next problem for animation purposes is to position it inside an input mesh and to attach surface points to it, processes that are called rigging and skinning respectively. Skeleton rigging can be performed manually, but a few methods have investigated automatic processes. In particular, the Pinocchio algorithm (Baran:2007:ARA:1276377.1276467) adapts a skeleton to a static mesh by defining an objective function and maximizing it. It works by packing spheres into the mesh and by considering their centers, gathered in a graph, as the admissible joint positions. This pre-computation makes the skeleton pose estimation tractable. If the input data is dynamic, it is possible to infer, or track, a skeleton from it. Most tracking approaches focus on the direct independent and simultaneous capture of the positions of the joints, using a temporal sequence and prior constraints. The pose parameters (angles) and intrinsic parameters (e.g. bone lengths) are then inferred from it. Many of such tracking methods work with depth streams or videos starting from a previously calibrated skeleton but the calibration itself can be performed from a depth video and a set of known admissible poses (Tkach16; Remelli17). Such methods require a dynamic scene and cannot apply to the static mesh rigging problem. It is also possible to rely on a whole database of people scans to learn the pose and deformation of human bodies (Anguelov05scape:shape; Hasler09cgf; 10.1007/978-3-642-33783-3_18; SMPL:2015). Finally, some methods (Hasler09smi; Zhang17) aim at finding a person’s pose despite its sometimes loose clothing, but this is outside the scope of our paper. For the sake of completeness it is also important to mention that many methods perform human tracking without requiring a model using multiple view acquisitions with or without markers. These methods are only remotely linked to our problem and we refer the reader to (Bogo:ICCV:2015) for an excellent overview.

Once the skeleton is correctly positioned, skinning methods aim at attaching the surface model to it by using weights that define the influence of the bones on the position of surface points. Then, when changing the pose of the skeleton, the attached surface should deform accordingly. Setting the right weights is an important question: while the profile of the weights is generally sketched by graphic designers (Mohr:2003:DMI:641480.641488), there exist automatic weighting techniques that, for example, use heat diffusion (Baran:2007:ARA:1276377.1276467). As far as skinning techniques are concerned, Linear Blend Skinning (Magnenat-Thalmann:1989:JLD:102313.102317) is one of the most popular ones. A mesh surface point is transformed by a linearly weighted combination of the motion of the bones it is attached to. Several methods attempt to fix the well-known collapsing problem of Linear Blend Skinning, such as Pose Space Deformation (Lewis:2000:PSD:344779.344862), Multi-Weight Enveloping (Wang:2002:MEL:545261.545283), spherical Skinning (Kavan:2005:SBS:1053427.1053429) and Dual-Quaternions Skinning (Kavan:2007:SDQ:1230100.1230107). A recent skinning method (Le:2019:DDM:3306346.3322982) corrects artefacts of linear blend skinning by locally estimating the rigid transformation that best restores the relative position of a vertex with respect to its neighbors using Laplacian differential coordinates. This method, designed for meshes, involves a definition of details in terms of Laplacian differences. In our approach, we rather define the detail as the residual over our anatomical model. Taking a different perspective on the problem, Implicit skinning (Vaillant:2013:ISR:2461912.2461960) uses an implicit formulation of the surface that better supports pose changes and reprojects skinned vertices on the implicit model after each pose change. In this paper, we also use a proxy model but it is explicit.

Pose change.

When a model is rigged and skinned, it is possible to change its pose manually by interacting with some joints of the skeleton. Through the skinning weights, the mesh surface should deform accordingly. However, it is often tedious to design every single motion of each joint for each frame of an animation. As a consequence, research has focused on inferring the motion from some key joints and frames with given skeleton positions. In this inverse kinematics context, the Fabrik (Aristidou:2011:FABRIK) and CCD (86079) algorithms define kinematic chains and aim at transforming each chain from its input pose to its target pose by updating pose parameters one after the other alternatively forward and backward along the chain. Our registration method will also use kinematic chains in a forward and backward approach, but the similarity ends here, since our goal is to estimate not only the pose but also length and width of the model limbs using data-attachment constraints in a static framework. Quite differently, some approaches transfer the volume delimited by a mesh to the interior of another mesh by minimizing some harmonic energy (Ali-Hamadi:2013:AT:2508363.2508415). To do this, a deformation field is set up between the two meshes, using the correspondence with the nearest point in the current iteration (gilles:inria-00516374).

Shape Synthesis.

Shape synthesis is a very active research area. It is often done by reusing parts of existing models. (Funkhouser04) proposed an example-based modeling system, where new models are generated by stitching together statue parts from a database constructed by segmenting a set of input models. For completing a 3D model, a general idea is to retrieve suitable models from the database and warp the retrieved models to conform with the incomplete model (Pauly05; Chaudhuri10). More recently, a probabilistic representation for the components of a shape has been developed to suggest relevant components during an interactive assembly-based modeling session (Chaudhuri11; Kalogerakis12). We will propose an application to statue synthesis by part combination, but the choice of the parts is done manually while the part adaptation is automatic.

3. Anatomy and Pose estimation

3.1. Human model

Our work focuses on artworks representing human beings without or with only a few garments. A challenge of artistic data compared to human scans lies in the difference in aesthetic perception. Indeed many sculptors favored the perceived beauty of their work over the realism of human proportions (143371; newell_1934). Figure 2 shows examples of such unrealistic statues of the Roman and Gallo-Roman eras. In this context, it is necessary to devise a human model with few constraints, allowing to fit a sculpture which does not follow the human proportion beauty canons. The existing fully detailed human templates modeling every single limb in a very realistic way are too constrained for our purpose. In particular, in our model we avoid modeling muscles. After we have registered our simplified human model to the statue point cloud, a further step could be to add new flexible muscle models, but this goes beyond the scope of this paper and is unnecessary for statues where the sculptor uses muscles as stylistic elements, in the same way that he could use arabesques, yielding a possibly unrealistic result (Figure 1(b)).

(a) A Gallo-Roman statue of a Gallic warrior (Avignon, France, picture: F. Philibert-Caillat )
(b) Statue of Heracles and Cacus (1530-1534) by Baccio Bandinelli, (Florence, Italy, picture: Cyberuly)
Figure 2. Examples of statues with unrealistic anatomies.

We introduce an anatomical model inspired by the sphere-mesh model (Thiery13), already successfully used for hand tracking (Tkach16; Remelli17), using only one-dimensional elements. In this model, each bone is represented by a sphere-mesh corresponding to the envelope of the union of a set of spheres centered on a segment and with a linearly varying radius (Figure 2(b)). Each bone is defined by two end sphere centers and with associated radii and respectively. The segment is the medial axis of the bone. For each point , the radius of the sphere centered at is , with .

The sphere-mesh model is controlled by the length and the pair of sphere radii . Consequently, we denote the sphere-mesh model for one bone as . We also denote by the angle of the conic part of the bone, as illustrated on Figure 2(a). Importantly enough, the bones we are defining do not correspond to anatomical bones, but more to limbs (i.e. it includes a coarse description of the flesh volume around the anatomical bone). By analogy to inverse kinematics, we keep the word bone instead of limb.

(a) Cross-section of a bone
(b) Sphere-mesh of a 3D bone.
Figure 3. The sphere-mesh of a bone is the union of the spheres centered on segment , with radius varying linearly between the two extremities.

With this type of bone element, we construct a simple human body template with a very coarse respect of human proportions as an initial body shape (Figure 4). Our human body template contains 22 bones . Three of those correspond to the pelvis and have no relative motion: their length is fixed up to a common scale parameter that will be determined during the registration, along with the orientation of the triplet. Additionally, a special bone is used to connect the spine bone to the neck, and its length and orientation directly depend on the adjacent spine bone. The other bones have no constraint on their relative proportions. The bones are organized into 5 chains, depicted in different colors in figure 3(a): the spine chain, the right arm chain, the left arm chain, the right leg chain and the left leg chain. These chains are independent with the only constraint that some extremities must remain anchored to the spine. The chain organization is used to define the notion of predecessor and successor for one bone in a chain, and this ordering will be extensively used in our kinematic registration. In particular, we will reverse the ordering of the chain to process it forward and backward several times along the process. Each bone is thus fully defined by its intrinsic parameters (length and two radii) and by its extrinsic parameter (rotation with respect to its predecessor). Furthermore, two successive bones share a common radius. Because of the simplicity of the sphere-mesh bone model, the distance from a point to the model can be easily computed. In contrast, using a mesh model would make these computations much more demanding.

(a) Control skeleton
(b) Sphere-mesh model
Figure 4. Our anatomical human model: the bones are organized into 5 chains shown in different colors. additional bones are drawn in black: the pelvis which is a constrained triplet of bones, and the connection bone between the spine and the neck.

3.2. Distance between the model and a point set

To capture the anatomy and the pose of a statue, we need a distance function to measure how the sphere-mesh model fits a point set , even if the points are far from the limbs they should be attached to. We assume that the coordinates of the points are provided with a coarse approximation of the normal so that we can locally distinguish the inside from the outside of the sampled surface. If normals are unavailable, in most cases, they can be roughly estimated and orientated using viewing directions from the 3D sensor for example.

Distance from a point to a one-bone model.

We start by defining the normal-constrained projection of a sampled point on a single bone

by using the normal vector

to disambiguate the choice between several orthogonal projection possibilities. The sphere-mesh model calibration and registration strives to reduce the distance between the sampled points and their corresponding points on the sphere-mesh and using the normal orientation helps this process. Given a point in the ambient space with oriented normal , we consider the lines passing through and orthogonally intersecting the sphere-mesh surface (possibly crossing its interior) at some points. We abusively call these intersection points orthogonal projections (see appendix A for the detailed cases). From all these possibilities, we select the projection whose normal has positive scalar product with . Considering this normal-constrained orthogonal projection allows to better handle the case where the bone is not initialized close to the point set. Since each point has a normal-constrained projection on several bones, we refer to its normal-constrained projection on bone as . If no subscript is provided, refers to the normal-constrained projection of on the associated closest bone.

Distance from a point set to a sphere-mesh model.

Given a point set and a sphere-mesh model of bones, we first need to approximate the subset of corresponding points for each bone. In the following, we define the point set as the subset of points which are closest to bone using the distance . However, we try to favor the assignation to the conical part of a bone. Therefore, if projects on the spherical part of the closest bone and if it also projects on the conical part of another bone with a similar distance, we assign it to this second bone. Once the assignment is computed, the one-bone distance function is defined as the sum of squared distances from points of to bone :


Importantly enough, the subset and the one-bone energy depend on the position of the initial extremity of the chain of bones involving , as well as the parameters of the previous bones in the chain.

In the following, the sum of one-bone distance functions is considered as an energy that we aim to minimize in order to capture the anatomy and the pose of the sphere-mesh that best correspond to our point set:


In the next sections, we will also be interested in the distance restricted to two adjacent bones and , which we call two-bones distance:


3.3. FAKIR : Forward And bacKward Iterative Registration

To register our anatomical model to the acquired static point cloud, we propose a kinematic approach taking into account the articulated property of the skeleton. Contrarily to many tracking methods that exploit the redundancy of information between several frames or several views (Tkach16; Remelli17), our method requires only one joint of the skeleton to be close to its optimal position, the rest of the skeleton pose being arbitrary, except for a mild assumption that in practice means using oversized initial lengths for the bones. It is one more assumption than the Pinocchio algorithm (Baran:2007:ARA:1276377.1276467), which also works in the static case and starts with no prerequisite and we will compare our algorithm to it. Inspired by the FABRIK (Aristidou:2011:FABRIK) and CCD (86079) algorithms, our registration algorithm successively loops forward and backward through the chains of bones so as to rotate and rescale them to match the data, refining the parameters while temporarily fixing the extremities of some bones. Hence our algorithm is named Forward And bacKward Iterative Registration (FAKIR). An originality of our method is that bones are not only considered one by one but also by consecutive pairs, which allows for a more robust estimation of the pose and skeleton parameters along a chain.

Registration process for a chain of bones.

If the bone is close to the points to which it should be associated, the estimation of the one-bone energy function becomes meaningful and the minimization of this energy can be used to coarsely optimize the position and radii of that bone with respect to the data. Therefore, our approach amounts to iteratively updating the rotation and intrinsic parameters of until we catch a coarse estimation of and , one extremity of being kept fixed. Our algorithm gradually rotates the current bone with respect to its predecessor, updating after each rotation, so that gradually contains more relevant points. Once the position of has been approximately found, the algorithm turns to the coarse estimation of the position of . All these computations are driven by the minimization of the one-bone energy. However, the one-bone energy alone might be inefficient to approximate the full length of a bone accurately and we use this energy only to obtain a first approximation of the registration for one bone. In an intertwined manner, a finer local registration procedure of our algorithm is performed each time two consecutive bones and have been processed. Its goal is to optimize the common joint position and radius while fixing the two other joint extremities. This optimization is performed by minimizing a two-bones energy function, defined above. Once a chain of bones has been positioned and scaled over its entire length, we repeat the process forward and backward in the chain in order to further refine the joints positions and radii between pairs of consecutive bones, only using two-bones energies. Finally, the position of the last extremity of a chain is also optimized between each forward and backward step of the loop by optimizing the one-bone energy function. Each parameter optimization is thus made by minimizing an energy related to either one or two bones. The full process is summarized in Algorithm 1 and illustrated with a chain of three bones in Figure 5.

Note that during the process, two consecutive bones should not be enclosed in a same limb of the model point set, otherwise a joint might remain enclosed inside this limb, trapping the optimization in a local minimum. This can be avoided by introducing a condition on the initial bone lengths ensuring that the sum of the initial lengths of two consecutive bones is below the optimal length of each of the limbs they are close to. A practical trick is to slightly oversize the bones before starting the registration process.

0:  A point set and a sphere-mesh chain of bones with one chain extremity close to its optimal position
0:  The registered sphere-mesh chain.
1:  Initialization:
2:  Fix the center of the first extremity of the chain. Rotate the first bone and adjust its radii and length by minimizing the one-bone energy function;
3:  for  to  do
4:     Consider the pair of bones :
5:     Fix the position of the joint common to and ;
6:     Alternate between the optimization of ’s rotation w.r.t , optimization of ’s intrinsic parameters and update of ;
7:     Fix the positions of the 2 joints that and do not share, and free their common joint;
8:     Compute the position and the radius of the common joint by using the two-bones energy function.
9:  end for
10:  Compute the length of the last bone and the radius of the last sphere.
11:  Forward and Backward registration loop:
12:  repeat
13:     Reverse the order of the bones in the chain;
14:     for  to  do
15:        Consider the pair of bones :
16:        Fix the positions of the 2 joints that and do not share;
17:        Compute the position and the radius of the common joint by using the two-bones energy function.
18:     end for
19:     Compute the length of the last bone and the radius of the last sphere with the one-bone energy function.
20:  until convergence
Algorithm 1 Forward and backward iterative registration
Figure 5. Overview of the forward and backward iterative registration for a 3-bone chain. From an initial position (a), the chain extremity is fixed and the first bone is rotated and scaled to roughly calibrate its dimensions and pose through the optimization of a one-bone energy function (b); the bone is fixed and the parameters of the second bone are roughly calibrated in turn (c); joint which is common to the first two bones is scaled and its position is optimized, by using a two-bones energy function, the other joints being fixed (d); The position and length of the third bone are then coarsely calibrated through one-bone optimization and the process continues by alternating single bone optimization and two-bones optimization, until the last bone of the chain (e). After this first coarse calibration forward pass finishes, a backward pass using only two-bone optimizations is performed (f) permitting to refine the pose and skeleton parameters and solve for the chain extremity position. With few forward and backward pass involving two-bone optimization only, the model is registered (g).

Optimization of a single bone

As stated above, the optimization of the one-bone energy is only used for estimating the parameters of the extremities of a chain, or for the very first forward pass in Algorithm 1). After this step, the optimization of the position of a joint using a bone pair should be preferred because it is more precise.

In the single-bone case, the aim is to estimate the 3D rotation of the bone, its length and the radius of its free extremity by minimizing the one-bone energy function , where are the angles of rotation with respect to the predecessor bone. This optimization is performed using the Levenberg-Marquardt algorithm for each parameter. In particular, the rotation can be decomposed into two rotations around two axes that are orthogonal to , indeed the rotation around is not considered since it leaves the bone unchanged. In the case of the minimization with respect to the vector of rotation angles , we iteratively try to replace the current angles with an update . At a minimum, , and the value for follows. The details for the damped least-squares estimation are provided in Appendix B.

Optimization for the joint between two consecutive bones.

The optimization of the position and radius of the joint between two consecutive bones is performed by optimizing a set of four parameters in a loop (an angle, two lengths and a radius) minimizing the two-bones energy function. The two end-sphere centers being fixed ( and in Figure 6), we first compute the optimal rotation of the two bones around axis , all other parameters being fixed. We then optimize the bone lengths and and, finally, the radius of the common joint is computed as . The parameters optimization alternates with a recomputation of point sets and , which refines the point-to-bone assignment. Similarly to the single-bone case, the optimization is also performed using the Levenberg-Marquardt algorithm whose details are provided in Appendix C.

(a) Rotate and
(b) Refine length of
(c) Refine length of
(d) Refine radius
Figure 6. Pairwise Optimization. With fixed extremities and , the pair of bones and is first rotated around axis in order to minimize the two-bones energy. Then the lengths of the bones and and their common radius are optimized successively. After these updates, the point-to-bone assignment is recomputed. As the process is repeated the distances are more accurate since the point-to-bone assignment becomes more meaningful.

Full Skeleton Registration

Our anatomical model is composed of chains, one of which is of particular importance: the spine chain which connects all other chains (the arms, and the legs through the pelvis block). We assume that the pelvis part of our model is initialized near the corresponding part of the point set, which requires a very limited user interaction - basically only one point and click. Each chain is then registered in turn using FAKIR yielding a registered skeleton both in terms of intrinsic parameters and pose in only a couple of iterations. The position of the pelvis is then revised during the registration of the spine chain. The registration order is the following: first the spine chain is registered, followed by each of the two leg chains and each of the two arm chains. When registering the arms and legs chains, the position for the joint attached to the spine or the pelvis remains fixed.

4. Point set skinning and application to statue pose and anatomy modification

Once the anatomical model is registered to the statue point set, it is possible to exploit the assignment between the model and the point set to change the pose and the elementary anatomy of a statue by modifying the extrinsic and intrinsic parameters of the bones respectively. To do so we attach the point set to its registered anatomical model so that deforming the model will deform the point set adequately. This process, well known in the animation research community, is called Skinning. It allows to deform the skin following an underlying skeletal animation, a widespread method to animate 3D models. In most skinning methods, the skin is a 3D mesh and the skeleton is a tree whose nodes represent joints of the skeleton and edges represent bones. An originality of our approach is to avoid using a mesh with fixed connectivity whose triangles quality may be altered by deformations related to pose and anatomy changes, possibly creating triangle slivers and self-intersections. We rather propose to develop a point set skinning process.

In our case, since after applying FAKIR, the anatomical model is registered to the point set, it can serve as the skeleton of usual animation methods. We introduce a skinning method working on point sets and taking a better account of the twisting and bending rotations, alleviating known artefacts of Linear Blend Skinning and Dual Quaternions. We further propose to consider the skin of the model as a set of details, encoded as a heightfield on top of our elementary anatomical and transfered back on the model after pose or intrinsic parameters changes.

Local heightfield over the sphere-mesh model

From now on, let us assume that the sphere-mesh model is registered to the point set. The heightfield value of a point is defined as a signed distance from to its projection on the sphere-mesh base point : , where is the normal to the sphere-mesh surface at . Importantly enough, is the regular orthogonal projection on the bone is assigned to, in opposition to the normal-constrained projection defined in section 3.2 and appendix A. Hence the surface point set is decomposed into the set of base points on the sphere-mesh and the residual orthogonal heightfield. The pose and anatomy changes modify, through skinning, the position of the base point on the sphere-mesh model. Then the heightfield is added back to the modified projection yielding the final point set, as described in algorithm 2.

Under pose change, a point remains affected to the same bone but its evolution depends on both the evolution of its base point , the normal at and the possible scaling of its the heightfield value . The evolution of the base point is driven by the bone is assigned to and its adjacent bones in the chain. The motion induced by our skinning model is continuous. Indeed, the position of a base point continuously depends on a set of one to three consecutive bones. Although the point-to-bone assignation is fixed during deformation, the base point can slide along the bone and move from the conic part to the sphere part of the bone and conversely. Since the conic part is tangent to the sphere, both the base point and its normal evolve continuously even in case of cone to sphere or sphere to cone base point motion. Since the heightfield of a point is either preserved or continuously scaled, the whole motion is continuous throughout the deformation. Naturally, some points may become hidden because of self-intersections between bones caused by the rotation. In this work, we choose to keep the self-intersections, since the resulting outer envelop of the model is visually satisfying. Compared to other heightfield decomposition methods, the heightfield is carried by the points themselves and not by the base surface. Hence it can encode folded surface sheets over the sphere-mesh model (see the Dancer with Crotales example on Figure 21). In the following we describe our proposed point set skinning method to be applied to the base point sets before lifting them with the heightfields (Algorithm 2). Since the skinning can be applied independently of the heightfield decomposition, we describe the method in a generic point set context for notation simplicity. Note however that heightfield skinning should be preferred to direct point set skinning, since it provides a much better result as illustrated on Figure 7.

Figure 7. Heightfield skinning. From left to right: original point set; pose change with direct point set skinning; pose change with heightfield skinning.
0:  Sphere-mesh model registered to the input point set . Target pose and anatomy.
0:  Point set corresponding to the target changes.
1:  Compute base point set corresponding to on the registered sphere-mesh and compute the set of height values
2:  Apply point set skinning on the base points
3:  Deform to the target pose and anatomy, yielding
4:  Re-project points of on the sphere-mesh yielding
5:  Lift points to the skin surface using the - possibly scaled - initial height of each point, yielding .
Algorithm 2 Height-field skinning

4.1. Point Set Skinning

The general idea of skinning is to attach each point to one or more bones with weights measuring the influence of each bone on it. The position of the point after a deformation is a weighted linear combination of the positions relative to its influencing bones. Linear Blend Skinning (Magnenat-Thalmann:1989:JLD:102313.102317), one of the most popular skinning method, causes some well-known collapsing problems at joints, in particular for large rotations. For example, the volume at the joint is not preserved when it is bent around the axis orthogonal to the two adjacent bones (Figure 8). Similarly, if we apply a twist around the axis of a bone while keeping its predecessor fixed, Linear Blend Skinning produces a folding of the joint around a singular point (Figure 11(a)). These flaws are avoided by Dual Quaternions Skinning (Kavan:2007:SDQ:1230100.1230107) which interprets a combination of rigid transformations as a rigid transformation, however artefacts still appear in the concavities (Figures 11 and 14).

We propose to deal with the pose change in a different way that breaks down the movement into its natural components. We consider that the motion between two bones at a joint is the combination of a twisting rotation around the axis of one of the bones and a bending rotation around an axis perpendicular to both bones’ axes. We also introduce motion-dependent skinning weights: the weight of a point is not the same for twisting and bending rotations. The impact of bending on the base point set should be limited to an area loosely enclosing the rotating sphere joint, while the impact of twisting obviously extends to the length of the bones adjacent to the twisted articulation. This is more coherent with the fact that underlying muscles are arranged along the bones and attached to the joints. In our approach, a point can only be influenced by the bones adjacent to so that it has at least one and at most three weights.

Figure 8. Linear Blending artefacts: joint collapse (a); (b) trace of the evolution of a point through iterated re-skinning with direct linear blend skinning after splitting the rotation into smaller ones. Through Linear Blend Skinning, a point located opposite to the bending is moved inside the joint along a line where and are the initial positions of with respect to bones and respectively.

Bending rotation with anisotropic weights.

We decompose the bending rotation into a sequence of small bending rotations and update both the position through Linear Blend Skinning and the corresponding weights of the points after each rotation (Figure 7(b)). At each step, the bone is slightly rotated by around the joint bending axis and the points are moved following adapted Linear Blend Skinning weights. The weights of a point during a bending rotation follow a Gaussian profile of the distances from to each of the bones that influence it. It is driven by a parameter controlling the size of the influence area. In our experiments, we set larger than the average distance between the point set and our model. The weight , relative to one of ’s influencing bones writes:


where is a normalizing factor ensuring that the weights sum to , and is the regular orthogonal projection of onto , Hence if is on bone (as is the case for base points), . Figure 9 shows the weights profile along two bones.

Figure 9. Skinning weights for the bending rotation around a joint. (a) the red curve represents the influence weights of the left bone and the blue curve represents the influence of the right bone . The influence area of each bone is controlled by . (b): values of anisotropic on a bone. Points such that are shown in white. From yellow to red, the value of increases linearly.

This discretization yields similar results as Dual Quaternion skinning. However, it is possible to further improve the method, by noticing that if is large, the convex part behaves well, it is populated by points sliding from the conic parts of both bones to the joint sphere, while an artefact is created in the concave part. On the contrary, if is close to , which corresponds to no skinning at all, a hole appears in the convex part, but a self-intersection is created in the concave part (Figure 10). Quite counter-intuitively, this self-intersection is much more visually satisfying than the thinning artefact which can be observed otherwise. Indeed, only the outer envelop is visible and, when allowing self-intersection, this envelop is similar to the one obtained if a contact area was computed between the two bones (Vaillant:2013:ISR:2461912.2461960). To get the most of the two possibilities, we propose to adapt so that it is close to for points in the concave part and it is larger for points on the convex parts. More precisely, varies continuously with respect to an anisotropy angle.

Figure 10. Influence of on the skinning resulting from a bending rotation. In figure 9(a), . The value increases from (b) to (d).

We define the anisotropy angle at as the angle between the plane defined by and axes and the plane defined by ’s axis and . This angle serves to transition continuously over the skinned surface between points with no skinning, favoring local self-intersections (), and points with skinning, favoring the diffusion of points over the spherical joint surface (). Here, is deduced from the anisotropy angle as , with controlling the influence area size. Therefore, the weight associated to point and one of its influencing bone is:

(a) Linear blend skinning
(b) Dual Quaternions skinning
(c) Our anisotropic skinning
Figure 11. Comparison of our anisotropic skinning method with Linear Blend Skinning and Dual Quaternions for a bending rotation. For a fair comparison, the two first methods use the Gaussian weight of Equation 4 which is made anisotropic in our case. Dual Quaternions fix the volume collapse of Linar Blend Skinning near the convexity, but artefacts remain in the concave part, while our skinning method does not suffer from any of these flaws.

Twisting rotation.

Let us consider a bone twisted around its axis by an angle , with its predecessor kept fixed. Such a rotation impacts points attached to bones and . To handle this twist, we drop the skinning framework by linear combinations of bone motions and replace it with rotations adapted to the points. More precisely, each point is rotated around ’s (resp. ) axis by an angle that depends on its distance to (resp. ). For rotating around the axis of , this angle writes with


is the projection of on ’s axis, and (resp. ) is a point on ’s axis (resp. ) delimiting the impacted areas (Figure 13). By default and , but different impacted areas can be designed by choosing different and . Here, the expression for corresponds to a linear evolution of the rotation angle along the bone, but other types of influences could be designed. On Figures 12 and 13, we compare our method with Linear Blend Skinning and Dual Quaternions for a twisting rotation, the trace of the points initially aligned is much smoother with our approach.

(a) Linear blend skinning
(b) Dual-quaternions skinning
Figure 12. Linear blend skinning and dual-quaternions skinning to be compared with our approach in figure 13 that uses twist specific weights. Green dots show points that were aligned before the twisting rotation.
Figure 13. The blue curve represents the value of . Points and define the range of the twisting effect. is the projection of on the skeleton line.

Combination of twist and bend.

Twist and bend specific skinning are combined to get the skinning result for any rotation around a joint. We perform first the twisting rotation and then the bending rotation. The positions and the weights of the points relative to each bone are updated between these two specific rotations. Figure 14 shows a comparison of our approach with Linear Blend Skinning and Dual Quaternions on synthetic data with a two-bones sphere-mesh model. As was expected, Linear Blend Skinning yields the well-known collapse at the joint. To alleviate this effect, bending with Dual Quaternions and our bending approach without anisotropy get a similar result, with a region of influence restricted to the neighborhood of the joint. However, the twist is handled differently in our approach and its effect is distributed over the length of the bone, yielding a more natural result.

Figure 14. Comparison of skinning approaches on a sphere-mesh model with combined twist and bend rotation movements. Left to right: Linear Blend Skinning, skinning with Dual Quaternions, Ours

5. Results

In this section, we show the performance of FAKIR both on synthetic data and on point sets resulting from statue digitization. We developed our algorithm in C++, using OpenMP for distance update parallelization. All experiments are run on an Intel Core i7-4790K CPU @ 4.00GHz.

5.1. Experiments on synthetic data

We first test our algorithm on synthetic data to provide a quantitative evaluation of the FAKIR performances. We consider a point set of points sampled on a sphere-mesh of a 4-bone chain in a specific pose. We start from a generic 4-bone chain that we try to register to the point set. Although the point set and the initial chain are quite distant from each other, a user-given initial approximate position of a single anchor point (one of the extremity) is enough to register accurately the chain. The accuracy of the registration is evaluated as the average distance between the point set and the model:


For points, without any additional noise, our algorithm takes to converge to in iterations for this synthetic model of bones, including 3.2s for the initialization. The distance of the point set to the model with respect to the iterations for larger point sets and increasing noise is shown on Figure 15: the number of points has only a moderate impact on the number of iterations needed to converge (around

). When there is noise in the data, the distance also converges in a few iterations independently of the noise. However the distance at convergence is directly correlated to the variance of the noise. In fact, FAKIR is rather resilient to even relatively high levels of Gaussian noise (Figure

16). Figure 17 shows how FAKIR handles an initial position of the anchor point that is not in the vicinity of its optimal position in the point set. FAKIR is rather robust, but in some cases (last column), when the initial chain position is such that the points corresponding to the first bone only project in a small area around the joint between the first and second bones. In this case, the optimization of the one-bone energy function alone fails to reduce the length of the first bone and the radius of its free extremity degenerates to instead. This is due to the fact that no point is projected on the spherical free extremity of that bone. This problem could be avoided by slightly modifying the one-bone energy function by adding a bone occupancy term. A preferential alternative would be to modify the one-bone energy function of the first and last bones by adding a term corresponding to the distance of the free caps to the data points. However, with our working assumptions those cases are avoided. FAKIR is also rather robust to missing data thanks to the iterated forward and backward passes (Figure 18). Naturally when the missing parts are on the first or last bone or when a full bone is missing, the algorithm cannot predict the right length or angle.

Figure 15. Evolution of the registration distance with the iterations for different number of points in the point set (left image) and different levels of noise (right image).
Figure 16. Evaluation of FAKIR with respect to increasing Gaussian noise after iterations. The first row shows the initial point set and the bottom row shows the registered bone chain. From left to right: without noise, , , and . The total groundtruth model length is 140.
Figure 17. Evaluation of FAKIR with respect to a bad initial anchor point position after iterations. The first row shows the initial point set and the bottom row shows the registered bone chain. The last column shows that due to a bad initialization, the points (plotted in red) that are affected to the first bone do not bring enough information for the one-bone energy function to move the chain extremity. Then, not enough bones remain to approximate the whole point set.
Figure 18. Evaluation of the FAKIR algorithm with respect to missing data after iterations. The first row shows the initial point set and the bottom row shows the registered bone chain.

5.2. Skeleton registration results on statues

We selected some interesting statues from various sources.

  1. Dancer with Crotales: Louvre Museum, Paris, France.

  2. Dancing Faun: Pompei excavations, Italy.

  3. Aphrodite, Museum of Thorvaldsens, Copenhagen, Denmark.

  4. Old man walking: Nye Carlsberg Glyptotek, Copenhagen, Denmark

  5. Esquiline Venus: found in 1874 at the Horti Lamiani in Rome. Capitoline Museums, Rome, Italy

  6. Old Fisherman Vatican-Louvre (or Dying seneca): found in Rome. Louvre Museum, Paris, France

  7. Venus de Milo: Louvre Museum, Paris, France

  8. Prince Paris: Ny Carlsberg Glyptotek, Copenhagen, Denmark.

While the ’Dancer with crotales’ is a raw point set. The other 7 models are point sets sampled on meshes extracted from the Sketchfab website. Figure 21 shows obtained registrations on four statues. It also compares resulting models after skeleton pose change and skinning with our method or skinning with Dual Quaternions. Our skinning method clearly improves the quality of the modified model near bone joints. The registration algorithm performs well for statues depicting naked characters: in this case, the registration is not hindered by additional clothing or accessories, and the simple sphere-mesh model fits well the data. Even with moderate clothing (Dancer with Crotales) FAKIR recovers the pose of the statue, and the skinning process yields a plausible result.

As far as the complexity of FAKIR is concerned, the computational bottleneck lies in the assignment and re-assignement of each point several times during the optimization process. This reassignment takes place during the updates of the one-bone and two-bones energy updates. The number of these updates is related to the geometry of the surface and not to the number of points sampled on that surface. Therefore the overall complexity is linear with respect to the number of points. From an experimental point of view, FAKIR is a reasonably light algorithm: for a point cloud of points and the 22-bone model, the first forward pass of FAKIR takes and the average execution time of one pass of the FAKIR process takes less than .

We compare FAKIR with Pinocchio (Baran:2007:ARA:1276377.1276467) in Figure 19. The FAKIR algorithm yields a better skeleton registration, in particular for the shoulders and neck bones. As far as computation times are concerned, the Pinocchio method takes about 35s for a mesh with 138048 vertices, which is roughly the same time as the 10 iterations of the FAKIR process optimizing not only for the joint positions but also for the bone radii (38s). Furthermore, a single iteration of FAKIR takes 9s and already provides a better result with a much more plausible shoulders location. However it is important to note that the Pinocchio method does not require an initial skeleton position, while our method requires one of the joint to be not far from its optimal position (in this experiment we chose the pelvis joint).

Figure 19. Skeletons for the Aphrodite statue using the Pinocchio rigging method and comparison with our FAKIR algorithm. From left to right: Pinocchio with the Pinocchio-provided initial skeleton (17 bones); Pinocchio with our initial skeleton (22 bones); FAKIR with our initial skeleton after a single forward iteration; FAKIR with our initial skeleton in 10 iterations. Only the skeleton is displayed since the bone radii are not taken into account by Pinocchio.

5.3. Statue restoration by fragment combination

A direct application of statue pose and anatomy modification is statue restoration, by harmoniously combining parts belonging to different statues after bringing them to a common pose and anatomy. This type of statue restoration by part combinations was quite common in the 18th and 19th centuries 111https://art.thewalters.org/detail/22879/torso-of-artemis-with-head-of-aphrodite/. Performing this restoration virtually has two advantages: first, several hypotheses can be tested, second the pose and anatomy of the statues can be made similar before the combination, avoiding thus adding an oversized arm to an undersized body. Examples of statues with missing parts are shown in Figures 1 and 22. To complete a statue, we first register our anatomical model to the point set using FAKIR. In case there is no clue for the size a limb, such as for the missing arm, forearm and hand of the Esquiline Venus (Figure 22, first row), we use default human proportions. Then we choose a statue which is complementary to the incomplete statue and change its elementary anatomy and pose to match the ones of the incomplete statue. The last step is to integrate the selected parts into the first statue. We assume that the selected parts and the statue to complete slightly overlap, which it necessary to blend harmoniously statue parts.

The two point sets being brought to a common pose and anatomy, the combination area is defined as the area with two layers of points, one from each statue. Here we handle the case where there is only a single layer in each fragment above the sphere-mesh in the combination area. To prevent layers superposition artefacts, it is necessary to merge the information in these areas. In a nutshell, the merging consists in removing the lower layer in the overlap area, creating a sharp boundary between the point sets, and then blending the points near the boundary. Recall that our sphere-mesh model is used as a basis surface to express the residual heightfield information after the registration. We propose to use it to combine the data points. Let be the subset of points on the two models that project on , the projection of on the sphere-mesh model, up to precision . The first step is to keep only the upper layer in the overlap area. To do so, we consider the subset of the points in whose heightfield value is larger than , where is the maximum heightfield value for points in . Then we replace the heightfield value of by a Gaussian-weighted average of the heightfield values in . The resulting heightfield value of is:


with a weight normalizing factor. This brings the points of the lower layer in the overlap area to the upper layer, creating a sharp boundary. Finally, the sharp boundary is smoothed by gaussian-weighted averaging of the heightfield values across the boundary.

Figure 20. In the left figure, blue points and red points come from different statues. The right figure shows the result after merging the two parts by taking the maximum followed by boundary smoothing.

In figure 1, we show the process of completing a statue which lacks arms and legs. Figure 22 shows the restoration of three other statues. The Esquiline Venus (first row) and the Venus de Milo (third row) are both missing arms while The Old Fisherman is missing legs (second row). The restoration method recovers plausible arms and legs in all three cases, leading to plausible restored statues.

5.4. Limitations

Our FAKIR algorithm works well to estimate the anatomical position of statues with or without clothes as long as the anatomical information remains visible. For example, in the Dancer with Crotales case (first row of Figure 21), the dimension and the pose of the legs is easy to infer although the legs are partially hidden by the dress. However if the clothes hide a large part of the statue, such as in the Venus de Milo case (third row in figure 22), FAKIR fails at recovering the anatomy and pose of the legs, as they are covered by the drape. For our point set model, a change in pose may reveal areas where there is no data. This occurs wherever two parts are stuck together in the initial pose. For example in Figure 21 (fourth row), there seems to be a veil between the right arm and the body (see also Figure 22, second row). Finally, combining parts of different statues for restoration can sometimes look unnatural because the materials and styles of the combined parts are different, style transfer would be necessary to alleviate this effect but this is a whole different research topic.

6. Conclusion and perspectives

We introduced a sphere-mesh anatomical model and a combined calibration and registration algorithm to estimate the anatomy and the pose of digitized archaeological statues. We also proposed a point set skinning method to modify the point set when the pose of a statue is changed. We compared our registration and skinning approaches with existing approaches to highlight their benefits. We also illustrated that this framework can be used to combine statue parts or add missing elements. While our method already gives good results, a further improvement would be to handle the case of a clothed statue which would involve modifying the FAKIR algorithm since anatomy parts may be hidden. Sometimes, the point cloud from the scan of a statue is incomplete, or some areas, occluded in the original statue, are revealed by the pose change, creating thus a hole in the point set. In that case an inpainting process is necessary and will be the topic for our future work. Finally, our current approach to combine parts remains very rudimentary and is similar to a union despite a slight smoothing. As a future work, we would like to design a more respectful approach to the geometry and style of different fragments.


This work was funded by the e-Roma project from the Agence Nationale de la Recherche (ANR-16-CE38-0009). The Dancer with crotales model is a point set of the Farman Dataset (ipol.2011.dalmm_ps). The other 7 statues data are sampled on meshes from the Sketchfab website: the Dancing Faun model is courtesy of Moshe Caine, the Venus de Milo model is courtesy of Sketchfab user ”tux”, and the other 5 statues models (Aphrodite, Old Man Walking, Esquiline Venus, Old Fisherman and Prince Paris) are courtesy of Geoffrey Marchal.

Figure 21. Registration and pose changes of 4 statues: the Dancer with Crotales (first row), the Dancing Faun (second row), Aphrodite (third row) and the Old Man Walking (fourth row). First column: initial point set, second column: overlay of the registered model and the point cloud, third column: registered model, fourth column: final point set in a modified pose by our skinning method, fifth column: skinning result with Dual Quaternion. As can be seen in particular in the areas circled in red in the fifth column, our method suppresses or at least reduces the Dual Quaternion artefacts around the bone joints.
Figure 22. Registration and restoration of 3 incomplete statues, Esquiline Venus (first row), Old Fisherman (second row) and Venus de Milo (third row). First column: initial point set, second column: overlay of the registered model and the point set, third column: registered model alone, fourth column: final restoration.


Appendix A One-bone distance computation

In this appendix, we detail the projection of a point on a bone . Instead of using the usual orthogonal projection on the bone, we constrain the projection to have a normal coherent with the one of . This constraint is helpful when the bone lies far away from its corresponding point set: the point will then be projected on the “right side” of the bone. In the following, without loss of generality, let us assume . All the following computations depend on an angle defined in Fig. 23 and which can be expressed as . Let us first compute the projection of on line , and two translations of these points along this line: at the distance of and at the distance , as illustrated on Figure 23. Let , so that can be expressed as . Different cases can occur:

  • : the point projects on the cone part of the bone. Let be the intersection of segment with the cone. is the orthogonal projection of on the bone. If the normal to has a positive scalar product with the normal of , . Otherwise, normals are deemed inconsistent and , i.e. the intersection of with the cone that is the farthest from .

  • (resp. ): is the projection of on the sphere centered at (resp. ) with consistent normal direction, except if this normal-constrained projection falls within the bone and not on the envelop.

In any case, the distance between and its normal-constrained projection vanishes when is located near the surface of one bone, with a normal oriented consistently. It may happen that the returned projection does not provide a point belonging to the surface of the bone: on figure 23, is the normal-constrained projection of point , but it is not on the surface of the bone. It corresponds to a case where the point is very far from the part of the bone which is coherent with its normal. In this situation, we compute the distances between point and its projections on the two spheres and choose the closest projection point for .

Figure 23. Various projection cases. has two possible projections and depending on the orientation of the normal at . Point is the projection of on line . If the normal at is oriented upward . Otherwise, . The same strategy is used to project points and .

For completeness, let us express unsigned distance in the various cases since they will be required in the following Levenberg-Marquardt optimization formulations. If (resp. ), (resp. ). If :


Since the radius of the cone varies linearly along line :


with: and . Furthermore . Hence, for each bone, we first compute the angle, then, for each point , we compute its projection on and the corresponding yielding and .

Appendix B Optimization for one bone

Let us assume that (Fig. 23) is fixed and let us optimize for the pose and intrinsic parameters of bone . In a local reference frame centered at with x-axis aligned with , has coordinates and has initial coordinates . The rotation of the bone can be parameterized by a rotation of angle around the y-axis followed by a rotation of angle around the z-axis. The one-bone energy is invariant by rotation around the x-axis. After the double rotation, has coordinates . Let us call the coordinates of point in this local coordinate system and express with respect to parameters , and . We have:

The one-bone energy function is (dropping the subscript for simplicity):


The optimization is performed on three set of parameters in turn: angles , bone length and bone radii .

the optimization for bone with respect to writes:


Following the Levenberg-Marquardt algorithm, at each iteration, parameter is replaced by a new estimate , computed as:


which is computed by taking:

We finally get :

where , and and is a column vector whose entries are for each point . is a damping factor set to initially and adapting it throughout iterations.

In the following, we assume and . In this case, projects on and with , and . Hence:


The full expression for the derivatives can be easily derived given the expressions for , , above. The cases , or can be computed similarly.

Appendix C Optimization for a joint between two consecutive bones

Let us consider the geometric optimization of the center of the joint between two bones by optimizing the two-bones energy with respect to the lengths and . Each length is optimized in turn, with a side-effect on the value of the other length. The two-bones energy can be expressed as a function of :


Following the Levenberg-Marquardt algorithm, at each iteration, each parameter is replaced by a new estimate :


By setting , we get:


where and are expressed as functions of .

Let us detail the expression of with respect to : during the pairwise optimization and remain fixed (Figure 6). Let be the origin of a local reference frame with the x-axis aligned with . In this frame, the coordinates write , and while a point has coordinates . Then , .

Let us assume that projects on (the case can be deduced with minor changes). Using the same notation as in figure 23 and appendix B, recall that . Since when optimizing the orthogonal projection on does not change, remains the same. However both and change. Since with , we get:


Simple geometric considerations give , and , whose differentiation with respect to is easy.

One must also express distances as functions of . In that case, the projection on bone is slightly different, since the position of point changes with . The formulas are only slightly modified by it, but this time also depends on . We get:


The full expression for the derivatives can be easily computed using the following formulas:

Plugging all the derivatives in Equation 18 yields , and can be updated as . This impacts the position of , whose new position is computed as , and is recomputed as : .

The two-bones energy is then optimized with respect to . This optimization is symmetric to the case above and can be easily adapted. Finally, the optimization of the radius of the common joint and rotation angle around axis are done in a similar manner.