Learned Interpolation for 3D Generation

12/08/2019 ∙ by Austin Dill, et al. ∙ Carnegie Mellon University 0

In order to generate novel 3D shapes with machine learning, one must allow for interpolation. The typical approach for incorporating this creative process is to interpolate in a learned latent space so as to avoid the problem of generating unrealistic instances by exploiting the model's learned structure. The process of the interpolation is supposed to form a semantically smooth morphing. While this approach is sound for synthesizing realistic media such as lifelike portraits or new designs for everyday objects, it subjectively fails to directly model the unexpected, unrealistic, or creative. In this work, we present a method for learning how to interpolate point clouds. By encoding prior knowledge about real-world objects, the intermediate forms are both realistic and unlike any existing forms. We show not only how this method can be used to generate "creative" point clouds, but how the method can also be leveraged to generate 3D models suitable for sculpture.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Merging abstract concepts, also known as creative blending, is frequently seen as a fundamental component of creativity, as this merging can allow novel concepts to emerge from simple components Pereira (2007). One computational way to approach this is interpolation, a process that smoothly transitions from one instance to another.

In order to generate novel 3D shapes with machine learning, one must allow for such interpolations. The typical approach for incorporating this creative process is to interpolate in a learned latent space so as to avoid the problem of generating unrealistic instances by exploiting the model’s learned structure. In 2D images, this often utilizes the trained Generative Adversarial Network Radford et al. (2015); Brock et al. (2018)

or Autoencoder 

Kingma and Welling (2013); Tolstikhin et al. (2017), which has shown promising results in creative generation Carter and Nielsen (2017). As for the basic requirement, the process of the interpolation is supposed to form a semantically smooth morphing Berthelot et al. (2018). While this approach is sound for synthesizing realistic media such as lifelike portraits or new designs for everyday objects, it subjectively fails to directly model the unexpected, unrealistic, or creative Bidgoli and Veloso (2019); Kingma and Dhariwal (2018).

In this work, we present a method for learning how to interpolate point clouds. By encoding prior knowledge about real-world objects, the intermediate forms are both realistic and unlike any existing forms. We show not only how this method can be used to generate "creative" point clouds, but how the method can also be leveraged to generate 3D models suitable for sculpture.

2 Interpolation as Generation

Figure 1: Learned interpolation between an airplane and a chair.

2.1 Naive Interpolation

Consider two point clouds and , where each represent a 3-dimensional point. To generate a point cloud that semantically lies in between the inputs and , a straightforward method is to generate a point cloud as follows:

Intuitively, this represents drawing a line between pairs of points in and and returning a points percent of the distance between them. While this is simple to implement, it does not produce results that are semantically in between the objects the point clouds represent, as can be seen in Figure 2.

2.2 Learned Point Cloud Interpolation

While prior algorithms for generating creative point clouds have relied on a pretrained model, our method directly learns a transformation from point clouds to point clouds by using interpolation as the guiding framework. We parameterize interpolation from a start point cloud to a goal point cloud where for each time step, we apply a learned transformation for each point independently, given an encoding of , denoted by .

In each of the above transformations, the function

is a multilayer perceptron

Rosenblatt (1958). All of the transformations are trained to produce a set after transformations so that and are close in Chamfer Distance, an error metric frequently used with set generation tasks Achlioptas et al. (2017). The encoding network is parameterized as a Deep Sets model Zaheer et al. (2017).

Figure 2: Naive interpolation fails to produce an interesting midpoint.

While this formulation allows us to visualize the trajectory of each point as it is transformed and allows us enough expressivity to approximate the goal point clouds, it does not enforce the requirement that each intermediate point cloud is realistic. For example it could allow a mapping to a completely meaningless intermediate state that would not be recognized as a plausible (if unusual) 3D object.

For this reason we introduce an additional loss term motivated by computer graphics Ezuz et al. (2018).

With this added term, we are able to maintain the topology of the beginning object, causing the network to find the most plausible correspondence between the source object and the target function. This loss function only penalizes the output of the algorithm but has the side effect of ensuring each intermediate step is topologically consistent as well.

2.3 Mesh Generation

Toilet and Plant
Airplane and Person
Piano and Bowl
Person and Plant
Figure 3: Generated meshes from interpolating between pairs of objects.

This technique allows one to use the vertices from a mesh as input, providing us with the correspondence needed to mesh the output. This removes the problem of meshing for creative sculpture generation, the motivating factor for creative sculpture generating algorithms Ge et al. (2019). In the author’s opinion, the conflict between the representational advantage of point clouds for machine learning tools and the artist’s frequent need for solid 3D shapes has limited the adoption of generative models for sculptural art. Our method therefore represents an important step forward for creative AI.

References

  • P. Achlioptas, O. Diamanti, I. Mitliagkas, and L. Guibas (2017) Learning representations and generative models for 3d point clouds. arXiv preprint arXiv:1707.02392. Cited by: §2.2.
  • D. Berthelot, C. Raffel, A. Roy, and I. Goodfellow (2018) Understanding and improving interpolation in autoencoders via an adversarial regularizer. arXiv preprint arXiv:1807.07543. Cited by: §1.
  • A. Bidgoli and P. Veloso (2019) DeepCloud. the application of a data-driven, generative model in design. External Links: 1904.01083 Cited by: §1.
  • A. Brock, J. Donahue, and K. Simonyan (2018) Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096. Cited by: §1.
  • S. Carter and M. Nielsen (2017)

    Using artificial intelligence to augment human intelligence

    .
    Distill 2 (12), pp. e9. Cited by: §1.
  • D. Ezuz, J. Solomon, and M. Ben-Chen (2018) Reversible harmonic maps between discrete surfaces. arXiv preprint arXiv:1801.02453. Cited by: §2.2.
  • S. Ge, A. Dill, E. Kang, C. Li, L. Zhang, M. Zaheer, and B. Poczos (2019) Developing creative ai to generate sculptural objects. External Links: 1908.07587 Cited by: §2.3.
  • D. P. Kingma and M. Welling (2013) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114. Cited by: §1.
  • D. P. Kingma and P. Dhariwal (2018) Glow: generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems, pp. 10215–10224. Cited by: §1.
  • F. C. Pereira (2007) Creativity and artificial intelligence: a conceptual blending approach. Vol. 4, Walter de Gruyter. Cited by: §1.
  • A. Radford, L. Metz, and S. Chintala (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434. Cited by: §1.
  • F. Rosenblatt (1958) The perceptron: a probabilistic model for information storage and organization in the brain.. Psychological review 65 (6), pp. 386. Cited by: §2.2.
  • I. Tolstikhin, O. Bousquet, S. Gelly, and B. Schoelkopf (2017) Wasserstein auto-encoders. arXiv preprint arXiv:1711.01558. Cited by: §1.
  • M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Poczos, R. R. Salakhutdinov, and A. J. Smola (2017) Deep sets. In Advances in neural information processing systems, pp. 3391–3401. Cited by: §2.2.