19 November 2021

Voxel Chiseling

Learning Sculptural Representations

In the past decade, there has been a considerable rise in the use of generative models for the purposes of exploring, supporting and understanding art and creativity. Unsupervised learning models are often used to generate visual works that challenge our contemporary views on art. Although generative models have proven themselves to be promising in 2D, there has been comparatively less work in the 3D generative space.

During my master thesis at the University of the Arts London (UAL) I sought to explore 3D shape synthesis using unsupervised learning models. The goal of this study was to research how sculpture art might arise from generative models. To what extent can deep learning systems learn representations that we might deem sculptural? The goal of this exploration is to have a ProGAN predict new volumetric representations from a data distribution. The results show a promising method to compute occupancy grids using a generative adversarial network (GAN).

Rendered animation of all the model’s predictions over the training process
Rendered animation of all the model’s predictions over the training process

Voxel Chiseling

Similar the process of a classical sculptor, the method of voxel chiseling is that of learning what bits to remove off a cube in order to synthesise a geometry. The classical sculptor would chisel away bits of marble from a cube; exposing the geometry that was hidden in the concrete primitive. The Sculpture GAN learns geometrical representations in a similar way. Starting with a voxel grid of size 28^3, the model learns at what indices it should remove a voxel in order to generate a 3D geometry of a given shape.

A cube consisting of 28^3 voxels
A cube consisting of 28^3 voxels

Learning Sculptural Representations

In 2014, Ian J. Goodfellow proposed the Generative Adversarial Network (GAN). A GAN is a machine learning model that combines two separate networks, namely a Generator and a Discriminator. The Generator is trained to produce fake samples and the Discriminator tries to classify whether a sample given to it comes from the data set or if it’s created by the Generator. This framework has proved itself extremely powerful class of neural networks used for unsupervised learning.

The generative model can be thought of as analogous to a team of counterfeiters, trying to produce fake currency and use it without detection, while the discriminative model is analogous to the police, trying to detect the counterfeit currency. Competition in this game drives both teams to improve their methods until the counterfeits are indistinguishable from the genuine articles. — Ian J. Goodfellow, 2014
Data augmentation (3D noise) to increase the size of the dataset
Data augmentation (3D noise) to increase the size of the dataset

The GAN that resulted from my thesis predicts volumetric maps that are similar to the targets and more than that: a certainty metric for each voxel index, providing an insight into geometrical features of an embedding. These certainties give us a score of how sure the network is that the voxel should be placed at that index.

Rendered prediction during training process (left) and Interpretation of “Le Penseur” (Auguste Rodin, 1904) (right)
Rendered prediction during training process (left) and Interpretation of “Le Penseur” (Auguste Rodin, 1904) (right)