Gestural Exploration of Neural Network Latent Spaces

Collaboration with Aman Tiwari, Spring 2019

Currently accepted to IEEE-GEM 2019 as Xoromancy: Image Creation via Gestural Control of High-Dimensional Spaces.
SIGGRAPH 2019 submission pending.
To be shown at New York Live Arts in May 2019.

Xoromancy explores methods for hand-gestural control of high dimensional spaces, in the realtime production of images by generative adversarial networks.

Correlation between proprioception and visual feedback builds an intuitive understanding for navigating the highly-nonlinear mappings between input dimensions and generated output imagery.

Xoromancy uses a Leap Motion to map movement and rotation of the hands to control the input vectors of BigGAN, trained on ImageNet, which outputs a sequence of images generated in real-time. This direct engagement allows users to develop an embodied fluency in exploring high-dimensional spaces.

Many tools for exploring the image latent space of generative adversarial networks (GANs) provide only stepwise, image-by-image traversal. Xoromancy’s use of hand tracking for controlling many latent space vectors simultaneously with realtime image response enables rapid and fluent exploration.

It is the first interactive real-time tool leveraging the human body’s proprioception and fine motor control to control the generation of images by neural networks. These images, produced by bigGAN (Brock et al, 2018) are high-resolution and can often near-photorealistically represent objects and scenes that would be otherwise time-consuming to render through other means.

Artistically these bigGAN images exist in spaces that can range from the highly realistic to the distinctly surreal, often containing perceptual cues conveying realistic lighting, texture, and forms but with decidedly unreal subjects. 

However, the input spaces bigGAN and similar networks are extremely high-dimensional and are prohibitively difficult to explore and conceptualize. This makes exhaustive exploration intractable and random exploration slow and often unrewarding.

We recognize that the human body is itself a continuous and high-dimensional control system, and modern body tracking such as the Leap Motion sensor enables the body’s motion to drive the high-dimensional input vector controlling the GAN’s visual output.

Although the possible dimensions of bodily movement are many, only a subset are mechanically comfortable to control and conceptually intuitive. The input mappings chosen for Xoromancy support both fast exploration through a variety of images and specific refinement towards a goal once an image has been found.

Xoromancy includes distinct gesture categories to independently control the subject of the generated image and its specific composition and form.