layout

title

byline

doi

tags

summary

post

Neural scene representation and rendering

Eslami & Rezende et al

10.1126/science.aar6170

neural-network

machine-learning

scene-representation

rendering

visual-system

vision

group:deepmind

GQNs learn to represent 3D scenes and can predict their layout based upon 2D images.

DeepMind published this exciting work which generates a scene representation from simple static imagery. Much like a young animal (or human!), this network begins with no understanding of how the world works or is organized; it has no understanding of photon trajectory or raytracing. Instead, it's given a

This system is dubbed a Generative Query Network (GQN). GQNs are comprised of two components — a representation net and a generation net.

The representation network is in charge of converting an image into a scene representation — a la converting a photograph into a 3D scene — and then the generative component re-renders that scene from a different perspective. This means that this network can render a scene in 2D from a different angle as its original photograph — given enough training data.

The animations (link below) of a camera moving through this space are really interesting, because the system pretty consistently learns the shape of simple geometric objects, and projects them accurately into the new rendering, even though object point, position, rotation, and color aren't explicitly covered by the network's training data.

You can read more about this work on DeepMind's blog.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2018-06-18-302.md

2018-06-18-302.md

Files

2018-06-18-302.md

Latest commit

History

2018-06-18-302.md

File metadata and controls