Skip to content

Latest commit

 

History

History
25 lines (20 loc) · 1.63 KB

2018-06-18-302.md

File metadata and controls

25 lines (20 loc) · 1.63 KB
layout title byline doi tags summary
post
Neural scene representation and rendering
Eslami & Rezende et al
10.1126/science.aar6170
neural-network
machine-learning
scene-representation
rendering
visual-system
vision
group:deepmind
GQNs learn to represent 3D scenes and can predict their layout based upon 2D images.

DeepMind published this exciting work which generates a scene representation from simple static imagery. Much like a young animal (or human!), this network begins with no understanding of how the world works or is organized; it has no understanding of photon trajectory or raytracing. Instead, it's given a

This system is dubbed a Generative Query Network (GQN). GQNs are comprised of two components — a representation net and a generation net.

The representation network is in charge of converting an image into a scene representation — a la converting a photograph into a 3D scene — and then the generative component re-renders that scene from a different perspective. This means that this network can render a scene in 2D from a different angle as its original photograph — given enough training data.

The animations (link below) of a camera moving through this space are really interesting, because the system pretty consistently learns the shape of simple geometric objects, and projects them accurately into the new rendering, even though object point, position, rotation, and color aren't explicitly covered by the network's training data.

You can read more about this work on DeepMind's blog.