June 15, 2018 report

Generation query network lets computer create multi-view 3-D model from 2-D photographs

by Bob Yirka , Tech Xplore

A team of researchers working with Google's DeepMind division in London has developed what they describe as a Generation Query Network (GQN)—it allows a computer to create a 3-D model of a scene from 2-D photographs that can be viewed from different angles. In their paper published in the journal Science, the team describes the new type of neural network system and what it represents. They also offer a more personal take on their project in a post on their website. Matthias Zwicker, with the University of Maryland offers a Perspective on the work done by the team in the same journal issue.

In computer science, big jumps in systems engineering can seem small because of the seeming simplicity of results—it is not until someone applies the results that the big leap is truly recognized. This was the case, for example, when the first systems began to appear that were able to listen to a what a person says and extract meaning from it. In this new endeavor, the team at DeepMind might have made a similar leap.

In traditional computer applications, including deep learning networks, a computer must be spoon-fed data in order to behave as if it has learned something. That is not the case for the GQN, which learns purely from observation, like human infants. The system can observe a real-world scene, such as blocks sitting on a table, and then recreate a model of it able to show the scene from other angles. At first glance, as Zwicker notes, this might not seem all that groundbreaking. It is only when considering what the system must do to come up with those new angles that the real power of the system becomes clear. It has to look at the scene and infer characteristics of occluded objects that cannot be observed using only 2-D information provided by cameras. There is no radar or depth finder, or images of what blocks are supposed to look like stored in its data banks. All it has to work with are the few photographs it takes.

Accomplishing this, the team explains, involves using two neural networks, one to analyze the scene, the other to use the resulting data to create a 3-D model of it that can be viewed from angles not shown in the photographs. There is much more work to be done, of course, most obviously, determining if it can be broadened to more complex objects—but in its primitive form, it clearly represents a new way to allow computers to learn.

GQN agent “imagining” new viewpoints in rooms with multiple objects. Credit: DeepMind

GQN agent operating in partially observed maze environments. Credit: DeepMind

GQN agent performing the Shepard Metzler object rotation task. Credit: DeepMind

More information: S. M. Ali Eslami et al. Neural scene representation and rendering, Science (2018). DOI: 10.1126/science.aar6170

Abstract
Scene representation—the process of converting visual sensory data into concise descriptions—is a requirement for intelligent behavior. Recent work has shown that neural networks excel at this task when provided with large, labeled datasets. However, removing the reliance on human labeling remains an important open problem. To this end, we introduce the Generative Query Network (GQN), a framework within which machines learn to represent scenes using only their own sensors. The GQN takes as input images of a scene taken from different viewpoints, constructs an internal representation, and uses this representation to predict the appearance of that scene from previously unobserved viewpoints. The GQN demonstrates representation learning without human labels or domain knowledge, paving the way toward machines that autonomously learn to understand the world around them.

Journal information: Science

Citation: Generation query network lets computer create multi-view 3-D model from 2-D photographs (2018, June 15) retrieved 18 April 2024 from https://techxplore.com/news/2018-06-query-network-multi-view-d.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

DeepMind uses neural network to help explain meta-learning in people

87 shares

Feedback to editors

How 3D printers can give robots a soft touch

10 hours ago

Clearing the air: Wind farms more land efficient than previously thought

10 hours ago

National roaming can increase resilience of Dutch mobile networks

12 hours ago

Researchers use machine learning to create a fabric-based touch sensor

13 hours ago

Retro-reflectors could help future cities keep their cool

16 hours ago

New material for hydrogen storage confines this clean yet troublesome fuel

16 hours ago

Novel method proposed to design high-efficiency guest components for ternary organic solar cells

17 hours ago

Researchers develop energy-efficient probabilistic computer by combining CMOS with stochastic nanomagnet

18 hours ago

A rimless wheel robot that can reliably overcome steps

21 hours ago

Student engineering team successfully builds and runs hydrogen-powered engine

23 hours ago

Load comments (0)

Generation query network lets computer create multi-view 3-D model from 2-D photographs

How 3D printers can give robots a soft touch

Clearing the air: Wind farms more land efficient than previously thought

National roaming can increase resilience of Dutch mobile networks

Researchers use machine learning to create a fabric-based touch sensor

Retro-reflectors could help future cities keep their cool

New material for hydrogen storage confines this clean yet troublesome fuel

Novel method proposed to design high-efficiency guest components for ternary organic solar cells

Researchers develop energy-efficient probabilistic computer by combining CMOS with stochastic nanomagnet

A rimless wheel robot that can reliably overcome steps

Student engineering team successfully builds and runs hydrogen-powered engine

DeepMind uses neural network to help explain meta-learning in people

A webcam is enough to produce a real-time 3-D model of a moving hand

AI senses people's pose through walls

New algorithm allows human being to communicate task to robot by performing it first in virtual reality

Google DeepMind project taking neural networks to a new level

Team takes a step up in system that teaches robot how to complete a task

Researchers develop energy-efficient probabilistic computer by combining CMOS with stochastic nanomagnet

New computer vision tool can count damaged buildings in crisis zones and accurately estimate bird flock sizes

Game theory research shows AI can evolve into more selfish or cooperative personalities

Proof-of-principle demonstration of 3D magnetic recording could lead to enhanced hard disk drives

Tech companies want to build artificial general intelligence. But who decides when AGI is attained?

Computer scientists show the way: AI models need not be so power hungry

Phys.org

Medical Xpress

Science X

Generation query network lets computer create multi-view 3-D model from 2-D photographs

How 3D printers can give robots a soft touch

Clearing the air: Wind farms more land efficient than previously thought

National roaming can increase resilience of Dutch mobile networks

Researchers use machine learning to create a fabric-based touch sensor

Retro-reflectors could help future cities keep their cool

New material for hydrogen storage confines this clean yet troublesome fuel

Novel method proposed to design high-efficiency guest components for ternary organic solar cells

Researchers develop energy-efficient probabilistic computer by combining CMOS with stochastic nanomagnet

A rimless wheel robot that can reliably overcome steps

Student engineering team successfully builds and runs hydrogen-powered engine

Related Stories

DeepMind uses neural network to help explain meta-learning in people

A webcam is enough to produce a real-time 3-D model of a moving hand

AI senses people's pose through walls

New algorithm allows human being to communicate task to robot by performing it first in virtual reality

Google DeepMind project taking neural networks to a new level

Team takes a step up in system that teaches robot how to complete a task

Recommended for you

Researchers develop energy-efficient probabilistic computer by combining CMOS with stochastic nanomagnet

New computer vision tool can count damaged buildings in crisis zones and accurately estimate bird flock sizes

Game theory research shows AI can evolve into more selfish or cooperative personalities

Proof-of-principle demonstration of 3D magnetic recording could lead to enhanced hard disk drives

Tech companies want to build artificial general intelligence. But who decides when AGI is attained?

Computer scientists show the way: AI models need not be so power hungry

Your Privacy