Wednesday, 5 October 2016

Google teaches robots to teach each other

23:19 Posted by Anonymous No comments

Poet John Donne said, "no man is an island," and that is even more true for robots. While we humans can share our experiences and expertise through language and demonstration, robots have the potential to instantly share all the skills they've learned with other robots simply by transmitting the information over a network. It is this "cloud robotics" that Google researchers are looking to take advantage of to help robots gain skills more quickly.

The human brain has billions of neurons, and between them they form an unfathomable number of connections. As we think and learn, neurons interact with each other and certain pathways that correspond to rewarding behavior will be reinforced over time so that those pathways are more likely to be chosen again in future, teaching us and shaping our actions.

An artificial neural network follows a similar structure on a smaller scale. Robots may be given a task and instructed to employ trial and error to determine the best way to achieve it. Early on, their behavior may look totally random to an outside observer, but by trying out different things, over time they'll learn which actions get them closer to their goals and will focus on those, continually improving their abilities.

While effective, this process is time-consuming, which is where cloud robotics comes in. Rather than have every robot go through this experimentation phase individually, their collective experiences can be shared, effectively allowing one robot to teach another how to perform a simple task, like opening a door or moving an object. Periodically, the robots upload what they've learned to the server, and download the latest version, meaning each one has a more comprehensive picture than any would through their individual experience.

Using this cloud-based learning, the Google Research team ran three different types of experiments, teaching the robots in different ways to find the most efficient and accurate way for them to build a common model of a skill.

First, multiple robots on a shared neural network were tasked with opening a door through trial and error alone. As the video below shows, at first they seem to be blindly fumbling around as they explore actions and figure out which ones get them closer to the goal.

After a few hours of experimentation, the robots collectively worked out how to open the door: reach for the handle, turn and pull. They understand that these actions lead to success, without necessarily building an explicit model of why that works.

In the second experiment, the researchers tested a predictive model. Robots were given a tray full of everyday objects to play with, and as they push and poke them around, they develop a basic understanding of cause and effect. Once again, their findings are shared, and the group can then use their ever-improving cause and effect model to predict which actions will lead to which results.

Using a computer interface showing the test area, the researchers could then tell the robots where to move an object by clicking on it, and then clicking a location. With the desired effect known, the robot can draw on its shared past experiences to find the best actions to achieve that goal.

Finally, the team employed human guidance to teach another batch of robots the old "open the door" trick. Each robot was physically moved by a researcher through the steps to the goal, and then playing that chain of action back forms a "policy" for opening the door that the robots can draw from.

Then, the robots were tasked with using trial and error to improve this policy. As each robot explored slight variations in how to perform the task, they got better at the job, up to the point where their collective experience allowed them to account for slight variations in the door and handle positions, and eventually, using a type of handle that none of the robots had actually encountered before.

So what's the point of all this? For neural networks, the more data the better, so a team of robots learning simultaneously and teaching each other can produce better results much faster than a single robot working alone. Speeding up that process could open the door for robots to tackle more complex tasks sooner.

0 comments:

Post a Comment