Last week marked the kickoff of Neural Information Processing Systems (NeurIPS), one of the largest AI and machine learning conferences in the world. NeurIPS 2017 and NeuIPS 2018 received 3,240 and 4,854 research paper submissions, respectively, and this year’s event — which took place from December 8 to December 14 in Vancouver, Canada — handily broke those records with around 6,600 submissions. More than 4,200 people queued in the registration line on Sunday afternoon, and all told, over 13,000 people attended, up 40% from the prior conference.
One particularly active category of research this year was robotics, which saw workshop and paper contributions from Intel, the University of California, Berkely, and others. Perhaps the most intriguing of these were novel approaches to training a team of machines to jointly solve a problem, and a multi-stage learning technique that uses pixel-level translation of human videos to train robots to complete tasks.
Multi-stage task learning
Researchers at Berkeley’s department of electrical engineering and computer sciences designed a system that aims to reduce human burden, at least where defining a task and resetting an environment is concerned. Their framework — AVID — translates human instructions for each step into robot instructions via a CycleGAN, a technique that involves the training of image-to-image translation models using a collection of images from two domains that needn’t be related.
In practice, robots internalize tasks one stage at a time, automatically discovering how to reset stages to retry it without human intervention. This makes the learning process largely automatic, from the intuitive specification of tasks via videos to training.
Better still, the researchers say that in experiments, AVID successfully learned tasks such as operating a coffee machine and retrieving a cup directly from raw image observations. It required only 20 minutes to provide human demonstrations and about 180 minutes of robot interaction with the environment, and in one of the tasks, it outperformed behavioral cloning using real robot demonstrations rather than videos of human demonstrations.
They leave to future work amortizing the cost of training the CycleGAN models for specific tasks, perhaps by reusing trained CycleGAN models to translate demonstrations for other, somewhat related tasks. The researchers believe training could be generalized with a large data set involving multiple different human and robot behaviors in an environment, enabling new tasks to be learned with just a few human demonstrations.
Teaching robots teamwork
Researchers at Intel sought to tackle two longstanding problems in machine learning — a disinclination to explore environments and a high sensitivity to choice in hyperparemeters, or parameters whose values are set before the learning process begins — with a framework dubbed CERL, or collaborative evolutionary reinforcement learning. It’s a collection of optimized algorithms that together achieve greater sample efficiency, and that dynamically distributes computational resources to favor the best-performing models of the bunch.
Learning objectives in CERL are split into two optimization processes that operate simultaneously. A population of model “teams” is constructed and each team is evaluated on its performance on the actual task. Following these evaluations, strong teams are retained while weak teams are eliminated, and new teams are formed.
Importantly, each model is provided a shared replay buffer, or a data repository where it can store its experiences as it explores. It constructs as many shared buffers as there are team positions, so a team member can learn from the experiences of all of its versions across all of the teams. And it’s this split-level approach that enables CERL to achieve state-of-the-art performance on a number of difficult benchmarks, including training a 3D humanoid model to walk from scratch.
In the future, the team plans to investigate similar problems involving multi-task learning in scenarios that have no well-defined reward feedback. They also hope to explore the role of communication in solving such tasks, which they note represents a class of problems that are a step up from simple perception.
Bonus round: Curling robots
Who knew robots could curl so well? A team hailing from Korea University and the Berlin Institute of Technology describe in a paper a machine — nicknamed Curly — that holds its own on real-world curling ice. An AI-based curling strategy and simulation engine guide the thrower robot, which autonomously drives and recognizes the field configuration thanks to a combination of traction control, cameras, and machine vision.
As the researchers note, curling ice sheets are traditionally covered with pebbles, which change their condition over time depending on factors like temperature, humidity, ice makers, elapsed time since the maintenance ended, and amount of sweeping done during the game. The trajectory of the stones varies over time as a result.
Curly contends with this by deploying a physics-based simulator designed to adjust parameters including throw angle, velocity, and curl direction until an optimal strategy is discovered. The robot’s thrower component performs this strategy on the ice sheet while holding and rotating a curling stone, which it releases by unfolding s gripper arm. As for the above-mentioned skip bit, it keeps tabs on the locations of the stones or the trajectories of said stones while accounting for variability.
According to the researchers, Curly performed well in on-the-ice experiments — namely, in classical game situations and when interacting with human opponents like a top-ranked Korean amateur high school team. They leave to future research using explainable AI techniques to gain a better understanding of critical shot impacts, allowing the robot to better learn from its mistakes.