Using several fingers at once to grasp and manipulate objects is a straightforward ask of humans, despite the primitiveness of our hand anatomy. But robots have a much tougher go of it. That’s because tasks like writing on paper with a pencil require physics models directing the imparting of forces onto the target object, as well as the repeated establishing and breaking of contacts. And with conventional approaches to the problem of dextrous manipulation, such models are often difficult to generate accurately.
That’s why researchers at Google Brain pursued a novel robot task planning technique involving deep dynamics models, or DDM. They claim that their method, which they describe in a preprint paper published this week on Arxiv.org (“Deep Dynamics Models for Learning Dexterous Manipulation“), enabled a mechanical hand to manipulate multiple objects using just four hours of real-world data.
It builds on a rich body in AI research addressing the problem of robotic hand manipulation, including a recent Tencent study that investigated a five-finger Rubik’s cube solver. Separately, OpenAI researchers last July detailed a system capable of directing hands in grasping and manipulating objects with state-of-the-art precision. In September of last year, an MIT CSAIL team proposed a computer vision system — dubbed Dense Object Nets — that allowed robots to inspect, visually understand, and manipulate objects they’ve never seen before. And for its part, Google earlier this year collaborated with researchers at Princeton, Columbia, and MIT to develop a picker robot they dubbed TossBot, which learns to grasp and throw objects into boxes outside the confines of its “natural range.”
“Model-free [machine learning] … methods can learn policies that achieve good performance on complex [robotic manipulation] tasks. [H]owever … these state-of-the-art algorithms struggle when a high degree of flexibility is required,” wrote the coauthors. “[C]omplex contact dynamics and high chances of task failure make the overall skill much more difficult. Model-free methods also require large amounts of data, making them difficult to use in the real world … In this work, we aim to push the boundary on this task complexity.”
To this end, the team’s method combined what they describe as an ensemble of “uncertainty-aware” AI models with state-of-the-art trajectory optimization. Reinforcement learning — an algorithmic training technique that employs rewards to drive software policies toward goals — helped teach the system nuanced hand and object interactions. Each of the actions was calculated to be the mean predicted reward across several machine learning models, which was used to optimize for a candidate sequence of actions. The hand executed only the first action, after which it received updated state information and replanned at the following step.
The researchers note that the “closed-loop” replanning method conferred the advantage of mitigating inaccuracies by preventing errors from accumulating. Additionally, they say that it allowed new goals to be swapped out at run-time independent of the trained machine learning models.
The researchers tasked the system with solving several real-world manipulation challenges, all of which required making contact with objects and finagling them into a target position. One of the most difficult was rotating two Baoding balls around the palm without dropping them, but the researchers’ models impressively managed to solve it using just 100,000 data points’ (or 2.7 hours) worth of data.
In a separate experiment, the team repurposed models trained on the Baoding task to accomplish other tasks without additional training, including moving a single ball to a goal location in the robotic hand and performing clockwise rotations instead of learned counter-clockwise ones. (The hand in question was Shadow Hand, which has a wrist with two actuated joints, plus middle and ring fingers that each have three actuated and one underactuated joint, as well as a little finger and thumb with five actuated joints.) They say that it managed to rotate the two balls 90 degrees and 180 degrees without dropping them from under two hours of real-world data captured from a camera, with a success rate of about 100% and 54%, respectively.
In a subsequent test designed to study the flexibility of their system, the team experimented with handwriting in a simulation environment. They say that the method’s separation of modeling and task-specific control importantly allowed for generalization across behaviors, as opposed to discovering and memorizing the answer to a specific task or movement.
“Our approach, based on deep model-based [reinforcement learning,] challenges the general machine learning community’s notion that models are difficult to learn and do not yet deliver control results that are as impressive as model-free methods,” wrote the paper’s coauthors. “On our simulated suite of dexterous manipulation tasks, [it] consistently outperforms these prior methods both in terms of learning speed and final performance, often solving flexible tasks that prior methods cannot … To the best of our knowledge, our paper demonstrates for the first time that deep neural network models can indeed enable sample-efficient and autonomous discovery of fine motor skills with high-dimensional manipulators, including a real-world dexterous hand trained entirely using just … hours of real-world data.”
The researchers intend to open-source the code soon.