Summary

  • London-based AI research company DeepMind has created two new AI systems, Gemini Robotics and Gemini Robotics-ER, to enable robots to better understand and interact with the physical world.
  • The models build on the company’s language model, Gemini 2.0, and are designed to work with different types of robots, including humanoid robot assistants.
  • While hardware for robot platforms is advancing, creating an AI system that can pilot these robots through novel scenarios with safety and precision remains difficult, according to DeepMind.
  • Google’s new models employ a “vision-language-action” system, which uses camera footage of a scene to allow a robot to recognise objects and carry out commands, and an “embodied reasoning” function with enhanced spatial understanding.
  • The ASIMOV dataset has been released to help researchers assess the safety implications of robotic actions in real-world scenarios.

By Benj Edwards

Original Article