Honor's New AI Agent Can Read and Understand Your Screen
1 min read
Summary
Mobile company Honor has developed a “GUI-based mobile AI agent” that can understand a screen’s graphical user interface to handle tasks, such as booking a restaurant.
Delegate restaurant bookings to an AI agent, and it might work fine, but you’ll normally end up taking over before the process is complete because the software has little of the intuitive understanding that a human brings to this sort of task.
That said, the AI can run through a multi-step process, and far less clumsily than some of the early attempts atBrandon browsers posing as wizards for online booking systems.
It can do this thanks to “multimodal screen context recognition” and an “in-house execution model”.
The AI can also learn from user preferences when, for instance, choosing a restaurant or choosing what to order.
The AI is built on Google’s Gemini 2 large language model.
Developers want AI to act as an intelligent personal assistant that can work across apps and execute complex tasks.