This Tool Probes Frontier AI Models for Lapses in Intelligence

Scale AI has created Scale Evaluation, a platform that can test AIs across a wide range of criteria and pinpoint weaknesses, flagging up the need for further training, with the company supplying the necessary additional data.
The role of humans in the development of AI has thus far involved training and feeding feedback on outputs, and Scale has risen to prominence in this field.
The new platform automates some of this process using the company’s own machine learning and algorithms.
Head of product Daniel Berrios says the tool enables makers of AI models to analyse the results and find areas where they underperform, enabling them to target specific areas for improvement through subsequent data campaigns.
Use of the platform could also contribute to standardised testing of AI outputs, which some are calling for in order to identify errors and blind spots in the technology.

Fast Feed