Rick W / Wednesday, November 5, 2025

New Benchmark Shows AI Agents Perform Poorly When Automating Real Jobs

A new paper from the Center for AI Safety and Scale AI has introduced the Remote Labor Index (RLI), the first benchmark designed to measure how well AI agents can perform paid, remote jobs.

Tags: AI

News

Categories

Word Search

Information System News

New Benchmark Shows AI Agents Perform Poorly When Automating Real Jobs