
Tech groups are rushing to redesign how they test and evaluate their artificial intelligence models, as the fast advancing technology surpasses current benchmarks.
OpenAI, Microsoft, Meta and Anthropic have all recently announced plans to build AI agents that can execute tasks for humans autonomously on their behalf. To do this effectively, the systems must be able to perform increasingly complex actions, using reasoning and planning.
您已閱讀10%(572字),剩余90%(5433字)包含更多重要信息,訂閱以繼續探索完整內容,并享受更多專屬服務。