Last Exams - Search News

Hosted on MSN

'Humanity's last exam' reveals how accurate AI actually is. Chatbots might want to look away now.

Artificial intelligence (AI) researchers have created what they are calling "Humanity's Last Exam" in an attempt to benchmark the progress of large language models (LLMs). Looking at the performance ...

Hosted on MSN

AI is failing 'Humanity's last exam'—so what does that mean for machine intelligence?

How do you translate ancient Palmyrene script from a Roman tombstone? How many paired tendons are supported by a specific sesamoid bone in a hummingbird? Can you identify closed syllables in Biblical ...

AOL

OpenAI’s deep research can complete 26% of Humanity’s Last Exam—a benchmark for the frontier of human knowledge

Artificial intelligence may be more than a quarter of the way to surpassing the boundaries of human knowledge. OpenAI’s new autonomous agent, deep research, has stormed past competing models and set a ...

AOL

Humanity’s last exam, the test that modern AI still struggles to pass

Artificial intelligence systems now breeze through many academic tests that once challenged both machines and people. That success created an unexpected problem. The benchmarks used to measure AI ...

New York Post

AI dangerously close to solving test that only the brightest minds on Earth could: ‘Human expertise still matters’

This system could game us. Artificial intelligence is already outperforming humans at various intelligence-based activities ranging from chess to pattern recognition. Now, experts claim they’re a year ...

MediaPost

3.know: Grok Explains Humanity's Last Exam, Its Relevance To Ad Pros

A grade of 45 might not seem gold star-worthy by old school human exam standards, but that's how xAI's Grok 3 chose to illustrate this column when I interviewed the chatbot on "leaked" rumors that its ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results