Clicky

  • Login
  • Register
  • Submit Your Content
  • Contact Us
Saturday, August 2, 2025
World Tribune
No Result
View All Result
  • Home
  • News
  • Business
  • Technology
  • Sports
  • Health
  • Food
Submit
  • Home
  • News
  • Business
  • Technology
  • Sports
  • Health
  • Food
No Result
View All Result
World Tribune
No Result
View All Result

AI keeps getting more powerful, making it harder to judge how smart models actually are

August 1, 2025
in Business
Reading Time: 4 mins read
A A
AI keeps getting more powerful, making it harder to judge how smart models actually are
0
SHARES
ShareShareShareShareShare

AI keeps getting more powerful, making it harder to judge how smart models actually are

How do you judge an AI model when it’s already starting to perform better than human beings? That’s the challenge faced by researchers like Russell Wald, executive director of the Stanford Institute for Human-Centered Artificial Intelligence (HAI). 

READ ALSO

Trump orders firing of government labor data chief after jobs report stuns market with massive revisions to previous reports

OpenAI’s latest funding round was so popular early investors were reportedly miffed about getting smaller allocations to make room for new partners

“As of 2024, there are very few task categories where human ability surpasses AI, and even in these areas, the performance gap between AI and humans is shrinking rapidly,” Wald said last week in a presentation hosted at the Fortune Brainstorm AI Singapore conference. “AI is exceeding human capabilities and it’s becoming increasingly harder for us to benchmark.”

The HAI releases the AI Index each year, which aims to provide a comprehensive, data-driven snapshot of where AI is today. At Fortune Brainstorm AI Singapore, Wald shared a few highlights from the 2025 edition of the AI index, such as the increasing power of today’s models, the growing dominance of industry on the AI frontier, and how China is poised to overtake the U.S.


The following transcript has been lightly edited for conciseness and clarity.

I’m Russell Wald, the executive director of the Stanford Institute for Human-Centered Artificial Intelligence, or what we call “HAI”. 

We are Stanford University’s globally recognized interdisciplinary research institute at the forefront of shaping AI development for the public good. HAI was established in 2019 with the goal of advancing AI research, education, policy and practice. And, through our convening role and rigorous study of AI, we have become the trusted partner on AI governance for decision makers in industry, government and civil society. 

I’m going to talk about what we produce at HAI, which is the AI index, an annual data driven analysis of trends in AI that tracks research, development, deployment and the socio-economic impact of AI across academia, government and industry.

We see AI performance consistently improve year over year. We use Midjourney, a text-to-image generator, asking for a hyper-realistic image of Harry Potter. And from February 2022 to July 2024, we see rapidly increasing quality in these generated images. 

In 2022, the model produced cartoonish, inaccurate renderings of Harry Potter, but by 2024, it could create startlingly realistic depictions. We have gone from what mirrors a Picasso painting to an uncanny rendering of Daniel Radcliffe, the actor who played Harry Potter in the movies. 

Because of this consistent performance growth, we are increasingly challenged when it comes to benchmarking these models. As of 2024, there are very few task categories where human ability surpasses AI, and even in these areas, the performance gap between AI and humans is shrinking rapidly. From image recognition to competition-level mathematics to PhD-level science questions, AI is exceeding human capabilities and it’s becoming increasingly harder for us to benchmark.

From healthcare to transportation, AI is rapidly moving from the lab to our daily life. In 2023, the U.S. Food and Drug Administration approved 223 AI-enabled medical devices, up from just six in 2015. 

On the roads, self-driving cars are no longer experimental. For example, Waymo, which I regularly take while living in San Francisco, is one of the largest U.S. operators and provides over 150,000 autonomous rides each week, while Baidu’s affordable Apollo Go robotaxi has a fleet now that serves numerous cities across China. 

Business use of AI increased significantly after stagnating from 2017 to 2023. The latest McKinsey report reveals that 78% of surveyed respondents say their organizations have begun to use AI in at least one business function, marking a significant increase from 55% in 2023. 

Driven by increasingly capable small models, the inference cost for a system performing at the level of [GPT 3.5] dropped over 280-fold between November 2022 and October 2024. Hardware costs have declined 30% annually, while energy efficiency has improved by 40% each year. 

Open-weight models are also closing the gap with closed models, reducing the performance [gap] from 8% to just 1.7% on some benchmarks in a single year. Together, these trends are rapidly lowering the barriers to advanced AI. 

However, even with inference and hardware costs going down, training costs remain out of reach for academia and most small players. Nearly 90% of notable AI models in 2024 came from industry, which is up from 60% in 2023. And while academia remains a top source of highly cited research, it does struggle at this point to stay as advanced at the frontier level. 

Model scale continues to grow rapidly. Training compute doubles every five months, datasets every eight, and power use annually. Yet performance gaps are shrinking. The score difference between the top and 10th ranked models fell from 11.9% to 5.4% in a year, and the top two models are now separated by just 0.7%. The frontier is increasingly competitive and increasingly crowded. 

In recent years, AI model performance at the frontier has converged, with multiple providers now offering highly capable models. This marks a shift from late 2022, when ChatGPT’s launch, widely seen as AI’s breakthrough into the public consciousness, coincided with the landscape dominated by just two players: OpenAI and Google. 

One of the most important things to note is that the transformer model cost $930 for Google to train in 2017—and that is the T in GPT, the baseline level of architecture—and now today we’re at $200 million to train Gemini Ultra. 

Last year’s AI index was among the first publications to highlight the lack of standard benchmarks for AI safety and responsibility evaluations. The index has also been analyzing global public opinion. If you are from a non-Western industrialized nation, you are more likely to view AI positively than not. China has an 83% positive view, Indonesia 80%, and Thailand 77%. Whereas Canada is at 40%, the U.S. 39%, and the Netherlands 36%. 

I’ll close with the geopolitical situation. The U.S. still maintains a lead in AI, followed closely by China. However, this gap is tightening. My intention is not to exacerbate the idea of an AI arms race between China and the U.S., but instead to highlight the different approaches between the most advanced frontier AI model developers. 

Over the last several years, the U.S. has relied on a few proprietary model providers. Meanwhile, China has deeply invested in its talent base, and more importantly, an open-source environment. If this trend continues, and I appear next year, at this rate, China would surpass the U.S. in terms of model performance. 

Credit: Source link

ShareTweetSendSharePin
Previous Post

Michael Phelps help Baltimore Ravens Marlon Humphrey swimming

Next Post

Is Zuckerberg reassessing Meta’s approach to open-source AI?

Related Posts

Trump orders firing of government labor data chief after jobs report stuns market with massive revisions to previous reports
Business

Trump orders firing of government labor data chief after jobs report stuns market with massive revisions to previous reports

August 1, 2025
OpenAI’s latest funding round was so popular early investors were reportedly miffed about getting smaller allocations to make room for new partners
Business

OpenAI’s latest funding round was so popular early investors were reportedly miffed about getting smaller allocations to make room for new partners

August 1, 2025
10 finance companies that made the biggest leaps on the 2025 Fortune Global 500 list
Business

10 finance companies that made the biggest leaps on the 2025 Fortune Global 500 list

August 1, 2025
Andy Jassy unleashed an 8-minute defense of Amazon’s AI playbook on earnings call
Business

Andy Jassy unleashed an 8-minute defense of Amazon’s AI playbook on earnings call

August 1, 2025
Andy Jassy says Amazon has chosen to ’embrace’ AI, promising it ‘will make all our teammates’ jobs more enjoyable’
Business

Andy Jassy says Amazon has chosen to ’embrace’ AI, promising it ‘will make all our teammates’ jobs more enjoyable’

August 1, 2025
‘We have made a few deals today that are excellent deals for the country’: Trump is coy as tariff scramble ensues
Business

‘We have made a few deals today that are excellent deals for the country’: Trump is coy as tariff scramble ensues

August 1, 2025
Next Post
Is Zuckerberg reassessing Meta’s approach to open-source AI?

Is Zuckerberg reassessing Meta’s approach to open-source AI?

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

What's New Here!

NJ native Chris Gotterup’s life-changing win got him into British Open

NJ native Chris Gotterup’s life-changing win got him into British Open

July 15, 2025
Robinhood stock tokens face scrutiny in the EU after OpenAI warning

Robinhood stock tokens face scrutiny in the EU after OpenAI warning

July 7, 2025
An affordable vlogging camera that lags behind its rivals

An affordable vlogging camera that lags behind its rivals

July 14, 2025
Pete Crow-Armstrong lives it up at All-Star Game with girlfriend Hailey Lavelle

Pete Crow-Armstrong lives it up at All-Star Game with girlfriend Hailey Lavelle

July 16, 2025
The best Amazon Prime Day deals under  that you can still get today

The best Amazon Prime Day deals under $50 that you can still get today

July 12, 2025
How Leading Food Manufacturers Are Winning With AI

How Leading Food Manufacturers Are Winning With AI

July 16, 2025
Stellantis sees €1.2 billion tariff hit in second half

Stellantis sees €1.2 billion tariff hit in second half

July 29, 2025

About

World Tribune is an online news portal that shares the latest news on world, business, health, tech, sports, and related topics.

Follow us

Recent Posts

  • Trump orders firing of government labor data chief after jobs report stuns market with massive revisions to previous reports
  • Trump fires commissioner of labor statistics after weaker-than-expected jobs figures slam markets
  • Nintendo has sold over 6 million Switch 2s but still can’t keep up with demand
  • Chiefs’ Chris Jones reveals marriage hopes after ugly breakup

Newslatter

Loading
  • Submit Your Content
  • Contact Us
  • Privacy Policy
  • Terms of Use
  • DMCA

© 2024 World Tribune - All Rights Reserved!

No Result
View All Result
  • Home
  • News
  • Business
  • Technology
  • Sports
  • Health
  • Food

© 2024 World Tribune - All Rights Reserved!

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In