GPT-4.5 release a bit of a dud

Free

*censored*
VVO Supporter 🍦🎈👾❤
Joined
Sep 22, 2018
Messages
42,314
Location
Moonbase Caligula
SL Rez
2008
Joined SLU
2009
SLU Posts
55565
The verdict is in: OpenAI's newest and most capable traditional AI model, GPT-4.5, is big, expensive, and slow, providing marginally better performance than GPT-4o at 30x the cost for input and 15x the cost for output. The new model seems to prove that longstanding rumors of diminishing returns in training unsupervised-learning LLMs were correct and that the so-called "scaling laws" cited by many for years have possibly met their natural end.

An AI expert who requested anonymity told Ars Technica, "GPT-4.5 is a lemon!" when comparing its reported performance to its dramatically increased price, while frequent OpenAI critic Gary Marcus called the release a "nothing burger" in a blog post (though to be fair, Marcus also seems to think most of what OpenAI does is overrated).
Former OpenAI researcher Andrej Karpathy wrote on X that GPT-4.5 is better than GPT-4o but in ways that are subtle and difficult to express. "Everything is a little bit better and it's awesome," he wrote, "but also not exactly in ways that are trivial to point to."

OpenAI is well aware of these limitations, and it took steps to soften the potential letdown by framing the launch as a relatively low-key "Research Preview" for ChatGPT Pro users and spelling out the model's limitations in a GPT-4.5 release post published Thursday.

Fireship makes it short and sweet:

 
  • 1Thanks
Reactions: Govi

Free

*censored*
VVO Supporter 🍦🎈👾❤
Joined
Sep 22, 2018
Messages
42,314
Location
Moonbase Caligula
SL Rez
2008
Joined SLU
2009
SLU Posts
55565

Free

*censored*
VVO Supporter 🍦🎈👾❤
Joined
Sep 22, 2018
Messages
42,314
Location
Moonbase Caligula
SL Rez
2008
Joined SLU
2009
SLU Posts
55565
Using SimpleQA, the company's in-house factuality benchmarking tool, OpenAI admitted in its release announcement that its new large language model (LLM) GPT-4.5 hallucinates — which is AI parlance for confidently spewing fabrications and presenting them as fact — 37 percent of the time.

Yes, you read that right: in tests, the latest AI model from a company that's worth hundreds of billions of dollars is telling lies for more than one out of every three answers it gives.

We should have elected it President!
 
  • 1Dead!
Reactions: CronoCloud Creeggan

Argent Stonecutter

Emergency Mustelid Hologram
Joined
Sep 20, 2018
Messages
7,460
Location
Coonspiracy Central, Noonkkot
SL Rez
2005
Joined SLU
Sep 2009
SLU Posts
20780
They're all lemons. They hallucinate 100% of the time because hallucination is how they work. If the hallucinations happen to line up with reality that's just chance.
 
  • 1Like
  • 1Winner
Reactions: Beebo Brink and Govi

Casey Pelous

Senior Discount
VVO Supporter 🍦🎈👾❤
Joined
Sep 24, 2018
Messages
3,228
Location
USA, upper left corner
SL Rez
2007
Joined SLU
February, 2011
SLU Posts
10461
Former OpenAI researcher and compulsive liar Andrej Karpathy Tommy Flanagan wrote on X that GPT-4.5 is better than GPT-4o but in ways that are subtle and difficult to express. "Everything is a little bit better and it's awesome," he wrote, "but also not exactly in ways that are trivial to point to. Yeah .... that's the ticket!" *giant bong hit* "I'd explain it but, you know, it's like, very, very technical. And, you know what else, I rode a surfboard on a tidal wave from Hawaii all the way to Silicon Valley, never hanging less than five the whole way. Yeah ....I was great .... "
 

Noodles

The sequel will probably be better.
Joined
Sep 20, 2018
Messages
5,994
Location
Illinois
SL Rez
2006
Joined SLU
04-28-2010
SLU Posts
6947
When life hands you lemons, make LSD (and put it in some lemonade).
Is that why it makes pictures of people with ten fingers? The LSD?
 
  • 1LOL
  • 1Agree
Reactions: Govi and Free