It’s a little hard to believe that just over a year ago, a group of leading researchers asked for a six-month pause in the development of larger systems of artificial intelligence, fearing that the systems would become too powerful. “Should we risk loss of control of our civilization?” they asked.
There was no pause. But now, a year later, the question isn’t really whether A.I. is too smart and will take over the world. It’s whether A.I. is too stupid and unreliable to be useful. Consider this week’s announcement from OpenAI’s chief executive, Sam Altman, who promised he would unveil “new stuff” that “feels like magic to me.” But it was just a rather routine update that makes ChatGPT cheaper and faster.
It feels like another sign that A.I. is not even close to living up to its hype. In my eyes, it’s looking less like an all-powerful being and more like a bad intern whose work is so unreliable that it’s often easier to do the task yourself. That realization has real implications for the way we, our employers and our government should deal with Silicon Valley’s latest dazzling new, new thing. Acknowledging A.I.’s flaws could help us invest our resources more efficiently and also allow us to turn our attention toward more realistic solutions.
Others voice similar concerns. “I find my feelings about A.I. are actually pretty similar to my feelings about blockchains: They do a poor job of much of what people try to do with them, they can’t do the things their creators claim they one day might, and many of the things they are well suited to do may not be altogether that beneficial,” wrote Molly White, a cryptocurrency researcher and critic, in her newsletter last month.
Let’s look at the research.
In the past 10 years, A.I. has conquered many tasks that were previously unimaginable, such as successfully identifying images, writing complete coherent sentences and transcribing audio. A.I. enabled a singer who had lost his voice to release a new song using A.I. trained with clips from his old songs.
But some of A.I.’s greatest accomplishments seem inflated. Some of you may remember that the A.I. model ChatGPT-4 aced the uniform bar exam a year ago. Turns out that it scored in the 48th percentile, not the 90th, as claimed by OpenAI, according to a re-examination by the M.I.T. researcher Eric Martínez. Or what about Google’s claim that it used A.I. to discover more than two million new chemical compounds? A re-examination by experimental materials chemists at the University of California, Santa Barbara, found “scant evidence for compounds that fulfill the trifecta of novelty, credibility and utility.”
Meanwhile, researchers in many fields have found that A.I. often struggles to answer even simple questions, whether about the law, medicine or voter information. Researchers have even found that A.I. does not always improve the quality of computer programming, the task it is supposed to excel at.
I don’t think we’re in cryptocurrency territory, where the hype turned out to be a cover story for a number of illegal schemes that landed a few big names in prison. But it’s also pretty clear that we’re a long way from Mr. Altman’s promise that A.I. will become “the most powerful technology humanity has yet invented.”
Take Devin, a recently released “A.I. software engineer” that was breathlessly touted by the tech press. A flesh-and-bones software developer named Carl Brown decided to take on Devin. A task that took the generative A.I.-powered agent over six hours took Mr. Brown just 36 minutes. Devin also executed poorly, running a slower, outdated programming language through a complicated process. “Right now the state of the art of generative A.I. is it just does a bad, complicated, convoluted job that just makes more work for everyone else,” Mr. Brown concluded in his YouTube video.
Cognition, Devin’s maker, responded by acknowledging that Devin did not complete the output requested and added that it was eager for more feedback so it can keep improving its product. Of course, A.I. companies are always promising that an actually useful version of their technology is just around the corner. “GPT-4 is the dumbest model any of you will ever have to use again by a lot,” Mr. Altman said recently while talking up GPT-5 at a recent event at Stanford University.
The reality is that A.I. models can often prepare a decent first draft. But I find that when I use A.I., I have to spend almost as much time correcting and revising its output as it would have taken me to do the work myself.
And consider for a moment the possibility that perhaps A.I. isn’t going to get that much better anytime soon. After all, the A.I. companies are running out of new data on which to train their models, and they are running out of energy to fuel their power-hungry A.I. machines. Meanwhile, authors and news organizations (including The New York Times) are contesting the legality of having their data ingested into the A.I. models without their consent, which could end up forcing quality data to be withdrawn from the models.
Given these constraints, it seems just as likely to me that generative A.I. could end up like the Roomba, the mediocre vacuum robot that does a passable job when you are home alone but not if you are expecting guests.
Companies that can get by with Roomba-quality work will, of course, still try to replace workers. But in workplaces where quality matters — and where workforces such as screenwriters and nurses are unionized — A.I. may not make significant inroads.
And if the A.I. models are relegated to producing mediocre work, they may have to compete on price rather than quality, which is never good for profit margins. In that scenario, skeptics such as Jeremy Grantham, an investor known for correctly predicting market crashes, could be right that the A.I. investment bubble is very likely to deflate soon.
The biggest question raised by a future populated by unexceptional A.I., however, is existential. Should we as a society be investing tens of billions of dollars, our precious electricity that could be used toward moving away from fossil fuels, and a generation of the brightest math and science minds on incremental improvements in mediocre email writing?
We can’t abandon work on improving A.I. The technology, however middling, is here to stay, and people are going to use it. But we should reckon with the possibility that we are investing in an ideal future that may not materialize.