LLMs, or the content wasteland ahead
Why I'm excited about most applications of AI, but have misgivings about the monetization of LLMs.
On most days, I detest punditry. That said, the emergence of large language models (LLMs) is perhaps the most interesting development in my entire career in tech; it’s simply too consequential to ignore.
I’m not inclined to speculate about the advent of artificial general intelligence (AGI): a hypothetical set of algorithms that match or surpass humans at any given task. Unlike many of my industry peers, I’m unconvinced that we’re nearing that threshold. In an earlier post, I remarked the following:
“The technology feels magical and disruptive, but we felt the same way about the first chatbot — ELIZA — and about all the Prolog-based expert systems that came on its heels. This isn’t to say that ChatGPT is a dud; it’s just that the shortcomings of magical technologies take some time to snap into view.”
The bottom line is that LLMs are designed to function as text predictors trained on far more data than we can possibly imagine. It remains to be seen if their humanlike behavior is merely a parlor trick, a straightforward if unexpected consequence of the vastness of the internet; or if they exhibit some yet-unknown emergent property that sets us on a path toward true AGI.
Instead of taking sides in that debate, I’d like to make a simpler prediction about LLMs as they operate today. I suspect that barring urgent intervention, within two decades, most interactions on the internet will be fake. It might seem like a weirdly pessimistic claim, but there are two powerful effects at play. First, there are robust commercial incentives to use LLMs to simulate human behavior on a massive scale. Second, some of the most plausible applications of LLMs — as search agents — will almost certainly have a chilling effect on the creation of new content on the internet.
To illustrate the first point, consider customer support. The need to devote resources to this task is seen by many large companies as a barrier to growth, so over the past three decades, we’ve seen the proliferation of maze-like phone IVRs, chatbots, and support wizards on the web. The first versions of such systems offered simple ways to reach a human. Today, most designs are more adversarial and escape-proof: for example, in an IVR, pressing “0” to speak to a human might just get you chided by a robot that refuses to cooperate. (For a taste, call UPS at 800-742-5877.)
Even with these countermeasures in place, human support continues to be a drag on revenue. Many businesses try to reduce costs by outsourcing the job to third-party call centers where low-wage workers with little agency must follow a rigid script. On the backend, the decision to offer a refund or make other accommodations is seldom predicated on what the customer has to say; instead, it boils down to a formula for the account’s worth. If you reliably make the company money, they will entertain the odd meritless demand. If you don’t, they will play hardball in hopes that you go away.
Herein lies the irresistible allure of LLMs: platforms such as ChatGPT are already capable of replacing these vestiges of customer support with something that feels more “authentic” to most customers — and doing so at fraction the cost. I have no doubts that LLMs will be employed for this purpose soon.
Most of the time, the solutions won’t be built on outright deception. Instead, they fill follow the telemedicine blueprint outlined in a recent Twitter thread by a Princeton-educated SF Bay Area entrepreneur:
The author boasted about the improved metrics and higher satisfaction scores when human volunteers were replaced by a bot. Critically, there was a fig leaf placed over the naughty bits: a human still supervised the conversation, with the robot merely “advising” what to say. This will likely continue — except that down the line, we’re going to have a human “supervising” two, five, or ten parallel conversations, doing just enough to keep up the appearances while delivering faked compassion at a marvelous scale.
[Update: as of June 2024, Amazon Clinic is already using “Curai Health”, an “AI-assisted primary care model” that follows this exact blueprint.]
The same incentives will cause LLMs to swarm other types of online communications. Consider that many businesses don’t shy away from covert marketing: from paid product placement, to astroturfing on Reddit, to content-farmed blog articles, to outright fake reviews. Government agencies and political operatives regularly succumb to the same temptations when they feel the cause is just.
Of course, these marketing and propaganda strategies exist today, but they are kept in check by cost: you need to pay real humans to engage with others in believable ways. LLMs offer a solution, letting you effortlessly conjure millions of human-like personas that are practically indistinguishable from real people and can seemingly live complex online lives — but by the end of the day, exist only to advance your goals.
Marketing aside, the same toolkit would be indispensable for crime. Scams and spear phishing campaigns would reach new levels if one could perfectly tailor the communications to their marks’ professional and social backgrounds, and do so millions of times per day. ChatGPT is already capable of “style transfer” that flawlessly adjusts the message to match a person’s background. Its developers try to detect overtly malicious uses, but self-hosted implementations will be free of such constraints.
The explosion in inauthentic content will be accompanied by the diminishing incentives for humans to publish on the internet. It is nearly a given that we will learn to depend on ChatGPT-style digital assistants to instantly retrieve, summarize, and apply the sum of human knowledge to any problem at hand; the revolution is likely to happen even if Microsoft Clippy 2.0 occasionally makes a mistake or two.
All these LLMs are made possible by the culture of uninhibited and organic sharing on the internet; they are trained on Reddit, Quora, Stack Exchange, Wikipedia, and countless websites that make up the “small web”. But the emergence of LLM-based assistants threatens to destroy this ecosystem. In a world where Clippy 2.0 has all the answers, nobody will visit your website, ask you for advice, or send you a “thank you” note. Some creators might still find solace in helping the humanity in a very abstract way, but many will give up — or flee to walled gardens where robots are not allowed to come.
A new breed of content licenses or other legal solutions to keep robots at bay might help preserve some degree of openness. The alternative is the internet of small, hermetic communities where members know each other, and the risk of drive-by robotic infiltration is easily kept in check.
On that topic: https://www.ic3.gov/Media/News/2024/240709.pdf
For sure in the area of coding there are huge changes ahead.
Example: I'm not a coder. But sometimes I want to solve my trivial problem using some scripting. Believe me, designing and writing all these loops and conditions was a real mind-training. And definitely not very funny. And took a lot of time that I could spend better, working on other parts of my project. Before I had to do the same, what professionals do in many jokes: I had to read eg. stackoverflow and find crumbs useful for my solutions. And sometimes (as eg. nowadays, when I work on fontforge scripting) there were no solutions, so I had to read documentation, which usually is realy easy - for coders, not for me. For several times I had tried to ask for help. Only once I got answer - the topic was "fresh" enough, but in other times I had waited for weeks and there were silence.
Now just I have to write something like "ladder" or conspect: first do this, then do this. I can even ask to design certain functions, so I can control data flow. But I don't have to code. I just need to understand the code enough to interact with really fast autist (you know, like Rain Man). What before took me weeks, now takes me hours.
So, your prophecy really can fulfill. It is sad, that subtle network of mutual help will disappear. And when it will vanish, also the free help from LLM will fade. Why should it remain free if there will be no free competition?