The Internet is Becoming an Ocean of LLM-Generated Junk

The internet’s full of content, but most of it is becoming junk. I’m talking about the stuff generated by Large Language Models (LLMs). These AI tools are cranking out endless articles, and the quality? It's bad—really bad.

Trevor I. Lasn ·

Sep 9, 2024 · 3 min read

Founder & CEO of 0xinsider.com — the Bloomberg terminal for prediction markets.

The internet has changed. It’s become flooded with content. Most of it is just regurgitated word salad or plain junk. The rise of Large Language Models (LLMs) is partly to blame. But here’s the catch—most of it is worthless. And I’m not saying this lightly.

When you read LLM-generated content, something feels off. It’s usually repetitive, overly wordy, and lacks depth. Sure, it can be grammatically correct and sound “professional.” But does it add real value? Most times, no.

LLMs don’t understand the content they generate. They’re just parroting back patterns they’ve seen from massive datasets. It’s like having a parrot that learned how to mimic conversations. Yeah, it can say things that make sense, but it doesn’t understand what it’s saying.

I’m Skeptical of Almost Everything I Read Now

Here’s the frustrating part: I’m skeptical of almost everything I read online now. I wonder, “Was this written by a human or generated by an LLM?” It’s not just that the quality is poor—it’s that the trust is gone. Even if the content seems polished, I still second-guess its accuracy and depth.

I’m sure you’ve felt the same way. You’re reading an article or documentation, and it feels oddly familiar. It’s like you’ve seen the same phrasing in three other articles. You start questioning, “Is this just recycled LLM output?”

The problem is that LLM-generated content isn’t just clogging up blogs or listicles—it’s starting to leak into everything. From tutorials to documentation, I spend more time verifying whether what I’m reading is trustworthy. And that’s time I could be spending learning or building something, not playing detective.

It’s getting harder to tell whether something is written by a human or a machine. But there are some telltale signs. If an article feels like it’s dragging on, or if it keeps repeating itself without adding any real value, that’s a red flag. LLMs tend to produce a lot of fluff to make the content seem longer or more thorough.

Another sign is repetition. If the same points keep popping up in slightly different wording, you’re probably reading machine-generated content. It’s like the AI doesn’t know when it’s already made a point, so it just keeps going in circles.

Here’s my advice: If you’re writing content, don’t just scratch the surface. Provide depth, real-world examples, and explanations that go beyond the basics. Otherwise, you’re just adding to the growing ocean of junk.

Note: This article is me blowing off steam. I don’t have any solutions to fix the issue.

Trevor I. Lasn

Founder & CEO of 0xinsider.com — the Bloomberg terminal for prediction markets. Product engineer based in Tartu, Estonia, building and shipping for over a decade.

Found this article helpful? You might enjoy my free newsletter. I share dev tips and insights to help you grow your coding skills and advance your tech career.

Check out these related articles that might be useful for you. They cover similar topics and provide additional insights.

Tech

5 min read

The Fight to Free JavaScript from Oracle's Control

The creator of JavaScript and Node.js are challenging Oracle's control over the JavaScript name

Pkl: Apple's New Configuration Language That Could Replace JSON and YAML

A deep dive into Pkl, Apple's configuration language that aims to replace JSON and YAML

The Credit Vacuum

Being a developer sometimes feels like being the goalkeeper in a soccer match. You make a hundred great saves, and no one bats an eye. But let one ball slip through, and suddenly you're the village idiot.