Vanta Logo
SPONSOR
Automate SOC 2 & ISO 27001 compliance with Vanta. Get $1,000 off.
Archived
Published
3 min read

Trevor I. Lasn

Staff Software Engineer, Engineering Manager

The Internet is Becoming an Ocean of LLM-Generated Junk

The internet’s full of content, but most of it is becoming junk. I’m talking about the stuff generated by Large Language Models (LLMs). These AI tools are cranking out endless articles, and the quality? It's bad—really bad.

The internet has changed. It’s become flooded with content. Most of it is just regurgitated word salad or plain junk. The rise of Large Language Models (LLMs) is partly to blame. But here’s the catch—most of it is worthless. And I’m not saying this lightly.

When you read LLM-generated content, something feels off. It’s usually repetitive, overly wordy, and lacks depth. Sure, it can be grammatically correct and sound “professional.” But does it add real value? Most times, no.

LLMs don’t understand the content they generate. They’re just parroting back patterns they’ve seen from massive datasets. It’s like having a parrot that learned how to mimic conversations. Yeah, it can say things that make sense, but it doesn’t understand what it’s saying.

I’m Skeptical of Almost Everything I Read Now

Here’s the frustrating part: I’m skeptical of almost everything I read online now. I wonder, “Was this written by a human or generated by an LLM?” It’s not just that the quality is poor—it’s that the trust is gone. Even if the content seems polished, I still second-guess its accuracy and depth.

I’m sure you’ve felt the same way. You’re reading an article or documentation, and it feels oddly familiar. It’s like you’ve seen the same phrasing in three other articles. You start questioning, “Is this just recycled LLM output?”

The problem is that LLM-generated content isn’t just clogging up blogs or listicles—it’s starting to leak into everything. From tutorials to documentation, I spend more time verifying whether what I’m reading is trustworthy. And that’s time I could be spending learning or building something, not playing detective.

It’s getting harder to tell whether something is written by a human or a machine. But there are some telltale signs. If an article feels like it’s dragging on, or if it keeps repeating itself without adding any real value, that’s a red flag. LLMs tend to produce a lot of fluff to make the content seem longer or more thorough.

Another sign is repetition. If the same points keep popping up in slightly different wording, you’re probably reading machine-generated content. It’s like the AI doesn’t know when it’s already made a point, so it just keeps going in circles.

Here’s my advice: If you’re writing content, don’t just scratch the surface. Provide depth, real-world examples, and explanations that go beyond the basics. Otherwise, you’re just adding to the growing ocean of junk.

Note: This article is me blowing off steam. I don’t have any solutions to fix the issue.

If you found this article helpful, you might enjoy my free newsletter. I share developer tips and insights to help you grow your skills and career.


More Articles You Might Enjoy

If you enjoyed this article, you might find these related pieces interesting as well. If you like what I have to say, please check out the sponsors who are supporting me. Much appreciated!

Tech
3 min read

Why Anthropic (Claude AI) Uses 'Member of Technical Staff' for All Engineers (Including Co-founders)

Inside Anthropic's unique approach to preventing talent poaching and maintaining organizational equality

Oct 23, 2024
Read article
Tech
3 min read

When Will We Have Our First AI CEO?

Welcome to the future of corporate leadership. It's efficient, profitable, and utterly inhuman

Nov 4, 2024
Read article
Tech
4 min read

Sentry's LLM Integration Makes Error Debugging Actually Smart

How Sentry.io is using Large Language Models to transform error debugging from mindless stack trace reading to intelligent problem-solving

Nov 24, 2024
Read article
Tech
3 min read

Ghost Jobs Should Be Illegal

How fake job postings became a systemic problem in tech recruiting

Nov 15, 2024
Read article
Tech
3 min read

Google is Killing Information Economics on the Internet

Google’s Gemini pulls summaries from websites and slaps them directly into the search results

Sep 11, 2024
Read article
Tech
4 min read

Why I moved from Google Analytics to Simple Analytics

I ditched Google Analytics for a privacy-focused analytics tool that bypasses ad blockers

Nov 9, 2024
Read article
Tech
5 min read

Pkl: Apple's New Configuration Language That Could Replace JSON and YAML

A deep dive into Pkl, Apple's configuration language that aims to replace JSON and YAML

Nov 1, 2024
Read article
Tech
5 min read

The Fight to Free JavaScript from Oracle's Control

The creator of JavaScript and Node.js are challenging Oracle's control over the JavaScript name

Nov 23, 2024
Read article
Tech
5 min read

Cloudflare's AI Content Control: Savior or Threat to the Open Web?

How Cloudflare's new AI management tools could revolutionize content creation, potentially reshaping the internet landscape for both website owners and AI companies.

Sep 24, 2024
Read article

This article was originally published on https://www.trevorlasn.com/blog/the-internet-is-becoming-an-ocean-of-llm-generated-junk. It was written by a human and polished using grammar tools for clarity.