January 13, 2022 in Tips by

Recently John Mueller said that low-quality content can reflect poorly on a website’s overall credibility. What does this mean? Is it true?

Here’s what he said, according to Roger Montti on SEJ:

So, in short, I guess if you have a very low quality translation [or other content] that’s also indexed and that’s also very visible in search, then that can definitely pull down the good quality translation as well or the good quality original content that you also have.”


That got me thinking. 

A side-business of mine, where I provide guided SEO campaigns, uses a lot of content that is low quality. I have weekly office hours with participants. They often ask great questions (not low quality questions), so I thought others might like to see these questions answered. I’ve taken those videos and put them on YouTube while adding an automated transcript (hence low quality) to a blog post. I thought, “well, this isn’t a great transcript, but it could help someone find the video that might be able to help them out even if the transcript doesn’t make much sense.”

If what Mueller said is correct, this poor-quality content might be holding back my website as a whole. 

What is poor quality content?

That’s hard to define, but I think it’s safe to say you know it when you see it. That’s why I immediately thought of my other website. I attempted to gather some data to determine if you could notice a trend with the low-quality content and Google search. 

Here’s how I collected my data: 

  1. I crawled the website using Screaming Frog (limited to the Office Hours section). Some have low-quality, automated transcripts- but others have good, manual transcripts, and others have short summaries (and no transcripts).
  2. Along with the URLs, I collected the following information from each page:
    1. I used the grammar checker in Screaming Frog to quantify the number of grammatical errors on each page. The number of grammar errors helps distinguish between good and imperfect transcripts.
    2. I looked at the number of words on each page. The length of a page is a way to distinguish between the transcribed videos vs. the ones that were not.
    3. I gathered Search Console data for each of these. Search impressions are a way of telling whether or not Google thought the page was good enough to show in the search results. I wanted to see the number of impressions each page received.

Here’s what I found:

Here’s what that looks like:

What this graph means:

To make this chart a little more clear (because so much is crowded in the lower-left), here’s the same data with the number of words on a logarithmic scale:

Low-quality content doesn’t do well in Google.

Mueller was right. Boy, that’s an arrogant statement. Perhaps I should put it another way: he’s not trying to socially engineer us to do something Google wants. You can see it in the graphs (above). Notice how many tiny dots (representing zero impressions in a month) are high on the x-axis? This data seems to confirm that low-quality content doesn’t perform as well in Google.

You can also see that most of the big circles (representing more impressions) appear below the 4% of grammatical error line.

Now we don’t exactly know how Google quantifies “low-quality,” but I’d suspect grammar might be part of it. There’s more to low-quality than grammar alone- as seen by the few small dots on the bottom-right of the chart.

But wait, there’s more. This data shows us some additional insights.

Google doesn’t prefer long content.

I’ve heard it said (and others have told me) that we need to write long content for Google because it “prefers” it. This data seems to show otherwise. 

Notice all the tiny dots (representing zero impressions) that had a lot of words (in the lower-right of the graph). You will also notice that most of the bigger circles (representing pages with more impressions) have a limited amount of content. This data puts a hole in the theory that Google likes long content. Just because an “article”/transcript was long doesn’t mean it showed up in Google. 

On a side note, I recommend long content on pages but not because Google “ranks” long content over shorter content. The value of longer content is each post’s opportunity for long-tail traffic.

Limitations of this data

It’s always hard to conclude anything definitive from SEO data. In this case, these observations assume that people are interested in the topics of these videos. If nobody else wants to know the answer to a question, the best grammar or longest article won’t help. SEO responds to demand. 

How can I fix this?

So, I’ve got 94 pages on my website that Google seems unworthy to show in the search results. How can I fix this? 

First, I need to stop cutting corners and using automatically-generated content. These are not only unreadable, but they’re not helpful. While I had hoped this might be enough signal to the search engines to get people to the video. But that’s not working.

At the same time, it might not be worth getting complete transcripts of each of these videos- even good transcripts. Notice all the low-error and long-length content at the lower-right on the graph? Those are the human-edited transcripts. That doesn’t seem to help bring in impressions from Google. That’s good because generating quality transcripts for each of these would be expensive. 

Instead, it seems that data points to a strategy that encourages a simple summary of the video. Most of the better-performing (big dot) content is between 500 and 3,000 words long. Perhaps a paragraph or two. That might be all I need to do with these.

Still, that’s a lot of pages to update. What do I do in the meantime? Well, I’m simply going to remove the automated transcripts. They aren’t generating any impressions right now, so removing them won’t make me lose traffic. If the idea that Mueller first proposed (that low-quality content reflects poorly on a website) is accurate, removing this might help me get more overall impressions for my website. 

I’ll keep you updated with the results of this project. It’s a lot of articles, though, so it might take a while to get all these summaries written.

Do you have a low-quality content problem on your website?

Can you replicate my data on your website? Does the same data tell a different story? I’d be interested in hearing from you. 

WordCamp Philadelphia 2020

Did you hear David’s marketing workshop at WordCamp Philadelphia?

Learn why we're reliable.
Read our other credentials.

The Bottom Line

You need to reach people in your industry.

Reliable Acorn will help you create a custom digital marketing strategy that does just that.

Ready to Talk?