Read more» Product and Post
What is quality content?
Columnist Patrick Stox takes a comprehensive look at what Google might consider to be "quality content" and adds his own thoughts and tips based on his experience in the SEO industry.
We’ve
all heard that content is king and that you need to write high-quality content,
or now “10x
content,” as coined by Rand Fishkin. Ask SEOs what “quality content” is
and you’ll receive a lot of varied and opinionated answers. Quality is
subjective, and each person views it differently.
Ask
SEOs what Google considers being quality content, and you will
get a lot of blank stares. I know because I like to ask this a lot.
The
number one answer I get, sadly, is that content should be x number
of words, where x is usually 200, 300, 500, 700, 1,000, 1,500,
or 2,000. More content does not mean better content. A simple query about the
age of an actor can be fully answered in a sentence and doesn’t require their
life story and filmography.
Another
answer I receive is that the content should be “relevant.” The problem with
this is that low-quality pages can be relevant as well.
Other
SEOs I’ve asked have given amazingly detailed answers from patents or ideas
from machine learning about word2vec, RankBrain, deep learning, count-based
methods and predictive methods.
Is
there a right answer?
Google Webmaster Quality Guidelines
Google has quality guidelines here. However, you may notice that there are many guidelines around negative signals but few around positive signals. When reading these, think for a minute what happens when two, ten or a hundred websites aren’t doing anything bad. How do you determine the quality difference if no one does anything wrong?
Basic
principles
*Make pages primarily for users, not for
search engines.
*Don’t deceive your users.
*Avoid tricks intended to improve search
engine rankings. A good rule of thumb is whether you’d feel comfortable
explaining what you’ve done to a website that competes with you, or to a Google
employee. Another useful test is to ask, “Does this help my users? Would I do
this if search engines didn’t exist?”
*Think about what makes your website unique,
valuable or engaging. Make your website stand out from others in your field.
Specific
guidelines
Avoid the
following techniques:
*Automatically generated content
*Participating in link schemes
*Creating pages with little or no original
content
*Cloaking
*Sneaky redirects
*Hidden text or links
*Doorway pages
*Scraped content
*Participating in affiliate programs without
adding sufficient value
*Loading pages with irrelevant keywords
*Creating pages with malicious behavior, such
as phishing or installing viruses, Trojans or other barware
*Abusing rich snippets markup
*Sending automated queries to Google
Follow good
practices like these:
*Monitoring your site for hacking and removing
hacked content as soon as it appears
*Preventing and removing user-generated spam on your site
Google on how to create valuable content
Then
there’s this
section from Google’s Webmaster Academy course, which tells
you how to “create valuable content.” There are a few good tips here
on what to avoid: broken links, wrong information, grammar or spelling
mistakes, excessive ads and so on. These are useful tips, but again, they focus
on what not to do.
There are some
tips on how to make your site useful, credible and engaging; however, when
it comes to being more valuable or high-quality, Google basically says, “be
more valuable or high-quality.”
As you
begin creating content, make sure your website is:
Useful
and informative: If you’re launching a site for a restaurant, you can include
the location, hours of operation, contact information, menu and a blog to share
upcoming events.
More
valuable and useful than other sites: If you write about how
to train a dog, make sure your article provides more value or a different
perspective than the numerous articles on the web on dog training.
Credible: Show
your site’s credibility by using original research, citations, links, reviews
and testimonials. An author biography or testimonials from real customers can
help boost your site’s trustworthiness and reputation.
High-quality: Your
site’s content should be unique, specific and high-quality. It should not be
mass-produced or outsourced on a large number of other sites. Keep in mind that
your content should be created primarily to give visitors a good user
experience, not to rank well in search engines.
Engaging: Bring
color and life to your site by adding images of your products, your team or
yourself. Make sure visitors are not distracted by spelling, stylistic and
factual errors. An excessive number of ads can also be distracting for
visitors. Engage visitors by interacting with them through regular updates,
comment boxes or social media widgets.
Google’s Panda algorithm
Panda algorithmically
assessed website quality. The algorithm targeted many signals of low-quality
sites but again didn’t provide much in the way of useful information for
positive signals.
Google’s Search Quality Rating Guidelines
There
were a lot of signals for both high- and low-quality content and websites in
the Google
Search Quality Ratings Guidelines. It is worth reading in
its entirety multiple times, but I pulled out some of the important parts here:
What
makes a High-quality page? A High-quality page may
have the following characteristics:
*High level of Expertise,
Authoritativeness and Trustworthiness (E-A-T)
*A satisfying amount of high quality MC (Main
Content)
*Satisfying website information and/or
information about who is responsible for the website, or satisfying customer
service information if the page is primarily for shopping or includes financial
transactions
*Positive website reputation for a website that is responsible
for the MC on the page
They
expand further on the concept of E-A-T. This was the part of the guidelines I
found the most interesting and relevant in determining quality of content (or a
website in general).
6.1 Low
Quality Main Content
One of
the most important criteria in PQ (Page Quality) rating is the quality of the
MC, which is determined by how much time, effort, expertise and talent/skill
have gone into the creation of the page and also informs the E-A-T of the page.
Consider
this example: Most students have to write papers for high school or college.
Many students take shortcuts to save time and effort by doing one or more of
the following:
*Buying papers online or getting someone else
to write for them
*Making things up
*Writing quickly, with no drafts or editing
*Filling the report with large pictures or
other distracting content
*Copying the entire report from an
encyclopedia or paraphrasing content by changing words or sentence structure
here and there
*Using commonly known facts, for example,
“Argentina is a country. People live in Argentina. Argentina has borders.”
*Using a lot of words to communicate only basic ideas or facts,
for example, “Pandas eat bamboo. Pandas eat a lot of bamboo. Bamboo is the best
food for a Panda bear.”
I found
the part of about large images amusing. I’m not a fan of hero images unless
they are exceptional. Unfortunately, most end up being generic. Some
publications make it worse and use generic hero sliders. Remember, there is an
algorithm for “above-the-fold,” and I feel like hero images are completely
against this. Most hero images provide little to no useful content without
having to scroll.
In
section 7.0, “Lowest Quality Pages,” Google notes that the following types of
pages/websites should receive the lowest quality rating:
*Harmful or malicious pages or websites
*True lack of purpose pages or websites
*Deceptive pages or websites
*Pages or websites which are created to make
money with little to no attempt to help users
*Pages with extremely low or lowest-quality MC
*Pages on YMYL websites that are so lacking in
website information that it feels untrustworthy
*Hacked, defaced or spammed pages
*Pages or websites created with no expertise
or pages which are highly untrustworthy, unreliable, unauthoritative,
inaccurate or misleading
*Websites which have extremely negative or
malicious reputations
*Violations of the Google Webmaster Quality Guidelines
Speaking
more specifically about page content in section 7.4, “Lowest Quality Main
Content,” the guidelines note that the following types of Main Content (MC)
should be judged as lowest quality:
*No helpful MC at all or so little MC that the
page effectively has no MC
*MC which consists almost entirely of “keyword
stuffing”
*Gibberish or meaningless MC
*“Auto-generated” MC, created with little to
no time, effort, expertise, manual or added value for users
*MC which consists almost entirely of content
copied from another source with little time, effort, expertise, manual or added
value for users.
Finally,
in section 7.2, “Lack of Purpose Pages,” Google notes that:
Sometimes
it is impossible to figure out the purpose of the page. Such pages serve no
real purpose for users. For example, some pages are deliberately created with
gibberish or meaningless (nonsense) text. No matter how they are created, true
lack of purpose pages should be rated lowest quality.
I love
how these sections are all basically saying that your page needs to have a
purpose and be understood. I’ve seen many marketing pages that use so much
lingo, jargon or marketing-speak that even people at the company can’t tell you
what the page is about. What’s worse is when good content is stripped away to
make more of these kinds of pages.
There
are also some interesting snippets regarding the different elements and signals
of trust that might need to be included based on the type of website. This
information is extremely important, and it’s easy to brainstorm the different
website elements that a local business would need (such as “about us” or
“contact”), compared to an e-commerce store that might need reviews,
pricing and so forth.
The
point is that you need to understand the questions your customers are asking
and provide that information to them.
12.7-
Understanding User Intent
It can
be helpful to think of queries as having one or more of the following intents.
*Know query,
some of which are Know Simple queries
*Do query,
some of which are Device Action queries
*Website query,
when the user is looking for a specific website or webpage
*Visit-in-person query, some of
which are looking for a specific business or organization, some of which are
looking for a category of businesses
The
above is very similar to the standard “informational, navigational and
transactional” system, but I like this better.
Google
elaborates on the idea of matching user intent with the purpose of the
page elsewhere in the document — section 2.2, “What is the Purpose of a
Webpage?” lists the following common page purposes:
*To share information about a topic
*To share personal or social information
*To share pictures, videos or other forms of
media
*To express an opinion or point of view
*To entertain
*To sell products or services
*To allow users to post questions for other
users to answer
*To allow users to share files or to download
software
Boom!
Jackpot matching the user intent with the purpose of a page and type of content
expected is exactly what I’m looking for in trying to determine quality.
This
makes sense if you think about it from the standpoint of semantic search. If I’ve
got a product page, and the top results for the keyword I’m targeting are all
informational in nature, then I obviously need to either create an
informational page or add more information to my product page if I even want to
compete.
I see
this mismatch often when people ask why they’re not ranking for a specific
term.
Google’s guidance on building high-quality
websites
Even
before the Quality Raters Guidelines, way back in 2011, there was this
gem on the Google Webmaster Central Blog that told us the
questions Google engineers asked themselves when building the algorithm.
*Would you trust the information presented in
this article?
*Is this article written by an expert or
enthusiast who knows the topic well, or is it shallower in nature?
*Does the site have duplicate, overlapping or
redundant articles on the same or similar topics with slightly different
keyword variations?
*Would you be comfortable giving your credit
card information to this site?
*Does this article have spelling, stylistic or
factual errors?
*Are the topics driven by genuine interests of
readers of the site, or does the site generate content by attempting to guess
what might rank well in search engines?
*Does the article provide original content or
information, original reporting, original research or original analysis?
*Does the page provide substantial value when
compared to other pages in search results?
*How much quality control is done on content?
*Does the article describe both sides of a
story?
*Is the site a recognized authority on its
topic?
*Is the content mass-produced by or outsourced
to a large number of creators or spread across a large network of sites, so
that individual pages or sites don’t get as much attention or care?
*Was the article edited well, or does it
appear sloppy or hastily produced?
*For a health-related query, would you trust
information from this site?
*Would you recognize this site as an
authoritative source when mentioned by name?
*Does this article provide a complete or
comprehensive description of the topic?
*Does this article contain insightful analysis
or interesting information that is beyond the obvious?
*Is this the sort of page you’d want to
bookmark, share with a friend or recommend?
*Does this article have an excessive amount of
ads that distract from or interfere with the main content?
*Would you expect to see this article in a
printed magazine, encyclopedia or book?
*Are the articles short, unsubstantial or
otherwise lacking in helpful specifics?
*Are the pages produced with great care and
attention to detail vs. less attention to detail?
*Would users complain when they see pages from this site?
Once
again, spelling, factual errors and content quality control are mentioned, just
like in the Google Search Quality Rating Guidelines. There are also a couple of
questions about a site being recognized as an authority on the topic or an
authority in general.
Additionally,
there are questions that seek to answer if the person knows the topic
well, if the content is unique and how comprehensively the topic is
covered. This matches up perfectly with the E-A-T concept from the Search
Quality Rating Guidelines.
Some content quality signals you can
control
*Broken links. Crawl
your site with a program like Screaming Frog and
fix them.
*Wrong information. Do
research and find the right sources.
*Grammatical mistakes. You
can use a tool like Grammarly or
have someone proofread your writing.
*Spelling mistakes. Use
spell-check or an editor.
*Reading level. The Hemingway App is
good for this. You should be adjusting your reading level based on your target
audience and the intent of the query.
*Excessive ads. Just
don’t.
*Page load speed. Go read
this.
*Website features. The
features you should have will change depending on the type of website and the
intent of the query.
*Matching the user intent
with the purpose of a page and type of content expected. Take
a look at the search results to see what is already ranking.
*Authority and comprehensiveness. Keep
reading.
There
are things outside of your control in the short term, but you can play the long
game and continue to build your authority over time by consistently creating
comprehensive content.
At SMX
West, I briefly showed a
way of identifying all topics/subtopics in an industry and how to completely
cover these based on keyword groupings. I believe that if you’re covering
everything that’s being searched for and answering every question that people
are asking about a topic, then you have a complete answer, and it will be the
best answer for a search engine to return in the results.
How do I determine quality content?
I want
to share a little more about my actual process and what I look for on a page or
a section of the site as it relates to the content of the page. Besides
technical on-page elements, in the content itself what I’m usually looking for
are:
Concepts
and entities
Co-occurrence
of keywords/phrases
Topical
completeness
Concepts and entities
We know
that Google looks for concepts and entities in the content, so I usually start
here. I use Alchemy
API for this.
If I
enter the page from Google about creating valuable content — https://support.google.com/webmasters/answer/6001093?hl=en — I get back some
information on entities such as Search Console, search engines, Google and
social media. Concepts returned are for website, Google search, Page Rank, web
search engine, Bing and Google. Keyword relevance is also returned through
Alchemy:
If you
run many of the top ranking websites for a search query through Alchemy API,
you will find a lot of overlap that indicates useful data. There are likely
consistent concepts and entities that you would want to include in the body of
your text. Alchemy has a JSON output, and I know a lot of people use Block
spring to pull into Google Sheets.
Co-occurrence of keywords and phrases
Ultimate Keyword Hunter provides
words or phrases that are used on the pages the most. I normally sort by co-occurrence
across websites and find that usually two-, three- and four-keyword phrases are
the most useful.