DIY: Duplicate content check

9 Best PDF Plugins for WordPress
June 22, 2016
WordPress Development for Intermediate Users: Building Plugins
June 22, 2016
Show all

Duplicate content might confuse Google. If your content is on multiple pages on your or other websites, Google won’t know what to rank first. Prevent duplicate content as much as possible. Perform a duplicate content check every now and then to find copied content.

In the XML sitemap section of our Yoast SEO plugin, we have predefined a snippet to add to your feed entry saying “This article first appeared on yourwebsite.com”. The link in this snippet makes sure that every scraper includes the link to the original article. Of course, this already helps to prevent duplicate content, as Google will find that backlink to your website.

Nevertheless, if you write awesome content, your content will be duplicated. And that copy won’t always include a link to your website. All the more reason to do a duplicate content check on a regular basis. In this article, I will show you quick ways to find duplicate content for your website.

CopyScape duplicate content checker

There are a lot of tools to find duplicate content. One of the best known duplicate content checkers probably is CopyScape.com. This tool works pretty easy: insert a link and CopyScape tells you where the content on the other page is:

CopyScape: duplicate content checker

That’s step one. It will return a number of results (9 in this case), presented like Google’s search result pages. Simply click one for more details.

duplicate content checker - CopyScape

In this case, 2% of the Creativ Form page is copied from our website. CopyScape nicely highlights the text they found to be duplicate. By doing so, this duplicate content checker will give an idea of how severe the copying is. If it’s just 2% of the page like in this case, I wouldn’t worry. If it’s like over 40%, that makes quite a large part of the other page and I would simply email them to change the copied text.

By the way, dear Creativ Form. If you want to copy our content, please tailor it to your website. “In this article” makes absolutely no sense in this case 😛

Order a website review NOW and get a plugin of your choice for free. We’ll even configure it for you!


Get a Yoast website review

By the way, we frequently find manufacturer descriptions used in online shops to be duplicate. Usually, these are automatically imported into the shop’s content management system. Usually, not just for your website. Be aware of this. I understand it’s quite the hassle to write unique product descriptions for every product, but at least start with your best-selling products and take it from there. Start now.

Use the CopyScape duplicate content checker to find copied content from your website on other websites. Again, it’s one of many tools but this one’s free and easy to use. If you want to dive a bit deeper into your duplicate content, CopyScape also offers a premium version for more insights at 5c per search.

Siteliner internal duplicate content check

Siteliner is CopyScapes brother that searches for internal duplicate content. This duplicate content checker will find duplicate content on your own site. A very common example of this is when a WordPress blog doesn’t use excerpts but shows the entire blog post on the blog’s homepage. That simply means that the blog post is available on at least two pages: the homepage and the post itself. And probably on the category and tag overview pages next to that. That’s four versions of the same article on your own website already.

The advantage of using excerpts is that the excerpt always has a proper link to the post. This link will tell Google that the original content is not on that blog/category/tag page but in the post itself. I think we recommend the use of excerpts in half of all the WordPress website reviews we do. That also means half of the websites actually has this internal duplicate content issue.

The Siteliner duplicate content check will show you a lot of things, but limited to 250 pages and 30 days. Again, there is a premium version, but the free one will already give you a good idea. Just do a search, find the overview page and please click to details. Don’t get scared by high numbers of internal duplicate content, as this duplicate content check even tells you the excerpts are duplicate content:

SiteLiner: internal duplicate content check

Percentages

Where Google understands what a sidebar is, CopyScape and Siteliner seem to include all text on a page in their percentage calculations. Please keep this in mind when you use on of these duplicate content checker. The actual percentage of the duplicate content, when just looking at the main content of a page, might be higher. Just a head’s up!

Am I worried? No. Simply click one of the links and check if it’s indeed the excerpt (it is). The total of the matched words is 223, but in fact, the ‘duplicate part’ is just 57 words of 1,086 words in total in the main content section of that article. And the excerpt obviously links to the post, so we’re covered.

Manual duplicate content check

CopyScape and Siteliner are nice, easy-to-use duplicate content checkers. However, if you want to see what’s duplicate according to Google, you could also use Google itself.

If you have a certain page that you’d like to check, simply go to that page. Copy a text snippet, preferably from a section that you think might be attractive for others to copy. Insert the exact snippet in Google using double quotation marks like this:

Duplicate content check in Google

“WordPress is one of the best, if not the best content management systems when it comes to SEO. That being said, spending time on your WordPress SEO might seem like a waste”. Limit that phrase to 32 words, as Google will only take the first 32 words into account. This search query returns ‘about 517 results’ according to Google, which is well over the 9 results CopyScape returned.

Check your own duplicate content

Use a duplicate content checker like CopyScape to find what has been copied from your site, and use Google to see where else on the internet this content ended up. These are simple tools that serve a higher goal: to prevent duplicate content. If you want to read more on duplicate content, start with our Duplicate content: causes and solutions article. Or visit our duplicate content tag page.

Read more: ‘rel=canonical: the ultimate guide’ »

 

View @ joostdevalk

Skip to toolbar