duplicate content

in Uncategorized

6 Ways to Remove Duplicate Content on WordPress

Google’s algorithm updates and penalty has caused a lot of panic, incurred heavy loss of traffic and revenue for websites. While the reasons for penalty are many, the most common reason for Google penalizing the site could be due to duplicate content issues.

Duplicate content worries are one of the major issues that every webmaster could face, mostly because they are caused due to the complex content structure used in the WordPress CMS.

How do Search Engines track Duplicate Content?

In the eyes of search engine bots, two or more pages that have a same or largely identical content under different URLs are labeled as duplicate content.  The main function of search engine is to return unique and relevant results, so they don’t appreciate sites with lots of duplicate content.

Google not only penalizes the single piece of duplicate content on your site, but penalizes the entire website, reducing overall site performance and ranking. Duplicate content doesn’t mean just a copy of the whole content; it can also be a copy of Title, Meta Tags, and other elements.

You will also face duplicate content issue if someone copies and republish your content on a different URL.

In most cases, nobody notices the issue until the whole site is penalized.  This post will offer you some advice to deal with duplicate content issue on WordPress.

 6 Common Causes of Duplicate Content

If penalized due to duplicate content, you will notice an overall drop in website ranking and not just some pages.

A lower crawl-through rate means bots take more time to find and index newer content if you have a lot of duplicate pages. Search engines only find and index a certain number of pages every time they crawl, so this can result in low crawl rate.

You can use the Google Webmaster Tools to check if you have any duplicate content warnings, analyse the crawl rate and indexing stats (under Diagnostics > Crawl Stats) to figure out how many duplicate pages you have and download the data. This will help save time in figuring out the affected pages.

Below are the most common causes of duplicate content on WordPress blogs.

  • “#Replytocom=?” links or comment links
  • Pages with same title and meta tag elements
  • Your blog post is copied and republished elsewhere
  • Category, Tag and author and archive pages being indexed
  • Post image attachment link
  • Comment pagination

You should also read Google’s take on duplicate content.

6 Ways to Fix and Avoid Duplicate Content

Below are the solutions to 6 common causes of duplicate content, and also links to tools and resources you can follow to fix the problem.

  • “Replytocom” Parameter in Links:

Example: Yourwebsite.com/article-title/replytocom=654

This is one of the most common issues with WordPress. In the posts where you have a lot of comments, each comment is assigned a link with comment ID number, and when bots access these URL’s, they find the same content on each of such URL’s.

Search engines don’t know which version(s) to include/exclude from their indices and they don’t know which version(s) to rank for query results; hence, much of these links including the original link is pushed down in search results.

Solution:

Remove “replytocom” links from Google’s index. Search “site:yoursite.com inurl:replytocom” to check if “replytocom” links from your blog is already indexed in search results. If such links are indexed, you should remove them via Google webmaster tools.

To stop such links from being indexed by search engines in future, install Replytocom redirect WordPress plugin, and also refer to WordPress duplicate content fix.

You could also add the below rule to robots.txt to prevent bots from indexing links with “replytocom” parameter.

Disallow: *?replytocom

  • Duplicate Title and Meta Tags Fix:

It can happen many times that when you have a huge number of posts, you might have some posts with exactly the same title as the other.  To solve this issue, use the Duplicate post remover plugin. 

  • Prevent Content Theft

This issue is very common, and copycats don’t spare any good blog from being copied.

Solution:

Show only summary in RSS feeds (Settings > reading > For each article in a feed, show: select “summary”), use tools like Copyscape.com, PlagSpotter.com and Dooplee duplicate content checker plugin to find copied posts.

  • Limit Categories, Tag, and Archive Pages

Each tag and category creates a new page and on each such page it shows the title and the first line paragraph from the posts which belongs to each category or tag, this creates confusion to bots when they crawl your site.

Having fewer tags and choosing fewer very important category is a good practice to avoid duplicate WordPress content.

For example, let’s make “Colour” as a category and ‘red’, ‘green’, ‘yellow’ as tags to keep it simple.

By default, the search engine crawls through all the segment of your website and if it finds the same content again, then it is treated as duplicate content.

Therefore, it is important to specify the search engines about which areas of your blog or website should be ignored. You can set canonical URL’s or Noindex pages which you don’t want to be indexed by search engines.

For this, you can use Noindex tag. I also recommend Robots Meta plugin, or if you are using SEO plugins like WP SEO, they have this feature inbuilt.

You can also use this code in your ‘header.php’ file. This code makes sure that only pages such as the home, posts, pages and category pages are indexed by search engines spiders, while certain others (feeds, archives, etc.) are excluded. The code is:

if((is_home() && ($paged < 2 )) || is_single() || is_page() || is_category()){ echo ‘<meta name=”robots” content=”index,follow” />’; } else { echo ‘<meta name=”robots” content=”noindex,follow” />’; } ?>

  • Image Attachment Link

This is another issue which creates duplicate pages. When you add an image to a post, you have some link options:

  1. link to file URL
  2. link to post
  3. link to the image attachment

Never set link attachment as image link.  If you already have a lot of posts with image attachment linking to image, then you have a simple solution to fix this issue by installing the attachment pages redirect plugin.

  • Comment Pagination

WordPress 2.7 introduced comment pagination to break comment into different pages in case you have too many comments which drags the page.

The problem with this is that each broken page means duplicating the content that people are commenting on.  You can fix this commenting pagination through Settings > Discussions > ‘Uncheck’ Break Comments into Pages option.

Hope this post has helped you to fix duplicate content issue on your blog. Let us know of your experience in dealing with duplicate content issue.

Write a Comment

Comment

Webmentions

  • Common SEO mistakes to avoidTheOBunce – Blog June 7, 2016

    […] Duplicate Content: Google values unique content and hates websites with duplicate content. Do not write about the same topic many times. For eg: “Make money online using social media” and “Using social media to make money online”. This will not help you rank your keywords higher. Check out rankjane post on removing duplicate content on wordpress. […]