Avoiding Google Penalty for Duplicate Content

WordPress sites usually have multiple links to the same content. Recent googlebot changes have now been more sensitive to this issue and will highlight this as “Duplicate Title Tag”. Once this is flagged, your page that is highlighted as having this “Duplicate Title Tag” will be delisted from the search engine results.

The effect is that your site, or at least your pages with duplicate title tags will not be shown. And once that happens, your traffic will drop as the visitors referred by the search engines will be gone!

The easy solution is to use the robots.txt file to let google know about this and have it not index the duplicate pages. For example:

eBusiness Adviser | Network Monitoring Service

/business-process/network-health-monitoring-service/
Excerpt above was taken from google webmasters tool that shows the single page for “Network Monitoring Service” being flagged by Google for having “Duplicate Title Tags”. In this case, it has three duplicates!
The  solution as we have found out is to have the following entries in the robots.txt file:
User-agent: Googlebot
Disallow: /*/trackback
Disallow: /*/feed
Disallow: /*/comments
Disallow: /*?*
Disallow: /*?
Disallow: /*page/*
User-agent: *

Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-contents/plugins/
Disallow: /wp-contents/themes/
Disallow: /trackback
Disallow: /comments
Disallow: /feed
This has the effect of making google bot not index the pages that are covered by the wildcard patterns. So taking the situation above, the page described by /?p=183 won’t be indexed by Google, because of the “Disallow: /*?” directive.
The page with the link:

 Here are some great ideas on how to make wordpress blog duplicate content safe.