Skip navigation
newsletter signup

There Is No Duplicate Content Penalty

March 24, 2010

By Jill Whalen

Dec. 2012 Update: Please see this updated article about Duplicate Content and Google Penalties.

The SEO industry has been plagued for years by a lack of consistencyPhoto Credit J.J. Verhoef with SEO terms and definitions. One of the most prevalent inaccurate terms we hear is "duplicate content penalty." While duplicate content is not something you should strive for on your website, there's no search engine penalty for having it.

Duplicate content has been and always will be a natural part of the Web. It's nothing to be afraid of. If your site has some dupe content for whatever reason, you don't have to lose sleep every night worrying about the wrath of the Google gods. They're not going to shoot lightning bolts at your site from the sky, nor are they going to banish your entire website from ever showing up for relevant searches.

They are simply going to filter out the dupes.

The search engines want to index and show to their users (the searchers) as much unique content as algorithmically possible. That's their job, and they do it quite well considering what they have to work with: spammers using invisible or irrelevant content, technically challenged websites that crawlers can't easily find, copycat scraper sites that exist only to obtain AdSense clicks, and a whole host of other such nonsense.

There's no doubt that duplicate content is a problem for search engines. If a searcher is looking for a particular type of product or service and is presented with pages and pages of results that provide the same basic information, then the engine has failed to do its job properly. In order to supply users with a variety of information on their search query, search engines have created duplicate content "filters" (not penalties) that attempt to weed out the information they already know about. Certainly, if your page is one of those that is filtered, it may very well feel like a penalty to you, but it's not – it's a filter.

Penalties Are for Spammers

Search engine penalties are reserved for pages and sites that are purposely trying to trick the search engines in one form or another. Penalties can be meted out algorithmically when obvious deceptions exist on a page, or they can be personally handed out by a search engineer who discovers the hanky-panky through spam reports and other means. To many people's surprise, penalties rarely happen to the average website. Sites that receive a true penalty typically know exactly what they did to deserve it. If they don't, they haven't been paying attention.

Honestly, the search engines are not out to get you. If you have a page on your site that sells red hats and another very similar page selling blue hats, you aren't going to find your site banished off the face of Google. The worst thing that will happen is that only the red hat page may show up in the search results instead of both pages showing up. If you need both to show up in the search engines, then you'll need to make them substantially unique.

Suffice it to say that just about any content that is easily created without much human intervention (i.e., automated) is not a great candidate for organic SEO purposes.

Article Reprints

Another duplicate-content issue that many are concerned about is the republishing of online articles. Reprinting someone's article on your site is not going to cause a penalty. While you probably don't want every article on your site to be a reprint of someone else's, if the reprints are helpful to your site visitors and your overall mission, then it's not a problem for the search engines.

If your own bylined articles are getting published elsewhere, that's a good thing. You don't need to provide a different version to other sites or not allow them to be republished at all. The more sites that host your article, the more chances you have to build your credibility as well as to gain links back to your site through a short bio at the end of the article. In many cases, Google doesn't even filter out duplicate articles in searches, but even if they eventually show only one version, it's still okay.

Inadvertent Multiple URLs for the Same Content

Where duplicate content CAN be a problem is when a website shows essentially the same page, but on numerous URLs. WordPress blogs often fall victim to this when multiple tags or categories are chosen to label any one blog post. The blog software then creates numerous URLs for the same article, depending on which category or tag a user clicked to view it. While this type of duplicate content won't cause a search engine penalty, it will often split the overall link popularity of the article, which is not recommended.

Any backend system or CMS that creates numerous URLs for any one piece of content can indeed be a problem for search engines, because it makes their spiders do more work. It's silly to have the spider finding the same information over and over again, when you'd rather have it finding other, unique information to index. This type of unintended duplicate content should definitely be cleaned up either through 301-redirects or by using the canonical link element (rel=canonical).

When it comes to duplicate content, the search engines are not penalizing you or thinking that you're a spammer; they're simply trying to show some variety in their search results pages and don't want to waste time indexing content they already have in their databases.


---
Jill Whalen is the CEO of High Rankings, a Boston SEO Consulting Agency.

If you learned from this article, be sure to sign up for the High Rankings Advisor SEO Newsletter so you can be the first to receive similar articles in the future!
 
 
Post Comment

 Michael Cottam said:
Great article Jill...I would add that duplicate content can cause an issue in terms of crawl depth. If you've got extensive multiple-URL structures for your content, it can affect the amount of content that the search engines (Google, at least) are willing to crawl and index.

Similarly, having a lot of pages with duplicate title and description tags, or lots of 404s or 500s, or an overly slow load time for your site all appear to cause Google to crawl & index fewer pages. It's not clear to me whether these are completely separate factors in Googlebot's determination, or whether they combine to make some sort of "quality score" that Google uses to determine how much effort your site is worth in terms of crawl & index, and clearly Matt Cutts is avoiding specifics on whether there are set "cap" levels and what controls those caps.
 John said:
Hi Jill,

Another good article, although I describe the filter as a penalty to clients as they seem to grasp that easier.

I had to explain to one last wek that if they have 4 different websites that all show the same content there will be problems - whether it is that the engines get confused and decide that as they all came on line at the same time so there is "something afoot" and it is some sort of attempt to game the results or whether it is just that the engines rate some pages on one site and other pages on a second site while the client really wants a 3rd site to be the one that visitors come to because for branding.
 Herman said:
"This type of unintended duplicate content should definitely be cleaned up either through 301-redirects or by using the canonical link element (rel=canonical)."

Please explain how to locate the duplicate content issues in WordPress and how to solve them.
 Daniel O'Neil said:
Hi,

What happens if the duplicate content is not on the same urls, but on different domain names? Do the search engines compare content across domains?
 Jim Sullivan said:
Hi Jill,

Another great article. What about a client that is targeting multiple countries that speak the same language. Should they have just one website with one publication of each article? Or should they duplicate the USA content (.com site) on an Australian TLD ( .com.au) for example?
 Jill Whalen said:
Thanks everyone for the great comments and questions. Let me address what we have so far, but keep 'em coming!

@Michael Cottom, that's correct, which is why you want to avoid the same content being found on multiple URLs within your site. It's silly to make the spider do all that work just to index what you already have indexed under a different URL.

@John, I agree that clients probably understand "penalty" better than "filter" but since it's not at all a penalty, I strongly dislike using the wrong terms. Penalty is a scary thing to clients, and in most cases, the dupe content is not the end of the world so there's no reason to scare them that way. With your 4 websites having the same content scenario, it's usually not that big of a deal since Google knows how to choose just one. But it's much better to choose the real one yourself and also not split any link popularity between the various domains. So it would be a much smarter idea to 301 redirect the extra domains to the real one. But if you don't, again, it's not the end of the world and Google won't strike them down dead!

@Herman, you can locate any dupe content issues, including those with Wordpress by using Google's site: command operator as well as just taking a snippet of content from your pages, putting it in quotes and then searching for it in Google. Fixing it is another story. There are plug-ins for WP that do that. See our forum as we are having a discussion on this now there in one of the threads.

@Daniel O'Neil, the search engines do compare content across domains, and that's where the filtering often comes in as discussed in the article.
 Joe said:
This is something I have been worried all the time. We have alot of similar products in our ecommerce. We try to write unique content but due to the products being same they are similar contents. It is good know that we will not get punished for it. I understand this duplicate content issue much better now.
 Brett Borders said:
I understand your point - that there isn't apparently some kind of hard and fast "rule - It guess it's semantics... if you want to call it a penalty, or not... but I have seen some scraper sites and duplicate pages that REALLY don't rank well in the Google search results ;)
 poch said:
Hi Miss Whalen,

Since my site is primarily about news, I sometimes duplicate exact copies but always
with link to the source. I was always afraid though that someday I would get blacklisted.
Now you've taken out that fear of mine. Thank you very much.

poch
 Jill Whalen said:
@Joe, you won't get punished for it, like I said, but you will have a hard time showing up in the search engines for your targeted keywords if you aren't creating your own product descriptions. One thing you can do to add unique content is get some user-generated content, like allow comments or reviews of the products to be posted.

@Brett Borders, yes, of course scraper sites do poorly in the search results. They're filtered out! (And nobody is linking to them.)

@poch, it's fine if you're occasionally putting up news articles that are elsewhere, but like I said in the original article, if that's all you have on your site, you'll highly unlikely to show up in the search results. It's also important to note that those news articles will be unlikely to show up in the search results either. So as long as you're posting them because they add value to your site and provide interesting content to your readers, that's fine. Just don't do for SEO purposes as you'll be wasting your time, for the most part.
 Conrad Lopez said:
Thanks for a great article Jill. I have used multiple categories for a long time on my Wordpress blogs and it never occurred to me there were seo implications. Thanks for pointing that out. I'll be sharing this with some of my fellow bloggers.
 George said:
Hi Jill,
thanks for the post. I have a small doubt in my mind. consider having 5 sites having blog installed in them. When i post a blog entry on one of the main site, it will get posted automatically to the other 4 sites. In this case the same blog entry is duplicated in the other 4 site...
Will I be able to use the canonical link element (rel=canonical) in this case..if so can u please explain...
 Jill Whalen said:
Yes George, you can use the canonical link element across domains these days if you'd like.
 Stephanie said:
A question about this topic - my client has several 'landing pages' with SEPARATE domain names, pointing all to the same site (not the same page).
Is this a type of site content duplication?
Does the robot see this site as, for example, 4 websites with the same content?

Thank you.
 Jill Whalen said:
@Stephanie, it depends on what you mean by "pointing." If you mean that they're 301-redirected, then the robots will never see the content that is "pointed."

But pointed means different things and can be done in different ways, so I can't answer the question with this limited information.
 SEO Doctor said:
I'm not sure getting one article on a lot of sites is going to help your link profile. I can submit 1 article to 5k article directories by automation, but I don't think it's worth it.
 John said:
I have to say that your correct. Press releases are a classic example of "duplicate content" because largely huge chunks of data are re-published in the name of news and not edited dramatically from the source.
 Jeff Selig said:
if duplicate content was truly an issue the AP wire and Reuters would have been toast long ago
 crockstar said:
Great article...

After a recent problem with duplicate content that a friend of mine was having it would appear that the opposite may in fact be true. BOLD disclaimer- I am not trying to advocate duplicate content... for me it is blatantly plagarism if it is not cited.

However- to continue with the story, a friend of mine has recently been having trouble with his rankings for content which has been duplicated by other (less strong) sites. However, the article that he wrote over 12 months ago does not really rank, yet, as the topic went hot recently the duplicate site took all the gains because it was duplicated quite recently and QDF and real-time gave the nod to the newere content.

It would almost seem as though there is currently a perverse incentive to duping content on popular issues as they become hot. I wouldn't advise it, but it seems as though the major engines do reward it with the most recent algorithms.
 Mike said:
That's the first time I've heard this. For sure I get to hear 'duplicate content is spam' all the time. Now, I know why so many people get away with it.
 Andre Arnett said:
Thanks for the article. Always have a problem trying to figure out that duplicate content penalty. There seem to be so many versions as to what it can do. Thanks for a little clarification on this issue.
 Jill Whalen said:
@crockstar, your story is not the opposite of what I wrote in this article. It's exactly what I said. Some of the duplicates will be filtered out and others will be shown.

But it's not a penalty. Which is the point.

It doesn't make it suck any less, I'll give you that!
 stephanie said:
To further elaborate on my previous question.....
My client has several domain names that he uses for marketing purposes - they all reside on his major (main) site.

They point to specific landing pages that are part of the major website....

For example:
Majorsite.com is the website.

myproduct.com points to Majorsite.com/mycoolproduct.html
otherproduct.com points to Majorsite.com/myotherproduct.html
thirdproduct.com points to Majorsite.com/thirdproduct.html

All those pages, of course, have links to pages in the majorsite....

Does that help?
 Jill Whalen said:
@stephanie, I'm still a little confused, because you haven't explained what you mean by "pointing."

As long as by pointing, you mean that those additional domains are 301-redirected to a specific page on Majorsite.com then there will be no problems.

There are so many ways of "pointing" a URL to another, but the only good way is via a 301-redirect.
 stephanie said:
I just use the CPANEL admin on the site to create an "Addon Domain", which has a redirection URL - which I redirect to the landing page on the majorsite. I don' t know the underlying functions it is doing... it is just done by the webhosting service company.

I don't know if we are talking the same thing - I need to look at the 301-redirect more...

thanks for your help.
 Jill Whalen said:
You can check the URL that you are redirecting via a "server header checker" to see whether it's a 301 redirect or something else. If you Google server header checker you should find some suitable ones.
 stephanie said:
thanks - learned something new!
 Adam Melson said:
Definitely glad this article came out. I think there's been a scare that sites will get penalized and it's really just wasting the number of pages you'll have crawled. You're not doing anything wrong with duplicate content, just being redundant and having a page shoved in the omitted results is usually the result. Thanks again for the nice recap!
 Teena said:
Great article giving light to this issue of duplicate content. I too fear duplicate content and now that you've explained, I don't have to worry about it anymore. I just needed a good read like yours who can provide me with detailed explanation of what duplicate content is all about or are there really penalties...Many thanks.
 Dwight said:
Jill,

Wrong again.
 Jill Whalen said:
@Dwight, care to elaborate?

If not, your post adds no value to the conversation and I'll just remove it.
 Claire said:
Duplicate content is one of those areas that people worry about incessantly as they've heard it will cause them problems with Google. Many thanks for this latest advice!
 David said:
This is worth looking at on the subject, which describes sources of duplicate content of which we might not be aware:

http://www.searchenginejournal.com/11-sources-of-duplicate-content-you’re-probably-unaware-of/10717/
 Jill Whalen said:
@David, I disagree with much of what's in that article.

Saying that your navigation and headers and footers is duplicate content and will get you into trouble with Google, is just plain wrong.

That's exactly the sort of article that my article here was written to debunk.

I repeat, you will not be PENALIZED or get into trouble with search engines for having duplicate content.
 Chong said:
Very nice article! I have a question about the http code. What if a sub site with a status code 200 is showing the same content of the main site? Does the main site suffer because of the duplicate content and rank lower?
 Jill Whalen said:
@Chong, one of the sites may be filtered out of the results, if both sites show a 200 OK and they both have the same content.
 Ben said:
Hi Jill,

I find the article very interesting. I'm wondering how to define the specific line between malicious conduct (the spammers) versus non-malicious conduct? Is there something that makes Google say "Okay, something is definitely not right here"?
Thanks!
 Jill Whalen said:
Hi Ben,

It's usually based on authority and trust. The less authoritative one should theoretically be filtered out.
 Tim said:
Hi Jill

Good article as usual.

I do think that duplicate content and duplication practices can become unacceptably 'malicious' over time too.

My own example is that I mirror my .co.uk site and my .com. There are historic reasons for this in that the search engines in the USA used to ignore anything that wasnt a dot com. The mirroring does mean that we are unfairly dominant in the search result and so this is I suppose spammy. Yup it would be great to have 2 different sites, but we havnt the time to run 2 different sites. We thought that Google would in time sort this out to show the right results in the right country but they haven't. I have been warned by SEOs that we are on thin ice but my ISP and web designers say absolutely no problem. I fear that what I am doing now looks malicious although the intent was originally perfectly innocent.

We are going to close down the .co.uk and redirect all the traffic by 301s to the .com.
 Jill Whalen said:
Tim, have you tried setting the region of the .co.uk site to the UK within Google's Webmaster Tools? That might help you.
 Tim Clarke said:
Jill.
Interesting thought. Our incoming links though are fairly randomly spread between the site names, with perhaps more on the .co.uk. I am concerned to concentrate the effect these by using 301s and dont know whether setting the .c.uk and the .com to different locations might divide them - assuming any of this still matters of course
Tim
 Jill Whalen said:
Tim, if you 301 it then I don't believe you can set it as anything in W. Tools. I wouldn't redirect it if you are targeting the UK with it.
 Robert Somerville said:
But isn't this just a matter of definition? Having your content filtered out due to being duplicate is in itself a form of penalty?
 Jill Whalen said:
Robert, no because true search engine penalties are something totally different. It's important to understand the distinction.
 Mitchell said:
Wow, Thanks Jill for a great post!

Finally, I feel like the clouds have lifted and I can actually understand what is going on!!!

Thanks!
 Shirley Kelly said:
Thank you for writing in plain, easy to understand English. I finally understand the mystery of duplicate content and how it affects my websites.
 Gifts retailer said:
Yes Google are going to filter out the duplicates and probably won't penalise you for duplicate content. But the whole reason people SEOs advise against duplicate content is that you want the RIGHT page in the listings in the first place. If Google pick a blog entry you made instead of a product page, that puts a serious knock on your conversion rate.
 Dave said:
Until Google gets some serious neural networking built into it's visual capabilities, design sites have a real problem with duplicated content. Design for the user?? Until Google gets "eyes"...I don't think so :\
 Katie Rainer said:
It's good to read that it's not as big a problem as everyone has been saying to me - I've recently been having some ranking problems and thought one reason could have been submitting articles promoting my site... but suppose 'duplicate content' is like one library complaining that another one has a book that's on their shelves lol, content can be found in a lot of places, not always badly copied.
 Ferodynamics said:
Good advice. I recently started posting the same content to my multiple blogs (if there is some relevancy) and the results are good so far.

Seems better to post a duplicate than let a website stagnate.
 Leander said:
Hey Jill, nice post! With a lot of good news :)

Do have one question though..
When I am writing articles and publishing these, there's no need to make each one if them unique? I ma talking about 2-3 times publishing one article, not putting them in 100 different places.
 Jill Whalen said:
@Leander, no you don't have to make them unique. They may or may not show up in the search engines for the keywords contained within it, however. (As per this article.)
 Leander said:
@ Jill

Thanks for your reply. It will safe me quite some time then. Thanks!
 Jessy said:
Thank God someone is writing the TRUTH about dup content. @Jill Whalen, SEO industry needs more articles like yours. Google officially stated several times there's no such thing as a "duplicate content penalty".

Google: “Let's put this to bed once and for all, folks: There's no such thing as a "duplicate content penalty." At least, not in the way most people mean when they say that.” Quoted from: http://googlewebmastercentral.blogspot.com/2008/09/demystifying-duplicate-content-penalty.html

If you're doing spammy or malicious acts, google will ban you. If not, no need to overstress yourself. dup pretty much don't stand a chance to outrank the original one. (oh well... maybe...not if you use massive amount of bancklinks to overcome)
 Kathy said:
Hi

This is a great post Jill. I have a co.uk site with about 300 pages of pure original content written by us. I have now started a .com do you think it would be ok to use some of the same pages on there? Will it be ok to use this ezines plugin to send my posts to ezines. All my posts are my own work so should be no duplicates.
 Jill Whalen said:
Kathy, it's fine, as long as you don't mind that your site may not show up for the keyword phrases targeted in the articles.
 Donna B. said:
I don't see dates here, so hope you're still around, Jill!

Thanks for a careful and clear explanation of duplicate content.

I have a question about Huff Post, however, which I think can be considered pretty high in rankings, regardless of what one thinks of it (I think a lot of it is plagiarism, and rarely go there.) They have about 25 tags at the very beginning of each "article"! I have 4 Wordpress blogs, and it just seems wrong that I have to be concerned about my 3 tags, or 2 categories, and they get gravy.

Any comment on their practice there?
 Jill Whalen said:
Donna, I'd call that keyword stuffing not dupe content.
 Jonathan said:
Very interesting post. IS a filter not just as bad as a 'penalty' at the end of the day, I recently started a blog on combat sports called MMApages and I included a section called news which Is a feed from a top MMA news site. My problem is that because it appears at the bottom of every page, I'm worried this will hurt my rankings. It gives full credit to the site it's from and I put there as I feel my readers would enjoy it but I don't feel that at ease about having it there in case my pages get filtered out. I'm thinking i should remove it. Very confusing topic.
 Jill Whalen said:
@Jonathan it's not a problem.
 Allen Nugent said:
"They are simply going to filter out the dupes" -- if only !

Almost every Google I do returns over 50% redundant hits -- same website, same content, redundantly farmed or poached content -- whatever. I don't understand why I should have to see the same crap over and over. There is no effective redundancy filtering in Google.
 Allen Nugent said:
Addendum: I suppose I should have said "content" instead of "crap", because sometimes (though rarely) the redundant information is actually what I was looking for.
 Keyword Rich Name Removed said:
Interesting how an article written way back in 2010 can still cause so much consternation post the Panda and Hummingbird pandemonium (no pun intended).

The myth of dup content (not sure who started it though - the Genius of it) has always been perpetrated by those still new or clueless to seo.

News aggregaters have long used this method to spread the news, as it were, and their pages, to this day, still rank in the search engines.

Then of course there's the syndication method, where you copy the entire content, or part thereof, add your own intro and ending, then attribute the source of the content to its origin, and you still can rank.

This article is to the point, it was well thought out - issue is in the interpretation of it.
We'll never really know as Google is not in the business of divulging seo secrets now - nor should they!