Are you a Google Analytics enthusiast?
More SEO Content
Google Spider Yes! Competitors Not!
Posted 14 April 2008 - 03:48 AM
I am busy with the putting together the layout for a new website. It will be an informational website, gathering the info will take a LOT of time. This information would be very useful to other competitors / companies as well.
What I want is the following: (1) Google must be able to spider the information but (2) Competitor-bots NOT.
Is there an easy way to arrange this?
Thanks a lot in advance
Posted 14 April 2008 - 07:29 AM
First, to allow in Googlebot but keep users away you'd have to perform some type of cloaking where you deliver the actual content to Googlebot and perhaps other legitimate spiders, but not to other user agents. This is fraught with difficulties on a variety of fronts, first and foremost being that it can be construed by the engines to be a bad thing. Thus attracting closer scrutiny or potentially a penalty for trying to "trick" the search engines. Another issue being that if you rely solely on the user-agent string to cloak things, it's a relatively simple procedure for others to mask their browser or bot to appear to be Googlebot, thus getting themselves into your restricted area.
The other obvious issue is that if you allow Googlebot etal to Cache/Archive the content, it'll still be available to anybody who goes to the search engines and then review the cache version of your pages.
In other words, it's hard to do. On a couple of fronts.
What usually works best, and may in your case too, is to feed the spiders and visitors the same thing, but make this freely available content a Synopsis of the actual info you're making available. In other words, a shorter version that contains your keyword phrases enough to rank well, but not something that gives it all up. Keep the really valuable stuff hidden behind some sort of password protection to give you better control over who gets to see the real meat.
Posted 15 April 2008 - 02:15 AM
Are there other people that deal with the same kind of problem?
As I mentioned in the first message, it will take a lot of time to gather all the info that we want to publish on our site. It is info that people are looking for we are certain, but they do not want to pay for it. We have other revenue streams, so that is not a problem. But we need to attract people to our site, so I want to have as much info on the site as possible but want to avoid that competitors just copy paste the info from our site onto their own site.
Putting a password on the site is a suggestion we are thinking off ourselves as well, but it would only be to prevent spiders of competitors to scrape our site. But I assume that when a competitor has HIS username and password he can put that into his spider and let it scrape our site ... or am I wrong here. I have no knowledge about spiders as you can tell .
If anybody has another sollution … I would appreciate it if you can share it with me.
Posted 15 April 2008 - 06:44 AM
If someone is using an automated process to scrape your site content and you can identify the bad bots that way it's a fairly simple process to send them off into null space and/or block them outright. As you say, even if you set up password protection the bad bot owner might have the ability to feed it a valid user/password and still get to your content.
The other way of handling such things is waiting until after they've lifted some of your content then complaining directly to their hosting company. There's always someone upstream from these types of folks, and most hosts have a copyright infringement/DCMA policy in place these days.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users