Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!



Photo
- - - - -

Serving Alternate Content To The Search Bot


  • Please log in to reply
23 replies to this topic

#1 Drew

Drew

    HR 3

  • Active Members
  • PipPipPip
  • 91 posts
  • Location:Cleveland, OH

Posted 08 July 2009 - 11:19 AM

Here's a quandry. The site I work with has a 'search' form with javascript-driven submit button and a POST method. For a variety of reasons both technical and social, we can't re-do the form, but we do need Google to find the URLs on the other side of that form. The solution devised was to serve direct links to all of the possible results pages when the user-agent is the Googlebot. Giving the human users the huge list of links would be useless to them, so we don't want them to see it, but we do need the engines to find those pages.

Now, I know Google doesn't want websites to serve different content to the bot than to the user, but do you think this would be an 'allowable case'? The hidden links aren't there to game the system, but rather to help the bot penetrate otherwise inaccessible areas of the site. I'd hate for them to penalize us for an action with positive intentions.

Any thoughts?

#2 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 08 July 2009 - 01:05 PM

QUOTE
Now, I know Google doesn't want websites to serve different content to the bot than to the user, but do you think this would be an 'allowable case'?


No. And if they discover it you're going to be sorry. Beacuse...

QUOTE
The hidden links aren't there to game the system


Yes it is, at least as far as they're concerned.

If you want them to get to those pages you need to provide a more transparent way to get to them. There are lots of ways to accomplish this. However hiding content or making content visible based upon the user agent isn't one of them.

#3 PIGuy

PIGuy

    HR 1

  • Members
  • Pip
  • 3 posts

Posted 08 July 2009 - 02:09 PM

QUOTE(Randy @ Jul 8 2009, 07:05 PM) View Post
No. And if they discover it you're going to be sorry. Beacuse...
Yes it is, at least as far as they're concerned.

If you want them to get to those pages you need to provide a more transparent way to get to them. There are lots of ways to accomplish this. However hiding content or making content visible based upon the user agent isn't one of them.

why can't you simply supply a sitemap?

#4 Jill

Jill

    Recovering SEO

  • Admin
  • 33,244 posts

Posted 08 July 2009 - 04:32 PM

Have you looked into Google's First Click Free program?

That might do the trick if it's appropriate for what you want.

#5 Drew

Drew

    HR 3

  • Active Members
  • PipPipPip
  • 91 posts
  • Location:Cleveland, OH

Posted 09 July 2009 - 09:44 AM

Thanks for the tips all. I guess I'm going to have to take it up with our developers and see if they're able to do something different for now. Adding the hidden content has gotten over 12,000 more of our pages into the index, but if it means risking 100% of them dropped from the index, that may be not worth it. Unfortunately we're dealing with code and data structures over a decade old, so adding site maps and other user-accessible content to the screen is a development and support nightmare... we may simply have to accept being unindexable until such time as the development calendar can fit in a total redesign...

#6 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 09 July 2009 - 10:20 AM

A pretty simple thought and one I've seen work before to get and keep these types of pages indexed. And it should work since you already have the list of search links available.

Instead of having that hidden on a page why not simply construct a separate page of Most Popular Search that say includes the question/search being input by the user. where the question itself is anchor text and linked to the answer? Possibly even add a short snippet for each to add context.

Basically think of all of those Knowledge Base applications out there and emulate how they display such data.

By offloading it to a separate Most Popular Searches/Questions type of page or pages you make it visible and available to users. Thus you get away from the specter of hidden content and potentially add value for real users. And you can simply link to this new Most Popular page from where you're now providing the hidden content.

#7 Drew

Drew

    HR 3

  • Active Members
  • PipPipPip
  • 91 posts
  • Location:Cleveland, OH

Posted 09 July 2009 - 10:48 AM

I think the solution here is actually the <noscript> tag. I'm not sure why it didn't occur to me before, but the same content that the search engine can't find is also inaccessible to users without script. By putting the links to the inaccessible pages into a <noscript> tag, we're helping the bots and the users at the same time without violating any rules.

#8 Alan Perkins

Alan Perkins

    Token male admin

  • Admin
  • 1,648 posts
  • Location:UK

Posted 13 July 2009 - 04:44 AM

QUOTE(Drew)
I think the solution here is actually the <noscript> tag. I'm not sure why it didn't occur to me before, but the same content that the search engine can't find is also inaccessible to users without script. By putting the links to the inaccessible pages into a <noscript> tag, we're helping the bots and the users at the same time without violating any rules.


Yep, you need to make your site more accessible to all Javacript-less clients, not just search bots. The noscript tag is one simple way to achieve this.

#9 Drew

Drew

    HR 3

  • Active Members
  • PipPipPip
  • 91 posts
  • Location:Cleveland, OH

Posted 13 July 2009 - 11:32 AM

Yeah, although we're discovering that pages accessible via the <NOSCRIPT> link aren't getting picked up nearly as quickly as they did before. We posted the new code on Friday and Google has only added three out of about 20,000 URLs to the index. We'll wait and see though - they may not index stuff in a <noscript> as quickly as they do stuff that's visible on the page or something.

#10 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 13 July 2009 - 06:05 PM

How many links are in the <noscript> tag of each page? Are you saying that each page has 20,000 links in it? Or even 500 links?

If so, I would imagine both might appear to be pretty spammy at first glance to the search engines. Or at least look similar enough to stuff real spammers do to cause the bot to take a step back. Think about it from their perspective. Mr. Bot goes to a page and sees 20-30 links to internal pages in the visible code of the page. Then also sees an additional 500 or 20,000 links that don't appear in the visible page for users from what the bot can tell.

This is why I don't like to use a <noscript> unless it is referencing urls that do appear elsewhere in the page, but just can't be read by the non-js user agents. It's too easy to go too far without realizing it and suddenly put the site under greater scrutiny than it is due.

Have you considered compiling those into an xml sitemap you can then submit to the search engines? Not that it's guaranteed to get the pages spidered or indexed, especially if they're not linked to from any other already indexed pages.

#11 Drew

Drew

    HR 3

  • Active Members
  • PipPipPip
  • 91 posts
  • Location:Cleveland, OH

Posted 27 July 2009 - 10:36 AM

Well, we're finding that since we put in the <noscript> tag, our content is not getting picked up at all. When we were doing the cloak, Google was grabbing hundreds of new URLs a day, and now they're getting about four. It looks like Google is mostly ignoring the no-script content.

For now, I think what we're going to do is to serve a different submit button on the form when the user-agent is a bot. According to http://searchenginel...detection-10638, Google's engineers have said they're fine with replacing search-engine-unfriendly links with alternates to let the bot get through. Since our content is behind a javascript and POST-driven form, giving the bot a plain <a> link to the same results instead of the search button seems like an allowable alteration. The content will be identical for users and bots, with the exception that users will run a search, whereas bots will simply get a list of all of the possible result URLs. That will have to do until we can get around to building a user-workable link directory.

#12 Jill

Jill

    Recovering SEO

  • Admin
  • 33,244 posts

Posted 27 July 2009 - 11:02 AM

You'll want to make sure that the users who click to your site from the google listing will see exactly what Google saw as per my previous post about 1-click free.

#13 Drew

Drew

    HR 3

  • Active Members
  • PipPipPip
  • 91 posts
  • Location:Cleveland, OH

Posted 28 July 2009 - 02:55 PM

Our dilemma isn't really a '1 click free' situation anyway. It's not that the human user has to register or pay to get the content, but rather that they have to submit a search form that relies on the POST method and some javascript to function. The bot doesn't index the results because it can't ever get to them.

So with our new configuration, the content for users and bots is identical, but the method of reaching it is slightly different. The user starts at a search form, clicks the 'Search' button, and gets the first twenty result links, and then has to click additional (script-driven) links to get to the rest of the results. The search engine bot starts at the search form, clicks a search link (served in place of a form submit button), which returns a complete list of ALL search result links. So in effect we're giving the user and the bot identical access to identical content - it's just that the user gets to filter the returns to what they want, and the bot gets a flood of all possible links so that it can spider through all of them.

#14 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 28 July 2009 - 03:37 PM

Don't fool yourself into thinking it's identical, or that the search engines are going to view it as being identical Drew. Because it's not.

You're treading on some very dangerous ground if the engines ever do look under the hood and see that it's different for bots and humans. They may or may not look, and may or may not zap you for it if they look. IMO it's okay to do as long as you go into it with full knowledge that at some point the site may well get penalized. You have to factor this possibility into your decisions, because that's the reality.

If you can stomach the risk, if the reward is great enough and if you have a backup plan, go for it. If not on any of those three, don't. And if you do continue serving up the altered content and get zapped don't be surprised or whine about it. You've been forewarned and given other options to achieve the goal.

#15 2Clean

2Clean

    HR 3

  • Active Members
  • PipPipPip
  • 62 posts

Posted 29 July 2009 - 04:23 AM

Google won't pass links in a noscript because it's a container for doing the kind of stuff you are doing. You need to be a little creative. Site maps won't work on this kind of scale unles you add pages to it gradually. This you might be able to script.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

We are now a read-only forum.
 
No new posts or registrations allowed.