Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!



Photo
- - - - -

Indexing Password Protected Content


This topic has been archived. This means that you cannot reply to this topic.
35 replies to this topic

#1 Scottie

Scottie

    Psycho Mom

  • Admin
  • 6,294 posts

Posted 03 August 2003 - 08:30 AM

[This discussion has broken out from the session id thread]

Question for Mel:

If you require a username and password to access specific content, why would you want to allow a spider access to it?

The SE doesn't want to index pages that the average user cannot see. If I click on the result and get a login box, I'm not going to be a happy camper with the SE for sending me there...

Edited by Alan Perkins, 04 August 2003 - 03:11 AM.


#2 Mel

Mel

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 353 posts

Posted 03 August 2003 - 08:51 AM

Question for Mel:

If you require a username and password to access specific content, why would you want to allow a spider access to it?

The SE doesn't want to index pages that the average user cannot see.  If I click on the result and get a login box, I'm not going to be a happy camper with the SE for sending me there...

HI Scottie:
I do not think the use of search engines is limited to "average users" but to less than average and better than average and all shades of users in between. I also fail to see why ALL the content on a site should not be indexed by the search engines.

If you click on that result and get a log-in or register box you may not be a happy camper, but it is up to the site how they want to allow access to their content. As an example you have valuable papers you have available for interested parties, but as a condition of that access they want you to register first (maybe even pay a fee). If you are not willing to register to access the content that is your decsion, but IMO your opinion should not be forced on all users.

This is a very normal situation with the large online research companies who require a subscription and payment before you can access their online content. Do you believe that their research and papers should not be indexed. so that a person can find them if they want them?

#3 Ron Carnell

Ron Carnell

    HR 6

  • Moderator
  • 966 posts

Posted 03 August 2003 - 10:13 AM

One would have to question the effectiveness of that, Mel. While I'm sure it depends on your target audience, I think it's a mistake to assume that people are stupid, and spoofing a user-agent isn't exactly rocket science these days. Take one not-stupid user, a forum or two that targets the same market, and all of your content just became free.

#4 Mel

Mel

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 353 posts

Posted 03 August 2003 - 10:22 AM

HI Ron:
Yep you are right there are always thieves out there, but can you suggest a better solution to getting such content spidered?

#5 Scottie

Scottie

    Psycho Mom

  • Admin
  • 6,294 posts

Posted 03 August 2003 - 10:42 AM

If it were my client, I would advise them to write teasers or summaries that were publicly available to allow search engines and prospective subscribers to read them.

If I'm looking for your information and I only get a login box, I'd assume the content wasn't there anymore and move on to the next result. Or I'd just pull up the Google cache and read it without logging in.

If however, I came to an abstract or summary of what was contained in the pay-only version, I would be more inclined to get my credit card out and sign up, knowing that what I want is in that document.

#6 Jill

Jill

    Recovering SEO

  • Admin
  • 33,005 posts

Posted 03 August 2003 - 02:13 PM

This is a very normal situation with the large online research companies who require a subscription and payment before you can access their online content. Do you believe that their research and papers should not be indexed. so that a person can find them if they want them?


Hmm...that's a very interesting situation.

The question is really what do the search engines think of this? Do they want to index your password-protected content which will not be available to the user unless they pay?

If they said yes to this, then I would say what you're doing is fine. My gut tells me that they would not actually want to index that stuff. But I could definitely be wrong. I am interested in discussing this with some search engine reps., however, and I am going to try really hard to ask some of them in San Jose in a few weeks.

We really would need their answer on this to be sure.

To be on the safe side, I would suggest doing what Scottie recommends. In fact, I've recommended the same thing to clients in the past. It never would have occured to me to have a search engine index password protected stuff.

I hope that the engine reps will let me know their rules about this, and will let you know what they say.

(Note to Scottie...make sure I remember this at the conference, and feel free to ask them yourself. I get a bit brain dead at these things from staying up to late and drinking too much!) :)

:)

Jill

#7 Alan Perkins

Alan Perkins

    Token male admin

  • Admin
  • 1,642 posts

Posted 03 August 2003 - 04:52 PM

If you click on that result and get a log-in or register box you may not be a happy camper, but it is up to the site how they want to allow access to their content.

It's also up to the search engine what content they want to index. A search engine needs to see what the searcher will see to make this decision.

This is a very normal situation with the large online research companies who require a subscription and payment before you can access their online content. Do you believe that their research and papers should not be indexed. so that a person can find them if they want them?

Yes, that's exactly what I think! It's also what most free access, free inclusion, general purpose search engines think, IMO.

Some engines, e.g. the old Northern Light, allow "special collections" of paid content to be searched, separately to the free search - not as part of it.

These days, if you use a PFI program you may be OK. I suggest checking with your PFI provider first! The general rule of thumb, though, is that search engines want searchers to see what the spider saw without having to offer any kind of payment.

Scottie's solution is the generally accepted workaround.

FWIW I don't see any problem with using Content Delivery to remove session IDs from URLs. I wouldn't call that cloaking - just as I wouldn't call it cloaking to use Content Delivery to add session IDs to URLs for browsers that supported sessions ... it amounts to the same thing.

Edited by Alan Perkins, 03 August 2003 - 05:04 PM.


#8 Scottie

Scottie

    Psycho Mom

  • Admin
  • 6,294 posts

Posted 03 August 2003 - 05:18 PM

(Note to Scottie...make sure I remember this at the conference, and feel free to ask them yourself.  I get a bit brain dead at these things from staying up to late and drinking too much!)  ;)

I love that you think I am going to be more clearheaded than you. :D

#9 Ron Carnell

Ron Carnell

    HR 6

  • Moderator
  • 966 posts

Posted 03 August 2003 - 06:01 PM

Scottie, I think, is right. And the proof is probably right in these forums.

How many people here subscribe (or have subscribed) to the Member's Area at searchenginewatch.com? Would you have done so if Danny's free content hadn't proven to you he would deliver?

And Mel? In RL, my experience has been that thieves are fairly rare. On the Internet, however, they are rampant when it comes to content or intellectual property rights. Most seem to believe that anything composed of bits and bytes, from music to software to private pages, is theirs for the taking.

#10 Mel

Mel

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 353 posts

Posted 03 August 2003 - 11:37 PM

Well if you do many searches you will often come up against links to articles in research sites that require payment to access them, so one would assume that the search engines do have a way of indexing them even though they are PW protected.

FWIW IMO the pupose of a search engine is to find and deliver relevant links to users in response to their queries. I have never seen any search engine say that they will only index content that is free, and if pw protected content is not indexed then the search engine has not done as good a job as it could have in indexing the web. It might be noted that content which payment is required for may well be of better content than that which is free.

There are other similar situations also, I know of quite a few sites where payment is not required to access content but registration is.

Ron:
someone who is going to set up and spoof a user agent to read a single web page is beyond my experience, but I suppose there may be those who do this just for the challenge. FWIW I believe that there are many, many more honest users than thieves, but the thieves get much much more publicity and this is contributing to a general sense of unease.

As a test, how many of you have been successfully ripped off by an online credit card scam for instance?

I would love to see less publicity about how the web is such a lawless, dangerous place and more about how the great majority of users find it a great and useful place.

#11 Mel

Mel

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 353 posts

Posted 04 August 2003 - 12:14 AM

If it were my client, I would advise them to write teasers or summaries that were publicly available to allow search engines and prospective subscribers to read them.

If I'm looking for your information and I only get a login box, I'd assume the content wasn't there anymore and move on to the next result.  Or I'd just pull up the Google cache and read it without logging in.

If however, I came to an abstract or summary of what was contained in the pay-only version, I would be more inclined to get my credit card out and sign up, knowing that what I want is in that document.

Hi Scottie:

I agree that an abstract of the content is a great way to both give users some indication of what they may find behind the veil, and to give the search engines something to chew on.

Why would you assume that the content was no longer there if you were asked to register to view it??

At any rate pulling up the Google cache is not an option for .pdf pages, which many with valuable content use to prevent copying.

#12 Scottie

Scottie

    Psycho Mom

  • Admin
  • 6,294 posts

Posted 04 August 2003 - 12:23 AM

Why would you assume that the content was no longer there if you were asked to register to view it?? 

I would assume it was no longer there because obviously it was there when the search engine spider came by but now it has been moved or removed and there is a login screen in place of the content.

I'd just move on to the next result until I found what I wanted. How many people do you think would assume that the content behind that login was exactly what they wanted and trust it to be there enough to give a credit card number?

Actually, that would make a very interesting usability test! If you are interested, I'll run one just for fun! PM me a search query that will return a login screen and let me run it by some test subjects and see how they react.

#13 Alan Perkins

Alan Perkins

    Token male admin

  • Admin
  • 1,642 posts

Posted 04 August 2003 - 03:06 AM

Well if you do many searches you will often come up against links to articles in research sites that require payment to access them, so one would assume that the search engines do have a way of indexing them even though they are PW protected.

No, they can't without something further taking place. For it to take place, one of two things has almost certainly happened:

1) The site has cloaked, or
2) There is a commercial arrangement between the site and the search engine (e.g. PFI)

There are other alternatives. For example, a well known Web forum precludes unregistered access from several ISPs, some of whose users abuse the forum with robots. I commonly access the Web with one such ISP, so when my search results include listings from that forum I am required to register or sign in before I can view the content. In this case the forum hasn't necessarily made provision for spiders, but it is performing a kind of Content Delivery to exclude many thousands of humans, which could be interpreted similarly. In this case, though, the content isn't really password-protected. There is another mechanism in place that makes it appear that way.

#14 Alan Perkins

Alan Perkins

    Token male admin

  • Admin
  • 1,642 posts

Posted 04 August 2003 - 03:12 AM

Pinging new thread...

I've split this off into a new thread since I think we've left session ids behind.

#15 Matt B

Matt B

    The modem is the message.

  • Active Members
  • PipPipPipPipPipPip
  • 558 posts

Posted 04 August 2003 - 07:59 AM

I would have to cast my vote on Scottie's abstract of the article, study, etc. In doing so, you allow the user to peruse the information to ensure that the article is first relevant, and then will provide the citation that they need.

From slaving over many research papers in college, I can tell you that many article references looked great. However, the articles themselves did nothing to support my research. However, when there was an abstract available, it made more sense to read that first, rather than an entire study or paper on the subject.

I also think that converting people to a subscription would be more effective if an abstract were presented, rather than a page with only a login/password/registration required. Obviously, I'm not going to pay for something that I can't be sure that I need. Paying for an article that is related, but counter to your thesis would be a real kick in the pants. ;) Being able to evaluate the content prior to purchase is the ideal conversion scenario. (In a paid subscription model)




SPAM FREE FORUM!
 
If you are just registering to spam,
don't bother. You will be wasting your
time as your spam will never see the
light of day!