Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo

My Site Is Based On X-cart!


  • Please log in to reply
6 replies to this topic

#1 Faizee

Faizee

    HR 1

  • Members
  • Pip
  • 6 posts

Posted 05 April 2008 - 04:33 AM

May Peace on you,

I prepared very first time a robot.txt file today, I am a little bit afraid to upload it,
My site is based on X-cart software, we generate html page by catalog option.

1-I disallowed all php Pages for duplicating purpose,
2-i disallowed all ftp folders but images folder.

Is there anything important to disallow or I disallowed any important thing that I shouldn’t? eek.gif

regards,

Faizee

#2 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 05 April 2008 - 08:55 AM

You'll probably want to exclude the dyanmic version of your pages since you're using the html catalog in x-cart. How to do that depends upon where your cart is installed.

For instance, if your dynamic cart is installed in a subdirectory one level down from root named /cart/ and your html version is in a different subdirecotry you could simply put the following in your robots.txt file
CODE
User-agent: *
Disallow: /cart/


If the dynamic cart is installed at the root level of your domain with the html version in a subdirectory you can simply exclude the cart.php page since all other x-cart pages feed off of it. That would look something like
CODE
User-agent: *
Disallow: /cart.php


#3 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,325 posts

Posted 05 April 2008 - 09:07 AM

QUOTE
1-I disallowed all php Pages for duplicating purpose,


You may want to paste in here how you're disallowing those, just in case.

#4 Faizee

Faizee

    HR 1

  • Members
  • Pip
  • 6 posts

Posted 07 April 2008 - 04:27 AM

Thank you, you people are very helpful this is my first time here im learning SEO, and English also 

I disallowed all php pages one by one,
If I just disallow: cart.php
Is it enough for all php pages?

QUOTE(Jill @ Apr 5 2008, 09:07 AM) View Post
You may want to paste in here how you're disallowing those, just in case.


User-agent: *
Disallow: /filename.php?

Thanks once again.

regards,

Faizee




#5 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 07 April 2008 - 08:06 AM

The idea behind using something like cart.php is that it'll allow you to disallow all of the dynamic pages that use this particular file. Basically you get 'em all and only have to have one instruction in robots.txt because there is an implied wildcard at the end of each Disallow line. You'll need to look at your x-cart installation to determine which files to disallow.

For instance, with one x-cart site I watch over the dynamic store is located at www.domain.com/store/ with the html catalog located at www.domain.com/store/catalog. The dynamic side of things uses a file called home.php in the /store/ subdirectory for many pages. So to exclude those via robots.txt my instruction would look like:

CODE
User-agent: *
Disallow: /store/cart.php


That alone would get all of the base pages and category pages since x-cart shows cat pages like www.domain.com/store/home.php?cat=124

Additionally, when I drill down through the dynamic side of things to the individual product pages I see that x-cart uses a different php file named product.php followed by some variables and values. So to block all of the pages with a single robots.txt exclusion I would change my robots.txt to read:

CODE
User-agent: *
Disallow: /store/cart.php
Disallow: /store/product.php


To take it a bit further, I have several extra pages set up for things like Ordering Information, Return Policy, Shipping Info and so on. The html converter makes these pages too, so I'll probably want to exclude the dynamic version. In my x-cart installation these dynamic pages are called via www.domain.com/store/pages.php?pageid=# So to exclude these pages also I'd change my robots.txt to read:

CODE
User-agent: *
Disallow: /store/cart.php
Disallow: /store/product.php
Disallow: /store/pages.php


So with just three lines I've manage to exclude the entire dynamic side of my store, forcing everything to go through my html catalog version first. If I wanted to I could also exclude the Search page, which is located at /store/search.php in my case, or since that's the page my html catalog points to also I can leave it be. The search engne spiders aren't going to go past it anyway since it uses an html form to perform a site search. For these reasons I've chosen not to exclude the search page, though I could.

Make sense?

Basically have a look around the dynamic side of your site and make note of the path (eg /store/ in my case) and filenames being used. Jot them down as you're surfing around. Once you have the exact files and locations being used to produce the dynamic pages you can easily restrict the spiders from seeing or using them.


#6 Faizee

Faizee

    HR 1

  • Members
  • Pip
  • 6 posts

Posted 07 April 2008 - 10:44 AM

This is amazing knowledge that I got today about robots file. The concept of dynamic pages is clear, smartass.gif
but yet I am thinking about other folders…..

There are many folders in my sub directory and as my little knowledge I disallow almost all folders excluding images folder,

Please say something about it Sir,

Thanks

#7 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 07 April 2008 - 05:08 PM

It depends upon what other subfolders you're talking about.

x-cart and many shopping carts automatically set up a bunch of subdirectories because that's how they keep things at least a little bit neat and tidy. However there is no need to exclude any of these because the search engines will never even see that they exist. The files in those folders are dynamically included in other files, but there is no direct link to any of them in the resulting html code.

Anything that stands a chance of getting linked to that you don't want the engines to index, feel free to exclude. Honestly, that usually won't be many folders for the average site.

However don't think excluding a subdirectory via robots.txt is anything close to a security measure, because it's not. Quite the opposite in fact, since there are bad bots out there that specifically look for Disallowed folders and files to see if they can use them as a way to break into a site. Robots.txt is not a form of site security.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users