Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



SEO Class in Chicago, IL

Learn How To Optimize Your Website on July 26, 2013


Looking for personalized in-depth SEO training among your peers?



High Rankings is offering a 1-day customized SEO training class in Chicago. Class size is limited so please sign-up now if you want in!



 


Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!



Photo
- - - - -

Large Website File Structure- What's Best For Se's?


  • Please log in to reply
17 replies to this topic

#1 deemeranddommer

deemeranddommer

    HR 1

  • Members
  • Pip
  • 7 posts
  • Location:Africa

Posted 02 January 2007 - 08:05 PM

Hi,

I operate a commerce site that has grown beyond our original plans. We sell items that when sold, are never taken off the site for purposes of research by their owners and third-parties. The site is heavily linked to. Problem is, years ago, our file structure on the server was such that all htm files were placed in root and all images were placed in a directory called 'images'. Now, we have over 4000 pages so there are that many files in root (I can see you all wincing now...) and even more image files under their separate image directory. Problem is, if I restructure the site and move things, I risk losing a serious number of outside sites that have linked to us as well as it taking an extraordinary amount of time to redo everything. The futures means it is going to grow even more so I plan to put new files in a better structure and leave the old where they are.

Knowing now that it's bad to have so many files under a directory and root especially, I have two main questions.

1 - Is it bad to have too many DIRECTORIES from root for a search engine to spider the site with ease?
2 - Is it best to structure the site with subdirectories by keeping image files and htm files separate in their own subdirectories rather than mixing in one directory?

I am trying to figure out what file structure on the server is most friendly to a search engine when spidering. (East of customer navigation is not the issue here, just how we set up the site on the disk).

Thanks

#2 Scottie

Scottie

    Psycho Mom

  • Admin
  • 6,294 posts
  • Location:Columbia, SC

Posted 02 January 2007 - 08:17 PM

QUOTE(deemeranddommer @ Jan 2 2007, 08:05 PM) View Post
Knowing now that it's bad to have so many files under a directory and root especially, I have two main questions.

1 - Is it bad to have too many DIRECTORIES from root for a search engine to spider the site with ease?
2 - Is it best to structure the site with subdirectories by keeping image files and htm files separate in their own subdirectories rather than mixing in one directory?


It's not bad for spiders at all. They don't care a whit about your file structure- I've had pages that were indexed and ranked well that appeared to be seven folders deep.

Structure your files how it makes sense for YOU. I can't imagine trying to find something in a folder with 8000 items! But there's nothing inherently wrong about it from a search engine point of view.

Personally, I like to give images their own folder but it's a housekeeping thing and certainly has no advantages either way when it comes to SEO.

What search engines care about is how many links it takes them to get to the page.

Example:

www.domain.com/file.htm
www.domain.com/folder/anotherfolder/thirdfolder/fourthfolder/file.htm

are both linked from the home page- they both get the same "weight" or vote from the home page.

However,

www.domain.com/somefile.htm,
even though it's in the root directory, isn't linked from the home page. The user has to click

www.domain.com/category/ then
www.domain.com/subcategory/

to reach it, so it's 3 clicks away from the home page. It may be many visits before a search engine spider ever reaches that page (because they follow every link on prior pages) and it's linked from a lower importance page, so they naturally determine that it's not as important as the other pages on your site.

Does that make sense?

Customers and search engine spiders navigate your site in similar ways, so if you create a structure that easily allows people to navigate your site, it works for the engines too. The actual structure of the URL/folders really has no bearing on anything (other than your own sanity when trying to find something!)

#3 deemeranddommer

deemeranddommer

    HR 1

  • Members
  • Pip
  • 7 posts
  • Location:Africa

Posted 02 January 2007 - 08:28 PM

Yes, thanks. Your last point is understandable. Going forward, I was thinking of two options to get on the right track for structuring, at least from a file server mechanical perspective since so many files under root and/or a directory is not what the servers like. I didn't want to move in a new direction of file structure if it will hurt the site's rank overall in the future considering it's great positioning now with the SE's.

In my case, I actually have links on the home page in a Frontpage shared border that may entail several clicks from the home page to the final page. I also have each of these items linked in a "recent additions" page (that is one click from home) when placed on the site initially. The SE's love my site but I don't want to change that slowly over time by causing crawlers undue issues compared to the sites simplicity now.

#4 deemeranddommer

deemeranddommer

    HR 1

  • Members
  • Pip
  • 7 posts
  • Location:Africa

Posted 02 January 2007 - 08:39 PM

Here's another thought, Scottie based on what you said in the end. If I am putting my submenu links in a Frontpage shared border that is seen on the home page, does this carry any less ranking weight rather than if they were simple text or image links coded right on the home page the old-fashion way?

#5 Scottie

Scottie

    Psycho Mom

  • Admin
  • 6,294 posts
  • Location:Columbia, SC

Posted 02 January 2007 - 09:01 PM

I'm not the right person to ask about Frontpage... I've always found it more confusing than just coding the page myself. I can't ever make it do what I want it to do. I guess it makes sense if you build it from scratch using their wizards and whatnot, but I've never been able to successfully edit a site in Frontpage!

Are shared borders in frames? If not, then they are probably fine. I would think these days they use include files, which assemble the page before sending it to the browser- in which case, the end result is a single page which is fine.

Simplicity is a good thing, no matter how you cut it. I don't think your server cares about the number of files in a folder either, to be honest. If things are going well and you don't mind the mass of files in the root... I'd leave well enough alone.

As a long term strategy though, I think you'd do well to have a better organizational structure just for your own needs in finding and editing, etc.

You should also probably consider a site map, linked from the footer. It's a good way to get all your pages accessible, no matter where they are located in your site structure.

#6 deemeranddommer

deemeranddommer

    HR 1

  • Members
  • Pip
  • 7 posts
  • Location:Africa

Posted 02 January 2007 - 09:14 PM

I agree Frontpage can be quite strange but from the beginning, I just coded the site simply without their bells and whistles of special Frontpage features. The shared border is not a frame but more of a script to a common header, footer or sidebar, whatever you choose to set it up as.

Indexing thousands of files in a single directory on a drive is not as fast for a server compared to having them chopped up in subdirectories. I have noticed this myself on my own drive. I find files easily by clicking on the list and just typing in the first few letters. All my htm and image files are coded by the catalog number so it's easy to find them this way.

I don't know if having a ton of directories is the same as a ton of files in a directory. If anyone else knows this from a file retrieval speed issue, please let me know. I guess from a SE perspective, it doesn't matter.

#7 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,372 posts

Posted 02 January 2007 - 10:23 PM

QUOTE
Knowing now that it's bad to have so many files under a directory and root especially


It's not bad at all. If you change things now, you will have many months of pain. So unless you absolutely positively have to change your URLs, I would highly recommend against doing it.

#8 deemeranddommer

deemeranddommer

    HR 1

  • Members
  • Pip
  • 7 posts
  • Location:Africa

Posted 02 January 2007 - 11:18 PM

QUOTE(Jill @ Jan 2 2007, 10:23 PM) View Post
It's not bad at all. If you change things now, you will have many months of pain. So unless you absolutely positively have to change your URLs, I would highly recommend against doing it.



Hi Jill,

I was informed by one of techies that hosts the site that so many files (4000+) in my root directory as well as any directory, was bad for indexing by the server as far as efficiently finding a file for a request. I was not going to move anything but just start from today and begin a more efficient file structure. My main concern was if there was some structure that spiders dont like over others.

thanks

#9 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,372 posts

Posted 02 January 2007 - 11:37 PM

QUOTE
I was informed by one of techies that hosts the site that so many files (4000+) in my root directory as well as any directory, was bad for indexing by the server as far as efficiently finding a file for a request.


I don't believe that is true.

#10 Ron Carnell

Ron Carnell

    HR 6

  • Moderator
  • 959 posts
  • Location:Michigan USA

Posted 03 January 2007 - 12:55 AM

QUOTE
I was informed by one of techies that hosts the site that so many files (4000+) in my root directory as well as any directory, was bad for indexing by the server as far as efficiently finding a file for a request.
Sorry, Jill, but that is indeed true. How true it is depends on the operating system and file system. Directories on older files systems, like FAT32, are simply linked lists. A directory with N entries will take, on average, N/2 reads to find any given entry. A newer file system, like NTFS, stores directory entries in a binary tree, so an average access drops to the square root of N. Much, much, much more efficient. It's also safer since, as all old DOS veterans will attest, linked lists tend to occasionally become unlinked lists. smile.gif

Unix-type systems, just like Microsoft ones, can go either way. I haven't checked in a few years, but last time I did, Red Hat Linux was still using the older linked listed. Randy might be able to corroborate that? Since Linux itself uses a Virtual File System other vendors can very easily connect to other, potentially better, file systems.

QUOTE
I don't know if having a ton of directories is the same as a ton of files in a directory. If anyone else knows this from a file retrieval speed issue, please let me know.

In a linked list, an entry is an entry is an entry. So, yes, directories are the same as files. Whether your directory holds 10K files or 10K sub-directories, the effects on (in)efficiency are the same.

You are NOT necessarily going to get a lot more efficiency out of a more complicated structure. Some, but not a lot.

If you have to find a file in a single directory with N entries, as we already said, it's going to take on average N/2 reads to do it. If you bury that file three directories deep, it's still going to take X/2 + Y/2 + Z/2 reads, where the variables represent the number of entries in each of the three directory levels. Unless you actually create a directory structure that resembles a binary tree, something that isn't always easily done, you're not going to gain a lot. Some, but not a lot.

That being the case, I agree with Jill, you certainly aren't going to gain enough to offset what you will definitely be losing with already established URLs. My advice would be to do it only on a go-forward basis and "try" to design a structure with multiple branches (nodes) at every level. You can use the formulas above (N/2 and X/2 + Y/2 + ...) to get a feel for how much efficiency you are gaining.

p.s. On a modern computer, 4K files in a single directory is a lot, but it's not really a LOT. I frequently run applications that have 10K files in a directory. You've probably picked a very good time to start addressing the problem because, in my opinion, you're just entering the edge of concern. smile.gif

#11 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 03 January 2007 - 08:21 AM

QUOTE
I haven't checked in a few years, but last time I did, Red Hat Linux was still using the older linked listed. Randy might be able to corroborate that?


It was as of RHEL 3 Ron. In fact, there was a bug that needed to be addressed with regard to corrupted link lists in 3 if memory serves. I've not checked RHEL 4, but would suspect that it too uses linked lists. Such an alteration of the base code would have been a major announcement in the Changelog if they'd changed it, and I haven't seen that one floating around anywhere.

FTR, I agree completely with all of Ron's conclusions.

#12 deemeranddommer

deemeranddommer

    HR 1

  • Members
  • Pip
  • 7 posts
  • Location:Africa

Posted 08 January 2007 - 03:34 PM

Thanks Ron and Randy. My hosting company's tech dept informed me they don't like people FTP'ing and pulling up directories with over 2000 files because of cpu usage on the server. I can understand this. I guess, the best course is to start with a different but simple structure to continue on to what is there so no file move changes will break my links and rank. Hopefully, by the time I reach another 4K in files, directories or what not, technology will improve on disk access. crossfingers.gif

#13 perkle

perkle

    HR 4

  • Active Members
  • PipPipPipPip
  • 121 posts

Posted 21 January 2007 - 05:50 PM

Hi Jill
You said in your statement:
It's not bad at all. If you change things now, you will have many months of pain. So unless you absolutely positively have to change your URLs, I would highly recommend against doing it.

How long a period of time are you talking about.
Also could I have left the home page .default.asp instead of changing it to the now .html

#14 perkle

perkle

    HR 4

  • Active Members
  • PipPipPipPip
  • 121 posts

Posted 22 January 2007 - 01:18 PM

QUOTE(Jill @ Jan 3 2007, 12:23 AM) View Post
It's not bad at all. If you change things now, you will have many months of pain. So unless you absolutely positively have to change your URLs, I would highly recommend against doing it.


Can any one help with my question. My original site was built with software and the home page had a URL like domainname.com/default.asp and I have now changed it to domainname.com/index.html

Should I (and could I) have left the home page name as it was and change the others to html static pages

How long does it usually take G to sort this out and assign page rank. Is there anything I can do.

Perkle

#15 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 22 January 2007 - 04:05 PM

Did all of the page names change? Or just the extension? And how long ago did you make the change?

You'll probably want to 301 the default.asp page over to the index.html page for starters, just in case anyone (or even you) were linking to it with the page filename as part of the link. If all of the rest of the page names were different as well, you can also redirect them to the most appropriate new page.

The rest is going to depend upon how long ago you changed the structure. You don't want to be flip flopping back and forth too much because you'll just confuse the spiders and start the re-indexing process all over again.

Changing page names can be painful in the short term, but isn't nearly as bad as changing the domain name.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users