Are you a Google Analytics enthusiast?
More SEO Content
Listed In Yahoo And Msn Not Google
Posted 16 December 2004 - 10:43 AM
I have been working on this site www.garah.com for the past 6 months. I have optimized my page and built up links from directories. I am ranking well in Yahoo and MSN for a few keywords, but google has yet to return that the site has even been found. Any help would be appreciated.
Posted 16 December 2004 - 10:51 AM
There seem to be exceptions to this rule, but in general nowadays starting a new site can take a (long) while before you rank like you do in the other searchengines.
Posted 16 December 2004 - 10:53 AM
The first thing I would do is to go to the pages that you have links pointing to your site, and check that those pages have been cached by Google. Google does not even have a partial index for your site, so it looks like it does not even know about the existence of the site.
I would check that first and post back. It shouldn't take long. It could be that your domain is blacklisted, but this is highly unlikely as there is no record of any previous life for that domain on wayback. I would guess that Google just don't know about your site. It has been dropping irectory pages like crazy for some time now, so I would certainly check for a cached page with a live link on it that has been cached.
Posted 16 December 2004 - 10:59 AM
Thanks for the reply!
My site was last spidered by google on Dec. 2nd. The information returned was 2+2. It doesn't look like it is spidering very deep.
Posted 16 December 2004 - 11:03 AM
Thanks for the pointers!
I checked out a few of the links from other pages, and they are cached by google. I wasn't sure if the use of php would be causing negative reactions with google?
Posted 16 December 2004 - 11:07 AM
Normally when Google comes across a link it does not know about, it is really quick to get over there and at least partially index it. If it was a coding problem, you would show up as a url only (partial index) in the serps, and a site:yourdomain would show the partial.
Posted 16 December 2004 - 11:08 AM
Posted 16 December 2004 - 11:14 AM
I'm also getting a forced Session ID (PHPSESSID) being set by the server on both sites. That'll keep Googlebot away if others are seeing it too.
Posted 16 December 2004 - 11:18 AM
Posted 16 December 2004 - 11:47 AM
When you say set to no store, no cache, are you talking about the google tag or are you talking about something else. There shouldn't be a no cache tag on the page.
Posted 16 December 2004 - 12:28 PM
HTTP Status Code: HTTP/1.1 200 OK
Date: Thu, 16 Dec 2004 17:25:30 GMT
Server: Apache/1.3.33 (Unix) mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 PHP/4.3.9 mod_ssl/2.8.22 OpenSSL/0.9.7a
Set-Cookie: PHPSESSID=a0412f21795eb9fbc3fba53fca18ef89; path=/
Expires: Sat, 1 Jan 2000 00:00:00 GMT
Cache-Control: no-store, no-cache, must-revalidate
Last-Modified: Thu, 16 Dec 2004 17:25:30 GMT
Cache-Control: post-check=0, pre-check=0
This is the headers info
I have checked the incoming links to this site, and the few that were pm'd to me are cached, and are ahref links so there is no reason why G does not know about this site.
Re dup content, has this site ever been spidered? ie was it picked up then dropped because of the dup?
Posted 16 December 2004 - 01:22 PM
This seems to indicate that google is not cacheing the site, not sure why? The site has been spidered, but when checking the log files it was displaying 2+2 from googlebot.
Posted 16 December 2004 - 02:01 PM
Posted 16 December 2004 - 03:06 PM
Basiclly Google looked at our site and made two hits in December. As for the two domains, garantahome is pointing to garah.com. They are on the same server.
Posted 16 December 2004 - 04:08 PM
When I was looking at it earlier both sites were on the same IP and had the same nameservers. That usually means they're simply aliased. That is okay, but it also removes all control you might want to have over which pages from which site get indexed.
In addition to the two already mentioned there is at least one more domain out there that is yet another duplicate. All sitting on the same IP number. One has 861 pages indexed by Google, another 104 and the one you're asking about 0.
Now if I saw that in 2 minutes worth of looking, you can safely assume that Google knows about that and more if there are others out there. To me it's little wonder they have chosen not to index the site you asked about, since all of the same content is already likely in their index under other domain names.
Hopefully you knew about these other domains Jay. If not, you should probably have a sit down chat with your client to find out what the real score is.
As to the PHPSESSID issue, if I had to guess your server's php installation probably has Transparent Session ID support enabled. You can usually turn that off at the domain level if you need to with a line as follows in an .htaccess file placed at the root level of the site:
I kind of doubt that's the issue because I can chop off the PHPSESSID portion and still get the same page. So it's not being required to surf the site, which would keep the spiders from being able to crawl the site.
But it could cause confusion down the line if some of those pages with a session id end up getting indexed. Google tends to stay away from those, but I see a lot more of them in Yahoo/MSN. So, it could become troublesome. Bottom line: Don't set 'em if you don't absolutely need them.
Lastly, if you're not setting the Cache-Control in your code, and I didn't see that you were but with php you could be setting it so that it doesn't show up in the HTML, it's possible that your host has Apache configured to produce those header lines.
Check your raw PHP code first to see if it's being set in the script, as that will override everything else. The lines you're looking for should start with header("Cache-Control: and header("Pragma: based upon what the headers I saw said. That'll give you want an easy way to search for them.
If you can't find those two lines there you can most likely use an .htaccess file to override whatever Cache-Control headers the server is sending all on its own. That would look like this:
The max-age above would tell the browser to pull the files from your server again if 24 hours had passed since it was last there. You can change the 86400 (seconds) to be whatever number you need it to be. I left it to one day in case the information on your site changes often, as often happens with realty/property sites. That'll make sure everybody gets a fresh copy from day to day, but should still allow Google etal to cache your content.
If you're really a glutton for punishment, and your server supports it, you can even use the mod_expires module of Apache to set an Expires header for different file types. So you could have your text files set to renew every day or two, but your image files that are already in someone's cache set to only update if a month has passed.
But that one is outside the scope of this issue.
[Edit ... sheez, typo city!]
Edited by Randy, 16 December 2004 - 04:16 PM.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users