Are you a Google Analytics enthusiast?
More SEO Content
Matt Cutts Giving Out Clues
Posted 24 July 2007 - 08:24 AM
Tidy cleans up many errors, but not all. It's probably in a webmasters interest to keep the code clean to minimize the risk of validation errors, and of course to improve usability by ensuring proper and speedy page loads on all browsers.
Google's "validation" may be a subset of HTML validation. They may not care about harmless errors, but they might want to warn webmasters whose pages are so FUBAR that Google can't index them properly. Another possibility is that if a site has lots of pages and Google needs to do extra work to decipher them, they might decide not to process all the pages because it's more work than the site is worth.
Just speculating, but I find it very odd that Matt pulls validation out of his back pocket.
Posted 24 July 2007 - 08:44 AM
I think it would be a good idea if they did and would help with corss-browser compatability, standardisation and accessibility, which can anly be a good thing.
It needs people like the SE's to embrace standards and try to encourage / force website to adhere to them, W3C can't do it alone , especially when the feeling umong SEO communitys has been very much 'screw the standards it makes no odds to SEO and ROI'.
As soon as it makes a different to ROI, just watch all the website suddenly become valid and standards compliant.
Let's just hope they do adopt W3C's standards and not their own!
Edited by 1dmf, 24 July 2007 - 08:50 AM.
Posted 24 July 2007 - 08:47 AM
There are a couple of Plugins for FireFox that Validate using Tidy. That way, You can debug Code Errors as You Design and Preview. Saves loads of Grief.
I am using HTML Validator 0.8.3.9 which suits me, but You have to be a little bit selectivewith it's "Warnings" as opposed to Errors.
Posted 24 July 2007 - 08:54 AM
Are there any plugins for IE? and if I use FF what warnings do you mean?
Posted 24 July 2007 - 09:05 AM
Don't get me wrong; standards are great, but for a different set of reasons. Standards should be promoted in order to allow less development time be spent on cross-browser consistency and the like, and more time on developing better, more accessible web sites. They shouldn't be promoted as a means of judging a page's worth though.
PS 1dmf - You yourself have illustrated another reason that standards shouldn't be used to judge a page; meaningless errors often crop up causing the validation to fail, but the "errors" no effect really. Try validating the CSS on the first link in your signature.
Posted 24 July 2007 - 09:17 AM
But as far as most devices are concerend invalid CSS is irrelevant, it's the markup that is the important stuff, the content, not the design / colour painted over it via CSS.
But thanks for pointing it out, I like my stuff to validate
Posted 24 July 2007 - 09:23 AM
And hehe - Fair enough. I was just pointing that out as an example of how easy little errors can crop up.
Posted 24 July 2007 - 11:19 AM
Having said that, if I were to add validation to my search engine mix on the assumption that accessible code is good for visitors, then I may not use the full validation script to do it. I may only use the parts that have been shown to affect usability and accessability, and I wasn't already looking at for other reasons.
For example, Google already looks at ALT attributes for images. It would be easy to spot and report this if you detected images but no ALT attribute. This is related to W3C validation, but would also be search validation. But for example, declaring a document to be HTML 4 Strict but then using XHTML for a widget on the page someplace is really unlikely to cause an issue with any browser I'm aware of, and not with search engines. (You used ">" instead of " />"? Horrors! Into Supplemental you go!).
Posted 24 July 2007 - 12:35 PM
if you declare a doctype and then don't use the correct end tag format, this could throw a spanner in the works I guess , but i'm not sure as you say the difference between <br> <br /> is really gonna kill the browsers or the SE's.
Posted 24 July 2007 - 07:10 PM
The plugin Validator I mentiond indicates both Errors and Warnings. The Warnings could just be Depreciated Tags or Attributes as opposed to Errors such as a Tag opened but not closed ie Missing </td> etc.
Posted 24 July 2007 - 10:47 PM
Actually, no, it wouldn't
Those sorts of programs have a goal that is different to an SE parser's. A tidy/validator program is interested in helping you remove any and all errors. An SE parser, OTOH, is interested in extracting data from an HTML document. Whether or not the code is "tidy", "validates" or anything else is really irrelevant. They just want to pull out the information, in a way that maintains the semantic markup (headings, titles etc) so they can use it to score documents.
Put in simple terms, why would an SE care if your code is "<table><td>hello</td></table>" or "<table><tr><td>hello</td></tr></table>"? Either way, they extract the word "hello". A lot of semantic markup is irrelevant to SEs, as are a lot of the rules "violations" that a validator (neccessarily and rightly) will point out.
Besides which, using two programs to parse a document, both written in a high level language like PERL, is likely slower and less efficient than writing a C app from scratch. Remember too: Google has being doing this for ~10 years, and given this is the most important part of their whole business (no parsing and indexing, no SE), I doubt they would use anyone else's validation code for the parser.
Have a read of http://infolab.stanf...rub/google.html, the original paper by Brin and Page, and you'll see this quote:
So they have been working on the parser for years! Besides which, any "tidy" program would need to be robust enough to not only spot, but also fix, any validation errors to be of any use whatsoever. <hmtl> is hard to fix, as is "<P We are the best" or heaps of other issues.
Actually, anyone interested in SEO should read that paper (it isn't very long).
A lot of the assumptions are that Matt offerred validation because it matters to Google, which it doesn't for rankings, but does for parsing. The best way to solve your problem from that quote Jonathon is not with code, but social engineering. Validated code will always be easier to parse (cause it obeys the rules), and that alone is good for Google. I bet they are constantly working on the parser, as new and improved idiots are always being spat out of the genetic production line, imaginging newways to F^%$^% up their code.
Even though the W3C validator exists, having a validator as part of Webmaster central (from here on in GWC) is likely to have an affect on many people's code, as they try to "make Google happy". Far more effecient (and scales better) to solve problems by helping people help themselves.
That is the premise of open source: everyone pitching in is better, and everyone pitching in to write better code helps Googls parse code more effeciently and effectively, build a better index, serve users etter etc etc.
Separate to why they offer it is how they would provide a validation service. Remember: that is where this speculation started, with Monseniour Cutts question about what the Webmaster Central team should work on next. Using the W3C PERL validator code, on an HTML document they have already downloaded, with a simple change in template, and voila, vaildation right in the GWC. Taht doesn't flood the W3C servers, which is why people make code available, and keeps people @ the GWC.
As I said, no conspiracy, nothing to get excited about.
The effort to "validate" code, AFAIAC, is simply too much effort for an SE to compute for every single web document, when the benefits are minimal and the time cost huge (8 billion documents validated is a LOT of CPU cycles). Why bother, when the parser is already under enough pressure, and there are time constraints?
Edited by projectphp, 24 July 2007 - 10:57 PM.
Posted 24 July 2007 - 11:09 PM
My response is, "Give me better rankings if I provide a better quality website." Better quality includes lots of things: clean coding, proofread content, no broken links, and so on, that are good for users. Neatness counts.
Posted 24 July 2007 - 11:40 PM
Validation != quality.
The benefit of validation is that there is no confusion. Non-validation is like a garbled call: you might succeeed in communicating your mesasge or you might not. With validation, you know that what you wrote is being interpretted the way you intended (unless your validation works, but your coding is poor). That is why people recommend <H#> rather than user styles, because an <H#> means something.
Some of the other things you mention might help, and are pretty easy to check without validation (if you link to a 404 error page, that relationship is known, and you can track back the linking pages and say "not real reliable"), and some less so (proofreading is a challenge, especailly when it is hard to know what content matches up).
The question is not what makes a good site, but how much agin is there in adding an element to the SERP equation. Improvement / Enegy in = relative value of using. Energy in for validation is high, and the improvement for a binary flag (validates || doesn't) is likely rather small.
Of course, I could be entirely wrong, but as there is no evidence validation is part of the algo, Occam is my mate
Posted 25 July 2007 - 12:02 AM
Validation and length of code are two rough ways to predict the effort required to parse a page. Long pages take more resources, and so do pages with lots of errors. As you cited above, it's hard to write a parser that can handle lots of errors and still run reasonably efficiently.
Posted 25 July 2007 - 01:34 AM
The issue then is twofold: bloat and parsing time.
What you really want to do is remove any code that doesn't have meaning. <font color=""> is one such example, as this doesn't convey meaning about the content, but about the layout. CSS, by and alrge, is a good starting point there, but there really isn't a blanket rule on code bloat. Some CSS heavy sites are bloated (add in a bunch of ads and everything bloats). Some CSS and validation free sites are bloat free.
Look, valdation certainly can't hurt, but the help is of the "will have less issues than random code as a general rule" NOT "validating is a bonus with Google".
Google's goal is still to help users find stuff and, by and large, that has nothig to do with validation.
At least IMHO.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users