Are you a Google Analytics enthusiast?
Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE!

www.CustomReportSharing.com
From the folks who brought you High Rankings!
More SEO Content
Ms Word To Html Convertor
#1
Posted 14 January 2005 - 03:22 PM
I have some MS Word files that I need to convert to HTML. When I do so with Word it makes the HTML very complicated. I dont want any font tags etc and all I want is simple conversion using <p> {new para} .<b> {bold} and < li> ( for bullets).
No font tags or other extra tags since all this is picked up on my site from teh stylesheet..
Any ideas on whats the best way?
#2
Posted 14 January 2005 - 03:26 PM
http://tidy.sourcefo....html#word-2000
Since HTML Tidy can be used in batch mode, that might be a good start.
Ian
#3
Posted 14 January 2005 - 03:32 PM
I don't know what version of Word you are using, but this plug-in from MS will go a long way towards cleaning up your Word-generated HTML files:
The only catch: it is for Word 2000...
Phil
#4
Posted 14 January 2005 - 03:47 PM
If you can get your hands on Macromedia Dreamweaver there's a command "Clean Up Word Html".
Also when you choose to save the document as webpage from the "File" menu on Word the resulting screen lets you choose the File Type, which you should set to "WebPage with filter" - or something like that - straight translation from my non-English Office.
Another possible walkaround would be to save your document from Word as html; view it in MS Explorer; then select and copy the text to some HTML editor, they may preserve the formatting but rid the code of garbage. No guarantee here, it works in Dreamweaver, may work in HomeSite, just try if nothing helps.
#5
Posted 14 January 2005 - 07:41 PM
The document has a large number of bulleted lists with sub bullets. I need something that would give me simple HTML using ul and li tags.
Any suggestions?
#6
Posted 14 January 2005 - 08:50 PM
It's a simple solution, and you lose all formatting, but I also lose all that MS Word stuff that's just not needed.
I drop it into notepad, save it and copy it then into my html editor... removes everything. Doesn't cost a dime.
#7
Posted 15 January 2005 - 03:41 AM
In FrontPage: Ctrl-A (select all) Ctrl-Shift-Z (remove formatting)
That should remove all inline formatting, such as bold, italic and fonts, but still leave the block level formatting, like a list or H1, intact.
And, believe it or not, the IE DOM actually has a way to accomplish that through JavaScript. Run a Google search on execCommand RemoveFormat for more information.
#8
Posted 15 January 2005 - 12:46 PM
The document has a large number of bulleted lists with sub bullets. I need something that would give me simple HTML using ul and li tags.
Any suggestions?
It shouldn't; <font>, <ul> et al will get carried over. Have you tried any of the methods? Anything worked?
#9
Posted 15 January 2005 - 05:27 PM
Did you try Ian's suggestion of using Tidy?
L.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users









