Hi,
I was just wondering if it would be possible to generate CLEAN html from MS Word files. Is it?
Thanks!
I don't know why you would. If you want to make a website with some quality use Nvu or Dreamweaver. Or learn HTML.
But here I found this with a Google search: http://www.codinghorror.com/blog/archives/000485.html. You could've found it yourself
Just use notepad++ for editing and creating your (x)html code.
It has syntax highlightning for various scripting languages. There is an ability to open more documents by use of tab's. And if you do not know how to write an html document learn it first type in google voor example as keywords "html guide" "html tutorials". And if you're finished writing your code validate it on w3c.com. also check www.hotdesign.com/seybold/ for info about stylesheets.
In my opinion, no. Just try it yourself... take any document, format it, put some colors, different fonts, and then save it as html and take a look at the code. I did this once just to check and the code was a disaster.
Ok...when submitting stuff to my blog, this was something I really tried to get to work (so stuff like bold converts to <b>, every paragraph is covered in <p>, hyper links are simple <a> tags)...
I haven't seen the program in Roald's post before, - and it doesn't do exactly what i would want anyway.
The solution (the way i see it) is to use the FCK editor from
http://www.fckeditor.net/
For typing up stuff in the first place.
Thank you. I haven't tested the program, just searched Google and printed the first result.
I would never ever ever recommend using MS Word for making websites. You will never get clean code.
You're better off learning HTML.
For this, online I would recommend sites such as
www.w3schools.com
www.alistapart.com
For books, I would recommend the "SAMS Teach Yourself" series.
If I remeber right, dreamweaver has a feature to load word generated HTML and clean them.
The next was FCKeditor, but when I want to use it in real application it cleans nothing - pure functionality of cleaning Word HTML
I try also W3C's html tidy
Now I use tool which is not designed for it.
It is Texy!
I like its syntax. I use it to convert documents from word (and others) by Copy/Paste method 
Also, the W3C, the organisation that sets the coding standards, has it's own application called Amaya. It can be downloaded from here - http://www.w3.org/Amaya/
Obviously, as it's written by the W3C, it produces standards compliant code.
In my experience Dreamweaver churns out some pretty dirty code too! But it's a step in the right direction...
Years ago i used Symantec Visual Page but I don't believe that exists in the same format any more, ah well
Personally these days I'd recommend a simple text editor (just Google for the best) coupled with a CSS-based approach leads to very quick, clean and accessible code.
In addition to welshsteve's recommendations, might I add www.positioniseverything.net - there's more links for further reading on there too.
Good luck with whatever you're designing!
Cheers,
fuzzy
The best way to get clean code is to hand code it. You can save time using editors to do most of the work then go back and edit the code to make it standards compliant. I don't care what editor your using there will always be some extra code as to add and remove items in your design. The only way that you can get rid of this is to edit the source when you have the finished design.
| rfwrangler wrote: |
| The best way to get clean code is to hand code it. You can save time using editors to do most of the work then go back and edit the code to make it standards compliant. I don't care what editor your using there will always be some extra code as to add and remove items in your design. The only way that you can get rid of this is to edit the source when you have the finished design. |
I agree.
I used to use a program called Webstyle by Xara. This produces ridiculously bad code, adding it's own attributes to tags, using vbscript to add together strings to make tags etc, thus making sure that if the code is changed, the Webstyle editor cannot open the file if you wish to edit it again.
However, it came shipped with some excellent and nice looking templates, I therefore produced a single page, then edited the code by hand to make it standards compliant.
I will reiterate what everyone else has said; there is absolutely no substitute for learning and using HTML. However, if you have Word documents that absolutely must be converted to HTML, the demoroniser tool can be useful:
http://www.fourmilab.ch/webtools/demoroniser/
I haven't used this tool, but it is supposed to be very good and useful.
HTH.
Actually you are all wrong. You can convert microsoft word documents into html using media-convert. I would not recommended using it to make proper web pages as it can sometimes alter the layout slightly but generally it works!!!
B'Owl
Word is for office use and the only form in which it should appear online is the .pdf file printed from it ! This is my opinion. Well and I consider trying to build a website in Word as a bad habit. Guess it's the best idea to outsource the job one cannot do, you save time and nerves and someone gains. It's better to improve your abilities than to strugle with something pointless.
| Dougie1 wrote: |
| Actually you are all wrong. You can convert microsoft word documents into html using media-convert. I would not recommended using it to make proper web pages as it can sometimes alter the layout slightly but generally it works!!! |
It definitely works better than Word conversion itself, generating a cleaner code, but it still generates lots of deprecated tags, and the code could be easily cleaned up much more manually...
| mariohs wrote: |
| Dougie1 wrote: | | Actually you are all wrong. You can convert microsoft word documents into html using media-convert. I would not recommended using it to make proper web pages as it can sometimes alter the layout slightly but generally it works!!! |
It definitely works better than Word conversion itself, generating a cleaner code, but it still generates lots of deprecated tags, and the code could be easily cleaned up much more manually... |
Yes. Unfortunately the code is not perfect. The conversions on that site are not always perfect. The WMV does not stream to xbox 360s therefore is not encoded correctly. It is a great site though. And even although it does convert the documents into html, it is much better learning simple html than doing it this way.
| Dougie1 wrote: |
| Yes. Unfortunately the code is not perfect. The conversions on that site are not always perfect. The WMV does not stream to xbox 360s therefore is not encoded correctly. It is a great site though. And even although it does convert the documents into html, it is much better learning simple html than doing it this way. |
Yes, it is indeed a pretty good site, I have no doubts about it... I'd say that: If you know html and css and have some spare time, do the html from scratch. Otherwise, use the site converter (which I'd like to reinforce, it's better than the word conversion). Your code will be far from being clean, but it will output an html close to your word document.
If you guys are looking for a way to tidy your html then the best solution is to use HTML TIDY.
It cleans all the errors for you and gives you warnings on what you've done wrong such as: attributes are in the wrong places, or nested loops are incorrect etc.
You can easily find HTML TIDY by googling it.
