Dave Raggett's HTML Tidy (http://tidy. sourceforge .net) is a wonderful open source tool for cleaning up HTML pages, including converting them to XHTML. Use it. HTML Tidy is a command line tool written in reasonably portable ANSI C that runs on most major platforms. Binaries are available for most platforms. To run it, just put the binary somewhere in your path , and use the --output-xhtml option to indicate you want XHTML output (instead of HTML). For example, the code below converts the file shows.html to XHTML.
C:/>tidy --output-xhtml shows.html
This dumps the converted document onto stdout , from where it can be redirected into a file in the usual way. If you prefer to convert the file in place, use the -m option.
C:/>tidy --output-xhtml -m shows.html
HTML Tidy cannot fix all the errors it finds in a typical malformed HTML document. However, it can do a very good job with most files. There will likely still be a few corrections that must be applied manually. However, HTML Tidy will fix enough problems to save you a significant amount of time.