Quote:
Originally Posted by tripy
XSL cannot be applied to HTML.
|
But, it can be applied to XHTML
Quote:
Originally Posted by tripy
I have no past experience in Java, but try to look for a DOM parser.
It allows you to instantiate a tree of the page, and then you should be able to exrtact what is useful for you, and re-create a simple html with an XSL processor.
|
Since version 1.4 all standard java installs come with both a parser and xsl transformer.
For HTML, one option could be to use jTidy to get a DOM object. This can be fed to a transformer, manipulated or just written out.
For your particular case, try jTidy to create a DOM, then remove the nodes corresponding to the divs you don't want, then just output the DOM object as XHTML.
Of note, jTidy is also pretty good at dealing with bad markup.
|