aboutsummaryrefslogblamecommitdiff
path: root/textproc/tagsoup/pkg-descr
blob: f997e6e0454d8e3c5f695ce15f9809ea4f52a5f5 (plain) (tree)
1
2
3
4
5
6
7
8
9
10
11










                                                                            
                                               
TagSoup - Just Keep On Truckin'

TagSoup is a SAX-compliant parser written in Java that, instead of parsing
well-formed or valid XML, parses HTML as it is found in the wild: poor,
nasty and brutish, though quite often far from short.  TagSoup is designed
for people who have to process this stuff using some semblance of a rational
application design.  By providing a SAX interface, it allows standard XML
tools to be applied to even the worst HTML.  TagSoup also includes
a command-line processor that reads HTML files and can generate either
clean HTML or well-formed XML that is a close approximation to XHTML.

WWW: http://vrici.lojban.org/~cowan/XML/tagsoup