Jericho HTML Parser
Release Notes

3.4   (2015-10-24)
       - Bug Fixes:
         - [62] Fixed GC performance problem in StreamedSource.
         - [71] Renderer.setHRLineLength(0) doesn't completely disable
           rendering of HR element.
         - [72] Fixed performance problem in Attributes.
         - [80] Fixed position discarded exception in StreamedSource.
         - [81] Limited left margin in Renderer based on MaxLineLength.
         - Little-endian BOM encoding detection broken.
         - HTML5 elements with forbidden end tags weren't present in
           HTMLElements.getEndTagForbiddenElementNames()
       - CHANGES THAT COULD AFFECT THE BEHAVIOUR OF EXISTING PROGRAMS:
         - Changed default character reference encoding behaviour.
           (see Config.DEFAULT_CHARACTER_REFERENCE_ENCODING_BEHAVIOUR)
         - Changed the the ordering of OutputSegments for more intuitive
           behaviour, but still backward compatible with the old API contract.
       - Added Apache License as an option for licensing.
       - Added Config.CurrentCharacterReferenceEncodingBehaviour parameter.
       - Performance improvements in name and attribute based searches after
         full sequential parse.
       - Performance improvement in CharacterReference.decode methods.
       - Added LoggerProvider.getSourceLogger() for better performance of
         highly concurrent applications.
       - Performance improvement in StreamedSource by avoiding exception
         at end of stream.
       - Compiling to target Java 1.7 instead of Java 1.5.
         (source code is however still compatible with Java 1.6)
       - Removed all raw type references from source code.
       - Improved documentation of TagType.isValidPosition to include mention
         of potential problems with Microsoft downlevel-revealed conditional
         comment tags.
       - INPUT elements missing a name attribute no longer result in an
         error message being logged.
       - INPUT elements with type attribute values of date, datetime,
         datetime-local, month, time, week, number, range, email, url, search,
         tel, and color are now recognised as text controls without warnings
         appearing in the log.
       - HTMLSanitiser.stripInvalidMarkup sample now removes content from
         <script> tags, not just the tags.
       - Upgraded to the following logger APIs:
         slf4j-api-1.7.12, log4j-2.4.1

3.3   (2012-10-31)
       - Bug Fixes:
         - [3581664] CharacterReference.decode() does not decode entities
           containing digits - &frac12; &frac14; &frac34; &sup1; &sup2; &sup3;
           &there4;
         - [3311286] SourceCompactor does not respect TEXTAREA
         - [3519131] Renderer output incorrect when constructed with an
           Element object.
         - [3538829] Renderer output of font decoration on block boundaries
           incorrect.
         - Segment.getAllStartTags(name) and Segment.getFirstElement(name)
           do not work if the argument contains upper case characters.
         - The end delimiter of a common server tag inside an escaped server
           tag is falsely recognised as the end delimiter of the escaped tag.
       - CHANGES THAT COULD AFFECT THE BEHAVIOUR OF EXISTING PROGRAMS:
         - [3427073] Segment.getStyleURISegments() now includes style element
           content as well as style attribute values.
         - [3427927] Segment.getURIAttributes() now includes the archive
           attributes of object and applet elements.
         - Comments no longer recognised inside script elements during full
           sequential parse. Previously they were recognised for compatibility
           with major browsers but modern browser behaviour has changed.
         - Changed the log level of all parsing errors from INFO to ERROR, and
           the log level of the Source.fullSequentialParse() advisory message
           from WARN to INFO. The previous levels gave the advisory message a
           higher severity than the parsing errors, preventing logging systems
           from hiding the advisory message while showing parsing errors.
           Character encoding warnings remain unchanged at WARN level.
         - Changed the behaviour of the Renderer.renderHyperlinkURL(StartTag)
           method so that relative URLs are not rendered.
         - Changed the behaviour of the Renderer so that hyperlink element
           content is not rendered if it is the same as the hyperlink URL,
           ignoring any http:// prefix or / suffix.
         - EndTag.tidy() now removes whitespace before the closing bracket.
       - Added Source(File) constructor.
       - Added OutputDocument.getSegment() method.
       - Added OutputDocument.remove(int begin, int end) method.
       - Added Renderer.setHRLineLength() method.
       - Added RenderToText.jsp webapp sample.
       - Added Segment.getRowColumnVector() method.
       - Encoding detection now ignores common encodings specified in meta tags
         that have a code unit size incompatible with the preliminary encoding.
       - Upgraded to the following logger APIs:
         slf4j-api-1.7.2, log4j-1.2.17

3.2   (2011-01-30)
       - Bug Fixes:
         - [2826979] IllegalCharsetNameException thrown when illegal encoding
           specified in the document.
         - [2837434] Potential multithreading bug in Source.getNewLine()
         - [3036182] NullPointerException when run with stringent java.policy
         - TextExtractor did not include any attribute values.
         - All unterminated character references were decoded regardless of the
           configuration settings (bug introduced in 3.1).
	       - Renderer class - <div> under <li> resulted in new line.
	       - SourceFormatter did not handle TEXTAREA elements correctly.
	       - No exceptions thrown if invalid charset is specified by server or in
	         source document.
	       - Byte order mark character was included in the source document.
	     - HTML5 elements added to HTMLElementName and HTMLElements classes.
	     - Detects HTML5 character encoding declaration.
	     - Uses Windows-1252 as the default 8-bit encoding when available instead
	       of the subset encoding ISO-8859-1.
	     - Added Renderer.setIncludeAlternateText(boolean) method.
	     - Added Renderer.renderAlternateText(StartTag) method.
	     - Added Renderer.setIncludeFirstElementTopMargin(boolean) method.
	     - Added Renderer.setDefaultTopMargin(String,int) static method.
	     - Added Renderer.setDefaultBottomMargin(String,int) static method.
	     - Added Renderer.setDefaultIndent(String,boolean) static method.
	     - Renderer now evaluates inline styles for top, bottom and left margins.
       - Added Attribute.getStartTag() method.
       - Added Segment.getURIAttributes() method.
       - Added Segment.getStyleURISegments() method.
       - Added deregister() methods to the extended tag type classes.
       - Added MicrosoftConditionalCommentTagTypes class.
       - Added StartTagType.SERVER_COMMON_COMMENT tag type.
       - SourceFormatter now inlines DOCTYPE tags.
       - Added Segment.getMaxDepthIndicator() method.
       - Added static Config.IsHTMLEmptyElementTagRecognised parameter.
       - Deprecated MicrosoftTagTypes class.
       - Upgraded to the following logger APIs:
         slf4j-api-1.6.1, log4j-1.2.16

3.1   (2009-06-11)
       - Bug Fixes:
         - [2793556] Infinite loop on Segment.getAllStartTags()
         - Infinite loop on Segment.getAllElements()
         - Segment.getFirst* methods returned segments outside the bounding
           segment.
         - Segment.getAllElements methods did not return all enclosed elements
           in some circumstances.
         - Fixed documentation errors in Segment.getAllElements methods.
       - Added StreamedSource class.
       - CHANGES THAT COULD AFFECT THE BEHAVIOUR OF EXISTING PROGRAMS:
         - Changed ParseText from class to interface.
         - Segment.getNodeIterator() now returns character references as
           separate nodes.
       - Added tag search methods based on attribute value regular expressions.
       - Added tag search methods based on HTML class attribute.
       - Added static Source.LegacyNodeIteratorCompatabilityMode property
         temporarily to restore Segment.getNodeIterator() functionality to
         that of previous versions.
       - Removed char[] based search methods in ParseText.
       - Added CharacterReference.appendCharTo(Appendable) method.
       - Added OutputDocument(Segment) constructor.
       - Added StreamedSourceCopy sample program.

3.0   (2009-04-09)
       - Requires runtime Java 5 or later
       - Bug Fixes:
         - Character references representing unicode supplementary characters
           were not decoded correctly to UTF-16 code unit pairs.
         - [2188446] Element.getDepth() and Element.getParentElement()
           returned incorrect results if called in parse on demand mode.
       - Comments are now recognised inside <script> elements.
       - API CHANGES THAT ARE NOT BACKWARD COMPATIBLE:
         - Changed package name to net.htmlparser.jericho
         - Attribute values must now be String rather than CharSequence.
         - Removed all deprecated methods/classes from previous versions.
       - All find* methods deprecated in favour of get* methods in order to
         apply a consistent naming convention across all tag search methods.
       - Tag, Element and HTMLElements classes no longer implement the
         HTMLElementName interface. (use static import instead)
       - All collections now stongly typed using generics.
       - Changed FormControlOutputStyle class to enum.
       - Changed FormControlType class to enum.
       - Added CharStreamSource.appendTo(Appendable) method.
       - Added Source.iterator() method.
       - Source now implements Iterable.
       - Internally uses StringBuilder for better performance.
       - Added Source.getNextStartTag(StartTagType) method.
       - Added Source.getNextEndTag(EndTagType) method.
       - Added Source.getPreviousStartTag(StartTagType) method.
       - Added Source.getPreviousEndTag(EndTagType) method.
       - Added Segment.getAllStartTags(StartTagType) method.
       - Added all Segment.getFirst* methods.
       - Added Renderer.renderHyperlinkURL(StartTag) method.
       - Added HTMLSanitiser sample program.
       - Upgraded to slf4j-api-1.5.6

2.6.2 (development)
       - Bug Fixes:
         - TextExtractor did not include any attribute values.

2.6.1 (2008-12-27)
       - MAVEN release only to fix corrupt MANIFEST.MF file.

2.6   (2008-06-05)
       - Bug Fixes:
         - [1906051] Exponential recursion when non-server tags are present
           inside attribute values during full seq parse (introduced v2.5).
         - [1927391] Renderer had indenting problems.
         - [1991529] Wrong encoding with DISPLAY_VALUE and select Tags.
         - An element whose start tag and end tag have different names, such
           as a Mason component called with content, had no end tag.
         - SourceFormatter did not preserve original indentation inside server
           tags as specified in documentation.
         - A start tag containing a server tag immediately before its closing
           delimiter was not parsed correctly.
         - StartTag.tidy() removed server tags outside of attribute values.
         - Nested elements formed from non-normal tag types were not parsed
           correctly.
         - CharStreamSourceUtil.toString(charStreamSource) broke if
           charStreamSource.getEstimatedMaximumOutputLength()<-1
       - CHANGES THAT COULD AFFECT THE BEHAVIOUR OF EXISTING PROGRAMS:
         - Non-server tags are no longer recognised inside server tags.
           (see the TagType.isValidPosition documentation for details)
         - Elements inside <script> elements are now ignored up until the first
           occurrence of the character sequence "</script" (previously "</")
           during a full sequential parse.
         - Added static Config.ConvertNonBreakingSpaces property, which
           affects the default behaviour of several methods.
         - StartTag.isEmptyElementTag() now checks that the start tag is not
           one that has an optional or required end tag.
         - Element.isEmptyElementTag() is now implemented to be identical to
           StartTag.isEmptyElementTag().
       - Added StartTag.isSyntacticalEmptyElementTag() method.
       - Improved performance of internal stream writing methods.
       - Added StartTagType.SERVER_COMMON_ESCAPED standard tag type.
       - Added MicrosoftTagTypes.DOWNLEVEL_REVEALED_CONDITIONAL_COMMENT
         extended tag type.
       - Added Source(URLConnection) constructor.
       - Added Source.findNextStartTag(pos,name,startTagType) method.
       - Added Source.findPreviousStartTag(pos,name,startTagType) method.
       - Added SourceCompactor class and CompactSource sample program.
       - Added Segment.getNodeIterator() method.
       - Reduced risk of stack overflow when parsing large documents without
         full sequential parse by avoiding recursive comment search.
       - Added TextExtractor.includeAttribute(StartTag,Attribute) method.
       - TextExtractor now includes attribute contents in order of appearance
         in the source document.
       - TextExtractor now includes contents of href attributes if the
         IncludeAttributes property is set.
       - Added Renderer.IncludeHyperlinkURLs property.
       - Renderer no longer includes A element href if it is equal to "#"
         or starts with "javascript:".
       - Added Segment.getSource() method.
       - Added EndTagType.getEndTagName(String startTagName) method.
       - Added OutputDocument.writeTo(Writer, int begin, int end) method.
       - OutputDocument now ignores output segments enclosed by other
         output segments.
       - FormFields.getDataSet() Map entries are now ordered to match the
         order of appearance of the keys in the source document.
       - FormFields.getValues() now returns a List rather than a Collection.
       - FormField.getValues() now returns a List rather than a Collection.
       - Added WriterLogger.log(String level, String message) method.
       - Upgraded to the following logger APIs:
         slf4j-api-1.5.2, commons-logging-api-1.1.1, log4j-1.2.15

2.5   (2007-09-02)
       - Bug Fixes:
         - [1747493] RenderToText does not handle multiple <br> correctly.
         - RenderToText does not handle whitespace after <br> correctly.
         - Resetting to invalid mark exception during encoding detection.
         - INPUT elements of type "button" and "reset" incorrectly 
           interpreted as form controls of type FormControlType.TEXT.
         - Valid end tags containing white space rejected.
       - Elements inside <script> elements are now ignored, up until the first
         occurrence of the character sequence "</".
       - Improved encoding detection.
       - Added Source.getPreliminaryEncodingInfo() method.
       - Improved parsing of attributes containing server tags.
       - Changed Source.isXML() algorithm.
       - Added Renderer.ConvertNonBreakingSpaces property.
       - Added TextExtractor.ConvertNonBreakingSpaces property.
       - Added TextExtractor.ExcludeNonHTMLElements property.
       - Added extendible TextExtractor.excludeElement(StartTag) method.
       - TextExtractor now includes value of content attribute.
       - Deprecated OverlappingOutputSegmentsException class.
       - Added OutputDocument.getRegisteredOutputSegments() method.
       - Added OutputDocument.getDebugInfo() method.
       - Added fullSequentialParseData parameter to TagType.isValidPosition.
       - Removed all methods/classes deprecated in 2.2.

2.4   (2007-05-20)
       - Released under dual EPL/LGPL licence.
       - Bug Fixes:
         - [1583814] Indent method outputs multiple </script> tags
         - [1576991] Bug in ConvertStyleSheets sample program
         - [1597587] various NPEs in findFormFields()
         - [1599700] Segment.findAllStartTags(attributeName...) infinite loop
         - Overlapping elements resulted in some elements being listed as a
           child of more than one parent element.
         - OutputDocument.writeTo(Writer) closed the writer.
       - Server tags no longer interfere with parsing of start tag attributes.
       - Added Renderer class and Segment.getRenderer() method.
       - Added TextExtractor class and Segment.getTextExtractor() method.
       - Deprecated segment.extractText methods.
       - Added SourceFormatter class and Source.getSourceFormatter() method.
       - Deprecated Source.indent method.
       - Added Logger interface along with the related LoggerProvider
         interface and BasicLoggerProvider and WriterLogger classes.
       - Added Source.setLogger(Logger) and Source.getLogger() methods.
       - Deprecated Source.setLogWriter(Writer) and Source.getLogWriter()
         methods.
       - Added Source.findNextElement(int pos, String attributeName,
           String value, boolean valueCaseSensitive) method.
       - Added Segment.findAllElements(String attributeName, String value,
           boolean valueCaseSensitive) method.
       - Calling the ignoreWhenParsing methods on overlapping segments no
         longer results in an OverlappingOutputSegmentsException.
       - Added CharacterReference.getEncodingFilterWriter(Writer) method.
       - Added CharacterReference.encode(char) method.
       - Added Source.getNewLine() method.
       - Added static Config.NewLine parameter.
       - All text output now uses Config.NewLine instead of hard-coded '\n'.
       - Source.fullSequentialParse() method no longer parses the source again
         if it has already been called.
       - Some methods that require the parsing of the entire source now call
         Source.fullSequentialParse() automatically.
       - Some changes to the output of various getDebugInfo() methods.
       - Added categorised class list in javadoc.
       - Removed all methods/constants deprecated in 2.0.

2.3   (2006-09-11)
       - Bug Fixes:
         - [1510438] NullPointerException in Source.indent.
         - [1511480] Incorrect detection of non-html element with nested
           empty-element tag of same name.
         - [1547562] Fault in caching mechanism.
         - Source.fullSequentialParse() sometimes resulted in unregistered
           tags being returned in tag searches.
         - Invalid Empty-element tags whose name is in either of the sets
           HTMLElements.getEndTagOptionalElementNames() or
           HTMLElements.getEndTagRequiredElementNames() were rejected by the
           parser if the slash immediately follows the tag name.
         - StartTag.tidy() only included a slash before the closing delimiter
           of the tag if the tag name was in the set of
           HTMLElements.getEndTagForbiddenElementNames().  It now includes the
           slash for all tag names not in getEndTagOptionalElementNames().
       - Source.fullSequentialParse() now clears the cache automatically
         instead of throwing an IllegalStateException if the cache is not
         empty.
       - Changes to behaviour of Source.indent:
         - preserves indenting in SCRIPT elements, server elements,
           HTML comments and CDATA sections.
         - keeps SCRIPT elements, HTML comments, XML declarations,
           XML processing instructions and markup declarations inline.
       - Minor documentation improvements.

2.2   (2006-06-20)
       - Bug Fixes:
         - Fault in caching mechanism resulted in missed tags in rare
           circumstances. (SubCache.findNextTag method)
         - [1407179] Segment.extractText() threw NullPointerException if
           the last character position was part of a tag.
       - Segment.extractText() now converts some tags to whitespace and
         ignores text inside SCRIPT and STYLE elements.
       - Added Segment.extractText(boolean includeAttributes) option.
       - Added Source.fullSequentialParse() method.
       - Added CharStreamSource interface for dealing with char output.
       - Added Source.indent(String indentText, boolean tidyTags,
          boolean collapseWhiteSpace, boolean indentAllElements) method.
       - Added Segment.getChildElements() method.
       - Added Element.getParentElement() method.
       - Added Element.getDepth() method.
       - Named tag search methods now only return unregistered tags if the
         specified name is not a valid XML tag name.
       - Changed Attributes.DefaultMaxErrorCount system default from 1 to 2.
       - Added EndTag.getElement() method.
       - Added Tag.getElement() abstract method.
       - Added Tag.getNameSegment() method.
       - Added Tag.getUserData() and Tag.setUserData(Object) methods.
       - Added Tag.findNextTag() method.
       - Added Tag.findPreviousTag() method.
       - Added Tag.tidy() and Tag.tidy(boolean toXHTML) methods.
       - Added and renamed many methods in OutputDocument class to make the
         interface more intuitive.
       - Added HTMLElements.getNestingForbiddenElementNames() method.
       - Illegally nested elements with required end tags now terminate at
         start of illegally nested start tag, avoiding possible stack overflow
         in the common case of multiple unterminated <a name=...> elements.
       - Tag search methods called with a pos argument that is out of range
         now return null or empty results rather than throwing an exception.
       - Renamed output(Writer) method in OutputSegment to writeTo(Writer).
       - Deprecated Tag.regenerateHTML() method.
       - Deprecated Source.getNextTagIterator() method.
       - Deprecated AttributesOutputSegment class.
       - Deprecated StringOutputSegment class.
       - Removed BlankOutputSegment class from public API.
       - Removed CharOutputSegment class from public API.
       - Removed IOutputSegment which was deprecated in 2.0.

2.1   (2005-12-24)
       - Added Source(InputStream) constructor.
       - Added Source(Reader) constructor.
       - Added Source(URL) constructor.
       - Added Source.getEncoding() method.
       - Added Source.getEncodingSpecificationInfo() method.
       - Added Source.isXML() method.
       - Added Source.findNextElement(pos) method.
       - Added Source.findNextElement(pos,name) method.
       - Added Segment.extractText() method.
       - Added StartTag.getAttributeValue(attributeName) method.
       - Added Element.getAttributeValue(attributeName) method.
       - Added ExtractText and SourceEncoding sample programs.

2.0   (2005-11-10)
       - Complete rewrite of the parsing engine to allow the encapsulation of
         different tag types into the new TagType class.
       - Requires Java 1.4 or later.
       - All programs written for previous versions of the library will have
         to be recompiled with the new version, regardless of whether any
         changes are required.  This is because several methods, including the
         Source constructor, now expect a CharSequence as an argument instead
         of a String.
       - Changes that could require modifications to existing programs:
         - The toString() method of Segment and all subclasses now returns the
           source text of the segment instead of a string useful for debugging
           purposes.  This change was necessary because Segment now
           implements CharSequence.
         - For consistency, the toString() methods of all IOutputSegment
           implementations now return the output string instead of a string
           useful for debugging purposes.
         - The return type of the OutputDocument.getSourceText() method is now
           CharSequence instead of String.
         - Character references in Attribute.getValue() are now decoded
         - StartTag.isEmptyElementTag() no longer checks whether the end tag
           is required.
         - Element.getContent() now returns zero-length segment instead of null
           in case of an empty element.
         - FormField.getPredefinedValues() now returns an empty collection
           instead of null if the form field has no predefined values.
         - Segment.findAllStartTags() now returns server tags that are found
           inside other tags.
         - Attributes segment now ends immediately after the last attribute
           instead of immediatley before the end-of-tag delimiter.
         - Modified Segment.isWhiteSpace(char) to match HTML specification
         - CharacterReference.encode(CharSequence) no longer encodes
           apostrophes by default
         - Tags of type SERVER_COMMON now always have the name "%" regardless
           of whether an identifier immediately follows it.
         - Modified and enhanced aspects of StartTag searches relating to
           special tags
         - P elements are now terminated by TABLE elements.
           See the HTMLElementName.P documentation for more information.
       - removed public fields in Attribute class that were deprecated in 1.2
       - removed Source.getSourceTextLowerCase() method deprecated in 1.3
       - removed Source.findEnd(int pos, SpecialTag) method which was
         accidentally added as a public method in 1.4
       - Deprecated numerous methods (details in javadoc)
       - Deprecated IOutputSegment interface and replaced with OutputSegment
       - Improved caching system
       - Added recognition of markup declarations
       - Added recognition of CDATA sections
       - Added recognition of SGML marked sections
       - Doctype declarations containing markup declarations now supported
       - Segment class now implements CharSequence and Comparable
       - Added getDebugInfo() to Segment and all subclasses to replace the
         previous functionality of the toString() method
       - OutputSegment interface now implements CharSequence
       - Added getDebugInfo() to the OutputSegment interface to replace the
         previous functionality of the toString() method
       - Attributes class now implements List
       - FormFields class now implements Collection
       - Added HTMLElementName interface and HTMLElements class
       - Added RowColumnVector class and associated methods in Source class
       - Added FormControl class
       - Added various methods to the FormField, FormFields and OutputDocument
         classes related to FormControl objects and the manipulation and output
         of form submission values.
       - Added Config and related classes
       - Added TagType class and subclasses
       - Added various tag search methods to the Source and Segment classes
         including searches by TagType, attribute values, and other criteria.
       - Added AttributesOutputSegment class
       - Added Util class
       - Added OverlappingOutputSegmentsException class
       - Added many other methods to existing classes
       - Documentation improvements

1.4.1 (2005-11-10)
       - Bug Fixes:
         - [1065861] Named StartTag search did not find a tag immediately
           following a comment
         - Unnamed StartTag search did not find a comment if the search starts
           at the first character of the comment
         - Character references in FormField.getPredefinedValues() items were
           not decoded
         - FormControlType.SELECT_SINGLE.allowsMultipleValues() returned false
           instead of the correct value of true, resulting in the same
           incorrect value from FormField.allowMultipleValues() when multiple
           SELECT_SINGLE controls with the same name were present in the form

1.4   (2004-09-02)
       - Added CharacterEntityReference and NumbericCharacterReference classes
       - Added CharOutputSegment class
       - Attributes allow whitespace around '=' sign
       - Added convenience method Element.getAttributes()
       - Some documentation improvements

1.3   (2004-07-25)
       - Deprecated Source.getSourceTextLowerCase()
       - Added ignoreWhenParsing methods to Source and Segment classes
         (See sample called JSPTest)
       - Added parseAttributes methods to Source, Segment and StartTag classes
       - Added ability to search for tags in a specified namespace
       - Added BlankOutputSegment class
       - Fixed bug relating to HTML comments with alphabetic characters
         immediately following the opening <!-- characters

1.2   (2004-06-16)
       - Deprecated public fields in Attribute class in favour of accessor
         methods
       - Following methods return empty list instead of null if no result:
         (WARNING - This could possibly break existing programs)
          Segment.findAllStartTags(String name)
          Segment.findAllComments()
          Segment.findAllElements(String name)
          Segment.findAllElements()
       - Added hashCode() method to Segment class
       - Server tags such as ASP, JSP, PSP, PHP and Mason are now recognised
       - Basic parser logging introduced (see Source.setLogWriter() method)
       - Start tags with too many badly formed attributes rejected
         (reduces number of false positives when searching for start tags)
       - Added public IOutputSegment.COMPARATOR field
       - Improved caching

1.1   (2004-03-07)
       - All elements defined in HTML 4.01 are recognised and their properties
         used to aid analysis
       - StartTag.getElement() method enhanced to return the correct span of
         elements which have a missing optional end tag
       - StartTag.isEndTagForbidden() method enhanced to also check the name of
         the tag against the list of elements in the HTML spec whose end tags
         are forbidden
       - Numerous new methods
       - Huge performance enhancement from the use of internal caching
       - Bug Fixes:
         [909944] Parser does not work with unclosed comments.

1.0   (2004-02-07) Initial Release

