One change that I think most people are overlooking is the change to HTML comments in HTML 5. If you read the working draft of the HTML 5 spec, you'll notice that previously valid comment markup may no longer be valid in HTML 5.

Here is definition from HTML 5:

Comments must start with the four character sequence U+003C LESS-THAN SIGN, U+0021 EXCLAMATION MARK, U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS (<!--). Following this sequence, the comment may have text, with the additional restriction that the text must not start with a single U+003E GREATER-THAN SIGN ('>') character, nor start with a U+002D HYPHEN-MINUS (-) character followed by a U+003E GREATER-THAN SIGN ('>') character, nor contain two consecutive U+002D HYPHEN-MINUS (-) characters, nor end with a U+002D HYPHEN-MINUS (-) character. Finally, the comment must be ended by the three character sequence U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN (-->).

Compared to HTML 4:

White space is not permitted between the markup declaration open delimiter("<!") and the comment open delimiter ("--"), but is permitted between the comment close delimiter ("--") and the markup declaration close delimiter (">"). A common error is to include a string of hyphens ("---") within a comment. Authors should avoid putting two or more adjacent hyphens inside comments.

Information that appears between comments has no special meaning (e.g., character references are not interpreted as such).

Note that comments are markup.

The main difference is that having dash-dash (--) within a comment is longer acceptable. In HTML 4, you can nest any number of opening and closing delimeters (--), which can cause unwanted behavious for web authors who do not know the comment definition that well.

For example, take a look at this chunk of HTML:

<!-- bad comment -- -->
<p>Hello, World</p>
<!--p>Hide me!</p-->

You'd probably assume the output to be Hello, World (with "Hide me!" being comented out). However, because the first comment contains two consecutive dashes within it's text, the --> no longer closes the entire comment. Rather, it treats the second paragraph as part of the comment text, and once it parses <!--p-->, those two dashes finally closes the initial comment block.

In HTML 5, the first comment would be invalid markup. I don't think any browser has implement this yet, and I'm not sure how invalid comment markup would be handled.

Note: I only saw this behaviour in Firefox, and not in IE nor Safari. It's actually a bug in those browsers to not parse the comment tag properly.

 



blog comments powered by Disqus