Friday, January 27, 2012

XHTML Syntax Rules

The syntax rules require that you:

  • Close all elements
  • Terminate empty elements
  • Quote all attribute values
  • Give values to all attributes
  • Define all element and attribute names in lower case (because XHTML is case-sensitive)
  • Nest elements correctly
Close all Elements

Unlike HTML, all elements must have an opening and closing tag.
In HTML, this is allowed:
<p>This is a paragraph.
In XHTML, the <p> tag must be closed:
<p>This is a paragraph.</p>

Terminate Empty Elements
Some tags are termed "empty" because they have their functionality self-contained within the tag (such as a
line break <br> or an image <img> tag) and do not have separate closing tags.

They do, however, need to be closed. To make these tags well-formed, add a slash (/) before the final angle
bracket (>).
For example,
<img src="myimg.jpg"/>

Quote all Attribute Values
Delimit all attributes with double quotation marks:
<table width="100%">

Supply Values for all Attributes
All attributes must have explicit values. Attribute minimization is forbidden. For example, the following HTML
code has a minimized attribute "checked":
<input type=checkbox checked/>
Correct syntax requires that a Boolean attribute whose value is implicit in HTML should, in XHTML, be set
equal to itself. Thus, the preceding example should be written:
<input type="checkbox" checked="checked"/>

Define all Element and Attribute Names in Lowercase
XML is case-sensitive and since XHTML DTDs define elements and attributes in lowercase, content needs to
obey this requirement.
This is illegal in XHTML:
<H1>My Big Title</H1>
<Table width="90%">
Correct XHTML form:
<h1>My Big Title</h1>
<table width="90%">

Nest Elements Correctly
The following code shows two elements incorrectly nested:
<p>This is bold <b>text</p></b>
The correct nested format is:
<p>This is bold <b>text</b></p>

Encode Non-US-ASCII Characters Using URL Encoding
Non-US-ASCII characters are not valid in hrefs or any other URL attribute values (RFC 1738). This means
the author must encode such characters using URL encoding. URL encoding of a character consists of a "%"

symbol, followed by the two-digit hexadecimal representation (case-insensitive) of the ISO-Latin code point
for the character.

Note: Different web servers use different encodings. For example, Tomcat on Windows NT expects hêllò to be encoded (in extended ASCII) as "he%EAll%F2". However, the HTML specification recommends using UTF-8, resulting in "he%C3%AAll%C3%B2", which Tomcat does not understand. The author must encode non-US-ASCII characters using URL encoding in a format understood by the application server. MIS will passthrough the encoded URLs unaffected.

Document Rules
XHTML is HTML defined as an XML application. The XML document rules require that all documents have
one root element and conform to the XML specification. Any XHTML documents that you work with must also
follow this convention.

A Root Element is Required
The root element contains all the other elements on a page. In XHTML, the root element is the <html>

XML Declaration Required
The XML declaration declares that the current document conforms to the XML specification. The declaration
has three attributes: version, encoding, and standalone.
The shortened syntax is as follows:
<?xml version="1.0"?>

Note: There is no space separating the "?" from the angle brackets.

Automate HTML to XHTML Mark-up:

No comments: