Revision as of 15:18, 2 October 2005 editLogixoul (talk | contribs)1,092 editsm it's not a subset - think of <br> vs. <br />← Previous edit | Revision as of 15:23, 2 October 2005 edit undoPlugwash (talk | contribs)Extended confirmed users9,427 edits rv empty element syntax does go back to sgml and the w3c validator has no problem with <br/> in htmlNext edit → | ||
Line 1: | Line 1: | ||
{{HTML}} | {{HTML}} | ||
'''eXtensible HyperText Markup Language''', or '''XHTML''', is a ] that has the same expressive possibilities as ], but a stricter syntax. Whereas HTML is an application of ], a very flexible markup language, XHTML is an application of ], |
'''eXtensible HyperText Markup Language''', or '''XHTML''', is a ] that has the same expressive possibilities as ], but a stricter syntax. Whereas HTML is an application of ], a very flexible markup language, XHTML is an application of ], a more restrictive subset of SGML. Because they need to be ''well-formed'' (syntactically correct), XHTML documents allow for automated processing to be performed using a standard XML library — unlike HTML, which requires a relatively complex, lenient, and generally custom parser (though an SGML parser library could possibly be used). XHTML 1.0 became a ] (W3C) Recommendation on ], ]. | ||
==Overview== | ==Overview== |
Revision as of 15:23, 2 October 2005
HTML |
---|
Comparisons |
eXtensible HyperText Markup Language, or XHTML, is a markup language that has the same expressive possibilities as HTML, but a stricter syntax. Whereas HTML is an application of SGML, a very flexible markup language, XHTML is an application of XML, a more restrictive subset of SGML. Because they need to be well-formed (syntactically correct), XHTML documents allow for automated processing to be performed using a standard XML library — unlike HTML, which requires a relatively complex, lenient, and generally custom parser (though an SGML parser library could possibly be used). XHTML 1.0 became a World Wide Web Consortium (W3C) Recommendation on January 26, 2000.
Overview
XHTML is the successor to HTML. As such, XHTML may be considered by many to be the "current version" of HTML, but it is a separate, parallel standard; the W3C continues to recommend the use of either XHTML 1.1, XHTML 1.0, or HTML 4.01 for web publishing.
The need for a more strict version of HTML was felt primarily because World Wide Web content now needs to be delivered to many devices (like mobile devices) apart from traditional computers, where extra resources cannot be devoted to support the additional complexity of HTML syntax.
Most of the recent versions of popular web browsers render XHTML properly, and many older browsers will also render XHTML as it is mostly compatible with HTML and most browsers do not require valid HTML. Similarly, almost all web browsers that are compatible with XHTML also render HTML properly. Some argue this compatability is slowing the switch from HTML to XHTML.
An especially useful feature of XHTML is that different XML namespaces (such as MathML and Scalable Vector Graphics) can be incorporated within it.
The changes from HTML to XHTML are minor, and are mainly to achieve conformance with XML. The most important change is the requirement that the document must be well formed and all elements must be closed. Additionally, in XHTML, all tags must be written in lowercase. This is in direct contrast to established traditions which began around the time of HTML 2.0, when most people preferred uppercase tags. In XHTML, all attribute values must be enclosed by quotes. (This is optional in SGML, and hence in HTML, where quotes may be omitted in some circumstances.) All elements must also be explicitly closed, including empty elements such as img
and br
. This can be done by adding a closing slash to the start tag: <img … />
and <br />
. Attribute minimization (e.g., <option selected>
) is also prohibited; instead, use <option selected="selected">
. More differences are detailed in the W3C XHTML specification .
Versions of XHTML
XHTML 1.0
The original XHTML W3C Recommendation, XHTML 1.0, was simply a reformulation of HTML 4.01 in XML. There are three different 'flavors' of XHTML 1.0, each equal in scope to their respective HTML 4.01 versions.
- XHTML 1.0 Strict is the same as HTML 4.01 Strict, but follows XML syntax rules.
- XHTML 1.0 Transitional allows some common deprecated elements and attributes not found in XHTML 1.0 Strict to be used, such as
<center>
,<u>
,<strike>
, and<applet>
. - XHTML 1.0 Frameset: Allows the use of HTML framesets.
XHTML 1.1
The most recent XHTML W3C Recommendation is XHTML 1.1: Module-based XHTML. Authors can import additional features (such as framesets) into their markup. This version also allows for ruby markup support, needed for East-Asian languages (especially CJK).
The XHTML 2.0 draft specification
Work on XHTML 2.0 is, as of 2005, still underway; in fact, the DTD has not even been authored yet. The XHTML 2.0 draft is controversial because it breaks backwards compatibility with all previous versions, and is therefore in effect a new markup language created to circumvent (X)HTML's limitations rather than being simply a new version.
New features brought into the HTML family of markup languages by XHTML 2.0:
- HTML forms will be replaced by XForms.
- HTML frames will be replaced by XFrames.
- The DOM Events will be replaced by XML Events, which uses the XML Document Object Model.
- A new list element type, the
<nl>
element type, will be included in order to specifically designate a list as a navigation list. This will be useful in creating nested menus which are currently created by a wide variety of means. - Any element will be able to act as a hyperlink, e.g.,
<li href="articles.html">Articles</li>
. - Any element will be able to reference alternative media with the
src
attribute, e.g.,<p src="lbridge.jpg" type="image/jpeg">London Bridge</p>
will replace<img src="lbridge.jpg" alt="London Bridge" />
. - The
<img src="" alt="" />
element has been removed in favor of<object type="MIME/ContentType" src="">Alt</object>
- The heading elements (i.e.
<h1>
,<h2>
,<h3>
, etc.) will be deprecated in favour of the single element<h>
. Levels of headings will instead be indicated by the nested<section>
elements each with their own<h>
heading. - The presentational elements
<i>
,<b>
and<tt>
, still allowed in XHTML 1.x (even Strict), will be absent from XHTML 2.0. The only presentational elements remaining will be<sup>
and<sub>
for superscript and subscript respectively.
Others in the XHTML family
- XHTML Basic: A special "light" version of XHTML for devices which cannot use the full XHTML set, primarily used on handhelds such as mobile phones. This is the intended replacement for WML and C-HTML.
- XHTML Mobile Profile: Based on XHTML Basic, this OMA (Open Mobile Alliance) effort targets hand phones specifically by adding mobile phone-specific elements to XHTML Basic.
Validating XHTML documents
An XHTML document that conforms to the XHTML specification is said to be a valid document. In a perfect world, all browsers would follow the web standards and valid documents would predictably render on every browser and platform. Although validating your XHTML does not ensure cross-browser compatibility, it is recommended. A document can be checked for validity with the W3C Markup Validation Service.
DOCTYPEs
For a document to validate, it must contain a Document Type Declaration, or DOCTYPE. A DOCTYPE declares to the browser what Document Type Definition (DTD) the document conforms to. A Document Type Declaration should be placed at the very beginning of an XHTML document. These are the most common XHTML Document Type Declarations:
- XHTML 1.0 Strict
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">- XHTML 1.0 Transitional
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">- XHTML 1.0 Frameset
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">- XHTML 1.1
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
The system identifier, which in these examples is the URL that begins with "http", need only point to a copy of the DTD to use if the validator cannot locate one based on the public identifier (the other quoted string). It does not need to be the specific URL that is in these examples; in fact, authors are encouraged to use local copies of the DTD files when possible. The public identifier, however, must be character-for-character the same as in the examples.
Character encoding may be specified at the beginning of an XHTML document in the XML declaration and within a meta http-equiv element. (If an XML document lacks encoding specification, an XML parser assumes that the encoding is UTF-8 or UTF-16, unless the encoding has already been determined by a higher protocol.)
Common errors
Some of the most common errors in XHTML are:
- Not closing empty elements (elements without closing tags)
- Incorrect:
<br>
- Correct:
<br />
- Incorrect:
- Not closing non-empty elements
- Incorrect:
<p>This is a paragraph.<p>This is another paragraph.
- Correct:
<p>This is a paragraph.</p><p>This is another paragraph.</p>
- Incorrect:
- Improperly nesting elements (elements must be closed in reverse order)
- Incorrect:
<em><strong>This is some text.</em></strong>
- Correct:
<em><strong>This is some text.</strong></em>
- Incorrect:
- Not specifying alternate text for images (using the
alt
attribute, which helps make pages accessible for devices that don't load images or screen-readers for the blind)- Incorrect:
<img src="/skins/common/images/poweredby_mediawiki_88x31.png" />
- Correct:
<img src="/skins/common/images/poweredby_mediawiki_88x31.png" alt="MediaWiki" />
- Incorrect:
- Putting text directly in the body of the document (this is not an error in XHTML 1.0 Transitional)
- Incorrect:
<body>Welcome to my page.</body>
- Correct:
<body><p>Welcome to my page.</p></body>
- Incorrect:
- Nesting block-level elements within inline elements
- Incorrect:
<em><h2>Introduction</h2></em>
- Correct:
<h2><em>Introduction</em></h2>
- Incorrect:
- Not putting quotation marks around attribute values
- Incorrect:
<td rowspan=3>
- Correct:
<td rowspan="3">
- Incorrect:
- Using the ampersand outside of entities (use
&
to display the ampersand character)- Incorrect:
<title>Cars & Trucks</title>
- Correct:
<title>Cars & Trucks</title>
- Incorrect:
- Using uppercase tag names and/or tag attributes
- Incorrect:
<BODY><P>The Best Page Ever</P></BODY>
- Correct:
<body><p>The Best Page Ever</p></body>
- Incorrect:
- Attribute minimization
- Incorrect:
<textarea readonly>READ-ONLY</textarea>
- Correct:
<textarea readonly="readonly">READ-ONLY</textarea>
- Incorrect:
This is not an exhaustive list, but gives a general sense of errors that XHTML coders often make.
Backward compatibilities
XHTML 1.0 is backward compatible with HTML when served as text/html. However, there are associated problems, especially for Internet Explorer. For more information, please refer to XHTML section in the criticisms of Internet Explorer article. Furthermore, XHTML 2.0 has been criticised for not being backward compatible with XHTML 1.x efforts.
Example
The following is an example of XHTML 1.0 Strict.
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>XHTML Example</title> </head> <body> <p>This is a tiny example of an XHTML document.</p> </body> </html>
See also
- Dynamic HTML
- List of document markup languages
- Comparison of document markup languages
- Comparison of layout engines (XHTML)
- microformats - formats built with XHTML
- XOXO - use of XHTML Modularization to define an outline format.
External links
- W3C's Markup Home Page
- XHTML 1.0 Specification
- XHTML 1.1 Specification
- Sending XHTML as text/html Considered Harmful
- Rated XHTML - pros and cons of XHTML
- A simple and free XHTML editor