Misplaced Pages

XML catalog

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
This article contains instructions, advice, or how-to content. Please help rewrite the content so that it is more encyclopedic or move it to Wikiversity, Wikibooks, or Wikivoyage. (March 2018)
This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "XML catalog" – news · newspapers · books · scholar · JSTOR (November 2023) (Learn how and when to remove this message)
(Learn how and when to remove this message)

XML documents typically refer to external entities, for example the public and/or system ID for the Document Type Definition. These external relationships are expressed using URIs, typically as URLs.

However absolute URLs only work when the network can reach them. Relying on remote resources makes XML processing susceptible to both planned and unplanned network downtime.

Relative URLs are only useful in the context where they were initially created. For example, the URL "../../xml/dtd/docbookx.xml" will usually only be useful in very limited circumstances.

One way to avoid these problems is to use an entity resolver (a standard part of SAX) or a URI Resolver (a standard part of JAXP). A resolver can examine the URIs of the resources being requested and determine how best to satisfy those requests. The XML catalog is a document describing a mapping between external entity references and locally cached equivalents.

Example Catalog.xml

The following simple catalog shows how one might provide locally cached DTDs for an XHTML page validation tool, for example.

 
  <?xml version="1.0"?>
  <!DOCTYPE catalog
    PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN"
           "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd">
  <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"
           prefer="public">
    <public publicId="-//W3C//DTD XHTML 1.0 Strict//EN"
            uri="dtd/xhtml1/xhtml1-strict.dtd"/>
    <public publicId="-//W3C//DTD XHTML 1.0 Transitional//EN"
            uri="dtd/xhtml1/xhtml1-transitional.dtd"/>
    <public publicId="-//W3C//DTD XHTML 1.1//EN"
            uri="dtd/xhtml11/xhtml11-flat.dtd"/>
  </catalog>

This catalog makes it possible to resolve -//W3C//DTD XHTML 1.0 Strict//EN to the local URI dtd/xhtml1/xhtml1-strict.dtd. Similarly, it provides local URIs for two other public IDs.

Note that the document above includes a DOCTYPE – this may cause the parser to attempt to access the system ID URL for the DOCTYPE (i.e. http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd) before the catalog resolver is fully functioning, which is probably undesirable. To prevent this, simply remove the DOCTYPE declaration.

The following example shows this, and also shows the equivalent <system/> declarations as an alternative to <public/> declarations.

  <?xml version="1.0"?>
  <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
    <system systemId="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
            uri="dtd/xhtml1/xhtml1-strict.dtd"/>
    <system systemId="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
            uri="dtd/xhtml1/xhtml1-transitional.dtd"/>
    <system systemId="http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"
            uri="dtd/xhtml11/xhtml11-flat.dtd"/>
  </catalog>

Using a catalog – Java SAX example

Catalog resolvers are available for various programming languages. The following example shows how, in Java, a SAX parser may be created to parse some input source in which the org.apache.xml.resolver.tools.CatalogResolver is used to resolve external entities to locally cached instances. This resolver originates from Apache Xerces but is now included with the Sun Java runtime.

It is necessary to create a SAXParser in the standard way by using factories. The XML reader entity resolver should be set to the default or to a customly-made one.

  final SAXParser saxParser = SAXParserFactory.newInstance().newSAXParser();
  final XMLReader reader = saxParser.getXMLReader();
  final ContentHandler handler = ...;
  final InputSource input = ...;
  reader.setEntityResolver( new CatalogResolver() );
  reader.setContentHandler( handler );
  reader.parse( input );

It is important to call the parse method on the reader, not on the SAX parser.

References

  1. Walsh, Norman (7 October 2005). "XML Catalogs OASIS Standard V1.1" (PDF). OASIS (pdf). Retrieved 4 November 2023.
Categories: