• Aucun résultat trouvé

Going Beyond the XML Specification

Chapter 2. XML Fundamentals

2.8 Going Beyond the XML Specification

The standards developed at the W3C ensure interoperability between distributed systems and the applications developers around the world. As we progress in this book from XML tools and strategies in your local applications to distributed application development, several new XML terms and issues come into the forefront.

2.8.1 XML Namespaces

As discussed in Section 1.2.2 in Chapter 1, namespaces provide a means to combine elements from different knowledge domains or schemas. The Namespaces specification accomplishes this by allowing element and attribute names to be qualified with a URI; every URI corresponds to a unique namespace. Namespaces are used for several purposes in practice, but the most important is to allow a document to contain elements defined by different schema (possibly originating from different organizations) without having naming conflicts.

Namespaces are used by associating a named xmlns attribute with a URI. Namespaces are communicated in an XML document using the reserved colon character in an element name, prefixed with the xmlns symbol. For example:

<sumc:purchaseOrder refnum="389473984-38844"

xmlns:sumc="http://www.superultramegacorp.com">

<sumc:product name="Magical Widget" sku="398-4993833">

<sumc:qty value="24">One Case Order</sumc:qty>

<sumc:amount value="34.56">34.56</sumc:amount>

<sumc:shipping value="overnight">Next-day</sumc:shipping>

</sumc:product>

</sumc:purchaseOrder>

In this document, the namespace of SuperUltraMegaCorp is defined. The prefix sumc has been associated with it in the xmlns:sumc attribute. Elements prefixed with sumc: are within this namespace. This purchaseOrder now has a context that can set it apart from a similarly structured purchase order intended for a different business domain.

2.8.2 Extracting Information Using XPath

XPath is discussed at length in Chapter 5. For now it is worth a mention, lest you start to develop your own method for querying XML without understanding what standards are offered.

XPath offers a standardized method of querying XML for specific information, whether it's a single element or node, or a collection of elements. The standardization is of value not when you're writing the backend part of your application, but rather when you need to expose search capabilities either programmatically or via the web.

2.8.3 Using XLink to Link XML Documents

The XLink language allows for the insertion of elements into XML documents to create and describe links between different resources. XLink uses XML syntax to create structures representing links similar to hyperlinks used in HTML, as well as more complex linking structures. Link specifications are encoded in the attributes of the source document, or in supplemental documents that can describe links among other documents. The most common applications embed link information at the link source. The target of a link is described using a URI and an XPath expression; the URI specifies the target resource, and the XPath expression specifies a specific location in the linked resource. XLink is still a young specification and is not discussed further in this book.

2.8.4 Communicating with XML Protocols

The XML Protocol working group is a W3C group tasked with investigating the development of XML-based messaging and communications standards. These standards are attempting to define a method of packaging information and sending it across the Internet. Some are focused on transactions, some are focused on guaranteed delivery, and others are focused on routing and enveloping mechanisms. The Protocol Activity page ( http://www.w3.org/2000/03/29-XML-protocol-matrix) is an excellent online resource for comparing these different protocols when developing distributed systems. The Web Distributed Authoring and Versioning specifications from the IETF, collectively known as WebDAV, use XML to support interoperable tools for web site management and authoring. Chapter 9 covers such items as remote procedure calls and web services (including SOAP) in greater detail. Additional specifications deal with other aspects of distributed computing, especially topics such as authentication and secure communications.

2.8.5 Replacing HTML with XHTML

The Extensible Hypertext Markup Language, or XHTML (http://www.w3.org/TR/xhtml1/), is a welcome gift to those of us who have had to struggle with parsing HTML. Though there is a W3C specification for HTML, most implementations conform only partially. This is due in part to the growth of HTML from some early implementations rather than a formal specification, and also to the browser implementers' attempts to do "the right thing" even with badly broken markup.

The attempts to force HTML to fit into an SGML mold after the fact probably hindered compliance further, if only because the rules for parsing it became more complex and implementers' don't like to start over. When a browser parses HTML, it concerns itself with display attributes, not organization of the information in the document. While XHTML doesn't change the focus on appearance, it is an XML-based markup language, allowing you to parse it with an XML parser. This can drastically reduce the handling time of XHTML. It also allows you to leverage XHTML into other XML applications, as well as use XML Namespaces in conjunction with XHTML that has migrated into other domains and systems.

The first version of the XHTML specification, XHTML 1.0, defines a monolithic document type that corresponds closely with the HTML 4 specification. Future versions of XHTML, starting with XHTML 1.1, are moving toward a modular approach;

different aspects of the language will be defined in separate components, and different applications will have the flexibility to determine which components they support. Part of the intent is to allow browsers with simpler displays, such as mobile phones, to avoid having to implement portions of XHTML that do not make sense for the application (such as tables for very small textual displays). An additional benefit is that application developers can define new modules that allow documents to be created that can be used for both presentation to people and improved computer-to-computer communications.

2.8.6 Transforming XML with XSLT

The XML Stylesheet Language, or XSL, consists of two component specifications: XSL Transformations (XSLT) and XSL Formatting Objects (XSL-FO). The transformation language is used to translate XML documents from their original form to some other form, which may be XML, HTML, or anything else (including plain text). XSLT is covered in more detail in Chapter 6. The XSL-FO specification describes specific presentational styling and is used to describe a formatted document that could be printed to a typesetting device or displayed on a screen. It is not as widely implemented as XSLT and is not covered further in this book.