Canonical XML

Amplify’d from

Canonical XML specifies a number of other details, some of which are:

  • the UTF-8 encoding is used

  • line-ends are represented using the character 0x0A

  • whitespace in attribute values is normalized

  • entity references are expanded

  • CDATA marked sections are not used

  • empty elements are encoded as start/end pairs, not using the special empty-element syntax

  • default attributes are made explicit

  • superfluous namespace declarations are deleted

According to the W3C, if two XML documents have the same canonical form, then the two documents are logically equivalent within the given application context (except for limitations regarding a few unusual cases).