NAME

tdom::schema -
Create a schema validation command

SYNOPSIS

package require tdom

    tdom::schema cmdName
    

DESCRIPTION

This command creates validation commands with a simple API. The validation commands have methods to define a grammer and are able to validate XML or DOM trees (and to some degree other kind of hierarchical data) against this grammer.

Additionally, a validation command may be used as argument to the -validateCmd option of the dom parse and the expat commands to enable validation additional to what they otherwise do.

The valid methods of the created commands are:

defelement name ?namespace? <definition script>
This method defines the element name (optional in the namespace namespace) in the grammar. The definition script is evaluated and defines the content model of the element. If the namespace argument is given, any element or ref references in the definition script not wrapped inside a namespace command are resolved in that namespace. If there is already a element definition for the name/namespace combination the command raises error.
defpattern name <definition script>
This method defines a (maybe complex) content particle with the name (optional in the namespace namespace) in the grammar, to be referenced in other definition scripts with the definition command ref. The definition script is evaluated and defines the content model of the content particle. If the namespace argument is given, any element or ref references in the definition script not wrapped inside a namespace command are resolved in that namespace. If there is already a pattern definition for the name/namespace combination the command raises error.
define <definition script>
This method allows to define several elements or pattern or a whole grammar with one call.
start documentElement ?namespace?
This method defines the name and namespace of the root element of a tree to validate. If this method is used then the root element must match for validity. If start isn't used, any with defelement defined element may be the root of a valid document. The start method may be used serveral times with varying arguments during the lifetime of a validation command. If the command is called with just the empty string (and no namespace argument), the validation constrain for the root element is removed and any defined element will be valid as root of a tree to validate.
event (start|end|text) ?event specific data?
This method allows to validate hierarchical data against the so far defined content constrains of the validation command.
start name ?attributes? ?namespace?
Checks if the current validation state allows the element name in the namespace is allowed to start here. It raises error, if not.
end
Checks if the current innermost open element may end here in the current state without violate validation constrains. It raises error, if not.
text text
Checks if the current validation state allows the given text content. It raises error, if not.
validate <XML string> ?objVar?
Returns true if the <XML string> is valid or false otherwise. If validation failed and the optional objVar argument is given, then the variable with that name is set to a validation error message.
delete
This method deletes the validation command.
state
This method returns the state of the validation command with respect to validation state. The possible return values and their meanings are:
READY
The validation command is ready to start validation
VALIDATING
The validation command is in the process of validating input.
FINISHED
The validation has finished, no futher events are expected.
reset
This method resets the validation command into state READY (while preserving the defined grammer).

Schema definition scripts

Schema definition scripts are ordinary Tcl scripts that are evaluatend in the namespace tdom::schema. Several schema definition commands in this tcl namespace allow to define a wide variety of document structures. Every schema definition command establish a validation constrain on the content which all has to match, leaving no unmatched content to render the content as valid.

The schema definition commands are:

element name ?quant? ?<definition script>?
This command refers to the element defined with defelement with the name name in the current context namespace. Forward references to a so far not defined element or recursive references are allowed.
ref name ?quant?
This command refers to the content particle defined with defpattern with the name name in the current context namespace. Forward references to a so far not defined pattern or recursive references are allowed.
group ?quant? <definition script>
choice ?quant? <definition script>
interleave ?quant? <definition script>
mixed ?quant? <definition script>
text <definition script>
any
attribute <definition script>
namespace <definition script>
empty
defelement name ?namespace? <definition script>
defpattern name ?namespace? <definition script>
start name ?namespace?

Quantity specifier

Serveral schema definition commands expects a quantifier as one of their arguments, which specifies how often the content particle specified by the command is expected. The valid values for a quant argument are:

!
The content particle must occur exactly once in valid documents.
?
The content particle must occur at most once in valid documents.
*
The content particle may occur zero or more times in a row in valid documents.
+
The content particle may occur one or more times in a row in valid documents.
n
The content particle must occur n times in a row in valid documents. The quantifier must be an integer greater zero.
{n m}
The content particle must occur n to m times (both inclusive) in a row in valid documents. The quantifier must be a tcl list with two elements. Both elements must be integers, with n >= 0 and n < m.

If an optional quantifier is missing then it defaults to ! - the content particle must occur exactly once in valid documents.

Exampels

The XML Schema Part 0: Primer Second Edition (https://www.w3.org/TR/xmlschema-0/) starts with this example schema:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

  <xsd:annotation>
    <xsd:documentation xml:lang="en">
     Purchase order schema for Example.com.
     Copyright 2000 Example.com. All rights reserved.
    </xsd:documentation>
  </xsd:annotation>

  <xsd:element name="purchaseOrder" type="PurchaseOrderType"/>

  <xsd:element name="comment" type="xsd:string"/>

  <xsd:complexType name="PurchaseOrderType">
    <xsd:sequence>
      <xsd:element name="shipTo" type="USAddress"/>
      <xsd:element name="billTo" type="USAddress"/>
      <xsd:element ref="comment" minOccurs="0"/>
      <xsd:element name="items"  type="Items"/>
    </xsd:sequence>
    <xsd:attribute name="orderDate" type="xsd:date"/>
  </xsd:complexType>

  <xsd:complexType name="USAddress">
    <xsd:sequence>
      <xsd:element name="name"   type="xsd:string"/>
      <xsd:element name="street" type="xsd:string"/>
      <xsd:element name="city"   type="xsd:string"/>
      <xsd:element name="state"  type="xsd:string"/>
      <xsd:element name="zip"    type="xsd:decimal"/>
    </xsd:sequence>
    <xsd:attribute name="country" type="xsd:NMTOKEN"
                   fixed="US"/>
  </xsd:complexType>

  <xsd:complexType name="Items">
    <xsd:sequence>
      <xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
        <xsd:complexType>
          <xsd:sequence>
            <xsd:element name="productName" type="xsd:string"/>
            <xsd:element name="quantity">
              <xsd:simpleType>
                <xsd:restriction base="xsd:positiveInteger">
                  <xsd:maxExclusive value="100"/>
                </xsd:restriction>
              </xsd:simpleType>
            </xsd:element>
            <xsd:element name="USPrice"  type="xsd:decimal"/>
            <xsd:element ref="comment"   minOccurs="0"/>
            <xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
          </xsd:sequence>
          <xsd:attribute name="partNum" type="SKU" use="required"/>
        </xsd:complexType>
      </xsd:element>
    </xsd:sequence>
  </xsd:complexType>

  <!-- Stock Keeping Unit, a code for identifying products -->
  <xsd:simpleType name="SKU">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="\d{3}-[A-Z]{2}"/>
    </xsd:restriction>
  </xsd:simpleType>

</xsd:schema>
    

A somewhat one-to-one translation of that into a tDOM schema defintion script would be:

tdom::schema grammar      
grammar define {

    # Purchase order schema for Example.com.
    # Copyright 2000 Example.com. All rights reserved.

    element purchaseOrder {ref PurchaseOrderType}

    element comment {text}

    defpattern PurchaseOrderType {
        element shipTo {ref USAddress}
        element billTo {ref USAddress}
        element comment ?
        element items
        attribute orderDate
    }

    defpattern USAddress {
        element name ! {text}
        element street ! {text}
        element city ! {text}
        element state ! {text}
        element zip ! {text isNumber}
        attribute country ! {text {fixed "US"}}
    }

    defelement items {
        element item * {
            element product ! {text}
            element quntity ! {text {maxExcluse 100}}
            element USPrice ! {text isNumber}
            element comment
            element shipDate ? {text isDate}
            attribute partNum ! {text {pattern "\d{3}-[A-Z]{2}"}}
        }
    }
}
      
    

The RELAX NG Tutorial (http://relaxng.org/tutorial-20011203.html) starts with this example:

Consider a simple XML representation of an email address book:

<addressBook>
  <card>
    <name>John Smith</name>
    <email>js@example.com</email>
  </card>
  <card>
    <name>Fred Bloggs</name>
    <email>fb@example.net</email>
  </card>
</addressBook>

The DTD would be as follows:

<!DOCTYPE addressBook [
<!ELEMENT addressBook (card*)>
<!ELEMENT card (name, email)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT email (#PCDATA)>
]>

A RELAX NG pattern for this could be written as follows:

<element name="addressBook" xmlns="http://relaxng.org/ns/structure/1.0">
  <zeroOrMore>
    <element name="card">
      <element name="name">
        <text/>
      </element>
      <element name="email">
        <text/>
      </element>
    </element>
  </zeroOrMore>
</element>
      
    

This schema definition script will do the same:

tdom::schema grammar      
grammar define {
    defelement addressBook {
        element card *
    }
    defelement card {
        element name
        element email
    }
    foreach e {name email} {
        defelement $e {text}
    }
}