dotNoted

Icon

Observations of .Net development in the wild

xsd:any and the Unique Particle Attribution constraint

Here’s a beauty –
 
"Wildcard ‘##any’ allows element ‘foo’, and causes the content model to become ambiguous. A content model must be formed such that during validation of an element information item sequence, the particle contained directly, indirectly or implicitly therein with which to attempt to validate each item in the sequence in turn can be uniquely determined without examining the content or attributes of that item, and without any information about the items in the remainder of the sequence."
 
Whew, ok then. So that’s one thing you can do if you are an out of work English Lit major – write prosaic exceptions for Microsoft.
 
You’ll find that when you have an xsd:any element (a "wildcard") in your XSD schema and try to run an XML document through validation using that schema in .Net 2.0. This is because you are violating the Unique Particle Attribution constraint on schema models. (What is a particle? The short, and usually sufficient answer is that it is the ‘abstract base class’ for elements and groups [and wildcards]. The long answer is here.) The Unique Particle Attribution constraint (UPA) states that any given element in an XML document has to be able to be assigned to one and only one declaration in the associated schema. Wildcards can mess with this in much the same way that they mess with your carefully constructed Regular Expressions – they eat too much. Consider the following declaration from a schema:
 
<xsd:sequence> 
    <xsd:element name="foo" type="xsd:string" minoccurs="0" maxoccurs="unbounded"/>  
    <xsd:element name="bar" type="xsd:string" minoccurs="0" maxoccurs="unbounded"/>
    <xsd:any minoccurs="0" maxoccurs="unbounded" processContents="skip"/> 
</xsd:sequence> 

Now if I had the following XML document, all would be well:
<foo>MyFoo</foo>
<bar>MyBar</bar>
<baz>MyBaz</baz>

…but the following would cause problems
<foo>MyFoo</foo>
<bar>MyBar</bar>
<bar>MyBar</bar>

You can see why. Which schema definition do I attribute the second <bar> element to? It could be the element definition or the wildcard. This is what makes the XML validator fall down, or at least should make it, since it violates the standard’s UPA constraint. It seems, however, that the validator in .Net 1.1 didn’t fall down here… I just downloaded the User Interface Process Application Block and compiled and ran the Insurance Client Managment quickstart and it fell down on the provided XML and XSD (the xsd has a wildcard). I’ll have to wait to try it on 1.1 since I don’t have it installed at work here, but my expectation is that it will work – incorrectly work, that is.
 
So, that leaves the question: is xsd:any a useful thing or not, if it breaks content models so easily? There are two camps. The next post on this subject will explore these camps.

Filed under: Getting Data

Leave a comment