XML Schema Clinic — Enforcing Association CardinalityWill Provost Originally published at XML.com, June 26, 2002 If you're like me, XML document design brings out your darker side. We like a schema whose heart is black — do we not? — a schema that's just itching to call a candidate document on the carpet for the slightest nonconformity. None of this any-element-will-do namespace validation for us. We enjoy the dirty work: schema, we think, are best built to be aggressive in ferreting out the little mistakes in the information set that could later confuse the more sheltered and constructive-minded XML application. We'll practice a bit of that merciless science right here: specifically, we'll look at ways to control the cardinality of associations between XML elements. The basic implementation of an association between two types is simple enough — and we'll review it in a moment — but is only sufficient for many-to-one relationships. What if multiple references, or zero references, to an item are unacceptable? What if a referenced item may or may not be present? These variations will require other techniques, and these are essential for the truly Draconian schema author. This article will use a simple UML notation to illustrate patterns and examples. Knowledge of both XML Schema 1.0 (Part 1 in particular) and UML is assumed, although in developing our notation we'll have a chance to review a little of both. A Simple UML NotationThe Unified Modeling Language, or UML, provides a basis for a simple notation that will serve our needs in identifying rudimentary design patterns and in illustrating specific examples. (Note that many UML-to-Schema mappings are possible; see the "XMI Production for XML Schema" specification from the OMG for one much more formal option.) First, let's denote a single complex type as what UML calls a class, with XML attributes and simple-type child elements represented by UML attributes — we're also borrowing the XPath
Expressing Compositions: Content ModelsThe above example also represents the most basic sort of composition, in which all the components are of simple type. Composition of one complex type in terms of another in UML is most literally denoted as an aggregation by value — note the use of a name, as for an attribute, on the composing type, and the depiction of cardinality:
One brief note on composition: it is so basic to the XML information model that it is easy to overuse. It's an excellent habit to challenge the assumption that composition is the best characterization of a given relationship, and with XML Schema, as we're about to see, there's no practical reason to avoid implementing a looser association where it's appropriate to the design. Expressing Associations: Key RelationshipsFor types that are related, but not by composition, the UML association is a useful concept. It declares that between two types there is a clear relationship, which may be named on either side, or on both, and usually that there is a clear understanding of cardinality as well. The association also may be navigable in either or both directions — to navigate the association is to find an associated instance based on knowledge of an associating instance. Although the association is an OO concept, its implementation in XML Schema relies on a classic relational database feature, the foreign-key relationship. A key must be defined to allow unique instances of the associated type (
The precise validation rules that are realized by these two components are as follows:
The basic implementation of an association, shown above, expresses many-to-one cardinality: multiple A to singular B. Let's speak generally now of associations "from A to B". By nature, the Controlling Association Cardinality: Optional or Multiple Associated TypeThe cardinality of the associated type is the easier of the two: this is really a matter of controlling the composition of referencing fields. Think of a referencing field (such as For example, consider a model a newspaper might use for printing movie listings: ![]() Here, a But, wait: where's the "multiple" B type? The diagram above maps the XML schema construction exactly, and in doing so lays out the ![]() Controlling Association Cardinality: Optional Associating TypeOn the associating side, we can implement optional cardinality by insisting on the uniqueness of the referencing fields: in other words, multiple A elements referencing the same B are disallowed. For this purpose a third XML Schema construct can be used: the uniqueness constraint, implemented using the Let's say a restaurant keeps its reservation book as an XML document. It lists its tables, organizes each evening into planned seating times, and then captures each reservation as a name and number of people related to table and time: ![]() Clearly, though each reservation references a table and seating time, we don't want double-bookings. The Restaurant schema enforces this constraint by laying an A more practical version of this design would include dates, rather than repeating all that table and seating information in multiple documents. There are a few approaches to this:
Controlling Association Cardinality: Singular Associating TypeTo achieve singular cardinality on A is to insist that for every B there is some referencing A. This is the exact inverse of the key reference that's already in place, and the way to implement this cardinality is to create a symmetrical key and key reference. So, for a one-to-one association, there would be two keys and two key references, one each on types A and B. Notice that this will work even when the cardinality of B is not singular. The idea of "exactly one A optionally referencing a B" may fit into the human brain only with some pushing and twisting, but it is both conceptually sound and technically feasible. This sort of relationship often will make more sense at every level if it is restated as "for every B there is exactly one A", and certainly this is simpler to implement. However, there may be tension between ease of implementation and the correct naming and navigability as designed: "inversion" to the implementer may be "perversion" to the designer. Keep in mind that the effort here is all in the schema; there is no burden on the instance documents, and if anything they may be most readable and intuitive according to the original design. SummaryThe following table summarizes the techniques for various multiplicity requirements for each side of the relationship:
| ![]() |