Saturday, 31 May 2008

Mathematical Dependence

OOA96 added the concept of mathematical dependence and mathematically dependent attributes to information modelling. Such attributes are annotated with an (M) suffix and are known as derived attributes in most other object-oriented methods (including Executable UML). Although the term mathematically dependent is a mouthful and the alternative term derived is much easier to use, it does allow us to distinguish between the analysis and design/implementation concept of derived.

When mathematically dependent attributes are calculated depends on the software architecture. They may be calculated on demand, i.e. whenever their value is accessed. Alternatively, they may be calculated whenever their dependent attributes change and then cached for instant access when needed. However, Recursive Design in OOA08 allows attribute updated mappings to be created in bridges that execute action language code whenever an attribute changes. If such a mapping is made from a particular mathematically dependent attribute then the second approach must be implemented for that attribute.

OOA08 also allows mathematically dependent relationships which are a new type of binary relationship formalized using action language code. Code needs to be provided for navigation in both directions. This code is given an instance of the source participant and returns a set of corresponding instances of the target participant. Using mathematically dependent relationships, analysts can now define composed relationships of the form R1 = R2 - R3 etc.

Thursday, 29 May 2008

Referential Attributes

Referential attributes are used to formalize binary and associative relationships. All objects should have one or more identifiers each of which is composed of one or more identifying attributes. Any attribute on an object can be identifying. It is common for identifiers to be composed of more than one identifying attribute. Some analysts get lazy and create arbitrary ID identifiers everywhere but I try to avoid them whenever possible. To formalize a one-to-one or one-to-many binary relationship between two objects (or participants), a target identifier is chosen on one object and a set of referential attributes are chosen (or added) to the other object where each referential attribute is mapped to an identifying attribute of the target identifier. These referential attributes can't be set directly by the analyst, they exist only as a by product of a relationship instance existing. A referential attribute may be used in many participant mappings each of which is annotated in the referential suffix.

OOA08 (and OOA Tool) fully supports the use of referential attributes to formalize binary and associative relationships. However, OOA08 requires all referential attributes be resolvable to a single base attribute (non-referential and non-polymorphic). This wasn't an explicit requirement in OOA91 or OOA96. An analyst could keep a model consistent by using the same data type (or attribute domain) for overlapping referential attributes. However, in my experience it is always possible to resolve overlapping referential attributes to a single base attribute and the resulting model is always more complete.

As a result of this requirement, all referential attributes have a base attribute status of either:

  • Unresolved indicating that there is no path from the referential attribute to any base attribute,
  • Partially Resolved indicating that there is a path from the referential attribute to a base attribute but that there are also other non-circular paths which don't lead to a base attribute yet,
  • Fully Resolved indicating that all paths from the referential attribute lead to a single base attribute (ignoring circular references),
  • Incompatible indicating that one or more paths from the referential attribute lead to a base attribute with a data type that doesn't match the manual data type associated with this attribute (which can't happen if a manual data type is not specified),
  • Multiple Base Attributes indicating that one or more paths from the referential attribute lead to multiple base attributes (whether they have the same data type is not relevant in OOA08),
  • and Only Circular References indicating that all paths from the referential attribute are circular in nature.

The example above should help illustrate the different base attribute statuses. Attribute A1 is Unresolved since it is declared as referential but doesn't appear in any participant mappings. Attribute A2 is Unresolved since although it references Attribute B1, that attribute is Unresolved. Attribute A3 is Partially Resolved since although it references Attribute B2 which is a base attribute, it also references Attribute C1 which is Unresolved. Attribute A4 is Fully Resolved since it only references Attribute D1 which is a base attribute. If Attribute A4 had a manual data type of Integer and Attribute D1 had a manual data type of String then Attribute A4 would have an Incompatible status instead. If Attribute C1 was a base attribute rather than a referential attribute then Attribute A3 would have a Multiple Base Attributes status instead. If Attribute B1 formalized relationship R4 then both Attribute A2 and Attribute B1 would have a Only Circular References status.

Circular references are common when composed identifiers are used, i.e. the identifiers for a set of related objects may all have a common identifying attribute. Having circular references is not a problem unless there are only circular references. However, once a referential attribute with this problem has been identified, it can normally be easily fixed, i.e. a set of related objects with a common identifying attribute implies a relationship should exist to an object defining the common attribute as a base attribute. There are situations that appear to be correct but which still have referential attributes with only circular references, i.e. when an identifying attribute which should have been a base attribute has become a referential attribute. However, an analyst should always be able to determine the home object of a referential attribute and if that attribute is not a base attribute, the analyst should make it a base attribute by separating the referential duties off into another attribute and noting the constraint.

The algorithm for resolving base attributes is highly iterative because referential attribute mappings form arbitrary graphs which may be cyclic, i.e. they may include circular paths. The algorithm is briefly outlined below:

  1. Add all referential attributes to a set of unresolved referential attributes
  2. Loop thru the set of unresolved referential attributes until a complete iteration occurs without resolving a single referential attribute
    • Attempt to resolve the base attribute and base attribute status of each referential attribute using previously resolved referential attributes (or previously determined true attributes for polymorphic attributes) while checking for error conditions at the same time
    • If any unresolved referential attributes are referenced by this attribute then skip the resolution of this attribute for now
    • A base attribute status can be resolved if all references from this attribute are non-referential attributes, previously resolved referential attributes or if an error condition can be detected without the need to completely resolve all references from this attribute, e.g. for Incompatible or Multiple Base Attributes error conditions
    • If a base attribute status can be resolved then remove this attribute from the set and perform another complete iteration of the unresolved referential attributes after this iteration completes
  3. If the set still contains unresolved referential attributes then a final iteration thru the set is made re-enter the above loop, except that references to unresolved referential attributes are treated as circular references and ignored
  4. Any unresolved referential attributes after this final iteration are all given the base attribute status of Only Circular References

The data type of a referential attribute is determined from the resolved base attribute. However, a manual data type can be specified for a referential attribute for consistency checking purposes. An error condition will be flagged if the resolved base attribute's data type doesn't match the manual data type.

The conditionality of the referential attribute is determined from the resolved base attribute's conditionality along with the navigation conditionality across all the paths to the resolved base attribute. This can also be weakened by specifying a manual conditionality for the referential attribute. Further discussion of attribute conditionality will be left for another day.

Monday, 26 May 2008

Polymorphic Attributes

Polymorphic attributes are new to OOA08 and have only really become useful since OOA96 introduced mathematically dependent (i.e. derived) attributes. Previously, if a common non-identifying attribute was defined in most (not necessarily all) subtypes of a subtype-supertype relationship, it would be abstracted into the supertype object. This would only cause a problem if the attribute was referential in any of the subtype objects. In which case, the analyst would have to leave the referential attributes where they were and decide whether having a copy of the attribute in the supertype object was still beneficial. OOA96 introduced the concept of a constrained referential attribute and this could be used to constrain any referential attributes left in subtypes after a common attribute is abstracted into a supertype object. This situation does not happen often in my experience. However, with the introduction of mathematically dependent attributes, abstracting common attributes of mixed type becomes much more difficult. How do you abstract a common attribute which is a simple attribute in one subtype and a mathematically dependent attribute in another subtype? The answer is a polymorphic attribute.

Polymorphic attributes are similar to polymorphic events (which were also introduced in OOA96) in that polymorphic attributes in a supertype object are mapped to true attributes in subtype objects. An analyst associates a subtype-supertype relationship with a polymorphic attribute and then selects a subtype attribute in each of the relationship's subtype objects. Subtype attributes may themselves be polymorphic attributes. When we refer to true attributes we mean non-polymorphic attributes with any intermediate polymorphic attributes flattened out. Obviously, the object containing the polymorphic attribute must be the relationship's supertype object. It is rare for there to be more than one choice of subtype-supertype relationship, i.e. multiple supertype objects are rare.

Polymorphic attributes and their corresponding true attributes do not have to have the same name but they must have the same data type. However, OOA Tool can pre-select subtype attributes that have the same name and data type. This makes the creation of a polymorphic attribute a 2 second job in most cases. The conditionality of a polymorphic attribute is the weakest conditionality of any of it's true attributes, i.e. the polymorphic attribute is conditional (1c) if any true attribute is conditional. Furthermore, an analyst can make a polymorphic attribute conditional even if none of it's true attributes are conditional. One could argue that the data type of a polymorphic attribute should be the weakest data type of any of it's true attributes. However, to keep things simple, the data types of all true attributes must match in OOA08.

The example above shows a typical polymorphic attribute with a true attribute defined in each of the subtype objects. The attribute suffix of a polymorphic attribute begins with 'P' and includes the associated subtype-supertype relationship ID, e.g. (P-R1). Using the relationship ID is important since the associated relationship may be defined in another subsystem and thus may not be shown on a particular diagram. The attribute suffix (P) is used if there is no associated relationship at present. In the example above the true attributes are simple, mathematically dependent and referential.

Polymorphic Attribute above could be replaced by Other Attribute if the associated calculation code performed a simple switch on the supertype's current subtype returning the appropriate true attribute value. This would be unthinkable in an object-oriented programming language. However, in Shlaer-Mellor OOA, a subtype-supertype relationship is complete in the sense that all subtypes are identified by the analyst in the information model. Therefore, the switch code for Other Attribute would also be complete. OOA Tool will inform an analyst if a subtype switch statement is no longer complete, e.g. when a new subtype object is added. The disadvantage of using this approach is that the polymorphic relationship is hidden in Other Attribute's calculation code.

The diagram above shows how polymorphic attributes can be used to create interface like objects without having to create additional redundant relationships or use arbitrary IDs. Main ID and Parent ID in Interface would normally have to be referential attributes but that would require some form of relationship to Parent to resolve the referential attributes (OOA08 requires all referential attributes to resolve to a single base attribute). Alternatively, an arbitrary ID could be added to Interface and then used in Child B and Child C. Making these attributes polymorphic means that they can resolve downwards in R3 while the matching attributes in Child B and Child C reference upwards to formalize R3. This works because the matching attributes in Child B and Child C have an alternative path to their base attributes in Parent and Main. How referential attributes are resolved to base attributes and why they should all resolve to a single base attribute will be discussed another day.

Friday, 23 May 2008

Flattening Subtype-Supertype Hierarchies

One of the first tasks when mapping (or translating) Shlaer-Mellor information models to XML documents or Java code is to flatten subtype-supertype hierarchies. Obviously, XML does not support any kind of generalization. While Java only allows single inheritence and this facility may be required to implement the objects within the Java software architecture.

The diagram above shows leaf objects with single and multiple supertypes. The definition of a leaf object is any object which doesn't participate in a subtype-supertype relationship as a supertype object. This example is easily flattened into:

  • Leaf A containing Leaf A and Supertype A
  • Leaf B containing Leaf B, Supertype A and Supertype B
  • Leaf C containing Leaf C and Supertype B

The diagram above shows an example of an improper subtype-supertype relationship. Leaf B is a subtype of Supertype B and Supertype C. However, the subtype-supertype relationship R1 specifies that all Supertype A objects are either Supertype B or Supertype C but not both. Executable UML (the UML version of Shlaer-Mellor) explicitly labels these relationships as disjoint to emphasis the point. C++ allows multiple inheritence with common base classes but Shlaer-Mellor OOA does not.

The other form of improper subtype-supertype relationship is where a subtype ends up being a supertype of itself via a circular chain of subtype-supertype navigations.

The diagram above shows an example of compound subtype-supertype relationships where Supertype is a Leaf A or Leaf B and a Leaf C or Leaf D. To flatten this example requires the introduction of new compound leaf objects:

  • Leaf A/C containing Leaf A, Supertype and Leaf C
  • Leaf A/D containing Leaf A, Supertype and Leaf D
  • Leaf B/C containing Leaf B, Supertype and Leaf C
  • Leaf B/D containing Leaf B, Supertype and Leaf D

Compound subtypes may participate in other subtype-supertype relationships either as supertypes or subtypes leading to complex hierarchies. However, as long as there are no improper subtype-supertype relationships then these hierarchies can always be flattened into leaf and compound leaf objects.

The algorithm for flattening subtype-supertype hierarchies gave me a few headaches before I settled on a two pass strategy:

  1. The first pass iterates over all leaf objects (non-supertype objects) creating flat elements each containing a set of component objects (as shown in the examples above). This pass uses limited recursion when navigating to supertypes and can easily check for improper subtype-supertype relationships by checking whether an object has already been added to a flat element's set of component objects prior to adding a new object.
  2. The second pass iterates over all multiple supertype objects (non-leaf objects which are supertypes in multiple subtype-supertype relationships). It then matches each subtype object to all flat elements whose set of component objects contain the subtype object. The set of flat elements for the subtype objects in each of the subtype-supertype relationships are then multiplied out by creating new flat elements containing the combined set of component objects (ignoring duplicates). The new flat elements replace the previous flat elements before the second pass iteration continues. This ensures multiple compound subtype-supertype relationships are correctly flattened without the need for recursion in this pass.

All flat elements contain at least one leaf object. A flat element containing multiple leaf objects represents a compound leaf object. A suitable name may need to be chosen for compound leaf objects since they aren't explicited defined in Shlaer-Mellor information models.

Flat elements can be easily mapped to XML elements as long as a suitable name for compound leaf objects can be determined. The XML attributes for the element are determined from the combined attributes of the flat element's set of component objects. All simple attributes except Arbitrary ID or Ordinal ID attributes are required. Mathematically dependent attributes may be useful but are not absolutely required. All referential attributes used solely to formalize subtype-supertype relationships can be ignored. Referential attributes used to formalize containment relationships can also be ignored. Only referential attributes used to formalize referential relationships need to be considered. All polymorphic attributes can be ignored since the true attributes must also be present. Attribute names may need to be qualified when they are mapped to XML attribute names. I will leave the discussion of containment and referential relationships within XML documents to another day.

Monday, 19 May 2008

OOA Interchange Format

The OOA Interchange Format allows Shlaer-Mellor OOA models to be saved and exchanged between tools. Models are saved as XML based ".ooa" files conforming to the OOA Interchange Format. This allows models to be opened and manipulated using ordinary XML tools. The format is really a mapping from the current OOA of OOA to XML.

At present (Build 008), each BETA release of OOA Tool introduces some changes to the format which breaks previously saved models, primarily because each new build defines an expanded OOA of OOA. I have to manually fix the models that are distributed with each release. However, any users who have been downloading the tool and creating models must be getting annoyed when they break after each new BETA release. Unfortunately, there are too many unresolved issues for the format to be finalized yet beyond the fact that it is an XML format. However, what I have decided to do from Build 009 is to track all changes to the format so that users can fix their own models fairly easily. Once OOA Tool 1.0 Final is released I will support old versions of the format as new versions are released.

With this goal in mind, my first task was to ensure the DTD which formalizes the format is up-to-date and valid. I renamed the DTD as "OOAInterchangeFormat0.01.dtd" and updated it so that ".ooa" files can now be validated against the DTD. I then added some preferences to OOA Tool so that users can select whether to embed the DTD, reference the version on the website or reference the version installed with OOA Tool. I also added some options to add default values and/or add mathematically dependent attributes (which are not normally saved). The format will also now be versioned independently of the tool. Version 0.01 is the first tracked version supplied with Build 009.

After this, I have become a little sidetracked. The first thing that sidetracked me was the pretty print logic in OOA Tool which allows patterns, archetypes and actions to be pretty printed as colour-coded HTML files so that they can be viewed on the web. The logic relies on lexical rules in a ".pattern" file to identify tokens which are then colour-coded. First, I separated the logic out into a separate command-line application that will be released with Build 009. I then added the ability to pretty print XML and DTD files. This required an "ExtensibleMarkupLanguage.pattern". I went directly to the latest Extensible Markup Language (XML) 1.0 (Fourth Edition) specification and converted the syntax. This wasn't as easy as it looks because the XML specification does not separate lexical and syntax rules. In fact, to identify all the tokens defined in the specification my lexical rules have to perform some additional syntax parsing. I also identified and fixed a couple of bugs in my Pattern Language library. The first involved using integer literals as character patterns. The second involved referencing token rules from within other token rules. I am pretty happy with the end result:

Syntax and Lexical Rules for Extensible Markup Language

The next thing that has sidetracked me is the mapping from the OOA of OOA to XML. Why can't I automatically map the OOA of OOA to XML using an XML service domain? Obviously, I can't do it at present since OOA Tool is far from complete. However, I have tried to perform the required logic in a Java program. The first problem to solve is mapping object hierarchies to element hierarchies which isn't as easy as it sounds since OOA allows an object to have multiple supertypes and allows supertype objects to define multiple subtype-supertype relationships. The next problem is that we need to identify containment and reference relationships between XML elements. I have not solved all of these problems yet. However, the DTD generated so far has identified discrepancies with the current hand-coded DTD which are best fixed immediately, e.g. the current DTD only defines a single Attribute element, while the OOA of OOA defines multiple Attribute objects which should have their own elements.

Attempting to synchronize "OOAInterchangeFormat0.01.dtd" with an automatically generated version is the main thing holding up Build 009 now. I'm also trying to hold off from modelling the Extensible Markup Language service domain that I have added to the OOA Tool model. I can do that later when I am in a better position to implement it. This week's main goal is to finish messing with the format and release Build 009. That will allow me to return to the paper I am attempting to write on Recursive Design.