Wednesday 30 June 2010

Week 25 of 2010

This week's report is a little late since I thought I might finish some modelling first. The only problem with that is that modelling often leads to more modelling! I've been working on the Recursive Design and Simulation subsystems of the OOA of OOA. In fact, I've now replaced the old Simulation and Translation subsystems with new Population, Population Runtime and Colour Model subsystems. Most of the old Simulation subsystem was actually concerned with population data anyway. I've moved the few objects directly relating to simulation and translation into the Recursive Design subsystem which is a good home for them anyway. The new Recursive Design subsystem is shown below:

You can now see that recursive design involves partitioning a system into domains, bridging between domains, adding layers to those domains if appropriate (new in OOA10), populating those domains, simulating the system for testing and translating the system into the final software product. Recursive design also involves colouring arbitrary aspects of the model and/or data to control aspects of simulation/translation such as population boundaries and thread allocations. I'm providing support for basic thread pools during simulation to help facilitate successful reuse, i.e. users should always test concurrent models using multiple thread simulations. There are many other architectural concerns that are not supported during simulation. Those concerns can only be tested by running a translated version of the system using an appropriate software architecture. I may need to add support for additional architectural concerns in future, e.g. concurrent access controls. Any comments here?

Colour models and Colours in Shlaer-Mellor are known as Marking Models and Marks in Executable UML. This is a neglected area of Shlaer-Mellor methodology and not well defined in MDA either. However, the need for those ideas became evident as I started to think how I would capture thread allocations for simulation purposes and population boundaries where I want to include a subset of a given population in a simulation/translation (a good example here are metamodel populations which rarely need to used entirely). I'm still working on the Colour Model subsystem so I won't show it here. I have been thinking how I can incorporate colour models into OOA Tool but I'll leave that topic to another day also. One interesting issue with colours is how you access that information within archetype templates which traditionally only accesses object, attribute and relationship instance data.

Back to populations. The new Population subsystem is shown below:

I've made quite a few changes here. Populations are now either Input Populations which are project assets in their own right (i.e. can be imported into other projects), Executable Populations used with simulations or Translatable Populations used with translations. Input populations are Simple Populations (i.e. user data), Metamodel Populations, Composed Populations allowing a set of input populations to be merged into one, and Output Populations which are derived from the output state of a completed simulation run (e.g. a software architecture may code an optimizer as a model that can be simulated to generate optimization flags that can then be passed as input into a translation). Composed populations is a powerful concept when combined with output populations allowing multi-stage build processes to be defined.

An interesting issue with translation is what do you do about derived attributes and relationships? Do you use an active population during translation so that derived attributes and relationships are still determined on demand or do you pre-calculate all such information before you start the translation. I've taken the view that pre-calculation is best so that translation will in the most part be a deterministic process, i.e. if you recalculate on demand then the results may change if the calculation logic is non-deterministic (e.g. involves current time). However, this means that I can't set any attributes during translation (which is probably a bad idea anyway). In the above model, a translatable population is an input population with all derived data pre-calculated.

Population data and executable population data is shown in the model above. However, runtime population data could not also be shown without considerable mess. Thus, I've put all supplemental runtime population data into a new Population Runtime subsystem. This was the work I was trying to finish before I did this report but I haven't quite finished it yet so I won't include it here yet. An additional requirement here is that sufficient data be gathered so that an automatically generated Thread of Control Chart (or Sequence Diagram) can be associated with each simulation run.

So I plan on finishing the Population Runtime and Colour Model subsystems now. I'm considering whether it's a good idea to add support for what I've done here to OOA Tool immediately without waiting for OOA Tool 2 to be completed. It would allow me to validate some of the ideas. The problem is that without a working Action Language and Archetype Language, it doesn't give us much.

Monday 21 June 2010

Week 24 of 2010

I worked on the Simulation subsystem of the OOA of OOA most of the week since OOA Tool 2 needs an OOA virtual machine immediately (all derived calculations will be coded using Action Language code rather than Java code). The bulk of the Simulation subsystem focuses on data populations and the instance data associated with different types of population.

Simple populations define Object Instances, Base Attribute Instances (simple, arbitrary ID and mathematically dependent attributes), Relationship Instances (simple, associative and subtype-supertype relationships) and Participant Instances. It also defines Counterpart Mapping Instances allowing object instances in one domain to be converted to counterpart object instances in another domain within bridging code. The only population dependent values that can be used in simple populations are object instance values. Any attribute instances with a population dependent type that isn't an object instance type will have an undefined value.

Arbitrary ID attributes have a current value representing an allocated arbitrary or ordinal ID. The system automatically allocates or reallocates IDs as and when needed within limits controlled by the analyst, e.g. ordinal IDs reflect an ordering which must be preserved. Within limits, these IDs may be used outside of the system. For both of these reasons, arbitrary ID attributes require instance data. I really must finish writing up the technical note on arbitrary IDs in OOA10!

Mathematically dependent attributes can be calculated in many different ways. The simplest is on demand which doesn't require instance data. However, a minimum duration between recalculations can also be used by the analyst to restrict how often resource intensive calculations are performed. Presentation data rarely needs to be calculated more than once per second. We may also want to write calculation code which uses previous values to optimize the current iteration (verifying that the previous value is still current may be very fast and worth the effort). These kinds of analysis optimization are difficult to automate at present but should be used sparingly. None of these reasons require a current value to be persisted long-term. However, they do present a need for instance data at some point.

Referential and/or polymorphic attributes have no instance data since they are calculated on demand which generally isn't often since their main purpose in life is to ensure the model is formalized. Composed and mathematically dependent relationships also have no instance data since they are also calculated on demand.

Executable populations extend simple populations with Event Destination Instances, Event Instances, Event Data Item Instances, Operation Instances and Input Parameter Instances. It also defines Threads and Thread Allocations allowing concurrent processing resources to be allocated. The population dependent values that can be used in executable populations are object, event or operation instance values. Return coordinate and transfer vector values can't be used.

Simulations are started by generating an event or invoking an operation. An executable population allows us to define a set of triggers which can be run. It also allows us to define a thread allocation policy which is the minimum information required to execute a realistic concurrent simulation. A default mapping to multiple threads for concurrent simulation purposes is not useful in my view. If you aren't interested in defining a thread allocation policy then use a single thread. I should point out that some systems may require a minimum number of threads to avoid deadlock, e.g. invoking a synchronous service across a bridge will generally block a thread.

Runtime populations extend executable populations with Generated Event Instances (specifying whether delayed and/or recurring), Invoked Operation Instances, Output Parameter Instances and Return Communication Instances (allowing us to track return coordinate and transfer vector returns).

Generated event instances are queued on threads as determined by the thread allocation policy and are consumed as determined by OOA policy. Delayed events block the stream of events between a source/target pair of event destinations (key OOA policy is to ensure sequential processing of events between source/target pair). Recurring events are re-queued after they are consumed allowing heartbeats to be implemented. However, re-queued events may not stop a simulation from completing.

Invoked operation instances are logically pushed on to threads as invoked and pulled off when invocation completes. However, these instances may persist for various reasons, e.g. logging. A thread will also block after invoking a bridging operation mapped to a synchronous service awaiting a synchronous return operation with the correct return coordinate. A thread can also block for other reasons, e.g. awaiting input/output.

Return communication instances allow us to track how timely and often returns occur. Return coordinates must have a single timely return since a thread is waiting for the results. If a timely return does not occur then the system may generate a default return and any later return would then be flagged as an error. Transfer vectors are more flexible as determined by the analyst. They may be conditional or have a multiplicity of many. However, if not conditional then a default initial return may be generated if a time constraint has been specified. Whether any later return is flagged as an error would then depend on whether multiple returns are allowed.

Once I finish tidying up the Simulation subsystem then I will update the OIM on the website. I also need to ensure I have a clean workable design for representing runtime models in OOA Tool 2 based on this work. I also want to be able to generate Thread of Control Charts (and Sequence Diagrams for Executable UML) from runtime populations on completion.

Tuesday 15 June 2010

Week 23 of 2010

This week was very short since I travelled up to Nottingham to attend a family 50th celebration. I also spent some time with my much neglected relatives!

My current development focus is on the new OOA Tool 2 framework. Two subsystems of that are Dictionary and Repository. The dictionary subsystem is a compact metamodel suitable for representing the OOA of OOA. It adds support for multiple notations and Java locales (language and country). It also adds support for generic GUI presentation and validation which will evolve as the needs of the generic GUI expand. I've implemented a DictionaryTool which can convert any domain (loaded from a "<project>.ooa" XML file) into a dictionary (stored as a "<domain>.dictionary" XML file). The core dictionary files are generated from the "OOATool.ooa" model which continues to evolve (using the current version of OOA Tool).

The repository subsystem defines rows (objects/attributes are represented as tables/columns in the dictionary), transactions and location nodes. Rows not only represent object instances, they also represent relationship instances as column data since they are resolved to object references as required. Rows also represent event destination and event data since simulation objects are mapped to special tables in OOA Tool 2. This allows me to treat all data changes in a uniform way for transaction purposes. The alternative would require special undo/redo handling for every new type of data change. I've just started reworking the Simulation subsystem of the OOA of OOA so that all required special tables for simulation can be identified and implemented. Execution (rather than modelling) is the priority for OOA Tool 2. Repository data will also be stored as XML files but that work hasn't been done yet. I also need to implement a RepositoryTool which can convert any OOA Tool "<project>.ooa" XML file into a single "<project>.ooa2" XML file. The generated repository file will reference the previously generated core dictionary files which will be included in future OOA Tool 2 releases. I'll leave discussion of location nodes to another day.

I'll cut it short this week so I can start some real work! I just hope England does better come Friday!!

Tuesday 8 June 2010

Week 15/16/17/18/19/20/21/22 of 2010

Sorry for the silence here but I have been adjusting to single life again and it's been more difficult than anticipated.

Yesterday I finally released Build 014 of OOA Tool. However, I abandoned attempts to finish all of the features that I had started. The main reason for the release was to make available the version of the tool that I am currently using to develop the OOA Tool 2 platform. Now if I have an idea that I want to blog about that uses any of the features that were implemented in that codebase then I can go ahead and blog about it. Knowing that users can replicate any models that I show. One example is reverse parent-child relationships which I blogged about a few days ago. Whether I release any further builds based on this codebase will depend on whether I discover any serious bugs before the new platform is ready. If anyone has a problem loading their existing models using this build then they can email me their ".ooa" file and I will fix any format problems they have.

Last week, I published a new technical note for the first time in ages regarding reverse parent-child relationships. I used this note as a vehicle for testing out some new JavaScript and CSS rules allowing notation to be switched using a control that persists as a cookie across web sessions. Executable UML users can set this once and reduce the amount of Shlaer-Mellor speak they have to wade thru. Shlaer-Mellor users will also be able to avoid the amount of UML stuff they have to ignore. Now it will take time to roll out notation switching across the whole website but that is my ultimate objective. Note that the notation switching JavaScript code only works with the latest browsers, e.g. Firefox 3 or IE 8. I'm not going to bother trying to support old browsers.

Friday 4 June 2010

Formalizing Reverse Parent-Child Relationships

A notation switchable version of the following technical note can be found here: Technical Note - Formalizing Reverse Parent-Child Relationships.

Parent-child relationships are binary relationships where one participant has some form of ownership over the other. They often appear on information models as defines or contains relationships and on UML models as compositions or aggregations. The parent will normally have a longer lifespan compared to the child and will often be one-to-many but not always. In OOA10, a parent-child relationship can be explicitly formalized by adding an Arbitrary ID Attribute to the child and then using a Parent Allocated ID Type to formalize the attribute. The relationship can be ordered by specifying the type as ordinal. Ordinal ID base attributes are normally called Order in my models while non-ordinal arbitrary ID base attributes are simply called Arbitrary ID. In either case, OOA Tool automatically annotates the attribute with an "A" suffix (short for arbitrary) followed by the relationship ID of the parent-child relationship (see diagram below for examples).

All of the above is fairly straight-forward. However, there is a complication when creating reverse or secondary parent-child relationships where the child needs to reference it's own parent to formalize a second relationship between the parent and child or where the parent needs to reference a specific child for special treatment. For simplicity I will refer to both types of relationship as reverse relationships from now on. Reverse relationships will normally be one-to-one since parent-child relationships are normally one-to-many. This technical note aims to discuss some issues with formalizing such relationships. A simple example involves the relationships between Object and Identifier in the OOA of OOA. There are two obvious relationships here, the parent-child relationship defines and the reverse relationship is preferred by. The diagram below illustrates four different ways of formalizing the reverse relationship. To simplify matters, the four approaches were all defined in the same domain so I have had to use object name qualifiers, e.g. Object A and Identifier A for approach A, etc.

Starting with approach A which is the obvious approach in OOA91. The parent-child relationship R1 is formalized using Identifier A.Object against the preferred identifier of Object A with Object A.Name as the base attribute. In OOA91, we could formalize the reverse relationship R2 using Object A.Name and Object A.Preferred Identifier since we can still manually specify Name as a base attribute. OOA10 takes this option out of the hands of the modeller to greatly tighten the rigor of information models, i.e. attributes are only base attributes when they are not referential or polymorphic attributes. However, OOA Tool still allows this approach to be captured but both Object A.Name and Identifier A.Object are annotated in the OOA Tool browser with the warning "Only circular references" indicating that no base attribute could be found.

Moving onto approach B which replaces Object A.Preferred Identifier with the referential attribute Identifier B.Preferred mapped to Object B.Name. This looks a little weird when you first see it but one must remember that referential attributes are not implementation fields, only a modelling mechanism to formalize relationships. Approach B has the same relationships as approach A but we have moved the formalization from the parent to the child avoiding the problem with base attributes. However, in doing so we have weakened the formalization since Identifier B could potentially be the preferred identifier of another object entirely which we certainly don't want. In OOA10, we can add a loop constraint to the reverse relationship R4 to ensure this doesn't happen. This approach works but may confuse some.

Approach C changes Identifer B.Preferred into the boolean attribute Identifer C.Preferred and makes the the reverse relationship R6 a mathematically dependent relationship (added in OOA10). The main problem with this approach is that we need to stop an object having more than one preferred identifier. We have two choices here with regard to the Action Language code formalizing R6:

  1. we can decide which identifier flagged as preferred is actually the preferred identifier (e.g. first flagged) but we can't use the boolean attribute directly as an indicator,
  2. or we can determine all preferred identifiers and the modeller can rely on a constraint error being flagged if more than one preferred identifier is set.
Both choices still require careful coding when determining preferred identifiers and setting them.

The final approach D which is the recommended approach since it has none of the drawbacks of the previous approaches yet it can always be used to formalize reverse relationships. It uses the associative object Preferred Identifier to formalize the reverse relationship R8. The only perceived disadvantage here is the addition of the new object. However, this is a modelling mechanism and does not imply any additional implementation code is required. Modellers may try to avoid this approach to reduce clutter but the previous approaches add complexity and additional error possibilities. If an Identifier D.Preferred boolean attribute is still desirable (perhaps to allow changes to be more easily observable in another domain) then a mathematically dependent attribute can still be added to Identifier D which checks whether a Preferred Identifier exists.

As a side note, the OOA of OOA actually uses approach A and is able to do that because Name isn't defined as a base attribute of Object in the Information Model subsystem. Instead Object is a subtype of Entity which defines Name as a base attribute. This indirection performs a similar role as the associative object in approach D. However, adding a subtype-supertype relationship above a parent is not a general solution to the problem which is why it's not given as an approach here.