An Easy Executable Software Specification – A Proposal

Executable Specification 1Executable specification is a "holly graal" of modern software engineering. It's very hard to implement as it requires:

  • Formal specification of rules
  • Transformation of those rules into real-system predicates
  • Stimulating system under tests state changes in repeatable manner

FitNesse is one of such approaches that specifies function sample inputs and outputs then allow to run such test cases using special connectors and provide colour reports from test runs. It's easy to use (from specification point of view), but has the following drawbacks:

  • Software architecture must be compatible with unit testing (what is very good in general: cut unnecessary dependencies, Inverse of Control, …) – your current system might require heavy refactoring to reach such state
  • Rules are written and evaluated only during single test execution – different scenario requires another test case (no Design By Contract style continuous state validation)

Above features are present in any other unit test framework: JUnit, unittest, … All such testing changes state and checks output.

And now quite crazy idea has come to my mind:

What if we forget about changing state and focus only on state validation?

Executable Specification 2Then the executable specification won't be responsible for "test driver" job. Only state validation would be needed (including some temporal logic to express time constraints). That is much easier task to accomplish.

What about state changing then (the "Test Driver" role in the diagram) – you might ask? We have two options here:

  • random state changes to simulate global state machine coverage (randomtest.net)
  • click'n'play tools (which I hate, BTW) to deliver coverage over some defined paths (Selenium)
  • some API-level access to change system state and execute some interesting scenarios (system-dependant)

So let's go back to Specification then. Let's review some high level requirements (taken randomly from IPTV area) written in English and how they could be transformed into formal specification languages (I'm borrowing idea of "test tables" from FitNesse):

A requirement: video should be visible after device boot unless it's in "factory reset" state

Here we have the following state variables:

  • 1st_video_frame = true: detected by checking for special strings in decoder logs
  • device_boot = true: one could find unique entry in kernel log that shows device boot
  • factory_reset = true: missing some local databases state

Once parameters meaning has been specified by existence of log entries we could write the decision table here:

| device_boot | factory_reset | 1st_video_frame? | timeout |
| true | false | true | 120s |
| true | true | false | 120s |

As you might have noticed I've added extra parameter "timeout" that delivers the temporal part of the constraint. The meaning of this parameter is as follows:

Given input condition is set the output condition should be met (even temporarily) in the timeout period

A requirement: the device should check for new software versions during boot or after TR-69 trigger, user might decide about the upgrade in both cases

Here we define the following state variables:

  • device_boot¬† = true: the first kernel entry
  • check_software = true: proper log entry related to network activity for new software version retrieval
  • tr69_upgrade_trigger = true: local TR69 agent logs
  • user_upgrade_dialog = true: upgrade decision dialog visible

The decision table:

| device_boot | tr69_upgrade_trigger | check_software? | user_upgrade_dialog? | timeout |
| true | – | true | – | 160s |
| – | true | true | – | 30s |

(I use "-" as "doesn't matter" marker)

And here comes the hard part: we don't know whether new software is present for installation, so we cannot decide about user dialog. New property is needed:

  • new_sf_version_available

And the table looks like this:

| device_boot | tr69_upgrade_trigger | new_sf_version_available | check_software? | user_upgrade_dialog? | timeout |
| true | – | true | true | true | 160s |
| – | true | true | true | true | 30s |
| true | – | true | false | false | 160s |
| – | true | true | false | false | 30s |

We can see that table behaviour is a bit redundant there: we needed to multiply specification entries to show (tr69_upgrade_trigger x new_sf_version_available) cases.

However, above test will show failures when:

  • Despite the new version presence there was no software update dialog visible
  • No new version check has been triggered 160s after boot

A requirement: rapid channel change should not exceed 600ms

This one looks a bit harder because of the low timeout and the usual buffering done on logs. However, having log flush interval limited to 100ms one can keep quite good performance and measure time with enough granularity.

Another caveat here is to exclude channels that are not available in current provisioning of the customer (you should see some up-sell page instead).

The states variable definition:

  • 1st_video_frame: specification as above
  • channel_up: P+ key has been pressed
  • channel_down: P- key has been pressed
  • channel_unavailable: system detects that current channel/event is not purchased yet

The specification:

| channel_up | channel_down | channel_unavailable | 1st_video_frame? | timeout |
| true | – | false | true | 800ms |
| – | true | false | true | 800ms |

Note that channel availability detection must be done at the same moment as channel change event – might be not true in most implementations (so not possible to reflect in our simple temporal logic language).

The implementation feasibility

In order above method to work it requires some kind of implementation.

  • System state changes detection: the output of a system could be serialized in single stream of textual events (redirected to serial line for embedded device, application server logs on server-based system)
  • Each interesting state variable changes could be derived from regexp parsing of above stream
  • On each state change all the collected rules would be examined to find active ones
  • The rules with timeout would setup a timeout callback to check the expected output state changes

The outputs of such test:

  • Failed scenarios with timestamp – for further investigation
  • Rules coverage – will tell you how good (or bad) your test driver is (how much coverage is delivered)

Based on first output you need to adjust your rules / matching patterns. Based on the second you should adjust your test driver policy.

It look like a feasible idea for implementation. I'm going to deliver a proof of concept in randomtest.net project.

This entry was posted in en. Bookmark the permalink.