Skip to content

Entries from October 2011.

The Software Project Enemy: Regression

I'm tracking current project state using automated test suite that is executed 24/7. It gives information about stability by randomly exploring project state space. Test launches are based on fresh builds from auto-build system connected to master branch for this project.

Recently I hit few times typical stability regression scenario: N+1th commit that should not have unexpected side effects caused crash in auto-testing suite just after 2 minutes of random testing thus blocking full test suite to be executed. OK, we finally had feedback (from auto-test), but let's compute the (bug .. detection) delay here:

  • auto-build phase: 30 minutes .. 1 hour (we build few branches in cycle, sometimes build can wait long time to be performed especially if build cache has to be refilled for some reason)
  • test queue wait phase: 30 minutes (pending tests should be finished before loading new version)
  • review of testing results (currently manual): ~1 hour (we send automatic reports, but only twice a day)

In worst scenario we may notice the problem (regression) next day! Too long time in my opinion - broken master branch state may block other developers thus slowing down their work. There must be better organization that will allow to react faster on such problems:

  1. Ask developers to launch random test suite locally before every commit. 5 minute automatic test should eliminate obvious crashes from publishing on shared master branch
  2. Auto-Notify all developers about every failed test: still need to wait for build/test queue, messages may become ignored after long time
  3. Employ automatic gate keeper-like mechanism that will promote (merge) commit after some sanity tests are passed

Looks like 1. is more efficient and easiest to implement option. 2. is too "spammy" and 3. is hardest to setup (probably in next project with language with ready support for "commit promotion").

  1. will look interesting if applied to latest submitters only. Maybe I'll give it a try.

Assert: To Abort Or Not To Abort, That's The Question

Everyone agrees that internal state checking using assert(), Q_ASSERT(), assert are good. Programmer can declare expected input (asserting parameters), internal state (invariants) and verify return values (postconditions) and runtime will verify such expectations. There are languages with direct support for assertions in those three variants (Eiffel with his Design By Contract philosophy).

Those assertions typically will show filename/line number/message information and abort program / raise an exception if the condition is not met. Runtime environment then can collect current stacktrace to give developer more information on failed expectation.

One can disable assertions entirely (C++, Java) or select subset of assertions (Eiffel) for production releases. Resulting code will be faster, but failed expectations will not be verified by software - problem reports may be harder to interpret.

On the other hand if an assertion is too strict (the assumption may be invalid) it may abort program giving user negative impression about software stability.

What to do then? How can we keep problem-diagnosing power of enabled assertions and prevent minor of invalid failed assertions from aborting whole program?

The answer is: weak assertions (assertion without abort).

OK, you may ask, if we are not calling abort() then we can loose details about failed assertion. Nobody will send a bug report if program is not crashing.

But why do we need to rely on end user? ;-) Let's send failed assertion reports automatically under-the-covers and leave system running!

I'm attaching solution below for Q_ASSERT (C++, QT), but you should get the idea:

void qt_assert_x(const char *where, const char *what, const char *file, int line) {
    char buf[BUFSIZE];
    snprintf(buf, BUFSIZE, "%s:%d: Q_ASSERT(%s, %s) FAILED", file, line, where, what);
    print_stacktrace_to_logs(buf);
    send_stacktrace_over_network(buf);
    if (getenv("ASSERT_FATAL")) {
        abort();
    }
}

If you place this code in LD_PRELOAD-ed library you will overload symbol from QT library and catch any reference to qt_assert_x() symbol. Supplied function will save important information into log file, send it using network connection (of possible) then return. I described in this post how can we collect such crash reports on a server and this post will tell you about implementation of stacktraces for C++.

Optionally you can ask for abort() call (typical Q_ASSERT behavior) using some kind of configuration (environment variable in my case). It may be useful in development environment to die early is something goes wrong to have better diagnostics.

Using custom error handlers you can have both benefits:

  • "robust" program that will not die after minor expectation is not met (end user will not care)
  • automatic diagnostics about failed expectation

First Draft Of Public Calendar Specification

After making decision to launch online booking service for existing calendars (Google / Microsoft Exchange) it's time to make minimal functionality specification. Then landing page will be prepared to check if the subject catch attendance then MVP to prove basic flows will work as expected.

Glossary:

  • Customer: calendar owner
  • Client: wants to book an appointment with customer
  • Time slot: period of time that can be selected for appointment

MVP Use case list below (will define implementation order and priorities):

  • Show customer's calendar on a website with available slots
    • WHY: Proves that we can read existing event source
  • Allow to book an appointment for selected slot
    • WHY: Proves that we can write existing event source
  • Allow customer to register new calendar
    • WHY: Allow for external tests for potential customers
    • Allows to set timezone, day of week range, hours range

Optional use case list:

  • SMS book confirmation
    • WHY: prevent abuse
    • Phone number is required from client
    • SMS code should be entered on confirmation screen to finish booking a visit
  • SMS notifications
    • WHY: minimize no-shows
    • One day before a visit notification SMS is sent
  • Vacation management by Customer
    • WHY: block booking for selected days
    • allow to specify date range for "unavailable" days
  • Automatic SMS after moving a reservation
    • WHY: easy change schedule
    • compares local cache with current calendar state
    • sends SMS to client and e-mail to customer
  • Custom SMS messages (with proper character limit check)
    • WHY: more details (address) in a message
  • Client cancels appointment
    • WHY: minimize no-shows
    • Maybe as easy as sending SMS to some address with confirmation?

To be continued ...

Uptime Statistics Visualisations For Site-uptime.net

Would you like how your web application performance behaves across longer time periods? Now it's possible with site-uptime.net! We've just added visualisation of statistics:

Above sample is not bad system responded in <1s for >88%. It looks like no optimisations needed here. Let's see different case:

Ops! We had some performance problems 2010-03 .. 2010-07 (twitter.com, BTW). Looks like they were fixed in next months.

We are going to allow store and show day-by-day visualisations as well to give you feedback in shorter amount of time.

Bazaar to GIT migration

Today I've moved using site-uptime.net development from Bazaar repository to GIT using elegant bzr2git script. The why:

  • In-place branches (I used to use them heavily)
  • Faster (no Python libs loading during "cold" start)
  • Can't live without "git rebase -i" now :-)

Encryption Using GPG: Minimal HOWTO

I assume you want to encrypt some files using your public GPG key. I'll focus on simplicity rather than completeness (minimal steps required to implement encryption).

First you have to generate key pair:

$ mkdir -p ~/.gnupg
$ gpg --gen-key

Then see your new key ID and export it to public key storage:

$ gpg --list-keys # get KEY_ID from output
$ gpg --keyserver "hkp://subkeys.pgp.net" --send-key <KEY_ID>

On remote machine import the key and make it trusted (to avoid warnings during encryption):

$ gpg --keyserver "hkp://subkeys.pgp.net" --recv-keys <KEY_ID>
$ gpg --edit-key <KEY_ID>
> trust

Then you can used this key to encrypt files and delete original (if needed):

$ gpg -r <KEY_ID> --output <FILE>.gpg --encrypt <FILE>
$ rm <FILE>

And the decryption (on host where private key is stored):

$ gpg -r <KEY_ID> --output <FILE> --decrypt <FILE>.gpg

Python Web Framework Selection

I've been using many different Python Web Frameworks so far:

All frameworks have its strengths and weakness. For new project that will handle appointments using existing calendar I decided to give web2py a try, rationale:

  • all stuff included on board, no manual integration of 3rd party libraries
  • stable API
  • small and elegant
  • integrates with GAE (with subset of DB layer)
  • template selection separated from controller (easier unit testing)
  • easy template syntax (reuses Python code embedded into markup language)

After first phase of product I'll report if my expectations above were correct and what kind of problems were located (if any).

Online Booking For Your Google / iPhone / Exchange Calendar

We, Aplikacja.info are delivering online booking services for Polish customers using med.aplikacja.info service since 2008. The service is mainly targeted at small/medium medical businesses and allows to integrate seamlessly with existing webpage.

I was asked recently about integration with hosted Microsoft Exchange calendar. Customer (also medical business) is using currently iPhone to record visits and would like to use existing contacts and calendar events and allow his customers to book online.

Our current persistence mechanism is local SQL database. In order to integrate with external contact / event source we have the following options:

  • keep local SQL storage and make periodical import/export of events/contacts to Microsoft Exchange
  • replace local SQL storage with remote-only storage thus import/export will not be needed

First option drawback is possible de-synchronisation / higher probability of conflicts. The benefit is that local storage is more reliable than rely only on remote system availability. Also some customers doesn't have existing calendars we can connect to. Then setup is much simpler (just signup to our online service).

The latter gives better guarantees of eliminating conflicts, but may be slower / less reliable (when remote system is unavailable for some reason).

We started few weeks ago attempt to integrate with Google Calendar - the interface is pretty well documented. Current option that will probably win is to rely only on remote storage with some data cached locally to make things go faster. Maybe the same approach will be used for MS Exchange (probably CalDAV as transport option).