Dariusz on Software Quality & Performance

11/02/2012

Large C++ Project Build Time Optimisation

Filed under: en — Tags: — dariusz.cieslak @

When you hit some level of code size in a project you starting to observe the following sequence:

  1. Developer creates and tests a feature
  2. Before submitting commit to repository update/fetch/sync is done
  3. Developer builds project again to check if build/basic functionality is not broken
  4. Smoke tests
  5. Submit

During step 3 you hear "damn slow rebuild!". One discovers that synchronization with repository forces him to rebuild 20% of files in a project (and it takes time when project is really huge). Why?

The answer here is: header dependencies. Some header files are included (directly and indirectly) in many source code files, that's rebuild of so many files is needed. You have the following options:

  • Skip build dependencies and pray resulting build is stable / working at all
  • Reduce header dependencies

I'll explain second option.

The first thing to do is to locate problematic headers. Here's a script that will find most problematic headers:

#!/bin/sh

awk -v F=$1 '
/^# *include/ {
    a=$0; sub(/[^<"]*[<"]/, "", a); sub(/[>"].*/, "", a); uses[a]++;
    f=FILENAME; sub(/.*\//, "", f); incl[a]=incl[a] f " ";
}

function compute_includes(f, located,
arr, n, i, sum) {
    # print "compute_includes(" f ")"
    if (f ~ /\.c/) {
        if (f in located) {
            return 0
        }
        else {
            located[f] = 1
            return 1
        }
    }
    if (!(f in incl)) {
        return 0
    }
    # print f "->" incl[f]
    n = split(incl[f], arr)
    sum = 0
    for (i=1; i<=n; i++) {
        if (f != arr[i]) {
            sum += compute_includes(arr[i], located)
        }
    }
    return sum
}

END {
    for (a in incl) {
        n = compute_includes(a, located)
        if (F) {
            if (F in located && a !~ /^Q/) {
                print n, a
            }
        }
        else {
            if (n && a !~ /^Q/) {
                print n, a
            }
        }
        for (b in located) {
            delete located[b]
        }
    };
}

' `find . -name \*.cpp -o -name \*.h -o -name \*.c` \
| sort -n

Sample output:

266 HiddenChannelsDefinitions.h
266 nmc-hal/hallogger.h
268 favoriteitemdefinitions.h
270 nmc-hal/playback.h
279 pvrsettingsitemdefinitions.h
279 subscriberinfoquerier.h
280 isubscriberinfoquerier.h
286 notset.h
292 asserts.h

As you can see there are header files that require ~300 source files to be rebuilt after change. You can start optimisations with those files.

If you locate headers to start with you can use the following techniques:

  • Use forwad declaration (class XYZ;) instead of header inclusion (#include "XYZ.h") when possible
  • Split large header files into smaller ones, rearrange includes
  • Use PIMPL to split interfaces from implementations

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress