Dariusz on Software Quality & Performance

11/02/2012

Large C++ Project Build Time Optimisation

Filed under: en — Tags: — dariusz.cieslak @

When you hit some level of code size in a project you starting to observe the following sequence:

  1. Developer creates and tests a feature
  2. Before submitting commit to repository update/fetch/sync is done
  3. Developer builds project again to check if build/basic functionality is not broken
  4. Smoke tests
  5. Submit

During step 3 you hear "damn slow rebuild!". One discovers that synchronization with repository forces him to rebuild 20% of files in a project (and it takes time when project is really huge). Why?

The answer here is: header dependencies. Some header files are included (directly and indirectly) in many source code files, that's rebuild of so many files is needed. You have the following options:

  • Skip build dependencies and pray resulting build is stable / working at all
  • Reduce header dependencies

I'll explain second option.

The first thing to do is to locate problematic headers. Here's a script that will find most problematic headers:

#!/bin/sh

awk -v F=$1 '
/^# *include/ {
    a=$0; sub(/[^<"]*[<"]/, "", a); sub(/[>"].*/, "", a); uses[a]++;
    f=FILENAME; sub(/.*\//, "", f); incl[a]=incl[a] f " ";
}

function compute_includes(f, located,
arr, n, i, sum) {
    # print "compute_includes(" f ")"
    if (f ~ /\.c/) {
        if (f in located) {
            return 0
        }
        else {
            located[f] = 1
            return 1
        }
    }
    if (!(f in incl)) {
        return 0
    }
    # print f "->" incl[f]
    n = split(incl[f], arr)
    sum = 0
    for (i=1; i<=n; i++) {
        if (f != arr[i]) {
            sum += compute_includes(arr[i], located)
        }
    }
    return sum
}

END {
    for (a in incl) {
        n = compute_includes(a, located)
        if (F) {
            if (F in located && a !~ /^Q/) {
                print n, a
            }
        }
        else {
            if (n && a !~ /^Q/) {
                print n, a
            }
        }
        for (b in located) {
            delete located[b]
        }
    };
}

' `find . -name \*.cpp -o -name \*.h -o -name \*.c` \
| sort -n

Sample output:

266 HiddenChannelsDefinitions.h
266 nmc-hal/hallogger.h
268 favoriteitemdefinitions.h
270 nmc-hal/playback.h
279 pvrsettingsitemdefinitions.h
279 subscriberinfoquerier.h
280 isubscriberinfoquerier.h
286 notset.h
292 asserts.h

As you can see there are header files that require ~300 source files to be rebuilt after change. You can start optimisations with those files.

If you locate headers to start with you can use the following techniques:

  • Use forwad declaration (class XYZ;) instead of header inclusion (#include "XYZ.h") when possible
  • Split large header files into smaller ones, rearrange includes
  • Use PIMPL to split interfaces from implementations

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress