The Oldest Unchanged Kernel Code

The kernel’s internal routine for computing the TickCountMultiplier in the KUSER_SHARED_DATA is here presented as the kernel’s longest-lasting code that has any substance.

The code is in one routine, named very straightforwardly as ExComputeTickCountMultiplier. It dates from version 3.10, which does not have the KUSER_SHARED_DATA but computes the multiplier for the kernel’s implementation of NtGetTickCount. Dating during the development of this first Windows version is possible from pre-release builds that are redistributed on the Internet as abandon-ware, possibly unlawfully but invaluably for historians. Nothing like this routine yet exists in the pre-release build 3.10.297.1 from June 1992, but the routine is present in the pre-release build 3.10.328.1 from October 1992. At the other end of the time scale, it is still present at least as recently as the 2004 edition of Windows 10—and not just 32-bit, but 64-bit too.

At not even 100 bytes, substance is relative. But at roughly 60 to 80 bytes, depending on the version, it’s not trivial. And through all these years, this code has executed every time that anyone starts Windows. Though the binary code varies between versions, the source code in C looks like it has never changed in getting on to three decades, which is no small point of distinction.

Algorithm

Also distinctive is that the routine is much less slight than it could be. Because it works, it doesn’t need to be simpler or smaller. Though it exists only to prepare an optimisation, it need not itself be optimised. But all the routine has to do is zero-extend a 32-bit integer to 64 bits, shift left by 24 bits, and divide by 10,000. This routine could be coded as a single C statement. Had it been, it would almost certainly have got inlined, very possibly leaving (in the binary) no discernable trace of its being still written as a separate routine. Indeed, since its computation is wanted just the once, the programmer might have written the one statement in-place. Of all routines that might have survived unchanged for so long as active code, this one has strikingly little reason ever to have existed as its own routine, let alone to have survived.

For critical review, specifically of the routine’s improbable survival, let’s transcribe Microsoft’s code from the binary into C:

ULONG ExComputeTickCountMultiplier (ULONG Period)
{
    ULONG integer = Period / 10000;
    ULONG remainder = Period - integer * 10000;
    ULONG fraction = 0;
    for (ULONG n = 24; n != 0; n --) {
        remainder <<= 1;
        fraction <<= 1;
        if (remainder < 10000) continue;
        remainder -= 10000;
        fraction |= 1;
    }
    return (integer << 24) | fraction;
}

One way to parse the computed TickCountMultiplier is that it’s a fixed-point integer with the high 8 bits for the integer part and the low 24 for the fraction. The kernel literally computes it in these parts: an integer division of the maximum period by 10,000 to get the integer part and a remainder; then 24 loops of a shift-and-subtract algorithm to divide this remainder by 10,000 to get the fractional part. Because of this particular fixed-point form, with 8 bits for the integer part, the result is meaningful only if Period is less than 2,560,000. Given that this is assumed of the input, the code is correct.

Microsoft is evidently not unhappy with this code. Except that the routine’s name has the Exp prefix in version 3.10 but changed to Ex when version 3.50 gave the computed result more exposure, the routine looks to have been written once and then left untouched. Even if its longevity is due only to nobody at Microsoft having ever cared enough to review it, let alone rewrite it, someone certainly reviewed it for ReactOS and seems similarly to have been not unhappy with it. For all the talk of ReactOS having reverse-engineered the Windows kernel using clean-room techniques, ReactOS’s implementation of this routine is just a slightly different transcription from mine above.

Yet the kernel’s code for this routine is undeniably laboured. For what’s assumed of the input, an alternative in one line gives exactly the same result and is far more readable:

ULONG ExComputeTickCountMultiplier (ULONG Period)
{
    return ((ULONGLONG) Period << 24) / 10000;
}

It’s just not credible that a competent programmer could write this as ten statements without realising that the computation can be done in one (or that a competent reverse engineer could read Microsoft’s binary code and not capture the functionality for the programmers in the clean room, rather than dictate a precise implementation). The only explanation that makes sense to me (about Microsoft’s programming, not ReactOS’s transcription) is that in 1992 the one-statement implementation was unavailable.

No 64-Bit Integer

The obvious, if not only, reason that Microsoft’s programmers in 1992 would not have plumbed for the one-statement implementation is that there was no 64-bit integer type to do 64-bit arithmetic with.

To this day, the x86 kernel exports a set of Rtl functions for arithmetic on 64-bit integers expressed as LARGE_INTEGER or ULARGE_INTEGER unions, and headers in kits for kernel-mode programming add macros and inline routines. All are documented in the DDK for Windows NT 3.1 as if intended for real-world use, which they got, both within the kernel and by drivers. Some remain in real-world use even now, but since at least the DDK for Windows NT 3.51 their documentation has been reduced to saying that each function “is exported to support existing driver binaries and is obsolete. For better performance use the compiler support for 64-bit integer operations.” Headers in the DDK for Windows NT 3.1 are written as if such support is anticipated but is not yet available. The familiar LONGLONG and ULONGLONG types are defined, but as double. They have the desired 64-bit size and alignment but are not usable for integer arithmetic. (This fallback to double if __int64 is thought to be unavailable has never been completely removed. See for yourself in NTDEF.H.)

Such were the limited programming resources of the Windows NT 3.1 DDK and contemporaneous (or at least recent) versions of Microsoft’s 32-bit compiler. Given this, the coding that the kernel has kept all these years arguably is optimised. It credibly was the most efficient, both for execution and presentation, that was possible at the time. Its survival in active use means it’s a living fossil of coding practice—indeed, of expert coding practice—from a time when Microsoft’s compiler for 32-bit Windows could not yet do 64-bit arithmetic.