Optimize Your Code with Register Variables
by Robert Zale, PowerBASIC Inc.
Register variables... a powerful tool for optimization, but often
overlooked or misunderstood. Many compilers use them in one form or
another. Better compilers, including all 32-bit PowerBASIC products,
offer automatic allocation as well as specific control by the
programmer. But more about that later.
So just what is a register? It's simply a small area of memory,
located directly on the cpu. The Intel x86 chips offer eight 32-bit
registers, while the x87 numeric coprocessor sports another eight
80-bit floating point registers. Typically, registers are used by
the compiler for temporary storage and calculation. Because they're
"on-chip", access is very fast... much faster than conventional
memory. Even better, the code needed to access them is smaller, too!
So why don't we examine the code in a Sub or Function to find a few
local variables which are used the most? Then, instead of storing
these popular variables in memory, just reserve a couple of the cpu
registers and store them there? We'd get a big boost in speed,
and smaller code size, too. There you have it: REGISTER VARIABLES.
So just how much of a difference do register variables make? Let's
look at a very simple example written in 32-bit PB/CC:
Function PBMain()
Register X&, Counter&
For Counter& = 1 To 300000000
X& = Counter& + Counter& + Counter&
Next
End Function
With Register Variables disabled on a P2/300, this runs in 7 seconds.
Now, turn them on, and the identical code runs in under 4 seconds, a
major improvement! That's close to double the execution speed!
Will it work for floating point code, too? You bet it will! Any
local, extended precision (80-bit) float variable can be declared as
a Register Variable. It even helps to use them for storage of
overworked numeric constants -- anything that limits memory access
will help your bottom line performance. Take this simple example:
Function PBMain()
Register Counter&
Register x##, y##
x## = 1
y## = 0.00001
For Counter& = 1 To 100000000
x## = x## * y##
Next
End Function
Watch this closely... It just gets better and better! Without the
benefit of Register Variables, this code runs in about 6.7 seconds.
Fairly respectable by most standards. But turn them on, and running
time is slashed to just 1.6 seconds. More than four times faster!
Just by telling the compiler to make full use of Register Variables.
Register variables are always local to the Sub or Function where they
are declared. In the current version of PowerBASIC, there may be up
to two integer class register variables (word/dword/integer/long) and
up to four extended precision (80-bit) floats within each Sub or
Function. It's possible that future versions of the compiler will
change these limits, so we place no restrictions on how many you may
declare. Any "extra" register variables are simply reclassified by
the compiler as locals.
PowerBASIC stores the two integer class Register Variables in the CPU
registers ESI and EDI, though you really don't need to worry about it
unless you use inline assembler. Since these two registers aren't
addressable at the byte level, PowerBASIC must disallow bytes as
Register Variables. Supported integer class variables would be any
of integer, long, word, and dword. As with all 32-bit programs,
32-bit data size is preferred. It's faster and the generated code is
smaller, too. In the 32-bit world, avoid the use of bytes, integers,
and words whenever it's reasonable. Your code will benefit!
The Intel x87 numeric coprocessor offers eight 80-bit floating point
registers. PowerBASIC takes four of them to store Register Variables.
Since these registers are 80-bits in size, only extended precision
floating point variables (such as x##) are eligible. If singles or
doubles were allowed, round-off discrepancies would be introduced.
Simply put, that would mean slight changes in calculation depending
upon whether Register Variables were enabled: an unacceptable option.
The REGISTER statement, supported in PB/CC and PB/DLL, allows you to
choose which variables will be classified as register variables. They
are local, so each Sub and Function may have its own unique set. If
you do not make the choice in a particular Sub/Function, the compiler
will attempt to choose for you. By default, the compiler will always
assign any integer class local variables available. Extended
precision float variables will be automatically allocated only in
functions which contain no external function calls.
The $REGISTER metastatement, also supported in PB/CC and PB/DLL,
allows you to specify the method of auto-allocation of Register
Variables. $REGISTER ALL requests automatic allocation of all
possible register variables, both integer class and floating point.
$REGISTER DEFAULT requests automatic allocation of integer class
variables, but allocates floating point variables only in subs and
functions which contain no external function calls. $REGISTER NONE
disables automatic allocation of register variables. In the current
version of the compilers, $REGISTER applies to the entire program.
It must therefore precede any Sub or Function.
Integer class register variables are almost always desirable and
beneficial. It's generally best to select those which are referenced
most frequently, such as For/Next Loop Counter Variables, and those
used repeatedly as array indexes. Float register variables should
generally be chosen with a bit more caution, since the compiler
must generate code to save and restore them to conventional memory
around each call to a Sub or Function. In some rather rare cases,
it is possible that float register variables could actually reduce
execution speed. However, they are extremely valuable with intensive
floating point calculations in functions which have few references to
other Subs or Functions.
Due to the structure of the numeric coprocessor, and the instruction
set available, the first float register variable declared in your
program has far more optimization possibilities than the others.
Use care in choosing the variable which is used most within floating
point expressions (that is, on the right side of the '=' assignment
operator), in order to gain the greatest advantage in execution speed.
Also, remember it is typically valuable to assign floating point
constants to register variables when they are used in repetitive or
intensive calculations.
So what about Register Variables and inline assembler? Generally
speaking, it's pretty straightforward. In most cases, you'll only
be accessing integer class variables, and you can do that by just
using the variable name. For example:
Register xyz&
asm mov eax, xyz&
asm mov eax, esi
In the above example, both "mov" instructions are interpreted in
exactly the same way, as PowerBASIC is smart enough to understand
that the Register Variable xyz& is stored in the cpu register esi.
It's much to your advantage to use intuitive variable names rather
than hardware registers, for obvious reasons, so be sure you do
that whenever possible. That rule always applies when you are
dealing with integer class Register Variables. That is, integer,
long, word, and dword Register Variables.
You probably won't have nearly as much need to access floating
point Register Variables from inline assembler, and that's good!
If you try it, the rewards can be great, but there are hazards.
You must use a good deal more care with assembler floating point
code in functions with Register Variables. Floating point register
variables may occupy up to four of the coprocessor registers, so
you must limit your use of x87 registers to the remaining four.
Further, float register variables should not be referenced by name
in assembler code, as the compiler can't always track the register
locations with absolute certainty. Here's why... Registers on
the x87 are oriented as a stack. The first value loaded is saved
in register st(0). The second value loaded goes to st(0) as well,
but pushes the first to st(1). And so on, for up to eight float
registers. When you declare float Register Variables, the first
is stored in st(0), the second in st(1), then st(2) and st(3).
Each time more values are loaded or stored, the Register Variables
can shift up to four register positions in either direction! This
isn't a problem with compiled PowerBASIC code, but it can be a
logical nightmare with inline assembler. So the PowerBASIC rule
is simple: Never, ever reference a float register variable by
name from inline assembler. Just don't do it! (smile) Reference
floats only by the register: st(0) through st(7). That's the
safe thing to do.
One final restriction: Since Register Variables have no memory
location, they cannot be used with the VARPTR() function.
Register Variables are supported in both 32-bit PowerBASIC compilers. PB/CC,
the PowerBASIC Console Compiler, creates text mode applications for Win95/98/NT.
It is ideally suited for those situations where a graphical user-interface is
not needed nor desirable, such as Internet Web Server and CGI applications, or
for a straightforward port of DOS Basic code to 32-bit Windows. PB/DLL, the
PowerBASIC DLL Compiler, creates DLLs and executables for Win95/98/NT. Its
industry-standard DLLs may be accessed from any Windows language to enhance
capabilities and total performance. Both compilers offer multi-threaded
capabilities, inline assembler, pointers, unsigned integers, conditional
compilation, and much more.
|