The Alvilda Optimizing Compiler
Back in 1993 I had the good fortune to be part of a three-man team
writing a compiler for a compiler course at DAIMI, Aarhus University,
Denmark. The group was Anders Skovsted Buch, Jan Hvid
Sørensen, and me (Tommy Thorn). Now, we weren't content
with
just making a compiler -- we wanted an optimizing compiler with
state-of-the-art register allocation (scheduling wasn't a big thing at
that time). Furthermore, we set out to try the unproven internal
representation CORTL, suggested by Carl McConnell in his Thesis
proposal "Tree Based Optimizations". The project
was highly successful and vastly superior to any of the other groups
(but probably also took orders more effort). A few useful lessons
were learned on the way:
- Naively I had assumed that we didn't need an AST representation of
the program but could generate CORTL directly as we parsed, via the parsing
actions.
Well, we could, but it turned out to make the front much more convoluted than
necessary.
- Using CORTL wasn't an obvious advantage. The advantages of
always having structured representation was offset by the pain of
maintaining this representation. Notably code motion was tedious.
- Ambition and deadlines don't match. We didn't finish for the
dead-line and rather than submitting a partial solution, we instead
submitted nothing, re-enrolled in the course the following year, and
submitted the completed project.
In May 2004 I dug out the sources again and only had to
change three lines in the Makefile to make it compile.
The compiler (written in Gnu C) translates a program in the
Pascal-like language Alvilda to SPARC assembly.
Download
- The complete project with both the Alvilda and Amanda compiler: Alvilda-2004-05-04.tar.gz
- The final rapport (alas, without the benchmark results). In
danish: dOvs-rapport-uden-kode.ps.gz
Using Alvilda
Build it with a simple make
command. The default
operating mode is to compile with optimization and generate a SPARC
assembly file, however there are a number of options to change that:
./alvilda -h
Usage: ./alvilda [-hptcrdD] [-O
level] [-n num] [-m num] [-R symbol] [-o file] file
where options are
-h
print this text
-p
pipe mode. write code to standard output
-t
trivial memorybased register allocation
-c
check the consistency of the rtl-program
-r
include code for range checking
-d
include code for checking pointer dereferences
-D
debugging mode. write what the compiler is doing
-O
level
set optimizing level (default = 0)
-n
number max
number of arguments in registers
-m
number max
number of words to copy without using a loop
-o
filename write code to
named file
-R
symbol
output rtl code.
symbol must be a symbol printed in debugging mode or
REMINOR redefine register minor numbers
The option -R is useful to examine the result of a stage. To know
the name of a stage, first run alvilda with debugging enabled.
Examples:
./alvilda -D Test/C/unionfind.alv
Parsing Alvilda Program (PARSE)
Building Procedure Define and Use
Information (BCDU)
Conversion to Static Single
Assignment Form (SSA)
Common Subexpression Elimination
(CSE)
Dead Code Elimination (DCE1)
Dead Assignment Elimination (DAE1)
Resolving Memory Registers (MEM)
Constant Folding (CF)
Common Subexpression Elimination
(CSE)
Dead Code Elimination (DCE1)
Dead Assignment Elimination (DAE1)
Converting Comparisions to
If-instructions (CTI)
Multiplication by Constant values
(MCONST)
Multiplication by Procedure Call
(MCALL)
Removing Big Constants (BCONST)
Combining Instructions (COMBINE)
Dead Assignment Elimination (DAE2)
Register Allocation (ALLOC)
Stripping SSA Information (STRIP)
Jump Elimination (JUMPELIM)
Dead Code Elimination (DCE2)
./alvilda -R DCE1
Test/C/unionfind.alv
Procedure: 3 [use: %g7.1, m35.1, %fp.1, %i0.1, m149.1, %i1.1, m152.1,
m153.1, m34.1, %sp.1][def: m35.3[UNDEF], m149.3[UNDEF], m1
52.3[UNDEF], m153.6]
221:11: { [dep:1, rep:0]
228: 12: { [dep:1, rep:0]
229: r148.1 := %i0.1
233: r151.1 := %i1.1
245: m153.2 := update(m153.1, r151.1, r148.1)
246: 13: { [dep:1, rep:1]
742: r239.0 := phi(r148.1, r162.1)
449: m153.3 := phi(m153.2, m153.4)
250: r159.4 := r239.0
252: r162.1 := access(m34.1, r239.0)
752: r249.0 := r162.1 = 0
253: r161.1 := r249.0
753: r250.0 := not r249.0
254: r163.1 := r250.0
754: r251.0 := not r250.0
255: r165.1 := r251.0
256: if r165.1 then
257: depart 13 [0]
269: m153.4 := update(m153.3, r151.1,
r162.1)
270: repeat 13 [0]
246: }
447: m153.5 := m153.3
271: depart 12 [0]
228: }
428: m153.6 := m153.3
275: depart 11 [0]
221:}
...
Notice, procedure 3 happens to be the Alvilda procedure find:
procedure
find(p: in ufe; q: in out ufe) is
q:=p;
while not(q.up=null) loop
q:=q.up;
end loop;
end find;
./alvilda -R DCE2 Test/C/unionfind.alv
Procedure: 3 [use: %g7, m35, %fp,
%i0, m149, %i1, m152, m153, m34, %sp][def: m35, m149, m152, m153]
221:11:
228: 12: {
245: m153 :=
update(m153, %i1, %i0)
837: %l0 := %i0
246: 13: {
252:
%l0 := access(m34, %l0)
818:
if %l0 = 0 then
257:
depart 13
269:
m153 := update(m153, %i1, %l0)
270:
repeat 13
246: }
228: }
...
./alvilda -p Test/C/unionfind.alv
.text
.global _main
.align 4
Proc3:
save %sp, -72, %sp
st %i0, [%i1]
mov %i0, %l0
L13repeat:
ld [%l0], %l0
subcc %l0, 0, %g0
be L13depart
nop
b L13repeat
st %l0, [%i1]
L13depart:
ret
restore
...
A few explanations: memory access (aka. pointer dereference) is
modeled with an access
operation which takes two parameters: the abstract memory (~ alias
class) and the address. Memory
updates
accordingly takes three
arguments: the abstract memory, the address, and the value (always a
full word). Updating the memory creates a new abstract memory,
thus everything fits into simple scheme where memory operations aren't
special and the dependences are explicit (the only additional
constraint is that abstract memories are single-threaded, thus an
update cannot be moved ahead of any access using the same memory).
As can be seen from this little example, there's still room for
improvement but in general it's hard to move the update out of the loop
due to Alvilda's out parameter semantics which leads to interesting
alias problems.
Comments and questions are most welcome: alvilda$numba-tu.com (replace the '$' with a '@').
Last update: 2006-03-20