GitHub - amoss/codestew: A flexible compiler backend

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
cogens		cogens
src		src
util		util
Makefile		Makefile
README		README
cropmd5		cropmd5

Repository files navigation

Rationale
---------
There are no raw sources for the testcases, everything is generated programatically from
C code.  This avoids the need for a parser for a testcase language, better models the
standard use as an API, and extends more easily to families of generated testcases.

Layout
------
Each of the testcases has a .cc generator in the cogen directory. The tests target in
the makefile will ensure that these are up to date. For each
  cogens/BASENAME.cc
an executable called
  cogens/BASENAME
will be created. This can then be executed to create output for the different target
machines. The tests target in the Makefile will try and generate:
  results/BASENAME-orig.dot       The SimpleMachine source in the testcase
  results/BASENAME-orig.pdf       As above
  results/BASENAME-arm.dot        Compiled to ARM instructions
  results/BASENAME-arm.pdf        As above
  results/BASENAME-arm-regs.dot   Compiled to ARM instructions and registers 
  results/BASENAME-arm-regs.pdf   As above
  results/BASENAME-arm.asm        Codeworks compatible ARM code
  results/BASENAME-x86.dot        Compiled to x86-64 instructions
  results/BASENAME-x86.pdf        As above
  results/BASENAME-x86-regs.dot   Compiled to x86-64 instructions and registers 
  results/BASENAME-x86-regs.pdf   As above

Testcases
---------
Each of the simple operation types has a range of testcases that vary the sizes of the
operands. The cases that use operands smaller than the word size of the target 
architecture should combine operands in an appropriate way to improve efficiency. The
cases that use operands larger than the word size of the target architecture should
split the operands into appropriate pieces. In all cases the appropriate split / merge
of operands is one that allows a sequence of instructions with equivalent semantics.

The xor testcases are interesting because the instruction is a piece-wise application
of a binary operator; operands can be split into arbitrary partitions without
introducing any dependencies. The merge operation is simply concatenation.

The add testcases can also be decomposed into arbitrary partitions, but a dependency
on the carry is introduced between each pair of operands in a split. Merging of the
operands eliminates the carry dependency.

The motivation for the mul testcases is to examine the trade-off between register
pressure and memory traffic that results from different traversal orders in the
accumulation of the partial products. 

The hamw testcase shows the sum operation; a generalisation of the add operation to an
arbitrary number of inputs. There are several well-known hacks for computing this 
function using few instructions, the research question behind this testcase is can 
a split/merge operation on sums and a conversion between sum-2/add create a search
problem with a strong enough bias to find these efficient formulations?

Issues
------

There is nowhere in the code/docs that defines the semantics of the instructions that
we use. There is no form of validity checking to enforce the semantics. There is no
definite list of what can or cannot be used in any of the machines. The only place that
the semantics are exposed is implicity in the translation functions.

The simple machine should avoid the use of carry flags, although every machine that we
compile directly onto will use them they are an artifact of the implementation and should
be avoided in the source language. Each use of an instruction that modifies an
architecturally unique and unnamed carry flag introduces a sequential dependency into
the instruction stream and breaks the declarative dataflow in the program.

The sensible way to enforce validity is to have a series of well-defined points for the
block where it is valid relative to the semantics of one particular machine. This implies
that each block must hold a reference to a particular machine that defines it. The only
exception will be during construction of blocks, which is deferred to the translation
functions that map from the semantics of one machine to another.

If we stick a machine in the block that is passed as an initialiser then it becomes 
impossible to use a default constructor for a block... would this be a problem later?
There are limited uses for arrays of blocks... jump tables and some representations of
CFGs... As these will be used sparingly we can tolerate arrays of pointers where 
necessary and use explicit initialisation...