See the slides of Lecture 12 on code generation for an approach to get started.
Develop tests for your compiler that explore the edge cases.
Define a transformation strategy that translates ChocoPy expressions to RISC-V code, including variable definitions, integer and boolean constants and operators.
Develop a simplification transformation that transforms expressions such that more concise code can be generated.
Follow the slides of Lecture 12 to set up a basic compiler pipeline:
compile-to-rv32im-ast :: Program -> RProgram
compile-to-rv32im-ast =
compile-cpy-to-cir
; compile-cir-to-rv32im
; compile-rv32im
Use debug(!msg)
or dbg(|msg)
to print the current term to the console.
You can also debug your generated code, by creating a .cpy
file, transforming it to an .rv32im
file using the Spoofax > Generation
menu, and copying the generated output to the online Venus editor, where you can use the Simulator
tab to run your code or step through each line, and inspect the memory and registers.
Develop a first compilation phase to transform ChocoPy programs to an intermediate language that makes control flow explicit.
compile-cpy-to-cir :: Program -> CProgram
compile-cpy-to-cir =
explicate-types
; desugar
; uniquify
; remove-complex-operands
; explicate-control
A useful transformation is to type specialize the constructors of the AST, such that the analysis results are no longer needed.
That is useful when applying transformations, since preserving those annotations cannot be done for all transformations.
The following explicate-types
transformation should be invoked on the AST before invoking the compiler transformation.
signature
constructors
AddInt : Exp * Exp -> Exp
rules
explicate-types =
innermost(type-specialize)
type-specialize :
add@Add(e1, e2) -> AddInt(e1, e2)
where <get-type> add => Int()
Note that the rules for translating integer additions should be adapted to reflect the change in constructors.
To improve the result of code generation, it can be useful to transform the source language expression. For example, left-associative additions may produce a better result.
desugar-exp :
AddInt(e1, AddInt(e2, e3)) -> AddInt(AddInt(e1, e2), e3)
Define a transformation to desugar
ChocoPy programs, making the task of downstream compilation easier.
This simplification can include constant folding, eventually. However, do not yet include constant folding rules, since you want to test the translation of operators. That is, your code generator should be able to translate arbitrary combinations of operators, so be prepared for the general case.
Think about (and try out in the online compiler!) the differences between adding integers and concatenating strings in RISC-V. In ChocoPy, they initially both use the Add(...)
constructor, so make sure to disambiguate them into separate constructors you will later use for code generation. The same holds for other operations, but you have to think about those yourself.
Define a transformation that eliminates shadowing. (See slides)
To simplify instruction selection, simplify expressions to only use atomic expressions (constants or variables) creating new variables to store intermediate results.
Use an intermediate language to explicate control.
That is, the explicate-control
strategy translates from ChocoPy ASTs
signature
sorts CID CINT CProgram CBlock CLabel CTail CStmt
CType CExp CAtom CVar
constructors
: string -> CID
: string -> CINT
CProgram : List(CBlock) -> CProgram
CBlock : CLabel * CTail -> CBlock
CLabel : CID -> CLabel
CReturn : CExp -> CTail
CReturnNone : CTail
CSeq : CStmt * CTail -> CTail
CVarDec : CVar * CType * CExp -> CStmt
CAssign : CVar * CExp -> CStmt
CIntT : CType
: CAtom -> CExp
CMin : CAtom -> CExp
CAdd : CAtom * CAtom -> CExp
CMul : CAtom * CAtom -> CExp
CDiv : CAtom * CAtom -> CExp
CInt : CINT -> CAtom
: CVar -> CAtom
CVar : CID -> CVar
Define a transformation from C-IR to RISC-V instructions:
compile-cir-to-rv32im :: CProgram -> RProgram
compile-cir-to-rv32im =
select-instructions-cprogram
The essence of this translation consists of instruction selection, i.e. mapping the high-level instructions of the intermediate language to concrete RISC-V instructions. For example, register allocation for integer constants, variables, and addition may be defined as follows:
select-instrs-exp(|RArg) :: CExp -> List(RLine)
select-instrs-exp(|x) :
CInt(i) -> [RLi(x, <cint-to-rint>i)]
select-instrs-exp(|x) :
CVar(y) -> [RMv(x, RVar(<cid-to-string>y))]
select-instrs-exp(|x) :
CAdd(y@CVar(_), z@CVar(_)) -> [RAdd(x, <cvar-to-rvar>y, <cvar-to-rvar>z)]
Note that because expressions have been transformed to applications of operators to atomic expressions, this step does not need to invent names for intermediate results.
We use RVar(x)
as symbolic registers in RISC-V code in order to postpone the mapping to concrete registers.
Define transformation rules for all operators of ChocoPy expressions.
RISC-V provides specialized instructions for some operations.
For example, the addi
instruction allows directly adding an integer constant (between -2048 and 2047) to a register.
A compiler can make use of such instructions, by detecting special patterns in the source language.
For example, the following rule, detects additions with an integer constant, and translates those to applications of addi
, avoiding the use of an extra register.
select-instrs-exp(|x) :
CAdd(z@CVar(_), CInt(i)) -> [RAddi(x, <cvar-to-rvar>z, <cint-to-rint>i)]
with debug(!"select-instrs-exp: ")
Can you detect other specialized instructions and corresponding source language patterns that provide a more concise and/or faster target code?
In ChocoPy we compile the Booleans True
and False
to integers in RISC-V (1
and 0
, respectively).
Implement the Boolean operators and
, or
and not
, and integer comparison operators ==
, !=
, <
, >
, <=
, >=
(you can ignore is
for now since it operates on objects).
You can make use of the online compiler, or the RISC-V instruction set to find the proper instructions in RISC-V.
Finally, we define a compiler stage that patches up the RISC-V code:
compile-rv32im :: RProgram -> RProgram
compile-rv32im =
assign-homes
; patch-instructions
Map symbolic registers to concrete registers. Either by mapping variables to actual registers or by using a stack discipline. We will look at register allocation next, so keep it simple.
This is the place for any mopping up that needs to be done.