e˙icient ir for the openmodelica...
TRANSCRIPT
Linköpings universitetSE–581 83 Linköping
+46 13 28 10 00 , www.liu.se
Linköping University | Department of Computer Science
Master thesis, 30 ECTS | Datateknik
2018 | LIU-IDA/LITH-EX-A--18/014--SE
E�icient IR for theOpenModelica CompilerE�ektiv IR för OpenModelica-kompilatorn
Patrik AnderssonSimon Eriksson
Supervisor : Martin SjölundExaminer : Peter Fritzson
Upphovsrä�
Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare – un-
der 25 år från publiceringsdatum under förutsättning att inga extraordinära om-
ständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en
att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det
oförändrat för ickekommersiell forskning och för undervisning. Överföring av
upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan
användning av dokumentet kräver upphovsmannens medgivande. För att garan-
tera äktheten, säkerheten och tillgängligheten �nns lösningar av teknisk och ad-
ministrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som
upphovsman i den omfattning som god sed kräver vid användning av dokumentet
på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras
i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens
litterära eller konstnärliga anseende eller egenart. För ytterligare information om
Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/.
Copyright
The publishers will keep this document online on the Internet – or its possible re-
placement – for a period of 25 years starting from the date of publication barring
exceptional circumstances. The online availability of the document implies per-
manent permission for anyone to read, to download, or to print out single copies
for his/hers own use and to use it unchanged for non-commercial research and
educational purpose. Subsequent transfers of copyright cannot revoke this per-
mission. All other uses of the document are conditional upon the consent of the
copyright owner. The publisher has taken technical and administrative measures
to assure authenticity, security and accessibility. According to intellectual prop-
erty law the author has the right to be mentioned when his/her work is accessed
as described above and to be protected against infringement. For additional in-
formation about the Linköping University Electronic Press and its procedures for
publication and for assurance of document integrity, please refer to its www home
page: http://www.ep.liu.se/.
©Patrik Andersson
Simon Eriksson
Abstract
The OpenModelica compiler currently generates code directly from a syntax
tree representation, which leads to ine�cient code in several cases. This the-
sis work introduces a lower-level intermediate representation for the com-
piler which aims to simplify the compiler back end and enable more opti-
mizations. The resulting design of the representation features �at primitive
operations and control �ow using basic blocks and terminators. Variables are
mutable, unlike SSA-based representations. Introducing the IR did not signif-
icantly change the runtime performance of the test programs. The number
of lines of code compared to the old back end was reduced to a quarter, this
and the simpler representation will help future work on optimization passes
and implementing an LLVM-based back end.
Contents
Abstract iv
Contents v
List of Figures vii
List of Tables viii
1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Theory 42.1 Compilers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 The Modelica language . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 The OpenModelica environment . . . . . . . . . . . . . . . . . . . 22
3 Method 333.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Performance evaluation . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4 Code complexity measurements . . . . . . . . . . . . . . . . . . . . 34
4 Results 364.1 Overview of the MidCode design . . . . . . . . . . . . . . . . . . . 36
4.2 Performance measurements . . . . . . . . . . . . . . . . . . . . . . 43
v
5 Discussion 445.1 Performance results . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2 Design of MidCode . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.3 Code Complexity of MidCode . . . . . . . . . . . . . . . . . . . . . 45
5.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.5 The work in a wider context . . . . . . . . . . . . . . . . . . . . . . 46
6 Conclusion 476.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
A Performance test functions 49A.1 Fibonacci . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
A.2 Mandelbrot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
A.3 Quicksort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.4 Takeuchi function . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Bibliography 53
vi
List of Figures
2.1 Compilation of Haskell in GHC . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Overview of translation phases in the OpenModelica compiler . . . . 23
2.3 Overview of the OpenModelica components . . . . . . . . . . . . . . . 24
4.1 Overview of MidCode phases . . . . . . . . . . . . . . . . . . . . . . . 37
vii
List of Tables
2.1 Available LVALUE types in MIR . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Available RVALUE types in MIR . . . . . . . . . . . . . . . . . . . . . 9
2.3 Available terminators in MIR . . . . . . . . . . . . . . . . . . . . . . . 10
4.1 Lines of code for corresponding parts of old back end . . . . . . . . . 43
4.2 Lines of code for new back end . . . . . . . . . . . . . . . . . . . . . . 43
4.3 Performance measurements for old and new code generator . . . . . . 43
viii
Chapter 1
Introduction
1.1 Background
OpenModelica is an open-source modeling and simulation environment developed
mainly by the non-pro�t Open Source Modelica Consortium (OSMC) that imple-
ments the open standard Modelica modeling language. Modelica is a declarative
equation-based language designed for describing various complex and dynamic
systems and can be used for simulating, for example, mechanical, electrical, hy-
draulical and process oriented systems. OpenModelica is mainly targeted towards
industrial and academic purposes.
1.2 Motivation
Currently, the C code generated by the OpenModelica compiler is ine�cient in
many cases, causing signi�cant performance issues, especially considering that
the data inputs may often be large and one major reason for that is due to the
code being generated directly from a high-level syntax tree-based representation
of the Modelica code[2].
A better solution would be to convert this representation to a lower-level inter-mediate representation (IR) more suitable for optimization and code generation
before actually generating the code. Later on, implementing a stage converting
the new lower-level IR to a more common representation like for example LLVM
would be feasible.
1
1.3. Aim
1.3 Aim
The aim of this thesis is to design and implement a new e�cient and maintainable
IR solution for OpenModelica. The new IR stage should be able to compile various
testing programs with roughly equal run time performance to the old code gener-
ation while simplifying the back-end code generation and enabling various useful
lower-level transformations in the future. This new code generation should easily
be extended with more lower-level optimizations and new back ends. Especially
a future LLVM-based back end would be interesting in the long term.
1.4 Research questions
1. How can a new IR help the implementation of optimizations and other code
transformations?
2. Which IR design choices (of some common alternatives) are most suitable
for OpenModelica?
3. How can this new IR be implemented in the OpenModelica compiler?
4. How much of the back end can be moved to a shared portable format leaving
target speci�c implementations simpler?
5. How will the new IR a�ect the run-time performance of OpenModelica?
1.5 Delimitations
This project will focus on evaluating some common IR approaches and on imple-
menting the new IR and its corresponding C code generator. The IR approaches to
evaluate should be low-level but platform-independent and well suited for trans-
forming to LLVM. Alternative special-purpose C code generator variants (for ex-
ample for parallelization or embedded devices) will not be considered.
While Modelica has many language features speci�c to simulation such as
equation-based models and model connections, the project will focus on imple-
menting Modelica functions, its algorithm feature subset and the MetaModelica
extension providing various features common in functional programming but not
included in the Modelica standard such as pattern matching, tagged unions and
2
1.5. Delimitations
linked lists. This part of Modelica is closer to general-purpose programming and
therefore easier to compare to other languages and their corresponding IR solu-
tions, as well as less complex. The Modelica support for multi-dimensional arrays
was deemed too complex and time-consuming, and was therefore skipped.
While performance improvements at this stage are of course desirable, the main
focus is on creating a solution that can later be extended on with new optimiza-
tions and back ends.
3
Chapter 2
Theory
2.1 Compilers
Modern compilers are commonly structured as a pipeline of several phases taking
a structure, transforming it, and sending it to the next phase. This pipeline can
be divided into a front end parsing and analyzing the source code, and a back endtaking the structure produced by the front end and converting it into a executable
program[1].
Some common operations of the front end part are lexical analyzing (converting
the source text to more easily parsable tokens), syntax analyzing (checking if the
syntax is correct and converting the token stream into an abstract syntax tree),
and semantic analyzing (e.g. checking that the types are correct) while some com-
mon tasks of the back end are optimization and code generation [1].
2.1.1 Optimization
In order to improve the performance, size and/or power consumption of a gen-
erated program, compilers may attempt to optimize the generated code rather
than producing the most obvious conversion of the source code. Optimizations
must give the same results as the unoptimized version as well as su�ciently high
performance improvements while being fast enough to still give acceptable com-
pilation times [1].
4
2.1. Compilers
2.1.2 Intermediate representations
Intermediate representations are internal forms of a computer program created and
used by a compiler in order to aid the compilation process. An intermediate repre-
sentation lies somewhere between the original source and the compiled target and
can be at di�erent levels depending on its area of use; it can be high-level (close to
the original source code), low-level (close to the target language) or something in
between. The data structures used can also vary; they can for example be a graph,
a tree or a linear list. Compilers often have multiple intermediate representations
in its pipeline, with each IR serving di�erent purposes and each phase converting
the program to a lower IR form[19].
One major advantage of having intermediate representations is that the compiler
can be more easily retargeted into new source languages or new target platforms
and reuse independent components for those. Instead of having to write one com-
piler for every source/target combination, developers can just add a single front
end in order to support one source language, and a single back end in order to
support one target platform [19][1].
Common IR designs
One common linear representation of operations is three-address codes, where
each operation is binary and represented by two source variables, one destina-
tion variable and an operation type. These operations can be stored in memory
as records called quadruples storing the three variables and the operation type.
Variables can be either a named variable, a constant or a compiler-generated tem-
porary variable. Unary operations may be de�ned as having just a single source
variable. Expression trees are �attened by storing intermediate operations in tem-
porary variables which are then used as source variables in later expressions.
Temporary variables are usually given unique names and not shared between
di�erent intermediate operation results. Special call, jump and conditional op-
erations can be implemented for representing control �ow.[1].
One control �ow representation that is frequently used is the control �ow graph(CFG). The instructions are partitioned into basic blocks, which is an instruction
sequence where the control �ow of each block can only enter through the �rst in-
struction and exit in the last, where various jumping constructions can be chosen.
The basic blocks are then represented as nodes in the diagram and execution paths
as directed edges between the blocks. Each basic block should preferably contain
as many instructions as possible without violating these rules. The instructions
5
2.1. Compilers
used in basic blocks are primitive and in three-address form. This representation
simpli�es many analyses and is therefore useful for performing many optimiza-
tions[1].
Another complementary representation for tracking data is static single assign-ment (SSA), which di�ers from three-address codes in that each variable can not
be assigned more than once in a function, which simpli�es data �ow analysis
signi�cantly. This gives SSA the important property of referential transparency,
meaning that a reference can be replaced with its de�nition and therefore that the
variable values are independent of the order their statements are listed in. Ref-
erential transparency also allows for a computation to be replaced by the result
which allows for well known transformations like common subexpression elim-
ination. As such, compilers that perform data �ow analysis can do a conversion
pass over the non-SSA representation and produce a SSA-base representation that
is easier to reason with. SSA is used together with a basic block structure and uses
a special φ (phi) function when execution paths merge, which takes a list of source
variables and assigns one of them to a new variable depending on the previously
executed block. The transformations can in general be made with other meth-
ods, but SSA has the advantage of being both intuitive and e�cient, allowing for
more optimizations to easier be implemented while also enabling fast compilation
times, often fast enough that it can be used in just-in-time compilers[17].
Example: GCC
One example of IR usage is GCC, which has two intermediate representations
called GENERIC and GIMPLE. GENERIC represents a function and its statements
as a tree structures, while GIMPLE is a subset of GENERIC reduced by a process
called gimpli�cation and used in the optimization stage. These representations
are both independent of the programming language used[12].
Example: LLVM
LLVM is an IR that is, among other things, used in the Clang compiler for C. LLVM
is also used for other back ends, for example Rust, Swift and GHC (for Haskell).
These compilers use LLVM for optimizations and code generation. LLVM also
aims to be a portable format by supporting several targets for code generation.
LLVM uses a basic block and terminator model for control �ow. The instructions
are three-address-code with single static assignment variables.
6
2.1. Compilers
Example: Rust and MIR
The o�cial compiler for the Rust programming language has added a new IR
named MIR (Mid-level Intermediate Representation) between its high-level AST
and its low-level LLVM code generation, whereas previously the LLVM code was
directly from the AST. The design of MIR is based on primitive three-operand
statements and basic blocks with terminators. This design makes translations
to LLVM, which also uses primitive operations and control �ow representations,
relatively simple to do [10].
Some of the main goals of MIR in Rust are improving compilation time by having
more e�cient data structures, enabling more Rust-speci�c optimizations, reduc-
ing redundancy in the code base and making optimizations and other transfor-
mations easier to work and reason with in general[10].
One notable di�erence from LLVM is that it is not SSA-based, i.e. it allows multiple
assignments to the same variable, and named variables are kept as-is. However,
generated temporaries are still typically single-assignment. As more advanced
optimizations relying on SSA representations are typically done by LLVM, an
lvalue-based representation rather than SSA is considered su�cient for this pur-
pose.[10][11].
The complete Rust language including its various syntactic sugar constructions
is reduced to a small subset that is easier to work with, since various redundant
representation variations of a single low-level feature, all having to be handled
separately, are now represented by fewer variants meaning the analysis has fr-
wer cases to handle. Where control �ow analyses were previously done on sepa-
rate control-�ow graphs that had to be generated from the AST, they can now be
done directly on the MIR representation. The Rust safety analyses are also more
accurate since the lower-level nature of MIR makes the di�erence between the
analyzed structure and the �nal code smaller. Rust-speci�c optimizations can be
directly done as a separate stage, whereas it previously often was done during con-
version to LLVM, adding unwanted complexity to this conversion phase. Apart
from simplifying LLVM generation, MIR also adds potential for adding other low-
level back ends in the future.[11].
The MIR data structure describes the workings of a single function and contains
a control-�ow graph stored as a list of basic blocks, a list of compiler-generated
temporary variables, and a list of user-declared variables. A single basic block
contains a list of statements and a terminator, which describes the control-�ow
action that occurs at the end of the basic block execution. A statement can either
7
2.1. Compilers
be a variable assignment or a drop (deallocation) of a variable, which is described
explicitly unlike in the source language. An assignment statement contains an
rvalue for the right-hand side and an lvalue for the left-hand side.
An lvalue can be variables of di�erent kinds such as named, temporary, argument
or return variables, a �eld in a struct or tuple, a pointer dereference, an array
index, or a enum downcast[11], see table 2.1.
B User-declared variable binding
TEMP Compiler-generated temporary
ARG Function argument
RETURN Return value
LVALUE.f Struct or tuple �eld
*LVALUE Pointer dereference
LVALUE[LVALUE] Array index
(LVALUE as VARIANT) Enum downcast
Table 2.1: Available LVALUE types in MIR
An rvalue symbolizes an expression and can be the use of an lvalue, a mutable
or immutable reference, a cast, a constant, a literals of a struct or built-in con-
tainer type, the length of an object, or common simple binary operations and
unary operations[11], see table 2.2. As shown below, most rvalue operations only
take lvalues as arguments, meaning that for constants and data structure literals
can only be used through temporary variables. The special BOX value represents
the memory allocation function taking the struct constructor method as its sole
argument, and is used in the MIR call representation just like other functions[11].
Terminators in MIR can jump to another basic block with or without stack un-
winding, jumps to one of two speci�ed basic blocks depending on the truth value
of a variable, jumps to one basic block from a list depending on the value of a
variable, call a function and afterwards jump to one of two basic blocks depend-
ing on if the function succeeded or failed, or simply return from the function with
or without stack unwinding[11], see table 2.3.
8
2.1. Compilers
Use(LVALUE) Value of LVALUE
[LVALUE; LVALUE] Array literal of speci�ed size with
the same de�ned value for all
cells
&’REGION LVALUE Reference to LVALUE
&’REGION mut LVALUE Mutable reference to LVALUE
LVALUE as TYPE Cast
LVALUE <BINOP> LVALUE Binary operation
<UNOP> LVALUE Unary operation
Struct { f: LVALUE0, ... } Struct literal
(LVALUE...LVALUE) Tuple literal
[LVALUE...LVALUE] Array literal
CONSTANT Constant
LEN(LVALUE) Length of LVALUE
BOX Memory allocation function for
box operator
Table 2.2: Available RVALUE types in MIR
Example: Swi� and SIL
Similar to the Rust compiler, the o�cial Swift compiler has also added a new
mid-level IR between the AST and the generated LLVM code with the name SIL(Swift Intermediate Language). Unlike MIR, SIL is SSA-based, but replaces the
phi node concept with having arguments in basic blocks that are set by termina-
tors jumping to that block. Within the block, the argument variables work like
typical source variables. Like with MIR, literals have to be saved in temporaries
before they can be used in operations. Calls are implemented di�erently in SIL,
while MIR implements calls as terminators, SIL instead implements them as reg-
ular statements. Operators are also implemented as calls to built-in functions
rather than special rvalue constructions by like in MIR. More low-level memory
operations are stored as explicit constructions than in MIR, including heap and
stack allocations, memory accesses and reference counting handling [20].
9
2.1. Compilers
GOTO(BB) Jump unconditionally to basic block BB
PANIC(BB) Start stack unwinding and jump to basic
block BB for cleanup
IF(LVALUE, BB0, BB1) Jump to BB0 if LVALUE is true, otherwise
jump to BB1
SWITCH(LVALUE,BB...)
Jump to one of the listed basic blocks depend-
ing on value of enum LVALUE
CALL(LVALUE0 =LVALUE1(LVALUE2...),BB0, BB1)
Call function referenced in LVALUE1 with
arguments in LVALUE2 onwards, store re-
turn value in LVALUE0, jump to BB0 if call
succeeded or BB1 if it panicked
DIVERGE Return and unwind stack
RETURN Return
Table 2.3: Available terminators in MIR
Example: Glasgow Haskell Compiler
One major compiler for the functional language Haskell is the Glasgow HaskellCompiler, GHC, which is an open source project. One of GHC’s IRs is Core. While
being an IR, Core corresponds well to a simple source level language which elim-
inates super�uous ways to express the same language construct [9]. For example,
a list comprehension needs to be changed from a native Haskell construct into an
expression based on variable bindings and functions in Core.
In Core, case-expressions are also restricted as they cannot match nested construc-tors of a value [7]. It is used to see which member of a union a value contains as
well as accessing the attributes of the record. Core also �attens expressions by
restricting their usage. An argument to a function must be a literal or variable
(called atom in the paper), resulting in the dependence of function calls being
explicitly ordered by variable bindings.
GHC has another lower level IR, the The Spineless Tagless Graph Reduction Ma-chine, or STG [8]. The di�erence between STG and Core is that Core is meant to
simplify expressions in a functional setting while STG is meant to help simpli�-
cations targeted at modern processors. As such it speci�es operational semantics,
unlike Core. In addition, all type information information is lost in transforming
Core to STG.
10
2.1. Compilers
Parse Tree
Core
STG
Cmm
LLVMC
Assembly
desugar
STGify
CodeGen
LLVM compiler
NCG
C compiler
Figure 2.1: Compilation of Haskell in GHC [9]
The operational semantics include a stack for arguments, returns, and the imple-
mentation of the lazy calling convention. Arguments are pushed when a function
application is evaluated and popped when entering closures with arguments. The
return entries in the stack is actually not for function returns since the only eval-
uation is from pattern matching, so the entry is for the result of a pattern match.
The implementation of the lazy calling convention is done with a stack entry that
causes a memory mutation of a suspended computation with the current value
computed.
STG also has a heap which contains all values allocated until they are deallocated
by garbage collection. An important feature for long running computations in a
lazy language is black holes. When a computation is entered, it is replaced by a
11
2.2. The Modelica language
black hole, which does not keep any of the computation references alive , although
the ones used when evaluating the computation are. This means that if garbage
collection is performed while evaluating the black holed value, more things can
be collected. For example, in code for �nding the last value of a long linked list,
earlier elements can be collected even if garbage collection happens in the middle
of evaluation. Additionally, if evaluation tried to evaluate a black hole that it has
created, then an in�nite loop has been detected, so an exception can be thrown.
Further down in the compilation pipeline, we �nd Cmm, which is a processor
portable intermediate language reminiscent of LLVM. Cmm consists of simple
control �ow between blocks, basic types that re�ect machine representation and
stack-backed unlimited variables[21]. Cmm contains no type information except
for machine level representations like 32-bit signed integers. It also explicitly rep-
resents the heap and stack and writing to byte addresses. As can be seen from �g-
ure 2.1.2 [9], there are several back ends that starts from Cmm and then generate
assembly.
2.2 The Modelica language
Modelica is a declarative and object-oriented language developed for equation-
based modeling of complex and dynamic physical systems. It can be used for
simulating, for example, mechanical, electrical, hydraulical and process oriented
systems[5]. The Modelica standard exists in multiple implementations and is gov-
erned by the international non-pro�t Modelica Association[16]. Systems can be
separated in smaller components which can then connect to other components
and be distributed in model libraries. This enables equation systems to be reused
and combined to make larger systems. Many common standard components are
distributed by the Modelica Association in their Modelica Standard Library[16].
2.2.1 Primitive types and arrays
The primitive types supported are integers, reals (�oating-point), booleans,
strings, enumerations and a special clock type used for synchronous systems. In
addition, support for complex numbers are implemented in a standard library.
Multi-dimensional arrays are also supported, and can have dimension sizes that
are unspeci�ed at compile time. A data type for complex values is also imple-
mented by the standard Modelica library[4].
12
2.2. The Modelica language
Some of the primitive operations supported in expressions are scalar arithmetic
operations (such as addition, subtraction, division, multiplication and exponenti-
ation), elementwise arithmetic operations on arrays, comparisons, logical opera-
tions, and if-expressions[4].
2.2.2 Models and equations
Modelica model classes describe the system to be modelled as a system of vari-
ables with optional initial values and di�erential, algebraic and discrete equations,
which can then be compiled and solved by the Modelica implementation for a
given time slice. The class de�ned at the top of the program is automatically in-
stantiated, and other classes can be instantiated by declaring them as variables in
the top class[4].
Each equation consists of two expressions, one on each side of an equality (=)
operator. The listed equations are not a�ected by the order in which they are
listed and are acausal, meaning they do not have a �xed data �ow direction. In
order to support variation over time, variables can be surrounded by the der()time derivative operator, and the time variable can also be accessed directly as
time. For-loops can also be used to declare repetitive equation series in a shorter
way.[16][4].
Variables can optionally have de�ned initial values, and models also support ad-
ditional variable types such as named constants and parameters, which unlike
normal named constants can be set before simulation without recompiling[4].
For example, a pendulum can be modelled as in the following example taken from
page 21 in Principles of Object-Oriented Modeling and Simulation with Modelica3[4]. This model contains both di�erential equations and algebraic equations,
and is therefore an example of an di�erential algebraic equation system (DAE).
This system can be simulated by calling the simulate function, for example by
writing simulate(Pendulum,stopTime=6)[4] and then plotted by calling
the plot function with the variable to be plotted as its argument[4].
model Pendulumparameter Real m=1, g=9.81, L=0.5;
//mass, gravity, length of pendulumReal F; //forceoutput Real x(start=0.5), y(start=0)
//x and y position with set start valuesoutput Real vx, vy; //x and y velocity
13
2.2. The Modelica language
equationm * der(vx) = -(x / L) * F;m * der(vy) = -(y / L) * F - m * g;der(x) = vx;der(y) = vy;x^2 + y^2 = L^2;
end Pendulum;
2.2.3 Model inheritance
Models can extend on other models, and therefore provide more specialization
while reusing code, similar to hierarchical class inheritance in typical object-
oriented languages. By inheriting equations, data variables and class members
from a base class, a subclass can inherit part of their behaviour while modifying
and adding on it by adding additional equations and variables[4].
Model classes can be partial, meaning that their equation systems are under-
speci�ed and can only be made solvable by extending them with subclasses pro-
viding additional equations, this can be seen as an analog to abstract classes in
object-oriented languages. Variables of an instances are accessed though dot syn-
tax, though they can be protected from outside access by putting them in the
protected section, which will block direct access from outside but still make
them available in submodels.[16][4].
Classes can also contain variables with type declarations that are replaceable by
subclasses, similar to generics in other languages. A �eld with a replaceable type
is simply pre�xed by the protected keyword. For making a new class based on
a class with replaceable types, a new type de�nition specifying the types is made
which can be then be instantiated like a regular class[4].
2.2.4 Connections
Model instances can be connected to each other through special connect-
equations in order to create larger systems. The interfaces for these connections
are speci�ed by connector classes, which contain a list of the variables that are
carried by the signals. Variables in a connector can optionally be con�gured as
�ow variables, indicating that the values of all connected signals will sum to zero
instead of being equal[4].
14
2.2. The Modelica language
Connections are generally acausal, meaning that they like equations lack a spec-
i�ed data direction, but they can also be speci�ed as input or output connections,
meaning that they can only receive from or send to a component, respectively[4].
When connecting one variable in a component to many subcomponents without
having to make a large number of connect-equations explicitly, it can be made
implicitly by pre�xing the shared variable in the top component with the innerkeyword and declaring a reference variable with the same name in the subcom-
ponents pre�xed by the outer keyword[4].
Discrete events
Discrete instantaneous events can be modelled by using or by using the when-
statement, which only activates its subequations at the exact time moment when
one or more of its condition expressions transitions to true. Discrete and con-
tinuous components can be freely combined to create hybrid systems. A when-
statement can contain a special reinit equation that resets a variable to a new
value on the event. In a reinit equation, the previous value of the variable can
be accessed through the pre operator. Apart from the when-statements, simple
if-expressions and if statements in normal equations may also be used to model
discrete changes[4].
Basic electronics example
In listing 1, we take examples from Principles of Object-Oriented Modeling andSimulationwithModelica 3 to give a taste of Modelica. The listing de�nes electrical
components in Modelica by de�ning variables, equations, connectors and using
inheritance so that shared equations can be de�ned in a single partial superclass
[4].
Packages
In order to avoid name con�icts and simplify sharing code, libraries can be dis-
tributed as packages, which gives all content in the library its own hierarchical
namespace. Other packages can then be imported in another package with the
import keyword, which optionally allows importing namespaces directly at the
top-level within the package. Within a package, an imported namespace can be
given custom names so that typing can be reduced without risking name con�icts
as with top-level imports.
15
2.2. The Modelica language
type Voltage = Real(unit="V");type Current = Real(unit="A");type Resistance = Real(unit="Ohm");type Capacitance = Real(unit="F");
connector Pin "Electrical pin"Voltage v;flow Current i;// the flow keyword indicates that any connected// variables should sum to zero
end Pin;
partial model TwoPin "Electrical component with two pins"// partial since it does not have enough equations// to be fully definedPin p,n;Voltage v;Current i;
equationv = p.v - n.v;0 = p.i + n.i;i = p.i;
end TwoPin;
model Resistorextends TwoPin;// include all variables and equations from TwoPinparameter Resistance R;
equationR*i = v;
end Resistor;
model Capacitorextends TwoPin;parameter Capacitance C;
equationC*der(v) = i;
end Capacitor;
model GroundPin p;
equation0 = p.v;
end Ground;
model LowPassPin in,out;parameter Resistance R;parameter Capacitance C;Resistor resistor(R=R);Capacitor capacitor(C=C);Ground ground;
equationconnect(in, resistor.p);connect(resistor.n, out);connect(out, capacitor.p);connect(capacitor.n, ground.p);
end LowPass;
Listing 1: Models for basic electronics simulation
16
2.2. The Modelica language
2.2.5 Functions and algorithms
More traditional imperative code can be written in Modelica inside algorithm sec-
tions. Unlike in normal equation sections, variables are assigned values directly
with the := assignment operator, they can also be assigned multiple times within
a single section. Both recursion and common imperative control �ow statements
such as if-then-else, for and while are supported. Algorithm sections in Modelica
are pure, i.e. without side-e�ects and global state, in order to support safe usage
inside equation systems. [4]
The special function class type can be used for implementing named mathematical
functions using algorithm sections. Functions can have multiple input variables
and, unlike many other languages, multiple outputs variables as well. Functions
can also declare local variables inside protected sections for use in the algorithm
section. [4]
Two examples of implementations for the factorial function are provided below:
function factorial_recursiveinput Integer i;output Integer o;
algorithmif i > 1 then
o := i * factorial_recursive(i-1);else
o := 1;end if;
end factorial_recursive;
function factorial_imperativeinput Integer i;output Integer o;
protectedInteger acc;
algorithmacc := 1;for x in 2:i loopacc := x*acc;
end loop;o := acc;
end factorial_imperative;
17
2.2. The Modelica language
2.2.6 MetaModelica
MetaModelica is an extended version of Modelica designed for modeling program-
ming languages. It complements the algorithm support in Modelica with various
features common to functional programming, such as tagged unions with support
for recursion, linked lists, tuples, and pattern matching. It also adds support for
exception handling and generics [6].
Parameterized types
Parameterized types enable types to be specialized by another type as a parameter,
and is similar to generics in other programming languages. Most of the new built-
in types in MetaModelica support type parameters [6].
Lists
Lists contain an arbitrary number of objects of a single type. Lists are imple-
mented as immutable linked lists like in many functional languages, meaning that
they are immutable which enables parts of lists to be shared between di�erent
lists. New lists can be created in constant time by inserting new values before
existing lists with the :: (cons) operator. [6]. However, some operations like ap-
pending, getting a value from a speci�c index, and calculating the list length will
have linear time complexity. Lists can be created either with the cons operator or
by braces-surrounded list literals listing all values in the list, this is also used to
represent the empty list {}[13].
In addition, pattern matching can be used for extracting values from or comparing
lists[6]. MetaModelica also has several built-in methods for performing various
operations on linked lists[13]:
listAppend — Returns a copy of a list concatenated with another list
listDelete — Returns a copy of a list with a speci�c index-speci�ed object
skipped
listEmpty — Returns a boolean indicating if a list is empty (has length 0)
listHead — Returns the �rst object in a list
listGet — Get an object in a list by index (1-indexed)
18
2.2. The Modelica language
listMember — Returns a boolean indicating if a list contains a speci�c value
listLength — Returns the length of a list
listRest — Returns the tail of the linked list (every object except the �rst)
listReverse — Return a reversed copy of a list
List<Integer> l, l2, l3; //variable declaration
l := {3, 4, 5};//list literall2 := 2 :: l;//creating a new list {2, 3, 4, 5} with the cons operator
i := listGet(l, 2);//accessing the second value through in the list (4)len := listLength(l);//getting the list length (3)l3 := listReverse(l);//getting a reversed list ({5, 4, 3})
Tuples
Tuples contain an arbitrary number of objects of mixed types, and can be seen
as a way to create simple records without having to write record declarations.
Values in the tuple can be accessed either through pattern matching or by dot
notation, denoted by following the tuple with a dot and the index of the object
(1-indexed)[6].
Tuple<Integer, String, List<Real>> t; //variable declaration
t := (12, "hello", {1.0, 2.0, 3.0}); //tuple literali := t.2; //accessing the second value through dot notation
Union types
Union type objects store record data with a type-safe constructor describing its
variant, and are similar to algebraic data types in functional programming. One
or more record types can be de�ned for a single union type. Union type instances
are also immutable, i.e. its �elds can not be modi�ed after it has been created.
19
2.2. The Modelica language
Union types are recursive, meaning that they can have �elds of its own type, and
are therefore useful for describing tree structures, such as abstract syntax trees.
Pattern matching can be used for checking and extracting �eld values[6].
uniontype Numberrecord INT
Integer int;end INT;record RATIONAL
Integer int1;Integer int2;
end RATIONAL;record REAL
Real re;end REAL;record COMPLEX
Real re;Real im;
end COMPLEX;end Number;
Number a; //variable declaration
a := RATIONAL(8, 13); //literal with RATIONAL constructora := REAL(1.618033); //literal with REAL constructor
Option types
Option type values either carry a single �eld of a speci�c type or none at all, and
is generally used for cases where objects are optionally de�ned. They are im-
plemented as a built-in parameterized union type with the constructors NONE()or SOME(x) where x is a object of the parameter type. The constructor can be
checked with the ‘isSome‘ and ‘isNone‘ functions, and option type values can also
be unpacked with pattern matching like other union types[6].
Option<String> o; //variable declaration
o := NONE(); //none literalo := SOME("hej"); //some literal
if isNone(o) then...
end if;
20
2.2. The Modelica language
Pa�ern matching
One of the most important features in MetaModelica is its pattern matching sup-
port, which is similar to pattern matching in many functional languages. This can
be used for more advanced control �ow and enables simple and powerful handling
of structural data[6].
Each case is tested in the order they are listed and contains a pattern, the body
to be executed and a case return expression calculated and returned by the match
expression after the body has �nished. The unit value () can be returned if an
actual return value is not desired. The return value can also be a tuple, allowing
multiple values to be returned. The return values in all cases in a single match
statements are required to be of the same type. The body for each case can either
be a algorithm section or a equation section, equation sections are however not
allowed to contain di�erential equations. A match statement can have its own set
of local variables, these can also be used for pattern binding[6].
Patterns that can be matched in a case include scalar constants such as integers
and strings, record constructors with named or positional arguments, tuples, lists
made with literal syntax, lists made with the cons (:: operator, and the _ wild
card which allows and ignores all values, these patterns can also be nested. Vari-
ables placed in a pattern will be bounded, i.e. assigned the actual value, if the case
match succeeds. In addition, the whole pattern itself can be bound to a variable
with the special as binding operator. The __ pattern as the single argument to
a record constructor can be used to bind all �elds without having to explicitly
name them. Apart from the pattern expression itself, a pattern can also include a
guard expression which must be true for the matching to succeed, this expression
pattern can include variables from the pattern expression[6].
Pattern matching expressions come in two variants with di�erent behaviour when
an exception is raised in the case body: match, which makes the whole match
statement fail as expected and matchcontinue, which instead rewinds the
state and tries the following patterns, failing the whole match expression only
when all patterns have been exhausted[6].
Comprehensions
List and array comprehensions allow the user to write concise mapping and �l-
tering on collections using some syntactic sugar. They take map expression and
one or more collections with a named iterator variable for each collection, and
21
2.3. The OpenModelica environment
can optionally take guards �ltering the values. There are also “threaded” compre-
hensions which work like a zip between any number of lists[6].
list<Integer> l0 := list(1+x for x guard 0<=x in otherList);list<Integer> l1 := list(a+b threaded for a in 1:2, b in 3:4);// {1+3,2+4}list<Integer> l2 := list(a+b for a in 1:2, b in 3:4);// {4,5,5,6}
Exception handling and asserts
Exceptions such as out-of-bounds accesses and divisions by zero can be tested by
putting the expression or statement inside a failure call, which will succeed
if the test statement causes an exception and throw an exception if the test state-
ment succeeds. If an unhandled exception occurs inside a matchcontinue case,
the program will then rewind the state and try the following cases rather than
making the entire match statement fail. Exceptions can also be generated explic-
itly with the fail function, or by assertions using the assert function, which
takes an assertion condition, a message string and optionally an assertion severity
level[6].
2.3 The OpenModelica environment
OpenModelica is an open-source Modelica-based simulation and modeling envi-
ronment. Some of its main purposes is to provide e�cient, easy-to-use and well
visualized Modelica-based simulations while also serving as a teaching and re-
search tool and as a reference implementation that is itself written largely in
Modelica[5]. Most of the development of OpenModelica is done by Linköping
University in Sweden.
2.3.1 Compiler structure
The OpenModelica compiler takes Modelica code and translates it to C code which
can be compiled by a standard compiler. The subsystem also provides an inter-
preter so that code can be tested interactively[3].
Most parts of the OpenModelica compiler are written in MetaModelica. The
OpenModelica compiler can compile MetaModelica code, including bootstrapping
itself[18].
22
2.3. The OpenModelica environment
Translator
Analyzer
Optimizer
Code Generator
C Compiler
Simulation
Modelica source
DAE with �attened models
DAE with sorted equations
DAE with optimized sorted equations
C source code
Executable program
Figure 2.2: Overview of translation phases in the OpenModelica compiler
The OpenModelica Compiler is organized, like most other compilers, as a pipeline
of these phases[4][3] as seen in �gure 2.2:
Translator — parses the source code into the initial Absyn-format AST, con-
verts it into the simpli�ed SCode-format intermediate AST, and reduces the
object-oriented structures to a single �at equation system in the DAE-format
AST. Type checking and other static analyses are also performed here.
Analyzer — performs transformations on the equation system so that they can
be e�ciently solved, including dependency sorting the equations and con-
verting to imperative assignments.
Optimizer — performs optimizations on the DAE.
23
2.3. The OpenModelica environment
Parse
SCode/explode
Inst
BackendDAECreate
Symbolic operations
(BackEnd)
SimCode
Code generator
Lookup Static Ceval
Modelica code
Absyn
SCode
DAE
Backend DAE
Sorted and optimized DAE
SimCode
C code
Figure 2.3: Overview of the OpenModelica components
24
2.3. The OpenModelica environment
Code Generator — generates compilable C code from the DAE. This code is then
passed to a C compiler.
A more detailed overview on some of the most relevant modules used in the code
generation is shown in �gure 2.3.
2.3.2 Susan as a Code Generator
Susan is a template language used by the OpenModelica Compiler. Its purpose is
to allow easy to use text generation from MetaModelica structures.
A Susan �le consists of several templates that accept some MetaModelica data
type and return text. Templates can also use what’s called bu�ers to �ll in holes
left in the returned text. Templates may be used solely for their e�ects on bu�ers
and not for the text they return.
See listing 2 for an example of a Susan template. The listing contains a bu�er
auxFunction and a match on var. The cases of the match return the �-
nal result of the entire template. The VARIABLE case has a nested template
contextCref to which it passes the auxFunction bu�er.
template funArgBoxedDefinition(Variable var)"A definition for a boxed variable is always of typemodelica_metatype, unless it's a function pointer"
::=let &auxFunction = buffer ""match varcase VARIABLE(__)
then 'modelica_metatype <%contextCref(name,contextFunction,&auxFunction)
%>'case FUNCTION_PTR(__)
then 'modelica_fnptr _<%name%>'end funArgBoxedDefinition;
Listing 2: A snippet in the Susan template language
2.3.3 The DAE representation
The DAE representation is a AST representation that, unlike the previous repre-
sentation stages, have the object-oriented structures such as class instances and
25
2.3. The OpenModelica environment
connections simpli�ed and �attened into a single equation system. This �atten-
ing is done from the SCode representation by the Inst module. However, Meta-
Modelica data structures are still preserved and constructed in run-time. Like the
other representations in OpenModelica, it is implemented using MetaModelica
data structures such as union types, optionals and lists[3].
A function in DAE can contain various di�erent elements, such as algorithms,
equations of di�erent kinds, variables, reinit statements, calls and asserts[15].
This overview will focus on the part implementing the algorithm subset, which
is the subset most relevant to the IR implemented in this thesis.
Element and Algorithm union types
A function contains elements of various types, such as algorithm sections, equa-
tions of di�erent forms, and variables. These are represented by the Element algo-
rithm[15]. Described below are the element types most important to this thesis.
Although all element types contain a source �eld of the ElementSource union
type containing metadata such as source code line numbers and classes and in-
stances it belongs to, this �eld is skipped in these descriptions for brevity.
VAR - This element type represents variables and contains many �elds related to
names, types, equation �ow and connections. The most important ones for
this thesis are the component reference and the type �eld.
ALGORITHM - This element type represent algorithm sections and contains a
�eld of the Algorithm union type, which simply contains a list of state-
ments.
ComponentRef union type
Component references represent hierarchical path names and are typically used
for describing variables[15].
CREF_IDENT — This record type represents a non-hierarchical or bottom-level
identi�er, and contains the name as a string, its type and a list of optional
subscripts.
CREF_ITER — This record type is used for iterators, and contains an index used
for code generation in addition to the data in in CREF_IDENT.
26
2.3. The OpenModelica environment
CREF_QUAL — This record type represents a higher level in a hierarchical path,
and contains a component reference to the level below in addition to the
data in CREF_IDENT.
Absyn.Path union type
While Absyn.Path is strictly part of the Absyn representation de�nitions, it
is frequently used in DAE for externally accessible objects such as functions or
union types, and so it is mentioned here.
IDENT — This record type represents a non-hierarchical or bottom-level identi-
�er, and contains the name as a string,
QUALIFIED — This path type represents a higher level in a hierarchical path
,and contains the path to the level below in addition to the name string of
its level.
Statement union type
The statement record types available are assignments of various types and control
�ow statements such as calls, if statements, loop statements like for and while,
when statements, and simple skipping statements like break, continue and re-
turn[15]. Described below are the statement types most important to this thesis.
Although all statement types, like the element types, contain aElementSourcesource �eld containing metadata, this �eld is skipped in these descriptions for
brevity as well.
STMT_ASSIGN — This statement type describes an assignment and contains the
type of the assignment and the expressions of the left and right hand side.
STMT_IF and the Else union type — This statement type describes an if
statement and contains the conditional expression, a list of statements to
be executed when the condition is true, and a value of the Else union
type to describe the behaviour when the condition is false. The type in the
Else union type �eld can either be NOELSE signifying that nothing is done,
ELSEIF performing another conditional step and having the same �elds
as a STMT_IF, or a ELSE which simply contains a list of statements to be
executed on a false condition.
27
2.3. The OpenModelica environment
STMT_FOR — This statement type describes a for(each) statement and contains
the type of the iterator, the name of the iterator variable, the range expres-
sion to be iterated over and a list of statements executed in the loop body.
It also contains a few additional code generation-aiding variables which did
not have to be considered in the development of this thesis.
STMT_WHILE — This statement type describes a while statement and contains
a conditional expression and a list of statements executed in the loop body.
STMT_NORETCALL — This statement type describes a call not having or storing
any return values, and the only �eld is contains is an expression of the call
type described further down.
STMT_BREAK, STMT_CONTINUE and STMT_RETURN — These statement
types simply describe break, continue and return statements and do contain
any additional data. Note that value returns in Modelica are done by as-
signments to designated output variables rather than by return statements,
therefore STMT_RETURN does not contain any return values, but simply
exits the function.
Type union type
This union type represents the data types used in DAE[15].
T_INTEGER, T_REAL, T_STRING and T_BOOL — These types simply repre-
sent the basic data types in Modelica, i.e. integers, reals, strings and
booleans.
T_NORETCALL — This type represents the return value of a call without output
variables.
T_TUPLE — This type represents tuples as returned from functions with multi-
ple output values contains a list of types indicating the type of each tuple
element and a optional list of tuple �eld names as strings.
T_METALIST — This type represents MetaModelica lists and contains a type
�eld indicating the type of its elements.
T_METATUPLE — This type represents MetaModelica tuples and contains a list
of types indicating the type of each tuple element.
28
2.3. The OpenModelica environment
T_METAOPTION — This type represents MetaModelica optionals indicating the
type of its element when it contains a value.
T_METAUNIONTYPE — This type represents MetaModelica union types.
T_METARECORD — This type represents MetaModelica records, and contains an
Absyn.Path to the union type, an Absyn.Path to the record, a list con-
taining the type of each �eld, the constructor ID for the record, a list of the
Var components of each �eld, and a boolean indicating if the record type is
a singleton.
T_METAARRAY — This type represents MetaModelica arrays and contains a type
�eld indicating the type of its elements.
T_METABOXED — This type represents MetaModelica boxed values.
Exp union type
This union type represents the expression types that can be used in DAE such as
literals, operators, variable references and calls[15].
ICONST, RCONST, SCONST and BCONST — These expression types simply
represent constants of the basic Modelica data types, i.e. integers, reals,
strings and booleans. Its sole �eld is the constant value it contains.
CREF — This expression type represents a variable reference and contains a com-
ponent reference �eld and the type of the variable.
BINARY and UNARY — These expression types represent binary or unary arith-
metic operations and contains one or two subexpressions and a Operatorvalue denoting the operation to be performed.
LBINARY and LUNARY — These expression types represent binary or unary
logical operations such as and, not, and or. Similar to the arithmetic op-
erations, it contains one or two subexpressions and a Operator value de-
noting the operation to be performed.
RELATION — This expression type represents comparisons. Apart from having
two subexpressions and a Operator value like other binary operations,
it has some additional �elds for model simulation handling which is not
considered here.
29
2.3. The OpenModelica environment
IFEXP — This expression type represents an if expression and contains three
subexpression: one for the condition, and one each for the true and false
case.
CALL — This expression type represents a call and contains the name of the
function, a list of subexpressions denoting the arguments and a special
CallAttributes �eld storing various additional data about the call.
Some of the data stored in CallAttributes are the type of the return
value, if the function call return multiple values as a tuple, if the call is to a
built-in function, and if the call is inline or a tail call.
RANGE — This expression type represents numeric ranges is typically used in
for statements and contains the type of the numeric values, the start value,
the end value and optionally the step between each value, which is 1 if not
speci�ed.
CAST — This expression type represents a type cast and contains the type the
value is cast to and a subexpression representing the value is being cast.
TSUB — This expression type represents tuple subscripts and contains the subex-
pression to be subscripted, the integer index, and the type of the returned
value.
ASUB — This expression type represents array subscripts and contains the subex-
pression to be subscripted and a list of integer indexes with each value rep-
resenting a di�erent array dimension.
RSUB — This expression type represents record value accesses and contains the
subexpression of the record, the integer o�set of the �eld, the name of the
�eld, and the type of the returned value.
LIST — This expression type represents a MetaModelica list literal or a nil node
and contains a list of subexpressions denoting each element stored in the
list.
CONS — This expression type represents a MetaModelica list node and contains
two subexpressions denoting the head and tail of the list node.
META_TUPLE — This expression type represents a MetaModelica tuple node and
contains a list of subexpressions denoting each element stored in the tuple.
META_OPTION — This expression type represents a MetaModelica optional and
contains an optional subexpression.
30
2.3. The OpenModelica environment
METARECORDCALL — This expression type represents a MetaModelica record
constructor and contains the path to the record, the arguments as a list of
subexpressions, a list of �eld names, the record variant number, and a list of
types for each �eld.
MATCHEXPRESSION — This expression type represents match expressions
and contains a �eld of the MatchType union type that can be
MATCHCONTINUE or MATCH, a list of subexpressions for the expressions
to be matched, a list of local declarations as Element values, a list of
cases as MatchCase values, and the type of the match expression. The
MatchCase union type is described more in detail below.
BOX — This expression type represents a MetaModelica boxed value and contains
a subexpression for the value to be boxed.
UNBOX — This expression type represents the unboxing of a MetaModelica boxed
value contains a subexpression for the value to be unboxed and a type �eld
indicating the type of the unboxed value.
PATTERN — This expression type represents various patterns as used in match
statements. Its sole value is of the Pattern union type described more in
detail below.
MatchCase union type
This union type represents a single case in a match expression and contains a
single variant record type ‘CASE‘. It contains a list of patterns of the Patternunion type, an optional guard subexpression, a list of local declarations as ele-
ments, a case body as a list of statements, an optional case return subexpression,
and some source-code related metadata[15].
Pattern union type
This union type represents patterns used in match expressions, and can also be
recursive like expressions[15].
PAT_WILD — This pattern type represents a wildcard that accepts all values
without binding anything. It does not contain any data.
31
2.3. The OpenModelica environment
PAT_CONSTANT — This pattern type matches various literals like numerals,
strings, empty list, and NONE. The record contains the expression and op-
tionally a type used for unboxing the value.
PAT_AS — This pattern type allows binding the entire value to a name while
continuing to match on its contents, such as listVar as _::tailVar,
and contains an identi�er, an optional type for unboxing, some attributes of
the identi�er, and the pattern that will be matched.
PAT_META_TUPLE — This pattern type matches the content of a tuple and con-
tains a list of patterns, one for each element.
PAT_CONS — This pattern type represents a linked list node and contains two
subpatterns representing the head and tail of the list.
PAT_CALL — This pattern type matches a union type constructor and contains
a name, the index of the matched record within its union type, the patterns
for each record attribute, a list of variables for each attribute, a list of types,
and a boolean indicating if the union type is known to be a singleton.
PAT_SOME — This pattern type represents an optional with a SOME value and
contains a subpattern for the actual value.
32
Chapter 3
Method
3.1 Design
During the design phase, di�erent IR designs and existing IR solutions of notable
compilers were evaluated and compared in order to create an initial IR design. The
evaluation focused on extendability, ability to implement optimizations and ease
of implementation with regards to conversions from the AST and to the back-end
code, with special focus on easy conversion with SSA-based back ends such as
LLVM.
The code base of the OpenModelica compiler and its corresponding documenta-
tion was also investigated in order to make good design decisions.
3.2 Implementation
The implementation roughly consists of three parts: one phase converting the
DAE representation to the new IR, one optimization phase where the generated
IR is improved in some respect, and another one converting the new IR to com-
pilable C-code. MetaModelica was used as the programming language for the
implementation, since this language is used by the rest of the compiler.
3.3 Performance evaluation
During the evaluation phase, the code quality and performance of the new code
generator were compared to the results for the old code generator. These results
33
3.4. Code complexity measurements
was then analyzed in order to see how large the di�erences are between the new
representation and its optimizations and if the new generator gives an improve-
ment.
The time was measured with the execStat timing module that is built-in into
the OpenModelica compiler. As execution time of compiled code wasn’t previ-
ously measured, this had to be implemented separately with a core change out-
side the MidCode code base. The test-cases were executed multiple times in order
to guard against anomaly results, then a result representing the median case was
picked. The input data and exact number of execution times were chosen so that
the total time would be large enough to be accurately measured while not taking
too long time to run. The computer used to run the measurements was a laptop
with a Intel i7 2630QM (Sandy Bridge) processor.
The following benchmark functions were made, which can be seen in appendix
A:
�bonacci – Recursive �bonacci F30 without memoization, executed 100 times
mandelbrot – ASCII Mandelbrot with 1000 iterations returning a linked list of
characters, executed 200 times
tak – Takeuchi function tak(18, 12, 6), executed 10000 times
qsort – Quick-sort of a random array of 20000 elements, executed 100 times
The C compiler used for compiling the generated code was GCC 7.2. The opti-
mization setting for the C compilation was changed to -O2 rather than the usual
-O0 since it was noticed that the low-level style of the MidCode-generated C code
was poorly suited for unoptimized compilation. It was also noted that the parti-
tion function in the Quicksort test was tail-call optimized by the original genera-
tor, something that has not been implemented in the current MidCode generator.
3.4 Code complexity measurements
The complexity of the di�erent code generators was also measured using the num-
ber of lines of code (LOC). This is measured because being able to have simpler
code generators means that there is less work to port the language to another
code generator.
34
3.4. Code complexity measurements
According to Nguyen et al., LOC is used widely within industry and literature
while being an essential component of several more advanced software complex-
ity measurements[14]. Speci�cally, we use the number of lines in the �le includ-
ing empty lines, comments, etc. A discussion of how appropriate and relevant
this metric is can be found in the section 5.3.
Both of the target speci�c implementations are in the Susan template language.
We compare to CodegenCFunctions.tpl, which is closest in functionality.
Unfortunately, this is not a precise comparison since the �le chosen for compar-
ison implements more features than our implementation. The old back end also
has more template �les like CodegenC.tpl, see table 4.1, but it is mostly im-
plementing features that are outside the scope of this thesis.
35
Chapter 4
Results
4.1 Overview of the MidCode design
MidCode, the resulting IR, represents the control �ow of a procedure by the com-
mon approach of basic blocks. Each basic block has a terminator which declares
what control �ow action happens at the end of the block, this may include opera-
tions returning values such as calls. The data �ow of the procedure is represented
by named variables, compiler-created temporaries and simple unary or binary
operations. Unlike SSA, named variables can be rewritten.
The MidCode related code paths are divided into three phases: “From Modelica
to MidCode”, “MidCode Transformations”, and “From MidCode to C”.
4.1.1 IR design details
This part describes the uniontypes and records de�ned for MidCode and the �elds
contained within these.
Program
A program is represented by the Program type. This type contains a name and
a list of functions.
36
4.1. Overview of the MidCode design
DAEToMid
MidCode transformations
MidToC
DAE/SimCode representation
MidCode
MidCode
C code
Figure 4.1: Overview of MidCode phases
Function
Functions are represented by the Function type. Each function contains a name
as an Absyn.Path, several lists of local, input and output variables, a body rep-
resented as a list of basic blocks, and ID references to the special entry and exit
basic blocks.
Block
Basic blocks are represented by the Block type. They contain a block ID number,
a list of statements and a terminator.
Stmt
Statements are represented by the Stmt type and can either be a NOP or an
ASSIGN, which simply assigns the value of an RValue to a Var. A statement
has linear control �ow but otherwise has various e�ects.
Var
Variables are represented by the Var type, and are used to represent both vari-
ables used by the Modelica code and variables introduced during the translation
process. Vars have a name and a data type.
37
4.1. Overview of the MidCode design
OutVar
Since output variables can be thrown away by the caller, lists of output variables
in call statements contain the OutVar type rather than the plain Var type. In-
stances of this type can either be a OUT_VAR containing an actual Var instance,
or OUT_WILD indicating that the caller will not save the value.
RValue
An RValue is a value that can be placed on the right side of an assignment.
The RValue type in MidCode contains a few expressions like addition of two
Vars and negating a Var. They appear in MidCode as part of assign statements.
RValues do not have other RValues as operands, instead temporary variables
are created during the translation process which are then sent as operands.
UNARYOP — An UNARYOP is a constructor of the RValue union representing
operations with a single operand, i.e. a single Var. UNARYOP has variants
representing for copying the unchanged value, negating, logically inverting,
boxing and unboxing a variable. The operation to choose is determined by
an enumeration value.
BINARYOP — A BINARYOP is a constructor of the RValue union representing
operations with two operands, i.e. two Vars. BINARYOP has several vari-
ants representing common operations like addition, subtraction, division,
multiplication, logical or/and, and comparisons. The operation to choose is
determined by an enumeration value.
Literal value constructors — A group of constructors of the RValue union rep-
resenting literal values. The LITERALINTEGER constructor represents
integer literals, the LITERALREAL constructor represents real (�oating-
point) literals, the LITERALBOOLEAN represents boolean values, and
LITERALSTRING represents literals. The more complex meta object lit-
erals used for records, linked lists, optionals and tuples are represented by
the LITERALMETATYPE constructor.
Meta object data accessors — A group of constructors that are used for access-
ing data about meta objects. The METAFIELD constructor returns a value
from a meta object slot and is used for accessing record and tuple �elds.
There are also three constructors speci�cally made for pattern matching,
UNIONTYPEVARIANT returning the value of the record variant for union-
types, ISCONS for checking if a linked list node is cons or nil, and ISSOMEfor checking if an optional has a value.
38
4.1. Overview of the MidCode design
Terminator
Each basic block has a terminator controlling the control �ow following the block,
which is represented by the Terminator type. Terminators have e�ects and can
cause branching and/or exceptional control �ow.
GOTO — The GOTO terminator simply jumps to a given block.
RETURN — The RETURN terminator simply exits the procedure.
BRANCH — The BRANCH terminator jumps to one of two given blocks depending
on if the given condition variable is true or false, and is used by several
terminator types.
SWITCH — The SWITCH terminator jumps to one of multiple given blocks in a
dictionary depending on the value of the given condition variable, this is
used when generating code in match statements.
CALL — The CALL terminator is a function call to another Modelica function.
Since it can cause control �ow via exceptions (for example through the
fail function), it is de�ned as a terminator rather than a statement.
LONGJMP, PUSHJMP and POPJMP — The LONGJMP terminator causes a con-
trol �ow transfer to the active PUSHJMP call site, even across function
boundaries. The PUSHJMP terminator is used to add a new active location
for LONGJMP while the PUSHJMP terminator is used to deactivate a corre-
sponding active PUSHJMP and cause the previously called one to become
active.
ASSERT and TERMINATE — The ASSERT terminator aborts the program with
an error message if a Var containing a condition result has a false value.
The TERMINATOR simply unconditionally aborts with an error message.
The error message for both terminators is given by a Var.
4.1.2 From Modelica to MidCode
MidCode is designed to represent interesting low-level properties uniformly,
which means that we need to lower several high-level Modelica representations
into a composition of MidCode constructs. The DAEToMid phase takes Mod-
elica functions as given from the SimCode module and converts it to Mid-
Code. The most important �elds in a SimCode function object are its name as
an Absyn.Path its variable de�nitions, and its list of DAE statements.
39
4.1. Overview of the MidCode design
Expressions are �attened and converted to MidCode by recursively translating
subexpressions into statements. The subexpression statements store their values
into temporary variables which are then used as operands in the statement of the
containing expression.
An if statement is translated into a set of blocks evaluating the condition termi-
nated by a branch. The branches are the set of blocks representing the body of
the if and the other leading either to the else body or the end of the if statement.
At the end of the if body represented by MidCode blocks is a jump to the end of
the statement. For a while statement, the terminator at the end of the body in-
stead jumps back to the evaluation of the condition. A for statement is translated
somewhat similarly to a while statement but generates its own code for iterator
handling and comparisons. When translating loops, the generated body and end
labels are inserted into stacks so that break and continue statements can be
implemented correctly.
Match expressions are translated into a state machine that keeps track of which
case is next. Patterns are translated into a series of checks, consisting of MidCode
blocks and branches, and assignments. If a check fails, it advances the case state
machine and branches to the next case, but if all checks succeed, the match per-
forms the assignments and continues with the body of the case. If no cases match,
then an exception is raised.
Exceptional control �ow are translated into MidCode constructs that are close to
the current C back-end implementation. Basically, “landing pads” are constructed
and removed using PUSHJMP and POPJMP. There are also restrictions put in
place to simplify e�cient transformations of exceptional structures, see more de-
tails in section 4.1.5. These restrictions are not limiting for this phase since all
exceptional control �ow from MetaModelica already �ts into the restricted struc-
ture, but it does need to take care not to introduce more complicated control �ow
including exceptional components.
One interesting symbiosis presents itself between the two phases “From Model-
ica to MidCode” and “MidCode Transformations”. The optimization phases will
contain many transformations that make MidCode constructs more e�cient, al-
lowing the initial transformation to be simpler as long as it produces optimizable
MidCode. For example, any dead code produced can be removed by an optimiza-
tion pass. It is also possible to add additional “cleanup” optimizations tuned for
removing ine�ciencies often introduced in our translation.
40
4.1. Overview of the MidCode design
4.1.3 MidCode Transformations
MidCode is designed to allow transformations from MidCode to MidCode that
optimizes or otherwise improves the code.
One such transformation that is special is one that removes function local
longjmp. MidCode allows longjmp within functions but that is not correct in C,
thus local longjmps must be removed at some point. Doing it in the “from Mod-
elica” phase means the optimization phase is not allowed to introduce longjmps,
for example due to inlining. Doing it in the later “to C” phase means that other
back ends with the same restrictions need to do the same work that could be han-
dled once and for all in MidCode transformations. Doing it in the transformation
phase allows the analyzer to look at the new control �ow generated from the
transformation and try to glean further bene�cial transformations.
4.1.4 From MidCode to C
The MidToC phase takes MidCode functions produced by the previous
DAEToMid phase and transforms them to C code using Susan templates. There
are templates for generating basic MidCode constructs like functions and state-
ments as well as helper templates where appropriate to help generate the con-
structs. The code generation phase does appear to be simpler. It requires no use
of Susan bu�ers which means the generation can simply be implemented as string
appending.
4.1.5 Exceptional Control Flow
MetaModelica has language constructs for exceptional control �ow. This includes
constructs like matchcontinue and fail. In C, the interesting parts of this is im-
plemented using longjmp. The MidCode implementation is straightforwardly
based on the same model.
MidCode contains three terminators for handling this. The PUSHJMP terminator
uses the C call setjmp and a special jump bu�er to set a landing pad that the next
LONGJMP terminator will go to. The POPJMP terminator uses an old bu�er saved
by a corresponding PUSHJMP to remove the landing pad made by that PUSHJMP.
As previously hinted at, there should be some structure with respect to PUSHJMPand POPJMP. Each PUSHJMP should be accompanied by a corresponding
41
4.1. Overview of the MidCode design
POPJMPwith the correct variables. The path between can branch or loop, but the
terminators should cover the only entry and exit to a subgraph between them. See
listing 3 for a pseudo-code example of exceptional e�ects “hidden” by branching.
if (c) pushjmp(...);body(...);if (c) popjmp(...);
Listing 3: Incorrect exceptional control �ow in MidCode.
These restrictions might be lifted but currently exist in order to simplify the nec-
essary analysis for converting local longjmps to goto.
4.1.6 Data in MidCode
Many boxed data types from MetaModelica are translated into a special uniform
representation in C. This includes union types, linked lists, arrays and optionals.
The representation is a heap-allocated sequence of pointer-sized words with the
�rst one being a header. The header contains the length of the sequence as well
as a constructor tag for union types. Lists and options are coded like union types,
as can be expected.
The data representation in MidCode reuses many parts of the MetaModelica rep-
resentation and is very similar.
There are some simple types: integer, real, enumeration, bool. Sev-
eral of the “meta”-pre�xed types have the same representation in C,
modelica_metatype, so in MidCode they are uni�ed under METATYPE. The
DAE types included are METATUPLE, METARECORD, METALIST, METAARRAY,
and METAOPTION.
4.1.7 Complexity of MidCode
For the complexity analysis table, 4.1 contains LOC measurements for the current
code generator, while table 4.2 contains the LOC measurements for the code in-
troduced in this thesis. The di�erences are large, but not all of the �les included
in the results here are relevant for comparisons, and the comparisons made are
further quali�ed and elaborated upon in later chapters.
42
4.2. Performance measurements
6094 CodegenC.tpl
6995 CodegenCFunctions.tpl
12089 total
Table 4.1: Lines of code for corresponding parts of old back end
727 MidToC.tpl
1513 DAEToMid.mo
99 HashTableMidVar.mo
230 MidCode.mo
184 MidToMid.mo
2969 total
Table 4.2: Lines of code for new back end
Program C MidC %
�bonacci 0.942s 1.498s 63%
mandelbrot 3.037s 3.309s 92%
tak 2.762s 2.459s 112%
qsort 2.368s 2.895s 82%
Table 4.3: Performance measurements for old and new code generator
4.2 Performance measurements
Table 4.3 contains the runtime performance measurements from the old and new
MidCode-based code generator, and their relative di�erence in percentage of per-
formance of the old generator.
43
Chapter 5
Discussion
5.1 Performance results
The performance measurements made showed that performance was mostly sim-
ilar, which is not particularly surprising. While the MidCode-generated code was
noticeably better in some cases, it could also be signi�cantly worse in others.
Many optimizations are made in DAE, which means MidCode gets the same ad-
vantages as the previous code generator. As we have not yet implemented im-
pactful enough optimizations, the simpli�cations did not by themselves improve
the performance of the code generator. However, we hope that the new IR will
prove easy to implement optimizations in. A likely improvement is to use LLVM
instead of C as the output, this will likely be helped by having the simpler basic
block-based MidCode as a starting point.
5.2 Design of MidCode
During the design of MidCode there were many choices that had to be made. In
this section we will discuss some of them.
While SSA is a powerful tool in IRs, MidCode does not use SSA. In order to get
value from SSA we would need to perform an analysis like LLVM’s mem2regwhich we initially did not want to plan for. One of our inspirations, Rust’s MIR,
also does not feature SSA because LLVM performs many of the analyses that uses
SSA. Since MidCode is intended to be used with LLVM in the future, OpenMod-
elica will also be able to leverage these optimizations.
44
5.3. Code Complexity of MidCode
We decided to transform the many control �ow constructs in DAE to fewer Mid-
Code constructs. We used basic blocks with terminators that point to the next
block, like MIR. This was done because control �ow is crucial for being able to
reason about a program, and any logic used for transforming the code need to be
able to handle every construct. Otherwise it may produce incorrect results, for
example if the analysis fails to notice that a return or break means some code is
not reached and its e�ects should not happen.
We did not change the types of variables because we did not �nd a suitable model
and since the DAE type system was considered su�cient in this stage. Maybe
with more time, we could have made a lower level type system more suitable for
a low-level IR.
5.3 Code Complexity of MidCode
In the LOC comparison, we compare Susan templates to each other, hence we
compare the same style and use the same method. Thus according to Nguyen et
al. the size measurement itself is useful for comparison.[14]
Unfortunately, while the size is comparable the functionality is not.
CodegenCFunctions does do more things than our implementation, for
example support for parallel computations. We do however plan on our template
�le not increasing signi�cantly in size if additional features were added to our
new backend. Thus we hope that the numbers would hold if there is future work
that would make the comparison more fair in terms of features. But currently we
do not think that any particular importance can be placed on this result.
5.4 Related Work
MidCode is fairly similar to other intermediate representations mentioned in the
Theory chapter. Most notably, it takes inspiration from the MIR representation
used in the Rust compiler. Like MIR, it in based on �at primitive operations and
control-�ow graphs with basic blocks and terminators, and does not use SSA to
describe the data �ow. It also has a similar purpose, as the Rust compiler intro-
duced MIR as a new-stage between the AST and the LLVM generation whereas
previously the LLVM code was generated directly from the AST. The SIL represen-
tation used in the Swift compiler also has a similar purpose and a similar design,
but has several di�erences compared to MidCode and MIR like a SSA-based data
45
5.5. The work in a wider context
�ow, basic block arguments, and calls implemented as regular statements rather
than terminators.
The GHC Haskell compiler also has several representations that we could compare
to. MidCode is intended as a low-level abstraction over C and thus �ts GHC’s
Cmm very well. Compared to Cmm, MidCode has fewer features, and does not
allow control over register allocation or tail calls. So that leaves us to wonder
how OpenModelica’s version of STG or Core would look. Of these, Core seems
the most interesting since it is the “Core” of Haskell semantics. There is a similar
higher level intermediate representation used in OpenModelica, DAE. DAE is used
as a lower level representation than the AST but it still retains control structures
expressed in redundant forms, for example for, while, and if. Core means
to get rid of redundant ways to express the same thing because, from a program
analysis point of view, fewer constructs means fewer cases to consider. MidCode
does get rid o� redundant forms, e.g. control �ow forms, and may be closer to
Core with regards to the purpose of the representations.
5.5 The work in a wider context
The work of this thesis will hopefully enable future performance improvements
to the OpenModelica compiler, and therefore increase productivity in the indus-
try and academia where OpenModelica-based simulations are done. It may also
increase competitiveness for the OpenModelica-based solution compared to pro-
prietary and commercial Modelica environments.
46
Chapter 6
Conclusion
While the project did not give immediate performance bene�ts, it provides a good
starting point for better optimizations and simpler code generation, and it has
potential to be further extended and �ne-tuned to match the current and future
needs of the OpenModelica compiler project. It has been able to demonstrate a
practical way of lowering the representation of a subset of MetaModelica dealing
with imperative algorithms. However, even then there are several major Modelica
features left that has to be added to the representation in order to make it truly
useful in production.
6.1 Future work
The work has plenty of room for further extensions. One of the main motivators
for the work was to enable more optimizations. As an example, to implement
common subexpression elimination, we would like several general features. These
data�ow related include some way to track mutation of variables like single static
assignment and tracking whether an operation has side e�ects.
There are also interesting possibilities for control �ow analysis. After performing
a branch on a condition we, in that branch, have the knowledge that the condition
is true, and vice versa for a condition shown to be false. This knowledge can be
used to simplify code in that branch, for example by removing another branch for
the a condition implied true by the gained knowledge. By �guring out whether
a function can fail, a region of code could be shown not to fail and a PUSHJMP,
POPJMP pair may be removed, which in turn can allow for further optimizations
since side e�ecting terminators were removed.
47
6.1. Future work
Implementing inlining in MidCode is a potent future technique that is very im-
portant for enabling optimizations by opening the callee to analysis and special-
ization.
MidCode does not currently support all of Modelica and MetaModelica. Support
for more language constructs can be added as future work. Similarly some sup-
port is lacking or incorrectly implemented, there are for example in the current
implementation some issues with handling of lexical scopes.
48
Appendix A
Performance test functions
A.1 Fibonacci
function fibinput Integer i;output Integer o;
algorithmo := match i
case 0 then 0;case 1 then 1;else then fib(i - 1) + fib(i - 2);
end match;end fib;
A.2 Mandelbrot
function mandelbrot_displayoutput list<String> out;
algorithmout := {};for y in -39:39 loop
for x in -39:39 loopif mandelbrot(x/40.0, y/40.0) == 0 then
out := "*" :: out;else
out := " " :: out;end if;
end for;
49
A.3. Quicksort
out := "\n" :: out;end for;
end mandelbrot_display;
function mandelbrotinput Real x;input Real y;output Integer result;
protectedReal cr, ci, zr, zi, zr2, zi2;Real tmp;Integer iter;
algorithmcr := x - 0.5;ci := y;
zr := 0;zi := 0;
iter := 0;
while true loopiter := iter + 1;tmp := zr * zi;zr2 := zr * zr;zi2 := zi * zi;zr := zr2 - zi2 + cr;zi := 2 * tmp + ci;if zi2 + zr2 > 16 then
result := iter;return;
end if;if iter > 1000 then
result := 0;return;
end if;end while;
end mandelbrot;
A.3 �icksort
function qsortinput list<Integer> l;output list<Integer> o;
50
A.4. Takeuchi function
protectedlist<Integer> smaller;list<Integer> larger;Integer pivot;list<Integer> rest;
algorithmo := match l
case {} then {};case pivot::restalgorithm
(smaller, larger) := partition(pivot, rest, {}, {});then listAppend(qsort(smaller),
listAppend({pivot}, qsort(larger)));end match;
end qsort;
function partitioninput Integer pivot;input List<Integer> l;input output List<Integer> smaller;input output List<Integer> larger;
protectedInteger head;list<Integer> tail;
algorithm(smaller, larger) := match l
case {} then (smaller, larger);case head::tail guard head <= pivot
then partition(pivot, tail, head::smaller, larger);case head::tail guard head > pivot
then partition(pivot, tail, smaller, head::larger);else then ({},{});
end match;end partition;
A.4 Takeuchi function
function takinput Integer x;input Integer y;input Integer z;output Integer o;
algorithmo := if y < x then
51
A.4. Takeuchi function
tak(tak(x-1, y, z), tak(y-1, z, x), tak(z-1, x, y))else
z;end tak;
52
Bibliography
[1] A.V. Aho, M.S. Lam, J.D. Ullman, and R. Sethi.
Compilers: Principles, Techniques, and Tools. Pearson Education, 2011.
isbn: 9780133002140.
[2] E�cient IR for the OpenModelica Compiler (thesis proposal). 2016.
url: https://openmodelica.org/images/docs/OpenMasterThesis/2016_Efficient_IR_for_OpenModelica_compiler_v1.pdf.
[3] Peter Fritzson et al. OpenModelica System Documentation, Version2014-02-01 for Modelica 1.9.1 Beta1. Feb. 2014.
url: https://github.com/OpenModelica/OpenModelica-doc/blob/d5928d96c0157e3c8762b2b85b67a7a963be9763/OpenModelicaSystem.pdf.
[4] Peter Fritzson. Principles of Object-Oriented Modeling and Simulation withModelica 3.3: A Cyber-Physical Approach. 2015. isbn: 9781118859124.
[5] Peter Fritzson, Peter Aronsson, Håkan Lundvall, Kaj Nyström, Adrian Pop,
Levon Saldamli, and David Broman. “The OpenModelica Modeling,
Simulation, and Development Environment”. In: 2005.
[6] Peter Fritzson, Adrian Pop, and Martin Sjölund. Towards Modelica 4Meta-Programming and Language Modeling with MetaModelica 2.0.
Tech. rep. 2011:10. Linköping University, PELAB - Programming
Environment Laboratory, May 2011. 297 pp.
url: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-68361 (visited on 04/01/2013).
[7] Simon L Peyton Jones. “Compiling Haskell by program transformation: A
report from the trenches”. In: European Symposium on Programming.
Springer. 1996, pp. 18–44.
53
Bibliography
[8] Simon L Peyton Jones. “Implementing lazy functional languages on stock
hardware: the Spineless Tagless G-machine”.
In: Journal of functional programming 2.02 (1992), pp. 127–202.
[9] Simon Marlow, S Peyton Jones, et al. The Glasgow Haskell Compiler. 2004.
[10] Niko Matsakis. Introducing MIR. 2016.
url: https://blog.rust-lang.org/2016/04/19/MIR.html.
[11] Niko Matsakis. Rust RFC #1211. 2015.
url: https://github.com/nox/rust-rfcs/blob/master/text/1211-mir.md.
[12] Jason Merrill.
“GENERIC and GIMPLE: A new tree representation for entire functions”.
In: Proceedings of the 2003 GCC Developers’ Summit. 2003, pp. 171–179.
[13] MetaModelica documentation at openmodelica.org.
url: https://build.openmodelica.org/Documentation/MetaModelica.html.
[14] Vu Nguyen, Sophia Deeds-rubin, Thomas Tan, and Barry Boehm.
“A SLOC Counting Standard”. In: COCOMO II Forum 2007.
[15] OMCompiler/DAE.mo. Mar. 2017.
url: https://github.com/OpenModelica/OMCompiler/blob/b8fe1840ca6a758e39255b674a549a2c7c4a4bbe/Compiler/FrontEnd/DAE.mo.
[16] Martin Otter and Hilding Elmqvist.
“Modelica-Language, Libraries, Tools, Workshop and EU-Project”.
In: (2001).
[17] Fabrice Rastello. SSA-based Compiler Design. 1st.
Springer Publishing Company, Incorporated, 2016.
[18] Martin Sjölund, Peter Fritzson, and Adrian Pop. “Bootstrapping a
Compiler for an Equation-Based Object-Oriented Language”.
In: Modeling, Identi�cation and Control 35.1 (2014), pp. 1–19.
[19] J Stanier and D Watson.
“Intermediate Representations in Imperative Compilers: A Survey”.
In: ACM Computing Surveys 45.3 (2013). issn: 03600300.
[20] Swift Intermediate Language (SIL). 2017.
url: https://github.com/apple/swift/blob/57ecaa7fae78d30ae4f90cb4606c98504723717e/docs/SIL.rst.
54
Bibliography
[21] David A Terei and Manuel MT Chakravarty.
“An LLVM backend for GHC”. In: ACM Sigplan Notices. Vol. 45. 11.
ACM. 2010, pp. 109–120.
55