an implementation of the abstract programming model ... · an implementation of the abstract...

47
An Implementation of the Abstract Programming Model Final Report Alexander Broadbent Student Number: 110225104 Supervisor: Matthew Huntbach 25 th April 2016 School of Electronic Engineering and Computer Science

Upload: phamthien

Post on 23-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

An Implementation of the Abstract

Programming Model

Final Report

Alexander Broadbent

Student Number: 110225104

Supervisor: Matthew Huntbach

25th April 2016

School of Electronic Engineering and Computer Science

Alexander Broadbent 110225104

Page 1 of 47

Acknowledgements Thanks to my project supervisor, Matthew Huntbach, for the advice at each stage of

the project. Also thanks to my brother, Philip Broadbent, for the continued help over

the course of the project.

Alexander Broadbent 110225104

Page 2 of 47

Abstract Programming language theory is a branch of computer science that deals with the

design, implementation, analysis, characterisation and classification of

programming languages and their individual features. A model of computation

defines a set of allowable operations and functions used in computation and their

costs [1].

The main objective of the project is to implement a programming language.

Alexander Broadbent 110225104

Page 3 of 47

Table of Contents Acknowledgements ........................................................................................................ 1

Abstract .......................................................................................................................... 2

Table of Contents ........................................................................................................... 3

Table of Figures .......................................................................................................... 6

1 Introduction ............................................................................................................. 7

1.1 Project Aim ....................................................................................................... 7

1.2 Project Motivation ............................................................................................ 7

1.3 Project Structure............................................................................................... 7

2 Background Research ............................................................................................ 8

2.1 Research Definitions ........................................................................................ 8

2.1.1 Compiler ..................................................................................................... 8

2.1.2 Interpreter .................................................................................................. 8

2.1.3 Operator Precedence ................................................................................. 8

2.1.4 Infix Notation ............................................................................................. 8

2.1.5 Stack ........................................................................................................... 8

2.1.6 Reverse Polish Notation (Postfix) ............................................................. 8

2.1.7 Literal.......................................................................................................... 9

2.1.8 Finite State Machine .................................................................................. 9

2.1.9 Bitwise Operations ..................................................................................... 9

2.1.10 Lexical Analysis ...................................................................................... 9

2.1.11 Syntactical Analysis ............................................................................... 9

2.1.12 Currying .................................................................................................. 9

2.1.13 Manual Memory Management ............................................................... 9

2.1.14 Eager Evaluation ..................................................................................... 9

2.1.15 Lazy Evaluation .................................................................................... 10

2.1.16 Agile Software Development ................................................................ 10

2.2 Research Topics ............................................................................................. 10

2.2.1 SECD Machine .......................................................................................... 10

2.2.2 Lambda Calculus ..................................................................................... 10

2.2.3 Shunting-Yard Algorithm ........................................................................ 11

2.2.4 Postfix Evaluation Algorithm ................................................................... 11

2.3 Research Summary ........................................................................................ 12

3 Scope..................................................................................................................... 13

Alexander Broadbent 110225104

Page 4 of 47

4 Requirements ........................................................................................................ 14

5 Design ................................................................................................................... 15

5.1 Development Environment ............................................................................. 15

5.2 Coding Design ................................................................................................. 15

5.3 User Interface ................................................................................................. 16

5.4 High Level Design ........................................................................................... 16

5.4.1 Singleton Design Pattern ......................................................................... 17

5.4.2 Template Method Design Pattern ........................................................... 17

5.5 Low Level Design ............................................................................................ 18

5.5.1 Model Package ........................................................................................ 18

5.5.2 GUI Package ............................................................................................. 18

5.5.3 Eval Package ............................................................................................ 18

5.5.4 Lexer Package .......................................................................................... 19

5.5.5 Parser Package ........................................................................................ 19

5.5.6 Operator Package .................................................................................... 19

6 Implementation ..................................................................................................... 22

6.1 Finite State Machine Model Implementation ................................................ 22

6.2 Lambda Calculus Model Implementation ...................................................... 24

6.3 Final Solution .................................................................................................. 26

6.3.1 Lexer (Lexical Analysis) ........................................................................... 26

6.3.2 Operator Precedence ............................................................................... 28

6.3.3 Parser (Syntactical Analysis) .................................................................. 29

6.3.4 Shunting-Yard Algorithm ........................................................................ 30

6.3.5 Expression Execution .............................................................................. 31

6.3.6 Custom Function Definition .................................................................... 31

6.3.7 Lazy Evaluation ........................................................................................ 31

6.3.8 Linked Lists .............................................................................................. 32

6.3.9 For Loops ................................................................................................. 32

6.3.10 Recursive Expressions ......................................................................... 33

6.3.11 Syntactic Sugar .................................................................................... 33

6.4 Evaluation ....................................................................................................... 34

7 Testing .................................................................................................................. 35

7.1 Low Level Design ............................................................................................ 35

7.1.1 Framework Package ................................................................................ 35

Alexander Broadbent 110225104

Page 5 of 47

7.1.2 Architecture Package .............................................................................. 36

7.1.3 Feature Package ...................................................................................... 36

7.1.4 Function Package .................................................................................... 36

7.1.5 Operator Package .................................................................................... 36

7.1.6 Syntax Package ....................................................................................... 36

7.1.7 Type Package ........................................................................................... 36

7.2 Framework Creation ....................................................................................... 37

7.3 Test Case Setup .............................................................................................. 38

7.4 Test Coverage ................................................................................................. 38

7.5 Testing Analysis ............................................................................................. 39

8 Conclusions .......................................................................................................... 40

9 Future Developments ........................................................................................... 41

9.1 Implementing the Backus-Naur Form Analysis ............................................ 41

9.2 Return Results in Input Format ...................................................................... 41

9.3 Full Recursion ................................................................................................. 41

9.4 Memorization within the Domain ................................................................... 41

9.5 How to: Add a New Operator or Function ...................................................... 42

10 References.......................................................................................................... 43

11 Appendices ......................................................................................................... 45

Alexander Broadbent 110225104

Page 6 of 47

Table of Figures

Figure 1 - Example of infix to postfix ............................................................................ 9

Figure 2 - A short example of the shunting-yard algorithm ...................................... 11

Figure 3 - A screenshot of the user interface displaying a list of variables .............. 16

Figure 4 - High level class diagram of the Domain class ........................................... 17

Figure 5 - An example of the Template Method in the toPostFix method ................. 17

Figure 6 - Example evaluation of the finite state automaton model ......................... 23

Figure 7 - UML diagram of the Expression Node class hierarchy ............................. 23

Figure 8 - Pseudo code of the parse method ............................................................. 29

Figure 9 - Pseudo code for the shunting yard algorithm ........................................... 30

Figure 10 - Pseudo code for the postfix algorithm ..................................................... 31

Figure 11 - An example of a custom function declaration and execution ................. 31

Figure 12 - UML diagram of the framework package within the test folder .............. 35

Figure 13 - Test coverage run at the end of the final implementation ...................... 45

Figure 14 - Example code for an Average function class ........................................... 45

Figure 15 - Sample execution of the final solution ..................................................... 46

Figure 16 - Sample output of the JUnit tests ............................................................. 46

Table 1 - Example post-yard algorithm analysis ....................................................... 11

Table 2 - Example postfix algorithm analysis ............................................................ 12

Table 3 - Tokens used in the Lexical Analysis of inputs ............................................ 27

Table 4 - Operator precedence values ........................................................................ 28

Table 5 – Testing Framework types and purposes .................................................... 37

Table 6 - Average run time to execute all tests in the testing framework ................. 38

Table 7 - Base classes used for Operators ................................................................. 42

Alexander Broadbent 110225104

Page 7 of 47

1 Introduction This chapter provides an overview of the motivations and decisions that occurred

before starting the actual project implementation.

1.1 Project Aim The aim of the project is to implement a model of computation in order to create a

programming language.

The principle objectives of the project are to:

Understand how programming languages work, and how to implement one

Develop an extensible programming language foundation

Build features of the language upon this foundation

1.2 Project Motivation Programming languages play a major role in the computer science field, and as such

every student of computer science has learnt how to use at least one programming

language. Most computers currently in use are built upon a handful of programming

languages, which makes these languages very important in modern technology.

It is this importance that gave the motivation to the choice of the topic. Programming

languages are an intrinsic and complex aspect of computer science as a whole.

Personally, I have been programming for over 10 years and still have a lot to learn

about how programming languages actually work.

The nature of this project is purely academic, requiring an understanding of the

theory behind programming languages with a great depth of research within the

topic. Programming languages are at the foundation of computer science and this,

combined with the personal motivation, gives the necessary motivation to gain the

knowledge of how languages are implemented.

1.3 Project Structure As the aim of the project is to implement a programming model of computation, there

are multiple models to explore, research and possibly implement. Due to three

models being implemented over the course of the project, each chapter will aim to

discuss each implementation separately where appropriate.

Alexander Broadbent 110225104

Page 8 of 47

2 Background Research Preceding to, and during, the implementation of the project there was a great deal of

research in order to further understand all the topics involved within programming

languages. The order in which the topics appear is not necessarily the order in which

they were researched, but have been intentionally ordered in sequence of complexity.

This chapter is divided into two sections, the first defines the topics that were

researched and the second further defines subjects that are key to the

implementation of the project.

2.1 Research Definitions This section defines the research areas that were carried out during the process of

the project, providing a simple definition for each topic that will be discussed in the

report.

2.1.1 Compiler

A compiler is a program that transforms source code (i.e. instructions in a high level

programming language) into a lower level language, usually machine code or

assembly language in order for the program to be executable.

2.1.2 Interpreter

An interpreter is a program that immediately runs the instructions given to the input,

without any previous compiling involved.

2.1.3 Operator Precedence

The precedence of an operator denotes its order of execution when compared to

other operators. Most programming languages conform to the same orders

commonly used in mathematics. In addition, many operators are non-associative

and so the operators are usually grouped left-to-right due to it being the same as the

user input; for example, the equation 16 ÷ 4 ÷ 4 would be evaluated as (16 ÷ 4) ÷ 4.

2.1.4 Infix Notation

The mathematical notation in which operators are placed between operands, it is the

most common notation to express a formula. Brackets are used to denote the order

of operations, such as 3 + (4 − 2) ∗ 3.

2.1.5 Stack

A stack is a data type which holds a collection of elements, allowing operations to

push an element to and remove an element from, the top of the stack; therefore,

implementing the last-in-first-out (LIFO) policy.

2.1.6 Reverse Polish Notation (Postfix)

Reverse Polish Notation (RPN), also known as postfix notation, is a notation in which

operators follow the operands. The operator precedence defines the order of the

operations, and so brackets are not necessary after converting between infix and

postfix.

3 ÷ 𝑥 − (4 + 2) → 3 𝑥 ÷ 4 2 + −

Alexander Broadbent 110225104

Page 9 of 47

Figure 1 - Example of infix to postfix

2.1.7 Literal

Used in terms of a data representation, a literal is a representation of a fixed value.

The value can be of any type, such as a number, text, boolean, or decimal.

2.1.8 Finite State Machine

A finite-state machine, also referred to as a finite-state automaton, is a program that

moves between states as it progresses through execution until stopping in an

accepting state.

2.1.9 Bitwise Operations

A bitwise operation operates on one or more bit patterns or binary numerals at the

level of their individual bits. The operations are primitive and fast, which are carried

out at low-level cost to the processor.

2.1.10 Lexical Analysis

The process of converting an input, usually a sequence of characters, into a list of

tokens, i.e. giving a character sequence an identified meaning.

2.1.11 Syntactical Analysis

The process, more commonly known as parsing, refers to the analysis of a string of

symbols that conform to the rules of a formal grammar.

2.1.12 Currying

The technique of transforming a function with multiple arguments into a function

with a sequence of functions with a single argument each. In the process of this

project, the currying technique was used within the context of taking a lambda

expression with multiple variables, and substituting one variable with a given value,

therefore returning the function with one less unassigned variable. For instance,

taking the lambda expression x. y.z. (x × y + z) and giving z the value of 5 will

return the expression x. y. (x × y + 5) where the expression now has two

unassigned variables instead of three.

2.1.13 Manual Memory Management

Manual Memory Management is the usage of manual instructions by the

programmer to identify and deallocate unused objects. This is the process of

avoiding memory leaks, where unused objects are not reallocated to the free storage,

and ensuring that both pointers and objects are deleted properly so that there is no

deleted object with a pointer referring to it. Within the Java language, there is a

garbage collection mechanism that will clean unused memory during run-time.

2.1.14 Eager Evaluation

An evaluation strategy used by most programming languages in which expressions

are evaluated as soon as possible. Also known as strict evaluation, the strategy has

many benefits, including tracking or scheduling the result of expressions is not

required and expressions stick to a predefined order of evaluation. However, there

Alexander Broadbent 110225104

Page 10 of 47

are some costs to eager evaluation, such as unnecessary results being calculated,

and requires the user to write expressions in execution order [2].

2.1.15 Lazy Evaluation

An evaluation strategy in which expressions are only evaluated when the result of the

expression is a dependency to another expression and is therefore needed. Also

known as non-strict evaluation, this more efficient method of evaluation means that

expressions can be calculated quicker as they may reduce the amount of

unnecessary processing required [3]. A technique called Memoization can also be

implemented to cache results of computations so they can be quickly retrieved

without repeated effort [4].

2.1.16 Agile Software Development

The Agile Manifesto, which started as a concept in 1974 and has been further

expanded upon since [5], defines a set of principles for developers to follow in order

to complete their product lifecycle in the most efficient manner. There are many

different implementations of the Agile Methodology in order to suit a particular

scenario of developers and stakeholders.

2.2 Research Topics This section introduces areas of research with more detail, as these topics are more

complex or have been used more extensively within the project.

2.2.1 SECD Machine

The SECD Machine is a virtual machine that was created as a target for functional

programming language compilers. The machine includes four registers: Stack,

Environment, Control, and Dump – with the first initial of each forming the name of

the machine. The machine was abstractly introduced specifically to evaluate lambda

expressions by Peter J. Landin [6] and is similar to the shunting-yard algorithm

though more complex in its evaluation.

2.2.2 Lambda Calculus

A formal mathematical notation for expression computation based on function

abstraction and application using variable binding and substitution. It is a universal

model of computation, implemented by functional programming languages [7], and

has been important in the role of developing the theory of programming languages.

Lambda Calculus has many applications in areas of mathematics, computer science

and beyond. It has been used to develop the theory of programming languages where

functional programming languages implement the lambda calculus model. For

example, in the functional programming language LISP, the anonymous function to

square a number can be defined as (𝑙𝑎𝑚𝑏𝑑𝑎 (𝑥)(∗ 𝑥 𝑥)). These anonymous functions

are often referred to as Lambda Expressions. Programming languages can further

handle lambda calculus using reduction strategies (such as full beta reductions,

normal order, or call by need) and can also calculate lambda expressions in parallel

and concurrently.

Alexander Broadbent 110225104

Page 11 of 47

2.2.3 Shunting-Yard Algorithm

A method to parse an infix notation to the postfix (Reverse Polish) Notation. Created

by Edsger Dijkstra [8], and named after the algorithm’s resemblance to a railroad

shunting yard, the algorithm takes operands from the input and puts them onto the

output and then keeps a stack for operators which are popped off to the output in

order of their precedence.

Figure 2 - A short example of the shunting-yard algorithm

For example, the analysis of the input 𝑠𝑖𝑛(𝑚𝑎𝑥(2, 3) ÷ 3 × 3.14) can be expressed in

the following table:

Token Action Output (in RPN) Operator Stack

sin Push token to stack sin

max Push token to stack max sin 2 Add token to output 2 max sin 3 Add token to output 2 3 max sin

÷ Pop token to output 2 3 max ÷ sin

3 Add token to output 2 3 max 3 ÷ sin × Pop token to output 2 3 max 3 ÷ × sin

3.14 Add token to output 2 3 max 3 ÷ 3.14 × sin Pop operator stack to output 2 3 max 3 ÷ 3.14 × sin

Table 1 - Example post-yard algorithm analysis

This analysis shows how each token of the input is added to the output directly if it

is a value, or if the token is an operator it is added to the operator stack after any

operator with a lower precedence has been popped off to the output.

2.2.4 Postfix Evaluation Algorithm

The postfix algorithm defines how to evaluate any expression that is in the Reverse

Polish Notation. The algorithm uses a stack that values push on to, from which

operators pop off their number of operands required, and push their result back on to

the stack. After the algorithm has finished, only one value should remain on the stack

Alexander Broadbent 110225104

Page 12 of 47

which is the result of the expression. For the example in Table 1 (page 11)Table 1 that

gives the output 2 3 𝑚𝑎𝑥 3 ÷ 3.14 × 𝑠𝑖𝑛, the analysis can be expressed in the following

table:

Token Action Stack

2 Push token to stack 2 3 Push token to stack 3 2 max Pop two values from stack, and push result 3 3 Push token to stack 3 3

÷ Pop two values from stack, and push result 1 3.14 Push token to stack 3.14 1

× Pop two values from stack, and push result 3.14 sin Pop one value from stack, and push result 0

Table 2 - Example postfix algorithm analysis

The algorithm assumes that operators have a known amount of operands that are

required, hence why in this analysis the action defines how many operands are

popped from the stack. This algorithm also defines how to validate an expression, as

a counter can be used to count each token, and then remove the number of operands

required by each operator; a result of 1 would mean the expression is valid.

2.3 Research Summary From the outset of the project, there was a very large amount of research needed

before any implementation process could begin, this was due to having no prior

knowledge or experience in the topic area of the theory of programming languages

and implementing an abstract programming model.

Initially, the main area of research was the basics of programming language theory.

This led to understanding of the subject areas of compilers, interpreters, lexical

analysis, syntactic analysis, and parse trees. These topics were enough to begin the

first implementation of the programming language, however, while that

implementation was in progress some flaws were noticed that required further

research.

As the scope of the project changed, so too did the requirement to research all the

areas needed for each change to the implementation. In order to implement the

lambda calculus model, lambda expressions had to be thoroughly researched, and

techniques such as Currying and beta-reduction had to be understood before they

could be implemented.

Over the course of the project, many areas had to be researched; and while not all of

the topics are in the final implementation the knowledge from those areas were

essential for the learning curve of completing this project and also in further

understanding how the abstract programming model can be implemented.

Alexander Broadbent 110225104

Page 13 of 47

3 Scope Early in the project, a significant decision was made to determine how far into the

process of creating a programming language the project would go. While the original

idea was to write the project in the programming language C, as to make the best use

of having full memory management, the scope of learning a whole language as well

as the research needed to implement a programming language was too much for the

timeline of the project. Therefore, the decision to implement all the features of a

programming language, but in a simpler format, using the familiar programming

language of Java.

In the specification, the scope of the project was to implement two different

programming languages - the first being a sort of tutorial by implementing the

Lambda Calculus model, and the second being the final implementation of a

concurrent programming model using single-assignment, single-writer, multiple-

reader variables, which is an academic paper written by the project supervisor,

Matthew Huntbach [9].

During the first implementation, problems were encountered in understanding the

concepts and so a simpler model of computation was implemented instead. The

scope of the project then changed in order to implement the finite state machine

model and finally the lambda calculus model became the eventual solution.

The first implementation of the finite state machine model was completed within the

expected time. Therefore, leaving an eight-week period to fully research and

implement the lambda calculus model, which initially seemed enough time.

However, during the implementation of the lambda calculus model, there proved to

be too much research required in order to implement the language within the timeline

of the project. Therefore, the lambda calculus features were removed from the

language in order to have a working implementation at the end of the project.

As there was still a solid foundation in the current language, that formed an

implementation of the abstract programming model in its own right, the choice was

made to combine the current implementation towards the original objective of the

single-assignment, single-writer, multiple-reader variables model.

Some aspects of the changes in scope during the project can be attributed to

miscalculating how long the research and planning stages of the implementation

would take, and also due to be initially over ambitious and aiming to embark on a

more complex project than the time restraints allowed.

However, overall taking on a project in which a large amount of research is required

has been good for personal development and proved to be an intellectually

stimulating journey into both academia and good coding practices.

Alexander Broadbent 110225104

Page 14 of 47

4 Requirements A programming language has a defined set of components, and as such this was the

first part of determining what is required in order to implement such language.

In order to implement the abstract model, only a subset of the overall components of

a programming language are required. The aim of the implementation at each stage

is to code a low-level architecture which can be extended by functions to mean that

implementing a new feature or component to the language would be a fairly minimal

addition in terms of code.

This minimalist design to the high-level side of the code meant that the low-level

code would have to be robust and flexible enough to support all of the components

of the language if they were to be implemented.

The requirements of the language to be implemented can be defined by the

requirements of the project, which were to understand how programming languages

work and how to implement one, to develop an extensible programming language

foundation, and to build features of the language upon this foundation.

While the aims of the project do not specify a particular abstract model to implement,

the specification of the project defines two models to implement over the duration.

Therefore, the project could be considered a success with the aims being met even if

only one model is implemented, or a different one entirely. The aim is to implement a

programming language by any model of computation available.

The project will be created using the agile methodology, which means to use short

planning and developing bursts to rapidly write the code for the project. Due to the

project specification having changed at various stages, the agile approach to writing

the implementation has been key to the success of the project. There are several

different implementations of the agile methodology, of which there was previous

experience using the Scrum methodology; however, Scrum is most efficiently used

within a team of developers and so only the basic elements would be necessary for

a single developer to use.

The short implementation bursts mean that features of the code can be added in a

timely manner, and if the feature cannot be implemented cleanly by the end of this

timespan then feature could be pulled and left as an expansion for future work. This

keeps the project at a healthy pace and ensures that unrealistic goals are not

entertained for long enough to risk affecting the success of the implementation.

A clean testing framework is also required to ensure that when building features in

this rapid agile manner, no previous features are broken. Regressions can occur for

any reason, especially when making changes to the base code of the language, and

so it will be imperative for the success of this project to keep on top of failed tests.

Each feature that is implemented should have, at least, a test case for a successful

use and a test case for the correct exception being thrown for each unsuccessful use.

Alexander Broadbent 110225104

Page 15 of 47

5 Design The abstract programming model defines a set of allowed operations used in

computation. While there are many different implementations of this model there is

a basic set of operations that are implemented by all the programming languages

which implement these models. One of the aims of this project is to implement a

subset of these operations in order to have a functioning programming language

while building an extensible foundation in which all operations can be implemented.

While there were multiple implementations over the course of the project, each one

was created initially to implement the same operations, but through a different model

of computation, and build as many operations as possible in the time permitted or

leave the code in an extensible state to make the language reusable.

5.1 Development Environment The application was written entirely in Java, and was developed using the IntelliJ

IDEA IDE. The implementation compiled the external libraries through Maven, the

libraries that were used are Apache Commons-Lang, Google Collections and JUnit.

While JUnit was used to create the testing framework, Commons-Lang and

Collections were used to create cleaner and more readable code.

5.2 Coding Design The design of the code base matches the pattern outlined by the model of

computation which the code is implementing. Therefore, the design pattern changed

throughout the project when the different models were implemented.

At the beginning of the project, this was not an ideal situation to be in as it meant

having to spend time redesigning the flow of the code base which is very time

consuming. However, as the final implementation was the biggest and took most of

the project time, this design could be thought out fully and certain choices could be

made to increase the extendibility of the code base.

One major design choice that is the backbone of the final implementation was to use

the shunting yard algorithm, as this is the foundation of how all features are

implemented and, at its core, is very extensible.

In terms of the coding style, there were several design choices used in order to make

the code as readable and extensible as possible. These included the class naming

conventions used such as interfaces having a name starting with “I” and not having

static text within classes so every class will declare private static fields to contain

data therefore improving code readability and robustness, and also declaring

separate files just for constants to be used within whole packages (named

“IConstants”).

Alexander Broadbent 110225104

Page 16 of 47

5.3 User Interface The user interface was designed as a simple command-line interface. The user can

type a set of commands in and when the enter key is pressed the input would be

evaluated and executed.

A command-line interface is the best input for an interpreter to use and keeps the

programming language simple from the user perspective by hiding any technical

aspects of the language. A simple tutorial has been written in order for the user to

have a guide and a reference to all the implemented operators and functions of the

language.

Any errors that are emitted from the input or execution would be displayed to the user

without the overall program crashing. Multiple exceptions were created to handle a

type of the error that could occur within the program and the user interface class

could catch these exceptions and display a user-friendly message.

During the implementation, a design choice was made to expose some of the

underlying memory of the application to the user by using a special command, which

can help with debugging as the user can check which variables they have created

and see a list of functions and operators available to them.

Figure 3 - A screenshot of the user interface displaying a list of variables

While the user interface is normally an important factor in software production, the

user interface in this project has been kept to a bare minimum. The design choice of

hiding the complexity of the program because the user does not need to know how

exactly the program works, only that their input has been evaluated and the expected

output is returned or a message is displayed to explain why a result was not possible.

The simplicity of the user interface also has benefits in the testing of the project, as

a simple test can be carried out to check the input and output of the console.

5.4 High Level Design The implementation contains creational and behavioural design patterns. These

design patterns help to create an efficient program, and solve many design issues

faced by object-oriented programmers. Such issues include: determining object

granularity, specifying object interfaces and implementations, putting reuse

mechanisms to work, and delegation [10].

Alexander Broadbent 110225104

Page 17 of 47

5.4.1 Singleton Design Pattern

The Domain class uses the Singleton design pattern to only keep only one instance

of a domain object within the whole application, and provides one global point of

access to the instance.

Figure 4 - High level class diagram of the Domain class

At the start of the project, the domain was modelled with the idea of having

immutable objects that are read by multiple concurrent readers, as per the single-

assignment, single-writer, multiple-reader variables model. This way, there is only

one copy of variables and functions used within the language, which cuts out having

different values for the same variable after the evaluation has finished.

This design pattern was the cause of further issues in the final solution, such as

creating a recursive function or using a for loop. In order to use recursion, a trace of

each variable is required as they can have a different value each time the recursive

call is made.

5.4.2 Template Method Design Pattern

Classes which implement the ICalculable interface use the Template Method

behavioural pattern. Using this design pattern takes away the design problem of

algorithmic dependencies by separating the algorithms that could change over the

course of development from the objects that depend on them.

All operators and functions in the language extend from the ICalculable class, and

each subclass can have a different implementation from the superclass. For example,

in the case of the RightParenthesis class, there are multiple implementations of the

toPostFix method between the base and the RightParenthesis’ own implementation.

Figure 5 - An example of the Template Method in the toPostFix method

This pattern is used because, in general, the different base operator classes have

their own implementations that work for the operators which extend it but there are

some classes that require its own specific implementation.

Domain

static getInstance() …

static Domain instance

return instance

ICalculable

toPostFix(…)

IOperator

toPostFix(…)

Operator

toPostFix(…)

NullaryOperator

toPostFix(…)

RightParenthesis

toPostFix(…)

Alexander Broadbent 110225104

Page 18 of 47

5.5 Low Level Design This chapter describes the low-level details of the implementation; discussing the

classes used in each package and some brief explanation about any design choices

made.

5.5.1 Model Package

Domain: A singleton class that houses all data used by the language, such as the operators,

functions, and variables. The Singleton design pattern was used because there should only

ever be one instance of this class in the language at any point, and it can be accessed from

any class within the code and so the static getInstance method allows this access.

5.5.2 GUI Package

IConstants: An interface that holds static final fields that are used within the context of the

user interface; such as the program name and version.

MainGUI: This class is the main user interface class, being the only runnable class within

the whole project. The class has one field, a static Domain object, which is set using the

getInstance to create the single instance of the Domain class.

XLogger: An abstract class that provides static methods to print feedback to the console for

the user to see. The choice of creating a logger, rather than just using System.println

methods wherever needed meant that all outputs conform to the same standard, and debug

mode can restrict some messages from being displayed. The class provides methods for

printing log, warning and severe message.

5.5.3 Eval Package

Placeholders: Represents the list of placeholder classes that extend the Literal class in

order to keep an order of objects in place within an Expression. Placeholder classes are Flag,

ConditionalPlaceholder and FunctionPlaceholder.

Expression: The class represents an expression in the language. The expression class

extends Literal, and has methods to check if the input is valid, convert the expression to

postfix and evaluate the result of the expression. An expression can only be created by either

a list of Token or ICalculable objects and a Domain object.

ExpressionException: This class extends the Exception class of Java, and can be thrown

by any class involved in the evaluation method of an Expression.

IncomparableTypeException: This class extends the Exception class of Java, and is

thrown during the parsing method if there is a problem with the user input.

ICalculable: The base interface of all classes used within the language, this includes

operators, functions and literals. The methods contained in the interface define how the

object will handle the evaluate, toPostFix and getType methods.

ICalculableType: An interface that holds static final integer fields to represent the different

types of ICalculable objects. The getType method in the ICalculable interface returns one of

these integers to define which type the class belongs to.

Literal: The class represents any data object in the language, which could be an integer,

boolean, string, binary or decimal type; and allows access to this data through get methods.

Alexander Broadbent 110225104

Page 19 of 47

Variable: This class extends the Literal class to add a name field to the object. The variable

class represents a variable within an expression and so contains all the methods and fields

of the Literal class, and has methods to get the name.

XList: This class is a wrapper class for a LinkedList datatype. Providing methods to add,

remove, and get an item from the list. The class also implements the Iterable interface so

that the list can be iterated over.

5.5.4 Lexer Package

ILexer: An interface that has the base methods of any Lexer class, which are to return a list

of Token objects from an input string.

IToken: An interface containing the static integer representations of the Token class to

represent a type of input. The regular expression for each token is also held in this interface.

Lexer: This class implements the ILexer interface, and has the purpose of providing a method

to turn an input string from the user into a list of Token objects that represents an expression.

Token: This class implements the IToken interface, and has the purpose of representing a

subsection of the input string and holds an integer identifier of what type of data that

sequence is.

TokenInfo: This class represents an association between the regular expression and the

integer identifier of a Token.

UnknownSequenceException: This class extends the Exception class of Java, and is

thrown during the lexical analysis method if there is a problem with the user input.

5.5.5 Parser Package

IParser: An interface that any Parser class must implement. Only contains one method that

will take a Domain and List of Tokens, and return a list of ICalculable objects.

Parser: This class implements the IParser interface to provide the parser analysis on an

input string.

ParserException: This class extends the Exception class of Java, and is thrown during the

parsing method if there is a problem with the user input.

5.5.6 Operator Package

Associativity: This class is an enumeration of the available associativity values of the

operators. The values are None, Left-to-Right, and Right-to-Left. See Table 2 (page 12) to

see the associativity of all operators.

IConstants: This interface that holds static final fields that are used within the context of

operators, such as the token sequence for each operator.

IFunction: This interface must be implemented by a function class.

IOperator: This interface must be implemented by an operator class.

IPrecedence: This interface holds the values for operator precedence, which are used to

order operators when they need to be evaluated.

Alexander Broadbent 110225104

Page 20 of 47

IUserFunction: This interface extends the IFunction interface, and must be implemented by

a UserFunction class.

5.5.6.1 Base Package

BinaryOperator: This abstract class is the base operator class for operators that require

two operands.

NullaryOperator: This abstract class is the base operator class for operators that require

zero operands.

Operator: This abstract class is the base operator class. The other base classes extend this

class in order to make implementing the same method across multiple classes in a simpler

manner.

TernaryOperator: This abstract class is the base operator class for operators that require

three operands.

UnaryOperator: This abstract class is the base operator class for operators that require one

operand.

5.5.6.2 Bitwise Package This package contains the operator classes that are used to carry out bitwise operations.

These classes are Add, LeftShift, Not, Or, RightShift, and XOr.

5.5.6.3 Common Package This package contains the operator classes that are common to the language and do not

have a pertain to a special feature of the language. These classes are Add, ArgSeparator,

Assignment, Concat, Decrement, Divide, Increment, LeftParenthesis, LogicalAnd, LogicalNot,

LogicalOr, Multiply, RightParenthesis, and Subtract.

5.5.6.4 Comparator Package This package contains the operator classes that are used to compare two numerical values

together. These classes are GreaterThan, GreaterThanEqual, LessThan, and LessThanEqual.

5.5.6.5 Conditional Package This package contains the operator classes that are used for conditional statements. These

classes are Conditional and ConditionalElse.

5.5.6.6 Equality Package This package contains the operator classes that are used to compare two values of any type.

These classes are Equal and NotEqual.

5.5.6.7 Function Package This package contains the operator classes that are used to store internal functions. These

classes are Function, List, Max, Random, Sum, and UserFunction.

5.5.6.8 List Package This package contains the operator classes that are used for the list functionality, these

classes are ArrayAccessEnd, ArrayAccessStart, Cons, Empty, Head, ListEnd, ListStart, Size,

and Tail.

Alexander Broadbent 110225104

Page 21 of 47

5.5.6.9 Loop Package This package contains the operator classes that are used in for loops, these classes are Do,

ForLoop, and In.

5.5.6.10 Math Package This package contains the operator classes that are used for math operations; these classes

are ACosine, ASine, ATangent, Cosine, Exponential, Log10, Mod, NaturalLog, Power, Sine,

SquareRoot, and Tangent.

Alexander Broadbent 110225104

Page 22 of 47

6 Implementation The language was implemented using Java in the JetBrains IntelliJ IDEA IDE [11] with

some additional libraries. Git was used to keep version control through GitHub [12];

throughout each iteration of the implementation process the same external libraries

were used: Google Collections [13], JUnit [14], and Apache Commons-Lang [15].

The specification outlined two implementations to create; the first being the lambda

calculus model and the second being the single-writer, single-assignment, multiple-

reader variables model. However, the scope of the project changed after the

specification was produced which led to three different implementations being

produced by the end of the project.

While the foundation of the Lambda Calculus Model Implementation was used to

create the Final Solution the foundation is discussed in great detail in the Final

Solution chapter while only the differences are discussed in the Lambda Calculus

Model Implementation chapter.

6.1 Finite State Machine Model Implementation Before implementing the first model, an interpreter for a simple calculator was

created to learn and implement the basics of a programming language. The language

was implemented using the Finite State Machine model, in which the interpreter of

the language expected an input which would traverse a set of states and produce a

valid output from them.

This model was implemented first because it involved the basic logic of a simple

programming language without being too complex, and there was some level of

previous understand with finite state automata being taught within a module at

university. This meant the language could be understood by only learning the

principles of programming language theory and applied to the previous knowledge of

finite state automata.

The FSA model gave a basic understand of how a simple programming language can

work, with inputs being lexically analysed into valid tokens, parsed into an expression

of objects that represent the equation being input, and then the expression being

evaluated to give a result.

The lexical analysis converts the input into a list of Tokens – an identifier that

represented the type of input, such as Number or Variable. This was performed by

having a predefined list of regular expressions to compare against the input string

and separating the input when a match is found.

The parser analysis builds an expression tree by evaluating the input and creating

nodes by evaluating each input into an expression node by analysing each of the

following elements: Sign Term Factor Argument Value.

Alexander Broadbent 110225104

Page 23 of 47

Once a list of expression nodes is determined, the parser builds a tree to represent

the given expression by parsing through the given input and creating a node for each

element [16].

For example, the input 3 + 4 × 2 ÷ 4 produces the following parse tree:

Figure 6 - Example evaluation of the finite state automaton model

During the implementation of this model, it seemed clear that using the finite state

automata model is not the most efficient. The parser was very specific in the way

that it handled each part of the input as a special case, i.e. essentially it was a list of

if-statements for each type of operator the input could be, and so extending this

model any further than the set of basic operators implemented would be too

inefficient.

The expression nodes have a base interface containing a list of node type identifiers,

and methods that each node type should implement; these were to get the type of

the node, get the value the node represents, and to accept a visor node.

Figure 7 - UML diagram of the Expression Node class hierarchy

The expression nodes had different types, which defined the handling of evaluation

of the node. The most particular, was the SequenceExpressionNode which defined a

list of nodes internally, and would sum up the evaluated result of each node.

This is when further discussions with the project supervisor, Matthew Huntbach, took

place to determine which model would be achievable within the timeframe of the

project and the most interesting to learn. The conclusion of which was to implement

the Lambda Calculus model.

3 4 2 4

Multiplication

Node

Division Node

Addition Node

Alexander Broadbent 110225104

Page 24 of 47

6.2 Lambda Calculus Model Implementation The research led to the discovery of the shunting yard algorithm which is an

algorithm, used by machines such as calculators, to convert an input from the user

into the Reverse Polish Notation (RPN), also known as postfix notation [17].

The lexical analysis was reused from the finite state machine model, as there were

no real changes necessary. The regular expressions were updated to include the new

syntax of the language. This shunting-yard algorithm was vastly more efficient than

the parser approach used in the previous implementation, and allowed for new

operators to be added easily.

Initially, the design choice was taken to mimic the syntax of the LISP language [18].

Lisp was one of the early functional programming languages, and so had a fairly

simple syntactical structure. This syntax separated the calculation from the variable

declaration unlike the conventional lambda calculus notation. For instance, the

expression 𝑥. 𝑦. (𝑥 × 𝑦) would be expressed in lisp by 𝑙𝑎𝑚𝑏𝑑𝑎 (𝑥 𝑦)(𝑥 × 𝑦). If initial

values were given to the expression then they could be expressed after the

calculation in another set of brackets, in the same form of being separated by a

space, such as 𝑙𝑎𝑚𝑏𝑑𝑎 (𝑥 𝑦)(𝑥 × 𝑦)(3 5) to give 𝑥 the value 3 and 𝑦 the value 5.

This simplistic syntax allows for the parser to create the variables in the domain, then

parse the calculation through the shunting-yard algorithm, and then set the values

(if there are any).

For example, if the user entered the input 𝑙𝑎𝑚𝑏𝑑𝑎 (𝑥 𝑦)(𝑥 × 𝑦)(3 5), the lexer would

first split the tokens using the regular expressions for each type of valid input. In

short form, the tokens found would be the lambda token, each parenthesis, each

variable, the arithmetic operator token, and the number token.

The parser then reads in the inputs, in this case, the parser would disregard the

lambda token, create a variable in the domain for each token in the first set of

brackets, create an Expression object from the objects in the second set of brackets

(which would trigger the postfix conversion of the expression), and then set the

values of the variables from the values in the third set of brackets.

Therefore, for this expression, the domain would contain 𝑥 = 3 and 𝑦 = 5. The

expression is then evaluated, and so 𝑥 y + would be evaluated to 3 5 + and return the

result of 8 as a decimal.

While there was not much difficulty getting to this stage of the implementation, once

the modern notation of lambda expressions was implemented many more issues

were observed. The simple change in syntax was not too difficult, for instance the

parser changed to only use the shunting-yard algorithm and have the symbol \ to

define the use of a lambda term.

Therefore, now the expression 𝑙𝑎𝑚𝑏𝑑𝑎 (𝑥 𝑦)(𝑥 × 𝑦)(3 5)would be input as \𝑥.\𝑦. (𝑥 +

𝑦) which defines an application of 𝑥 onto 𝑦 applied to the function 𝑥 + 𝑦. The lexer

would act in the usual manner to split the tokens into a meaningful sequence, and

Alexander Broadbent 110225104

Page 25 of 47

the parser would implement these tokens into objects that can be understood by the

system. Once the expression is parsed, it is converted to postfix notation and so

would become \𝑥 \𝑦 𝑥 𝑦 + . . . which would put all the tokens into an order where

they can be stacked up and evaluated in order.

Currying was introduced to the language in order to reduce an equation by giving one

of the unassigned variables a value. This would then allow the language to substitute

the use of the variable for the value assigned to it, which would then reduce the

expression down by one less unknown variable.

During this implementation, the problems of using reduction to simply the expression

input by the user was particularly hard to implement. To reduce a lambda expression

down to the normal form, the most reduced form, there are several types of reduction

techniques. Reduction order defines which redex to reduce when there is a choice,

the leftmost redex is one which is textually to the left of the other redexes, the

outermost redex is one which cannot be contained within any other redex, and the

innermost redex is one which contains no other redex. These reductions can be

ordered in two different ways, the applicative-order reduction or the normal-order

reduction. Applicative-order aims to reduce the leftmost innermost redex first,

whereas normal-order aims to reduce the leftmost outermost redex first [19].

Using these techniques to reduce an expression input by the user proved too difficult

with the underlying architecture of the shunting-yard algorithm. One branch of the

implementation explored the possibility of bypassing some of the foundation in

certain cases in order to reduce the expression, but ultimately, not all expressions

could be reduced in this format and so the branch was dropped.

This problem caused the halt in this implementation, as the time left to have a

meaningful implementation in place was running out and the final report would soon

have to take priority over the implementation. From this stage of the lambda calculus

model being implemented, basic calculus was available and simple lambda

expressions could be calculated as long as reduction was not necessary. Another

discussion with the project supervisor led to discussions about what could be

implemented in the remaining time, which saw a change in the final implementation.

It was then decided that the foundation of this implementation would be extended in

a more simplistic way to keep the shunting-yard algorithm intact, and to build

featured on top to merge more towards the single-assignment, single-writer,

multiple-reader variables model that was the original target of the project. Although

the expectation was set that not all the features of the model would be implemented,

it would be more of a compromise between the current state of the project and the

target model, due to the remaining time left of the project.

Alexander Broadbent 110225104

Page 26 of 47

6.3 Final Solution Once the implementation for the shunting yard algorithm was completed, the scope

was changed to no longer include lambda calculus. The time it was taking to fully

research the lambda calculus notation and implement an algorithm that efficiently

could analyse, reduce, and evaluate the expressions was proving too long for the

timeline of the project.

The final project started off with most of the base code from the lambda calculus

model implementation, with all modules and occurrences of lambda calculus being

removed. The implementation was then extended out to include a linked list variable

type, implement a simple instance of manual memory management and there were

also a lot of bug fixes implemented.

The language shifted towards the original motivation of the project, which was to

implement the single-assignment, single-writer, multiple-reader variables model.

This has been achieved by keeping variables in the domain immutable, which means

that once the value is set it cannot be changed. Therefore, the language can run

concurrently as multiple operations can write variables once and read from the

domain without overwriting data. The language can be further extended to implement

all the features of this model, but only a basic set are included in the final solution.

The solution is broken down in the following chapters, analysing the areas that were

researched and implemented in order to form the programming language, and some

describing the features of the language that was created. There was some overlap

between the final solution and previous implementations, which will also be outlined

where relevant.

Note: The supporting material contains a folder called Tutorial which contains a pdf

tutorial of how to use the language.

6.3.1 Lexer (Lexical Analysis)

The purpose of the lexical analysis is to split the input string into some logical

sequence of tokens. The lexer method takes the input string and assigns a token to

each element, the class TokenInfo stores a regular expression as a string and an

integer to represent the token and a class called Token to represent the input’s

sequence and integer token.

While the integer token does not serve a purpose later in the language, having a token

definition could be used if more features were implemented in the base structure, and

so has remained in the final implementation to ensure the code is extensible as

possible. In the parser, the tokens produced by the lexer are matched with ICalculable

objects by comparing their sequence (for example, “+”) rather than their token

number.

Alexander Broadbent 110225104

Page 27 of 47

The regular expressions and sates used were as follows:

Name Token Regular Expression Example Matching

Whitespace 0 \s

Boolean Comparator 1 \!\= | \=\= x != y

Assignment 2 \= x = 21

Increment Decrement 3 \+\+ | \-\- x++

Binary 8-bit 4 [0-1]{8} z = 00010011

Decimal 5 [+-]?(\d*.\d+) x = -3.2

Number 6 [+-]?[0-9]+ y = -102

Arithmetic 7 \+ | \- | \# 2 + 3

Geometric 8 \* | / | \^ | \% 2^4

Boolean 9 true | false x = true

Function Declaration 10 func func square(x) = x*x

Logical Operator 11 \! | \&\& | \|\| !true

Bitwise Operator 12 \~ | \& | \| | \$ ~01101101

Bit Shift 13 \<\< | \>\> 01101101 >> 2

Math Comparator 14 \<\= | \>\= x <= y

Math Equality Comparator 15 \< | \> x > y

Parenthesis 16 \( | \) (x)

For Loop Element 17 (for | in | do)[\W] for x in y do x

Text 18 \"([^\" | ]*)\" | \'([^\' | ]*)\' name = “hello”

Variable Or Function 19 \w(\w | \d)* name = “world”

Conditional 20 \? | \: (x < y) ? x : y

List 24 \{ | \} { 1, 2, 3 }

Argument Separator 26 \, { 1, 2, 3 }

Array Access 27 \[ | \] array[1]

Table 3 - Tokens used in the Lexical Analysis of inputs

Note: The Boolean regular expression is compiled with the flag IGNORE_CASE so that upper

case letters are allowed in the

The lexer function takes the input, and iterates over the regular expressions until a

match is found at the start of the input. A token is created with the sequence and

state of the matched text, which is then added to a list. The sequence is then removed

from the input, and this process continues until the input is empty. The list of tokens

is returned, which represents the whole input; or if there are unmatched tokens an

error is thrown.

Alexander Broadbent 110225104

Page 28 of 47

6.3.2 Operator Precedence

The order of operations was based on the orders implemented by the programming

language C [20], the ones used were:

Precedence Operator Description Associativity

1 ( ) [ ] { }

Function call Array access List (Compound Literal)

Left-to-right

2 ++ -- ! ~

Increment and Decrement Logical NOT and bitwise NOT

Right-to-left

3 * / % Multiplication, division, and modulo

Left-to-right

4 + - Addition and subtraction

5 << >> Bitwise left shift and right shift

6 < <= > >=

Less than (), less than or equal () More than (), more than or equal ()

7 == != Equal to (), not equal to ()

8 & Bitwise AND

9 $ Bitwise XOR (exclusive or)

10 | Bitwise OR (inclusive or)

11 && Logical AND

12 | | Logical OR

13 ? : Ternary conditional

Right-to-left 14 = Assignment

15 , Comma (argument separator)

99 None None

Table 4 - Operator precedence values

Within the implementation, each operator has an assigned precedence value and

associativity. The precedence value is used to order the operators in the postfix

conversion using the Shunting Yard algorithm. The associativity value, which is a

choice from none, left-to-right, or right-to-left, is used to determine the order of

operators when parentheses are not used.

Within the shunting-yard algorithm, the associativity and precedence are used to

determine which operators are to be moved from the operator stack to the output, by

determining if the current operator has a higher precedence than the operator on the

top of the stack. For example, if the expression 𝑥 = 3 × 5 + 4 is to be converted to

postfix, the operator × would be on the stack when the operator + is determined.

Because both operators have the same associativity and the × operator has a

precedence of 3 and the + operator has a precedence of 4, the × operator is moved

to the output first, and the + operator is pushed to the operator stack.

The value None was used for operators that do not require any operands, such as

parentheses or the list access operator “[“. These operators are not added to the

postfix expression and so no ordering is required for them.

Alexander Broadbent 110225104

Page 29 of 47

6.3.3 Parser (Syntactical Analysis)

The algorithm to convert a list of tokens into an Array of ICalculable objects.

Figure 8 - Pseudo code of the parse method

The algorithm basically converts the tokens to their ICalculable class

representations, which means the token represents an operator, function, or literal

object. The ICalculable class is the base class for all objects within the project and

provides methods, amongst others, to identify the type of object, convert infix to

postfix, and to evaluate the value. During the parser analysis, a ParserException may

be thrown if the user is creating a function with a name that already exists, or if a

syntax error occurs.

Although the syntax is not analysed within the parser function, once the expression

is evaluated, an ExpressionException is thrown when there is an error, for instance if

the expression contains three literals and an operator that takes two operands, then

after evaluating the operator that leaves two values left on the stack, an error would

be thrown.

The algorithm was later adjusted when the custom function feature was added. The

modified algorithm had an if statement before the for loop to check if the first token

is the “function declaration” token. If this is true, then the parseFunctionDeclaration

method is returned. The function is similar to the parse method, but there is a

UserFunction object and as the for loop iterates over the tokens there are checks to

add to this object. When the name is set, the function object is added to the domain

and the infix expression, any subsequent variables parsed before the assignment

operator (=) are added to the domain’s functional variable map and to the function

object as an argument. After the assignment operator is reached, then all

subsequence operators any operators or variables are treated as normal, and added

to the infix expression.

For each token: If token is a variable:

Get the variable and add to the infix expression Else if the token is an operator: Get the operator from domain and add to infix expression Else if the token is a function: Get the function from domain and add to infix expression Else: Parse the token as a literal, and add to infix expression

Return infix expression

Alexander Broadbent 110225104

Page 30 of 47

6.3.4 Shunting-Yard Algorithm

Each object contains a toPostFix method, which is how all objects are converted from

infix to postfix notation [17].

Figure 9 - Pseudo code for the shunting yard algorithm

While the algorithm implies that it should be implemented in one method, the function

was divided across each ICalculable class. Each class had the following method void

toPostFix(List<ICalculable> infix, int infixIndex, List<ICalculable> postfix, Stack<IOperator> operatorStack). For

example, a Literal class would just push itself onto the postfix stack and an operator

would push itself onto the operator stack. More complex functions occur for the

opening and closing brackets.

The shunting-yard algorithm can be evaluated to have a time-complexity of O(n)

because each token, whatever token it is, will be read just once. Therefore, there are

at most a constant number of operations executed per token, which gives the running

time as linear to the size of the input.

For each token in the input: If the token is a literal: Push the token onto the output queue If the token is a function: Push the token onto the operator stack If the token is an argument separator (“,”): Until the token at the top of the operator stack is an opening bracket: Pop off the token on the operator stack and push onto the output queue (If no opening bracket is found – thrown an error: mismatching brackets) If the token is an operator: While there is an operator on the stack, and either the token is left-associative and its precedence is less than or equal, or the token is right-associative and has lower precedence: Pop off the operator stack and push onto the output queue Push the token onto the operator stack If the token is an opening bracket (“(“): Push the token onto the operator stack If the token is a closing bracket (“)”): Until the token at the top of the stack is an opening bracket:

pop off the stack and push onto the output queue (If no opening bracket is found – thrown an error: mismatching brackets) Pop the opening bracket from the operator stack If the token at the top of the stack is a function: Pop off the stack and push onto the output queue If the operator stack is not empty: While there are still tokens on the operator stack:

If the token on the top of the stack is a bracket: Throw an error – mismatching brackets Pop the operator from the stack and push onto the output queue

Alexander Broadbent 110225104

Page 31 of 47

6.3.5 Expression Execution

In order to get the value from an expression, the postfix algorithm [21] is used:

Figure 10 - Pseudo code for the postfix algorithm

The expression can be executed efficiently using the postfix evaluation method, with

a time-complexity of O(n) as each token is visited once, stored on the stack, and then

popped off the stack and so the running time is linear to the size of the input [22].

Figure 15 (page 46) in the appendix shows the example output of the programming

language when given various expressions to execute.

6.3.6 Custom Function Definition

A feature implemented into the final solution allows users to define their own

functions. This meant that a user could create their own functions to use within the

language. An example of this functionality looks like this:

Figure 11 - An example of a custom function declaration and execution

A user can use the predefined variables, operators and functions in their expression,

and add variables that have to be added within the brackets after the function name.

The logic of these custom function classes follows that of the predefined functions,

only adding that

6.3.7 Lazy Evaluation

Conditional statements in the implementation utilise lazy interpretation. A

conditional statement takes three arguments: a boolean determinant, an expression

to execute if the determinant is true, and an expression to execute if the determinant

is false. Lazy evaluation means that expressions are only executed when it is

necessary to, which in this case is particularly useful as the code is separated into

While token list is not empty: Read the next token from input If the token is a value: Push the token onto the stack Else, the token is an operator with n operands: If there are fewer than n values on the stack:

Throw an error – input does not have sufficient values Else, remove the top n values from the stack Evaluate the operator, with the values as arguments Push the returned results, if any, back onto the stack

If there is more than one value left on the stack: Throw an error – input has too many values

Alexander Broadbent 110225104

Page 32 of 47

two paths, and where an eager interpreter will evaluate both conditions before

determining which path is being taken, a lazy interpreter will only execute the

expression that is returned.

An example of this evaluation in the language is the expression 𝑓𝑢𝑛𝑐 𝑒𝑣𝑒𝑛𝑜𝑑𝑑(𝑦) =

(𝑦 %2 == 0) ? 𝑦 ÷ 2 ∶ 𝑦 × 2 defines a function that takes a single argument and

returns a new value depending on if the argument is even or odd. If the language was

the use eager evaluation, both 𝑦 ÷ 2 and 𝑦 × 2 would be executed, and then once (𝑦 %2 == 0) was determined the correct result would be returned.

Instead, the language will execute the determinant (𝑦 %2 == 0) immediately and

store the result, then create an expression for 𝑦 ÷ 2 and 𝑦 × 2, storing them

separately. The result of these three objects (boolean, expression, expression) and if

the boolean is true then execute the first expression otherwise execute the second

expression. This keeps the processing down to a minimum, ensuring that only the

required evaluation of expressions is done as and when the result is needed.

Lazy Evaluation can introduce memory leaks, as objects created that are not

executed will take up memory and need to be handled. This could be a problem in

other languages, but as this implementation has been written using the Java

language, the built in garbage collector will handle these unused objects. The

garbage collector is a great feature of Java for just this behaviour, while these

expressions could be manually removed from memory if not required, the tracking of

each object is not necessary and would, if anything, cause the program to run slower.

6.3.8 Linked Lists

The language contains a linked list data type, named XList, that is a class with one

field, a LinkedList object, and has methods to manipulate the list object. The object

started out as a wrapped class just to make the best use of the toString() method for

displaying a linked list to the user in a more pleasing style (Where the LinkedList

default is “[1, 2, 3]” the XList is output is “1 2 3”. During development the XList

proved to be useful for testing purposes too over using a regular LinkedList object.

In terms of testing improvements, by overriding the equals(Object o) method that

Java provides the list of elements can be checked against another list given to the

method. The procedure for this method was to return false if the Object, o, that is

passed to the function is not an XList object. Then to return false if the size of the list

is not the same as the size of the list of the comparison XList object o. At this point,

both objects are an XList object with the same size list, and so a for loop iterates over

the size of the list to compare each value of the lists. If the value at each position if

not the same then the function returns false; at the end of the loop, the function

returns true as all value of the lists are the same.

6.3.9 For Loops

The class XList has been implemented so objects can be iterated over to perform an

expression on each value in the list. This functionality allows for lists to be operated

on in the same way as the other literal objects. As all variables are immutable in the

Alexander Broadbent 110225104

Page 33 of 47

language, for loops handle this by constructively returning a new list instead of

destructively changing a list that is passed to the loop.

Adding a recursive domain proved too difficult within the time frame of the project,

but some extent of recursion is available. The reasons why full recursion is not

available in the language is documented in the Full Recursion section of the Future

Developments chapter (page 41).

For example, where a variable 𝑦 can be created by multiplying the variable 𝑥 by 3 in

the expression 𝑦 = 𝑥 × 3. The same can be done for a list object in the expression

𝑦 = 𝑓𝑜𝑟 𝑣 𝑖𝑛 𝑥 𝑑𝑜 𝑣 × 3. With a new list being created from the output of the for loop;

for instance, if this expression was executed with 𝑥 = { 1, 2, 3 } then 𝑦 would be set to

{ 3, 6, 9 }.

6.3.10 Recursive Expressions

The language can process recursive custom functions, but only to a certain degree

with the current architecture. Again, like for loops, the recursive function is not fully

available. Expressions that contain a call to the domain is not respected on several

levels and so only non-domain objects can be used in the expression.

Currently, the language can process a recursive function as long as the expression

does not contain a call to the domain. For instance, the function declaration

𝑓𝑢𝑛𝑐 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝐿𝑖𝑠𝑡(𝑙) = 𝑒𝑚𝑝𝑡𝑦(𝑙) ? "𝑓𝑖𝑛𝑖𝑠ℎ𝑒𝑑" ∶ 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝐿𝑖𝑠𝑡(𝑡𝑎𝑖𝑙(𝑙)) will create a

function that takes a list argument. The conditional expression performs an empty

function call on the list; if the list is empty then the text "𝑓𝑖𝑛𝑖𝑠ℎ𝑒𝑑" is returned,

otherwise there is a recursive call to the 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝐿𝑖𝑠𝑡 function with the tail of the list.

While this function does not perform any real operation, it proves that recursion is

indeed possible within the language.

The design pattern of the domain was the cause of this problem, in order to use

recursion a trace of each variable is required as they can have a different value each

time the recursive call is made. For example, in the Fibonacci sequence:

𝑓𝑢𝑛𝑐 𝑓𝑖𝑏(𝑛) = 𝑛 < 2 ? 𝑛 ∶ 𝑓𝑖𝑏(𝑛 − 1) + 𝑓𝑖𝑏(𝑛 − 2), the variable 𝑛 will have a different

value each time that the recursive call is made. Therefore, once 𝑛 is set, the result

will not return to the previous function with that value of 𝑛, it will merely get stuck

without getting a satisfactory result and eventually Java will stop the function from

running.

6.3.11 Syntactic Sugar

In order for the instructions to be entered quicker, there are some allowances in the

syntax permitted by the language. For instance, during a for loop, the defined syntax

to use is 𝒇𝒐𝒓 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝒊𝒏 𝑙𝑖𝑠𝑡 𝒅𝒐 𝑠𝑜𝑚𝑒 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 where the variable is a new name

(i.e. has not already been assigned a value) and the list can be either a variable or a

call to the list function to create one.

The keyword 𝑖𝑛 defines that the variable will represent each value in the list, but it

can also be left out as the language discards the 𝑖𝑛 operator at the evaluation stage.

Therefore, the syntax 𝒇𝒐𝒓 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑙𝑖𝑠𝑡 𝒅𝒐 𝑠𝑜𝑚𝑒 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 would be equally valid.

Alexander Broadbent 110225104

Page 34 of 47

The same principle can be applied to simple function calls, where parentheses are

not necessary, although when used in combination with other operators are required

to define the order of precedence. For instance, the expression 𝑥 = 𝑙𝑖𝑠𝑡(1, 2, 3), which

creates variable 𝑥 and assigns a list of values, can also be achieved with the

expression 𝑥 = 𝑙𝑖𝑠𝑡 1 2 3.

6.4 Evaluation Looking back at the aims of the project: “Understand how programming languages

work and how to implement one, develop an extensible programming language

foundation, and build features of the language upon this foundation.” The final

implementation has satisfied these aims, as the language contains a solid foundation

which contains the principles of programming language theory, and has features

built in to allow a large quantity of expression types to be evaluated.

Whilst the project has been a success as measured in regards to the original

objectives, there are still a number of areas that could have been better had the

project gone more to the original specification. The specification outlined two

specific implementations, namely the lambda calculus model and single-writer

single-assignment multiple-reader model, and so by that measurement the project

could also be deemed a failure.

However, the idea of the language is to implement any abstract model and so a

success in its own right. Many features are available to use, and the language is very

extensible in the way it has been coded; it was also written using the agile

methodology to rapidly adjust to the changes in the scope.

While there are elements of the code base that were “hacked in”, and are not part of

an ideal solution and possibly even hold back the extensibility of the code, these were

a sad necessity of writing the project while having other modules at university to work

on and getting something to work by the end of the project. Personally, I think that I

know the solution to these hacks, and could have made the changes to remove them,

but they came too late to be a stable change and would risk introducing bugs to the

code.

Overall, the language contains all of the processes of a programming language, such

as the lexical analysis and parsing analysis, which was all learnt over the duration of

the project. While the final solution may not be a 100% complete implementation, I

am happy with the amount of knowledge that I now have about the theory of

programming languages and think that my coding practices have improved as a

result of the project.

Alexander Broadbent 110225104

Page 35 of 47

7 Testing With the language being rapidly programmed, an efficient and full testing suite was a

necessity to ensure that every new feature added was working as expected in a

number of given scenarios and that the addition did not break any other feature.

In order to maintain these tests, JUnit testing framework was used to create both unit

tests and integration tests. Unit tests were created to test individual units of source

code and integration tests were created to test functionality groups. The integrations

tests are more specifically used to ensure that no new feature breaks any old

functionality.

Note: This chapter contains references to the supporting material, the folder AllTest-

Coverage-Report and AllTest-Output.

7.1 Low Level Design This chapter describes the low-level details of the testing framework; discussing the

classes used in each package and some brief explanation about any design choices

made.

The classes in the framework package make up the abstract base of all unit test

classes. The base classes of the framework show how it has been created with

extensibility in mind; all test cases are within classes that extend one of the following

base classes, which provide the methods the tests require.

Figure 12 - UML diagram of the framework package within the test folder

7.1.1 Framework Package

BaseTest: This abstract class defines a test class for framework classes to extend. Can

contain static fields for log messages that can be used in multiple framework class methods.

ExpressionTest: This abstract class defines the basis for any test suite class to extend in

order to validate expressions that are input the same as a user would. There are methods

provided to allow domain manipulation to generate variables required for the test and to reset

the domain before and after each test.

Alexander Broadbent 110225104

Page 36 of 47

FunctionTest: This abstract class extends the ExpressionTest class and provides methods

on top to test the custom function feature. This means having methods to validate function

declaration expressions and validating the result of a function execution through,

runFunctionDeclaration and runFunctionTest.

LogTest: This abstract class is to be extended by test suites for testing output written to the

console via the XLogger class. The class provides methods for writing different levels of

messages, and asserting that the correct formatting is used. The output console used by the

testing framework is preserved using the Before and After JUnit annotation methods, as the

LogTest methods require a clean console to read from. Provides methods to run assertions

through runLogTest, runWarningTest, runSevereTest, and runDebugTest.

7.1.2 Architecture Package

This package contains the test cases to validate the underlying architecture of the

programming language. These cases are in the classes LoggerUT and ParserUT.

7.1.3 Feature Package

This package contains the test cases to validate specific features that were added to

the language. This package is essential to prevent regressions from occurring when

new features are added. These cases are in the classes ForLoopUT, ListOperatorUT,

RecursiveFunctionUT, UserFunctionUT, and XListUT.

7.1.4 Function Package

This package contains the test cases to validate the functions used within the

language. Unit tests check that each function executes as expected for valid data and

also throws the correct exception when invalid data is used. These cases are in the

class PredefinedFunctionUT.

7.1.5 Operator Package

This package contains the test cases to validate the operators used within the

language. Unit tests check that each operator executes as expected for valid data

and also throws the correct exception when invalid data is used. These cases are in

the classes AssignmentUT, BitwiseOperatorUT, ComparatorOperatorUT,

ComplexCalculationUT, ConditionalOperatorUT, EqualityOperatorUT,

LogicalOperatorUT, MathOperatorUT, MultipleVariableUT, and SimpleCalculationUT.

7.1.6 Syntax Package

This package contains the test cases to validate the syntax of the language performs

as expected. This package is particularly important as regressions from new features

may mean that the expected outcome of valid data is still correct, but the syntactic

sugar methodology means that some invalid data can still be valid within the

language. These cases are in the classes InputFormatUT, PredefinedVariableUT,

UnknownSequenceExceptionUT, VariableDefinitionUT, and VariableNameUT.

7.1.7 Type Package

This package contains the test cases to validate the strongly typed nature of the

language, ensuring that operators are executed only on the types of operands that

they are expected to run on. These cases are in the classes BitwiseByteParseUT and

IncomparableTypeUT.

Alexander Broadbent 110225104

Page 37 of 47

7.2 Framework Creation During development, the tests were created in unison with classes to maintain a high

test coverage, and as outlined earlier the rapid output of features meant that testing

was needed to determine if a new feature broke an existing one.

The testing framework was simple to begin with, just determining that a standard

input would give the expected output; but then over time repetition occurred within

test cases and more types of tests were needed.

By the final implementation the framework had 5 types of test:

Test Type Purpose

Expression Given an input and expected output. Validates the output is the same as the expected output.

Variable Given an input, variable and expected output; checks that the input runs correctly, and validates that the newly created variable exists in the domain with the expected output.

Exception Given an input and class that extends Exception, and validates that the class is caught while executing the input.

Function Given an input to declare a function, an input to run the function and an expected output of the run. The test runs the declaration, runs the test input, and validates the output is the same as the expected output.

Log Methods provided to test a log, warning, or severe message. Given a message to output, and then validates that the logger outputs the correct prefixed string. Also has methods for changing the debug setting.

Table 5 – Testing Framework types and purposes

Creating an extensible testing framework meant that writing a test case simply

involved picking the type of test and then feeding in the required elements, such as

input, expected output, class, or variable name.

The framework proved essential for testing all areas of the programming language

within the time frame of the project; as testing for each part of the language that

could fail would take a very long time and so by breaking it down into more simple

use cases the unit tests were more efficient.

Figure 16 (page 46) shows the a sample of the output text from the framework. The

class name of the test is displayed, and usually followed with the expected result and

then the actual result. The type of test being run determines what output template is

used; largely, the output for Expression, Variable, and Function tests are the same,

and Exception has a different output, and Log tests do not output as they use a new

output console to run the test.

Alexander Broadbent 110225104

Page 38 of 47

7.3 Test Case Setup When writing test cases, there is often a requirement to use similar objects between

tests and creating them each time adds extra time and data costs. JUnit allows for

these objects to be created either before each test case executes or at the start of

each test suite.

The framework takes advantage of JUnit’s Before and After annotations to reset the

domain. This is to stop changes to the domain in one test affecting another test. Like

for instance, in a scenario where two different test cases created a variable with the

same name, the second test that runs would fail due to the variables being

immutable. Therefore, the reset before and after every test is necessary to ensure

that tests are run in a completely clean and fair environment.

While the 180 test cases can run in very efficient time, this cleaning of the domain

does not have too much of an effect on the performance of the testing suite. Without

the cleaning in-between each test run, then all test cases would need to use a unique

variable or function name, which after a few tests becomes difficult to keep track of.

So this Before and After JUnit feature makes writing the test cases much simpler.

Table 6 - Average run time to execute all tests in the testing framework

The above table shows that the testing framework runs the 180 test cases in an

efficient time at an average of 1.93ms per test.

7.4 Test Coverage Figure 13 (page 45) shows the overview of the coverage report, while the full coverage

report (supplied within the supporting material) displays the classes of the

implementation and shows the lines that are covered by test cases, with green

highlighting indicating the line is covered by a test case and red highlighting

indicating that the line is not covered by any test case. The report is incredibly useful

in closing any gaps in testing while the report displays, the final implementation had

85% of the lines of code covered by at least one test case.

The class coverage, which was 97%, is high because on top of the base classes, each

feature had its own unit test and due to the extensible nature of the project meant

each unit test just needed to test the few methods written for each feature. Unit

testing was therefore kept to a simplistic minimum and so high class coverage was

easily achieved.

For each feature, a test case was created to check that the expected result was

returned from a valid input, and any thrown by the new feature, they were caught

when given the invalid input that should trigger them to be thrown.

Run 1 Run 2 Run 3 Average Run

375ms 300ms 369ms 348ms

Alexander Broadbent 110225104

Page 39 of 47

7.5 Testing Analysis At the end of the final implementation, the testing coverage reached 97% for class

coverage and 85% for line coverage. This coverage is higher than expected at the

start of the project, and the quality of the final implementation is reflective of the

testing standards used.

The line coverage could have been higher as certain aspects of the language were

not tested, although some parts of the language were not testable; such as user

interface methods and some methods of operators are hard to test without having a

very in-depth test suite, which was beyond the timescale of this project.

The agile methodology used meant that the testing framework had to be robust

enough to handle different types of tests for each feature of the language, and a test

would have been created for each possible operation. With this in mind at the start of

the project, a robust framework was built which made adding test cases very simple.

When analysing the coverage, one area of code that was untested within most of the

operators was the getPrecedence function. This function is used when determining

which order the operators in the expression should be executed in and so a lack of

testing of this method shows that most test cases do not have more than one

operator in them. This testing gap could mean that some operators may not be

executed in the expected order when input in combination with other operators.

Overall, the class coverage and line coverage is very high. To a certain degree, all of

the classes that can be tested are being tested with the current testing suite. The

high class coverage means that there should be a low level of unexpected behaviour

as the classes that contain an operator have all been tested that they perform the

expected behaviour. Of course, a high class coverage does not mean that errors can

arise during production and so class coverage is not a perfect indicator of the

success of test coverage, as a testing gap has already been identified, a combination

of operators may result in an unexpected behaviour.

Alexander Broadbent 110225104

Page 40 of 47

8 Conclusions This report documents how the project was researched, designed, scoped,

implemented and tested; and in my own opinion, although the scope had been

changed during the project, the initial aims and objectives have been met.

At the start of the project where there was little knowledge, and a very weak and

flawed implementation was created, whereas at the end of the project there was an

efficient and well-researched solution in place. The implementation serves its

purpose in that it is usable and extendable in its current state, so from the perspective

of research, design, and implementation the project can be viewed as a success.

Although the implementation does not produce bytecode which would normally be

produced by a programming language, the theory behind the language has been

implemented. Whilst it may have been an expectation from the title, producing

bytecode within the timeframe of the project is near impossible when combined with

the amount of research needed in order to do so. There is a model of computation

that has been implemented to produce a programming language

When examining the code of the programming language, it is visible that the structure

has built up in an agile way, as features are built in fairly flat to continue to be

extensible while supporting all aspects of the language syntax. I believe this makes

the code easier to read, and understand, and therefore building out more features into

this code base should be a realistic goal.

Therefore, the programming language created by the implementation meets the three

aims set out to achieve at the start of this project; the knowledge of programming

language theory has been applied to the implementation, which is extensible for

future use, and many operators and functions have been built in to provide a feature-

rich language.

Alexander Broadbent 110225104

Page 41 of 47

9 Future Developments One aim of the project was to create a platform in which the programming language

can be expanded upon further. Due to the complexity and research needed to

complete the project, it was known at the start that a full implementation would be

too much to complete and thus by keeping the code in an extensible state then the

implementation can be seen as an academic project that could be extended. This

was achieved by creating a solid foundation for the language, involving the Lexer,

Parser and Domain classes. This meant that for each operator to be added would

involve a single class addition and then additional references in the lexer and domain

class.

9.1 Implementing the Backus-Naur Form Analysis One way that would have been implemented, if not for time constraints, would have

been to implement the lexical and syntactical analysis using Backus-Naur Form

(BNF) grammar. Using the BNF grammar would mean that the syntactical analysis

would be capable of catching errors in the input immediately, unlike the current

implementation that does not catch errors until the expression is evaluated, and

make changes in the grammar much simpler to implement; such as ANTLR [23].

9.2 Return Results in Input Format In the final solution, most operators will parse the operands to a predefined type,

therefore returning a different type than what was passed. For example, when adding

two binary numbers 00001101 + 11010000, the decimal 221.0 is returned as the

addition operator parses the binary digits to decimals and returns the sum.

While this method works fine, it would be better to return an object that matches the

type passed into the operator. One reason that the implementation is in this format

is that it makes stringing together operations easily capable, if for example, an

operator returned a binary result, which is a String in Java, the next operator that

parses this result would read the binary as a decimal and so get the number wrong.

For instance, 7 in binary, "00000111", would be interpreted as the decimal number

111.

9.3 Full Recursion Due to the domain utilising the Singleton design pattern to only keep one instance of

the domain class within the whole system, it means that once a variable is assigned

a value it cannot be changed. To run a recursive function, the domain would need an

extra map to keep track of multiple instances of the same value, with a result being

passed between these levels back to the start of the recursive call.

The for loop feature negates this slightly by changing the value directly in the domain,

although it is still susceptible to an unexpected output if a function is called in the

expression of the for loop.

9.4 Memorization within the Domain The domain could be more efficient by keeping a store of the expressions that have

been executed and the result. This would be particularly beneficial when

Alexander Broadbent 110225104

Page 42 of 47

implementing a recursive function that would run the same expression multiple

times, an example of which would be the Fibonacci Sequence function.

This feature would not take much to implement, simply a Map of an infix String as

the key to an Object being the value which is the result of storing the result being

stored in the Domain. When an expression is input from the user, the Map could be

checked for the infix expression and if a result exists in the map it could be returned

to save the expression having to be executed.

9.5 How to: Add a New Operator or Function For an example of how another operator could be added, let us assume we want to

create a function called average, which will take a list of any length and find the

average of the values within. There are 5 base classes for operators that can be

extended depending on how many operands they are dealing with, operators allow

between zero and three operands while functions allow any number of operands to

be evaluated.

Operands Base Class

0 NullaryOperator 1 UnaryOperator 2 BinaryOperator 3 TernaryOperator Any Function

Table 7 - Base classes used for Operators

All of these classes contain the same methods that have to be implemented,

getToken and execute. There are some additional methods that must be

implemented depending on the base class, which include getPrecedence, evaluate

and toPostFix.

Figure 14 (page 45) in the appendix demonstrates the code addition that would be

required to add an average class to the programming language. The function class

created would be registered in the domain by adding the line

𝑟𝑒𝑔𝑖𝑠𝑡𝑒𝑟𝐹𝑢𝑛𝑐𝑡𝑖𝑜𝑛(𝒏𝒆𝒘 𝐴𝑣𝑒𝑟𝑎𝑔𝑒()); in the constructor for the Domain class. For an

operator class, the method 𝑟𝑒𝑔𝑖𝑠𝑡𝑒𝑟𝑂𝑝𝑒𝑟𝑎𝑡𝑜𝑟 would be used.

Alexander Broadbent 110225104

Page 43 of 47

10 References

[1] M. J. C. Gordon, Programming Language Theory and its Implementation, Citeseer,

1988.

[2] Bubenik and Zwaenepoel, “Performance of Optimistic Make,” Performance

Evaluation Review, vol. 17, pp. 39-48, 1989.

[3] J. C. Reynolds, Theories of Programming Languages, Cambridge: Cambridge

Univrtsity Press, 1998.

[4] R. Harper, “Memoization and Laziness,” [Online]. Available:

https://www.cs.cmu.edu/~rwh/introsml/techniques/memoization.htm. [Accessed

12 Mar 2016].

[5] E. A. Edmonds, “A Process for the Development of Software for Nontechnical Users

as an Adaptive System,” General Systems, vol. 19, no. 215C18, p. 8, 1974.

[6] P. J. Landin, “The Mechanical Evaluation of Expressions,” The Computer Journal, vol.

6, no. 4, pp. 308-320, 1964.

[7] A. M. Turing, “Computability and λ-Definability,” The Journal of Symbolic Logic, vol.

2, no. 4, pp. 153-163, 1937.

[8] E. W. Dijkstra, “Algol-60 Translation,” Mathematisch Centrum, pp. 0-32, 1961.

[9] M. Huntbach, “Aldwych_prog_model.pdf,” [Online]. Available:

http://www.eecs.qmul.ac.uk/~mmh/aldwych/papers/Aldwych_prog_model.pdf.

[Accessed 10 Oct 2015].

[10] E. Gamma, R. Helm, R. Johnson and J. Vlissides, Design Patterns - Elements of

Reusable Object-Oriented Software, Addison-Wesley, 1995.

[11] Jetbrains, “IntelliJ IDEA,” Jetbrains, [Online]. Available:

https://www.jetbrains.com/idea/. [Accessed 27 Oct 2015].

[12] GitHub, “GitHub - Where software is built,” GitHub, [Online]. Available:

https://github.com/. [Accessed 28 Oct 2015].

[13] Google, “Google Commons Library,” MVNRepository, [Online]. Available:

http://mvnrepository.com/artifact/com.google.collections/google-collections.

[Accessed 21 Oct 2015].

[14] JUnit, “JUnit Testing Framework,” [Online]. Available: http://junit.org/. [Accessed 1

Nov 2015].

Alexander Broadbent 110225104

Page 44 of 47

[15] Apache, Apache Common Lang, Apache.

[16] cognitolearning, “Writing a Parser in Java: Creating the Expression Tree | Cognito

Learning,” CognitoLearning, 7 May 2013. [Online]. Available:

http://cogitolearning.co.uk/?p=630. [Accessed 23 Oct 2015].

[17] T. O. M. Center, “The Shunting Yard Algorithm | The Oxford Math Center,” [Online].

Available: http://www.oxfordmathcenter.com/drupal7/node/628. [Accessed 10 Nov

2015].

[18] Free Software Foundation, “GNU Emacs Lisp Reference Manual: Simple Lambda,”

[Online]. Available:

http://www.gnu.org/software/emacs/manual/html_node/elisp/Simple-

Lambda.html#Simple-Lambda. [Accessed 8 Dec 2015].

[19] A. J. Field and P. G. Harrison, Functional Programming, London: Addison-Wesley

Publishing Company, Inc., 1988.

[20] cppreference.com, “C Operator Precedence - cppreference.com,” [Online]. Available:

http://en.cppreference.com/w/c/language/operator_precedence. [Accessed 10 Dec

2015].

[21] scriptasylum.com, “Postfix Evaluation,” [Online]. Available:

http://scriptasylum.com/tutorials/infix_postfix/algorithms/postfix-evaluation/.

[Accessed 10 Dec 2015].

[22] Admin, “Postfix Evaluation ~ EveryBrickMatters,” EveryBrickMatters, Nov 2013.

[Online]. Available: http://www.everybrickmatters.com/2013/11/postfix-

evaluation.html. [Accessed 21 Nov 2015].

[23] T. Parr, “ANTLR,” Antlr / Terence Parr, [Online]. Available: http://www.antlr.org/.

[Accessed 12 Jan 2016].

Referencing Style: IEEE Standard

Alexander Broadbent 110225104

Page 45 of 47

11 Appendices

Figure 13 - Test coverage run at the end of the final implementation

Figure 14 - Example code for an Average function class

Alexander Broadbent 110225104

Page 46 of 47

Figure 15 - Sample execution of the final solution

Figure 16 - Sample output of the JUnit tests