universidad politÉcnica de madrid facultad de …oa.upm.es/5682/1/ange_herranz_nieva_2.pdfreno...

UNIVERSIDAD POLITÉCNICA DE MADRID

FACULTAD DE INFORMÁTICA

UNA NOTACIÓN FORMAL ORIENTADA A

OBJETOS: ESPECIFICACIONES

EJECUTABLES CON CLAY

AN OBJECT-ORIENTED FORMAL NOTATION:EXECUTABLE SPECIFICATIONS IN CLAY

TESIS DOCTORAL

Ángel Herranz NievaEnero de 2011

UNA NOTACIÓN FORMAL ORIENTADA A

OBJETOS: ESPECIFICACIONES EJECUTABLES

CON CLAY

TESIS DOCTORAL

PRESENTADA EN LA FACULTAD DE INFORMÁTICA

DE LA UNIVERSIDAD POLITÉCNICA DE MADRID

PARA LA OBTENCIÓN DEL TÍTULO DE

DOCTOR EN INFORMÁTICA

Candidato: Ángel Herranz

Ingeniero en InformáticaUniversidad Politécnica de MadridEspaña

Director: Julio Mariño Carballo

Profesor Titular de Universidad

Madrid, Enero de 2011

This work is licensed under the Creative Commons Attribution-Share Alike 3.0 License. To view a copy of this license, vi-

sit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San

Francisco, California, 94105, USA.

Fe de erratas

• Página I (prefacio en español), línea 2: De-sde 7→ Des-de.

• Página 23, línea 8: frase huérfana eliminada.

• Página 99, línea 28: se corrige la traducción de la postcondición del método

“remove” (faltaba la traducción de la parte izquierda de la conjunción).

A Silvia

Agradecimientos

Decía Cervantes que “de gente bien nacida es agradecer los beneficios que se reci-

ben”. Así comenzaban los agradecimientos de la tesis doctoral de la persona a

la que más beneficios tengo que agradecer: el profesor Juan José Moreno Nava-

rro. Durante años me ha mostrado un apoyo y una confianza que en ocasiones

no merecí. Espero que reconozca en esta tesis una mínima pero representativa

muestra de su visión de la informática y de todo lo que he aprendido a su lado.

A mi director de tesis Julio Mariño Carballo, y amigo a pesar de todo, tengo

que agradecerle el orden que puso en mi cabeza y, por tanto, en este trabajo. A

última hora aprendí que la perfección es un continuo y Julio me ha ayudado a

que esta tesis supere un umbral que me permita sentirme cómodo.

Quiero agradecer a los miembros de Babel, mi grupo de investigación, su ayu-

da y sus críticas siempre constructivas a mi trabajo. En especial a aquellos com-

pañeros que han revisado esta tesis: Pablo Nogueira, Jamie Murdoch Gabbay,

Lars-Åke Fredlund, Clara Benac y Emilio Gallego. A mis colegas Manuel Carro y

Germán Puebla quiero agradecerles sus valiosos comentarios tras la prelectura.

A los creadores de teorías, lógicas, cálculos, lenguajes de programación, com-

piladores, demostradores de teoremas y herramientas que he utilizado a lo largo

de mis años en el mundo de la investigación quiero agradecerles su magnífica e

inestimable contribución.

A Rafael Corchuelo quiero darle las gracias por un apoyo que siempre se ha

manifestado en aquella imagen que me envió hace varios años: .

Siempre he sido de quienes dejan lo mejor para el final. A Silvia quiero darle

las gracias por su amor incondicional. A ella dedico todo este trabajo.

Prefacio

En el verano anterior a mi segundo año de carrera, sin ser muy consciente del ob-

jetivo de mi propia búsqueda, saqué de la biblioteca el libro Art of Prolog. Desde

entonces empecé a dejar de hablar el idioma de mis compañeros de profesión,

yo cada día más declarativo (lenguajes lógicos, lenguajes funcionales y finalmen-

te lenguajes de especificación formal y métodos formales) y ellos cada día más

imperativos. ¿Cómo era posible que personas tan inteligentes no se dieran cuen-

ta de que debíamos elevar el nivel de abstracción de los lenguajes en los que

describimos nuestros sistemas software?

Hoy me resulta evidente que la baja adopción de técnicas formales para el

desarrollo de software tiene su causa principal en aspectos sociológicos y econó-

micos mucho más complejos y sutiles, en los que es complicado influir de forma

directa desde el ámbito académico.

No obstante, desde la comunidad académica tenemos la posibilidad de in-

fluir de una forma indirecta y siempre a medio y largo plazo. Una primera vía

es incorporar las técnicas formales en los currícula de las carreras de ingenie-

ría informática, un buen ejemplo es el trabajo de Meyer [107]. La segunda vía

es la búsqueda de explicaciones a por qué la ingeniería de software no utiliza

herramientas matemáticas de la misma forma en que lo hacen otras ingenierías.

Desde mi punto de vista, la ausencia de herramientas efectivas (útiles desde

el prisma de los desarrolladores) es el primer problema que se debe atacar pa-

ra romper el siguiente círculo vicioso: hacen falta profesionales bien entrenados

que creen casos de éxito, hacen falta casos de éxito para fomentar la creación

de herramientas y hacen falta herramientas para entrenar a los profesionales.

Considero, además, que entre los cometidos de la comunidad científica está in-

cluido el de ayudar en la transferencia tecnológica de herramientas conceptuales

suficientemente maduras.

I

La motivación de esta tesis, y de gran parte de mi trabajo como investiga-

dor, es aportar mi grano de arena en el acercamiento de la declaratividad a la

industria del desarrollo de software.

Las contribuciones de esta tesis son Clay, un lenguaje de especificaciones for-

males orientado a objetos, herramientas teóricas y materiales asociadas y una

muestra de cómo se pueden integrar diferentes técnicas formales en los actuales

procesos de producción de software.

Historia

En 1996, con la entrada de un nuevo plan de estudios para informática en la

Universidad Politécnica de Madrid, el profesor Juan José Moreno Navarro di-

señó y puso en marcha una estrategia para currículum de programación que iba

más allá de las, ahora clásicas, functional-first o logic-first. Se utilizaría un len-

guaje de especificación formal con el que los estudiantes describirían los pro-

blemas y se enseñarían algoritmos que implementaran correctamente dichas

especificaciones.

Tres años más tarde había abandonado mi tema de tesis doctoral (máquinas

abstractas para lenguajes lógico funcionales) para trabajar con el profesor Mo-

reno Navarro en el diseño de un nuevo lenguaje de especificación. El lenguaje

estaba basado en lógica de primer orden y recogía patrones sintácticos que fa-

vorecían la descripción tanto de los problemas como la de sus soluciones, a la

par que facilitaban la generación de código a partir de ambas descripciones. El

lenguaje acabó denominándose SLAM-SL1 [63] e incorporaba infinidad de cons-

trucciones sintácticas para soportar distintos paradigmas (como el funcional o el

orientado a objetos) y patrones de diseño (iteradores, visitadores, cuantificado-

res). En 2003, habíamos incorporado tantas construcciones nuevas al lenguaje y

habíamos explorado tantas ramas posibles que cuando intenté su formalización

me resultó imposible, especialmente mientras pretendía mantenerlo todo bajo

el único paraguas de la lógica de primer orden.

Hasta ese momento, además, mi trabajo de investigación había adquirido

un perfil muy cercano a la “ingeniería de software” clásica, dejando al margen

aspectos teóricos a los que no quería renunciar.

1El nombre, acrónimo para Specification Language for Abstract Machines, es anterior alproyecto SLAM de Thomas Ball [144] y no tiene relación con éste más allá de que ambos sepueden englobar en el área de los métodos formales.

II

En 2007, tras la marcha del profesor Juan José Moreno Navarro al ministerio

de Ciencia en Innovación, Julio Mariño Carballo se hace cargo de la dirección

de mi tesis. Es entonces cuando surge Clay tras un proceso de refactorización

conceptual de SLAM-SL. Se seleccionan las características más idiosincrásicas

de SLAM-SL y se mantienen en Clay, pero se eliminan construcciones sintácticas

creadas con el único objetivo de generar código en lenguajes imperativos como

podía ser Java.

Mantener la lógica de primer orden como marco lógico para el lenguaje y

encontrar una teoría de la igualdad capaz de reflejar conceptos de la orienta-

ción a objetos como la herencia y la redefinición de métodos han estado entre

mis principales obsesiones. Le sigue mi afán por reflejar todo el trabajo teórico

en herramientas automáticas que me han permitido encontrar y corregir errores

conceptuales.

Contribuciones

Diseño de Clay. Clay es una notación formal orientada a objetos sin concepto de

estado, un lenguaje basado en clases con un sistema de tipos nominal y que

integra tipos algebraicos, herencia, una interpretación a la escandinava de

la igualdad y la sobreescritura de los métodos y un esquema de sobrecarga

muy permisivo.

Sistema de tipos nominal. Un sistema de tipos para Clay basado en nombres

(más complicados de definir que en los sistema basados en la estructura

de los tipos) que es usado para descartar especificaciones ilegales y pa-

ra guiar los procesos de traducción usados en la definición de la semán-

tica de primer orden y en la generación de código Prolog a partir de las

especificaciones.

Semántica Clay basada en lógica de primer orden. Una semántica para las es-

pecificaciones con una interpretación en lógica de primer orden de las

principales construcciones de la orientación a objetos: definición de cla-

ses por casos, herencia, sobrecarga permisiva, ligado dinámico e igualdad

estática. Además, mediante el uso de la sintaxis concreta de los demostra-

dores Prover9/Mace4 se ha avanzado en la mecanización tanto de la teoría

de Clay como de las propias especificaciones. Por ejemplo, algunos de los

teoremas de Clay han sido probados de manera automática.

III

Un generador de prototipos ejecutables. Un esquema de compilación de espe-

cificaciones Clay a programas lógicos. Dicho esquema contempla especi-

ficaciones implícitas y recursivas y hace uso de diferentes técnicas de pro-

gramación para conseguir una eficiencia aceptable: transformación Lloyd-

Topor, negación constructiva, búsqueda en anchura con profundización

incremental, restricciones de dominios finitos, etc.

El compilador de Clay. Una herramienta que va más allá de la simple definición

matemática de las traducciones a lógica de primer orden y la síntesis de

programas lógicos. He construido una herramienta real que realiza análisis

sintáctico de especificaciones Clay modulares, comprobación de tipos, tra-

ducción de especificaciones Clay a teorías de la lógica clásica en Prover9/

Mace4, y la síntesis de prototipos ejecutables Prolog.

Acercamiento de los métodos formales a la ingeniería. Digresiones que ilus-

tran diferentes modos de aplicación de nuestro trabajo previo en SLAM-SL

en el común de la ingeniería del software: integración en metodologías

ágiles, generación automática de código Java y formalización de patrones

de diseño.

Quiero resaltar que todo el trabajo teórico que se presenta en mi tesis está me-

canizado, es decir, ha sido descrito utilizando lenguajes y herramientas formales

con las que es posible interactuar: las teorías de Clay en Prover9/Mace4 (lógica

de primer orden) y Prolog (programación lógica) y las funciones de traducción

en Haskell. Creo que todo este trabajo puede considerarse como una contri-

bución metodológica que permita acercarnos a los ideales de propuestas como

POPLmark challenge y QED manifesto .

Influencias

Cuando uno busca crear un marco lógico capaz de capturar las nociones de la

orientación a objetos, sin introducir comportamientos extraños para expertos

en dicha área, el primer paso es aprender el lenguaje de los expertos. Puedo

nombrar a Budd [21], Fowler [48] o a la banda de los cuatro [52] como mis prin-

cipales fuentes de dicho aprendizaje. Tras ellos, debo añadir a la innumerable

legión de diseñadores y programadores que vierten su pericia en grupos de noti-

cias, weblogs y foros de discusión de programación orientada a objetos y diseño

orientado a objetos.

IV

De vuelta al terreno formal, he analizado marcos lógicos de diversos lengua-

jes de especificación orientados a objetos. Comenzaré nombrando VDM++ [149]

y Object-Z [116], las versiones orientadas a objetos de los clásicos VDM y Z. En

ambos casos los marcos resultantes son adaptaciones más o menos fieles a la

orientación a objetos de lógicas poco apropiadas para dicho paradigma. La si-

guiente vía analizada es la de lenguajes construidos sobre lógicas tipadas (con

subtipos) y me parecen CASL [100], COLD [74, 79] y OBJ [53] y Maude [33] los

más relevantes. Sus marcos son extraordinariamente elegantes, pero en ellos no

siempre los conceptos matemáticos se corresponden con lo que la orientación a

objetos dicta. La última fuente de influencia en el terreno formal es la de autores

de trabajos en cálculos de objetos, entre los que resaltaré a Abadi y Cardelli [1] y a

Castagna [25]. Sus trabajos, de una enorme influencia teórica, proponen marcos

menos expresivos que las lógicas anteriores y siempre basados en sistemas de ti-

pos estructurales que tratan muy superficialmente el aspecto crucial del nombre

de los tipos.

Terminaré nombrando a Daniel Jackson y el lenguaje Alloy [71], probable-

mente el principal representante de los métodos formales ligeros. Basado en el

marco lógico de la teoría relacional, posee una sintaxis de aspecto y semántica si-

milar a la orientación a objetos, con nociones de navegabilidad similares a las de

UML. Compararé mi línea de generación de teorías lógicas y prototipos a partir

de Clay con el enfoque de la comprobación de modelos que sigue Alloy.

Cómo leer esta tesis

El capítulo 1 ofrece al lector la posibilidad de profundizar en la motivación, justi-

ficación y contribuciones de mi trabajo.

El lector más interesado en los aspectos formales relacionados con la descrip-

ción sintáctica y semántica del lenguaje Clay y con la generación de prototipos

ejecutables puede continuar leyendo el capítulo 2. En él se ofrece una descrip-

ción informal del lenguaje, para continuar con la formalización del mismo en

los capítulos 3 y 4 y terminar con los aspectos esenciales de la generación de

programas lógicos en el capítulo 5. Los capítulos 7, 8 y 9 son de especial inte-

rés para comprender el origen de algunas de las construcciones de Clay y sus

implicaciones a nivel ingenieril.

El lector más interesado en las aportaciones al área de la ingeniería del soft-

V

ware puede saltar, tras el capítulo 1, a los capítulos 7 y 8 en los que analizamos

la compatibilidad de las técnicas formales con algunas prácticas actuales de la

industria de producción de software. Puede entonces continuar con el capítulo

9, en el que se muestra la formalización de varios patrones de diseño utilizando

nuestro propio lenguaje. Los aspectos formales en los capítulos 3, 4 y 5 exigen la

lectura previa del capítulo 2, en el que se presenta informalmente Clay.

VI

Resumen†

La tesis presenta el lenguaje Clay, una notación formal orientada a objetos que

busca acercar los métodos formales a los lenguajes de programación y procesos

de desarrollo de software más en uso hoy en día. Junto con la definición formal

del lenguaje, se proporcionan herramientas y aplicaciones que demuestran la

viabilidad del proyecto.

1. Motivación

El punto de partida de esta tesis es una reflexión sobre la ingeniería de software

y el papel que las matemáticas pueden o deben jugar en su práctica, tomando

como referencia las otras ingenierías que nos llevan unos siglos de ventaja y el

estado actual del uso de métodos formales para el desarrollo de software.

Mucho se ha escrito sobre por qué los métodos formales no encuentran un

uso en el desarrollo de software acorde con el grado de sofisticación técnico al-

canzado por las diferentes propuestas académicas. Parte del capítulo 1 se de-

dica precisamente a recapitular algunas de las opiniones más cualificadas sobre

este particular, tratando de extraer algunas conclusiones. Entre ellas, podemos

destacar:

1. La falta de profesionales cualificados en técnicas formales [19],

2. la escasez de herramientas adaptadas a los procesos de desarrollo más ha-

bituales [4],

†Este resumen de la Tesis Doctoral, presentada en lengua inglesa para su defensa ante un tri-bunal internacional, es preceptivo según la normativa de doctorado vigente en la UniversidadPolitécnica de Madrid.

VII

3. la distancia de lenguaje entre los formalismos de especificación y los de

programación [3], y

4. la percepción de que los métodos formales sirven para aumentar la fiabi-

lidad, si bien, a costa de reducir la productividad e incrementar los costes

de desarrollo [37].

Todos estos problemas están, de una u otra manera, interrelacionados. Así, la fal-

ta de profesionales cualificados es fundamentalmente debida a no haberse esta-

blecido una serie de estándares de facto, lo cual es imposible sin una cierta masa

crítica de casos de éxito, lo que desemboca en el consiguiente círculo vicioso.

El cuarto punto está también provocado, en parte, por la falta de engarce en-

tre las técnicas propuestas y los modelos de desarrollo más arraigados. Por ello,

los métodos formales son percibidos como obstáculos que añaden nuevas fases

al desarrollo sin ayudar realmente en “las de toda la vida”. Se piensa que el uso

de formalidad en la especificación de requisitos retrasará la aparición del pri-

mer código en funcionamiento, produciendo una sensación de retraso tanto a

los clientes como a los gestores del proyecto —de nada vale que, a la postre, el

tiempo total de desarrollo se reduzca.

Algo similar ocurre con las especificaciones formales y las fases de testing:

aun asumiendo que las técnicas formales proporcionasen fiabilidad total del có-

digo producido respecto a la especificación, nada protege contra la posibilidad

de errores en la propia formalización de los requisitos, por lo que no nos ahorra-

mos los procesos de validación tradicionales.

Todo esto ha hecho que los métodos formales (al menos en su acepción más

clásica) sigan restringidos a nichos donde las ventajas de su uso son claramente

superiores a los inconvenientes percibidos. Estamos hablando de software crí-

tico, de un tamaño limitado y que admite una especificación formal totalmente

exhaustiva de los requisitos. Algunos casos de éxito de lo que mencionamos son

tanto empresas [134, 141, 131, 130, 132] como proyectos [12, 15, 126, 84, 117, 57,

29, 20, 18]. Sin embargo, el uso de las técnicas formales para el desarrollo de

software de propósito general sigue siendo escaso.

Recientemente se observa una tendencia a intentar romper con estas inercias

por la vía de sacrificar algunos de los dogmas del desarrollo verificado (supuesta-

mente conducente a un software correcto al 100%) en aras de obtener beneficios

inmediatos, tangibles y que mejoren la calidad del software producido, al menos

en promedio. Así, asistimos a la aparición de los llamados métodos formales li-

VIII

geros que resultan mucho más fáciles de asimilar por los desarrolladores y que

producen resultados inmediatos en forma de errores en la captura de requisitos,

generación de casos de prueba, etc. Algunos de estos sistemas, de relativa sim-

plicidad, están consiguiendo una popularidad difícil de imaginar hace unos años

[71, 86, 128, 142, 146, 139].

2. Objetivos

Teniendo muy presente estas experiencias recientes, nos planteamos la posibili-

dad de poner al día nuestro trabajo en notaciones formales orientadas a objetos:

SLAM-SL [63]. Dicho trabajo estaba encaminado fundamentalmente a reducir el

desnivel lingüístico apuntado en el tercero de los problemas mencionados en la

sección anterior, con la posibilidad de generar prototipos ejecutables.

Una de las ventajas que se pueden conseguir con este enfoque es que estos

prototipos pueden ser integrados en el desarrollo de software tanto para redu-

cir el tiempo de generación de código tangible como para ayudar en la valida-

ción temprana de los requisitos. De hecho, un prototipo ejecutable tiene una

capacidad de validación de requisitos superior a las técnicas (generalmente ba-

sadas en model checking) que implementan los métodos ligeros existentes en la

actualidad.

Para reducir ese desnivel lingüístico adoptamos para SLAM-SL el paradigma

de la orientación a objetos, ya que éste tenía una relevancia comercial muy alta.

Se consideró razonable dotar a SLAM-SL de características orientadas a objeto

[61, 60, 63] tratando además de acercarnos a prácticas exitosas en la ingeniería

del software [64, 65, 66, 69]. El lenguaje incorporaba además construcciones sin-

tácticas para soportar otros paradigmas (como el funcional o el lógico) y patrones

de diseño (iteradores, visitadores, cuantificadores).

El problema con SLAM-SL era doble:

• Tenía tantas construcciones y habíamos explorado tantas ramas posibles

que la tarea de formalización resultaba imposible, por lo que SLAM-SL

nunca llegó a tener una semántica formalizada.

• La generación de prototipos ejecutables se realizaba a partir de patrones

sintácticos bastante estrictos que no admitían especificaciones implícitas

o recursivas.

IX

El objetivo principal de esta tesis ha sido solventar ambos problemas.

3. Nuestra propuesta: Clay

Nuestra propuesta, Clay, es una evolución de SLAM-SL y se ha diseñado a partir

de dos ideas básicas. En primer lugar, debe poseer un núcleo capaz de dar cabi-

da a las principales construcciones propias de la orientación a objetos y con una

semántica formal cercana a la esperada por los expertos. En segundo lugar, todas

las especificaciones deben admitir al menos una traducción a prototipos ejecu-

tables que permita a los desarrolladores la validación temprana de los mismos

requisitos.

El requisito de pretender una semántica asequible nos lleva a considerar en

primer lugar una lógica subyacente clásica y de primer orden. Además del obje-

tivo de formalizar el lenguaje, se pretende sacar partido del avanzado estado de

la tecnología de demostración automática para teorías de primer orden, con lo

que esto puede representar tanto para depurar la meta-teoría de Clay como para

razonar sobre las mismas especificaciones.

Una de las áreas tradicionales de nuestro grupo de investigación es la pro-

gramación lógica, donde un subconjunto de la lógica de primer orden admite un

procedimiento sistemático de deducción que permite a un programador experi-

mentado expresar una especificación lógica como un algoritmo de una eficiencia

más que razonable. A pesar de las expectativas despertadas por la programación

lógica desde los años 70 del s. XX, de ser una especie de lenguaje de especifica-

ción universal, su aparente distancia lingüística con los lenguajes tradicionales

de programación siempre ha hecho que los desarrolladores le diesen la espalda.

Para la generación de prototipos ejecutables decidimos combinar Clay y sus

especificaciones de alto nivel orientadas a objeto con la capacidad de deducción

automática que ofrece la programación lógica.

3.1 Clay

Clay es una notación formal orientada a objetos, cercana a la forma de pensar de

los desarrolladores y suficientemente sencilla como para permitir la síntesis de

programas lógicos a partir de sus especificaciones.

X

Cell.cly

claycPrologProver9

Mace4

Cell.pl

Cell.p9Clay.p9

Clay.pl

Developer

Figura 1: Nuestra propuesta: Clay.

Clay no tiene estado, está basado en clases y su sistema de tipos está basado

en nombres. Las clases pueden ser definidas como tipos algebraicos mediante

case classes: subclases disjuntas y completas. El comportamiento de las clases

puede ser extendido mediante herencia. Los métodos se especifican mediante

pre- y post-condiciones que son fórmulas de primer orden que relacionan al re-

ceptor del mensaje y los parámetros del mismo con su respuesta. Las fórmulas

atómicas más relevantes son la pertenencia a clases y la igualdad indexada por la

mínima superclase de ambos lados de la igualdad.

Mostramos en la figura 1 cómo funcionaría nuestra propuesta: el especifi-

cador escribe especificaciones Clay que el compilador de Clay las transforma en

teorías de primer orden y en programas lógicos. Entonces, el especificador puede

interactuar con Prover9/Mace4 y con Prolog para proceder con una validación de

los requisitos. Prover9/Mace4 le permite comprobar la consistencia de sus pro-

pias especificaciones mientras que Prolog le permite analizar si los resultados de

ciertas comprobaciones son los esperados.

XI

4. Contribuciones de esta tesis

Resumimos a continuación las principales contribuciones de la presente tesis.

Algunos resultados del trabajo ya han sido publicados; en tal caso añadimos

además la referencia completa de las publicaciones.

4.1 Diseño de una notación formal orientada a objetos

Una de las motivaciones de este trabajo era el estudio y la integración de con-

ceptos de la orientación a objetos, tanto los más establecidos como algunos de

los más novedosos, en una notación formal que ayudase a la implantación de

metodologías de desarrollo riguroso de software. Para ello, en un principio se

desarrolló el lenguaje de especificación SLAM-SL. Clay es una evolución lige-

ra de SLAM-SL que permite un mejor tratamiento formal y mecanizable de sus

características esenciales.

Los aspectos más reseñables de Clay son:

• Notación formal orientada a objetos, sin noción de estado y con un sistema

de tipos basado en nombres.

• Soporta tipos algebraicos en forma de case classes.

• La herencia es segura: las propiedades de una clase no pueden ser inva-

lidadas en el proceso de herencia, posee semántica escandinava para la

sobreescritura de métodos y la sobrecarga siempre añade comportamiento

extra al ya existente.

• Los métodos se especifican mediante pre- y post-condiciones que son fór-

mulas de primer orden que relacionan al receptor del mensaje y los pará-

metros del mismo con su respuesta.

• La igualdad está indexada por la mínima superclase de ambos lados de la

igualdad.

Las siguientes publicaciones contienen las principales ideas de SLAM-SL, el pre-

cursor de Clay, así como interesantes digresiones sobre las características men-

cionadas:

XII

• A. Herranz and J. J. Moreno-Navarro. On the design of an object-oriented

formal notation. In Fourth Workshop on Rigorous Object Oriented Methods,

ROOM 4. King’s College, London, March 2002.

• A. Herranz and J. J. Moreno-Navarro. Towards automating the iterative

rapid prototyping process with the slam system. In V Spanish Conference

on Software Engineering, pages 217–228, November 2000.

• A. Herranz and J. J. Moreno-Navarro. On the role of functional-logic lan-

guages for the debugging of imperative programs. In 9th International

Workshop on Functional and Logic Programming (WFLP 2000), Benicas-

sim, Spain, September 2000. Universidad Politécnica de Valencia.

• A. Herranz and J. J. Moreno-Navarro. Generation of and debugging with

logical pre and post conditions. In M. Ducasse, editor, Automated and

Algorithmic Debugging 2000. TU Munich, August 2000.

4.2 Sistema de tipos nominal para orientación a objetos

Tal como ya fue puesto de manifiesto en su día por Abadi y Cardelli [1], el subtipa-

do basado en nombres de tipos, en lugar de en la estructura de éstos, es difícil de

definir con precisión. Una de las principales desventajas del tipado estructural

es que dos tipos pueden quedar accidentalmente relacionados cuando el desa-

rrollador podría considerarlos ajenos, algo evitado por nuestra contribución de

un sistema de tipos nominal para Clay. Éste es usado tanto para descartar es-

pecificaciones ilegales como para guiar los procesos de traducción usados en la

definición de la semántica de primer orden y en la generación de código Prolog

a partir de las especificaciones.

4.3 Semántica formal de primer orden para Clay

Otra de las contribuciones principales es una semántica de primer orden para las

especificaciones Clay, entre cuyas virtudes podemos mencionar:

• Una interpretación en lógica de las principales construcciones de la orien-

tación a objetos: definición de clases por casos, herencia, sobrecarga per-

misiva, ligado dinámico e igualdad estática.

XIII

• Mediante el uso de la sintaxis concreta de los demostradores Prover9/

Mace4 se ha avanzado en la mecanización tanto de la meta-teoría de

Clay como de las propias especificaciones. Por ejemplo, algunos de los

teoremas de Clay han sido probados de manera automática.

4.4 Un generador de prototipos ejecutables

Hemos presentado un esquema de compilación de especificaciones Clay a pro-

gramas lógicos. La principal conclusión que podemos extraer es que la genera-

ción de prototipos ejecutables a partir de una notación formal orientada a obje-

tos es algo factible, lo que abre la posibilidad de aplicar métodos de desarrollo

de software basados en la orientación a objetos en la especificación formal de

requisitos y su validación ágil mediante prototipos tempranos.

Algunas de las características reseñables de este generador son:

• Generación de código a partir de especificaciones implícitas, incluso en

presencia de definiciones recursivas, algo inusual en otras herramientas.

• Nuestra implementación hace uso de diferentes técnicas de programación

para conseguir una eficiencia aceptable: transformación Lloyd-Topor, ne-

gación constructiva, búsqueda en anchura con profundización incremen-

tal, etc.

Esta contribución es una mejora sustancial de los resultados reflejados en la si-

guiente publicación:

• Ángel Herranz and Julio Mariño. Executable specifications in an object

oriented formal notation. In 20th International Symposium on Logic-Based

Program Synthesis and Transformation, LOPSTR 2010, Hagenberg, Austria,

July 2010. Research Institute for Symbolic Computation (RISC) of the Jo-

hannes Kepler University Linz.

4.5 El compilador de Clay

Habiendo identificado la carencia de herramientas formales como una de las

principales causas de la escasa penetración de los métodos formales en el de-

sarrollo de software de propósito general, no podíamos quedarnos en una sim-

ple definición matemática de la traducción, sino que queríamos construir una

XIV

herramienta real que de una manera tangible incorporase los desarrollos arriba

mencionados. El compilador de Clay permite:

• el análisis sintáctico de especificaciones Clay estructuradas en diferentes

módulos,

• la comprobación de tipos y la anotación de especificaciones Clay,

• la traducción de especificaciones Clay a teorías de primer orden en la sin-

taxis concreta aceptada por Prover9/Mace4, y

• la síntesis de prototipos ejecutables Prolog.

Implementado en Haskell, el compilador de Clay ha sido desarrollado aplicando

métodos y técnicas de programación funcional de última generación.

4.6 Métodos formales y prácticas comunes en el desarrollo de

software

Dentro del objetivo general de contribuir al uso de las técnicas formales en el

común de la ingeniería de software, se han incluido varios capítulos dedicados

a ilustrar diferentes modos de aplicación de estas tecnologías. Dichos capítulos

recogen nuestro trabajo previo en el diseño de SLAM-SL que, como hemos dicho,

debe considerarse el lenguaje precursor de Clay.

• Mostramos cómo es posible integrar los métodos formales y metodolo-

gías ágiles como la programación extrema (XP). En particular, hemos estu-

diado las pruebas de unidad, la refactorización y, de manera especial, el

desarrollo incremental desde el prisma de los métodos formales.

Las publicaciones que dan lugar a esta contribución son:

– A. Herranz and J.J. Moreno-Navarro. Formal extreme (and extremely

formal) programming. In Michele Marchesi and Giancarlo Succi, edi-

tors, 4th International Conference on Extreme Programming and Agi-

le Processes in Software Engineering, XP 2003, number 2675 in LNCS,

pages 88–96, Genova, Italy, May 2003.

– A. Herranz and J.J. Moreno-Navarro. Formal agility. how much of

each? In Taller de Metodologías Ágiles en el Desarrollo del Software.

XV

VIII Jornadas de Ingeniería del Software y Bases de Datos, JISBD 2003,

pages 47–51, Alicante, España, November 2003. Grupo ISSI.

– A. Herranz and J.J. Moreno-Navarro. Rapid prototyping and incre-

mental evolution using SLAM. In 14th IEEE International Workshop

on Rapid System Prototyping, RSP 2003), San Diego, California, USA,

June 2003.

• Hemos definido una caracterización sintáctica de una clase de especifi-

caciones que permite sintetizar código eficiente (Java, C++), haciendo así

que el proceso iterativo de prototipado rápido y los métodos formales se

integren de una manera rentable.


– A. Herranz, N. Maya, and J.J. Moreno-Navarro. From executable spe-

cifications to java. In Juan José Moreno-Navarro and Manuel Palomar,

editors, III Jornadas sobre Programación y Lenguajes, PROLE 2003, pa-

ges 33–44, Alicante, España, November 2003. Departamento de Len-

guajes y Sistemas Informáticos, Universidad de Alicante. Depósito

Legal MU-2299-2003.

– A. Herranz and J. J. Moreno-Navarro. Specifying in the large: Object-

oriented specifications in the software development process. In

B. J. Krämer H. Ehrig and A. Ertas, editors, The Sixth Biennial World

Conference on Integrated Design and Process Technology (IDPT’02),

volume 1, Pasadena, California, June 2002. Society for Design and

Process Science. ISSN 1090-9389.

• Finalmente, hemos mostrado cómo formalizar patrones de diseño tratán-

dolos como operadores entre clases. La idea, en sí, estaba en el folclore

de la comunidad de patrones de diseño desde hace tiempo, pero la hemos

desarrollado completamente por vez primera.


– Juan José Moreno-Navarro and Ángel Herranz. Design Pattern For-

malization Techniques, chapter Modeling and Reasoning about Desi-

gn Patterns in SLAM-SL. IGI Publishing, March 2007. ISBN: 978-1-

59904-219-0, ISBN: 978-1-59904-221-3.

– A. Herranz, J.J. Moreno-Navarro, and N. Maya. Declarative reflection

and its application as a pattern language. In Marco Comini and More-

XVI

no Falaschi, editors, Electronic Notes in Theoretical Computer Science,

volume 76. Elsevier Science Publishers, November 2002.

– A. Herranz and J. J. Moreno-Navarro. Design patterns as class opera-

tors. Workshop on High Integrity Software Development at V Spanish

Conference on Software Engineering, JISBD’01, November 2001.

5. Estructura de la tesis

Terminamos este resumen con una reseña sobre la estructura de la tesis y el

contenido de cada capítulo.

Parte I: Introducción

Capítulo 1: Introducción

Aquí se motiva la tesis, se describe en detalle el estado del arte de los méto-

dos formales en relación con el desarrollo de software y se presentan los puntos

fundamentales de la propuesta Clay.

Capítulo 2: Clay

Este capítulo es una presentación informal del lenguaje de especificación. Todas

las características de Clay son introducidas mediante ejemplos, y aquellas deci-

siones menos evidentes justificadas con argumentos prácticos y metodológicos.

Parte II: Semántica

Capítulo 3: Semántica estática

Este capítulo presenta la sintaxis abstracta, un sistema de tipos para Clay basado

en nombres y sus resultados teóricos como la decidibilidad del mismo.

XVII

Capítulo 4: Semántica dinámica basada en lógica de primer orden

Aquí se proporciona una semántica dinámica a Clay mediante la traducción de

las especificaciones a teorías de primer orden.

Parte III: El sistema Clay

Capítulo 5: Síntesis de programas lógicos

Se presenta un refinamiento de la semántica dinámica del capítulo anterior que

permite generar código Prolog a partir de las especificaciones Clay.

Capítulo 6: El compilador de Clay

El compilador de Clay materializa las funciones presentadas en los capítulos an-

teriores: análisis sintáctico, comprobación de tipos, generación de teorías de

primer orden y conexión con demostradores y generación de código Prolog.

Parte IV: Aplicaciones

Capítulo 7: Agilidad formal en Clay

El capítulo trata sobre la integración de métodos formales en procesos de desa-

rrollo ágiles, programación extrema en particular.

Capítulo 8: Especificando de manera escalable

Se tratan ideas para la generación de código imperativo a partir de patrones sin-

tácticos en las especificaciones Clay. La idea es facilitar la incorporación de téc-

nicas formales en el proceso de prototipado rápido e iterativo.

Capítulo 9: Modelado de patrones de diseño en Clay

Se presenta la formalización de patrones de diseño como un ejercicio de especi-

ficación en Clay.

XVIII

Parte VI: Conclusiones

Capítulo 10: Conclusiones y Trabajo futuro

Parte VI: Apéndices

Apéndice A: Referencia del lenguaje Clay

Incluye sintaxis concreta y descripción detallada de todas las construcciones.

Apéndice B: Teoría de Clay en programación lógica

Contiene, al completo, la teoría de Clay en forma de programa lógico presentada

en el capítulo 5

Apéndice C: Convenciones matemáticas

Se dedica a la presentación de todas las convenciones matemáticas y notaciones

utilizadas a lo largo de la tesis.

XIX

UNIVERSIDAD POLITÉCNICA DE MADRID

FACULTAD DE INFORMÁTICA

AN OBJECT-ORIENTED FORMAL

NOTATION: EXECUTABLE

SPECIFICATIONS IN CLAY

PHD THESIS

Ángel Herranz NievaJanuary 2011

AN OBJECT-ORIENTED FORMAL NOTATION:

EXECUTABLE SPECIFICATIONS IN CLAY

A PHD THESIS

PRESENTED AT THE COMPUTER SCIENCE SCHOOL

OF THE TECHNICAL UNIVERSITY OF MADRID

IN PARTIAL FULFILLMENT OF THE DEGREE OF

DOCTOR IN COMPUTER SCIENCE

PhD Candidate: Ángel Herranz

Ingeniero en InformáticaUniversidad Politécnica de MadridEspaña

Advisor: Julio Mariño Carballo

Profesor Titular de Universidad

Madrid, January 2011

This work is licensed under the Creative Commons Attribution-Share Alike 3.0 License. To view a copy of this license, visit

http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Fran-

cisco, California, 94105, USA.

List of Errata in the Hardcover Binding

• Page 23, line 8: dangling sentence removed.

• Page 99, line 28: translation of the postcondition of method “remove” fixed

(the translation of the left hand side of the conjunction was missing).

To Silvia

Acknowledgements

Cervantes said that “to be grateful for benefits received is the part of persons of good

birth”. So started the acknowledgements in the PhD. thesis of the person whom

I have more to be grateful for: Professor Juan José Moreno Navarro. For years he

gave me support and confidence sometimes undeserved. I hope he sees in this

thesis a small but relevant part of his vision of informatics and of all I have learnt

beside him.

I want to express my gratitude to my PhD supervisor and real friend, Dr. Julio

Mariño. He brought order to my mind and, therefore, to this thesis. Late I learnt

that perfection is a continuous and Julio helped to advance past a threshold be-

yond which I started to feel proud of my own work.

It is a pleasure to thank the members of Babel, my research group, their help-

ful and constructive comments. In particular to my fellow group members that

reviewed this thesis: Pablo Nogueira, Jamie Murdoch Gabbay, Lars-Åke Fredlund,

Clara Benac, and Emilio Gallego. I would like to thank my colleagues Manuel

Carro and Germán Puebla their valuable comments after the pre-defence.

To the creators of theories, logics, calculi, algorithms, programming lan-

guages, compilers, theorem provers, and tools that helped me during my re-

search, I want to express my gratitude for their enormous and invaluable contri-

bution.

I wish to thank Rafael Corchuelo his support that has been always present in

the form of that image he sent me years ago: .

I have been one who saves the best for the last. To Silvia, I wish to thank her

for her unconditional love. To her I dedicate all this work.

Preface

In the summer after my first undergraduate year, somewhat unconscious about

what I was looking for, I picked up a book from the School’s library: The Art of

Prolog. I gradually stopped talking the language of my fellow students. Everyday

I was more declarative (logic languages, functional languages, and finally, formal

specification languages and formal methods) while everyday they were more im-

perative. How could it be possible that such intelligent people didn’t realise that

we must lift the level of abstraction of the languages we use for describing soft-

ware systems?

Today, it’s evident to me that the scarce adoption of formal techniques by in-

dustry is due in essence to rather complex and subtle social and economic issues

which are difficult to leverage from academia.

However, academia can exert an indirect and medium to long-term influence.

One way is by introducing formal techniques in Computer Science curricula, a

good example is the work of Meyer [107]. Another way is by finding out why

software engineering does not rely on formal mathematics and tools in the same

way other engineering disciplines do.

In my view, the lack of effective tools (actually useful for developers) is the

main problem we must address in order to break the following vicious circle:

well-trained professionals are needed to create success stories, success stories

are needed to promote the creation of tools, and without tools it is hard to train

professionals. Furthermore, one of the missions of the scientific community

should be to help in the technology transfer of conceptual tools of sufficient ma-

turity.

The motivation of this thesis, and of most of my work as a researcher, is to

contribute my part in bringing closer the declarative and the software develop-

ment industry.

i

In particular, the contributions of this thesis are: Clay (an object-oriented for-

mal notation), its associated theory and tools, and a demonstration that different

formal techniques can be integrated in current software production processes.

History

In 1996, Professor Juan José Moreno-Navarro began an ambitious plan for the

programming content of the computer science degree of Universidad Politécnica

de Madrid. The strategy went beyond a classical functional-first or logic-first ap-

proach in programming courses: students would learn to specify problems for-

mally before deriving the algorithms that solved them.

Three years later I abandoned my thesis topic (abstract machines for fun-

ctional-logic languages) and got involved with Professor Juan José Moreno-Na-

varro in the design of a new formal specification language, based on first-order

logic and with syntactic patterns that helped in the description of problems and

solutions. The name of the language is SLAM-SL [63], an acronym for Specifica-

tion Language for Abstract Machines. The name was chosen before Thomas Ball’s

SLAM project at Microsoft with which there’s nothing in common save for being

formal methods languages.

SLAM-SL incorporated a panoply of syntactical constructs to support vari-

ous paradigms (such as functional and object-oriented) and patterns (quanti-

fiers, iterators, and visitors). In 2003, we had incorporated so many new features

to SLAM-SL and we had explored so many possible options that it was almost

impossible to formalise, in particular, in terms of first-order logic.

My work had also acquired a taste of classic “software engineering” in the

sense of having to put aside theoretical aspects I didn’t want to give up. In this

thesis I have tried to factor out and to extract the rigorous mathematical essence

behind SLAM-SL. The result is Clay.

Julio Mariño Carballo took charge as thesis supervisor after Juan José Mo-

reno-Navarro’s appointment to the Ministry of Science and Innovation. It is at

this point that Clay is born, after a conceptual refactoring of SLAM-SL. The lat-

ter’s distinguishing features are carried over to Clay, not those syntactic construc-

tions that were created with the only goal of generating code for object-oriented

and imperative languages such as Java.

My main obsessions have been to keep first-order logic as the logical frame-

ii

work, to provide a theory for equality in the language capable of managing

paramount object-oriented concepts such as inheritance and overriding, and

to embody the theory in automatic tools, with the latter playing a major role in

finding minor and major errors in the theory itself.

Contributions

Design of Clay Clay is a stateless object-oriented formal notation with the fol-

lowing features: it’s class-based, has a nominal type system that inte-

grates algebraic types and inheritance, has equality, method overriding

with Scandinavian semantics, dynamic binding, and a rather permissive

overloading.

A nominal type system A class-name-based type system for Clay which is trick-

ier to define than a structural typing one. The type system is used to reject

illegal specifications, and also to help guide the translation schemes that

define the Clay semantics and the generation of executable prototypes.

A first-order formal semantics for Clay A first-order semantics for Clay that

gives an interpretation in first-order logic of the main object-oriented con-

structions: inheritance, defining classes by cases, overloading, dynamic

binding and static equality. Furthermore, the use of the concrete syntax of

an automatic theorem prover (Prover9/Mace4) has allowed mechanising

both, the Clay’s meta-theory and specifications. For example, some of the

theorems about Clay in this thesis have been proved semi-automatically.

Executable prototype generation A compilation scheme of Clay specifications

into Prolog programs. Code can be generated from implicit specifications,

even recursive ones, something hard to find in other tools. My implemen-

tation takes advantage of various logic programming techniques in order

to achieve reasonable efficiency: constraints, constructive negation, Lloyd-

Topor transforms, incremental deepening search, etc.

The Clay compiler A tool that goes beyond the mathematical presentation of

the translations into first-order logic and the synthesis of logic programs.

I have built a compiler that supports syntax analysis of modular Clay spec-

ifications, type checking, translation of Clay specifications into first-order

theories in Prover9/Mace4, and synthesis of executable Prolog prototypes.

iii

FM in common practises in software development Digressions that illustrate

different ways of applying my previous work on SLAM-SL to common prac-

tise in the software development nowadays: integration of formal methods

in agile processes, automatic synthesis of Java artifacts and formalisation

of design patterns.

I want to stress that all the theoretical work in this thesis is mechanised, i.e.

it has been described using languages and formal tools that the specifier can

interact with: first-order theories in Prover9/Mace4, logic programs in Prolog

and translation functions in Haskell. In my opinion, this work can be consid-

ered as a methodological contribution towards the ideals behind proposals like

POPLmark challenge and QED manifesto .

Influences

The first step to design a logical framework that captures object-oriented notions

with no alien behaviour for the experts is to speak the practitioners’ language. I

can name Budd [21], Fowler [48] or the gang of four [52] as the main sources of

my learning, as well as designers and programmers that are legion in weblogs and

newsgroups on programming and design techniques.

Regarding more formal aspects, I have analysed logical frameworks of object-

oriented formal notations. The object-oriented versions of classical specifica-

tions languages such as VDM++ [149] and Object-Z [116] are typically built on top

of semantics largely unrelated to the object-oriented paradigm. Other strand are

those object-oriented formal languages created on top of typed logics with sub-

typing such as CASL [100], COLD [74, 79], and OBJ [53] and Maude [33]. These

frameworks are extraordinarily elegant but their associated mathematical con-

cepts do not directly correspond with those that the object-oriented dictates. My

last source of influence I am going to name is the work on object-oriented cal-

culi of Abadi and Cardelli [1] and Castagna [25]. These works have an enormous

theoretical influence, works that propose less expressive frameworks than those

also mentioned in this paragraph and that marginally treat the crucial aspects of

names in types because their type systems are structural.

Finally, we have to mention Daniel Jackson and his language Alloy [71]. Alloy

is, arguably, the main representative of lightweight formal methods. It is based on

the relational theory and has a syntax and semantics similar, though not exactly

iv

the intended one, to object-oriented modelling languages like UML. I compare

my approach of synthesising prototypes with the model checking approach in

Alloy.

How to Read this Thesis

Readers interested in the formal aspects involved in the syntactic and semantic

description of the Clay language and executable prototype generation can start

with the introduction in Chapter 1 and the informal account of the language in

Chapter 2. Then they could move to the formalisation of the semantics of Clay:

the static one in Chapter 3 and the dynamic one in fist-order logic in Chapter 4.

The essential of the logic program synthesis in Chapter 5 should be the next

chapter to read. Chapter 6 describe the implementation of my prototype.

Readers interested in our contributions to software engineering can, after

reading the introduction in Chapter 1, jump directly to Chapters 7 and 8 where

the integrability of formal techniques into common practise is studied, then they

could go to Chapter 9 in which several design patterns are formalised using our

own language.

I have tried to adhere to several conventions through this thesis in order to

make the reader’s life easier. Let us summarise them:

• I have dealt with several formal languages. For each kind of language I use

a different font face convention:

– For specification languages such as Allow, Clay or SLAM-SL, I use a

sans serif face.

– For first-order logic languages and languages of automatic theorem

provers I use a bold face.

– For programming languages such as Prolog, Haskell and Java, I use a

typewriter face.

• In order to make the thesis as self-contained as possible, the Appendix C

introduces the mathematical preliminaries, and used definitions and con-

ventions.

• Meta-symbols and variables in the mathematical definitions of domains

and translation functions usually encode the domain they belongs to. For

v

example, for a domain named class environment I have tried to consis-

tently use ce.

vi

Contents

Preface i

Contents vii

I Introduction 1

1 Introduction 3

1.1 Software Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Formal Methods in Software Engineering . . . . . . . . . . . . . . . . 5

1.3 A Taxonomy of Formal Methods . . . . . . . . . . . . . . . . . . . . . 7

1.4 Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.5 Lightweight Formal Methods . . . . . . . . . . . . . . . . . . . . . . . 10

1.6 Executable Prototypes and Logic Programming . . . . . . . . . . . . 11

1.7 Our Proposal: Clay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.8 Contributions of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . 13

1.8.1 Design of an Object-Oriented Formal Notation . . . . . . . . 14

1.8.2 A Nominal Type System for Object-Orientation . . . . . . . . 15

1.8.3 A First-Order Formal Semantics for Clay . . . . . . . . . . . . 15

1.8.4 Executable Prototype Generation . . . . . . . . . . . . . . . . 16

1.8.5 The Clay Compiler . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.8.6 FM in Common Practise in Software Development . . . . . . 17

vii

1.9 Thesis Organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2 A Taste of Clay 21

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2 Classes and Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.4 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.4.1 No Top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.5 Case Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.6 Generics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.7 Pre- and Post-conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.8 Assertions and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.9 Clay Idiosyncrasy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.9.1 The Meaning of a Specification . . . . . . . . . . . . . . . . . . 33

2.9.2 Booleans and Formulae . . . . . . . . . . . . . . . . . . . . . . 38

2.9.3 Multiple Inheritance . . . . . . . . . . . . . . . . . . . . . . . . 38

2.9.4 Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.9.5 Invariants and Consistency . . . . . . . . . . . . . . . . . . . . 41

2.9.6 Other Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

II Semantics 43

3 Static Semantics 45

3.1 Abstract Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.1.1 Clay Specifications . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.1.2 State Environments . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.1.3 Method Environments . . . . . . . . . . . . . . . . . . . . . . . 49

3.1.4 Formulae and Expressions . . . . . . . . . . . . . . . . . . . . 50

3.2 The Type System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

viii

3.2.1 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3.2.2 Typing Environments . . . . . . . . . . . . . . . . . . . . . . . 53

3.2.3 Typing Judgements . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.2.4 Typing Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.3 Typing Clay Specifications . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.3.1 Synthesis of Typing Environments . . . . . . . . . . . . . . . . 59

3.3.2 Decidability of the Type System . . . . . . . . . . . . . . . . . 61

3.3.3 A Typing Example . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4 A Dynamic Semantics Based on First-Order Logic 65

4.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.2 The Logic of Clay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.2.1 Sorts and Subsorts . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.2.2 Function Symbols . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.2.3 Predicate Symbols . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.3 The Clay Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.3.1 Instanceof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.3.2 Subtyping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4.3.3 Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4.3.4 Pre- and Post-conditions . . . . . . . . . . . . . . . . . . . . . 78

4.4 TRANSClay,FOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4.4.1 Abstract Syntax for OOFOL . . . . . . . . . . . . . . . . . . . . 79

4.4.2 Translation of Spec . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.4.3 Translation of Class Specifications (CS) . . . . . . . . . . . . . 81

4.4.4 Translation of Algebraic Types (SE) . . . . . . . . . . . . . . . 82

4.4.5 Translation of Methods (ME) . . . . . . . . . . . . . . . . . . . 83

4.4.6 Translation of Formulae and Expressions . . . . . . . . . . . . 83

4.5 Mechanised Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

ix

4.5.1 Subject Reduction . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.5.2 Consistency of the Clay Theory . . . . . . . . . . . . . . . . . . 86

4.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

III The Clay System 89

5 Synthesis of Logic Programs 91

5.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.2 Interacting with Clay . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.3 Translating Clay Specifications into Logic Programs . . . . . . . . . . 94

5.3.1 Representing Clay Instances in Prolog . . . . . . . . . . . . . . 95

5.3.2 Instance of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

5.3.3 Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

5.3.4 Predefined Integers . . . . . . . . . . . . . . . . . . . . . . . . . 98

5.4 Formalised Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

5.4.1 Abstract Syntax of Clay . . . . . . . . . . . . . . . . . . . . . . . 100

5.4.2 Abstract Syntax of Logic Programs . . . . . . . . . . . . . . . . 100

5.4.3 Synthesis of Logic Programs . . . . . . . . . . . . . . . . . . . . 101

5.4.4 Synthesis of Extended Programs . . . . . . . . . . . . . . . . . 103

5.4.5 Lloyd-Topor Transformation . . . . . . . . . . . . . . . . . . . 107

5.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

5.6 Related Work and Conclusions . . . . . . . . . . . . . . . . . . . . . . 109

6 The Clay Compiler 111

6.1 Architecture of the Compiler . . . . . . . . . . . . . . . . . . . . . . . 111

6.2 More than Parsing (MTP) . . . . . . . . . . . . . . . . . . . . . . . . . 114

6.2.1 Class Specifications . . . . . . . . . . . . . . . . . . . . . . . . . 115

6.2.2 Object Expressions . . . . . . . . . . . . . . . . . . . . . . . . . 116

6.3 Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

x

6.3.1 The Environment Definition . . . . . . . . . . . . . . . . . . . 117

6.3.2 The Environment Construction . . . . . . . . . . . . . . . . . 118

6.4 Type Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

6.5 Translators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

6.5.1 Translation into Prover9/Mace4 . . . . . . . . . . . . . . . . . 121

6.5.2 Synthesis of Prolog Programs . . . . . . . . . . . . . . . . . . . 124

IV Applications 127

7 Formal Agility 129

7.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

7.2 Formal Methods and SLAM . . . . . . . . . . . . . . . . . . . . . . . . 131

7.2.1 Data Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

7.2.2 Method Specification . . . . . . . . . . . . . . . . . . . . . . . 132

7.2.3 Support for Testing and Debugging . . . . . . . . . . . . . . . 134

7.3 XP Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

7.3.1 Unit Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

7.3.2 Incremental Development . . . . . . . . . . . . . . . . . . . . 136

7.3.3 Refactoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

7.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

8 Specifying in the Large 141

8.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

8.2 The SLAM Specification Language . . . . . . . . . . . . . . . . . . . . 144

8.2.1 Classes and Class Relationships . . . . . . . . . . . . . . . . . 145

8.2.2 Method Specifications . . . . . . . . . . . . . . . . . . . . . . . 146

8.2.3 SLAM-SL Predefined Classes . . . . . . . . . . . . . . . . . . . 149

8.2.4 SLAM-SL Formulas and Quantifiers . . . . . . . . . . . . . . . 150

8.3 Algebraic Types and Pattern Matching . . . . . . . . . . . . . . . . . . 152

xi

8.3.1 Compiling Algebraic Types . . . . . . . . . . . . . . . . . . . . 153

8.3.2 Compiling Pattern Matching . . . . . . . . . . . . . . . . . . . 156

8.4 Compiling SLAM-SL Solutions into Efficient Code . . . . . . . . . . . 158

8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

9 Modelling Design Patterns 163

9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

9.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

9.2.1 Modelling Object-Oriented Specifications . . . . . . . . . . . 166

9.2.2 Other Formalizations of Design Patterns . . . . . . . . . . . . 180

9.3 Design Patterns as Class Operations . . . . . . . . . . . . . . . . . . . 181

9.3.1 Composite Pattern . . . . . . . . . . . . . . . . . . . . . . . . . 183

9.3.2 Decorator Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . 185

9.3.3 Different Modelling Possibilities . . . . . . . . . . . . . . . . . 186

9.3.4 Design Patterns Composition . . . . . . . . . . . . . . . . . . . 187

9.3.5 Application: Reasoning with Design Patterns . . . . . . . . . 187

9.3.6 Application: Integration in a Development Environment . . 191

9.4 Future Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

9.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

9.6 Appendix: Formalisation of DP in SLAM-SL . . . . . . . . . . . . . . 194

9.6.1 Abstract Factory (Figure 9.5) . . . . . . . . . . . . . . . . . . . 195

9.6.2 Bridge (Figure 9.6) . . . . . . . . . . . . . . . . . . . . . . . . . 196

9.6.3 Strategy (Figure 9.7) . . . . . . . . . . . . . . . . . . . . . . . . 196

9.6.4 Adapter (Figure 9.8) . . . . . . . . . . . . . . . . . . . . . . . . 197

9.6.5 Observer (Figure 9.9) . . . . . . . . . . . . . . . . . . . . . . . . 198

9.6.6 Template Method (Figure 9.10) . . . . . . . . . . . . . . . . . . 200

9.6.7 Decorator (Figure 9.11) . . . . . . . . . . . . . . . . . . . . . . 200

9.6.8 State (Figure 9.12) . . . . . . . . . . . . . . . . . . . . . . . . . . 201

xii

9.6.9 Builder (Figure 9.13) . . . . . . . . . . . . . . . . . . . . . . . . 203

V Conclusion 205

10 Conclusions and Future Work 207

10.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

10.1.1 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

10.1.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

10.1.3 Clay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

10.1.4 Software Development . . . . . . . . . . . . . . . . . . . . . . . 210

10.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

10.2.1 Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

10.2.2 Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

10.2.3 New Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

VI Appendices 215

A Clay Notation Reference 217

A.1 Lexical Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

A.1.1 Identifiers and Variables . . . . . . . . . . . . . . . . . . . . . . 217

A.1.2 Literals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

A.1.3 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

A.1.4 Keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

A.2 Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

A.2.1 Compilation Unit . . . . . . . . . . . . . . . . . . . . . . . . . . 220

A.2.2 Module Declaration and Module Identifier . . . . . . . . . . . 220

A.2.3 Import Declaration and Class Identifier . . . . . . . . . . . . . 220

A.2.4 Class Specification . . . . . . . . . . . . . . . . . . . . . . . . . 221

A.2.5 Class Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . 221

xiii

A.2.6 Invariant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

A.2.7 State Declaration (Case Classes) . . . . . . . . . . . . . . . . . 221

A.2.8 Method Specification and Message Identifiers . . . . . . . . . 222

A.3 Formulae and Expressions . . . . . . . . . . . . . . . . . . . . . . . . . 222

A.3.1 Class Expression . . . . . . . . . . . . . . . . . . . . . . . . . . 222

A.3.2 Object Expression . . . . . . . . . . . . . . . . . . . . . . . . . . 223

A.3.3 Syntactic Sugar . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

A.3.4 Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

A.4 Precedence and Associativity . . . . . . . . . . . . . . . . . . . . . . . 224

B Clay Theory in Logic Programming 227

C Mathematical Conventions 235

C.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

C.2 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

C.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

C.4 Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

C.5 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

C.6 Natural Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

C.7 Sequences and Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

C.8 Ellipsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241

Bibliography 243

xiv

Part I

Introduction

1

1Introduction

Abstract

This chapter motivates this thesis, describing the state of the art of formal

methods for software development, presents the main points of the Clay

proposal, and enumerates the contributions of the thesis.

1.1 Software Engineering

• Software engineering is the establishment and use of sound

methods for the efficient construction of efficient, correct, timely

and pleasing software that solves the problems such as users

identify them.

• Software engineering extends the field of computer science to in-

clude also the concerns of building of software systems that are

so large or so complex that they necessarily are built by a team or

teams of engineers.

Dines Bjørner [14].

3

1 Introduction

Software engineers cannot construct their large programs in one shot. First

they need to find the right models, abstractions that capture key properties of

parts of the program under construction, abstractions that allow them to foresee

the future behaviour of its creation before actually building it. A civil engineer,

for instance, would represent a steel bridge by a system of equations and the

behaviour of a single truss by a differential equation. An electrical engineer can

use graph theory to represent a network, and so on. As in other engineering

disciplines, the application of mathematics to software engineering is crucial if

we want to use abstractions and to reason about them.

Software engineering, then, should be the discipline of building correct, re-

liable programs through the use of mathematics (sound methods in Bjørner’s

words), hopefully thanks to suitable models of the software system. However,

the term is more often associated to other aspects of engineering (project man-

agement) than to mathematical modelling.

Possibly, the main cause for this lies in the inherent plasticity of software: the

possibility to introduce changes at any stage of development – even on a running

system – is definitely something out of the question in the rest of fields of en-

gineering, where a sharp separation between the design and production stages

is essential. In software development, however, design and coding can occur si-

multaneously. Iterative development methods where prototypes are commonly

used as a substitute for any kind of formal abstraction, and careful testing are

generally the only guarantees for quality in software. In short, using small-scale

prototypes of a complex software system can be very helpful but it sounds like

reducing architecture to just maquette building.

There are a few notable exceptions to the general picture. Some software sys-

tems are routinely built from formal models in such a way that all of the reasoning

is done at the model level and parts of the actual code are automatically gener-

ated. Some stages of a compiler, for instance, can be almost completely devel-

oped from formal grammars using so-called compiler generators and, although

software engineers could be able to develop a full compiler from scratch for any

programming language with no prior training on compiler technology, the effort

involved would be impractical. Moreover, as part of the compiler is generated

from its specification, changes can be introduced in an agile and less costly way,

as no errors can be introduced in the code due to changes in the grammars (at

least in principle, since generative approaches has some integration and main-

tenance problems).

4

1.2 Formal Methods in Software Engineering

Another relevant exception is SQL, a widespread language for managing data

in relational database management systems. Thanks to its precise relational al-

gebra semantics, an SQL query can be transformed to an equivalent but more

efficient query.

Relational database theory has been a deep influence in software design

methodologies over the last three decades. In fact, some popular object-oriented

design notations such as UML can be seen as a failed attempt at translating some

of the virtues of the relational database design to the problem of general software

design, perhaps, because of its lack of a mathematical definition.

Despite the distinctive characteristics of software production, model-driven

engineering is possible when the right languages and tools are found and this

approach to the construction of software improves over craftsmanship, even eco-

nomically. Certainly, it can be the case that software engineering is just too young

to have found the right abstractions for its growth as a real engineering, a situa-

tion similar, perhaps, to that of civil engineering before and after the develop-

ment of the differential calculus three centuries ago.

The starting point of this dissertation is that software engineering is, or can

be, an engineering in the classical sense, as explained above, and that it is our

duty to look for the languages to represent those abstractions and to progress to-

wards the development of formal and software tools that make such an approach

practical.

Whilst these positions may not be shared by most of the software community,

it is our hope that the techniques developed in this dissertation will be a small

but significant step towards the practical use of formal models for the creation of

reliable software.

1.2 Formal Methods in Software Engineering

Awareness and adoption of the positions advocated in the preceding section by

the industry is scarce and slow. The use of formal methods for software develop-

ment is still infrequent. Three arguments are often used to explain this situation:

1. insufficiently trained practitioners [19],

2. lack of suitable tools supporting formal methods [4], and

5

1 Introduction

3. lack of integration of formal methods with existing practise and methodol-

ogy [4].

There is an obvious chicken and egg problem with these two issues: well-trained

professionals are needed to create success stories, success stories are needed to

boost the creation of tools and without tools it is hard to train practitioners. The

first of these causes was already mentioned among the well-known seven myths

of formal methods in Anthony Hall’s seminal paper [56]. Hall’s arguments are

still valid: “the maths needed for writing specifications are quite simple”, or at

least no more complex than in other engineering disciplines. In fact, most formal

specification languages do not go beyond set theory and first order logic.

There is, however, some controversy about this issue. Ian Sommerville [118,

Chapter 1] considers that the theoretical foundations of formal methods are in-

sufficient to support software engineering like mathematics supports classical

engineering. The aforementioned inherent complexity of software and doubts

on the scalability of formal methods are blamed for this. As Jean-Raymond Abrial

says [3], these doubts on the scalability act as a bottleneck for the use of auto-

matic verification techniques like theorem proving.

However, despite the theoretical limits imposed by computability and com-

putational complexity, recent years have shown enough positive evidence of the

applicability of formal methods in several ways: creation of companies mainly

devoted to the application of formal methods to software [134, 141, 131, 130, 132],

tools [139, 146, 129, 130, 150] and success stories (e.g the project METEOR in 1999

[12], the automation of the Paris Metro line 14, is one of the most popular story,

other references are [15, 126, 84, 117, 57, 29, 20, 18]) that somehow refute Som-

merville’s claims. Furthermore, the companies, tools and methods mentioned

have a definite application niche: hardware and embedded, and critical systems.

Thus, it is hard to criticise the tools on the technical and maturity sides when

they have succeeded in such demanding environments.

In our opinion, the problem is subtler and, as David Crocker says [37], it is

founded on a wrong perception of low productivity of the use of formal methods:

1. Productivity increase is often a more urgent need when developing soft-

ware. The perception that formal methods decrease productivity con-

tributes to restricting their usage to areas where the risks and costs associ-

ated to malfunction are the determinant factor.

2. Many formal methods are just applied at the architectural design level

6

1.3 A Taxonomy of Formal Methods

while all or most of the code is crafted by hand. This contributes to the

perception of formal methods as mostly unproductive.

3. Often, applying formal methods increases the time required to deliver a

first running prototype. To the managers, this can give the impression of a

slower progress, even if the overall time to market, testing included, would

decrease.

We believe, like Daniel Jackson [71], that in order to be adopted, formal tools

must generate products perceived as useful in all the stages of the general-

purpose software development: coding, testing, documentation, requirement

acquisition, etc.

This is also one of the points stressed out by the report The State-of-the-Art in

Formal Methods [9], whose conclusions were that “the industry needs support to

build error-free products, on time [. . . ]”. In order to achieve that “it is necessary

to work on the integration of formal tools with the rest of development tools.”

1.3 A Taxonomy of Formal Methods

In the following discussion, we will restrict ourselves to the area of formal meth-

ods and tools for the development of general-purpose software. We are leaving

aside many outstanding tools that have proved their value in very specific appli-

cation niches, such as hardware, firmware and communication protocols. More-

over, we are not commenting on the state of the art of theorem proving technol-

ogy nor model checking. While both technologies are crucial for the implemen-

tation of efficient tools, they are not themselves the object of our research.

Classical Formal Methods

Their characteristic feature is the application of a rigorous, little automated pro-

cess based on verified design, a systematic development method that uses the

concept of proof as a way of checking design steps that start with a formal spec-

ification of the system and end with a correct program. Some methods that

have managed to survive until these days are VDM [72], Z [119, 38] and the B-

Method [2].

7

1 Introduction

Lightweight Methods

The term is applied to formal methods that sacrifice verified design on behalf of a

greater level of automation. We can mention the industry-level proposals SPARK

Ada [146] from Praxis [141] and Perfect Developer [139] from Escher Technolo-

gies [134]. More recently, the academic proposal from Daniel Jackson, Alloy [71].

Some of our previous work could also be considered examples of the lightweight

approach, in particular [63, 65, 24].

Program Specification

Other proposals to introduce rigour in software development come directly from

the programming languages community, along the lines of Meyer’s work [97]. A

common feature of these proposals is that they are centred on a particular pro-

gramming language. This language is extended to support specifications of pro-

gram properties, more expressive type systems and non-automatic verification.

Among those worth mentioning we have, of course, Eiffel [133, 98], and more

recently JML [82, 83], Spec# [147, 10], Scala [143, 103] and Nice [138, 16].

There is a certain confluence of lightweight methods and program specifica-

tion languages and. In fact, we could have included SPARK Ada among the latter

as well.

Object-Oriented Languages and Formal Notations

The object-oriented programming is a successful and commercially relevant

programming style. We cannot complete this overview without mentioning lan-

guages and formal notations based on, or influenced by, the object-oriented

paradigm. To begin with, we have those which are extensions of notations used

in the most established classic formal methods, like Object-Z [40], VDM++ [77].

There are academic proposals inspired by the object-oriented paradigm that

are aimed at specifying general purpose software. We can mention OBLOG/Troll

[114, 35], OBJ/Maude [32] , Larch [55, 137], CASL [7] and CCSL [112].

Finally, regarding the widely-used notations for (informal) modelling of

object-oriented software, such as OMT (Object Modelling Technique) [113] and

UML (Unified Modelling Language) [17] several specification languages have

been presented that aim at providing a fully formal meaning to parts of those

8

1.4 Open Problems

models [36, 125, 51, 124, 34].

1.4 Open Problems

All the preceding approaches suffer from various problems, which we summarise

in this section.

Classical approaches assume a certain “all or nothing” philosophy by requir-

ing a costly verified software development process. Their object-oriented incar-

nations are typically built on top of semantics largely unrelated to the object-

oriented paradigm. Moreover, the lack of integrated tools and, to a lesser extent,

a notation alien to developers are still deterrents for its usage in real and large

projects.

Syntax and semantics of lightweight methods are usually elegant, relatively

simple and well established. Nevertheless, the tools, model checkers in gen-

eral, do not support neither infinite models nor loose specifications, therefore

abstraction suffers. Another drawback is that they seldom reflect the paradigms

developers are familiar with. In Alloy, for instance, many object-oriented con-

cepts have semantics that are similar to the real thing, but not quite the same

with unexpected outcomes (inheritance the most remarkable example).

The success of program specification languages has to do with their nearness

to the developers’ programming languages. Their main problem is their low ab-

straction level. This comes out as no surprise, since they arise as extensions of

programming languages, not modelling ones. Besides, their verification tech-

niques are often based on model checking and model checkers give no support

for loose specifications, another deterrent for abstraction.

Modelling notations such as UML/OCL have been designed with developers

in mind. However, part of the reason for their successful adoption by developers

is due to the laxity of their syntax and semantics. Of course, such a permissive

approach is incompatible with any rigorous development process: no code gen-

eration, no early validation of requirements, etc. Although some researchers are

establishing rigorous semantics frameworks for UML/OCL [111, 43], advanced

tools support is still scarce.

In spite of these problems, these approaches represent small steps towards

the adoption of formal methods in the software development industry. On our

9

1 Introduction

view, a combination of lightweight methods and the object-oriented notations

deserve special attention because

• syntax and semantics of lightweight formal methods are elegant and rela-

tively simple, and because

• syntax of object-oriented notations are familiar to the ordinary developers.

The aim of our proposal is to solve problems of both approaches, and, in partic-

ular, to offer products perceived as useful, such as executable prototypes. Before

presenting our proposal in Section 1.7 we will study the lightweight formal meth-

ods in a more detailed way (Section 1.5) and an alternative to model checking for

early requirements validation: generation of executable prototypes (Section 1.6).

1.5 Lightweight Formal Methods

Lightweight formal methods have become relatively popular thanks to their suc-

cess in early requirements validation, a smooth learning curve, and the availabil-

ity of usable tools. This simplicity is obtained by replacing the formal proof –

which often demands human intervention – by model checking, giving up strict

correctness in favour of less stringent criteria for models.

Consider, for example, the stepwise specification of queues in Alloy [71]. The

specifier might start by just sketching the interface

module myQueue

sig Queue { root: Node }sig Node { next: Node }

that is, stating that queues must have a root node and nodes will have a next

node to follow. The description can be “validated” by fixing a number of Queue

and Node individuals and letting a tool like the Alloy Analyzer [128] model check

the specification and show graphically the different instances found. Of course,

some of these instances will be inconsistent with the intuition in the specifier’s

mind – e.g. unreachable nodes or cyclic queues, which can be discovered with

very small models. Further constraints, like

fact allNodesBelongToOneQueue {all n:Node | one q:Queue | n in q.root .*next }

fact nextNotCyclic {no n:Node | n in n.^next}

10

1.6 Executable Prototypes and Logic Programming

can be added to the myQueue module in order to supply some of the missing

pieces in the original requirements. The first fact states that for every node there

must be some queue such that the node lies somewhere in the transitive-reflexive

closure of the next relation starting with the root of that node. The second one

states that no node can be in the transitive closure of the next relation starting

with itself. Model checking the refined specification will generate less instances,

thus allowing to explore more relevant examples, which will hopefully lead to re-

veal more subtle corners in the requirements.

This approach is extremely attractive: requirements are refined stepwise,

guided by counterexamples found by means of model checking, and the whole

process is performed with the help of graphical tools.

However, there are also some limitations inherent to this approach. Leaving

aside the fact that strict correctness is abandoned in favour of a more relaxed no-

tion of being not yet falsified by a counterexample (which is unsuitable for safety

critical domains), the use of model checking rather than proof-based techniques

also brings other negative consequences, such as limiting the choice of data types

in order to keep models finite, making it extremely difficult to model and reason

over recursive data types like naturals, lists, trees, etc. [71, Chapter 4, Section 8].

1.6 Executable Prototypes and Logic Programming

A natural alternative to model checking the initial requirements is to produce an

executable prototype from them. Using the right language it is possible to obtain

code and validation can be guided by testing, which might also be automated by

tools such as QuickCheck [31].

There are several possibilities for obtaining prototypes:

• One of them is to follow the correct by construction slogan and to produce

code from the specification, either by means of a transformational ap-

proach that often requires human intervention, or by casting the original

problem in some constructive type theory that will lead directly to an im-

plementation in a calculus by exploiting the Curry-Howard isomorphism

[105, 22, 13, 102].

• Another possibility is to use logic programming. In this case, executable

specifications are obtained free of charge, as resolution or narrowing

11

1 Introduction

will deal with the existential variables involved in any implicit (i.e. non-

constructive) specification. Readers familiar with logic programming will

remember the typical examples – obtaining subtraction from addition for

free, sorting algorithms from sorting test, etc. – and those familiar with

logic program transformation techniques will also recognise that these can

be used to turn those naive implementations into decent prototypes.

When it comes to practical usage, the correct by construction cannot compete

with the lightweight methods above, due to the great distances separating them

from the notations used for modelling object-oriented software. Nevertheless,

the use of logic programming and its extensions with constraints and concur-

rency are generating great expectations [86, 88, 142].

1.7 Our Proposal: Clay

Clay is an evolution of an attempt we made for creating an effective formal

tool: SLAM-SL. Since the object-oriented programming is a highly successful

and commercially relevant programming style, we considered reasonable to de-

sign SLAM-SL to provide object-oriented developers with good object-oriented

stuff [61, 60, 63, 80], and to bridge the gap between formal methods and more

widespread software engineering processes [64, 65, 66, 69]. SLAM-SL included a

lot of syntactical constructions and patterns that helped our tools to provide the

user with very attractive products.

Nevertheless, we explored so many features in SLAM-SL and the notation was

so complex that it resisted a formal analysis, in particular if we wanted to keep the

semantics under the first-order logic umbrella.

In this thesis we tried to factor out and to extract the mathematical essentials

behind SLAM-SL. The result of this downsizing is Clay.

Clay is an object-oriented formal notation that can be, both, simple enough

and close to developers’ way of thinking, and formal enough to support the syn-

thesis of logic programs from its specifications. Clay has been designed from

three basic premises:

1. Clay had to be equipped with a core capable of expressing the main con-

structs of object-oriented languages.

2. Clay had to have an accessible semantics for software developers.

12

1.8 Contributions of the Thesis

3. Clay had to admit one canonical translation into executable prototypes to

support early requirement validation.

Clay is stateless, class-based and has a nominal type system. Classes can define

algebraic data types by means of case definitions: disjoint and complete sub-

classes of the superclass. Classes can be extended by means of subclassing (in-

heritance). Methods are specified with pre- and post-conditions by first-order

formulae that relate the method subject and the parameters and the result of the

method. The most relevant atomic predefined predicates are class membership

and class-indexed equality.

The requirement to have an accessible semantics leads to consider keeping

first-order logic as the option. An extra benefit of this is to take advantage of

the current state of FOL automatic theorem provers. This can be used to help

in debugging – logically speaking – Clay’s meta-theory, and to help in reasoning

about individual Clay specifications.

Regarding executable prototypes and their use as a tool for early requirement

validation. The goal is to get advantage of existing logic programming technol-

ogy, both the compiler side and also program transformation techniques. The

approach is to refine the first-order encoding in order to generate Horn-clause

theories that can be executed – possibly with the help of some of the aforemen-

tioned optimisations – in a Prolog system.

Figure 1.1 shows the process for using our proposal: the specifier writes spec-

ifications in Clay, the clay compiler transforms them into first order theories and

Prolog programs and the specifier can interact with Prover9/Mace4 and with Pro-

log in order to validate his requirements.


In this section we enumerates the main contributions of this thesis. Some results

of this work have already been published, in those cases we will give the reference

of the paper.

13

1 Introduction

Cell.cly

claycPrologProver9

Mace4

Cell.pl

Cell.p9Clay.p9

Clay.pl

Developer

Figure 1.1: Our Proposal: Clay.

1.8.1 Design of an Object-Oriented Formal Notation

One of the main motivations of this work was to study and integrate object ori-

entation concepts – both traditional and innovative ones – into a formal notation

that helped to implement rigorous software development methodologies. This is

why SLAM-SL was originally conceived. Clay is a lightweight evolution of SLAM-

SL which permits a better formal and mechanised treatment of its essential fea-

tures.

The key features of Clay are:

• A stateless object-oriented formal notation with a nominal type system.

• Case-based class definitions are allowed, thus supporting algebraic data

types.

• Safe inheritance: properties of a class cannot be invalidated by subclassing,

Scandinavian semantics for method overriding and dynamic binding, and

permissive overloading that adds behaviour instead of overriding it.

• Methods are specified with pre- and post-conditions by first-order formu-

lae that relate the method subject and the parameters and the result of the

14


method. The most relevant predefined predicates are class membership

and class-indexed equality.

The reader can found the main ideas behind SLAM-SL and digressions of its main

features in these papers:

• A. Herranz and J. J. Moreno-Navarro. On the design of an object-oriented



• A. Herranz and J. J. Moreno-Navarro. Towards automating the iterative

rapid prototyping process with the slam system. In V Spanish Conference


• A. Herranz and J. J. Moreno-Navarro. On the role of functional-logic lan-




• A. Herranz and J. J. Moreno-Navarro. Generation of and debugging with

logical pre and post conditions. In M. Ducasse, editor, Automated and


1.8.2 A Nominal Type System for Object-Orientation

It is well known [1] that nominal subtyping can be trickier than structural sub-

typing. However, the latter has the disadvantage that two types with different

design purposes can be accidentally identified. This is precluded in our nominal

proposal.

The type system is used to reject illegal specifications, and also to help in the

generation of executable prototypes from them.

1.8.3 A First-Order Formal Semantics for Clay

Another key contribution is a first-order semantics for Clay specifications. Some

of their features are listed below:

15

1 Introduction

• A logical interpretation of the main features of object-oriented languages:

inheritance, defining classes by cases, permissive overloading, dynamic

binding and static equality.

• The use of the automatic prover technology of Prover9/Mace4 that has al-

lowed mechanising both, the Clay’s meta-theory and specifications. For

example, some of the theorems about Clay in this thesis have been proved

automatically.

1.8.4 Executable Prototype Generation

We propose a scheme for the compilation of Clay specifications into logic pro-

grams.

Some of the key features of the prototype generator are:

• Code can be generated from implicit specifications, even recursive ones,

something hard to find in other tools.

• Our implementation takes advantage of various logic programming tech-

niques in order to achieve reasonable efficiency: constraints, constructive

negation, Lloyd-Topor transforms, incremental deepening search, etc.

This contribution was initially published in the following paper and improved in

the thesis:

• Ángel Herranz and Julio Mariño. Executable specifications in an object ori-

ented formal notation. In 20th International Symposium on Logic-Based




1.8.5 The Clay Compiler

We have built an actual tool incorporating the proposal above. The Clay compiler

provides:

• syntax analysis of modular Clay specifications,

16


• type checking and annotation of Clay specifications,

• translation of Clay specifications into first-order theories in the concrete

syntax accepted by Prover9/Mace4, and

• synthesis of executable Prolog prototypes.

The Clay compiler has been implemented in Haskell using state-of-the-art meth-

ods and techniques of compiler constructions and functional programming.

1.8.6 FM in Common Practise in Software Development

Framed in the overall goal of promoting the use of formal techniques in soft-

ware engineering, several application chapters that illustrate possible uses of this

technology have been included. Those chapters contain previous work carried

out while SLAM-SL, the precursor to Clay, was being designed.

• We shown how to integrate formal and agile methods, such as extreme pro-

gramming (XP). More specifically, we have studied unit testing, refactoring

and incremental development from the standpoint of formal methods.

Papers that lead to this contribution are:

– A. Herranz and J.J. Moreno-Navarro. Formal extreme (and extremely

formal) programming. In Michele Marchesi and Giancarlo Succi, ed-

itors, 4th International Conference on Extreme Programming and Ag-

ile Processes in Software Engineering, XP 2003, number 2675 in LNCS,

pages 88–96, Genova, Italy, May 2003.

– A. Herranz and J.J. Moreno-Navarro. Formal agility. how much of

each? In Taller de Metodologías Ágiles en el Desarrollo del Software.

VIII Jornadas de Ingeniería del Software y Bases de Datos, JISBD 2003,

pages 47–51, Alicante, España, November 2003. Grupo ISSI.

– A. Herranz and J.J. Moreno-Navarro. Rapid prototyping and incre-

mental evolution using SLAM. In 14th IEEE International Workshop

on Rapid System Prototyping, RSP 2003), San Diego, California, USA,

June 2003.

• We have defined a syntactic characterisation of a class of specifications that

supports the efficient (imperative) code synthesis allowing for an efficient

integration of iterative rapid prototyping process and formal methods.

17

1 Introduction


– A. Herranz, N. Maya, and J.J. Moreno-Navarro. From executable spec-

ifications to java. In Juan José Moreno-Navarro and Manuel Palomar,

editors, III Jornadas sobre Programación y Lenguajes, PROLE 2003,

pages 33–44, Alicante, España, November 2003. Departamento de

Lenguajes y Sistemas Informáticos, Universidad de Alicante. De-

pósito Legal MU-2299-2003.

– A. Herranz and J. J. Moreno-Navarro. Specifying in the large: Object-


B. J. Krämer H. Ehrig and A. Ertas, editors, The Sixth Biennial World

Conference on Integrated Design and Process Technology (IDPT’02),

volume 1, Pasadena, California, June 2002. Society for Design and

Process Science. ISSN 1090-9389.

• Finally, we have shown how to formalise design patterns by treating them

as class operators. While the core idea was in the design patterns commu-

nity folklore, we have developed it fully for the first time.


– Juan José Moreno-Navarro and Ángel Herranz. Design Pattern For-

malization Techniques, chapter Modeling and Reasoning about De-

sign Patterns in SLAM-SL. IGI Publishing, March 2007. ISBN: 978-1-

59904-219-0, ISBN: 978-1-59904-221-3.

– A. Herranz, J.J. Moreno-Navarro, and N. Maya. Declarative reflection

and its application as a pattern language. In Marco Comini and Mo-

reno Falaschi, editors, Electronic Notes in Theoretical Computer Sci-

ence, volume 76. Elsevier Science Publishers, November 2002.

– A. Herranz and J. J. Moreno-Navarro. Design patterns as class opera-

tors. Workshop on High Integrity Software Development at V Spanish

Conference on Software Engineering, JISBD’01, November 2001.

18

1.9 Thesis Organisation

1.9 Thesis Organisation

Part I: Introduction

Chapter 1: Introduction We motivate the thesis, describing the state of the art

of formal methods for software development and present the main contri-

butions of the Clay proposal.

Chapter 2: A Taste of Clay This chapter is a quick tour of the specification lan-

guage. The main characteristics of Clay are introduced by means of ex-

amples, and the less-trivial design decisions motivated on practical and

methodological basis.

Part II: Semantics

Chapter 3: Static Semantics This chapter introduces the abstract syntax of the

notation, the nominal type system and some theoretical results.

Chapter 4: A Dynamic Semantics Based on First-order Logic We provide a dy-

namic semantics for Clay, by means of a translation of Clay specifications

into first-order logic.

Part III: The Clay System

Chapter 5: Synthesis of Logic Programs We present a refinement of the dy-

namic semantics of Chapter 4 that supports generating Prolog code from

the Clay specifications.

Chapter 6: The Clay Compiler This chapter describes the architecture of the

Clay compiler and the technology used to implement it. The compiler

functionality includes syntax analysis, type checking, first-order theory

generation, connection with automated provers and synthesis of exe-

cutable Prolog prototypes.

19

1 Introduction

Part IV: Applications

Chapter 7: Formal Agility The chapter deals with the issue of integrating formal

methods in agile software development processes and, more specifically,

extreme programming.

Chapter 8: Specifying in the Large We study the possibility of generating imper-

ative code from SLAM-SL specifications. The goal, that follows naturally

from the ideas in the previous chapter, is to introduce formal techniques

in iterative rapid prototyping process, and the method proposed relies on

a study of those specification patterns that occur more often.

Chapter 9: Modelling Design Patterns We formalise design patterns as a case

study for SLAM-SL specifications.

Part V: Conclusion

Chapter 10: Conclusion and Future Work

Appendices

Appendix A: Clay Language Reference The appendix has all the concrete syntax

and a detailed description of the constructs.

Appendix B: Clay Theory in Logic Programming This appendix contains the

complete Clay Theory presented in Chapter 5 in the form of a logic pro-

gram.

Appendix C: Mathematical Conventions This appendix is devoted to mathe-

matical conventions and notation we have followed in this thesis.

20

2

A Taste of Clay

Abstract

This chapter presents the main features of Clay by means of examples

that will be used throughout the thesis. Sections 2.2–2.8 introduce the

basic constructs of the language. In Section 2.9 we will deal with some

nontrivial aspects of our notation that may differ from other object-

oriented specification and programming languages. Subsequent chap-

ters will provide a more formal presentation of the language. Chapter 3

is devoted to the type system of Clay, and Chapter 4 provides a logical

semantics. For a full language reference, including the concrete syntax,

the reader is referred to Appendix A.

2.1 Introduction

This are some of the features that we will exemplify in the subsequent sections.

Object-oriented. Clay is an object-oriented formal notation. Concepts such as

classes, objects (or instances), inheritance and composition are essential

21

2 A Taste of Clay

in Clay. The object-oriented paradigm is extremely popular among ordi-

nary developers and it is admitted that it facilitates the creation of complex

software systems. We have tried to capture the core of these concepts in

the notation with no previous mathematical framework in our mind.

Class-based and nominal subtyping. Clay is centred around the notion of clas-

ses as descriptions of objects, i.e. Clay is a class-based language. The Clay

type system is based on type names (in contrast to a structural one) and it

follows the approach of subtyping-is-subclassing. The intention is to avoid

that two types with different design purposes can be accidentally identi-

fied.

Stateless. This initial version of the language does not contemplate the state no-

tion. Although the language has been designed with it in mind, a classical

first-order logic based semantics made capturing state too complex and we

decided to study and introduce it in future versions.

Case classes. The State design pattern [52] illustrates an advisable solution to a

common problem. Algebraic types reflect that type of solutions and pro-

vide a recursive decomposition mechanism via pattern matching. Inspired

by [104, 143], we have included cases classes in Clay.

LSP. Clay to follows the Liskov substitution principle [89] (LSP) and guaranties by

definition that all the properties described in a class are inherited in the ob-

jects of a subclass. In other words, Clay has a Scandinavian semantics [21]

where the behaviour of the parent is preserved and augmented. This ap-

proach is essential when we are specifying in the large: the specifier needs

to reason locally within a class specification.

Pre-post. Methods are specified with pre- and post-conditions by first-order for-

mulae that relate the method subject and the parameters and the result of

the method. The most relevant predefined predicates are class member-

ship and class-indexed equality.

Equality. Each class introduces its own version and the equality predicate. The

equality in Clay is observational, i.e. two objects are equal if they are indis-

tinguishable under their reactions to received messages.

Permissive overloading. Clay follows a very permissive overloading of method

names to allow the use of substitutability and specialisation mechanisms

22

2.2 Classes and Objects

[26]. There is no contradiction in declaring a method in a subclass with ar-

guments being supertypes (contravariant arguments) and the result a sub-

type (covariant result).

Some readers will feel impatient reading this informal presentation. We hope

they understand that all the concepts are intertwined and it is difficult to satisfy

all the readers. We present the basic object-oriented concepts from Section 2.2 to

Section 2.8 and we deal with nontrivial, even controversial, aspects in Section 2.9.

2.2 Classes and Objects

The central notions in Clay are those of classes and objects. Clay specifications are

sets of class specifications which describe the structure and behaviour of objects.

Let us begin with an example adapted from [1]:

class Cell {state Cell_ { isEmpty : Bool,

contents : Nat }invariant { self . isEmpty : True ⇒ self .contents = 0 }...

}

Class specifications start with the keyword class. In our example, Cell describes

objects that contain, at most, one natural number. The keyword state1 defines

part of the data model by means of object composition: any object of the class

Cell has two fields, isEmpty and contents. Field isEmpty is a Boolean field that indi-

cates whether the natural field contents is relevant or not.

Objects are the values described by classes. Objects interact with each other

by message passing. Message passing is expressed by the standard object-

oriented “dot”2 syntax:

e.m(e1,. . .,en)

Such expression represents an object that is the response of the object repre-

sented by e to the message m(e1,. . .,en), where each ei is the representation of an

object.

Field names are messages. In the example, the intended meaning of c.isEmpty

1The meaning of Cell_ is explained in Section 2.5.2Left-associative.

23

2 A Taste of Clay

is the Boolean object that decides if the cell c has no relevant information in

c.contents.

Properties and constraints of the domain can be made explicit with the use

of invariants. Invariants are Clay formulae, statements involving logical connec-

tives, variables, auxiliary symbols and predefined symbols which syntax and in-

tended meaning are, basically, the syntax and the meaning of first-order logic

formulae.

The invariant of the class Cell establishes that whenever a cell is empty, its

content must be 0. The symbol self is a variable universally quantified over all

the objects of the class Cell. The formal meaning of the invariant is given by this

formula:

∀ c : Cell (c.isEmpty : True ⇒ c.contents = 0)

As in first-order logic languages, quantifiers (∀ and ∃), the logical connectives

for conjunction (∨), disjunction (∧), implication (⇒), and equivalence (⇔ ), and

predicate symbols are at the specifier’s disposal. Precedence and associativity of

the logical operators are the usual in first-order logic. The most relevant predi-

cate symbols are instance_of and eq, which will be often used in their infix form:

“ :” and “=”.

Predicate “ :” checks whether an object is an instance of a given class. In our

example, the Clay type system establishes that c.contents : Nat if c : Cell. More in-

teresting is its use in self . isEmpty : True, a construct that will be discussed in more

detail in Sections 2.5 and 2.9.2. For the moment, it will be enough to say that True

is the subclass of Bool that describes all the Boolean objects that are true (just

one).

Regarding the equality predicate, each class introduces its own version and

the symbol is overloaded. In general, equality is observational, i.e. two objects

are equal if they are indistinguishable when sending messages to them. In our

example, this boils down to equality of every component of the object: if we

have two cells c1 and c2, predicate c1 = c2 holds if c1.isEmpty = c2.isEmpty and

c1.contents = c2.contents hold.

24

2.3 Methods

2.3 Methods

We complete our example with the definition of some methods for Cell. In the

form of pre- and post-conditions, method definitions specify how instances of

Cell will respond to messages. These formulae relate the receiver of the message

(self ) and parameters of the message with its answer ( result ):

class Cell {...modifier set (v : Nat) {

post { result . isEmpty : False ∧ result .contents = v }}

observer get : Nat {pre { self . isEmpty : False }post { result = self .contents }

}}

The definition of methods get and set introduce messages that instances of Cell

respond to.

The keyword modifier indicates that the result of sending message set(n) to

the cell c, expression c.set(n), represents the modification of c, a new non-empty

cell with internal content n.

Observer methods introduce messages which allow observing calculated at-

tributes of objects and are declared with the keyword observer. Expression c.get

denotes the response of c to the message get: the internal value of field contents if

c is not empty, otherwise the response is an unknown instance of Nat (unknown

but consistently the same for any empty cell).

In Clay, classes are themselves objects that respond to messages introduced

by constructor methods, (keyword constructor):

class Cell {...constructor mkCell {

post { result . isEmpty = Bool.mkTrue }}...

}

The expression Cell .mkCell denotes an object of class Cell that represents an

empty cell, and, established by the invariant, with field contents equal to 0. The

reader can infer that the expression Bool.mkTrue represent the object true, the

25

2 A Taste of Clay

response of the object Bool to the message mkTrue.

Since classes are objects, they are instances of other classes. In our example

the following predicates hold:

Cell : MetaCell ∧ MetaCell : Meta ∧ Meta : Meta

Summarising, for every class name C the user defines in Clay, a new class name

MetaC is automatically created. To avoid an infinite chain, every MetaC is an in-

stance of the predefined class Meta, which is an instance of itself.

2.4 Inheritance

Let us extend the behaviour of the Cell class with the ability to restore the state of

cells to their previous content:

class ReCell extends Cell {state ReCell_ { wasEmpty : Bool,

backup : Nat }invariant { wasEmpty : True ⇒ backup = 0 }

constructor mkReCell {post { result = Cell.mkCell

∧ result .wasEmpty : True}

}...

}

Class ReCell extends Cell with backup fields.

Inheritance is declared by means of keyword extends, which induces a sub-

class relation, a predefined predicate symbol that will be used in its infix form:

“<:”. The relation obeys the expected rules of reflexivity, transitivity and sub-

sumption. Clay adopts the inheritance-is-subtyping approach that in a class-

based language results in no distinction between types and classes.

Also, we have designed Clay to follow the Liskov substitution principle (LSP)

so all the properties of the instances of a superclass are inherited by the instances

of a subclass. In our example, the first property inherited is the invariant of Cell:

∀ c : ReCell (c.isEmpty : True ⇒ c.contents = 0)

In our view, the most important aspect of the LSP is that a subclass cannot in-

validate, by overriding for instance, any property specified in its superclasses. If

this happens the whole specification will be considered inconsistent in Clay. This

26

2.4 Inheritance

approach is essential when we are specifying in the large: the specifier needs to

reason locally within a class specification. Therefore, the language cannot allow

a subclass to describe a behaviour that forces the specifier to take it into account.

The approach adds another advantage: specifications can be much more con-

cise since it is not necessary to state properties already stated in superclasses.

The main drawback is certain loss of flexibility but, in our view, the decision pays

off.

Note how in the specification of set we omit everything already specified in

the superclass:

class ReCell extends Cell {...modifier set (v : Nat) {

post { result .wasEmpty = self.isEmpty∧ result .backup = self.contents }

}...

}

If r is an instance of ReCell then the postcondition of message set in Cell is inher-

ited and r.set(5).contents = 5 holds.

To end with the example we specify the message restore that recovers the im-

mediate previous state of the cell:

class ReCell extends Cell {...modifier restore {

post { result . isEmpty = self.wasEmpty∧ result .contents = self .backup∧ result .wasEmpty = Bool.mkTrue

}}

}

2.4.1 No Top

Clay, as other object-oriented languages such as C++ [120], has no top class in

the inheritance hierarchy. The only common operation to all classes is equality,

nevertheless, equality it is not a method and it does not lead to any expression,

it is a predicate and lives in the world of Clay formulae. Having no top class, we

can detect errors in the use of the equality since the comparison of two unrelated

objects will be rejected by the type checker (see Section 2.9.4).

27

2 A Taste of Clay

2.5 Case Classes

Case classes are subclasses which export their constructor parameters and which

provide a recursive decomposition mechanism via pattern matching. In Clay,

instances of a class are the disjoint and complete sum of the instances of its case

classes. Case classes are declared with the keyword state – the terminology comes

from its similarity to the design pattern State [52] – and this implicitly introduces

subclasses of the original class. Readers familiar with functional programming

will recognise here a way to define algebraic data types (also known as free types):

class Nat {state Zero {}state Succ {pred : Nat}...

}

Three classes are introduced with the lines above: Nat, Zero and Succ. If n is an

instance of Nat then it is an instance of Zero or, exclusively, of Succ. The following

Clay formula expresses it:

∀ n : Nat ((n : Zero ∨ n : Succ) ∧ (n : Zero ⇔ ¬ n : Succ))

Moreover, Zero and Succ are subclasses of Nat so the following formulae hold n:

n : Zero ⇒ n : Natn : Succ ⇒ n : Nat

In the example of cells presented in the previous sections, Cell_ and ReCell_

are case classes of Cell and ReCell, respectively.

The case classes Zero and Succ implicitly define the constructor methods

mkZero and mkSucc. Both are valid messages of class MetaNat (see Section 2.2)

and can be sent to the object Nat. The response of Nat to message mkZero, i.e.

Nat.mkZero, is an instance of Zero. When a case class defines fields, its constructor

uses them as the definition of the parameters. Constructor mkSucc has as its

parameter an instance of Nat so we can write Nat.mkSucc(Nat.mkZero) to repre-

sent number 1. We will use 0, 1, 2, etc. to abbreviate the following expressions:

Nat.mkZero, Nat.mkSucc(Nat.mkZero), Nat.mkSucc(Nat.mkSucc(Nat.mkZero)), etc.

The combination of predicate “ :”, case classes and fields can be used to emu-

late the effect of pattern matching, as shown in Listing 2.1.

We show now the specification of class Bool and we will understand why the

construct isEmpty : True establishes that isEmpty is true in the invariant of class

Cell:

28

2.6 Generics

class Nat {...modifier add (n : Nat) {

post {( self : Zero ⇒ result = n)

∧ ( self : Succ⇒ result = Nat.mkSucc(self.pred.add(n))) }

}...

}

Listing 2.1: Adding natural numbers

class Bool {state True {}state False {}...

}

Formulae isEmpty = Bool.mkTrue in the invariant of class Cell and isEmpty : True are

equivalent in Clay since Bool.mkTrue : Bool and the case classes define a free type:

∀ n : Bool ((n : True ∨ n : False) ∧ (n : False ⇔ ¬ n : True))

To finish with case classes we show a simple example of an enumeration:

class RGB {state Red {}state Green {}state Blue {}...

}

The formulae that represents that RGB is a free type is:

∀ c : RGB ((c : Red ∨ c : Green ∨ c : Blue)

∧ (c : Red ⇔ ¬ n : Green ∧ ¬ c : Blue)∧ (c : Green ⇔ ¬ n : Red ∧ ¬ c : Blue)∧ (c : Blue ⇔ ¬ n : Red ∧ ¬ c : Green)

)

2.6 Generics

To address the problem of reusability and extendability of specifications Clay

supports three key techniques: inheritance (Section 2.4), disjoint sums (case

classes, Section 2.5) and parametric polymorphism by means of generics.

29

2 A Taste of Clay

The following Clay specification defines generic pairs:

class Tup2 <x,y> {state Tup2_ {fst : x, snd : y}

}

Variables x and y are class parameters of the class constructor Tup2 and can be

bound to any two classes. We can then create the pair (true,0) with the follow-

ing expression Tup2<Bool,Nat>.mkTup2_(Bool.mkTrue,0) which responds to fst and

snd messages with the proper instances of Bool (class parameter x) and Nat (class

parameter y) observing the following properties:

Tup2<Bool,Nat>.mkTup2_(Bool.mkTrue,0).fst = Bool.mkTrueTup2<Bool,Nat>.mkTup2_(Bool.mkTrue,0).snd = 0

Generics, case classes and inheritance can be combined, as the following exam-

ple shows:

class Seq<x> extends Collection<x> {state Empty { }state NonEmpty {head : x, tail : Seq<x>}...

}

The genericity in the container class is inherited by its case subclasses. Expres-

sions like Empty<x> and NonEmpty<x> have to be used if we want to do pattern

matching on the specification of message size:

class Seq<x> extends Collection<x> {...observer size : Nat {

post {self : Empty<x> ⇒ result : Zero

∧ self : NonEmpty<x> ⇒ result = self.tail.size. inc }...

}

Generics in Clay can also be introduced using the mechanism of bounded type

parametrisation. Let us assume the following specification for a class Poset:3

class Poset {observer lte(other : Self) : Bool {

post { ( result : True ∧ other. lte ( self ) : True) ⇒ self = other∧ result : False ⇒ other. lte ( self ) : True∧ forall o : Self ( result : True ∧ other. lte (o) ⇒ self . lte (o))

}}

}

3We will offer details about Self in Section 2.9.1. For the moment it can be understood as aclass variable bounded by Poset.

30

2.7 Pre- and Post-conditions

Using it, a new generic class SortedSeq can be specified whose class parameter is

constrained to be a subclass of Poset:

class SortedSeq<x extends Poset> {state SortedSeq_ { s : Seq<x> }invariant { ∀ i : Nat ((s.dom.in(i) : True ∧ s.dom.in(i+1) : True)

⇒ s.elem(i) . lte (s.elem(i+1))) }...

}

Then, SortedSeq class constructor can be instantiated with any class that inherits

from Poset like NatPoset:4

class NatPoset extends Nat Poset {observer lte(other : Self) : Bool {

post { self : Zero ⇒ result : True∧ self : Succ ⇒ other : Zero

⇒ result : False∧ self : Succ ⇒ other : Succ

⇒ result = self .pred. lte (other.pred)}

}}

However, SortedSeq<Nat> would not be a valid class name since Nat is not a sub-

class of Poset.

2.7 Pre- and Post-conditions

Using postconditions we can state requirements on the result of a message, leav-

ing out any unnecessary algorithmic detail. Let us look at the specification of

message half in our natural numbers example:

class Nat {...observer half : Nat {

post { self = result .add(result) }}...

}

The postcondition in the post clause forces the following property for every in-

stance n of Nat:

n.half.add(n.half) = n

4Note that, although it is not relevant in this example, we are using multiple inheritance (seeSection 2.9.3).

31

2 A Taste of Clay

Any actual implementation would find impossible to comply with the property

for odd numbers. To avoid this kind of inconsistencies the specifier can introduce

a precondition that states a restriction on the use of a message:

class Nat {...observer half : Nat {

pre { self .even : True }post { self = result .add(result) }

}

observer even : Bool {post {

result : True ⇔ exists n : Nat ( self = n.add(n))}

}}

Now, the whole specification of half establishes the previous property but is pro-

tected by the precondition:

n.even : True ⇒ n.half.add(n.half) = n

In Clay, sending a message that does not satisfy the precondition would result in

an unknown object. In our example, 1.half is a natural number although we do

not know which one. More details about this in Section 2.9.1.

2.8 Assertions and Solutions

Clay allows to specify conditions that postconditions must guarantee in the form

of assertions:

class ReCell {...modifier restore {

post { ... }}assert {¬ result . isEmpty ⇒ result.get = result . restore.get}

}}

Assertions make the model easier to understand and, at the same time, can be

converted to either proof obligations for proof checking, or test cases for future

implementations, or run-time checks.

32

2.9 Clay Idiosyncrasy

In order to document algorithmic details about the way in which the result

of a message can be calculated by an eventually synthesised prototype we use

solutions. A solution is a formula that entails the postcondition, generally written

in a way that allows a more straightforward translation into code:

class Nat {...observer even : Bool {

post { ... }sol {

self : Zero ∧ result : True∨ self : Succ ∧ result = self .pred.even.neg

}}

}

When a sol clause is present, its content, rather than that of the post, can be used

to generate an executable prototype.


In previous sections we have presented characteristics of Clay that should not

have led to controversy. In this section we keep the informal line of the chapter

but we will answer some questions about what we consider the most interesting

aspects of the notation.

2.9.1 The Meaning of a Specification

Although Chapter 4 is devoted to the formal semantics of Clay, we will informally

explain in this section the meaning of the main constructs of the language by

means of the following running example:

1 class Map<key extends Poset, info> extends Collection<info> {2 state Map_ { map : Seq<Tup2<key,info>> }3 invariant { ... } −− sorted and no duplicated keys4

5 observer get (k : key) : info {6 pre { exists i : info ( self .map.in(Tup2.mkTup2(k,i)) : True) }7 post { self .map.in(Tup2.mkTup2(k,result)) : True }8 }9 ...

10 }

Let us introduce every construction:

33

2 A Taste of Clay

Class declaration. Line 1 establishes that Map<A,B> is a valid class, subclass

of Collection<B>, if A and B represents are valid class expressions and

A <: Poset.

Case class declaration. Line 2 declares Map<A,B>.mkMap_(s) : Map<A,B> if

s : Seq<Tup2<A,B>>, and Map<A,B>.mkMap_(o) respects the invariant.

Method declaration. Line 5 establishes

∀ s : Map<A,B> ⇒ ∀ k : A ⇒ s.get(k) : B

In words, if s is a map and k a key, s.get(k) is an instance of B.

Method specification. Line 9 introduces the following fact about any instance s

of Map<A,B> and any instance k of A:

s.map.in(Tup2.mkTup2(k,s.get(k))) : True

guarded by the precondition of line 6:

exists i : info (s.map.in(Tup2.mkTup2(k,i)) : True)

Self, capitalised

Every class specification introduces a class variable Self bounded by the class un-

der definition. Self represents the “current” class. The specifier can use Self at any

moment. In Section 2.6 we used it as a parameter type. The keyword modifier im-

plicitly uses Self as the class of the resulting object.

In the specification of modifier add in class Nat, the resulting object is an in-

stance of Self. Any instance of a subclass C of Nat will answer with an instance of

class C.

Undefinedness

What is the meaning of 1.half or m.get(k) when key k is not in map m in Clay?

We follow the same approach as Z with respect to undefined expressions: every

expression has a value, though sometimes we do not know which one. the

The approach allow us to take 1.half to be 42 consistently and everything will

go well. If we consistently take it to be 27 everything will go well. What Clay

will not do is to take 1.half to be sometimes 37 and sometimes 42. Otherwise we

would lose the reflexivity of the equality or collapse all values.

34


Whichever value 1.half is, it cannot be discovered in a deduction process be-

cause the precondition 1.even : True acts as guard when we try it.

Loose Specifications

When some aspects of the domain are not yet relevant to the requirements we

can leave them unspecified. The usual refinement process can be iteratively ap-

plied to tighten the specification while each new specification has to be proven

consistent. Loose specifications naturally appear in abstract classes or interfaces

in object-oriented specifications.

The class Collection<x> defines methods insert and includes in the following

manner:

class Collection<x> {...modifier insert(e : x) {

post { result . includes(e) : True }}modifier includes(e : x) : Bool {}

}

Subclasses of Collection<x> will guarantee that its instances will respond with the

instance True to the sequence of messages insert (e) and includes(e).

If the user specifies a contradictory property in a subclass then the whole

specification turns inconsistent.

Since the refinement process and inheritance require the same consistency

proofs, for this work we have not introduced a specific construct for declaring

refinement. In the same line Clay does not distinguish between interfaces and

abstract classes, interfaces being completely loose specifications apart from the

typing aspects.

Non-determinism

As mentioned in the previous paragraph about undefinedness, in Clay every ex-

pression has a value though sometimes we do not know what it is. Given the

following specification of sqrt:

class Nat {...modifier sqrt {

35

2 A Taste of Clay

post { ∃ r : Nat (r .mul(r) = self)⇒ result .mul(result ) = self

∧ ∀ x : Nat ( self . lte (x) : True)⇒ self . sqrt . lte (x.sqrt ) : True }

}}

the square root of 25 is 5, but the square root of 21 could be 4 or 5. Clay will return

one of the values consistently and not 4 sometimes and 5 the others.

Binding

Binding is the process by which an expression is associated with a behaviour [21].

In Clay, behaviour is given by the properties that affect the expression during a

deductive process.

In the following example, method m2 in class A delegates its response to the

response of message m1:

class A {observer m1 : Nat {

post { result = 4 ∨ result = 5 }}

observer m2 : Nat {post { result = self .m1.half }

}}

The response of A.mkA.m2 will consistently be any natural number, 2 included.

Nevertheless, we don’t know which one, although we know that the following for-

mula holds:

B.mkB.m1 = 4 ⇒ B.mkB.m2 = 2

Let us override the method m1 of class A in subclass B:

class B extends A {observer m1 : Nat {

post { result = 4 }}

}

Clay follows the Scandinavian semantics [21] for method overriding so the de-

duction process should reflect it by finding B.mkB.m2 = 2 to be true consistently:

• B.mkB.m2 = 4 ∨ B.mkB.m2 = 4 since B.mkB : A, and

36


• B.mkB.m2 = 4 since B.mkB : B.

Permissive Overloading

Clay follows a very permissive overloading of method names to allow the use of

substitutability and specialisation mechanisms [26]. There is no contradiction in

declaring a method in a subclass with arguments being supertypes (contravari-

ant arguments) and the result a subtype (covariant result).

For every class C, let SubC and SupC be classes such that C <: SupC and

SubC <: C. The meaning of the definition

class A {observer m (x : X) : R {}

}

is that for every instance a of A and every instance x of X the expression a.m(x) is

an instance of R.

We can define the subclass SubA that overrides method m in various ways

while respecting that arguments are contravariant and the result is covariant:

class SubA extends A {observer m (x : X) : R {}observer m (x : X) : SubR {}observer m (x : SupX) : R {}observer m (x : SupX) : SubR {}

}

The first definition does not add any information to the previous meaning. The

second definition is more interesting and establishes that the reply to the mes-

sage m of an instance of SubA it is something more specific than an instance of R,

an instance of SubR. The third and fourth definitions establish that instances of

SubA are able to reply to message m if the argument is something more general

than an instance of X, an instance of SupX.

Covariant arguments are also supported in Clay and their interpretation is the

specialisation of the method behaviour in the instances of the subclass:

class SubA extends A {observer m (x : SubX) : R {}

37

2 A Taste of Clay

observer m (x : SubX) : SubR {}

}

Obviously, an instance of SubA will reply to any message m(x) being x an instance

of X. Both definitions establish a specialised or refined behaviour in the case an

instance of SubX is used as argument. This allows the introduction of Self in the

arguments, allowing the definition of binary methods.

Contravariant results are not allowed in Clay. Not because they introduce any

inconsistency but because they provide no extra information:

class SubA extends A {observer m (x : X) : SupR { −− Not allowed}

}

If a′ is an instance of SubA we already know that a′.m(x) is an instance of R and,

obviously, it is already an instance of SupR.

2.9.2 Booleans and Formulae

In Clay, expressions and formulae live in different syntactic categories. To avoid

any confusion, in this work we prefer to make this distinction absolutely explicit.

This decision leads to cumbersome notation like

observer half : Nat {pre { self .even : True }

}

Allowing the specifier to write formulae as instances of Bool, as a kind of syntactic

sugar, could be possible. The previous example could be written as

observer half : Nat {pre { self .even }

}

Nevertheless, we prefer not to use it to avoid confusion.

2.9.3 Multiple Inheritance

We illustrate multiple inheritance in Clay using coloured numbers:

class RGBNat extends Nat RGB {constructor mkRedZero {

post { self : Zero ∧ self : Red }

38


}}

Instances of RGBNat inherit all the properties of superclasses Nat and RGB. If

p : RGBNat we can use it in any circumstance an RGB or a Nat can be used. The

following constructions are valid:

Nat.mkZero.add(p)p : Red

2.9.4 Equality

“One can’t do mathematics for more than ten minutes without

grappling, in some way or other, with the slippery notion of equal-

ity.” Mazur [93]

We will see that Clay is very much like mathematics regarding equality:

class Nat {...modifier mul (n : Nat) {

post { self = Nat.mkZero ∧ result = Nat.mkZero∨ ... }

}...

}

Being RGBNat a subclass of Nat with an instance that represents a red zero, we

expect the following property to be true:

∀ n : Nat (RGBNat.mkRedZero.mul(n) = Nat.mkZero)

Nevertheless, the property is true depending on the truth of the first equality in

the postcondition of mul:

RGBNat.mkRedZero = Nat.mkZero

Obviously RGBNat.mkRedZero and Nat.mkZero are not the same object. Should

both expressions be equal? In our view, they should. Otherwise, the spec-

ifier would have to make explicit all the relevant properties that reveal that

RGBNat.mkRedZero and Nat.mkZero behave indistinguishably in the context of

message mul. That decision would lead, in general, to much more cumbersome

specifications.

In Clay, the equality predicate is implicitly indexed by the minimum subtype

of the compared instances in the context in which the formula appears.

39

2 A Taste of Clay

In the example of mul, the minimum superclass of self and Nat.mkZero is Nat.

The semantics of Clay establishes that no properties of self other than those

reachable from Nat are checked when compared with Nat.mkZero.

Consider, for instance, adding a red zero to a zero: 0.add(RGBNat.mkRZ). Ac-

cording to Listing 2.1 the result is equal to RGBNat.mkRZ up to Nat, so the property

0.add(RGBNat.mkRZ) : Zero holds and 0.add(RGBNat.mkRZ) : Red is not even a valid

expression. Nevertheless, we can write a formula on an instance cn of RGBNat

establishing

cn = 0.add(RGBNat.mkRZ)

since equality is applied modulo Nat, the minimum common subtype of cn and

0.add(RGBNat.mkRZ).

As a result of indexing statically the equalities, some specifications may seem

strange at first sight, like that of the constructor mkRedZero:

class RGBNat extends RGB Nat {constructor mkRedZero {

post { result = RGB.mkRed ∧ result = Nat.mkZero }}

}

The postcondition seemingly states that RGB.mkRed and Nat.mkZero are equal, by

transitivity:

RGBNat.mkRedZero = RGB.mkRed∧ RGBNat.mkRedZero = Nat.mkZero

A deeper examination reveals that equalities are indexed in the following way:

class RGBNat extends RGB Nat {constructor mkRedZero {

post { result =RGB RGB.mkRed ∧ result =Nat Nat.mkZero }}

}

The property is then

RGBNat.mkRedZero =RGB RGB.mkRed∧ RGBNat.mkRedZero =Nat Nat.mkZero

and RGB.mkRed = Nat.mkZero is clearly invalid.

Informally, o1 =A o2 holds in Clay if and only if:

• o1 and o2 are instances of the same case class C of A,

• o1.f =B o2.f for each field f with class B declared in the case class C, and

40


• o1 =A′ o2 for every subclass A′ of A.

2.9.5 Invariants and Consistency

In most formal methods, invariants are one of the main elements of the proof

obligation process, i.e. invariants are not assumed and specifiers need to prove

that their postconditions ensure the invariants.

In Clay, invariants augment the postconditions. We consider that this de-

cision leads to more concise specifications while the proof obligation is turned

into a mandatory consistency checking.

2.9.6 Other Features

Clay is a lightweight evolution of SLAM-SL [63]. In this work we have advanced in

the formal treatment of some aspects only superficially treated in the cited works

and, at the same time, many features and syntax from SLAM-SL [69, 66, 65, 59]

have been left behind. Our plan is to reintroduce them in Clay in the future.

41

Part II

Semantics

43

3

Static Semantics

Abstract

This chapter presents the Clay type system. The Clay type system is

based on type names, it does not support structural subtyping to avoid

accidental matching of unrelated types and is thus harder to define than

structural type systems like those in [1]. To give a precise definition we

detail an abstract syntax for the language in Section 3.1 and then, in Sec-

tion 3.2, a whole type system for Clay is provided. Section 3.3 presents

important results about the typechecking of Clay specifications.

3.1 Abstract Syntax

Textual representation of Clay specifications is not appropriate to describe the

semantics of the language. We need a mathematical object to describe and bet-

ter manipulate Clay specifications. This section is devoted to define an abstract

syntax of Clay that will be used in Sections 3.2 and 3.3 to introduce a type system

that validates specifications and extracts important information used during the

translation processes described in Chapters 4 and 5.

45

3 Static Semantics

Appendix C introduces the mathematical preliminaries, definitions and con-

ventions used in this chapter.

We introduce four disjoint sets of atoms, sets with no structure, to represent

valid class identifiers (CI), message identifiers (MI), class variables (CV ), and ob-

ject variables (OV ). The same font of Clay specifications is used for identifiers or

variables that come from Clay specifications (such as Map, k, and self ). OV in-

cludes the elements self and result . In the mathematical definitions we use this

conventions with respect to the metavariables: ci for class identifiers (CI), mi for

message identifiers (MI), cv for class variables (CV ), and ov for object variables

(OV ).

3.1.1 Clay Specifications

A top-down account of Clay’s constructions follows. Let us start with the abstract

representation of Clay specifications. Clay specifications (Spec) can be repre-

sented by two partial functions: Bounds and Classes:

Spec = Bounds×Classes

Roughly speaking, Bounds represents the information in the header declarations

and Classes relates it to the actual specifications. Bounds is a partial function

from class identifiers (CI ) to sequences of sets of class expressions (CExpr) that

captures the arity and bounds of – possibly generic – class identifiers:

Bounds = CI 7→ (2CExpr)∗

Class expressions (CExpr) are terms that follow this concrete syntax:

CExpr ::= CI < CExpr,CExpr, . . . ,CExpr >| CV

i.e. class variables or a class identifier applied to several class expressions.

Example 3.1. In the declaration of generic maps

class Map<key extends Poset, info> extends Collection<info>{...}

46

3.1 Abstract Syntax

bnds, the first component of the clay specification spec = ⟨bnds,clss⟩, would con-

tain the entry Map 7→ [{Poset<>},;].

Classes is a partial function from class expressions to class specifications (CS):

Classes = CExpr 7→ CS

A class specification is a tuple with a bounds environment (BE) the finite set

of the (immediate) superclasses (2CExpr) of the class(es) being defined, the class

invariant (Form), a state environment (SE) providing the structure of algebraic

data and a method environment (ME) which provides the behaviour:

CS = BE ×2CExpr ×Form×SE ×ME

Bounds environments are finite mappings from class variables (CV ) to their

bounds (2CExpr).

BE = CV 7→ 2CExpr

Well-formedness Rules

For a given class specification spec = ⟨bnds,clss⟩, we introduce the well-formed-

ness rules:

1. Clay specifications are finite, formally | dom bnds |∈N.

2. No class variables in the domain of clss, formally CV ∩dom clss =;.

3. Free class variables are bounded, formally, if C ∈ dom clss and be = (π1 ◦clss)C then FV C = dom be (FV is the classical function that returns the

free variables of expressions and formulae and is defined at the end of Sec-

tion 3.1.4).

4. No other class identifier than those in bnds are in the domain of clss. For-

mally, dom bnds = {topC | C ∈ dom clss}, where function top is

top : CExpr → CI ∪CV

top (ci < C1, . . . ,Cn >) = ci

topcv = cv

47

3 Static Semantics

5. No cycles are allowed in the inheritance relation.

6. The clss and bnds components are related by the condition that all the class

expressions in the domain of the former must meet the bounds constraints

established by the latter.

7. The invariant is a formula where just the object variable self is allowed as a

free variable. Formally, for each cs ∈ rangeclss, FV (π3 cs) = {self }.

The formalisation of rules 5 and 6 follows.

Let us define the immediate superclass relation Cspec on dom clss for a given

specification spec. ACspec B iff A,B ∈ dom clss and B ∈ (π2 ◦clss)A. An inheritance

chain S in a class specification spec is a sequence of class expressions S1S2 . . .

where Si Cspec Si+1 for 0 < i <| S |.The well-formedness rule 5 is guaranteed by the following property: for any

inheritance chain S, i, j ∈ dom S, if topSi = topSj then i = j. Constructively, it

is enough to check the property for the inheritance chains starting with ci <cv1, . . . ,cvn > (with ci ∈ dom bnds) since those inheritance chains are finite as we

will prove in Lemma 3.1.

Formally, the well-formedness rule 6 is:

for each class expression ci < C1, . . . ,Cn >∈ dom(clss),

| bnds(ci) |= n and

for each i ∈ 1..n, Ci ∈ dom clss and for each b ∈ (bnds ci)i, b ∈ superspec(Ci) ,

where function superspec : CExpr 7→ 2CExpr can be defined as a transitive, reflexive

closure of the superclasses component of the class specifications.

superspec A = {C | AC∗spec C}

3.1.2 State Environments

A state environment is a partial function from class expressions (the case classes)

to tuples of field environments (FE) and formulae (Form):

SE = CExpr 7→ FE ×Form

The formula represents the invariant for that case and just the variable self is

48

3.1 Abstract Syntax

allowed as a free variable. Thus, for each ⟨fe,F⟩ ∈ rangese, FV (F) = {self }.

A field environment is a partial function from method identifiers (MI) to class

expressions:

FE = MI 7→ CExpr

3.1.3 Method Environments

A method environment is a partial function from method identifiers (MI) to

method specifications (MS):

ME = MI 7→ MS

A method specification is a triple with a method declaration (MD), and two for-

mulae (Form) that represent the precondition and the postcondition:

MS = MD×Form×Form

The declaration acts as the call pattern and, thus, free variables in the pre- and

post-conditions must be taken from those in the declaration:

for every ⟨d,pre,post⟩ ∈ MS, FV (pre)∪FV (post) ⊆ FV (d)∪ { result , self }

A method declaration is a pair with a sequence of pairs of object variables (OV )

and class expressions (CExpr) and a class expression:

MD = (OV ×CExpr)∗×CExpr

Example 3.2. Getting back to the specification of generic maps:

class Map<key extends Poset,info> extends Collection<info> {state Map_ { map : Seq<Tup2<key,info>> }invariant { ... } −− sorted and no duplicated keys

observer get (k : key) : info {pre { exists i : info ( self .map.in(Tup2.mkTup2(k,i)) : True) }post { self .map.in(Tup2.mkTup2(k,result)) : True }

}...

}

49

3 Static Semantics

The specification is a pair spec = ⟨bnds,clss⟩ where

clss = {Map<key,info> 7→ csmap, . . .}

csmap = ⟨bemap,Csmap,Fmap,semap,memap⟩bemap = {key 7→ {Poset<>}, info 7→;}

Csmap = {Collection<info>}

Fmap = {>}

semap = {Map_ 7→ {map 7→ Seq<Tup2<key,info>>}}

memap = {get 7→ msget , . . .}

msget = ⟨mdget ,Fpre,Fpost⟩mdget = ⟨[⟨k,key⟩], info⟩ .

3.1.4 Formulae and Expressions

We complete the definition of the abstract syntax with the representation of for-

mulae and expressions in Clay. We consider that a concrete syntax is a more ap-

propriate representation:

Form ::= > | ⊥ | Expr = Expr | Expr : CExpr

| ¬ Form

| Form ∧ Form | Form ∨ Form | Form ⇒ Form | Form ⇔ Form

| ∀OV : CExpr(Form) | ∃OV : CExpr(Form)

Expr ::= CExpr | OV | Expr.MI(Expr,Expr, . . . ,Expr)

This is basically a first-order language where the interesting parts are:

• the equality (=) and instance_of (:) built-ins;

• quantification bounded by class formulae (∀cv : C(F), ∃cv : C(F)) ;

• message invocation expressions (o.mi(o1, . . . ,on)).

50

3.2 The Type System

The formal definition of function FV follows:

FV Form (o1 = o2) = FV o1 ∪FV o2

FV Form (o : C) = FV o∪FV C

FV Form (¬ F) = FV F

FV Form (F1 ⊗F2) = FV F1 ∪FV F2 where ⊗∈ {∧,∨,⇒,⇔}

FV Form (Qov : C (F)) = FV F − {ov} where Q ∈ {∀,∃}

FV Expr C = FV CExpr C if C ∈ CExpr

FV Expr ov = ov if ov ∈ OV

FV Expr (o.mi(o1, . . . ,on)) = FV o∪ ⋃i∈1..n

FV Ci

FV CExpr cv = cv if cv ∈ CV

FV CExpr (ci < C1, . . . ,Cn >) = ⋃i∈1..n

FV Ci

Example 3.3. Following with the Map example:

Fpre = ∃ i : info(self .map.in(Tup2.mkTup2(k,i )): True)

Fpost = self .map.in(Tup2.mkTup2(k,result)) .

Figure 3.1 summarises the whole mathematical definition of the abstract syn-

tax. Terms between parenthesis after the name of the domains are our conven-

tions on metavariables.

3.2 The Type System

In this section we follow a standard approach for describing type systems: a lan-

guage for types is defined, the typing environments are introduced, typing judge-

ments are defined and, finally, the type inference rules are presented.

51

3 Static Semantics

Specifications (spec) Spec = Bounds×Classes

Bound Limits (bnds) Bounds = CI 7→ CExpr∗

Class Environment (clss) Classes = CExpr 7→ CS

Class Specifications (cs) CS = BE ×2CExpr ×Form×SE ×ME

Bound Environments (be) BE = CV 7→ 2CExpr

State Environments (se) SE = CExpr 7→ FE ×Form

Field Environments (fe) FE = MI 7→ CExpr

Method Environments (me) ME = MI 7→ MS

Method Specifications (ms) MS = MD×Form×Form

Method Declarations (md) MD = (OV ×CExpr)∗×CExpr

Formulae (F) Form ::= > | ⊥ | o1 = o2 | o : C

| ¬ F | F1 ∧ F2 | F1 ∨ F2

| F1 ⇒ F2 | F1 ⇔ F2

| ∀ov : C(F) | ∃ov : C(F)

Expressions (o) Expr ::= C | ov | o.m(o1, . . . ,on)

Class Expressions (C,A,B) CExpr ::= ci < C1, . . . ,Cn >| cv

Class Identifiers (ci) CI = valid Clay class identifiers

Message Identifiers (mi) MI = valid Clay message identifiers

Class Variables (cv) CV = valid Clay class variables

Object Variables (ov) OV = valid Clay object variables

Figure 3.1: Clay’s abstract syntax.

The design of the type system owes a lot to the one introduced by Abadi and

Cardelli in [1]. The main difference is that Clay has a nominal rather than struc-

tural type system. Keeping track of the name information is responsible for most

of the added complexity (basically, more environments) in the following presen-

tation.

3.2.1 Types

Our representation of types are class expressions, that we will call object types,

and message types. We extend the conventions introduced in Figure 3.1 with the

52

3.2 The Type System

metavariable M to represent message types:

Object Types (A,B,C) ObjTy = CExpr

Message Types (M) MsgTy ::= ObjTy . . .ObjTy → ObjTy

Example 3.4. The monomorphic class expressions Nat<>, Bool<>, Cell<> are ob-

ject types, as well as Collection<Bool<>>, Collection<Nat<>>, Collection<Cell<>> and

Collection<x> where x is a class variable. The type of method isEmpty of class Cell is

the message type [] → Bool<>.

3.2.2 Typing Environments

To describe the type system of Clay we start with the definition of typing environ-

ments, a structure that relates classes, methods and parameters with their types.

A typing environment E ∈ TEnv in Clay is a tuple ⟨Γ,γ,β,Λ⟩ where

• Γ ∈ GTEnv is a global typing environment that maps object types (ObjTy) to

class typing environments (CTEnv):

GTEnv = ObjTy 7→ CTEnv

• γ ∈ CTEnv is a class typing environment, a structure with a set of super

types (2ObjTy) and a message typing environment (MTEnv) that relates mes-

sage identifiers (MI) to message types (MsgTy):

CTEnv = 2ObjTy ×MTEnv

MTEnv = MI 7→ MsgTy

Two projections on CTEnv are defined:

suptys : CTEnv → 2ObjTy

suptys = π1

mtenv : CTEnv → MTEnv

mtenv = π2

• β ∈ BTEnv is a bounds typing environment that maps class variables (CV )

53

3 Static Semantics

Syntax Intended meaning

E ` A The object type A is valid.E ` o : A The object expression o is well typed and

A is one of its types.E ` A <:B The object type A is a subtype of B.E ` M The message type M is a valid.γ` mi : M The message identifier identifier mi is

well typed and M is one of its types.E ` M <:N The message type M is a subtype of N .E ` F <:Prop The formula F is well typed.

Figure 3.2: Typing Judgements.

to their bounds (2ObjTy):

BTEnv : CV 7→ 2CExpr

• Λ ∈ LTEnv is a local environment that maps object variables (OV ) to object

types (ObjTy):

LTEnv : OV 7→ ObjTy

We will use the following metavariables for each kind of type environment:

E for typing environments (TEnv), Γ for global typing environments (GTEnv),

γ for class typing environments (CTEnv), µ for message typing environments

(MTEnv), β for bounds typing environments (BTEnv), and Λ for local typing en-

vironments (LTEnv). Each time E appears we refer to the tuple ⟨Γ,γ,β,Λ⟩. Each

time γ appears we refer to the tuple ⟨otys,µ⟩.

3.2.3 Typing Judgements

The language for type judgements and their intended semantics is given in Fig-

ure 3.2.

Example 3.5. The following are (apparently valid) judgements:

⟨{Nat<> 7→ γ},γ′,;,;⟩ ` Nat<>

⟨{Nat<> 7→ γ},γ′,;,;⟩ ` Nat<>→Nat<>

⟨{Nat<> 7→ ⟨;, {mkCell 7→ [] → Cell<>⟩},γ′,;,;⟩ ` mkCell : [] → Cell<>

54

3.2 The Type System

3.2.4 Typing Rules

The type inference rules are presented in several fragments that capture the fol-

lowing information: well-formed (valid) types, subtyping, typing of object ex-

pressions and messages, and typing of formulae.

Well-formed (Valid) Types

Object types are well-formed if they are in the domain of the global environment:

⟨{A 7→ γ}∪Γ,γ′,β,Λ⟩ ` A[TR-wfot]

Every class identifier ci introduces a new class identifier Metaci:

⟨{ci < A1, . . . ,An >7→ γ}∪Γ,γ′,β,Λ⟩ `Metaci < A1, . . . ,An > [TR-wfmeta]

Type variables are well-formed types if they are in the domain of the bounds typ-

ing environment:

⟨Γ,γ, {cv 7→ Cs}∪β,Λ⟩ ` cv[TR-wftv]

Message types are well-formed if involved object types are well-formed:

E ` A E ` Ai (i ∈ 1..n)E ` A1A2 . . .An → A

[TR-wfmt]

Subtyping

The first source of subtyping information are declarations:

E ` A E ` B B ∈ suptys (ΓA)E ` A <:B

[TR-subty]

55

3 Static Semantics

E ` A E ` cv A ∈βcvE ` cv <:A

[TR-subtv]

Reflexivity, transitivity and subsumption are basic properties of subtyping:

E ` AE ` A <:A

[TR-ref]

E ` A <:B E ` B <:CE ` A <:C

[TR-trans]

E ` o : A E ` A <:BE ` o : B

[TR-ssump]

Message subtyping is covariant in the result type and contravariant in the argu-

ments:

E ` B1 <:A1 . . . E ` Bn <:An E ` A <:BE ` A1 . . .An → A <:B1 . . .Bn → B

[TR-submt]

The following rule is the subsumption for message typing:

ΓA ` mi : M E ` M <:NΓA ` mi : N

[TR-ssumpmt]

Typing Object Expressions

The type of an object variable is obtained from the local typing environment:

E ` A Λ(ov) = AE ` ov : A

[TR-ovty]

56

3.2 The Type System

The type of a class expression ci < A1, . . . ,An > is Metaci < A1, . . . ,An >:

E ` ci < A1, . . . ,An >E ` ci < A1, . . . ,An >: Metaci < A1, . . . ,An > [TR-cety]

E ` B1 . . . E ` Bn E ` B mtenv (ΓA)(mi) = B1 . . .Bn → BΓA ` mi : B1 . . .Bn → B

[TR-mity]

The type of a send expression is obtained from the type of the receipt and the

type of the message:

E ` o : A E ` o1 : A1 . . . E ` on : An ΓA ` mi : A1 . . .An → AE ` o.mi(o1, . . .on) : A

[TR-sndty]

Formulae

Formulae > and ⊥ always typecheck:

E ` F : PropF ∈ {>,⊥} [TR-Fty]

The following rule does not just type check an equality atomic formula, it is the

rule that annotates the index of the equality with information extracted from the

side condition:

E ` o1 : A1 E ` o2 : A2 E ` A1 <:B E ` A2 <:BE ` o1 =B o2 : Prop

MinTys [TR-eqty]

The side condition MinTys checks the existence of a minimum supertype of both

sides of the equality:

if there exist A′1, A′

2, B′ such that

E ` o1 : A′1, E ` o2 : A′

2, E ` A′1 <:B′, and E ` A′

2 <:B′

then A1 <:A′1, A2 <:A′

2, and B <:B′.

57

3 Static Semantics

An atomic formula with the predicate : will typecheck if the type is a subtype of

any type of the expression:

E ` o : B E ` A <:BE ` o : A : Prop

[TR-insty]

The rest of the rules just decompose non atomic formulae:

E ` F : PropE `¬ F : Prop

[TR-¬ ty]

E ` F1 : Prop E ` F2 : PropE ` F1 ⊗F2 : Prop

⊗∈ {∧,∨,⇒,⇔} [TR-⊗ty]

E ` A ⟨Γ,γ,β,Λ∪ {ov 7→ A}⟩ ` F : PropE `Qov : A(F) : Prop

Q ∈ {∀,∃},ov 6∈ domΛ [TR-Qty]

3.3 Typing Clay Specifications

The typing rules in Section 3.2.4 allow to derive judgements that relate a typing

environment with an object expression or a message identifier, as explained in

Section 3.2.3. However, most of the benefits of the type system have to do with

the processing of complete Clay specifications. In the following Chapter, for ex-

ample, it will be shown that the translation into first-order logic makes use of

type annotations.

Therefore, it is necessary that those annotations can be obtained (compiled)

from the source using some constructive, finitary method. This, in turn, can be

split into two subproblems:

1. A procedure to synthesise typing environments from Clay specifications.

58


2. A decision procedure on typing judgements, using the typing rules pre-

sented in the previous sections, provided that the “right” typing environ-

ment has been provided some how.

3.3.1 Synthesis of Typing Environments

The first question is how to obtain the typing environment from the source spec-

ification. Our typing environments have been carefully designed to allow an al-

most straightforward translation from the abstract syntax.

The function τ that transforms Clay specifications into global typing environ-

ments is defined as:

τ : Spec → GTEnv

τspec = {A 7→ ⟨(cs2suptys◦ clss)A⟩, (me2µ◦π5 ◦ clss)A

| A ∈ dom clss}

where ⟨bnds,clss⟩ = desugar spec

Auxiliary functions are defined as follows.

Without loss of generality, we assume the existence of a function that desug-

ars case class definitions, by moving the case class declarations to new classes

and the case constructors to new methods desugar : Spec → Spec.

Function md2mt transforms method declarations into message types:

md2mt : MD 7→ MsgTy

md2mt ⟨⟨ov1,C1⟩ . . .⟨ovn,Cn⟩,C⟩ = C1 . . .Cn → C

Function me2µ transforms method environments into message typing environ-

ments:

me2µ : ME 7→ MTEnv

me2µme = md2mt ◦π1 ◦me

Function cs2suptys transforms class specifications into sets of (super)types:

cs2suptys : CS 7→ 2ObjTy

cs2suptys = superspec ◦π2

59

3 Static Semantics

The construction above collects all the type declarations present in a Clay

specification and bundles it in the first component of a typing environment. The

other three components get different values during the type checking process.

Corollary 3.1. Given a Clay specification spec, there exists a constructive, finitary

procedure for obtaining a typing environment E that collects all the types infor-

mation in it, if that exists.

Proof Sketch. We give the procedure that has been actually implemented by our

type checker (see Chapter 6):

A subset of τ, τ′ is constructed:

τ′ : Spec → GTEnv

τ′ spec = {A 7→ ⟨(cs2suptys◦ clss)A⟩, (me2µ◦π5 ◦ clss)A

| A ∈ dom clss and A = ci < cv1, . . . ,cvn >}

where ⟨bnds,clss⟩ = desugar spec

τ′ is finite since dom bnds (with the valid class identifiers ci) is finite.

To calculate the application of τ to any class expression C = ci < A1, . . . ,An >we obtain the typing environment τ′ (ci < cv1, . . . ,cvn >) and substitute each class

variable cvi by Ai.

It remains to check that all the formulae (invariants, pre- and post-conditions)

in class specifications are type-consistent. A Clay specification spec = ⟨bnds,clss⟩is type-consistent iff for every A = ci < cv1, . . . ,cvn > with ci ∈ bnds and cv1, . . . , cvn

different class variables

1. Class invariant can be assigned the type Prop:

⟨Γ,γ,β,Λ⟩ ` (π3 ◦ clss)A : Prop

where Γ= τspec, γ= ΓA, β= (π1 ◦ clss)A, andΛ= {self 7→ A}.

2. Pre- and post-conditions of method specifications can be assigned the type

Prop: for every mi ∈ dom((mtenv ◦Γ)A)

⟨Γ,γ,β,Λ⟩ ` (π2 ◦π4 ◦ clss)A : Prop

⟨Γ,γ,β,Λ⟩ ` (π2 ◦π3 ◦ clss)A : Prop

60


where Γ= τspec, γ= ΓA, β= (π1 ◦ clss)A, and

Λ= (md2Λ◦π4 ◦ clss)A∪ {self 7→ A}.

Function md2Λ transforms method declarations into a local environment:

md2Λ : MD 7→ LTEnv

md2Λ⟨⟨ov1,C1⟩ . . .⟨ovn,Cn⟩,C⟩ = {ov1 7→ C1, . . . ,ov1 7→ C1, result 7→ C}

3.3.2 Decidability of the Type System

The second problem is a standard decidability result for a deductive system [1,

108, 109]. We will just sketch the main parts of the proofs.

One of the interesting parts is the side condition to rule TR-eqty: the condi-

tion itself must be decidable:

Lemma 3.1. Given a Clay specification spec = ⟨bnds,clss⟩, for every expression C ∈CExpr, the set superspec(C) is finite.

Proof sketch. We proceed by reductio ad absurdum.

1. Let us assume that there exist C such that superspec(C) is infinite.

2. If superspec(C) is infinite then there exist an infinite inheritance chain S

starting at C (S1 = C).

3. By the Well-formedness Rule 5 in Section 3.1, there are not Si, Sj (i 6= j) such

that topSi = topSi, so the set {topC | C ∈ rangeS} is infinite.

4. By the definition of inheritance chain, where Si Cspec Si+1, rangeS ⊆dom clss so {topC | C ∈ dom clss} is also infinite.

5. By the Well-formedness Rule 4 in Section 3.1 dom bnds = {topC | C ∈dom clss} is infinite which contradict the Well-formedness Rule 1.

Proposition 3.1. Judgements E ` A <:B, with E = τspec, reflects relation C∗spec

(reflexive and transitive closure of immediate inheritance relation Cspec).

61

3 Static Semantics

Proof sketch. Rule TR-subty is obtained from suptys that itself is defined by super

that is the closure of C. The rest of the rules (TR− ref and TR− trans) just reflect

reflexivity and transitivity properties.

Corollary 3.2. The side condition to rule TR-eqty is decidable.

Proof sketch. By Proposition 3.1 and Lemma 3.1, the search of types A′ and B′

runs on a finite set of types and the decision of the existence of the minimum

supertype is effective.

Theorem 3.1 (Decidability of type checking). Given a typing environment E =⟨Γ,γ,β,Λ⟩, object types A and B, message type M, object expression o and message

identifier mi, it is decidable whether a proof for a judgement exists.

Proof sketch. We will show that for any kind of judgement (see Figure 3.2), a fi-

nite derivation tree can be constructed. We will proceed, mainly, by structural

induction on the consequent of the typing rules. For those typing rules that are

not structural we will apply some common techniques.

Case E ` A. All affected rules (TR−wfot, TR−wfmeta, TR−wftv, ) are base cases

that requires searches on finite domains.

Case E ` o : A. Rules TR-ref, TR-cety, and TR-sendty are structural. Non struc-

tural rules are:

Rule TR-ovty.

E ` A Λ(ov) = AE ` ov : A

[TR-ovty]

Its non structural part, the last antecedent, is decidable since local

typing environments are finite.

Case E ` A <:B. All affected rules are non structural:

Rule TR-subty.

E ` A E ` B B ∈ suptys (ΓA)E ` A <:B

[TR-subty]

Its non structural part, the last antecedent, is decidable by

Lemma 3.1 and Proposition 3.1.

62


Rule TR-subtv.

E ` A E ` cv A ∈βcvE ` cv <:A

[TR-subtv]

Its non structural part, the third antecedent, is decidable since the set

of bounds of a class variable is finite.

Rules TR-trans and TR-ssump. By Proposition 3.1, and by Lemma 3.1 we

have an effective decision procedure that consists in generating the

finite set of subtypes of the right hand side of A <:B and check that the

left hand side is in such a set.

Case E ` M. The only one affected rule TR-wfmt is structural.

Case γ` mi : M. All the affected rules are non structural:

Rule TR-ssumpmt.

ΓA ` mi : M E ` M <:NΓA ` mi : N

[TR-ssumpmt]

If N = A1 . . .An → A, by Proposition 3.1, and by Lemma 3.1 we have an

effective decision procedure that consists in generating the finite set

of subtypes of A, the finite sets of supertypes of every Ai, and combine

all of them to calculate any possible M .

Rule TR-mity.

E ` B1 . . . E ` Bn E ` B mtenv (ΓA)(mi) = B1 . . .Bn → BΓA ` mi : B1 . . .Bn → B

[TR-mity]

Its non structural part, the last antecedent, is decidable since method

typing environments are finite.

Case E ` M <:N . The only one affected rule TR-submt is structural.

Case E ` F <:Prop. Rules TR->ty, TR-⊥ty, TR-¬ ty, TR-∧ty, TR-∨ty, TR-⇒ty, TR-

⇔ty, TR-∀ty, and TR-∃ty are structural. Non structural rules are:

Rule TR-eqty.

E ` o1 : A1 E ` o2 : A2 E ` A1 <:B E ` A2 <:BE ` o1 =B o2 : Prop

MinTys [TR-eqty]

63

3 Static Semantics

a. ⟨Γ,γ,;, {v 7→Nat}⟩ ` Bool (TR−wfot)b. ⟨Γ,γ,;, {v 7→Nat}⟩ ` False (TR−wfot)c. ⟨Γ,γ,;, {v 7→Nat}⟩ ` False<:Bool (TR−subty)d. ⟨Γ,γ,;, {v 7→Nat}⟩ ` Nat (TR−wfot)e. ⟨Γ,γ,;, {v 7→Nat}⟩ ` Cell (TR−wfot)f . ⟨Γ,γ,;, {v 7→Nat}⟩ ` result : Cell (TR−ovty)g. γ ` isEmpty : [] → Bool (TR−mity)h. ⟨Γ,γ,;, {v 7→Nat}⟩ ` result . isEmpty : Bool (f ,g,TR−sndty)i. ⟨Γ,γ,;, {v 7→Nat}⟩ ` result . isEmpty:False : Prop (h,c,TR− insty)j. γ ` contents : [] →Nat (TR−mity)k. ⟨Γ,γ,;, {v 7→Nat}⟩ ` result .contents : Nat (j,TR−sndty)l. ⟨Γ,γ,;, {v 7→Nat}⟩ ` v : Nat (TR−ovty)m. ⟨Γ,γ,;, {v 7→Nat}⟩ ` Nat<:Nat (d,TR− ref)n. ⟨Γ,γ,;, {v 7→Nat}⟩ ` result .contents=Nat v : Prop (k, l,m,TR−eqty)o. ⟨Γ,γ,;, {v 7→Nat}⟩ ` result . isEmpty:False (i,n,TR−∧ty)

∧ result .contents=Nat v : Prop

Figure 3.3: Deriving the type of a postcondition.

By Corollary 3.2, the side condition of this rule is decidable. By

Lemma 3.1, exploring all subtypes of B is an effective procedure.

Rule TR-insty.

E ` o : B E ` A <:BE ` o : A : Prop

[TR-insty]

Decidable by Lemma 3.1.

3.3.3 A Typing Example

Example 3.6. Let us prove that type Prop can be assigned to the postcondition of

method set in the specification of class Cell. Assume a specification spec contain-

ing, at least, definitions for classes Bool, True, False, Nat and Cell. Let Γ= τspec. Let

γ= ΓCell. Thus, what we want is to prove

⟨Γ,γ,;, {v 7→Nat}⟩ ` result . isEmpty:False∧ result .contents = v : Prop

The proof is in Figure 3.3.

64

4A Dynamic Semantics Based

on First-Order Logic

Abstract

This chapter presents the formal semantics of Clay. This semantics is

given via an interlingua-based translation. The interlingua is a first-order

language of the subsorted first-order logic. The sort of the first-order lan-

guage represent Clay concepts. Clay specifications are specifically trans-

lated into axioms and constant symbols of the first-order language. We

have used the concrete syntax of Prover9/Mace4 as the target language.

4.1 Methodology

Establishing a formal semantics it is needed as much to unambiguously under-

stand and write specifications as to develop formal method tools for Clay. The

main objectives of the semantics of Clay are:

• To give a formal and unambiguous meaning to Clay specifications.

• To mechanise reasoning about specifications.

65

4 A Dynamic Semantics Based on First-Order Logic

One is tempted to use or to design an extremely powerful (in the sense of expres-

siveness) logic, but a very relevant aspect has to be taken into account: Clay is

a formal specification language and, as such, its specifications should be under-

stood by different kinds of stakeholders, from customers to developers. There is

an important difference between the semantics one needs to supply for an speci-

fication language and that for a programming language. The latter is designed for

experts: i.e. (advanced) programmers or automatic tools for safe program manip-

ulation. On the other hand, a specification language needs to be equipped with

a more intuitive and natural semantics because its specifications are going to be

read by non-experts, i.e. customers and ordinary developers.

It is often taught in introductory courses classes in Philosophy,

Mathematics, and Computer Science that logic is the universal lan-

guage of reasoning and rigorous representation of knowledge. This is

not unfounded: for example, the entire body of mathematics can be

formalised in classical first-order predicate logic (FOL).

[28]

Using first-order logic to give a formal semantics to Clay is, for us, both a need

and a challenge.

Interlingua-based Semantics

The methodology we have used to give a formal semantics to Clay is named

interlingua-based translation. In [8], Van Baalen and Fikes describe a method

for providing a declarative semantics for a new language in terms of its transla-

tion into a target language (the interlingua). To apply the method, the interlingua

must have a declarative semantics and must include logical entailment and a set

of top-level and satisfy the following definition.

Definition 4.1 (Interlingua-based semantics). Let L be a language, Li be an inter-

lingua language with a formally defined declarative semantics. Let TRANSL,Li be

a binary relation between top-level forms of L and top-level forms of Li, and BTL

be a set of top-level forms in Li. The pair < TRANSL,Li ,BTL > is called an Li-based

semantics for L when for every set TL of top-level forms in L, there is a set TLi of

top-level forms in Li such that

∀s ∈ TL(∃ i ∈ TLi (TRANSL,Li (s, i)))

∀ i ∈ TLi (∃s ∈ TL(TRANSL,Li (s, i)))

66

4.2 The Logic of Clay

and the theory of TLi ∪BTL is equivalent to the theory represented by TL.

Under this definition,

• L is Clay,

• the interlingua language Li is first-order logic, and

• BTL are the axioms that define the semantics of Clay that we call The Clay

Theory.

When < TRANSL,Li ,BTL > is used to define the semantics of L, as it is in our case,

then the theory represented by TL is equivalent to the theory of TLi ∪BTL by defini-

tion.

In order to make the definition of the semantics of Clay easier to understand

we introduce an intermediate language in the translation process. The result is

description in three steps:

1. The definition of the Clay theory in a subsorted first-order logic that we call

OOFOL (object-oriented first-order logic). The characteristics of the logic

and the details of the first-order language used is given in Section 4.2. The

Clay theory is then introduce and explained in Section 4.3.

2. The translation of Clay specifications in the form of abstract syntax trees

into axioms of OOFOL (Section 4.4).

3. The encoding of OOFOL into untyped first-order logic following the orien-

tations in [45]. The encoding is explained in a distributed manner throw

the Sections 4.2 and 4.3. We have used the syntax of Prover9/Mace4 as a

concrete syntax for first-order logic.


To present the interlingua we will follow Gallier’s notation and style [50]. We start

with the alphabet of our object-oriented first-order logic (OOFOL), a subsorted

first-order logic:

• S is the sets of sorts,

67


• FS is the set of function symbols,

• PS is the set of predicate symbols, and

• r is the rank function.

Our font face convention for symbols, terms and formulae of the first-order lan-

guage is bold face. For example, clsidS is a sort symbol, add can be a function

symbol, _self, _result, or X can be variables, and pre is a predicate symbol.

4.2.1 Sorts and Subsorts

The set of sorts that we have designed is:

S = {clsidS,msgidS,clsS,objS,msgS, clslstS,objlstS}∪ {anyS,boolS}

Intended Meaning

We are not using the sorts in the logic to reflect the classes of Clay but to group

its syntactical categories. Class expressions, for instance, are encoded as terms

of sort clsS. We will discuss on this decision in Section 4.6. For the moment, let

us describe every sort:

• clsidS groups constants that reflect class identifiers in Clay.

• msgidS groups constants that reflect message identifiers in Clay.

• clsS groups terms that represent classes in Clay.

• objS groups terms that represent objects in Clay. To reflect that every class

is an object we establish that sort clsS is a subsort of objS.

• clslstS groups terms that represent lists of classes.

• objlstS groups terms that represent lists of objects.

• boolS is not properly a sort and we use it to characterise the syntactical

family of formulae.

• anyS is the sort that groups all the universe and we are using it just to type

the equality predicate of the first-order logic language.

68


The reader can see that the subsorted part of the logic is used just to support the

design decision that establishes that classes are objects in Clay.

Encoding in Prover9/Mace4

To encode OOFOL (a subsorted first-order logic) in untyped first-order logic we

follow the standard procedure described in [45]. The essential idea to convert

a many-sorted language into a one-sorted language is to add domain predicate

symbols Ds, one for each sort s, and to modify quantified formulae recursively as

follows:

Every formula A of the form ∀s x(B) (or ∃s x(B)) is converted to the

formula A′ =∀x(Ds(x) ⇒ B′), where B′ is the result of converting B.

We can see that the description of the encoding is a bit imprecise, in par-

ticular one can derive that the translation of ∃s x(B) would be A′ = ∃x(Ds(x) ⇒B′). Nevertheless, the meaning of the last formula is far from the meaning of

∃s x(B). The proper transformation and the one we have applied in this work is

A′ =∃x(Ds(x) ∧ B′).

The concrete syntax for the untyped first-order language is the syntax for for-

mulae in Prover9/Mace4. The most relevant aspects of this syntax are:

• Variable symbols start with upper case (we activate the Prover9/Mace4 flag

set(prolog_style_variables)).

• Function symbols start with lowercase.

• Free variables are considered universally quantified.

• Precedence and associativity of the logical symbols are the usual. From

lower to higher precedence: ⇔ (equivalence, infix operator, non associa-

tive), ⇒ (implication, infix operator, right associative), ⇐ (backward impli-

cation, infix operator, left associative), ∨ (disjunction, infix operator, right

associative), ∧ (conjunction, infix operator, right associative), and ¬ (nega-

tion, prefix operator).

Let us go into the initial details of our encoding in Prover9/Mace4. We start es-

tablishing that sorts have no empty carriers and that they are disjoint except for

clsS and objS (clsS is a subsort of objS):

69


∃ Cid clsidS(Cid).∃ Mid msgidS(Mid).∃ O objS(O).∃ C clsS(C).∃ M msgS(M).∃ CL clslstS(CL).∃ OL objlstS(OL).

Axioms 4.1: Non-empty sorts.

clsidS(X) ∧ msgidS(Y) ⇒ X 6= Y.clsidS(X) ∧ clsS(Y) ⇒ X 6= Y.clsidS(X) ∧ objS(Y) ⇒ X 6= Y.clsidS(X) ∧ msgS(Y) ⇒ X 6= Y.clsidS(X) ∧ clslstS(Y) ⇒ X 6= Y.clsidS(X) ∧ objlstS(Y) ⇒ X 6= Y.msgidS(X) ∧ clsS(Y) ⇒ X 6= Y.msgidS(X) ∧ objS(Y) ⇒ X 6= Y.msgidS(X) ∧ msgS(Y) ⇒ X 6= Y.msgidS(X) ∧ clslstS(Y) ⇒ X 6= Y.msgidS(X) ∧ objlstS(Y) ⇒ X 6= Y.clsS(X) ∧ msgS(Y) ⇒ X 6= Y.clsS(X) ∧ clslstS(Y) ⇒ X 6= Y.clsS(X) ∧ objlstS(Y) ⇒ X 6= Y.objS(X) ∧ msgS(Y) ⇒ X 6= Y.objS(X) ∧ clslstS(Y) ⇒ X 6= Y.objS(X) ∧ objlstS(Y) ⇒ X 6= Y.msgS(X) ∧ clslstS(Y) ⇒ X 6= Y.msgS(X) ∧ objlstS(Y) ⇒ X 6= Y.

Axioms 4.2: Disjoint sorts.

clsS(C) ⇒ objS(C).

Axioms 4.3: Classes are objects.

Any element of the universe belongs to sort anyS:

anyS(X).

Axioms 4.4: The sort of any groups all terms.

4.2.2 Function Symbols

The set of function symbols is:

FS = {cMetaClass,send,msg, cls,$nilc,$consc,$nilo,$conso}

The rank function on function symbols is defined in Figure 4.1.

70


r : FS∪PS → S∗×S

r(cMetaClass) = (ε,clsidS)

r(send) = (objS msgS,objS)

r(msg) = (msgidS objlstS,msgidS)

r(cls) = (clsidS clslstS,clsS)

r($nilc) = (ε, clslstS)

r($consc) = (clsS clslstS, clslstS)

r($nilo) = (ε,objlstS)

r($conso) = (objS objlstS,objlstS)

Figure 4.1: Rank function for function symbols.

To improve readability of terms that represent lists we follow the Prover9/

Mace4 syntax for lists (a Prolog like syntax): [] (the empty list $nil), [a,b,c] (a list

with three elements $cons(a,$cons(b,$cons(c,$nil))) or [a:b] (the list $cons(a,b), a

is the first and b is the rest).

Intended Meaning

A Clay specification will not introduce any other function symbol but class iden-

tifiers and message identifiers with ranks (ε,clsidS) and (ε,msgidS) respectively

(such as Nat and add in Clay). With this pieces we build the following language:

• Classes are terms of the form cls(cid,cs) such as cls(cMetaClass,[]),

cls(cNat,[]) and cls(cList,[ cls(cNat,[])]) representing cMetaClass, Nat and

List<Nat>, respectively.

• Messages are terms of the form msg(mid,os) like msg(mkZero,[]) or

msg(add,[m]) representing mkZero and addNat(n), respectively, where m

is the representation of n.

• Objects are terms of the form send(o,m)1 like cls(Nat,[])←msg(mkZero,[])

or n←msg(add,[m]) representing Nat.mkZero and n.addNat(m), respectively.

1For function symbol send we have introduced the infix version _←_ so we will write o←m inthe logic.

71


All function symbols represent injections except send, therefore the following ax-

ioms are part of the theory:

∀clsidS Cid1∀clsidS Cid2∀clslstS CL1∀clslstS CL2

(cls(Cid1,CL1) = cls(Cid2,CL2) ⇒ Cid1 = Cid2 ∧ CL1 = CL2)

∀msgidS Mid1∀msgidS Mid2∀objlstS OL1∀objlstS OL2

(msg(Mid1,OL1) = msg(Mid2,OL2) ⇒ Mid1 = Mid2 ∧ OL1 = OL2)

∀objS O1∀objS O2∀objlstS OL1∀objlstS OL2

($conso(O1,OL1) =$conso(O2,OL2) ⇒ O1 = O2 ∧ Os1 = Os2)

∀clsS C1∀clsS C2∀clslstS CL1∀clslstS CL2

($consc (C1,CL1) =$consc (C2,CL2) ⇒ C1 = C2 ∧ Cs1 = Cs2)


The following axioms establish the rank of every function symbol, included the

sorts of the lists:

clsidS(cMetaClass).

objS(O) ∧ msgS(M) ⇒ objS(O ← M).

msgidS(Mid) ∧ objlstS(OL) ⇒ msgS(msg(Mid,OL)).

clsidS(Cid) ∧ clslstS(CL) ⇒ clsS(cls(Cid,CL)).

Axioms 4.5: Rank of function symbols.

objlstS ([]).clslstS ([]).

objS(O) ∧ objlstS(OL) ⇒ objlstS([O:OL]).clsS(C) ∧ clslstS(OL) ⇒ clslstS([C:OL]).

Axioms 4.6: Rank of function symbols that represent lists.

The injection property for every symbol is encoded as follows:

clsidS(Cid1) ∧ clslstS(CL1) ∧ clsidS(Cid2) ∧ clslstS(CL2) ⇒cls(Cid1,CL1) = cls(Cid2,CL2) ⇒ Cid1 = Cid2 ∧ CL1 = CL2.

72


msgidS(Mid1) ∧ objlstS(OL1) ∧ msgidS(Mid2) ∧ objlstS(CL2) ⇒msg(Mid1,OL1) = msg(Mid2,OL2) ⇒ Mid1 = Mid2 ∧ OL1 = OL2.

objS(O1) ∧ objlstS(Os1) ∧ objS(O2) ∧ objlstS(Os2) ⇒[O1:Os1] = [O2:Os2] ⇒ O1 = O2 ∧ Os1 = Os2.

clsS(C1) ∧ clslstS(Cs1) ∧ clsS(C2) ∧ clslstS(Cs2) ⇒[C1:Cs1] = [C2:Cs2] ⇒ C1 = C2 ∧ Cs1 = Cs2.

Axioms 4.7: Function symbols are injections.

4.2.3 Predicate Symbols

The set of predicate symbols is:

PS = {>,⊥,_=_}

∪ {subclass, instanceof,eq,eqs,pre,post}

∪ {instancesof}

The rank function on predicate symbols is defined in Figure 4.2.

r : FS∪PS → S∗×S

r(>) = (ε,boolS)

r(⊥) = (ε,boolS)

r(_=_) = (anyS anyS,boolS)

r(subclass) = (clsS clsS,boolS)

r(instanceof) = (objS clsS,boolS)

r(eq) = (clsS objS objS,boolS)

r(eqs) = (clsS objS objS,boolS)

r(pre) = (msgidS clsS clslstS clsS objS objlstS,boolS)

r(post) = (msgidS clsS clslstS clsS objS objlstS objS,boolS)

r(instancesof) = (objlstS clslstS,boolS)

Figure 4.2: Rank function for predicate symbols.

Intended Meaning

A Clay specification will not introduce any other predicate symbol. Let us explain

each one:

73


• >, ⊥, _=_ are the standard symbols in the logic that represent truth, false-

hood and equality in first order logic.

• subclass is the subtyping relation (<:).

• instanceof is the relation that establishes when an object is an instance of

a class ( :).

• eq is the ternary equality predicate in Clay: eq(C,O1,O2) establishes that

objects O1 and O2 are not distinguishable under the properties of class C.

• eqs is a ternary predicate that captures the structural equivalence between

two objects. In Section 4.4 we will show how facts of this predicate are gen-

erated for every class. The idea is to check that both objects match the same

case class and that they are recursively equivalent. Predicate eq is defined

in function of eqs and we will see its definition in this section.

• pre is the predicate that contains the definition of the preconditions in the

specifications: pre(M,C,Cs,T ,S,Os) is the precondition of method M in the

class C for parameters of classes Cs returning an object of class T being S

the recipient of the message and Os the arguments.

• post is the predicate that contains the definition of the postconditions in

the specifications: post(M,C,Cs,T ,S,Os,R) is the postcondition of method

M in the class C for parameters of classes Cs returning an object of class

T being S the recipient of the message and Os the arguments and R an in-

stance of T that fulfils the postcondition.

• instancesof is the result of lifting the predicates instanceof to lists.


According to Enderton [45], the following axioms are not needed in the theory.

Nevertheless they are consistent with the rest of the theory and we have found

them very convenient for capturing mistakes while transcribing the theory:

subclass(A, B) ⇒ clsS(A) ∧ clsS(B).

instanceof(O, C) ⇒ clsS(C) ∧ objS(O).

eq(C, O1, O2) ⇒ clsS(C) ∧ objS(O1) ∧ objS(02).

eqs(C, O1, O2) ⇒ clsS(C) ∧ objS(O1) ∧ objS(02).

74

4.3 The Clay Theory

pre(Mid, C, CL, RC, O, OL) ⇒msgidS(Mid)∧ clsS(C) ∧ clslstS(CL) ∧ clsS(RC)∧ objS(O) ∧ objlstS(OL).

post(Mid, C, CL, RC, O, OL, RO) ⇒msgidS(Mid)∧ clsS(C) ∧ clslstS(CL) ∧ clsS(RC)∧ objS(O) ∧ objlstS(OL) ∧ objS(OR).

instancesof(OL,CL) ⇒ objlstS(OL) ∧ clslstS(CL).

Axioms 4.8: Rank of predicate symbols.

4.3 The Clay Theory

Up to now, we have just introduced the pieces that capture the way in which Clay

concepts in particular, and object-oriented concepts in general, are represented

in first order logic. In this section we give the semantics of Clay in the form of

axioms directly in the untyped first-order logic. We present the axioms of the

theory in a literate style [78] with a description previous to the formulae, and

following some indications of [81] to lay out them.

4.3.1 Instanceof

The translation process of Clay specifications into the logic introduces the ax-

ioms that capture the type information of the object expressions with the pred-

icate instanceof. In particular, translation functions in Sections 4.4.4 and 4.4.5

reveal how expressions such as o←m are instances of the classes declared for the

message m.

In the theory, the predicate instancesof checks if objects in the first argument

are instances of classes in the second argument, one by one:

instancesof ([],[]).objS(O) ⇒ objlstS(OL) ⇒ clsS(C) ⇒ clslstS(CL) ⇒(

instancesof([O:OL],[C:CL])⇔instanceof(O,C) ∧ instancesof(OL,CL)

).

Axioms 4.9: Definition of instancesof.

75


4.3.2 Subtyping


ioms that capture the subtype information of the class expressions with the pred-

icate subclass. In particular, translation function in Section 4.4.3 reveals the su-

perclasses of a class from a class declaration.

In the theory, the following axioms capture the main properties of the predi-

cate subclass and its relation with instanceof:

• Subclass is reflexive:

clsS(A) ⇒(

subclass(A, A)).

Axioms 4.10: subclass is reflexive.

• Subclass is transitive:

clsS(A) ⇒ clsS(B) ⇒ clsS(C) ⇒(

subclass(A, B) ∧ subclass(B, C) ⇒ subclass(A, C)).

Axioms 4.11: subclass is transitive.

• The subsumption property establishes that any instance of a class is an

instance of their superclasses:

objS(O) ⇒ clsS(A) ⇒ clsS(B) ⇒(

instanceof(O, A) ∧ subclass(A, B) ⇒ instanceof(O, B)).

Axioms 4.12: Subsumption.

4.3.3 Equality


ioms that capture the structural equality for classes. Section 4.4.4 shows the

translation of case classes into axioms for the predicate eqs that reflect the main

properties of algebraic types

In the theory, the definition of the equality predicate eq is given on top of eqs:

Two objects are equal under a given class if

76

4.3 The Clay Theory

• both are instances of the class,

• both are structurally equivalent (below we will talk about the structural

equality), and

• both are equal under any common subclass.

Let us show the axiom that capture this information:

objS(X) ⇒ objS(Y) ⇒ clsS(C) ⇒(

eq(C, X, Y) ⇔ instanceof(X, C) ∧ instanceof(Y, C)∧ eqs(C, X, Y)∧∀ D (clsS(D) ⇒

(subclass(C, D) ⇒ eq(D, X, Y)))).

Axioms 4.13: Clay’s equality.

The theory also includes the axioms of any equality predicate:

• Eq is reflexive:

objS(X) ⇒ clsS(C) ⇒(

instanceof(X, C) ⇒ eq(C, X, X)).

Axioms 4.14: Equality is reflexive.

• Eq is symmetric:


instanceof(X, C) ∧ instanceof(Y, C)∧ eq(C, X, Y)⇒eq(C, Y, X)

).

Axioms 4.15: Equality is symmetric.

• Eq is transitive:

objS(X) ⇒ objS(Y) ⇒ objS(Z) ⇒ clsS(C) ⇒(

instanceof(X, C) ∧ instanceof(Y, C) ∧ instanceof(Z, C)∧ eq(C, X, Y) ∧ eq(C, Y, Z)⇒eq(C, X, Z)

).

Axioms 4.16: Equality is transitive.

77


Although we do not use the equality of the untyped first-order logic, we inform

our system that two terms that represent objects are the same element of the

carrier if they cannot be distinguish with eq for any class:

objS(X) ⇒ objS(Y) ⇒(

X = Y⇔(∀ C (clsS(C) ⇒ ((instanceof(X, C) ⇔ instanceof(Y, C))

∧ (instanceof(X, C) ⇒ eq(C, X, Y)))))

).

Axioms 4.17: Clay’s equality in first-order logic.

4.3.4 Pre- and Post-conditions

Perhaps, with the axioms of equality, the following axiom is the most important

fact of our theory. It describes that if the precondition holds for an expression

o←m then the postcondition holds for that expression:

msgidS(Mid) ⇒clsS(C) ⇒ clslstS(CL) ⇒ clsS(RC) ⇒objS(O) ⇒ objlstS(OL) ⇒(

pre(Mid, C, CL, RC, O, OL)⇒post(Mid, C, CL, RC, O, OL, O ← msg(Mid,OL))

).

Axioms 4.18: Semantics of pre- and post-conditions.

The translation process of Clay specifications into the logic introduces the

axioms that capture the definitions of pre- and post-conditions. Section 4.4.5

shows the translation.

In the theory, the following axioms capture the information that pre- and

post-conditions are well typed:

• Precondtions are well-typed:


pre(Mid, C, CL, RC, O, OL)⇒

78

4.4 TRANSClay,FOL

instanceof(O,C)∧ instancesof(OL,CL)∧ instanceof(O←msg(Mid,OL),RC)

).

Axioms 4.19: Preconditions are well typed.

• Postconditions are well-typed:


post(Mid, C, CL, RC, O, OL, R) ⇒instanceof(O,C)∧ instancesof(OL,CL)∧ instanceof(R, RC)

).

Axioms 4.20: Postconditions are well typed.

4.4 TRANSClay,FOL

In this section we formalise the translation of Clay abstract syntax trees (see

Chapter 3) into an OOFOL theory.

4.4.1 Abstract Syntax for OOFOL

We have already mentioned in Subsection 4.2.2 that the translation process just

add class identifiers and message identifiers to the OOFOL language, i.e. con-

stants of sorts clsidS and msgidS. Our abstract structure for OOFOL theories

(Theory) is the following Cartesian product:

Theory = 2CI ×2MI ×2OOForm

For readability reasons, we give the definition of OOForm in Figure 4.3 in the

form of a concrete syntax.

79


OOForm ::= >| ⊥| subclass(OOTermclsS ,OOTermclsS)

| instanceof(OOTermobjS ,OOTermclsS)

| eq(OOTermclsS ,OOTermobjS ,OOTermobjS)

| eqs(OOTermclsS ,OOTermobjS ,OOTermobjS)

| pre(OOTermmsgidS ,OOTermclsS ,OOTermclslstS ,OOTermclsS ,

| OOTermobjS ,OOTermobjlstS)

| post(OOTermmsgidS ,OOTermclsS ,OOTermclslstS ,OOTermclsS ,

| OOTermobjS ,OOTermobjlstS ,OOTermobjS)

| ¬OOForm

| OOForm∧OOForm

| OOForm∨OOForm

| OOForm⇒OOForm

| OOForm⇔OOForm

| ∀FS OOVar (OOForm)

| ∃FS OOVar (OOForm)

where OOTerms is the set of terms of sort s and OOVars is the set of variables ofsort s.

Figure 4.3: Syntax of OOFOL formulae.

4.4.2 Translation of Spec

Clay specifications (Spec) are represented by two partial functions: bnds and clss

(see Section 3.1). The definition of the translation starts at that point:

TRANSSpec : Spec → Theory

TRANSSpec �⟨bnds,clss⟩� = ⟨dom bnds,mis, f ⟩where mis = ⋃

ci∈dom Bounds(π1 ◦TRANSOOForm×CExpr×CS)�⟨g,C,cs⟩�

f = ⋃ci∈dom Bounds

(π2 ◦TRANSOOForm×CExpr×CS)�⟨g,C,cs⟩�g = (π1 ◦TRANSCI×Bounds)�⟨ci,bnds⟩�C = (π2 ◦TRANSCI×Bounds)�⟨ci,bnds⟩�cs = clss C

80

4.4 TRANSClay,FOL

From now on, we add the following conventions to those mentioned in Chap-

ter 3 and Appendix C:

• TRANS is an indexed family of functions on different domains. We will over-

load the name of the family to name each function, in other words, in gen-

eral we will omit the index since it is easy to derive.

• We use a special syntax for the application of a translation function in order

to identify its application easily. For example, TRANSSpec �⟨c,b⟩�.

• Metavariables g (guard), f (formula), ty (type information), gty (guarded

type information), algb (algebraic type information) are used for OOForm.

• Metavariables with s being the last letter will represent sets or sequences.

For example, mis for a set or a sequence of message identifiers, and fs for

sets of OOForms.

TRANSCI×Bounds : CI ×Bounds → OOForm×CExpr

TRANS �⟨ci,bnds⟩� = ⟨g,C⟩where g = subclass(TRANS �cv1�, TRANS �(bnds c)1�)

∧ . . .

∧ subclass(TRANS �cvn�, TRANS �(bnds c)n�)

C = ci < cv1, . . . ,cvn >cv1 . . .cvn are different elements of CV

n = | (bnds c) |

4.4.3 Translation of Class Specifications (CS)

TRANSOOForm×CExpr×CS : OOForm×CExpr×CS → 2MI ×2OOForm

TRANS �⟨g,C,cs⟩� = ⟨miss∪mims, {inv, fsub}∪ fss∪ fms⟩where miss = (π1 ◦TRANS)�⟨g,C,se⟩�

mims = (π1 ◦TRANS)�⟨g,C,me⟩�fss = (π2 ◦TRANS)�⟨g,C,se⟩�

fms = (π2 ◦TRANS)�⟨g,C,me⟩�⟨be,sups, I ,se,me⟩ = cs

inv = g ⇒ instanceof(_self,TRANS �C�) ⇒ TRANS �I�fsub = g ⇒ subclass(TRANS �C�,TRANS �C1�)

∧ . . .

∧ subclass(TRANS �C�,TRANS �Cn�)

{C1, . . . ,Cn} = sups

81


4.4.4 Translation of Algebraic Types (SE)

TRANSOOForm×CExpr×SE : OOForm×CExpr×SE → 2MI ×2OOForm

TRANS �⟨g,C,se⟩� = ⟨miss, {f }∪⋃i∈1..n invi ∪⋃

A∈dom se TRANS �⟨g,A, (π1 ◦ se)A⟩�⟩where miss = ⋃

B∈dom se dom((π1 ◦ se)B)

f = g ⇒ instanceof(_self,TRANS �C�) ⇒ algb

algb = conj1 ∨ . . . ∨ conjn

(with i ∈ {1, . . . ,n}) conji = ¬instanceof(_self,TRANS �C1�)

∧ . . .

∧ ¬instanceof(_self,TRANS �Ci−1�)

∧ instanceof(_self,TRANS �Ci�)

∧ ¬instanceof(_self,TRANS �Ci+1�)

∧ . . .

∧ ¬instanceof(_self,TRANS �Cn�)

(with i ∈ {1, . . . ,n}) Ii = (π2 ◦ se)Ci

(with i ∈ {1, . . . ,n}) invsi = g ⇒ instanceof(_self,TRANS �Ci�) ⇒ TRANS �Ii�{C1, . . . ,Cn} = dom se

_self is “fresh”

TRANSOOForm×CExpr×FE : OOForm×CExpr×FE → 2OOForm

TRANS �⟨g,C, fe⟩� = ty1 ∪ . . .∪ tyn ∪eqs

where

(with j ∈ {1, . . . ,n}) tyi = g ′ ⇒ instanceof(_self1←TRANS �mij�,TRANS �fe mij�)

eqs = g ′′

⇒ (eqs(_self1,_self2)

⇔ eq(_self1←TRANS �mi1�,_self2←TRANS �mi1�)

∧ . . .

∧ eq(_self1←TRANS �min�,_self2←TRANS �min�))

g ′ = g ⇒ instanceof(_self1,TRANS �C�)

g ′′ = g ′ ⇒ instanceof(_self2,TRANS �C�)

{mi1, . . . ,min} = dom fe

_self1 is “fresh”

_self2 is “fresh”

82

4.4 TRANSClay,FOL

4.4.5 Translation of Methods (ME)

TRANSOOForm×CExpr×ME : OOForm×CExpr×ME → 2MI ×2OOForm

TRANS �⟨g,C,me⟩� = ⟨dom me,⋃

mi∈dom me TRANS �⟨g,C,mi,me mi⟩�⟩

TRANSOOForm×CExpr×MI×MS : OOForm×CExpr×MI ×MS → 2OOForm

TRANS �⟨g,C,mi,ms⟩� = { ty, g ∧ gty ⇒ p, g ∧ gty ⇒ q }

where ⟨ty,gty,argtys,resty⟩ = TRANS �⟨g,C,mi,md⟩�⟨md,pre,post⟩ = ms

p = pre(TRANS �mi�,

TRANS �C�, argtys, resty,

TRANS ��self)

⇔ TRANS �pre�q = post(TRANS �mi�,

TRANS �C�, argtys, resty,

_self, _result)

⇔ TRANS �post�

TRANSOOForm×CExpr×MI×MD : OOForm×CExpr×MI ×MD

→ OOForm×OOForm×OOTerm∗clslstS ×OOTermclslstS

TRANS �⟨g,C,mi,md⟩� = ⟨ty,gty, ta1ta2 . . . tan, tb⟩where ⟨sig,B⟩ = md

⟨ov1,A1⟩ . . .⟨ovn,An⟩ = sig

(with i ∈ {1, . . . ,n}) tai = TRANS �Ai�tb = TRANS �B�ty = g ∧ gty

⇒ instance(TRANS �self .mi(ov1, . . . ,ovn)�,

tb)

gty = instance(_self,TRANS �C�)

∧ instance(TRANS �ov1�,ta1)

∧ . . .

∧ instance(TRANS �ovn�,tan)

4.4.6 Translation of Formulae and Expressions

83


TRANSForm : Form → OOForm

TRANS �o1 =C o2� = eqs(TRANS �C�,TRANS �o1�,TRANS �o1�)

TRANS �o : C� = instance(TRANS �o�,TRANS �C�)

TRANS �¬ F� = ¬ TRANS �F�TRANS �F1 ∧F2� = TRANS �F1� ∧ TRANS �F2�TRANS �F1 ∨F2� = TRANS �F1� ∨ TRANS �F2�

TRANS �F1 ⇒ F2� = TRANS �F1�⇒ TRANS �F2�TRANS �F1 ⇔ F2� = TRANS �F1�⇔ TRANS �F2�

TRANS �∀ov : C(F)� = ∀objS TRANS �ov� (instance(TRANS �ov�,TRANS �C�) ⇒ TRANS �F�)

TRANS �∃ov : C(F)� = ∃objS TRANS �ov� (instance(TRANS �ov�,TRANS �C�) ∧ TRANS �F�)

TRANSCExpr : Expr → OOTermclsS

TRANS �ci < C1, . . . ,Cn >� = cls(TRANS �ci�,[TRANS �C1�, . . ., TRANS �Cn�])

TRANS �cv� = TRANSCV �cv�

TRANSExpr : Expr → OOTermobjS

TRANS �C� = TRANSCExpr �C�TRANS �ov� = TRANSOV �ov�

TRANS �o.mi(o1, . . . ,on)� = send(TRANS �o�,

msg(TRANS �mi�,[TRANS �o1�, . . ., TRANS �on�]))

TRANSCV : CV → OOTermclsS

TRANS �t� = _t

TRANSOV : OV → OOTermobjS

TRANS �x� = _x

TRANSMI : MI → OOTermmsgidS

TRANS �mi� = mi

TRANSCI : CI → OOTermclsidS

TRANS �ci� = ci

4.5 Mechanised Reasoning

In this section we explain how we have used the automatic theorem prover

Prover9/Mace4 [95, 94] to mechanise some reasoning about the Clay theory

presented in Section 4.3.

84

4.5 Mechanised Reasoning

Prover9 is a resolution/paramodulation automated theorem prover for first-

order and equational logic. Mace4 is a valuable complement to Prover9, looking

for counterexamples before (or at the same time as) using Prover9 to search for a

proof. It can also be used to help debug input clauses and formulae for Prover9.

Mace4 helps avoid wasting time searching for a proof with Prover9 by first finding

a counterexample or by first helping to debug logical specifications.

We have fed Prover9/Mace4 [95] with the Clay theory in order to check its

consistency. This process helped us to debug and test our intuitions and their

descriptions. Once we got enough confidence on the consistency of the Clay the-

ory we started proving some theorems. Let us start with the main theorem that

Prover9/Mace4 have automatically proved and then we will give some explana-

tions about the consistency checking of the theory.

4.5.1 Subject Reduction

As an example of the mechanisation of the reasoning that results after the en-

coding of Clay in Prover9/Mace4, we would like to show a theorem automatically

proved. Subject reduction is a desirable property that establishes that if two ex-

pressions reduce to the same object, then they must belong to the same classes.

In our case we do not have a reduction relation so we have written the property

with the equality predicate:

∀ X ∀ Y ∀ C(


instanceof(X, C) ∧ eq(C, X, Y) ⇒ instanceof(Y, C))

)

Axioms 4.21: Subject reduction theorem.

Here is the resulting proof:

% Proof 1 at 0.07 (+ 0.00) seconds.

% Length of proof is 16.

% Level of proof is 3.

% Maximum clause weight is 13.000.

% Given clauses 150.

60 objS(X) -> objS(Y) -> clsS(C) -> (eq(C,X,Y) <->

instanceof(X,C) & instanceof(Y,C) & eqs(C,X,Y) &

(forall D (clsS(D) -> subclass(C,D) ->

eq(D,X,Y)))) # label(non_clause). [assumption].

85


65 subject_reduction <-> (forall X forall Y forall C

(objS(X) -> objS(Y) -> clsS(C) -> instanceof(X,C) &

eq(C,X,Y) -> instanceof(Y,C))) # label(non_clause).

[assumption].

66 subject_reduction # label(non_clause) # label(goal). [goal].

186 ~ objS(A) | ~ objS(B) | ~ clsS(C) | ~ eq(C,A,B) |

instanceof(B,C). [clausify(60)].

194 subject_reduction | objS(c8). [clausify(65)].

195 subject_reduction | objS(c9). [clausify(65)].

196 subject_reduction | clsS(c10). [clausify(65)].

198 subject_reduction | eq(c10,c8,c9). [clausify(65)].

199 subject_reduction | ~ instanceof(c9,c10). [clausify(65)].

200 ~ subject_reduction. [deny(66)].

345 ~ instanceof(c9,c10). [back_unit_del(199),unit_del(a,200)].

346 eq(c10,c8,c9). [back_unit_del(198),unit_del(a,200)].

348 clsS(c10). [back_unit_del(196),unit_del(a,200)].

349 objS(c9). [back_unit_del(195),unit_del(a,200)].

350 objS(c8). [back_unit_del(194),unit_del(a,200)].

469 $F. [resolve(346,a,186,d),unit_del(a,350),

unit_del(b,349),unit_del(c,348),unit_del(d,345)].

========================== end of proof =========================

THEOREM PROVED

Exiting with 1 proof.

Process 8172 exit (max_proofs) Tue Jul 27 16:50:54 2010

4.5.2 Consistency of the Clay Theory

During the process of writing the semantics of the language, we have created in-

consistent theories. Sometimes due to errors in the codification but other times

because of the introduction of properties that, although intuitively looked cor-

rect resulted inconsistent.

At this moment, feeding Prover9/Mace4 with the Clay theory results in an

apparently endless execution. Since neither the prover nor the model checker

finish their executions our confidence in the consistency of the theory is pretty

high.

86

4.6 Related Work

4.6 Related Work

At first sight, the decision of not using a specifically designed logic that directly

captures object-oriented concepts might seem surprising. During a year, we were

studying several logical frameworks of other object-oriented formal specification

languages. We tried some approaches based on typed logics since we wanted the

logic to manage all the typing issues for free. The results were disappointing, in

particular because typed logics introduce, irremediably, information that is not

dynamically usable. In other words, we were unable to find a logic suitable to

capture and introduce dynamic type information, i.e. type information actually

usable during deduction and not just statically.

VDM and Z have their own object-oriented versions: VDM++ [149] and

Object-Z [116]. During their design processes a lot of constraints were imposed

by their ancestors and their underlying logics were inherited. The resulting se-

mantics are, at least, as complex as the original ones, more complex anyway

than first-order logic. On our view, knowledge transfer to industry is not effective

enough.

Languages like CASL [100] or COLD [74, 79] are object-oriented. Their logics

are claimed to be considered as object-oriented logics in the sense that the main

object-oriented concepts are directly supported. Nevertheless, a deeper analysis

reveals that inheritance, for instance, is represented directly with subsorting and

subsorting is interpreted as set inclusion in the model. An elegant but unrealistic

solution: the meaning of a class ColoredNat inheriting from Nat has nothing to

do with a subset of Nat. Equational languages based on equational logic like OBJ

[53] or Maude [33] present the same problem.

This interpretation is avoided in CASL [27] by introducing injections into sub-

classes and projections into superclasses. Nevertheless, this does not actually

avoid to understand inheritance as inclusion. Its main advantage is that restric-

tions on signatures are not so strict and that strong semantic relationships are

established between overloading symbol names. This semantics restrictions are

really interesting and we have included some of them in Clay as the permissive

overloading schemes supported.

A different approach is the use of an executable calculus like Abadi and

Cardelli’s one [1]. Abadi and Cardelli’s calculi seems ideal for giving semantics to

programming languages. Nevertheless, the expressiveness is lesser, in principle,

87


than the expressiveness of a non-executable logic.

Furthermore, the work by Abadi and Cardelli is based on structural typing

while, from our point of view, type names and explicit relations between them

are extremely important in specifications. The difficulty of precisely defining a

type system name based does not justify not trying to manage types by names.

88

Part III

The Clay System

89

5Synthesis of Logic Programs

Abstract

Early validation of requirements is crucial for the rigorous development

of software. Without it, even the most formal of the methodologies will

produce the wrong outcome. One possibility to validate requirements is

by constructing prototypes. In this chapter we study how to synthesise

prototypes from Clay specifications. We present a normal representation

of the Clay instances as Prolog terms. We define a set of basic predicates

like equality or instance of and we formalise the translation of Clay spec-

ifications into logic programs. Our synthesis can deal with non-trivial,

recursive and implicit specifications.

5.1 Methodology

In order to present the synthesis of Prolog programs from Clay specifications we

will follow the methodology in Chapter 4: the interlingua-based semantics. In

this case, Prolog will be the interlingua Li in Definition 4.1 and a Prolog module

will be BTL, the set of Prolog top-level forms that are the heart of our synthesis.

With respect to Clay, we have made some simplifying decisions in order to keep

91

5 Synthesis of Logic Programs

the resulting theory tractable and readable: no multiple inheritance, overloading

not allowed (just method refinement) and no parametric polymorphism.

Section 5.4 is devoted to the formalisation of the translation. The formali-

sation is hard to follow so Sections 5.2 and 5.3 discuss the intuitions behind the

main decisions.

5.2 Interacting with Clay

The prototype generated by our synthesiser supports interacting with Clay spec-

ifications by asking it to reduce a Clay object expression to a normal representa-

tion. We describe now some use cases and in Section 5.5 we will check the actual

performance of the synthesised prototype with those use cases.

Inheritance

Classes Cell and ReCell presented in Sections 2.2 and 2.4 will be our first guiding

example. We will interact with Clay to check that the compiler is enabling the

specifier to write concise specifications with safe inheritance, and we will see the

answers to expressions like ReCell.mkReCell.set(0).set(1).get and

ReCell.mkReCell.set(0).set(1).restore.get.

Recursive Specifications

A more interesting example is the specification of a binary search tree. Figure 5.1

shows the domain specification of the class BSTInt and the method that checks if

an element is in the tree.

We will interact with the synthesised prototype to check whether a recursive

definition of a method like that presented in Figure 5.2 can be executed.

Implicit Specifications

Our last guiding example will be the implicit specification of method remove in

the class BSTInt. We will execute some examples using the message remove.

92

5.2 Interacting with Clay

class BSTInt {case Empty { }case Node { data : Int , left : BSTInt, right : BSTInt }

observer contains (x : Int ) : Bool {post { self : Empty ∧ result : False

∨ self : Node ∧ x = self .data ∧ result : True∨ self : Node ∧ x < self .data : True ∧

result = self . left .contains(x)∨ self : Node ∧ x < self .data : False ∧

result = self . right .contains(x) }}

...}

Figure 5.1: Binary search trees in Clay.

class BSTInt {...modifier insert (x : Int ) {

post {self : Empty ∧ result = BSTInt.mkNode(x,BSTInt.mkEmpty,BSTInt.mkEmpty)

∨ self : Node∧ ( x < self .data : True

∧ result = BSTInt.mkNode(self.data, self. left . insert (x ), self . right )∨ x = self .data ∧ result = self∨ x < self .data : False

∧ result = BSTInt.mkNode(self.data,self.left , self . right . insert (x))) }

}...

}

Figure 5.2: A recursive specification.

class BSTInt {...modifier remove (x : Int ) {

post { result .contains(x) : False ∧ result . insert (x)=self}}

...}

Figure 5.3: An implicit specification.

93


class(C) C is a Clay’s class identifier.inherits(A,[B]) B is the superclass of Acases(C,Cs) Cs is the list with case classes of class Cfields(C,Fs) Fs is the association list with the field names and

field types of case class Cmsgtype(C,M) M is the message identifier of a method defined

or overridden in class Cpre(C,S,M,As) Precondition of sending message M with argu-

ments As to instance S in class Cpost(C,S,M,As,R) Postcondition that establishes that R is the re-

sulting instance of sending message M with ar-guments As to instance S in class C

Figure 5.4: Representing Clay in Prolog.

Requirements Validation

The interaction with Clay should help the specifiers to gain confidence in their

specifications. We will detect an error in our previous specifications.

5.3 Translating Clay Specifications into Logic Programs

Given a Clay specification we will synthesise facts that represent its abstract syn-

tax tree: classes, inheritance, case classes, fields, and pre- and post-conditions of

methods. Figure 5.4 describes the meaning of the target predicates. Their precise

meaning is given in Section 5.4 where we give a formalisation of the translation

of Clay specifications into facts about the target predicates.

The heart of our translator is a common theory for all specifications: the Clay

theory. The most important predicates of this axiomatisation are (instanceof/2,

reduce/2, and eq/3), definitions that rely on the facts translated from the source

specifications (Figure 5.4). Their meaning is:

• Predicate instanceof(NF,A) is a generator of instances NF of a class A. NF

is a normal form of an instance of A. These normal forms are flexible repre-

sentation of instances as incomplete data structures and will be presented

in Sections 5.3.1 and 5.3.2.

• Predicate eq(A,NF1,NF2), Clay’s equality, decides if the representations

(NF1 and NF2) of two instances are indistinguishable in class A.

94


• Finally, predicate reduce(E,NF) reduces any Clay object expression E to

its normal form NF . Predicates eq and reduce will be presented in Sec-

tion 5.3.3.

5.3.1 Representing Clay Instances in Prolog

We have mentioned that the predicate reduce/2 reduces a Clay object expression

to a normal form. Clay object expressions have a straightforward representation

in Prolog:

• A class expression ci<C1, . . ., Cn> is represented by the Prolog term

(ci<C1, . . ., Cn>)# = ci#(C#1, . . ., C#

n).

• A class identifier ci is represented by a valid Prolog constant mi# by quoting

its lexeme: ’cv’.

• A class variable cv (an object variable ov) is represented by a valid Prolog

variable cv# (ov#) by prefixing its name with “_”: _cv (_ov).

• A send expression o.mi(o1, . . ., on) is represented by the Prolog term

(o.mi(o1, . . ., on))# = o#<-mi#(o#1, . . ., o#

n).

• A message identifier mi is represented by a valid Prolog constant mi# by

using its lexeme.

To describe how the generated prototype represents the instances of our lan-

guage in normal form we will use the example of restorable cells (instances of

ReCell). We need to capture all the information of known superclasses (Cell) and

to capture all the information about the specific case class (ReCell).

With no multiple inheritance, a sorted linear structure can represent the clas-

ses of an instance. Therefore, we can use a list where each element contains

the part of the representation for a given class of the instance: (C,S,F) where

C is the class, S is the particular case class, and F is an association list from field

names to the representation of their instances. Let us show the representation of

Cell .mkCell:

[(’Cell’,’CellCase’,[(contents,[(’Int’,’Int’,[42])])])]

95


The list contains one element since the object is an instance of just one class

(Cell).

Under subtyping, during a deduction process where a cell with 42 is expected

an instance of ReCell could appear. If we follow our rules, the representation of

ReCell.mkReCell.set(42) would be:

[(’Cell’,’CellCase’,[(contents,[(’Int’,’Int’,[42])])]),

(’ReCell’,’ReCellCase’,[(backup,[(’Int’,’Int’,[0])])])]

The representation of the cell with 42 and the instance of ReCell are partially the

same but the latter does not fit in the former. This is something that we would ex-

pect to happen since both instances represent the same information with respect

to the properties of Cell.

We propose to make room for yet unknown information of subclasses and to

use an incomplete data structure where the incomplete part represents the room

for the information of the potential subclasses. The representation of Cell .mkCell

would be

[(’Cell’,’CellCase’,[(contents,[(’Int’,’Int’,[42])|_])])|_]

and for the instance of ReCell we would have the following representation:

[(’Cell’,’CellCase’,[(contents,[(’Int’,’Int’,[42])|_])]),

(’ReCell’,’ReCellCase’,[(backup,[(’Int’,’Int’,[0])|_])])|_]

Apart from carrying all the information needed by methods specified in the

superclasses, our normal form has the following properties:

• Information about case classes allows us to reflect the disjoint sum (case

classes) of products (fields).

• The incomplete part might be instantiated with data of an instance of a

subclass (like the backup of ReCell) during the deduction process. The most

interesting benefit is that the instantiation can be implemented with the

unification of our logic language engine. The example above shows how

the instance of ReCell fits, by unification, in the cell with 42.

5.3.2 Instance of

The predicate “ :” (instance of) is translated into the Prolog predicate instanceof/2.

Which generates the representation of all instances (first argument) of all classes

(second argument) of a specification. Let us see some outputs of this predicate:

96


?- instanceof(O,C).

C = ’Int’,

O = [(’Int’,’Int’,[_])|_] ;

C = ’Cell’,

O = [(’Cell’,’CellCase’,[(contents,[(’Int’,’Int’,[_])|_])])|_] ;

C = ’CellCase’,

O = [(’Cell’,’CellCase’,[(contents,[(’Int’,’Int’,[_])|_])])|_] ;

C = ’ReCellCase’,

O = [(’Cell’,’CellCase’,[(contents,[(’Int’,’Int’,[_])|_])]),

(’ReCell’,’ReCellCase’,[(backup,[(’Int’,’Int’,[_])|_])])|_]

Thanks to our incomplete structures every instance of a subclass is an in-

stance of a superclass, a technique that makes the desirable property of sub-

sumption to be a theorem in our Prolog axiomatisation.

5.3.3 Equality

Clay equality (=) is the other predicate used in the atomic formulae of Clay in

this work. Our translation of Clay equality into Prolog consists of two steps: a

reduction of the object expressions to normal form and the unification of the

obtained representations.

Let us see a description of the implementation of the reduction step and post-

pone the formalisation of the translation of the equality literals to Section 5.4.

Predicate reduce/2 relates terms that represent abstract syntax trees of Clay ex-

pressions with their normal form. The most important clause of reduce/2 defines

the reduction of sending a message (M) to an object expression O. Functor <--, in

infix form, represents the send operator of Clay:

reduce(O<--M,NF) :- M =.. [Mid|Args],

reduce(O,ONF), reduceall(Args,ArgsNF),

knownclasses(ONF,Cs),

checkpreposts(Cs,ONF,Mid,ArgsNF,NF,defined).

Predicate reduceall/2 reduces a list of expressions, the second argument of

knownclasses/2 contains the known classes (Cs) of the recipient of the message,

and checkpreposts checks pre- and post-conditions of every class of Cs in which

method Mid is defined.

Safe Inheritance. We already mentioned in Section 2.4 the danger of overriding

the properties of methods in subclasses: the practical impossibility of reasoning

in large programs. The above implementation of predicate reduce/2 will fail if

any postcondition in the inheritance hierarchy is inconsistent with the postcon-

97


ditions specified in superclasses.

5.3.4 Predefined Integers

The predefined class Int encapsulates integers that get translated into Prolog in-

tegers managed via finite domain constraints. This illustrates another technique

that can be applied in the translation when the target language has declarative

extensions. Previous versions of the same specification used a Peano represen-

tation for naturals (predefined class Nat) as a way of obtaining a complete theory

for numbers. The experiments in Section 5.5 show drastic gains over our previous

implementation presented in [68].

In the next section we formalise the translation of Clay specifications into

Prolog programs. The distance between Clay and Prolog is enough to make the

translation far from trivial and difficult to follow. More intuitive should be in-

specting Figure 5.5: it presents, almost in parallel, the correspondence between

the Clay specification of methods insert and remove of BSTInt and the automati-

cally synthesised Prolog code.

5.4 Formalised Translation

The translation of Clay to logic programs is presented in two steps. The first step,

Section 5.4.4, translates the Clay specifications in the form of their abstract syn-

tax into extended programs [91, 90], logic programs that contain classes with an

arbitrary first-order formula in their body. A reduced version of the Clay syntax

presented in Section 3.1 can be found in Section 5.4.1. A concise abstract syntax

of logic programs is given in Section 5.4.2.

The second step, Section 5.4.5, is to apply the Lloyd-Topor transformation to

obtain general programs from extended programs. General programs allow con-

junctions of positive and negative atoms in the body of their clauses. The require-

ment for executing the transformed programs is a sound form of the negation.

We use the work of our colleagues [92, 101] when safe negation is needed.

Mathematical definitions follow conventions described in Section 4.4.2.

98


modifier insert (x : Int ) {post {

self : Empty ∧result = BSTInt.mkNode(

x,BSTInt.mkEmpty,BSTInt.mkEmpty)

∨

self : Node ∧(x < self .data : True ∧result = BSTInt.mkNode(

self .data,self . left . insert (x ),self . right )

∨

...) }}

modifier remove (x : Int ) {post { result .contains(x) : False

∧ result . insert (x)=self}

}

post(’BSTInt’,_s,insert,[_x],_r) :-

instanceof(_s,’Empty’),

reduce(’BSTInt’

<--mkNode(_x,

’BSTInt’<--mkEmpty,

’BSTInt’<--mkEmpty),_NF_BSTInt_mkNode),

eq(’BSTInt’,_r,NF_BSTInt_mkNode).


instanceof(_s, ’Node’),

reduce(_x < _s<--data,_NF__x_le),

instanceof(_NF__x_le,’True’),

reduce(’BSTInt’

<--mkNode(_s<--data,_s<--left<--insert(_x),_s<--right),

_NF_BSTInt_mkNode),

eq(’BSTInt’, _r, _NF_BSTInt_mkNode).


...

post(’BSTInt’,_s,remove,[_x],_r) :-

reduce(_r<--contains(_x),_NF__r_contains),

instanceof(_NF__r_contains,

’False’),

reduce(_r<--insert(_x),_NF__r_insert),

eq(’BSTInt’,_NF__r_insert,_s).

Figure 5.5: Translation of insert and remove.

99


Specifications (spec) Spec = CI 7→ CS

Class Specifications (cs) CS = CI ×SE ×ME

State Environments (se) SE = CI 7→ FE

Field Environments (fe) FE = MI 7→ CI

Method Environments (me) ME = MI 7→ MS

Method Specifications (ms) MS = MD×Form×Form

Method Declarations (md) MD = (OV ×CI)∗×CI

Formulae (F) Form ::= o1 = o2 | o : ci

| ¬ F | F1 ∧ F2 | F1 ∨ F2 | F1 ⇒ F2 | F1 ⇔ F2

| ∀ov : ci(F) | ∃ov : ci(F)

Expressions (o) Expr ::= ci | ov | o.m(o1, . . . ,on)

Class Identifiers (ci) CI = valid Clay class identifiers

Message Identifiers (mi) MI = valid Clay message identifiers

Class Variables (cv) CV = valid Clay class variables

Object Variables (ov) OV = valid Clay object variables

Figure 5.6: Simplified Clay’s abstract syntax.

5.4.1 Abstract Syntax of Clay

To reduce the size of the translation function we have reduced the abstract syntax

of Clay of Chapter 3 (Figure 3.1). The main changes are the following:

• No parametric polymorphism. This means that class identifiers are enough

to represent class expressions (CExpr).

• No multiple inheritance.

• No invariants.

Figure 5.6 contains the simplified version of the Clay abstract syntax.

5.4.2 Abstract Syntax of Logic Programs

Extended Programs (EP)

An extended program (EP) is a set of extended clauses (EC), implications with

an arbitrary first-order formula, that we call here extended goals (EG), in the an-

100


tecedent and an atom in the consequent:

EP = 2EC

EC ::= EG ⇐ Atom

EG ::= Lit

| EG ∧ EG | EG ∨ EG | EG ⇒ EG | EG ⇔ EG

| ∀Var(EG) | ∃Var(EG)

Lit ::= Atom | ¬Atom

Atom ::= PN(Term, . . . ,Term)

Term ::= Var | FN(Term, . . . ,Term)

where Var, FN and PN are the set of valid variables, functor names and predicate

names.

General Programs (LP)

General programs (LP) are sets of clauses (Clause):

LP = 2Clause

Clause ::= Atom:-Goal

Goal ::= Lit, . . .,Lit

Figure 5.7 contains the mathematical definition of extended and general pro-

grams.

5.4.3 Synthesis of Logic Programs

Let us start with the definition of the function trspec. The logic program synthe-

sised from a given specification s contains a set of clauses that is common to any

specification (clay_theory) and a set of clauses that are the result of applying the

Lloyd-Topor (lloyd_topor) transformation to the translation of the specification of

101


Extended Programs EP = 2EC

Extended Clauses EC ::= Atom ⇐ EGExtended Goals EG ::= Lit | EG ∧ . . . ∧ EG

| EG ∨ EG | EG ⇒ EG | EG ⇔ EG| ∀x(EG) | ∃x(EG)

General Programs LP = 2Clause

General Clauses Clause ::= Atom:-GoalGeneral Goals Goal ::= Lit ∧ . . . ∧ LitLiterals Lit ::= A | ¬ AAtoms Atom ::= p(t1, . . . , tn)Terms (s, t) Term ::= x | f (t1, . . . , tn)Predicate Symbols p,qFunctors f ,g,hVariables x,y,z

Figure 5.7: Abstract Syntax of Extended and General Programs.

every class into an extended program (trclass).

trspec : Spec → LP (5.1)

trspec �s� = clay_theory∪ ⋃A∈domS

(lloyd_topor◦ trclass)�A,s A� (5.2)

The definition of the translation functions follows similar conventions in Sec-

tion 4.4.2.

The Clay Theory

Appendix B shows the whole theory in the form of a SWI Prolog program. With

the rule for reduce presented in Section 5.3.3, we stress the importance of the

generator instanceof:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% instanceof(NF,C) :- NF represents an instance of C.

instanceof(NF,C) :-

nf(C,NF).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% nf(C,NF) :- NF is a normal form of class C.

nf(C,NF) :-

superclasses(C,SuperCs),

genstates(SuperCs,NF).

102


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% superclasses(A,SuperAs) :-

% SuperAs is the sorted list of superclasses of A, max

% first.

superclasses(A,SuperAs) :-

superclasses_acum(A,[A],SuperAs).

superclasses_acum(A,Acum,Acum) :-

inherits(A,[]).

superclasses_acum(A,Acum,SuperAs) :-

inherits(A,[Super]),

superclasses_acum(Super,[Super|Acum],SuperAs).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% genstates([C1,C2,...],[S1,S2,...]) :-

% for every i, state(Ci,Si).

genstates([],_) :- true.

genstates([Final],[State|_]) :-

genstate(Final,State).

% Care with case classes, they are subclases but its direct

% superclass is already generating its state (both rules)

genstates([Super,Case|Subs],[State|States]) :-

caseclass(Case,Super),

% To avoid exploration of all cases in next atom:

State = (Super, (Case, _)),

genstate(Super,State),

genstates(Subs,States).

genstates([Super,Sub|Subs],[State|States]) :-

inherits(Sub,[Super]),

\+ caseclass(Sub,Super),


genstates([Sub|Subs],States).

5.4.4 Synthesis of Extended Programs

This first stage starts with the function trclass. The extended program generated

for a given class A contains clauses that establish that A is a class and the inheri-

tance relation derived its declaration, and the translation of the static part of the

specification (trse) and the translation of the dynamic part of the specification

(trme):

103


trclass : CI ×CS → EP

trclass �A, (super,stts,meths)� = {class(trexp�A�)⇐>,instanceof(trexp�A�,trmeta�A�)⇐>,inherits(trexp�A�,trexp�super�)⇐>}

∪ trse �A,stts� ∪ trme �A,meths�

Translation of State Environments

The translation of the static part synthesises a clause for the cases predicate and

then the clauses for constructors of case classes (trstt) and projections from fields

(trfe) for every case class:

trse : CI ×SE → EP

trse �A,states� = {cases(trexp �A�,trset �domstates�) ⇐ >}⋃A∈domstates(trstt �A,B,states B�∪ trfe �A,B,states B�)

The function trstt generates the clauses to represent the declaration (msgtype),

pre- (pre) and post-condition (post) of every constructor of a case class:

trstt : CI ×CI ×FE → EP

trstt �A,B, fe� = {msgtype(trmeta �A�,trmk �B�) ⇐ >, pre(trmeta �A�,_self,trmk �B�,trfev �fe�) ⇐

instanceof(_self,trmeta �A�)∧ trfety �fe�

, post(trmeta �A�,_self,trmk �B�,trfev �fe�,_result) ⇐instanceof(_result,trmeta �B�)

∧ project(_result,trexp �B�,trfenv �fe�)The function trfe generates the clauses for every field of the case class through

the function trfield that generates the clauses to represent the declaration (msgtype),

pre- (pre) and post-condition (post) of every field:

trfe : CI ×CI ×FE → EP

trfe �A,B,⟨mi1,C1⟩ . . .⟨min,Cn⟩� = ⋃j∈{1...n} trfield �A,B,Cj,mij, j�

104


trfield : CI ×CI ×CI ×MI ×N→ EP

trfield �A,B,C,mi, j� = {msgtype(trexp �A�,mkconst �mi�) ⇐ >, pre(trexp �A�,_self,mkconst �mi�,[]) ⇐

instanceof(_self,trexp �B�), post(trexp �A�,_self,mkconst �mi�,[],_result) ⇐

instanceof(_result,trexp �C�)∧ project(_self,trexp �B�,[_, . . . ,_︸︷︷︸

j−1

,(mi,_result)|_])

The function trfev generates a Prolog list with a valid Prolog variable for ob-

tained from every field name:

trfev : MI ×CI∗ → Term

trfev �⟨mi1,C1⟩ . . .⟨min,Cn⟩� = [mkvar �mi1�,. . .,mkvar �min�]The function trfety generates instanceof calls given sequence of fields:

trfety : FE → EG

trfety �⟨mi1,C1⟩ . . .⟨min,Cn⟩� = instanceof(mkvar �mi1�,mkconst �C1�)∧ . . .

∧ instanceof(mkvar �min�,mkconst �Cn�)The function trset generates a Prolog list with valid Prolog terms that represent

the set of class identifiers:

trset : 2CI → Term

trfev �{C1, . . . ,Cn}� = [mkconst �C1�,. . .,mkconst �Cn�]The function trfenv generates an association list from field names to the vari-

ables generated by trfev:

trfenv : (MI ×CI)∗ → Term

trfenv �⟨mi1,C1⟩ . . .⟨min,Cn⟩� = [(mkconst �mi1�,mkvar �mi1�),. . .,

[(mkconst �min�,mkvar �min�)]

Translation of Method Environments

The synthesis of the clauses for methods are self-explanatory:

trme : CI ×ME → EP

trme �A,B,methods� = ⋃m∈dommethods trms �A,methods(m)�

105


trms : CI ×MI ×MS → EP

trms �A,m,⟨md,p,q⟩� = trmd �A,m,md�∪{pre(trexp �A�,_self,mkconst �m�,trmdv �md�) ⇐

trform �p�, post(trexp �A�,_self,mkconst �m�,trmdv �md�,_result) ⇐

trform �q�

trmd : CI ×MD → EP

trmd �A,m� = {msgtype(trexp �A�,mkconst �m�) ⇐ >}

The function trmdv generates a list of variables, one variable per argument of a

message:

trmdv : MD → Term

trmdv �⟨⟨x1,C1⟩ . . .⟨cn,Cn⟩,R⟩� = [mkvar �x1�,. . .,mkvar �xn�]

Translation of Formulae and Expressions

The translation of formulae is straightforward for the non-atomic formulae.

trform : Form → EG

trform �¬F� = negate(trform �F�)trform �F ∗G� = trform �F�∗ trform �G� where ∗ ∈ {∧,∨,⇒,⇔}

trform �∀x : A(F)� = ∀ trexp �x�(trform �x : A�⇒ trform �F�)

trform �∃x : A(F)� = ∃ trexp �x�(trform �x : A� ∧ trform �F�)

More interesting is the translation of equality and instanceof since the ob-

ject expressions has to be reduced before using the respective implementation of

both predicates (eq and instanceof):

trform : Form → EG

trform �e1 = e2� = reduce(trexp �e1�, NF1) ∧ reduce(trexp �e2�, NF2)

∧ eq(trexp �A�, NF1, NF2)

where A is the minimum common type of e1 and e2

and NF1 and NF2 are fresh variables

trform �e : A� = reduce(trexp �e�, NF) ∧ instanceof(NF, trexp �A�)where NF is a fresh variable

trexp : Expr → Term

trexp �A� = mkconst �A�trexp �x� = mkvar �x�trexp �e ← m(e1, . . . ,en)� = send(trexp �e�,mkconst �m�(trexp �e1�,. . .,trexp �en�)

106

5.5 Experimental Results

Translation of Lexical Elements

mkvar generates a valid Prolog variable from any Clay identifier

by prefixing the identifier with _

mkconst generates a valid Prolog constant from any Clay identifier

trmeta generates the class a given class A is instance of:

metamkconst �A�trmk generates the message that construct instances of a case

class A: mkmkconst �A�

5.4.5 Lloyd-Topor Transformation

lloyd_topor : EP → LP

lloyd_topor�ep� = Lloyd-Topor transformation [91, 90] to ep

The Lloyd-Topor transformation is the application of the rules in Figure 5.8

to the extended program (EP) until no more transformations can be applied and

a general program is obtained (EP).

5.5 Experimental Results

Let us get back to the problems posed in Section 5.2. Relevant parts of the code

obtained for the recursive definition of insert are shown in Figure 5.5. We have

automatically generated up to 4000 trees with up to 80 nodes with a maximum

depth of 10. The insertion of elements works properly and the execution time in

the worst case is less than 1000 milliseconds.

The second question was whether Clay would be able to generate an exe-

cutable prototype from the implicit specification of the remove method for binary

search trees. The code obtained is shown in Figure 5.5.

Running tests on this specification shows several problems. First, the logic

program obtained from the specification seemed to be only partially correct.

Given a (valid) binary insertion tree and one of its elements, the returned tree

was, in some cases, a tree meeting the specification but failed in the rest.

Analysis of the tests revealed that the synthesised prototype was working

properly exactly in those cases where the element to remove was at the leaves of

the structure – i.e. in a node with two empty subtrees as children. This solves the

107


Replace by

A ⇐ α ∧ ¬(V ∧ W ) ∧ β A ⇐ α ∧ ¬V ∧ β

A ⇐ α ∧ ¬W ∧ βA ⇐ α ∧ ¬(V ∧ W ) ∧ β A ⇐ α ∧ ¬V ∧ β

A ⇐ α ∧ ¬W ∧ βA ⇐ α ∧ ∀x(W ) ∧ β A ⇐ α ∧ ¬∃(¬W ) ∧ β

A ⇐ α ∧ ¬∀x(W ) ∧ β A ⇐ α ∧ ∃(¬W ) ∧ β

A ⇐ α ∧ (W ⇒ V ) ∧ β A ⇐ α ∧ V ∧ β

A ⇐ α ∧ ¬W ∧ βA ⇐ α ∧ ¬(W ⇒ V ) ∧ β A ⇐ α ∧ W ∧ ¬V ∧ β

A ⇐ α ∧ (V ∨ W ) ∧ β A ⇐ α ∧ V ∧ β

A ⇐ α ∧ W ∧ βA ⇐ α ∧ ¬(V ∨ W ) ∧ β A ⇐ α ∧ ¬V ∧ ¬W ∧ β

A ⇐ α ∧ ¬¬W ∧ β A ⇐ α ∧ W ∧ β

A ⇐ α ∧ ∃x(W ) ∧ β A ⇐ α ∧ W ∧ β

A ⇐ α ∧ ¬∃x(W ) ∧ β A ⇐ α ∧ ¬P ∧ β

P ⇐ α ∧ ∃x(W ) ∧ β

where P is p(y1, . . . ,yk) with p being a new predicate name and y1,. . ., and yk being the free variables in W . In these rules, A representsan atom (Atom), α and β represent a conjoined formula of extendedgoals (EG), and V and W represent extended goals (EG).

Figure 5.8: Lloyd-Topor transformation rules.

mystery: the specification for remove uses predefined (structural) equality while

the specifier was probably thinking in the intended set semantics for the trees as

collections. The order in which elements are stored in the tree affects its actual

shape. That is why only elements that make their way down to the “bottom” of

the structure via method insert meet the specification of remove.

In other words, the specification was flawed and the execution allowed us

to spot the bug. There are several ways to solve the problem. One of them is,

of course, to use a self normalising data structure – balanced tree, heap. . . – for

which predefined equality behaves as set equality. A quicker fix – less efficient –

is to flatten both sides of the equality:

modifier remove (x : Int ) {post { result .contains(x) : False ∧

result . insert (x) . flatten () = self . flatten ()}}

where flatten is an observer defined recursively in the obvious way:

observer flatten () : List <Int> {

108

5.6 Related Work and Conclusions

post { self : Empty ∧ result = []∨ self : Node ∧

result = self . left . flatten .append([].cons(self .data)) .append(self.right . flatten ) }

}

We now show the effects of safe inheritance:

?- reduce(’Cell’<--mkCellCase(0),R).

R = ’CellCase’{contents : 0}

?- reduce(’Cell’<--mkCellCase(0)<--set(1)),R).

R = ’CellCase’{contents : 1}

?- reduce(’Cell’<--mkCellCase(0)<--set(1)<--get,R).

R = 1

?- reduce(’ReCell’<--mkReCell<--set(0)<--set(1)<--restore<--get,R).

R = 0

?- reduce(’Cell’<--mkCell<--set(0)<--set(1)<--restore<--get,R).

no

We finish the presentation of our results with some performance figures. Our

experiments have been produced in a Ubuntu box running GNU/Linux 2.6.32-

25 SMP on a machine with an Intel Dual Core CPU [email protected], 4096KB of

cache and 2 GB of RAM. Our Prolog engine is SWI-Prolog version 5.10.0. In the

following table we show the performance of the generated prototypes. All the

tests have been executed with 38 as the limit in the depth for the iterative deep-

ening strategy for predicate instanceof.

Test Time (ms.)

Generation of trees (1000 trees) 202

Creation of trees (15 insertions) 929

Removing leaf from tree (1 node) 0

Removing leaf from tree (3 nodes) 391



The full Clay specifications for these examples, their translation into Prolog and

the Prolog implementation of the Clay theory can be found at

http://babel.ls.fi.upm.es/~angel/papers/2010lopstr-lncs-code.tgz.

5.6 Related Work and Conclusions

We have presented the compilation scheme of an object-oriented formal nota-

tion into logic programs. This allows the generation of executable prototypes that

109

http://babel.ls.fi.upm.es/~angel/papers/2010lopstr-lncs-code.tgz


can help in validating requirements, e.g. by means of automated test generation.

We have focused on the generation of code from implicit method specifica-

tions, specially in presence of recursive definitions, something which is seldom

supported by other lightweight methods and tools.

Early experiments with our compiler show the feasibility of the approach, but

also the limitations of a naive application of Prolog’s standard search mecha-

nisms. In fact, obtaining an efficient search scheme is one of the challenges for

future research. Our current implementation combines techniques such as the

Lloyd-Topor transforms of first-order formulae and iterative deepening search

for achieving completeness in some examples.

We expect to have an efficiency increase with the use of constructive negation

and also with techniques that allow for lazy instance generation, that is, coroutin-

ing the logic code that implements quantification via instance generation with

the one that implements the implicit postconditions. More mature tools, like

ProB [86, 87] already take advantage of these.

One improvement that has already been incorporated is the use of con-

straints for arithmetic. The previous version of this work [68] used a Peano

representation for naturals (predefined class Nat) as a way of obtaining a com-

plete theory for numbers. Here, the predefined class Int encapsulates integers

that get translated into Prolog integers managed via finite domain constraints.

The tests show drastic gains over our previous implementation.

Certain features of object-oriented programming (e.g. mutable state) have

been left out of this presentation. Studying the introduction of state in our code

generation scheme would help in applying the ideas presented in this paper to

other object-oriented formal notations like VDM++, Object-Z, Troll or OASIS [46,

115, 75, 106].

110

6The Clay Compiler

Abstract

To test our approach in the real world with real specifications we have de-

veloped a compiler for Clay. The compiler that we present in this chapter

translates Clay specifications into first-order logic, in particular to the

concrete language of Prover9/Mace4, and synthesises prototypes in Pro-

log. Our implementation has been crucial in the development of the

theoretical material of this thesis. We have implemented the compiler

in Haskell, applying a methodological approach [70] and state-of-the-art

techniques of functional programming.

6.1 Architecture of the Compiler

To design the architecture of our compiler we have applied typical modularisa-

tion of compilers following orientations from [6], from [70] and from state-of-the

art of Haskell [136] techniques. At first level the compiler has three components

as the reader can see in Figure 6.1:

Front-end. This component knows about the static semantics of Clay (concrete

111

6 The Clay Compiler

Cell.cly Front−end

Cell.pl Cell.p9

Back−end

to

Prover9

Back−end

to

Prolog

Figure 6.1: Clay Compiler Architecture.

syntax, abstract syntax and type system described in Chapter 3). In Figure

6.2 the reader can appreciate how Clay specifications are parsed and then

transformed into typed abstract syntax trees and environments.

Back-ends. After the typechecking stage we find two back-ends:

• The first one implements the translation of Clay environments into

first-order logic theories in Prover9/Mace4 as described in Chapter 4.

Figure 6.3 shows the internal components: a translator of Clay envi-

ronments into OOFOL abstract syntax, an encoder of OOFOL abstract

syntax into the abstract syntax of first-order logic, and a pretty printer

to the concrete syntax of Prover9/Mace4 [95].

• The second one implements the translation of Clay environments

into executable prototypes in Prolog as described in Chapter 5. Fig-

ure 6.4 shows the internal components: a translator of Clay environ-

ments into the abstract syntax of extended programs, a Lloyd-Topor

transformer to the abstract syntax of logic programs, and a pretty

printer to the concrete syntax of SWI Prolog [148].

Haskell [73] is an industrial strength purely-functional programming language.

Haskell offers the programmer a substantial increasing in programmer produc-

tivity, concise and clear code, higher reliability and shorter lead times. Never-

theless, for this work, the most important characteristic is that Haskell has a very

112

6.1 Architecture of the Compiler

Cell.cly MTP

Environments

and

ASTs

TC

Annotated

Environments

and ASTs

Figure 6.2: Clay Compiler Architecture: Front-end.

Encoding

to

FOL

Translation

to

OOFOL

Annotated

Environments

and AST

OOFOL

AST

FOL

AST

Pretty

PrinterCell.p9

Figure 6.3: Clay Compiler Architecture: Back-end to FOL.

113

6 The Clay Compiler

LLoyd−Topr

Negation

Transformation

Translation

to

EP

Annotated

Environments

and AST

EP

AST

LP

AST

Pretty

PrinterCell.pl

Figure 6.4: Clay Compiler Architecture: Back-end to Prolog.

small semantic gap with the formalisation of the mathematical structures and

translation functions to be implemented.

6.2 More than Parsing (MTP)

In the implementation of the Clay compiler we have applied our own techniques

described in [70]:

• Concrete syntax is specified in GONF (Generalised Object-Oriented Nor-

mal Form), a formalism that facilitates the automatic derivation of the ab-

stract syntax of the language from its concrete syntax.

• Concrete syntax is transformed into an LALR(1) grammar. Such a grammar

is fed Happy, the most popular parser generator for Haskell, to obtain the

parser.

• Semantic actions within the parser just create the abstract syntax tree.

• Abstract syntax trees are traversed in order to construct the environments.

We illustrate the application of our techniques with a pair of productions used in

the concrete syntax of Clay.

114

6.2 More than Parsing (MTP)

6.2.1 Class Specifications

A class specification consists of a class declaration, optionally a class invariant, a

possibly empty sequence of case classes declarations (state declarations) and a a

possibly empty sequence of method specifications. Its GONF production is:

class_spec ::= class_decl {invariant?

state_decl∗method_spec∗

}

In the definition of the abstract syntax constant lexemes like ’{’ and ’}’ are

ignored. The Haskell type that represents class specifications is:

data ClassSpec = ClassSpec

ClassDecl

(Maybe Invariant)

[StateDecl]

[MethodSpec]

After the application of some straightforward rules we obtain the following

productions in Happy:

class_spec :: { ClassSpec }

class_spec : class_decl ’{’

invariant_opt

state_decl_seq0

method_spec_seq0

’}’

{ ClassSpec $1 $3 $4 $5 }

invariant_opt :: { Maybe Invariant }

invariant_opt : {- empty -}

{ Nothing }

| invariant

{ Just $1 }

state_decl_seq0 :: { [StateDecl] }

state_decl_seq0 : {- empty -}

{ [] }

| state_decl_seq0 state_decl

{ $2:$1 }

method_spec_seq0 :: { [MethodSpec] }

method_spec_seq0 : {- empty -}

{ [] }

| method_spec_seq0 method_spec

{ $2:$1 }

115

6 The Clay Compiler

As we mentioned, in the MTP techniques, semantic actions just create the ap-

propriate abstract syntax nodes.

6.2.2 Object Expressions

Object expressions consists of four alternatives: a class expression, a variable, a

message sent and a sugared expression. Their GONF productions are:

obj_expr ::= class_expr| var_id| sent_expr| sugared_expr

sent_expr ::= obj_expr.msg_expr

sugared_expr ::= bin_expr| una_expr| num_expr

In the derivation of the abstract syntax from the concrete syntax we have ap-

plied the directive collapse [70] which leads to a flatten hierarchy of the alterna-

tives. Binary and unary expressions are sugared versions of sent expressions. The

resulting type in Haskell is:

data ObjExpr = Class ClsExpr

(Maybe ObjType)

| Var VarId

(Maybe ObjType)

| Send ObjExpr MsgExpr

(Maybe ObjType)

| IntExpr IntLit

(Maybe ObjType)

| DecExpr DecLit

(Maybe ObjType)

deriving (Data, Typeable)

Observe that, in each case, a semantic value is added: Maybe ObjType. Some

abstract syntax nodes have been decorated with semantic values that are com-

puted by the successive stages in the compiler. In this case, each object expres-

sion will be annotated with its minimum type in the typechecking phase.

Let us show the resulting productions in Happy:

obj_expr :: { ObjExpr }

obj_expr : cls_expr

{ Class $1 Nothing }

| var_id

116

6.3 Environments

{ Var $1 Nothing }

| obj_expr ’<-’ msg_expr

{ Send $1 $3 Nothing }

| sugared_expr

{ $1 }

sugared_expr :: { ObjExpr }

sugared_expr : bin_expr

{ $1 }

| una_expr

{ $1 }

| num_expr

{ $1 }

6.3 Environments

To allow an efficient access to the information in the abstract syntax tree and to

avoid the duplication of some computations we introduce a more abstract rep-

resentation of the Clay specifications in the form of environments. We explore,

briefly, the definition of our environments and the techniques applied in its con-

struction.

6.3.1 The Environment Definition

Data type GlobalEnv represents a symbol table in the form of an association list

from class identifiers to (polymorphic) class environments:

type GlbEnv = [(ClsId,PolyClsEnv)]

data PolyClsEnv = PolyClsEnv { bndEnv :: BndEnv

, clsEnv :: ClsEnv }

The bndEnv (of type [(VarId,[ObjType])]) is a bound environment that asso-

ciates class variables to their bounds. The clsEnv field, of type

data ClsEnv = ClsEnv { clsId :: ClsId

, isCaseCls :: Bool

, superTys :: [ObjType]

, clsInv :: Formula

, sttEnv :: SttEnv

, msgEnv :: MsgEnv }

deriving (Data, Typeable)

is the heart of the Clay environments and its fields represent the following infor-

117

6 The Clay Compiler

mation:

• clsId is the class identifier.

• isCaseCls establishes of the class is a case class.

• superTys contains the supertypes of the class.

• clsInv represents the class invariant.

• sttEnv represents the cases classes of the class and their definitions.

• msgEnv represents the specification of the methods specified in the class.

6.3.2 The Environment Construction

Component MTP in Figure 6.2 construct the environments. The construction of

the global environment follows a monadic approach to introduce arbitrary con-

trol flows in the incremental construction of the environment.

Every node of the abstract syntax that modify a environment will be declared

as an instance of class EnvModifier.

class EnvModifier env node where

modify :: Monad m => env -> node -> m env

Let us show the precise instantiation for the ClassSpec:

instance EnvModifier Env ClassSpec where

modify env (ClassSpec cd inv sds mss) =

do env <- setCurClsEnv env emptyClsEnv

env <- setCurBndEnv env emptyBndEnv

env <- modify env cd

env <- modify env inv

env <- pushClsInv env

env <- modify env sds

env <- modify env mss

return env

The type we attach to the calculation of environments is

data Env = Env { glbEnv :: GlbEnv

, toLoad :: [ClsId]

, curModId :: Maybe ModId

, curImps :: [ClsId]

, curClsId :: Maybe ClsId

, curBndEnv :: Maybe BndEnv

, curClsEnv :: Maybe ClsEnv

118

6.3 Environments

, curSttInfo :: Maybe SttInfo

, curMsgInfo :: Maybe MsgInfo

, curLocEnv :: Maybe LocEnv

, curFormula :: Maybe Formula }

Env contains, mainly, a global environment and intermediate information that

are being constructed:

• toLoad is the list of classes to be loaded.

• curModId is the module of the class which environment is under construc-

tion (if any).

• curImps is the current imported classes.

• curClsId is the current class identifier (if any).

• curBndEnv is the current environment with the formal type parameters of

the current class (if any).

• curClsEnv is the current class environment under construction (if any).

• curSttInfo is the current case class (state) being traversed (if any).

• curMsgInfo is the information of the current method being traversed (if

any).

• curLocEnv represents the local parameters of the node being traversed (if

any).

• curFormula contains a representative formula of the last traversed node (if

any).

Now we can understand the line env <- pushClsInv env: monadic function

pushClsInv pushes the current formula (curFormula) in the class invariant field

(clsInv) of the current class environment (curClsEnv).

During the construction of the internal environments these are traversed by

a generic function that qualify every class identifier with the current module in

the environment. Here we make use of Haskell SyB (scrap your boilerplate) tech-

nology:

qualifyClsIds :: (Monad m, Data a) => Env -> a -> m a

qualifyClsIds env = everywhereM (mkM (qualifyWrt (curImps env)))

119

6 The Clay Compiler

6.4 Type Checking

With all the environments constructed the typechecking stage is not very difficult

to implement: we annotate objects expressions with their types and we check the

consistency of types with respect to the typing rules of Chapter 3.

The most important functions of the type checker are tcAtomicFormula and

tcObjExpr. The index of the equality predicate, one of the alternatives of an

atomic formula, is calculated with the function minCommonSupertype:

tcAtomicFormula :: Monad m

=> Env -> AtomicFormula -> m AtomicFormula

tcAtomicFormula env (Equal oe1 oe2 _) =

do tcoe1 <- tcObjExpr env oe1

tcoe2 <- tcObjExpr env oe2

return $ Equal tcoe1 tcoe2 (do ty1 <- decoration tcoe1

ty2 <- decoration tcoe2

minCommonSupertype env ty1 ty2)

In tcObjExpr, the most relevant part is in the last but one line where the func-

tion checkResultType implements the typing rule for the syntax of sending mes-

sages.

tcObjExpr :: Monad m => Env -> ObjExpr -> m ObjExpr

tcObjExpr _env e@(Class ce _type) =

do return $ Class ce (Just $ metaType [objTypeFromClsExpr ce])

tcObjExpr env e@(Var vid _type) =

do lenv <- getCurLocEnv env

case lookup vid lenv of

Nothing -> do benv <- getCurBndEnv env

case lookup vid benv of

Nothing -> fail $ "Variable not found: ’"

++ show vid

++ "’ in "

++ (show env)

Just _ -> return $ Var vid (Just $ TVar vid)

Just ot -> return $ Var vid (Just ot)

tcObjExpr env e@(Send oe me _type) =

do tcoe <- tcObjExpr env oe

let Just ot = decoration tcoe

tcme <- tcMsgExpr env ot me

rt <- checkResultType ot tcme

return $ Send tcoe tcme rt

120

6.5 Translators

6.5 Translators

Both back-ends follow the same architecture: given the annotated environments

resulting from the front-end, a translation process generates an abstract syntax

tree that represents the sentences of the target language for the back-end.

6.5.1 Translation into Prover9/Mace4

Abstract Syntax of OOFOL

The logic presented in Chapter 4 is represented in Haskell using phantom types

to capture sort information of terms. We start with the introduction of the sorts:

data ClsIdS

data ClsS

data MsgIdS

data MsgS

data ObjS

class Sort s where

...

instance Sort ClsIdS where

...

instance ObjSort ClsS where

...

instance Sort MsgIdS where

...

instance Sort MsgS where

...

instance ObjSort ObjS where

...

Typed terms are then defined with the type

data Sort s => TypedTerm s = TT Term

data Term = Var String

| Cte String

| Cls (TypedTerm ClsIdS) [TypedTerm ClsS]

| Msg (TypedTerm MsgIdS) [TypedTerm ObjS]

| Send (TypedTerm ObjS) (TypedTerm MsgS)

The class Sort defines the creation of variables and the projection of the sort

name:

class Sort s where

121

6 The Clay Compiler

var :: String -> TypedTerm s

var ident = TT (Var ident)

sortName :: TypedTerm s -> String

With this infrastructure we have an abstract syntax that does not allow bad-

formed OOFOL formulae. The type that represent these formulae is

data Formula = Top

| Bot

| Wfo (TypedTerm ObjS)

| Wfc (TypedTerm ClsS)

| Subclass (TypedTerm ClsS) (TypedTerm ClsS)

| Instanceof (TypedTerm ObjS) (TypedTerm ClsS)

| Eq (TypedTerm ClsS) (TypedTerm ObjS) (TypedTerm ObjS)

| Pre (TypedTerm MsgIdS)

(TypedTerm ClsS)

[(TypedTerm ClsS)]

(TypedTerm ClsS)

(TypedTerm ObjS)

[(TypedTerm ObjS)]

| Post (TypedTerm MsgIdS)

(TypedTerm ClsS)

[(TypedTerm ClsS)]

(TypedTerm ClsS)

(TypedTerm ObjS)

[(TypedTerm ObjS)]

(TypedTerm ObjS)

| Neg Formula

| Conj Formula Formula

| Disj Formula Formula

| Impl Formula Formula

| Equiv Formula Formula

| forall s . Sort s => Forall (TypedTerm s) Formula

| forall s . Sort s => Exists (TypedTerm s) Formula

Translation of Clay into OOFOL

The implementation of the translation function formalised in Chapter 4 is the

Haskell function trans of the type class Trans:

class Trans node fol where

trans :: node -> fol

Let us show a pair of examples of instantiation of the class for AtomicFormula and

ObjExpr.

instance Trans AtomicFormula Log.Formula where

trans (Equal x y oty) = Eq (trans oty) (trans x) (trans y)

122

6.5 Translators

trans (InstanceOf o c) = Instanceof (trans o) (trans c)

...

instance Trans ObjExpr (TypedTerm ObjS) where

trans (Class ce _type) = trans ce

trans (Var v _type) = trans v

trans (Send o m _type) = transSend (trans o) (trans m)

Encoding of OOFOL in Prover9/Mace4

Encoding OOFOL formulae in FOL is more or less straightforward. We have

added a class that introduce a show9 function that translate a given construction

into the syntax of Prover9/Mace4:

class ShowAsProver9 a where

show9 :: a -> String

show9asTree :: Int -> a -> String

Then, TypedTerm and Formula are instances of ShowAsProver9. The most rele-

vant portion of code is the translation of quantifiers following the Enderton [45]

indications:

instance ShowAsProver9 Formula where

show9 Top =

"$T"

show9 Bot =

"$F"

...

show9 (Forall v f) =

"(forall " ++ show9 v

++ "(" ++ whichSort v

++ " -> "

++ show9 f ++ ")"

++ ")"

show9 (Exists v f) =

"(exists " ++ show9 v

++ "(" ++ whichSort v

++ " & "

++ show9 f ++ ")"

++ ")"

123

6 The Clay Compiler

6.5.2 Synthesis of Prolog Programs

Abstract Syntax of Extended Programs

The Lloyd-Topor transformation transforms clauses with arbitrary first order for-

mulae in the body to general programs, programs where single negation is al-

lowed in front of each atom in the body of a clause. We have introduced one

abstract grammar for extended programs and other one for general programs:

newtype EP = EP [EPC]

data EPC = EPC Atom [Literal] [Formula]

newtype GP = GP [GPC]

data GPC = GPC Atom [Literal]

The meaning of EPC h a b is a ∧ b → h (where a and b are interpreted as the

conjunction of the formulae in a and b).

The definition of the types Atom, Literal and Formula are

data Atom = Atom String [Term]

data Term = Var String

| Structure String [Term]

| List [Term]

| Tuple [Term]

| Integer Integer

data Literal = Pos Atom

| Neg Atom

data Formula = Top

| Bot

| At Atom

| Not Formula

| Conj Formula Formula

| Disj Formula Formula

| Impl Formula Formula

| Equiv Formula Formula

| Forall String Formula

| Exists String Formula

Translation of Clay into General Programs

We have applied the same approach we used in the back-end to first-order logic

to implement the translation function of Chapter 5:

124

6.5 Translators

class Trans node fol where

trans :: node -> fol

Let us show a pair of examples of instantiation of the class for AtomicFormula and

ObjExpr.

instance Trans AtomicFormula Prolog.Formula where

trans (Equal x y oty) =

let (redX, xTr) = flatSend x

(redY, yTr) = flatSend y

in Conj redX

(Conj redY

(At (mkEqAtom (maybe (Prolog.Var "_") trans oty)

xTr

yTr)))

trans (InstanceOf o c) =

let (redO, oTr) = flatSend o

cTr = trans c

in Conj redO

(At (mkInstanceofAtom oTr cTr))

...

instance Trans ObjExpr Term where

trans (Class ce _type) =

trans ce

trans (Var v _type) =

trans v

trans (Send o m _type) =

send (trans o) (trans m)

trans (IntExpr il _type) =

Integer (intLitValue il)

The function flatsend introduces a formula with the predicate reduce that

will reduce the translated expression to a normal form:

flatSend :: ObjExpr -> (Prolog.Formula, Term)

flatSend oe =

let oeTr = trans oe

in case oeTr of

Structure "send" ts ->

let nfVar = Prolog.Var ("_NF_" ++ headsOfTerms ts)

in (At (mkReduceAtom oeTr nfVar), nfVar )

Integer _i ->

let nfVar = Prolog.Var ("_NF_" ++ headOfTerm oeTr)

in (At (mkReduceAtom oeTr nfVar), nfVar )_ -> (Prolog.Top, oeTr)

Finally, function transLloydTopor implements the Lloyd-Topor transforma-

tion.

125

6 The Clay Compiler

Encoding of Extended Programs in Prolog

Encoding general programs into Prolog is straightforward and we did implement

it defining the following instances of Show:

instance Show GP where

show (GP gpcs) = showAnyListWith show "" "\n\n" "" gpcs

instance Show GPC where

show (GPC h b) = show h ++

if null b

then "."

else " :-\n"

++ showAnyListWith show " " ",\n " "." b

show (GPCDec d) = ":- " ++ d ++ "."

instance Show Literal where

show (Pos a) = show a

show (Neg a) = "\\+ " ++ show a

instance Show Atom where

show (Atom p ts) = p ++ if null ts

then ""

else showAnyListWith show "(" ", " ")" ts

126

Part IV

Applications

127

7

Formal Agility

Abstract

This chapter contains the published paper [65], an exploratory work

where we study how the technology of Formal Methods (FM) can inter-

act with agile processes in general and with Extreme Programming (XP)

in particular. Our hypothesis is that most of XP practices (pair program-

ming, daily build, the simplest design or the metaphor) are technology

independent and therefore can be used in FM based developments. Ad-

ditionally, other essential pieces like test first, incremental development

and refactoring can be improved by using FM. In the paper we explore,

in a certain detail, those practices: when you write a formal specification

you are saying what your code must do, when you write a test you are

doing the same so the idea is to use formal specifications as tests. In-

cremental development is quite similar to the refinement process in FM:

specifications evolve to code maintaining previous functionality. Finally

FM can help to remove redundancy, eliminate unused functionality and

transform obsolete designs into new ones, and this is refactoring.

129

7 Formal Agility

7.1 Motivation

At first sight, XP [11] and FM [127, 72] are water and oil: an impossible mixture.

Maybe the most relevant discrepancy is that while one of the strategic motivation

of XP is “spending later and earning sooner” FM require “spending sooner and

earning later”. However, a deeper analysis reveals that FM and XP can benefit

their selves.

The use of formal specifications is perceived as improving reliability at the

cost of lower productivity. XP and other agile processes focus on productivity

so, in principle, using FM following XP practices could improve its efficiency. In

particular, pair programming, daily build, the simplest design or the metaphor

are XP practices that in our view are independent of the concrete development

technology used to produce software and the declarative technology and FM is

just a different development technology.

On the other hand, the main criticism to XP is that it has been called system-

atic hacking and, probably, the underlying problem is the lack of a formal or even

semi-formal approach. But, what XP practices are liable to incorporate a formal

approach? We think that unit testing, incremental development and refactoring

are three main XP practices where FM can be successfully applied:

• When you write a formal specification you are saying what your code must

do, when you write a test you are doing the same so one idea is to use formal

specifications as tests.

• Incremental development is quite similar to the refinement process in FM:

specifications evolve to code maintaining previous functionality.

• Finally FM can help to remove redundancy, eliminate unused functionality

and transform obsolete designs into new ones, and this is refactoring.

After all, it might be possible to dilute FM in XP. We would like to point out that

we are not claiming to formalise XP (as could be understood from the joke in the

title), but just to study how the declarative technology can be integrated in XP

and how XP can take advantages of this technology.

Before exploring the above XP practices from a formal approach, SLAM (our

formal tool) is presented in Section 7.2. In Sections 7.3.1 and 7.3.3 we briefly

present how formal specifications can be used in the practices of testing and

130

7.2 Formal Methods and SLAM

refactoring. Section 7.3.2 focuses in the formalisation of the incremental devel-

opment under the prism of FM.


In spite of the great expectations generated around declarative technologies (for-

mal methods, and functional and logic programming) years ago, these have not

penetrated the mass market of software development. One of the main causes is

a deficient software tool support for integrating formal methods in the develop-

ment process. Since 2001 we are involved in the development of the SLAM [145]

system, a modern comfortable tool for specifying, refinement and programming.

The formal notation SLAM-SL [63] embedded in the whole system is an

object-oriented specification language valid for the design and programming

stages in the software construction process. Although the main ideas in the pa-

per could have been presented using any other FM and its associated notation,

we think that the design of SLAM-SL gives to our notation important advantages.

For this paper, other of the most relevant features of SLAM-SL is that it has

been designed as a trade-off between the expressiveness of its underlying logic

and the possibility of code synthesis. From a SLAM-SL specification the user can

obtain code in a high level programming language (let us say Java), a code that

is readable and, of course, correct with respect to the specification. Because the

code is readable, it can be modified and, we expect, improved by human pro-

grammers.

A complete SLAM-SL description is out of the scope of this work, but let us

sketch some relevant elements for the goals of the paper.

7.2.1 Data Modelling

SLAM-SL is a model based formal notation [44] where algebraic types (free types

under Z terminology) are used to specify a model for representing instances.

From the point of view of an object-oriented programmer, data are modelled fol-

lowing the design pattern State [52]:

class Orderstate pending (customer : Customer,

product : Product,quantity : Positive )

131

7 Formal Agility

state delivered (customer : Customer,product : Product,quantity : Positive ,payment : Transfer)

Informally, an order instance can be in state pending so members customer,

product and quantity are meaningful, or in state delivered and customer, prod-

uct, quantity and payment are meaningful. Even more, pending and delivered

are order constructors, and customer, product, quantity and payment are getter

methods (the last one is partial). Automatically, the SLAM-SL compiler synthe-

sised the following human understandable Java code:

class Order {

private OrderState state;

...

}

class OrderState {

private Customer customer;

private Product product;

private int Quantity;

...

}

class PendingOrderState extends OrderState {

}

class DeliveredOrderState extends OrderState {

private Transfer payment;

}

Class invariants associated to every state are allowed, invariants that can be used

to statically (through a theorem prover) or dynamically (through assertions) to

check the specification and the implementation consistency.

7.2.2 Method Specification

The general scheme of a method specification is this one:

class A...method m (T1, . . ., Tn) : Rpre P(self ,x1, . . . ,xn)call self .m (x1, . . ., xn)post Q(self ,x1, . . . ,xn,result)chk T1(self ,x1, . . . ,xn,result)...

132


chk Tm(self ,x1, . . . ,xn,result)sol S(self ,x1, . . . ,xn,result)

As we can see, a method specification involves a guard or a precondition

(the formula P(self ,x1, . . . ,xn)) that indicates if the rule can be triggered, an op-

eration call scheme (self .m (x1, . . ., xn )); and a postcondition (given by the for-

mula Q(self ,x1, . . . ,xn,result)) that relates input state and output state. The for-

mal meaning of this specification is given by the following property:

∀s,x1, . . . ,xn.pre−m(s,x1, . . . ,xn) ⇒ post−m(s,x1, . . . ,xn,s.m(x1, . . . ,xn)

where pre and post predicates are defined in this way:

pre−m(s,x1, . . . ,xn) ≡ P(self ,x1, . . . ,xn)

post−m(s,x1, . . . ,xn,r) ≡ Q(self ,x1, . . . ,xn,r)

The procedure to calculate the result of the method is called a solution in the

SLAM-SL terminology and it has been indicated by the reserved word sol follow-

ing by the formula S(self ,x1, . . . ,xn,result). Notice that the formula is written in

the same SLAM-SL notation, but must be an executable expression (a condition

that can be syntactically checked). The SLAM-SL compiler synthesised efficient

and readable imperative code from solutions. The key concept is the operational

use of quantifiers (extending usual logic quantifiers). Quantifiers allow the ex-

pressiveness of logic while the basis for their efficient implementation as traver-

sal operations on data.

Once it is proved that the postcondition entails the solution it is ensured the

correctness of the obtained code. However, the automatically generated code

could not be enough efficient and, as we mentioned previously, the programmer

can modify the generated code.

Formulas prefixed with the reserved word chk are extra properties that will

hold in the program. Each Ti must be an executable formula and can be con-

sidered as tests (for instance that a prime number greater than 2 must be odd).

Theoretically, they are not needed because they must be entailed by the post-

condition, however, important errors in specifications can be caught. They can

also be completed with some values (concrete values, intervals, etc.) what can

provide automatic tests to be executed during the execution. Proof obligations

are generated in order to prove that every Ti holds under the given postcondition

and assertions can be generated in order to check that hand-coded modifications

133

7 Formal Agility

fulfil those properties.

7.2.3 Support for Testing and Debugging

Executable code is obtained from solutions and using similar techniques pre and

postconditions are used to generate debugging annotations (assertions and ex-

ceptions) [59]. Notice that the postcondition can be complex enough to prevent

code generation. However, test can always be checked. This feature can be used

both to prevent errors in the case of programmer’s modifications and to imple-

ments runtime tests. Furthermore, up to now the SLAM system is not automati-

cally proving that the postcondition entails the solution, so test can help to find

wrong solutions. Nevertheless, as soon as this feature will be incorporated to the

system the automatically generated code is always correct and no test checking

is needed.

7.3 XP Practices

As mentioned in the motivation, most XP practices are technology independent.

In our opinion, the XP process could be adopted by using SLAM (or any other FM

tool) instead of an ordinary programming language and tool. In other words, we

propose to write formal specifications instead of programs. A number of advan-

tages appear:

• Rephrasing a XP rule, “The specification is the documentation” because

we have a high level description with a formal specification of the intended

semantics of the future code. One of the bigger efforts in the SLAM devel-

opment has been to ensure that the generated code is readable enough.

Therefore, the “answer is still in the code” (but also in the specification).

• FM tools (theorem provers, model checkers, etc.) help to maintain the con-

sistency of the specification and the correctness of the implementation.

• Important misunderstandings and errors can be captured in the early

stages of the development but close enough to code generation.

While in Agile Methods the emphasis is on staying light, quick, and low ceremony

in the process, FM could make it sometimes heavier, sometimes not. Even in the

134

7.3 XP Practices

first cases we have that: i) it is still can be considered a light method in the FM

area, and ii) the benefits should compensate in many cases the increase of work.

Let us focus on in three XP pieces where we consider that FM can play an

interesting role.

7.3.1 Unit Testing

In XP the role of writing the tests in advance is similar to the role of writing a

precise requirement: it is used to indicate what the program is expected to do.

Tests in XP solves two different problems:

• The detection of misunderstandings in the intended specifications.

• The detection of errors in the implementation.

The perspective under both problems is completely different when using FM.

The detection of inconsistencies in formal specifications are supported by formal

tools, mainly by a generator of proof obligations and by a theorem prover assistant.

With both tools the user get information about possible inconsistencies.

The detection of errors in the implementation is absolutely unneeded thanks

to the verified design process: a process that ensures that the code obtained from

an original specification is correct with respect to it. Notice that the use of tests

do not ensure that requirements are satisfied, just “convince” the programmer

that it happens. The FM approach overcome this limitation.

So we propose to replace the tests by chk formulas expressed in SLAM-SL.

There are several advantages of this approach:

1. tests can be complex enough but the SLAM system takes care of the code

generation is feasible,

2. tests are executed automatically every time the program is run in debug-

ging mode,

3. testing properties can be carried out in all the incremental versions of the

code, i.e. they are automatically checked in all the iterations, and

4. automated formal tools can be used to improve the behaviour, for instance

proving that some test are inconsistent with the specification by using a

theorem proving.

135

7 Formal Agility

7.3.2 Incremental Development

In this section we present the logical properties that the iterative development of

software by the incremental addition of requirements must fulfil. We have called

the set of those properties the Combination Property and it formally establishes

that the combination of the code already obtained to solve the previous require-

ments and the code needed to solve the new one must fulfil all the requirements.

The incremental development of XP needs to ensure that: i) at every step we de-

velop the minimal code needed to solve the corresponding requirement, and ii)

this code is combined with the previous code in such a way that the old require-

ments still hold. To solve this goal we establish the minimal properties that must

be proved to ensure a correct behaviour.

We will call storyi the formula expressing requirements at step i. At every step

we want to develop a function fi that covers all the requirements story1, . . . ,storyi.

To obtain fi we depart from:

• the function fi−1(x,y) with postcondition posti−1(self ,x,y,result), and

• a function gi(x,z) that solves requirement storyi.

Additionally, function fi computes “more things” than fi−1, i.e. the result of fi in-

cludes the result of fi−1, and maybe more data. Formally, there exists a projection

πi that relates both results.

Let us discuss some remarks with respect to these formulas before establish-

ing the main properties. The fact that gi is developed for requirements storyi

means that its postcondition entails storyi(self ,x,z,result). We assume that some

of the arguments for gi are still present in the previous code, i.e. arguments rep-

resented by variables x are still present in fi−1, while some previous arguments y

are not needed for storyi and some new z are required.

Now, the main property to be proved can be formulated. Let us assume that

the postcondition of function fi(x,yz) is posti(self ,x,y,z,resulti). To ensure that

this function is correctly defined we must prove the Combination Property:

posti(self ,x,y,z,resulti) ⇒storyi(self ,x,z,resulti)∧posti−1(self ,x,y,resulti−1)∧πi(resulti) = resulti−1

136

7.3 XP Practices

Now we can formally establish that this is the only property (at every step i)

needed to ensure that the final code (i.e. fn) entails all the requirements.

Theorem 7.1. For every i ∈ {1, . . . ,n−1} the following formulas hold:

postn(self ,x,y,z,resultn) ⇒ storyi(self ,x,z,resulti)

posti+1(self ,x,y,z,resulti+1) ∧ posti(self ,x,y,z,resulti) ⇒πi+1(resulti+1) = resulti

The proof proceeds by induction on i.

A Simple Example

In the following example, we will show three customer stories for the devel-

opment of a small telephone database [119]. The customer wants a telephone

database where information can be added and looked up maintaining two differ-

ent tables: one with the persons and other one with the entries (pairs of person

and phone). The specification written by development is:

class Phone_DB

state (members : {Person},phones : {(Person,Phone)})

constructor make_phone_DBcall make_phone_DBpost result .members = {}

modifier add_entry (Person, Phone)pre person in self .members and

not (person, phone) in self .phonescall add_entry(person, phone)post result .phones = self.phones + {(person,phone)}

modifier add_member (Person)pre not person in memberscall add_member(person)post members = self.members + {person}

observer find_phones (Person) : {Phone}pre person in dom(phones)call find_phones(person) = self.phones(person)

137

7 Formal Agility

In the second story, the customer asks for including a way to remove entries

in the data base and this is the result of the development task:

modifier remove_entry (Person, Phone)pre (person,phone) in phonescall remove_entry(person, phone)post phones = self.phones − {(person,phone)}

The combination property in this case is trivial to prove because we only have

added a new operation. A consistency check is also trivial.

In the third customer story, she asks for removing the person from the

database of members if its removed entry is the last one:

modifier remove_entry (Person, Phone)pre (person,phone) in phonescall remove_entry(person, phone)post phones = self.phones − {(person,phone)} and

if (exists phone : Phones with (person, phone) in phones)then members = self.memberselse members = self.members − {person}end

In this step, the postcondition of remove_entry must be proved to entail the

previous postcondition. A theorem prover can automatically do the work: let A

be the formula phones = self.phones − {(person,phone)} and B the right hand side of

the conjunction, the proof obligation is

A ∧ B ⇒ A

what is directly the scheme of an inference rule in first order logic.

7.3.3 Refactoring

The declarative technology makes easier to find and remove redundancy, elim-

inate unused functionality and transform obsolete designs into new ones, i.e to

refactor code [47]. Thanks to the reflective properties of SLAM-SL, generic pat-

terns can be specified and it can be proved that a specification is an instantiation

of such a generic pattern. The idea it is having a relevant collection of generic

patterns trust the prover technology of FM were able to match specifications with

specifications in those patterns. Some works in formalising design patterns [52]

have been done using SLAM-SL [67].

However, we need to be sure that the resulting code from refactoring is still

138

7.4 Conclusions

readable enough. In any case, taking into account that it is for free, the program-

mer can spend some time in documenting it.

7.4 Conclusions

We have presented how some XP practices can admit the integration of Formal

Methods and declarative technology. In particular, unit testing, refactoring, and,

in a more detailed way, incremental development have been studied from the

prism of FM.

Probably there is more room for FM ideas helping agile methodologies and

XP, and we will study this as a future work.

One of the goals of the SLAM system is to make FM and their advantages

closer to any kind of software development. Obviously FM are specially needed

for critical applications but combining it with rapid prototyping and agile meth-

odologies could make them affordable for any software construction. Up to know

we have not equipped SLAM with an automatic interface generator that pre-

cludes the use of our system for heavy graphical interface applications. The au-

tomatic generation of graphical interfaces is another matter of future work.

139

8

Specifying in the Large

Abstract

This chapter contains the published paper [64], a digression of how for-

mal methods, and, in particular, object-oriented specification languages

can be integrated in the software development process in an effective

way. We depart from an object specification language in the SLAM sys-

tem that combines characteristics of algebraic languages as well as pre

and postconditions for class methods specification. We study how to

specify classes as well as the formal relations that class relationships

must hold (in particular, inheritance).

One of the main features of the specification language is that it is sup-

ported by an integrated environment. Among other facilities, it includes

the generation of readable imperative code. We address how this transla-

tion can be done and the extra capabilities of the environment regarding

the use of formal method in the development process, for instance pro-

gram validation.

141

8 Specifying in the Large

8.1 Motivation

One of the most important problems in software development is software in-

tegrity and reliability. In the recent history of computing several software errors

that caused considerable damage can be found. For instance, the failure of the

Ariane 501 was classified by the ESA official reports as a software error. Another

examples is the lost of the Mars Climate Orbiters due to a software problem mix-

ing measures in European and American metric units. The presence of comput-

ers in new activities, disciplines, and systems makes even more important soft-

ware correctness. Moreover, problems like Y2K or the Euro bug show that the

situation is not restricted to apparently safety critical applications. The situation

is well described in the PITAC report, developed by the USA President’s Informa-

tion Technology Advisory Committee as part of the Computer Science Research

plan of the USA government [140].

The standard solution to this problem is the use of specification languages

for software description and the use of formal methods for proving properties of

programs. There are advantages concerning program development, like the pos-

sibility of automatically obtaining a prototype, or the help in program evolution

by reusing and modifying existing specifications by (semi) automatic manipu-

lation. From the point of view of software validation there are some additional

benefits, like the formal verification of properties, program debugging by dynam-

ically checking formal specifications, and program maintenance, reuse and doc-

umentation.

However, there are some drawbacks, namely those concerning the reduced

number of automated tools, and the lack of use of formal methods because the

extra cost.

Most of the advantages are present in the Iterative Rapid Prototyping Process

(IRPP): an iterative development of prototypes from some (partial) specifications

that can be modified from the user information, checking the prototype after a

validation step. A key role in this model is played by the specification language

used for requirement description. The IRPP assumes specifying-in-the-large, i.e.

the specification language must have an structure allowing for the convenient

specification of large systems and tools for managing them. This encompasses

two characteristics: one is a high expressiveness of the specification languages,

the other one is the ability of the development environment to produce a signifi-

cant amount of code from the specification.

142

8.1 Motivation

Algebraic specification languages, like OBJ [53], FOOPS [110], Maude [33] as

well as information systems specification languages, like TROLL [75], Albert [39],

Oblog [114], and OASIS [85], permit animating the specification in such a way

that they are executable.

Despite the obvious benefits of the IRPP, some other authors have pointed

out some additional problems [118] like the difficulty to express non-functional

requirements, and that the cost of the prototype development could represent

an unacceptably large fraction of overall system cost taking into account that re-

implementation is recommended. Even in the case of using an executable speci-

fication language, one of the problem remains: the prototype is still throw-away.

As prototypes can be executed in an interpreted manner, once the prototype is

accepted the developer needs to re-implement it in a productive language, usu-

ally an imperative one.

The situation would be improved if readable and efficient imperative code

could be generated from the specification language. Some years ago this scenario

was considered unfeasible, but in our opinion the current technology in pro-

gramming languages and specification languages is mature enough to achieve

this goal. Basically, the methodology departs from a formal specification of the

product in order to obtain systematically a program that fulfil the original speci-

fication. Most of the relevant ideas for this objective come from declarative lan-

guages development and implementation, for instance program transformation

techniques. It is an obvious fact that declarative systems have a very low pres-

ence in industrial development, and having been involved in the development

of declarative languages and techniques and in several development projects we

are very sceptical about the real possibility to introduce declarative languages in

software enterprises. From our point of view the only feasible way to incorpo-

rate those techniques into software production is to adapt them to imperative

systems (see for instance the success of ILOG, a constraint logic programming

library for C/C++).

Our proposal to fully implement the IRPP and to support specifying-in-the-

large is the SLAM system. The system contains several components based on

formal methods: an object-oriented specification language, an advanced devel-

opment environment including efficient and readable code generation and a li-

brary to support a high level specification.

The most novel feature of the SLAM specification language (SLAM-SL) is that

is designed as a trade-off between the high expressiveness of the underlying logic

143


and the possibility of an efficient compilation. SLAM-SL formula, an extension of

logic formula, are used to specify functions by means of a precondition (condi-

tion to apply the function with success), a postcondition (that relates input argu-

ments and the result), and a solution (an effective method to compute the func-

tion). One key concept is the operational use of ‘quantifiers’ (extending usual

logic quantifiers). Quantifiers allow the expressiveness of logic while the basis

for their efficient implementation are the characterization of the classes (in the

sense of object orientation) that can be traversed and the method to do it. By

using program transformation techniques that will be discussed later, it is possi-

ble to obtain code in a high level programming language, an object-oriented one

preferably, like Java or C++, although SLAM is language independent, that code is

efficient and readable, so it can be modified by the user. By readable, we under-

stand a code where original modelling is translated into equivalent imperative

types and function descriptions are moved to easy to follow code (in particular,

using loops) with adequate declarative annotations.

With respect to related work, the closest proposals have been mentioned

above: executable algebraic specification languages (the OBJ family) and ani-

mable information system description languages. The goal to automatize the

IRPP is present both in those proposal and SLAM. There are some other experi-

ences adding declarative features to imperative languages (the most recent and

interesting is Pizza [104]) but the level of abstraction is not higher enough to

be considered as specification languages. In any, to our knowledge SLAM is the

first complete approach to obtain full code from specifications without adding

limitations to the original language.

The rest of the paper is organized as follows. Section 8.2 presents the main

characteristics of the specification language SLAM-SL. Next two sections are de-

voted to code generation, showing how to compile algebraic types into impera-

tive ones in Section 8.3, and how to transform SLAM-SL specifications into effi-

cient imperative code in Section 8.4. Finally, we conclude and sketch some hints

for future work.

8.2 The SLAM Specification Language

This section presents the main constructions of the language SLAM-SL. SLAM-SL

is part of the SLAM project, a software construction development environment

that is able to synthesize reasonably efficient and readable code in different high

144


level object-oriented target languages like C++ or Java. Among other features,

the user can write specifications in a friendly way, track her hand-coded opti-

mizations, and check, in debug mode, those optimizations through automati-

cally synthesized assertions.

In order to facilitate the understanding of SLAM-SL we will show its elements

with a concrete syntax that does not necessarily correspond neither with an in-

ternal representation nor the environment presentation, so the reader should not

pay attention to the concrete syntax but to the abstract one.

A SLAM-SL program is a collection of specifications that defines classes and

class properties. The specification of method behaviour is given through precon-

ditions and postconditions but with a functional flavour as we will see.

8.2.1 Classes and Class Relationships

In SLAM-SL, a class is defined by specifying its properties: name, relationships

with other classes, and methods.

As in many object-oriented programming languages, different kind of rela-

tionships between classes cannot be distinguished as UML supports. For in-

stance, aggregation cannot be distinguished from composition, and some asso-

ciations are implicit through the semantics of methods. Anyway, the following

relationships that can be caught statically are listed:

Aggregation: the state specification of a class defines an aggregation or compo-

sition among class instances.

Inheritance: class properties can be defined from scratch or by inheriting them

from already defined classes. Overriding of such properties are constrained

in SLAM-SL, not only the signatures but also the meaning (see Subsec-

tion 8.2.2).

Polymorphism: generic polymorphism is introduced by permitting introducing

arguments in types. SLAM-SL allows deferring classes in the style of Eiffel

but adding some features from theories (in OBJ terminology [53]) as well

as type classes (à la Haskell [73]) playing a more powerful role than C++

templates.

Let us see a simple example:

145


class Stack inherits Collectionstate emptystate non_empty (top : Object, rest : Stack)

The first line declares a new class called Stack and it establishes that class Stack

inherits properties from Collection. Lines starting with state define attributes that

are the internal representation of the class instances. SLAM-SL permits defin-

ing algebraic types to indicate that a syntactical construction represents class

instances. Syntactically algebraic types allow for alternative definitions of con-

structor rooted tuples. Semantically, different descriptions of type elements are

combined in a declarative fashion. In our example, the constructors empty and

non_empty cover the two descriptions of the state of a stack. The concrete values

empty and non_empty (5, empty) represent an empty stack and the state of a stack

with a unique object (the constant 5) respectively.

Each state can be associated with an invariant over the attributes that the

state defines. Important properties of the class instances can be captured in the

invariants.

class Pointstate polar ( r : Float, a : Float)invariant r > 0 and 0 6 a and a < 2 * pi

In Section 8.3 algebraic types in SLAM-SL are shown in detail.

8.2.2 Method Specifications

SLAM-SL has a clear functional flavour, so methods are represented by functions

but the user can classify different kinds of methods: constructors, modifiers and

observers.

• Object constructors. An object constructor is a function designed to create

new instances of a class.

• Object observers. Observers allow to access properties of an object without

modifying it. SLAM-SL provides free observers for record fields (with the

name of the field).

• Object modifiers. Modifiers are designed to modify the value of an object.

In spite of the classification above, all the methods are functions that involve sev-

eral objects.

146


The standard methods over stack objects permit creating an empty stack, de-

cide if a stack is empty, consulting the top of the stack, and push and pop el-

ements. Let us complete the stack class specification with the definition of its

methods:

constructor make_emptypre truecall make_emptypost result = empty

observer is_empty : Boolpre truecall is_emptypost result = (self = empty)

observer top : Objectpre not self . is_emptycall toppost result = self .topmodifier push (Object)pre truecall push(x)post result = non_empty(x, rest)

modifier poppre not self . is_emptycall poppost result = self . rest

A method is specified by a set of rules, every rule involves a guard or a precon-

dition that indicates if the rule can be triggered, an operation call scheme, and a

postcondition that relates input state and output state. The general form of a rule

is the following:

function op(T ) : Rpre :− P(x,self)op(x)post :− Q(x,self,result)

where P(x,self ) is a SLAM-SL formula (see Section 8.2.4) involving variables in the

argument (x) and the recipient of the message (self ) in case of the operation to

be either an observer or a modifier. Q(x,self , result ) is another formula involving

variables in the argument, the reserved symbol result that represents the com-

puted value of the function and self that represents the state of the receipt of the

message before the method invocation.

Some shorthands help the user to write formulas concisely and readablely:

147


self can be ommitted for accessing attributes, explicit function definitions, as in

VDM, are allowed, and unconditionally true preconditions can be skipped.

Method Overriding

Let us explain in some details how SLAM-SL handles method overriding. Sup-

pose you have a class C with a method m with precondition P and postcondi-

tion Q. Now, a subclass C′ of C is declared supplying a new specification for m:

precondition P′ and postcondition Q′. As SLAM-SL is a formal specification lan-

guage, it is forces that the following statements hold:

Inheritance Property:

P → P′

(P∧Q′) → Q

Interfaces

In SLAM-SL it is quite easy to declaratively specify interfaces, i.e. class with no

state and methods that must be redefined in the subclasses. The way to declare

such method is to indicate that the precondition is false. This means that this

method is not applicable in any case. Notice that it is still possible to supply

an adequate postcondition. This postcondition must be preserved in all derived

classes. Those methods that have no definition are implicitly considered to have

the precondition false and the postcondition true. For the practical use of SLAM-

SL as an specification language a dedicated syntax for interfaces can be intro-

duced. We just want to stress the point that they are just translated into this sim-

plified form.

Encapsulation

Encapsulation is an important concept in programming languages which per-

mits the user to control coupling and maximize cohesion. In general, encapsu-

lation is not encourage in formal methods. In SLAM-SL the user can indicate the

visibility scope of each property: public, protected or private. If an attribute is

indicated as public the user gets for free an observer, for instance, in the stack

148


example the definition of the observer top could have been avoided in this way:

state non_empty (public top : Object, rest : Stack)

If a complete state is declared as public then a free constructor is obtained

and the definition of make_empty could have been avoided with:

public state empty

‘Inheritance by Composition’

SLAM-SL also introduces a broad notion of inheritance by composition. Let us

see an example, the following SLAM-SL specification defines a read only wrapper

for stacks:

class ROStack

public state wrap (target : Stack acceptpublic top : Objectpublic is_empty : Bool)

Now, wrap creates instances of ROStack which state is a stack and which methods

are top and is_empty that resends those messages to the state object. This is a

pretty unexplored feature that we have called ‘inheritance by composition’.

8.2.3 SLAM-SL Predefined Classes

As many other specification languages, SLAM-SL has a powerful toolkit with pre-

defined types representing booleans, numbers, characters and strings, records

and tuples, collections (sequences, sets, etc.), dictionaries (maps, relations, etc.).

SLAM-SL type syntax reflects value syntax, for instance, the type ‘sequence of

integers’ is written as [ Integer] and its values are written as [1,2,3] , a tuple type

can be written as (Char,Integer) and its values as (’ a ’,32) , a set type like {String}

groups together values like { "Hello" , "world"} or {} .

The most interesting feature of sets and sequences is that they inherit from

the predefined abstract class Collection. All the classes that inherit from it must

define a traversal—a specific method to traverse the collection from which iter-

ative or recursive code can be automatically generated. This will be shown in

detail in Section 8.4.

149


8.2.4 SLAM-SL Formulas and Quantifiers

SLAM-SL formulas are basically logic formulas built using the usual logical con-

nectives (and -conjunction, or -disjunction, not -negation, implies -implication,

and equiv - equivalence), predefined and user defined functions and predicates,

and quantified expressions. SLAM-SL formulas are typed in a similar way than

other expressions. In fact, expressions and formulas share the syntax –every for-

mula is an expression of type boolean. SLAM-SL expressions can combine ob-

jects with its own operations. Operations can be combined in any consistent way

to produce new expressions.

Expressions can use quantifiers over elements of adequate types. Quantifiers

are a key feature in SLAM-SL and extend the notion of quantifier in logic. They

can compute not only the truth of an assertion but also any other value. In SLAM-

SL, a quantified expression is written in the following way:

q x in d [where F(x)] with E(x)

The above expression scheme is a quantified expression:

• q is the quantifier symbol that indicates the meaning of the quantification

by a binary operation and a starting value,

• d is an object of a special predefined class Collection,

• x is the variable the quantifier ranges over,

• F is an optional boolean expression that filters elements in the collection,

and

• E represents the function applied to elements in the collection previous to

computation.

Some predefined quantifiers are shown in the following table with an informal

description:

150


Symbol Generalizes

exists ∨ with false.

exists1 as exists but limiting the count to 1

forall ∧ with true

sum + with 0

prod × with 1

count inc with 0 (counting)

select searching

max max

maxim maximizers

map apply a function to every element

Let us show some examples and their intended meaning:

Example and meaning

forall x in {1,2,4,7,8} with x < 11 =

true

count x in {1,2,4,7,8} with x.isPrime =

3

sum x in [1..10] where x < 5 with x.pow(2) =

30

map x in {1..10} with x / 2 =

{0,1,2,3,4,5}

map x in [1..10] with x / 2 =

[0,1,1,2,2,3,3,4,4,5]

Solutions

In SLAM-SL constructive formulas, those that code can be generated from, are

called solutions. They can be syntactically characterized by assigning a value to

the result variable and restricting quantifiers to finite collections. The user can

supply a solution when the postcondition does not offer a method to compute

the function.

151


8.3 Algebraic Types and Pattern Matching

Two important features peculiar to functional programming languages are alge-

braic types and function definition using pattern matching. ‘Variant records’ and

‘union types’ are the imperative version, but algebraic types and pattern match-

ing favour conciseness and readability.

We believe that algebraic types increase the language’s expressive power by

extending standard type systems with a strong theoretical foundation. Because

of this, algebraic types have been introduced in SLAM-SL, considering that they

are a natural and abstract way to represent values. Moreover, the algebraic types

allow the user to define its own types as allow SLAM-SL developers to specify the

whole predefined types. This is an important advantage because the formal en-

gines for refinement, transformation, verification and synthesizing will be based

in a smaller set of deduction rules.

The following example tries to illustrate the former discussion. The prede-

fined SLAM-SL type for lists can be specified in the language itself by using no

other predefined domain but algebraic types:

class List

public state emptystate non_empty (public head : Object,

public tail : List )

modifier add_to_front (Object)call add_to_front(y) = non_empty(y, self)

observer length : Natcall (empty).lenght = 0call (non_empty(x,xs)).length = 1 + xs.length

observer includes (Object) : Boolcall (empty).includes(y) = falsecall (non_empty(x,xs)).includes(y) =

(x = y) or xs.includes(y)

modifier remove (Object)call (empty).remove(y) = emptycall (non_empty (x,xs)).remove(y) =

if x = ythen xselse non_empty(x, xs.remove(y))

end

152


modifier append (List)call (empty).append(ys) = yscall (non_empty(x,xs)).append(ys) =

non_empty(x, xs.append(ys))

For those readers acquainted with functional programming the specification

above is ‘standard’. However, no experience with functional languages is required

because the previous specification is easy to understand: every operation is de-

fined by several rules and every rule specifies the behaviour of the operation de-

pending on the state of the object (empty or non_empty).

In order to notice the expressiveness of algebraic types versus ‘standard’ mod-

elling in procedural language we can discuss how to implement the above speci-

fication in one of those languages. The domain description can be basically kept

by using union types or variant records. However, recursive domains need for the

inclusion of pointers. Of course, the different level of abstraction can be justified

by the different goal: specification versus implementation. Our proposal try to

make both steps closer by generating the implementation in mind automatically.

8.3.1 Compiling Algebraic Types

In [104], a method for compiling algebraic types and pattern matching is given

based on the association of object types with algebraic types. Nevertheless,

the ideas in Pizza should be revisited because in strongly typed languages (like

Java), objects cannot dynamically change their class and functional interfaces

are forced. Some different compilation schemes that avoid that problem are

proposed.

Compilation scheme 1.

Our first proposal being presented is likely the most efficient one. Every state is

distinguished by a tag (the discriminate attribute) and actual attributes in every

state are included as attributes in the target class. Attributes can be meaningless

depending on the tag. The compilation scheme of algebraic types can be found

in Figure 8.1.

The function

tr ans_decl :[(String,Type_Name)]

153


SLAM-SL target code

class Astate K1...state Kn

state C1 (x1 : T1)...

state Cm (xm : Tm)

Java object code

class A {

private int state; // represents the object state

private final static int K1 = 1;

/* no attributes when state == K1 */

...

private final static int Kn = n;/* no attributes when state == Kn */

private final static int C1 = n+1;/* attributes when state == C1 */

tr ans_decl (x1 : T1);...

private final static int Cm = n+m;

/* attributes when state == Cm */

tr ans_decl (xm : Tm);}

Figure 8.1: Compilation scheme 1 of algebraic types.

formalizes the translation scheme. tr ans_decl (xi : Ti) represents the translation

of SLAM-SL attributes into Java. Following the scheme, the specification of the

list class can be automatically translated into Java in the following way:

public class List {

private int state; // represents the object state

private static final int EMPTY = 1;

/* no attributes when state == EMPTY */

private static final int NON_EMPTY = 2;

/* attributes when state == NON_EMPTY */

private Object head;

private List tail;

private List () { }

154


/* A factory method for empty stacks */

public static List empty () {

List result;

result = new List();

result.state = EMPTY;

result.head = null;

result.tail = null;

return result;

}

private static List nonEmpty (Object head, List tail) {

List result;

result = new List ();

result.state = NON_EMPTY;

result.head = head;

result.tail = (List)tail.clone();

return result;

}

}

Compilation scheme 2.

The main idea in this proposal is to introduce a new class AState for representing

the state of the target class A and as many subclasses as states have been declared

in A. Our compilation can be understood as the application of the state design

pattern [52].

For the list class example, the state of a list is represented by an instance of

ListState that can exclusively be an instance of ListStateEmpty or an instance of

ListStateNonEmpty. Attributes in ListState subclasses come from the state dec-

larations in SLAM-SL. The translation follows:

public class List {

ListState state;

private List () { }

private List (ListState state) {

this.state = state;

}

/* A factory method for empty stacks */

public static List empty () {

return new List(ListState.emptyState());

}

}

155


abstract class ListState {

public static ListStateEmpty emptyState () {

return new ListStateEmpty();

}

public static ListStateNonEmpty nonEmptyState (

Object head,

ListState tail) {

return new ListStateNonEmpty(head, tail);

}

}

class ListStateEmpty extends ListState {

public ListStateEmpty () { }

}

class ListStateNonEmpty extends ListState {

public final Object head;

public final ListState tail;

private ListStateNonEmpty () {

this.head = null; this.tail = null;

}

public ListStateNonEmpty (Object head,

ListState tail) {

this.head = head; this.tail = tail;

}

}

Optimization.

The latter compilation scheme supports introducing an optimization: when the

algebraic type contains a constant case, then that case can be represented by null

and one of the classes representing states can be dropped. In the example, the

class ListStateEmpty can be avoid by representing the empty case with null in the

state attribute.

8.3.2 Compiling Pattern Matching

In order to decide if a rule must be applied every rule is compiled into Java code

checking its invocation pattern. Depending on the translation scheme the pat-

tern matching compilation will be different.

For the first one, pattern matching is compiled into case expressions discrim-

156


inating with respect to the value of the tag. The translation of the length and

append methods would be the following ones:

public int length () {

int result = 0;

switch (this.state) {

case EMPTY:

result = 0;

break;

case NON_EMPTY:

result = 1 + this.tail.length();

break;

}

return result;

}

public void append (List ys) {

switch (this.state) {

case EMPTY:

this.state = ys.state;

this.head = ys.head;

this.tail = (List)ys.tail.clone();

break;

case NON_EMPTY:

this.tail.append(ys);

break;

}

}

For the second compilation scheme, pattern matching is compiled taking

advantage of dynamic binding in the object-oriented paradigms and the state

design pattern characteristics. Every rule is actually implemented in the corre-

sponding state subclass. Let us see the translation of both methods length and

append:

public class List {


return this.state.length();

}

public void append (List ys) {

this.state = this.state.append(ys);

}

}

In the target class the same method is invoked over the state attribute. Every rule

is translated into a method in its corresponding (state) class:

abstract class ListState {

public abstract int length ();

157


public abstract ListState append (List ys);

}

class ListStateEmpty extends ListState {


return 0;

}

public ListState append (List ys) {

return ys.state;

}

}

class ListStateNonEmpty extends ListState {


return 1 + tail.length();

}

public ListState append(List ys) {

return ListState.nonEmptyState

(this.head,this.tail.append(ys));

}

}

8.4 Compiling SLAM-SL Solutions into Efficient Code

A solution in SLAM-SL is a constructive formula with the same meaning of a post-

condition but with the property that an effective method for computing the func-

tion can be extracted from. The current characterization of solutions is syntacti-

cal.

Set comprehension is a powerful construct in many formal methods [72, 119,

2]. From collection construction to filtering, the concept of comprehension pro-

vides expressiveness and conciseness. Some programming languages have gen-

eralized the concept to its main data structures, as Haskell does with lists or

Smalltalk does with collections. The notion of comprehension needs not to be

restricted to sets, and comprehension over any kind of ’collection’ could be al-

lowed. Several problems can be solved by specifying a way to traverse any collec-

tion while some result is computed. Besides expressiveness, another important

advantage is that synthesizing efficient code is possible.

In SLAM-SL, traversals over collections is a generalization of ’collection com-

prehension’. Patterns for traversing collections have been introduced in the lan-

guage. If the user wants to define a new collection, his class must be declared

as a subclass of Collection and the way in which values of the collection are tra-

158


versed must be specified. Currently, our decision is that the user must specify an

‘abstraction’ function from collections to sequences. Let us see an example of the

traversal definition in a class for representing trees:

class Tree (T) inherits Collection (T)

state nilstate node ( left : Tree (T),

root : T,right : Tree (T))

/* Definition of the preorder traversal */public observer traversal : [T](empty).traversal =

[](node(ls,r , rs )). traversal =

[ r ] + ls . traversal + rs . traversal

By using program transformation techniques it is possible to obtain code in

a high level programming language that is efficient and readable, and that, con-

sequently, it can be modified by the user. Informally, the following quantified

expression

q x in d where F(x) with E(x)

represents an ‘iteration’ over the elements of d . traversal computes an accumu-

lated result with the meaning of q and values of E(x), except those that do not

fulfil F(x).

The main idea is to combine the recursive definition of the traversal with the

recursive definition of the quantifier. The folding-simplify-unfolding technique

from program transformation in functional programming is used in order to ob-

tain an efficient recursive code without the need of an intermediate sequence.

This code is easily translated into imperative recursive code in case of multiple

recursive traversals. However, when the traversal is defined by linear recursion,

the usual methods to translate recursive definitions to sequential code is used in

order to obtain efficient imperative code.

Let us see a couple of examples. The first one relies on trees. The tree traversal

was previously defined:

/* Definition of the preorder traversal */public function traversal (Tree (T)) : [T]call traversal (empty) =

[]call traversal (node(ls, r , rs ))=

[ r ] + ls . traversal + rs . traversal

159


On the other hand we have the recursive definition of, let say, the universal quan-

tifier over sequences:

forall x in [] where F(x) with E(x) =true

forall x in [x1] + xs where F(x) with E(x) =(F(x) ∧ E(x1)) andforall x in xs where F(x) with E(x)

so the following equational reasoning can be made:

forall x in tree where F(x) with E(x) =forall x in traversal (tree) where F(x) with E(x) =

trueif s = []

forall x in s. suffix (2) where F(x) with E(x)if not F(s(1)) and s /= []

E(s(1)) andforall x in s. suffix (2) where F(x) with E(x)

if F(s(1)) and s /= []where s = traversal(tree)

= (Unfolding of traversal(tree))true

if tree = emptyforall x in traversal (ls) + traversal(lr)

where F(x) with E(x)if not F(r) and tree = node(ls,r,rs)

E(r) andforall x in traversal (ls) + traversal(lr)

where F(x) with E(x)if F(r) and tree = node(ls,r,rs)

= (Distribution of quantifiers over sequences)true

if tree = emptyforall x in traversal (ls) where F(x) with E(x) andforall x in traversal (lr) where F(x) with E(x)

if not F(r) and tree = node(ls,r,rs)E(r) andforall x in traversal (ls) where F(x) with E(x) andforall x in traversal (rs) where F(x) with E(x)

if F(r) and tree = node(ls,r,rs)= (Folding of quantifier definition)

trueif tree = empty

forall x in ls where F(x) with E(x) andforall x in rs where F(x) with E(x)

if not F(r) and tree = node(ls,r,rs)E(r) and forall x in ls where F(x) with E(x) andforall x in rs where F(x) with E(x)

if F(r) and tree = node(ls,r,rs)

160


Let us now show an example of linear recursion. Suppose the following traversal

over list instances are defined:

function traversal ( List (T)) : [T]call traversal (non_empty(x, r)) = [r ] + traversal ( r )

Now the select quantifier is used as the running example:

select x in [] where F(x) with E(x) =UNDEFINED

select x in [x1] + xs where F(x) with E(x) =x1

if F(x1) and E(x1)select x in xs where F(x) with E(x)

otherwise

and the following equational reasoning can be made:

select x in l where F(x) with E(x) =select x in traversal (l) where F(x) with E(x) =

s(1)if F(s(1) and E(s(1))

select x in s.suffix(2) where F(x) with E(x)otherwise

where s = traversal(l)= (Unfolding of traversal(l))

xif F(x) and E(x))

select x in traversal (r) where F(x) with E(x)otherwise

where l = Cons(x,r)= (Folding of quantifier definition)

xif F(x) and E(x)

select x in r where F(x) with E(x)otherwise

This is a tail recursive definition that yields to the following imperative code:

x = l.head();

while (!(F(x) && E(x)) {

l = l.tail;

x = l.head()

}

In general it is not needed to make the transformation every time. Moreover, a

schema can be derived for each quantifier in case of linear recursion.

161


8.5 Conclusion

We have presented how some formal methods, and declarative techniques can

be fully integrated in the software development process. The object specification

language SLAM-SL is expressive enough to describe software applications. We

have collected a small but significant set of examples (bank transactions, pop-

ulation simulation, seat assignment in elections, etc. that can be found in our

papers). It can also be used in program debugging. The expressiveness of SLAM-

SL and algebraic specification languages is difficult to compare. While SLAM-SL

includes logical formulas, the level of abstraction of the description of certain op-

erations is higher in the language. However, SLAM-SL needs to provide a model

of the class (attributes) in order to generate code, but this code is directly an im-

perative one. If we restrict to the specification and the debugging facilities, every

OBJ specification can be easily translated into a SLAM-SL specification. As a fu-

ture work we plan to generate the proofs needed to ensure the correctness of

solutions w.r.t postconditions, providing verified generated code.

Moreover, the SLAM system is able to generate efficient and readable code.

We have described how to obtain imperative data types from the SLAM-SL class

attributes even in the case of recursive definitions. Furthermore, some tech-

niques from program transformation used for program optimization can be

adapted to SLAM-SL for the generation of imperative code.

As SLAM can cope with these two issues (expressiveness and efficient code

generation) it can be be used for specifying-in-the-large and can promote the

use of formal methods in industrial developments. The case of the B language

and development method [2] and the experience on the formal specification and

development of the line 14 of the Paris underground [12] (among other examples

of this kind) have shown that it is not only feasible but productive to use formal

methods in software development. The novel approach of SLAM is mainly fo-

cused on the ability to generate readable and efficient code in such a way that

the use of IRPP and formal methods is closer, cheaper, and more effective.

162

9Modelling Design Patterns

Abstract

This chapter contains the published book chapter [69] where a formal

model for design patterns was studied. Although design patterns are

informal, its formalisation is needed if tools that mechanised them are

wanted. We present a formalization of (some) design patterns as oper-

ators between classes. Being SLAM object-oriented and reflective this is

done with the “standard” API of the language (slam.lang.reflective.Class,

etc.). Potentialy, we can check a design follows a patterns, detect patterns

in code, and guide a refactoring process.

9.1 Introduction

Design patterns [52] and refactoring [47] are two sides of the same coin: the aim

of the application of both concepts is creating software with an underlying high

quality architecture. Design patterns are “descriptions of communicating objects

and classes that are customized to solve a general design problem in a particular

context” [52], theoretically the domain of application of design patterns is the set

of problems. Refactoring is “the process of changing a software system in such

163

9 Modelling Design Patterns

a way that it does not alter the external behaviour of the code yet improves its

internal structure” [47], its application domain is the set of solutions. In practice

design patterns are not directly applied to problems but to vain solutions, some-

times to conceptual solutions sometimes to concrete solutions: in many cases

design patterns are model refactoring descriptions.

The thesis of the works of Tokuda [122] and Cinnéide [30] is that automating

the application of design patterns to an existing program in a behaviour preserv-

ing way is feasible, in other words, refactoring processes can be automatically

guided by design patterns. In this work we show our pattern design formalization

and its relation with design and refactoring automation (as well as other useful

applications).

The first unavoidable step is to introduce a formal reading of design patterns.

This will be done in terms of class operators. Then some practical applications

will be presented: how to use the formalism to reason about design patterns and

how to incorporate this model into design tools and software development envi-

ronments.

Let us provide an informal and intuitive description of our proposal. A given

(preliminary) design is the input of a design pattern. This design is modelled as

a collection of classes. The result of the operation is another design obtained by

modifying the input classes and/or by creating new ones, taking into account the

description of the design pattern.

For instance, let consider you have an interface Target that could be imple-

mented by an existing class Adaptee but its interface does not match the tar-

get one. The design pattern Adapter, considered as an operator, accepts clas-

ses Target and Adaptee as input, and returns a new class Adapter that allows for

connecting common functionality. Similarly, when a client needs different vari-

ants of an algorithm, it is possible to put each variant of the method in different

classes and abstract them automatically “inventing” the abstract class that con-

figures the Strategy pattern.

In order to define design patterns as operations between collections of clas-

ses, we use specification languages as the working framework, like Z [119], VDM

[72], OBJ [53], Maude [33], or Larch [137]: a function that models the design pat-

tern is specified in terms of pre and postconditions. A precondition collects the

logical conditions required to apply the function with success. In our case, it al-

lows specifying some aspects of the design pattern description in a non ambigu-

ous way. Talking in terms of the sections used to describe a pattern, the pattern

164

9.1 Introduction

function precondition establishes the applicability of the pattern. For instance,

in the pattern Strategy mentioned above, the precondition needs to ensure that

all the input classes define a method to abstract with the same signature. A post-

condition relates input arguments and the result. In the Adapter operator, the

postcondition establishes that input classes (Target and Adaptee) are not modi-

fied, and that a new class (Adapter) is introduced, inheriting from the input clas-

ses. The Adapter methods are described by adequate calls to the corresponding

Adaptee methods. The postcondition encompasses most of the elements of the

intent and consequences sections of the pattern description.

These specification languages have a well established semantics and provide

elements to describe formally software components. Among the wide variety

of mentioned specification languages we will use our own language SLAM-SL

[145] along the paper for practical reasons, basically, because we have a com-

plete model of object-oriented aspects in the language itself by the use of reflec-

tion, and a firm knowledge of the tools around it. It will be clear to the reader

that any other language can be used instead. One key point of SLAM-SL is the

reflection capabilities [67], i.e. the ability of the language for representing and

reasoning on aspects of itself (classes, methods, pre and postconditions, etc.) As

the work is not devoted to SLAM-SL we restrict the presentation of SLAM-SL to

an appendix, that could be omitted in the final version. Anywhere, the examples

are relatively easy to understand.

Formalization is one of the main advantages of our approach because it al-

lows for formal reasoning about design patterns. The purpose of formalization is

to resolve questions of relationships between patterns (when a pattern is a partic-

ular case of another), validation (when a piece of program implements a pattern),

and, specially, it is a mandatory basis for tool support. Additionally, the view of

design patterns as class operators allows for a straightforward incorporation into

object-oriented design and development environments: it can be used to mod-

ify an existing set of classes to adapt them to fulfil a design pattern, which is an

example of refactoring.

The chapter organization is as follows: Section 9.2 provides an adequate

background for the work, namely a brief presentation about our specification

language SLAM-SL, and specially its reflective capabilities. Section 9.3 is devoted

to the main subject of the chapter: how design patterns can be described as class

operations. Additionally, we describe two possible applications: how to reason

about design patterns, and how a design can be automatically refactored using

165


design patterns. Future and emerging trends are included in Section 9.4. Finally

we provide some concluding remarks (Section 9.5). Section 9.6) presents more

examples of formalization of patterns.

9.2 Background

This section is focused on two aspects. The first one is to introduce our spe-

cific way of modelling object-oriented specifications. As we have mentioned, we

use object-oriented specification languages and, in particular, our own proposal

SLAM-SL. We focus on the reflexive features of the language as they are crucial

for the specification of design patterns. The second goal of the section is to dis-

cuss some related work that represents alternative ways for the formal approach

to design patterns.

9.2.1 Modelling Object-Oriented Specifications

Object-oriented concepts must be modelled in order to formalise design patterns

and refactoring. There are two options: to model all these characteristics in some

specification language (as an additional theory, or a library), or to use a reflective

object-oriented specification language. For this chapter we have decided to use

SLAM-SL, an object-oriented formal specification language that fulfils the sec-

ond option (see [67]).

Data Modelling

The two fundamental means of modelling in the object-oriented approach are

composition and inheritance. The abstract syntax to specify composition and

inheritance in SLAM-SL is virtually identical to those found in widespread object-

oriented programming languages like Java and design notations like UML so an

ordinary developer should feel comfortable with the notation.

Class Declaration

A class is declared with the following construction:

class C

166

9.2 Background

where the symbol C represents a valid class name. Once a class is declared, the

user can add properties for that class. We use the term property to name every

characteristic of instances of the class, including methods. Composition and in-

heritance relationships allow the user to model data.

Generics

SLAM-SL also supports generic classes by using bounded parametric polymor-

phism [23]. Syntax for generics are similar to the syntax in Java 1.5 and templates

in C++. The declaration of a generic class in SLAM-SL follows this syntax:

class B<X inherits A>

X is a class variable, it can be used in the specification of B and must be instan-

tiated when B is used. The syntax for instantiation is B<D> where D is a subclass

of A.

Composition

In class based object-oriented languages, a class is a template that defines the

structure of the information of class instances. Usually, the data are modelled

through a set of fields containing the information. The construction1

state s (l1 : C1, . . ., ln : Cn)

in a class specification establishes that instances have n fields (li) being each one

an instance of a class (Ci).

The syntax designed to access to the properties of instances is the infix send

operator .2. Fields are instance properties so in the example above, if o is an

instance of the class C then o .li is an instance of the class Ci. Formally li is a

function from C in Ci.

SLAM-SL has an important toolkit with predefined classes like Boolean,

Integer, Float, String, Seq (generic sequences), Tup_2 (generic tuples of two

components), etc.

Let us show a concrete simple class specification of a telephone database

(adapted from an example in [127]):

class PhoneDB

1The reader can ignore the metasymbol s.2The standard notation in object-oriented languages is the dot ( .).

167


state phoneDB (members : Seq<Person>, phones : Seq<Tup_2<Person,Phone>>)

Every instance db of PhoneDB has two properties: members is a finite sequence

of instances of Person, and phones that is a finite sequence of pairs of instances

of Person and Phone. The syntax to access to the information of fields members

and phones is db .members and db .phones.

The name given after the keyword state is a constructor of instances of

the class. This allows us to write an expression that represents an instance of

PhoneDB (assuming that the names mary and jones are persons and 7254 is a

telephone number, and that the syntax [x1, . . . ,xn ] represents sequences):

phone_DB([mary, jones], [(mary,7254)])

An specificaly design syntax for sequences and tuples has been introduced in

SLAM-SL: [X ] is a class for sequences of instances of X (Seq<X>) and (X ,Y ) is a

class for tuples of instances of X and Y as components (Tup_2<X ,Y >).

Invariants

In order to constraint the domain, the use of invariants are allowed. In the case

of the database, a reasonable constraint could be that every person in an entry of

the collection of phones were in the collection of members:

invariant ∀ e:(Person, Phone) (e ∈ phones ⇒ e.fst ∈ members)

where fst is a method that returns the first component of a tuple and ∈ is a

method that decides if an instance belongs to a sequence of instances.

Invariants establish which expressions are really valid instances. The expres-

sion

phone_DB([mary, jones], [(jim,7254)])

would not be valid because jim is not in the collection [mary,jones]. These invari-

ants can be understood as preconditions of the state constructor.

A second example is shown where data representation of the class Point is

specified by mean of two fields that represent Cartesian coordinates:

class Pointstate cart (x : Float, y : Float)

For the moment, observable properties of any instance of Point are x and y and

points can be represented by expressions cart (a ,b) where a and b are instances

of Float.

168

9.2 Background

Inheritance

The construct to specify that a class is a subclass of other classes is the following:

class C inherits C1 C2 . . . Cn

The informal meaning of the declaration above is that class C is a subclass of C1,

C2, . . . , Cn (with n ≥ 0) and inherits properties from them. Although the meaning

of subclassing is controversial we do not pretend to change its intended semantics

in the object-oriented paradigm so its actual meaning in SLAM-SL is similar to

that in most of object-oriented languages:

• an instance of a subclass can be used in every place in which an instance of

the superclass is allowed (subtyping), and

• properties in superclasses are directly inherited by subclasses (inheri-

tance).

A very important property, not so extended in object-oriented languages, is: the

behaviour of subclasses are consistent with the behaviour of the superclasses.

For the moment, let us show a pair of paradigmatic examples:

class ColoredPoint inherits Point, Color

As expected from the declared inheritance relationship, ColoredPoint instances

must inherits properties from Point and Color, properties like x in Point or is_red

in Color.

The user can specify a condition that every instance of a subclass must fulfil

in order to be considered a valid instance. This invariant condition is given over

public properties of the superclasses. For instance in the specification

class NoRedColoredPoint inherits Point, Colorinvariant not self . is_red

the invariant establishes that for every instance of NoRedColoredPoint its prop-

erty is_red is false.

Predefined collections

Class Collection plays an important role in SLAM-SL. All predefined contain-

ers classes in SLAM-SL inherits from the class Collection and all instances of

Collection can be quantified. SLAM-SL introduces several predefined quantifiers

169


like universal and existential quantifiers, counters, maximisers, etc. The most

important predefined classes that inherit from Collection are sequences and

sets.

Visibility

Reserved words public, private and protected have been introduced in SLAM-

SL with the usual meaning in object-oriented notations for design and program-

ming. By default, state names and fields are private while the methods are public.

Behaviour Modelling

The SLAM-SL notation allows the user to distinguish among several kinds of

methods:

Constructors. Constructors are class members that build new instances of the

class.

Observers. Observers are instance members that observe instance properties.

Modifiers. Modifiers are instance members that modify the state of an instance.

Functions. Functions are class members that observe properties of the class.

The SLAM-SL semantics is stateless. This means that the classification above is,

from the semantics point of view, artificial because methods will be interpreted

as functions. From a pragmatic point of view, the classification given by the spec-

ifier is used in static analysis and code synthesis stages.

Methods are specified by given a precondition and a postcondition. Precon-

ditions and postconditions are SLAM-SL formulae involving explicitly declared

formal parameters and two implicit formal parameters: self and result where

self represents the state of the instace before the message is received and result

represent the result of the method (the state of the instance after the execution of

the method if the method is a modifier). Obviously, when constructors or func-

tions are being specified the formal parameter self is not accessible.

The syntax of SLAM-SL formulae is similar to firt-order logic formulae, in fact,

SLAM-SL specifications are axiomatised into first-order logic.

170

9.2 Background

The specification of a method in SLAM-SL has the following scheme:3

class A...method m (T1,. . .,Tn) : Rpre P(self ,x1, . . . ,xn)call m (x1,. . .,xn)post Q(self ,x1, . . . ,xn,result)sol S(self ,x1, . . . ,xn,result)

A method specification involves a precondition, the formula P(self ,x1, . . . ,xn),

that indicates if the rule can be triggered, an operation call scheme m(x1, . . . ,xn);

and a postcondition, given by the formula Q(self ,x1, . . . ,xn,result), that relates

input and output states. The informal meaning of this specification is given by

the following formula:

∀s,x1, . . . ,xn(s ← pre_m(x1, . . . ,xn) ⇒ s ← post_m(x1, . . . ,xn,s ← m(x1, . . . ,xn)))

where

s ← pre_m(x1, . . . ,xn) , P(s,x1, . . . ,xn) ∧ inv(s)

s ← post_m(x1, . . . ,xn,r) , Q(s,x1, . . . ,xn,r) ∧ inv(r)

Precondition, call scheme and postcondition must be considered the spec-

ification of the method. The procedure to calculate the result of the method is

called a solution in the SLAM-SL terminology and it has been indicated by the re-

served word sol followed by the formula S(self ,x1, . . . ,xn,result). Notice that the

formula is written in the same SLAM-SL notation, but must be an executable ex-

pression (a condition that can be syntactically checked). The objective is that the

SLAM-SL compiler to synthesise efficient and readable imperative code from so-

lutions. Solutions must be considered as a refinement of the postcondition and

the user, with the help of the system, must prove that every solution entails its

postcondition post_m (ie. invariant included).

Some shorthands help the user to write formulae concisely and readably: self

identifier can be omitted for accessing attributes, explicit function definitions, as

in VDM, are allowed, and unconditionally true preconditions can be skipped.

Let us show an example of specification of a sortable sequence. A sortable se-

quence is a generic class and its type argument must inherit from the predefined

3The reserved word method represents any reserved word for the different kind of methods:constructor, observer, modifier or function.

171


class Ordered (that introduces a partial order relation):

class SortableSeq<X inherits Ordered> inherits Seq<X>

Constructor empty creates instances with the property length inherited from

Seq equal to 0:

constructor emptycall emptypost result . length = 0

The distinction between postconditions and solutions is crucial for code gen-

eration. A proof obligation establishes that the solution entails the postcondition.

Code is obtained from solutions. The fact that both formulae are written in the

same language has a number of advantages: i) it is a very abstract way of defining

operational specifications from the user point of view, ii) it is easier to manipulate

for optimisation of generated code, and iii) the task of ensuring the correctness

property is easier.

The following example is the specification of a sorting method for sortable

sequences where a postcondition and a solution is offered:

modifier sortcall sortpost result . isPermutation(self ) and result. isSortedsol result = self if self . length < 2

and result = self . tail . sort . insertSort ( self .head)if self . length > 1

Methods isPermutation, isSorted and insertSort are specified bellow. Methods

length, tail and head are inherited from class Seq.

The mentioned proof obligation in the SLAM-SL underlying logic is

(length(l) < 2 ⇒ sort(l) = l)∧(length(l) ≥ 2 ⇒ sort(l) = insertSort(sort(tail(l)),head(l)))

⇒isPermutation(sort(l), l) ∧ isSorted(sort(l))

The above specification of method sort used the following specification of

methods:

observer isSorted : Booleancall isSorted =

self . length < 2or ( self .elementAt(1) ⇐ self .elementAt(2)

and self. tail . isSorted)

172

9.2 Background

given as an explicit (functional) definition,

observer count (X) : Integercall count(x) = countQ quantifies x = y with y in self

by using the predefined quantifier that counts elements in a collection with a

given property,

observer isPermutation(Seq<X>) : Booleancall isPermutationpost

forallQ quantifies self .count(x) = result .count(x) with x in selfand forallQ quantifies self .count(x) = result .count(x) with x in result

by using the universal quantifier, and

modifier insertSort(X)pre self . isSortedcall insertSort (x) =

if self . length = 0then self .cons(x)else if self .head ⇐ x

then self . insertSort (x) .cons(x)else self .cons(x)

Quantification

In SLAM-SL some constructs have been added in order to make writing of ex-

ecutable specification easier to write. One of those constructs consists of the

generalisation of the quantifier concept.

In standard logic, the meaning of a quantified expression ∀x ∈ C(P(x)) is the

conjunction true∧P(x1)∧P(x2)∧... with each xi in C. The quantifier ∀ determines

the base value true and the binary operation ∧. In SLAM-SL we have extended

quantified expressions with the following syntax:

q quantifies e(x) with x in d

Where q is a quantifier that indicates the meaning of the quantification by a bi-

nary operation (let us call it⊗) and a starting value (let us call it b), d is an object of

a special predefined class Collection, x is the variable the quantifier ranges over,

and e represents the function applied to elements in the collection previous to

computation. The informal meaning of the expression above is:

b ⊗ e(x1) ⊗ e(x2) ⊗ e(x3) ⊗ . . .

The abstract class Collection in SLAM-SL has the following interface4:

4Seq is the generic predefined class for representing sequences

173


class Collection <T>observer traversal : Seq<T>

In SLAM-SL the user can specify the way in which a collection is traversed by

inheriting from Collection and by specifying the way in which it is traversed. For

instance, if the collection is a tree, it can be traversed in three different depth-first

ways: preorder, inorder, and postorder.

The abstract base class for quantifiers has the following interface:

class Quantifier<Element, Result>state quantifier (public accumulated : Result)modifier next (Element)

and a pair of concrete quantifier specifications are:

class Forall inherits Quantifier<Boolean,Boolean>

constructor forallQpost accumulated = true

modifier next (Boolean)call next(c)post result .accumulated = accumulated and c

class Count inherits Quantifier<Boolean,Integer>

constructor countQpost accumulated = 0

modifier next (Integer)call next(c)post result .accumulated = accumulated + if c then 1 else 0

Reflection

A SLAM-SL program is a collection of specifications that defines classes and their

properties: name, relationships with other classes, and methods. Relationships

with other classes are the inheritance relationship, and aggregation, or composi-

tion among classes, the last defined in state specification.

In this section the specification of SLAM-SL classes, properties, expressions,

etc. are presented. Authors are sure that the reader understands that the speci-

fication of any construct needs the specification of the others so we will need to

refer constructs that have not been specified yet.

class Class

174

9.2 Background

public state mkClass (name : String,inheritance : {Class},inv : Formula,states : {State},methods : {Method})

invariantforallQ quantifies s.noCycle({}) with s in inheritanceand forallQ quantifies m1.differ (m2) if m1 /= m2

with m1 in methods, m2 in methods

observer noCycle ({Class}) : Booleancall noCycle(c) = not self in c

and forallQ quantifies s.noCycle(c.add(self))with s in self . inheritance

For modelling classes, we have made a natural reading of ‘what a class is’:

a name, an inheritance relationship, an invariant, and its properties (states and

methods), respectively: a string, a collection of instances of Class, an instance

of Formula, a collection of instances of State and a collection of instances of

Method. The syntax {X} or [X] is used to denote sets (respectively sequences)

of type X .

The invariant in Class establishes that

• there is no cycle in the inheritance relationship,

• properties are correctly specialized: method overloading is allowed, but

there must be an argument of different type. Notice that thanks to this

declarative specification SLAM-SL is able to identify those properties that

a class must fulfil what is much more expressive and powerful than the re-

flective features of Java or C# that are merely syntactic.

Among the interesting methods of classes, let us show a couple of them. Whether

a class is just an interface is detected by checking if among the properties there

is no states or constructors defined and if all the methods are undefined. Finally,

a class is a subtype of another one if the latter can be found in the inheritance

sequence of the former.

public observer isInterface : Booleancall isInterface =

states = {}and forallQ quantifies m.undefined with m in methods

public observer isSubtype (Class) : Booleancall isSubtype(c) =

175


c = selfor existQ quantifies cl . isSubtype(c) with cl in inheritance

Formulae

SLAM-SL formulae and expressions are the heart of SLAM-SL specifications.

Therefore we discuss reflective features related to formula management what,

at the same time, gives an idea about how a SLAM-SL formula is. The SLAM-SL

runtime environment can manage formulae in the same way the compiler does,

this means that formulae can be created and compiled at runtime so the user

can specify programs that manage classes and class behaviours. The following

specification of formulae reflects its abstract syntax in SLAM-SL:

class Formula

public state mkTruepublic state mkFalsepublic state mkNot (f : Formula)public state mkAnd (left : Formula, right : Formula)public state mkOr (left : Formula, right : Formula)public state mkImpl ( left : Formula, right : Formula)public state mkEquiv (left : Formula, right : Formula)public state mkForall (var : String , type : Class, qf : Formula)public state mkExists (var : String , type : Class, qf : Formula)public state mkEq ( lexpr : Expr, rexpr : Expr, type : Class)public state mkPred (name : String, args : [Expr])

public observer wellTyped (ValEnv) : Booleancall wellTyped(env) =

(is_mkTrue or is_mkFalse)or(( is_mkAnd or is_mkOr or is_mkImpl or is_mkEquiv)

and left .wellTyped(env) and right.wellTyped(env))or isMkNot and f.wellTyped(env)or isMkEq and lexpr.type.isSubtype(type) and rexpr.type.isSubtype(type)or (isMkForall or isMkExists) and qf.wellTyped(env.put(var,type))or isMkPred and forallQ quantifies env.get(name).argSig(i)

. isSubtype(args.type(env))with i in [1.. args.length]

public modifier susbstitute (String , Expression)call substitute (var, expr)post result = self if is_mkTrue or is_mkFalse

and result = mkNot(f.substitute(var,expr) if is_mkNotand result = mkAnd(left.substitute(var,expr),

right . substitute (var,expr)) if self . is_mkAndand result = mkOr(left.substitute (var,expr),

176

9.2 Background

right . substitute (var,expr)) if self . is_mkAOr. . .

public observer isExecutable: Booleancall isExecutable =

is_mkEq and lexpr = mkVar("result") and rexpr.isExecutable

Class Formula represents the abstract syntax of SLAM-SL formulae that are

those in the underlying logic plus the introduction of meta names for formulae.

Methods have been added for checking if a formula is well typed, for substituting

variables with expressions and for checking if a formula is executable.

Properties

The classes modelling properties are called State, and Method. Its models are the

following:

class Statestate mkState (name : String, attributes : { Attribute }, inv : Maybe<Formula>)invariant forallQ quantifies a1. differ (a2) if a1 /= a2

with a1 in attributes , a2 in attributes

In SLAM-SL, a composition relationship among classes is defined by the state

specification. A state defines attributes that are the internal representation of the

class instances. A state can have an invariant that establishes properties of the

attributes and/or relationships between them.

class Methodpublic state mk_method (kind : MethodKind,

visibility : Visibility ,name : String,signature : Signature,precondition : Formula,postcondition : Formula,solution : Maybe<Formula>)

public observer type_sig : [ Class ]call type_sig = mapQ quantifies d.type with d in sig

observer invokation : [String ]call invokation = mapQ quantifies d.name with d in signature

In the class Method, we have also introduced a couple of useful operations:

constructing a method, abstracting the type signature just using the argument

types (the names are almost irrelevant except for the pre and postconditions),

and composing a method call with the argument names.

177


On top of them, we can describe a number of interesting operations on meth-

ods. The first one (isCompatible) indicates when two methods are equivalent

(same name, types and equivalent pre and postconditions). The second one

(canInherit) specifies when a method can override another definition. They must

have a coherent definition (same name and arguments/return type) and the in-

heritance property must hold.

public observer is_compatible (Method) : Booleancall is_compatible (m) =

kind = m.kind and name = m.nameand type_sig = m.type_sigand return = m.returnand (prec1 implies prec2)and (post2 implies post1)where

prec1 = orallQ quantifies r .get_precwith r in rules ;

post1 = andallQ quantifies r.get_prec implies r.get_postcwith r in rules ;

prec2 = orallQ quantifies r .get_precwith r in m.rules;

post2 = andallQ quantifies r.get_prec implies r.get_postcwith r in m.rules

public observer can_inherit (Method) : Booleancall can_inherit (m) =

kind = m.kind andand name = m.nameand sig.length = m.sig.lenghtand (forallQ quantifies sig( i ) . is_subclass_of(m.sig ( i ))

width i in sig .dom)and return = m.returnand (prec1 implies prec2)and (post2 implies post1)where

prec1 = orallQ quantifies r .get_precwith r in rules ;

post1 = andallQ quantifies r.get_prec implies r.get_postcwith r in rules ;

prec2 = orallQ quantifies r .get_precwith r in m.rules;

post2 = andallQ quantifies r.get_prec implies r.get_postcwith r in m.rules

Finally, we specify operations to decide when two methods are really different

(up to argument names) and when a method implements an interface method

(i.e. precondition false):

public observer differ (Method) : Boolean

178

9.2 Background

call differ (m) =name /= m.nameor sig .lenght /= m.sig.lengthor existsQ quantifies sig( i ) .type /= m.sig(i) .type

with i in sig .dom

public observer do_nothing : Booleancall do_nothing =

existsQ quantifies (r .get_prec = false and r.get_postc = true)with r in rules

For the sake of simplicity, we assume that all record components of classes

Class, Method and State are public. In fact, good object-oriented methodologies

recommend to make them private and to declare adequate methods to access

them. We omit such definitions to avoid an overloaded specification.

Notice that what we have presented is only a subset of the full SLAM-SL spec-

ification, just selected to show the main elements of the language as well as to

make design pattern description easy to follow. The full reflective specification is

included in the reflect module of the SLAM-SL distribution, more details can be

found in [67].

SLAM-SL sentences and substitution

An instance of the class Class represents a SLAM-SL class, an instance of the class

Method represents a SLAM-SL method, an instance of the class Formula repre-

sents a SLAM-SL formula. Instead of using expressions based on constructors

and methods, the user can write those instances by using SLAM-SL sentences di-

rectly. This makes the specification much more concise. Let us show an example

for representing the class Point by using constructors and methods:

mkClass ("Point",{},mkTrue,{mkState("cart" ,[mkField("x","Float" ),

mkField("y","Float" )])},{})

SLAM-SL introduces a syntax that allows the user to give the class by using the

SLAM-SL own syntax for classes:

<scode>class Pointstate cart (x : Float, y : Float)</scode>

179


Both expressions are equivalent.

Every SLAM-SL (sub)sentence representing any object-oriented concepts can

be given between <scode> and </scode> and its meaning is an instance of the class

that models such a concept.

Substitutions has been added to SLAM-SL as a metalanguage capability. Its

syntax is S[x := e] where S is an instance that represent a SLAM-SL (sub)sentence,

x is a string to be substituted and e is an instance that represent other SLAM-SL

subsentence. The SLAM-SL compiler check that substitutions are well typed.

Let us show an example of substitution: the following expression

<scode>class Pointstate cart (x : Float, y : Float)</scode>["Float" := <scode>Integer</scode>]

is equivalent to this one

<scode>class Pointstate cart (x : Integer, y : Integer)</scode>

9.2.2 Other Formalizations of Design Patterns

The LePus project [41, 42] develops an ambitious idea: a visual language for spec-

ifying design patterns. A design pattern is described by a (limited form of) verbal

specification, and a diagram that includes the constructs and relations the de-

sign pattern involves, and the constraints it imposes on conforming implemen-

tations. A tool can read it and produces what they call a trick, basically an algo-

rithm to manipulate programs. Of course, the work is in principle more general

than our approach but we claim that we can get a similar power with a simpler

technique, and that SLAM-SL can also be considered as a pattern language.

Some other papers [5, 99, 121] differ in their goals, and are more interested in

describing temporal behaviour and relations between design patterns, by using

variations of temporal logic: [5] is focused on the formalization of architectural

design patterns based on an object-oriented model integrated with a process ori-

ented method to describe the design pattern; [99] is concentrated on communi-

cation between objects; [121] has the aim to describe the structural aspect of a

design pattern.

180

9.3 Design Patterns as Class Operations

As we said in Section 9.1, the work of Tokuda and Batory [123], [122], already

points out that some design patterns can be expressed as a series of program

transformations applied to an initial software state, where these program trans-

formations are primitive object-oriented transformations.

The work of Cinnéide [30] also points out that design patterns can guide the

refactoring process. A methodology for the construction of automated transfor-

mation, that introduces design patterns to an existing program preserving its be-

haviour, is presented. The main difference between this approach and our pro-

posal is that we can detect the patterns to apply in a given design.

The detection of situations in which refactoring can be applied is what Mens

names bad smells in his paper [96].

Finally, [54] uses UML and OCL as specification languages for design pat-

terns. While the paper contains some useful ideas in order to develop a tool,

it also honestly shows the severe limitations of UML and OCL for this goal, and

particular extensions are proposed.


A design pattern consists of the description of a valuable design that solves a gen-

eral problem. Strictly speaking, design patterns cannot be formalized because

its domain of application are problems. Nevertheless, relevant parts of design

patterns are susceptible of formalization: structure, participants and, more dif-

ficultly, collaborations. Our proposal is to view design patterns as class (set of)

operators that receive a collection of classes that will be instances of (some) par-

ticipants and return a collection of classes that represents a new design.

In our model, a given (preliminary) design is the input of a design pattern.

This design is modelled as a collection of classes. The result of the operation is

another design obtained by (possibly) modifying the old classes, and potentially

creating new ones, according to the description of the design pattern.

For instance, let consider you have a collection of classes leafs (e.g. Line,

Circle, Rectangle, . . . ) that share some operations (e.g. draw, rotate, resize, . . . )

and you want to compose all of them in a wider object that either has all of them

as particular cases and also can collect some of them inside (e.g. a Figure). The

Composite pattern, considered as an operator, accepts classes (leafs) as input and

returns two new classes Component (merely an interface) and Composite (for the

181


collection of components) with the common operations as methods, and modi-

fying classes in leafs to inherit from Component.

More specifically, a design patterns is modelled as a class with a single func-

tion apply that is a class operator. The precondition for this function collects the

logical conditions required to use the pattern with success. Basically, this means

that the pattern precondition establishes the applicability of the pattern, talking

in terms of the sections in the pattern description. For instance, in the Composite

pattern we mentioned above, the precondition needs to ensure that all the clas-

ses in leafs define the common methods with the same signature.

On the other hand, the postcondition encompasses most of the elements of

the intent and consequences sections of the pattern description. In the Composite

pattern, the postcondition establishes that the input classes leafs now inherit

from Component and classes Composite and Component are introduced, the

first one inheriting from the second one. The Composite state is a collection of

Components and its methods are described by iterative calls to the corresponding

leafs methods.

In order to describe all these elements, the reflective features play a signif-

icant role because they allow inspecting argument classes and describing new

classes as result [67]. Design patterns can be described by a (polymorphic) class

DPattern. The method apply describes the behaviour of the pattern by accept-

ing a collection of classes as arguments (the previous design) and returning a

new collection of classes. This method can describe a general behaviour of the

pattern, or can describe different applications of the pattern with different con-

sequences, each one in a different rule. The class argument (coming from the

polymorphic definition) is occasionally needed to instruct the pattern about the

selection of classes, methods, etc. that take part in the pattern. This argument is

stored in the internal state of the class DP:

class DP <Arg>

private state dp (protected arg : Arg)

public observer apply ([Class]) : [Class]

Inheritance is used to derive concrete design patterns. It is also needed to

instantiate the type argument and supplying a value for the state. Notice that

design patterns variants are easily supported in our model through inheritance.

Let us describe in detail the method by using a couple of examples taken from

182


Composite arguments Composite results

Figure 9.1: Composite class diagram.

[52]. A graphical description complements the formal definition using an UML

based notation taken again from [52]. A preliminary version of these ideas can

be found in [62]–with no contribution about how to reason with design patterns

or how to develope a tool–, where a good number of examples (AbstractFactory,

Bridge, Strategy, Adapter, Observer, TemplateMethod, . . . ) are described. Most of

them can be found in Appendix 9.6. This collection clearly shows the feasibility

of our approach.

9.3.1 Composite Pattern

The Composite pattern is part of the object structural patterns. It is used to com-

pose objects intro tree structures to represent part-whole hierarchies. Using the

pattern, the clients treat individual objects and compositions of object uniformly.

When we treat it as a class operator, we have the collection of basic objects

as argument (called the leafs). The result “invents” two new classes Component

and Composite. Component is just an interface for all the common methods in all

the leaf classes plus some methods to add, remove and consult internal objects.

Composite inherits from Component and stores the collection of components.

The result also collects all the classes in leafs that are modified by inheriting from

Component. The methods in Composite can be grouped in two parts. On one

hand, we have methods to add and remove a component, and also to consult

the ith element in the component collection (getChild). On the other hand, we

have all the common methods of the leafs that have a very simple specification

by iterative calling the same operation in all the components. See Figure 9.1 and

Figure 9.2 for a complete SLAM-SL specification.

183


class Composite inherits DP<Unit>

public constructor composite (Unit)call composite (unit)post result .arg = unit

public observer apply ([Class]): [Class]let common_meths = {m with cl in leafs | m in cl.methods} with m in cl.methods)pre (not leaf . isEmpty ) and (not common_meths.isEmpty)apply ( leafs )post result = [component, composite] + [c \ inheritance. insert (component) with c in leafs]

wherecomponent =

mkClass("Component", {Component}, <slamcode>true = true</slamcode>,{},{m \ prec = <slamcode>false and q = true</slamcode>[q := postc]| m in commonMethods}+ {create, add, remove, getChild})

composite =mkClass ("Composite", {}, <slamcode>true = true</slamcode>,

{children }, {create, add, remove, getChild}+ {gen(m) | m in commonMethods})

children = <slamcode>state mkComposite (children : [C])</slamcode>[C := component]

create = <slamcode> constructor createpre true = truecall createpost result = {}</slamcode>

add = <slamcode> modifier add (C)pre true = truecall add(c)post result = children. insert (c)</slamcode>

remove = <slamcode> modifier remove (C)pre true = truecall remove(c)post result = children.remove(c)</slamcode>[C := component]

getChild = <slamcode> modifier getChild (Nat)pre true = truecall getChild( i )post result = self . children( i )</slamcode>

function gen (Property) : Propertycall gen(p) =

p \ prec = <slamcode>all quantifies p with p in children</slamcode>[p := m.prec [self := c]

\ postc = <slamcode>result = [mkCall(n,[c] + i) | c in children ]</slamcode>[n := m.name, i := m.invokation]

Figure 9.2: Composite pattern specification.

184


9.3.2 Decorator Pattern

The Decorator pattern is classified as object structural and it is used to attach ad-

ditional responsibility to an object dynamically. It can be seen as the following

class operator: A collection of concrete components and a collection of deco-

rators are used as arguments. They share some operations that the pattern ab-

stracts in two steps. First of all, a new Decorator class abstracts the operation of

the decorators. Then another newly created class Component abstracts the oper-

ation either for the concrete components and for the decorator.

The class argument is used to split the sequence of classes into the concrete

components and the decorators. Concrete components are forced to inherit from

Component, while decorators inherit from Decorator and modifies the common

methods to add a call to the decorator operation. The Decorator class contains a

Component in the state and offers the common methods as public. They are im-

plemented as simple calls to the equivalent operations in the stored component.

Finally, Component is merely an interface for the common methods.

class Decorator_DP inherits DPattern<Natural>

public constructor decorator (Natural)call decorator(n)post result .arg = n

public observer apply ([Class]): [Class]let common_meths = {m with cl in classes | m in cl.methods}pre (classes.length > arg) and (not common_meths.isEmpty)call apply (classes)post

result = [component, decorator]+ mapQ quantifies c \ inh.insert(decorator)

\ methods = mapQ quantifies add_call(m, c.methods)with m in c.methods

with c in concrete_classes+ mapQ quantifies c \ inh.insert(component) with c in concrete_classes

whereconcrete_classes = classes.prefix (arg);decorators = classes.suffix (arg);component = mk_Class(

"Component", {}, {},mapQ quantifies

m \ (mapQ quantifies r \ prec = $false$\ postc = $true$

with r in rules)with m in common_methods);

decorator = mk_Class(

185


"Decorator",{mk_State([mk_Field("component", component)])},{component},mapQ quantifies

m \ mapQ quantifiesr \ postc = <slamslcode>

result = component.mk_Call(m.name,m.parameters)

</slamcode>with r in rules

with m in common_methods);

As we can see, thanks to its declarative reflection features, SLAM-SL can be

considered as a design patterns language. Once you can model a pattern as a

class operator, SLAM-SL can be used to specify it and this specification can be

used to instruct the associated tool to apply the pattern to existing designs and

programs.

9.3.3 Different Modelling Possibilities

For several patterns, as Factory Method, the most general case specification is a

thorny issue, nevertheless one can specify in a simpler way, different situations in

which the pattern could be applied with different consequences. It can be made,

as we said in Section 9.3, by generating a different rule for each situation you

want to manage, i.e., a different precondition-postcondition pair.

For example, you can find the situation in which several classes in a hierarchy

implement a common method similarly except for an object creation step 5, so

you can apply a refactoring by the Factory Method. The rule precondition speci-

fies this situation in a easy way by reflection. The postcondition establishes that a

new Abstract class will be created, a inheritance relation will be created between

initial classes and the new one, in the new class two methods will be placed: a

new abstract factory method and the common method in which the object cre-

ation step will be replaced by a call to the former, and in the initial classes the

common method will be removed as long as the factory method will be added

with a call to the concrete object constructor.

Now, if the designer finds a new application of the Factory Method pattern,

she will only have to add a new rule to describe this application and its conse-

5This example, “Introduce Polymorphic Creation With Factory Method”, has been taken from[76]

186


quences.

9.3.4 Design Patterns Composition

Viewing design patterns as operators over classes allows us to create new design

patterns by composition. For instance, the composite design pattern can be ap-

plied to a collection of leafs and then a decorator can be applied to the new de-

sign.

In the case study presented in chapter two in [52], the design of a document

editor is guided by the application of several design patterns. Some of those de-

sign patterns are applied to (a part of) the result of a previous one. Because de-

sign patterns have been modelled as class operators, we can specify the compo-

sition of them:

composite = instance (Empty);glyph = composite.apply([border, scroll , character,rectangle, polygon]);decorator = instance (3);mono_glyph = decorator.apply(glyph.prefix(3))

9.3.5 Application: Reasoning with Design Patterns

An inmediate application of the formalisation of design patterns is to reason

about certain properties. In this section, some properties that can be stated with

our formalism are presented.

Commuting Patterns

Proving that the application of two patterns commutes is less relevant for the

user at least at the design level, but it is useful for a software team, to know that

these tasks are interchangeable in time if recommended by the project planning.

Additionally, the look of a design can get dirty or complicated besides the ap-

plication of a pattern, in this case, it can be desirable to postpone its application

to the end of a commutative sequence of pattern applications.

Therefore, given two design patterns dp1 and dp2, we say that they commute

if the following property written in SLAM-SL holds:

forall design : [Class] (dp2.apply(dp1.apply(design)) = dp1.apply(dp2.apply(design))

187


if (dp1.pre_apply(design) and dp2.pre_apply(design)))

An example of two patterns that commutes are Adapter and Decorator. We

omit the proof to save space, but it is straightforward. Let us discuss the influ-

ence of this fact: Consider the example of a drawing editor, as in [52, Chapter 4],

that lets you draw and manipulate graphical elements (lines, polygons, text, etc.).

The interface for graphical objects is defined by an abstract Shape class. Each ele-

mentary geometric Shape’s subclass is able to implement a draw method but not

the TextShape one. Meanwhile, an off-the-self user interface toolkit provides a

sophisticated TextView class for displaying and editing text. Besides, this toolkit,

should let you add properties like borders or scrolling to any user interface com-

ponent. In this example, we can apply two design patterns: the Adapter one in

order to define TextShape so that it adapts the TextView interface toShape’s; the

Decorator one in order to attach “decorating” responsibilities to individual ob-

jects (scroll, border, etc.) dynamically.

In the previous example, the application of the Adapter design patterns only

adds an association relation between TextView class and TexShape (as we said in

Section 9.1). Whereas the application of Decorator design patterns transforms

the design in a more complicated one (as we said in Section 9.3.2). So in order to

obtain simpler intermediate designs, Decorator would be applied the last.

In general, it is not an usual case that two patterns directly commutes, but it

is more frequent the fact that they commute after some trivial modifications (i.e.

permutation of arguments, renaming of operations, etc.). As these modification

can be specified in the specification language itself, more general properties can

be proved.

More General Patterns

Another interesting property might be to detect that a design pattern is an in-

stance of a more general one. In the Pattern Languages of Program Design meet-

ings, it is usual that a pattern proposal is rejected with the argumentation that it

is an instance of an existing one. We offer the basis for formally prove (or disagree

with) such statements. However, this does not means that the concrete pattern

is useless. Firstly because we are not specifying all the components of a pattern,

and two operationally similar patterns can differ in the suggestions of usage, and

this difference could be crucial for a software engineer. Secondly, because the

188


general pattern could be complicated enough, or rarely used in full, and the sim-

pler version could be more adequate for being part of the expertise of the practi-

tioner. Nevertheless, the tool can detect that storing the concrete design pattern

is not needed because it is an instance of the other pattern which specification

can be used instead.

A design patterns cdp of type CDP<CArgs> is an instance of a more general

design pattern gdp of type GDP<GArgs> (where CArgs is a subtype of GArgs) can

be characterised through the following SLAM-SL formula:

forall design : [Class] (cdp.apply(design) = gdp.apply(design)if cdp.pre_apply(design)

)

Our specification of the design pattern Composite is an instance of the design

pattern Composite presented in [52, Chapter 4]. The general version allows sev-

eral Composite classes each one with its own behaviour. We can specify it in the

following way:

class CompositeGOFinherits DPattern<[Class]>

public constructorcompositeGOF ([Class])

pre ...call compositeGOF(composites)post result .arg = composites

public apply ([Class])pre ...call apply(leafs)post ...

and formally prove that composite is an instance of compositeGOF([]). Again we

omit the proof.

Other properties

Other interesting examples of properties that can be easily stated for reasoning

about design patterns and systems are:

Pattern Composition. To find out that a pattern is the composition of two pat-

terns can be interesting. This does not preclude to exclude the composed pattern

189


from the catalogue, but an implementation can take advantage of this feature. A

design patterns dp is the composition of two design patterns dp1 and dp2 if:

forall design : [Class] (dp.apply(design) = dp2.apply(dp1.apply(design))if dp.pre_apply(design)

)

Pattern Implementation. An additional usage, out of the scope of this work, is

to prove that a concrete piece of software really implements a pattern. A design

design is the result of the application of a design pattern dp if:

exists original : [Class] (dp.post_apply(original,design)

)

Refactoring. Given a system design we can explore if a subsystem can be refac-

tored by the application of any design patterns in a collection of previously spec-

ified design patterns:

filterQ quantifies dp.pre_apply(subsystem)with subsystem in design.subSequencies

Pattern’s piece. There are designs in which we can find that a piece of a design

pattern has been applied but not the whole one. So the design can be refactored

applying only the remaining part. In these cases, we can find out if a design pat-

tern cdp is a component (or a piece) of another design patterns wdp:

forall design : [Class] (wdp.pre_apply(design)and exists sub_design : [Class] (

sub_design.is_in(design)and dp.pre_apply(design) implies cdp.pre_apply(sub_design)and dp.apply(design) implies cdp.apply(sub_design)

))

In the same way we do looking for a pattern implementation, we can find out if a

piece of a pattern has been applied to a design and next, find the remaining part

to be applied.

190


Figure 9.3: A tool for using design patterns

9.3.6 Application: Integration in a Development Environment

Once we have the modelling of design patterns as class operation, it is relatively

easy to incorporate them as a refactoring tool into development and design envi-

ronments. Let us describe how to achieve this goal. An additional feature of your

favourite development environment (Visual Studio, Visual Age, Rational Rose,

...) can allow the user to select a design pattern and to provide the arguments

to it. Figure 9.3 shows an example in C++, where the decorator pattern has been

chosen to organize the responsibilities in a flexible way of three existing display

classes: Border, Scroll, and TextView. The first two are selected as “decorators”

(they just allow to display things in different ways), while the third one is classi-

fied as a concrete component (is just a concrete way to display something, in this

case a text).

Once the pattern is applied, the existing code is automatically modified and

the new classes (if any) are generated as depicted in figure 9.4. The pattern pre-

conditions are checked and in case of failure a message explaining the reason is

displayed.

Tools for incorporating design patterns into a project has already been devel-

oped, but they depart from the idea that the designer has in mind the pattern to

be used before generating any code. The tool generates a code/design skeleton,

and then the user provides the particular details for each class. Obviously, our

modelling can be used also for this purpose (and the tool modified accordingly

191


Figure 9.4: Result of the application of the pattern.

with little effort), but we have preferred to focus on a refactoring point of view.

Rarely the designer selects a design pattern from the very beginning, but they are

inserted later when the design complexity grows. Additionally, our ideas rein-

force the reusability of existing code, because the argument classes can be part

of a library.

9.4 Future Trends

Although we consider our approach very promising, some additional work can

be done. Our future work will address the following issues:

• Obviously, it is important to provide a formalization of a more significant

collection of patterns (even if we have already described a good number of

192

9.5 Conclusion

them). We also plan to reformulate the descriptions in other specification

languages, more widely used, like Z or Maude. Notice that this is a simple

translation (except that the reflective features need to be modelled either

from the scratch – Z – or augmenting the already existing – Maude). How-

ever, the new formulations can make easier to include design patterns in

existing tools for these languages.

• One of the most promising application is those related with the develop-

ment of tools. We plan to fully develop efficiently the tools described, ex-

ploring in concrete applications the real impact of our approach.

Although we have displayed how to incorporate design patterns into a de-

velopment tool, it can be done in a similar way in a design tool, like Ratio-

nal Rose for UML. In this case, the system generate new diagrams and OCL

specifications.

In fact, we have only shown the easiest tool possible, but many extensions

are possible. In particular, an additional feature could be to select some

classes and then leave the system to find the pattern that can be applied to

them (i.e. the preconditions are fulfilled).

• Although the reflexive features of SLAM-SL allows for many semantical

treatment of specifications, it is true that it is possible to go deeper on this

approach. Many interesting issues of SLAM-SL (for instance, proving that

solutions implies the postconditions, or that the inheritance relation is

fulfilled) needs for ‘”hard” reasoning on formulae. This means that some

non trivial mathematical proofs are needed. Either we leave them to the

responsible for the specification (human), or we use some automatic the-

orem proving tools (computer, or mixed). We want to explore this second

approach in the next future. This allows us to include more semanti-

cal conditions in our modelling of object-oriented aspects. For instance,

the do_nothing method just check syntactically that the postcondition is

exactly the atom false while it can be checked that the postcondition is

logically equivalent to false.

9.5 Conclusion

We have proposed a formalization of design patterns by viewing them as opera-

tors between classes. The idea is not new and has circulated in the design pat-

193


terns community for some time—for instance, Prof. John Vlissides mentioned

it in a panel at POPL’00. However, to our best knowledge we have not found a

development of the technique.

The precise definition of software design patterns is a prerequisite for allow-

ing tool support in their implementation. Thus coherent specifications of pat-

terns are essential not only to improve their comprehension and to reason about

their properties, but also to support and automate their use.

If we measure our proposal following the criteria of A.H. Eden in his FAQ page

on Formal and precise software pattern representation languages [135] we can es-

tablish that our approach is expressive, because conveys the abstraction observed

in patterns, concise, at least more than other formalizations, compact because it

is heavily focused on relevant aspects of patterns, and descriptive in the sense

that we can apply our model to any pattern—though for some patterns if you

model the most general pattern it may lead to a less concise formalization than if

you formalize the specific ones.

It is worth mentioning that we are not claiming that our approach is the

“unique” or “the most appropriate” way to formalize design patterns. In fact,

different formalizations focused on a particular aspects yield to different tools,

properties to prove, aspects to understand, etc.

Our formal understanding of patterns gives support for tools that interleave

with existing object-oriented environments. The main difference between our

tool and the one proposed in [41, 42] is that our method is adapted to ”every

day” existing CASE environments instead of a totally new application, so the user

of existing tools can benefit from design patterns almost for free. We can also can

apply our tool to existing code, even stored in libraries. However, both kind of

tools are not alternative but complementary, as the output of the [41, 42] tool

could be used to instruct our tool.

In summary, our work tries to add some value to design pattern modelling:

the possibility of reasoning with them, understanding, refactoring, etc.

9.6 Appendix: Formalisation of DP in SLAM-SL

In this appendix we will show the formalization in SLAM-SL of some DP.

194


Abstract Factory arguments Abstract Factory results

class Abstract_Factory_DP inherits DP <{Method}>

public observer apply ([Class]) : [Class]pre not factories .is_empty and

forall m in arg withforall f in factories with

exists cm in c.meths with not m.differs (cm)call apply ( factories )post result = [abstract_factory ] +

map f in factories with f \ inh. insert (abstract_factory)where

abstract_factory =<slamcode>class Abstract_Factory</slamcode>\ meths = map m in args

with m \ prec = <slamcode>false</slamcode>\ postc = <slamcode>true</slamcode>

Figure 9.5: Abstract Factory pattern specification.

9.6.1 Abstract Factory (Figure 9.5)

The Abstract Factory pattern is part of the creational series. It provides an inter-

face for creating families of related or dependent objects without specifying the

concrete classes. We can see this pattern as an operator that takes the “facto-

ries´´ classes as argument and produces a new abstract class AbstractFactory for

interfacing them. The old classes are modified to inherit from this new class.

The class argument collects the methods to abstract. For simplicity we have

consider it as a set of methods, although the pre and postconditions and the ar-

guments names are not relevant.

The precondition of apply basically establishes that the methods to abstract

are present with the same format in all the factory classes.

195


Bridge arguments Bridge results

class Bridge_DP inherits DP <Unit>

public observer apply ([Class]) : [Class]pre classes.length = 1call apply (classes)post result =

[ impl,cl \ st = <slamcode>state (imp : Impl)</slamcode>

\ meths = map m in cl.methswith m \ postc = <slamcode>result = x</slamcode>

[x := imp.m.name(m.Call)]]where impl = cl \ name = cl.name + "_Impl",

cl = classes. first

Figure 9.6: Bridge pattern specification.

9.6.2 Bridge (Figure 9.6)

It is one of the object structural patterns and is used to decouple an abstraction

from its implementation. The operator takes a class as argument and returns

two classes. One is the implementation class that is basically the original one. An

abstract class is created by modifying the state of original one. It is replaced by a

single attribute belonging to the implementation class. Methods are rewritten as

merely calls to the correspondent ones in the implementation class.

9.6.3 Strategy (Figure 9.7)

This object behavioural pattern can be used when we have a a family of algo-

rithms for the same purpose and we want to encapsulate each one, and make

them interchangeable. Thus, the input classes share some methods and a new

class Strategy is created to provide an interface for them. The old classes need to

inherit from the Strategy class.

The apply precondition need to ensure that there are really common methods

to abstract. Again, no argument class is really needed and the empty class is used

instead.

196


Strategy arguments Strategy results

module examples.dps

class Strategy_DP inherits DP (Empty)

public observer apply ([Class]) : [Class]let (m in common_methods equiv forall cl in leafs with m in cl.methods)pre :− not common_methods.is_emptycall apply (classes)post :− result = [strategy] +

map cl in classes with cl \ inh. insert (strategy)where

strategy = <slamcode>class Strategy</slamcode>\ methods = map m in common_methods

with m \ prec = <slamcode>false</slamcode>\ postc = <slamcode>false</slamcode>

Figure 9.7: Strategy pattern specification.

9.6.4 Adapter (Figure 9.8)

The Adapter pattern belongs to the object structural patterns. It converts the

interface of a class adaptee into another interface target clients expect. It is done

by introducing a class adapter that inherits from both classes. Every method that

needs to be adapted is specified by making a call to the corresponding adaptee

method.

The apply precondition states that target is an interface and the methods to

be adapted are present in both classes.

The class argument relates methods between the target and the adaptee. It is

done by a couple of functions: the first one accepts a target method as parameter

and returns the corresponding method in the adaptee that implements this inter-

face, while the second one adapts the arguments between both calls. Of course,

the last choice is a simplification of the problem because in many cases the way

to adapt a method call to the other is not merely a reorder of the arguments.

197


Adapter arguments Adapter results

class Adapter_DP inherits DP <Method → Method,[String ] → [String]>

public observer apply ([Class]) : [Class]let (m in common_methods equiv forall cl in leafs with m in cl.methods),

( relates , adapts) = arg,to_adapt = filter m in target .meths with (m in relates.dom) in

pre target . is_interface andforall m in to_adapt with (relates (m) in adaptee.meths)

call apply ([ target , adaptee])post result = [target , adaptee, adapter]

where adapter = <slamcode> class Adapter</slamcode>\ inh = {target , adaptee}\ meths = map m in to_adapt

with m \ prec = relates (m).prec\ postc = (result = makeCall

( relates (m).name,adapts (m.Call )))

Figure 9.8: Adapter pattern specification.

9.6.5 Observer (Figure 9.9)

The Observer pattern is included into the behavioural pattern, and defines a one-

to-many dependency between objects so that when one object changes state,

all its dependants are notified and updated automatically. We assume that we

already have a collection of concrete subject classes and a concrete observer. The

pattern invents a class to store observers that is a superclass of all the concrete

subject and also a class Observer to abstract the update operation of the concrete

observer.

The class argument simply identifies the update method in the

concreteObserver class. A class Observer is created just for interfacing this method

and the concreteObservers inherit from this new class. Another Subject class is

invented to store observers. The classes in concreteSubjects are forced to inherit

from Subject and all the constructor and modifier methods includes a call to the

Notify operation of the father. The specification is a bit long but easy to follow:

198


Observer arguments Observer results

class Observer_DP inherits DPattern <Method → Bool>

public observer apply ([Class]): [Class]pre exists1 m in concreteObserver.meths

with (isUpdate (m) and updateMethods.sig = [])call apply (classes)post result =[subject, observ,

concreteObserver \ inh.insert (observer)] +map cl in concreteSubjectswith cl \ inh. insert (subject) and

meths = forall m in meths with addNotify(m)where isUpdate = arg

concreteObserver = classes (1)concreteSubjects = classes.suffix(1)updateMethod = select m in concreteObserver.meths with isUpdate (m)subject = makeClass ("Subject", [makeDec ("observers", [observ])],

{observ}, true , {attach, detach, create, notify })attach = makeMethod(modifier, public, "attach",

[makeDec("ob", Observer)],true , ( result = observ.insert (ob)))

detach = makeMethod(modifier, public, "detach",[makeDec("ob", Observer)],true , ( result = observ.remove (ob)))

create = makeMethod(constructor, public, "create",emptyDec, true, ( result = []))

notify = makeMethod(modifier, "notify", emptyDec, true,( result = map o in observers

with o.makeCall(updateMethod.name())))observ = makeClass("Observer", emptyDec, {}, true,

{updateMethod \ prec = false and postc = true})addNotify (m) = if m.kind.isConstructor or m.kind.isModifier

then m \ postc = m.postc and self.notifyelse m

Figure 9.9: Observer pattern specification.

199


9.6.6 Template Method (Figure 9.10)

The TemplateMethod (class behavioural) pattern is applied to a single class to ab-

stract a method that can be used as an skeleton of similar algorithms. We assume

that the method is already implemented in the class. The pattern abstracts this

method into a new class and eliminates it from the original one.

The formalization is more or less straightforward by inventing a new class

TemplateAbstractClass with the same functionality as the original one but with

empty code for those methods that are not classified as templates. The concrete

class is forced to inherit from the new class and the template methods are re-

moved.

The class argument is a boolean function which decides the methods to ab-

stract as template methods. The precondition plays an important role because it

is needed that the template method are really templates, i.e. they only use pre-

defined and public operations of the class.

9.6.7 Decorator (Figure 9.11)

The Decorator pattern is classified as object structural and it is used to attach ad-

ditional responsibility to an object dynamically. It can be seen as the following

class operator: A collection of concrete components and a collection of deco-

rators are used as arguments. They share some operations that the pattern ab-

stracts in two steps. First of all, a new Decorator class abstracts the operation of

the decorators. Then another newly created class Component abstracts the oper-

ation either for the concrete components and for the decorator.

The class argument is used to split the sequence of classes into the concrete

components and the decorators. Concrete components are forced to inherit from

Component, while decorators inherit from Decorator and modifies the common

methods to add a call to the decorator operation. The Decorator class contains a

Component in the state and offers the common methods as public. They are im-

plemented as simple calls to the equivalent operations in the stored component.

Finally, Component is merely an interface for the common methods.

200


Template Method arguments Template Method results

class Template_Method_DP inherits DP <Method → Bool>

public observer apply ([Class]): [Class]pre exists m in concreteClass.meths with isTemplate(m)call apply ([ concreteClass])post result = [templateAbstracClass,

concreteClass \ st = [] andinh. insert (templateAbstracClass) andmeths = filter m in concreteClass.meths

with not isTemplate(m)]where templateMethods = filter m concreteClass.meths with not isTemplate(m)

templateAbstracClass = makeClass ("TemplateAbstractClass",concreteClass.st, {}, true ,map m in concreteClass.methswith modify (m))

modify (m) = if m in templateMethodsthen melse m \ prec = false and postc = true end

isTemplate = arg

Figure 9.10: Template Method pattern specification.

9.6.8 State (Figure 9.12)

The State pattern belongs to the object behavioral classification. It can be used

to allow an object to modify its behavior when its internal state changes. The

object will appear to change its class. When studied as a class operator, it takes

a collection of concrete state classes as argument. All these classes are present

in the result, except that they inherit from the State class described below. The

result adds two classes: one to abstract the behavior of all the concrete states,

called State, that represents an interface containing all the common methods in

all the concrete states. The second one is Context that is designed for calling

state operations. It contains a State as attribute and all the common methods,

described as merely calls to the corresponding operation of the attribute. This

class can be refined by inheritance to introduce more funcionality. The complete

specification can be found in Figure 9.12.

201


Decorator arguments Decorator results

class Decorator_DP inherits DPattern<Natural>

public constructor decorator (Natural)call decorator(n)post result .arg = n

public observer apply ([Class]): [Class]let common_meths = {m with cl in classes | m in cl.methods}pre (classes.length > arg) and (not common_meths.isEmpty)call apply (classes)post

result = [component, decorator]+ mapQ quantifies c \ inh.insert(decorator)

\ methods = mapQ quantifies add_call(m, c.methods)with m in c.methods

with c in concrete_classes+ mapQ quantifies c \ inh.insert(component) with c in concrete_classes

whereconcrete_classes = classes.prefix (arg);decorators = classes.suffix (arg);component = mk_Class(

"Component", {}, {},mapQ quantifies

m \ (mapQ quantifies r \ prec = $false$\ postc = $true$

with r in rules)with m in common_methods);

decorator = mk_Class("Decorator",{mk_State([mk_Field("component", component)])},{component},mapQ quantifies

m \ mapQ quantifiesr \ postc = <slamslcode>

result = component.mk_Call(m.name,m.parameters)

</slamcode>with r in rules

with m in common_methods);

Figure 9.11: Decorator pattern specification.

202


State arguments State results

class State_DP inherits DP <Empty>

public observer apply ([Class]) : [Class]let (m in common_methods equiv forall cl in leafs with m in cl.methods)pre not concrete_states.is_empty and not common_methods.is_emptycall apply (concrete_states)post result = [context, abs_state] +

map c in concrete_states with c \ inh. insert (abs_state)where

abs_state = make_class ("State" ,{},{},map m in common_methodswith m \ map r in rules with r \ prec = $false$

\ postc = $true$),context = make_class ("Context",

{make_state (make_attribute (make_declaration ("stt",abs_state )))},{}, map m in common_methods with transfer (m)),

transfer (m) =m \ map r in rules

with r .put_postc ($result = stt .make_call(m.name,m.parameters)$)

Figure 9.12: State pattern specification.

9.6.9 Builder (Figure 9.13)

The Builder pattern (belonging to the object creational patterns) is designed to

separate the construction of a complex object from its representation, so that the

same construction process can create different representations.

As a class operator, the Builder patterns takes a collection of concrete builders

as an argument. Another class director is part of the arguments and it is assumed

that it contains the algorithm to construct objects. It is also assumed that all the

concrete builders share some operations that are used to build objects. Those

methods are called builders and need to be defined in all the concrete builders.

The argument class is a boolean function isBuilder that is applied to methods in

the concrete builders, detecting if they are builders or not. Methods classified as

builders are abstracted into the Builder class. The concrete builders appear in the

result but they are forced to inherit from Builder. The director class is modified

203


in the following way: once an attribute belongs to one of the concrete classes it is

abstracted to the Builder class. See Figure 9.13 for the detailed description.

Builder arguments Builder results

class Builder_DP inherits DP <Method → Boolean>

public observer apply ([Class]) : [Class]let concrete_builders = classes.prefix (1)

(m in common_methods equiv (forall cl in concrete_builderswith m in cl .methods)),

is_builder = arg,builder_methods = filter m in common_methods with is_builder (m)

pre (classes.lenght > 2) and (not builder_methods.is_empty)call apply (classes)post result = [builder ] +

[ director \ states = map d in director.stateswith abstract_to_builder (d)] +

map c in concrete_builderswith c \ inheritance. insert ( builder )

wheredirector = classes.prefix (1),builder = make_class ("Builder" ,{},{},

map m in builder_methodswith m \ map r in rules

with r \ prec = <slamcode>false</slamcode>\ postc = <slamcode>true</slamcode>),

abstract_to_builder (d) = map a in d.attributeswith if a.type in concrete_builders

then d \ type = builderelse d end

Figure 9.13: Builder pattern specification.

204

Part V

Conclusion

205

10

Conclusions and Future Work

Abstract

In this chapter we state the conclusions of this thesis and present the

main lines of future work.

10.1 Conclusions

The motivation of this thesis was to contribute to bridging logic-based methods

and software development practises. The main outcomes have been the Clay

object-oriented formal notation, its associated theory and tools, and a demon-

stration that different formal techniques can be integrated in current software

production processes.

10.1.1 Tools

In our opinion, one remarkable aspect of this thesis is that all the theoretical work

has been mechanised. All the formalisation has been described using languages

and formal tools already available for a potential user of Clay: first-order theories

207

10 Conclusions and Future Work

in Prover9/Mace4, executable prototypes in Prolog and translation functions in

Haskell.

We think that this makes this thesis a methodological contribution towards

the goals behind proposals like POPLmark challenge and QED manifesto and

could serve as a guide for other projects.

The use of automatic prover technology allowed us to mechanise both Clay’s

meta-theory and specifications. For example, some of the theorems about Clay

have been proved automatically.

Our compiler is able to synthesise executable prototypes from implicit

method specifications, specially in the presence of recursive definitions, some-

thing which is seldom supported by other lightweight methods and tools that

demand to be fed with executable specifications [49, 58].

Tools that rely on model checking, such as Alloy and ProB, make an exhaus-

tive search of models that act as counterexamples. Even with small models, these

tools find errors in specifications. In the worst case, our executable prototypes

can act as model checkers but Clay does not require the specifications to be re-

fined to be programs.

The main drawback of prototypes synthesised by our tool is performance. If

the response times of the prototypes are not good enough, its executability is

worthless. Anyhow, we are excited with the dramatic improvements of our results

presented in [68], after encoding Clay integers and its methods as SWI-Prolog

integers and finite domain operations using constraints.

We found the interaction with the classical logic automated prover quite un-

pleasant. Particularly annoying were those undetected errors that could be found

with a sorted logic and with the declaration of symbols. These difficulties forced

us to take some intricate decisions in Chapter 4 such as reflecting the sorts in

axioms at the cost of having a less readable theory.

It is very likely that our formalisations and, therefore, our implementation

still contain errors. With all our formalisation implemented we know that find

and solve such errors will not be that difficult. Nevertheless, we missed some

kind of tool that kept the implementation and the formalisations synchronised.

208

10.1 Conclusions

10.1.2 Theory

We have modelled the main features of object-oriented languages in classical

logic. A common theory describes the semantics of inheritance, overloading

and dynamic binding, and a translation function defines how to encode object-

oriented specifications (Clay) into axioms.

The result, apart from the formalisation itself, is that the specifier can inter-

act with an automatic prover (Prover9/Mace4) to detect inconsistencies in her

specifications or get higher confidence on them.

We think that the use of Clay or Prover9/Mace4, with respect to the conclu-

sions above, is incidental, and the ideas can be applied to other object-oriented

notations and prover specific technology.

We have mentioned in the previous subsection that the interaction with the

classical logic automated prover was quite unpleasant. Nevertheless, much more

important were not having introduced induction schemes from the inductive

data definitions (case classes). Without such schemes the theorems automati-

cally proved cannot be very strong. We will go into some details in the future

work section.

10.1.3 Clay

We have designed Clay, a formal notation with the following features: object-

oriented, class-based with nominal subtyping, stateless, algebraic types, Scandi-

navian semantics and permissive overloading.

From these features, we would like to highlight the implications of its mono-

tonic semantics that relates concepts like the Scandinavian semantics (the be-

haviour of the parent is preserved and augmented), permissive overloading (sub-

stitutability and specialisation), the Liskov substitution principle (properties de-

scribed in a class are inherited in the objects of a subclass), and open world as-

sumptions. In Clay, a subclass cannot invalidate, by overriding for instance, any

property specified in its superclasses. This approach is essential when we are

specifying in the large because the specifier needs to reason locally within a class

specification. The drawback is certain loss of flexibility but, in our view, the deci-

sion pays off.

We wanted Clay to be simple enough and close to the developers’ way of

209


thinking. We think that we have succeeded in some aspects but not in all. The

most controversial feature left out is state. The reader can observe that, without

state, the formalisation in classical logic gets rather complex. The introduction

of state would have required we to move to other kind of logical framework. Let

us explore this issue in the future work section.

10.1.4 Software Development

The compilation scheme from Clay specifications into logic programs supports

the generation of executable prototypes that help in validating requirements. It

was a strong requirement that executability of specifications was not obtained

by sacrificing expressiveness, e.g. by forcing the specifier to use some intricate

idiom. This raises the abstraction level at which the specifier can write the re-

quirements.

After several years writing specifications and programs, we have learnt that

both tasks, specifying and programming, are extraordinarily similar. One con-

clusion that can be drawn from Part IV is that, as our published papers show,

there is a mutual interest in exchanging concepts between both fields.

10.2 Future Work

In the previous section we mentioned some limitations of our contributions. In

this one we present our future work projects. Some of them are a natural evo-

lution of those parts of our work that can be considered unfinished, some are

improvements and the rest are new research topics we discovered in the process

of writing this thesis.

10.2.1 Evolution

Correctness of the Prolog Generator

The results of the synthesised prototypes are not valuable if they are not correct.

The most obvious reason to get incorrect results is that the synthesised proto-

types is not correct with respect to the formal semantics of Clay. To avoid it we

will work on a correctness result between the formal semantics of Chapter 4 and

210

10.2 Future Work

the translation of Chapter 5.

Induction Schemes

The data model of Clay is inductive (algebraic types). In general, to prove the-

orems the automatic theorem prover will need inductive axioms for every class,

but these cannot be encoded in first-order logic and Prover9/Mace4 does not

supports any kind of induction schemes. Therefore, every property to be proved

by induction demands the introduction of its own inductive axioms. We statically

generate inductive axioms for known predicate names (e.g. invariants, pre- and

post-conditions in the class specification) but not for every intermediate lemma

that requires induction and that might not be known statically. Without giving

up first-order logic, we plan to explore the intermediate results of the deduction

process in order to incrementally add new inductive axioms that help in the proof

of theorems stronger than, say, subject reduction.

Introduction of State

Clay is stateless while most of the current modelling languages used by software

engineers have mutable state. Our plan is to modify Clay’s semantics to support

the notion of state. The most promising tools to support this extension are evolv-

ing algebras (as known as abstract state machines), and dynamic logic.

10.2.2 Improvements

Improving the Efficiency of Prototypes

Comparison between the Prolog code obtained from our compiler and that

crafted by hand shows room for improvement. There exist more mature tools

(see, for instance, ProB [86, 87]) which also generate logic programs from formal

specifications. These works show that certain extensions of logic programming

(constraints, coroutining, etc.) can help in dramatically improving the efficiency

of the resulting code in an automated way. We plan to try new extensions in the

same way we have encoded Clay integers as predefined Prolog integers.

211


Feedback to Experts

Our expertise allows us to easily interact with the external tools (Prover9/Mace4

and Prolog). We cannot expect, however, that the average Clay user can under-

stand the answers returned by those tools.

To allow the specifier to interact with Clay, a decompiler that translates back

the answers of the tools into Clay is absolutely essential. The distance between

our Prolog encoding of Clay and Clay is very short, thus we do not expect great

difficulties. Nevertheless, interpreting proofs and interpretations returned by

Prover9/Mace4 will require hard work and we will need to include extra infor-

mation about Clay in the synthesised axioms.

Advanced Constructions in Clay

Clay is a lightweight evolution of SLAM-SL designed to focus on the formal as-

pects superficially treated with SLAM-SL. In this desugaring process some inter-

esting construction of SLAM-SL were lost. Bringing back to Clay iterators, pat-

terns, collection comprehension is one of our lines of potential future work.

10.2.3 New Topics

The topics displayed in this subsection where uncovered while working on this

thesis.

Other Provers

Prover9/Mace4 is a very powerful tool. Nevertheless, encoding object-oriented

concepts such as classes and messages in an untyped logic has been quite diffi-

cult and error prone. We have introduced a lot of mistakes that could have been

detected with the generation of a minor notion of types in the prover.

We consider interesting to study the representation of object-oriented con-

cepts in any kind of higher order logic and give a try to assisted provers like Coq

[13]. One of the advantages of moving to a higher-order setting would be a more

flexible treatment of induction schemes.

212

10.2 Future Work

Concrete and Abstract Syntax

We think that the way we have formalised Clay is a methodological contribution.

We would like to go beyond and to construct tools for describing

• concrete and abstract syntax,

• the injection of concrete sentences into abstract trees (parsing),

• the construction of environments as efficient structures for managing the

abstract trees,

• the predicates that check wellformedness rules (such as type checking),

• the translation between abstract trees (translation functions), and

• the preferred function included in the reversed injection (pretty printers).

Particularly interesting is to automatically obtain a reverse translation function

that supports the translation of feedback in the object language to the target lan-

guage.

213

Part VI

Appendices

215

AClay Notation Reference

Abstract

This appendix contains a fairly detailed description of the syntax of Clay.

A.1 Lexical Issues

Clay lexemes are grouped in identifiers, literals, operators and keywords. White

space and comments are just used as lexeme separators and are ignored for any

other purpose. White spaces are blanks, tabs and new lines. Comment lines start

with “ // ” and (non-nested) comments appears between this strings: “ /*” and “*/”.

A.1.1 Identifiers and Variables

Clay has four types of names: class identifiers, class variables, message identifiers

and object variables. At the syntactic layer, message identifiers, class variables

and object variables are not distinguished. Module identifiers are introduced in

order to qualify class identifiers. Modules defines hierarchical namespaces simi-

lar to those in object-oriented programming languages such as Java.

217

A Clay Notation Reference

The regular expressions that recognise identifiers and variables are defined

on top of the auxiliary regular expressions

$lcalpha = a-z

$ucalpha = A-Z

$alpha = [$lcalpha $ucalpha]

$digit = 0-9

$underline = \_

$prime = \’

$dot = \.

$dollar = \$

@lcid = $dollar? $lcalpha [$alpha $digit $underline]* $prime* $dollar?

@ucid = $dollar? $ucalpha [$alpha $digit $underline]* $prime* $dollar?

Class Identifiers

The regular expression that recognises class identifiers is

@clsid = ((@qmodid|@mmvid) $dot)? @ucid

Message and Module Identifiers, and Variables

The regular expression that recognises message identifiers, one level namespace

identifiers and variables is

@mmvid = @lcid

The regular expression that recognises two or more level namespace identifiers

is

@qmodid = @lcid ($dot @lcid)+

A.1.2 Literals

Literals for integers and decimal numbers are defined by these regular expres-

sions:

@int = $digit+

@dec = $digit+ ($dot $digit*)?

A.1.3 Operators

In Figure A.1, the sequences of characters on the left are recognised as logical,

relational and numerical operators. On the right we show a literate version and

218

A.1 Lexical Issues

<=>

=>

<=

\/

/\

~

=

<:

<

>

=<

>=

+

-

*/

<-

⇔ // logical equivalence⇒ // logical implication⇐ // reversed logical implication∨ // logical disjunction∧ // logical conjunction¬ // logical negation= // equality<: // subclass< // less than> // greater than6 // less than or equal> // greater than or equal+ // sum operator− // sub operator

* // prod operator/ // div operator. // message passing

Figure A.1: Operators in Clay.

its meaning. Section A.4 we shows the precedence of each operator.

A.1.4 Keywords

The following are reserved words and cannot be used as identifiers or variables.

Some of them are synonymous and as such is indicated:

assert = assertionbotcaseclassconstructorelseexistsforallifimport

inv = invariantmodifiermoduleobserverpost = postconditionpre = preconditionsol = solutionstate = casethentop

The following symbols are used as special lexemes to structure the specifica-

tion and facilitate parsing: “ ,”, “ :”, “<”, “>”, “(”, “)”, “{”, “}”.

219


A.2 Grammar

We express the context free grammar for Clay in Extended Backus–Naur Form

(EBNF) using these conventions:

• A sentential formα that may be omitted is represented with the suffix ?: α?.

• A sentential formα that may be repeated zero or more times is represented

with the suffix ∗: α∗.

• A sentential form α that may be repeated at least once is represented with

the suffix +: α+.

The axiom of the grammar is CompilationUnit.

A.2.1 Compilation Unit

A compilation unit contains a class specification. A module declaration intro-

duces the name space of the specified class and import declarations declares

which other classes are used in the specification.

CompilationUnit ::=ModDecl?

ImportDecl∗ClassSpec

A.2.2 Module Declaration and Module Identifier

This is the syntax of a module declaration:

ModDecl ::= module ModId

and a module identifier is, directly, a lexeme defined in Section A.1.1:

ModId ::= qmodid | mmvid

A.2.3 Import Declaration and Class Identifier

This is the syntax of an import declaration:

ImportDecl ::= import ClsId

220

A.2 Grammar

and a class identifier is, directly, a lexeme defined in Section A.1.1:

ClsId ::= clsid

A.2.4 Class Specification

A class specification has a class declaration, an invariant, a (possibly empty)

sequence of state (case class) declarations and a (possibly empty) sequence of

method specifications.

ClassSpec ::=ClassDecl {

Invariant?

StateDecl∗MethodSpec∗

}

A.2.5 Class Declaration

A class declaration has a class identifier, its class formal parameters and its su-

perclasses.

ClassDecl ::= class ClassId ClassFormalParams? ClassSupertypes?

ClassFormalParams ::= <(ClassFormalParam , )∗ClassFormalParam>

A class formal parameter introduces a class variable and its bounds.

FormalParam ::= VarId ClassSupertypes?

ClassSupertypes ::= extends ClassExpr+

A.2.6 Invariant

Invariant ::= invariant { Formula }

A.2.7 State Declaration (Case Classes)

A state declaration introduces a class identifier and the fields defined for that case

class.

StateDecl ::= state ClsId { ((StateField , )∗StateField)? }

StateField ::= MsgId : ClsExpr

221


A.2.8 Method Specification and Message Identifiers

A method specification starts with a method declaration and has an optional

precondition, an optional postcondition, an optional solution, and a (possibly

empty) sequence of assertions.

MethodSpec ::=MethodDecl {

Precondition?

Postcondition?

Solution?

Assertion∗}

A method declaration introduces a message identifier with a method signa-

ture and optionally a result type (mandatory for observers and forbidden for con-

structors and modifiers).

MethodDecl ::= MethodKind MsgId MethodSignature? ( : ClsExpr)?

MethodKind ::= constructor | modifier | observer

A message identifier is, directly, a lexeme defined in Section A.1.1:

MsgId ::= mmvid

A method signature introduces formal parameters: object variables and their

types.

MethodSignature ::= ( (FormalParam , )∗FormalParam )

FormalParam ::= VarId : ClsExpr

Here is the syntax for preconditions, postconditions, solutions, and assertions.

Precondition ::= precondition { Formula }

Postcondition ::= postcondition { Formula }

Solution ::= solution { Formula }

Assertion ::= assertion { Formula }

A.3 Formulae and Expressions

A.3.1 Class Expression

Class expressions are class variables or a class identifier applied to a (possibly

empty) sequence of class expression (class actual parameters).

222

A.3 Formulae and Expressions

ClsExpr ::=ClsId ClsActualParams?

| VarId

ClsActualParams ::= < (ClsExpr , )∗ClsExpr >

A.3.2 Object Expression

An object expression is a class expression, an object variable, a send expression,

or a sugared expression.

ObjExpr ::=ClsExpr

| VarId| SendExpr| SugaredExpr

The send expression is the most used expression. It is represented with an

object expression (the receipt) the send operator and the message.

SendExpr ::= ObjExpr.MsgExpr

Messages have a message identifier and several actual parameters that are

object expressions.

MsgExpr ::= MsgId MsgParams?

MsgParams ::= ( (ObjExpr , )∗ObjExpr )

A.3.3 Syntactic Sugar

Sugared expressions are used to introduce human readable conventions for inte-

gers, decimal numbers and their usual operations. Their syntax is the standard in

any programming or specification language and the internal representation are

send expressions.

SugaredExpr ::=BinExpr

| UnaExpr| LitExpr

BinExpr ::=ObjExpr < ObjExpr

| ObjExpr > ObjExpr| ObjExpr 6 ObjExpr| ObjExpr > ObjExpr| ObjExpr + ObjExpr| ObjExpr − ObjExpr

223

A.4 Precedence and Associativity

• * and / (left associative).

• . (the message passing symbol is left associative).

The precedence order for logical operators, lowest first, is:

• ⇔ (equivalence, left associative).

• ⇐ (reverse implication, right associative).

• ⇒ (implication, left associative).

• ∨ (disjunction, left associative).

• ∧ (conjunction, left associative).

• ¬ (negation, right associative).

225

BClay Theory in Logic

Programming

Abstract

This appendix contains the complete Clay Theory presented in Chapter 5

in the form of a logic program.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%% Clay theory %%%%%%%%%%%%%%%%%%%%%%%%

:- use_module(library(clpfd)).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% A bit of sugar for Clay expressions

:- op(200,yfx,’<--’).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% class(C) :- C is a class.

class_limited(C) :-

call_with_depth_limit(class(C),5,R),

number(R).

:- discontiguous class/1.

% Built-in classes

227

B Clay Theory in Logic Programming

class(clay_lang_Meta).

class(clay_lang_Int).

class(clay_lang_MetaInt).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% inherits(A,SuperA) :- A is a "direct" subclass of SuperA.

:- discontiguous inherits/2.

% Built-in classes (no meta classes are needed here)

inherits(clay_lang_Int,[]).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% cases(A,[C1,C2,...,Cn) :- Ci is case class of class A.

:- discontiguous cases/2.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% fields(C,[(FieldName1,FieldType1), ...]) :-

% Case class C has a a field with fieldname FieldNamei and

% type FieldTypei.

:- discontiguous fields/2.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% reduce(E,NF) :- NF is the reduced form of Clay expression E.

:- discontiguous reduce/2.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% msgtype(A,Mid) :- Mid is a message defined in A.

:- discontiguous msgtype/2.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% pre(C,Self,Mid,[Arg1, Arg2...]) :- precondition

% holds for message Self<--m(Arg1, Arg2, ...) in class C

:- discontiguous pre/4.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% post(C,Self,Mid,[Arg1, Arg2...],Result) :- postcondition

% holds for Result (normal form) in class C for message

% Self<--m(Arg1, Arg2, ...)

:- discontiguous post/5.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% sol(C,Self,Mid,[Arg1, Arg2...],Result) :- solution

% holds for Result (normal form) in class C for message

% Self<--m(Arg1, Arg2, ...)

:- discontiguous sol/5.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% caseclass(A,B) :- A is a case class of B.

caseclass(A,B) :-

228

cases(B,Cases),

member(A,Cases).

class(A) :-

caseclass(A,Super),

class(Super).

inherits(A,[Super]) :-

caseclass(A,Super).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% superclasses(A,SuperAs) :-

% SuperAs is the sorted list of superclasses of A, max

% first.

superclasses(A,SuperAs) :-

superclasses_acum_limited(A,[A],SuperAs).

superclasses_acum_limited(A,Acum,Super) :-

call_with_depth_limit(superclasses_acum(A,Acum,Super),

10,

R),

number(R).

superclasses_acum(A,Acum,Acum) :-

inherits(A,[]).

superclasses_acum(A,Acum,SuperAs) :-

inherits(A,[Super]),

superclasses_acum(Super,[Super|Acum],SuperAs).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% nf(C,NF) :- NF is a normal form of class C.

nf_limited(C,NF) :-

call_with_depth_limit(nf(C,NF),38,R),

number(R).

:- discontiguous nf/2.

nf(C,NF) :-

superclasses(C,SuperCs),

genstates(SuperCs,NF).

nf(clay_lang_Meta, clay_lang_Meta).

nf(clay_lang_Meta, clay_lang_MetaInt).

nf(clay_lang_MetaInt, clay_lang_Int).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% genstates([C1,C2,...],[S1,S2,...]) :-

% for every i, state(Ci,Si).

genstates([],_) :- true.

genstates([Final],[State|_]) :-

genstate(Final,State).

229


% Care with case classes, they are subclases but its direct

% superclass is already generating its state (both rules)

genstates([Super,Case|Subs],[State|States]) :-

caseclass(Case,Super),

% To avoid exploration of all cases in next atom:

State = (Super, (Case, _)),


genstates(Subs,States).

genstates([Super,Sub|Subs],[State|States]) :-

inherits(Sub,[Super]),

\+ caseclass(Sub,Super),


genstates([Sub|Subs],States).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% genstate(C,S) :-

% S represents the "basic" state of class C, ie. the initial

% algebra generated by case classes.

genstate(C,(C,(C,[]))) :-

cases(C,[]).

genstate(C,(C,State)) :-

% To avoid blind generation of cases:

State = (Case, _),

cases(C,Cases),

Cases \= [],

caseclass(Case,C), % member(Case,Cases),

genfields(Case,State).

% Built-ins

genstate(clay_lang_Int,(clay_lang_Int,(clay_lang_Int,[I]))) :-

I in inf..sup.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% generate(Case,(Case,Data)) :-

% Data represents a normal form of the cartesial product of

% the types of fields of the case class Case.

genfields(Case,(Case,Data)) :-

fields(Case,Fields),

fields2nf(Fields,Data).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% fields2nf([(FieldName1,FieldType1),...],

% [(FieldName1,NF1),...]) :-

% Each NFi is a formal form of FieldTypei.

fields2nf([],[]).

fields2nf([(FieldName,FieldType)|Fields],

[(FieldName,NF)|FieldsNF]) :-

nf(FieldType,NF),

fields2nf(Fields,FieldsNF).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

230

% instanceof(NF,C) :- NF represents an instance of C.

instanceof(NF,C) :-

nf_limited(C,NF).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% reduce(E,NF) :- NF is the normal form of expression E

% reduce(V,V) :-

% var(V).

reduce(I,[(clay_lang_Int, (clay_lang_Int, [I]))|_]) :-

number(I).

reduce(NF,NF) :-

NF = [_|_]. % Already in normal form

reduce(O<--M,NF) :-

M =.. [Mid|Args],

reduce(O,ONF),

reduceall(Args,ArgsNF),

knownclasses(ONF,Cs), % findall(C,instanceof(ONF,C),Cs),

checkpreposts(Cs,ONF,Mid,ArgsNF,NF,defined).

reduce(E1 < E2,NF) :-

reduce(E1,NF1),

reduce(E2,NF2),

eq(clay_lang_Int,

NF1,

[(clay_lang_Int, (clay_lang_Int, [I1]))|_]),

eq(clay_lang_Int,

NF2,

[(clay_lang_Int, (clay_lang_Int, [I2]))|_]),

clpfd2bool(I1 #< I2, NF).

reduce(C,C) :-

class_limited(C). % Classes are normal forms

clpfd2bool(C,NF) :-

C,

reduce(clay_lang_Bool<--mkTrue,NF).

clpfd2bool(C, NF) :-

#\ C,

reduce(clay_lang_Bool<--mkFalse,NF).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% reduceall(Es,NFs) :- map of reduce apply to list of

% expressions Es.

reduceall([],[]).

reduceall([E|Es],[NF|NFs]) :-

reduce(E,NF),

reduceall(Es,NFs).

231


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% knownclasses(NF,Cs) :- Known classes of a normal form.

knownclasses(NF,[]) :-

var(NF),

!.

knownclasses(C,Cs) :-

class_limited(C),

findall(M,nf(M,C),Cs).

knownclasses([(C,_,_)|NFs],[C|Cs]) :-

knownclasses(NFs,Cs).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% checkpreposts(Cs,Mid,ArgsNF,NF,D) :- check

% pre(C,ONF,Mid,ArgsNF) and post(C,ONF,Mid,ArgsNF,NF)

% for every class C in Cs. D reflects that at the message Mid

% was defined for at least one class in Cs.

checkpreposts([],_ONF,_Mid,_ArgsNF,_NF, undefined).

checkpreposts([C|Cs],ONF,Mid,ArgsNF,NF, Defined) :-

\+ msgtype(C,Mid),

checkpreposts(Cs,ONF,Mid,ArgsNF,NF, Defined).

checkpreposts([C|Cs],ONF,Mid,ArgsNF,NF, defined) :-

msgtype(C,Mid),

pre(C,ONF,Mid,ArgsNF),

(

sol(C,ONF,Mid,ArgsNF,NF), !

;

post(C,ONF,Mid,ArgsNF,NF)

),

checkpreposts(Cs,ONF,Mid,ArgsNF,NF,_Defined).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% project(NF,Case,Fields) :- Fields is the projection of NF

% wrt case class Case

project([(Class,State,Fields)|_],State,Fields) :-

nonvar(Class).

project([_|Rest],State,Fields) :-

nonvar(Rest),

project(Rest,State,Fields).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% eq(C,NF1,NF2) :- NF1 and NF2 are equal upto class C.

eq(C,NF1,NF2) :-

eq_upto(C,NF1,NF2).

eq_upto(_,X,Y) :-

var(X),

var(Y),

!.

eq_upto(C,[A|Sub1],[B|Sub2]) :-

% nonvar(A),

232

% nonvar(B),

eqs(C,A,B),

(A = (C,_,_) ->

true

;

eq_upto(C,Sub1,Sub2)).

eqs(_C,A,B) :-

A = B.

% eqs(C,(C,(C,[])),(C,(C,[]))).

% eqs(C,(C,(S,AFs)),(C,(S,BFs))) :-

% AFs = BFs. % TODO: unification cannot be used,

% % eq_upto needed with static information.

% eqs(clay_lang_Int,

% (clay_lang_Int,(clay_lang_Int,[I1])),

% (clay_lang_Int,(C,[I2]))) :-

% I1 #= I2.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% Some auxiliary functions

writenf(C) :-

class_limited(C),

!,

write(C).

writenf([(C,(C,[]))|Sub]) :-

writenf(Sub).

writenf([(C,(S,Fs))|Sub]) :-

C \= S,

write(S),

write(’{’),

writefields(Fs),

write(’}’),

(var(Sub) -> true; write(’ < ’), writenf(Sub)).

writenf([(clay_lang_Int,(clay_lang_Int,[I]))|_]) :-

!,

write(I).

writefields([]).

writefields([(FN,F)|Rest]) :-

write(FN),

write(’ : ’),

writenf(F),

(Rest = [] -> true; write(’; ’), writefields(Rest)).

nfsize(C,0) :-

class_limited(C),

!.

nfsize([],0) :-

!.

nfsize([(C,(C,[]))|Sub],Size) :-

233


!,

nfsize(Sub,Size).

nfsize([(C,(S,Fs))|Sub],Size) :-

C \= S,

fieldssize(Fs,SizeFs),

nfsize(Sub,SizeS),

Size is SizeFs + SizeS.

nfsize([(clay_lang_Int,(clay_lang_Int,[_I]))|_],1).

fieldssize([],0) :-

!.

fieldssize([(_FN,F)|Rest],Size) :-

nfsize(F,SizeS),

fieldssize(Rest,SizeFs),

Size is SizeS + SizeFs + 1.

234

C

Mathematical Conventions

Abstract

This appendix contains mathematical conventions and notation we have

followed in this thesis, mainly in Chapters 3, 4, and 5. The appendix has

been included in order to make the thesis as self-contained as possible.

It is assumed that the reader is familiar with the basic properties of sets.

For the definition of standard concepts we have followed [50, Chapter 1].

C.1 Sets

The notation {a,b,c} is used to allow the explicit construction of finite sets as an

enumeration of its (finitely many) elements (the set with the elements a, b and

c in the example). The notation ; is used as an alternate representation of the

empty set {}.

235

C Mathematical Conventions

Standard predicates for checking membership (∈), and inclusion (⊂, ⊆):1

x ∈ A iff x is an element of the set A

A ⊆ B iff x ∈ A implies x ∈ B

A ⊂ B iff x ∈ A implies x ∈ B and A 6= B

Set comprehensions allow to define a set from a “characteristic property”.

The notation {x | P(x)} denotes the set of all x satisfying P(x) following the intuitive

notion of membership

x ∈ {x | P(x)} iff P(x)

Operations on sets are union (∪), intersection (∩), and difference (−):

A∪B = {x | x ∈ A or x ∈ B}

A∩B = {x | x ∈ A,x ∈ B}

A−B = {x | x ∈ A,x 6∈ B}

The family union

⋃i∈A b(i)

where b(i) represents a set that depends on i, is the set

{x | i ∈ A,x = b(i)}

Given a set A, the powerset of A denoted by 2A is the set of sets

{X | ∀x ∈ X (x ∈ A)}

1iff = if and only iff

236

C.2 Relations

C.2 Relations

Given two sets A and B, their Cartesian product denoted by A×B is the set of

ordered pairs

{⟨a,b⟩ | a ∈ A,b ∈ B}

Given any finite number of sets A1,. . . ,An, the Cartesian product A1 × . . .×An

is the set of ordered n-tuples

{⟨a1, . . . ,an⟩ | ai ∈ Ai,1 ≤ i ≤ n}

The notation A2 is used as an alternate to A×A.

A binary relation (or relation) between A and B is any subset R of A×B. Given

a relation R between A and B, the set

{x ∈ A | ∃y ∈ B (⟨x,y⟩ ∈ R)}

is called the domain of R and denoted by dom R. The set

{y ∈ B | ∃x ∈ A (⟨x,y⟩ ∈ R)}

is called the range of R and is denoted by rangeR.

A relation between A and A is called a relation on (o over) A.

A relation R on A is reflexive iff A×A ⊆ R. A relation R on A is transitive iff

⟨x,y⟩ ∈ R and ⟨y,z⟩ ∈ R implies ⟨x,z⟩ ∈ R

A relation R on A is symmetric iff

⟨x,y⟩ ∈ R implies ⟨y,x⟩ ∈ R

A relation R on A is antisymmetric iff

⟨x,y⟩ ∈ R and x 6= y implies ⟨y,x⟩ ∈ R.

The transitive closure of a binary relation R on a set A denoted by R+ is the

237


transitive relation on A that contains R and is minimal. Intuitively, R+ can be

constructed step by step:

R0 = R

Ri = Ri−1 ∪ {⟨x,z⟩ | ∃y {⟨x,y⟩,⟨x,y⟩} ⊂ Ri−1}

and R+ is all Ri together:

R+ = ⋃i∈N

Ri

The transitive and reflexive closure of a binary relation R on a set A denoted

by R∗ is the transitive and reflexive relation R+∪A×A.

The notation xRy is used as an alternate to ⟨x,y⟩ ∈ R.

C.3 Functions

A relation R between two sets A and B is functional iff, for all x ∈ A, and y,z ∈ B, if

⟨x,y⟩ ∈ R and ⟨x,z⟩ ∈ R implies that y = z.

A partial function is a triple ⟨A, f ,B⟩, where A and B are arbitrary sets and f is

a functional relation between A and B.

The notation f : A 7→ B is used as an alternative to denote the partial function

⟨A, f ,B⟩. The partial function ⟨A, f ,B⟩ and f are usually identified.

For every element x in the domain of a partial function f : A 7→ B, the function

application of f to x denoted by f x (the juxtaposition of f and x) is the unique

element y in the range of f such that ⟨x,y⟩ ∈ f .

When the mere juxtaposition of expressions representing a function (e1) and

a value (e2) results ambiguous we use parenthesis: e1 (e2), (e1)e2, and (e1) (e2).

Without parenthesis, the juxtaposition must be interpreted as left associative:

e1 e2 e3 = (e1 e2)e3

A partial function f : A 7→ B is a total function iff dom f = A. It is customary

to call a total function simply a function. The notation f : A → B is used as an

alternative to denote that f is a total function.

A function f : A → B is injective iff, for all x,y ∈ A, f x = f y implies that x = y.

A function f : A → B is surjective iff, for all y ∈ B, there is some x ∈ A such that

238

C.4 Composition

f x = y, i.e. the range of f is the set B.

A function is bijective iff it is both injective and surjective.

C.4 Composition

Given two binary relations R between A and B, and S between B and C, their com-

position denoted by R◦S is a relation between A and C defined by the following

set of ordered pairs

{⟨a,c⟩ | ∃b ∈ B (⟨a,b⟩ ∈ R and ⟨b,c⟩ ∈ S)}

Given two partial or total functions f : A 7→ B and g : B 7→ C, their composition

denoted by f ◦g is a partial or total function. According to our notation (f ◦g)x =g (f x), that is, f is applied first. Composition is associative.

C.5 Projections

Given a Cartesian product A1 × . . .×An, the projections πi, i ∈ 1..n, are the func-

tions {⟨⟨x1, . . . ,xi, . . . ,xn⟩,xi⟩ | ⟨x1, . . . ,xi, . . . ,xn⟩ ∈ A1 × . . .×An} .

C.6 Natural Numbers

The set of natural numbers (or nonnegative integers) is denoted by N and is the

set {0,1,2,3, . . .}. The set of positive integers is denoted by N+.

Given two natural numbers n and m the enumeration from n to m denoted

by n..m is the set

{i ∈N | i ≤ n, i ≤ m}

C.7 Sequences and Strings

Given two sets I and X , an I-indexed sequence (or sequence) is any function

A : I → X usually denoted by (Ai)i∈I . The set I is called the index set. If X is a set of

239


sets, (Ai)i∈I is called a family of sets.

A set A is finite iff there is a bijection h : 1..n → A for some natural number n.

The natural number n is called the cardinality of the set A, which is also denoted

by | A |. When I is the set N of natural numbers, a sequence (Ai)i∈I is called a

countable sequence, and when I is some set 1..n with n ∈ N, (Ai)i∈I is a finite

sequence.

Given any set A (even infinite), a string over A is any finite sequence u : 1..n →A, where n is a natural number. It is customary to call the set A an alphabet.

Given a string u : 1..n → A, the natural number n is called the length of u and

is denoted by | u |. For n = 0, we have the string corresponding to the unique

function from the empty set to A, called the empty string, and denoted by εA, or

for simplicity by ε when the set A is understood.

Given any set A (even infinite), the set of all strings over A is denoted by A∗

and the set of all nonempty strings over A is denoted by A+.

If u : 1..n → A is a string and n > 0, for every i ∈ 1..n, u (i) is some element of A

also denoted by ui, and the string u is also denoted by u1..un.

Strings can be concatenated as follows. Given any two strings u : 1..m → A

and v : 1..n → A, their concatenation denoted by uv is the string w : 1..(m+n) → A

such that

wi ={

ui if 1 ≤ i ≤ m

vi−m if m+1 ≤ i ≤ m+n

The notation [a,b,c] is used to allow the explicit construction of strings as

an enumeration of the elements. In the example, [a,b,c] represents the string

u : 1..3 → {a,b,c} where u1 = a, u2 = b, and u3 = c. The notation [] is used as an

alternate representation of the empty string ε.

Given a string u, a string v is a prefix (or head) of u if there is a string w such

that u = vw. A string v is a suffix (or tail) of u if there is a string w such that u = wv.

A string v is a substring of u if there are strings x and y such that u = xvy. A prefix

v (suffix, substring) of a string u is proper if v 6= u.

240

C.8 Ellipsis

C.8 Ellipsis

We will use ellipses in different forms.

For strings, the notation e . . .e′ (or [e, . . . ,e′]), where e ∈ A is an expression

where index 1 is used and e′ ∈ A is the result of substituting 1 by a given nat-

ural number n in e, denotes a string r : 1..n → A where each ri is the result of

substituting 1 by i in e.

When a separator symbol ⊗ is used, the notation e⊗ . . .⊗ e′ (a string over A∪{⊗}), where e ∈ A is an expression where index 1 is used and e′ ∈ A is the result of

substituting 1 by a given natural number n in e, denotes a string r : 1..(2n−1) → A

where each ri is the result of substituting 1 by i in e if i is odd and the symbol ⊗ if

i is even.

For sets, the notation {e, . . .e′}, where e ∈ A is an expression where index 1 is

used and e′ ∈ A is the result of substituting 1 by a given natural number n in e,

denotes the set

{e[i\1] | i ∈ 1..n}

where the expression e[i\1] represent e after substituting 1 by a given natural

number i. Note that should be the same as e′[i\n].

241

Bibliography

[1] Martín Abadi and Luca Cardelli. A Theory of Objects. Springer-Verlag, 1996.

[2] Jean-Raymond Abrial. The B-Book. Cambridge University Press, 1996.

[3] Jean-Raymond Abrial. Discrete system models. Internal Notes (http://

www-lsr.imag.fr/B), February 2002.

[4] Jean-Raymond Abrial. Faultless systems: Yes we can! Computer, 42:30–36,

September 2009.

[5] P. S. C. Alencar, D. D. Cowan, and C. J. P. Lucena. A Formal Approach to

Architectural Design Patterns. In M.C. Gaudel and J. Woodcock, editors,

FME’96: Industrial Benefit and Advances in Formal Methods, LNCS, pages

576–594. Springer Verlag, 1996.

[6] Andrew W. Appel. Modern Compiler Implementation in ML. Cambridge

University Press, 1998.

[7] Egidio Astesiano, Michel Bidoit, Hélène Kirchner, Bernd Krieg-Brückner,

Peter D. Mosses, Donald Sannella, and Andrzej Tarlecki. CASL: The Com-

mon Algebraic Specification Language. Theoretical Computer Science,

286(2):153–196, 2002.

[8] Jeffrey Van Baalen and Richard E. Fikes. The role of reversible grammars in

translating between representation languages. In Proceedings of the 4th In-

ternational Conference on Principles of Knowledge Representation and Rea-

soning, pages 562–571, San Francisco, CA, USA, 1994. Morgan Kaufmann

Publishers Inc.

[9] Milica Barjaktarovic and WetStone Technologies, Inc. The state-of-the-art

in formal methods.

243

http://www-lsr.imag.fr/B

http://www-lsr.imag.fr/B

Bibliography

http://www.cs.utexas.edu/users/csed/formal-methods/docs/StateFM.pdf,

January 1998. For Michael Nassiff, Rome Research Site, AFRL/IFGB, 525

Brooks Rd. Rome, NY 13441-4505.

[10] Mike Barnett, K. Rustan, M. Leino, and Wolfram Schulte. The Spec# pro-

gramming system: An overview. In Proceedings of CASSIS 2004: Construc-

tion and Analysis of Safe, Secure and Interoperable Smart devices. Springer,

2004.

[11] Kent Beck. Extreme Programming Explained: Embrace Change. Addison-

Wesley, Pearson Education, 2000. ISBN 201-61641-6.

[12] P. Behm, P. Benoit, and J.M. Meynadier. Meteor: A Successful Application

of B in a Large Project. In FM 99 — World Conference on Formal Methods

in the Development of Computing Systems, number 1708 in LNCS, pages

369–387. Springer Verlag, 1999.

[13] Yves Bertot and Pierre Castéran. Interactive Theorem Proving and Program

Development. Coq’Art: The Calculus of Inductive Constructions. Texts in

Theoretical Computer Science. Springer Verlag, 2004.

[14] Dines Bjørner. Software Engineering 3. Domains, Requirements, and Soft-

ware Design. Springer, 2006.

[15] Paul Boca, Jonathan Bowen, and Jawed Siddiqi, editors. Formal Methods:

State of the Art and New Directions. Springer, 2010.

[16] Daniel Bonniot. Using kinds to type partially polymorphic multi-methods.

In Workshop on Types in Programming (TIP’02), Dagstuhl, Germany, July

2002.

[17] Grady Booch, James Rumbaugh, and Ivar Jacobson. The Unified Model-

ing Language user guide. Addison Wesley Longman Publishing Co., Inc.,

Redwood City, CA, USA, 1999.

[18] Jonathan P. Bowen and Michael G. Hinchey, editors. Applications of Formal

Methods. Prentice Hall PTR, Upper Saddle River, NJ, USA, 1995.

[19] Jonathan P. Bowen and Michael G. Hinchey. Ten commandments of formal

methods. . . ten years later. Computer, 39(1):40–48, 2006.

244

http://www.cs.utexas.edu/users/csed/formal-methods/docs/StateFM.pdf

Bibliography

[20] J.P. Bowen and M.G. Hinchey. The use of industrial-strength formal meth-

ods. Computer Software and Applications Conference, Annual Interna-

tional, 0:332, 1997.

[21] Timothy Budd. An Introduction to Object Oriented Programming.

Addisson-Wesley, second edition, 1998.

[22] James L. Caldwell. Extracting general recursive program schemes in

Nuprl’s type theory. In LOPSTR ’01: Selected papers from the 11th Inter-

national Workshop on Logic Based Program Synthesis and Transformation,

pages 233–244, London, UK, 2001. Springer-Verlag.

[23] Luca Cardelli and Peter Wegner. On understanding types, data abstraction

and polymorphism. Computing Surveys, 17(4):471–522, December 1985.

[24] Manuel Carro, Julio Mariño, Ángel Herranz, and Juan José Moreno Nava-

rro. Teaching how to derive correct concurrent programs (from state-based

specifications and code patterns). In C.N. Dean and R.T. Boute, editors,

Teaching Formal Methods, CoLogNET/FME Symposium, TFM 2004, Ghent,

Belgium, volume 3294 of LNCS, pages 85–106. Springer, 2004. ISBN 3-540-

23611-2.

[25] Giuseppe Castagna. Object-Oriented Programming. A Unified Foundation.

Birkhäuser, 1996.

[26] Giuseppe Castagna. Object-Oriented Programming. A Unified Founda-

tion, chapter Covariance and contravariance: conflict without a cause.

Birkhäuser, 1996.

[27] M. Cerioli, A. Haxthausen, B. Krieg-Brückner, and T. Mossakowski. Permis-

sive subsorted partial logic in CASL. In Algebraic Methodology and Software

Technology (AMAST’97), number 1349 in LNCS, pages 91–107. Springer,

1997.

[28] Iliano Cervesato. Logical frameworks - why not just classical logic? In

Martina Faller, Stefan Kaufman, and Marc Pauly, editors, Proceedings of the

Seventh CSLI Workshop on Logic, Language and Computation. CSLI Publi-

cations, 1999.

[29] Emanuele Ciapessoni, Piergiorgio Mirandola, Alberto Coen-Porisini, Dino

Mandrioli, and Angelo Morzenti. From formal models to formally based

245

Bibliography

methods: an industrial experience. ACM Trans. Softw. Eng. Methodol.,

8:79–113, January 1999.

[30] Mel Ó Cinnéide. Automated Application of Design Patterns: a Refactoring

Approach. PhD thesis, University of Dublin, Trinity College, 2001.

[31] Koen Claessen and John Hughes. Quickcheck: a lightweight tool for ran-

dom testing of Haskell programs. In ICFP, pages 268–279, 2000.

[32] Manuel Clavel, Francisco Durán, Steven Eker, Patrick Lincoln, Narciso

Martí-Oliet, José Meseguer, and José F. Quesada. Using Maude. In Tom

Maibaum, editor, Fundamental Approaches to Software Engineering, Third

International Conference, FASE 2000, Held as Part of ETAPS 2000, Berlin,

Germany, March/April 2000, Proceedings, volume 1783 of Lecture Notes in

Computer Science, pages 371–374. Springer-Verlag, 2000.

[33] Manuel Clavel, Francisco Durán, Steven Eker, Patrick Lincoln, Narciso

Martí-Oliet, José Meseguer, and José Quesada. A Maude Tutorial. CSL,

SRI International, March 2000.

[34] Manuel Clavel and Marina Egea. ITP/OCL: A rewriting-based validation

tool for UML+OCL static class diagrams. In AMAST, pages 368–373, 2006.

[35] S. Conrad, M. Gogolla, and R. Herzig. Troll-light: A core language for

specifying objects. Technical Report 92-02, Technische Universität Braun-

schweig, Informatik, Abt. Datenbanken, Postfach 3329, W-3300 Braun-

schweig, Germany, December 1992.

[36] Steve Cook and John Daniels. Designing object systems: object-oriented

modelling with Syntropy. Prentice-Hall, Inc., Upper Saddle River, NJ, USA,

1994.

[37] David Crocker. Safe object-oriented software: The verified design-by-

contract paradigm. In F. Redmill and T Anderson, editors, Proceedings of

12th Safety-Critical Systems Symposium, Birmingham, UK, February 2004.

Springer Verlag. Chapter 2.

[38] A. Diller. Z: An Introduction to Formal Methods. John Wiley & Sons, 2nd

edition, 1994.

[39] E. Dubois, P. Du Bois, and M. Petit. Object Oriented Requirement Analy-

sis: an Agent Perspective. In Proceedings of the ECOOP’93, pages 458–481,

1993.

246

Bibliography

[40] R. Duke, P. King, G. Rose, and G. Smith. The Object-Z specification lan-

guage. In T. Korson, V. Vaishnavi, and B. Meyer, editors, Technology of

Object-Oriented Languages and Systems TOOLS 5, pages 465–483, 1994.

[41] A. H. Eden, A. Yehudai, and J. Gil. Precise Specification and Automatic Ap-

plication of Design Pattern. In Proc. 12th Annual Conference on Automated

Software Engineering, 1997.

[42] A. H. Eden, A. Yehudai, and J. Gil. LePUs - a Declarative Pattern Specifi-

cation Language. Technical Report 326/98, Department of Computer Sci-

ence, Tel Aviv University, Israel, 1998.

[43] Marina Egea. An executable formal semantics for OCL with Applications to

Formal Analysis and Validation. PhD thesis, Universidad Complutense de

Madrid, 2008.

[44] H. Ehrig, F. Orejas, and J. Padberg. Relevance, integration and classifica-

tion of specification formalism and formal specification techniques. In

Proc. FORMS, Formale Techniken für die Eisenbahnsicherung, Fortschritt-

Berichte VDI, Reihe 12, Nr. 436, VDI Verlag, 2000, pages 31 – 54, 1999.

[45] Herbert Enderton. A Mathematical Introduction to Logic. New York: Aca-

demic Press, 1972.

[46] John Fitzgerald, Peter Gorm Larsen, Paul Mukherjee, Nico Plat, and Marcel

Verhoef. Validated Designs For Object-oriented Systems. Springer-Verlag

TELOS, Santa Clara, CA, USA, 2005.

[47] M. Fowler, K. Beck, J. Brant, and W.F. Opdyke. Refactoring: Improving the

Design of Existing Code. Addison-Wesley, 1999.

[48] Martin Fowler. UML Distilled: A Brief Guide to the Standard Object Model-

ing Language. Addison-Wesley Longman Publishing Co., Inc., Boston, MA,

USA, 3 edition, 2003.

[49] N. E. Fuchs. Specifications are (preferably) executable. Software Engineer-

ing Journal, September 1992.

[50] Jean H. Gallier. Logic for Computer Science. Foundations of Automatic The-

orem Proving. John Willey and Sons, Inc., 1987.

[51] Francisco José Galán Morillo. Formalizaciones para Sintetizar Software

Orientado a Objetos. PhD thesis, Universidad de Sevilla, October 2000.

247

Bibliography

[52] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns - Elements

of Reusable Object Oriented Software. Addison-Wesley, 1995.

[53] Joseph A. Goguen, Timothy Winkler, Jose Meseguer, Kokichi Futatsugi, and

Jean-Pierre Jouannaud. Introducing OBJ. Technical report, Oxford + SRI,

October 1993.

[54] A.Le Guennec, G.Sunyé, and J.M. Jezequel. Precise modeling of design pat-

terns. In Third International Conference on the Unified Modeling Language

(UML2000). University of York, 2000.

[55] John V. Guttag and James J. Horning. Larch: languages and tools for formal

specification. Springer-Verlag New York, Inc., New York, NY, USA, 1993.

[56] Anthony Hall. Seven myths of formal methods. IEEE Softw., 7(5):11–19,

1990.

[57] Anthony Hall. Correctness by construction: Integrating formality into a

commercial development process. In Lars-Henrik Eriksson and Peter Lind-

say, editors, FME 2002:Formal Methods—Getting IT Right, volume 2391 of

Lecture Notes in Computer Science, pages 139–157. Springer Berlin / Hei-

delberg, 2002.

[58] I. J. Hayes and C. B. Jones. Specifications are not (necessarily) executable.

Technical Report 148, Key Center for Software Technology, Department of

Computer Science, The University of Queensland, St. Lucia 4072. Australia,

January 1990.

[59] A. Herranz and J. J. Moreno-Navarro. Generation of and debugging with

logical pre- and post-conditions. In M. Ducasse, editor, Automated and


[60] A. Herranz and J. J. Moreno-Navarro. On the role of functional-logic lan-




[61] A. Herranz and J. J. Moreno-Navarro. Towards automating the iterative

rapid prototyping process with the SLAM system. In V Spanish Conference


248

Bibliography

[62] A. Herranz and J. J. Moreno-Navarro. Design patterns as class operators.

Workshop on High Integrity Software Development at V Spanish Confer-

ence on Software Engineering, JISBD’01, November 2001.

[63] A. Herranz and J. J. Moreno-Navarro. On the design of an object-oriented



[64] A. Herranz and J. J. Moreno-Navarro. Specifying in the large: Object-


B. J. Krämer H. Ehrig and A. Ertas, editors, The Sixth Biennial World Con-

ference on Integrated Design and Process Technology (IDPT’02), volume 1,

Pasadena, California, June 2002. Society for Design and Process Science.

ISSN 1090-9389.

[65] A. Herranz and J.J. Moreno-Navarro. Formal extreme (and extremely for-

mal) programming. In Michele Marchesi and Giancarlo Succi, editors, 4th

International Conference on Extreme Programming and Agile Processes in

Software Engineering, XP 2003, number 2675 in LNCS, pages 88–96, Gen-

ova, Italy, May 2003.

[66] A. Herranz and J.J. Moreno-Navarro. Rapid prototyping and incremental

evolution using SLAM. In 14th IEEE International Workshop on Rapid Sys-

tem Prototyping, RSP 2003), San Diego, California, USA, June 2003.

[67] A. Herranz, J.J. Moreno-Navarro, and N. Maya. Declarative reflection

and its application as a pattern language. In Marco Comini and Moreno

Falaschi, editors, 11th. International Workshop on Functional and Logic

Programming (WFLP’02), Grado, Italy, June 2002. University of Udine.

[68] Ángel Herranz and Julio Mariño. Executable specifications in an object ori-

ented formal notation. In 20th International Symposium on Logic-Based




[69] Ángel Herranz and Juan José Moreno-Navarro. Design Pattern Formaliza-

tion Techniques, chapter Modeling and Reasoning about Design Patterns

in SLAM-SL. IGI Publishing, March 2007. ISBN: 978-1-59904-219-0, ISBN:

978-1-59904-221-3.

249

Bibliography

[70] Ángel Herranz and Pablo Nogueira. More than parsing. In Francisco Javier

López Fraguas, editor, Spanish Conference on Programming and Lan-

guages (CEDI-PROLE’05), pages 193–202. Thomson Paraninfo, September

2005.

[71] Daniel Jackson. Software Abstractions: Logic, Language, and Analysis. The

MIT Press, 2006.

[72] Cliff B. Jones. Systematic Software Development Using VDM. Prentice Hall,

1986.

[73] S. Peyton Jones and J. Hughes. Report on the Programming Language

Haskell 98. A Non-strict Purely Functional Language, February 1999.

[74] H.B.M. Jonkers and J.H. Obbink. COLD: A common object-oriented lan-

guage for design. Technical report, Philips Research Laboratories, Eind-

hoven, 1983. Working document.

[75] R. Jungclaus, G. Saake, T. Hartmann, and C. Sernadas. Troll – a language for

object-oriented specification of information systems. ACM Transactions

on Information Systems, 14(2):175–211, April 1996.

[76] Joshua Kerievsky. Refactoring to Patterns. Addison-Wesley, 2004.

[77] K.Lano and S.Goldsack. Integrated formal and object-oriented methods:

The VDM++ approach. In In proceedings of Methods Integration Workshop.

Leeds Metropolitan University, (Supported by BCS FACS), March 1996.

[78] Donald E. Knuth. Literate programming. Technical report STAN-CS-83-

981, Stanford University, Department of Computer Science, 1983.

[79] C. P. J. Koymans and G. R. Renardel de Lavalette. The logic MPLω. In Al-

gebraic Methods: Theory, Tools and Applications (Part I), number 394 in

Lecture Notes in Computer Science. Sprinter Verlag, 1989.

[80] Leslie Lamport. How to write a proof. American Mathematical Monthly,

102(7):600–608, August-September 1993.

[81] Leslie Lamport. How to write a long formula (short communication). For-

mal Asp. Comput., 6(5):580–584, 1994.

[82] Gary T. Leavens, Albert L. Baker, and Clyde Ruby. JML: A notation for de-

tailed design. In Haim Kilov, Bernhard Rumpe, and Ian Simmonds, editors,

250

Bibliography

Behavioral Specifications of Businesses and Systems, pages 175–188. Kluwer

Academic Publishers, 1999.

[83] Gary T. Leavens, Albert L. Baker, and Clyde Ruby. Preliminary design of

JML: a behavioral interface specification language for java. SIGSOFT Softw.

Eng. Notes, 31(3):1–38, 2006.

[84] T. Lecomte, T. Servat, and G. Pouzancre. Formal methods in safety-critical

railway systems. In Proceedings of Brazilian Symposium on Formal Meth-

ods: SMBF 2007, pages 26–30, Outo Preto, Brazil, aug 2007.

[85] P. Letelier, P. Sánchez, I. Ramos, and O. Pastor. OASIS 3.0: Un enfoque

formal para el modelado conceptual orientado a objetos. SPUPV-98-4011,

1998.

[86] Michael Leuschel and Michael Butler. ProB: A model checker for B. In Kei-

jiro Araki, Stefania Gnesi, and Dino Mandrioli, editors, FME 2003: Formal

Methods, LNCS 2805, pages 855–874. Springer-Verlag, 2003.

[87] Michael Leuschel, Dominique Cansell, and Michael Butler. Validating and

animating higher-order recursive functions in B. In Jean-Raymond Abrial

and Uwe Glässer, editors, Festschrift for Egon Börger, 2007.

[88] Michael Leuschel, Jérôme Falampin, Fabian Fritz, and Daniel Plagge. Au-

tomated property verification for large scale B models. In Proceedings of

the 2nd World Congress on Formal Methods, FM ’09, pages 708–723, Berlin,

Heidelberg, 2009. Springer-Verlag.

[89] Barbara Liskov. Keynote address - data abstraction and hierarchy. In Ad-

dendum to the proceedings on Object-oriented programming systems, lan-

guages and applications (Addendum), OOPSLA ’87, pages 17–34, New York,

NY, USA, 1987. ACM.

[90] John W. Lloyd and Rodney W. Topor. Making Prolog more expressive. J.

Log. Program., 1(3):225–240, 1984.

[91] John Wylie Lloyd. Foundations of Logic Programming. Springer-Verlag New

York, Inc., Secaucus, NJ, USA, 1993.

[92] Julio Mariño, Juan José Moreno-Navarro, and Susana Muñoz Hernández.

Implementing constructive intensional negation. New Generation Com-

puting, 27(1):25–56, January 2009.

251

Bibliography

[93] Barry Mazur. When is one thing equal to some other thing? In Proof

and other dilemmas, MAA Spectrum, pages 221–241. Math. Assoc. Amer-

ica, Washington, DC, 2008.

[94] William McCune. OTTER 3.3 Reference Manual. Argonne National Labora-

tory, 2003. Technical Memorandum ANL/MCS-TM-263.

[95] William McCune. Prover9 and Mace4 Website. http://www.prover9.org,

2005-2010.

[96] Tom Mens, Tom Tourwé, and Francisca Muñoz. Beyond the refactoring

browser: Advanced tool support for software refactoring, 2003.

[97] Bertrand Meyer. Applying “design by contract”. Computer, 25(10):40–51,

1992.

[98] Bertrand Meyer. Eiffel: the language. Prentice-Hall, Inc., Upper Saddle

River, NJ, USA, 1992.

[99] T. Mikkonen. Formalizing Design Patterns. In Proc. ICSE’98, pages 115–

124. IEEE Computer Society Press, 1998.

[100] Till Mossakowski, Anne Elisabeth Haxthausen, Donald Sannella, and An-

drzej Tarlecki. CASL - the common algebraic specification language: Se-

mantics and proof theory. Computers and Artificial Intelligence, 22(3-

4):285–321, 2003.

[101] Susana Muñoz. A Negation System for Prolog. PhD thesis, Facultad de In-

formática, Universidad Politécnica de Madrid, 2003.

[102] Ulf Norell. Dependently typed programming in Agda. In TLDI ’09: Pro-

ceedings of the 4th international workshop on Types in language design and

implementation, pages 1–2, New York, NY, USA, 2009. ACM.

[103] Martin Odersky, Philippe Altherr, Vincent Cremet, Iulian Dragos, Gilles

Dubochet, Burak Emir, Sean McDirmid, Stéphane Micheloud, Nikolay Mi-

haylov, Michel Schinz, Erik Stenman, Lex Spoon, and Matthias Zenger. An

overview of the Scala programming language. Technical Report LAMP-

REPORT-2006-001, École Polytechnique Fédérale de Lausanne, 1015 Lau-

sanne, Switzerland, 2006.

252

http://www.prover9.org

Bibliography

[104] Martin Odersky and Philip Wadler. Pizza into Java: Translating theory into

practice. In Proc. 24th ACM Symposium on Principles of Programming Lan-

guages, January 1997.

[105] Nicolas Oury and Wouter Swierstra. The power of Pi. SIGPLAN Not.,

43(9):39–50, 2008.

[106] Oscar Pastor López, Fiona Hayes, and Stephen Bear. OASIS: An Object-

Oriented Specification Language., volume 593 of Lecture Notes in Computer

Science, pages 348–363. Springer, Berlin, Heidelberg, January 1992.

[107] Michela Pedroni and Bertrand Meyer. The inverted curriculum in practice.

In SIGCSE, pages 481–485, 2006.

[108] Benjamin C. Pierce. Types and Programming Languages. MIT Press, 2002.

[109] Benjamin C. Pierce, editor. Advanced Topics in Types and Programming

Languages. MIT Press, 2005.

[110] Lucia Rapanotti and Adolfo Socorro. Introducing FOOPS. Technical Report

PRG-TR-28-92, Oxford University Computing Laboratory, 1992.

[111] Mark Richters. A Precise Approach to Validating UML Models and OCL Con-

straints. PhD thesis, Universität Bremen, 2002.

[112] J. Rothe, H. Tews, and B. Jacobs. The coalgebraic class specification lan-

guage CCSL. Journal of Universal Computer Science, 7(2):175–193, March

2001.

[113] James Rumbaugh, Michael Blaha, William Premerlani, Frederick Eddy, and

William Lorensen. Object-oriented modeling and design. Prentice-Hall,

Inc., Upper Saddle River, NJ, USA, 1991.

[114] C. Sernadas, P. Goureia, and A. Sernadas. Oblog: Object oriented logic

based conceptual modelling. Technical report, Instituto Superior Tecnico,

Lisboa, 1992.

[115] Graeme Smith. The Object-Z specification language. Kluwer Academic

Publishers, Norwell, MA, USA, 2000.

[116] Greame Smith. Reasoning about Object-Z specifications. In Asia-Pacific

Software Engineering Conference (APSEC ’95). IEEE Computer Society

Press, 1995.

253

Bibliography

[117] A.E. Kelley Sobel and M.R. Clarkson. Formal methods application: An em-

pirical tale of software development. IEEE Transactions on Software Engi-

neering, 28:308–320, 2002.

[118] Ian Sommerville. Software Engineering. Pearson Education, 8th edition,

June 2006. ISBN13: 9780321313799, ISBN10: 0321313798.

[119] J. M. Spivey. The Z Notation: A Reference Manual. Prentice Hall Interna-

tional Series in Computer Science, 2nd edition, 1992.

[120] Bjarne Stroustrup. The C++ Programming Language. Addison-Wesley,

2004. ISBN 0-201-88954-4 and 0-201-70073-5.

[121] T. Taibi and D. C. L. Ngo. Formal specification of design patterns - a bal-

anced approach. Journal of Object Technology, 2(4):127–140, 7 2003.

[122] L. Tokuda. Evolving Object-Oriented Designs with Refactorings. PhD thesis,

University of Texas, December 1999.

[123] L. Tokuda and D. Batory. Automated software evolution via design pat-

tern transformations. In 3rd International Symposium on Applied Corpo-

rate Computing, Monterrey, Mexico, October 1995.

[124] Ambrosio Toval, José Sáez, and Francisco Maestre. Automated property

verification in UML models. In Michael Leuschel, Stefan Gruner, and

Stéphane Lo Presti, editors, Proceedings of the 3rd Automated Verification

of Critical Systems (AVoCS’03), Southampton (GB), 2003.

[125] Jos Warmer and Anneke Kleppe. The object constraint language: pre-

cise modeling with UML. Addison-Wesley Longman Publishing Co., Inc.,

Boston, MA, USA, 1999.

[126] Jim Woodcock, Peter Gorm Larsen, Juan Bicarregui, and John Fitzgerald.

Formal methods: Practice and experience. ACM Comput. Surv., 41:19:1–

19:36, October 2009.

[127] J. B. Wordsworth. Software Development with Z. Addison-Wesley, 1992.

[128] Alloy. http://alloy.mit.edu/.

[129] Atelier B. http://www.atelierb.eu/.

[130] B-Core (UK) Limited. http://www.b-core.com/.

254

http://alloy.mit.edu/

http://www.atelierb.eu/

http://www.b-core.com/

Bibliography

[131] ClearSy System Engineering. http://www.clearsy.com/.

[132] CSK Systems Corp. http://www.csk.com/systems/.

[133] Eiffel software. http://www.eiffel.com/.

[134] Escher Technologies Limited. http://www.eschertech.com/.

[135] Formal and precise software pattern representation languages FAQ.

http://www.cs.concordia.ca/%7efaculty/eden/precise_and_formal/faq.

htm.

[136] The Haskell programming language. http://www.haskell.org/.

[137] Larch. http://www.sds.lcs.mit.edu/Larch/.

[138] The Nice programming language. http://nice.sourceforge.net/.

[139] Perfect Developer. http://www.eschertech.com/.

[140] PITAC - Interim Report to the President.

http://www.ccic.gov/ac/interim/, August 1998.

[141] Praxis High Integrity Systems Limited. http://www.praxis-his.com/.

[142] The ProB animator and model checker.

http://www.stups.uni-duesseldorf.de/ProB/.

[143] The Scala programming language. http://www.scala-lang.org/.

[144] SLAM Project. http://research.microsoft.com/en-us/projects/slam/.

[145] The SLAM Project. http://babel.ls.fi.upm.es/slam/.

[146] SPARKAda. http://www.praxis-his.com/sparkada/.

[147] Spec#. http://research.microsoft.com/SpecSharp/.

[148] SWI-Prolog. http://www.swi-prolog.org/.

[149] VDM portal. http://www.vdmportal.org/.

[150] VDM Tools. http://www.vdmtools.jp/.

255

http://www.clearsy.com/

http://www.csk.com/systems/

http://www.eiffel.com/

http://www.eschertech.com/

http://www.cs.concordia.ca/%7efaculty/eden/precise_and_formal/faq.htm

http://www.cs.concordia.ca/%7efaculty/eden/precise_and_formal/faq.htm

http://www.haskell.org/

http://www.sds.lcs.mit.edu/Larch/

http://nice.sourceforge.net/

http://www.eschertech.com/

http://www.ccic.gov/ac/interim/

http://www.praxis-his.com/

http://www.stups.uni-duesseldorf.de/ProB/

http://www.scala-lang.org/

http://research.microsoft.com/en-us/projects/slam/

http://babel.ls.fi.upm.es/slam/

http://www.praxis-his.com/sparkada/

http://research.microsoft.com/SpecSharp/

http://www.swi-prolog.org/

http://www.vdmportal.org/

http://www.vdmtools.jp/

universidad politÉcnica de madrid facultad de …oa.upm.es/5682/1/ange_herranz_nieva_2.pdfreno...

Documents