domain-specific languages for cellular interactions

31
Domain-specific Languages for Cellular Interactions Bill Harrison Department of Computer Science University of Missouri at Columbia This work partially supported by: NIH1 R0l GM62920-04A1, NIH1 P20 GM065762-01A1, the Georgia Research Alliance a the Georgia Cancer Coalition.

Upload: adia

Post on 19-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Domain-specific Languages for Cellular Interactions. Bill Harrison Department of Computer Science University of Missouri at Columbia. This work partially supported by: NIH1 R0l GM62920-04A1, NIH1 P20 GM065762-01A1, the Georgia Research Alliance and the Georgia Cancer Coalition. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Domain-specific Languages for Cellular Interactions

Domain-specific Languages for Cellular Interactions

Bill HarrisonDepartment of Computer ScienceUniversity of Missouri at Columbia

This work partially supported by: NIH1 R0l GM62920-04A1, NIH1 P20 GM065762-01A1,the Georgia Research Alliance andthe Georgia Cancer Coalition.

Page 2: Domain-specific Languages for Cellular Interactions

Domain-specific Languages for Cellular Interactions

Bill HarrisonDepartment of Computer ScienceUniversity of Missouri at Columbia

meow!

This work partially supported by: NIH1 R0l GM62920-04A1, NIH1 P20 GM065762-01A1,the Georgia Research Alliance andthe Georgia Cancer Coalition.

Page 3: Domain-specific Languages for Cellular Interactions

Ph.D 2001, UIUC Thesis: Modular Compilers and Their

Correctness Proofs Thesis Advisor: Sam Kamin

Post-doc, Oregon Graduate Inst. (OGI) Three years on Programatica Project

using Haskell programming language as basis for formal methods

Assistant Professor, University of Missouri-Columbia since Fall 2003

Page 4: Domain-specific Languages for Cellular Interactions

Systems Biology asks… Can static biological structure be related

to dynamic biological behavior with mathematical clarity, precision, & rigor?

Can biological systems be viewed as the “sum of their parts”? Can component-level models be integrated into

precise system-level models of biological behavior?

What techniques from Mathematics and Computer Science apply to this composition problem?

Page 5: Domain-specific Languages for Cellular Interactions

Rhodobacter Sphaeroides Photosynthetic

bacterium seeks out regions of

greater light Roughly the size of

wavelength of light cannot sense local

light differences directly

applies random walk

Page 6: Domain-specific Languages for Cellular Interactions

Simulations of Biological Systems

Simulations provide qualitative feedback, but are not models per se how accurate/faithful is a simulation? what does the feedback mean? can one reason about the biological

phenomenon based on the simulation? can you identify the biology by

inspecting the text of the simulation program?

Page 7: Domain-specific Languages for Cellular Interactions

R. Sphaeroides in C++

contains 1000 LOC to understand requires

expertise in C++ …and biological model …and critical system details

e.g., how is concurrency implemented?

bool global_state::register_state(void *apointer){ if( number_of_states == mother_of_all_states.size()) mother_of_all_states.resize(number_of_states + 1000); mother_of_all_states[number_of_states++] = apointer; return true;}

Page 8: Domain-specific Languages for Cellular Interactions

R. Sphaeroides in C++ Program structure does not

reflect biological model can you look at the source code

and recognize the underlying biology?

difficult to comprehend …and write correctly …and modify …and maintain …and re-use

bool global_state::register_state(void *apointer){ if( number_of_states == mother_of_all_states.size()) mother_of_all_states.resize(number_of_states + 1000); mother_of_all_states[number_of_states++] = apointer; return true;}

Page 9: Domain-specific Languages for Cellular Interactions

System Biology as Programming Language Design

The Problem: General-purpose programming languages do

not have the “right vocabulary” Biological model: Concurrent Markov chains C++: classes, pointers, etc.

…nor are they mathematics Our Solution: Design small, special purpose

languages with exactly the right vocabulary called a Domain-specific Language (DSL)

[Sheard99,Thiemann01,Leijen01] Mathematical semantics of DSLs gives

formal model of biology

Page 10: Domain-specific Languages for Cellular Interactions

cell1 || … || cellnExecuting:

Produces animation:

Language Model of R. Sphaeroides

Page 11: Domain-specific Languages for Cellular Interactions

Outline Language Design and Domain-specific

Languages design, definition, and implementation

Systems Biology as Language Design Case Study for Rhodobacter Sphaeroides

Design: what are the appropriate abstractions for R. Sphaeroides?

Definition: how do we specify exactly what R. Sphaeroides programs mean?

Implementation: how do we run R. Sphaeroides programs?

Conclusions

Page 12: Domain-specific Languages for Cellular Interactions

Application Programmers should choose languages with abstractions most suited to their task;Language designers must provide languages with those abstractions…

Domain Central Activities Reasonable Language

System Programming “bit-fiddling” C

Artificial Intelligence List processing LISP

System Admin. Text processing, etc. PERL

Cardinal Rule of Language Design

Page 13: Domain-specific Languages for Cellular Interactions

DSLs are small languages w/ “domain abstractions”

translatesdirectly

assignStmt :: Parser StmtassignStmt = do{ id ident ; symbol ":=" ; s Expr ; return (Assign id s)}

Parsec code

<Stmt> <ident> := <Expr>BNF for language

Ex: “Parsec” Parser DSL

Page 14: Domain-specific Languages for Cellular Interactions

“Why a language and not a library?”

The Slogan: “What is excluded from a DSL is as important as what is included in it”

libraries in a general-purpose language still require considerable expertise & self-discipline on the part of the

programmer Lack of generality in DSL fewer things to “go wrong”

DSL may have desirable properties that a general-purpose language will not

e.g., implementation techniques specialized to DSL that do not apply to general-purpose languages

small size makes rigorous specification tractable

Page 15: Domain-specific Languages for Cellular Interactions

DSL Design

DSL design for R. Sphaeroides what are our domain abstractions?

How does this organism behave? What modeling techniques are used by

biologists to describe this behavior?

Page 16: Domain-specific Languages for Cellular Interactions

Bacterial Commands

adjustspeed

grow dividetumble

die

*Probability of growth varies with light concentration

laze

Page 17: Domain-specific Languages for Cellular Interactions

Chapman-Kolmogorov Equation*

probability of transition from i to j

Pi,j

probability of being in state m

*Commonly used framework for modeling biological systems [Bremaud99, Dailey02, Mao02, Shah00]

Page 18: Domain-specific Languages for Cellular Interactions

Chapman-Kolmogorov Equation

A row in the above matrix encodes the transition function from state i of a Markov chain

Page 19: Domain-specific Languages for Cellular Interactions

Bacteria as Markov Chains

State i

State 0

State m…

0,iP

miP,

• non-deter. state machines with probabilistic transitions induced by the Chapman-Kolmogorov equation• Pi,j in terms of environmental factors, organism state, etc.• executing concurrently

Page 20: Domain-specific Languages for Cellular Interactions

Domain Abstractions for R. Sphaeroides

Individual cells: Markov-chain abstraction

choose P1 Action1

… Pn Actionn

Actions: Tumble, Divide, AdjSpeed, Laze, Grow, etc.

Concurrency: cell1 || cell2 Environmental Factors: light, size

Page 21: Domain-specific Languages for Cellular Interactions

Abstract syntax for CellSys

choose is our principal domain abstraction behaves like the Markov chain transition function

Cell-level environment variables: light, size

Page 22: Domain-specific Languages for Cellular Interactions

DSL Definition Background: Programming languages

are “collections of effects” Java = OO + Threads + State +… LISP = Higher-order Functions + … Prolog = Backtracking + …

Corresponding to each such effect is an algebraic construction called a monad

used for the development of modular semantic theories of programming languages [Moggi89]

monads may be constructed using “monad transformers”

Page 23: Domain-specific Languages for Cellular Interactions

StateTimperative

:=

EnvTbinding @ v

ErrorTexceptionsraise/catch

ContTcontinuationscallcc

NondetTnon-determ.choose

ResTthreads

step pause

DebugTdebuggingrollback

BackTbacktracking

cut

ProbTprobabilityrandom

ReactTreactivity

send,recv,…

Periodic Table of Effects

StateTimperative

:=

EnvTbinding @ v

ErrorTexceptionsraise/catch

ContTcontinuationscallcc

NondetTnon-determ.choose

ResTthreads

step pause

DebugTdebuggingrollback

BackTbacktracking

cut

ReactTreactivity

send,recv,…

Prog. languages are collections of effects captured as monads [Moggi] Monads assembled from constructors (monad transformers)

Our view: Systems are collections of effects captured as monads “Systems” broadly construed:

Compilers [Harrison00,98,01,02], Secure system software [Harrison05,03], and Biology [Harrison04]

Page 24: Domain-specific Languages for Cellular Interactions

Periodic Table of Effects

ProbTprobabilityrandom

StateTimperative

:=

EnvTbinding @ v

ErrorTexceptionsraise/catch

ContTcontinuationscallcc

NondetTnon-determ.choose

ResTthreads

step pause

DebugTdebuggingrollback

BackTbacktracking

cut

ReactTreactivity

send,recv,…

Mathematical definitions for any language created by combining MTs

CellSys = StateT + ResT + ProbT + ReactT

Such definitions are flexible modular, extensible, and easily refactored

Page 25: Domain-specific Languages for Cellular Interactions

DSL definition similar to traditional RTS

In a traditional RTS threads request

services like “send a message” “output on device” “consume resource”

RTS mediates ensuring that the

threads do not interfere

global system state remains consistent

schedules threads

Run-time System

threads

Page 26: Domain-specific Languages for Cellular Interactions

High-level view of definition

In CellSys Cells are threads with

physical components as well size, velocity, …

cells request services like “consume nutrients” “move me here” “want to divide”

GE mediates like RTS, also: preserves physical integrity updates global world view performs scheduling

Global Enviroment

cells

Page 27: Domain-specific Languages for Cellular Interactions

DSL Implementation Because CellSys defined in terms of

monad transformers, may be implemented directly as Haskell program I.e., monadic language definition may be

transcribed “symbol for symbol” into Haskell

Haskell implementation easily instrumented to output system “snapshots”:

prints out snapshots in POV (Persistence of Vision) format & converted into MPEG

Page 28: Domain-specific Languages for Cellular Interactions

Q: What are appropriate languages for modeling?

Integrate techniques from programming languages models of concurrency language semantics

i.e., precise, mathematical language definitions efficient language implementation

…into special purpose language called a “Domain-Specific Language”

abstractions taken directly from biology comprehensible by biologists

DSLs and DSL programs hide technical details irrelevant/uninteresting to biologists are “tunable” by computer scientist to reflect

discovery/refinement execute to provide “reality check” by biologists

Page 29: Domain-specific Languages for Cellular Interactions

Bioinformatics = Computer Science + Biology

models of concurrency efficient implementation mathematical models of

programs reasoning about programs

organism structure & behavior

modeling techniques cellular automata systems of PDE’s numerical

techniques

Computer Science Biology

=

Hard Problem: How do you effect a technology transfer from CS Biology?

Page 30: Domain-specific Languages for Cellular Interactions

Interdisciplinary Process

CellSys (version 1.0)

CellSys (version 2.0)

feedback/discussion

Biologist evaluates DSL model for

accuracy, expressiveness,

etc.Language expert refactors

language as needed

Page 31: Domain-specific Languages for Cellular Interactions

Summary

modularmonadic

semantics

domainspecific

languages

systemsbiology

Comprehensibility, Reusability, &

Ease of Use

Precise description of biologicalphenomena through DSL semantics

Large body of work providing domain abstractions &

models

* Harrison & Harrison, “Domain Specific Languages for Cellular Interactions” in Proceedingsof the International Conference IEEE Engineering in Medicine and Biology, 2004.