an introduction to spitbol programming languages robert dewar
Post on 20-Dec-2015
227 views
TRANSCRIPT
An Introduction toAn Introduction toSPITBOLSPITBOL
Programming LanguagesProgramming Languages
Robert DewarRobert Dewar
SPITBOL BackgroundSPITBOL Background
A series of string processing languages A series of string processing languages developed at Bell Labs (Griswold et al) developed at Bell Labs (Griswold et al)
SNOBOLSNOBOLSNOBOL-3SNOBOL-3SNOBOL-4SNOBOL-4
Later Griswold developed ICONLater Griswold developed ICON Based on SNOBOL-4Based on SNOBOL-4
SPITBOL is a fast compiled implementation SPITBOL is a fast compiled implementation of SNOBOL-4 (dewar et al)of SNOBOL-4 (dewar et al)
Silly AcronymsSilly Acronyms
SStritriNNg g OOriented symriented symBOBOlic lic LLanguageanguageSPSPeedy eedy IImplemenmplemenTTation of snoation of snoBOLBOL44
Strictly speaking, SPITBOL is a dialectStrictly speaking, SPITBOL is a dialectRemoves a few very marginal featuresRemoves a few very marginal featuresAdds a number of extensionsAdds a number of extensions
Dynamic TypingDynamic Typing
A = 123A = 123 ;* A has an integer;* A has an integerA = “BCD”A = “BCD” ;* A has a string;* A has a stringA = array(10)A = array(10) ;* A has an array;* A has an array
Full typing information availableFull typing information availableFull type checking doneFull type checking doneBut types can vary dynamicallyBut types can vary dynamicallyNo static declarationsNo static declarations
Datatypes (partial list)Datatypes (partial list)
INTEGER (typical 32-bit signed integer)INTEGER (typical 32-bit signed integer) REALREAL STRINGSTRING
Varying length string as first class typeVarying length string as first class type Not in any sense an array of charactersNot in any sense an array of characters
ARRAYARRAY TABLETABLE PATTERNPATTERN CODECODE
Basic Syntactic formBasic Syntactic form
Line orientedLine orientedLabels in column 1Labels in column 1Rest of line free format (keep to 80 cols)Rest of line free format (keep to 80 cols)Continuation lines have . (period) in col Continuation lines have . (period) in col
11Comment line starts with *Comment line starts with *Multiple statements on line using ;Multiple statements on line using ;
But no ; normally after a statementBut no ; normally after a statementThe combination ;* makes a line commentThe combination ;* makes a line comment
More on basic syntaxMore on basic syntax
Assignment uses =Assignment uses =Must have spaces around =Must have spaces around =Must have spaces around binary Must have spaces around binary
operatorsoperatorsMust not have space after unary Must not have space after unary
operatoroperatorNull operator (i.e. space) is Null operator (i.e. space) is
concatenationconcatenation
Simple ArithmeticSimple Arithmetic
Normal arithmetic operatorsNormal arithmetic operators A = 123A = 123
A = A + 2 A = A + 2 B = 126 B = 126 A = (A + B) / (A * B) A = (A + B) / (A * B)
Note: precedence of / is lower than * so Note: precedence of / is lower than * so we could have written last line as:we could have written last line as: A = (A + B) / A * B A = (A + B) / A * B
Real arithmeticReal arithmetic
Same set of operatorsSame set of operators
A = 123.45A = 123.45 B = 27.55 B = 27.55 C = A / B C = A / B
Automatic widening of integersAutomatic widening of integers
C = C + 1 C = C + 1 ;* 1 treated as 1.0 ;* 1 treated as 1.0 herehere
StringsStrings
Strings can be any lengthStrings can be any lengthString literals have two formsString literals have two forms
Surround by “ can contain embedded ‘Surround by “ can contain embedded ‘Surround by ‘ can contain embedded “Surround by ‘ can contain embedded “
Examples:Examples: A = “123’ABC” A = “123’ABC” N = ‘b”c’ N = ‘b”c’ C = A N A ;* concatenation C = A N A ;* concatenation* C has value 123’ABCb”c123’ABC* C has value 123’ABCb”c123’ABC
Strings and IntegersStrings and Integers
Can auto-convert between string/integerCan auto-convert between string/integer X = 123X = 123
K = X “abc” ;* K = string 123abc K = X “abc” ;* K = string 123abc K = X “” ;* K = integer 123 K = X “” ;* K = integer 123* concatenating with null is special as above* concatenating with null is special as above
X = “123”X = “123” ;* X = string “123”;* X = string “123” M = X + 1 M = X + 1 ;* M = integer 124;* M = integer 124 M = X + “a” M = X + “a” ;* run-time error;* run-time error
PredicatesPredicates
Predicates are functions that either Predicates are functions that either return the null string (on true) or return the null string (on true) or “fail” on false“fail” on false
Integer predicates: eq le lt ne gt geInteger predicates: eq le lt ne gt ge eq(1,2)eq(1,2) failsfails
ne(1,2) ne(1,2) succeeds, returns nullsucceeds, returns nullNote: no space between function name Note: no space between function name
and left parenthesis (rule applies to all and left parenthesis (rule applies to all functions)functions)
Gotos and labelsGotos and labels
A label is an identifier in column oneA label is an identifier in column oneAt the end of any statement can have At the end of any statement can have
a goto field in one of five forms:a goto field in one of five forms: :(Label):(Label) unconditional goto Labelunconditional goto Label
:S(B1) :S(B1) on success goto b1, on fail fall throughon success goto b1, on fail fall through
:F(B2):F(B2) on success fall through, on failure goto on success fall through, on failure goto B2B2
:S(F1)F(X) :S(F1)F(X) on success goto F1, on failure go to Xon success goto F1, on failure go to X
:F(F1)S(X):F(F1)S(X) on failure goto F1, on success go on failure goto F1, on success go to Xto X
Example of use of LabelsExample of use of Labels
A simple loop (add numbers from 1 to 10)A simple loop (add numbers from 1 to 10) N = 1N = 1
S = 0S = 0SUMSUM S = S + NS = S + N
N = LT(N, 10) N + 1N = LT(N, 10) N + 1 :S(SUM):S(SUM)Note that if LT(N,10) succeeds it returns nullNote that if LT(N,10) succeeds it returns nullThe null is concatenated with the value of NThe null is concatenated with the value of NNow you see why the special rule that Now you see why the special rule that
concatenating null does nothing at all!concatenating null does nothing at all!
Comparing StringsComparing Strings
Cannot compare using eq, neCannot compare using eq, neSince these work only for Integer, RealSince these work only for Integer, RealFor example EQ(“123”,”00123”) succeedsFor example EQ(“123”,”00123”) succeedsBut EQ(“ABC”,”ABC”) is a run-time errorBut EQ(“ABC”,”ABC”) is a run-time error
So to compare two stringsSo to compare two stringsUse IDENT(A,B) or DIFFER(A,B) to compareUse IDENT(A,B) or DIFFER(A,B) to compareMissing args are null soMissing args are null soIDENT(A) or DIFFER(A) checks for being IDENT(A) or DIFFER(A) checks for being
equal to null or not equal to nullequal to null or not equal to null
Input-OutputInput-Output
To write to standard output:To write to standard output: OUTPUT = stringOUTPUT = string
To write to standard error:To write to standard error: TERMINAL = stringTERMINAL = string
To read from standard inputTo read from standard input LINE = TERMINALLINE = TERMINAL fails if no more input (end of file)fails if no more input (end of file)
To Read/Write FilesTo Read/Write Files
Dynamicaly associate variables with the Dynamicaly associate variables with the files and subsequent assignments write files and subsequent assignments write the file and subsequent references the file and subsequent references read.read.
Here is a file copy programHere is a file copy program INPUT(‘IN’,1,”filename1”)INPUT(‘IN’,1,”filename1”)
OUTPUT(‘OUT’,2,”filename2”)OUTPUT(‘OUT’,2,”filename2”)CLCL OUT = IN :S(CL)OUT = IN :S(CL)ENDEND
End label ends program (always true)End label ends program (always true)1 and 2 are unit numbers, must be unique1 and 2 are unit numbers, must be unique
Pattern MatchingPattern Matching
General format isGeneral format is subject ? patternsubject ? pattern subject ? pattern = valuesubject ? pattern = valueThe ? can be omittedThe ? can be omittedMatch may failMatch may failIf match succeeds in second form, value If match succeeds in second form, value
replaces matched part of subjectreplaces matched part of subjectPattern can contain strings or special Pattern can contain strings or special
pattern primitivespattern primitives
Pattern Matching ExamplesPattern Matching Examples
Example:Example: X = “123AABCTHECAT”X = “123AABCTHECAT”
X ? “A” ARB “THE” = “HELLO”X ? “A” ARB “THE” = “HELLO”Here ARB matches anything (special Here ARB matches anything (special
primitive)primitive)Match is to left most occurrenceMatch is to left most occurrenceSo ARB matches “ABC”So ARB matches “ABC”Resulting value in X is “123HELLOCAT”Resulting value in X is “123HELLOCAT”
Other primitivesOther primitives
These can be used as pattern These can be used as pattern componentscomponents LEN(int) matches int charactersLEN(int) matches int characters ANY(“AB”) matches A or BANY(“AB”) matches A or B SPAN(“ “) matches longest spaces stringSPAN(“ “) matches longest spaces string BREAK(“A”) matches up to but not incl ‘A’BREAK(“A”) matches up to but not incl ‘A’ REM matches rest of stringREM matches rest of string BAL matches paren balanced stringBAL matches paren balanced string
Pattern ConstructorsPattern Constructors
AlternationAlternation P1 | P2P1 | P2Matches either P1 or P2, try P1 firstMatches either P1 or P2, try P1 first
ConcatenationConcatenation P1 P2P1 P2Matches P1 then P2Matches P1 then P2
Pattern OutputPattern Output
The use of the dot operatorThe use of the dot operator STM = “label x = terminal”STM = “label x = terminal”
STM ? BREAK(‘ ‘) . L SPAN(‘ ‘) REM . S STM ? BREAK(‘ ‘) . L SPAN(‘ ‘) REM . SIf match succeeds (only if) period results If match succeeds (only if) period results
in assigning matched part to given in assigning matched part to given variablevariable
After above matchAfter above match L = “label” L = “label”
S = “x = terminal” S = “x = terminal”
Pattern OutputPattern Output The $ operator is like the dot operator, but assignment is The $ operator is like the dot operator, but assignment is
immediateimmediate
"ABC" ? ARB $ TERMINAL 'x‘"ABC" ? ARB $ TERMINAL 'x‘ENDEND
Output is ten lines:Output is ten lines: (blank line) (arb matches null string before A)(blank line) (arb matches null string before A) AA
ABAB ABCABC
(blank line) (arb matches null string between A and B)(blank line) (arb matches null string between A and B) BB BCBC (blank line) (arb matches null string between B (blank line) (arb matches null string between B
and C)and C) CC
(blank line) (arb matches null string after C) (blank line) (arb matches null string after C)
Patterns as ValuesPatterns as Values
Patterns can be assigned etcPatterns can be assigned etc Vowel = ‘oe’ | ‘ae’ | ‘a’ | ‘e’ | ‘i’ | ‘o’ | Vowel = ‘oe’ | ‘ae’ | ‘a’ | ‘e’ | ‘i’ | ‘o’ |
‘u’‘u’ Cons = Notany(“aeiou”);Cons = Notany(“aeiou”);Now can use Vowel in a patternNow can use Vowel in a patternSo a big pattern can be built upSo a big pattern can be built upUsing a series of assignments to build it Using a series of assignments to build it
from component partsfrom component parts Vowelseq = Arbno(Vowel)Vowelseq = Arbno(Vowel)
Isolatedcons = Vowelseq Cons Vowelseq Isolatedcons = Vowelseq Cons Vowelseqetc.etc.
Fancy Recursive PatternsFancy Recursive Patterns
Here is a BNF grammar for simple Here is a BNF grammar for simple expressionsexpressionsEXPR ::= TERM | EXPR + TERMEXPR ::= TERM | EXPR + TERM
TERM ::= PRIM | PRIM * TERMTERM ::= PRIM | PRIM * TERMPRIM ::= LETTER | ( EXPR )PRIM ::= LETTER | ( EXPR )LETTER ::= ‘a’ | ‘b’ | ‘c’ … ‘z’LETTER ::= ‘a’ | ‘b’ | ‘c’ … ‘z’
Generates strings likeGenerates strings like a+b*(c+d) a+b*(c+d)
First attempt at patternFirst attempt at pattern
Here is a pattern matching that Here is a pattern matching that grammargrammar EXPR = TERM | EXPR ‘+’ TERMEXPR = TERM | EXPR ‘+’ TERM
TERM = PRIM | PRIM ‘*’ TERM TERM = PRIM | PRIM ‘*’ TERM PRIM = LETTER | ‘(‘ EXPR ‘)’ PRIM = LETTER | ‘(‘ EXPR ‘)’ LETTER = ANY(“abc .. xyz”) LETTER = ANY(“abc .. xyz”)
Neat Neat But wrong But wrong Why, because when you execute the Why, because when you execute the
assignment to EXPR, TERM are null assignment to EXPR, TERM are null EXPR = ‘’ | ‘’ ‘*’ ‘’EXPR = ‘’ | ‘’ ‘*’ ‘’That’s not what you want That’s not what you want
Second attempt at patternSecond attempt at pattern
Here is a pattern that worksHere is a pattern that works EXPR = *TERM | *EXPR ‘+’ *TERMEXPR = *TERM | *EXPR ‘+’ *TERM
TERM = *PRIM | *PRIM ‘*’ *TERM TERM = *PRIM | *PRIM ‘*’ *TERM PRIM = *LETTER | ‘(‘ *EXPR ‘)’ PRIM = *LETTER | ‘(‘ *EXPR ‘)’ LETTER = ANY(“abc .. xyz”) LETTER = ANY(“abc .. xyz”)
This works, because unary * means This works, because unary * means don’t look in the variable until pattern don’t look in the variable until pattern matching times.matching times.
More neat patternsMore neat patterns
Match all palindromesMatch all palindromes PAL = POS(0) ARB $ STR *REVERSE(STR) RPOS (0)PAL = POS(0) ARB $ STR *REVERSE(STR) RPOS (0)
POS(0) matches null string at start POS(0) matches null string at start RPOS(0) matches null string at endRPOS(0) matches null string at end The unary * actually means don’t evaluate The unary * actually means don’t evaluate
expression until pattern matching time, so expression until pattern matching time, so reverse is called during the pattern match.reverse is called during the pattern match.
ArraysArrays
Array created by call to array functionArray created by call to array function AR = ARRAY(50)AR = ARRAY(50)
To index, we use <>, fail if out of To index, we use <>, fail if out of rangerange
To fill AR with integers 1 .. 50To fill AR with integers 1 .. 50 N = 0N = 0
LP LP AR<N = N + 1> = N :S(LP)AR<N = N + 1> = N :S(LP)Multidimensional arrays allowed etc.Multidimensional arrays allowed etc.
TablesTables
Like arrays but subscript can be Like arrays but subscript can be anythinganything
Implemented typically by hash tablesImplemented typically by hash tables R = TABLE(100)R = TABLE(100)
LP LP S = TERMINAL S = TERMINAL :F(END):F(END) TERMINAL = NE(R<S>) S “given “ R<S> “times”TERMINAL = NE(R<S>) S “given “ R<S> “times”
R<S> = R<S> + 1 :(LP)R<S> = R<S> + 1 :(LP) END END
FunctionsFunctions
Functions are defined dynamicallyFunctions are defined dynamicallyEverything in SNOBOL4 is dynamic Everything in SNOBOL4 is dynamic
Factorial functionFactorial function DEFINE(“FACT(X)”)DEFINE(“FACT(X)”)
TERMINAL = FACT(6) TERMINAL = FACT(6) :(END):(END) FACT FACT = EQ(X,1) 1 FACT FACT = EQ(X,1) 1 :S(RETURN):S(RETURN)
FACT = X * FACT(X – 1)FACT = X * FACT(X – 1) ::(RETURN)(RETURN) END END
RETURN is a special label to return from a RETURN is a special label to return from a functionfunction
More on functionsMore on functions
Wrong modification of previous Wrong modification of previous programprogram DEFINE(“FACT(X)”)DEFINE(“FACT(X)”)
FACT FACT FACT = EQ(X,1) 1FACT = EQ(X,1) 1 :S(RETURN):S(RETURN) FACT = X * FACT(X – 1)FACT = X * FACT(X – 1) ::
(RETURN)(RETURN) TERMINAL = FACT(6) TERMINAL = FACT(6) END END
That’s because execution “falls into” the That’s because execution “falls into” the definition of the function. If you run the above definition of the function. If you run the above program you get a message like “RETURN from program you get a message like “RETURN from outer level”outer level”
More on functionsMore on functions
Correct modification of previous Correct modification of previous programprogram DEFINE(“FACT(X)”) DEFINE(“FACT(X)”) :(FACT_END):(FACT_END)
FACT FACT FACT = EQ(X,1) 1FACT = EQ(X,1) 1 :S(RETURN):S(RETURN) FACT = X * FACT(X – 1)FACT = X * FACT(X – 1) ::
(RETURN)(RETURN) FACT_END FACT_END
TERMINAL = FACT(6) TERMINAL = FACT(6) END END
That’s a very standard style for defining functionsThat’s a very standard style for defining functions Similar to jumping past data in assemblerSimilar to jumping past data in assembler
More on FunctionsMore on Functions
Can have multiple argumentsCan have multiple arguments DEFINE(“ACKERMAN(X,Y)”)DEFINE(“ACKERMAN(X,Y)”)
Can have local variablesCan have local variables DEFINE(“MYFUNC(A,B,C)L1,L2”);DEFINE(“MYFUNC(A,B,C)L1,L2”);
No static scopingNo static scopingThe way both arguments and locals workThe way both arguments and locals work
On entry, save old values, set arguments, On entry, save old values, set arguments, set locals to all null valuesset locals to all null values
On return, restore saved valuesOn return, restore saved values
The EVAL functionThe EVAL function
The function EVAL takes a string and The function EVAL takes a string and evaluates it as a SNOBOL-4 expressionevaluates it as a SNOBOL-4 expressionHere is a simple calculator programHere is a simple calculator program LPLP TERMINAL = EVAL(TERMINAL)TERMINAL = EVAL(TERMINAL) :S(LP):S(LP)
END END Note that since we are within a single program, Note that since we are within a single program,
variables etc stick around, so this is more variables etc stick around, so this is more powerful than it lookspowerful than it looks
Also assignments are expressions in SPITBOL!Also assignments are expressions in SPITBOL!
Running the Calculator Running the Calculator ProgramProgram
b = 12b = 121212
a = 32a = 323232
b + ab + a4444
c = "str"c = "str"strstr
c ? arb . q 'r'c ? arb . q 'r'strstr
qqstst
The CODE functionThe CODE function
Even more fun and gamesEven more fun and gamesThe function CODE(str) takes a string The function CODE(str) takes a string
and treats it as a sequence of snobol-and treats it as a sequence of snobol-4 statements and compiles them.4 statements and compiles them.
The result is an object of type CODEThe result is an object of type CODEThe special goto form :<obj> will The special goto form :<obj> will
jump to the compiled code.jump to the compiled code.
A More Powerful CalculatorA More Powerful Calculator
Here is a more interesting calculatorHere is a more interesting calculator LPLP C = CODE(TERMINAL “; :(LP)”)C = CODE(TERMINAL “; :(LP)”) :S<C>:S<C>
END END Here we take the input from the terminal, Here we take the input from the terminal,
concatenate a goto LP so that control will concatenate a goto LP so that control will return to the loop, and if no end of file and the return to the loop, and if no end of file and the code compiled successfully execute the codecode compiled successfully execute the code
Calculator 2 at WorkCalculator 2 at Work
a = 6a = 6
b = 5b = 5
terminal = a + bterminal = a + b
1111
define("f(x)") :(e);f f = eq(x,1) 1 :s(return);f = x * f(x - 1) :define("f(x)") :(e);f f = eq(x,1) 1 :s(return);f = x * f(x - 1) :(return) ;e(return) ;e
terminal = f(6)terminal = f(6)
720720
Use of Predicates in PatternsUse of Predicates in Patterns
Here is a pattern that matches aHere is a pattern that matches annbbnnccnn
That is: a string of a’s b’s c’s with equal That is: a string of a’s b’s c’s with equal number of eachnumber of each
abc = Pos(0)abc = Pos(0). Span(‘a’) $ a. Span(‘a’) $ a. Span(‘b’) $ b *eq(size(a),size(b)). Span(‘b’) $ b *eq(size(a),size(b)). Span(‘c’) $ c *eq(size(a),size(c)). Span(‘c’) $ c *eq(size(a),size(c)). Rpos(0). Rpos(0)
The calls to eq are made at pattern The calls to eq are made at pattern matching time and either fail or return matching time and either fail or return the null string.the null string.
Using Your Own PredicatesUsing Your Own Predicates
Here is a predicate that matches only Here is a predicate that matches only strings of digits where the value is strings of digits where the value is primeprime prime = span(‘0123456789’) $ nprime = span(‘0123456789’) $ n
. *Is_Prime(n). *Is_Prime(n)You now write an Is_Prime function that You now write an Is_Prime function that
returns the null string on true and fails on returns the null string on true and fails on false.false.
A function fails by branching to FRETURNA function fails by branching to FRETURN eq(a,b) :s(return)f(freturn)eq(a,b) :s(return)f(freturn)
Let’s end with a fun patternLet’s end with a fun pattern
Find longest numeric stringFind longest numeric string DIG = '0123456789'DIG = '0123456789'
LI = NULL $ W FENCELI = NULL $ W FENCE. BREAKX(DIG). BREAKX(DIG).. (SPAN(DIG) $ N *GT(SIZE(N),SIZE(W))) $ W (SPAN(DIG) $ N *GT(SIZE(N),SIZE(W))) $ W. FAIL. FAIL
T = 'abc123def1234789xyz99!'T = 'abc123def1234789xyz99!'T LIT LI
TERMINAL = WTERMINAL = WENDEND
Output of running this program is 1234789Output of running this program is 1234789