introductionwrite functions!example: calculate...
TRANSCRIPT
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Outline
1 Introduction
2 Write Functions!
3 Example: Calculate Entropy
4 Arguments and Returns
5 R Style
6 Writing Better R Code
7 Object Oriented Programming
Writing Functions I 1 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Writing Functions In RNecessity Really is a Mother
Paul E. Johnson12
1University of Kansas, Department of Political Science 2Center for ResearchMethods and Data Analysis
2013
Writing Functions I 2 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Outline
1 Introduction
2 Write Functions!
3 Example: Calculate Entropy
4 Arguments and Returns
5 R Style
6 Writing Better R Code
7 Object Oriented Programming
Writing Functions I 3 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Outline
1 Introduction
2 Write Functions!
3 Example: Calculate Entropy
4 Arguments and Returns
5 R Style
6 Writing Better R Code
7 Object Oriented Programming
Writing Functions I 4 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Did You Ever Write a Program?
If Yes: R’s different than that.
If No: Its not like other things you’ve done.
In either case, don’t worry about it :)
Writing Functions I 5 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
R is a little bit like an elephant
Its a tree trunk! Its a snake! Its a brush!
Writing Functions I 6 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
The R Language is like S, of course
The S Language– John Chambers, etal. at Bell Labs, mid 1970s.See Richard Becker’s “Brief History ofS” about the AT&T years
There have been 4 generations of theS language.
Many packages now were written inS3, but S4 has been recommended fornew packages for at least 5 years.
A new framework known as “referenceclasses” is now being developed (wasjokingly referred to as “R5” at onetime)
S3: The New S Language 1988
Writing Functions I 7 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Is R a Branch from S?
R can be seen as a competingimplementation of S.Ross Ihaka and Robert Gentleman.1996. “R: A language for data analysisand graphics.” Journal ofComputational and GraphicalStatistics, 5(3):299-314.
Open Source, Open Community, openrepository CRAN
S pioneers now work to advance R.
S4: John Chambers,Software forData Analysis: Programming with R,Springer, 2008
Writing Functions I 8 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Outline
1 Introduction
2 Write Functions!
3 Example: Calculate Entropy
4 Arguments and Returns
5 R Style
6 Writing Better R Code
7 Object Oriented Programming
Writing Functions I 9 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Overview: 3 reasons to write functions
Preserve your sanity: isolate specific work.
Side benefit: preserve sanity of others who have to read yourcode
Facilitate re-use of your ideas
Co-operate with R functions like lapply() and boot() whichREQUIRE functions from us.
Writing Functions I 10 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Functions: Separate Calculations Meaningfully
New programmers tempted to craft a giant sequence of 1000commands
Just Don’t!
Problem No other human can comprehend that messSolution Write functions to calculate separate parts
I don’t feel comfortable with any function until I have a small“working example” to explore it (many availablehttp://pj.freefaculty.org/R)
Writing Functions I 11 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Re-use your work
If you write a function, you can put it to use in many differentcontexts
If you write a gigantic stream of 1000 commands, you can’t.
Writing Functions I 12 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Avoid for loops with lots of meat inside
Instead of this:
f o r ( i i n 1 :1000}{1000 s o f l i n e s he r e f u l l o f x [ i ] , z [ i ] , and so f o r t h}
We want:
fn1 <− f u n c t i o n ( arguments ) { . . . }fn2 <− f u n c t i o n ( arguments ) { . . . }f o r ( i i n 1 : 1000) {
y <− fn1 ( x , . . . )z <− fn2 ( y , . . . )}
This is easier to read, understand, and more re-usable!
Writing Functions I 13 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Personal confession
My first attack at any problem is often a long string ofcommands that are not separable, not readable
The revision process usually causes me to segregate code intoseparate pieces
One hint that you need a function: constant cutting andpasting of code scraps from one place to another
Writing Functions I 14 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
When finished, I Wish Your R program would look like this
myfn1 <− f u n c t i o n ( argument1 , argument2 , . . . ) {## lines here using argument1 , argument2
}myfn2 <− f u n c t i o n ( argument3 , argument4 ) {
## lines here
}## Try to perfect the above. Then use them
##
a <− 7b <− c (4 , 4 , 4 , 4 , 2)g r e a t 1 <− myfn1 ( a , b , parm3 = TRUE)g r e a t 2 <− myfn2 (b , g r e a t 1 )
Writing Functions I 15 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How to Create a Function
R allows us to create functions “on the fly”. This is theessential difference between a compiled language like C and aninterpreted language like R. While an R session is running, wecan add new capabilities to it.
The artist Escher would like this one:The word function is a function that createsfunctions!
A new function somethingGood() is defined by a stanza thatbegins like so:
somethingGood <− f u n c t i o n ( arguments ) {
somethingGood is a name we choose
Writing Functions I 16 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How to Create a Function ...
The terms arguments and parameters are interchangeable. Ioften say inputs. In R, do not say “options”, that confusespeople because R has a function called options() thatgoverns the working session.
arguments may be specified with default values, as in
somethingGood <− f u n c t i o n ( x1 = 0 , x2 = NULL) {
After the squiggly brace, any valid R code can be used. Wecan even define functions inside the function!
What happens in the function stays in the function. Thingsyou create inside there do not escape the closure unless youreally want them to.
Return results: When when the function’s work is finished, asingle object’s name is included on the last line.
Writing Functions I 17 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How to Create a Function ...
somethingGood <− f u n c t i o n ( x1 = 0 , x2 = NULL) {## suppose really interesting calculations create
res , a result
r e s}
There are some little wrinkles about returns that will bediscussed later. But, for now, I plant some seeds in your mind.
The return includes one objectThe result will ordinarily print in the R terminal when thefunction runs, but we can prevent that by using the last line as
somethingGood <− f u n c t i o n ( x1 = 0 , x2 = NULL) {## suppose really interesting calculations create \
texttt{res}, a result
i n v i s i b l e ( r e s )}
Writing Functions I 18 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How to Create a Function ...
It is possible to exit sooner, to short-circuit the calculationsbefore they have all run their course. That is what thereturn() function is for.
somethingGood <− f u n c t i o n ( x1 = 0 , x2 = NULL) {## suppose you created res
i f ( s omeLog i c a lCond i t i o n ) r e t u r n ( i n v i s i b l e ( r e s ) )## otherwise , go on and revise res further.
i n v i s i b l e ( r e s )}
Writing Functions I 19 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
R Functions pass information “by value”
The basic premise is that users should organize theirinformation “here”, in the current environment, and it isimportant that the function should not accidentally damage it.
Thus, we send info “over there” to a function
We get back a new something.
The function DOES NOT change variables we give to thefunction
The super assignment << − allows an exception to this, butR Core recommends we avoid it. Only experts should use it.
Writing Functions I 20 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
A simple example of a new function: doubleMe
doubleMe <− f u n c t i o n ( i n pu t = 0) {newva l <− 2 * i n pu t
}
The function’s name is “doubleMe”I choose a name for the incoming variable “input”.Other names would do (must start with a letter, etc.):
doubleMe <− f u n c t i o n ( x ) {out <− 2 * x
}
The last named thing is the one that comes back from the function.Note, explicit use of a return function is NOT REQUIRED. Thelast named thing comes back to us.
Writing Functions I 21 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Key Elements of doubleMe
doubleMe <− f u n c t i o n ( i n pu t = 0) {newva l <− 2 * i n pu t
}
doubleMe a name with which toaccess this function.Because of mybackground in“Objective C”, I like thisstyle of name. Don’tput periods in functionnames unless you knowabout “classes” and areusing them.
input a name usedINTERNALLY whilemaking calculations
= 0 An optional defaultvalue.
newval Last calculation isreturned.
Writing Functions I 22 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How to Call doubleMe
What is 2 * 7?
( doubleMe (7) )
[ 1 ] 14
The caller may name the arguments explicitly:
( doubleMe ( i npu t = 8) )
[ 1 ] 16
Wonder why I use parentheses around everything? Its just apresentational trick. The default action is “print”, and that’swhat happens when you put something in parentheses withouta function name.
The alternatives are:
p r i n t ( doubleMe ( i npu t = 3) )
Writing Functions I 23 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How to Call doubleMe ...
[ 1 ] 6
or
x <− doubleMe ( i npu t = 2)x
[ 1 ] 4
Writing Functions I 24 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Generally, I Prefer Clarity in the Call
The “call” is the code statement that puts a function to use.The call includes the function’s name and all arguments.
This works
doubleMe (10)
But wouldn’t you rather be clear?
doubleMe ( i npu t = 10)
When there are many arguments, naming them often helpsprevent accidental matching of input to arguments (R’spositional matching can be fooled).
Writing Functions I 25 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Function Calls
But if you name the argument wrong, it breaks
> doubleMe ( myInput = 7)E r r o r i n doubleMe ( myInput = 7) :unused argument ( s ) ( myInput = 7)
What if you feed it something unsuitable?
> doubleMe ( lm ( rnorm (100) ∼ rnorm (100) ) )E r r o r i n 2 * i n pu t : non−numeric argument to b i n a r y
op e r a t o rI n a d d i t i o n : Warning messages :1 : I n mo d e l .m a t r i x . d e f a u l t (mt , mf , c o n t r a s t s ) :
the r e s pon s e appeared on the r ight−hand s i d e and wasdropped
2 : I n mo d e l .m a t r i x . d e f a u l t (mt , mf , c o n t r a s t s ) :problem with term 1 i n mode l .ma t r i x : no columns a r e
a s s i g n e d
Writing Functions I 26 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Vectorization
This is not always true, but OFTEN:
We get “free”“vectorization”
( doubleMe ( c ( 1 , 2 , 3 , 4 , 5 ) ) )
[ 1 ] 2 4 6 8 10
But it won’t allow you to specify too many inputs:
> doubleMe (1 , 2 , 3 , 4 , 5 )E r r o r i n doubleMe (1 , 2 , 3 , 4 , 5) :
unused argument ( s ) (2 , 3 , 4 , 5)
Vectorization: vector in =⇒ vector out
Oops. I forgot the input
doubleMe ( )
Gives the default value.
Writing Functions I 27 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
print.function magic
Oops. I forgot the parentheses
doubleMe
f u n c t i o n ( i n pu t = 0) {newva l <− 2 * i n pu t
}
Similarly, type “lm” and hit return. Or “predict.glm”. Don’tadd parentheses.
When you type a function’s name, R thinks you want to printthat thing, and it invokes a “print method” for you, calledprint.function(). (That’s print as applied to an object ofclass function.)
Generally, print.function() will display the R internalfunctions in a “tidied up” format. Your functions–the ones youhave created in your session–are generally not tidied up. Thatis discussed in the Rstyle vignette distributed with rockchalk.
Writing Functions I 28 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
predict.glm, for example
p r e d i c t . g lm
f u n c t i o n ( ob j e c t , newdata = NULL , type = c ( ” l i n k ” , ”r e s pon s e” ,”terms ”) , s e . f i t = FALSE , d i s p e r s i o n = NULL , terms =
NULL ,n a . a c t i o n = na .pa s s , . . . )
{t ype <− match .a rg ( type )n a . a c t <− o b j e c t $ n a . a c t i o no b j e c t $ n a . a c t i o n <− NULLi f ( ! s e . f i t ) {
i f ( m i s s i n g ( newdata ) ) {pred <− sw i t c h ( type , l i n k = ob j e c t $
l i n e a r . p r e d i c t o r s ,r e s pon s e = ob j e c t $ f i t t e d . v a l u e s , te rms =
p r e d i c t . l m ( ob j e c t ,s e . f i t = s e . f i t , s c a l e = 1 , type = ”terms ”
,te rms = terms ) )
i f ( ! i s . n u l l ( n a . a c t ) )
Writing Functions I 29 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
predict.glm, for example ...pred <− n a p r e d i c t ( na . ac t , p red )
}e l s e {
pred <− p r e d i c t . l m ( ob j e c t , newdata , s e . f i t ,s c a l e = 1 ,type = i f e l s e ( type == ” l i n k ” , ”r e s pon s e ” ,
type ) ,te rms = terms , n a . a c t i o n = n a . a c t i o n )
sw i t c h ( type , r e s pon s e = {pred <− f am i l y ( o b j e c t ) $ l i n k i n v ( pred )
} , l i n k = , terms = )}
}e l s e {
i f ( i n h e r i t s ( ob j e c t , ”s u r v r e g ”) )d i s p e r s i o n <− 1
i f ( i s . n u l l ( d i s p e r s i o n ) | | d i s p e r s i o n == 0)d i s p e r s i o n <− summary ( ob j e c t , d i s p e r s i o n =
d i s p e r s i o n ) $ d i s p e r s i o nr e s i d u a l . s c a l e <− a s . v e c t o r ( s q r t ( d i s p e r s i o n ) )pred <− p r e d i c t . l m ( ob j e c t , newdata , s e . f i t , s c a l e =
r e s i d u a l . s c a l e ,
Writing Functions I 30 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
predict.glm, for example ...
t ype = i f e l s e ( type == ” l i n k ” , ”r e s pon s e ” , type ) ,te rms = terms , n a . a c t i o n = na . a c t i o n )
f i t <− pred $ f i ts e . f i t <− pred $ s e . f i tsw i t c h ( type , r e s pon s e = {
s e . f i t <− s e . f i t * abs ( f am i l y ( o b j e c t ) $mu.eta ( f i t) )
f i t <− f am i l y ( o b j e c t ) $ l i n k i n v ( f i t )} , l i n k = , terms = )i f ( m i s s i n g ( newdata ) && ! i s . n u l l ( n a . a c t ) ) {
f i t <− n a p r e d i c t ( na . ac t , f i t )s e . f i t <− n a p r e d i c t ( na . ac t , s e . f i t )
}pred <− l i s t ( f i t = f i t , s e . f i t = s e . f i t ,
r e s i d u a l . s c a l e = r e s i d u a l . s c a l e )}pred
}<bytecode : 0 x2bc3fb8><environment : namespace : s t a t s>
Writing Functions I 31 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Function Calls: Local versus Global
The variables we create in “our” session are generally in theGlobal Environment.
Local variables in functions.
Function arguments are local variablesVariables created inside are local variables
Local variables cease to exist once R returns from the function
After playing with doubleMe(), we note that the variableinput does not exist in the current environment.
> l s ( )[ 1 ] ”doubleMe ”> i n pu tE r r o r : o b j e c t ' i n pu t ' not found
Writing Functions I 32 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Check Point: write your own function
Write a function “myGreatFunction” that takes a vector andreturns the arithmetic average.
Generate your own input data, x1, like so
s e t . s e e d (234234)x1 <− rnorm (10000 , m = 7 , sd = 19)
In myGreatFunction(), you can use any R functions youwant, it is not necessary to re-create the mean() function(unless you really want to :))
After you’ve written myGreatFunction(), use it:
x1mean <− myGreatFunct ion ( x1 )x1mean
Now stress test your function by changing x1
x1 [ c (13 , 44 , 99 , 343 , 555) ] <− NAmyGreatFunct ion ( x1 )
Writing Functions I 33 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Outline
1 Introduction
2 Write Functions!
3 Example: Calculate Entropy
4 Arguments and Returns
5 R Style
6 Writing Better R Code
7 Object Oriented Programming
Writing Functions I 34 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Entropy can summarize diversity for a categorical variable
Entropy in Physics means disorganization
Sometimes called Shannon’s Information Index
Basic idea. List the possible outcomes and their probabilities
The amount of diversity in a collection of observationsdepends on the equality of the proportions of cases observedwithin each type.
Writing Functions I 35 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
A Reasonable Person Would Agree . . .
This distribution is “less diverse”outcome name t1 t2 t3 t4 t5
prob(outcome) 0.1 0.3 0.05 0.55 0.0
than this distribution:outcome name t1 t2 t3 t4 t5
prob(outcome) 0.2 0.2 0.2 0.2 0.2
Writing Functions I 36 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
The Information Index
For each type, calculate the following information (or can Isay “diversity”?) value
−pt ∗ log2(pt) (1)
Note that if pt = 0, the diversity value is 0
If pt = 1, then diversity is also 0
Sum those values across the m categories
m∑t=1
−pt ∗ log2(pt) (2)
Diversity is at a maximum when pt are all equal
Writing Functions I 37 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Calculate Diversity for One Type
d i v r <− f u n c t i o n ( p = 0) {i f e l s e ( p > 0 & p < 1 , −p * l o g2 ( p ) , 0)
}
Writing Functions I 38 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Let’s plot that
pseq <− seq (0 .001 , 0 .999 , l e n g t h =999)p s e q . d i v r <− d i v r ( pseq )p l o t ( p s e q . d i v r ∼ pseq , x l a b = ”p ” , y l a b = ”D i v e r s i t y
Con t r i b u t i o n o f One Obse r va t i on ” , main = e x p r e s s i o n (pa s t e ( ”D i v e r s i t y : ” , −p %*% log [ 2 ] ( p ) ) ) , t ype = ” l ”)
Writing Functions I 39 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
Diversity: − p × log2(p)
p
Div
ersi
ty C
ontr
ibut
ion
of O
ne O
bser
vatio
n
Writing Functions I 40 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Diversity Function
Define an Entropy function that sums those values
en t ropy <− f u n c t i o n ( p ) {sum( d i v r ( p ) )
}
Calculate some test cases
en t ropy ( c (1 / 5 , 1/ 5 , 1/ 5 , 1/ 5 , 1/ 5) )
[ 1 ] 2 .321928
en t ropy ( c (3 / 5 , 1/ 5 , 1/ 5 , 0/ 5 , 0/ 5) )
[ 1 ] 1 .370951
Writing Functions I 41 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
There’s a Little Problem With This Approach
Diversity is sensitive to the number of categories8 equally likely outcomes (rep(x,y): repeats x y times.)
en t ropy ( rep (1 / 8 , 8) )
[ 1 ] 3
14 equally likely outcomes
en t ropy ( rep (1 / 14 , 14) )
[ 1 ] 3 .807355
Write it out for a 3 category case
−1
3log2(
1
3)− 1
3log2(
1
3)− 1
3log2(
1
3) = −log2(
1
3) (3)
The highest possible diversity with 3 types is −log2(13)
The highest possible diversity for N types is −log2( 1N )
Writing Functions I 42 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
We Might As Well Plot That
maximumEntropy <− f u n c t i o n (N) − l o g2 (1 /N)Nmax <− 15M <− 2 :Nmaxp l o t (M, maximumEntropy (M) , x l a b = ”N o f P o s s i b l e Types ” ,
y l a b = ”Maximum Po s s i b l e D i v e r s i t y ” , main = ”MaximumPo s s i b l e Entropy For N Ca t e g o r i e s ” , t ype = ”h ” , axe s =FALSE)
a x i s (1 )a x i s (2 )p o i n t s (M, maximumEntropy (M) , pch=19)
Writing Functions I 43 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Maximum Entropy as a Function of the Number of Types
Maximum Possible Entropy For N Categories
N of Possible Types
Max
imum
Pos
sibl
e D
iver
sity
2 4 6 8 10 12 14
1.0
1.5
2.0
2.5
3.0
3.5
4.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Writing Functions I 44 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Final Result: Normed Entropy as a Diversity Summary
normedEntropy <− f u n c t i o n ( x ) en t ropy ( x ) / maximumEntropy (l e n g t h ( x ) )
Compare some cases with 4 possible outcomes
normedEntropy ( c (1 / 4 ,1 / 4 ,1 / 4 ,1 / 4) )
[ 1 ] 1
normedEntropy ( c (1 / 2 , 1/ 2 , 0 , 0) )
[ 1 ] 0 . 5
normedEntropy ( c (1 , 0 , 0 , 0) )
[ 1 ] 0
Writing Functions I 45 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How about 7 types of outcomes:
normedEntropy ( rep (1 / 7 , 7) )
[ 1 ] 1
normedEntropy ( ( 1 : 7 ) / ( sum ( 1 : 7 ) ) )
[ 1 ] 0 .9297027
normedEntropy ( c (2 / 7 , 2/ 7 , 3/ 7 , 0 , 0 , 0 , 0) )
[ 1 ] 0 .5544923
normedEntropy ( c (5 / 7 , 2/ 7 , 0 , 0 , 0 , 0 , 0) )
[ 1 ] 0 .3074497
Writing Functions I 46 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Compare 3 test cases
1 2 3 4 5 6 7 8 9 10
0.0
0.1
0.2
0.3
0.4
0.5
Normed Entropy= 0.82
0.1 0.1 0.1 0.1
0.2 0.2 0.2
0 0 0
1 2 3 4 5
0.0
0.1
0.2
0.3
0.4
0.5
Normed Entropy= 0.66
0.2
0.4 0.4
0 0
1 2 3 4 5 6 7
0.0
0.1
0.2
0.3
0.4
0.5
Normed Entropy= 0.93
0.04
0.07
0.11
0.14
0.18
0.21
0.25
Subjectively, I wrestle with the question of whether comparisonacross variables with different M is meaningful.
Writing Functions I 47 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Entropy is reported in summarize() andsummarizeFactors() in the rockchalk package
Manufacture a variable to re-produce testcase3
round ( t e s t c a s e 3 , 2 )
[ 1 ] 0 . 04 0 .07 0 .11 0 .14 0 .18 0 .21 0 .25
l i b r a r y ( r o c k ch a l k )t e s t c a s e 3 v <− f a c t o r ( c ( 1 , 2 , 2 , 3 , 3 , 3 , 4 , 4 , 4 , 4 , 5 , 5 , 5 , 5 , 5 ,
6 , 6 , 6 , 6 , 6 , 6 , 7 , 7 , 7 , 7 , 7 , 7 , 7 ) )round ( ( t a b l e ( t e s t c a s e 3 v ) / l e n g t h ( t e s t c a s e 3 v ) ) , 2)
t e s t c a s e 3 v1 2 3 4 5 6 7
0 .04 0 .07 0 .11 0 .14 0 .18 0 .21 0 .25
dat <− da t a . f r ame ( t e s t c a s e 3 v )summar i zeFacto r s ( dat )
Writing Functions I 48 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Entropy is reported in summarize() andsummarizeFactors() in the rockchalk package ...
t e s t c a s e 3 v7 : 7 .00006 : 6 .00005 : 5 .00004 : 4 .0000( A l l Others ) : 6 .0000NA ' s : 0 .0000en t ropy : 2 .6100normedEntropy : 0 .9297N :28 .0000
Writing Functions I 49 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Outline
1 Introduction
2 Write Functions!
3 Example: Calculate Entropy
4 Arguments and Returns
5 R Style
6 Writing Better R Code
7 Object Oriented Programming
Writing Functions I 50 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Function Anatomy
Arguments: The Arguments of a function
someWork <− f u n c t i o n ( what1 , what2 , what3 )
what1, what2, and what3 become “local variables” inside thefunction.
named arguments must begin with letters, may includenumbers, or “ ” or “.”, but no “” or “,” or “*” or ”-”
Writing Functions I 51 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Arguments: named and unnamed R variables
someWork <− f u n c t i o n ( what1 , what2 , what3 , . . . )
In R, everything is an “object”. what1, what2, what3 can beANYTHING!
That is a blessing and a curse
Blessing: Flexibility! Let an argument be a vector, matrix, list,whatever. The function declaration does not care.Curse: Difficulty managing the infinite array of possible waysusers might break your functions.
It is your choice, whether your function should receive
2 integers (x, y), orone vector of 2 integers (x)For examples of this, look at the arguments of plot.default,arrows, segments, lines, and matplot.
Writing Functions I 52 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Peculiar spellings like . . .
Functions may have an argument “. . .”
It seems funny, but “...” is actually a word made up of threelegal characters
If the caller includes any named arguments that are notwhat1, what2, or what3, then the R runtime will put them ina list and give them to “...”
The “...” argument requires special effort.
Writing Functions I 53 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Function Arguments: R is VERY Lenient
R is lenient. Perhaps too lenient!
Arguments are not type-defined. In
myF <− f u n c t i o n ( x1 , x2 ) {
x1 could be an integer vector, a character string, a regressionmodel, or anythingFunction writer may declare default values, but need not.These are acceptable definitions
someWork <− f u n c t i o n ( what1 , what2 , what3 , what4 ,what5 )
someWork <− f u n c t i o n ( what1 = 0 , what2 = NULL , what3= c (1 , 2 , 3 ) , what4 = 3 * what1 , what5 )
someWork <− f u n c t i o n ( what1 = 0 , what2 , what3 , what4= 3 * what1 , what5 )
Writing Functions I 54 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
R is Lenient toward Users as Well
R is lenient on format of function calls
someWork (1 )someWork ( what=1, someObject )someWork ( what5 = f r ed , what4 = jim , what3 = j o e )
Not all parameters must be providedPartial argument matching
Writing Functions I 55 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
R is Lenient About Undefined Variables
This is a convenience and a curse
Variables inside functions might not be resolved the way youexpect.
If a variable is used, but not defined inside the function, R will“look outward” for it
This is “lexical scope” in action. The runtime engine searchesthrough a sequence of “frames”, ending up at the user’sworkspace (which is the Global Environment).
Writing Functions I 56 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Example of the Undefined Variable Problem
Suppose “dat” exists in your workspace.
Here is a function that will find “dat”, even if that’s not whatyou intend.
myFn <− f u n c t i o n ( x , y , z ) {dat1 <− 2 * x + 3 * yr e s <− s q r t ( dat )
}
Note my typographical error (dat where dat1 should be). Thefunction should crash, but it will not.
R will find the wrong “dat” and the result we get will beunhelpful
In my experience, this is the single most important cause ofhard-to-find flaws in user code.
Writing Functions I 57 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Inside the function, Check Arguments
Check and Re-format
“argument checking”: diagnose what arguments the userprovidedFigure out if they are “correct” for what the function requires.
You choose how sympathetic you want to be toward users. Doyou want to accept incomplete input and re-format it? (Inrockchalk/R/genCorrelatedData.R, find may “re-formatter”functions like makeVec(), vech2mat()).
Writing Functions I 58 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Arguments: When to Use Defaults?
I don’t know, but . . .
As an R beginner, I took a very conservative strategy ofputting defaults on everything!
f u n c t i o n ( what1 = NULL , what2 = NULL , what3 = 3 , what4 =``a ' ' )
That way, if a user forgets to provide“what3”, then the systemwill not go looking for it.
If the defaults usually work, this is concise, easy to read.
Most functions in R base code don’t set defaults for all, oreven most, variables.
Instead, we manage arguments with more delicacy. Insistingon NULL defaults is “throwing the baby out with the bathwater.”
Writing Functions I 59 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Functions that Help while Checking Arguments
missing()
Inside a function, missing() is a way to ask if the usersupplied a value for an argument.
doSomething <− f u n c t i o n ( what1 , what2 , what3 , what4 ) {i f ( m i s s i n g ( what1 ) ) s top (`` you f o r g o t to s p e c i f y
what1 ' ' )}
I’ve found this conservative approach to be an error-preventer.If x is not provided, or if the user gave us a NULL, we betteradapt!
i f ( m i s s i n g ( x ) | | i s . n u l l ( x ) ) ## do something
Writing Functions I 60 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Functions that Help while Checking Arguments ...
Type Checks
There are many “is.” functions. Ask if x is a certain type of thing:
is.vector() TRUE or FALSE: are you a vector?
is.list() TRUE or FALSE: are you a list?
is.numeric() TRUE or FALSE: are you numeric?
You get the general idea, I hope
Writing Functions I 61 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Functions that Help while Checking Arguments ...
Look at all of these is. functions!
i s . a r r a y i s . l o a d e d i s . o b j e c ti s . a t om i c i s . l o g i c a l i s . o r d e r e di s . c a l l i s . m a t r i x i s . p a c k a g e v e r s i o ni s . c h a r a c t e r i s .m t s i s . p a i r l i s ti s . c omp l e x i s . n a i s . p r i m i t i v ei s . d a t a . f r am e i s . n a<− i s . q ri s . d o u b l e i s . n a . d a t a . f r am e i s . Ri s . e l em e n t i s . n a<−. d e f a u l t i s . r a s t e ri s . emp t y .mode l i s . n a<−. f a c t o r i s . r a wi s . e n v i r o nmen t i s . name i s . r e c u r s i v ei s . e x p r e s s i o n i s . n a n i s . r e l i s t a b l ei s . f a c t o r i s . n a . n um e r i c v e r s i o n i s . s i n g l ei s . f i n i t e i s . n a . POS IX l t i s . s t e p f u ni s . f u n c t i o n i s . n u l l i s . s ymb o li s . i n f i n i t e i s . n um e r i c i s . t a b l ei s . i n t e g e r i s . n ume r i c .D a t e i s . t si s . l a n g u a g e i s . n um e r i c . d i f f t i m e i s . t s k e r n e li s . l e a f i s . nume r i c .POS IX t i s . u n s o r t e di s . l i s t i s . n um e r i c v e r s i o n i s . v e c t o r
Writing Functions I 62 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Functions that Help while Checking Arguments ...
Size Checks
length()
nrow(), ncol() number of rows (columns) from a matrix
NROW(), NCOL() Works if input is matrix, but will treat a vectoras a one column matrix.
dim() Returns 2 dimensional vector, works with matricesand arrays
Stop, Warn
stop(), stopifnot() Ways to abend the function
warning Will return, but throws a message onto the R warninglog. User can run warnings() to see messages.
Writing Functions I 63 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
The Return Has To Be Singular
When you use a function, it is necessary to “catch” the outputwith a single object name, as in
newth ing <− doubleMe (32)newth ing
[ 1 ] 64
i s . n um e r i c ( newth ing )
[ 1 ] TRUE
i s . v e c t o r ( newth ing )
[ 1 ] TRUE
We expect “doubleMe(32)” should return 64, and it does.
Writing Functions I 64 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
The Return Has To Be Singular ...
The“vectorization for free”gives us a hint of what is to follow.
newth ing <− doubleMe ( c (1 , 2 , 3 , 4 , 5 ) )newth ing
[ 1 ] 2 4 6 8 10
i s . n um e r i c ( newth ing )
[ 1 ] TRUE
i s . v e c t o r ( newth ing )
[ 1 ] TRUE
R allows us to return one “thing”, but “thing” can be a rich,informative thing!
Writing Functions I 65 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Generalization. Return a list
A list may include numbers, characters, vectors ,etc
or data frames or other lists
Read code for function “lm”
Almost ALL interesting R functions return a list of elements.
Writing Functions I 66 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Check Point: revise your function
Create “myGreatFunction2” by revising “myGreatFunction”.Make it return a vector of 3 values: the maximum, theminimum, and the median.
Generate your own input data, x1, like so
x1 <− rnorm (10000 , m=7, sd=19)
After you’ve written myGreatFunction, use it:
myGreatFunct ion ( x1 )
Now stress test your function by changing x1
x1 [ c (13 , 44 , 99 , 343 , 555) ] <− NAmyGreatFunct ion ( x1 )
Writing Functions I 67 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Check Point: Run this example function
Admittedly, this is a dumb example, but . . .
This function returns a regression object
aRegMod <− f u n c t i o n ( x in1 , x in2 , yout ) { lm ( yout ∼ x i n1+ x i n2 ) }
Generate some data, run aRegMod()
dat <− r o c k c h a l k : : g enCo r r e l a t edData (N = 100 , rho = 0 .3, beta = c (1 , 1 .0 , −1.1 , 0 . 0 ) )
m1 <− with ( dat , aRegMod( x1 , x2 , y ) )
m1 is a “single object”
Run “class(m1)”, “attributes(m1)”, “summary(m1)”, ‘str(m1)”,
Writing Functions I 68 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Outline
1 Introduction
2 Write Functions!
3 Example: Calculate Entropy
4 Arguments and Returns
5 R Style
6 Writing Better R Code
7 Object Oriented Programming
Writing Functions I 69 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
The Unofficial Official R Style Guide
This is discussed in rockchalk vignette Rstyle
There is not much “official style guidance” from the R CoreTeam
Don’t mistake that as permission to write however you want.
There ARE very widely accepted standards for the way thatcode should look
Writing Functions I 70 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Inductive Style Guide
To see that R really does have an implicitly stated Style,inspect the R source code.
The code follows a uniform pattern of indentation, the use ofwhite space, and so forth.
Recall that print.function() will be called if you type thename of a function without parentheses. That is a tidied upview of R’s internal structural represenation of a function.
. This time, lets look at “lm” inside R:
lm
Writing Functions I 71 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Inductive Style Guide ...
f u n c t i o n ( formula , data , subse t , we ights , n a . a c t i o n , method= ”qr ” ,model = TRUE, x = FALSE , y = FALSE , qr = TRUE,
s i n g u l a r . o k = TRUE,c o n t r a s t s = NULL , o f f s e t , . . . )
{r e t . x <− xr e t . y <− yc l <− ma t c h . c a l l ( )mf <− ma t c h . c a l l ( e xpand .do t s = FALSE)m <− match ( c ( ”fo rmu la ” , ”data ” , ”s ub s e t ” , ”we i gh t s ” , ”
n a . a c t i o n ” ,” o f f s e t ”) , names (mf ) , 0L )
mf <− mf [ c (1L , m) ]mf$ d r o p . u n u s e d . l e v e l s <− TRUEmf [ [ 1 L ] ] <− as.name ( ”mode l . f r ame ”)mf <− e v a l (mf , p a r e n t . f r ame ( ) )i f ( method == ”mode l . f r ame ”)
r e t u r n (mf )e l s e i f ( method != ”qr ”)
Writing Functions I 72 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Inductive Style Guide ...
warn ing ( g e t t e x t f ( ”method = '%s ' i s not s u ppo r t e d .Us ing ' qr ' ” ,method ) , domain = NA)
mt <− a t t r (mf , ”terms ”)y <− mode l . r e s pon s e (mf , ”numer ic ”)w <− a s . v e c t o r ( mode l .we i gh t s (mf ) )i f ( ! i s . n u l l (w) && ! i s . n um e r i c (w) )
s top ( ” ' weights ' must be a numer ic v e c t o r ”)o f f s e t <− a s . v e c t o r ( mo d e l . o f f s e t (mf ) )i f ( ! i s . n u l l ( o f f s e t ) ) {
i f ( l e n g t h ( o f f s e t ) != NROW( y ) )s top ( g e t t e x t f ( ”number o f o f f s e t s i s %d , shou ld
equa l %d ( number o f o b s e r v a t i o n s ) ” ,l e n g t h ( o f f s e t ) , NROW( y ) ) , domain = NA)
}i f ( i s . emp t y .mode l (mt) ) {
x <− NULLz <− l i s t ( c o e f f i c i e n t s = i f ( i s . m a t r i x ( y ) ) mat r i x ( ,
0 ,3) e l s e numer ic ( ) , r e s i d u a l s = y , f i t t e d . v a l u e s
= 0 *
Writing Functions I 73 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Inductive Style Guide ...y , we i gh t s = w, rank = 0L , d f . r e s i d u a l = i f ( !
i s . n u l l (w) ) sum(w !=0) e l s e i f ( i s . m a t r i x ( y ) ) nrow ( y ) e l s e l e n g t h ( y )
)i f ( ! i s . n u l l ( o f f s e t ) ) {
z$ f i t t e d . v a l u e s <− o f f s e tz$ r e s i d u a l s <− y − o f f s e t
}}e l s e {
x <− mode l .ma t r i x (mt , mf , c o n t r a s t s )z <− i f ( i s . n u l l (w) )
l m . f i t ( x , y , o f f s e t = o f f s e t , s i n g u l a r . o k =s i n g u l a r . o k ,. . . )
e l s e lm .w f i t ( x , y , w, o f f s e t = o f f s e t , s i n g u l a r . o k =s i n g u l a r . o k ,
. . . )}c l a s s ( z ) <− c ( i f ( i s . m a t r i x ( y ) ) ”mlm” , ”lm ”)z$ n a . a c t i o n <− a t t r (mf , ”n a . a c t i o n ”)z$ o f f s e t <− o f f s e t
Writing Functions I 74 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Inductive Style Guide ...
z$ c o n t r a s t s <− a t t r ( x , ”c o n t r a s t s ”)z$ x l e v e l s <− . g e t X l e v e l s (mt , mf )z$ c a l l <− c lz$ terms <− mti f ( model )
z$model <− mfi f ( r e t . x )
z$x <− xi f ( r e t . y )
z$y <− yi f ( ! qr )
z$ qr <− NULLz
}<bytecode : 0 x2a512f8><environment : namespace : s t a t s>
Writing Functions I 75 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Why Neatness Counts
You can write messy code, but you can’t make anybody readit.
Finding bugs in a giant undifferentiated mass of commands isdifficult
Chance of error rises as clarity decreases
If you want people to help you, or use your code, you shouldwrite neatly!
Writing Functions I 76 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Style Fundamentals
WHITE SPACE:
indentation. R Core Team recommends 4 spacesone space around operators like <- = *
“< −” should be used for assignments. “=” was used bymistake so often by novices that the R interpreter wasre-written to allow =. However, it may still fail in some cases.
Use helpful variable names
Separate calculations into functions. Sage advice from one ofmy programming mentors:
Don’t allow a calculation to grow longer than thespace on one screen. Break it down into smaller,well defined pieces.
Writing Functions I 77 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Be Careful about line endings
Unlike C (or other languages), R does not require an “end ofline character” like “;”.
That’s convenient, but sometimes code can “fool” R intobelieving that a command is finished.
From the help page for “if”
Note that it is a common mistake to forget to putbraces (‘{ .. }’) around your statements, e.g., after‘if(..)’ or ‘for(....)’. In particular, you should nothave a newline between ‘}’ and ‘else’ to avoid asyntax error in entering a ‘if ... else’ construct at thekeyboard or via ‘source’. For that reason, one(somewhat extreme) attitude of defensiveprogramming is to always use braces, e.g., for ‘if’clauses.
Writing Functions I 78 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Be Careful about line endings ...
The else “ Policy. I strongly recommend this format:
i f ( a− l o g i c a l− c ond i t i o n ) {## if TRUE , do this
} e l s e {## if FALSE , the other thing
}
to close the previous if and begin else on same line.
“if-else” is very troublesome. If R thinks the “if” is finished, itmay not notice the else.
i f ( x > 7) y <− 3e l s e y <− 2
Causes “Error: unexpected ’else’ in “else”
When running code line-by-line, the “naked else” always causesan error.
Writing Functions I 79 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Google Doc on R Coding is Just “somebody’s” OpinionThe Easily Googled Google R Standards
Writing Functions I 80 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How To Name Functions
Don’t use names for functions that are already in widespreaduse, like lm, seq, rep, etc.
I like Objective C style variable and function names thatsmash words together, as in myRegression myCode
R uses periods in function names to represent “objectorientation” or “subclassing”, thus I avoid periods for simplepunctuation.Ex: doSomething() is better than do.something
Underscores are now allowed in function names.Ex: do something() would be OK
Writing Functions I 81 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How To Name Variables
Use clear names always
Use short names where possible
Never use names of common functions for variables
Never name a variable T or F (doing so tramples R symbols)
“Name by suffix” strategy I’m using now:
m1 <− lm ( y ∼ x , data=dat )m1sum <− summary (m1)m1v i f <− v i f (m1)m1inf <− i n f l u e n c e .m e a s u r e s (m1)
Writing Functions I 82 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Expect Some Variations in My Code
I don’t mind adding squiggly braces, even if not requiredif (whatever == TRUE) {x <- y}
Sometimes I will use 3 lines where one would do
if (whatever == TRUE){
x <- y
}
When in doubt, I like to explicitly name arguments, but notalways.
I sometimes forget to write TRUE and FALSE when T and Fwill suffice
Here’s why that’s a big deal. Suppose a some user mistakenlyredefines T or F:
T <− 7F <− myTe r r i f i c F un c t i o n
then my functions that use “T” and “F” will die.
Writing Functions I 83 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Outline
1 Introduction
2 Write Functions!
3 Example: Calculate Entropy
4 Arguments and Returns
5 R Style
6 Writing Better R Code
7 Object Oriented Programming
Writing Functions I 84 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
What is this Section?
Tips from the school of hard knocks
Criticisms of R code I’ve found or written
Writing Functions I 85 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Functions replace “cut and paste” editing
If you find yourself using “copy and paste” to repeat stanzas withslight variations, you are almost certainly doing the wrongthing.
Re-conceptualize, write a function that does the right thing
Use the function over and over
Why is this important: AVOIDING mistakes due to editing mistakes
Writing Functions I 86 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
We Learn By Criticizing
I keep a folder of R code that troubles me.This example is a more-or-less literal translation from SAS into Rthat does not use R’s special features.
# ---------------SPECIFICATIONS------------------------ #
i t e r =1000 #how many iterations per
condition
s e t . s e e d (7913025) # set random seed
# --------------END SPECIFICATIONS--------------------- #
#n per cluster sample size
f o r ( p e r c l u s t i n c (100) ) {#number of clusters - later for MLM application
f o r ( n c l u s t i n c (1 ) ) {#common correlation
f o r ( s e t c o r r i n c ( 1 : 8 ) ) {i f ( s e t c o r r ==1) {
c o r r . 1 0 <−1c o r r . 2 0 <−0c o r r . 3 0 <−0c o r r . 4 0 <−0
Writing Functions I 87 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
We Learn By Criticizing ...c o r r . 5 0 <−0c o r r . 6 0 <−0c o r r . 7 0 <−0c o r r . 8 0 <−0}
i f ( s e t c o r r ==2) {c o r r . 1 0 <−1c o r r . 2 0 <−1c o r r . 3 0 <−0c o r r . 4 0 <−0c o r r . 5 0 <−0c o r r . 6 0 <−0c o r r . 7 0 <−0c o r r . 8 0 <−0}
i f ( s e t c o r r ==3) {c o r r . 1 0 <−1c o r r . 2 0 <−1c o r r . 3 0 <−1c o r r . 4 0 <−0c o r r . 5 0 <−0c o r r . 6 0 <−0
Writing Functions I 88 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
We Learn By Criticizing ...c o r r . 7 0 <−0c o r r . 8 0 <−0}
i f ( s e t c o r r ==4) {c o r r . 1 0 <−1c o r r . 2 0 <−1c o r r . 3 0 <−1c o r r . 4 0 <−1c o r r . 5 0 <−0c o r r . 6 0 <−0c o r r . 7 0 <−0c o r r . 8 0 <−0}
i f ( s e t c o r r ==5) {c o r r . 1 0 <−1c o r r . 2 0 <−1c o r r . 3 0 <−1c o r r . 4 0 <−1c o r r . 5 0 <−1c o r r . 6 0 <−0c o r r . 7 0 <−0c o r r . 8 0 <−0
Writing Functions I 89 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
We Learn By Criticizing ...}
i f ( s e t c o r r ==6) {c o r r . 1 0 <−1c o r r . 2 0 <−1c o r r . 3 0 <−1c o r r . 4 0 <−1c o r r . 5 0 <−1c o r r . 6 0 <−1c o r r . 7 0 <−0c o r r . 8 0 <−0}
i f ( s e t c o r r ==7) {c o r r . 1 0 <−1c o r r . 2 0 <−1c o r r . 3 0 <−1c o r r . 4 0 <−1c o r r . 5 0 <−1c o r r . 6 0 <−1c o r r . 7 0 <−1c o r r . 8 0 <−0}
i f ( s e t c o r r ==8) {
Writing Functions I 90 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
We Learn By Criticizing ...
c o r r . 1 0 <−1c o r r . 2 0 <−1c o r r . 3 0 <−1c o r r . 4 0 <−1c o r r . 5 0 <−1c o r r . 6 0 <−1c o r r . 7 0 <−1c o r r . 8 0 <−1
}
That project defined several variables in that way, consuming 100sof lines.
Writing Functions I 91 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Quick Exercise
How would you convert that into a vector in R, without writing300 lines.Hint. You want to end up with a vector setcorr with 0 elements,all either 0 or 1.Requirement: You need some easy, flexible way to adjust thenumber of 0’s and 1’s
Writing Functions I 92 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Another “noisy code” example
Note in the following
1 the author does not declare functionsRather, treats comments ahead of blocks of code as if theywere function declarations.
2 repeated use of cat() with same file argument is an exampleof cut-and-paste coding.
# #########################################
# WRITE GENERATION CODE #
# #########################################
pathGEN <− pa s t e ( d i r r o o t , p e r c l u s t , n c l u s t , s e t c o r r , sep=”\\ ”)gen <− pa s t e ( pathGEN , ”g e n e r a t e c o r r d a t a . i n p ” , sep=”\\ ”)t e s t d a t a 1 <− pa s t e ( pathGEN , ”da t a1 . d a t ” , sep=”\\ ”)
ca t ( 'MONTECARLO: \n ' , f i l e=gen )ca t ( 'NAMES ARE x y q1−q8 ; \n ' , f i l e=gen , append=T)ca t ( 'NOBSERVATIONS = 1000 ; \n ' , f i l e=gen , append=T)
Writing Functions I 93 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Another “noisy code” example ...ca t ( ' NREPS = ' , i t e r , ' ; \n ' , f i l e=gen , append=T)ca t ( ' SEED = ' , round ( r u n i f ( 1 ) * 10000000) , ' ; \n ' , f i l e=gen ,
append=T)ca t ( ' REPSAVE=ALL ; \n ' , f i l e=gen , append=T)ca t ( ' SAVE=\n ' , pathGEN , ' \\ data * . d a t ; \n ' , f i l e=gen , append=
T, sep=””)ca t ( 'MODEL POPULATION: \n ' , f i l e=gen , append=T)ca t ( ' [ q1−q8*0 x*0 y* 0 ] ; \n ' , f i l e=gen , append=T)ca t ( 'q1−q8* 1 ; x* 1 ; y* 1 ; \n ' , f i l e=gen , append=T)ca t ( 'q1−q8 wi th q1−q8* . 50 ; \n ' , f i l e=gen , append=T)ca t ( ' x wi th y* . 50 ; \n ' , f i l e=gen , append=T)ca t ( ' x wi th q1−q8* . 50 ; \n ' , f i l e=gen , append=T)ca t ( ' y wi th q1−q8* ' , . 0+( c o r r . 1 0 * ( . 10+c o r r . 2 0 * . 10+c o r r . 3 0 *
. 10+c o r r . 4 0 *+co r r . 4 0 * . 10+c o r r . 5 0 * . 10+c o r r . 6 0 * . 10+c o r r . 7 0* . 10+c o r r . 8 0 * . 10 ) ) , ' ; \n ' , f i l e=gen , append=T, sep=””)
i f ( f i l e . e x i s t s ( t e s t d a t a 1 ) ) {} e l s e {s h e l l ( pa s t e ( ”mp lu s . e x e ” , gen , pa s t e ( pathGEN , ”s a v e 1 . o u t ” , sep
=”\\ ”) , sep=” ”) )}
Writing Functions I 94 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Another “noisy code” example ...#
---------------------------------------------------------------------
#
# #########################################
# GENERATE SAS CODE #
# #########################################
pathSAS <− pa s t e ( d i r r o o t , p e r c l u s t , n c l u s t , s e t c o r r , s e t p a t t e r n ,p e r c en tm i s s , aux , auxnumber , sep=”\\ ”)
pathSASdata <− pa s t e ( d i r r o o t , p e r c l u s t , n c l u s t , s e t c o r r ,s e t p a t t e r n , p e r c en tm i s s , sep=”\\ ”)
smcarmiss <− pa s t e ( pathSAS , ”mod i f y d a t a . s a s ” , sep=”\\ ”)t e s t d a t a 2 <− pa s t e ( pathSASdata , ”da t a1 . d a t ” , sep=”\\ ”)
# ------------------------------------------------------- #
# ------------- import SIM data into SAS ---------------- #
# ------------------------------------------------------- #
i f ( f i l e . e x i s t s ( t e s t d a t a 2 ) ) {} e l s e {
ca t ( ' proc p r i n t t o \n ' , f i l e=smcarmiss , append=T)
Writing Functions I 95 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Another “noisy code” example ...ca t ( ' l o g=”R:\\ u s e r s \\ username \\ data \\simLOG2\\LOGLOG.log ” \n
' , f i l e=smcarmiss , append=T)ca t ( ' p r i n t =”R:\\ u s e r s \\ username \\ data \\simLOG2\\ LSTLST. lst ”
\n ' , f i l e=smcarmiss , append=T)ca t ( 'new ; \n ' , f i l e=smcarmiss , append=T)ca t ( ' run ; \n ' , f i l e=smcarmiss , append=T)ca t ( ' \n ' , f i l e=smcarmiss , append=T)
ca t ( '%macro importMPLUS ; \n ' , f i l e=smcarmiss , append=T)ca t ( '%do i=1 %to ' , f i l e=smcarmiss , append=T)ca t ( pa s t e ( i t e r ) , f i l e=smcarmiss , append=T)ca t ( ' ; \n ' , f i l e=smcarmiss , append=T)ca t ( ' data work .da ta&i ; \n ' , f i l e=smcarmiss , append=T)ca t ( ' i n f i l e ' , f i l e=smcarmiss , append=T)ca t ( ' ” ' , f i l e=smcarmiss , append=T)ca t ( pa s t e ( pathGEN , ”data& i . . d a t ” , sep=”\\ ”) , f i l e=smcarmiss ,
append=T)ca t ( ' ” ; \n ' , f i l e=smcarmiss , append=T)ca t ( ' INPUT x y q1 q2 q3 q4 q5 q6 q7 q8 ; /*<−−− insert
v a r i a b l e s he r e */ \n ' , f i l e=smcarmiss , append=T)ca t ( 'RUN; \n ' , f i l e=smcarmiss , append=T)ca t ( '%end ; \n ' , f i l e=smcarmiss , append=T)
Writing Functions I 96 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Another “noisy code” example ...ca t ( '%mend ; \n ' , f i l e=smcarmiss , append=T)ca t ( '%importMPLUS \n ' , f i l e=smcarmiss , append=T)ca t ( ' \n ' , f i l e=smcarmiss , append=T)
# #########################################
# Draw random sample of size N #
# #########################################
ca t ( '%macro s amp l e s i z e ; \n ' , f i l e=smcarmiss , append=T)ca t ( '%do i=1 %to ' , f i l e=smcarmiss , append=T)ca t ( pa s t e ( i t e r ) , f i l e=smcarmiss , append=T)ca t ( ' ; \n ' , f i l e=smcarmiss , append=T)ca t ( ' Proc s u r v e y s e l e c t data=work .da ta&i out=work . samp leda ta&
i method=SRS \n ' , f i l e=smcarmiss , append=T)i f ( p e r c l u s t ==50) {ca t ( ' samps i ze=50 \n ' , f i l e=smcarmiss , append=T)}e l s e i f ( p e r c l u s t ==75) {ca t ( ' samps i ze=75 \n ' , f i l e=smcarmiss , append=T)}e l s e i f ( p e r c l u s t ==100) {ca t ( ' samps i ze=100 \n ' , f i l e=smcarmiss , append=T)}
Writing Functions I 97 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Another “noisy code” example ...
e l s e i f ( p e r c l u s t ==200) {ca t ( ' samps i ze=200 \n ' , f i l e=smcarmiss , append=T)}e l s e i f ( p e r c l u s t ==400) {ca t ( ' samps i ze=400 \n ' , f i l e=smcarmiss , append=T)}e l s e i f ( p e r c l u s t ==800) {ca t ( ' samps i ze=800 \n ' , f i l e=smcarmiss , append=T)}e l s e i f ( p e r c l u s t ==1000) {ca t ( ' samps i ze=1000 \n ' , f i l e=smcarmiss , append=T)}ca t ( 'SEED = ' , round ( r u n i f ( 1 ) * 10000000) , ' ; \n ' , f i l e=
smcarmiss , append=T)ca t ( 'RUN; \n ' , f i l e=smcarmiss , append=T)ca t ( '%end ; \n ' , f i l e=smcarmiss , append=T)ca t ( '%mend ; \n ' , f i l e=smcarmiss , append=T)ca t ( '%samp l e s i z e \n ' , f i l e=smcarmiss , append=T)ca t ( ' \n ' , f i l e=smcarmiss , append=T)
Writing Functions I 98 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
This could be much better
Weird indentation
Use ”/”, not ”backslash”, even on Windows
Use vectors
Prolific copying and pasting of ”cat” lines.
Avoid system except where truly necessary. R has OS neutralfunctions to create directorys and such.
Writing Functions I 99 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
I found the prototype for that code in a previous project
gen <− pa s t e ( path , ”g e n e r a t e . i n p ” , sep=”\\ ”)
ca t ( 'MONTECARLO: \n ' , f i l e=gen )ca t ( 'NAMES ARE y1−y6 ; \n ' , f i l e=gen , append=T)ca t ( ' NOBSERVATIONS = ' , p e r c l u s t * nc l u s t , ' ; \n ' , f i l e=gen ,
append=T)i f ( p e r c l u s t != 7 . 5 ) {ca t ( ' NCSIZES = 1 ; \n ' , f i l e=gen , append=T)ca t ( ' CSIZES = ' , n c l u s t , ' ( ' , p e r c l u s t , ' ) ; \n ' , f i l e=gen ,
append=T)}i f ( p e r c l u s t ==7 . 5 ) {ca t ( ' NCSIZES = 2 ; \n ' , f i l e=gen , append=T)ca t ( ' CSIZES = ' , n c l u s t / 2 , ' (7 ) ' , n c l u s t / 2 , ' (8 ) ; \n ' ,
f i l e=gen , append=T)}
# user-specified iterations
ca t ( ' NREPS = ' , i t e r , ' ; \n ' , f i l e=gen , append=T)ca t ( ' SEED = 791305; \n ' , f i l e=gen , append=T)
Writing Functions I 100 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
I found the prototype for that code in a previous project ...ca t ( ' REPSAVE=ALL ; \n ' , f i l e=gen , append=T)ca t ( ' SAVE=\n ' , path , ' \\ data * . d a t ; \n ' , f i l e=gen , append=T,
sep=””)ca t ( ' ANALYSIS : TYPE = TWOLEVEL; \n ' , f i l e=gen , append=T)
ca t ( 'MODEL POPULATION: \n ' , f i l e=gen , append=T)
ca t ( '%WITHIN% \n ' , f i l e=gen , append=T)# baseline loadings are all .3 , add .4 if 'within ' = 1, but
change based on mod/strongca t ( 'FW BY y1−y2* ' , . 3+(w i t h i n * ( . 4+s t r ong * . 1 ) ) , ' \n ' , f i l e=
gen , append=T, sep=””)ca t ( ' y3−y4* ' , . 3+(w i t h i n * ( .4−mod* . 3 ) ) , ' \n ' , f i l e=gen ,
append=T, sep=””)ca t ( ' y5−y6* ' , . 3+(w i t h i n * ( .4− s t rong * . 1 ) ) , ' ; \n ' , f i l e=gen ,
append=T, sep=””)ca t ( 'FW@1; \n ' , f i l e=gen , append=T)# residuals are just 1-loading∧2
ca t ( ' y1−y2* ' , 1−( . 3+(w i t h i n * ( . 4+s t r ong * . 1 ) ) )∧2 , ' ; \n ' , f i l e=gen , append=T, sep=””)
ca t ( ' y3−y4* ' , 1−( . 3+(w i t h i n * ( .4−mod* . 3 ) ) )∧2 , ' ; \n ' , f i l e=gen , append=T, sep=””)
Writing Functions I 101 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
I found the prototype for that code in a previous project ...
ca t ( ' y5−y6* ' , 1−( . 3+(w i t h i n * ( .4− s t rong * . 1 ) ) )∧2 , ' ; \n ' , f i l e=gen , append=T, sep=””)
ca t ( ' \n ' , f i l e=gen , append=T, sep=””)ca t ( '%BETWEEN% \n ' , f i l e=gen , append=T)
# the ICC bit multiplies by .053 when ICC is low , by 1 when
ICC is high
ca t ( 'FB BY y1−y2* ' , s q r t ( ( . 3+(between* ( . 4+s t r ong * . 1 ) ) )∧2*(1+( i c c * ( .053−1 ) ) ) ) , ' \n ' , f i l e=gen , append=T, sep=””)
ca t ( ' y3−y4* ' , s q r t ( ( . 3+(between* ( .4−mod* . 3 ) ) )∧2 * (1+( i c c * (.053−1 ) ) ) ) , ' \n ' , f i l e=gen , append=T, sep=””)
ca t ( ' y5−y6* ' , s q r t ( ( . 3+(between* ( .4− s t rong * . 1 ) ) )∧2 * (1+( i c c *
( .053−1 ) ) ) ) , ' ; \n ' , f i l e=gen , append=T, sep=””)ca t ( 'FB@1 ; \n ' , f i l e=gen , append=T)
#residuals are total variance (1 or .053 , depending on ICC)
loading∧2
ca t ( ' y1−y2* ' , 1+( i c c * ( .053−1 ) ) −sqrt ( ( . 3+(between* ( . 4+s t r ong *
. 1 ) ) )∧2* (1+( i c c * ( .053−1 ) ) ) )∧2 , ' ; \n ' , f i l e=gen , append=T, sep=””)
Writing Functions I 102 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
I found the prototype for that code in a previous project ...
ca t ( ' y3−y4* ' , 1+( i c c * ( .053−1 ) ) −sqrt ( ( . 3+(between* ( .4−mod* . 3 )) )∧2 * (1+( i c c * ( .053−1 ) ) ) )∧2 , ' ; \n ' , f i l e=gen , append=T,sep=””)
ca t ( ' y5−y6* ' , 1+( i c c * ( .053−1 ) ) −sqrt ( ( . 3+(between* ( .4− s t rong *
. 1 ) ) )∧2 * (1+( i c c * ( .053−1 ) ) ) )∧2 , ' ; \n ' , f i l e=gen , append=T, sep=””)
ca t ( ' \n ' , f i l e=gen , append=T, sep=””)
#run the above syntax using Mplus
#shell (paste ("cd", mplus , sep =" "))
s h e l l ( pa s t e ( ”mp lu s . e x e ” , gen , pa s t e ( path , ”s a v e . o u t ” , sep=”\\”) , sep=” ”) )
Writing Functions I 103 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Here was my Suggestion
Create separate functions to do separate parts of the work.Avoid so much “cut and paste” coding. The cat function caninclude MANY separate quoted strings or values, there is no reasonto write separate cat statements for each line.Here was my suggestion for part of the revision
##Create one MPlus Input file corresponding to following
parameters.
c r e a t e I n p F i l e <− f u n c t i o n ( path=”apath ” , gen=”a f i l e n am e . i n p ”, p e r c l u s t =2, n c l u s t =100 , i t e r =1000 , mod=1, s t r ong =1,between=1, w i t h i n =1){ca t ( ”MONTECARLO:
NAMES ARE y1−y6 ;NOBSERVATIONS = ” , p e r c l u s t * nc l u s t , ” ; \n ” ,
i f e l s e ( p e r c l u s t != 7 . 5 ,pa s t e ( ”NCSIZES = 1 ; \n CSIZES =” , n c l u s t ,
”( ” , p e r c l u s t , ”) ;\ n ”) ,pa s t e ( ” NCSIZES = 2 ; \n CSIZES = ” ,
n c l u s t / 2 , ” (7 ) ” , n c l u s t /2 , ” (8 ) ; \n ”) ) , f i l e=gen , append=T,
Writing Functions I 104 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Here was my Suggestion ...sep=””)
## user-specified iterations
ca t ( ”NREPS = ” , i t e r , ” ;SEED = 791305;REPSAVE=ALL ;SAVE=” , path , ”\\ data * . d a t ;ANALYSIS : TYPE = TWOLEVEL;MODEL POPULATION:%WITHIN% \n ” , f i l e=gen , append=T, sep=””)
## baseline loadings are all .3 , add .4 if "within" =
1, but change based on mod/strong
ca t ( ”FW BY y1−y2* ” , . 3+(w i t h i n * ( . 4+s t r ong * . 1 ) ) ,”y3−y4* ” , . 3+(w i t h i n * ( .4−mod* . 3 ) ) ,”y5−y6* ” , . 3+(w i t h i n * ( .4− s t rong * . 1 ) ) ,”FW@1; \n ” ,”y1−y2* ” , 1−( . 3+(w i t h i n * ( . 4+s t r ong * . 1 ) ) )∧2 , ” ; \n ” ,”y3−y4* ” , 1−( . 3+(w i t h i n * ( .4−mod* . 3 ) ) )∧2 , ” ; \n ” ,”y5−y6* ” , 1−( . 3+(w i t h i n * ( .4− s t rong * . 1 ) ) )∧2 , ” ; \n ” ,” %BETWEEN% ” ,
Writing Functions I 105 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Here was my Suggestion ...
”FB BY y1−y2* ” , s q r t ( ( . 3+(between* ( . 4+s t r ong * . 1 ) ) )∧
2* (1+( i c c * ( .053−1 ) ) ) ) ,”y3−y4* ” , s q r t ( ( . 3+(between* ( .4−mod* . 3 ) ) )∧2 * (1+(
i c c * ( .053−1 ) ) ) ) ,”y5−y6* ” , s q r t ( ( . 3+(between* ( .4− s t rong * . 1 ) ) )∧2 *
(1+( i c c * ( .053−1 ) ) ) ) , ” ; FB@1 ; \n ” ,”y1−y2* ” , 1+( i c c * ( .053−1 ) ) −sqrt ( ( . 3+(between* ( . 4+
s t r ong * . 1 ) ) )∧2* (1+( i c c * ( .053−1 ) ) ) )∧2 , ” ; ” ,”y3−y4* ” , 1+( i c c * ( .053−1 ) ) −sqrt ( ( . 3+(between* (
.4−mod* . 3 ) ) )∧2 * (1+( i c c * ( .053−1 ) ) ) )∧2 , ” ; ” ,”y5−y6* ” , 1+( i c c * ( .053−1 ) ) −sqrt ( ( . 3+(between* (
.4− s t rong * . 1 ) ) )∧2 * (1+( i c c * ( .053−1 ) ) ) )∧2 , ” ; ” ,”\n ” , f i l e=gen , append=T, sep=””)
}
Writing Functions I 106 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Critique that!
Benefit of re-write is isolation of code writing into a separatefunction
We need to work on cleaning up use of “cat” to write files.
Writing Functions I 107 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Outline
1 Introduction
2 Write Functions!
3 Example: Calculate Entropy
4 Arguments and Returns
5 R Style
6 Writing Better R Code
7 Object Oriented Programming
Writing Functions I 108 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Object Oriented Programming
A re-conceptualization of programming to reduce programmererror
OO includes a broad set of ideas, only a few of which aredirectly applicable to R programming
The “rise” to pre-eminance of OO indicated by the
introduction of object frameworks in existing languages (C++,Objective-C)growth of wholly new object-oriented languages (Java)
Writing Functions I 109 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Decipher R OO by Intuition
Run the command
methods ( p r i n t )
What do you see? I see 170 lines, like so:
[ 1 ] p r i n t . a c f *[ 2 ] p r i n t . a n o v a[ 3 ] p r i n t . a o v *
[ 4 ] p r i n t . a o v l i s t *. . .[ 1 6 7 ] p r i n t . w a r n i n g s[ 1 6 8 ] p r i n t . x g e t t e x t *[ 1 6 9 ] p r i n t . x n g e t t e x t *[ 1 7 0 ] p r i n t . x t a b s *
Non−v i s i b l e f u n c t i o n s a r e a s t e r i s k e d
Writing Functions I 110 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
170 print.??? Methods.
Yes: there are really 170 print “methods”
No: the R team does not expect or want you to know all ofthem. Users just run
p r i n t ( x )
Try not to worry about “how” the printing is achieved.
Yes: R team wants package writers to create specialized printmethods to control presentation of their output for the objectthey create.
Writing Functions I 111 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
The R Runtime System Handles the Details
The user runs
p r i n t ( x )
The R runtime system1 notices that x is of a certain type, say “classOfx”2 and then the runtime system uses print.classOfX to handle the
user’s request
Writing Functions I 112 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
print is a “Generic Function”
Definition of “Generic Function”: the function that users call whichcauses an object-specific function to be called for the work to getdone. Examples:“print”, “plot”, “summary”, “anova”
Generic Function is terminology unique to R(AFAIK)
In the standard case, a generic function does not do any work.It sends the work to the appropriate “implementation” in amethod.
“A standard generic function does nocomputation other than dispatching a method, but Rgeneric functions can do other coumputations as wellbefore and/or after method dispatch”(Chambers,Software for Data Analysis, p. 398)
Writing Functions I 113 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
print is a “Generic Function” ...
UseMethod() is the function that declares a function asgeneric: The R runtime system is warned to “be alert” tousage of the function.
Example: the print generic function from R source (basepackage).
p r i n t <− f u n c t i o n ( x , . . . ) UseMethod ( ”p r i n t ”)
Example: the plot generic function from R source (graphicspackage).
p l o t <− f u n c t i o n ( x , y , . . . ) UseMethod ( ”p l o t ”)
Writing Functions I 114 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Here’s Where R Gains its Analytical Power
The generic is just a place holder. User runs print(x), then Rknows it is supposed to ask x for its class and then theappropriate thing is supposed to happen. No Big Deal.
But the statisticians in the S & R projects saw enormoussimplifying potential in developing a battery is standardgeneric accessor functions
summary()anova()predict()plot()aic()
Writing Functions I 115 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Object
Object: self-contained “thing”. A container for data.
Operationally, in R: just about anything on the left hand sidein an assignment “<-”
Each “thing” in R carries with it enough information so thatgeneric functions “know what to do.”.
If there is no function specific to an object, the work is sent toa default method (see print.default).
Writing Functions I 116 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Class
Definition: As far as R with S3 is concerned, class is acharacteristic label assigned to an object (an object can have avector of classes, such as c(“lm”, “glm”)).
The class information is used by R do decide which methodshould be used to fulfill the request.
Run class(x), ask x what class it inherits from.
In R, the principal importance of the “class” of an object isthat it is used to decide which function should be used tocarry out the required work when a generic function is used.
Classes called “numeric”, “integer”, “character” are all vectorclasses
Writing Functions I 117 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Class ...
> y <− c (1 , 10 , 23)> c l a s s ( y )[ 1 ] ”numer ic ”> x <− c ( ”a ” , ”b ” , ”c ”)> x[ 1 ] ”a ” ”b ” ”c ”> c l a s s ( x )[ 1 ] ”c h a r a c t e r ”> x <− f a c t o r ( x )> c l a s s ( x )[ 1 ] ” f a c t o r ”> m1 <− lm ( y ∼ x )> c l a s s (m1)[ 1 ] ”lm ”> m2 <− glm ( y ∼ x , f am i l y=Gamma)> c l a s s (m2)[ 1 ] ”glm ” ”lm ”
Writing Functions I 118 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Method, a.k.a, “Method Function”
Definition: The “implementation”: the function that does the workfor an object of a particular type.
When the user runs print(m1), and m1 is from class “lm”, thework is sent to a method print.lm()
Methods are always named in the format “generic.class”, suchas “print.default”, “print.lm”, etc.
Note: Most methods do not “double-check” whether theobject they are given is from the proper class. They count inR’s runtime system to check and then call print.whatever forobejcts of type whatever
That’s why many methods are“hidden” (can only access via :::notation)
Accessing methods directly
Writing Functions I 119 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Method, a.k.a, “Method Function” ...
If a method is “exported”, can be called directly via“package::method.class()” format
If a package is “attached” to the search path, then“method.class()” will suffice, but is not as clear
If a method is NOT exported, then user can reach into thepackage and grab it by running “package:::method.class()”
Writing Functions I 120 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Detour: attributes() Function and Confusing Output
The class is stored as an attribute in many object types. Runattributes()
> a t t r i b u t e s ( x )$ l e v e l s[ 1 ] ”a ” ”b ” ”c ”
$ c l a s s[ 1 ] ” f a c t o r ”
> a t t r i b u t e s (m1)$names[ 1 ] ” c o e f f i c i e n t s ” ” r e s i d u a l s ” ” e f f e c t s ” ”rank ”[ 5 ] ” f i t t e d . v a l u e s ” ”a s s i g n ” ”qr ” ”
d f . r e s i d u a l ”[ 9 ] ”c o n t r a s t s ” ” x l e v e l s ” ” c a l l ” ”terms ”
[ 1 3 ] ”model ”
$ c l a s s[ 1 ] ”lm ”
> a t t r i b u t e s ( y )
Writing Functions I 121 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Detour: attributes() Function and Confusing Output ...
NULL> i s . o b j e c t ( y )[ 1 ] FALSE
puzzle: why has y no attribute? Why is it not an object?
Honestly, I’m baffled, I thought ”everything in R is an object.”(And I still do.)
If the object does not have a class attribute, it has animplicit class, “matrix”, “array” or the result of “mode(x)”(except that integer vectors have the implicit class“integer”). (from ?class in R-2.15.1)
Writing Functions I 122 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How Objects get “into” Classes
In older S3 terminology, user is allowed to simply claim that xis from one or more classes
c l a s s ( x ) <− c (`` lm ' ' , ``glm ' ' , c l a s s ( x ) )
That would say x’s class includes “lm” and “glm” as newclasses, and also would keep x’s old classes as well.
The class is an attribute, can be set thusly
a t t r ( x , `` c l a s s ' ' ) <− c (`` lm ' ' , ``whateve r ISay ' ' )
When a generic method “run” is called with x, the R runtimewill first try to use run.lm. If run.lm is not found, thenrun.whateverISay will be tried, and if that fails, it falls back torun.default.
Writing Functions I 123 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How Objects get “into” Classes: S4
S4 has more structure, makes classes & methods work morelike truly object oriented programs.
S4 classes are defined with a list of variables BEFORE objectsare created.
Variables are typed!
Example imitates Matloff, p. 223
s e t C l a s s ( ”p j f r i e n d ” , r e p r e s e n t a t i o n (name=”c h a r a c t e r ” ,gender=”f a c t o r ” ,food=”f a c t o r ” ,age=” i n t e g e r ”) )
Create an instance of class pjfriend (Note: to declare aninteger, add letter “L” to end of number).
Writing Functions I 124 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How Objects get “into” Classes: S4 ...
w i l l i am <− new( ”p j f r i e n d ” , name = ”w i l l i am ” , gender =f a c t o r ( ”male ”) , food=f a c t o r ( ”p i z z a ”) , age=33L)
w i l l i am
An ob j e c t o f c l a s s ”p j f r i e n d ”S l o t ”name ”:[ 1 ] ”w i l l i am ”
S l o t ”gender ” :[ 1 ] maleL e v e l s : male
S l o t ”food ” :[ 1 ] p i z z aL e v e l s : p i z z a
S l o t ”age ” :[ 1 ] 33
Writing Functions I 125 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How Objects get “into” Classes: S4 ...j a n e <− new( ”p j f r i e n d ” , name=”pumpkin ” , gender =
f a c t o r ( ”f ema l e ”) , food=f a c t o r ( ”hamburger ”) , age=21L)
j an e
An ob j e c t o f c l a s s ”p j f r i e n d ”S l o t ”name ”:[ 1 ] ”pumpkin ”
S l o t ”gender ” :[ 1 ] f ema l eL e v e l s : f ema l e
S l o t ”food ” :[ 1 ] hamburgerL e v e l s : hamburger
S l o t ”age ” :[ 1 ] 21
jane and william are instances of class “pjfriend”
Writing Functions I 126 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
How Objects get “into” Classes: S4 ...
The variables inside an S4 object are called “slots” in R
“slot” would be called “instance variable” in mostOO-languages)
values in slots can be retrieved with symbol @, not $
Writing Functions I 127 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Implement an S4 method
Step 1. Write a function that can receive a function of type“pjfriend” and do something with it.
Step 2. Use setMethod to tell the R system that the functionimplements the method that is called for.
setMethod “wraps” a function.
setMethod ( ”some−generic− function−name ” , ”p j f r i e n d ” ,f u n c t i o n ( x ) {
#do something with x
})
Writing Functions I 128 / 129 University of Kansas
Introduction Write Functions! Example: Calculate Entropy Arguments and Returns R Style Writing Better R Code Object Oriented Programming
Difficult to Account For Changes between S3 and S4
I think it is difficult to explain some of the notational andterminological changes between S3 and S4.
If you type an S4 object’s name on the command line
> x
the R runtime looks for a method “show.class” (where class isthe class of x).
Why change from “print” to “show”? (IDK)
Why change the “accessor” symbol from $ to @ ?
Why call things accessed with @ “slots” rather than instancevariables?
Writing Functions I 129 / 129 University of Kansas