poetry with r -- dissecting the code

29
introduction() # Poetry is considered a form of literary art in which # language is used for its aesthetic and evocative qualities. It # contains multiple interpretations and therefore resonates # differently in each reader. # # Code is the language used to communicate with computers. It has its # own rules (syntax) and meaning (semantics). Like literature writers # or poets, coders also have their own style that include - strategies # for optimizing the code being read by a computer, and facilitating # its understanding through visual organization and comments for other # coders. # # Code can speak literature, logic, maths. It contains different # layers of abstraction and it links them to the physical world of # processors and memory chips. All these resources can contribute in # expanding the boundaries of contemporary poetry by using code as a # new language. Code to speak about life or death, love or hate. Code # meant to be read, not run. url("http://code-poems.com")

Upload: peter-solymos

Post on 27-Jan-2015

111 views

Category:

Technology


3 download

DESCRIPTION

ERUG meeting -- December 11, 2012

TRANSCRIPT

Page 1: Poetry with R -- Dissecting the code

introduction()

# Poetry is considered a form of literary art in which

# language is used for its aesthetic and evocative qualities. It

# contains multiple interpretations and therefore resonates

# differently in each reader. # # Code is the language used to communicate with computers. It has its

# own rules (syntax) and meaning (semantics). Like literature writers

# or poets, coders also have their own style that include - strategies

# for optimizing the code being read by a computer, and facilitating

# its understanding through visual organization and comments for other

# coders. # # Code can speak literature, logic, maths. It contains different

# layers of abstraction and it links them to the physical world of

# processors and memory chips. All these resources can contribute in

# expanding the boundaries of contemporary poetry by using code as a

# new language. Code to speak about life or death, love or hate. Code

# meant to be read, not run.

url("http://code-poems.com")

Page 2: Poetry with R -- Dissecting the code

R version 2.15.2 (2012-10-26) -- "Trick or Treat"

Copyright (C) 2012 The R Foundation for Statistical Computing

ISBN 3-900051-07-0

Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.

You are welcome to redistribute it under certain conditions.

Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.

Type 'contributors()' for more information and

'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or

'help.start()' for an HTML browser interface to help.

Type 'q()' to quit R.

>

Page 3: Poetry with R -- Dissecting the code

R version 2.15.2 (2012-10-26) -- "Trick or Treat"

Copyright (C) 2012 The R Foundation for Statistical Computing

ISBN 3-900051-07-0

Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.

You are welcome to redistribute it under certain conditions.

Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.

Type 'contributors()' for more information and

'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or

'help.start()' for an HTML browser interface to help.

Type 'q()' to quit R.

> print("Hello World")

[1] "Hello World"

>

Character vector of length 1 (mode

and type (typeof) comes with it)

Page 4: Poetry with R -- Dissecting the code

Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.

Type 'contributors()' for more information and

'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or

'help.start()' for an HTML browser interface to help.

Type 'q()' to quit R.

> print("Hello World")

[1] "Hello World"

> print

function (x, ...)

UseMethod("print")

<bytecode: 0x0000000010a7a2c8>

<environment: namespace:base>

>

Page 5: Poetry with R -- Dissecting the code

Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.

Type 'contributors()' for more information and

'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or

'help.start()' for an HTML browser interface to help.

Type 'q()' to quit R.

> print("Hello World")

[1] "Hello World"

> print

function (x, ...)

UseMethod("print")

<bytecode: 0x0000000010a7a2c8>

<environment: namespace:base>

>

This means that code

Is compiled, not interpreted

-- thus faster, but not for today…

Page 6: Poetry with R -- Dissecting the code

Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.

Type 'contributors()' for more information and

'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or

'help.start()' for an HTML browser interface to help.

Type 'q()' to quit R.

> print("Hello World")

[1] "Hello World"

> print

function (x, ...)

UseMethod("print")

<bytecode: 0x0000000010a7a2c8>

<environment: namespace:base>

>

Environment where it is

defined. Not for now…

Page 7: Poetry with R -- Dissecting the code

> print("Hello World")

[1] "Hello World"

> print

function (x, ...)

UseMethod("print")

<bytecode: 0x0000000010a7a2c8>

<environment: namespace:base>

> formals("print")

$x

$...

> ?print

starting httpd help server ... done

>

Arglist, help also helps. x is an input object.

print prints its argument and returns it invisibly (via invisible(x)).

It is a generic function which means that new printing methods can be easily added for new classes.

Page 8: Poetry with R -- Dissecting the code

> print("Hello World")

[1] "Hello World"

> print

function (x, ...)

UseMethod("print")

<bytecode: 0x0000000010a7a2c8>

<environment: namespace:base>

> formals("print")

$x

$...

> ?print

starting httpd help server ... done

>

print prints its argument and returns it invisibly (via invisible(x)).

It is a generic function which means that new printing methods can be easily added for new classes.

Generic function.

Page 9: Poetry with R -- Dissecting the code

summary(so_far)

# o R is interpreted language, but

# bite compiling is possible (see

# compiler package).

# o In the background, everything is

# about environments (which are

# similar to lists), but luckily, this

# is hidden from average user.

# o Everything is an object -- OO.

# o Objects come in classes.

# o Methods can be defined for objects.

Page 10: Poetry with R -- Dissecting the code

> set.seed(1234)

> x <- runif(10)

> y <- 2 + 5 * x + rnorm(10)

> plot(x, y)

>

Random numbers are important.

r unif != run if, uniform

normal

Another generic function.

Page 11: Poetry with R -- Dissecting the code

> set.seed(1234)

> x <- runif(10)

> y <- 2 + 5 * x + rnorm(10)

> plot(x, y)

> (n <- cor.test(x, y))

Pearson's product-moment correlation

data: x and y

t = 5.9327, df = 8, p-value = 0.0003487

alternative hypothesis: true correlation is not equal to 0

95 percent confidence interval:

0.6325328 0.9770136

sample estimates:

cor

0.9026646

>

Parenthesis: print in short.

Page 12: Poetry with R -- Dissecting the code

> (n <- cor.test(x, y))

Pearson's product-moment correlation

data: x and y

t = 5.9327, df = 8, p-value = 0.0003487

alternative hypothesis: true correlation is not equal to 0

95 percent confidence interval:

0.6325328 0.9770136

sample estimates:

cor

0.9026646

> class(n)

[1] "htest"

>

Page 13: Poetry with R -- Dissecting the code

> class(n)

[1] "htest"

> str(n)

List of 9

$ statistic : Named num 5.93

..- attr(*, "names")= chr "t"

$ parameter : Named int 8

..- attr(*, "names")= chr "df"

$ p.value : num 0.000349

$ estimate : Named num 0.903

..- attr(*, "names")= chr "cor"

$ null.value : Named num 0

..- attr(*, "names")= chr "correlation"

$ alternative: chr "two.sided"

$ method : chr "Pearson's product-moment correlation"

$ data.name : chr "x and y"

$ conf.int : atomic [1:2] 0.633 0.977

..- attr(*, "conf.level")= num 0.95

- attr(*, "class")= chr "htest"

>

It is a list, nothing more.

Attributes are important

Class is an attribute.

Page 14: Poetry with R -- Dissecting the code

..- attr(*, "conf.level")= num 0.95

- attr(*, "class")= chr "htest"

> methods(class="htest")

[1] print.htest*

Non-visible functions are asterisked

>

Page 15: Poetry with R -- Dissecting the code

..- attr(*, "conf.level")= num 0.95

- attr(*, "class")= chr "htest"

> methods(class="htest")

[1] print.htest*

Non-visible functions are asterisked

> print.htest

Error: object 'print.htest' not found

>

Page 16: Poetry with R -- Dissecting the code

..- attr(*, "conf.level")= num 0.95

- attr(*, "class")= chr "htest"

> methods(class="htest")

[1] print.htest*

Non-visible functions are asterisked

> print.htest

Error: object 'print.htest' not found

> getAnywhere("print.htest")

A single object matching ‘print.htest’ was found

It was found in the following places

registered S3 method for print from namespace stats

namespace:stats

with value

function (x, digits = 4, quote = TRUE, prefix = "", ...)

{

cat("\n")

Gimme that damn thing!

Page 17: Poetry with R -- Dissecting the code

cat(strwrap(x$method, prefix = "\t"), sep = "\n")

cat("\n")

cat("data: ", x$data.name, "\n")

out <- character()

if (!is.null(x$statistic))

out <- c(out, paste(names(x$statistic), "=", format(round(x$statistic,

4))))

if (!is.null(x$parameter))

out <- c(out, paste(names(x$parameter), "=", format(round(x$parameter,

3))))

if (!is.null(x$p.value)) {

fp <- format.pval(x$p.value, digits = digits)

out <- c(out, paste("p-value", if (substr(fp, 1L, 1L) ==

"<") fp else paste("=", fp)))

}

cat(strwrap(paste(out, collapse = ", ")), sep = "\n")

if (!is.null(x$alternative)) {

cat("alternative hypothesis: ")

if (!is.null(x$null.value)) {

if (length(x$null.value) == 1L) {

alt.char <- switch(x$alternative, two.sided = "not equal to",

less = "less than", greater = "greater than")

cat("true", names(x$null.value), "is", alt.char,

x$null.value, "\n")

}

else {

cat(x$alternative, "\nnull values:\n")

print(x$null.value, ...)

}

}

else cat(x$alternative, "\n")

}

Page 18: Poetry with R -- Dissecting the code

if (!is.null(x$conf.int)) {

cat(format(100 * attr(x$conf.int, "conf.level")), "percent confidence interval:\n",

format(c(x$conf.int[1L], x$conf.int[2L])), "\n")

}

if (!is.null(x$estimate)) {

cat("sample estimates:\n")

print(x$estimate, ...)

}

cat("\n")

invisible(x)

}

<bytecode: 0x0000000010f7a3e0>

<environment: namespace:stats>

>

Return the value invisibly.

Defined in stats pkg.

Page 19: Poetry with R -- Dissecting the code

> m <- lm(y ~ x)

>

lm linear regression

y ~ x formula, let’s get back to

this later

Page 20: Poetry with R -- Dissecting the code

> m <- lm(y ~ x)

> m

Call:

lm(formula = y ~ x)

Coefficients:

(Intercept) x

1.740 4.857

>

lm linear regression

y ~ x formula, let’s get back to

this later

This makes life so much easier,

talk about it next year.

Page 21: Poetry with R -- Dissecting the code

> m <- lm(y ~ x)

> m

Call:

lm(formula = y ~ x)

Coefficients:

(Intercept) x

1.740 4.857

> names(m)

[1] "coefficients" "residuals" "effects" "rank"

[5] "fitted.values" "assign" "qr" "df.residual"

[9] "xlevels" "call" "terms" "model"

>

lm linear regression

y ~ x formula, let’s get back to

this later

This makes life so much easier,

talk about it next year.

Page 22: Poetry with R -- Dissecting the code

> print.lm

function (x, digits = max(3, getOption("digits") - 3), ...)

{

cat("\nCall:\n", paste(deparse(x$call), sep = "\n", collapse = "\n"),

"\n\n", sep = "")

if (length(coef(x))) {

cat("Coefficients:\n")

print.default(format(coef(x), digits = digits), print.gap = 2,

quote = FALSE)

}

else cat("No coefficients\n")

cat("\n")

invisible(x)

}

<bytecode: 0x0000000010542380>

<environment: namespace:stats>

>

Page 23: Poetry with R -- Dissecting the code

> (s <- summary(m))

Call:

lm(formula = y ~ x)

Residuals:

Min 1Q Median 3Q Max

-0.7372 -0.4189 -0.2076 0.2832 1.2928

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.7401 0.4539 3.834 0.004990 **

x 4.8571 0.8187 5.933 0.000349 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6751 on 8 degrees of freedom

Multiple R-squared: 0.8148, Adjusted R-squared: 0.7917

F-statistic: 35.2 on 1 and 8 DF, p-value: 0.0003487

>

Page 24: Poetry with R -- Dissecting the code

> coef(m)

(Intercept) x

1.740131 4.857122

>

Page 25: Poetry with R -- Dissecting the code

> coef(m)

(Intercept) x

1.740131 4.857122

> coef(s)

Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.740131 0.4538773 3.833924 0.0049900329

x 4.857122 0.8186989 5.932733 0.0003486669

>

Page 26: Poetry with R -- Dissecting the code

> coef(m)

(Intercept) x

1.740131 4.857122

> coef(s)

Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.740131 0.4538773 3.833924 0.0049900329

x 4.857122 0.8186989 5.932733 0.0003486669

> class(s)

[1] "summary.lm"

>

Page 27: Poetry with R -- Dissecting the code

> coef(m)

(Intercept) x

1.740131 4.857122

> coef(s)

Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.740131 0.4538773 3.833924 0.0049900329

x 4.857122 0.8186989 5.932733 0.0003486669

> class(s)

[1] "summary.lm"

> getAnywhere("print.summary.lm")

A single object matching ‘print.summary.lm’ was found

It was found in the following places

registered S3 method for print from namespace stats

namespace:stats

with value

function (x, digits = max(3, getOption("digits") - 3), symbolic.cor =

x$symbolic.cor,

signif.stars = getOption("show.signif.stars"), ...)

{

cat("\nCall:\n", paste(deparse(x$call), sep = "\n", collapse = "\n"),

"\n\n", sep = "")

resid <- x$residuals

df <- x$df

rdf <- df[2L]

cat(if (!is.null(x$weights) && diff(range(x$weights)))

Page 28: Poetry with R -- Dissecting the code

print(summay(presentation))

# o Print method is simplest.

# o It conveys meaning to user.

# o Results are usually structures in

# different ways,

# o need methods to access information: > methods(class="lm")

[1] add1.lm* alias.lm* anova.lm

[4] case.names.lm* confint.lm* cooks.distance.lm*

[7] deviance.lm* dfbeta.lm* dfbetas.lm*

[10] drop1.lm* dummy.coef.lm* effects.lm*

[13] extractAIC.lm* family.lm* formula.lm*

[16] hatvalues.lm influence.lm* kappa.lm

[19] labels.lm* logLik.lm* model.frame.lm

[22] model.matrix.lm nobs.lm* plot.lm

[25] predict.lm print.lm proj.lm*

[28] qr.lm* residuals.lm rstandard.lm

[31] rstudent.lm simulate.lm* summary.lm

[34] variable.names.lm* vcov.lm*

Page 29: Poetry with R -- Dissecting the code

> if (questions) {

+ answer(questions)

+ } else q("no")

# ERUG meeting -- December 11, 2012

# Poetry with R -- Dissecting the code

# P. Solymos