fp days: down the clojure rabbit hole

72
Down the (Clojure) Rabbit Hole FP Days 2014 @cgrand

Upload: cgrand

Post on 13-Jul-2015

157 views

Category:

Software


5 download

TRANSCRIPT

Down the (Clojure) Rabbit Hole

FP Days 2014 @cgrand

Who am I?

• Independent software dev

• Early adopter (6+ years)

• Authored a couple of libs

• Co-authored Clojure Programming

Where I come from

• Growing pains: Logo, 8-bit BASIC, Assembly, Pascal, C/C++

• Enlightenment: Caml Light

• Dark ages: Entreprise Java

A New Hope

• Wait, there’s more to Javascript than copy-pasting incantations!?

• 1st class functions? Party on!

It can’t go wrong…

• Concurrent server-side Javascript

• Left me with a lot of cicatricial tissue

Wish list• Functional

• Dynamic

• JVM

• Meta-programming

• Concurrency story

Wish list✓ Functional

✓ Dynamic

✓ JVM

✓ Meta-programming

✓ Concurrency story

When the pupil is ready to learn,

a teacher will appear.

A B

A B

A B

A B

Parentheses!

Pragmatic Parentheses!

• Don’t commit to a syntax

• Don’t spend time budget on syntax

• Open to extension

• Metaprogramming and serialization for free!

Immutability

Persistent Data Structures

• More than immutable

• Want! for years

• But didn’t know the word to look for

What is immutable with log ops?

Numbers!

Okasaki and base-10 cubes

• Persistent Data Structures as numeral systems

• Base-10 cubes

Collections as numbers• inc is cons

• random access list

• elementary school material!

• log472

Data structures for free

• Devise a twisted numeral system (no zero, extra digits, ambiguous, etc.)

• Get a persistent data structure for free!

ROMANA DATASTRVCTVRA

Strict 2-3 Finger Trees• 4 digits: 0, 1, 2, 3

• Positional AND symmetric

• 2n+1 digits

• dk, k ≤ n has weight 2k, k ≥ n has weight 22n-k

• 0 and 1 only allowed for the most significant digit

Inc by the right0

1

2

3

202

203

212

213

222

223

232

233

22022

22023

22032

22033

22122

22123

Dec by the left22123

33023

23023

32023

22023

333

233

323

223

313

213

303

203

202

3

2

1

0

Back to data

Dynamic + Persistent• Low ceremony modeling:

• easily map external data to internal values

• 1:1 most of the time

• one model end-to-end

• or several identical ones…

Evolution• As you gain knowledge

• Introduce layers

• Recursive process

• Models for each layer may diverge

• Converters are enough

• Must Ignore semantics

Loose Coupling

• Sharing values ≠ sharing mutable objects

• Sharing schemas ≠ sharing classes

• Coupling is caused by behavior and mutability

• Dumber is better

Who Needs Encapsulation?

• When things can’t change, why continuously check invariants?

• Use validators to enforce invariants

• Safe publication

• How a value is built is of no importance

Beware of Postel’s Law

Be conservative in what you send, be liberal in what you accept.

This defines de facto validity

Hard to reverse-engineer, not declarative enough

Modularity

• Easily serializable values (map, vectors etc.)

• Low coupling

• Allows for an easier transition from internal module to external service

Data as API

Data as API• Blind spots

• process must be agreed upon out of band

• process is closed

• coupling to a version of the process

• Bring objects and behavior back!

Who needs objects?

• We have closures! (or the other way round)

• Let’s put closures in the map!

• But closures don’t print!

• Pesky behaviors…

How do you call a system that sends data and

process in-band?

A web server!

Data+Process=HTML

• Content is data

• Links and forms define processes

• Javascript too but is beyond the Turing horizon

<form action="http://example.org/xyz" method=post> <input type=hidden name=secret value="0xDECAF"> <label>First name: <input name=fname></label> </form>

Anatomy of a Form

Anatomy of a Form

<form action="http://example.org/xyz" method=post> <input type=hidden name=secret value="0xDECAF"> <label>First name: <input name=fname></label> </form>

Function pointer

Anatomy of a FormFunction pointer

<form action="http://example.org/xyz" method=post> <input type=hidden name=secret value="0xDECAF"> <label>First name: <input name=fname></label> </form>

Invocation Protocol

impl.

Anatomy of a FormFunction pointer

Invocation Protocol

impl.<form action="http://example.org/xyz" method=post> <input type=hidden name=secret value="0xDECAF"> <label>First name: <input name=fname></label> </form>

Closed over environment

A form is a closure!

Hypermedia API?• Cambrian explosion (HAL, Siren, Uber…) but no

uptake

• A good starting point may be to look at closures that are good form candidates

• named arguments

• partials

• arguments shuffling/renaming

What’s to be gained?

• Not only you get data as usual

• That you can manipulate as you wish from your side of the fence

• But you also get dynamic introspection of the other side!

Let’s pretend it exists

• all closures are lifted out of the data and mapped by names to generic functions

• names should be namespaced and shared

{:todos [{:desc "Invoice Client XXX" :mark-as-done (form-inspired-serialization-goes-here) :delete (form-inspired-serialization-goes-here)} {:desc "Work on Enliven" :mark-as-done (form-inspired-serialization-goes-here) :delete (form-inspired-serialization-goes-here)}] :create (form-inspired-serialization-goes-here)}

Benefits

• Coupling between producer and consumer is now only an agreements on names and schemas

• The process is dynamic

• REPL experience

• Worth exploring: closures as arguments

Abusive Simplifications

• Using closure for endpoint is an abuse

• HTTP or synchronicity are not required

• May apply to a messaging system

• as long as some fns are adressable

Computation

Sequences

• Iteration model

• Should not be confused with collections

• or only very locally

• The (original) way to compose computations

Indices are a smell

• Indices may look efficient

• but that’s only because we picked data structures for which they are

Pet peeve: strings

• Strings as chunks of 16 bits chars

• Waste of space because many code points < 256

• chars are not even characters…

• Forces you to copy and decode bytes to chars

Pet peeve: strings

• An abstraction with cursors (local moves/searches) but no indexed lookup

• Would allow to deal with encoded UTF-8 directly

• AFAIK all algorithms that requires indexed lookup (eg Boyer Moore) could work on bytes and be composed with encoding

Enliven

• Work in progress templating library

• I explored composing the encoding with the template: static parts are precomputed as bytes, possibly in direct ByteBuffers

• I even tried composing with gzip

GZON• Directly emits compressed JSON

• Avoids repeated conversions and writes of commons expressions (constants, field names, templates values)

• Avoids having to find repetitions right after having emitted them

• Coming to Github soon

Towards Efficiency

• By avoiding unnecessary copies and allocations

• By composing/merging computations

In Clojure too

• Sequences

• Chunked sequences

• Reducers

• Transducers

Efficiency perspectives

• GPUs

• Cache-Oblivious data structures

Ordering considered harmful

Ordering• Incidental vs Essential

• Incidental ordering breaks local reasoning

• deeply nested ifs

• short-circuiting and/or

• pattern matching clauses

Short-circuiting and/or• Refactor with care!

• Swapping two expressions may:

• Cause exceptions (and (not= x 0) (/ 1 x))

• Change the returned value (and (pred x) (lookup x))

Short-circuiting and/or

• You can’t know when order matters

• The compiler can’t either

Regexes

• Choice is ordered

• You can’t locally fix a regex

=> (re-seq #"a|ab" "abababab")("a" "a" "a" "a")

CFG vs RDP

• L ::= "a" L "a" | "a"

• Which strings are matched?

CFG vs RDP

• L ::= "a" L "a" | "a"

• Which strings are matched?

• CFG: any string of 2N+1 "a"s

• RDP: any string of 2^N-1 "a"s

CFG vs RDP

• L ::= "a" | "a" L "a"

• Which strings are matched?

CFG vs RDP

• L ::= "a" | "a" L "a"

• Which strings are matched?

• CFG: any string of 2N+1 "a"s

• RDP: just "a"

miniKanren/core.logic• Depending on the ordering of your disjunctions and

conjunctions...

• ...your program may run endlessly without ever returning a single answer

• Defaults fixed for disjunction in the draft of the Reasoned Schemer 2nd edition and in core.logic

• Fair conjunction is harder

• ASP (Answer Set Programming) is really pure

ACId is declarative• Associative

• Ordering is not execution

• Commutative

• Accidental ordering should not matter

• Idempotent

• Factorisation as an optimisation

ACId in practice

• I designed Enliven following ACId principles

• Easier to reason about a template than with Enlive

• Way faster: as fast as print

• Point-free

Misc

• Local state is the danger, global state is fine

• Nano VMs are fun

Thanks!