ankara jug - practical functional programming with scala
Post on 15-Jul-2015
332 Views
Preview:
TRANSCRIPT
Ensar Basri KahveciSoftware Engineer, Hazelcast
- Java SE for living, Scala for fun- Concurrent and distributed programming- NoSQL
http://basrikahveci.com
metanet
metanet
ebkahveci@gmail.com
basrikahveci
Roadmap
■ Why did OOP become so popular?
■ Why did FP become so popular?
■ Essence of FP
■ Introduction to Scala○ How it stands between OOP and FP
■ Practical FP with Scala○ Step by step examples
■ The good, the bad, the ugly Scala
■ Resources to learn Scala
Simula, 60s- Used for simulations.- Considered as the first OOP
language.- Introduced objects, classes,
inheritance and subclasses, virtual methods etc.
Smalltalk, Late 70s- Alan Kay, Xerox Palo Alto Research
Center.- Considered as the most influential
OOP language.- Byte Magazine August 81 Issue.- OOP Software design patterns:
MVC etc.- GUI, WYSIWYG, debugging tools.
Early OOP Languages
Why did OOP become popular?
■ Encapsulation?■ Code re-use?■ Dynamic binding?■ Dependency inversion?■ Liskov substitution principle?■ Open-closed principle?
Why did OOP become popular?
■ Encapsulation? NO■ Code re-use? NO■ Dynamic binding? NO■ Dependency inversion? NO■ Liskov substitution principle? NO■ Open-closed principle? NO
■ It’s because of the things you could do with OOP!
How OOP got started
- A traditional data structure: Linked list- A well defined structure
- Two cases: Empty, Non-empty- Unbounded number of operations:
- map, reverse, print, filter, get element, add element etc.
- A use-case: Simulation- Fixed number of operations:
- nextStep, toString, aggeratege- Unbounded number of implementations:
- Car, Road, Molecule, Cell, Person, Building etc.
- Another use-case: GUI widgets- Fixed number of operations:
- move, redraw, boundingRectangle- Unbounded number of implementations:
- Window, Menu, Letter, Shape, Image, Video
What do Simulation & GUIs have in common?
■ Both need a way to execute a fixed API with an unknown implementation.
■ It is possible to do that in a procedural language such as C.○ But it is too cumbersome!
■ So, people preferred OOP languages.○ Because they wanted to write cool GUI widgets!○ The other things came later.
What about Functional Programming?
■ Just like OOP, FP has many advantages:○ Fewer errors○ Better modularity○ High-level abstractions○ Shorter code○ Increased developer productivity
What about Functional Programming?
■ These alone are not enough for mainstream adoption.○ After all, FP has been around for 50 years!
■ Needs a catalyzer○ Something that sparks the initial adoption until
other advantages become clear to everyone○ Just as what OOP had for GUI widgets
A catalyzer
- Two forces driving software complexity
- Multicores (=parallel programming)- Cloud computing (=distributed computing)
- Current languages and frameworks have trouble keeping up!
- Locks / threads don’t scale!- It is very hard to scale with locks and threads!
- We need betters tools with the right level of abstractions!
Triple Challenge
■ Parallel○ How to make user of multi-cores, multi-machines?
■ Reactive○ How to deal with asynchronous events?
■ Distributed○ How to deal with delays and failures?
■ Mutable state is a liability for each of these!○ Cache coherence○ Races○ Versioning
The root of the problem
var x = 0
async { x = x + 1 }
async { x = x * 2 }
// can print 0, 1, 2
println(x)
■ Non-determinism○ Caused by concurrent accesses to mutable
state
■ Non-determinism = ○ parallel programming + mutable
state
■ To overcome it, avoid mutable state!○ Avoiding mutable state means
programming functionally.This is the catalyzer for FP.
The Essence of Functional Programming
Concentrate on transformations of immutable values,
instead of stepwise modifications of mutable state.
An example
case class Person(name: String, height: Double, weight: Double)
def calculateBMI(height: Double, weight: Double) = weight / Math.pow(height, 2)
def classify(bmi: Double) = {
if (bmi <= 18.5) "underweight"
else if (bmi <= 25) "normal"
else if (bmi <= 30) "overweight"
else "obese"
}
val people = List(Person("A", 1.7, 80), Person("B", 1.8, 70), Person("C", 1.9, 50))
people
.map({ person => (person, calculateBMI(person.height, person.weight)) })
.map({ case (person, bmi) => (person.name, classify(bmi)) })
.foreach(println)
(A,overweight)
(B,normal)
(C,underweight)
An example
case class Person(name: String, height: Double, weight: Double)
def calculateBMI(height: Double, weight: Double) = weight / Math.pow(height, 2)
def classify(bmi: Double) = {
if (bmi <= 18.5) "underweight"
else if (bmi <= 25) "normal"
else if (bmi <= 30) "overweight"
else "obese"
}
val people = List(Person("A", 1.7, 80), Person("B", 1.8, 70), Person("C", 1.9, 50))
people.par
.map({ person => (person, calculateBMI(person.height, person.weight)) })
.map({ case (person, bmi) => (person.name, classify(bmi)) })
.foreach(println)
(A,overweight)
(C,underweight)
(B,normal)
Functional Programming
■ A programming style that models computations as the evaluation of expressions, in contrast with imperative programming where programs are composed of statements that change global state. Functional programming typically avoids mutable state and promotes immutability.
■ Immutable data: FP promotes immutability. Instead of altering existing values, altered copies are created and the original is preserved.
■ Functions are first-class citizens. They are treated like any other values.
■ Purity: A function is pure if it only depends to its arguments and contains no-side effect.○ Side-effect: modifying global state, modifying an argument value, making IO etc.
■ Referential transparency: Pure computations yield the same result each time they invoked.
■ Lazy evaluation: Since pure computations are referentially transparent, they can be performed at any time and still yield the same result. ○ This makes it possible to defer the computation of values until they are needed, that is,
to compute them lazily.
Previous example, Revisited
case class Person(name: String, height: Double, weight: Double)
def calculateBMI(height: Double, weight: Double) = weight / Math.pow(height, 2)
def classify(bmi: Double) = {
if (bmi <= 18.5) "underweight"
else if (bmi <= 25) "normal"
else if (bmi <= 30) "overweight"
else "obese"
}
val people = List(Person("A", 1.7, 80), Person("B", 1.8, 70), Person("C", 1.9, 50))
people
.map({ person => (person, calculateBMI(person.height, person.weight)) })
.map({ case (person, bmi) => (person.name, classify(bmi)) })
.foreach(println)
(A,overweight)
(C,underweight)
(B,normal)
immutable list
a pure function, referentially transparent
side-effect
a function as a first-class citizen
a higherorderfunction
Scala at first glance
■ Led by Martin Odersky○ Previously worked on javac and Java generics
■ Appeared in 2003○ Latest version is 2.11.6, released in March 2015.
■ Commercially supported by Typesafe Inc.
■ A multi-paradigm language○ Object-oriented, ○ Functional, ○ Concurrent
■ Compiles to Java byte-code and runs on JVM
■ Java interoperability
■ Statically-typed
■ Type inference
Scala at first glance
■ Everything is an object. Every call is a method call.
1 + 5 // shorthand for 1.+(5)
■ Everything is an expression with a value.
val comment:String = if (people.isEmpty) “no-one” else “someone”
■ Semicolon is mostly optional. ■ Type-inference.
val name = “Basri” // name’s type is String
■ Value of a block or a function is the value of the last statement. return is usually omitted.
val name = { println(“Running name block”); “Basri”; }
■ Exceptions are supported but there are no checked-exceptions.○ throw statements have a special type: Nothing
■ variable and value separation.
var age = 26; age = 27;val name = “Basri”name = “Kahveci” // doesn’t compile
■ Supports while and do-while but there is no for(initialize; test; update)○ There is no break and continue keywords.
■ for loops are just a syntactic sugar.
for (n <- List(1, 2)) yield n * 2 // converted to List(1, 2) map { n => n * 2 }
Object-Oriented Programming with Scala
trait Moveable { def move(f: (Int, Int) => (Int, Int)):Unit}
class Shape(var x:Int, var y:Int) extends Moveable { def move(f: (Int, Int) => (Int, Int)) { val (newX, newY) = f(x, y) x = newX; y = newY; }}
object Square { def apply(x:Int, y:Int, w:Int, color:String = "Black") = new Square(x, y, w, color)}
class Square(x:Int, y:Int, private var width:Int, val color:String) extends Shape(x, y) { def area:Int = width * width
def w = width
def w_=(value:Int) { if (value > 0) this.width = value else throw new IllegalArgumentException() }}
val s = Square(3, 4, 5)s.w = 20s.move( { (x:Int, y:Int) => (x + 10, y + 10) } )println(s"x: ${s.x} y: ${s.y} w: ${s.w}") // prints x: 13 y: 14 w: 20
Today’s Menu for FP with Scala
■ Collections, Functions
■ Options, Functions
■ Pattern Matching and Functional Error Handling
■ Lazy Evaluation○ Call by name○ Streams
■ Async programming with Futures
■ Data proccessing with FP
Scala Collections
■ All collections extend Iterable trait
■ Three major categories: sequences, sets, maps
■ Immutable and mutable versions
■ +: prepends, :+ appends an element to a sequence
■ + adds an element to an unordered collection
■ ++ concatenates two collections
■ - and -- removes elements
The Main Scala Collection Traits
Set(Color.RED, Color.GREEN, Color.BLUE)Map(Color.RED -> 0xFF0000, Color.GREEN -> 0xFF00, Color.BLUE -> 0xFF)Seq("Hello", "World")
collection.mutable.Set(Color.RED, Color.GREEN, Color.BLUE)collection.mutable.Map(Color.RED -> 0xFF0000, Color.GREEN -> 0xFF00, Color.BLUE -> 0xFF)collection.mutable.Seq("Hello", "World")
Immutable Sequences
ListA scala List is either empty or an object with a head element and a tail.
List(4, 2) == 4 :: List(2)
List(2) == 2 :: Nil
List(4, 2).head == 4
List(4, 2).tail == List(2)
4 :: 2 :: Nil == 4 :: (2 :: Nil)
List(4, 2) == Nil.::(2).::(4)
RangeA scala Range represents an integer sequence.
0 to 6
0.to(6)
0 to 6 by 2
0.to(6).by(2)
0.to(6, 2)
0 to 6 by 2 foreach println
0.to(6).by(2).foreach(println)
for ( i <- 0 to 6) println(i)
Useful Methods on the Iterable trait
■ head, last, tail (all but first), init (all but last)
■ map(f), foreach(f), flatMap(f)
■ reduceLeft(op), reduceRight(op), foldLeft(init)(op), foldRight(init)(op)
■ reduce(op), fold(init)(op), aggregate(init)(op, combineOp) (applied in random order)
■ count(pred), exists(pred), filter(pred), filterNot(pred)
val n: List[Int] = List(1, 2, 3, 4)
n.reduceLeft(Integer.sum) // (((1 + 2) + 3) + 4)
n.foldLeft(-1)(Integer.sum) // ((((-1 + 1) + 2) + 3) + 4)
n.reduce(Integer.max) // ( (( ((1 > 2) -> 2) > 3 ) -> 3) > 4 ) -> 4
n./:(-1)(_ + _) // == n.foldLeft(-1)(Integer.sum)
Scala Option Class: A fancier null value?
■ Option[A] is a container for an optional value of type A.
■ If the value of type A is present, it is an instance of Some[A], containing the value. Otherwise, it is None.
■ It just behaves like a collection! So, you can map it, flatMap it, filter it , chain it etc.
■ Two states: Some[A], None
Option examples - 1
case class Employee(name: String, gender: GENDER, profession: PROFESSION, birthYear: Int)
val employees = List( Employee("01, F, DOC", FEMALE, DOCTOR , 1990), Employee("02, M, ENG", MALE , ENGINEER, 1985), Employee("03, F, TCH", FEMALE, TEACHER , 1980) )
val youngDocOpt:Option[Employee] = employees .find({ e => e.profession == DOCTOR && e.birthYear >= 1995 }) // None
val youngEngOpt:Option[Employee] = employees .find({ e => e.profession == ENGINEER && e.birthYear >= 1985 }) // Some(Employee(02...))
val youngTchOpt:Option[Employee] = employees .find({ e => e.profession == TEACHER && e.birthYear >= 1975 }) // Some(Employee(03...))
val youngDocOrEngOpt: Option[Employee] = youngDocOpt.orElse(youngEngOpt)
youngDocOrEngOpt.map(_.name).foreach(println) // 02, F, ENG
youngEngOpt.orElse(youngTchOpt).map(_.name).foreach(println) // 02, F, ENG
Option examples - 2: Currying
case class Employee(name: String, gender: GENDER, profession: PROFESSION, birthYear: Int)
val employees = List( Employee("01, F, DOC", FEMALE, DOCTOR , 1990), Employee("02, M, ENG", MALE , ENGINEER, 1985), Employee("03, F, TCH", FEMALE, TEACHER , 1980) )
def byProfessionAndBirthYear(p:PROFESSION, b:Int)(e:Employee):Boolean = { e.profession == p && e.birthYear == b
}
// Curryingval docYoungerThan20: (Employee) => Boolean = byProfessionAndBirthYear(DOCTOR, 1995)
val engYoungerThan30: (Employee) => Boolean = byProfessionAndBirthYear(ENGINEER, 1985)
val youngDoc:Option[Employee] = employees.find(docYoungerThan20) // None
val youngEng:Option[Employee] = employees.find(engYoungerThan30) // Some(Employee(02…))
youngDoc.orElse(youngEng).map(_.name).foreach(println) // Some(Employee(02…))
Option examples - 3: flatMap
case class Employee(name: String, gender: GENDER, profession: PROFESSION, birthYear: Int)
val employees = List( Employee("01, F, DOC", FEMALE, DOCTOR , 1990), Employee("02, M, ENG", MALE , ENGINEER, 1985), Employee("03, F, TCH", FEMALE, TEACHER , 1980) )
def byProfessionAndBirthYear(p:PROFESSION, b:Int)(emp:Employee) = { emp.profession == p && emp.birthYear == b
}val engYoungerThan30: (Employee) => Boolean = byProfessionAndBirthYear(ENGINEER, 1985)val youngEng:Option[Employee] = employees.find(engYoungerThan30) // Some(Employee(02…))
def getAgeIfMale(e:Employee):Option[Int] = { e match { case Employee(_, MALE, _, birthYear) => Some(2015 - birthYear) case _ => None }}
val ageOptOpt: Option[Option[Int]] = youngEng.map(getAgeIfMale) // Some(Some(30))ageOptOpt.foreach(println)
val ageOpt: Option[Int] = youngEng.flatMap(getAgeIfMale) // Some(30)ageOpt.foreach(println)
Handling Failures Functionally with scala.util.Try
def divide(n1:Int, n2:Int) = n1 / n2
def handleResult(result:Try[Int]) {
result match {
case Success(value) =>
println("Result is: " + value)
case Failure(t) =>
println("Failed with: " + t)
}
}
handleResult(Try(divide(4, 2))) // Result is: 2
handleResult(Try(divide(4, 0))) // Failed with: java.lang.ArithmeticException: / by zero
Handling Failures Functionally with scala.util.Either
trait GentleDivisionError
case object NegativeDivisor extends GentleDivisionError
case object ZeroDivisor extends GentleDivisionError
def divideGently(n1:Int, n2:Int):Either[Int, GentleDivisionError] = {
if(n2 < 0) Right(NegativeDivisor)
else if(n2 == 0) Right(ZeroDivisor)
else Left(n1 / n2)
}
def handleResult(result:Either[Int, GentleDivisionError]) = {
result match {
case Left(value) => println("Result is: " + value)
case Right(error) => println("Failed with: " + error)
}
}
handleResult(divideGently(4, 2)) // Result is: 2
handleResult(divideGently(4, -1)) // Failed with: NegativeDivisor
handleResult(divideGently(4, 0)) // Failed with: ZeroDivisor
Lazy Evaluation
■ Strict Evaluation○ Expressions can have a value only when their subexpressions have value.
def noreturn(x):
while True:
x = -x
return x # not reached
4 in [2, 4, noreturn(5)]
■ Non-Strict Evaluation○ Expressions can have a value even if some of their subexpressions do not.
elem 4 [2, 4, noreturn 5]
■ Eager Evaluation○ An expression is evaluated as soon as it is bound to a variable.
■ Lazy Evaluation○ An expression is evaluated only when its result is needed.
Scala Streams
import scala.math.BigInt
lazy val fibs: Stream[BigInt] = // (0, 1, Stream(...))
BigInt(0) #::
BigInt(1) #::
fibs.zip(fibs.tail).map { n =>
println("Evaluating: %s -> %s".format(n._1, n._2))
n._1 + n._2
}
fibs.take(5).foreach(println)
println("------")
fibs.take(5).foreach(println)
0
1
Evaluating: 0 -> 1
1
Evaluating: 1 -> 1
2
Evaluating: 1 -> 2
3
---
0
1
1
2
3
■ The class Stream implements lazy lists where elements are only evaluated when they are needed.
■ The Stream class also employs memoization such that previously computed values are converted from Stream elements to concrete values.
Call By Name
class Logger(val debugEnabled:Boolean) {
def debug(msg: String) {
if (debugEnabled) println(msg)
}
def debugLazy(msg: => String) {
if (debugEnabled) println(msg)
}
}
val logger = new Logger(false)
logger.debug({
println("Evaluated !")
"basri kahveci"
}) // Prints: Evaluated !
println("------")
logger.debugLazy({
println("Evaluated !")
"basri kahveci"
}) // Nothing happens
■ Evaluate a function argument only when it is needed.
scala.concurrent.Future
■ A container class for a eventually-available value
■ The computation of the value might go wrong or timeout○ When the future is completed, it may have a successful value or an
exception
■ A write-once container. Once the future is completed, it is effectively final.
■ Much more powerful than Java’s Future○ You can map it, flatMap it, filter it, chain it etc.
Lets make a cup of cappuccino(Sequentially)
type CoffeeBeans = String
type GroundCoffee = String
type Milk = String
type FrothedMilk = String
type Espresso = String
type Cappuccino = String
case class Water(temperature: Int)
def grind(beans: CoffeeBeans): GroundCoffee = s"ground coffee of $beans"
def heatWater(water: Water): Water = water.copy(temperature = 85)
def frothMilk(milk: Milk): FrothedMilk = s"frothed $milk"
def brew(coffee: GroundCoffee, heatedWater: Water): Espresso = "espresso"
def combine(espresso: Espresso, frothedMilk: FrothedMilk): Cappuccino = "cappuccino"
case class GrindingException(msg: String) extends Exception(msg)
case class FrothingException(msg: String) extends Exception(msg)
case class WaterBoilingException(msg: String) extends Exception(msg)
case class BrewingException(msg: String) extends Exception(msg)
// going through these steps sequentially:
def prepareCappuccino(): Try[Cappuccino] = for {
ground <- Try(grind("arabica beans"))
water <- Try(heatWater(Water(25)))
espresso <- Try(brew(ground, water))
foam <- Try(frothMilk("milk"))
} yield combine(espresso, foam)
Lets make a cup of cappuccino(Asynchronously)
def grindAsync(beans: CoffeeBeans):Future[GroundCoffee] = Future { s"ground coffee of $beans" }
def heatWaterAsync(water: Water): Future[Water] = Future { water.copy(temperature = 85) }
def frothMilkAsync(milk: Milk): Future[FrothedMilk] = Future { s"frothed $milk" }
def brewAsync(coffee: GroundCoffee, heatedWater: Water): Future[Espresso] = Future { "espresso" }
def prepareCappuccinoAsync(): Future[Cappuccino] = {
val groundCoffee = grindAsync("arabica beans")
val heatedWater = heatWaterAsync(Water(20))
val frothedMilk = frothMilkAsync("milk")
for {
ground <- groundCoffee
water <- heatedWater
foam <- frothedMilk
espresso <- brewAsync(ground, water)
} yield combine(espresso, foam)
}
FP is good for concurrency and parallelism
■ A programming style that models computations as the evaluation of expressions, in contrast with imperative programming where programs are composed of statements that change global state. Functional programming typically avoids mutable state and promotes immutability.○ No locks, No cache-invalidation, No synchronization problems
■ Immutable data: FP promotes immutability. Instead of altering existing values, altered copies are created and the original is preserved. ○ You can share data between threads freely without data-race problems
■ Functions, Purity and Lazy Evaluation: A function is pure if it only depends to its arguments and contains no-side effect. Since pure computations are referentially transparent, they can be performed at any time and still yield the same result.○ Relaxed execution order
■ Referential transparency: Pure computations yield the same result each time they invoked.○ You can cache the result of a computation and use it afterwards
■ A programming model for processing large data sets with a parallel and distributed approach.
■ It consists 2 steps: Map, Reduce
■ The model of map and reduce functions are inspired from functional programming, but they are not same as they are in fp
■ The key contribution is not the map and reduce functions, it is a scalable and fault-tolerant execution engine that can be implemented based on the computation model
Map-Reduce Paradigm
■ A brand-new engine for large scale data processing
■ Utilizes in-memory computing
■ Provides high-level Java, Scala and Python APIs that consist high-level operations such as map, reduce, flatMap, filter, count etc.
■ Also used for sql, machine learning, graph processing and streaming
Apache Spark
Data processing with functional approach (Scala Collections)
queryJsonsWithUid // Instances of search logs. Type is Seq[JValue]
.map({
json =>
val JString(uid) = json \ "uid"
(uid, 1)
}).groupBy({ case (uid, _) => uid }) // the map step
.map({
case (uid, occurrences) => // occurrences:Seq[(String, Integer)]
val o: Seq[Int] = occurrences.map(_._2) // (These are the 1s given in the initial map function)
(uid, o.sum )
}) // the reduce step
.filter({ case (_, queryCount) => queryCount > 10 })
.foreach(println)
Data processing with functional approach (Apache Spark)
val spark = new SparkContext("local[4]", "UserStats")
val queryLogs: RDD[JValue] = // A distributed collection
spark
.wholeTextFiles("data/new-*/*_m4.json") // gives (fileName, content)
.map(_._2) // (fileName, content) -> content
.map({ content => parse(content) })
.filter({ json => json.\("uid") != JNothing })
val uidToQueryCount: RDD[(String, Int)] =
queryLogs
.map({
json =>
val JString(uid) = json \ "uid"
(uid, 1)
})
.reduceByKey(Integer.sum)
.filter({ case (uid, queryCount) => queryCount > 10 })
uidToQueryCount
.collect()
.foreach(println)
Data processing with functional approach (Apache Spark)
val spark = new SparkContext("local[4]", "UserStats")
val queryLogs: RDD[JValue] = // A distributed collection
spark
.wholeTextFiles("data/new-*/*_m4.json") // gives (fileName, content)
.map(_._2) // (fileName, content) -> content
.map({ content => parse(content) })
.filter({ json => json.\("uid") != JNothing })
val noUser: (String, Int) = (null, 0)
val uidToQueryCount: RDD[(String, Int)] =
queryLogs
.map({
json =>
val JString(uid) = json \ "uid"
(uid, 1)
})
.reduceByKey(Integer.sum)
.fold(noUser)({
case (user1, user2) => // user: (uid, queryCount)
if (user1._2 > user2._2) user1 else user2
})
uidToQueryCount
.collect()
.foreach(println)
The Good, The Bad, The Ugly Scala
■ Too many features because of being multi-paradigm○ classes, case classes, traits, abstract classes, inheritance, functions, mutable collections, immutable
collections, pattern matching, extractors, parsers, async programming, futures, actors, implicits, macros ...
■ Powerful but very complex type system○ Type aliases, paths, structural types, compound types, infix types, existential types, self types, abstract
types, higher-kinded types
■ Slow compilation○ Type inference comes with a cost
■ Cryptic ScalaDoc
■ Expressiveness may cause weird syntax
■ Many ways to solve a problem
■ IDE support
■ Backward incompatibility
Resources
■ Scala for the Impatient, Addison-Wesley 2012
■ Scala in Depth, Manning 2012
■ Scala Cookbook, O’Reilly 2013
■ Programming in Scala, Artima 2011
■ Functional Programming in Scala, Manning 2014
■ Useful blogs and blog post series:○ The Neophyte’s Guide to Scala○ mauricio.github.io○ james-iry.blogspot.com.br○ Twitter’s scala school○ Twitter’s effective scala
References
■ https://www.parleys.com/talk/51704efce4b095cc56d8d4b5/ Slide material is used
■ Scala for the Impatient and Functional Programming in Scala books
■ http://danielwestheide.com/blog
■ http://wiki.haskell.org
■ http://wikipedia.org
top related