the whole gpars team gpars.org/webapp/guide/index.pdf · groovy you use. if you don’t already...

The GPars User GuideThe Whole GPars Team <[email protected]>

Version 1.2.1, 2015-12-14

Table of ContentsIntroducing Our User Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Enter GPars. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Credits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

User Guide To Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

A Few Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Ready ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Download and Install . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

The GPars Artifact. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Transitive Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

A Hello World Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Code Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Getting Set Up In An IDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Applicability of Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

What’s New . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Java API – Using GPars from Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Convenience Factory Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Dataflow Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Dataflow Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

User Guide To Data Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Parallel Collections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Meet Parallel Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

GParsPool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Avoid Side-Effects in Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

GParsExecutorsPool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Usage of GParsExecutorsPool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Avoid Side-effects in Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Memoize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Examples Of Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Fibonacci Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Available Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Map-Reduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Avoid Side-effects in Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Combine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Parallel Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Asynchronous Invocations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Composable Asynchronous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Are We Concurrent Yet? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Fork-Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Parallel Speculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Parallel Speculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Alternatives Using Dataflow Variables and Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

User Guide To CSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

Communicating Sequential Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

The CSP Model Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

CSP with GPars Dataflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Alternatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

User Guide To Actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Types of Actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Actor Threading Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

Usage of Actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Actors Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Stateless Actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

Tips and Tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

Active Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

Classic Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

User Guide To Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

Concepts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

Factory Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

Listeners and Validators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

Validator Gotchas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

Grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

Reading The Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

State Copy Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

Error Handling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

Fair and Non-fair Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

User Guide to Dataflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

Implementation Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

Benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

Concepts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

Principles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

Dataflow Queues and Broadcasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

DataflowStream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Bind Handlers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

Bind Handlers Grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

Bind Handler Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

Lazy Dataflow Tasks and Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

Dataflow Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

Bind Error Notification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

Further Reading. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

Deterministic Deadlocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

Dataflows Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

Returning Values From A Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

Joining Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

Selects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

Guards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

Priority Select . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

Collecting Results of Asynchronous Computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

Timeouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

Cancellations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

Operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

Concepts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

Constructing Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

Holding State in Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

Parallelize Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

Selectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

Shutting Down Dataflow Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

Emergency Shutdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

PoisonPill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

Immediate Poison Pill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

Poison With Counting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

Poison Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

Termination Tips and Tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

Keeping PoisonPill Inside a Given Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

Graceful Shutdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

Application Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

Pipeline DSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

Overriding the Default PGroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

The Pipeline Builder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

Passing Construction Parameters Through the Pipeline DSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

Using Plain Java Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

Synchronous Variables and Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

Synchronous Dataflow Broadcast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

Synchronous Dataflow Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

Kanban Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

KanbanFlow Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

Classic Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

User Guide To STM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

Software Transactional Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

Running A Piece of Code Atomically. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

Customizing the Transactional Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

Using the Transaction Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

User Guide To GAE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

Google App Engine Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

User Guide To Remoting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

Remote Serialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

Dataflows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

DataflowVariable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

DataflowBroadcast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

DataflowQueue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

Actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

Remote Actor Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240

Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241

General GPars Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

Grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

Java API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

Parallel Collections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

Actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246

Hosted Environments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248

User Guide To The Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249

To download this guide as a PDF - click here

1

http://gpars.website/

../guide/index.pdf

Introducing Our User GuideThe world of mainstream computing is changing rapidly these days. If you open the case and lookunder the covers of your computer, you’ll most likely see a dual-core processor there, or a quad-core one, if you have a high-end computer. We all now run our software on multi-processorsystems.

Why do people still create single-threaded code ?

The code we write today and tomorrow will probably never run on a single processor system:parallel hardware has become standard. Not so with the software though, at least not yet. Peoplestill create single-threaded code, even though it will not be able to leverage the full power ofcurrent and future hardware.

The code we write today will probably never run on a single processor system !

Some developers experiment with low-level concurrency primitives, like threads, and locks orsynchronized blocks. However, it has become obvious that the shared-memory multi-threadingapproach used at the application level causes more trouble than it solves. Low-level concurrencyhandling is usually hard to get right, and it’s not much fun either.

With such a radical change in hardware, software inevitably has to change dramatically too.Higher-levelS OF concurrency and parallelism concepts like map/reduce, fork/join, actors anddataflow provide natural abstractions for different types of problem domains while leveraging themulti-core hardware.

Enter GParsMeet GPars, an open-source concurrency and parallelism library for Java and Groovy that givesyou a number of high-level abstractions for writing concurrent and parallel code in Groovy(map/reduce, fork/join, asynchronous closures, actors, agents, dataflow concurrency and otherconcepts), which can make your Java and Groovy code concurrent and/or parallel with little effort.

With GPars, your Java and/or Groovy code can easily utilize all the available processors on thetarget system. You can run multiple calculations at the same time, request network resources inparallel, safely solve hierarchical divide-and-conquer problems, perform functional stylemap/reduce or data parallel collection processing or build your applications around the actor ordataflow model.

Apache

The GPars project is open sourced under the Apache 2 License.

If you’re working on a commercial, open-source, educational or any other type of software project

2

http://gpars.org

http://www.apache.org/licenses/LICENSE-2.0.html

in Groovy, download the binaries or integrate them from the Maven repository and get going. Thedoor to writing highly concurrent and/or parallel Java and Groovy code is wide open. Enjoy!

CreditsThis project could not have reached the point where it stands currently without all the great helpand contributionS from many individuals, who have devoted their time, energy and expertise tomake GPars a solid product. First, it’s the people in the core team who should be mentioned:

• Václav Pech

• Dierk Koenig

• Alex Tkachman

• Russel Winder

• Paul King

• Jon Kerridge

• Rafał Sławik

Over time, many other people have contributed their ideas, provided useful feedback or helpedGPars in one way or another. There are many people in this group, too many to name them all, butlet’s list at least the most active:

• Hamlet d’Arcy

• Hans Dockter

• Guillaume Laforge

• Robert Fischer

• Johannes Link

• Graeme Rocher

• Alex Miller

• Jeff Gortatowsky

• Jiří Kropáček

• Jim Northrop

Many thanks to everyone!

3

User Guide To Getting Started

A Few AssumptionsLet’s set out a few assumptions before we start :

• You know and use Groovy and/or Java : otherwise you’d not be investing your valuable timestudying a concurrency and parallelism library for Groovy and/or Java.

• You definitely want to write code employing concurrency and parallelism concepts.

• If you are not using Groovy, you are prepared to pay the inevitable verbosity tax of using Java.

• You target multi-core hardware with your code.

• You appreciate that in concurrent and parallel code things can happen at any time, in any order,and, more likely, with more than one thing happening at once.

Ready ?With those assumptions in place, we can get started.

It’s becoming more and more obvious that dealing with concurrency and parallelism at thethread/synchronized/lock level, as provided by the JVM, is far too low a level to be safe andcomfortable.

Many high-level concepts, such as actors and dataflow have been around for quite some time.Parallel-chipped computers have been in use, at least in data centres if not on the desktop, longbefore multi-core chips hit the hardware mainstream.

So now is the time to adopt these higher-level abstractions into the mainstream software industry.

This is what GPars enables for the Groovy and Java languages, allowing them to use higher-levelabstractions and, therefore, make development of concurrent and parallel software easier and lesserror prone.

The concepts available in GPars can be categorized into three groups:

• Code-level helpers - Constructs that can be applied to small parts of your code-base, such as anindividual algorithms or data structure, without any major changes in the overall projectarchitecture

• Parallel Collections

• Asynchronous Processing

• Fork/Join (Divide/Conquer)

• Architecture-level concepts - Constructs that need to be taken into account when designing theproject structure

• Actors

4

• Communicating Sequential Processes (CSP)

• Dataflow

• Data Parallelism

• Shared Mutable State Protection - More than 95% of the current use of shared mutable states canbe avoided using proper abstractions. Good abstractions are still necessary for the remaining5% of those use cases, i.e. when shared mutable state cannot be avoided.

• Agents

• Software Transactional Memory (not fully implemented in GPars as yet)

Download and InstallGPars is now distributed as part of Groovy. So if you have a Groovy installation, you shouldalready have GPars. Your exact version of GPars will, of course, depend on which version ofGroovy you use.

If you don’t already have GPars, and you do have Groovy, then perhaps you should upgrade yourGroovy !

If you need it, you can download Groovy from here, and GPars from here.

If you don’t have a Groovy installation, but use Groovy by using dependencies or perhaps, justhaving the groovy-all artifact, then you will need to get GPars. Also if you want to use a differentversion of GPars to the one bundled with Groovy, or have an old GPars-free Groovy that youcannot upgrade, you will need to get GPars. The ways to download GPars are:

• Download the artifact from a repository and add it and all the transitive dependenciesmanually.

• Specify a dependency in Gradle, Maven, or Ivy (or Gant, or Ant) build files.

• Use Grapes (especially useful for Groovy scripts).

• Download and install it from here.

If you’re building a Grails or a Griffon application, you can use the appropriate plugins to fetchour jar files for you.

The GPars ArtifactAs noted above, GPars is now distributed as standard with Groovy. If however, you have tomanage this dependency manually, the GPars artifact is in the main Maven repository and was inthe Codehaus main and snapshots repositories before they closed.

5

http://groovy-lang.org/download.html



../Download.html

../Download.html

../Download.html

Release versions can be found in the Maven main repositories but, for now, the currentdevelopment version (SNAPSHOT) was in the Codehaus snapshots repository. We’re moving it toanother location.

To use GPars from Gradle or Grapes, use the specification:

A Gradle Example

"org.codehaus.gpars:gpars:1.2.0"

You may need to add our snapshot repository manually to the search list in this latter case. UsingMaven the dependency is:

A Sample Maven Declaration

<dependency> <groupId>org.codehaus.gpars</groupId> <artifactId>gpars</artifactId> <version>1.2.0</version></dependency>

Transitive DependenciesGPars as a library depends on Groovy versions later than 2.2.1. Also, the Fork/Join concurrencylibrary must be available. This comes as standard with Java 7.

GPars 2.0 will depend on Java 8 and will only be usable with Groovy 3.0 and later.

Please visit the Integration page on our GPars website for more details.

6

../Integration.html

A Hello World ExampleOnce you’re setup, try the following Groovy script to confirm your setup is functioning properly.

A Groovy Sample

import static groovyx.gpars.actor.Actors.actor

/** * A demo showing two cooperating actors. The decryptor decrypts received messages * and replies them back. The console actor sends a message to decrypt, prints out * the reply and terminates both actors. The main thread waits on both actors to * finish using the join() method to prevent premature exit, since both actors use * the default actor group, which uses a daemon thread pool. * @author Dierk Koenig, Vaclav Pech */

def decryptor = actor { loop { react { message -> if (message instanceof String) reply message.reverse() else stop() } }}

def console = actor { decryptor.send 'lellarap si yvoorG' react { println 'Decrypted message: ' + it decryptor.send false }}

[decryptor, console]*.join()

You should receive a message "Decrypted message: Groovy is parallel" on the console.

7

Java API

GPars has been designed primarily for use with the Groovy programming language. Ofcourse all Java and Groovy programs are just bytecodes running on the JVM, so GPars can beused with Java source.

Despite being aimed at Groovy, the solid technical foundation, plus the good performancecharacteristics of GPars makes it an excellent library for Java programs too. In fact much ofGPars is written in Java, so there’s no performance penalty for Java applications usingGPars.

For details please refer to the Java API section.

To quick-test GPars with the Java API, compile and run the following Java code:

8

Another Java Sample

import groovyx.gpars.MessagingRunnable;import groovyx.gpars.actor.DynamicDispatchActor;

public class StatelessActorDemo {

public static void main(String[] args) throws InterruptedException { final MyStatelessActor actor = new MyStatelessActor(); actor.start(); actor.send("Hello"); actor.sendAndWait(10);

actor.sendAndContinue(10.0, new MessagingRunnable<String>() { @Override protected void doRun(final String s) { System.out.println("Received a reply " + s); } }); }}

class MyStatelessActor extends DynamicDispatchActor { public void onMessage(final String msg) { System.out.println("Received " + msg); replyIfExists("Thank you"); }

public void onMessage(final Integer msg) { System.out.println("Received a number " + msg); replyIfExists("Thank you"); }

public void onMessage(final Object msg) { System.out.println("Received an object " + msg); replyIfExists("Thank you"); }}

Artifacts maybe needed

Remember though that you will almost certainly have to add the Groovy artifact to the build aswell as the GPars artifact. GPars may well work at Java speeds with Java applications, but it stillhas some compilation dependencies on Groovy.

Code ConventionsWe follow certain conventions in our code samples. Understanding these conventions may help youread and comprehend GPars code samples better.

9

• The leftShift operator '<<' has been overloaded on actors, agents and dataflow expressions(both variables and streams) to mean send a message or assign a value.

Using The leftShift Operator

myActor << 'message'

myAgent << {account -> account.add('5 USD')}

myDataflowVariable << 120332

• On actors and agents, the default call() method has been also overloaded to mean send . Sosending a message to an actor or agent may look like a regular method call.

myActor "message"

myAgent {house -> house.repair()}

• The rightShift operator '>>' in GPars has the when bound meaning. So

myDataflowVariable >> {value -> doSomethingWith(value)}

will schedule the closure to run only after myDataflowVariable is bound to a value, with the valueas a parameter.

UsageIn samples, we tend to statically import frequently used factory methods:

• GParsPool.withPool()

• GParsPool.withExistingPool()

• GParsExecutorsPool.withPool()

• GParsExecutorsPool.withExistingPool()

• Actors.actor()

• Actors.reactor()

• Actors.fairReactor()

• Actors.messageHandler()

• Actors.fairMessageHandler()

• Agent.agent()

• Agent.fairAgent()

• Dataflow.task()

• Dataflow.operator()

10

It’s more a matter of style preferences and personal taste, but we think static imports make thecode more compact and readable.

Getting Set Up In An IDEAdding the GPars jar files to your project or defining the appropriate dependencies in pom.xmlshould be enough to get you started with GPars in your IDE.

GPars DSL recognition

IntelliJ IDEA in both the free Community Edition and the commercial Ultimate Edition willrecognize the GPars domain specific languages, complete methods like eachParallel() , reduce() orcallAsync() and validate them. GPars uses the Groovy DSL mechanism, which teaches IntelliJ IDEAthe DSLs as soon as the GPars jar file is added to the project.

Applicability of ConceptsGPars provides a lot of concepts to pick from. We’re continuously building and updating ourdocuments to help users choose the right level of abstraction for their tasks at hands. Please, referto Concepts Compared for details.

To briefly summarize the suggestions, here are some basic guide-lines:

• You’re looking at a collection, which needs to be iterated or processed using one of the manybeautiful Groovy collection methods, like each() , collect() , find() etc.. Suppose that processingeach element of the collection is independent of the other items, then using GPars parallelcollections can be appropriate.

• If you have a long-lasting calculation , which may safely run in the background, use theasynchronous invocation support in GPars. Since GPars asynchronous functions can becomposed, you can quickly parallelize tyhese complex functional calculations without having tomark independent calculations explicitly.

• Say you need to parallelize an algorithm. You can identify a set of tasks with their mutualdependencies. The tasks typically do not need to share data, but instead some tasks may need towait for other tasks to finish before starting. Now you’re ready to express these dependenciesexplicitly in code. With GPars dataflow tasks, you create internally sequential tasks, each ofwhich can run concurrently with the others. Dataflow variables and channels provide the taskswith the capability to declare their dependencies and to exchange data safely.

• Perhaps you can’t avoid using shared mutable state in your logic. Multiple threads will beaccessing shared data and (some of them) modifying it. A traditional locking and synchronizedapproach feels too risky or unfamiliar? Then go for agents to wrap your data and serialize allaccess to it.

• You’re building a system with high concurrency demands. Tweaking a data structure here ortask there won’t cut it. You need to build the architecture from the ground up with concurrencyin mind. Message-passing might be the way to go. Your choices could include :

11

http://www.jetbrains.net/confluence/display/GRVY/Scripting+IDE+for+DSL+awareness

http://www.jetbrains.net/confluence/display/GRVY/Scripting+IDE+for+DSL+awareness

../Concepts_Compared.html

• Groovy CSP to give you highly deterministic and composable models for concurrentprocesses. A model is organized around the concept of calculations or processes, whichrun concurrently and communicate through synchronous channels.

• If you’re trying to solve a complex data-processing problem, consider GPars dataflowoperators to build a data flow network. The concept is organized around event-driventransformations wired into pipelines using asynchronous channels.

• Actors and Active Objects will shine if you need to build a general-purpose, highlyconcurrent and scalable architecture following the object-oriented paradigm.

Now you may have a better idea of what concepts to use on your current project. Go check outmore details on them in our User Guide.

What’s NewThe next GPars 1.3.0 release introduces several enhancements and improvements on top of theprevious release, mainly in the dataflow area.

Check out the JIRA release notes.

Project Changes

Breaking Changes

See the Breaking Changes listing for the lists of bug fixes and improvements.

Asynchronous Functions

TBD

Parallel Collections

TBD

Fork / Join

TBD

Actors

• Remote actors

• Exception propagation from active objects

12

../Breaking_Changes.html

Dataflow

• Remote dataflow variables and channels

• Dataflow operators accepting variable number arguments

• Select made @CompileStatic compatible

Agent

• Remote agents

STM

TBD

Other

• Raised the JDK dependency to version 1.7

• Raised the Groovy dependency to version 2.2

• Replaced the jsr-177y fork-join pool implementation with the one from JDK 1.7

• Removed the dependency on jsr-166y

Java API – Using GPars from JavaUsing GPars is very addictive, I guarantee. Once you get hooked, you won’t be able to code withoutit. If the world forces you to write code in Java, you’ll still be able to benefit from many of theGPars features.

Java API specifics

Some parts of GPars are irrelevant in Java and it’s better to use the underlying Java librariesdirectly:

• Parallel Collection – use jsr-166y library’s Parallel Array directly until GPars 1.3.0 becomesavailable

• Fork/Join – use jsr-166y library’s Fork/Join support directly until GPars 1.3.0 becomes available

• Asynchronous functions – use Java executor services directly

The other parts of GPars can be used from Java just as from Groovy, although most will miss theGroovy DSL capabilities.

GPars Closures in Java API

To overcome the lack of closures as a language element in Java and to avoid forcing users to useGroovy closures directly through the Java API, a few handy wrapper classes have been provided tohelp you define callbacks, actor body or dataflow tasks.

13

• groovyx.gpars.MessagingRunnable - used for single-argument callbacks or actor body

• groovyx.gpars.ReactorMessagingRunnable - used for ReactiveActor body

• groovyx.gpars.DataflowMessagingRunnable - used for dataflow operators' body

These classes can be used in places where the GPars API expects a Groovy closure.

ActorsThe DynamicDispatchActor as well as the ReactiveActor classes can be used just like in Groovy:

A DynamicDispatchActor Sample

import groovyx.gpars.MessagingRunnable; import groovyx.gpars.actor.DynamicDispatchActor;

public class StatelessActorDemo { public static void main(String[] args) throws InterruptedException { final MyStatelessActor actor = new MyStatelessActor(); actor.start(); actor.send("Hello"); actor.sendAndWait(10); actor.sendAndContinue(10.0, new MessagingRunnable<String>() { @Override protected void doRun(final String s) { System.out.println("Received a reply " + s); } }); } }

class MyStatelessActor extends DynamicDispatchActor { public void onMessage(final String msg) { System.out.println("Received " + msg); replyIfExists("Thank you"); }

public void onMessage(final Integer msg) { System.out.println("Received a number " + msg); replyIfExists("Thank you"); }

public void onMessage(final Object msg) { System.out.println("Received an object " + msg); replyIfExists("Thank you"); } }

There are few differences between Groovy and Java for GPars use, but notice the callbacks

14

instantiating the MessagingRunnable class in place of a Groovy closure.

A MessagingRunnable Sample

import groovy.lang.Closure;import groovyx.gpars.ReactorMessagingRunnable;import groovyx.gpars.actor.Actor;import groovyx.gpars.actor.ReactiveActor;

public class ReactorDemo { public static void main(final String[] args) throws InterruptedException {

final Closure handler = new ReactorMessagingRunnable<Integer, Integer>() { @Override protected Integer doRun(final Integer integer) { return integer * 2; } }; final Actor actor = new ReactiveActor(handler); actor.start();

System.out.println("Result: " + actor.sendAndWait(1)); System.out.println("Result: " + actor.sendAndWait(2)); System.out.println("Result: " + actor.sendAndWait(3)); }}

Convenience Factory MethodsObviously, all the essential factory methods to build actors quickly are available where you’d expectthem.

15

Factory Samples

import groovy.lang.Closure;import groovyx.gpars.ReactorMessagingRunnable;import groovyx.gpars.actor.Actor;import groovyx.gpars.actor.Actors;

public class ReactorDemo { public static void main(final String[] args) throws InterruptedException { final Closure handler = new ReactorMessagingRunnable<Integer, Integer>() { @Override protected Integer doRun(final Integer integer) { return integer * 2; } }; final Actor actor = Actors.reactor(handler);

System.out.println("Result: " + actor.sendAndWait(1)); System.out.println("Result: " + actor.sendAndWait(2)); System.out.println("Result: " + actor.sendAndWait(3)); }}

AgentsAgent Samples

import groovyx.gpars.MessagingRunnable; import groovyx.gpars.agent.Agent;

public class AgentDemo {

public static void main(final String[] args) throws InterruptedException {

final Agent counter = new Agent<Integer>(0); counter.send(10); System.out.println("Current value: " + counter.getVal()); counter.send(new MessagingRunnable<Integer>() { @Override protected void doRun(final Integer integer) { counter.updateValue(integer + 1); } });

System.out.println("Current value: " + counter.getVal()); } }

16

Dataflow ConcurrencyBoth DataflowVariables and DataflowQueues can be used from Java without any hiccups. Just avoidthe handy overloaded operators and go straight to the methods, like bind , whenBound, getVal andother.

You may also continue to use dataflow tasks passing them instances of Runnable or Callable justlike groovy closures.

Dataflow Samples

import groovyx.gpars.MessagingRunnable;import groovyx.gpars.dataflow.DataflowVariable;import groovyx.gpars.group.DefaultPGroup;

import java.util.concurrent.Callable;

public class DataflowTaskDemo {

public static void main(final String[] args) throws InterruptedException { final DefaultPGroup group = new DefaultPGroup(10);

final DataflowVariable a = new DataflowVariable();

group.task(new Runnable() { public void run() { a.bind(10); } });

final Promise result = group.task(new Callable() { public Object call() throws Exception { return (Integer)a.getVal() + 10; } });

result.whenBound(new MessagingRunnable<Integer>() { @Override protected void doRun(final Integer integer) { System.out.println("arguments = " + integer); } });

System.out.println("result = " + result.getVal()); }}

Dataflow OperatorsThe sample below should illustrate the main differences between Groovy and Java APIs for

17

dataflow operators.

• Use the convenience factory methods when accepting lists of channels to create operators orselectors

• Use DataflowMessagingRunnable to specify the operator body

• Call getOwningProcessor() to get hold of the operator from within the body in order to e.g. bindoutput values

18

More Dataflow Samples

import groovyx.gpars.DataflowMessagingRunnable;import groovyx.gpars.dataflow.Dataflow;import groovyx.gpars.dataflow.DataflowQueue;import groovyx.gpars.dataflow.operator.DataflowProcessor;

import java.util.Arrays;import java.util.List;

public class DataflowOperatorDemo {

public static void main(final String[] args) throws InterruptedException { final DataflowQueue stream1 = new DataflowQueue(); final DataflowQueue stream2 = new DataflowQueue(); final DataflowQueue stream3 = new DataflowQueue(); final DataflowQueue stream4 = new DataflowQueue();

final DataflowProcessor op1 = Dataflow.selector(Arrays.asList(stream1),Arrays.asList(stream2), new DataflowMessagingRunnable(1) { @Override protected void doRun(final Object... objects) { getOwningProcessor().bindOutput(2*(Integer)objects[0]); } });

final List secondOperatorInput = Arrays.asList(stream2, stream3);

final DataflowProcessor op2 = Dataflow.operator(secondOperatorInput, Arrays.asList(stream4), new DataflowMessagingRunnable(2) { @Override protected void doRun(final Object... objects) { getOwningProcessor().bindOutput((Integer) objects[0] + (Integer)objects[1]); } });

stream1.bind(1); stream1.bind(2); stream1.bind(3); stream3.bind(100); stream3.bind(100); stream3.bind(100); System.out.println("Result: " + stream4.getVal()); System.out.println("Result: " + stream4.getVal()); System.out.println("Result: " + stream4.getVal()); op1.stop(); op2.stop(); }}

19

PerformanceIn general, GPars overhead is identical irrespective of whether you use it from Groovy or Java andit tends to be very low anyway. GPars actors, for example, can compete head-to-head with otherJVM actor options, like Scala actors.

Since Groovy code, in general, runs a little slower than Java code, due to dynamic methodinvocations, you might consider writing your code in Java to improve performance.

Typically numeric operations or frequent fine-grained method calls within a task or actor bodymay benefit from a rewrite into Java.

PrerequisitesAll the GPars integration rules apply equally to Java projects and Groovy projects. You only need toinclude the Groovy distribution jar file in your project and your are good-to-go.

You may also want to check out our sample Java-Maven project for tips on how to integrate GParsinto a Maven-based pure Java application – Java Sample Maven Project

20

../Demos.html

../Demos.html

../Demos.html

../Demos.html

User Guide To Data ParallelismFocusing on data instead of processes helps us create robust concurrent programs. As aprogrammer, you define your data together with functions that should be applied to it and then letthe underlying machinery process the data. Typically, a set of concurrent tasks will be created andsubmitted to a thread pool for processing.

In GPars, the GParsPool and GParsExecutorsPool classes give you access to low-level dataparallelism techniques. The GParsPool class relies on the Fork/Join implementation introduced inJDK 7 and offers excellent functionality and performance. The GParsExecutorsPool is provided forthose who still need to use the older Java executors.

There are three fundamental domains covered by the GPars low-level data parallelism:

• Processing collections concurrently

• Running functions (closures) asynchronously

• Performing Fork/Join (Divide/Conquer) algorithms

The API described here is based on using GPars with JDK7. It can be used withlater JDKs, but JDK8 introduced the Streams framework which can be useddirectly from Groovy and, in essence, replaces the GPars features covered here.Work is underway to provide the API described here based on the JDK8 Streamsframework for use with JDK8 and later to provide a simple upgrade path.

Parallel CollectionsDealing with data frequently involves manipulating collections. Lists, arrays, sets, maps, iterators,strings. A lot of other data types can be viewed as collections of items. The common pattern toprocess such collections is to take elements sequentially, one-by-one, and make an action for each ofthe items in the series.

Take, for example, the min function, which is supposed to return the smallest element of acollection. When you call the min method on a collection of numbers, a variable (minVal say) iscreated to store the smallest value seen so far, initialized to some reasonable value for the giventype, so for example for integers and floating points, this may well be zero. The elements of thecollection are then iterated through as each is compared to the stored value. Should a value be lessthan the one currently held in minVal then minVal is changed to store the newly seen smaller value.

Once all elements have been processed, the minimum value in the collection is stored in the minVal.

However simple, this solution is totally wrong on multi-core and multi-processor hardware.Running the min function on a dual-core chip can leverage at most 50% of the computing power ofthe chip. On a quad-core, it would be only 25%. So in this latter case, this algorithm effectivelywastes 75% of the computing power of the chip.

21

Tree-like structures prove to be more appropriate for parallel processing.

The min function in our example doesn’t need to iterate through all the elements in row andcompare their values with the minVal variable. What it can do, instead, is rely on the multi-core/multi-processor nature of our hardware.

A parallel_min function can, for example, compare pairs (or tuples of certain size) of neighboringvalues in the collection and promote the smallest value from the tuple into a next round ofcomparisons. Searching for the minimum in different tuples can safely happen in parallel, so tuplesin the same round can be processed by different cores at the same time without races or contentionamong threads.

Meet Parallel ArraysAlthough not part of JDK7, the extra166y library brings a very convenient abstraction calledParallel Arrays, and GPars has harnessed this mechanism to provide a very Groovy API.

What is extra166y ?

extra166y is an implementation of Java collections supporting parallel operations usingFork-Join concurrent framework provided by JSR-166. It was never made part of theJDK — unlike the jsr166y library. In fact, extra166y was made redundant from JDK8 onwardsby the Streams framework. Therefore, to continue to support JDK7, GPars includes a copy ofextra166y in it so there is no external dependency.

As noted earlier, work is underway to rewrite the GPars API in terms of Streams for users of JDK8onwards. Of course people using JDK8 onwards can simply use Streams directly from Groovy.

How ?

GPars leverages the Parallel Arrays implementation in several ways. The GParsPool andGParsExecutorsPool classes provide parallel variants of the common Groovy iteration methodslike each , collect , findAll, etc.

A Parallel Sample

def selfPortraits = images.findAllParallel{it.contains me}.collectParallel{it.resize()}

It also allows for a more functional style map/reduce style of collections processing.

A Map/Reduce Sample

def smallestSelfPortrait = images.parallel.filter{it.contains me}.map{it.resize()}.min{it.sizeInMB}

22

http://groovy.dzone.com/articles/parallelize-your-arrays-with-j

GParsPoolUse of GParsPool — the JSR-166y-based concurrent collection processor

Usage

The GParsPool class provides (from JSR-166y), a ParallelArray-based concurrency DSL forcollections and objects.

Examples of use:

Some Parallel Samples

// Summarize numbers concurrently.GParsPool.withPool { final AtomicInteger result = new AtomicInteger(0) [1, 2, 3, 4, 5].eachParallel{result.addAndGet(it)}

assert 15 == result}

// Multiply numbers asynchronously.GParsPool.withPool { final List result = [1, 2, 3, 4, 5].collectParallel{it * 2}

assert ([2, 4, 6, 8, 10].equals(result))}

The passed-in closure takes an instance of a ForkJoinPool as a parameter, which can then be freelyused inside the closure.

A ForkJoinPool Sample

// Check whether all elements within a collection meet certain criteria.GParsPool.withPool(5){ForkJoinPool pool -> assert [1, 2, 3, 4, 5].everyParallel{it > 0}

assert ![1, 2, 3, 4, 5].everyParallel{it > 1}}

The GParsPool.withPool method takes optional parameters for number of threads in the createdpool plus an unhandled exceptions handler.

An Exception Handler Sample With Number of Threads Required

withPool(10){...}withPool(20, exceptionHandler){...}

23

Pool Reuse

The GParsPool.withExistingPool takes an already existing ForkJoinPool instance to reuse. The DSLis valid only within the associated block of code and only for the thread that has called the withPoolor withExistingPool methods. The withPool method returns only after all the worker threads havefinished their tasks and the pool has been destroyed, returning the resulting value of the associatedblock of code. The withExistingPool method doesn’t wait for the pool threads to finish.

Alternatively, the GParsPool class can be statically imported as import staticgroovyx.gpars.GParsPool, so we can omit the GParsPool class name.

A Pool Sample

withPool { assert [1, 2, 3, 4, 5].everyParallel{it > 0} assert ![1, 2, 3, 4, 5].everyParallel{it > 1}}

The following methods are currently supported on all objects in Groovy:

• eachParallel

• eachWithIndexParallel

• collectParallel

• collectManyParallel

• findAllParallel

• findAnyParallel

• findParallel

• everyParallel

• anyParallel

• grepParallel

• groupByParallel

• foldParallel

• minParallel

• maxParallel

• sumParallel

• splitParallel

• countParallel

• foldParallel

Meta-class Enhancer

As an alternative, you can use the ParallelEnhancer class to enhance meta-classes of any classes or

24

individual instances with the parallel methods.

An Enhanced Sample

import groovyx.gpars.ParallelEnhancer

def list = [1, 2, 3, 4, 5, 6, 7, 8, 9]ParallelEnhancer.enhanceInstance(list)println list.collectParallel {it * 2 }

def animals = ['dog', 'ant', 'cat', 'whale']ParallelEnhancer.enhanceInstance animalsprintln (animals.anyParallel {it ==~ /ant/} ? 'Found an ant' : 'No ants found')println (animals.everyParallel {it.contains('a')} ? 'All animals contain a' : 'Someanimals can live without an a')

When using the ParallelEnhancer class, you’re not restricted to a withPool block when using theGParsPool DSLs. The enhanced classes or instances remain enhanced till they are garbagecollected.

Exception Handling

If an exception is thrown while processing any of the passed-in closures, the first exception is re-thrown from the xxxParallel methods and the algorithm stops as soon as possible.

Exception Handling

The exception handling mechanism of GParsPool builds on the one provided in the Fork/Joinframework. Since Fork/Join algorithms are by nature hierarchical, once any part of thealgorithm fails, there’s usually little benefit continuing the computation, since some branchesof the algorithm will never return a result.

Bear in mind that the GParsPool implementation doesn’t give any guarantees about itsbehavior after a first unhandled exception occurs, beyond stopping the algorithm and re-throwing the first detected exception to the caller. This behavior, after all, is consistent withwhat the traditional sequential iteration methods do.

Transparently Parallel Collections

On top of adding new xxxParallel methods, GPars can also let you change the semantics of originaliteration methods.

For example, you may be passing a collection into a library method, which will process yourcollection in a sequential way, let’s say, by using the collect method. Then by changing the semanticsof the collect method on your collection, you can effectively parallelize this library sequential code.

25

A makeConcurrent() Sample

GParsPool.withPool {

//The selectImportantNames() will process the name collections concurrently assert ['ALICE', 'JASON'] == selectImportantNames(['Joe', 'Alice', 'Dave', 'Jason'].makeConcurrent())}

/** * A function implemented using standard sequential collect() and findAll() methods. */def selectImportantNames(names) { names.collect {it.toUpperCase()}.findAll{it.size() > 4}}

The makeSequential method will reset the collection back to the original sequential semantics.

A Sequential Sample

import static groovyx.gpars.GParsPool.withPool

def list = [1, 2, 3, 4, 5, 6, 7, 8, 9]

println 'Sequential: ' list.each { print it + ',' } println()

withPool {


list.makeConcurrent()

println 'Concurrent: ' list.each { print it + ',' } println()

list.makeSequential()

println 'Sequential: ' list.each { print it + ',' } println()}

println 'Sequential: 'list.each { print it + ',' }println()

The asConcurrent() convenience method allows us to specify code blocks, where the collection

26

maintains concurrent semantics.

An asConcurrent() Sample


def list = [1, 2, 3, 4, 5, 6, 7, 8, 9]


withPool {


list.asConcurrent { println 'Concurrent: ' list.each { print it + ',' } println() }

println 'Sequential: ' list.each { print it + ',' } println()}


Code Samples

Transparent parallelism, including the makeConcurrent() , makeSequential() and asConcurrent()methods, is also available in combination with our ParallelEnhancer .

27

A ParallelEnhancer Sample

/** * A function implemented using standard sequential collect() and findAll() methods. */def selectImportantNames(names) { names.collect {it.toUpperCase()}.findAll{it.size() > 4}}

def names = ['Joe', 'Alice', 'Dave', 'Jason']ParallelEnhancer.enhanceInstance(names)

//The selectImportantNames() will process the name collections concurrentlyassert ['ALICE', 'JASON'] == selectImportantNames(names.makeConcurrent())

Another ParallelEnhancer Sample

import groovyx.gpars.ParallelEnhancer

def list = [1, 2, 3, 4, 5, 6, 7, 8, 9]


ParallelEnhancer.enhanceInstance(list)


list.asConcurrent { println 'Concurrent: ' list.each { print it + ',' } println()

}list.makeSequential()


Avoid Side-Effects in FunctionsWe have to warn you. Since the closures that are provided to the parallel methods like eachParallelor collectParallel() may be run in parallel, you have to make sure that each of the closures is written

28

in a thread-safe manner. The closures must hold no internal state, share data nor have side-effectsbeyond the boundaries of the single element that they’ve been invoked on. Violations of these ruleswill open the door for race conditions and deadlocks, the most severe enemies of a modern multi-core programmer.

Don’t do this !

Concurrently Accessing a non-Thread-Safe Collection

def thumbnails = []images.eachParallel {thumbnails << it.thumbnail} //Concurrently accessing a not-thread-safe collection of thumbnails? Don't do this!

At least, you’ve been warned.

It May Not Execute The Way You Expect

Because GParsPool uses a Fork/Join pool (with work stealing), threads may not be applied toa waiting processing task even though they may appear idle.

With a work-stealing algorithm, worker threads that run out of things to do can steal tasksfrom other threads that are still busy.

If you use GParsExecutorsPool (which doesn’t use Fork/Join), you’ll get the thread allocationbehavior that you would naively expect.

GParsExecutorsPoolUse of GParsExecutorsPool - the Java Executors-based concurrent collection processor -

Usage of GParsExecutorsPoolThe GParsPool classes enable a Java Executors-based concurrency DSL for collections and objects.

The GParsExecutorsPool class can be used as a pure-JDK-based collections parallel processor.Unlike the GParsPool class, GParsExecutorsPool doesn’t require fork/join thread pools but,instead, leverages the standard JDK executor services to parallelize closures to process a collectionor an object iteratively.

It needs to be stated, however, that GParsPool typically performs much better thanGParsExecutorsPool does.

29

GParsPool typically performs much better than GParsExecutorsPool

Examples of Use

A GParsExecutorsPool Example

//multiply numbers asynchronously GParsExecutorsPool.withPool { Collection<Future> result = [1, 2, 3, 4, 5].collectParallel{it * 10}

assert new HashSet([10, 20, 30, 40, 50]) == new HashSet((Collection)result*.get()) }

//multiply numbers asynchronously using an asynchronous closure GParsExecutorsPool.withPool { def closure={it * 10} def asyncClosure=closure.async()

Collection<Future> result = [1, 2, 3, 4, 5].collect(asyncClosure)

assert new HashSet([10, 20, 30, 40, 50]) == new HashSet((Collection)result*.get()) }

The passed-in closure takes an instance of an ExecutorService as a parameter, which can be thenused freely inside the closure.

Another GParsExecutorsPool Example

//find an element meeting specified criteria GParsExecutorsPool.withPool(5) {ExecutorService service -> service.submit({performLongCalculation()} as Runnable) }

The GParsExecutorsPool.withPool() method takes an optional parameter declaring the number ofthreads in the created pool and a thread factory.

An Example Declaring Required Thread Count

withPool(10) {...}withPool(20, threadFactory) {...}

The GParsExecutorsPool.withExistingPool() takes an already existing executor service instance toreuse. The DSL is only valid within the associated block of code and only for the thread that hascalled the withPool() or withExistingPool() method.

30

Did you know the withExistingPool() method doesn’t wait for executor servicethreads to finish ?

The withPool() method returns control only after all the worker threads have finished their tasksand the executor service has been destroyed, returning the resulting value of the associated blockof code.

Statically import the GParsExecutorsPool class as import staticgroovyx.gpars.GParsExecutorsPool.* to omit the GParsExecutorsPool classname.

A FindParallel Example

withPool { def result = [1, 2, 3, 4, 5].findParallel{Number number -> number > 2} assert result in [3, 4, 5] }

The following methods are currently supported on all objects that support iterations in Groovy :

• eachParallel()

• eachWithIndexParallel()

• collectParallel()

• findAllParallel()

• findParallel()

• allParallel()

• anyParallel()

• grepParallel()

• groupByParallel()

Meta-class Enhancer

As an alternative, you can use the GParsExecutorsPoolEnhancer class to enhance meta-classes forany classes or individual instances having asynchronous methods.

31

Enhancing Your Code

import groovyx.gpars.GParsExecutorsPoolEnhancer

def list = [1, 2, 3, 4, 5, 6, 7, 8, 9]GParsExecutorsPoolEnhancer.enhanceInstance(list)println list.collectParallel {it * 2 }

def animals = ['dog', 'ant', 'cat', 'whale']GParsExecutorsPoolEnhancer.enhanceInstance animals

println (animals.anyParallel {it ==~ /ant/} ? 'Found an ant' : 'No ants found')println (animals.allParallel {it.contains('a')} ? 'All animals contain a' : 'Someanimals can live without an a')

When using the GParsExecutorsPoolEnhancer class, you’re not restricted to a withPool() block withthe use of the GParsExecutorsPool DSLs. The enhanced classes or instances remain enhanced untilthey are garbage collected.

Exception Handling

Exceptions can be thrown while processing any of the passed-in closures. An instance of theAsyncException method will wrap any/all of the original exceptions re-thrown from the xxxParallelmethods.

Avoid Side-effects in FunctionsOnce again we need to warn you about using closures with side-effects. Please avoid logic thataffects objects beyond the scope of the single, currently processed element. Please avoid logic orclosures that keep state. Don’t do that! It’s dangerous to pass them to any of the xxxParallel()methods.

MemoizeThe memoize function enables caching of a function’s return values. Repeated calls to thememoized function with the same argument values will, instead of invoking the calculationencoded in the original function, retrieve the resulting value from an internal, transparent cache.

Provided the calculation is considerably slower than retrieving a cached value from the cache,developers can trade-off memory for performance.

Checkout out the example, where we attempt to scan multiple websites for particular content:

The memoize functionality of GPars was donated to Groovy for version 1.8 and if you run onGroovy 1.8 or later, we recommend you use the Groovy functionality.

Memoize, in GPars, is almost identical, except that it searches the memoized caches concurrently

32

using the surrounding thread pool. This may give performance benefits in some scenarios.

Memoize Me Up, Scotty

The GPars memoize functionality has been renamed to avoid future conflicts with thememoize functionality in Groovy.

GPars now calls these methods with a preceding letter g , such as gmemoize().

Examples Of UseA GParsPool Example With gmemoize()

GParsPool.withPool { def urls = ['http://www.dzone.com', 'http://www.theserverside.com','http://www.infoq.com']

Closure download = {url -> println "Downloading $url" url.toURL().text.toUpperCase() }

Closure cachingDownload = download.gmemoize()

println 'Groovy sites today: ' + urls.findAllParallel {url -> cachingDownload(url).contains('GROOVY')} println 'Grails sites today: ' + urls.findAllParallel {url -> cachingDownload(url).contains('GRAILS')} println 'Griffon sites today: ' + urls.findAllParallel {url -> cachingDownload(url).contains('GRIFFON')} println 'Gradle sites today: ' + urls.findAllParallel {url -> cachingDownload(url).contains('GRADLE')} println 'Concurrency sites today: ' + urls.findAllParallel {url ->cachingDownload(url).contains('CONCURRENCY')} println 'GPars sites today: ' + urls.findAllParallel {url -> cachingDownload(url).contains('GPARS')}}

Notice how closures are enhanced inside the GParsPool.withPool() blocks with a memoize()function. This returns a new closure wrapping the original closure as a cache entry.

In the previous example, we’re calling the cachingDownload function in several places in the code,however, each unique url is downloaded only once - the first time it’s needed. The values are thencached and available for subsequent calls. Additionally, these values are also available to allthreads, no matter which thread originally came first with a download request for that particularurl and had to handle the actual calculation/download.

So, to wrap up, a memoize call shields a function by using a cache of past return values.

33

However, memoize can do even more! In some algorithms, adding a little memory may have adramatic impact on the computational complexity of the calculation. Let’s look at a classicalexample of Fibonacci numbers.

Fibonacci ExampleA purely functional, recursive implementation that follows the definition of Fibonacci numbers isexponentially complex:

A Fibonacci Example

Closure fib = {n -> n > 1 ? call(n - 1) + call(n - 2) : n}

Try calling the fib function with numbers around 30 and you’ll see how slow it is.

Now with a little twist and an added memoize cache, the algorithm magically turns into a linearlycomplex one:

A Better Version of the Fibonacci Example

Closure fibfib = {n -> n > 1 ? fib(n - 1) + fib(n - 2) : n}.gmemoize()

The extra memory we added has now cut off all but one recursive branch of the calculation. And allsubsequent calls to the same fib function will also benefit from the cached values.

Look below to see how the memoizeAtMost variant can reduce memory consumption in ourexample, yet preserve the linear complexity of the algorithm.

Available Variants

Memoize

The basic variant keeps values in the internal cache for the whole lifetime of the memoizedfunction. It provides the best performance characteristics of all the variants.

memoizeAtMost

Allows us to set a hard limit on number of items cached. Once the limit has been reached, allsubsequently added values will eliminate the oldest value from the cache using the LRU (LastRecently Used) strategy.

So for our Fibonacci number example, we could safely reduce the cache size to two items:

34

A Cached Fibonacci Example

Closure fibfib = {n -> n > 1 ? fib(n - 1) + fib(n - 2) : n}.memoizeAtMost(2)

Setting an upper limit on the cache size serves two purposes:

• Keeps the memory footprint of the cache within defined boundaries

• Preserves desired performance characteristics of the function. Too large a cache increases thetime to retrieve a cached value, compared to the time it would have taken to calculate the resultdirectly.

memoizeAtLeast

Allows unlimited growth of the internal cache until the JVM’s garbage collector decides to step inand evict a SoftReferences entry (used by our implementation) from the memory.

The single parameter to the memoizeAtLeast() method indicates the minimum number of cacheditems that should be protected from gc eviction. The cache will never shrink below the specifiednumber of entries. The cache ensures it only protects the most recently used items from evictionusing the LRU (Last Recently Used) strategy.

memoizeBetween

Combines the memoizeAtLeast and memoizeAtMost methods to allow the cache to grow andshrink in the range between the two parameter values depending on available memory and the gcactivity.

The cache size will never exceed the upper size limit to preserve desired performancecharacteristics of the cache.

Map-ReduceThe Parallel Collection Map/Reduce DSL gives GPars a more functional flavor. In general, theMap/Reduce DSL may be used for the same purpose as the xxxParallel() family of methods and hasvery similar semantics. On the other hand, Map/Reduce can perform considerably faster, if youneed to chain multiple methods together to process a single collection in multiple steps:

A Map-Reduce Example

println 'Number of occurrences of the word GROOVY today: ' + urls.parallel .map {it.toURL().text.toUpperCase()} .filter {it.contains('GROOVY')} .map{it.split()} .map{it.findAll{word -> word.contains 'GROOVY'}.size()} .sum()

35

The xxxParallel() methods must follow the same contract as their non-parallel peers. So acollectParallel() method must return a legal collection of items, which you can treat as a Groovycollection.

Internally, the parallel collect method builds an efficient parallel structure, called a parallel array.It then performs the required operation concurrently. Before returning, it destroys the ParallelArray as it builds a collection of results to return to you. A potential call to, for example,findAllParallel() on the resulting collection would repeat the whole process of construction anddestruction of a Parallel Array instance under the covers.

With Map/Reduce, you turn your collection into a Parallel Array and back again only a single time.The Map/Reduce family of methods do not return Groovy collections, but can freely pass along theinternal Parallel Arrays directly.

Invoking the parallel property of a collection will build a Parallel Array for the collection and thenreturn a thin wrapper around the Parallel Array instance. Then you can chain any of thesemethods together to get an answer :

• map()

• reduce()

• filter()

• size()

• sum()

• min()

• max()

• sort()

• groupBy()

• combine()

Returning a plain Groovy collection instance is always just a matter of retrieving the collectionproperty.

A Map-Reduce Example

def myNumbers = (1..1000).parallel.filter{it % 2 == 0}.map{Math.sqrt it}.collection

Avoid Side-effects in FunctionsOnce again we need to warn you. To avoid nasty surprises, please, keep any closures you pass to theMap/Reduce functions, stateless and clean from side-effects.

To avoid nasty surprises keep your closures stateless

36

Availability

This feature is only available when using in the Fork/Join-based GParsPool , not in theGParsExecutorsPool method.

Classical Example

A classical example, inspired by thevery, counts occurrences of words in a string:

A Telling Example


def words = "This is just plain text to count words in"print count(words)

def count(arg) {

withPool {

return arg.parallel .map{[it, 1]} .groupBy{it[0]}.getParallel() .map {it.value=it.value.size();it} .sort{-it.value}.collection }

}

The same example can be implemented with the more general combine operation:

A Combine Example

def words = "This is just plain text to count words in"print count(words)

def count(arg) {

withPool { return arg.parallel .map{[it, 1]} .combine(0) {sum, value -> sum + value}.getParallel() .sort{-it.value}.collection }

}

37

http://github.com/thevery

CombineThe combine operation expects an input list of tuples (two-element lists), often considered to be key-value pairs (such as [ [key1, value1], [key2, value2], [key1, value3], [key3, value4] … ] ). These mighthave potentially repeating keys.

When invoked, the combine method merges the values of identical keys using the providedaccumulator function. This produces a map of the original (unique) keys and their (now)accumulated values.

E.g. will be combined into [a : b+e, c : d+f]. Some logic like the '+' operation for the values will needto be provided as the accumulation closure logic.

The accumulation function argument needs to specify a function to use when combining(accumulating) values belonging to the same key. An initial accumulator value needs to be providedas well.

Since the combine method processes items in parallel, the initial accumulator value will be reusedmultiple times. Thus the provided value must allow for reuse.

It should either be a cloneable (or immutable) value or a closure returning a fresh initialaccumulator each time it’s requested. Good combinations of accumulator functions and reusableinitial values include:

Some Examples of a Combining-Accumulator Function and Reusable Initial Value

accumulator = {List acc, value -> acc << value} initialValue = []accumulator = {List acc, value -> acc << value} initialValue = {-> []}accumulator = {int sum, int value -> acc + value} initialValue = 0accumulator = {int sum, int value -> sum + value} initialValue = {-> 0}accumulator = {ShoppingCart cart, Item value -> cart.addItem(value)} initialValue = {-> new ShoppingCart()}

The return type is a map.

E.g. [['he', 1], ['she', 2], ['he', 2], ['me', 1], ['she', 5], ['he', 1]] with an initial value of zero will combineinto ['he' : 4, 'she' : 7, 'me' : 1]

Comparion Logic

The keys will be mutually compared using their equals and hashCode methods. Considerusing @Canonical or @EqualsAndHashCode annotations to annotate objects you use as keys.

As with all hash maps in Groovy, be sure you’re using a String not a GString as a key!

38

For more involved scenarios when you combine() complex objects, a good strategy here is to have acomplete class to use as a key for common use cases and to apply different keys for uncommoncases.

A Complex Example

import groovy.transform.ToStringimport groovy.transform.TupleConstructor


// declare a complete class to use in combination processing@TupleConstructor @ToStringclass PricedCar implements Cloneable { // either Clonable or Immutable String model String color Double price

// declare a way to resolve comparison logic boolean equals(final o) { if (this.is(o)) return true if (getClass() != o.class) return false

final PricedCar pricedCar = (PricedCar) o

if (color != pricedCar.color) return false if (model != pricedCar.model) return false

return true }

int hashCode() { int result result = (model != null ? model.hashCode() : 0) result = 31 * result + (color != null ? color.hashCode() : 0) return result }

@Override protected Object clone() { return super.clone() }}

// some datadef cars = [new PricedCar('F550', 'blue', 2342.223), new PricedCar('F550', 'red', 234.234), new PricedCar('Da', 'white', 2222.2), new PricedCar('Da', 'white', 1111.1)]

39

withPool { //Combine by model def result = cars.parallel.map { [it.model, it] }.combine(new PricedCar('', 'N/A', 0.0)) {sum, value -> sum.model = value.model sum.price += value.price sum }.values()

println result

//Combine by model and color (using the PricedCar's equals and hashCode)) result = cars.parallel.map { [it, it] }.combine(new PricedCar('', 'N/A', 0.0)) {sum, value -> sum.model = value.model sum.color = value.color sum.price += value.price sum }.values()

println result}

Parallel ArraysAs an alternative, the efficient tree-based data structures defined in JSR-166y - Java Concurrencycan be used directly. The parallelArray property on any collection or object will return aParallelArray instance holding the elements of the original collection. These then can bemanipulated through the jsr166y API.

Please refer to jsr166y documentation for API details.

40

https://en.wikipedia.org/wiki/Java_concurrency




A Parallel Array Example

import groovyx.gpars.extra166y.Ops

groovyx.gpars.GParsPool.withPool {

assert 15 == [1, 2, 3, 4, 5].parallelArray.reduce({a, b -> a + b} as Ops.Reducer,0) //summarize

assert 55 == [1, 2, 3, 4, 5].parallelArray.withMapping({it ** 2} as Ops.Op).reduce({a, b -> a + b} as Ops.Reducer, 0) //summarize squares

assert 20 == [1, 2, 3, 4, 5].parallelArray.withFilter({it % 2 == 0} as Ops.Predicate) //summarize squares of even numbers .withMapping({it ** 2} as Ops.Op) .reduce({a, b -> a + b} as Ops.Reducer, 0)

assert 'aa:bb:cc:dd:ee' == 'abcde'.parallelArray//concatenate duplicated characters with separator .withMapping({it * 2} as Ops.Op) .reduce({a, b -> "$a:$b"} as Ops.Reducer, "")

Asynchronous InvocationsLong running background tasks happen a lot in most systems.

Typically, a main thread of execution wants to initialize a few calculations, start downloads, dosearches, etc. even when the results may not be needed immediately.

GPars gives the developers the tools to schedule asynchronous activities for background processingand collect the results later, when they’re needed.

Usage of GParsPool and GParsExecutorsPool Asynchronous ProcessingFacilities

Both GParsPool and GParsExecutorsPool methods provide nearly identical services whileleveraging different underlying machinery.

Closures Enhancements

The following methods are added to closures inside the GPars(Executors)Pool.withPool() blocks:

• async() - To create an asynchronous variant of the supplied closure which, when invoked,returns a future object for the potential return value

• callAsync() - Calls a closure in a separate thread while supplying the given arguments, returninga future object for the potential return value,

41

An async() Example

GParsPool.withPool() { Closure longLastingCalculation = {calculate()} Closure fastCalculation = longLastingCalculation.async() //create a new closure,which starts the original closure on a thread pool

Future result=fastCalculation() //returns almostimmediately

//do stuff while calculation performs ... println result.get()}

A callAsync() Example

GParsPool.withPool() { /** * The callAsync() method is an asynchronous variant of the default call() methodto invoke a closure. * It will return a Future for the result value. */ assert 6 == {it * 2}.call(3) assert 6 == {it * 2}.callAsync(3).get()}

Timeouts

The callTimeoutAsync() methods, taking either a long value or a Duration instance, provides atimer mechanism.

A Timed Example

{-> while(true) { Thread.sleep 1000 //Simulate a bit of interesting calculation if (Thread.currentThread().isInterrupted()) break; //We've been cancelled }}.callTimeoutAsync(2000)

To allow cancellation, our asynchronously running code must keep checking the interrupted flag ofit’s own thread and stop calculating when/if the flag is set to true.

Executor Service Enhancements

The ExecutorService and ForkJoinPool classes are enhanced with the '<<' (leftShift) operator tosubmit tasks to the pool and return a Future for the result.

42

A Convenient Example Using [red]'<<'

GParsExecutorsPool.withPool {ExecutorService executorService -> executorService << {println 'Inside parallel task'}}

Running Functions (closures) in Parallel

The GParsPool and GParsExecutorsPool classes also provide handy methods executeAsync() andexecuteAsyncAndWait() to easily run multiple closures asynchronously.

Example:

An Example

GParsPool.withPool { assert [10, 20] == GParsPool.executeAsyncAndWait({calculateA()}, {calculateB()}//waits for results assert [10, 20] == GParsPool.executeAsync({calculateA()}, {calculateB()})*.get()//returns Futures instead and doesn't wait for results to be calculated}

Composable Asynchronous FunctionsFunctions are to be composed. In fact, composing side-effect-free functions is very easy. Mucheasier and more reliable than composing objects, for example.

Given the same input, functions always return the same result, they never change their behaviorunexpectedly nor they break when multiple threads call them at the same time.

Functions in Groovy

We can treat Groovy closures as functions. They take arguments, do their calculation and return avalue. Provided you don’t let your closures touch anything outside their scope, your closures arewell-behaved, just like pure functions. Functions that you can combine for a higher good.

A Higher Good Example

def sum = (0..100000).inject(0, {a, b -> a + b})

For this example, by combining a function adding two numbers {a,b} with the inject function,which iterates through the whole collection, you can quickly summarize all items. Then, replacingthe adding function with a comparison function immediately gives you a combined function tocalculate maximums.

43

Find The Maximums

def max = myNumbers.inject(0, {a, b -> a>b?a:b})

You see, functional programming is popular for a reason.

Are We Concurrent Yet?This all works just fine until you realize you’re not using the full power of your expensivehardware. These functions are just plain sequential! No parallelism is used! All but one processorcore is doing nothing, they’re idle, totally wasted!

A Generic Way to Use Asynchronous Functions

Those paying attention might decide to use the Parallel Collection techniques describedearlier and they would certainly be correct.

For our scenario described here, where we process a collection, using those parallel methodswould be the best choice. However, we’re now looking for a generic way to create andcombine asynchronous functions. This would help us, not only for collections processing, butmostly in other, more generic, cases like the one right below.

All but one processor core is doing nothing! They’re idle! Totally wasted!

To make things more obvious, here’s an example of combining four functions, which are supposedto check whether a particular web page matches the contents of a local file. We need to downloadthe page, load the file, calculate hashes of both and finally compare the resulting numbers.

44

An Example

Closure download = {String url -> url.toURL().text}

Closure loadFile = {String fileName -> ... //load the file here}

Closure hash = {s -> s.hashCode()}

Closure compare = {int first, int second -> first == second}

def result = compare(hash(download('http://www.gpars.org')), hash(loadFile('/coolStuff/gpars/website/index.html')))println "The result of comparison: " + result

We need to download the page, load up the file, calculate hashes of both and finally compare theresulting numbers. Each of the functions is responsible for one particular job. One functiondownloads the content, a second loads the file, and a third calculates the hashes and finally thefourth one will do the comparison.

Combining the functions is as simple as nesting their calls.

Making It All Asynchronous

The downside of our code is that we haven’t leveraged the independence of the download() and theloadFile() functions. Neither have we allowed the two hashes to be run concurrently. They couldwell run in parallel, but our approach to combine functions restricts parallelism.

Obviously not all of the functions can run concurrently. Some functions depend on results ofothers. They cannot start before the other function finishes. We need to block them until theirparameters are available. The hash() functions needs a string to work on. The compare() functionneeds two numbers to compare.

So we can only take parallelism so far, while blocking parallelism of others. Seems like achallenging task.

Things Are Bright in the Functional World

Luckily, the dependencies between functions are already expressed implicitly in the code. There’sno need to duplicate that dependency information. If one functions takes parameters and theparameters need to be calculated first by another function, we implicitly have a dependency here.

The hash() function depends on loadFile() as well as on the download() functions in our example.

45

The inject function in our earlier example depended on the results of the addition functionsgradually invoked on all elements of the collection.

Our task is, in fact, very simple !

However difficult it may seem at first, our task is, in fact, very simple. We only need to teachour functions to return a promise of their future results. And we need to teach the otherfunctions to accept those promises as parameters so that they will wait for the real valuesbefore they start their work.

And if we convince the functions to release the threads they hold, while waiting for thevalues, we get directly to where the magic can happen.

In the best traditions of GPars, we’ve made it very straightforward for you to convince any functionto believe in the promises of other functions. Call the asyncFun() function on a closure and you’reasynchronous !

Promises, Promises

withPool { def maxPromise = numbers.inject(0, {a, b -> a>b?a:b}.asyncFun())

println "Look Ma, I can talk to the user while the math is being done for me!" println maxPromise.get()}

The inject function doesn’t really care what objects are returned from the addition function, maybeit’s a little surprised each call to the addition function returns so fast, but doesn’t moan much, keepsiterating and finally returns the overall result we expect.

Now is the time you should stand behind what you say and do what you want others to do. Don’tfrown at the result and just accept that you got back just a promise. A promise to get the answerdelivered as soon as the calculation is complete. The extra heat from your laptop is an indicationthat the calculation exploits natural parallelism in your functions and makes its best effort todeliver the result to you quickly.

A Promise Is A Promise

The promise is a good old DataflowVariable, so you can query its status, register somenotification hooks or even make it an input to a Dataflow algorithm !

46

An Promising Example

withPool { def sumPromise = (0..100000).inject(0, {a, b -> a + b}.asyncFun())

println "Are we done yet? " + sumPromise.bound

sumPromise.whenBound {sum -> println sum}}

Do You Need A Timeout ?

The get() method has also a variant with a timeout parameter, if you want to avoid the risk ofwaiting indefinitely.

Can Things Go Wrong?

Sure. But you’ll get an exception thrown from the promise get() method.

An Exceptional Example

try { sumPromise.get()

} catch (MyCalculationException e) { println "Guess, things are not ideal today."}

This Is All Fine, But What Functions Can Really Be Combined?

There are no limits to your ambitions. Take any sequential functions you need to combine and youshould be able to combine their asynchronous variants as well.

Review our initial example comparing the content of a file with a web page. We simply make all thefunctions asynchronous by calling the asyncFun() method on them and we are ready to set off.

47

Using The asyncFun() Example

Closure download = {String url -> url.toURL().text }.asyncFun()

Closure loadFile = {String fileName -> ... //load the file here }.asyncFun()

Closure hash = {s -> s.hashCode()}.asyncFun()

Closure compare = {int first, int second -> first == second }.asyncFun()

def result = compare(hash(download('http://www.gpars.org')), hash(loadFile('/coolStuff/gpars/website/index.html')))

println 'Allowed to do something else now' println "The result of comparison: " + result.get()

Calling Asynchronous Functions from Within Asynchronous Functions

Another very valuable attribute of asynchronous functions is that promises can be combined.

Promises can be combined !

48

An Asynchronous Function Within Another


withPool { Closure plus = {Integer a, Integer b -> sleep 3000 println 'Adding numbers' a + b }.asyncFun(); // ok, here's one func

Closure multiply = {Integer a, Integer b -> sleep 2000 a * b }.asyncFun() // and second one

Closure measureTime = {-> sleep 3000 4 }.asyncFun(); // and another

// declare a function within a function Closure distance = {Integer initialDistance, Integer velocity, Integer time -> plus(initialDistance, multiply(velocity, time)) }.asyncFun(); // and another

Closure chattyDistance = {Integer initialDistance, Integer velocity, Integertime -> println 'All parameters are now ready - starting' println 'About to call another asynchronous function' def innerResultPromise = plus(initialDistance, multiply(velocity, time)) println 'Returning the promise for the inner calculation as my own result' return innerResultPromise }.asyncFun(); // and declare (but not run) a final asynch.function

// fine, now let's execute those previous asynch. functions println "Distance = " + distance(100, 20, measureTime()).get() + ' m' println "ChattyDistance = " + chattyDistance(100, 20, measureTime()).get() + 'm' }

If an asynchronous function (e.g. like the distance function in this example) in its body calls anotherasynchronous function (e.g. plus ) and returns the the promise of the invoked function, the innerfunction’s ( plus ) resulting promise will combine with the outer function’s ( distance ) resultspromise.

The inner function ( plus ) will now bind its result to the outer function’s ( distance ) promise, oncethe inner function (plus) finishes its calculation. This ability of promises to combine logic allowsfunctions to cease their calculation without blocking a thread. This happens not only when waiting

49

for parameters, but also whenever they call another asynchronous function anywhere in their codebody.

Methods as Asynchronous Functions

Methods can be referred to as closures using the .& operator. These closures can then betransformed using the asyncFun method into composable asynchronous functions just like ordinaryclosures.

An Example

class DownloadHelper {

String download(String url) { url.toURL().text }

int scanFor(String word, String text) { text.findAll(word).size() }

String lower(s) { s.toLowerCase() }}

//now we'll make the methods asynchronouswithPool { final DownloadHelper d = new DownloadHelper() Closure download = d.&download.asyncFun() // notice the .& syntax Closure scanFor = d.&scanFor.asyncFun() // and here Closure lower = d.&lower.asyncFun() // and here

//asynchronous processing def result = scanFor('groovy', lower(download('http://www.infoq.com'))) println 'Doing something else for now' println result.get()}

Using Annotations to Create Asynchronous Functions

Instead of calling the asyncFun() function, the @AsyncFun annotation can be used to annotateClosure-typed fields. The fields have to be initialized in-place and the containing class needs to beinstantiated within a withPool block.

50

An Annotation Example

import static groovyx.gpars.GParsPool.withPoolimport groovyx.gpars.AsyncFun

class DownloadingSearch { @AsyncFun Closure download = {String url -> url.toURL().text }

@AsyncFun Closure scanFor = {String word, String text -> text.findAll(word).size() }

@AsyncFun Closure lower = {s -> s.toLowerCase()}

void scan() { def result = scanFor('groovy', lower(download('http://www.infoq.com')))//synchronous processing

println 'Allowed to do something else now' println result.get() }}

withPool { new DownloadingSearch().scan()}

Alternative Pools

The AsyncFun annotation, by default, uses an instance of GParsPool from the wrapping withPoolblock. You may, however, specify the type of pool explicitly:

A Explicit Example

@AsyncFun(GParsExecutorsPoolUtil) def sum6 = {a, b -> a + b }

Blocking Functions Through Annotations

The AsyncFun method also allows us to specify, whether the resulting function should allowblocking (true) or non-blocking (false - default) semantics.

An Example of Blocking Semantics

@AsyncFun(blocking = true)def sum = {a, b -> a + b }

51

Explicit and Delayed Pool Assignment

When using the GPars(Executors)PoolUtil.asyncFun() function directly to create an asynchronousfunction, you have two additional ways to assign a thread pool to the function.

1. The thread pool to be used by the function can be specified explicitly as an additional argumentat creation time

2. The implicit thread pool can be obtained from the surrounding scope at invocation-time ratherat creation time

When specifying the thread pool explicitly, the call doesn’t need to be wrapped in a withPool()block:

To Specify Thread Pools Explicitly

Closure sPlus = {Integer a, Integer b -> a + b}

Closure sMultiply = {Integer a, Integer b -> sleep 2000 a * b}

println "Synchronous result: " + sMultiply(sPlus(10, 30), 100)

final pool = new FJPool();

Closure aPlus = GParsPoolUtil.asyncFun(sPlus, pool)Closure aMultiply = GParsPoolUtil.asyncFun(sMultiply, pool)

def result = aMultiply(aPlus(10, 30), 100)

println "Time to do something else while the calculation is running"println "Asynchronous result: " + result.get()

With a delayed pool assignment, only the function invocation must be surrounded with a withPool()block:

52

A Delayed Pool Assignment Example

Closure aPlus = GParsPoolUtil.asyncFun(sPlus)Closure aMultiply = GParsPoolUtil.asyncFun(sMultiply)

withPool { def result = aMultiply(aPlus(10, 30), 100)

println "Time to do something else while the calculation is running" println "Asynchronous result: " + result.get()}

For us, this is a very interesting domain to explore. So any comments, questions or suggestions arewelcome on combining asynchronous functions or hints about its limits.

Fork-JoinFork/Join or Divide-and-Conquer, is a very powerful abstraction to solve hierarchical problems.

The Abstraction

When talking about hierarchical problems, think about quick sort, merge sort, file system orgeneral tree navigation problems.

• Fork/Join algorithms essentially split a problem into several smaller sub-problems and thenrecursively applies the same algorithm to each of the sub-problems.

• Once the sub-problem is small enough, it is solved directly.

• The solutions of all sub-problems are combined to solve their parent problem, which in turnhelps solve its' own grand-parent problem.

A Picture Is Worth A Thousand Words

Check out the fancy Interactive Fork/Join visualization demo. It shows you how threads co-operate to solve a common divide-and-conquer algorithm.

The mighty JSR-166y library co-ordinates Fork/Join orchestration rather nicely, but leaves a fewrough edges, which can hurt you, if you don’t pay enough attention. You must still deal withthreads, pools and/or synchronization barriers.

The GPars Abstraction Convenience Layer

GPars can hide the complexities of dealing with threads, pools and recursive tasks from you, yet letyou leverage the powerful Fork/Join implementation in jsr166y.

53

http://blog.krecan.net/2011/03/27/visualizing-forkjoin/



A Complex Example to Walk A File Directory

import static groovyx.gpars.GParsPool.runForkJoinimport static groovyx.gpars.GParsPool.withPool

withPool() { println """Number of files: ${

runForkJoin(new File("./src")) {file -> long count = 0 file.eachFile { if (it.isDirectory()) { println "Forking a child task for $it" forkOffChild(it) //fork a child task

} else { count++ } } return count + (childrenResults.sum(0)) //use results of children tasks to calculate and store own result }

}""".toString();}

The runForkJoin() factory method uses the supplied recursive code together with the providedvalues to build a hierarchical Fork/Join calculation. The number of values passed to therunForkJoin() method must match the number of expected parameters of the closure. This mustequal the same number of arguments passed to the forkOffChild() or runChildDirectly() methods.

54

An Example

def quicksort(numbers) {

withPool {

runForkJoin(0, numbers) {index, list ->

def groups = list.groupBy {it <=> list[list.size().intdiv(2)]}

if ((list.size() < 2) || (groups.size() == 1)) { return [index: index, list: list.clone()] }

(-1..1).each {forkOffChild(it, groups[it] ?: [])}

return [index: index, list: childrenResults.sort {it.index}.sum {it.list}]

}.list }}

It’s Asynchronous, Mate !

The important piece of the puzzle to note here is that forkOffChild() doesn’t wait for the childto run. It merely schedules it for execution at a future time. If a child task throws anexception, don’t expect the exception to be fired from the forkOffChild() method itself. Theexception will have happened long after the parent has called forkOffChild().

It’s the getChildrenResults() method that will re-throw any child sub-task exceptions back tothe parent.

Alternative Approach

Alternatively, the underlying mechanism of nested Fork/Join worker tasks can be used directly.Custom-tailored workers can eliminate the performance overhead associated with parameterspreading imposed when using the generic workers.

Also, custom workers can be implemented in Java for further increases in performance.

55

A Custom Worker

public final class FileCounter extends AbstractForkJoinWorker<Long> { private final File file;

def FileCounter(final File file) { this.file = file }

@Override protected Long computeTask() { long count = 0;

file.eachFile { if (it.isDirectory()) { println "Forking a thread for $it" forkOffChild(new FileCounter(it)) //fork a child task

} else { count++ } } return count + ((childrenResults)?.sum() ?: 0) //use results of childrentasks to calculate and store own result }}

withPool(1) {pool -> //feel free to experiment with the number of fork/join threadsin the pool println "Number of files: ${runForkJoin(new FileCounter(new File("..")))}"}

The AbstractForkJoinWorker subclasses can be written in both Java and Groovy. Either choicrlets you optimize for execution speed, if low performance of the worker becomes a bottleneck.

Fork / Join Saves Your Resources

Fork/Join operations can safely be run with small numbers of threads thanks to internal use of theTaskBarrier class to synchronize the threads.

While a thread is blocked inside an algorithm waiting for its sub-problems to be calculated, thethread is silently returned to it’s pool to take on any other available sub-problems from the taskqueue and process them. Although the algorithm creates as many tasks as there are sub-directoriesand tasks wait for the sub-directory tasks to complete, often as few as a single thread is enough tokeep the computation going and eventually calculate a valid result.

Mergesort Example

Come on Punk, Merge my day !

56

import static groovyx.gpars.GParsPool.runForkJoinimport static groovyx.gpars.GParsPool.withPool

/** * Splits a list of numbers in half */def split(List<Integer> list) { int listSize = list.size() int middleIndex = listSize / 2 def list1 = list[0..<middleIndex] def list2 = list[middleIndex..listSize - 1] return [list1, list2]}

/** * Merges two sorted lists into one */List<Integer> merge(List<Integer> a, List<Integer> b) { int i = 0, j = 0 final int newSize = a.size() + b.size() List<Integer> result = new ArrayList<Integer>(newSize)

while ((i < a.size()) && (j < b.size())) { if (a[i] <= b[j]) result << a[i++] else result << b[j++] }

if (i < a.size()) result.addAll(a[i..-1]) else result.addAll(b[j..-1]) return result}

final def numbers = [1, 5, 2, 4, 3, 8, 6, 7, 3, 4, 5, 2, 2, 9, 8, 7, 6, 7, 8, 1, 4, 1,7, 5, 8, 2, 3, 9, 5, 7, 4, 3]

withPool(3) { //feel free to experiment with the number of fork/join threads in thepool println """Sorted numbers: ${ runForkJoin(numbers) {nums -> println "Thread ${Thread.currentThread().name[-1]}: Sorting $nums" switch (nums.size()) { case 0..1: return nums //store own result case 2: if (nums[0] <= nums[1]) return nums //store own result else return nums[-1..0] //store own result default: def splitList = split(nums) [splitList[0], splitList[1]].each {forkOffChild it} //fork achild task return merge(* childrenResults) //use results of children

57

tasks to calculate and store own result } } }"""}

Mergesort Example Using A Custom-tailored Worker Class

An Example

public final class SortWorker extends AbstractForkJoinWorker<List<Integer>> { private final List numbers

def SortWorker(final List<Integer> numbers) { this.numbers = numbers.asImmutable() }

/** * Splits a list of numbers in half */ def split(List<Integer> list) { int listSize = list.size() int middleIndex = listSize / 2 def list1 = list[0..<middleIndex] def list2 = list[middleIndex..listSize - 1] return [list1, list2] }

/** * Merges two sorted lists into one */ List<Integer> merge(List<Integer> a, List<Integer> b) { int i = 0, j = 0 final int newSize = a.size() + b.size()

List<Integer> result = new ArrayList<Integer>(newSize)


if (i < a.size()) result.addAll(a[i..-1]) else result.addAll(b[j..-1]) return result }

/** * Sorts a small list or delegates to two children, if the list contains more than

58

two elements. */ @Override protected List<Integer> computeTask() { println "Thread ${Thread.currentThread().name[-1]}: Sorting $numbers"

switch (numbers.size()) { case 0..1: return numbers //store own result

case 2: if (numbers[0] <= numbers[1]) return numbers //store own result else return numbers[-1..0] //store own result

default: def splitList = split(numbers) [new SortWorker(splitList[0]), new SortWorker(splitList[1])].each{forkOffChild it} //fork a child task return merge(* childrenResults) //use results of children tasksto calculate and store own result } }}

final def numbers = [1, 5, 2, 4, 3, 8, 6, 7, 3, 4, 5, 2, 2, 9, 8, 7, 6, 7, 8, 1, 4, 1,7, 5, 8, 2, 3, 9, 5, 7, 4, 3]

withPool(1) { //feel free to experiment with the number of fork/join threads in thepool println "Sorted numbers: ${runForkJoin(new SortWorker(numbers))}"}

Running Child Tasks Directly

The forkOffChild method has a sibling — called the runChildDirectly method. This method will runthe child task directly and immediately within the current thread instead of scheduling the childtask for asynchronous processing on the thread pool. Typically you’d call forkOffChild on every sub-task but the last, which you invoke directly without the scheduling overhead.

59

A Fork-In-Time-Saves-Nine

Closure fib = {number -> if (number <= 2) { return 1 }

forkOffChild(number - 1) // This task will run asynchronously, probably in adifferent thread final def result = runChildDirectly(number - 2) // This task is run directlywithin the current thread return (Integer) getChildrenResults().sum() + result}

withPool { assert 55 == runForkJoin(10, fib)}

Availability

This feature is only available when using in the Fork/Join-based GParsPool , but notGParsExecutorsPool .

Parallel SpeculationsWith processor cores having become plentiful, some algorithms might benefit from brutal-forceparallel duplication. Instead of deciding up-front about how to solve a problem, what algorithm touse or which location to connect to, you run all potential solutions in parallel.

Parallel SpeculationsImagine you need to perform a task like e.g. calculate an expensive function or read data from afile, database or internet. Luckily, you know several good ways (e.g. functions or urls) to reach yourgoal. However, all are not equal.

Although they return the same (as far as your needs are concerned) result, the elapsed time of eachwill differ and some may even fail (e.g. network issues). What’s worse, no-one’s going to tell youwhich choice gives you the single best solution nor which paths might lead to no solution at all.

1. Shall I run quick sort or merge sort on my list?

2. Which url will work best?

3. Is this service available at its primary location or should I use the backup one?

GPars Speculations give you the option to try all the available alternatives in parallel and receive

60

the result from the fastest functional path, silently ignoring the slow or broken ones.

This is what the speculate methods on GParsPool and GParsExecutorsPool can do for you.

A Sort Example

def numbers = ...def quickSort = ...def mergeSort = ...def sortedNumbers = speculate(quickSort, mergeSort)

So we’re performing both a quick sort and a merge sort at the same time (concurrently), whilegetting the result of the faster one.

Given the parallel resources available these days on mainstream hardware, running the twofunctions in parallel will not have a dramatic impact on speed of calculation of either one, and thuswe get the results of both in about the same time as if we ran only ran the faster of the twocalculations. And also, the result arrives sooner than when running the slower one. Yet we didn’thave to know up-front, which of the two sorting algorithms would perform better on our data. Thuswe speculated (guessed).

Similarly, downloading a document from several sources with different speeds and/or reliabilitymight look like this:

A Document DownLoad Example

import static groovyx.gpars.GParsPool.speculateimport static groovyx.gpars.GParsPool.withPool

def alternative1 = { 'http://www.dzone.com/links/index.html'.toURL().text}

def alternative2 = { 'http://www.dzone.com/'.toURL().text}

def alternative3 = { 'http://www.dzzzzzone.com/'.toURL().text //wrong url}

def alternative4 = { 'http://dzone.com/'.toURL().text}

withPool(4){ println speculate([alternative1, alternative2, alternative3, alternative4]).contains('groovy')}

61

Thread Starvation

Make sure the surrounding thread pool has enough threads to process all alternatives inparallel. The size of the pool should match the number of closures supplied.

Alternatives Using Dataflow Variables and StreamsIn some use cases, we can ignore failing alternatives, so Dataflow variables or Streams may beused to obtain the results of the winning speculation.

See this User Guide’s topic on Dataflow Concurrency

Please refer to the Dataflow Concurrency section of this User Guide for details on DataflowVariables and streams.

62

An Example

import groovyx.gpars.dataflow.DataflowQueueimport static groovyx.gpars.dataflow.Dataflow.task

def alternative1 = { 'http://www.dzone.com/links/index.html'.toURL().text}

def alternative2 = { 'http://www.dzone.com/'.toURL().text}

def alternative3 = { 'http://www.dzzzzzone.com/'.toURL().text //will fail due to wrong url}

def alternative4 = { 'http://dzone.com/'.toURL().text}

//Pick either one of the following, both will work:final def result = new DataflowQueue()// final def result = new DataflowVariable()

[alternative1, alternative2, alternative3, alternative4].each{code -> task{ try { result << code() } catch (ignore) { } // We deliberately ignore unsuccessful urls. }}

println result.val.contains('groovy')

63

User Guide To CSP

Communicating Sequential ProcessesThe CSP (Communicating Sequential Processes) abstraction builds on independent composableprocesses, which exchange messages in a synchronous manner. GPars leverages the JCSP librarydeveloped at the University of Kent, UK.

Jon Kerridge, the author of the CSP implementation in GPars, provides exhaustive examples on ofGroovyCSP use at www.soc.napier.ac.uk or here on our local mirror page.

Purpose

The GroovyCSP implementation leverages JCSP, a Java-based CSP library, which is licensedunder LGPL. There are some differences between the Apache 2 license, which GPars uses,and LGPL. Please make sure your application conforms to the LGPL rules before enabling theuse of JCSP in your code.

If the LGPL license is not adequate for your use, you might consider checking out the DataflowConcurrency chapter of this User Guide to learn about tasks , selectors and operators , which mayhelp you resolve concurrency issues in ways similar to the CSP approach. In fact, the dataflow andCSP concepts, as implemented in GPars, are very close to each other.

Apache 2 License

By default, without actively adding an explicit dependency on JCSP in your build file ordownloading and including the JCSP jar file in your project, the standard commercial-software-friendly Apache 2 License terms apply to your project. GPars directly only dependson software licensed under licenses compatible with the Apache 2 License.

The CSP Model PrinciplesIn essence, the CSP model builds on independent concurrent processes, which mutuallycommunicate through channels using synchronous (i.e. rendezvous) message passing. Unlike actorsor dataflow operators, which revolve around the event-processing pattern, CSP processes place thefocus of their activities (aka sequences of steps) around the use of communications to remainmutually in sync along the way.

Since the addressing is indirect through channels, the processes do not need to know about oneanother. They typically consist of a set of input and output channels and a body. Once a CSP processis started, it obtains a thread from a thread pool and starts processing its body, pausing only whenreading from a channel or writing into a channel. Some implementations (e.g. GoLang) can alsodetach the thread from the CSP process when blocked on a channel.

CSP programs are deterministic. The same data on the program’s input will always generate the

64

http://www.cs.kent.ac.uk/projects/ofa/jcsp/



http://www.soc.napier.ac.uk/~cs10/#_Toc271192596

../jk

same output, irrespective of the actual thread-scheduling scheme used. This helps a lot whendebugging CSP programs as well as analyzing deadlocks.

Determinism combined with indirect addressing results in a great level of composability of CSPprocesses. You can combine small CSP processes into bigger ones just by connecting their input andoutput channels and then wrapping them by another, bigger containing process.

The CSP model introduces non-determinism using Alternatives. A process can attempt to read avalue from multiple channels at the same time through a construct called Alternative or Select. Thefirst value that becomes available in any of the channels involved in the Select will be read andconsumed by the process. Since the order of messages received through a Select depends onunpredictable conditions during program run-time, the value that will be read is non-deterministic.

CSP with GPars DataflowGPars provides all the necessary building blocks to create CSP processes.

• CSP processes can be modelled through GPars tasks using a Closure, a Runnable or a Callable tohold the actual implementation of the process

• CSP Channels should be modelled with SyncDataflowQueue and SyncDataflowBroadcast classes

• CSP Alternative is provided through the Select class with its select and _ prioritySelect_methods

ProcessesTo start a process, simply use the task factory method.

Start A Process

import groovyx.gpars.group.DefaultPGroupimport groovyx.gpars.scheduler.ResizeablePool

group = new DefaultPGroup(new ResizeablePool(true))

def t = group.task { println "I am a process"}

t.join()

Since each process consumes a thread for its lifetime, it’s advisable to useresizeable thread pools as in the example above.

A process can also be created from a Runnable or Callable object:

65

A Runnable Sample



class MyProcess implements Runnable {

@Override void run() { println "I am a process" }}def t = group.task new MyProcess()

t.join()

Using Callable allows values to be returned through the get() method:

A Callable Sample


import java.util.concurrent.Callable


class MyProcess implements Callable<String> {

@Override String call() { println "I am a process" return "CSP is great!" }}

def t = group.task new MyProcess()

println t.get()

ChannelsProcesses typically need channels to communicate with their companion processes as well as withthe outside world:

66

A Channel Sample

import groovy.transform.TupleConstructorimport groovyx.gpars.dataflow.DataflowReadChannelimport groovyx.gpars.dataflow.DataflowWriteChannelimport groovyx.gpars.group.DefaultPGroupimport groovyx.gpars.scheduler.ResizeablePool

import java.util.concurrent.Callableimport groovyx.gpars.dataflow.SyncDataflowQueue


@TupleConstructorclass Greeter implements Callable<String> { DataflowReadChannel names DataflowWriteChannel greetings

@Override String call() { while(!Thread.currentThread().isInterrupted()) { String name = names.val greetings << "Hello " + name } return "CSP is great!" }}

def a = new SyncDataflowQueue()def b = new SyncDataflowQueue()

group.task new Greeter(a, b)

a << "Joe"a << "Dave"println b.valprintln b.val

Which Delivery Technique To Use for Messages ?

The CSP model uses synchronous messaging, however, in GPars you may consider usingasynchronous channels as well as synchronous ones.

You can also combine these two types of channels within the same process.

CompositionGrouping processes simply becomes a matter of connecting them with channels:

67

A Grouping Sample


@TupleConstructorclass Formatter implements Callable<String> { DataflowReadChannel rawNames DataflowWriteChannel formattedNames

@Override String call() { while(!Thread.currentThread().isInterrupted()) { String name = rawNames.val formattedNames << name.toUpperCase() } }}

@TupleConstructorclass Greeter implements Callable<String> { DataflowReadChannel names DataflowWriteChannel greetings

@Override String call() { while(!Thread.currentThread().isInterrupted()) { String name = names.val greetings << "Hello " + name } }}

def a = new SyncDataflowQueue()def b = new SyncDataflowQueue()def c = new SyncDataflowQueue()

group.task new Formatter(a, b)group.task new Greeter(b, c)

a << "Joe"a << "Dave"println c.valprintln c.val

AlternativesTo introduce non-determinist, GPars offers the Select class with its select and prioritySelectmethods:

68

A Select Sample

import groovy.transform.TupleConstructorimport groovyx.gpars.dataflow.SyncDataflowQueueimport groovyx.gpars.dataflow.DataflowReadChannelimport groovyx.gpars.dataflow.DataflowWriteChannelimport groovyx.gpars.dataflow.Selectimport groovyx.gpars.group.DefaultPGroupimport groovyx.gpars.scheduler.ResizeablePool

import static groovyx.gpars.dataflow.Dataflow.select


@TupleConstructorclass Receptionist implements Runnable { DataflowReadChannel emails DataflowReadChannel phoneCalls DataflowReadChannel tweets DataflowWriteChannel forwardedMessages

private final Select incomingRequests = select([phoneCalls, emails, tweets])//prioritySelect() would give highest precedence to phone calls

@Override void run() { while(!Thread.currentThread().isInterrupted()) { String msg = incomingRequests.select() forwardedMessages << msg.toUpperCase() } }}

def a = new SyncDataflowQueue()def b = new SyncDataflowQueue()def c = new SyncDataflowQueue()def d = new SyncDataflowQueue()

group.task new Receptionist(a, b, c, d)

a << "my email"b << "my phone call"c << "my tweet"

//The values come in random order since the process uses a Select to read its input3.times{ println d.val.value}

69

ComponentsCSP processes can be composed into larger entities. Suppose you already have a set of CSPprocesses (aka Runnable/Callable classes), you can compose them into a larger process:

A Larger Sample

final class Prefix implements Callable { private final DataflowChannel inChannel private final DataflowChannel outChannel private final def prefix

def Prefix(final inChannel, final outChannel, final prefix) { this.inChannel = inChannel; this.outChannel = outChannel; this.prefix = prefix }

public def call() { outChannel << prefix while (true) { sleep 200 outChannel << inChannel.val } }}

70

Another Building Block

final class Copy implements Callable { private final DataflowChannel inChannel private final DataflowChannel outChannel1 private final DataflowChannel outChannel2

def Copy(final inChannel, final outChannel1, final outChannel2) { this.inChannel = inChannel; this.outChannel1 = outChannel1; this.outChannel2 = outChannel2; }

public def call() { final PGroup group = Dataflow.retrieveCurrentDFPGroup() while (true) { def i = inChannel.val group.task { outChannel1 << i outChannel2 << i }.join() } }}

A Sample

import groovyx.gpars.dataflow.DataflowChannelimport groovyx.gpars.dataflow.SyncDataflowQueueimport groovyx.gpars.group.DefaultPGroup

group = new DefaultPGroup(6)

def fib(DataflowChannel out) { group.task { def a = new SyncDataflowQueue() def b = new SyncDataflowQueue() def c = new SyncDataflowQueue() def d = new SyncDataflowQueue() [new Prefix(d, a, 0L), new Prefix(c, d, 1L), new Copy(a, b, out), newStatePairs(b, c)].each { group.task it} }}

final SyncDataflowQueue ch = new SyncDataflowQueue()group.task new Print('Fibonacci numbers', ch)fib(ch)

sleep 10000

71

User Guide To ActorsActors offer a message passing-based concurrency model: programs are collections of independentactive objects that exchange messages and have no mutable shared state.

Actors can help us avoid issues such as deadlock, live-lock and starvation, which are commonproblems for shared memory based approaches.

Actors are a way of leveraging the multi-core nature of today’s hardware without all the problemstraditionally associated with shared-memory multi-threading, which is why programminglanguages such as Erlang and Scala have taken up this model.

The actor support in GPars was originally inspired by the Actors library in Scala,but has since gone well beyond what Scala offers as standard.

A nice article summarizing the key concepts behind actors has been written by Ruben Vermeersch.

Actors always guarantee that at most one thread processes the actor’s body at any one time andalso, under the covers, that the memory is synchronized each time a thread is assigned to an actorso the actor’s state can be safely modified by code in the body without any other extra(synchronization or locking) effort .

Ideally actor’s code should never be invoked directly from outside so all the code of the actor classcan only be executed by the thread handling the last received message and hence all the actor’scode is implicitly thread-safe .

If any of the actor’s methods are allowed to be called by other objects directly, the thread-safetyguarantee for the actor’s code and state are no longer valid .

Types of ActorsIn general, you can find two types of actors in the wild — ones that hold implicit state and onesthat don’t.

GPars gives you both options.

Stateless actors, represented in GPars by the DynamicDispatchActor and the ReactiveActor classes,keep no track of what messages have arrived previously. You may think of these as flat messagehandlers, which process messages as they come. Any state-based behavior has to be implementedby the user.

The stateful actors, represented in GPars by the DefaultActor class (and previously also by theAbstractPooledActor class), allow us to handle implicit state directly. After receiving a message, theactor moves into a new state with different ways to handle future messages.

73

http://ruben.savanne.be/articles/concurrency-in-erlang-scala

To give you an example, a freshly started actor may only accept some types of messages, e.g.encrypted messages for decryption, only after it has received the encryption keys. The statefulactors allow to encode such dependencies directly in the structure of the message-handling code.Implicit state management, however, comes at a slight performance cost, mainly due to the lack ofcontinuations support on JVM.

Actor Threading ModelSince actors are detached from the system threads, a large number of actors can share a relativelysmall thread pool.

This can go as far as having many concurrent actors share a single pooled thread while avoidingsome of the threading limitations of the JVM.

In general, while the JVM can only give you a limited number of threads (typically around a coupleof thousands), the number of actors is only limited by the available memory. If an actor has nowork to do, it doesn’t consume any threads.

Actor code is processed in chunks separated by quiet periods of waiting for new events (messages).This can be naturally modeled through continuations.

As JVM doesn’t support continuations directly, they have to be simulated in the actors frameworks,which has slight impact on organization of the actors' code. However, the benefits in most casesoutweigh the difficulties.

An Actors Sample

import groovyx.gpars.actor.Actorimport groovyx.gpars.actor.DefaultActor

class GameMaster extends DefaultActor { int secretNum

void afterStart() { secretNum = new Random().nextInt(10) }

void act() { loop { react { int num -> if (num > secretNum) { reply 'too large' } else if (num < secretNum) { reply 'too small' } else { reply 'you win'

74

terminate() } } } }}

class Player extends DefaultActor { String name Actor server int myNum

void act() { loop { myNum = new Random().nextInt(10) server.send myNum react { switch (it) { case 'too large': println "$name: $myNum was too large"; break case 'too small': println "$name: $myNum was too small"; break case 'you win': println "$name: I won $myNum"; terminate(); break } } } }}

def master = new GameMaster().start()def player = new Player(name: 'Player', server: master).start()

// This forces the main thread to wait until both actors have terminated.[master, player]*.join()

example by Jordi Campos i Miralles, Departament de Matemàtica Aplicada i Anàlisi, MAiA Facultatde Matemàtiques, Universitat de Barcelona

Usage of ActorsGPars provides consistent Actor APIs and DSLs. Actors, in principal, perform three specificoperations — send messages, receive messages and create new actors. Although not specificallyenforced by GPars, messages should be immutable or at least follow the hands-off policy when thesender never touches the messages after the message has been sent off.

Sending Messages

Messages can be sent to actors using the send method.

75

A Sample

def passiveActor = Actors.actor{ loop { react { msg -> println "Received: $msg"; } }}passiveActor.send 'Message 1'passiveActor << 'Message 2' //using the << operatorpassiveActor 'Message 3' //using the implicit call() method

Alternatively, the << operator or the implicit call method can be used. A family of sendAndWaitmethods is available to block the caller until a reply from the actor is available. The reply isreturned from the sendAndWait method as a return value. The sendAndWait methods may alsoreturn after a timeout expires or in case of termination of the called actor.

A Sample

def replyingActor = Actors.actor{ loop { react { msg -> println "Received: $msg"; reply "I've got $msg" } }}

def reply1 = replyingActor.sendAndWait('Message 4')

def reply2 = replyingActor.sendAndWait('Message 5', 10, TimeUnit.SECONDS)

use (TimeCategory) { def reply3 = replyingActor.sendAndWait('Message 6', 10.seconds)}

The sendAndContinue method allows the caller to continue its processing while the supplied closureis waiting for a reply from the actor.

A Sample

friend.sendAndContinue 'I need money!', {money -> pocket money}println 'I can continue while my friend is collecting money for me'

The sendAndPromise method returns a Promise (aka Future) to the final reply and so allows thecaller to continue its processing while the actor is handling the submitted message.

76

A Sample

Promise loan = friend.sendAndPromise 'I need money!'println 'I can continue while my friend is collecting money for me'loan.whenBound {money -> pocket money} // Asynchronous waiting for a reply.println "Received ${loan.get()}" // Synchronous waiting for a reply.

All send , sendAndWait or sendAndContinue methods will throw an exception if invoked on a non-active actor.

Receiving Messages

Non-blocking Message Retrieval

Calling the react method, optionally with a timeout parameter, from within the actor’s code willconsume the next message from the actor’s inbox, potentially waiting, if there is no message to beprocessed immediately.

A Sample

println 'Waiting for a gift'react {gift -> if (mySpouse.likes gift) reply 'Thank you!'}

Under the covers, the supplied closure is not invoked directly, but scheduled for processing by anythread in the thread pool once a message is available. After scheduling, the current thread will thenbe detached from the actor and freed to process any other actor, which has received a messagealready.

To permit detaching actors from threads, the react method requires code to be written in a specialcontinuation style.

77

A react Sample

Actors.actor { loop { println 'Waiting for a gift' react {gift -> if (mySpouse.likes gift) reply 'Thank you!' else { reply 'Try again, please' react {anotherGift -> if (myChildren.like gift) reply 'Thank you!' } println 'Never reached' } } println 'Never reached' } println 'Never reached'}

The react method has a special semantics to allow actors to be detached from threads when nomessages are available in their mailbox. Essentially, react schedules the supplied code (closure) tobe executed upon next message arrival and returns. The closure supplied to the react methods isthe code where the computation should resume. This is a continuation style.

Since actors have to preserve the guarantee that at most one thread is active within the actor’sbody, the next message cannot be handled before the current message processing finishes.Typically, there shouldn’t be a need to put code after calls to react. Some actor implementationseven enforce this. However, GPars does not - for performance reasons. The loop method allowsiterations within the actor body. Unlike typical looping constructs, like for or while loops, loopcooperates with nested react blocks and will ensure looping across subsequent message retrievals.

Sending Replies

The reply and replyIfExists methods are not only defined on the actors themselves, but forAbstractPooledActor (not available in DefaultActor , DynamicDispatchActor nor ReactiveActorclasses) also on the processed messages themselves upon their reception, which is particularlyhandy when handling multiple messages in a single call. In such cases, reply() invoked on the actorwill send a reply to authors of all the currently processed messages (the last one), whereas reply()called on messages sends a reply to the author of that particular message only.

See DemoMultiMessage.groovy in our sample demos here

The Sender Property

Messages-upon-retrieval offer the sender property to identify the originator of the message. Theproperty is available inside the Actor’s closure:

78

A Sample

react {tweet -> if (isSpam(tweet)) ignoreTweetsFrom sender sender.send 'Never write to me again!'}

Forwarding

When sending a message, a different actor can be specified as the sender so that potential replies tothe message will be forwarded to the specified actor and not to the actual originator.

A Sample

def decryptor = Actors.actor { react {message -> reply message.reverse()// sender.send message.reverse() //An alternative way to send replies }}

def console = Actors.actor { //This actor will print out decrypted messages, sincethe replies are forwarded to it react { println 'Decrypted message: ' + it }}

decryptor.send 'lellarap si yvoorG', console //Specify an actor to send replies toconsole.join()

Creating Actors

Actors share a pool of threads, which are dynamically assigned to actors when the actors need toreact to messages sent to them. The threads are returned to the pool once a message has beenprocessed and the actor is idle waiting for some more messages to arrive.

For example, this is how you create an actor that prints out all messages that it receives.

79

Actor Sample

def console = Actors.actor { loop { react { println it } }}

Notice the loop() method call, which ensures that the actor doesn’t stop after having processed thefirst message.

Here’s an example with a decryptor service, which can decrypt submitted messages and send thedecrypted messages back to the originators.

A Sample

final def decryptor = Actors.actor { loop { react {String message -> if ('stopService' == message) { println 'Stopping decryptor' stop() } else reply message.reverse() } }}

Actors.actor { decryptor.send 'lellarap si yvoorG' react { println 'Decrypted message: ' + it decryptor.send 'stopService' }}.join()

Here’s an example of an actor that waits for up to 30 seconds to receive a reply to its message.

80

A Sample

def friend = Actors.actor { react { //this doesn't reply -> caller won't receive any answer in time println it //reply 'Hello' //uncomment this to answer conversation react { println it } }}

def me = Actors.actor { friend.send('Hi') //wait for answer 1sec react(1000) {msg -> if (msg == Actor.TIMEOUT) { friend.send('I see, busy as usual. Never mind.') stop() } else { //continue conversation println "Thank you for $msg" } }}

me.join()

Undelivered Messages

Sometimes messages cannot be delivered to the target actor. When special action needs to be takenfor undelivered messages, at actor termination, all unprocessed messages from its queue have theironDeliveryError() method called. The onDeliveryError() method or closure defined on the messagecan, for example, send a notification back to the original sender of the message.

81

Handling Undelivered Messages

final DefaultActor meme = Actors.actor { def message = 1

message.metaClass.onDeliveryError = {-> //send message back to the caller me << "Could not deliver $delegate" }

def actor = Actors.actor { react { //wait 2sec in order next call in demo can be emitted Thread.sleep(2000) //stop actor after first message stop() } }

actor << message actor << message

react { //print whatever comes back println it }

}

me.join()

Alternatively the onDeliveryError() method can be specified on the sender itself. The method can beadded both dynamically

82

A Dynamic Sample

final DefaultActor meme = Actors.actor { def message1 = 1 def message2 = 2

def actor = Actors.actor { react { //wait 2sec in order next call in demo can be emitted Thread.sleep(2000) //stop actor after first message stop() } }

me.metaClass.onDeliveryError = {msg -> //callback on actor inaccessibility println "Could not deliver message $msg" }

actor << message1 actor << message2

actor.join()

}

me.join()

and statically in actor definition:

A Static Sample

class MyActor extends DefaultActor { public void onDeliveryError(msg) { println "Could not deliver message $msg" } ...}

Joining Actors

Actors provide a join() method to allow callers to wait for the actor to terminate. A variantaccepting a timeout is also available. The Groovy spread-dot ( *. ) operator comes in handy whenjoining multiple actors at a time.

83

A Sample to Join Actors

def master = new GameMaster().start()def player = new Player(name: 'Player', server: master).start()

[master, player]*.join()

Conditional and Counting Loops

The loop() method allows for either a condition or a number of iterations to be specified, optionallyaccompanied with a closure to invoke once the loop finishes - After Loop Termination Code Handler.

The following actor will loop three times to receive 3 messages and then prints out the maximum ofthe received messages.

A Sample

final Actor actor = Actors.actor { def candidates = [] def printResult = {-> println "The best offer is ${candidates.max()}"}

loop(3, printResult) { react { candidates << it } }}

actor 10actor 30actor 20actor.join()

The following actor will receive messages until a value greater then 30 arrives.

84

A Sample

final Actor actor = Actors.actor { def candidates = [] final Closure printResult = {-> println "Reached best offer - ${candidates.max()}"}

loop({-> candidates.max() < 30}, printResult) { react { candidates << it } }}

actor 10actor 20actor 25actor 31actor 20actor.join()

The After Loop Termination Code Handler can use an actor’s react{} but not loop().

Fair Vs Non-fair Actor Behavior

DefaultActor can be set to behave in a fair or non-fair (default) manner. Depending on thestrategy chosen, the actor either makes the thread available to other actors sharing the sameparallel group (fair), or keeps the thread for itself until the message queue becomes empty(non-fair). Generally, non-fair actors perform 2 - 3 times better than fair ones.

Use either the fairActor() factory method or the actor’s makeFair() method.

Custom Schedulers

Actors leverage the standard JDK concurrency library by default. To provide a custom threadscheduler, use the appropriate constructor parameter when creating a parallel group (PGroupclass). The supplied scheduler will orchestrate threads in the group’s thread pool.

Please also see the numerous Actor sample demo programs.

Actors PrinciplesActors share a pool of threads, which are dynamically assigned to actors when the actors need to

85

react to messages sent to them. The threads are returned back to the pool once a message has beenprocessed and the actor is idle waiting for some more messages to arrive. Actors become detachedfrom the underlying threads and so a relatively small thread pool can serve potentially unlimitednumber of actors. Virtually unlimited scalability in number of actors is the main advantage ofevent-based actors , which are detached from the underlying physical threads.

Here are some examples of how to use actors. This is how you create an actor that prints out allmessages that it receives.

A Sample


def console = actor { loop { react { println it } }

Notice the loop() method call, which ensures that the actor doesn’t stop after having processed thefirst message.

As an alternative you can extend the DefaultActor class and override the act() method. Once youinstantiate the actor, you need to start it so that it attaches itself to the thread pool and can startaccepting messages. The actor() factory method will take care of starting the actor.

A Sample

class CustomActor extends DefaultActor { @Override protected void act() { loop { react { println it } } }}

def console=new CustomActor()console.start()

Messages can be sent to the actor using multiple methods

86

A Sample

console.send('Message')console 'Message'console.sendAndWait 'Message'//Wait for a replyconsole.sendAndContinue 'Message', {reply -> println "I received reply: $reply"}//Forward the reply to a function

Creating An Asynchronous Service

A Sample


final def decryptor = actor { loop { react {String message-> reply message.reverse() } }}

def console = actor { decryptor.send 'lellarap si yvoorG' react { println 'Decrypted message: ' + it }}

console.join()

As you can see, you create new actors with the actor() method passing in the actor’s body as aclosure parameter. Inside the actor’s body, you can use loop() to iterate, react() to receive messagesand reply() to send a message to the actor, which has sent the currently processed message. Thesender of the current message is also available through the actor’s sender property. When thedecryptor actor doesn’t find a message in its message queue at the time when react() is called, thereact() method gives up the thread and returns it back to the thread pool for other actors to pick itup.

Only after a new message arrives to the actor’s message queue, the closure of the react() method isscheduled for processing with the pool. Event-based actors internally simulate continuations -actor’s work - is split into sequentially run chunks, which are invoked once a message is availablein the inbox. Each chunk for a single actor can be performed by a different thread from the threadpool.

Groovy's flexible syntax with closures allows our library to offer multiple ways to define actors. For

87

instance, here’s an example of an actor that waits for up to 30 seconds to receive a reply to itsmessage. Actors allow time DSL, defined by org.codehaus.groovy.runtime.TimeCategory class, to beused for timeout specification to the react() method, provided the user wraps the call within aTimeCategory use block.

A Sample


def me = Actors.actor { friend.send('Hi') //wait for answer 1sec react(1000) {msg -> if (msg == Actor.TIMEOUT) { friend.send('I see, busy as usual. Never mind.') stop() } else { //continue conversation println "Thank you for $msg" } }}

me.join()

When a timeout expires when waiting for a message, the Actor.TIMEOUT message arrives instead.Also the onTimeout() handler is invoked, if present on the actor:

88

An Actor Sample


def me = Actors.actor { friend.send('Hi')

delegate.metaClass.onTimeout = {-> friend.send('I see, busy as usual. Never mind.') stop() }

//wait for answer 1sec react(1000) {msg -> if (msg != Actor.TIMEOUT) { //continue conversation println "Thank you for $msg" } }}

me.join()

Notice the possibility to use Groovy meta-programming to define an actor’s lifecycle notificationmethods (e.g. onTimeout() ) dynamically. Obviously, the lifecycle methods can be defined the usualway when you decide to define a new class for your actor.

A Sample

class MyActor extends DefaultActor { public void onTimeout() { ... }

protected void act() { ... }}

89

Actors Guarantee Thread-safety For Non-thread-safe Code

Actors guarantee that, always at most, one thread processes the actor’s body at a time. Under thecovers the memory is synchronized each time a thread is assigned to an actor. Therefore, theactor’s state can be safely modified by code in the body without any other extra (synchronizationor locking) effort.

A Sample

class MyCounterActor extends DefaultActor { private Integer counter = 0

protected void act() { loop { react { counter++ } } }}

Ideally, an actor’s code should never be invoked directly from outside so all code of the actor classcan only be executed by the thread handling the last received message. Therefore, all the actor’scode is implicitly thread-safe. If any of the actor’s methods are allowed to be called by otherobjects directly, the thread-safety guarantee for the actor’s code and state is no longer valid.

Simple Calculator

Here is a little bit more realistic example of an event-driven actor that receives two numericmessages, sums them up and sends the result to the console actor.

90

A Calculator

import groovyx.gpars.group.DefaultPGroup

//not necessary, just showing that a single-threaded pool can still handle multipleactorsdef group = new DefaultPGroup(1);

final def console = group.actor { loop { react { println 'Result: ' + it } }}

final def calculator = group.actor { react {a -> react {b -> console.send(a + b) } }}

calculator.send 2calculator.send 3

calculator.join()group.shutdown()

Notice that event-driven actors require special care regarding the react() method. Sinceevent_driven actors need to split the code into independent chunks assignable to different threadssequentially and continuations are not natively supported on JVM, the chunks are createdartificially. The react() method creates the next message handler. As soon as the current messagehandler finishes, the next message handler (continuation) is scheduled.

Concurrent Merge Sort Example

For comparison, I’m also including a more involved example performing a concurrent merge sortof a list of integers using actors. You can see that, thanks to flexibility of Groovy, we came prettyclose to the Scala model, although I still miss Scala pattern-matching for message handling.

A Sort Sample

import groovyx.gpars.group.DefaultPGroupimport static groovyx.gpars.actor.Actors.actor

Closure createMessageHandler(def parentActor) { return {

91

react {List<Integer> message -> assert message != null switch (message.size()) { case 0..1: parentActor.send(message) break case 2: if (message[0] <= message[1]) parentActor.send(message) else parentActor.send(message[-1..0]) break default: def splitList = split(message)

def child1 = actor(createMessageHandler(delegate)) def child2 = actor(createMessageHandler(delegate)) child1.send(splitList[0]) child2.send(splitList[1])

react {message1 -> react {message2 -> parentActor.send merge(message1, message2) } } } } }}

def console = new DefaultPGroup(1).actor { react { println "Sorted array:\t${it}" System.exit 0 }}

def sorter = actor(createMessageHandler(console))sorter.send([1, 5, 2, 4, 3, 8, 6, 7, 3, 9, 5, 3])console.join()

def split(List<Integer> list) { int listSize = list.size() int middleIndex = listSize / 2 def list1 = list[0..<middleIndex] def list2 = list[middleIndex..listSize - 1] return [list1, list2]}

List<Integer> merge(List<Integer> a, List<Integer> b) { int i = 0, j = 0 final int newSize = a.size() + b.size() List<Integer> result = new ArrayList<Integer>(newSize)

92


if (i < a.size()) result.addAll(a[i..-1]) else result.addAll(b[j..-1]) return result}

Since actors reuse threads from a pool, the script will work with virtually any size thread pool, nomatter how many actors are created along the way.

Actor Lifecycle Methods

Each Actor can define lifecycle observing methods, which will be called whenever a certainlifecycle event occurs.

• afterStart() - called right after the actor has been started.

• afterStop(List undeliveredMessages) - called right after the actor is stopped, passing in all theunprocessed messages from the queue.

• onInterrupt(InterruptedException e) - called when the actor’s thread gets interrupted. Threadinterruption will result in the stopping the actor in any case.

• onTimeout() - called when no messages are sent to the actor within the timeout specified for thecurrently blocking react method.

• onException(Throwable e) - called when an exception occurs in the actor’s event handler. Actorwill stop after return from this method.

You can either define the methods statically in your Actor class or add them dynamically to theactor’s metaclass:

An Actor Sample

class MyActor extends DefaultActor { public void afterStart() { ... } public void onTimeout() { ... }

protected void act() { ... }}

93

Another Sample

def myActor = actor { delegate.metaClass.onException = { log.error('Exception occurred', it) }

...}

Performance Tips

To help performance, you may consider using the silentStart() method instead of start() whenstarting a DynamicDispatchActor or a ReactiveActor. Calling silentStart() will by-pass some ofthe start-up machinery and as a result will also avoid calling the afterStart() method. Due toits stateful nature, DefaultActor cannot be started silently.

Pool Management

Actors can be organized into groups and, as a default, there’s always an application-wide pooledactor group available. And, just like the Actors abstract factory, can be used to create actors in thedefault group. Custom groups can be used as abstract factories to create new actors instancesbelonging to these groups.

A Group Sample

def myGroup = new DefaultPGroup()

def actor1 = myGroup.actor {...}

def actor2 = myGroup.actor {...}

The parallelGroup property of an actor points to the group it belongs to. By default, it points to thedefault actor group, which is Actors.defaultActorPGroup , and can only be changed before the actoris started.

94

A Sample

class MyActor extends StaticDispatchActor<Integer> { private static PGroup group = new DefaultPGroup(100)

MyActor(...) { this.parallelGroup = group ... }}

The actors belonging to the same group share the underlying thread pool of that group. The pool, bydefault, contains n + 1 threads, where n stands for the number of CPUs detected by the JVM. Thepool size can be set explicitly, either by setting the gpars.poolsize system property or, individually,for each actor group. This is by specifying the appropriate constructor parameter.

A Sample

def myGroup = new DefaultPGroup(10) //the pool will contain 10 threads

The thread pool can be manipulated through the appropriate DefaultPGroup class, which delegatesto the Pool interface of the thread pool. For example, the resize() method allows you to change thepool size any time and the resetDefaultSize() sets it back to the default value. The shutdown()method can be called when you need to safely finish all tasks, destroy the pool and stop all thethreads in order to exit JVM in an organized manner.

A Sample

... (n+1 threads in the default pool after startup)

Actors.defaultActorPGroup.resize 1 //use one-thread pool

... (1 thread in the pool)

Actors.defaultActorPGroup.resetDefaultSize()

... (n+1 threads in the pool)

Actors.defaultActorPGroup.shutdown()

As an alternative to the DefaultPGroup, which creates a pool of daemon threads, theNonDaemonPGroup class can be used when non-daemon threads are required.

95

A Sample

def daemonGroup = new DefaultPGroup()

def actor1 = daemonGroup.actor {...}

def nonDaemonGroup = new NonDaemonPGroup()

def actor2 = nonDaemonGroup.actor {...}

class MyActor { def MyActor() { this.parallelGroup = nonDaemonGroup }

void act() {...}}

Actors belonging to the same group share the underlying thread pool. With pooled actor groups,you can split your actors to leverage multiple thread pools of different sizes and so assign resourcesto different components of your system and tune their performance.

96

A Sample

def coreActors = new NonDaemonPGroup(5) //5 non-daemon threads pooldef helperActors = new DefaultPGroup(1) //1 daemon thread pool

def priceCalculator = coreActors.actor {...}

def paymentProcessor = coreActors.actor {...}

def emailNotifier = helperActors.actor {...}

def cleanupActor = helperActors.actor {...}

//increase size of the core actor groupcoreActors.resize 6

//shutdown the group's pool once you no longer need the group to release resourceshelperActors.shutdown()

Do not forget to shutdown custom pooled actor groups, once you no longer need them and theiractors, to preserve system resources.

The Default Actor Group

Actors that didn’t have their parallelGroup property changed or that were created through any ofthe factory methods on the Actors class can share a common group Actors.defaultActorPGroup .This group uses a resizeable thread pool with an upper limit of 1000 threads . This gives you thecomfort of having the pool automatically adjust to the demand of the actors. On the other hand,with a growing number of actors, the pool may become too big an inefficient. It’s advisable to groupyour actors into your own PGroups with fixed size thread pools for all but trivial applications.

Common Trap: App Terminates While Actors Do Not Receive Messages

Most likely you’re using daemon threads and pools, which is the default setting, and your mainthread finishes. Calling actor.join() on any, some or all of your actors would block the main threaduntil the actor terminates and thus keep all your actors running.

Alternatively, use instances of NonDaemonPGroup and assign some of your actors to these groups.

97

A Sample

def nonDaemonGroup = new NonDaemonPGroup()def myActor = nonDaemonGroup.actor {...}

alternatively .A Sample

def nonDaemonGroup = new NonDaemonPGroup()

class MyActor extends DefaultActor { def MyActor() { this.parallelGroup = nonDaemonGroup }

void act() {...}}

def myActor = new MyActor()

Blocking Actors

Instead of event-driven continuation-styled actors, you may in some scenarios prefer usingblocking actors. Blocking actors hold a single pooled thread for their whole life-time including thetime when waiting for messages. They avoid some of the thread management overhead, since theynever fight for threads after start, and also let you write straight code without the necessity ofcontinuation style. Since they only do blocking, message reads are via the receive method.Obviously, the number of blocking actors running concurrently, is limited by the number of threadsavailable in the shared pool. On the other hand, blocking actors typically provide betterperformance compared to continuation-style actors, especially when the actor’s message queuerarely is empty.

98

A Sample

def decryptor = blockingActor { while (true) { receive {message -> if (message instanceof String) reply message.reverse() else stop() } }}

def console = blockingActor { decryptor.send 'lellarap si yvoorG' println 'Decrypted message: ' + receive() decryptor.send false}

[decryptor, console]*.join()

Blocking actors increase the number of options to tune performance of your applications. Theymay, in particular, be good candidates for high-traffic positions in your actor network.

Stateless Actors

Dynamic Dispatch Actor

The DynamicDispatchActor class is an actor allowing for an alternative structure of the messagehandling code.

In general, DynamicDispatchActor repeatedly scans for messages and dispatches arrived messagesto one of the onMessage(message) methods defined on the actor. The DynamicDispatchActorleverages the Groovy dynamic method dispatch mechanism under the covers. Since, unlikeDefaultActor descendants, a DynamicDispatchActor not ReactiveActor (discussed below) do not needto implicitly remember an actor’s state between subsequent message receptions, they providemuch better performance characteristics, generally comparable to other actor frameworks, like e.g.Scala Actors.

99

A Dynamic Dispatch Actor Sample

import groovyx.gpars.actor.Actorsimport groovyx.gpars.actor.DynamicDispatchActor

final class MyActor extends DynamicDispatchActor {

void onMessage(String message) { println 'Received string' }

void onMessage(Integer message) { println 'Received integer' reply 'Thanks!' }

void onMessage(Object message) { println 'Received object' sender.send 'Thanks!' }

void onMessage(List message) { println 'Received list' stop() }}

final def myActor = new MyActor().start()

Actors.actor { myActor 1 myActor '' myActor 1.0 myActor(new ArrayList()) myActor.join()}.join()

In some scenarios, typically when no implicit conversation-history-dependent state needs to bepreserved for the actor, the dynamic dispatch code structure may be more intuitive than thetraditional one using nested loop and react statements.

The DynamicDispatchActor class also provides a handy facility to add message handlersdynamically at actor construction time or any time later using the when handlers, optionallywrapped inside a become method:

100

A DynamicDispatchActor Sample

final Actor myActor = new DynamicDispatchActor().become { when {String msg -> println 'A String'; reply 'Thanks'} when {Double msg -> println 'A Double'; reply 'Thanks'} when {msg -> println 'A something ...'; reply 'What was that?';stop()}}myActor.start()Actors.actor { myActor 'Hello' myActor 1.0d myActor 10 as BigDecimal myActor.join()}.join()

Obviously the two approaches can be combined:

101

A Combined Sample

final class MyDDA extends DynamicDispatchActor {

void onMessage(String message) { println 'Received string' }

void onMessage(Integer message) { println 'Received integer' }

void onMessage(Object message) { println 'Received object' }

void onMessage(List message) { println 'Received list' stop() }}

final def myActor = new MyDDA().become { when {BigDecimal num -> println 'Received BigDecimal'} when {Float num -> println 'Got a float'}}.start()

Actors.actor { myActor 'Hello' myActor 1.0f myActor 10 as BigDecimal myActor.send([]) myActor.join()}.join()

The dynamic message handlers registered via when will take precedence over the static onMessagehandlers.

Fair or non-fair Behavior of DynamicDispatchActors

DynamicDispatchActor can be set to behave in a fair on non-fair (default) manner. Dependingon the strategy chosen, the actor either makes the thread available to other actors sharing thesame parallel group (fair), or keeps the thread fot itself until the message queue gets empty(non-fair).

Generally, non-fair actors perform 2 - 3 times better than fair ones.

Use either the fairMessageHandler() factory method or the actor’s makeFair() method.

102

A Fair Sample

def fairActor = Actors.fairMessageHandler {...}

Static Dispatch Actor

While DynamicDispatchActor dispatches messages based on their run-time type and so pays extraperformance penalty for each message, StaticDispatchActor avoids run-time message checks anddispatches the message solely based on the compile-time information.

A Sample

final class MyActor extends StaticDispatchActor<String> { void onMessage(String message) { println 'Received string ' + message

switch (message) { case 'hello': reply 'Hi!' break case 'stop': stop() } }}

Instances of StaticDispatchActor have to override the onMessage method appropriate for the actor’sdeclared type parameter. The onMessage(T message) method is then invoked with every receivedmessage.

A shorter route towards both fair and non-fair static dispatch actors is available through the helperfactory methods:

103

A Sample

final actor = staticMessageHandler {String message -> println 'Received string ' + message

switch (message) { case 'hello': reply 'Hi!' break case 'stop': stop() }}

println 'Reply: ' + actor.sendAndWait('hello')actor 'bye'actor 'stop'actor.join()

When compared to the DynamicDispatchActor, the StaticDispatchActor class is limited to a singlehandler method.

This simplified creation without any when handlers, plus the considerable performance benefits,should make StaticDispatchActor your default choice for straightforward message handlers. Usethis wnmb hen dispatching based on message run-time type is not necessary.

For example, StaticDispatchActors make dataflow operators four times faster than theDynamicDispatchActor .

Reactive Actor

The ReactiveActor class, constructed typically by calling Actors.reactor() or DefaultPGroup.reactor(),allows ar more event-driven approach.

When a reactive actor receives a message, the supplied block of code, which makes up the reactiveactor’s body, is run with the message as a parameter. The result returned from the code is sent inreply.

104

A Sample

final def group = new DefaultPGroup()

final def doubler = group.reactor { 2 * it}

group.actor { println 'Double of 10 = ' + doubler.sendAndWait(10)}



for(i in (1..10)) { println "Double of $i = ${doubler.sendAndWait(i)}"}

doubler.stop()doubler.join()

Here’s an example of an actor that submits a batch of numbers to a ReactiveActor for processingand then prints the results gradually as they arrive.

105

A Sample

import groovyx.gpars.actor.Actorimport groovyx.gpars.actor.Actors

final def doubler = Actors.reactor { 2 * it}

Actor actor = Actors.actor { (1..10).each {doubler << it} int i = 0 loop { i += 1 if (i > 10) stop() else { react {message -> println "Double of $i = $message" } } }}

actor.join()doubler.stop()doubler.join()

Essentially, reactive actors provide a convenience shortcut for an actor that would wait formessages in a loop, process them and send back the result. This is schematically how the reactiveactor looks inside:

A Sample

public class ReactiveActor extends DefaultActor { Closure body

void act() { loop { react {message -> reply body(message) } } }}

106

Fair or Non-fair Behavior of ReactiveActors

ReactiveActor can be set to behave in a fair or unfair (default) manner.

Depending on the strategy chosen, the actor either makes the thread available to other actorssharing the same parallel group (fair), or keeps the thread for itself until the message queue isempty (non-fair). Generally, non-fair actors perform 2–3 times better than fair ones.

Use either the fairReactor() factory method or the actor’s makeFair() method.

A Fair Sample

def fairActor = Actors.fairReactor {...}

Tips and Tricks

Structuring Actor’s Code

When extending the DefaultActor class, you can call any actor’s methods from within the act()method and use the react() or loop() methods in them.

107

A Sample

class MyDemoActor extends DefaultActor {

protected void act() { handleA() }

private void handleA() { react {a -> handleB(a) } }

private void handleB(int a) { react {b -> println a + b reply a + b } }}

final def demoActor = new MyDemoActor()demoActor.start()

Actors.actor { demoActor 10 demoActor 20 react { println "Result: $it" }}.join()

Bear in mind that the methods handleA() and handleB() in all our examples will only schedule thesupplied message handlers to run as continuations of the current calculation in reaction to the nextmessage arriving.

Alternatively, when using the actor() factory method, you can add event-handling code through themeta class as closures.

108

A Sample

Actor demoActor = Actors.actor { delegate.metaClass { handleA = {-> react {a -> handleB(a) } }

handleB = {a -> react {b -> println a + b reply a + b } } }

handleA()}


Closures, which have the actor set as their delegate, can also be used to structure event-handlingcode.

109

A Sample

Closure handleB = {a -> react {b -> println a + b reply a + b }}

Closure handleA = {-> react {a -> handleB(a) }}

Actor demoActor = Actors.actor { handleA.delegate = delegate handleB.delegate = delegate

handleA()}


Event-Driven Loops

When coding event-driven actors, please kepp in mind that calls to react() and loop() methods haveslightly different semantics.

This becomes a bit of a challenge once you try to implement any types of loops in your actors. Onthe other hand, if you leverage the fact that react() only schedules a continuation and returns, youmay call methods recursively without fear of stack overflow. Look at the examples below that usethese three described techniques for structuring actor’s code.

A Subclass Of DefaultActor

110

A Sample

class MyLoopActor extends DefaultActor {

protected void act() { outerLoop() }

private void outerLoop() { react {a -> println 'Outer: ' + a if (a != 0) innerLoop() else println 'Done' } }

private void innerLoop() { react {b -> println 'Inner ' + b if (b == 0) outerLoop() else innerLoop() } }}

final def actor = new MyLoopActor().start()actor 10actor 20actor 0actor 0actor.join()

Enhancing The Actor’s MetaClass

111

A Sample

Actor actor = Actors.actor {

delegate.metaClass { outerLoop = {-> react {a -> println 'Outer: ' + a if (a!=0) innerLoop() else println 'Done' } }

innerLoop = {-> react {b -> println 'Inner ' + b if (b==0) outerLoop() else innerLoop() } } }

outerLoop()}

actor 10actor 20actor 0actor 0actor.join()

Using Groovy Closures

112

A Groovy Sample

Closure innerLoop

Closure outerLoop = {-> react {a -> println 'Outer: ' + a if (a!=0) innerLoop() else println 'Done' }}

innerLoop = {-> react {b -> println 'Inner ' + b if (b==0) outerLoop() else innerLoop() }}

Actor actor = Actors.actor { outerLoop.delegate = delegate innerLoop.delegate = delegate

outerLoop()}

actor 10actor 20actor 0actor 0actor.join()

Plus don’t forget about the idea of using the actor’s loop() method to create a loop that runs until theactor terminates.

113

A Sample

class MyLoopingActor extends DefaultActor {

protected void act() { loop { outerLoop() } }

private void outerLoop() { react {a -> println 'Outer: ' + a if (a!=0) innerLoop() else println 'Done for now, but will loop again' } }

private void innerLoop() { react {b -> println 'Inner ' + b if (b == 0) outerLoop() else innerLoop() } }}

final def actor = new MyLoopingActor().start()actor 10actor 20actor 0actor 0actor 10actor.stop()actor.join()

Active ObjectsActive objects provide an OO facade on top of actors. This allows you to avoid dealing directly withthe actor machinery, having to match messages, wait for results and send replies. Ouch !

Actors With a Friendly Facade

114

A Sample

import groovyx.gpars.activeobject.ActiveObjectimport groovyx.gpars.activeobject.ActiveMethod

@ActiveObjectclass Decryptor { @ActiveMethod def decrypt(String encryptedText) { return encryptedText.reverse() }

@ActiveMethod def decrypt(Integer encryptedNumber) { return -1*encryptedNumber + 142 }}

final Decryptor decryptor = new Decryptor()def part1 = decryptor.decrypt(' noitcA ni yvoorG')def part2 = decryptor.decrypt(140)def part3 = decryptor.decrypt('noitide dn')

print part1.get()print part2.get()println part3.get()

You mark active objects with the @ActiveObject annotation. This will ensure a hidden actor instanceis created for each instance of your class. Now you can mark methods with the @ActiveMethodannotation indicating that you want the method to be invoked asynchronously by the target object’sinternal actor. An optional boolean blocking parameter to the @ActiveMethod annotation specifies,whether the caller should block until a result is available or whether instead the caller should onlyreceive a promise for a future result in a form of a DataflowVariable and so the caller is not blockedwaiting.

Blocking or Not ?

By default, all active methods are set to be non-blocking . However, methods that declaretheir return type explicitly, must be configured as blocking, otherwise the compiler willreport an error. Only def, void and DataflowVariable are permissible return types for non-blocking methods.

Under the covers, GPars will translate your method call to a message being sent to the internalactor. The actor will eventually handle that message by invoking the desired method on behalf ofthe caller and once finished a reply will be sent back to the caller. Non-blocking methods returnpromises for results, aka DataflowVariables .

115

But Blocking Means We’re Not Really Asynchronous, Are We?

Indeed, if you mark your active methods as blocking , the caller will be blocked waiting for theresult, just like when doing normal plain method invocation. All we’ve achieved is being thread-safe inside the Active object from concurrent access. Something the synchronized keyword couldgive you as well. So it’s the non-blocking methods that should drive your decision towards usingactive objects. Blocking methods will then provide the usual synchronous semantics yet give theconsistency guarantees across concurrent method invocations. The blocking methods are then stillvery useful when used in combination with non-blocking ones.

A Sample

import groovyx.gpars.activeobject.ActiveMethodimport groovyx.gpars.activeobject.ActiveObjectimport groovyx.gpars.dataflow.DataflowVariable

@ActiveObjectclass Decryptor { @ActiveMethod(blocking=true) String decrypt(String encryptedText) { encryptedText.reverse() }

@ActiveMethod(blocking=true) Integer decrypt(Integer encryptedNumber) { -1*encryptedNumber + 142 }}

final Decryptor decryptor = new Decryptor()print decryptor.decrypt(' noitcA ni yvoorG')print decryptor.decrypt(140)println decryptor.decrypt('noitide dn')

Non-Blocking Semantics

Calling the non-blocking active method will return as soon as the actor has been sent a message.The caller is now allowed to do whatever it likes, while the actor is taking care of the calculation.

The state of the calculation can be polled using the bound property on the promise. Calling the get()method on the returned promise will block the caller until a value is available. The call to get() willeventually return a value or throw an exception, depending on the outcome of the actualcalculation.

The get() method has a variant with a timeout parameter, to avoid the risk ofwaiting indefinitely.

116

Annotation Rules

There are a few rules to follow when annotating your objects:

• The ActiveMethod annotations are only accepted in classes annotated as ActiveObject

• Only instance (non-static) methods can be annotated as ActiveMethod

• You can override active methods with non-active ones and vice versa

• Subclasses of active objects can declare additional active methods, provided they arethemselves annotated as ActiveObject

• Combining concurrent use of active and non-active methods may result in race conditions.Ideally design your active objects as completely encapsulated classes with all non-privatemethods marked as active

Inheritance

The @ActiveObject annotation can appear on any class in an inheritance hierarchy. The actor fieldwill only be created in top-most annotated class in the hierarchy, the subclasses will reuse the field.

An Annotated Sample

import groovyx.gpars.activeobject.ActiveObjectimport groovyx.gpars.activeobject.ActiveMethodimport groovyx.gpars.dataflow.DataflowVariable

@ActiveObjectclass A { @ActiveMethod def fooA(value) { ... }}

class B extends A {}

@ActiveObjectclass C extends B { @ActiveMethod def fooC(value1, value2) { ... }}

In our example, the actor field will be generated into class A. Class C has to be annotated with@ActiveObject since it holds the @ActiveMethod annotation on method fooC() , while class B doesnot need the annotation, since none of its methods is active.

117

Groups

Just like actors can be grouped around thread pools, active objects can be configured to use threadsfrom particular parallel groups.

A Group Sample

@ActiveObject("group1")class MyActiveObject { ...}

The value parameter to the @ActiveObject annotation specifies a name of parallel group to bind theinternal actor to. Only threads from the specified group will be used to run internal actors ofinstances of the class.

The groups, however, need to be created and registered prior to creation of any of the active objectinstances belonging to that group. If not specified explicitly, an active object will use the defaultactor group - Actors.defaultActorPGroup .

A Sample

final DefaultPGroup group = new DefaultPGroup(10)ActiveObjectRegistry.instance.register("group1", group)

Alternative Names For The Internal Actor

You will probably only rarely run into name collisions with the default name for the active object’sinternal actor field. May you need to change the default name internalActiveObjectActor , use theactorName parameter to the @ActiveObject annotation.

A Named Sample

@ActiveObject(actorName = "alternativeActorName")class MyActiveObject { ...}

Actor Naming Conventions

Alternative names for internal actors as well as their desired groups cannot be overriden insubclasses.

Make sure you only specify these values in the top-most active objects in your inheritancehierarchy. Obviously, the top most active object is still allowed to subclass other classes, justnone of the predecessors must be an active object.

118

Classic Examples

A Few Examples of Actors Usage

• The Sieve of Eratosthenes

• Sleeping Barber

• Dining Philosophers

• Word Sort

• Load Balancer

The Sieve of Eratosthenes

Problem description

A Sample

import groovyx.gpars.actor.DynamicDispatchActor

/** * Demonstrates concurrent implementation of the Sieve of Eratosthenes using actors * * In principle, the algorithm consists of concurrently run chained filters, * each of which detects whether the current number can be divided by a single primenumber. * (generate nums 1, 2, 3, 4, 5, ...) -> (filter by mod 2) -> (filter by mod 3) ->(filter by mod 5) -> (filter by mod 7) -> (filter by mod 11) -> (caution! Primesfalling out here) * The chain is built (grows) on the fly, whenever a new prime is found. */

int requestedPrimeNumberBoundary = 1000

final def firstFilter = new FilterActor(2).start()

/** * Generating candidate numbers and sending them to the actor chain */(2..requestedPrimeNumberBoundary).each { firstFilter it}firstFilter.sendAndWait 'Poison'

/** * Filter out numbers that can be divided by a single prime number */final class FilterActor extends DynamicDispatchActor { private final int myPrime

119

http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

private def follower

def FilterActor(final myPrime) { this.myPrime = myPrime; }

/** * Try to divide the received number with the prime. If the number cannot bedivided, send it along the chain. * If there's no-one to send it to, I'm the last in the chain, the number is aprime and so I will create and chain * a new actor responsible for filtering by this newly found prime number. */ def onMessage(int value) { if (value % myPrime != 0) { if (follower) follower value else { println "Found $value" follower = new FilterActor(value).start() } } }

/** * Stop the actor on poisson reception */ def onMessage(def poisson) { if (follower) { def sender = sender follower.sendAndContinue(poisson, {this.stop(); sender?.send('Done')})//Pass the poisson along and stop after a reply } else { //I am the last in the chain stop() reply 'Done' } }}

Sleeping Barber

Problem description

A Sample

import groovyx.gpars.group.DefaultPGroupimport groovyx.gpars.actor.DefaultActorimport groovyx.gpars.group.DefaultPGroupimport groovyx.gpars.actor.Actor

final def group = new DefaultPGroup()

120

http://en.wikipedia.org/wiki/Sleeping_barber_problem

final def barber = group.actor { final def random = new Random() loop { react {message -> switch (message) { case Enter: message.customer.send new Start() println "Barber: Processing customer ${message.customer.name}" doTheWork(random) message.customer.send new Done() reply new Next() break case Wait: println "Barber: No customers. Going to have a sleep" break } } }}

private def doTheWork(Random random) { Thread.sleep(random.nextInt(10) * 1000)}

final Actor waitingRoom

waitingRoom = group.actor { final int capacity = 5 final List<Customer> waitingCustomers = [] boolean barberAsleep = true

loop { react {message -> switch (message) { case Enter: if (waitingCustomers.size() == capacity) { reply new Full() } else { waitingCustomers << message.customer if (barberAsleep) { assert waitingCustomers.size() == 1 barberAsleep = false waitingRoom.send new Next() } else reply new Wait() } break case Next: if (waitingCustomers.size()>0) { def customer = waitingCustomers.remove(0) barber.send new Enter(customer:customer)

121

} else { barber.send new Wait() barberAsleep = true } } } }

}

class Customer extends DefaultActor { String name Actor localBarbers

void act() { localBarbers << new Enter(customer:this) loop { react {message -> switch (message) { case Full: println "Customer: $name: The waiting room is full. I amleaving." stop() break case Wait: println "Customer: $name: I will wait." break case Start: println "Customer: $name: I am now being served." break case Done: println "Customer: $name: I have been served." stop(); break

} } } }}

class Enter { Customer customer }class Full {}class Wait {}class Next {}class Start {}class Done {}

def customers = []customers << new Customer(name:'Joe', localBarbers:waitingRoom).start()customers << new Customer(name:'Dave', localBarbers:waitingRoom).start()

122

customers << new Customer(name:'Alice', localBarbers:waitingRoom).start()

sleep 15000customers << new Customer(name: 'James', localBarbers: waitingRoom).start()sleep 5000customers*.join()barber.stop()waitingRoom.stop()

Dining Philosophers

Problem description

A Sample

import groovyx.gpars.actor.DefaultActorimport groovyx.gpars.actor.Actors

Actors.defaultActorPGroup.resize 5

final class Philosopher extends DefaultActor { private Random random = new Random()

String name def forks = []

void act() { assert 2 == forks.size() loop { think() forks*.send new Take() def messages = [] react {a -> messages << [a, sender] react {b -> messages << [b, sender] if ([a, b].any {Rejected.isCase it}) { println "$name: \tOops, can't get my forks! Giving up." final def accepted = messages.find {Accepted.isCase it[0]} if (accepted!=null) accepted[1].send new Finished() } else { eat() reply new Finished() } } } } }

123

http://en.wikipedia.org/wiki/Dining_philosophers_problem

void think() { println "$name: \tI'm thinking" Thread.sleep random.nextInt(5000) println "$name: \tI'm done thinking" }

void eat() { println "$name: \tI'm EATING" Thread.sleep random.nextInt(2000) println "$name: \tI'm done EATING" }}

final class Fork extends DefaultActor {

String name boolean available = true

void act() { loop { react {message -> switch (message) { case Take: if (available) { available = false reply new Accepted() } else reply new Rejected() break case Finished: assert !available available = true break default: throw new IllegalStateException("Cannot process themessage: $message") } } } }}

final class Take {}final class Accepted {}final class Rejected {}final class Finished {}

def forks = [ new Fork(name:'Fork 1'), new Fork(name:'Fork 2'), new Fork(name:'Fork 3'), new Fork(name:'Fork 4'), new Fork(name:'Fork 5')

124

]

def philosophers = [ new Philosopher(name:'Joe', forks:[forks[0], forks[1]]), new Philosopher(name:'Dave', forks:[forks[1], forks[2]]), new Philosopher(name:'Alice', forks:[forks[2], forks[3]]), new Philosopher(name:'James', forks:[forks[3], forks[4]]), new Philosopher(name:'Phil', forks:[forks[4], forks[0]]),]

forks*.start()philosophers*.start()

sleep 10000forks*.stop()philosophers*.stop()

Word Sort

Given a folder name, the script will sort words in all files in the folder. The SortMaster actor createsa given number of WordSortActors , splits among them the files to sort words in and collects theresults.

Inspired by Scala Concurrency blog post by Michael Galpin

A Sample, Sort-Of

//Messagesprivate final class FileToSort { String fileName }private final class SortResult { String fileName; List<String> words }

//Worker actorclass WordSortActor extends DefaultActor {

private List<String> sortedWords(String fileName) { parseFile(fileName).sort {it.toLowerCase()} }

private List<String> parseFile(String fileName) { List<String> words = [] new File(fileName).splitEachLine(' ') {words.addAll(it)} return words }

void act() { loop { react {message -> switch (message) { case FileToSort:

125

http://fupeg.blogspot.com/2009/06/scala-concurrency.html

println "Sorting file=${message.fileName} on thread ${Thread.currentThread().name}" reply new SortResult(fileName: message.fileName, words:sortedWords(message.fileName)) } } } }}

//Master actorfinal class SortMaster extends DefaultActor {

String docRoot = '/' int numActors = 1

List<List<String>> sorted = [] private CountDownLatch startupLatch = new CountDownLatch(1) private CountDownLatch doneLatch

private void beginSorting() { int cnt = sendTasksToWorkers() doneLatch = new CountDownLatch(cnt) }

private List createWorkers() { return (1..numActors).collect {new WordSortActor().start()} }

private int sendTasksToWorkers() { List<Actor> workers = createWorkers() int cnt = 0 new File(docRoot).eachFile { workers[cnt % numActors] << new FileToSort(fileName: it) cnt += 1 } return cnt }

public void waitUntilDone() { startupLatch.await() doneLatch.await() }

void act() { beginSorting() startupLatch.countDown() loop { react { switch (it) { case SortResult:

126

sorted << it.words doneLatch.countDown() println "Received results for file=${it.fileName}" } } } }}

//start the actors to sort wordsdef master = new SortMaster(docRoot: 'c:/tmp/Logs/', numActors: 5).start()master.waitUntilDone()println 'Done'

File file = new File("c:/tmp/Logs/sorted_words.txt")file.withPrintWriter { printer -> master.sorted.each { printer.println it }}

Load Balancer

Demonstrates work balancing among adaptable set of workers. The load balancer receives tasksand queues them in a temporary task queue. When a worker finishes his assignment, it asks theload balancer for a new task.

If the load balancer doesn’t have any tasks available in the task queue, the worker is stopped. If thenumber of tasks in the task queue exceeds certain limit, a new worker is created to increase size ofthe worker pool.

A Load Balancer Sample

import groovyx.gpars.actor.Actorimport groovyx.gpars.actor.DefaultActor

/** * Demonstrates work balancing among adaptable set of workers. * The load balancer receives tasks and queues them in a temporary task queue. * When a worker finishes his assignment, it asks the load balancer for a new task. * If the load balancer doesn't have any tasks available in the task queue, the workeris stopped. * If the number of tasks in the task queue exceeds certain limit, a new worker iscreated * to increase size of the worker pool. */

final class LoadBalancer extends DefaultActor { int workers = 0 List taskQueue = [] private static final QUEUE_SIZE_TRIGGER = 10

127

void act() { loop { react { message -> switch (message) { case NeedMoreWork: if (taskQueue.size() == 0) { println 'No more tasks in the task queue. Terminating theworker.' reply DemoWorker.EXIT workers -= 1 } else reply taskQueue.remove(0) break case WorkToDo: taskQueue << message if ((workers == 0) || (taskQueue.size() >= QUEUE_SIZE_TRIGGER)) { println 'Need more workers. Starting one.' workers += 1 new DemoWorker(this).start() } } println "Active workers=${workers}\tTasks in queue=${taskQueue.size()}" } } }}

final class DemoWorker extends DefaultActor { final static Object EXIT = new Object() private static final Random random = new Random()

Actor balancer

def DemoWorker(balancer) { this.balancer = balancer }

void act() { loop { this.balancer << new NeedMoreWork() react { switch (it) { case WorkToDo: processMessage(it) break case EXIT: terminate() } } }

128

}

private void processMessage(message) { synchronized (random) { Thread.sleep random.nextInt(5000) } }}final class WorkToDo {}final class NeedMoreWork {}

final Actor balancer = new LoadBalancer().start()

//produce tasksfor (i in 1..20) { Thread.sleep 100 balancer << new WorkToDo()}

//produce tasks in a parallel threadThread.start { for (i in 1..10) { Thread.sleep 1000 balancer << new WorkToDo() }}

Thread.sleep 35000 //let the queues get emptybalancer << new WorkToDo()balancer << new WorkToDo()Thread.sleep 10000

balancer.stop()balancer.join()

129

User Guide To AgentsThe Agent class is a thread-safe non-blocking shared mutable state wrapper implementationinspired by Agents in Clojure.

Shared Mutable State can’t be avoided

A lot of the concurrency problems disappear when you eliminate the need for Shared MutableState with your architecture. Indeed, concepts like actors, CSP or dataflow concurrencyavoid or isolate mutable state completely.

In some cases, however, sharing mutable data is either inevitable or makes the design morenatural and understandable. For example, think of a shopping cart in a typical e-commerceapplication, when multiple AJAX requests may hit the cart with read or write requestsconcurrently.

IntroductionIn the Clojure programing language, you can find a concept of Agents, the purpose of which is toprotect mutable data that need to be shared across threads. Agents hide the data and protect itfrom direct access. Clients can only send commands (functions) to the agent. The commands will beserialized and processed against the data one-by-one in turn.

With the commands being executed serially, the commands do not need to care about concurrencyand can assume the data is all theirs when run. Although implemented differently, GPars Agents,called Agent , fundamentally behave like actors. They accept messages and process themasynchronously. The messages, however, must be commands (functions or Groovy closures) andwill be executed inside the agent. After reception, the received function is run against the internalstate of the Agent and the return value of the function is considered to be the new internal state ofthe Agent.

Essentially, agents safe-guard mutable values by allowing only a single agent-managed thread tomake modifications to them. The mutable values are not directly accessible from outside, butinstead requests have to be sent to the agent and the agent is guaranteed to process the requestssequentially on behalf of the callers. Agents guarantee sequential execution of all requests and soconsistency of the values.

130

Schematically -

agent = new Agent(0) //created a new Agent wrapping an integer with initial value 0agent.send {increment()} //asynchronous send operation, sending the increment()function...

//after some delay to process the message the internal Agent's state has been updated...

assert agent.val== 1

To wrap integers, we can certainly use AtomicXXX types on the Java platform, but when the state is amore complex object we need more support.

ConceptsGPars provides an Agent class, which is a special-purpose, thread-safe, non-blockingimplementation inspired by Agents in Clojure.

An Agent wraps a reference to mutable state, held inside a single field, and accepts code (closuresor commands) as messages, which can be sent to the Agent just like to any other actor using the '<<'operator, the send() methods or the implicit call() method.

At some point after reception of a closure / command, the closure is invoked against the internalmutable field and can make changes to it. The closure is guaranteed to be run without interventionfrom other threads and so may freely alter the internal state of the Agent held in the internal datafield.

The whole update process is of the fire-and-forget type since, once the message (closure) is sent tothe Agent, the caller thread can go off to do other things and come back later to check the currentvalue with Agent.val or Agent.valAsync(closure).

Basic Rules

• When executed, the submitted commands obtain the agent's state as a parameter.

• The submitted commands /closures can call any methods on the *agent8’s state.

• Replacing the state object with a new one is also possible and is done using the updateValue()method.

• The return value of the submitted closure doesn’t have a special meaning and is ignored.

• If the message sent to an Agent is not a closure, it is considered to be a new value for theinternal reference field.

• The val property of an Agent will wait until all preceding commands in the agent’s queue areconsumed and then safely return the value of the Agent.

• The valAsync() method will do the same without blocking the caller.

131

• The instantVal property will return an immediate snapshot of the internal agent's state.

• All Agent instances share a default daemon thread pool. Setting the threadPool property of anAgent instance will allow it to use a different thread pool.

• Exceptions thrown by the commands can be collected using the errors property.

Examples

Shared List of Members

The Agent wraps a list of members, who have been added to the club. To add a new member, amessage (command to add a member) has to be sent to the clubMembers Agent.

A Sample -

import groovyx.gpars.agent.Agent importjava.util.concurrent.ExecutorService import java.util.concurrent.Executors

/** * Create a new Agent wrapping a list of strings */def clubMembers = new Agent<List<String>>(['Me']) //add Me

clubMembers.send {it.add 'James'} //add James

final Thread t1 = Thread.start { clubMembers.send {it.add 'Joe'} //add Joe}

final Thread t2 = Thread.start { clubMembers << {it.add 'Dave'} //add Dave clubMembers {it.add 'Alice'} //add Alice (using the implicit call() method)}

[t1, t2]*.join()println clubMembers.valclubMembers.valAsync {println "Current members: $it"}

clubMembers.await()

Shared Conference Counting Number of Registrations

The Conference class allows registration and un-registration, however these methods can only becalled from the commands sent to the conference Agent.

132

A Conference Sample -

import groovyx.gpars.agent.Agent

/** * Conference stores number of registrations and allows parties to register andunregister. * It inherits from the Agent class and adds the register() and unregister() privatemethods, * which callers may use it the commands they submit to the Conference. */class Conference extends Agent<Long> { def Conference() { super(0) } private def register(long num) { data += num } private def unregister(long num) { data -= num }}

final Agent conference = new Conference() //new Conference created

/** * Three external parties will try to register/unregister concurrently */

final Thread t1 = Thread.start { conference << {register(10L)} //send a command to register 10attendees}

final Thread t2 = Thread.start { conference << {register(5L)} //send a command to register 5attendees}

final Thread t3 = Thread.start { conference << {unregister(3L)} //send a command to unregister 3attendees}

[t1, t2, t3]*.join()

assert 12L == conference.val

Factory MethodsAgent instances can also be created using the Agent.agent() factory method.

133

A Sample to Make an Agent Instance

def clubMembers = Agent.agent ['Me'] //add Me

Listeners and ValidatorsAgents allow the user to add listeners and validators. While listeners are notified each time theinternal state changes, validators get a chance to reject or veto a coming change by throwing anexception.

A Concrete Example -

final Agent counter = new Agent()

counter.addListener {oldValue, newValue -> println "Changing value from $oldValue to$newValue"}counter.addListener {agent, oldValue, newValue -> println "Agent $agent changing valuefrom $oldValue to $newValue"}

counter.addValidator {oldValue, newValue -> if (oldValue > newValue) throw newIllegalArgumentException('Things can only go up in Groovy')}counter.addValidator {agent, oldValue, newValue -> if (oldValue == newValue) throw newIllegalArgumentException('Things never stay the same for $agent')}

counter 10counter 11counter {updateValue 12}counter 10 //Will be rejected

counter {updateValue it - 1} //Will be rejectedcounter {updateValue it} //Will be rejectedcounter {updateValue 11} //Will be rejectedcounter 12 //Will be rejected

counter 20counter.await()

Both listeners and validators are essentially closures taking two or three arguments. Exceptionsthrown from the validators will be logged inside the agent and can be tested using the hasErrors()method or retrieved through the errors property.

Testing for Errors Sample

assert counter.hasErrors()assert counter.errors.size() == 5

134

Validator GotchasGroovy is not very strict on variable data types and immutability, so agent users should be awareof potential bumps on the road.

If the submitted code modifies the state directly, validators will not be able to un-do the change incase of a validation rule violation. There are two possible solutions available:

• Make sure you never change the supplied object representing current agent state

• Use custom copy strategy on the agent to allow the agent to create copies of the internal state

In both cases you need to call updateValue() to set and validate the new state properly.

The problem as well as both of the solutions follows :

A Validator Sample -

//Create an agent storing names, rejecting 'Joe'final Closure rejectJoeValidator = {oldValue, newValue -> if ('Joe' in newValue) thrownew IllegalArgumentException('Joe is not allowed to enter our list.')}

Agent agent = new Agent([])agent.addValidator rejectJoeValidator

agent {it << 'Dave'} //Acceptedagent {it << 'Joe'} //Erroneously accepted, since by-passes thevalidation mechanismprintln agent.val

//Solution 1 - never alter the supplied state objectagent = new Agent([])agent.addValidator rejectJoeValidator

agent {updateValue(['Dave', * it])} //Acceptedagent {updateValue(['Joe', * it])} //Rejectedprintln agent.val

//Solution 2 - use custom copy strategy on the agentagent = new Agent([], {it.clone()})agent.addValidator rejectJoeValidator

agent {updateValue it << 'Dave'} //Acceptedagent {updateValue it << 'Joe'} //Rejected, since 'it' is now just a copy ofthe internal agent's stateprintln agent.val

135

GroupingBy default, all Agent instances belong to the same group sharing its daemon thread pool.

Custom groups can also create instances of Agent. These instances will belong to the group, whichcreated them, and will share a thread pool. To create an Agent instance belonging to a group, callthe agent() factory method on the group. This way you can organize and tune performance ofagents.

Create Groups Around a Thread Pools

final def group = new NonDaemonPGroup(5) //create a group around a thread pooldef clubMembers = group.agent(['Me']) //add Me

Custom Thread Pools for Agents

The default thread pool for agents contains daemon threads. Make sure that your customthread pools either use daemon threads, too, which can be achieved either by usingDefaultPGroup or by providing your own thread factory to a thread pool constructor.

Alterntively, in case your thread pools use non-daemon threads, such as when using theNonDaemonPGroup group class, make sure you shutdown the group or the thread poolexplicitly by calling its shutdown() method, otherwise your applications will [red]never exit.

Direct Pool Replacement

Alternatively, by calling the attachToThreadPool() method on an Agent instance, a custom threadpool can be specified for it.

attachToThreadPool() Example

def clubMembers = new Agent<List<String>>(['Me']) //add Me

final ExecutorService pool = Executors.newFixedThreadPool(10)clubMembers.attachToThreadPool(new DefaultPool(pool))

Remember, like actors, a single Agent instance (aka agent) can never use morethan one thread at a time

The Shopping Cart Example

136

A Sample -


class ShoppingCart { private def cartState = new Agent([:])//----------------- public methods below here ---------------------------------- public void addItem(String product, int quantity) { cartState << {it[product] = quantity} //the << operator sends //a message to the Agent } public void removeItem(String product) { cartState << {it.remove(product)} } public Object listContent() { return cartState.val } public void clearItems() { cartState << performClear }

public void increaseQuantity(String product, int quantityChange) { cartState << this.&changeQuantity.curry(product, quantityChange) }//----------------- private methods below here --------------------------------- private void changeQuantity(String product, int quantityChange, Map items) { items[product] = (items[product] ?: 0) + quantityChange } private Closure performClear = { it.clear() }}//----------------- script code below here -------------------------------------final ShoppingCart cart = new ShoppingCart()cart.addItem 'Pilsner', 10cart.addItem 'Budweisser', 5cart.addItem 'Staropramen', 20

cart.removeItem 'Budweisser'cart.addItem 'Budweisser', 15

println "Contents ${cart.listContent()}"

cart.increaseQuantity 'Budweisser', 3println "Contents ${cart.listContent()}"

cart.clearItems()println "Contents ${cart.listContent()}"

You might have noticed two implementation strategies in the code.

1. Public methods may internally just send the required code off to the Agent, instead of executingthe same functionality directly

137

And so Typically Sequential Code Like This

public void addItem(String product, int quantity) { cartState[product]=quantity}

Becomes

public void addItem(String product, int quantity) { cartState << {it[product] = quantity}}

1. Public methods may send references to internal private methods or closures, which hold thedesired functionality to perform the deed.

A Public-to-Private Sample

public void clearItems() { cartState << performClear}

private Closure performClear = { it.clear() }

Currying might be necessary, if the closure takes other arguments besides the current internalstate instance. See the increaseQuantity method.

The Printer Service Example

Another example - suppose a not thread-safe printer service is shared by multiple threads. Theprinter needs to have the document and quality properties set before printing. Obviously we have apotential for race conditions if not guarded properly. Callers don’t want to block until the printer isavailable, which the fire-and-forget nature of actors solves very elegantly.

138

A Sample Printer Service


/** * A non-thread-safe service that slowly prints documents on at a time */class PrinterService { String document String quality

public void printDocument() { println "Printing $document in $quality quality" Thread.sleep 5000 println "Done printing $document" }}

def printer = new Agent<PrinterService>(new PrinterService())

final Thread thread1 = Thread.start { for (num in (1..3)) { final String text = "document $num" printer << {printerService -> printerService.document = text printerService.quality = 'High' printerService.printDocument() } Thread.sleep 200 } println 'Thread 1 is ready to do something else. All print tasks have beensubmitted'}

final Thread thread2 = Thread.start { for (num in (1..4)) { final String text = "picture $num" printer << {printerService -> printerService.document = text printerService.quality = 'Medium' printerService.printDocument() } Thread.sleep 500 } println 'Thread 2 is ready to do something else. All print tasks have beensubmitted'}

[thread1, thread2]*.join()printer.await()

139

For the latest updates, see the respective Demos

Reading The ValueTo follow the Clojure philosophy closely, the Agent class gives reads higher priority than to writes.By using the instantVal property, your read request will bypass the incoming message queue of theAgent and returns the current snapshot of the internal state. The val property will wait in themessage queue for processing, just like the non-blocking variant valAsync(Clojure cl) , which willinvoke the provided closure with the internal state as a parameter.

You have to bear in mind that the instantVal property might return although correct, but randomlylooking results, since the internal state of the Agent at the time of instantVal execution is non-deterministic and depends on the messages that have been processed before the thread schedulerexecutes the body of instantVal .

The await() method lets you wait for the processing of all the messages submitted to the Agentbefore and so may block the calling thread.

State Copy StrategyTo avoid leaking the internal state, the Agent class can specify a copy strategy as the secondconstructor argument. With the copy strategy specified, the internal state is processed by the copystrategy closure and the output value of the copy strategy value is returned to the caller instead ofthe actual internal state. This applies to instantVal, val as well as to valAsync() .

Error HandlingExceptions thrown from within the submitted commands are stored inside the agent and can beobtained from the errors property. The property is cleared once read.

140

Demos.html

A Sample of Error Handling

def clubMembers = new Agent<List>()assert clubMembers.errors.empty

clubMembers.send {throw new IllegalStateException('test1')} clubMembers.send {throw new IllegalArgumentException('test2')} clubMembers.await()

List errors = clubMembers.errors assert 2 == errors.size() assert errors[0] instanceof IllegalStateException assert 'test1' == errors[0].message assert errors[1] instanceof IllegalArgumentException assert 'test2' == errors[1].message

assert clubMembers.errors.empty

Fair and Non-fair AgentsAgents can be either fair or non-fair. Fair agents give up the thread after processing each message,unfair agents keep a thread until their message queue is empty. As a result, non-fair agents tend toperform better than fair ones.

The default setting for all Agent instances is to be non-fair, however by calling its makeFair()method the instance can be made fair.

A Sample To Make It Fair

def clubMembers = new Agent<List>(['Me']) //add MeclubMembers.makeFair()

141

User Guide to DataflowDataflow concurrency offers an alternative concurrency model, which is inherently safe androbust.

IntroductionCheck out this small example written in Groovy using GPars to sum results of calculationsperformed by three concurrently run tasks:

A Simple Sample

import static groovyx.gpars.dataflow.Dataflow.task

final def x = new DataflowVariable()final def y = new DataflowVariable()final def z = new DataflowVariable()

task { z << x.val + y.val}

task { x << 10}

task { y << 5}

println "Result: ${z.val}"

The same algorithm rewritten using the Dataflows class looks like this :

142

A Dataflows Sample


final def df = new Dataflows()

task { df.z = df.x + df.y}

task { df.x = 10}

task { df.y = 5}

println "Result: ${df.z}"

We start three logical tasks, which can run in parallel and perform their particular activities. Thetasks need to exchange data and they do so using Dataflow Variables. Think of Dataflow Variablesas one-shot channels safely and reliably transferring data from producers to their consumers.

The Dataflow Variables have pretty straightforward semantics. When a task needs to read a valuefrom a DataflowVariable (through the val property), it will block until the value has been set byanother task or thread (using the '<<' operator). Each Dataflow Variable can be set only once in itslifetime.

Notice that you don’t have to bother with ordering and synchronizing the tasks or threads and theiraccess to shared variables. The values are magically transferred among tasks at the right timewithout your intervention. The data flow seamlessly among tasks / threads without yourintervention or care.

Implementation DetailThe three tasks in the example do not necessarily need to be mapped to three physical threads.Tasks represent so-called "green" or "logical" threads and can be mapped under the covers to anynumber of physical threads. The actual mapping depends on the scheduler, but the outcome ofdataflow algorithms doesn’t depend on the actual scheduling.

Re-binding Is Possible

The bind operation of dataflow variables silently accepts re-binding to a value, which is equalto an already bound value. We can call the bindUnique method to reject equal values onalready-bound variables.

143

BenefitsHere’s what you gain by using Dataflow Concurrency (by Jonas Bonér):

• No race-conditions

• No live-locks

• Deterministic deadlocks

• Completely deterministic programs

• BEAUTIFUL code.

This doesn’t sound bad, does it?

Concepts

Dataflow Programming

Quoting Wikipedia

Operations (in Dataflow programs) consist of "black boxes" with inputs and outputs, all of which arealways explicitly defined. They run as soon as all of their inputs become valid, as opposed to whenthe program encounters them. Whereas a traditional program essentially consists of a series ofstatements saying "do this, now do this", a dataflow program is more like a series of workers on anassembly line, who will do their assigned task as soon as the materials arrive.

This is why dataflow languages are inherently parallel: the operations have no hidden state to keeptrack of, and the operations are all "ready" at the same time.

PrinciplesWith Dataflow Concurrency, you can safely share variables across tasks. These variable (in Groovyinstances of the DataflowVariable class) can only be assigned (using the '<<' operator) a value oncein their lifetime. The values of the variables, on the other hand, can be read multiple times (inGroovy through the val property), even before the value has been assigned. In such cases, thereading task is suspended until the value is set by another task. So you can simply write your codefor each task sequentially using Dataflow Variables and the underlying mechanics will make sureyou get all the values you need in a thread-safe manner.

In brief, you generally perform three operations with Dataflow variables:

• Create a dataflow variable

• Wait for the variable to be bound (read it)

• Bind the variable (write to it)

And these are the three essential rules your programs have to follow:

144

http://www.jonasboner.com

• When the program encounters an unbound variable it waits for a value.

• It’s not possible to change the value of a dataflow variable once it’s bound.

• Dataflow variables makes it easy to create concurrent stream agents.

Dataflow Queues and BroadcastsBefore you check our samples of Dataflow Variables, Tasks and Operators, you should learn a bitabout streams and queues to have a full picture of Dataflow Concurrency. Except for dataflowvariables, there are also the concepts of DataflowQueues and DataflowBroadcast that you canleverage in your code.

You may think of them as thread-safe buffers or queues for message transfer among concurrenttasks or threads. Check out a typical producer-consumer demo:

A Producer-Consumer Demo


def words = ['Groovy', 'fantastic', 'concurrency', 'fun', 'enjoy', 'safe', 'GPars','data', 'flow']final def buffer = new DataflowQueue()

task { for (word in words) { buffer << word.toUpperCase() //add to the buffer }}

task { while(true) println buffer.val //read from the buffer in a loop}

Both DataflowBroadcasts and DataflowQueues , just like DataflowVariables , implement theDataflowChannel interface with common methods allowing us to write to them and read values fromthem.

The ability to treat both types identically through the DataflowChannel interface comes in handyonce you start using them to wire tasks , operators or selectors together.

145

DataflowChannels Combine Two Interfaces

The DataflowChannel interface combines two interfaces, each serving its purpose:

• DataflowReadChannel holds all the methods necessary for reading values from a channel- getVal(), getValAsync(), whenBound(), etc.

• DataflowWriteChannel holds all the methods necessary for writing values into a channel -bind(), '<<'

You may prefer using these dedicated interfaces instead of the general DataflowChannelinterface, to better express your intended usage.

Please refer to the API doc for more details about the channel interfaces.

Point-to-point Communication

The DataflowQueue class can be viewed as a point-to-point (1 to 1, many to 1) communicationchannel. It allows one or more producers send messages to one reader. If multiple readers readfrom the same DataflowQueue , they will each consume different messages.

Or to put it a different way, each message is consumed by exactly one reader. You can easilyimagine a simple load-balancing scheme built around a shared DataflowQueue with readers beingadded dynamically when the consumer part of your algorithm needs to scale up. This is also auseful default choice when connecting tasks or operators.

Publish-subscribe Communication

The DataflowBroadcast class offers a publish-subscribe (1 to many, many to many) communicationmodel. One or more producers write messages, while all registered readers will receive all themessages. Each message is thus consumed by all readers with a valid subscription at the pointwhen the message is written to the channel. The readers subscribe by calling thecreateReadChannel() method.

A Pub-Sub Sample

DataflowWriteChannel broadcastStream = new DataflowBroadcast()DataflowReadChannel stream1 = broadcastStream.createReadChannel()DataflowReadChannel stream2 = broadcastStream.createReadChannel()

broadcastStream << 'Message1'broadcastStream << 'Message2'broadcastStream << 'Message3'

assert stream1.val == stream2.valassert stream1.val == stream2.valassert stream1.val == stream2.val

146

Under the covers, DataflowBroadcast uses the DataflowStream class to implement the messagedelivery.

DataflowStreamThe DataflowStream class represents a deterministic dataflow channel. It’s built around the conceptof a functional queue and so provides a lock-free thread-safe implementation for message passing.

Essentially, you may think of DataflowStream mechanisms as a 1-to-many communication channel,since when a reader consumes a messages, other readers will still be able to read the samemessage. Also, all messages arrive to all readers in the same order.

Since the DataflowStream is implemented as a functional queue, its API requires users to traversethe values in the stream themselves. On the other hand, DataflowStream offers handy methods forvalue filtering or transformation together with interesting performance characteristics.

Semantics for the DataflowStream Differ From The DataflowChannelInterface

The DataflowStream class, unlike the other communication elements, does not implement theDataflowChannel interface, since the semantics of its use is different. UseDataflowStreamReadAdapter and DataflowStreamWriteAdapter classes to wrap instances ofthe DataflowChannel class in a DataflowReadChannel or DataflowWriteChannelimplementations.

A Sample of DataflowStream Usage

import groovyx.gpars.dataflow.stream.DataflowStreamimport groovyx.gpars.group.DefaultPGroupimport groovyx.gpars.scheduler.ResizeablePool

/** * Demonstrates concurrent implementation of the Sieve of Eratosthenes using dataflowtasks * * In principle, the algorithm consists of a concurrently run chained filters, * each of which detects whether the current number can be divided by a single primenumber. * (generate nums 1, 2, 3, 4, 5, ...) -> (filter by mod 2) -> (filter by mod 3) ->(filter by mod 5) -> (filter by mod 7) -> (filter by mod 11) -> (caution! Primesfalling out here) * The chain is built (grows) on the fly, whenever a new prime is found */

/** * We need a resizeable thread pool, since tasks consume threads while waiting,

147

blocked for values from the DataflowQueue.val */group = new DefaultPGroup(new ResizeablePool(true))

final int requestedPrimeNumberCount = 100

/** * Generating candidate numbers */final DataflowStream candidates = new DataflowStream()group.task { candidates.generate(2, {it + 1}, {it < 1000})}

/** * Chain a new filter for a particular prime number to the end of the Sieve * @param inChannel The current end channel to consume * @param prime The prime number to divide future prime candidates with * @return A new channel ending the whole chain */def filter(DataflowStream inChannel, int prime) { inChannel.filter { number -> group.task { number % prime != 0 } }}

/** * Consume Sieve output and add additional filters for all found primes */def currentOutput = candidatesrequestedPrimeNumberCount.times {

int prime = currentOutput.first println "Found: $prime" currentOutput = filter(currentOutput, prime)}

For convenience and for the ability to use DataflowStream objects with other dataflow constructs,like e.g. operators, you can wrap it with DataflowReadAdapter for read access orDataflowWriteAdapter for write access.

The DataflowStream class is designed for single-threaded producers and consumers. If multiplethreads are supposed to read or write values to the stream, their access to the stream must beserialized externally or adapters should be used.

DataflowStream Adapters

The DataflowStream API as well as the semantics of its use are very different from the one defined

148

by Dataflow(Read/Write)Channel. Adapters have to be used in order to allow DataflowStreams towork with other dataflow elements. The DataflowStreamReadAdapter class will wrap aDataflowStream with the necessary methods to read values, while the DataflowStreamWriteAdapterclass provides write methods around the wrapped DataflowStream method.

Thread Safety

It’s important to mention that the DataflowStreamWriteAdapter is thread safe. It allows multiplethreads to add values to the wrapped DataflowStream through the adapter. On the other hand, theDataflowStreamReadAdapter is designed to be used by a single thread.

The DataflowStreamWriteAdapter is thread safe

To minimize overhead and stay in-line with DataflowStream semantics, theDataflowStreamReadAdapter class is not thread-safe and should only be used from within a singlethread.

If multiple threads need to read from a DataflowStream, they should create their own wrapping ofDataflowStreamReadAdapter .

Thanks to the adapters, DataflowStream can be used for communications between operators orselectors, as these expect Dataflow(Read/Write)Channels .

149

DataflowStreamAdapters Sample

import groovyx.gpars.dataflow.DataflowQueueimport groovyx.gpars.dataflow.stream.DataflowStreamimport groovyx.gpars.dataflow.stream.DataflowStreamReadAdapterimport groovyx.gpars.dataflow.stream.DataflowStreamWriteAdapterimport static groovyx.gpars.dataflow.Dataflow.selectorimport static groovyx.gpars.dataflow.Dataflow.operator

/** * Demonstrates the use of DataflowStreamAdapters to allow dataflow operators to useDataflowStreams */

final DataflowStream a = new DataflowStream()final DataflowStream b = new DataflowStream()def aw = new DataflowStreamWriteAdapter(a)def bw = new DataflowStreamWriteAdapter(b)def ar = new DataflowStreamReadAdapter(a)def br = new DataflowStreamReadAdapter(b)

def result = new DataflowQueue()

def op1 = operator(ar, bw) { bindOutput it}def op2 = selector([br], [result]) { result << it}

aw << 1aw << 2aw << 3assert([1, 2, 3] == [result.val, result.val, result.val])op1.stop()op2.stop()op1.join()op2.join()

Also the ability to select a value from multiple DataflowChannels can only be used through anadapter around a DataflowStream.

150

A DataflowStream Sample

import groovyx.gpars.dataflow.Selectimport groovyx.gpars.dataflow.stream.DataflowStreamimport groovyx.gpars.dataflow.stream.DataflowStreamReadAdapterimport groovyx.gpars.dataflow.stream.DataflowStreamWriteAdapterimport static groovyx.gpars.dataflow.Dataflow.selectimport static groovyx.gpars.dataflow.Dataflow.task

/** * Demonstrates the use of DataflowStreamAdapters to allow dataflow select to selecton DataflowStreams */

final DataflowStream a = new DataflowStream()final DataflowStream b = new DataflowStream()

def aw = new DataflowStreamWriteAdapter(a)def bw = new DataflowStreamWriteAdapter(b)def ar = new DataflowStreamReadAdapter(a)def br = new DataflowStreamReadAdapter(b)

final Select<?> select = select(ar, br)task { aw << 1 aw << 2 aw << 3}

assert 1 == select().valueassert 2 == select().valueassert 3 == select().value

task { bw << 4 aw << 5 bw << 6}

def result = (1..3).collect{select()}.sort{it.value}

assert result*.value == [4, 5, 6]assert result*.index == [1, 0, 1]

If you don’t need any of the functional queue DataflowStream-special functionality, like generation,filtering or mapping, you might consider using the DataflowBroadcast class instead.

This class offers the publish-subscribe communication model through the DataflowChannelinterface.

151

Bind HandlersWhat A Bind

def a = new DataflowVariable()

a >> {println "The variable has just been bound to $it"}

a.whenBound {println "Just to confirm that the variable has been really set to $it"}...

Bind handlers can be registered on all dataflow channels (variables, queues or broadcasts) eitherusing the '>>' operator and/or the then() or the whenBound() methods. They will be run only after avalue is bound to the variable.

Dataflow queues and broadcasts also support a wheneverBound method to register a closure or amessage handler to run each time a value is bound to them.

A DataflowQueue().wheneverBound Sample

def queue = new DataflowQueue()queue.wheneverBound {println "A value $it arrived to the queue"}

Obviously, nothing prevents you from having more than a single handler for a single promise: Theywill all trigger in parallel once the promise has a concrete value:

A wheneverBound Sample

Promise bookingPromise = task { final data = collectData() return broker.makeBooking(data)}

bookingPromise.whenBound {booking -> printAgenda booking}bookingPromise.whenBound {booking -> sendMeAnEmailTo booking}bookingPromise.whenBound {booking -> updateTheCalendar booking}

Parallel Speculations Anyone ?

Dataflow variables and broadcasts are one of several possible ways to implement ParallelSpeculations. For details, please check out Parallel Speculations in the Parallel Collectionssection of the User Guide.

Bind Handlers GroupingWhen you need to wait for multiple DataflowVariables Promises to be bound, we can benefit from

152

calling the whenAllBound() function. It’s available on the Dataflow class as well as on PGroupinstances.

whenAllBound() Sample

final group = new NonDaemonPGroup()

//Calling asynchronous services and receiving back promises for the reservations Promise flightReservation = flightBookingService('PAR <-> BRU') Promise hotelReservation = hotelBookingService('BRU:Feb 24 20015 - Feb 29 2015') Promise taxiReservation = taxiBookingService('BRU:Feb 24 2015 10:31')

//when all reservations have been made, we need to build an agenda for our trip Promise agenda = group.whenAllBound(flightReservation, hotelReservation,taxiReservation) {flight, hotel, taxi -> "Agenda: $flight | $hotel | $taxi" }

//since this is a demo, we only print the agenda and block when it's ready println agenda.val

If you don’t know the number of parameters the whenAllBound() handler needs, then use a closurewith one argument of type List:

whenAllBound() Sample

Promise module1 = task { compile(module1Sources)}Promise module2 = task { compile(module2Sources)}

//We don't know the number of modules that will be jarred together, so use a Listfinal jarCompiledModules = {List modules -> ...}

whenAllBound([module1, module2], jarCompiledModules)

Bind Handler ChainingAll dataflow channels also support the then() method to register a callback handler to invoke whena value becomes available. Unlike whenBound(), the then() method allows us to use chaining, givingus the option to transfer resulting values between functions asynchronously.

Groovy allows us to leave out some of the dots in the then() method chains.

153

A Pointless Sample - No Need To Join The Dots !

final DataflowVariable variable = new DataflowVariable()final DataflowVariable result = new DataflowVariable()

variable.then {it * 2} then {it + 1} then {result << it}variable << 4assert 9 == result.val

This could be nicely combined with Asynchronous functions


final doubler = {it * 2}final adder = {it + 1}

variable.then doubler then adder then {result << it}

Thread.start {variable << 4}

assert 9 == result.val

or ActiveObjects

@ActiveObjectclass ActiveDemoCalculator { @ActiveMethod def doubler(int value) { value * 2 }

@ActiveMethod def adder(int value) { value + 1 }}

final DataflowVariable result = new DataflowVariable()final calculator = new ActiveDemoCalculator();

calculator.doubler(4).then {calculator.adder it}.then {result << it}


154

Motivation for Chaining Promises

Chaining can save quite some code when calling other asynchronous services from withinwhenBound() handlers.

Asynchronous services, such as Asynchronous Functions or Active Methods, return Promisesfor their results. To obtain the actual results, your handlers would have to block to wait forthe value to be bound. This locks the current thread in an unproductive state.

An Unproductive Sample

variable.whenBound {value -> Promise promise = asyncFunction(value) println promise.get()}

or, alternatively, it could register another (nested) whenBound() handler, which would result inunnecessarily complex code.

An Unnecessarily Complex Nested Sample

variable.whenBound {value -> asyncFunction(value).whenBound { println it }}

For an illustration, compare the following two code snippets. One is using whenBound() and oneusing then() chaining. They’re both equivalent in terms of functionality and behavior.

155

A whenBound() Sample Plus a then() Example

final DataflowVariable variable = new DataflowVariable()

final doubler = {it * 2}final inc = {it + 1}

//Using whenBound()variable.whenBound {value -> task { doubler(value) }.whenBound {doubledValue -> task { inc(doubledValue) }.whenBound {incrementedValue -> println incrementedValue } }}

//Using then() chainingvariable.then doubler then inc then this.&println


Chaining Promises solves both of these issues elegantly:

variable >> asyncFunction >> {println it}

The RightShift '>>' operator has been overloaded to call then() method and, therefore, can bechained the same way:

A Chaining Sample


final doubler = {it * 2}final adder = {it + 1}

variable >> doubler >> adder >> {result << it}



156

Error Handling for Promise Chaining

Asynchronous operations may obviously throw exceptions. It’s important to be able to handle themeasily and with little effort. GPars romise objects can implicitly propagate exceptions fromasynchronous calculations across promise chains.

• Promises propagate result values as well as exceptions. The blocking get() method re-throwsany exception that was bound to the Promise so the caller can handle it.

• For asynchronous notifications - the whenBound() handler closure - gets the exception passed inas an argument.

• The then() method accepts two arguments - a value handler and an optional error handler.These will be invoked depending on whether the result is a regular value or an exception. If noerrorHandler is specified, the exception is re-thrown to the Promise returned by then() .

• Exactly the same behavior for then() methods holds true for the whenAllBound() method, whichlistens on multiple Promises to get bound.

A Sample of Error Handling

Promise<Integer> initial = new DataflowVariable<Integer>()Promise<String> result = initial.then {it * 2} then {100 / it} // Will throwexception for 0.then {println "Log the value $it as it passes by"; return it} // No error handler isdefined, // so exceptions areignored // and silently re-thrown to the next handler in chain.then({"The result for $num is $it"}, {"Error detected for $num: $it"}) // Here theexception is caught

initial << 0

println result.get()

ErrorHandler is a closure that accepts instances of Throwable as its' only (optional) argument. Itreturns a value that should be bound to the result of the then() method call, i.e. the returnedPromise. If an exception is thrown from within an error handler, it’s bound to the resultingPromise as an error.

157

Re-throwing Potential Exceptions

promise.then({it+1}) // Implicitly re-throws potential exceptionsbound to promisepromise.then({it+1}, {e -> throw e}) // Explicitly re-throws potential exceptionsbound to promise

promise.then({it+1}, {e -> throw new RuntimeException('Error occurred', e})// Explicitly re-throws a new exception wrapping a potential exception bound to a*Promise*

Where Do You Want This Exception ?

Exception handling in Java has try-catch statements. The behavior of GPars Promise objects givesan asynchronous invocation freedom to handle exceptions at anywhere it’s most convenient. Youcan freely ignore exceptions in your code if you want to, then just assume things work. Even so,remember that exceptions are not accidentally swallowed.

A Exceptional Sample

task { 'gpars.org'.toURL().text //should throw MalformedURLException}

.then {page -> page.toUpperCase()}

.then {page -> page.contains('GROOVY')}

.then({mentionsGroovy -> println "Groovy found: $mentionsGroovy"}, {error -> println"Error: $error"}).join()

Handling Concrete Exception Types

You may also be more specific about the handled exception types like this :

A Specific Exception Handling Example

url.then(download) .then(calculateHash, {MalformedURLException e -> return 0}) // <- specific ! .then(formatResult) .then(printResult, printError) .then(sendNotificationEmail);

Customer-site Exception Handling

You may wish to leave an exception completely un-handled, then let clients (consumers) handle it:

158

A Delayed Exception Handling Example

Promise<Object> result = url.then(download).then(calculateHash).then(formatResult).then(printResult);try { result.get()} catch (Exception e) { //handle exceptions here}

Putting It All Together

By combining whenAllBound() and then (or '>>') methods, we can easily manage largeasynchronous scenarios in a convenient way:

A Large Asynchronous Sample

withPool {

Closure download = {String url -> sleep 3000 //Simulate a web read 'web content' }.asyncFun()

Closure loadFile = {String fileName -> 'file content' //simulate a local file read }.asyncFun()

Closure hash = {s -> s.hashCode()}

Closure compare = {int first, int second -> first == second }

Closure errorHandler = {println "Error detected: $it"}

def all = whenAllBound([ download('http://www.gpars.org') >> hash, loadFile('/coolStuff/gpars/website/index.html') >> hash ], compare).then({println it}, errorHandler) all.join() //optionally block until the calculation is all done

Notice that only the initial action (function) needs to be asynchronous. The functions further downthe pipeline will be invoked asynchronously by your Promise, even if they are synchronous.

159

Implementing the Fork/join Pattern With Promises

Promises are very flexible and can be used as an implementation vehicle for many differentscenarios. Here’s one handy additional capability of a Promise.

The _thenForkAndJoin() method triggers one or several activities once the current Promisebecomes bound and returns a completed Promise object, bound only after all the activities finish.

Let’s see how this fits into the picture:

• then() - permits activity chaining, so that one activity is performed after another

• whenAllBound() - allows joining multiple activities; a new activity is started only after they allfinish

• task() - allows us to create (fork) multiple asynchronous activities

• thenForkAndJoin() - a short-hand syntax for forking several activities and joining on them

So with thenForkAndJoin() you simply create multiple activities that should be triggered by a shared(triggering) Promise.

A Sample of Multiple Activities

promise.thenForkAndJoin(task1, task2, task3).then{...}

Once all the activities return a result, they’re collected into a list and bound into the Promisereturned by thenForkAndJoin() .

A thenForkAndJoin() Sample

task { 2}.thenForkAndJoin({ it ** 2 }, { it**3 }, { it**4 }, { it**5 }).then({ println it}).join()

Lazy Dataflow Tasks and VariablesSometimes you may need to combine the qualities of Dataflow Variables with a lazy initialization.

A Lazy Sample

Closure<String> download = {url -> println "Downloading" url.toURL().text}

def pageContent = new LazyDataflowVariable(download.curry("http://gpars.org"))

160

Instances of LazyDataflowVariable have an initializer declared at construction time. An instance isonly triggered when someone asks for its value, either through the blocking get() method or usingany of the non-blocking callback methods, such as then() . Since LazyDataflowVariables preserve allthe goodness of ordinary DataflowVariables , you can chain them together easily with other lazy orordinary Dataflow Variables.

A Bigger Example

This discussion deserves a more practical example. So, taking inspiration from this long post, thefollowing piece of code demonstrates how to use LazyDataflowVariables to lazily andasynchronously load mutually dependent components into memory. The component modules willbe loaded in the order of their dependencies and concurrently, if possible.

Each module will only be loaded once, irrespective of the number of modules that depend on it.Thanks to laziness, only the modules that are transitively needed will be loaded. Our example usesa simple "diamond" dependency scheme:

• D depends on B and C

• C depends on A

• B depends on A

When loading D, A will get loaded first. B and C will be loaded concurrently once A has been loaded.D will start loading once both B and C have been loaded.

161

http://blog.jcoglan.com/2013/03/30/callbacks-are-imperative-promises-are-functional-nodes-biggest-missed-opportunity/

A Diamond Sample

def moduleA = new LazyDataflowVariable({-> println "Loading moduleA into memory" sleep 3000 println "Loaded moduleA into memory" return "moduleA"})

def moduleB = new LazyDataflowVariable({-> moduleA.then { println "->Loading moduleB into memory, since moduleA is ready" sleep 3000 println " Loaded moduleB into memory" return "moduleB" }})

def moduleC = new LazyDataflowVariable({-> moduleA.then { println "->Loading moduleC into memory, since moduleA is ready" sleep 3000 println " Loaded moduleC into memory" return "moduleC" }})

def moduleD = new LazyDataflowVariable({-> whenAllBound(moduleB, moduleC) { b, c -> println "-->Loading moduleD into memory, since moduleB and moduleC are ready" sleep 3000 println " Loaded moduleD into memory" return "moduleD" }})

println "Nothing loaded so far"println "==================================================================="println "Load module: " + moduleD.get()println "==================================================================="println "All requested modules loaded"

Making Tasks Lazy

The lazyTask() method is available alongside the task() method to give the us a task-orientedabstraction for delayed activities. A Lazy Task returns an instance of a LazyDataflowVariable (like aPromise ) with the initializer set by the provided closure. As soon as someone asks for the value,the task will start asynchronously and eventually deliver a value into the LazyDataflowVariable .

162

A Lazy Sample

import groovyx.gpars.dataflow.Dataflow

def pageContent = Dataflow.lazyTask { println "Downloading" "http://gpars.org".toURL().text }

println "No-one has asked for the value just yet. Bound = ${pageContent.bound}"sleep 1000println "Now going to ask for a value"println pageContent.get().size()println "Repetitive requests will receive the already calculated value. No additionaldownloading."println pageContent.get().size()

Dataflow ExpressionsLook at the magic below:

A Dataflow Sample

def initialDistance = new DataflowVariable()def acceleration = new DataflowVariable()def time = new DataflowVariable()

task { initialDistance << 100 acceleration << 2 time << 10}

def result = initialDistance + acceleration*0.5*time**2println 'Total distance ' + result.val

We use DataflowVariables that represent several parameters to a mathematical equationcalculating total distance of an accelerating object. In the equation itself, however, we use theDataflowVariable directly. We do not refer to the values they represent and yet we are able to dothe math correctly. This shows that DataflowVariables can be very flexible.

For example, you can call methods on them and these methods are dispatched to the bound values:

163

A DataflowVariable Sample

def name = new DataflowVariable()task { name << ' adam '}println name.toUpperCase().trim().val

You can pass other DataflowVariables as arguments to such methods and the real values will bepassed automatically instead:

Another DataflowVariable as An Argument Sample

def title = new DataflowVariable()def searchPhrase = new DataflowVariable()task { title << ' Groovy in Action 2nd edition '}

task { searchPhrase << '2nd'}

println title.trim().contains(searchPhrase).val

And you can also query properties of the bound value using directly the DataflowVariable:

A DataflowVariable Sample To Query Book Title Properties

def book = new DataflowVariable()def searchPhrase = new DataflowVariable()task { book << [ title:'Groovy in Action 2nd edition ', author:'Dierk Koenig', publisher:'Manning']}

task { searchPhrase << '2nd'}

book.title.trim().contains(searchPhrase).whenBound {println it} //Asynchronouswaiting

println book.title.trim().contains(searchPhrase).val //Synchronous waiting

Please note that the result is still a DataflowVariable (DataflowExpression to be precise), fromwhich you can get the real value from both synchronously and asynchronously.

164

Bind Error NotificationDataflowVariables offer the ability to send notifications to registered listeners whenever a bindoperation fails. The getBindErrorManager() method allows listeners to be added and removed. Thelisteners are notified in case of a failed attempt to bind a value (through bind(), bindSafely(),bindUnique() or leftShift() ) or an error (through bindError()).

Reporting Bind Operation FailED

final DataflowVariable variable = new DataflowVariable()

variable.getBindErrorManager().addBindErrorListener(new BindErrorListener() { @Override void onBindError(final Object oldValue, final Object failedValue, finalboolean uniqueBind) { println "Bind failed!" }

@Override void onBindError(final Object oldValue, final Throwable failedError) { println "Binding an error failed!" }

@Override public void onBindError(final Throwable oldError, final ObjectfailedValue, final boolean uniqueBind) { println "Bind failed!" }

@Override public void onBindError(final Throwable oldError, final ThrowablefailedError) { println "Binding an error failed!" }

})

This lets us customize responses to any attempt to bind an already bound Dataflow Variable. Forexample, using bindSafely(), you do not receive bind exceptions back to the caller, but rather, aregistered BindErrorListener is notified.

Further Reading• Scala Dataflow library by Jonas Bonér

• JVM concurrency presentation slides by Jonas Bonér

• Dataflow Concurrency library for Ruby

165

http://github.com/jboner/scala-dataflow/tree/f9a38992f5abed4df0b12f6a5293f703aa04dc33/src

http://jonasboner.com/talks/state_youre_doing_it_wrong/html/all.html

http://github.com/larrytheliquid/dataflow/tree/master

TasksThe Dataflow Task give us an easy-to-grasp abstraction of mutually-independent logical tasks orthreads. These can run concurrently and exchange data solely through Dataflow Variables, Queues,Broadcasts and Streams. A Dataflow Task with it’s easy-to-express mutual dependencies andinherently sequential body could also be used as a practical implementation of a UML ActivityDiagrams .

Check out the examples.

A Simple Mashup Example

In this example, we’re downloading the front pages of three popular web sites, each in their owntask, while in a separate task we’re filtering out sites talking about Groovy today and forming theoutput. The output task synchronizes automatically with the three download tasks on the threeDataflow variables through which the content of each website is passed to the output task.

166

What Wonderful A Mashup !

import static groovyx.gpars.GParsPool.withPoolimport groovyx.gpars.dataflow.DataflowVariableimport static groovyx.gpars.dataflow.Dataflow.task

/** * A simple mashup sample, downloads content of three websites * and checks how many of them refer to Groovy. */

def dzone = new DataflowVariable()def jroller = new DataflowVariable()def theserverside = new DataflowVariable()

task { println 'Started downloading from DZone' dzone << 'http://www.dzone.com'.toURL().text println 'Done downloading from DZone'}

task { println 'Started downloading from JRoller' jroller << 'http://www.jroller.com'.toURL().text println 'Done downloading from JRoller'}

task { println 'Started downloading from TheServerSide' theserverside << 'http://www.theserverside.com'.toURL().text println 'Done downloading from TheServerSide'}

task { withPool { println "Number of Groovy sites today: " + ([dzone, jroller, theserverside].findAllParallel { it.val.toUpperCase().contains 'GROOVY' }).size() }}.join()

Grouping Tasks

Dataflow tasks can be organized into groups for performance fine-tuning. Groups provide a handytask() factory method to create tasks attached to these groups. Using groups allows us to organizetasks or operators around different thread pools (wrapped inside the group). While the

167

Dataflow.task() command schedules the task on a default thread pool(java.util.concurrent.Executor, fixed size=#cpu+1, daemon threads), we might prefer defining ourown thread pool(s) to run these tasks.

A Personal Thread Pool Sample


def group = new DefaultPGroup()

group.with { task { ... }

task { ... }}

Custom Thread Pools for Dataflow

The default thread pool for dataflow tasks has daemon threads. This means our applicationwill exit as soon as the main thread finishes and won’t wait for all tasks to complete!

When grouping tasks, make sure the custom thread pools either :

1. use daemon threads ( achieved by using DefaultPGroup )

2. provide a thread factory to a thread pool constructor

3. or in case the thread pools use non-daemon threads, ( from the NonDaemonPGroupgroup class ), we must shutdown the group or the thread pool explicitly by calling itsshutdown() method, otherwise our application will not exit.

We can selectively override the default group used for tasks, operators, callbacks and otherdataflow elements inside a code block using the Dataflow.usingGroup() method:

A Sample

Dataflow.usingGroup(group) { task { 'http://gpars.codehaus.org'.toURL().text //should throw MalformedURLException } .then {page -> page.toUpperCase()} .then {page -> page.contains('GROOVY')} .then({mentionsGroovy -> println "Groovy found: $mentionsGroovy"}, {error ->println "Error: $error"}).join()}

168

You can always override the default group by being specific:

A Sample

Dataflow.usingGroup(group) { anotherGroup.task { 'http://gpars.codehaus.org'.toURL().text //should throw MalformedURLException } .then(anotherGroup) {page -> page.toUpperCase()} .then(anotherGroup) {page -> page.contains('GROOVY')}.then(anotherGroup) {printlnDataflow.retrieveCurrentDFPGroup();it} .then(anotherGroup, {mentionsGroovy -> println "Groovy found: $mentionsGroovy"},{error -> println "Error: $error"}).join()}

A Mashup Variant With Methods

To avoid giving you wrong impression about structuring the Dataflow code, here’s a rewrite of themashup example, with a downloadPage() method performing the actual download in a separatetask. It returns a DataflowVariable instance, so that the main application thread could eventuallyget hold of the downloaded content.

Dataflow variables can obviously be passed around as parameters or return values.

169

A Sample

package groovyx.gpars.samples.dataflow

import static groovyx.gpars.GParsExecutorsPool.withPoolimport groovyx.gpars.dataflow.DataflowVariableimport static groovyx.gpars.dataflow.Dataflow.task

/** * A simple mashup sample, downloads content of three websites and checks how many ofthem refer to Groovy. */final List urls = ['http://www.dzone.com', 'http://www.jroller.com','http://www.theserverside.com']

task { def pages = urls.collect { downloadPage(it) } withPool { println "Number of Groovy sites today: " + (pages.findAllParallel { it.val.toUpperCase().contains 'GROOVY' }).size() }}.join()

def downloadPage(def url) { def page = new DataflowVariable() task { println "Started downloading from $url" page << url.toURL().text println "Done downloading from $url" } return page}

A Physical Calculation Example

Dataflow programs naturally scale with the number of processors. Up to a certain level, the moreprocessors you have the faster the program runs. Check out, for example, the following script,which calculates parameters of a simple physical experiment and prints out the results.

Each task performs its part of the calculation and may depend on values calculated by some othertasks and its' results might be needed by some of the other tasks. With Dataflow Concurrency you cansplit the work between tasks or reorder the tasks themselves as you like and the dataflowmechanics will ensure the calculation is accomplished correctly.


170

import groovyx.gpars.dataflow.DataflowVariableimport static groovyx.gpars.dataflow.Dataflow.task

final def mass = new DataflowVariable()final def radius = new DataflowVariable()final def volume = new DataflowVariable()final def density = new DataflowVariable()final def acceleration = new DataflowVariable()final def time = new DataflowVariable()final def velocity = new DataflowVariable()final def decelerationForce = new DataflowVariable()final def deceleration = new DataflowVariable()final def distance = new DataflowVariable()

def t = task { println """

Calculating distance required to stop a moving ball.....................................................The ball has a radius of ${radius.val} meters and is made of a material with ${density.val} kg/m3 density,which means that the ball has a volume of ${volume.val} m3 and a mass of ${mass.val}kg.The ball has been accelerating with ${acceleration.val} m/s2 from 0 for ${time.val}seconds and so reached a velocity of ${velocity.val} m/s.

Given our ability to push the ball backwards with a force of ${decelerationForce.val}N (Newton), we can cause a decelerationof ${deceleration.val} m/s2 and so stop the ball at a distance of ${distance.val} m....................................................

This example has been calculated asynchronously in multiple tasks using *GPars*Dataflow concurrency in Groovy.Author: ${author.val}"""

System.exit 0}

task { mass << volume.val * density.val}

task { volume << Math.PI * (radius.val ** 3)}

task { radius << 2.5 density << 998.2071 //water acceleration << 9.80665 //free fall

171

decelerationForce << 900}

task { println 'Enter your name:' def name = new InputStreamReader(System.in).readLine() author << (name?.trim()?.size()>0 ? name : 'anonymous')}

task { time << 10 velocity << acceleration.val * time.val}

task { deceleration << decelerationForce.val / mass.val}

task { distance << deceleration.val * ((velocity.val/deceleration.val) ** 2) * 0.5}

t.join()

I did my best to make all the physical calculations right. Feel free to change thevalues and see how long a distance you need to stop the rolling ball.

Deterministic DeadlocksIf you happen to introduce a deadlock in your dependencies, the deadlock will occur each time yourun the code. No randomness is allowed. That’s one of the benefits of Dataflow Concurrency.Irrespective of the actual thread scheduling scheme, if you don’t get a deadlock in tests, you won’tget them in production.

A Deterministic Deadlock Sample

task { println a.val b << 'Hi there'}

task { println b.val a << 'Hello man'}

172

Dataflows MapAs a handy shortcut the Dataflows class can help you reduce the amount of code you need toleverage Dataflow Variables.

A Convenience Example

def df = new Dataflows()df.x = 'value1'

assert df.x == 'value1'

Dataflow.task {df.y = 'value2}

assert df.y == 'value2'

Think of Dataflows as a map with Dataflow Variables as keys storing their bound values asappropriate map values. The semantics of reading a value (e.g. df.x) and binding a value (e.g. df.x ='value') are identical to the semantics of plain Dataflow Variables (x.val and x << 'value'respectively).

Mixing Dataflows and Groovy with blocks

When inside a with block of a Dataflows instance, the Dataflow Variables stored inside theDataflows instance can be accessed directly without the need to prefix them with the Dataflowsinstance identifier.

A with block Makes Coding Easier

new Dataflows().with { x = 'value1' assert x == 'value1'

Dataflow.task {y = 'value2}

assert y == 'value2'}

Returning Values From A TaskTypically, Dataflow tasks communicate through Dataflow Variables. On top of that, tasks can alsoreturn values, again through a Dataflow Variable. When you invoke the task() factory method, youget back an instance of a Promise (implemented as DataflowVariable), through which you canlisten for the task’s return value, just like when using any other Promise or DataflowVariable.

173

A Task Returns A Value Using a Promise

final Promise t1 = task { return 10 } final Promise t2 = task { return 20 }

def results = [t1, t2]*.val

println 'Both sub-tasks finished and returned values: ' + results

The value can also be obtained without blocking the caller using the whenBound()method.

A Sample

def task = task { println 'The task is running and calculating the return value' 30 // the value to be returned}task >> {value -> println "The task finished and returned $value"}

Joining TasksUsing the join() operation on the resulting Dataflow Variable of a task, you can block until the taskfinishes.

A Bloacking Sample

task { final Promise t1 = task { println 'First sub-task running.' } final Promise t2 = task { println 'Second sub-task running' } [t1, t2]*.join() println 'Both sub-tasks finished' }.join()

174

SelectsFrequently, a value needs to be obtained from one of several dataflow channels (such as variables,queues, broadcasts or streams). The Select class is suitable for these scenarios.

Select can scan several dataflow channels and pick one channel from all the input channels, with avalue that’s ready to be read. The value from that chosen channel is read and returned to the callertogether with the index of the originating channel. Picking the channel is either random, or basedon channel priority, in which case, channels with a lower position index in the Select constructorhave higher priority.

Selecting A Value From Multiple Channels

import groovyx.gpars.dataflow.DataflowQueueimport groovyx.gpars.dataflow.DataflowVariableimport static groovyx.gpars.dataflow.Dataflow.selectimport static groovyx.gpars.dataflow.Dataflow.task

/** * Shows a basic use of Select, which monitors a set of input channels for values andmakes these values * available on its output irrespective of their original input channel. * Note that dataflow variables and queues can be combined for Select. * * You might also consider checking out the prioritySelect method, which prioritizesvalues by the index of their input channel */def a = new DataflowVariable()def b = new DataflowVariable()def c = new DataflowQueue()

task { sleep 3000 a << 10}

task { sleep 1000 b << 20}

task { sleep 5000 c << 30}

def select = select([a, b, c])

println "The fastest result is ${select().value}"

175

A select() Method Returns What ?

Note that the return type from select() is SelectResult , holding the value as well as the originalchannel index.

There are several ways to read values from a Select:

How Do I Select Thee ? Let Me Count The Ways !

def sel = select(a, b, c, d)def result = sel.select() //Random selectiondef result = sel() //Random selection (ashort-hand variant)def result = sel.select([true, true, false, true]) //Random selectionwith guards specifieddef result = sel([true, true, false, true]) //Random selectionwith guards specified (a short-hand variant)

def result = sel.prioritySelect() //Priority selectiondef result = sel.prioritySelect([true, true, false, true]) //Priority selectionwith guards specifies

By default, the Select method blocks processing of the caller until a value is available to be read. Thealternative selectToPromise() and prioritySelectToPromise() methods give us a way to obtain aPromise of a value that can be selected later. Through the returned Promise, you can register acallback to invoke asynchronously whenever the next value is selected.

Random And Priority Seletions

def sel = select(a, b, c, d)

Promise result = sel.selectToPromise() //RandomselectionPromise result = sel.selectToPromise([true, true, false, true]) //Randomselection with guards specified

Promise result = sel.prioritySelectToPromise()//Priority selectionPromise result = sel.prioritySelectToPromise([true, true, false, true])//Priority selection with guards specifies

Another Way ?

Alternatively, the Select method can have it’s value sent to a declared MessageStream (e.g. an Actor)without blocking the caller.

176

A Sample

def handler = actor {...}def sel = select(a, b, c, d)

sel.select(handler) //Random selectionsel(handler) //Random selection (ashort-hand variant)sel.select(handler, [true, true, false, true]) //Random selection withguards specifiedsel(handler, [true, true, false, true]) //Random selection withguards specified (a short-hand variant)

sel.prioritySelect(handler) //Priority selectionsel.prioritySelect(handler, [true, true, false, true]) //Priority selection withguards specifies

GuardsGuards let the caller omit some input channels from the selection. Guards are specified as a List ofboolean flags passed to the select() or prioritySelect() methods.

A Useful Filter Tool

def sel = select(leaders, seniors, experts, juniors)def teamLead = sel([true, true, false, false]).value //Only 'leaders' and'seniors' qualify for becoming a teamLead here

A typical use for Guards is to make Selects flexible enough to adapt to changes in user state.

177

Guards Enable/Disable Channels During Value Selection

import groovyx.gpars.dataflow.DataflowQueueimport static groovyx.gpars.dataflow.Dataflow.selectimport static groovyx.gpars.dataflow.Dataflow.task

/** * Demonstrates the ability to enable/disable channels during a value selection on aSelect by providing boolean guards. */final DataflowQueue operations = new DataflowQueue()final DataflowQueue numbers = new DataflowQueue()

def t = task { final def select = select(operations, numbers) 3.times { def instruction = select([true, false]).value def num1 = select([false, true]).value def num2 = select([false, true]).value final def formula = "$num1 $instruction $num2" println "$formula = ${new GroovyShell().evaluate(formula)}" }}

task { operations << '+' operations << '+' operations << '*'}

task { numbers << 10 numbers << 20 numbers << 30 numbers << 40 numbers << 50 numbers << 60}

t.join()

Priority SelectWhen certain channels should have precedence over others when selecting, the prioritySelectmethods should be used instead.

178

A prioritySelect Sample

/** * Here's a simply usecase for Priority Select. It monitors a set of input channelsfor values and makes these values * available on its output irrespective of their original input channel. * * Note that dataflow variables, queues and broadcasts can be combined for Select. * * Unlike plain select method call, the prioritySelect call gives precedence to inputchannels with lower index. * Available messages from high priority channels will be served before messages fromlower-priority channels. * Messages received through a single input channel will have their mutual orderpreserved. * */def critical = new DataflowVariable()def ordinary = new DataflowQueue()def whoCares = new DataflowQueue()

task { ordinary << 'All working fine' whoCares << 'I feel a bit tired' ordinary << 'We are on target'}

task { ordinary << 'I have just started my work. Busy. Will come back later...' sleep 5000 ordinary << 'I am done for now'}

task { whoCares << 'Huh, what is that noise' ordinary << 'Here I am to do some clean-up work' whoCares << 'I wonder whether unplugging this cable will eliminate that nastysound.' critical << 'The server room runs on UPS!' whoCares << 'The sound has disappeared'}

def select = select([critical, ordinary, whoCares])

println 'Starting to monitor our IT department'

sleep 300010.times {println "Received: ${select.prioritySelect().value}"}

179

Collecting Results of Asynchronous ComputationsNo matter whether they are dataflow tasks , active objects' methods or asynchronous functions,asynchronous activities always return a Promise. Promises implement the SelectableChannelinterface and so can be passed in selects for selection together with other Promises as well as readchannels .

Similarly to Java's CompletionService , our GPars Select method enables you to obtain results ofasynchronous activities as soon as each becomes available. Also, we can use a Select to give us thefirst/fastest result from the first of several computations running in parallel.

How To Pick The Fastest Result

import groovyx.gpars.dataflow.Promiseimport groovyx.gpars.dataflow.Selectimport groovyx.gpars.group.DefaultPGroup/** * Demonstrates the use of dataflow tasks and selects to pick the fastest result ofconcurrently run calculations. */

final group = new DefaultPGroup()group.with { Promise p1 = task { sleep(1000) 10 * 10 + 1 } Promise p2 = task { sleep(1000) 5 * 20 + 2 } Promise p3 = task { sleep(1000) 1 * 100 + 3 }

final alt = new Select(group, p1, p2, p3)

def result = alt.select() println "Result: " + result}

TimeoutsThe Select.createTimeout() method will create a DataflowVariable bound to a value after adeclared time period. This can be leveraged in Selects so that they unblock(resume processing) aftera desired delay, if none of the other channels delivers a value before that time. Simply pass a

180

timeout channel as another input channel to the Select .

A Timeout Channel Helps Pick The Fastest Answer

import groovyx.gpars.dataflow.Promiseimport groovyx.gpars.dataflow.Selectimport groovyx.gpars.group.DefaultPGroup/** * Demonstrates the use of dataflow tasks and selects to pick the fastest result ofconcurrently run calculations. */

final group = new DefaultPGroup()group.with { Promise p1 = task { sleep(1000) 10 * 10 + 1 } Promise p2 = task { sleep(1000) 5 * 20 + 2 } Promise p3 = task { sleep(1000) 1 * 100 + 3 }

final timeoutChannel = Select.createTimeout(500)

final alt = new Select(group, p1, p2, p3, timeoutChannel)

def result = alt.select() println "Result: " + result}

CancellationsOk, so we have our answer. What bout the other tasks that continue to work on their answer? If weneed to cancel those other tasks once an answer was found or maybe a timeout expired, then thebest way is to set a flag that our tasks periodically monitor.

Intentionally, There’s no cancellation machinery built into DataflowVariables orTasks

181

A Sample To Pick The Fastest Answer And Cancel The Others

import groovyx.gpars.dataflow.Promiseimport groovyx.gpars.dataflow.Selectimport groovyx.gpars.group.DefaultPGroup

import java.util.concurrent.atomic.AtomicBoolean

/** * Demonstrates the use of dataflow tasks and selects to pick the fastest result ofconcurrently run calculations. * It shows a waY to cancel the slower tasks once a result is known */

final group = new DefaultPGroup()final done = new AtomicBoolean()

group.with { Promise p1 = task { sleep(1000) if (done.get()) return 10 * 10 + 1 } Promise p2 = task { sleep(1000) if (done.get()) return 5 * 20 + 2 } Promise p3 = task { sleep(1000) if (done.get()) return 1 * 100 + 3 }

final alt = new Select(group, p1, p2, p3, Select.createTimeout(500))

def result = alt.select() done.set(true)

println "Result: " + result}

OperatorsDataflow Operators and Selectors provide a full Dataflow implementation with all the usualceremony.

182

ConceptsFull Dataflow Concurrency builds on the concept of channels connecting operators and selectors.These objects consume values coming through input channels, transform them into new values andoutput the new values into their output channels.

Operators wait for every input channel to have a value before starting to process them butSelectors only wait for the first available value on any of its' input channels.

An Operator Sample

operator(inputs: [a, b, c], outputs: [d]) {x, y, z -> ... bindOutput 0, x + y + z}

183

A Sample Cache

/** * CACHE * * Caches sites' contents. Accepts requests for url content, outputs the content.Outputs requests for download * if the site is not in cache yet. */operator(inputs: [urlRequests], outputs: [downloadRequests, sites]) {request ->

if (!request.content) { println "[Cache] Retrieving ${request.site}"

def content = cache[request.site]

if (content) { println "[Cache] Found in cache" bindOutput 1, [site: request.site, word:request.word, content: content] } else { def downloads = pendingDownloads[request.site] if (downloads != null) { println "[Cache] Awaiting download" downloads << request } else { pendingDownloads[request.site] = [] println "[Cache] Asking for download" bindOutput 0, request } }

} else { println "[Cache] Caching ${request.site}"

cache[request.site] = request.content bindOutput 1, request

def downloads = pendingDownloads[request.site]

if (downloads != null) { for (downloadRequest in downloads) { println "[Cache] Waking up" bindOutput 1, [site: downloadRequest.site, word:downloadRequest.word,content: request.content] } pendingDownloads.remove(request.site) } }}

184

Exception Handling Explained

Standard error handling prints an error message to the standard error output and terminatesthe operator in case an uncaught exception is thrown from withing the operator’s body. Toalter the behavior, you can register your own event listener. See the Operator Lifecyclesection for more details.

Exception Handling Sample

def listener = new DataflowEventAdapter() {

@Override boolean onException(final DataflowProcessor processor, final Throwable e) { logChannel << e return false //Indicate whether to terminate the operator or not }}

op = group.operator(inputs: [a, b], outputs: [c], listeners: [listener]) {x, y -> ...}

Types of Operators

There are specialized versions of operator methods for specific purposes:

• operator - the basic general-purpose operator

• selector - operator triggered by a value being available on any input channel

• prioritySelector - a selector that prefers delivering messages from lower-indexed input channelsover higher-indexed ones

• splitter - a single-input operator copying its input values to all of its output channels

Wiring Operators Together

Operators are typically combined into networks, like when some operators consume outputproduced by other operators.

Using Several Operators

operator(inputs:[a, b], outputs:[c, d]) {...}splitter(c, [e, f])selector(inputs:[e, d]: outputs:[]) {...}

You may alternatively refer to output channels through operators themselves:

185

A More Complex Operator Example

def op1 = operator(inputs:[a, b], outputs:[c, d]) {...}def sp1 = splitter(op1.outputs[0], [e, f]) //takes thefirst output of op1

selector(inputs:[sp1.outputs[0], op1.outputs[1]]: outputs:[]) {...} //takes thefirst output of sp1 and the second output of op1

Grouping Operators

Dataflow operators can be organized into groups for performance fine-tuning. Groups provide ahandy operator() factory method to create tasks attached to the groups.

For Performance Fine-tuning ? Use Groups


def group = new DefaultPGroup()

group.with { operator(inputs: [a, b, c], outputs: [d]) {x, y, z -> ... bindOutput 0, x + y + z }}

Custom Thread Pools For Dataflow

The default thread pool for dataflow operators contains daemon threads, which means yourapplication will exit as soon as the main thread finishes and won’t wait for all tasks tocomplete.

When grouping operators, make sure your custom thread pools use either daemon threads,too, which can be achieved by using DefaultPGroup or by providing your own thread factoryto a thread pool constructor, or in case your thread pools use non-daemon threads, such aswhen using the NonDaemonPGroup group class, make sure you shutdown the group or thethread pool explicitly by calling its shutdown() method, otherwise your applications will notexit.

You may selectively override the default group used for tasks, operators, callbacks and otherdataflow elements inside a code block using the Dataflow.usingGroup() method:

186

A Sample

Dataflow.usingGroup(group) { operator(inputs: [a, b, c], outputs: [d]) {x, y, z -> ... bindOutput 0, x + y + z }}

You can always override the default group by being specific:

A Sample

Dataflow.usingGroup(group) { anotherGroup.operator(inputs: [a, b, c], outputs: [d]) {x, y, z -> ... bindOutput 0, x + y + z }}

Constructing OperatorsThe construction properties of an operator, such as inputs, outputs, stateObject or maxForks cannotbe modified once the operator has been build. You may find thegroovyx.gpars.dataflow.ProcessingNode class helpful when gradually collecting channels and valuesinto lists before you finally build an operator.

187

A Sample

import groovyx.gpars.dataflow.Dataflowimport groovyx.gpars.dataflow.DataflowQueueimport static groovyx.gpars.dataflow.ProcessingNode.node

/** * Shows how to build operators using the ProcessingNode class */

final DataflowQueue aValues = new DataflowQueue()final DataflowQueue bValues = new DataflowQueue()final DataflowQueue results = new DataflowQueue()

//Create a config and gradually set the required properties - channels, code, etc.def adderConfig = node {valueA, valueB -> bindOutput valueA + valueB}adderConfig.inputs << aValuesadderConfig.inputs << bValuesadderConfig.outputs << results

//Build the operatorfinal adder = adderConfig.operator(Dataflow.DATA_FLOW_GROUP)

//Now the operator is running and processing the dataaValues << 10aValues << 20bValues << 1bValues << 2

assert [11, 22] == (1..2).collect { results.val}

Holding State in OperatorsAlthough operators can frequently do without keeping state between subsequent invocations,GPars allows operators to maintain state, if desired by the developer. One obvious way is toleverage the Groovy closure capabilities to close-over their context:

A Sample

int counter = 0operator(inputs: [a], outputs: [b]) {value -> counter += 1}

188

Another way, which allows you to avoid declaring the state object outside of the operator definition,is to pass the state object into the operator as a stateObject parameter at construction time:

A Sample

operator(inputs: [a], outputs: [b], stateObject: [counter: 0]) {value -> stateObject.counter += 1}

Parallelize OperatorsBy default an operator’s body is processed by a single thread at a time. While this is a safe settingallowing the operator’s body to be written in a non-thread-safe manner, once an operator becomes"hot" and data start to accumulate in the operator’s input queues, you might consider allowingmultiple threads to run the operator’s body concurrently. Bear in mind that in such a case you needto avoid or protect shared resources from multi-threaded access. To enable multiple threads to runthe operator’s body concurrently, pass an extra maxForks parameter when creating an operator:

A Sample

def op = operator(inputs: [a, b, c], outputs: [d, e], maxForks: 2) {x, y, z -> bindOutput 0, x + y + z bindOutput 1, x * y * z}

The value of the maxForks parameter indicates the maximum of threads running the operatorconcurrently. Only positive numbers are allowed with value 1 being the default.

Thread Starvation

Please always make sure the group serving the operator holds enough threads to support allrequested forks. Using groups allows you to organize tasks or operators around differentthread pools (wrapped inside the group). While the Dataflow.task() command schedules thetask on a default thread pool (java.util.concurrent.Executor, fixed size=#cpu+1, daemonthreads), you may prefer being able to define your own thread pool(s) to run your tasks.

A Sample

def group = new DefaultPGroup(10)group.operator((inputs: [a, b, c], outputs: [d, e], maxForks: 5) {x, y, z -> ...}

The default group uses a resizeable thread pool as so will never run out of threads.

189

Synchronizing Outputs

When enabling internal parallelization of an operator by setting the value for maxForks to a valuegreater than 1 it is important to remember that without explicit or implicit synchronization in theoperators' body race-conditions may occur. Especially bear in mind that values written to multipleoutput channels are not guarantied to be written atomically in the same order to all the channels

A Sample

operator(inputs:[inputChannel], outputs:[a, b], maxForks:5) {msg -> bindOutput 0, msg bindOutput 1, msg}inputChannel << 1inputChannel << 2inputChannel << 3inputChannel << 4inputChannel << 5

May result in output channels having the values mixed-up something like:

A Sample

a -> 1, 3, 2, 4, 5b -> 2, 1, 3, 5, 4

Explicit synchronization is one way to get correctly bound all output channels andprotect operator not-thread local state:

A Sample

def lock = new Object()operator(inputs:[inputChannel], outputs:[a, b], maxForks:5) {msg -> doStuffThatIsThreadSafe()

synchronized(lock) { doSomethingThatMustNotBeAccessedByMultipleThreadsAtTheSameTime() bindOutput 0, msg bindOutput 1, 2*msg }}

Obviously you need to weight the pros and cons here, since synchronization may defeat thepurpose of setting maxForks to a value greater than 1.

To set values of all the operator’s output channels in one atomic step, you may also consider calling

190

either the bindAllOutputsAtomically method, passing in a single value to write to all outputchannels or the bindAllOutputsAtomically method, which takes a multiple values, each of which willbe written to the output channel with the same position index.

A Sample

operator(inputs:[inputChannel], outputs:[a, b], maxForks:5) {msg -> doStuffThatIsThreadSafe() bindAllOutputValuesAtomically msg, 2*msg }}

Which Bind Do I Use ?

Using the _bindAllOutputs_ or the _bindAllOutputValues_ methods will notguarantee atomicity of writes across al the output channels when using internalparallelism.

If preserving the order of messages in multiple output channels is not an issue,bindAllOutputs as well as bindAllOutputValues will provide better performance over theatomic variants.

Operator Lifecycle

Dataflow operators and selectors fire several events during their lifecycle, which allows theinterested parties to obtain notifications and potential alter operator’s behavior. TheDataflowEventListener interface offers a couple of callback methods:

A Sample

public interface DataflowEventListener { /** * Invoked immediately after the operator starts by a pooled thread before thefirst message is obtained * * @param processor The reporting dataflow operator/selector */ void afterStart(DataflowProcessor processor);

/** * Invoked immediately after the operator terminates * * @param processor The reporting dataflow operator/selector */ void afterStop(DataflowProcessor processor);

/**

191

* Invoked if an exception occurs. * If any of the listeners returns true, the operator will terminate. * Exceptions outside of the operator's body or listeners' messageSentOut()handlers will terminate the operator irrespective of the listeners' votes. * * @param processor The reporting dataflow operator/selector * @param e The thrown exception * @return True, if the operator should terminate in response to the exception,false otherwise. */ boolean onException(DataflowProcessor processor, Throwable e);

/** * Invoked when a message becomes available in an input channel. * * @param processor The reporting dataflow operator/selector * @param channel The input channel holding the message * @param index The index of the input channel within the operator * @param message The incoming message * @return The original message or a message that should be used instead */ Object messageArrived(DataflowProcessor processor, DataflowReadChannel<Object>channel, int index, Object message);

/** * Invoked when a control message (instances of ControlMessage) becomes availablein an input channel. * * @param processor The reporting dataflow operator/selector * @param channel The input channel holding the message * @param index The index of the input channel within the operator * @param message The incoming message * @return The original message or a message that should be used instead */ Object controlMessageArrived(DataflowProcessor processor, DataflowReadChannel<Object> channel, int index, Object message);

/** * Invoked when a message is being bound to an output channel. * * @param processor The reporting dataflow operator/selector * @param channel The output channel to send the message to * @param index The index of the output channel within the operator * @param message The message to send * @return The original message or a message that should be used instead */ Object messageSentOut(DataflowProcessor processor, DataflowWriteChannel<Object>channel, int index, Object message);

/** * Invoked when all messages required to trigger the operator become available in

192

the input channels. * * @param processor The reporting dataflow operator/selector * @param messages The incoming messages * @return The original list of messages or a modified/new list of messages thatshould be used instead */ List<Object> beforeRun(DataflowProcessor processor, List<Object> messages);

/** * Invoked when the operator completes a single run * * @param processor The reporting dataflow operator/selector * @param messages The incoming messages that have been processed */ void afterRun(DataflowProcessor processor, List<Object> messages);

/** * Invoked when the fireCustomEvent() method is triggered manually on a dataflowoperator/selector * * @param processor The reporting dataflow operator/selector * @param data The custom piece of data provided as part of the event * @return A value to return from the fireCustomEvent() method to the caller(event initiator) */ Object customEvent(DataflowProcessor processor, Object data);}

A default implementation is provided through the DataflowEventAdapter class.

Listeners provide a way to handle exceptions, when they occur inside operators. A listener maytypically log such exceptions, notify a supervising entity, generate an alternative output or performany steps required to recover from the situation. If there’s no listener registered or if any of thelisteners returns true the operator will terminate, preserving the contract of afterStop() . Exceptionsthat occur outside the actual operator’s body, i.e. at the parameter preparation phase before thebody is triggered or at the clean-up and channel subscription phase, after the body finishes, alwayslead to operator termination.

The fireCustomEvent() method available on operators and selectors may be used to communicateback and forth between operator’s body and the interested listeners:

193

A Sample

final listener = new DataflowEventAdapter() { @Override Object customEvent(DataflowProcessor processor, Object data) { println "Log: Getting quite high on the scale $data" return 100 //The value to use instead }}

op = group.operator(inputs: [a, b], outputs: [c], listeners: [listener]) {x, y -> final sum = x + y if (sum > 100) bindOutput(fireCustomEvent(sum)) //Reporting that the sum is toohigh, binding the lowered value that comes back else bindOutput sum}

SelectorsSelector’s body should be a closure consuming either one or two arguments.

A Sample

selector (inputs : [a, b, c], outputs : [d, e]) {value -> ....}

The two-argument closure will get a value plus an index of the input channel, the value of which iscurrently being processed. This allows the selector to distinguish between values coming throughdifferent input channels.

A Sample

selector (inputs : [a, b, c], outputs : [d, e]) {value, index -> ....}

Priority Selector

When priorities need to be preserved among input channels, a DataflowPrioritySelector should beused.

194

A Sample

prioritySelector(inputs : [a, b, c], outputs : [d, e]) {value, index -> ...}

The priority selector will always prefer values from channels with lower position index over valuescoming through the channels with higher position index.

Join Selector

A selector without a body closure specified will copy all incoming values to all of its outputchannels.

A Sample

def join = selector (inputs : [programmers, analysis, managers], outputs : [employees,colleagues])

Internal Parallelism

The maxForks attribute allowing for internal selectors parallelism is also available.

A Sample

selector (inputs : [a, b, c], outputs : [d, e], maxForks : 5) {value -> ....}

Guards

Just like Selects , Selectors also allow the users to temporarily include/exclude individual inputchannels from selection. The guards input property can be used to set the initial mask on all inputchannels and the setGuards and setGuard methods are then available in the selector’s body.

195

A Sample

import groovyx.gpars.dataflow.DataflowQueueimport static groovyx.gpars.dataflow.Dataflow.selectorimport static groovyx.gpars.dataflow.Dataflow.task

/** * Demonstrates the ability to enable/disable channels during a value selection on aselect by providing boolean guards. */final DataflowQueue operations = new DataflowQueue()final DataflowQueue numbers = new DataflowQueue()

def instructiondef nums = []

selector(inputs: [operations, numbers], outputs: [], guards: [true, false]) {value,index -> //initial guards is set here if (index == 0) { instruction = value setGuard(0, false) //setGuard() used here setGuard(1, true) } else nums << value if (nums.size() == 2) { setGuards([true, false]) //setGuards() usedhere final def formula = "${nums[0]} $instruction ${nums[1]}" println "$formula = ${new GroovyShell().evaluate(formula)}" nums.clear() }}

task { operations << '+' operations << '+' operations << '*'}

task { numbers << 10 numbers << 20 numbers << 30 numbers << 40 numbers << 50 numbers << 60}

196

Warning

Avoid combining guards and maxForks greater than 1. Although the Selector is thread-safeand won’t be damaged in any way, the guards are likely not to be set the way you expect. Themultiple threads running the selector’s body concurrently will tend to over-write each-other’ssettings to the guards property.

Shutting Down Dataflow NetworksShutting down a network of dataflow processors (operators and selectors) may sometimes be a non-trivial task, especially if you need a generic mechanism that will not leave any messagesunprocessed.

Dataflow operators and selectors can be terminated in three ways:

• by calling the terminate() method on all operators that need to be terminated

• by sending a poisson message

• by setting up a network of activity monitors that will shutdown the network after all messageshave been processed

Check out the details on the ways that GPars provides.

Shutting down the thread pool

If you use a custom PGroup to maintain a thread pool for your dataflow network, you shouldnot forget to shutdown the pool once the network is terminated. Otherwise the thread poolwill consume system resources and, in case of using non-daemon threads, it will prevent JVMfrom exit.

Emergency ShutdownYou can call terminate() on any operator/selector to immediately shut it down. Provided you keeptrack of all your processors, perhaps by adding them to a list, the fastest way to stop the networkwould be:

A Sample

allMyProcessors*.terminate()

This should, however, be treated as an emergency exit, since no guarantees can be given regardingmessages processed nor finished work. Operators will simply terminate instantly leaving workunfinished and abandoning messages in the input channels. Certainly, the lifecycle event listenershooked to the operators/selectors will have their afterStop() event handlers invoked in order to, forexample, release resources or output a note into the log.

197

A Sample

def op1 = operator(inputs: [a, b, c], outputs: [d, e]) {x, y, z -> }

def op2 = selector(inputs: [d], outputs: [f, out]) { }

def op3 = prioritySelector(inputs: [e, f], outputs: [b]) {value, index -> }

[op1, op2, op3]*.terminate() //Terminate all operators by calling the terminate()method on themop1.join()op2.join()op3.join()

Shutting down the whole JVM through System.exit() will obviously shutdown thedataflow network, however, no lifecycle listeners will be invoked.

Stopping Operators Gently

Operators handle incoming messages repeatedly. The only safe moment for stopping an operatorwithout the risk of loosing any messages is right after the operator has finished processingmessages and is just about to look for more messages in its incoming pipes. This is exactly what theterminateAfterNextRun() method does. It will schedule the operator for shutdown after the next setof messages gets handled.

The unprocessed messages will stay in the input channels, which allows you to handle them later,perhaps with a different operator/selector or in some other way. Using terminateAfterNextRun() youwill not loose any input messages. This may be particularly handy when you use a group ofoperators/selectors to load-balance messages coming from a channel. Once the work-loaddecreases, the terminateAfterNextRun() method may be used to safely reduce the pool of load-balancing operators.

Detecting shutdown

Operators and electors offer a handy join() method for those who need to block until the operatorterminates.

A Sample

allMyProcessors*.join()

This is the easies way to wait until the whole dataflow network shuts down, irrespective of theshutdown method used.

198

PoisonPillPoisonPill is a common term for a strategy that uses special-purpose messages to stop entities thatreceive it. GPars offers the PoisonPill class, which has exactly such effect or operators andselectors. Since PoisonPill is a ControlMessage, it is invisible to operator’s body and custom codedoes not need to handle it in any way. DataflowEventListeners may react to ControlMessagesthrough the controlMessageArrived() handler method.

A Sample

def op1 = operator(inputs: [a, b, c], outputs: [d, e]) {x, y, z -> }

def op2 = selector(inputs: [d], outputs: [f, out]) { }

def op3 = prioritySelector(inputs: [e, f], outputs: [b]) {value, index -> }

a << PoisonPill.instance //Send the poisson

op1.join()op2.join()op3.join()

After receiving a poison pill, an operator terminates, right after it finishes the current calculationand makes sure the poison is sent to all its output channels. This is so that the poison can spread tothe connected operators. Also, although operators typically wait for all inputs to have a value, incase of PoisonPills, the operator will terminate immediately as soon as a PoisonPill appears on anyof its inputs. The values already obtained from the other channels will be lost. It can be consideredan error in the design of the network, if these messages were supposed to be processed. Theywould need a proper value as their peer and not a PoisonPill in order to be processes normally.

Selectors, on the other hand, will patiently wait for PoisonPill to be received from all their inputchannels before sending it on the the output channels. This behavior prevents networks containingfeed-back loops involving selectors from being shutdown using PoisonPill . A selector wouldnever receive a PoisonPill from the channel that comes back from behind the selector. A differentshutdown strategy should be used for such networks.

Operators and Selectors Should Only Terminate Themselves

Given the potential variety of operator networks and their asynchronous nature, a goodtermination strategy is that operators and selectors should only ever terminate themselves.

All ways of terminating them from outside (either by calling the terminate() method or bysending poison down the stream) may result in messages being lost somewhere in the pipes.This happens when the reading operators terminate before they fully handle the messageswaiting in their input channels.

199

Immediate Poison PillEspecially for selectors to shutdown immediately after receiving a poison pill, a notion ofimmediate poison pill has been introduced. Since normal, non-immediate poison pills merelyclose the input channel leaving the selector alive until at least one input channel remains open, theimmediate poison pill closes the selector instantly. Obviously, unprocessed messages from the otherselector’s input channels will not be handled by the selector, once it reads an immediate poison pill.

With immediate poison pill you can safely shutdown networks with selectors involved in feedbackloops.

A Sample

def op1 = selector(inputs: [a, b, c], outputs: [d, e]) {value, index -> }def op2 = selector(inputs: [d], outputs: [f, out]) { }def op3 = prioritySelector(inputs: [e, f], outputs: [b]) {value, index -> }

a << PoisonPill.immediateInstance

[op1, op2, op3]*.join()

Poison With CountingWhen sending a poison pill down the operator network you may need to be notified when all theoperators or a specified number of them have been stopped. The CountingPoisonPill class servesexactly this purpose:

A Sample

operator(inputs: [a, b, c], outputs: [d, e]) {x, y, z -> }selector(inputs: [d], outputs: [f, out]) { }prioritySelector(inputs: [e, f], outputs: [b]) {value, index -> }

//Send the poisson indicating the number of operators than need to be terminatedbefore we can continuefinal pill = new CountingPoisonPill(3)a << pill

//Wait for all operators to terminatepill.join()//At least 3 operators should be terminated by now

The termination property of the CountingPoisonPill class is a regular Promise<Boolean> and so has alot of handy properties.

200

A Sample

//Send the poisson indicating the number of operators than need to be terminatedbefore we can continuefinal pill = new CountingPoisonPill(3)pill.termination.whenBound {println "Reporting asynchronously that the network hasbeen stopped"}a << pill

if (pill.termination.bound) println "Wow, that was quick. We are done already!"else println "Things are being slow today. The network is still running."

//Wait for all operators to terminateassert pill.termination.get()//At least 3 operators should be terminated by now

An immediate variant of CountingPoisonPill is also available - ImmediateCountingPoisonPill .

A Sample

def op1 = selector(inputs: [a, b, c], outputs: [d, e]) {value, index -> }def op2 = selector(inputs: [d], outputs: [f, out]) { }def op3 = prioritySelector(inputs: [e, f], outputs: [b]) {value, index -> }

final pill = new ImmediateCountingPoisonPill(3)a << pillpill.join()

ImmediateCountingPoisonPill will safely and instantly shutdown dataflow networks even withselectors involved in feedback loops, which normal non-immediate poison pill would not be able to.

Poison StrategiesTo correctly shutdown a network using PoisonPill you must identify the appropriate set of channelsto send PoisonPill to. PoisonPill will spread in the network the usual way through the channels andprocessors down the stream. Typically the right channels to send PoisonPill to will be those thatserve as data sources for the network. This may be difficult to achieve for general cases or forcomplex networks. On the other hand, for networks with a prevalent direction of message flowPoisonPill provides a very straightforward way to shutdown the whole network gracefully.

201

Load-balancing Prevents Poison Shutdown

Load-balancing architectures, which use multiple operators reading messages off a sharedchannel (queue), will also prevent poison shutdown to work properly, since only one of thereading operators will get to read the poison message. You may consider using forkedoperators instead, by setting the maxForks property to a value greater than 1. Anotheralternative is to manually split the message stream into multiple channels, each of whichwould be consumed by one of the original operators.

Termination Tips and TricksNotice that GPars tasks return a DataflowVariable, which is bound to a value as soon as the taskfinishes. The 'terminator' operator below leverages the fact that DataflowVariables areimplementations of the DataflowReadChannel interface and thus can be consumed by operators. Assoon as both tasks finish, the operator will send a PoisonPill down the q channel to stop theconsumer as soon as it processes all data.

202

A Sample

import groovyx.gpars.dataflow.DataflowQueueimport groovyx.gpars.group.NonDaemonPGroup

def group = new NonDaemonPGroup()

final DataflowQueue q = new DataflowQueue()

// final destinationdef customs = group.operator(inputs: [q], outputs: []) { value -> println "Customs received $value"}

// big producerdef green = group.task { (1..100).each { q << 'green channel ' + it sleep 10 }}

// little producerdef red = group.task { (1..10).each { q << 'red channel ' + it sleep 15 }}

def terminator = group.operator(inputs: [green, red], outputs: []) { t1, t2 -> q << PoisonPill.instance}

customs.join()group.shutdown()

Keeping PoisonPill Inside a Given NetworkIf your network passed values through channels to entities outside of it, you may need to stop thePoisonPill messages on the network boundaries. This can be easily achieved by putting a single-input single-output filtering operator on each such channel.

203

A Sample

operator(networkLeavingChannel, otherNetworkEnteringChannel) {value -> if (!(value instanceOf PoisonPill)) bindOutput it}

The Pipeline DSL may be also helpful here:

A Sample

networkLeavingChannel.filter { !(it instanceOf PoisonPill) } intootherNetworkEnteringChannel

Check out the Pipeline DSL section to find out more on pipelines.

Graceful ShutdownGPars provides a generic way to shutdown a dataflow network. Unlike the previously mentionedmechanisms this approach will keep the network running until all the messages get handled andthan gracefully shuts all operators down letting you know when this happens. You have to pay amodest performance penalty, though. This is unavoidable since we need to keep track of what’shappening inside the network.

A Graceful Sample

import groovyx.gpars.dataflow.DataflowBroadcastimport groovyx.gpars.dataflow.DataflowQueueimport groovyx.gpars.dataflow.operator.component.GracefulShutdownListenerimport groovyx.gpars.dataflow.operator.component.GracefulShutdownMonitorimport groovyx.gpars.group.DefaultPGroupimport groovyx.gpars.group.PGroup

PGroup group = new DefaultPGroup(10)final a = new DataflowQueue()final b = new DataflowQueue()final c = new DataflowQueue()final d = new DataflowQueue<Object>()final e = new DataflowBroadcast<Object>()final f = new DataflowQueue<Object>()final result = new DataflowQueue<Object>()

final monitor = new GracefulShutdownMonitor(100);

def op1 = group.operator(inputs: [a, b], outputs: [c], listeners: [newGracefulShutdownListener(monitor)]) {x, y ->

204

sleep 5 bindOutput x + y}def op2 = group.operator(inputs: [c], outputs: [d, e], listeners: [newGracefulShutdownListener(monitor)]) {x -> sleep 10 bindAllOutputs 2*x}def op3 = group.operator(inputs: [d], outputs: [f], listeners: [newGracefulShutdownListener(monitor)]) {x -> sleep 5 bindOutput x + 40}def op4 = group.operator(inputs: [e.createReadChannel(), f], outputs: [result],listeners: [new GracefulShutdownListener(monitor)]) {x, y -> sleep 5 bindOutput x + y}

100.times{a << 10}100.times{b << 20}

final shutdownPromise = monitor.shutdownNetwork()

100.times{assert 160 == result.val}

shutdownPromise.get()[op1, op2, op3, op4]*.join()

group.shutdown()

First, we need an instance of GracefulShutdownMonitor , which will orchestrate the shutdownprocess. It relies on instances of GracefulShutdownListener attached to all operators/selectors. Theselisteners observe their respective processors together with their input channels and report to theshared GracefulShutdownMonitor. Once shutdownNetwork() is called on GracefulShutdownMonitor ,it will periodically check for reported activities, query the state of operators as well as the numberof messages in their input channels.

Please make sure that no new messages enter the dataflow network after the shutdown has beeninitiated, since this may cause the network to never terminate. The shutdown process should onlybe started after all data producers have ceased sending additional messages to the monitorednetwork.

The shutdownNetwork() method returns a Promise so that you can do the usual set of tricks with it- block waiting for the network to terminate using the get() method, register a callback using thewhenBound() method or make it trigger a whole set of activities through the then() method.

205

Limitations of Graceful Sshutdown

• For GracefulShutdownListener to work correctly, its messageArrived() event handler mustsee the original value that has arrived through the input channel. Since some eventlisteners may alter the messages as they pass through the listeners it is advisable to addthe GracefulShutdownListener first to the list of listeners on each dataflow processor.

• Also, graceful shutdown will not work for those rare operators that have listeners, whichturn control messages into plain value messages in the controlMessageArrived() eventhandler.

• Third and last, load-balancing architectures, which use multiple operators readingmessages off a shared channel (queue), will also prevent graceful shutdown to workproperly. You may consider using forked operators instead, by setting the maxForksproperty to a value greater than 1. Another alternative is to manually split the messagestream into multiple channels, each of which would be consumed by one of the originaloperators.

Application FrameworksDataflow Operators and `Selectors1 can be successfully used to build high-level domain-specificframeworks for problems that naturally fit the flow model.

Building Flow Frameworks on Top of GPars Dataflow

GPars dataflow can be viewed as bottom-line language-level infrastructure.

Operators, selectors, channels and event listeners can be very useful at language level to combine,for example, with actors or parallel collections. Whenever a need comes for asynchronoushandling of events that come through one of more channels, a dataflow operator or a smalldataflow network could be a very good fit. Unlike tasks, operators are lightweight and releasethreads when there’s no message to process. Unlike actors, operators are addressed indirectlythrough channels and may easily combine messages from multiple channels into one action.

Alternatively, operators can be looked at as continuous functions, which instantly and repeatedlytransform their input values into output. We believe that a concurrency-friendly general-purposeprogramming language should provide this type of abstraction.

At the same time, dataflow elements can be easily used as building blocks for constructing domain-specific workflow-like frameworks. These frameworks can offer higher-level abstractionsspecialized to a single problem domain, which would be inappropriate for a general-purposelanguage-level library. Each of the higher-level concepts is then mapped to (potentially several)GPars concepts.

For example, a network solving data-mining problems may consist of several data sources, datacleaning nodes, categorization nodes, reporting nodes and others. Image processing networks, onthe other hand, may need nodes specialized in image compression and format transformation.Similarly, networks for data encryption, mp3 encoding, work-flow management as well as many

206

other domains that would benefit from dataflow-based solutions. These will differ in many aspects -the type of nodes in the network, the type and frequency of events, the load-balancing scheme,potential constraints on branching, the need for visualization, debugging and logging, the way usersdefine the networks and interact with them as well as many others.

The higher-level application-specific frameworks should put effort into providing abstractions bestsuited for the given domain and hide GPars complexities.

For example, the visual graph of the network that the user manipulates on the screen shouldtypically not show all the channels that participate in the network. Debugging or logging channels,which rarely contribute to the core of the solution, are among the first good candidates to considerfor exclusion. Also channels and lifecycle-event listeners, which orchestrate aspects such as loadbalancing or graceful shutdown, will probably be not exposed to the user, although they will bepart of the generated and executed network.

Similarly, a single channel in the domain-specific model will in reality translate into multiplechannels perhaps with one or more logging/transforming/filtering operators connecting themtogether. The function associated with a node will most likely be wrapped with some additionalinfrastructural code to form the operator’s body.

GPars gives you the underlying components that the end user may be abstracted away completelyby the application-specific framework. This keeps GPars domain-agnostic and universal, yet usefulat the implementation level.

Pipeline DSL

A DSL for Building Operators Pipelines

Building dataflow networks can be further simplified. GPars offers handy shortcuts for thecommon scenario of building (mostly linear) pipelines of operators.

A Sample

def toUpperCase = {s -> s.toUpperCase()}

final encrypt = new DataflowQueue()final DataflowReadChannel encrypted = encrypt | toUpperCase | {it.reverse()} | {'###encrypted###' + it + '###'}

encrypt << "I need to keep this message secret!"encrypt << "GPars can build linear operator pipelines really easily"

println encrypted.valprintln encrypted.val

This saves you from directly creating, wiring and manipulating all the channels and operators thatare to form the pipeline. The pipe operator lets you hook an output of one

207

function/operator/process to the input of another one. Just like chaining system processes on thecommand line.

The pipe operator is a handy shorthand for a more generic chainWith() method:

A Sample

def toUpperCase = {s -> s.toUpperCase()}

final encrypt = new DataflowQueue()final DataflowReadChannel encrypted = encrypt.chainWith toUpperCase chainWith {it.reverse()} chainWith {'###encrypted###' + it + '###'}

encrypt << "I need to keep this message secret!"encrypt << "GPars can build linear operator pipelines really easily"

println encrypted.valprintln encrypted.val

Combining Pipelines with Straight Operators

Since each operator pipeline has an entry and an exit channel, pipelines can be wired into morecomplex operator networks. Only your imagination can limit your ability to mix pipelines withchannels and operators in the same network definitions.

208

A Sample

def toUpperCase = {s -> s.toUpperCase()}def save = {text -> //Just pretending to be saving the text to disk, database or whatever println 'Saving ' + text}

final toEncrypt = new DataflowQueue()final DataflowReadChannel encrypted = toEncrypt.chainWith toUpperCase chainWith {it.reverse()} chainWith {'###encrypted###' + it + '###'}

final DataflowQueue fork1 = new DataflowQueue()final DataflowQueue fork2 = new DataflowQueue()splitter(encrypted, [fork1, fork2]) //Split the data flow

fork1.chainWith save //Hook in the save operation

//Hook in a sneaky decryption pipelinefinal DataflowReadChannel decrypted = fork2.chainWith {it[15..-4]} chainWith {it.reverse()} chainWith {it.toLowerCase()} .chainWith {'Groovy leaks! Check out a decrypted secret message: ' + it}

toEncrypt << "I need to keep this message secret!"toEncrypt << "GPars can build operator pipelines really easy"

println decrypted.valprintln decrypted.val

Type of Channel Preservation

The type of the channel is preserved across the whole pipeline. E.g. if you start chaining off asynchronous channel, all the channels in the pipeline will be synchronous. In that case,obviously, the whole chain blocks, including the writer who writes into the channel at head,until someone reads data off the tail of the pipeline.

209

A Sample

final SyncDataflowQueue queue = new SyncDataflowQueue()final result = queue.chainWith {it * 2}.chainWith {it + 1} chainWith {it * 100}

Thread.start { 5.times { println result.val }}

queue << 1queue << 2queue << 3queue << 4queue << 5

Joining Pipelines

Two pipelines (or channels) can be connected using the into() method:

A Sample

final encrypt = new DataflowQueue()final DataflowWriteChannel messagesToSave = new DataflowQueue()encrypt.chainWith toUpperCase chainWith {it.reverse()} into messagesToSave

task { encrypt << "I need to keep this message secret!" encrypt << "GPars can build operator pipelines really easy"}

task { 2.times { println "Saving " + messagesToSave.val }}

The output of the encryption pipeline is directly connected to the input of the saving pipeline (asingle channel in out case).

Forking the Dataflow

When a need comes to copy the output of a pipeline/channel into more than one followingpipeline/channel, the split() method will help you:

210

A Sample

final encrypt = new DataflowQueue()final DataflowWriteChannel messagesToSave = new DataflowQueue()final DataflowWriteChannel messagesToLog = new DataflowQueue()

encrypt.chainWith toUpperCase chainWith {it.reverse()}.split(messagesToSave,messagesToLog)

Tapping into a Pipeline

Like split() the tap() method allows you to fork the data flow into multiple channels. Tapping,however, is slightly more convenient in some scenarios, since it treats one of the two new forks asthe successor of the pipeline.

A Sample

queue.chainWith {it * 2}.tap(logChannel).chainWith{it + 1}.tap(logChannel).into(PrintChannel)

Merging Channels

Merging allows you to join multiple read channels as inputs for a single dataflow operator. Thefunction passed as the second argument needs to accept as many arguments as there are channelsbeing merged - each will hold a value of the corresponding channel.

A Sample

maleChannel.merge(femaleChannel) {m, f -> m.marry(f)}.into(mortgageCandidatesChannel)

Separation

Separation is the opposite operation to merge. The supplied closure returns a list of values, each ofwhich will be output into an output channel with the corresponding position index.

A Sample

queue1.separate([queue2, queue3, queue4]) {a -> [a-1, a, a+1]}

211

Choices

The binaryChoice() and choice() methods allow you to send a value to one out of two (or many)output channels, as indicated by the return value from a closure.

A Sample

queue1.binaryChoice(queue2, queue3) {a -> a > 0}queue1.choice([queue2, queue3, queue4]) {a -> a % 3}

Filtering

The filter() method allows to filter data in the pipeline using boolean predicates.

A Sample

final DataflowQueue queue1 = new DataflowQueue() final DataflowQueue queue2 = new DataflowQueue()

final odd = {num -> num % 2 != 0 }

queue1.filter(odd) into queue2 (1..5).each {queue1 << it} assert 1 == queue2.val assert 3 == queue2.val assert 5 == queue2.val

Null Values

If a chained function returns a null value, it is normally passed along the pipeline as a valid value.To indicate to the operator that no value should be passed further down the pipeline, aNullObject.nullObject instance must be returned.

212

A Sample

final DataflowQueue queue1 = new DataflowQueue() final DataflowQueue queue2 = new DataflowQueue()

final odd = {num -> if (num == 5) return null //null values are normally passed on if (num % 2 != 0) return num else return NullObject.nullObject //this value gets blocked }

queue1.chainWith odd into queue2 (1..5).each {queue1 << it} assert 1 == queue2.val assert 3 == queue2.val assert null == queue2.val

Customizing Thread Pools

All of the Pipeline DSL methods allow for custom thread pools or PGroups to be specified:

213

A Sample

channel | {it * 2}

channel.chainWith(closure)channel.chainWith(pool) {it * 2}channel.chainWith(group) {it * 2}

channel.into(otherChannel)channel.into(pool, otherChannel)channel.into(group, otherChannel)

channel.split(otherChannel1, otherChannel2)channel.split(otherChannels)channel.split(pool, otherChannel1, otherChannel2)channel.split(pool, otherChannels)channel.split(group, otherChannel1, otherChannel2)channel.split(group, otherChannels)

channel.tap(otherChannel)channel.tap(pool, otherChannel)channel.tap(group, otherChannel)

channel.merge(otherChannel)channel.merge(otherChannels)channel.merge(pool, otherChannel)channel.merge(pool, otherChannels)channel.merge(group, otherChannel)channel.merge(group, otherChannels)

channel.filter( otherChannel)channel.filter(pool, otherChannel)channel.filter(group, otherChannel)

channel.binaryChoice( trueBranch, falseBranch)channel.binaryChoice(pool, trueBranch, falseBranch)channel.binaryChoice(group, trueBranch, falseBranch)

channel.choice( branches)channel.choice(pool, branches)channel.choice(group, branches)

channel.separate( outputs)channel.separate(pool, outputs)channel.separate(group, outputs)

214

Overriding the Default PGroupTo avoid the necessity to specify PGroup for each Pipeline DSL method separately you may overridethe value of the default Dataflow PGroup.

A Sample

Dataflow.usingGroup(group) { channel.choice(branches)}//Is identical tochannel.choice(group, branches)

The Dataflow.usingGroup() method resets the value of the default dataflow PGroup for the givencode block to the value specified.

The Pipeline BuilderThe Pipeline class offers an intuitive builder for operator pipelines. The greatest benefit of using thePipeline class compared to chaining the channels directly is the ease with which a custom threadpool/group can be applied to all the operators along the constructed chain. The available methodsand overloaded operators are identical to the ones available on channels directly.

The greatest benefit of using the Pipeline class is easy usage

215

A Sample

import groovyx.gpars.dataflow.DataflowQueueimport groovyx.gpars.dataflow.operator.Pipelineimport groovyx.gpars.scheduler.DefaultPoolimport groovyx.gpars.scheduler.Pool

final DataflowQueue queue = new DataflowQueue()final DataflowQueue result1 = new DataflowQueue()final DataflowQueue result2 = new DataflowQueue()final Pool pool = new DefaultPool(false, 2)

final negate = {-it}

final Pipeline pipeline = new Pipeline(pool, queue)

pipeline | {it * 2} | {it + 1} | negatepipeline.split(result1, result2)

queue << 1queue << 2queue << 3

assert -3 == result1.valassert -5 == result1.valassert -7 == result1.val

assert -3 == result2.valassert -5 == result2.valassert -7 == result2.val

pool.shutdown()

Passing Construction Parameters Through the PipelineDSLYou are likely to frequently need the ability to pass additional initialization parameters to theoperators, such as the listeners to attach or the value for maxForks. Just like when buildingoperators directly, the Pipeline DSL methods accept an optional map of parameters to pass in.

A Sample

new Pipeline(group, queue1).merge([maxForks: 4, listeners: [listener]], queue2) {a, b-> a + b}.into queue3

216

ImplementationThe Dataflow Concurrency in GPars builds on the same principles as the Actor support. All of thedataflow tasks share a thread pool and so the number threads created through Dataflow.task()factory method don’t need to correspond to the number of physical threads required from thesystem. The PGroup.task() factory method can be used to attach the created task to a group. Sinceeach group defines its own thread pool, you can easily organize tasks around different thread poolsjust like you do with actors.

Combining Actors and Dataflow Concurrency

The good news is that you can combine actors and Dataflow Concurrency in any way you feel fit foryour particular problem at hands. You can freely you use Dataflow Variables from Actors.

A Sample

final DataflowVariable a = new DataflowVariable()

final Actor doubler = Actors.actor { react {message-> a << 2 * message }}

final Actor fakingDoubler = actor { react { doubler.send it //send a number to the doubler println "Result ${a.val}" //wait for the result to be bound to 'a' }}

fakingDoubler << 10

In the example you see the fakingDoubler using both messages and a DataflowVariable tocommunicate with the doubler Actor.

Using Plain Java ThreadsThe DataflowVariable as well as the DataflowQueue classes can obviously be used from any threadof your application, not only from the tasks created by Dataflow.task(). Consider the followingexample:

217


import groovyx.gpars.dataflow.DataflowVariable

final DataflowVariable a = new DataflowVariable<String>()final DataflowVariable b = new DataflowVariable<String>()

Thread.start { println "Received: $a.val" Thread.sleep 2000 b << 'Thank you'}

Thread.start { Thread.sleep 2000 a << 'An important message from the second thread' println "Reply: $b.val"}

We’re creating two plain java.lang.Thread instances, which exchange data using the two data flowvariables. Obviously, neither the Actor lifecycle methods, nor the send/react functionality or threadpooling take effect in such scenarios.

Synchronous Variables and ChannelsWhen using asynchronous dataflow channels, apart from the fact that readers have to wait for avalue to be available for consumption, the communicating parties remain completely independent.Writers don’t wait for their messages to get consumed. Readers obtain values immediately as theycome and ask. Synchronous channels, on the other hand, can synchronize writers with the readersas well as multiple readers among themselves.

This is particularly useful when you need to increase the level of determinism.

The writer-to-reader partial ordering imposed by asynchronous communication is complementedwith reader-to-writer partial ordering, when using synchronous communication. In other words,you are guaranteed that whatever the reader did before reading a value from a synchronouschannel preceded whatever the writer did after writing the value. Also, with synchronouscommunication writers can never get too far ahead of readers, which simplifies reasoning aboutthe system and reduces the need to manage data production speed in order to avoid systemoverload.

Synchronous Dataflow Queue

The SyncDataflowQueue class should be used for point-to-point (1:1 or n:1) communication. Eachmessage written to the queue will be consumed by exactly one reader. Writers are blocked untiltheir message is consumed, readers are blocked until there’s a value available for them to read.

218

Synchronous channels block both the writer and the readers until all parties areready

A Synchronous Channel Sample

import groovyx.gpars.dataflow.SyncDataflowQueueimport groovyx.gpars.group.NonDaemonPGroup

/** * Shows how synchronous dataflow queues can be used to throttle fast producer * when serving data to a slow consumer. Unlike when using asynchronous channels, * synchronous channels block both the writer and the readers until all parties * are ready to exchange messages. */


final SyncDataflowQueue channel = new SyncDataflowQueue()

def producer = group.task { (1..30).each { channel << it println "Just sent $it" } channel << -1}

def consumer = group.task { while (true) { sleep 500 //simulating a slow consumer final Object msg = channel.val if (msg == -1) return println "Received $msg" }}

consumer.join()

group.shutdown()

Synchronous Dataflow BroadcastThe SyncDataflowBroadcast class should be used for publish-subscribe (1:n or n:m) communication.

Each message written to the broadcast will be consumed by all subscribed readers. Writers areblocked until their message is consumed by all readers, readers are blocked until there’s a value

219

available for them to read and all the other subscribed readers ask for the message as well. WithSyncDataflowBroadcast, you get all readers processing the same message at the same time andwaiting for one-another before getting the next one.

220

A SyncDataflowBroadcast Sample

import groovyx.gpars.dataflow.SyncDataflowBroadcastimport groovyx.gpars.group.NonDaemonPGroup

/** * Shows how synchronous dataflow broadcasts can be used to throttle fast producer * when serving data to slow consumers. Unlike when using asynchronous channels, * synchronous channels block both the writer and the readers * until all parties are ready to exchange messages. */


final SyncDataflowBroadcast channel = new SyncDataflowBroadcast()

def subscription1 = channel.createReadChannel()def fastConsumer = group.task { while (true) { sleep 10 //simulating a fast consumer final Object msg = subscription1.val if (msg == -1) return println "Fast consumer received $msg" }}

def subscription2 = channel.createReadChannel()def slowConsumer = group.task { while (true) { sleep 500 //simulating a slow consumer final Object msg = subscription2.val if (msg == -1) return println "Slow consumer received $msg" }}

def producer = group.task { (1..30).each { println "Sending $it" channel << it println "Sent $it" } channel << -1}

[fastConsumer, slowConsumer]*.join()

group.shutdown()

221

Synchronous Dataflow VariableUnlike DataflowVariable, which is asynchronous and only blocks the readers until a value is boundto the variable, the SyncDataflowVariable class provides a one-shot data exchange mechanism thatblocks the writer and all readers until a specified number of waiting parties is reached.

A Sample

import groovyx.gpars.dataflow.SyncDataflowVariableimport groovyx.gpars.group.NonDaemonPGroup

final NonDaemonPGroup group = new NonDaemonPGroup()

//two readers required to exchange the messagefinal SyncDataflowVariable value = new SyncDataflowVariable(2)

def writer = group.task { println "Writer about to write a value" value << 'Hello' println "Writer has written the value"}

def reader = group.task { println "Reader about to read a value" println "Reader has read the value: ${value.val}"}

def slowReader = group.task { sleep 5000 println "Slow reader about to read a value" println "Slow reader has read the value: ${value.val}"}

[reader, slowReader]*.join()

group.shutdown()

Kanban FlowTable 1. Kanban Flow API Links

API Link API Link API Link API Link

Kanban Flow Kanban Link Kanban Tray Processing Node

222

../api/groovydoc/groovyx/gpars/dataflow/KanbanFlow.html

../api/groovydoc/groovyx/gpars/dataflow/KanbanLink.html

../api/groovydoc/groovyx/gpars/dataflow/KanbanTray.html

../api/groovydoc/groovyx/gpars/dataflow/ProcessingNode.html

KanbanFlow DescriptionA KanbanFlow is a composed object that uses dataflow abstractions to define dependenciesbetween multiple concurrent producer and consumer operators.

Each link between a producer and a consumer is defined by a KanbanLink.

Inside each KanbanLink, the communication between producer and consumer follows theKanbanFlow pattern as described in The KanbanFlow Pattern. They use objects of type KanbanTrayto send products downstream and to signal requests for further products back to the producer.

The figure below illustrates a KanbanLink with one producer, one consumer and five traysnumbered 0 to 4. Tray number 0 has been used to take a product from producer to consumer, hasbeen emptied by the consumer and is now sent back to the producer’s input queue. Trays 1 and 2wait to carry products waiting for consumption, while trays 3 and 4 wait to be used by producers.

A KanbanFlow object links producers to consumers thus creating KanbanLink objects. In the courseof this activity, a second link may be constructed where the producer is the same object that actedas the consumer in a formerly created link such that the two links become connected to build achain.

Here is an example of a KanbanFlow with only one link, e.g. one producer and one consumer. Theproducer always sends the number 1 downstream and the consumer prints this number.

223

http://people.canoo.com/mittie/kanbanflow.html

A Kanban Sample

import static groovyx.gpars.dataflow.ProcessingNode.nodeimport groovyx.gpars.dataflow.KanbanFlow

def producer = node { down -> down 1 }def consumer = node { up -> println up.take() }

new KanbanFlow().with { link producer to consumer start() // run for a while stop()}

To put a product into a tray and send the tray downstream, one can either use the send() method,the << operator, or use the tray as a method object. The following lines are equivalent:

A Sample of Equivalent Choices

node { down -> down.send 1 }node { down -> down << 1 }node { down -> down 1 }

When a product is taken from the input tray with the take() method, the empty tray isautomatically released.

You should call take() only once!

If you prefer not to use an empty tray to send products downstream (as typically the case when aProcessingNode acts as a filter), you must release the tray in order to keep it in play. Otherwise, thenumber of trays in the system decreases.

You can release a tray either by calling the release() method or by using the ~ operator (think"shake it off"). The following lines are equivalent:

More Equivalent Choices

node { down -> down.release() }node { down -> ~down }

Tray Release

Trays are automatically released, if you call any of the take() or send() methods.

224

Various Linking Structures

In addition to linear chains, a KanbanFlow can also link a single producer to multiple consumers(like a tree) or link multiple producers to a single consumer (collector) or any combination of theabove that results in a directed acyclic graph (DAG).

The KanbanFlowTest class has many examples for such structures, including scenarios where asingle producer delegates work to multiple consumers with:

• a work-stealing strategy where all consumers get their pick from the downstream,

• a master-slave strategy where a producer chooses from the available consumers, and

• a broadcast strategy where a producer sends all products to all consumers.

Cycles are forbidden by default but when enabled, they can be used as so-called generators. Aproducer can even be his own consumer that increases a product value every cycle. The generatoritself remains state-free since the value is only stored as a product riding on a tray. Such a generatorcan be used for e.g. lazy sequences or as a the heartbeat of a subsequent flow.

The approach of generator loops can be equally applied to collectors, where a collector does notmaintain any internal state but sends a collection to itself, adding products at each call.

Generally speaking, a ProcessingNode can link to itself to export state to the tray/product that itsends to itself. Access to the product is then thread-safe by design.

Composing KanbanFlows

Just as KanbanLink objects can be chained together to form a KanbanFlow, flows themselves can becomposed again to form new, greater flows from existing smaller ones.

A KanbanFlow Sample

def firstFlow = new KanbanFlow()def producer = node(counter)def consumer = node(repeater)firstFlow.link(producer).to(consumer)

def secondFlow = new KanbanFlow()def producer2 = node(repeater)def consumer2 = node(reporter)secondFlow.link(producer2).to(consumer2)

flow = firstFlow + secondFlow

flow.start()

225

Customizing Concurrency Characteristics

The amount of concurrency in a kanban system is determined by the number of trays (sometimescalled WIP = work-in-progress). With no trays in the streams, the system does nothing:

• With one tray only, the system is confined to sequential execution.

• With more trays, concurrency begins.

• With more trays than available processing units, the system begins to waste resources.

The number of trays can be controlled in various ways. They are typically set when starting theflow.

Setting Tray Counts

flow.start(0) // start without traysflow.start(1) // start with one tray per link in the flowflow.start() // start with the optimal number of trays

In addition to the trays, the KanbanFlow may also be constrained by its underlying ThreadPool. Apool of size 1 for example will not allow much concurrency.

KanbanFlows use a default pool that is dimensioned by the number of available cores. This can becustomized by setting the pooledGroup property.

Tests

• Kanban Flow Test in Groovy

Kanban Code Samples in Groovy

• Demo Kanban Flow

• Demo Kanban Flow Broadcast

• Demo Kanban Flow Cycle

• Demo Kanban Lazy Prime Sequence Loops

Classic Examples

The Sieve of Eratosthenes Implementation using Dataflow Tasks

A Sample Solution of The Sieve of Eratosthenes

import groovyx.gpars.dataflow.DataflowQueueimport static groovyx.gpars.dataflow.Dataflow.task

226

https://github.com/GPars/GPars/blob/master/src/test/groovy/groovyx/gpars/dataflow/KanbanFlowTest.groovy

https://github.com/GPars/GPars/blob/master/src/test/groovy/groovyx/gpars/dataflow/KanbanFlowTest.groovy

https://github.com/GPars/GPars/blob/master/src/test/groovy/groovyx/gpars/samples/dataflow/kanban/DemoKanbanFlow.groovy

https://github.com/GPars/GPars/blob/master/src/test/groovy/groovyx/gpars/samples/dataflow/kanban/DemoKanbanFlowBroadcast.groovy

https://github.com/GPars/GPars/blob/master/src/test/groovy/groovyx/gpars/samples/dataflow/kanban/DemoKanbanFlowCycle.groovy

https://github.com/GPars/GPars/blob/master/src/test/groovy/groovyx/gpars/samples/dataflow/kanban/DemoKanbanLazyPrimeSequenceLoops.groovy

/** * Demonstrates concurrent implementation of the Sieve of Eratosthenes using dataflowtasks */


final DataflowQueue initialChannel = new DataflowQueue()

/** * Generating candidate numbers */task { (2..10000).each { initialChannel << it }}

/** * Chain a new filter for a particular prime number to the end of the Sieve * @param inChannel The current end channel to consume * @param prime The prime number to divide future prime candidates with * @return A new channel ending the whole chain */def filter(inChannel, int prime) { def outChannel = new DataflowQueue()

task { while (true) { def number = inChannel.val if (number % prime != 0) { outChannel << number } } } return outChannel}

/** * Consume Sieve output and add additional filters for all found primes */def currentOutput = initialChannelrequestedPrimeNumberCount.times { int prime = currentOutput.val println "Found: $prime" currentOutput = filter(currentOutput, prime)}

227

The Sieve of Eratosthenes using both Dataflow Tasks and Operators

228

A More Complex Solution

import groovyx.gpars.dataflow.DataflowQueue import static groovyx.gpars.dataflow.Dataflow.operator import static groovyx.gpars.dataflow.Dataflow.task

/** * Demonstrates concurrent implementation of the Sieve of Eratosthenes usingdataflow tasks and operators */


final DataflowQueue initialChannel = new DataflowQueue()

/** * Generating candidate numbers */ task { (2..1000).each { initialChannel << it } }

/** * Chain a new filter for a particular prime number to the end of the Sieve * @param inChannel The current end channel to consume * @param prime The prime number to divide future prime candidates with * @return A new channel ending the whole chain */ def filter(inChannel, int prime) { def outChannel = new DataflowQueue()

operator([inputs: [inChannel], outputs: [outChannel]]) { if (it % prime != 0) { bindOutput it } } return outChannel }

/** * Consume Sieve output and add additional filters for all found primes */ def currentOutput = initialChannel requestedPrimeNumberCount.times { int prime = currentOutput.val println "Found: $prime" currentOutput = filter(currentOutput, prime) }

229

User Guide To STM

Software Transactional MemorySoftware Transactional Memory (STM) gives developers transactional semantics for accessing in-memory data. This is similar to database concepts.

When multiple threads share data in memory, by marking blocks of code as transactional (atomic),the developer delegates the responsibility for data consistency to the STM engine. GPars leveragesthe Multiverse STM engine.

Running A Piece of Code AtomicallyWhen using STM, developers organize their code into transactions. A transaction is a piece of code,which is executed atomically - either 1) all the code is run or 2) none at all.

The data used by the transactional code remains consistent irrespective of whether the transactionfinishes normally or abruptly. While running inside a transaction, the code is given an illusion ofbeing isolated from other concurrently running transactions so that changes to data in onetransaction are not visible in the other ones until the transactions commit. This gives us the ACIpart of the ACID characteristics of database transactions. The durability transactional aspect sotypical for databases, is not typically mandated for Stm.

GPars allows developers to specify transaction boundaries by using the atomic closures.

Sample of ACI Transaction Boundaries

import groovyx.gpars.stm.GParsStmimport org.multiverse.api.references.TxnIntegerimport static org.multiverse.api.StmUtils.newTxnInteger

public class Account { private final TxnInteger amount = newTxnInteger(0);

public void transfer(final int a) { GParsStm.atomic { amount.increment(a); } }

public int getCurrentAmount() { GParsStm.atomicWithInt { amount.get(); } }}

There are several types of atomic closures, each for a different type of return value:

231

https://github.com/pveentjer/Multiverse



• atomic - returning Object

• atomicWithInt - returning int

• atomicWithLong - returning long

• atomicWithBoolean - returning boolean

• atomicWithDouble - returning double

• atomicWithVoid - no return value

Multiverse, by default, uses an optimistic locking strategy and automatically rolls back and retriescolliding transactions.

Developers should refrain from irreversible actions (e.g. writing to the console, sending e-mails,launching a missile, etc.) in their transactional code. To increase flexibility, the default Multiversesettings can be customized through custom atomic blocks .

Customizing the Transactional PropertiesFrequently it’s desirable to specify different values for some of the transaction properties (e.g. read-only transactions, locking strategy, isolation level, etc.). The createAtomicBlock method will create anew AtomicBlock configured with the supplied values:

Create an AtomicBlock with Custom Parameters

import groovyx.gpars.stm.GParsStmimport org.multiverse.api.AtomicBlockimport org.multiverse.api.PropagationLevel

final TxnExecutor block = GParsStm.createTxnExecutor(maxRetries: 3000, familyName:'Custom', PropagationLevel: PropagationLevel.Requires, interruptible: false)assert GParsStm.atomicWithBoolean(block) { true}

The customized AtomicBlock can then be used to create transactions using the specified settings.

AtomicBlock instances are thread-safe and can be freely reused among threadsand transactions

Using the Transaction ObjectAtomic closures use the current Transaction as a parameter. The Txn object handle for a transactioncan be used to manually control the transaction. This is illustrated in the example below, where weuse the retry() method to block the current transaction until the counter reaches the desired value:

232

A Sample

import groovyx.gpars.stm.GParsStmimport org.multiverse.api.PropagationLevelimport org.multiverse.api.TxnExecutor

import static org.multiverse.api.StmUtils.newTxnInteger

final TxnExecutor block = GParsStm.createTxnExecutor(maxRetries: 3000, familyName:'Custom', PropagationLevel: PropagationLevel.Requires, interruptible: false)

def counter = newTxnInteger(0)final int max = 100

Thread.start { while (counter.atomicGet() < max) { counter.atomicIncrementAndGet(1) sleep 10 }}

assert max + 1 == GParsStm.atomicWithInt(block) { tx -> if (counter.get() == max) return counter.get() + 1 tx.retry()}

Data StructuresYou might have noticed in previous examples that we use dedicated data structures to hold values.The fact is that normal Java classes do not support transactions and thus cannot be used directly,since Multiverse would not be able to share them safely among concurrent transactions, committhem nor roll them back.

normal Java classes do not support transactions

We need to use data that knows about transactions:

• TxnIntRef

• TxnLongRef

• TxnBooleanRef

• TxnDoubleRef

• TxnRef

233

You typically create these through the factory methods of the org.multiverse.api.StmUtils class.

More InformationWe decided not to duplicate the information that was already available on the Multiverse website.

Unfortunately with the closure of Codehaus, that website is longer available. You may try to gathermore information from the Multiverse source code.

As we are unclear about the future of the Multiverse project, we will consider using a differentSTM implementation in a future GPars 2.0.

234


User Guide To GAE

Google App Engine IntegrationGPars can be run on the Google App Engine (GAE). It can be made part of Groovy and Java GAEapplications as well as a plugged into Gaelyk.

The Google App Engine is known as GAE

The small GPars App Engine integration library provides all the necessary infrastructure to hookGAE services into GPars. Although you’ll be running on GAE threads and leveraging GAE timerservices, the high-level abstractions remain the same. With a few restrictions you can still useGPars actors, dataflow, agents, parallel collections and other handy concepts.

Please refer to the GPars App Engine library for details on how to proceed with GPars on GAE.

235

https://developers.google.com/appengine/




https://github.com/musketyr/gpars-appengine

https://github.com/musketyr/gpars-appengine

User Guide To RemotingConcepts like Actors, Dataflows and Agents are not restricted just to a single VM. They provide anabstraction layer for concurrent programming that allows us to separate logic from low levelsynchronization code. These concepts can be easly extend to multiple nodes in a network.

The following notes describe Remoting in GPars.

Remoting for GPars was a Google Summer of Code 2014 project.

IntroductionTo use Actors, Dataflows or Agent remotely, a new remote proxy object was introduced with theRemote prefix.

The proxy object usually has an identical interface to it’s local counterpart. This allows us to use itin place of local counterpart. Under the covers, a proxy object just sends messages over the wire toan original instance.

To transport messages across the network, the Netty library was used.

To create a proxy-object, the instance serialization mechanism was used (more in remote-serialization below).

The general approach to using remotes is as follows (details below):

At host A:

1. Create remoting context and start a server to handle incoming requests.

2. Publish an instance under a specified name

At host B:

1. Create remoting context

2. Ask for an instance with specified name from hostA:port. A promise object is returned.

3. Get a proxy object from the promise.

At this point, a new connection is created for each request

236

http://netty.io

Remote SerializationThe following mechanism was used to create proxy objects:

object ←(serialization)→ handle ---- [network] ---- handle ←(serialization)→ proxy-object

One of the main advantages of this mechanism is that sending proxy-object references back isdeserialized back to the original instance.

As all messages are seralized before sending over a wire, they must implement the Serializableinterface.

This is a consequence of using built-in Java serialization mechanism and NettyObjectDecoder/ObjectEncoder. On the other hand, it gives us the flexibility to send any custom objectas a message to an Actor or to use DataflowVariable(s) of any type.

DataflowsIn order to use remoting for Dataflows, a context (RemoteDataflows class) has to be created. Withinthis context, Dataflows can be published and retrieved from remote hosts.

A Sample

def remoteDataflows = RemoteDataflows.create()

In all subsections we assume that a context has already been created as shownabove.

After creating a context, if we want to allow other hosts to retrieve published Dataflows, we need tostart a server. We need to provide an address and port to listen on (say, like: localhost:11222, or10.0.0.123:11333).

Start A Remote Server

remoteDataflows.startServer HOST PORT

To stop the server, we have a stopServer() method. Note that both start and stop methods areasynchronous, and they don’t block; the server is started/stopped in the background.

Multiple execution of these methods or executing them in wrong order will raise an exception.

To only retrieve instances from remote hosts, starting a server is not necessary.

237

DataflowVariableThe DataflowVariable is a core part of Dataflows subsystem that gains remoting abilities. Otherstructures(?) and subsystems depend on it.

Publishing a variable within context is done simply by:

Publishing a Context

def variable = new DataflowVariable()remoteDataflows.publish variable "my-first-variable"

This registers the variable under a given name, so when a request for a variable with name my-first-variable arrives, the variable can be sent to the remote host.

It’s important to remember that publishing another variable under the same name, will overridethe provious one and subsequent requests will send the newly published one.

Variable retrieval is done by:

Variable Retrieval

def remoteVariablePromise = remoteDataflows.getVariable HOST, PORT, "my-first-variable"def remoteVariable = remoteVariablePromise.get()

The getVariable method is non-blocking and returns a promise object that will eventually hold aproxy object to that variable. This proxy has the same interface as a DataflowVariable and can beused seemlessly as a regular variable.

To explore a full example see our: groovyx.gpars.samples.remote.dataflow.variable code

DataflowBroadcastIt’s possible to subscribe to a DataflowBroadcast on a remote host. To do this, we had to havepublished it first (assuming the context already exists):

A DataflowBroadcast Sample

def stream = new DataflowBroadcast()remoteDataflows.publish stream "my-first-broadcast"

Then on the other host, it can be retrieved:

238

A Retrieval Sample

def readChannelPromise = remoteDataflows.getReadChannel HOST, PORT, "my-first-broadcast"def readChannel = readChannelPromise.get()

The proxy object has the same interface as a ReadChannel and can be used in the same fashion asa ReadChannel of a regular DataflowBroadcast.

To explore a full example, please see: groovyx.gpars.samples.remote.dataflow.broadcast

DataflowQueueThe DataflowQueue feature received similar functionality, and is published like this :

A Publish Sample

def queue = new DataflowQueue()remoteDataflows.publish queue, "my-first-queue"

and in similar way, we can retrieved it from the remote host:

Retrieval from Remote Sources

def queuePromise = remoteDataflows.getQueue HOST, PORT, "my-first-queue"def queue = queuePromise.get()

New items can be pushed into the queue of the remote proxy. Such elements are sent over-the-wireto the original instance and pushed into it.

Retrieval commands send a request for an element to the original instance.

Conceptually, the remote proxy is an interface - it just sends requests to an original instance.

To explore a full example see: groovyx.gpars.samples.remote.dataflow.queue orgroovyx.gpars.samples.remote.dataflow.queuebalancer

ActorsThe Remote Actors subsystem is designed in a similar way.

To start a RemoteActors class, a context must be created. Then within this context, an Actorsinstance can be published or retrieved from a remote host.

239

Remote Creation

def remoteActors = RemoteActors.create()

Publishing :

def actor = ...remoteActors.publish actor, "actor-name"

Retrieval :

def actorPromise = remoteActors.get HOST, PORT, "actor-name"def remoteActor = actorPromise.get()

It’s possible to join a remote Actor, but this will block until the original Actor ends its work.Sending replies and the sendAndWait method are supported as well.

One can send any object as a message to an Actor, but keep in mind it has to be Serializable.

See example: groovyx.gpars.samples.remote.actor

Remote Actor NamesA RemoteActors class context may be identified by a name. To create one with a name use:

Create A Named Context

def remoteActors = RemoteActors.create "test-group-1"

Actors published within this context may be accessed by providing a special Actor URL.

For example: publishing an actor under the name of actor within this context makes it accessibleunder the URL "test-group-1/actor".

A Named Sample

def actor = remoteActors.get "test-group-1/actor"

The host and port of an instance holding this actor is determined automatically.

Invoking the get method will send a broadcast query to 255.255.255.255 with a search for an actorwithin a context with that specific name. A matching instance responds to that query withnecessary information like host and port.

240

Allowed Actor and Context Names

As the URL can contain "\\" (backslash) as a separator between context and actor name, wecannot use backslashes in an actor’s name, but a context name can contain any UTFcharacters.

AgentsA Remote Agent system is designed in similar fashion.

First, a RemoteAgents class context must be created. Within this context, Agents can be published orretrieved from remote hosts.

A Remote Create Sample

def remoteAgents = RemoteAgents.create()

Publishing :

def agent = ...remoteAgents.publish agent, "agent-name"

Retrieval :

def agentPromise = remoteAgents.get HOST, PORT, "agent-name"def remoteAgent = agentPromise.get()

There are two ways to execute closures used to update the state of a remote Agent instance:

• remote - closure is serialized and sent to original instance and executed in that context

• local - current state is retrieved and closure is executed where the update originated, thenupdated value is sent to original instance. Concurrent changes to Agent wait until this processends.

By default, a remote Agent uses a remote execution policy, but we can change it if necessary :

Changing Policy to LOCAL

def agentPromise = remoteAgents.get HOST, PORT, "agent"def remoteAgent = agentPromise.get()remoteAgent.executionPolicy = AgentClosureExecutionPolicy.LOCAL

241

General GPars Tips

GroupingHigh-level concurrency concepts, like Agents, Actors or Dataflow tasks and operators can begrouped around shared thread pools. The PGroup class and its sub-classes represent convenientGPars wrappers around thread pools. Objects created using the group’s factory methods will sharethe group’s thread pool.

A xxxPGroup Sample

def group1 = new DefaultPGroup()def group2 = new NonDaemonPGroup()

group1.with { task {...} task {...} def op = operator(...) {...} def actor = actor{...} def anotherActor = group2.actor{...} //will belong to group2 def agent = safe(0)}

Groups For Thread Pools

When customizing the thread pools for groups, consider using the existing GParsimplementations - the DefaultPool or ResizeablePool classes. Or you may wish to create yourown implementation of the groovyx.gpars.scheduler.Pool interface to pass to theDefaultPGroup or NonDaemonPGroup constructors.

Java APIMuch of GPars functionality can be used from Java just as well as from Groovy. Checkout the JavaAPI - Using GPars from Java section of this User Guide. Then experiment with the Maven-basedstand-alone Java demo applications.

Take GPars with you wherever you go!

PerformanceYour code in Groovy can be just as fast as code written in Java, Scala or any other programing

243

language. This should not be surprising, since GPars is technically a solid tasty Java-made cake witha Groovy DSL frosting on it.

Unlike Java, however, with GPars, as well as with other DSL-friendly languages, you are very likelyto experience a useful code speed-up for free. This speed-up comes from a better and cleaner designof your application.

Coding with a concurrency DSL will give you a smaller code-base with code using the concurrencyprimitives as language constructs. So it’s much easier to build robust concurrent applications,identify potential bottle-necks or errors and then eliminate them.

While this whole User Guide is describing how to use Groovy and GPars to create beautiful androbust concurrent code, we wanted to use some of these tips to highlight a few places where somecode tuning or minor design compromises could give you interesting performance gains.

Parallel CollectionsMethods like parallel collection processing, like eachParallel() , collectParallel() and such-like, useParallel Array , an efficient tree-like data structure behind the scenes. This data structure has to bebuilt from the original collection each time you call any of the parallel collection methods. Thuswhen chaining parallel method calls, you might consider using the map/reduce API instead oralternatively, use the ParallelArray API directly to avoid the Parallel Array creation overhead.

A Sample of Parallel Finds

import groovyx.gpars.GParsPool;GParsPool.withPool { people.findAllParallel{it.isMale()}.collectParallel{it.name}.any{it == 'Joe'} people.parallel.filter{it.isMale()}.map{it.name}.filter{it == 'Joe'}.size() > 0 people.parallelArray.withFilter({it.isMale()} as Predicate).withMapping({it.name}as Mapper).any{it == 'Joe'} != null}

In many scenarios, changing the pool size from the default value can give you performancebenefits. Especially if your tasks perform IO operations, such as file or database access, networking,etc. So increasing the number of threads in the pool is likely to help performance.

Bump PoolSize To Boost Performance

import groovyx.gpars.GParsPool;GParsPool.withPool(50) { ...}

Since the closures you provide to the parallel collection processing methods are executedfrequently, and concurrently, you may further slightly benefit from turning them into Java.

244

ActorsGPars actors are fast. DynamicDispatchActors and ReactiveActors are about twice as fast as theDefaultActors, since they don’t have to maintain an implicit state between subsequent messagearrivals. The DefaultActors are, in fact, on a par in performance with actors from Scala , which yourarely hear of as being slow.

If top performance is what you’re looking for then identify patterns in your code !

If top performance is what you’re looking for, a good start is to identify the following patterns inyour actor code:

A Pattern To Look For

actor { loop { react {msg -> switch(msg) { case String:... case Integer:... } } }}

A Better Replacement : DynamicDispatchActor :

messageHandler { when{String msg -> ...} when{Integer msg -> ...}}

The loop and react methods are rather costly to call.

Defining a DynamicDispatchActor or ReactiveActor as classes instead of using the messageHandlerand reactor factory methods will also give you some more speed:

245

A Dynamic Sample

class MyHandler extends DynamicDispatchActor { public void handleMessage(String msg) { ... }

public void handleMessage(Integer msg) { ... }}

Now, convert that MyHandler class to Java to squeeze the last bit of performance from GPars.

Pool Adjustment

GPars allows you to group actors around thread pools, giving you the freedom to organize actorsany way you like. It’s always worthwhile to experiment with the actor pool size and type.

FJPool usually gives better characteristics than DefaultPool, but seems to be more sensitive to thenumber of threads in the pool. Sometimes, using a ResizeablePool or ResizeableFJPool can helpperformance by automatic eliminating unneeded threads.

A Sample

def attackerGroup = new DefaultPGroup(new ResizeableFJPool(10))def defenderGroup = new DefaultPGroup(new DefaultPool(5))

def attacker = attackerGroup.actor {...}def defender = defenderGroup.messageHandler {...}...

AgentsGPars Agents are even a bit faster in processing messages than actors. The advice to group agentswisely around thread pools, and then tune the pool sizes and types, applies to agents as well asactors. With agents, you may also benefit from submitting Java-written closures as messages.

Share Your Experience

The more we hear about GPars uses in the wild, the better we can adapt it for the future. Letus know how you use GPars and how it performs. Send us your benchmarks, performancecomparisons or profiling reports to help us tune GPars for you. See this page for more details.

246

../User_Voices.html

Hosted EnvironmentsHosted environments, such as Google App Engine, can impose additional restrictions on threading.For GPars to better integrate with these environments, the default thread factory and timer factorycan be customized.

Hosted environments like Google App Engine impose restrictions on threading

The GPars_Config class provides static initialization methods allowing third parties to registertheir own implementations of the PoolFactory and TimerFactory interfaces. These can then be usedto create default pools and timers for Actors, Dataflow and PGroups.

Some Static Methods To Initialize Objects

public final class GParsConfig { private static volatile PoolFactory poolFactory; private static volatile TimerFactory timerFactory;

public static void setPoolFactory(final PoolFactory pool)

public static PoolFactory getPoolFactory()

public static Pool retrieveDefaultPool()

public static void setTimerFactory(final TimerFactory timerFactory)

public static TimerFactory getTimerFactory()

public static GeneralTimer retrieveDefaultTimer(final String name, final booleandaemon)

public static void shutdown()}

Custom factories should be registered immediately after application startup in order for Actors andDataflow to be able to use them for their default groups.

Shutdown

The GParsConfig.shutdown() method can be used in managed environments to properly shutdownall asynchronously running timers and free up the memory from all thread-local variables.

After a call to this method, the GPars library can no longer provide the declared services.

247

CompatibilitySome further compatibility issues can occur when running GPars in a hosted environment.

The most noticeable one is probably the lack of ForkJoinThreadPool support in GAE. Mechanismslike Fork/Join and GParsPool may not be available on some services as a result. However,GParsExecutorsPool, Dataflow, Actors, Agents and STM should work normally even when usingmanaged non-Java SE thread pools.

248

User Guide To The Conclusion

This was quite a wild ride, wasn’t it?

Now, after reading this User Guide, you’re certainly ready to build fast, robust, reliable andconcurrent applications.

You’ve seen that there are many concepts you can choose from and each has its own areas ofapplicability. The ability to pick the right concept to apply is key to being a successful developer.

If you feel you can do this with GPars, the mission of this User Guide has been accomplished.

Now, go ahead, use GPars and have fun!

249

the whole gpars team gpars.org/webapp/guide/index.pdf · groovy you use. if you don’t already...

Documents