extending ruby by harnessing other languages

Post on 06-Aug-2015

34 Views

Category:

Software

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Extending Ruby by harnessing other languages

Brendon McLean@brendon9x

What if Ruby doesn’t do everything well?

But you want to use Ruby anyway

A long time ago, in a startup far, far away…

The Challenge a.k.a “The value proposition”

To bring meaningful insight and intelligence to market research*

*In two weeks please

SolutionMVP on Ruby on Rails

The Requirement a.k.a “The Problem”

a.k.a Numerical Ruby Please

Dimensionality* *lots of it

RESP AGE GENDER USE_IOS USE_ANDROID CAT_PERSON1 18-24 M 1 1 N2 25-35 M N3 35-50 F 1 1 Y4 35-50 F Y5 18-24 M 1 1 Y6 25-35 M 1 Y7 25-35 M 1 1 Y8 35-50 F 1 N9 18-24 F 1 Y10 18-24 F 1 N11 35-50 F 1 N12 35-50 F 1 N13 18-24 F 1 Y14 35-50 M 1 N15 18-24 F 1 N16 35-50 M 1 N17 35-50 M 1 1 NN 25-35 M Y

up to 40GB* *per dataset

Loosely structured

Interrogate Anything* *goodbye clever caching strategy

Long story short…

Approx FLOPS

1,000

1,000,000

1,000,000,000

1,000,000,000,000

Plain Ruby (2011) Bitset (2012) GSL (2012) NArray (2013)

Complexity inflection point

Performance Abstraction Power Complexity

Time to consider other options

If, hypothetically, we weren’t using Ruby, what else is out there?

HaskellComes with free beard*

*beard must compile

NOBODY EXPECTED PYTHON!

Why Python?

Size of scientific community

Depth of ecosystem

Similar* attitude to usability and expressiveness first

*ish

Performance

Numpy

• Lineage goes back to 1995

• Array computing — vectorised operations for Python

• NArray is based on Numpy

• Is the bedrock upon which the rest of scientific Python is built

Vectorisation

$>  array.reduce(&:+)  

$>  a.zip(b).map  do  |l,  r|            l  *  r        end

$>  array.sum  

$>  a  *  b

Pandas

• Built on Numpy

• Basically ports the best bits of R into Python

• Fast

• Cognitively simpler for general programmers

• Munging!

Bonus extras

• Scipy: Linear Algebra, FFT, Clustering, Stats

• IPython Notebooks

• Sympy: Computer Algebra System

• nltk: Natural Language Toolkit

• scikit-learn: Machine Learning

Strength of community

Total commits

NArray GSL Pandas Numpy

12,588

10,865

193141

Contributers

NArray GSL Numpy Pandas

310

249

44

Issues: Open and Closed

GSL NArray Numpy Pandas

1051

651

181

4681

2691

185

Using Python from Ruby ❤️

Problem statement

Flexibility of Pandas

Speed of Numpy

Scales horizontally

Ruby ❤️ API

API Inspiration

ActiveRecord scopesDeferred, composable

API implementation problemGetting to Ruby to run Python === Getting Python to run Ruby

Ruby => Data => Python*

*or other

“Code is data”— people with LISP personality disorder

S-Expressions(function  arg1  arg2  arg3  ...)

Simple s-expression example

$>  (+  1  1)  =>  2  

$>  (find  User  1  2)

$>  1  +  1  =>  2  

$>  User.find(1,  2)  $>  User.send(:find,  1,  2)

Example with nesting

$>    =>  5

2 2* 1+

Example with nesting

$>    =>  5

22*1+( ( ) )

ActiveLISP

User.select(:state).      where(no_spam:  false).      group(:state).      count  

(count      (group  :state          (where  :no_spam  false              (select  :state  User)          )      )  )  

Ruby => S-Expressions

S-Expressions => Python

Does have limitations*

Added benefits

Optimisation (tree rewrites)

Automatic query sharding

Target multiple backends through a common API

Enough talking…

Live Demo

Thanks

• Min RK Initial iRuby Kernel

• Daniel Mendler for continued work on iRuby Kernel

• My Team @ Intellection

• And all the gems!

London Cape Townhttp://www.public-domain-image.com/architecture/bridge/slides/bridge-london-england.html

intellection

top related