hosting and using “r” with azure ml

Hosting and Using “R” with Azure ML

November 18, 2014

Meetup Azure Machine Learning

Why host “R” in Azure ML

• R has great depth and breadth in many areas

• Very high value but not always easy to transition to a pluggable solution callable by other processes or services

• Can be combined with existing Azure ML modules (mix and match)

• ORSimply host a working R solution as a web service

Common challenge

Focus on right hand side…

One Azure ML module to learn and use

More input ports than usual

Output ports for “data” and R device output

Adding additional “R” packages/scripts…

From script to running web service

Using the web service…






Next meetup – will be mid-January 2015

• Will post slides and script from tonight to


• Remaining slides have R script and illustrations.

R scripts used last night

• Loading and referencing an external R package• Note: make sure follow steps on slide 8

• #this is trivial and just used to show package load and testinstall.packages("src/",lib=".",repos=NULL,verbose=TRUE)install.packages("src/",lib=".",repos=NULL,verbose=TRUE)install.packages("src/",lib=".",repos=NULL,verbose=TRUE)library(skmeans,lib.loc=".",verbose=TRUE)#our package and libraries should be loaded up#stuff <- packages.installed(skmeans)#dataset1 <- maml.mapInputPort(1) # class: data.frame#dataset2 <- maml.mapInputPort(2) # class: data.framesamp <-matrix(,size=20*50,replace=TRUE),nrow=20,ncol=500,dimnames=list(1:20,1:500))fit <-skmeans(samp,5)result <- data.frame(list(rownames(samp),fit$cluster),row.names=NULL)colnames(result) <- c("sample row","cluster")print(result) #R console output

Simple kmeans cluster

mydata <- maml.mapInputPort(1) # get our data from the R script input module instead of inline – this is the web service input signature

# parse and structure the input data to become a dataframe for the clustering

data.split <- strsplit(mydata[1,1], ",")[[1]]

data.split <- sapply(data.split, strsplit, ";", simplify = TRUE)

data.split <- sapply(data.split, strsplit, ";", simplify = TRUE)

data.split <-

data.split <- data.matrix(data.split)

data.split <- data.frame(data.split)

# K-Means Cluster Analysis

fit <- kmeans(data.split, mydata$k) # k-cluster solution

# get cluster means


# append cluster assignment

mydatafinal <- data.frame(t(fit$cluster))


colnames(mydatafinal) <- paste("V",1:n_col,sep="")


maml.mapOutPortPort(mydatafinal) # this will become the web service publishing port – i.e. what is returned – output must be a dataframe…

Input schema and sample data for kmeansthis is hosted in the “R” script module

mydata <- data.frame(value = "1; 3; 5; 6; 7; 7, 5; 5; 6; 7; 2; 1, 3; 7; 2; 9; 56; 6, 1; 4; 5; 26; 4; 23, 15; 35; 6; 7; 12; 1, 32; 51; 62; 7; 21; 1", k=5, stringsAsFactors=FALSE)

maml.mapOutputPort("data"); # this is the key as it wires the sample schema above to the downstream receiver (see the next illustration)

A model that is ready to be published as a web service.Note the publishing icons on the lower modules input and output ports.

Input Schema

Simple kmeans cluster script

