embc r-package (v.1.1) tutorial - wordpress.com · j.garriga, j.r.palmer, a.oltra, f.bartumeus...

26
EMbC Expectation Maximization binary Clustering EMbC R-package (v.1.1) Tutorial J.Garriga, J.R.Palmer, A.Oltra, F.Bartumeus ICREA - Movement Ecology Lab (CEAB-CSIC) 28 April 2014

Upload: vukhanh

Post on 02-Nov-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

EMbCExpectation Maximization binary Clustering

EMbC R-package (v.1.1)

TutorialJ.Garriga, J.R.Palmer, A.Oltra, F.Bartumeus

ICREA - Movement Ecology Lab (CEAB-CSIC)28 April 2014

Behavioural Annotation of Movement Trajectories

➲ segmentation/labelling of a trajectory into behavioural modes

➲ approach✔ esimate movement variables (e.g velocity, turning

angle, tourtuosity,fpt) and apply a classification algorithm

➲ challenges✔ unsupervised problem✔ parameterization and prior assumptions✔ uncertainty in the data (geopositioning inaccuracies,

sampling hetereogeneity)✔ capture sufficiently general and biologically meaningful

semantics

EMbC➲ variant of the GMM maximum likelihood estimation

algorithm (EMC)

➲ EMC goods✔ unsupervised, multivariate clustering algorithm.

➲ EMC bads✔ prior assumptions or restarts✔ interpretability of the output

➲ EMbC novelty✔ implements potential uncertainty in the data✔ binary clustering: variables take either High or Low

values (meaningful clustering)✔ number of clusters bounded to 2^(number of

variables)✔ parameter free, no prior assumptions

EMbC R-package

API➲ classes

✔ objects with attributes (slots) containing input/output variables

➲ constructors✔ commands to build the objects (process inputs and

computes outputs)

➲ methods✔ commands to manipulate the objects (give access to

attributes for analysis or visualization)

➲ parameters✔ modifiers of the commands

EMbC R-package API

Classes

binClst(dataSet)

binClst(dataSet)

binClstMoveStck(set of Move objs.)

binClstMoveStck(set of Move objs.)

binClstPathStck(set of trajectories)

binClstPathStck(set of trajectories)

binClstMove(Move obj.)

binClstMove(Move obj.)

binClstPath(trajectory)

binClstPath(trajectory)

MoveMove R-package

MoveMove R-package

EMbC R-package API

Constructors

➲ general (broad applicability)

binClst ← embc(signature,parameters) (v.1.1 only bivariant)

➲ specific for movement trajectory analysis(behavioural annotation)

binClstPath ← stbc(signature,parameters)binClstMove ← stbc(signature,parameters)

(v.1.1 only speed/turn analysis)

constructor

binClst ← embc(signature,params)

➲ signature✔ (matrix)

data points matrix (n data points, m variables)

➲ returns✔ binClst

➲ example> mybc <- embc(myDataMtx)

➲ parameters✔ (U,stDv,maxItr,vrb)

constructor

binClstPath/binClstMove ← stbc(signature,params)➲ signature:

✔ (data.frame) trajectory as (timeStamps, lons, lats, ...)➲ returns

✔ binClstPath➲ example

> mybcp <- stbc(myPathDF)

➲ signature✔ (Move object) ➲ returns

✔ binClstMove➲ example

> mybcm <- stbc(myMoveObj)

➲ parameters✔ inherited (stDv,maxItr,vrb) (not U!!)✔ specific (spdLim,smth)

binClst@slots

➲ input✔ mybc@X:: matrix(n,m); data-points

➲ internal✔ mybc@n:: integer; number of data points✔ mybc@m:: integer; number of variables✔ mybc@k:: integer; max. number of clusters (2^(m))

➲ output✔ mybc@W:: matrix(n,k); likelihood weights✔ mybc@L:: numeric(iters); step likelihoods✔ mybc@R:: matrix(k,2*m); delimiters✔ mybc@A:: numeric(n); cluster assignments (1:LL,2:LH,3:HL,4:HH)

binClstPath@slots

➲ all inherited from binClst

➲ self

✔ mybcp@pth::data.frame(dTM,lons,lats,...); trajectory✔ mybcp@span:: numeric(n); span times✔ mybcp@dist::numeric(n); distances✔ mybcp@bear::numeric(n); bearings✔ mybcp@U::matrix(n,m); uncertainty

binClstMove@slots

➲ all inherited from binClstPth

➲ all inherited from Move

EMbC R-package API

Methods

➲ clustering information (statistics)

➲ plotting (input/output variables)

➲ visualization (clustering, annotated path)

➲ comparison (clusters, labels)

methods

stts(signature)

➲ clustering statistics (mean, stDv, marginal distribution)

➲ signature✔ (binClst), (binClstPath), (binClstMove)

➲ parameters:✔ none

➲ example:> stts(mybc)

methods

sctr(signature)

➲ bivariate clustering scatterplot

➲ signatures✔ (matrix)✔ (binClst), (binClstPath), (binClstMove)

➲ parameters✔ none

➲ example> sctr(mybcp)> sctr(mybc@X)

binClstPath methods

lblp(signature)

➲ plots labeling profile

➲ signature✔ (binClstPath), (binClstMove)

➲ parameters✔ none

➲ example> lblp(mybcp)> lblp(bcp1,bcp2)> lblp(mybcp,expLbl)

methods

binClstPath/binClstMove ← smth(signature)

➲ a posteriori smoothing of local labelling

➲ signatures✔ (binClstPath), (binClstMove)

➲ parameters✔ none

➲ returns✔ a smoothed copy of the binClstPath/binClstMove

input instance

➲ example> mysmoothedbcp <- smth(mybcp)

binClst methods

comparing/validating results

➲ comparison of clustering scatterplots✔ sctr(binClst,binClst)✔ sctr(binClst,numeric)➲ comparison of labellings

✔ lblp(binClst,binClst)✔ lblp(binClst,numeric)

➲ parameters✔ none

➲ example> sctr(bc1,bc2)> sctr(bc1,exp)> lblp(bc1,exp)

binClstPath methods

pkml(signature) / bkml(signature)➲ generates pointwise/burstwise kmlDoc (Google-Earth)

➲ signature✔ (binClstPath), (binClstMove)

➲ parameters✔ folder::character ('~/embcDocs')

path where to save the kmlDocfile name “system_datetime.kml”

✔ markerRadius::numeric (5)size of markers in pixels

✔ display::bool (FALSE)open Google-Earth

➲ example> pkml(mybcp,display=TRUE)> bkml(mybcp)

binClstPath methods

pmap(signature) / bmap(signature)➲ generates pointwise/burstwise htmlDoc (browser)

➲ signature✔ (binClstPath),(binClstMove)

➲ parameters✔ folder::character ('~/embcDocs')

path where to save the kmlDocfile name “system_datetime.html”

✔ markerRadius::numeric (5)size of markers in pixels

✔ display::bool (FALSE)open browser

✔ mapType::character ('SATELLITE')background map (see doc. help)

➲ example> pmap(mybcp,display=TRUE,markerRadius=25)> bmap(mybcp,display=TRUE)

binClstPath methods

varp(signature)➲ plots binClstPath/binClstMove data

➲ signature✔ (binClstPath), (binClstMove)✔ (matrix)✔ (numeric)

➲ parameters✔ none

➲ example> varp(mybcp)> varp(mybcp@X)> varp(mybcp@U)> varp(mybcp@L)

binClstPath methods

view(signature)

➲ fast visualization of the trajectory

➲ signature✔ (binClstPath), (binClstMove)

➲ parameters✔ none

➲ example> view(mybcp)

binClst methods

setc(signature,params)

➲ sets a k-color palette from a color family in the RColorBrewer R-package

➲ signature✔ (binClst), (binClstPath), (binClstMove)

➲ parameters✔ fam:: character

a color family name from the RColorBrewer

➲ examples> brewer.pal.info; lists all color families (name, type, max.colors)> display.brewer.pal(8,'RdYlBu')> setc(mybc,'RdYlBu')alternatively:>mybc@C <- c("#FDAE61","#D7191C","#ABD9E9","#2C7BB6","#525252")

embc/stbc constructors

parameters(iteration process control)

➲ maxItr = integer

✔ stop criterium in case of slow convergence✔ default,(200)✔ example

> mybc <- stbc(myPthDf,maxItr=1)

➲ vrb = integer ✔ if vrb=0 shows compact step information✔ if vrb=1 shows expanded (statistics) step information✔ default, (0)✔ example

> mybc <- embc(X=data,,stDv=c(1,1),vrb=1)

stbc constructor

specific parameters

➲ spdLim:: float✔ speed limit for outliers (m/s)✔ default, (40)✔ example

> mybcp <- stbc(myPthDf,spdLim=20)

➲ smth:: float✔ smoothing time window (hours)✔ default, (0)✔ example

> mybcp <- stbc(myPthDf,smth=6)

embc/stbc constructors

parameters(clustering control)

➲ U = matrix(n,m)

✔ uncertainty matrix (same dimension as X !!)✔ default, (matrix(rep(1,n*m),c(n,m)))✔ example

> mybc <- embc(myDataMtx,U=myUncMtx)

➲ stDv = numeric(m)● minimum standard deviation (variable specific)● default, (rep(10e-32,m))● example

> mybc <- embc(myDataMtx,stDv=myMinStDv)

Thanks for your attention!

➲ acknowledgments✔ Raymond Klaassen (Migration Ecology Group, Department

of Animal Ecology. Lund University, Sweden), who has kindly provided the Osprey trajectory

✔ Dina Dechmann (Max Plank Institute of Ornithology, MPIO, Germany) who has kindly povided the Straw-colored fruit bat data

➲ test package✔ ask at [email protected]✔ available for linux/windows (please, indicate your OS)✔ any feedback welcome

➲ questions ?????