testing the efficacy of unsupervised machine learning techniques to … · 2019-09-02 ·...

83
I Testing the Efficacy of Unsupervised Machine Learning Techniques to Infer Shark Behaviour from Accelerometry Data Submitted by Courtney E. Norris, BSc This thesis is presented for the degree of Bachelor of Science Honours in Conservation and Wildlife Biology School of Veterinary and Life Sciences, of Murdoch University, 2019

Upload: others

Post on 11-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

I

Testing the Efficacy of Unsupervised Machine Learning

Techniques to Infer Shark Behaviour from

Accelerometry Data

Submitted by

Courtney E. Norris, BSc

This thesis is presented for the degree of

Bachelor of Science Honours in Conservation and Wildlife Biology

School of Veterinary and Life Sciences, of Murdoch University, 2019

Page 2: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

II

Declaration

I declare that this thesis is my own account of my research and contains

as its main content work which has not previously been submitted for a

degree at any tertiary education institution.

Courtney Norris

Page 3: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

III

Table of contents

Abstract ...................................................................................................................... VI

Acknowledgements .................................................................................................. VII

Introduction .................................................................................................................. 1

Animal behaviour ..................................................................................................... 1

Biologging ................................................................................................................ 1

Using accelerometry to study animal behaviour ...................................................... 2

Machine learning for animal behaviour classification ............................................. 5

Supervised machine learning ................................................................................... 5

Unsupervised machine learning ............................................................................... 8

K-means clustering............................................................................................. 10

Hidden Markov models ...................................................................................... 14

Current study and objectives .................................................................................. 17

Methods ...................................................................................................................... 18

Study site and subject species ................................................................................ 18

Tag package and procedure .................................................................................... 19

Capture and attachment method ............................................................................. 20

Data collection ....................................................................................................... 21

Data pre-processing ................................................................................................ 22

K-means clustering................................................................................................. 24

K-means performance evaluation ...................................................................... 27

Page 4: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

IV

Mixed model analysis ........................................................................................ 28

Hidden Markov model ........................................................................................... 31

The HMM framework ........................................................................................ 31

Model formulation ............................................................................................. 33

Fitting the HMM ................................................................................................ 34

Validation of the HMM ...................................................................................... 35

Global decoding ................................................................................................. 36

HMM performance evaluation ........................................................................... 36

Using a GLMM to analyse the performance of the HMM................................. 36

Results ........................................................................................................................ 38

Behavioural trials ................................................................................................... 38

K-means clustering................................................................................................. 39

Classification performance................................................................................. 39

Mixed effects model to analyse the pseudo-predictions from K-means clustering

............................................................................................................................ 42

Hidden Markov model ........................................................................................... 45

Classification performance................................................................................. 50

Mixed effects model to analyse the pseudo-predictions from the HMM........... 52

Discussion .................................................................................................................. 56

Classifier performance for unsupervised ML methods .......................................... 56

K-means clustering............................................................................................. 56

Page 5: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

V

Hidden Markov model ....................................................................................... 58

Comparing the performance of unsupervised machine learning methods to a

previously supervised machine learning method for behavioural classification of

shark behaviour ...................................................................................................... 61

Limitations and future research .............................................................................. 63

Conclusion ............................................................................................................. 65

Page 6: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

VI

Abstract

Biologging is becoming a powerful tool in the study of free-ranging animal behaviour.

Accelerometers play an important role particularly for cryptic aquatic species by

facilitating the measurement of animal body movement and thus, behaviour. However,

our ability to collect large and complex data sets is surpassing our ability to analyse

them, prompting a need to develop methodologies for automated behavioural

classification. Unsupervised machine learning is particularly useful for behavioural

classification where direct observations to link patterns of acceleration to animal

behaviour are not always attainable. We tested the ability of unsupervised machine

learning to classify shark behaviour by applying two common unsupervised

approaches, K-means clustering and Hidden Markov models (HMM), to ground-

truthed accelerometry data collected from captive juvenile lemon sharks (Negaprion

brevirostris). Although K-means clustering demonstrated low classification

performance, the HMM performed well in distinguishing broad categories in

behaviour (resting vs swimming), but generally had poor performance in rare and more

complex behaviours (e.g. prey handling or burst swimming). This study is one of the

first to validate the use of common unsupervised machine learning algorithms and

lends further support to their use in the study of behaviour in free-ranging animals,

while also showing limitations in their ability to discern complex behaviours.

Page 7: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

VII

Acknowledgements

Firstly, I would like to thank my supervisors, Adrian Gleiss and Nicola Armstrong,

for your inspiration, support and encouragement. Thank you for all your knowledge

and advice and thank you for being so patient with my migraines. I truly loved this

work and I am extremely grateful for the huge push into statistics – you have helped

open doors for me that would have been otherwise non-existent. Thank you for

everything.

Thank you to all my colleagues at Murdoch University. Evan Byrnes for the great trip

to Germany with Lauran, and all your help with the HMM’s, this paper would not

have happened without you. Jenna Hounslow and Oliver Jewell for the constant

support, the hours you have spent reading my work, and all of your advice on honours

life, thank you so much. And Renae, for going through this year with me every step of

the way, thank you for your friendship and support. And of course, Vianey Leos-

Barajas for all your help with the HMM, I cannot thank you enough.

To my wonderful fiancé, thank you for all the emotional support this year, you have

been my rock and I am so lucky to have a man like you by my side. To my wonderful

San Churro family, particularly Chelsea, Mac, Jo, Indi and Anna - thank you for all

your support and understanding with my migraines this year, it means the world to

me. Mummy and daddy dearest, even from miles away you continue to be the best

support system and I thank you for that. Lastly, Lauren and Rebecca, I cannot thank

you enough for your constant love and support.

Page 8: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

1

Introduction

Animal behaviour

Understanding how animals interact with, and respond to their surrounding

environment has become increasingly significant as the number of threatened species

continues to rise, and our ecosystems collapse (Brooker et al., 2016). Studying

behavioural aspects such as foraging, reproduction, migration and habitat use provide

significant insight for wildlife management and conservation (Yoda, 2019). For

example, by investigating nestling development, migration and recruitment of the

short-tailed albatross (Phoebastria albatrus), Deguchi et al. (2014) applied adaptive

management techniques and enabled the successful translocation of the threatened

seabird. Animal behaviour was traditionally studied in captivity, mostly due to rare or

cryptic species, migratory movements, or environments that limit direct observation

of animal behaviour (Boyd et al., 2004; Campbell et al., 2013). However, it quickly

became clear that captive environments can hinder the physical and psychological

needs of an animal, and result in the display of abnormal behaviours, or limited

representation of natural behaviours (Boyd et al., 2004; Mason, 1991). Therefore, the

development of new techniques was required to remotely observe the behaviour of

free-ranging wildlife.

Biologging

Biologging involves attaching miniaturised tags to animals for remote

measurement of their movement, behaviour and physiology (Boyd et al., 2004; Rutz

and Hays, 2009; Wilmers et al., 2015; Wilson et al., 2008). The first biologger

Page 9: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

2

emerged in the 1930s (Scholander, 1940), but it was the ground-breaking development

of microprocessors in the 1960s, that revolutionised the way we study animals

(Kooyman, 1966, 2004). Biologging technology has advanced rapidly in the last 20

years, with devices available that weigh only 0.34 g (Virens and Cree, 2018), to

loggers sampling at increasingly fine temporal resolutions (500 Hz) (Nishiumi et al.,

2018). These advancements have been extremely important in the marine realm, where

they have facilitated recordings of wildlife that would be otherwise unobservable

(Bograd et al., 2010; Chapple et al., 2015). For marine wildlife, biologgers were

traditionally used to study movement ecology and habitat use (Whitney et al., 2007).

One of the most recent and most significant advancements in biologging technology

is the accelerometer, where its implementation has enabled quantitative movement

analyses and revolutionised the study of animal behaviour (Ropert-Coudert et al.,

2009; Rutz and Hays, 2009).

Using accelerometry to study animal behaviour

An accelerometer is an electromechanical device that provides animal

orientation data and detailed movement kinematics (Brewster et al., 2018; Shepard et

al., 2008). The sensor acts as a spring that produces an electric charge when deformed

(Brown et al., 2013). The electric charge produced is relative to the acceleration that

the device experiences from gravity and animal movement, and has enabled fine-scale

behaviours to be monitored, unrestrained by visibility or observer bias (Brown et al.,

2013). Accelerometers are a potentially effective tool that can be applied to an array

of conservation concerns (Shepard et al., 2008). For example, the combination of

global positioning system (GPS) measurements and accelerometry data can provide

Page 10: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

3

high-resolution information on habitat use and activity budgets of wildlife (Berlincourt

et al., 2015; Bouten et al., 2013; Nams, 2014; Shepard et al., 2008).

Early accelerometry studies used mono- and bi-axial accelerometers that

record acceleration in one and two spatial axes respectively (Hochscheid and Wilson,

1999; Yoda et al., 2001; Yoda et al., 1999). In the last 15 years, however, the majority

of acceleration studies record acceleration in three dimensions: along longitudinal,

lateral and dorsoventral axes (Figure 1) (Leos‐Barajas et al., 2017; Shepard et al.,

2008; Wilson et al., 2008). Tri-axial accelerometers provide an informative estimate

of the animals complete body acceleration over time, which can be used in conjunction

with the corresponding posture to identify animal behaviour (Shepard et al., 2008)

In addition to animal movement and behaviour, researchers are now using

three-dimensional biologgers to study complex ecological processes such as migration

and foraging strategies (Bäckman et al., 2017; Hernández-Pliego et al., 2017). For

example, tri-axial accelerometers were used to study the mating behaviour of free-

ranging nurse sharks (Ginglymostoma cirratum) (Whitney et al., 2010). The

Figure 1: Schematic diagram of the three axes used to record acceleration

with a tri-axial accelerometer; longitudinal: Y-axis; lateral: Z-axis; Dorso-

ventral: X-axis

Page 11: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

4

accelerometers proved effective by easily identifying basic active and inactive

behaviours that produce relatively low-amplitude spectra and consequently, low

acceleration (Figure 2). Behaviours associated with mating such as head-down

orientation or rolling from side to side were also identified and were made apparent

by sudden changes in amplitude and acceleration. Despite the accelerometers ability

to record both simple and complex behaviours, all behaviours had to be directly

observed before they could be interpreted with certainty (Whitney et al., 2010).

Figure 2: An example of elasmobranch behavioural classification from tri-axial accelerometry

data. The example portrays five elasmobranch behaviours for classification. (a) overall dynamic

body acceleration (ODBA) is calculated by the sum of the absolute acceleration in all three axes.

(b) Dynamic acceleration in all three axes: sway (blue), heave (red) and surge (grey) for the time

of each behaviour. (c) The corresponding wavelet spectrum produced from the sway axis alone

which illustrates increased amplitude during the burst and headshake episodes. Reproduced from

(Brewster et al., 2018)

Page 12: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

5

Machine learning for animal behaviour classification

Advances in biologging technology have resulted in the acquisition of

correspondingly large and complex data sets (Boyd et al., 2004; Valletta et al., 2017).

Our ability to collect these high-resolution data sets are quickly surpassing our ability

to analyse them (Valletta et al., 2017). As a result, the necessary analytical tools need

to be developed to automatically analyse accelerometry data (Brewster et al., 2018;

Whitney et al., 2010). One such tool is machine learning (ML).

Machine learning is a category of artificial intelligence within computer

science. It is the ability of computers to learn without specific instructions; rather, they

learn primarily from or discern patterns in data (Choy et al., 2018; Robert, 2014;

Samuel, 1959). These models can then be used to make efficient and accurate

predictions of future data (Mohri et al., 2012). ML has been applied to a variety of

disciplines, including bioinformatics, marketing, and robotics. Biologists are now also

using ML techniques to classify animal behaviour when computers can identify and

quantify behavioural states from data (Campbell et al., 2013; Chimienti et al., 2016;

Resheff et al., 2014).

Supervised machine learning

There are two types of machine learning used for classification: supervised and

unsupervised (Table 1). Supervised ML is an algorithmic model that predicts the future

output from previously labelled input (Sathya and Abraham, 2013) such as support

vector machines (SVM) (Martiskainen et al., 2009) or K-Nearest Neighbour (KNN)

(Bidder et al., 2014). For biologging data, this begins with a training data set whereby

the accelerometry data is the input variable, and the outcome variable is the known

Page 13: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

6

behaviour labels for the associated acceleration measures (Brewster et al., 2018). In

which case, the known behaviour labels are created through manual annotation from

the direct observation of behaviours (observed in captivity or otherwise) (Brewster et

al., 2018; Norouzzadeh et al., 2018). The accelerometry data are then used as a map

to the labelled behaviours, and this map enables the model to predict unseen data from

animals that are not part of the training procedure (Brewster et al., 2018; Valletta et

al., 2017).

In behavioural ecology, supervised ML methods are commonly used because

the training data provides a set of pre-defined behaviours that enable simple

interpretation (Leos‐Barajas et al., 2017). Supervised ML can also be highly accurate

and precise. This is because the existing labels (from direct observation) in the training

data set were manually classified, and enable the validation of labels assigned in the

test data set (Collingwood and Wilkerson, 2012; Ladds et al., 2016). Furthermore, the

trained data set itself is a benefit as it trains the classifier to implement automatic

behavioural classification and allows for quick identification of known behaviours in

the test set (Chimienti et al., 2016).

Despite its simplicity and ease of use, selecting the most appropriate ML

method can be difficult and time-consuming (Brewster et al., 2018; Ladds et al., 2016).

Supervised learning is also often subjected to sample selection bias. This bias stems

from the assumption that all data points are independent and identically distributed

across the population (Dundar et al., 2007; Ladds et al., 2016). However, as the model

learns from one distribution (the training set) and then predicts on another distribution

(test set), there is a high degree of correlation between the two data sets and the

independence assumption is violated (Dundar et al., 2007; Lorbach et al., 2016;

Zadrozny, 2004).

Page 14: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

7

Supervised ML also relies heavily on input from the ‘trainer’ to manually label

behaviours and ‘train’ the classifier to identify the individual behaviours (Ladds et al.,

2016). The dependency on the training data set makes it more difficult to identify new

or unknown behaviours as the supervised model can only classify behaviours that have

appeared in the training data set (Chimienti et al., 2016; Ladds et al., 2016; Sakamoto

et al., 2009). For example, if a foraging behaviour was not observed in the respective

training data set, the supervised classifier will have difficulty in identifying the

foraging behaviour if it arises in the test data set. Furthermore, extensive training data

sets can often be expensive and difficult to obtain for some species (Brewster et al.,

2018).

Despite the drawbacks, supervised ML is still relatively popular for

behavioural classification. Bao and Intille (2004) attached five bi-axial accelerometers

to human subjects as they executed a variety of behaviours. The accelerometry data

was then applied to train a group of classifiers including decision tables, decision trees,

K-nearest neighbours, support vector machine and naïve Byers. The most favourable

classifier were decision trees that successfully identified 84% of behaviours (Bao and

Intille, 2004). This research was one of the earliest to validate behavioural

classification from acceleration data in natural conditions. Despite the contrast in test

subjects and selecting bi-axial accelerometers instead of tri-axial, this study was

imperative to the classification of wildlife behaviour because studying animal

behaviour in captivity or controlled conditions is not often feasible or a reliable

portrayal of natural behaviour (Bao and Intille, 2004).

A large majority of animal behaviour studies rely on pairing input data with

direct observation (Bidder et al., 2014; Brewster et al., 2018; Chimienti et al., 2016;

Ladds et al., 2016). They require existing knowledge of animal behaviour to confirm

Page 15: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

8

behavioural classification (Chimienti et al., 2016). Classifying animal behaviour

without prior ecological or biological knowledge has been little studied. Schwager et

al. (2007) refrained from using any information on the movement or behaviour they

may have had prior when applying the K-means clustering algorithm to the tracking

data of free-ranging cows. Without observations or prior information, the training data

set does not exist, and such studies have led to the advancement of unsupervised ML.

Unsupervised machine learning

Unsupervised ML is an algorithmic model that identifies the inherent structure

solely from unlabelled input data (Brewster et al., 2018; Sathya and Abraham, 2013).

There is no training data set or known responses to data that are seen in supervised

ML (from direct observation); rather the model seeks to find repetitive patterns in the

observations alone (Chimienti et al., 2016; Ladds et al., 2016). Examples of

unsupervised ML include neural networks (Reby et al., 1997), expectation-

maximisation algorithms (EM) (Garriga et al., 2016) or various clustering methods

(Fallucchi et al., 2017; Sakamoto et al., 2009).

One of the benefits of using unsupervised ML methods in comparison to

supervised methods is that unsupervised methods are often less time consuming

(Jonsen et al., 2013; Ladds et al., 2016). In exchange for speed, however, the

performance of unsupervised classification methods are dramatically reduced (Guerra

et al., 2011; Ladds et al., 2016; Nazeer and Sebastian, 2009). Unsupervised methods

can also be challenging for biologists to implement or appreciate because the results

of behavioural classification cannot be ascertained with confidence. That is, without

prior information or behavioural observations, the classifications produced by an

unsupervised ML method cannot be validated (Sur et al., 2017).

Page 16: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

9

Unsupervised ML is beneficial for the study of highly mobile species who

often inhabit inaccessible locations where direct observations for ground-truthed

training data are unobtainable (Brewster et al., 2018; Nakamura et al., 2011; Payne et

al., 2016). Therefore, being able to classify behaviours without prior information

would be highly advantageous for cryptic wildlife such as elasmobranchs. As most

shark species swim continuously, there is continuous lateral motion in the sway axis

that produces statistical noise and makes fine-scale behaviours particularly difficult to

identify (Brewster et al., 2018; Hounslow et al., 2019; Williams et al., 2017). However,

the application of an unsupervised ML method to elasmobranch acceleration data

could expose new or rare behaviours that would be otherwise indiscernible to a

supervised ML method due to the reliance on a possibly biased training data set.

(Brewster et al., 2018; Chimienti et al., 2016; Ladds et al., 2016; Sathya and Abraham,

2013).

Page 17: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

10

Table 1: Advantages and disadvantages of supervised and unsupervised machine learning

Advantages Disadvantages

Supervised

ML • Can be highly accurate and

precise

• Simple interpretation

• The trainer can define the

label (usually a limited

number value)

• Comprehensive training data

set

• Time-consuming

• Not suited for species that are

not easily observed in the wild

or cannot be kept in captivity

• Assumes an independent and

identically distributed training

data set which is often violated

Unsupervised

ML • No prior knowledge of data

or training set required

• Speed

• Has the potential to reveal

new or rare behaviours

• Less accuracy

• States will not always point to

specific behaviours

• Results cannot be validated

K-means clustering

One of the most common and widespread methods of classifying observations

into distinct categories is K-means clustering. The unsupervised algorithmic model

aims to separate n observations into groups represented by the vector K (Figure 3)

(Zhang et al., 2015). The observations are clustered by minimising the within-cluster

total of the squared Euclidean distance from the cluster centroids (Sakamoto et al.,

2009). Simply put, individual observations are allocated to their respective groups

whereby intercluster distance is maximised, and the intracluster similarity is

minimised (Valletta et al., 2017; Zhang et al., 2015).

Page 18: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

11

The omnipresence of K-means clustering is primarily due to it being

mathematically simple and efficient on large data sets (Jain et al., 1999; Kumar and

Prasad, 2013; Zhang et al., 2015). However, K-means clustering can be limited by the

number of data points that a computer can process (Sakamoto et al., 2009). It also

tends to work well only on data sets with condensed or outlying clusters (Jain et al.,

1999).

Although K-means clustering can be computationally inexpensive (Valletta et

al., 2017), it is particularly sensitive to the result of the initial partition (Birodkar and

Edla, 2014; Jain et al., 1999; Morissette and Chartier, 2013; Nazeer and Sebastian,

Figure 3: The process of the K-means clustering algorithm (Sakamoto et al., 2009) used

on the wavelet spectra from accelerometer-equipped juvenile lemon sharks

Page 19: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

12

2009). This is because K-means uses a local search heuristic to efficiently find the first

partition, which results in the estimated optimal result rather than just the optimal

result (Jain et al., 1999; Sakamoto et al., 2009; Xiong and Zhang, 2016). This also

means that the algorithm is unstable and will provide different results depending on

the starting partitions (Figure 4) (Jain et al., 1999; Xiong and Zhang, 2016).

Figure 4: A graphical display of the sensitivity of K-means clustering to the selection of the

initial partition. If you begin with A, B and C as the initial means, then the resulting partition

will be [[A] [B, C], [D, E, F, G]]. However, the squared error criterion value is greater than

for the optimal partition [[A, B, C], [D, E], [F, G]] that is seen with the rectangles (this

contains the global optimum). The accurate three cluster solution would be achieved by

selecting A, D and F as the initial means. Reproduced from Jain et al. (1999)

Page 20: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

13

Unlike other clustering methods (Langfelder et al., 2007), the number of

expected clusters must be specified before K-means clustering can begin (Brewster et

al., 2018). Determining the optimum number of clusters that best represent the

associated data set is a universal issue. If the selected cluster number is too small,

comparable behaviours can be classified together, whereas electing too many clusters

can synthetically separate behaviours (Gleiss et al., 2017; Sakamoto et al., 2009;

Valletta et al., 2017; Whitney et al., 2010).

To further improve the accuracy and efficiency of this classifier, the data can

be transformed before applying the K-means algorithm (Al Shalabi et al., 2006;

Mohamad and Usman, 2013). Pre-processing techniques can include normalisation

(scaling the data so that each component of the data set is within a small established

range such as [0:1]), standardisation (where the mean in each dimension is made to be

0 and the variance to be 1), outlier removal and more (Mohamad and Usman, 2013;

Patel and Mehta, 2011; Pavel and Sattarov). Data transformation of highly distributed

or ‘noisy’ data can increase the likelihood of clusters being identified, produce higher-

quality clusters and fix any inconsistencies in the data (Mohamad and Usman, 2013;

Patel and Mehta, 2011; Van Moorter et al., 2010).

Due to its simple nature, K-means clustering has also been used as a starting

point or within a combination of methods to classify animal behaviour. In 2015, Zhang

and colleagues (Zhang et al., 2015) applied both K-means clustering and ‘behavioural

change point analysis’ (BCPA) to segregate the movement trajectories of the Little

Penguin (Eudyptula minor) into distinct behaviours. They aimed to increase the

accuracy and efficacy in clustering animal behaviour data without prior information.

Consequently, this method correctly classified 92.5% of the data and enabled

successful discrimination of behavioural states (Zhang et al., 2015).

Page 21: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

14

K-means clustering has been used in many animal behaviour studies, mostly

focused on location data such as GPS (Christensen‐Dalsgaard et al., 2018; Van

Moorter et al., 2010) or acoustic telemetry (Brodie et al., 2018; Xydes et al., 2013).

However, there are comparatively few studies that have applied K-means clustering

to accelerometry data (Dewhirst et al., 2017; Gleiss et al., 2017; Hart et al., 2016;

Patterson et al., 2019; Sakamoto et al., 2009). One method of analysing accelerometry

data with K-means clustering is to use the package Ethographer (Sakamoto et al.,

2009) and create spectra of acceleration signals (derived from animal movement) that

are contingent on the continuous wavelet transformation (Sakamoto et al., 2009). The

K-means algorithm is then applied to the spectra and each second is grouped into “K”

clusters. Wavelet transformation enables easy detection of small changes over short

periods of time and provides a strong method for automated behavioural classification

(Hart et al., 2016; Sakamoto et al., 2009).

Hidden Markov models

Hidden Markov models (HMM) are a class of algorithms that have recently

gained popularity in ecological applications. Although this model can be used in both

a supervised and unsupervised context, only the unsupervised model will be discussed

here. The primary concept of the model was first published in the 1960s (Baum and

Eagon, 1967; Baum and Petrie, 1966) and gained popularity in the 1970s when it was

applied to speech recognition (Rabiner, 1989). HMM’s have applications in many

disciplines, but despite the age, it was only recently that scientists began to apply the

complex ML model to classify animal behaviour (Grünewälder et al., 2012; Patterson

et al., 2008).

Page 22: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

15

Hidden Markov models are non-deterministic algorithms that differ from other

mathematical models by allowing multiple outcomes (Koski, 1996). That is, several

variables are characterised by probability distributions with an uncertain outcome

(Bolano, 2014; Zucchini et al., 2016). The model comprises of two stochastic

processes of which one is observed, and the other hidden (Figure 5) (Morales et al.,

2004; Pohle et al., 2017).

Like K-means clustering, first-order HMM’s use unlabelled input data (Leos‐

Barajas et al., 2017). The stochastic model assumes that a sequence of observations

such as acceleration data are produced by an underlying series of hidden states (Franke

et al., 2006; Tucker and Anand, 2005). These hidden states are produced by a Markov

process by which future states are independent of past states, yet entirely dependent

on the penultimate state (Dean et al., 2012; Jurafsky and Martin, 2014). Many other

statistical methods (K-means clustering, for example), assume independence between

successive observations of animal movement (Leos‐Barajas et al., 2017; Swihart and

Slade, 1985). However, there is an instinctive dependence between observations of

Figure 5: A graphical representation of a first order hidden Markov model (HMM)

that includes two stochastic processes; one hidden (state process) and the other is

observed (state-dependent process). Reproduced from Pohle et al. (2017)

Page 23: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

16

movement once the behaviour has initiated. That is, at time t, the states are influenced

by the state at time t-1 (Leos‐Barajas et al., 2017).

As HMM’s account for the sequential correlation that is observed in animal

movement data, they are practical for the challenging and often unmeasurable aspects

of behavioural analysis (Lawler et al., 2019; Zucchini et al., 2016). HMM’s have been

recently applied to accelerometry data of blacktip reef sharks (Leos‐Barajas et al.,

2017). With the aim to classify the state switching aspects of shark behaviour, Leos‐

Barajas et al. (2017) confirmed that the sharks were more active throughout high tides

and at water temperatures of 28-29 °C. This research highlighted the significance of

consecutive correlation when identifying animal movement, an area that has often

been neglected or incorrectly addressed in previous studies (Carroll et al., 2014; Leos‐

Barajas et al., 2017; Li and Bolker, 2017).

It is highly likely that the computational complexity of HMM’s is a deterrent

to many ecologists studying animal behaviour. Due to their random nature, HMM’s

can be statistically noisy (Tucker and Anand, 2005). They are also quite rich in

mathematical structure, where the complexity of the model increases substantially as

the number of states rises, or covariates are added to the model (McKellar et al., 2014).

Estimating the model's parameters can also prove difficult (Mannini and Sabatini,

2010; Wang, 2019). However, HMM’s have few statistical assumptions and

estimating the maximum likelihood is relatively straightforward and quick to optimise

(Lawler et al., 2019).

Although HMM’s are computationally complex, they can perform

sophisticated behaviour classification (Leos‐Barajas et al., 2017). HMM’s are also

relatively flexible. They can manage different types of data within the one model

Page 24: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

17

(including missing data) and have a variety of extensions including the hidden semi-

Markov model (HSMM) and the hierarchical HMM (HHMM) (Leos‐Barajas et al.,

2017).

Current study and objectives

Observing and understanding the behaviour of wildlife is particularly difficult

for free-ranging aquatic species (Brewster et al., 2018; Gleiss et al., 2010; Whitney et

al., 2012). Many elasmobranchs, for example, live cryptic lifestyles, where direct

observations are logistically challenging or unattainable (Brewster et al., 2018).

Unsupervised ML methods provide a promising solution to the pressing challenge of

automated behavioural classification. However, unsupervised ML needs to be

compared to supervised ML to determine how effectively animal behaviour can be

estimated without captive trials. To examine the validity of unsupervised ML methods,

behavioural trials were carried out to observe and record the per second behaviours of

juvenile lemon sharks (Negaprion brevirostris). The behavioural labels were used

solely for ground-truthing purposes to compare with the per second inferred clusters

or states from the unsupervised ML methods.

This present study set out to address the following objectives: (1) Determine

the efficacy of unsupervised ML methods in the identification of shark behaviour from

accelerometry data; (2) To compare the performance of two unsupervised ML

methods, K-means clustering and hidden Markov models, to a previously used

supervised ML technique; (3) To test if unsupervised ML methods produce the same

results as supervised learning when applied to wild data.

Page 25: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

18

Methods

The data used in this dissertation were gathered and disclosed by Brewster et

al. (2018). Data collection included the capture of individuals, ethogram (behavioural)

trials and individual release. The behavioural trials took place in a controlled

environment where Brewster et al. (2018) could induce the animals to perform a wider

range of acceleration signatures. Additional data analysis was performed specifically

for this thesis to test the ability of unsupervised ML methods to classify shark

behaviour.

Study site and subject species

The study site was a cluster of islands that fall within the Commonwealth of

the Bahamas. The Bimini Islands are located approximately 85 km east of Miami in

Florida, USA (Figure 6). The islands are fringed with mangrove forests that provide

well-documented nursery grounds for a variety of species, including the lemon shark

(Chapman et al., 2009; Trave and Sheaves, 2014). The lemon shark was chosen as the

subject species for its abundance, its resilience in captivity, and high site fidelity to the

nursery grounds in Bimini, Bahamas (Brewster et al., 2018).

Page 26: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

19

Tag package and procedure

The tagging equipment included a G6a+ triaxial acceleration data logger (40

mm x 28 mm x 17 mm) (CEFAS Technologies Ltd) and an acoustic transmitter (for a

separate component of the study) (Figure 7). The accelerometers logged acceleration

at 30 Hz and contained 56 MB for data storage. For tag attachment, a hypodermic

needle (1.5 mm in diameter) was used to perforate two holes in the base of the first

dorsal fin, where nylon monofilament was threaded through the accelerometer and the

fin. The monofilament line was then secured to the opposite side of the first dorsal fin

with stainless steel crimps and two thin plastic plates. Medical-Grade polyurethane

Figure 6: Satellite image of the study site in the Bimini Islands

25°44’ N, 79° 16’ W (Google Earth, 2019)

Page 27: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

20

Figure 7: Juvenile lemon shark (Negaprion brevirostris) equipped with the tri-axial

accelerometer (Image credit: T.J.Ostendorf, Bimini Biological Field Station, 2015)

foam was positioned between the plates and the accelerometer to prevent irritation or

damage to the fin.

Capture and attachment method

A 180m x 2m rectangular gillnet was deployed perpendicular to the shoreline

of South Bimini. The net was checked in 15-minute intervals or in the event of a

disturbance. The size and weight of the tag package governed the minimum total

length (TL) of the sharks that could be used for this study (75 cm TL). The maximum

total length was delineated by the size limit of sharks that could be chambered in the

respirometry element of a separate study (Lear et al., 2017). Four juvenile lemon

sharks were collected and examined for a passive integrated transponder (PIT) tag that

Page 28: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

21

is positioned within the muscle at the base of their dorsal fin. If the shark had not been

previously tagged with a PIT tag, one was inserted before the captive behavioural trials

commenced.

Data collection

Sharks were transferred to a 10m x 6m rectangular enclosure that was

constructed in shallow waters on nearby sand flats. The semi-enclosed enclosure was

constructed in such a way as to reduce the incessant circular swimming patterns that

have been previously observed in captive sharks (Gleiss et al., 2009a). The enclosure

was made of plastic diamond-shaped mesh to ensure the animals experienced natural

environmental conditions. Before accelerometer attachment, each shark was given a

minimum of 48 hours to recover and adapt to the fenced enclosure.

Behavioural trials were carried out approximately 24 hours after accelerometer

attachment to enable acclimation. For the behavioural trials, an observer was

positioned next to the enclosure on a 3 m high wooden tower to record each lemon

shark’s behaviour. To ensure sufficient replicates of rarely performed behaviours,

Brewster et al. (2018) encouraged such behaviours by feeding an individual fish

(headshake) or creating large movements to the side of an individual (burst event).

Each second of the trial was manually annotated with a behaviour label (Table 2) to

produce an ethogram. The sharks were fed to repletion with fresh or thawed fish every

three days (except during trials). Once the behavioural trial was completed, sharks

were recaptured with a dip net, and the dorsally mounted tag was removed. All sharks

were then monitored during recovery prior to release.

Page 29: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

22

Table 2: Description of the observed behaviours performed in semi-captive juvenile lemon

sharks (Negaprion brevirostris) (Brewster et al., 2018; Myrberg Jr and Gruber, 1974;

Whitney et al., 2012)

Behaviour Definition

Swim Steady swimming

Burst Increased speed of swimming (usually during hunting or in response to

disturbance)

Chafe Rolling motion to “scratch” dorsal surface

Headshake Shaking of the head (generally prey handling)

Rest Lying motionless on the substrate

Data pre-processing

Statistical analysis for testing the efficacy of the unsupervised ML methods

was conducted in IGOR Pro version 7.08 (WaveMetrics Inc, Lake Oswego, Oregon,

USA) and R statistical software version 3.5.3 (R Development Core Team, 2019)

(Figure 8). As K-means clustering is a particularly simple algorithm, the raw

acceleration data from the sway axis was the only measurement used for K-means

clustering. For ground-truthing purposes, the raw acceleration data from the sway axis

was then segmented into one-second averages using Ethographer (Sakamoto et al.,

2009).

For HMM analysis, ODBA was the only measurement used for classification.

The change in metric is attributed to previous studies which found that ODBA is an

important metric for behaviour classification (Brewster et al., 2018; Gleiss et al., 2010;

Lear et al., 2017; Leos‐Barajas et al., 2017). The raw acceleration data (sampled at 30

Hz) was loaded into Igor Pro, where ODBA was then calculated using Ethographer

(Sakamoto et al., 2009). Firstly, each axis was smoothed over three seconds, and then

static and dynamic acceleration were separated. Static acceleration measures the

Page 30: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

23

acceleration forces on the device that emanate from the earth’s gravitational pull. Once

static acceleration was removed from the raw values, dynamic acceleration (a measure

of the animals' movement) was all that remains (Graf et al., 2015; Wilson et al., 2008).

ODBA was then calculated for each shark by summing the absolute dynamic

acceleration of all three axes. For ground-truthing purposes, ODBA was then

condensed into one-second averages using Ethographer (Sakamoto et al., 2009).

Figure 8:Methods process for testing the efficacy of two unsupervised machine

learning methods for behaviour classification of accelerometry data from juvenile

lemon sharks (Negaprion brevirostris)

Page 31: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

24

K-means clustering

For the purpose of this study, a cluster is defined as an assemblage of data

points grouped due to certain similarities in their acceleration signal. In addition,

clustering was not individually applied to each shark’s acceleration data because the

K-means clusters for one individual, will not necessarily correspond to clusters in

another individual. The raw acceleration data (sampled at 30 Hz) was loaded into Igor

Pro version 7.08 (WaveMetrics Inc, Lake Oswego, Oregon, USA). To ensure the

clusters could be compared between sharks, the per-second acceleration data of all

four individuals were combined in Ethographer to be clustered together (Sakamoto et

al., 2009).

A continuous wavelet transformation was used to decompose the sway (z) axis

acceleration signal waveform into time-frequency space (Figure 2) (Torrence and

Compo, 1998). The sway axis was used because the lateral movement represents

individual tail-beats and therefore, the primary acceleration axes for sharks (Brewster

et al., 2018). By using the wavelet transformation, the cyclicity of the movements of

the caudal fin by the sharks can be analysed (Polansky et al., 2010). The wavelet

spectrum will display wavelet power plotted against time and frequency, which will

represent the acceleration amplitude and frequency of the sway axis (Sakamoto et al.,

2009; Westra and Sharma, 2006).

Before running the algorithm, K-means clustering requires the number of

clusters to be specified (Brewster et al., 2018). The Calinski Harabasz index (CH

criterion) (Caliński and Harabasz, 1974) was used to select the optimal number of

clusters with the R package “fpc” (Hennig, 2010). The CH criterion is an internal

clustering validation method that uses Euclidean distance to determine the optimum

Page 32: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

25

number of ‘K’ clusters (Caliński and Harabasz, 1974; Hämäläinen et al., 2017). In

Ethographer, the K-means clustering algorithm was then applied to the continuous

wavelet spectrum where it separated the clusters based on the temporal changes in the

amplitude of the sway axis (Figure 9) (Sakamoto et al., 2009; Whitney et al., 2010)).

The classification performance was then quantified using the ground-truthed

acceleration data and a mixed effects model (Table 3).

Page 33: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

26

Figure 9: Schematic diagram of K-means clustering classifying the wavelet spectrum. Inferred behaviours from

the raw acceleration data of the sway axis: Red – Rest; Yellow – Chafe; Green – Swim; Blue – Burst; Purple –

Headshake. (A) Wavelet spectrum generated from the raw acceleration from the sway axis (B) The corresponding

K-means clusters (C) The corresponding local behaviour spectrum (D) The percentage of time spent representing

each cluster

Page 34: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

27

Table 3: The variables included when investigating the efficacy of K-means clustering to

identify shark behaviour from accelerometry data (the variables will be italicised for easier

interpretation in the methods and results)

K-means performance evaluation

Confusion matrices provide a simple approach to establish how many

behaviours are ‘correctly’ assigned (Plakhov et al., 2015). In order to assess the

efficacy of K-means clustering for behaviour classification, the performance of the

clustering method was assessed by generating a confusion matrix (Figure 8). While a

confusion matrix typically assesses classification performance by comparing actual

behavioural states to predicted behavioural states (Brewster et al., 2018), the K-means

clusters generated here did not necessarily correspond directly to a known behaviour.

Hence, performance could only be inferred. Therefore, the Observed Behavioural state

contained the cluster number as “pseudo- predictions”.

Statistical models were applied to compare the fit of observed behavioural

state to the assigned cluster (pseudo-prediction) and to account for possible sources of

variation at both the individual and group level.

Variable Variable type Description Range

Shark Nominal Shark ID and the influence of individual

Shark

1-4

Observed

behavioural

state

Categorical The observed behaviour/body movements

of captive sharks

Rest-burst-

chafe-

headshake-

swim

Cluster Categorical The cluster number as determined by K-

means Clustering

0-4

K-means

clustering

assignment

Discrete The number of cases in which a cluster

was assigned to an observed behavioural

state

0-14031

Page 35: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

28

Mixed model analysis

Generalised linear mixed models (GLMM’s) are a flexible extension on the

basic linear regression model. They are beneficial for exploring the relationship

between the predictors and the response variable in data sets that have multiple sources

of variability (Johnson et al., 2015; Krueger and Tian, 2004). GLMM’s include both

fixed and random effects that contribute linearly to the dependent variable (Barr et al.,

2013; Cheng et al., 2010). For clarity purposes, the competing mixed effects models

for K-means were labelled model 1a, 1b, 1c, and 1d. Subsequently, the competing

mixed effects models for the HMM were labelled model 2a, 2b, 2c, 2d, and 2e.

Using a GLMM to analyse the performance of K-means clustering

Here, a GLMM was used to identify a significant relationship between the

observed behavioural state and clusters in K-means clustering, while also accounting

for individual-specific heterogeneity across individuals (Figure 8) (Baayen et al.,

2008; Bolker et al., 2009). The packages lme4 (Bates et al., 2014) and glmmTMB

(Brooks et al., 2017) were used to fit the GLMM’s.

Poisson regression was applied to the K-means clustering assignment with a

random intercept. As fixed effects, the observed behavioural state and cluster (without

an interaction) were entered into the model, and a random intercept was included for

the effect of individual shark. The random intercept model allows the intercepts (K-

means clustering assignments) to deviate between individual sharks, while the

observed behavioural state and cluster are restricted by a common slope (meaning the

random effect has the same effect across all other variables) (Duncan et al., 1998; Loy

et al., 2017; Winter, 2013).

Page 36: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

29

Model 1a comprised the aforementioned Poisson regression. The

overdispersion displayed in the deviance residuals of model 1a suggests that negative

binomial model could provide a better fit by accounting for any heterogeneity among

the data (Wedel et al., 1993; Zou et al., 2013). Therefore, model 1b involved a negative

binomial regression with an identical fixed and random effects structure as model 1a.

Model 1c, however, aimed to consider an interaction between the two

categorical variables: observed behavioural state and cluster. An interaction with two

or more independent variables is when the effect that one independent variable has on

the dependent variable is dependent on the value of the second independent variable.

A significant interaction term would suggest that the number of Cluster assignments

would respond differently to the combination of observed behavioural state and

cluster than would be predicted by simply adding either independent variable

separately. The fixed effects structure of model 1c included the interaction term

between the observed behavioural state and cluster. The random effect structure

remained the same as previous models with a random intercept for shark.

Model 1d was constructed to test for an association between the observed

behavioural state and cluster. An association differs from interaction in that the first

independent variable is correlated to the values of the second independent variable.

However, this association does not affect the dependent variable (Nathans et al., 2012).

A significant association term would suggest that the number of K-means clustering

assignment allocated to the observed behavioural state are dependent on what the

underlying cluster is. For this model, the cluster variable was removed from the fixed

effects structure, and only the observed behaviour remained. The random effect

structure has remained the same as previous models.

Page 37: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

30

Mixed model selection and validation

A backwards step-like model selection process was used to determine the best

of the four models (Table 8) (Broman and Speed, 2002). The package DHARMa

(Hartig, 2017) was used to produce QQ plots and observe any violations in the data.

The Akaike Information Criterion (AIC) was then used to evaluate the fit of

the competing models. The AIC works as a relative indicator for the goodness of fit,

while simultaneously penalising a plethora of parameters used to achieve that fit

(flexibility over accuracy) (Akaike, 1994; Bozdogan, 1987; Sclove, 1987). The AIC

provides the ability to compare nested and non-nested models and accounts for model

selection uncertainty (Bolker et al., 2009; Loy et al., 2017; Posada and Buckley, 2004).

While there are complexities in defining the ‘best’ model (Burnham et al., 2011;

Richards et al., 2011; Symonds and Moussalli, 2011), in the present study, the best

model was considered to possess the lowest AIC.

To eliminate problematic levels of collinearity in model 1d, the package usdm

(Naimi et al., 2014) was used to calculate the variance inflation factor (VIF) for each

independent variable. The VIF illustrates the proportion of variance in one

independent variable that can be explained by the second independent variable, due to

the correlation between the two (Craney and Surles, 2002; Zuur et al., 2010). In the

literature, there is no uniformly agreed threshold value for unacceptable collinearity.

However, a threshold as low as 3-4 is suggested for a rigorous approach (Kock and

Lynn, 2012; Olague et al., 2007; Zuur et al., 2010) and collinearity values above three

will be excluded.

Page 38: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

31

Hidden Markov model

The HMM framework

A basic two-state HMM involves two stochastic processes (one observed, and

the other hidden). First, is the observed state process that is governed by an underlying

mixture distribution. For example, in an HMM with N states, each observation has

been produced by one of N probability distributions and these are then mixed to form

a new distribution (Zucchini et al., 2016). The observed states of the HMM, however,

are influenced by an underlying hidden state that follows a Markovian (Markov) chain

(Bolano, 2014; Zucchini et al., 2016). The Markov chain is the second part of the

HMM that uses probability to statistically model random processes over time (Bolano,

2014; Zucchini et al., 2016).

In HMM analysis, there are two primary aspects to investigate: the relationship

between the hidden states and the observed states; and the transitions between the

states over time (Bolano, 2014). In the present study, the former is emphasised, using

an unsupervised HMM to infer biologically interpretable behavioural states from the

accelerometry data collected from juvenile lemon sharks (Brewster et al., 2018). There

are five main components required to compute the HMM (Table 4), and the

classification performance was then quantified using the ground-truthed acceleration

data and a mixed effects model (Table 5).

Page 39: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

32

Table 4: The five main components required for HMM computation (Bolano, 2014; Leos‐

Barajas et al., 2017; Zucchini et al., 2016)

Elements

Hidden state sequence (S1, S2,…, ST)

Observed state sequence (X1, X2,…, XT)

Initial state distribution (denoted by δ and includes the mean and standard deviation (SD)

of each state distribution)

State-dependent probabilities (the probability of an observation that was produced by a

particular hidden state, yet independent of time t)

Transition probability matrix (TPM; the probability of moving from one state to another)

Table 5: The variables included when investigating the efficacy of the HMM to identify shark

behaviour from accelerometry data (the variables will be italicised for easier interpretation

in the methods and results)

Variable Variable type Description Range

Shark Nominal Shark ID and the influence of

individual shark

1-4

Observed

behavioural

state

Categorical The observed behaviour/body

movements of captive Sharks

Rest-burst-

chafe-

headshake-

swim

Hidden

state

Categorical The hidden state number as determined

by the HMM

1-4

HMM

assignment

Discrete The number of cases in which a hidden

state was assigned to a specific

observed behavioural state

0-14031

Page 40: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

33

Model formulation

A pragmatic step by step approach was used to select the number of states for

the HMM (Figure 8) (Pohle et al., 2017). In previous studies, lemon sharks around the

Bimini Islands have been observed to exhibit five behaviours (burst, chafe, headshake,

rest and steady swim) (Brewster et al., 2018). This prior knowledge and a histogram

of the collective ODBA (Figure 10) were used to determine what possible distributions

may be present. As the collective histogram was highly skewed, a log transformation

was performed on the per-second ODBA values. The ODBA histogram was uniformly

segmented, and the mean and variances of each state were estimated (DeRuiter et al.,

2017). Three to five general movement classes were most plausible.

Figure 10: Histogram of the combined ODBA values that were calculated from the accelerometry data

of juvenile lemon sharks (Negaprion brevirostris)

Page 41: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

34

Fitting the HMM

Three HMM candidate models were constructed based on three, four or five

hidden behavioural states (Figure 8). The best candidate model for the ODBA data

was the four-state HMM. There were not enough states in the three-state HMM to

capture the subtle statistical patterns and separate the states accordingly (Ngô et al.,

2019; Schliep, 2001). The five-state model was biologically uninterpretable due to an

excessive number of parameters that produced a highly multimodal likelihood

function (Quick et al., 2017; Stephens, 1996).

The four-state HMM had a joint Gaussian distribution with four distinct sets

of parameters (mean and SD), one for each hidden state (Quick et al., 2017). The

HMM was fit via a direct maximum likelihood estimation (MLE) with the nlm

optimiser in R (Dennis Jr and Schnabel, 1996). Here; the aim was to find the most

likely parameters that have given rise to the ODBA observations (Liisberg et al., 2016;

Quick et al., 2017). For parameter estimation, a transition matrix where all state

transitions were possible was assumed. The forward algorithm was used to transverse

along the series of observations and update the likelihood and state probabilities at

each step (Leos‐Barajas et al., 2017). Multiple sets of the initial parameters were also

considered to enhance the chances of finding the global maximum (and not the local

maximum) (Figure 11) (Adam et al., 2019; Liu and Lemeire, 2017). After the initial

parameters had been selected, the model was then run 1000 times to check for

numerical stability and robustness against the differing initial parameters (Quick et al.,

2017). Checking the stability of the initial parameters is often extremely time-

consuming, particularly with higher state or multivariate HMM’s. To reduce

computation time, the packages Rcpp and RcppArmadillo were combined with C++

Page 42: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

35

to enhance the speed of the forward algorithm (Eddelbuettel et al., 2011; Eddelbuettel

and Sanderson, 2014). The resulting computation time was over 30 hours.

Validation of the HMM

The pseudo-residuals were used to assess the four-state HMM’s goodness of

fit. There are two types of pseudo-residuals that are useful for HMMs: ordinary

pseudo-residuals and the forecast pseudo-residuals. If the fitted model is accurate, the

ordinary pseudo-residuals are distributed standard normal, given all other observations

(Zucchini et al., 2016). The forecast pseudo-residuals, however, are mainly used to

assess the models forecasting ability because they are distributed standard normal

given all preceding observations (prior to t) (Rooney, 2016; Zucchini et al., 2016). In

our case, the forecast pseudo-residuals were used because the ODBA observations are

the ‘future values’ that are needed to determine the ‘number of successes’ for this time

series (Zucchini et al., 2016).

Figure 11: Schematic diagram of the forward algorithm searching for the global

maximum (blue arrows represent the optimisation process)

Page 43: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

36

Global decoding

Global decoding creates a map to uncover the hidden states of the HMM. Here,

the recursive Viterbi algorithm was used to calculate the most probable (hidden) paths

of the sequence given the observations (Pohle et al., 2017; Viterbi, 1967; Zucchini et

al., 2016).

HMM performance evaluation

Similar to the previous performance assessment for unsupervised K-means

clustering, a confusion matrix was generated to assess the performance of the HMM

for the behavioural classification of accelerometry data (Figure 8). Here, the observed

behavioural state is compared to the hidden state that is predicted by the HMM. Also,

the pseudo-predictions did not necessarily correspond directly to a known behaviour,

and therefore, performance could only be inferred. The HMM inferred states were

therefore assigned to observed behavioural state as “pseudo-predictions”. Mixed

effects models were then applied as per the analysis of K-means clustering, to compare

the fit of the observed behavioural state against the hidden state, and to once again,

account for the possible sources of variation at both the individual and group level.

Using a GLMM to analyse the performance of the HMM

The GLMM was used to identify a significant relationship between the

observed behavioural state and hidden state in the HMM, while also accounting for

individual-specific heterogeneity across individual sharks (Figure 8) (Baayen et al.,

2008; Bolker et al., 2009). The packages lme4 and glmmTMB were used to fit the

GLMM’s (Bates et al., 2014; Brooks et al., 2017).

Page 44: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

37

For model 2a, a Poisson regression was applied to HMM assignment with a

random intercept. As fixed effects, the categorical variables observed behavioural

state and hidden state (without an interaction) were entered into the model, and a

random intercept was included for the effect of individual shark. The random intercept

model allowed the intercepts (HMM assignments) to deviate between individual

sharks, while the observed behavioural state and hidden state were restricted by a

common slope (Duncan et al., 1998; Loy et al., 2017; Winter, 2013).

As the deviance residuals of model 2a were overdispersed, model 2b

comprised of a negative regression with an identical fixed and random effects structure

to model one. Model 2c included a negative binomial regression with an identical fixed

and random effects structure and an interaction term between the observed

behavioural state and the hidden state of the HMM. Model 2d also involved an

interaction term between the observed behavioural state and the hidden state, but this

time with a quasi-Poisson regression. A significant interaction term would suggest that

the number of HMM assignments would respond differently to a combination of

observed behavioural state and hidden state than would be predicted by simply adding

either independent variable separately.

Model 2e was constructed to test for an association between the observed

behavioural state and the hidden state of the HMM. A significant association term

would suggest that the number of HMM assignments allocated to each observed

behavioural state are dependent on the underlying hidden state. For the fixed effects

structure in model 2e, the categorical variable hidden state was removed, and only the

observed behavioural state remained. The random effect structure remained the same

as previous models.

Page 45: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

38

Mixed model selection and validation

For the HMM, the model selection and validation used in the GLMM followed

similar methods as the validation of the K-means GLMM. The package DHARMa

(Hartig, 2017) was used to produce QQ plots and observe any violations in the data.

Results

Behavioural trials

Four juvenile lemon sharks were tagged in the behavioural trials, totalling over

38 hours of accelerometry data. Of these, nearly ten hours of accelerometer data was

ground-truthed with the per-second behavioural observations (Brewster et al., 2018)

(Table 6). The data set was imbalanced with swimming behaviour representing

97.57% and burst events representing 0.13%. The further imbalance was attributed to

shark one, representing 82.75% of the entire data set. All five behaviours were

identified throughout the study. However, not all sharks were observed performing all

behaviours. Burst behaviour was the least observed behavioural state for sharks 1

(0.01%), 2 (0.18%) and 3 (4.32%) and headshake was the least observed for shark 4

(7.48%).

Page 46: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

39

Table 6: Ethogram illustrating the number of per second observations of each behaviour

recorded for each juvenile lemon shark (Negaprion brevirostris) (n = 4) fitted with an

accelerometer during the semi-captive trials

Shark ID and Total Length (TL) B

eh

av

iou

ra

l st

ate

1 (82.6 cm) 2 (80.5 cm) 3 (79.2 cm) 4 (85.2 cm) Total

Burst 4 10 12 22 48

Chafe 139 83 49 0 271

Headshake 57 14 26 16 113

Rest 82 315 43 0 440

Swim 29406 5273 148 176 35003

Total 29688 5695 278 214 35875

K-means clustering

Five clusters were identified as the optimal number of clusters to best represent

the sway acceleration data (Table 7). The majority of observations were represented

by clusters zero and one (18162 and 12309 seconds respectively), with the least

amount of observations represented by clusters three and four (2801 and 1086 seconds

respectively).

Classification performance

The categorical variables observed behavioural state and cluster were

compared using a pseudo-confusion matrix (Table 7). Overall, K-means clustering

provided poor performance in the classification of lemon shark behaviour. There were

few patterns in the pseudo-confusion matrix, and all cluster subtypes were labelled to

represent all observed behavioural states at one point or another. With shark one, for

example, four of the five total Observed Behaviour States were represented by two

clusters (cluster one and two) at one time or another. For shark two, cluster zero

Page 47: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

40

represented four of five behaviours, and clusters one and two represented all five

observed behaviours. For shark three, cluster two also represented the five behaviours

at one point or another, and cluster two represented four of the five observed

behaviours.

For shark one, clusters zero and one were assigned to similar observed

behaviours, with the majority representing rest and swim. There were only 139

seconds of headshake behaviour for shark one, 133 of which were represented by

cluster two. Similarly, of the 26 seconds of headshaking behaviour observed in shark

three, 23 seconds were assigned to cluster two. For sharks two, three and four, clusters

three and four were not labelled to represent any of the observed behaviours.

Page 48: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

41

Table 7: The confusion matrix generated to evaluate the performance of K-means clustering

for behaviour classification of lemon shark (N.brevirostris) accelerometry data; where rows

indicate actual behavioural states and columns indicate clusters membership as pseudo-

predictions

Shark Observed Behaviour Cluster Total

0 1 2 3 4

1 Burst 0 3 1 0 0 4

Chafe 0 0 133 0 6 139

Headshake 5 4 48 0 0 57

Rest 60 22 0 0 0 82

Swim 14031 10730 764 2801 1080 29406

Total 14096 10759 946 2801 1086 29688

2 Burst 2 4 4 0 0 10

Chafe 0 4 79 0 0 83

Headshake 3 5 11 0 0 19

Rest 303 0 7 0 0 310

Swim 3587 1371 315 0 0 5273

Total 3895 1384 416 0 0 5695

3 Burst 5 5 2 0 0 12

Chafe 0 8 41 0 0 49

Headshake 3 0 23 0 0 26

Rest 0 38 5 0 0 43

Swim 34 48 66 0 0 148

Total 42 99 137 0 0 278

4 Burst 13 6 3 0 0 22

Chafe 0 0 0 0 0 0

Headshake 0 7 9 0 0 16

Rest 0 0 0 0 0 0

Swim 116 54 6 0 0 176

Total 129 67 18 0 0 214

Page 49: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

42

Mixed effects model to analyse the pseudo-predictions from K-means

clustering

Model selection

From the four candidate models, the AIC’s (Table 8) and QQ plots (Figure 12)

were compared to select the best fitting mixed-effects model to explain the K-means

pseudo-predictions. Model 1a held the highest AIC score, and the QQ plots showed a

significant deviation from the normal distribution (P = 9.60e-04). Model 1b was the

negative binomial regression where the AIC score dropped significantly in comparison

to the first model. Visual inspection of the QQ plot displayed no significant deviation

from the normal distribution and homoscedasticity. We used DHARMa (Hartig, 2017)

to directly test the uniformity and dispersion assumptions and further validate the

improved fit of model two (P = 0.34 and P = 0.37, respectively).

Model 1c involved the interaction term between the two fixed effects, and

despite the lower AIC, the model did not converge. Model 1d tested whether both State

and cluster were needed to predict the counts of K-means clustering assignments. This

model had the second-highest AIC score, and visual inspection of the QQ plot revealed

a slight deviation from the expected distribution. A likelihood ratio test between model

1b and 1d revealed a statistically significant difference between the two models,

suggesting that the cluster number is not important in predicting the counts of K-means

clustering assignments (P = 8.57 e-06). The VIF for all factors in model 1d were below

the threshold (3), and multicollinearity was not a concern (Table 9).

Page 50: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

43

Table 8: The four generalised linear mixed models constructed to identify a significant

relationship between the observed behavioural state and clusters in K-means clustering

# Model Degrees of

freedom

AIC

1a Poisson – observed behavioural state + clusters +

(1 | shark)

90 6957.5

1b Negative binomial – observed behavioural state +

clusters + (1 | shark)

89 613.3

1c Negative binomial - observed behavioural state *

clusters + (1 | shark)

73 604.8

1d Negative binomial – observed behavioural state +

(1 | shark)

93 634.1

Figure 12: Goodness of fit (QQ plots) for the generralised linear mixed models constructed to identify the

relationship between the Observed Behaioural State and Clusters in K-means clustering: (A) QQ plot for

the null model (model 1a). (B) QQ plot for the final model (1b)

Page 51: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

44

Table 9: GLMM analysis of K-means clustering: The variance inflation factors (VIF) for

observed behavioural state and cluster of model 1d (correlation among both variables is <3)

Factor VIF

Burst 1.50

Chafe 1.58

Headshake 1.53

Swim 1.63

Cluster 2 1.53

Cluster 3 1.53

Cluster 4 1.41

Cluster 5 1.42

The selected mixed effects model

Model 1b was the final model used to analyse the K-means confusion matrix

(Table 10). Rest behaviour and cluster zero were the reference groups for observed

behavioural state and cluster, respectively. The validity of model assumptions was

checked in both the full and reduced models (Bolker et al., 2009; Cheng et al., 2010;

Loy et al., 2017). There was no significant relationship found between the clusters and

the observed behavioural state. Therefore, K-means clustering could not correctly

identify elasmobranch behaviour.

Among the observed behavioural state subtypes, burst behaviour was

significantly different from rest behaviour (P = 0.03), lowering the number of assigned

cluster by an estimated 1.82 units. Swim behaviour was also significantly different

from rest behaviour (P = 2e-16), increasing the number of clusters assigned by an

estimated 6.16 units. Inspecting the clusters, clusters three and four were significantly

different to cluster zero (P = 7.87e-07 and P = 1.26e-06 respectively). Cluster three was

Page 52: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

45

associated with a 4.07 unit decrease in the number of cluster assignments, and cluster

four was associated with a 3.95 unit decrease in cluster assignments. There were no

additional differences found to be statistically significant across the observed

behavioural state and cluster subtypes.

Table 10: Results of the final generalised linear mixed model constructed to identify the

relationship between the observed behavioural state and clusters in K-means clustering

Fixed effects Estimate Standard

error

Z value P value

Intercept 3.34 0.70 4.76 1.99e-06

Burst -1.82 0.82 -2.23 0.03

Chafe -0.07 0.76 -0.09 0.9

Headshake -1.27 0.79 -1.60 0.11

Swim 6.16 0.73 8.34 2e-16

Cluster 1 -0.77 0.72 -1.06 0.29

Cluster 2 0.17 0.72 0.24 0.81

Cluster 3 -4.07 0.82 -4.94 7.87e-07

Cluster 4 -3.95 0.82 -4.85 1.26e-06

Hidden Markov model

Four states were identified as the optimal number of hidden states to best

represent the ODBA values. The majority of observations were represented by the

hidden states three and four (25,384 and 9045 seconds respectively) and the least

amount of observations represented by hidden state one (435 seconds).

The initial state distribution (denoted by δ) of the HMM was estimated as -4.5,

-2.8, -2.2 and -1 for the means, and 1, 2, 1, 2 for the standard deviations. The maximum

likelihood estimate computed for the four-state HMM was 9349.73. The nlm optimiser

Page 53: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

46

was run 1000 times to test the numerical stability of the MLE (24-hour computation

time). During the 1000 iterations, the nlm optimiser revealed a global maximum of

0.04 above the previously suggested MLE. However, the parameter estimates of the

four-state HMM were consistent with the initial state distribution and the higher MLE

was not considered a concern (Leos-Barajas and Michelot, 2018).

The final four distributions are illustrated (Figure 13). State one captured the

extremely low ODBA values while state two captured the highest ODBA values.

States three and four both captured the middle range ODBA values with an overlap

between the two states. The QQ plot revealed that the four state HMM followed the

expected line well except at the tails where there was a slight divergence from the

distribution (Figure 14).

Page 54: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

47

Figure 13: Illustration of the fitted state dependent distributions for the four-state HMM plotted over the

histogram of the combined ODBA values that were calculated from the accelerometry data of juvenile lemon

sharks (N.brevirostris). (A) State one (B) State two (C) State three (D) State four

Page 55: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

48

The Viterbi algorithm was used to determine the most likely sequence of

hidden states (Figure 15) and the most likely transition probabilities given the

sequence of ODBA values (Table 11). The transition probabilities provided are

labelled state persistence and state switching. State persistence is the probability that

an animal will remain in the same behavioural state as a time series continues. For

example, as most sharks swim continuously, a particularly high probability of state

persistence for shark swimming behaviour would be expected. State switching,

however, is the probability that animal will move from one behavioural state to the

next behavioural state (Leos‐Barajas et al., 2017; Quick et al., 2017). State switching

and state persistence were identified within and across all four states. However, the

probabilities for state persistence were substantially higher in comparison to state

Figure 14: Goodness of fit (QQ plot) of the four-state HMM’s application on accelerometry data

of juvenile lemon sharks (forecast pseudo-residuals are shown

Page 56: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

49

switching. States one and three had the highest transition probabilities for state

persistence (9.86e-01 and 9.41e-01, respectively). The lowest transition probabilities

were states three, for switching to state two (2.69e-16) and state one switching to state

four (1.02e-15).

Table 11: The transition probability matrix (TPM) given the sequence of combined ODBA

values (calculated by the Viterbi algorithm)

Time t + 1

Tim

e t

State 1 2 3 4

1 9.86e-01 1.39e-02 1.72e-10 5.01e-11

2 1.42e-13 7.38e-01 2.69e-16 2.62e-01

3 2.45e-04 2.11e-03 9.41e-01 5.70e-02

4 1.02e-15 2.48e-02 1.46e-01 8.29e-01

Figure 15: Histogram of the combined ODBA values that were calculated from the accelerometry data of

juvenile lemon sharks (Negaprion brevirostris) (N = 4); The colours correspond to the most probable

hidden states given the sequence of combined ODBA values: Black – State one; Red – State two; Green –

State three; Blue – State four

Page 57: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

50

Classification performance

Overall, the classification of the HMM was homogenous across the pseudo-

confusion matrix (Table 12). There was high congruity across the four individuals with

hidden state one and resting behaviour, and with hidden state three and swimming

behaviour. Hidden state four was assigned mostly to swim behaviour, with the

exception being 17 chafe events. Hidden state two was the only major exception to

the clarity of classification mentioned above. In sharks one, two, and three, hidden

state two was assigned to all the observed behaviours, and in shark four, this state was

assigned to three of five observed behaviours.

Page 58: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

51

Table 12: The confusion matrix generated to evaluate the performance of the HMM for

behaviour classification of lemon shark accelerometry data; where rows indicate actual

behavioural states and columns indicate clusters membership as pseudo-predictions

Shark Observed behavioural

state

Hidden state Total

1 2 3 4

1 Burst 0 4 0 0 4

Chafe 0 126 1 12 139

Headshake 0 57 0 0 57

Rest 81 1 0 0 82

Swim 0 494 20653 8259 29406

Total 81 682 20654 8271 29688

2 Burst 0 10 0 0 10

Chafe 0 81 0 0 81

Headshake 0 14 0 0 14

Rest 313 2 0 0 315

Swim 2 44 4650 579 5275

Total 315 151 4650 579 5695

3 Burst 0 12 0 0 12

Chafe 0 44 0 5 49

Headshake 0 26 0 0 26

Rest 39 2 1 1 43

Swim 0 29 48 71 148

Total 39 113 49 77 278

4 Burst 0 22 0 0 22

Chafe 0 0 0 0 0

Headshake 0 16 0 0 16

Rest 0 0 0 0 0

Swim 0 27 31 118 166

Total 0 65 31 118 214

Page 59: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

52

Mixed effects model to analyse the pseudo-predictions from the HMM

Model selection

From the five candidate models, the AIC’s (Table 13) and QQ plots (Figure

16) were compared. Model 2a held the highest AIC score and was significantly

overdispersed (P = 8.48e-05). Model 2b had a much lower AIC but was also

significantly overdispersed (P = 2.26e-13). Model 2c was the negative binomial model

with the interaction between the hidden state and the observed behavioural state and

had the lowest AIC score of the competing models. However, as the model did not

converge (due to complete separation), it was disregarded.

Model 2d the quasi-Poisson model that tested for the interaction term, which

had the second-lowest AIC and displayed no significant deviation from the normal

distribution. Model 2e was the quasi-Poisson model that tested for the association

term, which had an increased AIC in comparison to models 2c and 2d. Overall, model

2d was selected as the best-fitting model.

Page 60: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

53

Table 13: The five generalised linear mixed models constructed to identify the relationship

between the observed behavioural state and hidden state in the HMM

# Model Degrees of freedom AIC

2a Poisson – observed behavioural state + hidden state +

(1 | shark)

71 10962.9

2b Negative binomial – observed behavioural state +

hidden state + (1 | shark)

70 840.2

2c Negative binomial - observed behavioural state *

hidden state + (1 | shark)

58 437.4

2d Quasi-Poisson – observed behavioural state *

hidden state + (1 | shark)

58 492.6

2e Quasi-Poisson – observed behavioural state +

(1 | shark)

73 516.5

Figure 16: Goodness of fit (QQ plots) for the generralised linear mixed models constructed to identify the

relationship between the Observed Behaioural State and hidden state in the HMM: (A) QQ plot for the

null model (model 1a). (B) QQ plot for the final model (1d)

Page 61: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

54

A generalised linear mixed model for the HMM pseudo-predictions

Model 2d was the final model used to analyse the HMM confusion matrix

(Table 14). Again, rest behaviour and cluster zero were the reference groups for the

observed behavioural state and hidden state, respectively. The validity of model

assumptions was checked across all five models (Bolker et al., 2009; Cheng et al.,

2010; Loy et al., 2017). Model 2d identified a significant interaction between the

hidden states and the observed behavioural state. Therefore, the unsupervised HMM

could accurately identify elasmobranch behaviour.

Among the observed behaviours, the swim was the only observed behavioural

state to have statistically significant interaction with a hidden state. The combination

of swim behaviour and hidden state two was statistically significant (P = 2.99e-02),

increasing the number of HMM assignments by 3.02 units. The combination of swim

behaviour and hidden state three was also statistically significant (P = 1.51e-03),

increasing the number of HMM assignments by 5.46 units. Lastly, the combination of

swim behaviour and hidden state four was statistically significant (P = 2.33e-03),

increasing the number of HMM assignments by 5.04 units.

Page 62: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

55

Table 14: Results of the final generalised linear mixed model that identified a significant

interaction between the observed behavioural state and hidden state in the HMM

Fixed effects Estimate Standard error Z value P value

Intercept 6.15 0.77 7.95 1.81e-15

Burst -20.12 9130.00 -2.00e-03 9.98e-15

Chafe -27.31 331900.00 0.00 1.00

Headshake -19.67 7296.00 -3.00e-03 9.98e-01

Swim -1.85 1.15 -1.61 1.07e-01

Hidden state 2 -0.73 0.81 -0.91 3.65e-01

Hidden state 3 -1.88 1.15 -1.64 0.10

Hidden state 4 -1.88 1.15 -1.64 0.10

Burst: Hidden state 2 20.69 9130.00 2.00e-03 9.98e-01

Chafe: Hidden state 2 28.01 331900.00 0.00 1.00

Headshake: Hidden state 2 20.51 7296.00 3.00e-03 9.98e-01

Swim: Hidden state 2 3.02 1.39 2.17 2.99e-02

Burst: Hidden state 3 -7.74 1124000.00 0.00 1.00

Chafe: Hidden state 3 27.31 331900.00 0.00 1.00

Headshake: Hidden state 3 -7.27 708500.00 0.00 1.00

Swim: Hidden state 3 5.46 1.72 3.17 1.51e-03

Burst: Hidden state 4 -6.67 656400.00 0.00 1.00

Chafe: Hidden state 4 28.23 331900.00 0.00 1.00

Headshake: Hidden state 4 -6.17 408500.00 0.00 1.00

Swim: Hidden state 4 5.04 1.65 3.05 2.33e-03

Page 63: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

56

Discussion

To the best of our knowledge, the current study is the first to validate the use

of unsupervised ML methods to classify elasmobranch behaviour from accelerometry

data using ground-truthed data. Both unsupervised algorithms had poor performance

in distinguishing rare behaviours but performed well in discerning broad categories of

behaviour.

Classifier performance for unsupervised ML methods

K-means clustering

The benefits of K-means clustering include rapid classification of large data

sets with comprehensible results. In the current study, K-means clustering performed

poorly in the classification of lemon shark behaviour. There was no significant

relationship between clusters and the observed behavioural states, which was

illustrated with little homogeneity in the pseudo-confusion matrix. Furthermore, shark

one was the only individual to exhibit all five clusters during captive trials. Clusters

three and four were non-existent for sharks two, three and four, and suggest that five

clusters were not appropriate for K-means to accurately classify lemon shark

behaviour.

The CH criterion was the index used to select the appropriate number of

clusters for the ground-truthed observations, and it was chosen for its robust

performance with K-means clustering (Arbelaitz et al., 2013; Hämäläinen et al., 2017;

Maulik and Bandyopadhyay, 2002; Steinley and Brusco, 2011). As five clusters were

suggested, and there were five observed behaviours exhibited in the behavioural trials,

five clusters were deemed appropriate. However, the pseudo-predictions suggest that

Page 64: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

57

the CH criterion was incorrect and that the five clusters were not suited to the per

second accelerometry data. It is important to note that the CH criterion is a data-driven

validation method, meaning that the selected optimum number of clusters is dictated

only from the data itself. Therefore, with shark one representing 82.75% of the ground-

truthed observations alone, the CH criterion would have been significantly influenced

by this shark alone.

Although shark one contributed more than 29,000 seconds of the ground-

truthed observations, the poor performance of K-means clustering was consistent

across individuals. In shark one, cluster zero and one were repeatedly classified to

represent both rest and swim behaviours. This was expected given most shark species

swim continuously, which is represented by continuous acceleration in the waveform

signal of the sway axis (Gleiss et al., 2009a; Gleiss et al., 2009b). These regular

oscillations in the waveform signal only increase the difficulty in classifying shark

behaviour. For behaviours like rest and swim, misclassification is expected because

the only differences between the two behaviours are the frequency and intensity of

their acceleration waveform signal (Wilson et al., 2015).

As previously mentioned, clusters three and four were only assigned a

classification for shark one, representing <15% of the total observations. K-means has

classified these clusters to represent mostly swim behaviour (except for six seconds of

headshaking behaviour). Again, this may be a possible result of the imbalanced data

set, but it could also indicate that K-means clustering is artificially separating

swimming behaviour (Gleiss et al., 2017). This has been previously observed with K-

means clustering, where Whitney et al. (2010) was required to restrict the maximum

cluster number to prevent artificial separation of the same behaviour. However,

Page 65: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

58

without comparable class sizes for the individual sharks and their behaviours, it would

be impossible to tell.

Behavioural classification from K-means clustering was equally poor for

sharks two, three and four. There was no significant relationship between the observed

behaviours and K-means clustering, which was illustrated in the absence of cluster

uniformity across the pseudo-confusion matrix (Table 7). The poor performance in

behavioural classification may indicate that the underlying structure of the

accelerometry data is too complex for K-means clustering. Although K-means

clustering has successfully classified animal behaviour in the past (Schwager et al.,

2007; Watanabe et al., 2012), these studies demonstrate a relatively simple

implementation of the clustering method, with a limited number of behavioural states

(such as active or inactive) (Sakamoto et al., 2009; Schwager et al., 2007; Watanabe

et al., 2012). Conversely, applying K-means clustering to complex biological data

such as accelerometry data can produce impractical results due to its high

dimensionality that often surpasses the computational capability of the clustering

method (Ronan et al., 2016). That is, as the complexity of data increases, the

unsupervised clustering method is often confused by statistical noise, and fails to

create useful results (Ronan et al., 2016). Similarly, the pseudo-predictions classified

by K-means clustering were inconsistent. Little behavioural inference could be drawn

for basic behaviours such as rest and swim, and even less for short-lived behaviours

such as headshake or burst.

Hidden Markov model

Overall, the HMM performed relatively well. There was a significant

interaction between the hidden state and the observed behavioural state, which was

Page 66: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

59

illustrated through relatively homogeneous classifications across the pseudo-

confusion matrix. The significant interaction indicated that the performance of the

HMM was improved when classifying swimming behaviour for all four states. Of the

four states identified, state one contained the smallest ODBA values and was generally

associated with resting behaviour. State two, however, comprised of the higher ODBA

values and was labelled to represent all five observed behaviours at one point or

another. Despite a few exceptions (twelve seconds of chafe and one second of rest),

state three was mostly associated with swim behaviour, and state four was labelled to

represent both swim and chafe. As the state distributions of hidden states three and

four slightly overlapped, the pseudo-predictions were hardly surprising and can be

explained by the similar characteristics observed in the acceleration waveforms of

swim and chafe behaviour (Figure 8).

As previously mentioned, HMM’s are rich in mathematical structure and are

computationally complex. However, they also provide exhaustive and sophisticated

results. They provide information on two relationships: first is the relationship

between the hidden states and the observed states; Second is the transitions between

the states over time (Bolano, 2014). The second relationship was described using the

Viterbi algorithm and displayed with the TPM (Table 11). Based on the acceleration

data, the TPM illustrated much higher state persistence than state switching across all

states. Similar state persistence has been revealed in other marine wildlife (Bacheler

et al., 2019; Quick et al., 2017), and highlights the high correlation in lemon shark

behaviour.

The high accuracy of the unsupervised HMM was only further supported by

the TPM. As previously mentioned, hidden states one and three respectively

correspond to rest and swim behaviours. For resting behaviour, the extremely high

Page 67: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

60

probability for state persistence (0.99) is expected because once resting has been

initiated, it is extremely likely that the shark will continue to rest. For swimming

behaviour, the probability for state persistence had dropped slightly (0.94) in

comparison to resting behaviour but remains relatively high overall. This probability

is also anticipated because as most sharks continuously swim, it is more likely that the

shark will transition from swimming behaviour to a burst event (for example), and

back into swimming behaviour again. Therefore, the probability of state persistence

for swimming behaviour should remain high, while accounting for the inevitable

probability of state transitions. Comparatively, the accuracy in the HMM was poorest

with the classification of short-lived behaviours such as headshake, chafe and burst.

This classification confusion was illustrated not only through high heterogeneity in

the pseudo-confusion matrix for hidden state two but also through the much lower

state persistence probability for this hidden state (0.74).

State transition and persistence probabilities provide a valuable metric to

animal behaviour studies. The probabilities can also be used with biological

knowledge to identify environmental or spatiotemporal patterns in animal behaviour

(Goodall et al., 2019). For example, by comparing a Viterbi decoded ODBA time

series with the corresponding tides and temperatures, Leos‐Barajas et al. (2017)

revealed that blacktip reef sharks (Carcharhinus melanopterus) were most active in

warmer surface waters and were 30% more likely to be active in the late evenings.

Unsupervised ML methods are often criticised for their decreased accuracy and

reliability (Collingwood and Wilkerson, 2012; Ladds et al., 2016). The Viterbi

algorithm, however, is an undervalued tool in behavioural ecology that already

provides unsupervised HMM’s with the intelligence they require for improving the

automatic classification of animal behaviour.

Page 68: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

61

Comparing the performance of unsupervised machine learning

methods to a previously supervised machine learning method for

behavioural classification of shark behaviour

Brewster et al. (2018) used both captive and free-ranging lemon sharks to

collect accelerometry data for the development and validation of ML models for the

classification of animal behaviour. The latter study, an ensemble classifier (uses

multiple ML methods to construct an optimal model based on weighted model

selection) (Dietterich, 2000; Wang et al., 2018) was used to assess the performance of

supervised ML methods to predict the behaviour of wild lemon sharks (Brewster et

al., 2018). In the current study, the ground-truthed accelerometry data set collected by

Brewster et al. (2018), was used to create a pseudo-confusion matrix to analyse the

performance of two unsupervised ML methods: K-means clustering and HMM’s.

Overall, the performance of behavioural classification from K-means

clustering was comparatively lower than the unsupervised HMM, and the supervised

ensemble classifier. The HMM outperformed K-means clustering with improved

accuracy in the pseudo-predictions for swim and rest behaviours. In a similar pattern,

the ensemble classifier correctly classified both swim and rest behaviours with very

little error (Brewster et al., 2018). However, the accuracy in classifying lemon shark

behaviour was comparatively lower for short-lived behaviours such as headshake and

burst, for both the unsupervised HMM and the supervised ensemble classifier

(Brewster et al., 2018).

Despite the amount of time and the additional parameters used to train the

supervised ensemble classifier, the pseudo-predictions from the unsupervised HMM

Page 69: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

62

provided similar results to the ensemble classifier for the classification of lemon shark

behaviour.

Here, three ML methods were compared in their ability to classify

elasmobranch behaviour from accelerometry data (Table 15). Following this

overview, ecologists should base their choice of behavioural classification method on

both the available data and the objectives of the study. For example, when the aim is

to classify animal behaviour for a species that is easily observed, and the behaviour of

interest belongs to a set of previously defined behaviours, the supervised ensemble

classifier would enable precise and accurate behavioural classification. In contrast,

direct observations are often unattainable for species that live in aquatic environments,

and behaviour classification is inferred, not validated (Brewster et al., 2018). In which

case, K-means clustering would not be preferred, particularly because it disregards the

serial correlation often associated with animal movement data (Brewster et al., 2018;

Franke et al., 2006). Therefore, computational complexity is a small price to pay for a

highly informative unsupervised ML method that can accurately classify the behaviour

of free-ranging aquatic wildlife.

Page 70: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

63

Table 15: Concluding comparison of the performance of two unsupervised machine learning

techniques to a previously supervised machine learning technique

Supervised Unsupervised

Ensemble classifier K-means clustering HMM

Efficiency + +++ +

Analysis complexity +++ + +++

Accuracy +++ + +++

Precision +++ + ++

New/unknown behaviours + ++ +++

Requires validation data + - -

Limitations and future research

There were several limitations associated with the current study, some of

which have been mentioned earlier. As previously discussed, the methodology used

for data collection was designed with the aims listed by Brewster et al. (2018). Time,

equipment and funding for the ethogram trials were restricted, which lead to small

sample size (N = 4) and a highly imbalanced data set (for both individual sharks and

behaviours). Animal movement and behaviour studies are often constrained to small

sample sizes. For ML, in particular, small sample sizes have been associated with

lower model performance and can decrease the statistical power behind the

corresponding conclusions (Garamszegi, 2016; Guida et al., 2017; Hays et al., 2016).

Furthermore, the ground-truthed observations were restricted by the pre-defined

behaviours of juvenile lemon sharks which meant that novel behavioural patterns

would not be identified (Brewster et al., 2018; Chimienti et al., 2016).

There were also several limitations in the present study that was specific to

ML. For example, the imbalanced data set meant that swim behaviour represented

Page 71: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

64

over 95% of the ground-truthed observations. In which case, it was expected to see

higher misclassification rates for rare behaviours that were poorly represented. While

the overall classification performance of the HMM and ensemble classifier was

relatively high, the classification of headshake, chafe and burst behaviours were

particularly poor (Brewster et al., 2018). Misclassification of rare behaviours has been

previously observed in domestic cats (Watanabe et al., 2005), seals (Ladds et al., 2017;

Shuert et al., 2018) and vultures (Khosravifard et al., 2018). Similar to the

misclassification of swimming and resting behaviour, rare behaviours are repeatedly

misclassified because they share similar patterns in the acceleration waveform. For

rare behaviours, the main distinguishing feature is the variation in temporal patterns

across the acceleration waveform. Therefore, misclassification is a direct result of the

varying time both within and between rare behaviours (Hounslow et al., 2019; Ladds

et al., 2017; McClune et al., 2014).

Due to their high complexity, ecologists tend to shy away from HMM’s.

However, several R packages have been recently created to ease the complexity of the

models and make HMM’s more accessible for the analysis of movement data (Lawler

et al., 2019; McClintock and Michelot, 2018; Michelot et al., 2016; Visser and

Speekenbrink, 2010). For example, the packages moveHMM, momentuHMM and

CarHMM involve pre-processing of movement data, fitting an HMM and even

analysing the fitted model (Lawler et al., 2019; McClintock and Michelot, 2018;

Michelot et al., 2016). However, there is currently no R package that provides a

straightforward application of an HMM to accelerometry data, and no packages could

be used for this study.

For future studies, a better balance of ground-truthed observations such as

more burst and headshake could significantly improve the classification of lemon

Page 72: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

65

shark behaviour. For model performance, it is important to note that the pseudo-

predictions were not necessarily matched one to one with a biologically interpretable

behaviour. The inference from unsupervised ML methods is an inevitable issue.

Further investigation into the efficacy of unsupervised HMM’s is suggested to better

quantify the misclassification experienced in this study.

Conclusion

This study has highlighted the value of unsupervised ML in the study of free-

ranging animal behaviour. Overall, K-means clustering demonstrated weak

performance that was clearly surpassed by the HMM. The HMM performed

particularly well in the identification of basic behaviours (swim and rest) and provided

sophisticated metrics for the analysing animal behaviour without direct observation.

The HMM had a lower rate of performance for the classification of short-lived

behaviours such as burst and headshake. However, the compared supervised ML

method was also unable to discern short behaviours. Overall, the results from this

thesis demonstrated that the unsupervised HMM is a valuable method for the

automatic classification of elasmobranch behaviour from high-resolution movement

data while highlighting the limitations of classifying complex behaviours with ML.

Page 73: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

66

Literature Cited

Adam, T., Langrock, R., Weiß, C.H., 2019. Penalized estimation of flexible hidden Markov models for time series of counts. arXiv preprint arXiv:1901.03275.

Akaike, H., 1994. Implications of informational point of view on the development of statistical science, Selected Papers of Hirotugu Akaike. Springer, pp. 421-432.

Al Shalabi, L., Shaaban, Z., Kasasbeh, B., 2006. Data mining: A preprocessing engine. Journal of Computer Science 2, 735-739.

Arbelaitz, O., Gurrutxaga, I., Muguerza, J., PéRez, J.M., Perona, I., 2013. An extensive comparative study of cluster validity indices. Pattern Recognition 46, 243-256.

Baayen, R.H., Davidson, D.J., Bates, D.M., 2008. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language 59, 390-412.

Bacheler, N.M., Michelot, T., Cheshire, R.T., Shertzer, K.W., 2019. Fine-scale movement patterns and behavioral states of gray triggerfish Balistes capriscus determined from acoustic telemetry and hidden Markov models. Fisheries Research 215, 76-89.

Bäckman, J., Andersson, A., Pedersen, L., Sjöberg, S., Tøttrup, A.P., Alerstam, T., 2017. Actogram analysis of free-flying migratory birds: new perspectives based on acceleration logging. Journal of Comparative Physiology A 203, 543-564.

Bao, L., Intille, S.S., 2004. Activity recognition from user-annotated acceleration data, International Conference on Pervasive Computing. Springer, pp. 1-17.

Barr, D.J., Levy, R., Scheepers, C., Tily, H.J., 2013. Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language 68, 255-278.

Bates, D., Mächler, M., Bolker, B., Walker, S., 2014. Fitting linear mixed-effects models using lme4. arXiv preprint arXiv:1406.5823.

Baum, L.E., Eagon, J.A., 1967. An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology. Bulletin of the American Mathematical Society 73, 360-363.

Baum, L.E., Petrie, T., 1966. Statistical inference for probabilistic functions of finite state Markov chains. The annals of mathematical statistics 37, 1554-1563.

Berlincourt, M., Angel, L.P., Arnould, J.P., 2015. Combined use of GPS and accelerometry reveals fine scale three-dimensional foraging behaviour in the Short-Tailed Shearwater. PloS one 10, e0139351.

Bidder, O.R., Campbell, H.A., Gómez-Laich, A., Urgé, P., Walker, J., Cai, Y., Gao, L., Quintana, F., Wilson, R.P., 2014. Love thy neighbour: automatic animal behavioural classification of acceleration data using the k-nearest neighbour algorithm. PloS one 9, e88609.

Birodkar, V., Edla, D.R., 2014. Enhanced K-Means Clustering Algorithm using A Heuristic Approach. Journal of Information and Computing Science 9, 277-284.

Bograd, S.J., Block, B.A., Costa, D.P., Godley, B.J., 2010. Biologging technologies: new tools for conservation. Introduction. Endangered Species Research 10, 1-7.

Bolano, D., 2014. Hidden Markov models: an approach to sequence analysis in population studies, Annual meeting of the Population Association of America.

Page 74: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

67

Bolker, B.M., Brooks, M.E., Clark, C.J., Geange, S.W., Poulsen, J.R., Stevens, M.H.H., White, J.-S.S., 2009. Generalized linear mixed models: a practical guide for ecology and evolution. Trends in ecology & evolution 24, 127-135.

Bouten, W., Baaij, E.W., Shamoun-Baranes, J., Camphuysen, K.C., 2013. A flexible GPS tracking system for studying bird behaviour at multiple scales. Journal of Ornithology 154, 571-580.

Boyd, I.L., Kato, A., Ropert-Coudert, Y., 2004. Bio-logging science: sensing beyond the boundaries.

Bozdogan, H., 1987. Model selection and Akaike's information criterion (AIC): The general theory and its analytical extensions. Psychometrika 52, 345-370.

Brewster, L., Dale, J., Guttridge, T., Gruber, S., Hansell, A., Elliott, M., Cowx, I., Whitney, N., Gleiss, A., 2018. Development and application of a machine learning algorithm for classification of elasmobranch behaviour from accelerometry data. Marine Biology 165, 62.

Brodie, S., Lédée, E.J., Heupel, M.R., Babcock, R.C., Campbell, H.A., Gledhill, D.C., Hoenner, X., Huveneers, C., Jaine, F.R., Simpfendorfer, C.A., 2018. Continental-scale animal tracking reveals functional movement classes across marine taxa. Scientific Reports 8, 3717.

Broman, K.W., Speed, T.P., 2002. A model selection approach for the identification of quantitative trait loci in experimental crosses. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64, 641-656.

Brooker, R.M., Feeney, W.E., White, J.R., Manassa, R.P., Johansen, J.L., Dixson, D.L., 2016. Using insights from animal behaviour and behavioural ecology to inform marine conservation initiatives. Animal Behaviour 120, 211-221.

Brooks, M.E., Kristensen, K., van Benthem, K.J., Magnusson, A., Berg, C.W., Nielsen, A., Skaug, H.J., Machler, M., Bolker, B.M., 2017. glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. The R journal 9, 378-400.

Brown, D.D., Kays, R., Wikelski, M., Wilson, R., Klimley, A.P., 2013. Observing the unwatchable through acceleration logging of animal behavior. Animal Biotelemetry 1, 20.

Burnham, K.P., Anderson, D.R., Huyvaert, K.P., 2011. AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons. Behavioral ecology and sociobiology 65, 23-35.

Caliński, T., Harabasz, J., 1974. A dendrite method for cluster analysis. Communications in Statistics-theory and Methods 3, 1-27.

Campbell, H.A., Gao, L., Bidder, O.R., Hunter, J., Franklin, C.E., 2013. Creating a behavioural classification module for acceleration data: using a captive surrogate for difficult to observe species. Journal of Experimental Biology 216, 4501-4506.

Carroll, G., Slip, D., Jonsen, I., Harcourt, R., 2014. Supervised accelerometry analysis can identify prey capture by penguins at sea. Journal of Experimental Biology 217, 4295-4302.

Chapman, D.D., Babcock, E.A., Gruber, S.H., Dibattista, J.D., Franks, B.R., Kessel, S.A., Guttridge, T., Pikitch, E.K., Feldheim, K.A., 2009. Long‐term natal site‐fidelity by immature lemon sharks (Negaprion brevirostris) at a subtropical island. Molecular Ecology 18, 3500-3507.

Chapple, T.K., Gleiss, A.C., Jewell, O.J., Wikelski, M., Block, B.A., 2015. Tracking sharks without teeth: a non-invasive rigid tag attachment for large predatory sharks. Animal Biotelemetry 3, 14.

Page 75: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

68

Cheng, J., Edwards, L.J., Maldonado‐Molina, M.M., Komro, K.A., Muller, K.E., 2010. Real longitudinal data analysis for real people: building a good enough mixed model. Statistics in Medicine 29, 504-520.

Chimienti, M., Cornulier, T., Owen, E., Bolton, M., Davies, I.M., Travis, J.M., Scott, B.E., 2016. The use of an unsupervised learning approach for characterizing latent behaviors in accelerometer data. Ecology and Evolution 6, 727-741.

Choy, G., Khalilzadeh, O., Michalski, M., Do, S., Samir, A.E., Pianykh, O.S., Geis, J.R., Pandharipande, P.V., Brink, J.A., Dreyer, K.J., 2018. Current applications and future impact of machine learning in radiology. Radiology 288, 318-328.

Christensen‐Dalsgaard, S., May, R., Lorentsen, S.H., 2018. Taking a trip to the shelf: Behavioral decisions are mediated by the proximity to foraging habitats in the black‐legged kittiwake. Ecology and Evolution 8, 866-878.

Collingwood, L., Wilkerson, J., 2012. Tradeoffs in accuracy and efficiency in supervised learning methods. Journal of Information Technology & Politics 9, 298-318.

Dean, B., Freeman, R., Kirk, H., Leonard, K., Phillips, R.A., Perrins, C.M., Guilford, T., 2012. Behavioural mapping of a pelagic seabird: combining multiple sensors and a hidden Markov model reveals the distribution of at-sea behaviour. Journal of the Royal Society Interface, rsif20120570.

Deguchi, T., Suryan, R.M., Ozaki, K., Jacobs, J.F., Sato, F., Nakamura, N., Balogh, G.R., 2014. Translocation and hand-rearing of the short-tailed albatross Phoebastria albatrus: early indicators of success for species conservation and island restoration. Oryx 48, 195-203.

Dennis Jr, J.E., Schnabel, R.B., 1996. Numerical methods for unconstrained optimization and nonlinear equations. Siam.

DeRuiter, S.L., Langrock, R., Skirbutas, T., Goldbogen, J.A., Calambokidis, J., Friedlaender, A.S., Southall, B.L., 2017. A multivariate mixed hidden Markov model for blue whale behaviour and responses to sound exposure. The Annals of Applied Statistics 11, 362-392.

Dewhirst, O.P., Roskilly, K., Hubel, T.Y., Jordan, N.R., Golabek, K.A., McNutt, J.W., Wilson, A.M., 2017. An exploratory clustering approach for extracting stride parameters from tracking collars on free-ranging wild animals. Journal of Experimental Biology 220, 341-346.

Dietterich, T.G., 2000. Ensemble methods in machine learning, International workshop on multiple classifier systems. Springer, pp. 1-15.

Duncan, C., Jones, K., Moon, G., 1998. Context, composition and heterogeneity: using multilevel models in health research. Social science & medicine 46, 97-117.

Dundar, M., Krishnapuram, B., Bi, J., Rao, R.B., 2007. Learning Classifiers When the Training Data Is Not IID, IJCAI, pp. 756-761.

Eddelbuettel, D., François, R., Allaire, J., Ushey, K., Kou, Q., Russel, N., Chambers, J., Bates, D., 2011. Rcpp: Seamless R and C++ integration. Journal of Statistical Software 40, 1-18.

Eddelbuettel, D., Sanderson, C., 2014. RcppArmadillo: Accelerating R with high-performance C++ linear algebra. Computational Statistics & Data Analysis 71, 1054-1063.

Fallucchi, F., Luccasen III, R.A., Turocy, T.L., 2017. Identifying discrete behavioural types: A re-analysis of public goods game contributions by hierarchical clustering. CBESS Discussion Paper.

Franke, A., Caelli, T., Kuzyk, G., Hudson, R.J., 2006. Prediction of wolf (Canis lupus) kill-sites using hidden Markov models. Ecological modelling 197, 237-246.

Page 76: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

69

Garamszegi, L.Z., 2016. A simple statistical guide for the analysis of behaviour when data are constrained due to practical or ethical reasons. Animal behaviour 120, 223-234.

Garriga, J., Palmer, J.R., Oltra, A., Bartumeus, F., 2016. Expectation-maximization binary clustering for behavioural annotation. PLoS One 11, e0151984.

Gleiss, A.C., Dale, J.J., Holland, K.N., Wilson, R.P., 2010. Accelerating estimates of activity-specific metabolic rate in fishes: testing the applicability of acceleration data-loggers. Journal of Experimental Marine Biology and Ecology 385, 85-91.

Gleiss, A.C., Gruber, S.H., Wilson, R.P., 2009a. Multi-channel data-logging: towards determination of behaviour and metabolic rate in free-swimming sharks, Tagging and tracking of marine animals with electronic devices. Springer, pp. 211-228.

Gleiss, A.C., Morgan, D.L., Whitty, J.M., Keleher, J.J., Fossette, S., Hays, G.C., 2017. Are vertical migrations driven by circadian behaviour? Decoupling of activity and depth use in a large riverine elasmobranch, the freshwater sawfish (Pristis pristis). Hydrobiologia 787, 181-191.

Gleiss, A.C., Norman, B., Liebsch, N., Francis, C., Wilson, R.P., 2009b. A new prospect for tagging large free-swimming sharks with motion-sensitive data-loggers. Fisheries Research 97, 11-16.

Goodall, V.L., Ferreira, S.M., Funston, P.J., Maruping-Mzileni, N., 2019. Uncovering hidden states in African lion movement data using hidden Markov models. Wildlife Research.

Google Earth, 2019. North Bimini, The Bahamas Google Earth.

Graf, P.M., Wilson, R.P., Qasem, L., Hackländer, K., Rosell, F., 2015. The use of acceleration to code for animal behaviours; a case study in free-ranging Eurasian beavers Castor fiber. PloS one 10, e0136751.

Grünewälder, S., Broekhuis, F., Macdonald, D.W., Wilson, A.M., McNutt, J.W., Shawe-Taylor, J., Hailes, S., 2012. Movement activity based classification of animal behaviour with an application to data from cheetah (Acinonyx jubatus). PloS one 7, e49120.

Guerra, L., McGarry, L.M., Robles, V., Bielza, C., Larranaga, P., Yuste, R., 2011. Comparison between supervised and unsupervised classifications of neuronal cell types: a case study. Developmental neurobiology 71, 71-82.

Guida, L., Dapp, D.R., Huveneers, C.P., Walker, T.I., Reina, R.D., 2017. Evaluating time-depth recorders as a tool to measure the behaviour of sharks captured on longlines. Journal of experimental marine biology and ecology 497, 120-126.

Hämäläinen, J., Jauhiainen, S., Kärkkäinen, T., 2017. Comparison of Internal Clustering Validation Indices for Prototype-Based Clustering. Algorithms 10, 105.

Hart, K.M., White, C.F., Iverson, A.R., Whitney, N., 2016. Trading shallow safety for deep sleep: juvenile green turtles select deeper resting sites as they grow. Endangered Species Research 31, 61-73.

Hartig, F., 2017. DHARMa: residual diagnostics for hierarchical (multi-level/mixed) regression models. R package version 0.1 5.

Hays, G.C., Ferreira, L.C., Sequeira, A.M., Meekan, M.G., Duarte, C.M., Bailey, H., Bailleul, F., Bowen, W.D., Caley, M.J., Costa, D.P., 2016. Key questions in marine megafauna movement ecology. Trends in ecology & evolution 31, 463-475.

Hennig, C., 2010. fpc: Flexible procedures for clustering. R package version 2, 0-3.

Page 77: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

70

Hernández-Pliego, J., Rodríguez, C., Dell’Omo, G., Bustamante, J., 2017. Combined use of tri-axial accelerometers and GPS reveals the flexible foraging strategy of a bird in relation to weather conditions. PloS one 12, e0177892.

Hochscheid, S., Wilson, R.P., 1999. A new method for the determination of at-sea activity in sea turtles. Marine ecology progress series 185, 293-296.

Hounslow, J., Brewster, L., Lear, K., Guttridge, T., Daly, R., Whitney, N., Gleiss, A., 2019. Assessing the effects of sampling frequency on behavioural classification of accelerometer data. Journal of Experimental Marine Biology and Ecology 512, 22-30.

Jain, A.K., Murty, M.N., Flynn, P.J., 1999. Data clustering: a review. ACM computing surveys (CSUR) 31, 264-323.

Johnson, P.C., Barry, S.J., Ferguson, H.M., Müller, P., 2015. Power analysis for generalized linear mixed models in ecology and evolution. Methods in ecology and evolution 6, 133-142.

Jonsen, I., Basson, M., Bestley, S., Bravington, M., Patterson, T., Pedersen, M.W., Thomson, R., Thygesen, U.H., Wotherspoon, S., 2013. State-space models for bio-loggers: A methodological road map. Deep Sea Research Part II: Topical Studies in Oceanography 88, 34-46.

Jurafsky, D., Martin, J.H., 2014. Speech and language processing. Pearson London:.

Khosravifard, S., Venus, V., Skidmore, A.K., Bouten, W., Munoz, A.R., Toxopeus, A.G., 2018. Identification of Griffon Vulture’s Flight Types Using High-Resolution Tracking Data. International journal of environmental research 12, 313-325.

Kooyman, G.L., 1966. Maximum diving capacities of the Weddell seal, Leptonychotes weddeili. Science 151, 1553-1554.

Kooyman, G.L., 2004. Genesis and evolution of bio-logging devices: l963-2002.

Koski, A., 1996. Modelling ECG signals with hidden Markov models. Artificial intelligence in medicine 8, 453-471.

Krueger, C., Tian, L., 2004. A comparison of the general linear mixed model and repeated measures ANOVA using a dataset with multiple missing data points. Biological research for nursing 6, 151-157.

Kumar, R.R., Prasad, A.B., 2013. K Means clustering algorithm for partitioning data sets evaluated from horizontal aggregations. IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN, 2278-0661.

Ladds, M.A., Thompson, A.P., Kadar, J.-P., Slip, D.J., Hocking, D.P., Harcourt, R.G., 2017. Super machine learning: improving accuracy and reducing variance of behaviour classification from accelerometry. Animal Biotelemetry 5, 8.

Ladds, M.A., Thompson, A.P., Slip, D.J., Hocking, D.P., Harcourt, R.G., 2016. Seeing It All: Evaluating Supervised Machine Learning Methods for the Classification of Diverse Otariid Behviours. PLoS ONE 11, 1-17.

Langfelder, P., Zhang, B., Horvath, S., 2007. Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics 24, 719-720.

Lawler, E., Whoriskey, K., Aeberhard, W.H., Field, C., Flemming, J.M., 2019. The conditionally autoregressive hidden Markov model (CarHMM): Inferring behavioural states from animal tracking data exhibiting conditional autocorrelation. arXiv preprint arXiv:1903.04999.

Page 78: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

71

Lear, K.O., Whitney, N.M., Brewster, L.R., Morris, J.J., Hueter, R.E., Gleiss, A.C., 2017. Correlations of metabolic rate and body acceleration in three species of coastal sharks under contrasting temperature regimes. Journal of Experimental Biology 220, 397-407.

Leos-Barajas, V., Michelot, T., 2018. An Introduction to Animal Movement Modeling with Hidden Markov Models using Stan for Bayesian Inference. arXiv preprint arXiv:1806.10639.

Leos‐Barajas, V., Photopoulou, T., Langrock, R., Patterson, T.A., Watanabe, Y.Y., Murgatroyd, M., Papastamatiou, Y.P., 2017. Analysis of animal accelerometer data using hidden Markov models. Methods in Ecology and Evolution 8, 161-173.

Li, M., Bolker, B.M., 2017. Incorporating periodic variability in hidden Markov models for animal movement. Movement ecology 5, 1.

Liisberg, J., Møller, J.K., Bloem, H., Cipriano, J., Mor, G., Madsen, H., 2016. Hidden Markov Models for indirect classification of occupant behaviour. Sustainable cities and society 27, 83-98.

Liu, T., Lemeire, J., 2017. Efficient and effective learning of hmms based on identification of hidden states. Mathematical Problems in Engineering 2017.

Lorbach, M., Poppe, R., van Dam, E., Veltkamp, R., Noldus, L., 2016. Transfer Learning for Rodent Behavior Recognition, Proceedings of Measuring Behavior 2016, pp. 461-489.

Loy, A., Hofmann, H., Cook, D., 2017. Model Choice and Diagnostics for Linear Mixed-Effects Models Using Statistics on Street Corners. Journal of Computational and Graphical Statistics 26, 478-492.

Mannini, A., Sabatini, A.M., 2010. Machine learning methods for classifying human physical activity from on-body accelerometers. Sensors 10, 1154-1175.

Martiskainen, P., Järvinen, M., Skön, J.-P., Tiirikainen, J., Kolehmainen, M., Mononen, J., 2009. Cow behaviour pattern recognition using a three-dimensional accelerometer and support vector machines. Applied animal behaviour science 119, 32-38.

Mason, G.J., 1991. Stereotypies: a critical review. Animal behaviour 41, 1015-1037.

Maulik, U., Bandyopadhyay, S., 2002. Performance evaluation of some clustering algorithms and validity indices. IEEE Transactions on pattern analysis and machine intelligence 24, 1650-1654.

McClintock, B.T., Michelot, T., 2018. momentuHMM: R package for generalized hidden Markov models of animal movement. Methods in Ecology and Evolution 9, 1518-1530.

McClune, D.W., Marks, N.J., Wilson, R.P., Houghton, J.D., Montgomery, I.W., McGowan, N.E., Gormley, E., Scantlebury, M., 2014. Tri-axial accelerometers quantify behaviour in the Eurasian badger (Meles meles): towards an automated interpretation of field data. Animal Biotelemetry 2, 5.

McKellar, A.E., Langrock, R., Walters, J.R., Kesler, D.C., 2014. Using mixed hidden Markov models to examine behavioral states in a cooperatively breeding bird. Behavioral Ecology 26, 148-157.

Michelot, T., Langrock, R., Patterson, T.A., 2016. moveHMM: an R package for the statistical modelling of animal movement data using hidden Markov models. Methods in Ecology and Evolution 7, 1308-1315.

Mohamad, I.B., Usman, D., 2013. Standardization and its effects on K-means clustering algorithm. Research Journal of Applied Sciences, Engineering and Technology 6, 3299-3303.

Page 79: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

72

Mohri, M., Rostamizadeh, A., Talwalkar, A., 2012. Foundations of machine learning. MIT press.

Morales, J.M., Haydon, D.T., Frair, J., Holsinger, K.E., Fryxell, J.M., 2004. Extracting more out of relocation data: building movement models as mixtures of random walks. Ecology 85, 2436-2445.

Morissette, L., Chartier, S., 2013. The k-means clustering technique: General considerations and implementation in Mathematica. Tutorials in Quantitative Methods for Psychology 9, 15-24.

Myrberg Jr, A.A., Gruber, S.H., 1974. The behavior of the bonnethead shark, Sphyrna tiburo. Copeia, 358-374.

Naimi, B., Hamm, N.A., Groen, T.A., Skidmore, A.K., Toxopeus, A.G., 2014. Where is positional uncertainty a problem for species distribution modelling? Ecography 37, 191-203.

Nakamura, I., Watanabe, Y.Y., Papastamatiou, Y.P., Sato, K., Meyer, C.G., 2011. Yo-yo vertical movements suggest a foraging strategy for tiger sharks Galeocerdo cuvier. Marine Ecology Progress Series 424, 237-246.

Nams, V.O., 2014. Combining animal movements and behavioural data to detect behavioural states. Ecology letters 17, 1228-1237.

Nazeer, K.A., Sebastian, M., 2009. Improving the Accuracy and Efficiency of the k-means Clustering Algorithm, Proceedings of the world congress on engineering, pp. 1-3.

Ngô, M.C., Heide-Jørgensen, M.P., Ditlevsen, S., 2019. Understanding narwhal diving behaviour using Hidden Markov Models with dependent state distributions and long range dependence. PLoS computational biology 15, e1006425.

Nishiumi, N., Matsuo, A., Kawabe, R., Payne, N., Huveneers, C., Watanabe, Y.Y., Kawabata, Y., 2018. A miniaturized threshold-triggered acceleration data-logger for recording burst movements of aquatic animals. Journal of Experimental Biology 221, jeb172346.

Norouzzadeh, M.S., Nguyen, A., Kosmala, M., Swanson, A., Palmer, M.S., Packer, C., Clune, J., 2018. Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. Proceedings of the National Academy of Sciences 115, E5716-E5725.

Patel, V.R., Mehta, R.G., 2011. Impact of outlier removal and normalization approach in modified k-means clustering algorithm. International Journal of Computer Science Issues (IJCSI) 8, 331.

Patterson, A., Gilchrist, H.G., Chivers, L., Hatch, S., Elliott, K., 2019. A comparison of techniques for classifying behavior from accelerometers for two species of seabird. Ecology and Evolution 9, 3030-3045.

Patterson, T.A., Thomas, L., Wilcox, C., Ovaskainen, O., Matthiopoulos, J., 2008. State–space models of individual animal movement. Trends in ecology & evolution 23, 87-94.

Pavel, M.-S., Sattarov, T., On the importance of preprocessing and initialization in k-means.

Payne, N.L., Iosilevskii, G., Barnett, A., Fischer, C., Graham, R.T., Gleiss, A.C., Watanabe, Y.Y., 2016. Great hammerhead sharks swim on their side to reduce transport costs. Nature communications 7, 12289.

Plakhov, A., Tchemisova, T., Freitas, A., 2015. Optimization in the Natural Sciences: 30th Euro Mini-Conference, EmC-ONS 2014, Aveiro, Portugal, February 5-9, 2014: revised selected papers. Springer.

Page 80: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

73

Pohle, J., Langrock, R., van Beest, F.M., Schmidt, N.M., 2017. Selecting the number of states in hidden Markov models: pragmatic solutions illustrated using animal movement. Journal of Agricultural, Biological and Environmental Statistics 22, 270-293.

Polansky, L., Wittemyer, G., Cross, P.C., Tambling, C.J., Getz, W.M., 2010. From moonlight to movement and synchronized randomness: Fourier and wavelet analyses of animal location time series data. Ecology 91, 1506-1518.

Posada, D., Buckley, T.R., 2004. Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Systematic biology 53, 793-808.

Quick, N.J., Isojunno, S., Sadykova, D., Bowers, M., Nowacek, D.P., Read, A.J., 2017. Hidden Markov models reveal complexity in the diving behaviour of short-finned pilot whales. Scientific reports 7, 45765.

R Development Core Team, 2019. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.

Rabiner, L.R., 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257-286.

Reby, D., Lek, S., Dimopoulos, I., Joachim, J., Lauga, J., Aulagnier, S., 1997. Artificial neural networks as a classification method in the behavioural sciences. Behavioural processes 40, 35-43.

Resheff, Y.S., Rotics, S., Harel, R., Spiegel, O., Nathan, R., 2014. AcceleRater: a web application for supervised learning of behavioral modes from acceleration measurements. Movement ecology 2, 27.

Richards, S.A., Whittingham, M.J., Stephens, P.A., 2011. Model selection and model averaging in behavioural ecology: the utility of the IT-AIC framework. Behavioral Ecology and Sociobiology 65, 77-89.

Robert, C., 2014. Machine learning, a probabilistic perspective. Taylor & Francis.

Ronan, T., Qi, Z., Naegle, K.M., 2016. Avoiding common pitfalls when clustering biological data. Sci. Signal. 9, re6-re6.

Rooney, T.J., 2016. On improving the forecast accuracy of the hidden Markov model. University of Cape Town.

Ropert-Coudert, Y., Beaulieu, M., Hanuise, N., Kato, A., 2009. Diving into the world of biologging. Endangered Species Research 10, 21-27.

Rutz, C., Hays, G.C., 2009. New frontiers in biologging science. The Royal Society.

Sakamoto, K.Q., Sato, K., Ishizuka, M., Watanuki, Y., Takahashi, A., Daunt, F., Wanless, S., 2009. Can Ethograms Be Automatically Generated Using Body Acceleration Data from Free-Ranging Birds? PLoS ONE 4.

Samuel, A.L., 1959. Some studies in machine learning using the game of checkers. IBM Journal of research and development 3, 210-229.

Sathya, R., Abraham, A., 2013. Comparison of supervised and unsupervised learning algorithms for pattern classification. International Journal of Advanced Research in Artificial Intelligence 2, 34-38.

Schliep, A., 2001. A bayesian approach to learning hidden markov model topology with applications to biological sequence analysis. Universität zu Köln.

Page 81: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

74

Scholander, P.F., 1940. Experimental investigations on the respiratory function in diving mammals and birds.

Schwager, M., Anderson, D.M., Butler, Z., Rus, D., 2007. Robust classification of animal tracking data. Computers and Electronics in Agriculture 56, 46-59.

Sclove, S.L., 1987. Application of model-selection criteria to some problems in multivariate analysis. Psychometrika 52, 333-343.

Shepard, E.L.C., Wilson, R.P., Quintana, F., Laich, A.G., Liebsch, N., Albareda, D.A., Halsey, L.G., Gleiss, A., Morgan, D.T., Myers, A.E., Newman, C., Macdonald, D.W., 2008. Identification of animal movement patterns using tri-axial accelerometry. Endangered Species Research 10, 47-60.

Shuert, C.R., Pomeroy, P.P., Twiss, S.D., 2018. Assessing the utility and limitations of accelerometers and machine learning approaches in classifying behaviour during lactation in a phocid seal. Animal Biotelemetry 6, 14.

Steinley, D., Brusco, M.J., 2011. Choosing the number of clusters in Κ-means clustering. Psychological Methods 16, 285.

Stephens, M., 1996. Dealing with the multimodal distributions of mixture model parameters. Available from the MCMC Preprint Service at http://www.stats.bris.ac.uk/MCMC.

Sur, M., Suffredini, T., Wessells, S.M., Bloom, P.H., Lanzone, M., Blackshire, S., Sridhar, S., Katzner, T., 2017. Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds. PloS one 12, e0174785.

Swihart, R.K., Slade, N.A., 1985. Testing for independence of observations in animal movements. Ecology 66, 1176-1184.

Symonds, M.R., Moussalli, A., 2011. A brief guide to model selection, multimodel inference and model averaging in behavioural ecology using Akaike’s information criterion. Behavioral Ecology and Sociobiology 65, 13-21.

Torrence, C., Compo, G.P., 1998. A practical guide to wavelet analysis. Bulletin of the American Meteorological society 79, 61-78.

Trave, C., Sheaves, M., 2014. Bimini Islands: a characterization of the two major nursery areas; status and perspectives. SpringerPlus 3, 270.

Tucker, B.C., Anand, M., 2005. On the use of stationary versus hidden Markov models to detect simple versus complex ecological dynamics. Ecological Modelling 185, 177-193.

Valletta, J.J., Torney, C., Kings, M., Thornton, A., Madden, J., 2017. Applications of Machine Learning in Animal Behaviour Studies. Animal Behaviour 124, 203-220.

Van Moorter, B., Visscher, D.R., Jerde, C.L., Frair, J.L., Merrill, E.H., 2010. Identifying movement states from location data using cluster analysis. Journal of Wildlife Management 74, 588-594.

Virens, J., Cree, A., 2018. Further miniaturisation of the Thermochron iButton to create a thermal bio-logger weighing 0.3 g. Journal of Experimental Biology 221, jeb176354.

Visser, I., Speekenbrink, M., 2010. depmixS4: an R package for hidden Markov models. Journal of Statistical Software 36, 1-21.

Viterbi, A., 1967. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE transactions on Information Theory 13, 260-269.

Page 82: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

75

Wang, G., 2019. Machine learning for inferring animal behavior from location and movement data. Ecological Informatics 49, 69-76.

Wang, J., He, Z., Zheng, G., Gao, S., Zhao, K., 2018. Development and validation of an ensemble classifier for real-time recognition of cow behavior patterns from accelerometer data and location data. PloS one 13, e0203546.

Watanabe, S., Izawa, M., Kato, A., Ropert-Coudert, Y., Naito, Y., 2005. A new technique for monitoring the detailed behaviour of terrestrial animals: a case study with the domestic cat. Applied Animal Behaviour Science 94, 117-131.

Watanabe, S., Sato, K., Ponganis, P.J., 2012. Activity time budget during foraging trips of emperor penguins. PloS one 7, e50357.

Wedel, M., DeSarbo, W.S., Bult, J.R., Ramaswamy, V., 1993. A latent class Poisson regression model for heterogeneous count data. Journal of Applied Econometrics 8, 397-411.

Westra, S., Sharma, A., 2006. Dominant modes of interannual variability in Australian rainfall analyzed using wavelets. Journal of Geophysical Research: Atmospheres 111.

Whitney, N.M., Papastamatiou, Y.P., Gleiss, A.C., 2012. Integrative multi-sensor tagging: emerging techniques to link elasmobranch behavior, physiology and ecology. Biology of sharks and their relatives 1, 265-290.

Whitney, N.M., Papastamatiou, Y.P., Holland, K.N., Lowe, C.G., 2007. Use of an acceleration data logger to measure diel activity patterns in captive whitetip reef sharks, Triaenodon obesus. Aquatic Living Resources 20, 299-305.

Whitney, N.M., Pratt Jr, H.L., Pratt, T.C., Carrier, J.C., 2010. Identifying shark mating behaviour using three-dimensional acceleration loggers. Endangered Species Research 10, 71-82.

Williams, H.J., Holton, M.D., Shepard, E.L., Largey, N., Norman, B., Ryan, P.G., Duriez, O., Scantlebury, M., Quintana, F., Magowan, E.A., 2017. Identification of animal movement patterns using tri-axial magnetometry. Movement ecology 5, 6.

Wilmers, C.C., Nickel, B., Bryce, C.M., Smith, J.A., Wheat, R.E., Yovovich, V., 2015. The golden age of bio‐logging: how animal‐borne sensors are advancing the frontiers of ecology. Ecology 96, 1741-1753.

Wilson, A.D., Brownscombe, J.W., Krause, J., Krause, S., Gutowsky, L.F., Brooks, E.J., Cooke, S.J., 2015. Integrating network analysis, sensor tags, and observation to understand shark ecology and behavior. Behavioral Ecology 26, 1577-1586.

Wilson, R.P., Shepard, E., Liebsch, N., 2008. Prying into the intimate details of animal lives: use of a daily diary on animals. Endangered Species Research 4, 123-137.

Winter, B., 2013. A very basic tutorial for performing linear mixed effects analyses. arXiv preprint arXiv:1308.5499.

Xiong, Z., Zhang, Z., 2016. A data preprocessing method applied to cluster analysis on stock data by K-means, International Conference on Intelligent Control and Computer Application, Atlantis Press, pp. 142-145.

Xydes, A., Moline, M., Lowe, C.G., Farrugia, T.J., Clark, C., 2013. Behavioral characterization and particle filter localization to improve temporal resolution and accuracy while tracking acoustically tagged fishes. Ocean engineering 61, 1-11.

Yoda, K., 2019. Advances in bio-logging techniques and their application to study navigation in wild seabirds. Advanced Robotics 33, 108-117.

Page 83: Testing the Efficacy of Unsupervised Machine Learning Techniques to … · 2019-09-02 · accelerometers proved effective by easily identifying basic active and inactive behaviours

76

Yoda, K., Natio, Y., Sato, K., Takahashi, A., Nishikawa, J., Robert-Coudert, Y., Kurita, M., Maho, Y.L., 2001. A New Technique for Monitoring the Behviour of Free-Ranging Adélie Penguins. The Journal of Experimental Biology 204, 685-690.

Yoda, K., Sato, K., Niizuma, Y., Kurita, M., Bost, C., Le Maho, Y., Naito, Y., 1999. Precise monitoring of porpoising behaviour of Adélie penguins determined using acceleration data loggers. Journal of Experimental Biology 202, 3121-3126.

Zadrozny, B., 2004. Learning and evaluating classifiers under sample selection bias, Proceedings of the twenty-first international conference on Machine learning. ACM, p. 114.

Zhang, J., O’Reilly, K.M., Perry, G.L., Taylor, G.A., Dennis, T.E., 2015. Extending the functionality of behavioural change-point analysis with k-means clustering: a case study with the little penguin (Eudyptula minor). PloS one 10, e0122811.

Zou, Y., Zhang, Y., Lord, D., 2013. Application of finite mixture of negative binomial regression models with varying weight parameters for vehicle crash data analysis. Accident Analysis & Prevention 50, 1042-1051.

Zucchini, W., MacDonald, I.L., Langrock, R., 2016. Hidden Markov models for time series: an introduction using R. Chapman and Hall/CRC.