differential mapping spiking neural network for …zahra et al.: differential mapping snn for...

11
1 Differential Mapping Spiking Neural Network for Sensor-Based Robot Control Omar Zahra, Silvia Tolu, Senior Member, IEEE and David Navarro-Alarcon, Senior Member, IEEE Abstract—In this work, a spiking neural network is proposed for approximating differential sensorimotor maps of robotic systems. The computed model is used as a local Jacobian-like projection that relates changes in sensor space to changes in motor space. The network consists of an input (sensory) layer and an output (motor) layer connected through plastic synapses, with interinhibtory connections at the output layer. Spiking neurons are modeled as Izhikevich neurons with synapses’ learning rule based on spike timing-dependent plasticity. Feedback data from proprioceptive and exteroceptive sensors are encoded and fed into the input layer through a motor babbling process. As the main challenge to building an efficient SNN is to tune its parameters, we present an intuitive tuning method that enables us to considerably reduce the number of neurons and the amount of data required for training. Our proposed architecture represents a biologically plausible neural controller that is capable of handling noisy sensor readings to guide robot movements in real- time. Experimental results are presented to validate the control methodology with a vision-guided robot. Index Terms—Robotics, spiking neural networks, sensor-based control, visual servoing. I. INTRODUCTION Sensor-guided object manipulation is one of the most fun- damental tasks in robotics, with many possible approaches to perform it [1]. Conventional methods typically rely on mathematical modeling of the observed end-effector pose and its related joint configuration. These methods provide accurate solutions, however, they require exact knowledge of the analytical sensor-motor relations (which might not be known); Furthermore, these conventional methods are gener- ally not provided with adaptation mechanisms to cope with uncertainties/changes in the sensor setup. Among the many interesting cognitive abilities of humans (and animals, in general) is the motor babbling process that leads to the formation of sensorimotor maps [2]. By relying on such adaptive maps, humans can learn to perform many motion tasks in an efficient way. These advanced capabilities have motivated many research studies that attempt to artificially replicate such performance in robots and machines [3], [4]. Our aim in this work is to develop a bio-inspired adaptive computational method to guide the motion of robots with real- time sensory information and a limited amount of data. This work is supported in part by the Research Grants Council (RGC) of Hong Kong under grant number 14203917, and in part by PROCORE- France/Hong Kong Joint Research Scheme sponsored by the RGC and the Consulate General of France in Hong Kong under grant F-PolyU503/18. O. Zahra and D. Navarro-Alarcon are with The Hong Kong Polytechnic University, Department of Mechanical Engineering, Kowloon, Hong Kong. Corresponding author e-mail: [email protected]. S. Tolu is with Technical University of Denmark, Department of Electrical Engineering, Copenhagen, Denmark. Data-driven computational maps have been previously built for approximating unknown sensorimotor relations [5]–[7]. Among the common limitation of these classical approaches is the demand for a high number of training data points and high computational power, which is impractical in many cases. By drawing inspiration from the central nervous system, artificial neural networks (ANN) have been built and used for many decades. It started with the McCulloch Pitts model as the first generation of ANN by using binary computing units [8], followed by the second generation utilizing mainly the sigmoidal (or tanh) activation functions to make it more capable of approximating non-linear functions [9]. Taking one step further, spiking neuronal networks (SNN) (representing the third generation of ANN) are designed to incorporate most of the (biological) neuronal dynamics which provide them with more complex and realistic firing patterns based on spikes [10]. Our aim in this work is to build an adaptive SNN-based model to guide robots with sensor feedback. The main additional property of SNN is the incorporation of a temporal dimension; Note that the relative timing of spikes (and spikes sequences) enables us to encode useful information. As in real biological systems, SNNs hold an advantage for real-time processing (as concluded in [11] for the visual system) and multiplexing of information (such as amplitude and frequency in the auditory system [12]). While SNN allows building more biologically plausible systems, its complex behavior makes it more difficult to analyze it, thus, to predict its behavior. To this end, various methods can be used, e.g. simplifying the network’s mathematical description [13] or applying techniques for tuning its parameters (to achieve the desired performance). The latter approach is the one we follow in this work. Several studies have provided examples of the application of SNN in robotics [14]. However, in most studies, robots were only controlled in a simulation environment [15]. The main reason is due to the large size of the neural networks used in these works, which makes it impractical for real- time operations. In [16], a cognitive architecture is used for controlling a robotic hand to perform grasping motions. In [17], an SNN was used for learning motion primitives of a robot so to move in three axes (left-right, up-down and far-near). In [18], a self-organizing architecture was used to build an SNN representing spatio-motor transformations (i.e. kinematics) of a two degrees of freedom (DOF) robot. Despite its good performance and biologically plausibility, the network’s size and limited scalability make it impractical for real-time control. In [19], an SNN with learning based on spike timing- dependent plasticity (STDP) was used to build the kinematics arXiv:2005.10017v1 [cs.RO] 20 May 2020

Upload: others

Post on 12-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Differential Mapping Spiking Neural Network for …ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 3 However, this model is the most biologically plausible among all

1

Differential Mapping Spiking Neural Network forSensor-Based Robot Control

Omar Zahra, Silvia Tolu, Senior Member, IEEE and David Navarro-Alarcon, Senior Member, IEEE

Abstract—In this work, a spiking neural network is proposedfor approximating differential sensorimotor maps of roboticsystems. The computed model is used as a local Jacobian-likeprojection that relates changes in sensor space to changes inmotor space. The network consists of an input (sensory) layer andan output (motor) layer connected through plastic synapses, withinterinhibtory connections at the output layer. Spiking neuronsare modeled as Izhikevich neurons with synapses’ learning rulebased on spike timing-dependent plasticity. Feedback data fromproprioceptive and exteroceptive sensors are encoded and fedinto the input layer through a motor babbling process. Asthe main challenge to building an efficient SNN is to tune itsparameters, we present an intuitive tuning method that enables usto considerably reduce the number of neurons and the amount ofdata required for training. Our proposed architecture representsa biologically plausible neural controller that is capable ofhandling noisy sensor readings to guide robot movements in real-time. Experimental results are presented to validate the controlmethodology with a vision-guided robot.

Index Terms—Robotics, spiking neural networks, sensor-basedcontrol, visual servoing.

I. INTRODUCTION

Sensor-guided object manipulation is one of the most fun-damental tasks in robotics, with many possible approachesto perform it [1]. Conventional methods typically rely onmathematical modeling of the observed end-effector poseand its related joint configuration. These methods provideaccurate solutions, however, they require exact knowledgeof the analytical sensor-motor relations (which might not beknown); Furthermore, these conventional methods are gener-ally not provided with adaptation mechanisms to cope withuncertainties/changes in the sensor setup.

Among the many interesting cognitive abilities of humans(and animals, in general) is the motor babbling process thatleads to the formation of sensorimotor maps [2]. By relying onsuch adaptive maps, humans can learn to perform many motiontasks in an efficient way. These advanced capabilities havemotivated many research studies that attempt to artificiallyreplicate such performance in robots and machines [3], [4].Our aim in this work is to develop a bio-inspired adaptivecomputational method to guide the motion of robots with real-time sensory information and a limited amount of data.

This work is supported in part by the Research Grants Council (RGC)of Hong Kong under grant number 14203917, and in part by PROCORE-France/Hong Kong Joint Research Scheme sponsored by the RGC and theConsulate General of France in Hong Kong under grant F-PolyU503/18.

O. Zahra and D. Navarro-Alarcon are with The Hong Kong PolytechnicUniversity, Department of Mechanical Engineering, Kowloon, Hong Kong.Corresponding author e-mail: [email protected].

S. Tolu is with Technical University of Denmark, Department of ElectricalEngineering, Copenhagen, Denmark.

Data-driven computational maps have been previously builtfor approximating unknown sensorimotor relations [5]–[7].Among the common limitation of these classical approaches isthe demand for a high number of training data points and highcomputational power, which is impractical in many cases.

By drawing inspiration from the central nervous system,artificial neural networks (ANN) have been built and usedfor many decades. It started with the McCulloch Pitts modelas the first generation of ANN by using binary computingunits [8], followed by the second generation utilizing mainlythe sigmoidal (or tanh) activation functions to make it morecapable of approximating non-linear functions [9]. Taking onestep further, spiking neuronal networks (SNN) (representingthe third generation of ANN) are designed to incorporate mostof the (biological) neuronal dynamics which provide them withmore complex and realistic firing patterns based on spikes [10].Our aim in this work is to build an adaptive SNN-based modelto guide robots with sensor feedback.

The main additional property of SNN is the incorporationof a temporal dimension; Note that the relative timing ofspikes (and spikes sequences) enables us to encode usefulinformation. As in real biological systems, SNNs hold anadvantage for real-time processing (as concluded in [11] forthe visual system) and multiplexing of information (such asamplitude and frequency in the auditory system [12]). WhileSNN allows building more biologically plausible systems, itscomplex behavior makes it more difficult to analyze it, thus, topredict its behavior. To this end, various methods can be used,e.g. simplifying the network’s mathematical description [13]or applying techniques for tuning its parameters (to achievethe desired performance). The latter approach is the one wefollow in this work.

Several studies have provided examples of the applicationof SNN in robotics [14]. However, in most studies, robotswere only controlled in a simulation environment [15]. Themain reason is due to the large size of the neural networksused in these works, which makes it impractical for real-time operations. In [16], a cognitive architecture is used forcontrolling a robotic hand to perform grasping motions. In[17], an SNN was used for learning motion primitives ofa robot so to move in three axes (left-right, up-down andfar-near). In [18], a self-organizing architecture was usedto build an SNN representing spatio-motor transformations(i.e. kinematics) of a two degrees of freedom (DOF) robot.Despite its good performance and biologically plausibility, thenetwork’s size and limited scalability make it impractical forreal-time control.

In [19], an SNN with learning based on spike timing-dependent plasticity (STDP) was used to build the kinematics

arX

iv:2

005.

1001

7v1

[cs

.RO

] 2

0 M

ay 2

020

Page 2: Differential Mapping Spiking Neural Network for …ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 3 However, this model is the most biologically plausible among all

2

DMSNN

l

l

l

Robot

Proprioception

Motor Action

VisualTracking

STDP

Random Angle

Generator Xe

DMSNN

l

l

l

VisualTracking

Robot

Proprioception

Motor Action = tpost- tpre

-+

Motor Babbling

e= e

Encoders

Encoder

Encoders

Decoder

X-

+

e= ee

Training Phase

Control Phase

cmd

dd

v κ

κ

v

Fig. 1: A schematic diagram for both the training and control phases for the DMSNN. The signals introduced to each neuron bundle aredepicted. During the training phase, the robot motion is guided by motor babbling in joint space. During the control phase, the robot motionis guided by motor commands decoded from the activity of motor neurons at the output layer.

of a robot. Although the synaptic connections illustrate theability to approximate the kinematic relation, the approxima-tion error is not evident.

In this paper, we propose an SNN-based control architectureto guide robots with sensor feedback. Our proposed neuralcontroller adaptively builds the differential map that correlatesend-effector velocities (as measured by an external vision sen-sor) with joint angular velocities. In other words, it effectivelyworks as a local Jacobian-like transformation between differ-ent (sensor-motor) spaces. This new method is characterizedby its inter-inhibitory connections at the output layer, real-timeoperation and fewer number of neurons, compared to previousworks in literature [16], [17]. To the best of the authors’knowledge, this is the first study to report an SNN-basedmethod capable of forming the sensor-motor differential mapin a computationally efficient way (resulting from our intuitiveapproach to adjust its parameters). To validate the proposedtheory, we present a detailed experimental study with roboticplatforms performing vision-guided manipulation tasks.

The rest of this paper is structured as follows: Sec. II de-scribes the developed spiking neural network; Sec. III presentsthe verification of the method proposed through a dummy test;Sec. IV presents the experimental results; Sec. V discusses themethods and results; Sec. VI gives conclusions.

II. METHODSMany studies have suggested that humans use internal mod-

els to represent perception and action [20]–[22]. Some havebuilt computational models of brain areas (e.g the cerebellum)responsible for the generation and coordination of fine motoractions [23], [24]. These models represent transformationsfrom sensory perception to motion commands, but unlikein traditional robot control (where such transformations aresolved analytically), the human brain has neural circuits thatencode these sensorimotor relations. To carry out a typical(eye-to-hand) visual servoing task, we must first establish thekinematic relation between the joint velocities and the mea-sured end-effector motion [25]. This model can be formulated

as x = J(θ)θ, such that:

J(θ) =

∂x1

∂θ1· · · ∂x1

∂θm...

. . ....

∂xn

∂θ1· · · ∂xn

∂θm

(1)

where x ∈ Rn, θ ∈ Rm, θ ∈ Rm and J(θ) ∈ Rn×m are the(observed) spatial velocity, joint angles, joint velocities and theJacobian matrix, respectively (without loss of generality, weassume that m ≥ n). For velocity-based control it is necessaryto estimate the joint velocities to achieve a certain spatialvelocity. This expression is denoted as:

θ = J#(θ)x (2)

where J# is the pseudo-inverse of the J . In this work,a differential mapping spiking neural network (DMSNN) isproposed to build a computational model analogous to theJacobian transformation relating changes in 1 joint-space DoFto 1 task-space DoF.

As shown in Fig. 1, the network is first subject to thetraining phase in which both sensory readings and motorcommands are fed to the network through motor babblingby executing random motions. After training the network forseveral iterations, the differential mapping is formed by themodulation of the connection between the input and outputlayers. Then, the network can be used in the control phase toguide the robot through the estimation of the required motorcommand to reach the desired target by feeding the sensoryinformation only to the input layer. Details of this network aredescribed in the rest of this section.

A. Neuron Model

Among the models available for spiking neurons is theHodgkin and Huxley model that explains the mechanismof triggering an action potential and its propagation [26].The mathematical model is composed of a set of non-lineardifferential equations, which makes it computationally intense.

Page 3: Differential Mapping Spiking Neural Network for …ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 3 However, this model is the most biologically plausible among all

ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 3

However, this model is the most biologically plausible amongall available models.

Another model that is widely used is the Leaky Integrateand Fire model. In this model, the behavior of a neuron isapproximated as a simple RC circuit with a low-pass filterand a switch with a thresholding effect [27]. The RC circuit isfirst charged by an input current, which makes the voltage atthe capacitor to increase until it reaches the threshold value.The switch opens and lets a pulse (spike) to be generated; Thecapacitor then starts to build up again. This model is simpleand has a low computational cost, however, compared toHodgkin-Huxley’s model, it is not very biologically plausible.

In 2003, Izhikevich developed a model that can reproducethe various firing patterns recorded by different neurons indifferent brain regions [28]. This model is chosen for ourstudy as it holds a balance between a reasonable computationalcost while preserving biological plausibility. The model can bedescribed by the following set of differential equations:

v = f(v, u) = 0.04v2 + 5v + 140− u+ I (3)u = g(v, u) = a(bv − u) (4)

After a spike occurs, the membrane potentials are reset as:

if v ≥ 30 mV, then v ← c, u← (u+ d) (5)

where v is the membrane potential and u is the membrane re-covery variable. The parameter a determines the time constantfor recovery, b determines the sensitivity to fluctuations belowthe threshold value, c gives the value of membrane potentialafter a spike is triggered, and d gives the value of the recoveryvariable after a spike is triggered. The term I represents thesummation of external currents introduced.

B. Synaptic Connections

Synapses are the connections that transmit signals betweentwo neurons. Let us denote by εij the weight/strength ofthe synapse. The transmission acts only in one direction,such that signals are carried from the ith presynaptic to thejth postsynaptic neuron. The information transmitted throughsynapses is usually encoded in the form of spikes (or actionpotentials). Different theories have been presented for the wayof encoding and decoding such information in our brains [29].The synaptic connections can either be excitatory or inhibitory,and plastic or non-plastic. The excitatory synapses are theconnections that are more likely to increase the activity ofpost-synaptic neurons with the increase in the activity of thepresynaptic neuron, while the inhibitory synapses decreasethat likelihood. The synapses connecting the input layer tothe output layer are plastic (which means that its strength issubject to change). Let us denote by ∆εij [t] the synapse’schange of strength at the time instance t, which satisfies thefollowing discrete update rule:

εij [t+ 1] = εij [t] + ∆εij [t] (6)

One of the very first learning rules to update the synapticweights is the Hebbian Learning rule [30], whose basicformulation is:

∆εij = ηaiaj (7)

∆εij

∆t-30 -20 -10 0 10 20 30

-0.04

-0.02

0.00

0.02

0.04

0.06

-0.06

Sa=SbSa >SbSa <Sb

(a)

-30 -20 -10 0 10 20 30

∆t

-0.04

-0.03

-0.02

-0.01

0.00

0.01

0.02

0.03

0.04

0.05

∆εij

τ1=10τ2=8τ2=10τ2=12

(b)

Fig. 2: (a) Asymmetric and (b) Symmetric learning STDP rules. Forasymmetric learning, Sa and Sb are plotted at different values whilekeeping τa and τb constant. For the symmetric one, τ2 is plotted atdifferent values while keeping τ1 constant.

where the scalar η is the learning rate, ai and aj are theactivities (or average firing rates) of the pre and postsynapticneurons, respectively. This rule strengthens the connectionsbetween strongly correlated variables, and has been shown toperform principal component analysis (PCA) [31]. However,this type of learning does not take into consideration the timedifference between spikes, which is the main feature of SNNs.

Another learning rule that is more appealing from a bi-ological perspective is STDP, where potentiation (increase)or depression (decrease) in the strength of the connectionsis dependent upon the relative timing of spikes that occur inpresynaptic and postsynaptic neurons [32]. STDP is consideredto be one temporal form of Hebbian learning [33], e.g. in [34],it is shown that it can perform ‘kernel spectral componentanalysis’ (kSCA), which resembles PCA. This attribute makesit suitable for mapping two spaces while successfully updatingthe synaptic weights.

In the literature [35], two common patterns for STDP iseither as symmetric [36] or antisymmetric [37], as depicted inFig. 2. The antisymmetric model can be formulated as:

∆εij =

−Sa exp (−∆t/τa) ∆t ≤ 0

Sb exp (−∆t/τb) ∆t > 0(8)

where Sa and Sb are coefficients that control the magni-tude of the synaptic depression and potentiation, respectively,where τa and τb determine the time window through whichdepression and potentiation occur. The asymmetric STDP isthus suitable for a learning process whenever the sequence ofsignals matters, e.g. if a spike arrives from the presynapticneuron before the spike from a postsynaptic neuron (whichresults in a positive value for ∆t, and thus the synapse’s weightis potentiated).

While the firing sequence is not so crucial in the case ofcontinuous firing at both input and output layers (as in ourcase), the symmetric model is found to be more appropriate(see [38]) and can be described by:

∆εij = S

(1−

(∆t

τ1

)2)

exp

(|∆t|τ2

)(9)

where S is a coefficient that controls the magnitude of thesynaptic change, the ratio between τ1 and τ2 decides the time

Page 4: Differential Mapping Spiking Neural Network for …ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 3 However, this model is the most biologically plausible among all

4

... ...

...

Excitatory Plastic Synapse

Inhibitory Non-Plastic Synapse

εij

εij~

Fig. 3: The layout of the proposed spiking neural network. Input(sensory) neurons are connected to output (motor) neurons throughexcitatory and inhibitory plastic synapses εij ; Neurons in each motorbundle are interconnected through inhibitory non-plastic synapses εij .

window through which potentiation and depression occurs, and∆t is the difference between the timing of spikes at post tpostand pre tpre synaptic neurons. In this study, we use S = 0.05,τ1 = 20ms and τ2 = 18ms. In this learning model, the changein synapse’s weight is controlled by the absolute value of ∆t,but not the sign, i.e. the sequence of firing. The chosen timewindow for the pre and postsynaptic neurons for STDP is30ms, which means as long as −30 ≤ ∆t ≤ 30, it will stillcontribute to the modification of the synaptic strength.

C. Network Layout

The network layout of the proposed neural controller isshown in Fig. 3, where each dimension of sensory input (jointangles and spatial velocity) and motor output (joint velocity)is represented by a one-dimensional array (bundle) of neurons.The complete network consists of n+2m bundles of neurons.The input sensory layer consists of bundles lθ1:m encodingthe joint angles and lx1:n encoding the spatial velocity, whilethe output motor layer consists of lθ1:m encoding the jointvelocity controls. Each sensory neuron is connected throughan excitatory and inhibitory synapse (which is omitted in Fig.3 for clarity) to each motor neuron. This acts as a substitute foradding an inhibitory interneuron and allows for stable learningdynamics while avoiding an unbounded increase in connectionstrength and neuron activity. Additionally, each of the neuronsencoding the output θk is connected to the neurons of the samebundle lθk through non-plastic inhibitory connections, whosestrength is determined as:

εij = exp

(−(i− j)2

(σnNl)2

)− 1 (10)

where εij denotes the strength of the non-plastic connectionbetween neurons i and j, σn is the standard deviation, andNl is the number of neurons in the bundle. Therefore, thefarther the neuron is, the stronger the inhibition activity. Thisapproximates the behavior of a winner take all effect, but withthe change in the inhibitory value depending on the proximityto the winner neuron (ensuring a continuous and more robustoutput).

The initial guess for this layout was that non-linearity inthe neuronal units, as well as the features of STDP, makeit possible to learn the differential map without the need

0.0 5.0 10.0 15.0 20.0

-80

-70

-60

-50

-40

mem

bran

e po

tent

ial v

(mV

)

T (steps)

(a)

reco

very

var

iabl

e u

membrane potential v(mV)-40-80 -70 -60 -50-90

10

-20

-10

0

-15

-5

5

(b)

Fig. 4: (a) Computational and (b) analytical analysis of a spikingneuron. In (a) an integrator neuron receives spikes from four neuronsto cross a threshold value and trigger a spike, While in (b) a phaseportrait of a spiking neuron is plotted.

for a hidden layer. Such approach is supported by previousstudies (see [38] and [16]) which turned out to be accurate,as discussed in section IV. However, to perform the vision-guided motion task with the network, a careful setting of itsparameters is needed.

D. Tuning the Network Parameters

Tuning the various parameters for the neurons and synapsesis a difficult task. Some basic rules can be used to guide thetrial and error approach to choose a set of appropriate values.For example, as u is the membrane recovery variable, it isresponsible for the delivery of negative feedback to v, such thatit resists change of the value of v. The network’s parametersa, b, c and d tune how v and u change and interact togetherover time.

Fig. 4b shows the phase portrait of the output neurons, asit plots the recovery potential versus the membrane potential.The two black curves represent the nullclines for both v and u,i.e. the line along which partial derivative equals zero whichseparates the planes of variation of v and u. To obtain thesenullclines and analyze the system, equations 3 and 4 are setequal to zero to obtain lines along which there is no change inv and u, respectively. The intersection of these nullclines formsattractor (stable) points or repeller (unstable) points [39]. Fromthe location of attractors and repellers, the stability regions canbe concluded. The equilibrium points (v∗, u∗) are obtained bysolving the equations of the parabola and line obtained fromequating f(v, u) and g(v, u) together. The stability of thesepoints is determined by the eigenvalues of the Linearizationmatrix L:

L(v∗, u∗) =

[∂f∂v (v∗, u∗) ∂f

∂u (v∗, u∗)∂g∂v (v∗, u∗) ∂g

∂u (v∗, u∗)

]=

[0.08v∗ + 5 −1

ab −a

](11)

As the v-nullcline shifts upward, the attractor and repellerannihilate each other and merge into a saddle; Any furtherupward shift leads to the disappearance of the saddle point.

As shown in Algorithm 1, to use an SNN to model a certainsystem, initial values should be assigned for neuron parametersa, b, c, d based on their role within the neuronal layer. In thiswork, the motor neurons are modeled as integrator neurons asit acts as a coincidence detector, such that it triggers a spikeby accumulating closely timed signals as illustrated in Fig. 4a.The initial values for the neuron parameters are a = 0.02, b =

Page 5: Differential Mapping Spiking Neural Network for …ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 3 However, this model is the most biologically plausible among all

ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 5

Algorithm 1 Network Parameters TuningInput

n,m = Number of dimensions of spaces to be encodedNl = Number of neurons to encode each dimensionItr∗ = Number of iterations to build the map

Outputa, b, c, d = Neurons parametersI∗ = Minimum input to have continuous spikesCI , CE = Maximum strength of Inhibitory and Excitatory

connections, respectivelyS = Learning rate for STDP based synaptic connections

Routine1: Assign initial values for neuron parameters a, b, c, d2: while ξ ≥ eth do3: Build the neuron model f(v, u) and g(v, u)4: Solve for nullclines at f(v, u) = 0 and g(v, u) = 05: Get v∗ and u∗ where f(v∗, u∗) = g(v∗, u∗) = 06: Evaluate L(v∗, u∗) to check stability7: Obtain I∗ and hence CE8: Set S for given Itr∗

9: Update values for a, b, c, d10: end while

−0.1, c = −55, and d = 6. The sensory neurons are modeledas fast spiking neurons providing high frequency spikes andinitialized with a = 0.1, b = 0.2, c = −65, and d = 2. Theabove-mentioned initial values are suggested in [40].

The parameters’ variation produces specific properties inthe system, which can be used as a guideline to adjust theneuron’s firing pattern: (i) The value of a controls the decayrate of u. This can be clearly noticed when comparing thefiring behavior of low-threshold spiking (LTS) neurons withresonator (RZ) neurons where both have the same values of b, cand d, while RZ has a higher value of a (to set sub-thresholdoscillations and resonate at a narrow band of frequencies).(ii) The parameter b controls the sensitivity of u to changesin the membrane potential v below the threshold value. Theintegrator neuron has a low value of b which is why manyspikes with small time interval in between are needed to triggera spike. (iii) The parameter c describes the value to which vis reset after firing. Therefore, decreasing its value enables tocreate spikes bursts since it makes the recovery to the originalthreshold value faster. (iv) The parameter d describes the resetin the value of u after a spike occurs, thus, lowering it createsa higher firing frequency for the same input current.

After setting the neuron parameters, the value for I∗ and CEcan be concluded by analyzing the stability at v∗ and u∗ suchthat a specific firing rate at the output neurons is obtained fora certain input from multiple input neurons. I∗ is even morecritical to estimate in our study, as selective disinhibition isto be achieved, where the neuron is to be maintained at theverge of firing to avoid excessive firing [41]. This ensuresthat only the correct motor neurons fire upon excitation of thecorresponding sensory neurons.

To obtain I∗, we first equate g(v∗, u∗) = a(bv∗ − u∗) = 0,and solve for u∗ = bv∗. Then, substitute u∗ into f(v∗, u∗) = 0

such that 0.04v∗2 + (5− b)v∗ + 140 + I = 0. The intersectionof the u and v nullclines at one point is when the neuronstarts to give continuous spikes, which means there is onlyone solution for the quadratic equation. The equilibrium pointsare given by v∗ = −(5 − b)/(2 × 0.04) and u∗ = bv∗. Thevalue of I∗ is finally computed by solving f(v∗, u∗) = 0.The parameter CE must be chosen such that at the end of thetraining phase the selected firing behavior is still maintained.After setting CE , spiking behavior is tested. The chosen valuesfor the motor neurons must not allow evoking spikes at lowspiking frequency from sensory neurons. Note that increasingthe frequency makes the learning process more susceptibleto noise. Therefore, the motor neuron parameters are to bemodified instead. The parameter b is then incremented slightlyuntil a satisfactory performance is obtained.

Depending on the size of the data set and the number ofneurons in each bundle, the number of iterations Itr∗ mustbe defined to build the map, which in turn determines thevalue of the learning rate gain S. If Itr∗ is relatively small,it will lead to a large S, which often leads to instabilityand noise sensitivity. A large Itr∗ results in an exhaustive(computationally demanding) training process.

To quantify the accuracy of the computed differential mapapproximated by the network, we define the following metric:

ξ =1

N

N∑n=1

√(~vd − ~vest)2 (12)

which is simply the difference between the desired spatialvelocity ~vd and the spatial velocity obtained upon executionof the estimated motor command ~vest; The scalar N > 0 isthe number of trials over which the difference is measured toobtain a mean error value. The tuning process continues untilthe value of ξ is below some threshold value eth, indicatingthat the network’s performance is acceptable.

E. Training Phase and Control Phase

For the proposed network to form the desired differentialmap, the information needs to be input/encoded into thenetwork and extracted/decoded in a proper way. The inputto the sensory layers (during training and control phases) andmotor layers (during training phase only) is calculated for eachneuron based on its preferred (central) value ψc. Thus, thisGaussian distribution models the network’s tuning curve. Thefiring rate for a certain input can be formulated as:

αi(t) = exp

(−‖ψ − ψc‖2

2σ2

)(13)

where ψ is the input value, and σ is calculated based onnumber of neurons per layer Nl, and the range of changeof the variable to be encoded from Ψmin to Ψmax. This leadsto the contribution of the whole layer to encode a particularvalue (a process that can be interpreted as “population coding”[29]).

To get the estimated output from the network, a properdecoding function has to be chosen. Among the various decod-ing methods, the central neuron voting scheme is selected tocalculate the decoded value corresponding to the firing rate in

Page 6: Differential Mapping Spiking Neural Network for …ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 3 However, this model is the most biologically plausible among all

6

all the neurons of a layer. This can be modeled, for a specifictime window, as follows:

ψest =Σψi.αi

Σαi(14)

where ψi is the central value of neuron i in the bundle lΨ,and ψest is the estimated (decoded) value of the output [29].

Fig. 1 depicts a schematic diagram of the proposed SNN-based method. Firstly, the motor babbling process initiates thetraining phases by providing motor commands for the robotto move linearly in the joint space through numerous randomtargets, which can be formulated as:

ωd = κθeθ‖eθ‖

(15)

where θd denotes the randomly generated desired joint angles,θ the current joint angles and eθ = θd − θ its error. Thisjoint velocity command is scaled by the gain κθ > 0,which is also varied during the babbling, to generate richertraining data. During babbling motions, sensory informationand motor commands are fed to the network to guide themodulation of the plastic synapses between the input andoutput layers through STDP. Sensory information is introducedfrom proprioception (θ and θ) and the external sensors (thevisual velocity x), which are then encoded. The differencein time of spikes generated in sensory and motor neurons∆t = tpost − tpre controls the amount of change in thesynapses’ strength. After a sufficient number of iterations, therobot is ready to perform a sensor-guided motion task.

The desired position xd is then compared to the currentposition x provided by the external vision sensor With thisinformation, the following motion command is computed:

vd = κxex‖ex‖

(16)

where ex = xd − x denotes the feedback spatial error andκx > 0 a variable gain that regulates the velocity by whichthe robot is driven towards the target xd. Finally, the motorcommand θcmd is decoded from the spikes generated at theoutput bundles lθ1:m as follows:

θcmd = ψest (17)

This value is then fed into the robot’s servo controller to guideits motion towards xd.

III. SIMULATION RESULTS

A. Performing Summation

To test the proposed network and tuning method, thesummation of two variables (n1 + n2 = nsum) is carriedout in different approaches. In Fig. 5a, the sum is estimatedusing two 1D-layers that encode the two variables (“n1” and“n2”) connected through all plastic connections to a 1D-layerencoding the result (“nsum”). Random numbers are generatedwithin a range, and introduced to the input layers, with theirsummation introduced into the output layer. The networkparameters are tuned as described in Sec. II-D, such that duringthe training phase firings in both input and output layers allowthe plastic connections to represent the desired summation

nsum

n1 n2

(a) Separate encoding

n1 ,n2

nsum

(b) Joint encoding

Fig. 5: A schematic of the two possible ways to perform summationof two variables (n1 and n2). In this paper, (a) is the chosen approach.

function. After training, the sum is estimated from the decodedvalues at the output layer. The approach gives a mean error of7.5%.

Fig. 5b shows another network layout used to test ourmethod. A 2D input layer is used instead of two separate1D layers. The difference from the previous layout is thatthe neurons’ activity is estimated by multiplying the normallydistributed activity of both variables. The mean error for thislayout is around 2%. To build the proposed DMSNN, weuse the former approach. The rationale behind this choice iselaborated in Sec. V.

B. 2DOF Planar Robot

To verify the effectiveness of our method for visual servoingtasks, a simulation model of a 2DOF planar robot with revolutejoints is built. The differential kinematics of this system are:[

x1

x2

]=

[−l1 sin(θ1)− l2 sin(θ12) −l2 sin(θ12)l1 cos(θ1) + l2 cos(θ12) l2 cos(θ12)

] [θ1

θ2

](18)

where li denotes the link length, and θ12 = θ1 + θ2. Withthe above definition of the Jacobian matrix, we can derive theinverse differential mapping θ = J#x, which is then usedto generate training data for the SNN. During the trainingprocess, normalized spatial velocity vectors are fed at randomjoint angles to the corresponding bundles in the input layeralong with the corresponding angular velocities to the bundlesof the output layer. Around singular configuration, a smallvalue is added to the determinant of the Jacobian matrix toavoid the unbounded joint velocities.

After training, the network is tested by providing a randomtarget in Cartesian workspace xd for a random initial positionxi. By generating normalized spatial velocity vector ~vd fromthe current position x to reach the target, the estimated angularvelocities at each joint can be decoded from the bundles ofthe output layer.

To quantify the performance, we calculate the minimumdistance δ from the current position to the line segment xixd.The target path is a straight line, thus, the shortest distance ateach point is defined by the normal to xixd. But in othercases, a more complex path may be required. For a robotmoving along the path C, divided into NC points, given areference target path Ω, composed of NΩ points, the meanerror is calculated by averaging maximum error in Ntrials byapplying Algorithm 2.

Page 7: Differential Mapping Spiking Neural Network for …ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 3 However, this model is the most biologically plausible among all

ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 7

x2(

m)

vest

vd

x1(m)

0.0

0.04

0.08

0.12

0 100 200 300T (steps)

ex

(m)

xd

xi

Ci f

δmax

(a) (b)

0.32

0.34

0.36

0.30-0.08 -0.04 0.0 0.04

Fig. 6: Simulation of 2DOF Planar Robot driven by the DMSNN(a) initially at xi reaching a randomly generated target at xd and ateach time step both the actual velocity (~vest) and the desired velocityvector (~vd) are calculated; (b) End-effector position error ex.

IV. EXPERIMENTS

A. Setup

The proposed SNN is simulated using NeMo library, a tooldeveloped to simulate SNN [42]. A UR3 robot is set with anIntel RealSense D415 camera in setup as shown in Fig. 7a. TheD415 is an RGB-D camera providing real-time color framesand depth maps. Image registration is carried out by aligningthe color and depth frames, then, the end-effector position canbe measured through color filtering of the manipulated object.For effective color filtering, the image is blurred to removehigh-frequency noise, then the color frame is converted toHSV color space for more robust performance independentof the light intensity. A mask, in the desired color range, isapplied to obtain a binary image, which is then subjected tosome morphology operators to remove the small blobs. Theseoperations allow to filter the image and make it easier todiscriminate the colored end-effector as the biggest contour.Afterwards, the moment of this contour is calculated suchthat Mij = σφ(x1, x2)x1

(i)x2(j). Then, the centroid (ρx, ρy)

is calculated in pixel units based on the moment such thatρx = M10/M00 and ρy = M01/M00, see [43].

Algorithm 2 Calculating Mean Error1: e = 02: for i = 1 to Ntrials do3: ej = 04: for j = 1 to NC do5: ek = 06: for k = 1 to NΩ do7: eij = ‖C(i)− Ω(j)‖8: if eij ≤ ek then9: ek = eij

10: end if11: end for12: if ek ≥ ej then13: ej = ek14: end if15: end for16: Stack error e(i)← ej17: end for18: emean =

1

Ntrials

∑Ntrials

n=1 e(n)

θS

θE

θW

(a) UR3 Planar configuration

θB

θS

θE

θW

(b) UR3 Spatial configuration

Fig. 7: The UR3 moves (a) planar motion while controlling 3DOFsand (b) spatial motion while controlling 4DOFs.

The location at the center of the object is then converted toworld coordinates by using the camera’s intrinsic parametersas follows [44]:

x1 = (ρx − cx)x3

F, x2 = (ρy − cy)

x3

F(19)

where ρxy and ρy denote to the end-effector position in pixels,x3 is the end-effector’s depth, cx and cy denote the principalpoint and F is the focal length. The following first-order filteris used to remove noise from these visual measurements:

sf (t+ 1) = sf (t)− λ(sf (t)− s(t+ 1)) (20)

where s and sf are the variable before and after filtering,respectively, and λ > 0 denotes the filter’s gain.

To train the network, the robot is driven linearly in jointspace between randomly generated angles. Data is collectedat a constant sampling rate of 25Hz, then filtered. Fig. 8cshows the position obtained from the camera before and afterfiltering. Each neuron bundle is fed with the correspondingdata to let the plastic connections develop to form the re-quired differential mapping. Once the training phase ends, thestrength of the synaptic connections is kept constant. Randomtargets are given across the robot workspace as shown in Fig.8a and Fig. 8b. The current joint angles and the desired spatialvelocity are updated every 20ms.

B. 3DOF Planar Robot

For this case, three joints are controlled to guide the UR3robot in planar motion, as shown in Fig. 10. The networkconsists of five input bundles (lθ1:3 and lx1:2 ) and threeoutput bundles (lθ1:3 ). The network parameters, as well asthe number of neurons in each bundle, are shown in Table I.These three joints have ranges of: [−180,−90], [−45, 0]and [90, 180]. From the motor babbling data collected,the maximum and minimum values for both the spatial andangular values are obtained. For each layer with N3D neurons,the central value ψc encoded by each neuron is assigned bydividing the range of each variable evenly over the whole layer.

The update of the synaptic strength is shown as heatmapsin Fig. 9a and Fig. 9b. Each heatmap depicts the relation

Page 8: Differential Mapping Spiking Neural Network for …ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 3 However, this model is the most biologically plausible among all

8

(a)

X(m)

0.02

0.03 Y(m)0.10

0.040.02

Z(m

)

1.05

1.20

Robot Workspace

(b)

0 100 200 300 400 500 600 700 800

0 100 200 300 400 500 600 700 800

(c)

0 100 200 300 400 500 600 700 800

0.05

0.00

0.05

Vx

0 100 200 300 400 500 600 700 800

0.1

0.0

0.1

0.2

Vy

0 100 200 300 400 500 600 700 800time(step)

0.2

0.1

0.0

0.1

Vz

Camera readingsRobot Sensors

(d)

Fig. 8: Plot of data collected for training the DMSNN for (a) 3DOFsand (b) 4DOFs, and velocities as measured from the camera (afterfiltering) against that estimated by the internal sensors of the robotfor (c) 3DOFs and (d) 4DOFs.

between one bundle from the sensory layer to one bundlefrom the motor layer, where each pixel gives the strengthof an excitatory synapse connecting one sensory neuronto one motor neuron. After running the simulation for6000 iterations, the strength of synapses between a motorlayer to the sensory layer is modulated to represent therequired differential map. The robot is then given randompoints to reach by its end-effector through a visual servoingprocess. As shown in Fig. 10, a target point is provided

(a)

(b)

(c)

(d)

TH1->TH3d

TH1->TH1d

TH1->TH2d

TH2->TH1d

TH2->TH2d

TH2->TH3d

TH3->TH1d

TH3->TH2d

TH3->TH3d

Xd->TH1d

Xd->TH2d

Xd->TH3d

Yd->TH1d

Yd->TH2d

Yd->TH3d

TH1->TH3d

TH1->TH1d

TH1->TH2d

TH2->TH1d

TH2->TH2d

TH2->TH3d

TH3->TH1d

TH3->TH2d

TH3->TH3d

Xd->TH1d

Xd->TH2d

Xd->TH3d

Yd->TH1d

Yd->TH2d

Yd->TH3d

TH1->TH1d

TH1->TH2d

TH1->TH3d

TH1->TH4d

TH2->TH1d

TH2->TH2d

TH2->TH3d

TH2->TH4d

TH3->TH1d

TH3->TH2d

TH3->TH3d

TH3->TH4d

TH4->TH1d

TH4->TH2d

TH4->TH3d

TH4->TH4d

Xd->TH1d

Xd->TH2d

Xd->TH3d

Xd->TH4d

Yd->TH1d

Yd->TH2d

Yd->TH3d

Yd->TH4d

Zd->TH1d

Zd->TH2d

Zd->TH3d

Zd->TH4d

TH1->TH1d

TH1->TH2d

TH1->TH3d

TH1->TH4d

TH2->TH1d

TH2->TH2d

TH2->TH3d

TH2->TH4d

TH3->TH1d

TH3->TH2d

TH3->TH3d

TH3->TH4d

TH4->TH1d

TH4->TH2d

TH4->TH3d

TH4->TH4d

Xd->TH1d

Xd->TH2d

Xd->TH3d

Xd->TH4d

Yd->TH1d

Yd->TH2d

Yd->TH3d

Yd->TH4d

Zd->TH1d

Zd->TH2d

Zd->TH3d

Zd->TH4d

Fig. 9: Heatmap of the weights update process for the 3DOF case ofstudy at (a) 3000 and (b) 6000 iterations, and for 4DOF case at (c)4000 and (d) 9000 iterations. The dark blue color corresponds to zeroweight while dark red color corresponds to the maximum weight.

Page 9: Differential Mapping Spiking Neural Network for …ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 3 However, this model is the most biologically plausible among all

ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 9

0.2

0 4 8 12 16 20 24 28Time (s)

(a)

x x x x x

0

0.1

0.2

0 16 32 48 64 80 96

Time (s)

(b)

0.3

0.1

0

x x x x x

ex

(m)

ex

(m)

Fig. 10: The UR3 while performing planar motions in (a) and (b)with the norm of the error ex plotted during motion.

and the robot automatically moves towards it by usingdecoded motion command θcmd. The distance between theend-effector and the target is represented as the norm ‖x−xd‖.

C. 4DOF Spatial Robot

In this case, four joints are controlled to guide theUR3 robot as shown in Fig. 7b. The network consistsof seven input groups (lθ1:4 and lx1:3 ) and four outputgroups (lθ1:4 ). The network parameters and the number ofneurons in each layer are shown in table I. The fourjoints ranges are: [−200,−170], [−75,−45], [90, 110]and [−200,−160]. The heatmaps depicting the weights’update at the 4000 and 9000 iterations are shown in Fig. 9cand Fig. 9d, respectively. Similar to the previous setup, severalend-effector targets are given to the robot in sensor space.Representative results for these 4DOF visual servoing tasksare depicted in Fig. 11. The accompanying multimedia videodemonstrates the performance of our new method with manyexperimental results.

V. DISCUSSION

As shown in section III-A, the accuracy of the summationof two numbers by the network in the case of multiple one-dimensional arrays is lower than that obtained in the case ofa multi-dimensional array. However, this result comes on theexpense of the network size where the 1D layers require only

(a)

(b)

x x x x x

xxxxx

Time (s)6050403020100

0

Time (s)1501251007550250

0.1

0.05

0.05

0.1

0.15

0.15

xxxx

ex

(m

)ex

(m

)

Fig. 11: The UR3 while performing spatial motions in (a) and (b)with the norm of the error in the 3 axes ex plotted during motion.

2N neurons, for N is the number of neurons per layer, whilethe 2D layer needs N 2 neurons. This means that for an O-dimensional case, NO neurons would be needed. However,note that the explicit use of real-time visual feedback inour formulation provides a valuable rectifying property to thecontroller (i.e. it corrects for small approximation errors, seee.g. [45]) while demanding moderate computational power.

Moreover, in this study the SNN is used to guide therobot for visual servoing process, where it requires fast stateupdates, hence the proposed network is to be used insteadas it compromises between real-time operation and sufficientaccuracy. However, in case of accurate mapping of robotstates and commands the second approach is to be used, oradditional hidden layers are to be added to the network forhigher accuracy and better estimates. Another feature, whichis not evident from the summation test studied, is that SNNbetter approximates real-life operations where the values tobe encoded/decoded are continuous, i.e. with no discontinuityor jumps. For the neurons in the network to evoke a spike, itneeds first to let the membrane potential build up over a seriesof time steps. While incorporating the Gaussian activity, thisleads to sharing excitation with neurons in the neighborhood,and it becomes more likely for the adjacent neurons to betriggered in next cycles as well. As the adjacent neuronsencode values close to each other, this achieves a continuoussmooth operation and avoid sudden changes and jerky motions.

This work also exploits the potential of an SNN basedon STDP, following some procedures for fine-tuning of thenetwork parameters as described in section II-D, where only

Page 10: Differential Mapping Spiking Neural Network for …ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 3 However, this model is the most biologically plausible among all

10

TABLE I: Network Parameters

Neuronal layers

LayerParameter

a b c d N3D N4D

lθi 0.1 0.2 -65 2 68 136lxi 0.1 0.2 -65 2 68 136lθi 0.02 0.15 -55 6 68 136

Synaptic connections

DOFParameter

S τ1 τ2 CI CE Itr∗

3 (Planar) 0.05 20 18 -4 4 60004 (Spatial) 0.03 20 18 -5 5 9000

two layers (input and output) and a low number of neuronsare needed to develop a differential map for multi-DOFrobot. Following the proposed tuning method, each neuronbundle consists of around 136 neurons, in case of the spatialconfiguration, compared to around 1000 neurons in [38]. In[16] each bundle is constructed of around 100 neurons only,however, a hidden layer is needed which increases the size ofthe network and complexity of the tuning process. Moreover,the robot in this work needed to approach only 100 points,and data recorded while moving to these target points wassufficient for the training process compared to 6426 targetpoints in [17]. The robot succeeded in all trials to reach thedestined targets through visual servoing.

The time required to reach the target varies depending onthe distance from the current position and the richness ofdata trials available along the path between the current andtarget pose. This can be seen in the two examples in Fig.10, where in Fig. 10a the robot takes less time to cover abigger distance compared to Fig. 10b. This can be explainedby having the training data collected from motor babbling injoint space with more examples of the robot moving in waving-like trajectories and less abundant data for linear motion intask space. Similarly, the trial in Fig. 11a less time is neededcompared to the trial in Fig.11b as the target in the latter isclose to the boundary of the studied section of the workspacewhich has less training examples. Richer data collected frommotions in both joint space and task space can be used ina future study to role out the effect of varying the data ondifferent motion types, and the possibility of emerging ofmotion primitives from such behavior.

It can be noticed, from Fig. 8c and Fig. 8d, the trainingdata is filtered using only first-order filter and contains morenoise in case of the spatial motion, due to the noise in depthreadings, however, the DMSNN is still able to approximate thedifferential relationship and build the desired map. This canbe justified by the depression in the strength of uncorrelatedneurons due to the chosen STDP rule.

VI. CONCLUSIONS

The proposed DMSNN provides a way to build a differentialmap to drive robots with many DOFs through multi-modalservoing tasks. The network is featured by real-time operationand a small amount of data is needed for training. The currentlimitations of this method are related to the limited resolutionof the network output. Future work will focus on improving the

output resolution, as well as deriving a mathematical formulato conclude the optimal parameters for neurons firing andSTDP learning. We also plan to use this method for conductingvision-guided shape control tasks with deformable objects, asin [46].

REFERENCES

[1] D. Navarro-Alarcon, A. Cherubini, and X. Li, “On model adaptation forsensorimotor control of robots,” in 2019 Chinese Control Conf. (CCC).IEEE, 2019, pp. 2548–2552.

[2] J. Fagard, R. Esseily, L. Jacquey, K. ORegan, and E. Somogyi, “Fetalorigin of sensorimotor behavior,” Frontiers in Neurorobotics, vol. 12,p. 23, 2018. [Online]. Available: https://www.frontiersin.org/article/10.3389/fnbot.2018.00023

[3] T. Aoki, T. Nakamura, and T. Nagai, “Learning of motor control frommotor babbling,” IFAC-PapersOnLine, vol. 49, no. 19, pp. 154–158,2016.

[4] X. Xiong, F. Worgotter, and P. Manoonpong, “Adaptive and energyefficient walking in a hexapod robot under neuromechanical control andsensorimotor learning,” IEEE Trans. on cybernetics, vol. 46, no. 11, pp.2521–2534, 2015.

[5] L. Pinto and A. Gupta, “Supersizing self-supervision: Learning to graspfrom 50k tries and 700 robot hours,” in 2016 IEEE Int. Conf. on Robot.and Automat. (ICRA). IEEE, 2016, pp. 3406–3413.

[6] H. A. Pierson and M. S. Gashler, “Deep learning in robotics: a review ofrecent research,” Advanced Robot., vol. 31, no. 16, pp. 821–835, 2017.

[7] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath,“Deep reinforcement learning: A brief survey,” IEEE Signal ProcessingMag., vol. 34, no. 6, pp. 26–38, 2017.

[8] W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanentin nervous activity,” The Bull. of Math. Biophys., vol. 5, no. 4, pp. 115–133, 1943.

[9] B. Widrow and S. D. Stearns, Adaptive Signal Processing. USA:Prentice-Hall, Inc., 1985.

[10] W. Maass, “Networks of spiking neurons: the third generation of neuralnetwork models,” Neural networks, vol. 10, no. 9, pp. 1659–1671, 1997.

[11] S. Thorpe, D. Fize, and C. Marlot, “Speed of processing in the humanvisual system,” nature, vol. 381, no. 6582, pp. 520–522, 1996.

[12] G. Wang and M. Pavel, “A spiking neuron representation of auditorysignals,” in Proceedings. 2005 IEEE Int. Joint Conf. on Neural Networks,2005., vol. 1. IEEE, 2005, pp. 416–421.

[13] G. Schoner, Dynamic thinking: A primer on dynamic field theory.Oxford University Press, 2016.

[14] Z. Bing, C. Meschede, F. Rohrbein, K. Huang, and A. C. Knoll, “Asurvey of robotics control based on learning-inspired spiking neuralnetworks,” Frontiers in neurorobotics, vol. 12, p. 35, 2018.

[15] C. Corchado, A. Antonietti, M. C. Capolei, C. Casellato, and S. Tolu,“Integration of paired spiking cerebellar models for voluntary movementadaptation in a closed-loop neuro-robotic experiment. a simulationstudy,” in 2019 IEEE Int. Conf. on Cyborg and Bionic Syst. IEEE,2019.

[16] J. C. V. Tieck, H. Donat, J. Kaiser, I. Peric, S. Ulbrich, et al.,“Towards grasping with spiking neural networks for anthropomorphicrobot hands,” in Int. Conf. on Artif. Neural Networks. Springer, 2017,pp. 43–51.

[17] J. C. V. Tieck, L. Steffen, J. Kaiser, A. Roennau, and R. Dillmann,“Controlling a robot arm for target reaching without planning usingspiking neurons,” in 2018 IEEE 17th Int. Conf. on Cogn. Inform. &Cogn. Computing (ICCI* CC). IEEE, 2018, pp. 111–116.

[18] N. Srinivasa and Y. Cho, “Self-organizing spiking neural model forlearning fault-tolerant spatio-motor transformations,” IEEE Trans. onneural networks and Learn. Syst., vol. 23, no. 10, pp. 1526–1538, 2012.

[19] Q. Wu, T. M. McGinnity, L. Maguire, A. Belatreche, and B. Glackin, “2dco-ordinate transformation based on a spike timing-dependent plasticitylearning mechanism,” Neural Networks, vol. 21, no. 9, pp. 1318–1327,2008.

[20] D. M. Wolpert and M. Kawato, “Multiple paired forward and inversemodels for motor control,” Neural networks, vol. 11, no. 7-8, pp. 1317–1329, 1998.

[21] S.-J. Blakemore, D. Wolpert, and C. Frith, “Why can’t you tickleyourself?” Neuroreport, vol. 11, no. 11, pp. R11–R16, 2000.

[22] A. Maravita and A. Iriki, “Tools for the body (schema),” Trends in Cogn.sciences, vol. 8, no. 2, pp. 79–86, 2004.

Page 11: Differential Mapping Spiking Neural Network for …ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 3 However, this model is the most biologically plausible among all

ZAHRA et al.: DIFFERENTIAL MAPPING SNN FOR SENSOR-BASED CONTROL 11

[23] I. Abadıa, F. Naveros, J. A. Garrido, E. Ros, and N. R. Luque, “On robotcompliance: A cerebellar control approach,” IEEE Trans. on cybernetics,2019.

[24] F. Naveros, N. R. Luque, E. Ros, and A. Arleo, “Vor adaptation on ahumanoid icub robot using a spiking cerebellar model,” IEEE Trans. onCybernetics, 2019.

[25] F. Chaumette and S. Hutchinson, “Visual servo control. Part I: Basicapproaches,” IEEE Robot. Autom. Mag., vol. 13, no. 4, pp. 82–90, 2006.

[26] A. L. Hodgkin and A. F. Huxley, “A quantitative description of mem-brane current and its application to conduction and excitation in nerve,”The Journal of physiology, vol. 117, no. 4, pp. 500–544, 1952.

[27] L. F. Abbott, “Lapicques introduction of the integrate-and-fire modelneuron (1907),” Brain research Bull., vol. 50, no. 5-6, pp. 303–304,1999.

[28] E. M. Izhikevich, “Simple model of spiking neurons,” IEEE Trans. onneural networks, vol. 14, no. 6, pp. 1569–1572, 2003.

[29] S. Amari et al., The handbook of brain theory and neural networks.MIT press, 2003.

[30] D. Hebb, “The organization of behavior. 1949,” New York Wiely, vol. 2,no. 7, p. 8, 2002.

[31] E. Oja, “Simplified neuron model as a principal component analyzer,”Journal of Math. Biol., vol. 15, no. 3, pp. 267–273, 1982.

[32] D. Buonomano and T. Carvalho, “Spike-timing-dependent plasticity(stdp),” in Encyclopedia of Neuroscience, L. R. Squire, Ed. Oxford:Academic Press, 2009, pp. 265 – 268.

[33] N. Caporale and Y. Dan, “Spike timing–dependent plasticity: a hebbianlearning rule,” Annu. Rev. Neurosci., vol. 31, pp. 25–46, 2008.

[34] M. Gilson, T. Fukai, and A. N. Burkitt, “Spectral analysis of input spiketrains by spike-timing-dependent plasticity,” PLoS computational Biol.,vol. 8, no. 7, 2012.

[35] V. Cutsuridis, S. Cobb, and B. P. Graham, “A ca 2+ dynamics model ofthe stdp symmetry-to-asymmetry transition in the ca1 pyramidal cell ofthe hippocampus,” in Int. Conf. on Artif. Neural Networks. Springer,2008, pp. 627–635.

[36] M. A. Woodin, K. Ganguly, and M.-m. Poo, “Coincident pre-and post-synaptic activity modifies gabaergic synapses by postsynaptic changesin cl- transporter activity,” Neuron, vol. 39, no. 5, pp. 807–820, 2003.

[37] K. Buchanan and J. Mellor, “The activity requirements for spiketiming-dependent plasticity in the hippocampus,” Frontiers in SynapticNeuroscience, vol. 2, p. 11, 2010. [Online]. Available: https://www.frontiersin.org/article/10.3389/fnsyn.2010.00011

[38] A. Bouganis and M. Shanahan, “Training a spiking neural network tocontrol a 4-dof robotic arm based on spike timing-dependent plasticity,”in The 2010 Int. Joint Conf. on Neural Networks (IJCNN). IEEE, 2010,pp. 1–8.

[39] E. M. Izhikevich and J. Moehlis, “Dynamical systems in neuroscience:The geometry of excitability and bursting,” SIAM review, vol. 50, no. 2,p. 397, 2008.

[40] E. M. Izhikevich, “Which model to use for cortical spiking neurons?”IEEE Trans. on neural networks, vol. 15, no. 5, pp. 1063–1070, 2004.

[41] D. Sridharan and E. I. Knudsen, “Selective disinhibition: a unified neuralmechanism for predictive and post hoc attentional selection,” Visionresearch, vol. 116, pp. 194–209, 2015.

[42] A. K. Fidjeland, E. B. Roesch, M. P. Shanahan, and W. Luk, “Nemo: aplatform for neural modelling of spiking neurons using gpus,” in 200920th IEEE Int. Conf. on Application-specific Syst., Architectures andProcessors. IEEE, 2009, pp. 137–144.

[43] M.-K. Hu, “Visual pattern recognition by moment invariants,” IRE Trans.on information theory, vol. 8, no. 2, pp. 179–187, 1962.

[44] P. Sturm, Pinhole Camera Model. Boston, MA: Springer US, 2014,pp. 610–613.

[45] C. C. Cheah, M. Hirano, S. Kawamura, and S. Arimoto, “Approximatejacobian control for robots with uncertain kinematics and dynamics,”IEEE Trans. on Robotics and Automation, vol. 19, no. 4, pp. 692–702,2003.

[46] D. Navarro-Alarcon and Y.-H. Liu, “Fourier-based shape servoing: Anew feedback method to actively deform soft objects into desired 2Dimage shapes,” IEEE Trans. Robot., vol. 34, no. 1, pp. 272–1279, 2018.