a graph theoretical approach for a multistep mapping...

A graph theoretical approach for a multistep mapping software forthe FACETS project

Karsten WendtTechnische Universitat Dresden

ETIT/IEE - HPSND-01062 Dresden

[email protected]

Matthias EhrlichTechnische Universitat Dresden


[email protected]

Rene SchuffnyTechnische Universitat Dresden


[email protected]

Abstract: This paper discusses the mapping problem, i.e. the assignment of complex biological networks to aspecial hardware system. The problem is modeled as a map of two disjoint sets. Further a graph model is introducedto overcome the lack of adequate system representations for the mapping software framework, which controls thecomplex mapping process. The characteristics of the graph model are described and possible applications forboth the biological networks and hardware systems are shown. Consequently the mapping is expanded to a graphrelation. After a short consideration of implementation aspects the mapping process, i.e. the handling with thegraph models during creation of the mapping itself is represented.

Key–Words:Graph Model, Mapping, FACETS, Software Framework

1 Introduction

The projectFast Analog Computing with EmergentTransient States- FACETS [1] aims at exploring var-ious computational aspects of biological neural net-works. The motivation is to emulate complex pulsingneural systems with up5 · 10

4 neurons and200 · 106

synapses with a speed-up between103 and10

4 com-pared to biological real-time.

This encompasses the development of a massiveparallel neural hardware system consisting of config-urable pulse coupled neural network IC as well as asoftware that allows for the configuration of said sys-tem in reasonable time.

The challenge of the work accomplished so far isto make the configuration and mapping process con-trollable by using a flexible framework of models andalgorithms.

A short overview of the dimensions of the recon-figurable FACETS system and the resulting mapping,hence configuration problem shall be given.

It is followed by a short overview of the recon-figurable FACETS system which is currently in de-velopment in section 2. Section 3 considers the socalled mapping problem and mapping framework ofthe FACETS project. Then a graph model with itsbiological and hardware application is introduced insection 4. The paper is completed by a short summaryof implementation issues in section 5 and a presenta-tion of the entire mapping process based on the graphmodels in section 6.

2 Current State of the FACETS Sys-tem from the Mapping Viewpoint

The FACETS system1 consists of analog modeled IF(integrate-and-fire) neurons and synapses, which im-plement a STDP (spike-time-dependency-plasticity)mechanism [2]. The neuron’s and synapses’ behav-ior is defined by a set of configurable parameters. Aparameter set is shared by several neurons or synapsesrespectively.

Functional blocks, so calledHICANNs(High In-put Count Analog Neural Network)2, contain up to217 (131072) synapses each, which are shared by a

configurable set of hardware neurons equally, so thata single neuron can be connected to up to2

14 (16384)synapses. The input signals, i.e. neuron spikes fromother HICANNs, and the output signals of each HI-CANNs are encoded in a digital time-continuous net-work, so calledlayer-1network. It consists of buses,which carry the neuron spikes of up to64 neuronseach.

The input of each HICANN is decoded config-urably statically from up tonmaxInput layer-1 buses.Therefore the number of different input signals is lim-ited to64 · nmaxInput per HICANN and so also for asingle neuron. The analog HICANN’s output is trans-formed by a 64-to-1 priority encoder to digital pulse

1status quo of September 20072currently in development at the Kirchhoff Institut of Physik,

Ruprecht-Karls-Universitaet, Heidelberg, Germany

2nd WSEAS Int. Conf on COMPUTER ENGINEERING and APPLICATIONS (CEA'08) Acapulco, Mexico, January 25-27, 2008

ISSN: 1790-511789 ISBN: 978-960-6766-33-6

signals of 1 up to 8 layer-1 buses, which are fed in thelayer-1 network at predefined positions.

A grid of up to400 HICANNs are arranged on anentirewafer, e.g. these blocks have to be placed andconnected on a single cohesive plane.nl1horizontal

horizontal andnl1vertcial vertical layer-1 buses be-tween each HICANN provide one part of the intra-wafer communication system. At each cross point canbe found acrossbar, which realizes the connectivitybetween horizontal and vertical buses. The crossbarsare populated only sparsely with switches to reducethe amount of memory. At the borders of each HI-CANN repeatersare placed, connecting the layer-1network of each adjacent HICANN to stop or relaysingle buses. The input selection of each HICANN isdone by an also sparsely populatedselect switch.

Finally the FACETS system consists of severalwafers connected by an additional digital not time-continuous package-based communication system, socalled layer-2 network. The signal output points ofeach HICANN are connected with aDNC (DigitalNetwork Core), which transforms necessary layer-1signals into packages, including time and destination.Then the packages are routed, passingFPGAs, whichprovide access to the inter-wafer bus, to their entrypoints and are decoded back by a DNC to layer-1pulses. Thus long-ranged intra-wafer and inter-wafercommunication is realized. The configuration of theDNCs and FPGAs is done statically by routing tables.[3]

3 Mapping ProblemObviously to see, that to set up this hardware systemto simulate a specific neural network is a difficult task.Generally, given a biological system consisting of el-ements ofB and a hardware system consisting of el-ements ofH, the assignment ofB to H can be pre-sented by a mapm:

m : Bp → Hp (1)

whereasBp ⊆ B (2)

andHp ⊆ H (3)

The different constraints of this map can be describedinformally as to

• minimize not realized neurons and synapses

• minimize violations of parameters

• minimize violations of timing

• maximize hardware efficiency

• realize adequate mapping time

Formally it can be described as following, differing inhard and soft constraints: Given 2 functions, so calledcost functions:

ch,s : (B, H, Bp → Hp) → R (4)

a mappingm has to be found so that:

ch(B, H, m) = 0 (5)

cs(B, H, m) → min. (6)

Then the task can be characterized as a multi-objectiveoptimization problem [6] [7], which makes a set ofadequate algorithms, e.g. genetic searches, necessary.The algorithm’s optimization is based on efficient costfunctions, which evaluate the mapping with regard tothe given constraints. Again the cost functions dependon flexible models of the biological and hardware sys-tems. The complexity of this problem requires a soft-ware framework, which decouples the involved ele-ments as shown in figure 1 as far as possible.

Figure 1: Mapping Framework Scheme

4 Modeling

4.1 Graph Model

As an universal model of data representation for theentire mapping framework and on which the costfunctions and the algorithms will base, a special graphmodel was chosen. Generally, a graph is a pair of 2 fi-nite setsV andE, which represent a set of nodes and aset of edges.V andE apply to the constraints:V 6= ∅

andV ∩ E 6= ∅. Furthermore every graph is relatedto a mapΨ with Ψ : E → V × V . [4] [5]

The used graph modelG consists of nodesvi ∈V , which can hold a name or a value as a data itemrespectively. It contains 3 types of directed edgesei ∈E:


ISSN: 1790-5117Page 190 ISBN: 978-960-6766-33-6

• hierarchical edgesmodeling a hierarchy ofnodes, i.e. to represent a container-componentrelationship

• named edgesmodeling a certain relationship be-tween 2 nodes

• hyper edgesconnecting a named edge with anode, i.e. to model a detailed description of anode-node relationship.

To clarify the role of the graph model components,they are shown exemplarily in figure 2. Thus the graphmodel can be characterized as a special form of a di-rected hypergraph. [4] [5]

Figure 2: Graph Model

So it is possible to create flexible models in re-spect to the biological network’s and hardware sys-tem’s complexity. Because of its structure the modelis fully navigable, which means every point, i.e. ev-ery node or edge is reachable from every position inthe graph. By comparison a non graph model with 2disjoint data sets, e.g. the synaptical connectivity asa (sparse) matrix or a list separated from the neuralparameters as a vector, offers no direct access to therelations of these data sets.

Thus it provides an universal data interface forthe optimization algorithms and cost functions, whichtransform needed data into adequate formats and storethem back into the graph model making the data ac-cessible for other components. Furthermore the de-centralized structure makes the model suitable formassive parallelization.

On the other hand this kind of data representationconsumes more memory and more build up time thanmore compact models by way of comparison.

4.2 Biological Model

The biological systems, which have to be mapped tothe FACETS hardware, can be considered as networksof neurons and synapses. Each neuron is connected toa number of synapses and furthermore characterizedto a set of parameters. Each synapse connects a sourceand a target neuron and is also assigned to parameters.

The biological graph model is referred to asGB = (VB, EB). To gain access to a simple struc-ture all neurons and neural as wall as synaptical pa-rameters are corresponded hierarchically to differentnodes. A synaptical connection is modeled as a namededge, where the name classifies the edges role. Theparameter assignments are also done by named con-nections and hyper connections for the synapses. Asample of this global view is shown in figure 3.

Figure 3: Hierarchical view of the biological modelGB

Given this structure a neural network can be au-tomatically transformed into graph model represen-tation GB, as shown in figure 4. There, a systemof 3 neuronsN0..2 with 2 parameters setP 0,1

Nand 5

synapsesS0..4 with also 2 parameter setsP 0,1

Scon-

necting the neurons are modeled as a graph describedabove. Each neuron and each parameter set is realizedby a node. The synapses are formed by a named con-nection, where the connection names are not shownin this figure. The elements are assigned by namedconnections or hyper connections to their parameternodes.

4.3 Hardware Model

The hardware system as described in section 2 canbe formed as a hierarchical structure of hardware el-ements. The hardware graph model is referred to asGH = (VH , EH). A reduced model is shown infigure 5. On top level the FACETS system consistsof a digital bus connecting several wafers. A singlewafer contains a grid of HICANNs, connected by ab-stracted connection elements (CE). A HICANN in the


ISSN: 1790-5117Page 191 ISBN: 978-960-6766-33-6

Figure 4: Simplified biological network transformedinto a graph modelGB

last level holds a number of hardware neurons.Furthermore each level also contains data about

the optimization algorithm and the cost function touse, which abstracts the constraints of the hardwareto an algorithm compatible form. The informationis used by the mapping framework to accomplish themapping process and is shown only for the highestnode in the figure.

Figure 5: Hierarchical view of the FACETS hardwaremodelGH

As an example the hardware system can be mod-eled more detailed at the wafer level. In figure 6a single HICANN, consisting of neuronsN0,1 andsynapsesS0..3 characterized by the parameter setsP

0

N

andP0,1

S, is embedded in layer-1 relevant hardware el-

ements. It receives its spike signals via a select switchSW

1 from a layer-1 sectionl11. Its output is trans-formed by aWTA-encoderWTA

0 and can be routedthrough a crossbarCB

0. All these elements are char-acterizable by their parameter sets.

The transformation step to a graph model rep-resentationGH converts all hardware elements tograph nodes and connects them to their parametersvia named nodes. Only the simple signal transport-ing components like synapses and layer-1 buses arealso modeled by named edges assigned to parameters

via hyper edges.

Figure 6: Simplified hardware system transformedinto a graph modelGH

4.4 Graph MappingConsequently the mapping as a map from the setB tothe setH, see section 3, must be expanded to a graphrelationrm:

rm : VB × VH (7)

with

VB × VH :=

{

(b, h)|(b ∈ VB) ∧ (h ∈ VH)

}

(8)

whereasVB is the set of nodes inGB andVH is theset of nodes inGH . Thus the mapping relationrm canbe generalized to a set of tuples

rm =

(b0, h

i0

0)

(b0, h

i1

0)

...

(b0, h

imax

0)

(b1, h

j0

1)

...

(bn, h

kmax

n)

(9)

which assign every mapped node of the biologicalmodel to one or more nodes of the hardware system.In the graph model the mapping is done by namededges, so called mapping edges representing the as-signment. An example is shown in figure 7, where2 neurons are mapped to 2 different HICANNs and asynaptical parameter set is also mapped to both hard-ware nodes.

5 ImplementationThe graph model was implemented in C++. To reducethe memory consumption to be able to model large


ISSN: 1790-5117Page 192 ISBN: 978-960-6766-33-6

Figure 7: Mapping representation in the graph model

models, the classes were implemented light-weighted.The model consists of 2 elements,nodesandconnec-tions, which stand for the named edges. A node con-tains:

• a value, which holds the data item of the node

• a pointer to its superordinate node, which modelsthe hierarchical edges backward

• a list of pointers to its subordinate nodes, whichmodels the hierarchical edges forward

• a list of pointers to outgoing and incoming con-nections, which models the named connectionsforward and backward

• a list of pointers to connections, which modelsthe hyper edges backward

A connection contains:

• a value or a name ID respectively, which holdsthe semantic meaning of the edge

• a pointer to the target and the source node, whichmodels the named edges forward and backward

• a pointer to a hyper-connected node, which mod-els the hyper edges forward

Thus the memory consumption of a connection is al-ways 26 Bytes and a node consumes at least 96 Bytesdepending on the size of the data item and the lengthof the lists.

For example a biological network of104 neurons

and13.5 · 106 synapses causes a memory consump-

tion of866.4 MBytes in total, which means every neu-ron consumes38.1 KBytes in average, the synapticalconnections and their parameter assignments allocate35.9 Bytes per synapse.

Furthermore the graph model provides an inter-face for the optimization algorithms and cost func-tions to interact with the model. On the one hand itgives access to functions to set up and control mod-els, i.e. to node and connection creation and deletionfunction. On the other hand it offers functions to nav-igate inside the graph model, i.e. filter all subordinatenodes of a given node in regard to their values, findall outgoing or incoming connections with the samename ID or filter the source or targets of given con-nections regarding a given value. It is possible to com-bine these access functions to locate certain positionsvia a ”path” along the edges and nodes, which meansto navigate inside the model.

6 Mapping Process

The global flow of the mapping process is build on arecursively called functionMappingStep, which per-forms the mapping of a partial biological model withan adequate optimization algorithm and cost functionto the current position, i.e. to subordinate nodes ofcurrent node in the hardware model. Figure 8 showsthe global mapping process as a structure chart.

Figure 8: Global mapping process

6.1 Global Model Creation

The first stage of the mapping process is the creationof the hardware and the biological model. While thehardware model creation is a trivial task to build upthe representation of the biological model is not.

The neural networks textual representation fol-lows a self defined neural syntax similar to a context


ISSN: 1790-5117Page 193 ISBN: 978-960-6766-33-6

free syntax of a formal language, enabling the pos-sibility to use a parsing process for constructing aninternal graph representation.

To generate the structure of the biological net-work a scanner reads the neural net lists input streamand identifies the tokens. The subsequent low levelparsing process recognizes the neural semantic of thetokens read according to our self defined syntax andreturns a collapsed statement. The information ex-tracted is stored in a statement- or ”root”-tree fromwhich the global syntactic tree representing the bio-model is finally build.

This split-parser approach allows separate opti-mization of the scanning/low level parsing processand the high level parser. The use of parsing methodsenables an implicit error check during net list read-inand also handles the semantic recognition.

Finally 2 models as described in section 4.2 and4.3 are created.

6.2 Mapping Step

Initially this function is called with global biologicalmodelGB and the highest hardware nodev

∗

Has argu-

ments.First, as in the hardware model defined, an in-

stance of the current optimization algorithmOAvH

and an instance of the current cost functionCFvHfor

this hardware node is created. These 2 componentsuse the interface as described in section 5 to gain ac-cess to the models and transform the data into ade-quate computation formats.

Then the algorithm with the cost function is exe-cuted and creates a mapping between the graph mod-els as shown in section 4.4. Subsequently the mappingis expanded to dependent nodes, e.g. parameter sets,to complete the mapping for this hardware elementvH .

Finally for each subordinate nodev′H

a partial bi-ological modelGv

′

Bis extracted, which contains only

biological elements which are relevant for the subse-quent mapping steps, holding a connection to their ori-gins. Recursively the function MappingStep is calledwith the modelGv

′

Bandv

′

Has arguments. It creates

a more detailed mapping for the subordinate nodes ofvH . After that the specified mapping is updated to thecurrent biological modelGv

B, i.e. the mapping edges

are redirected to lower level hardware nodes, and thepartial model is deleted.

The return of the highest MappingStep call fol-lows a routing step which calculates the routing tablesof the hardware, depending on the placement of thebiological elements. This will be outlined in a laterpublication.

7 ConclusionA hypergraph classified graph model consisting of 3different edge types was introduced. It was shown thatit is capable to model complex biological and hard-ware systems with the same underlying structure. Thegraph models were tested and developed with variousbiological networks, reaching from10

2 to 2 · 105 neu-

rons and different hardware systems, which mimic thedescribed FACETS system.

Its implementation was optimized regarding ac-cess speed, memory usage and mapping process func-tions, see chapter 6.2. The graph model offers a ba-sis to develop optimization algorithms independently,efficient cost functions and more complex modelsfor the introduced mapping framework. Further it ishighly capable of massive parallelization to achieveadditional performance. The next task is to exam-ine the memory behaviour in more detail, investigatethe performance of single mapping steps and extendthe graph model interface for further algorithms, costfunctions and external routing functions.

Acknowledgements: The research project FACETSis financed by the European Union as IntegratedProject (Nr. 15879) in the framework of the Informa-tion Society Technologies program.

References:

[1] K. Meier - Fast Analog Computing with EmergentTransient States in Neural Architectures,Integratedproject proposal, FP6-2004-IST-FETProactive, PartB. - Kirchhoff Institut of Physik, Ruprecht-Karls-Universitaet, Heidelberg 2004.

[2] J. Schemmel, A. Gruebl, K. Meier and E. Mueller -Implementing Synaptic Plasticity in a VLSI SpikingNeural Network Model, Kirchhoff Institut of Physik,Ruprecht-Karls-Universitaet, Heidelberg 2006.

[3] M. Ehrlich, C. Mayr , H. Eisenreich , S. Henker, A.Srowig, A. Gruebl, J. Schemmel, and R. SchueffnyWafer-scale VLSI implementaions of pulse coupledneural network- International Conference on Sen-sors, Circuits and Instrumentation Systems SSD07,Hammamet- Tunisia, 2007

[4] R. Diestel -Graph Theory, Springer 2005.[5] M.I. Jordan -Graphical Models, Computer Science

Division and Department of Statistics, University ofCalifornia, Berkeley, California 94720-3860, USA2004.

[6] D.F. Jones -Multi-objective meta-heuristics: Anoverview of the current state-of-the-art, School ofComputer Science and Mathematics, University ofPortmouth, UK 2001.

[7] E. Zitzler - Comparison of Multiobjective Evolu-tionary Algorithms: Empirical Results, Departmentof Electrical Engineering, Swiss Federal Institute ofTechnology, Switzerland 2000.


ISSN: 1790-5117Page 194 ISBN: 978-960-6766-33-6

a graph theoretical approach for a multistep mapping...

Documents