reiner hartenstein university of kaiserslautern
DESCRIPTION
July 8, 2002, ENST, Paris, France. Enabling Technologies for Reconfigurable Computing and Software / Configware Co-Design Part 3: Resources for RC -. Reiner Hartenstein University of Kaiserslautern. Schedule. Opportunities by new patent laws ?. to clever guys being keen on patents: - PowerPoint PPT PresentationTRANSCRIPT
Enabling Technologies for
Reconfigurable Computing
Enabling Technologies for Reconfigurable Computing and Software / Configware Co-Design Part 3:Resources for RC
-
Reiner Hartenstein
University ofKaiserslautern
July 8, 2002, ENST, Paris, France
© 2002, [email protected] http://kressaray.de2
University of Kaiserslautern
Xputer Lab
Schedule
time slotxx.30 – xx.00 Reconfigurable Computing (RC)xx.00 – xx.30 coffee breakxx.30 – xx.00 Design / Compilation Techniquesxx.00 – xx.00 lunch breakxx.00 – xx.30 Resources for Data-Stream-based RCxx.30 – xx.00 coffee breakxx.00 – xx.30 FPGAs: recent developments
© 2002, [email protected] http://kressaray.de3
University of Kaiserslautern
Xputer LabOpportunities by new patent
laws ?
• to clever guys being keen on patents:
• don‘t file for patent following details !
• everything shown in this presentation has been published years ago
© 2002, [email protected] http://kressaray.de4
University of Kaiserslautern
Xputer Lab>> Configware Industry
• Configware Industry
• Terminology
• MoPL data-procedural language
• Anti architecture and circuitry
• Stream-based Memory Architecturehttp://www.uni-kl.de
© 2002, [email protected] http://kressaray.de5
University of Kaiserslautern
Xputer LabConfigware heading for mainstream
• Configware market taking off for mainstream• FPGA-based designs more complex, even SoC• No design productivity and quality without good configware
libraries (soft IP cores) from various application areas. • Growing no. of independent configware houses (soft IP core
vendors) and design services • AllianceCORE & Reference Design Alliance• Currently the top FPGA vendors are the key innovators and
meet most configware demand.
© 2002, [email protected] http://kressaray.de6
University of Kaiserslautern
Xputer LabOS for PLDs
• separate EDA software market, comparable to the compiler / OS market in computers,
• Cadence, Mentor, Synopsys just jumped in.
• < 5% Xilinx / Altera income from EDA SW
© 2002, [email protected] http://kressaray.de7
University of Kaiserslautern
Xputer Lab Xilinx Alliances
• The Software AllianceEDA Program
• ... Xilinx Inc.'s Foundation...
• free WebPACK downloadable tool palette
• The Xilinx XtremeDSP Initiative (with Mentor Graphics)
• MathWorks / Xilinx Alliance.
• The Wind River / Xilinx alliance
•#
© 2002, [email protected] http://kressaray.de8
University of Kaiserslautern
Xputer Lab
The Software Alliance EDA Program
provides a wide selection of EDA tools
Acugen Software, Agilent EEsof EDA, Aldec, Aptix, Auspy Development, Cadence, Celoxica, Dolphin Integration, Elanix, Exemplar, Flynn Systems, Hyperlynx,
IKOS Systems, Innoveda, MentorGraphics, MiroTech, Model Technoloy, Protel International, Simucad, SynaptiCAD, Synopsys,Synplicity, Translogic, Virtual Computer Corporation.
helps leading EDA vendors to integrate Xilinx Alliance software tightly into their tools
© 2002, [email protected] http://kressaray.de9
University of Kaiserslautern
Xputer LabThe Xilinx AllianceCORE
programa cooperation between Xilinx and third-party core developers, to produce a broad selection of industry-standard solutions for
use in Xilinx platforms. - Partners are:Amphion Semiconductor, Ltd. ARC Cores CAST, Inc. DELTATEC Derivation Systems, Inc.Dolphin Integration (Grenoble) Eureka Technology Inc. Frontier Design Inc. GV & Associates, Inc. inSilicon Corporation iCODING Technology Inc. Loarant CorporationMindspeed Technologies - A Conexant Business (formerly Applied Telecom) |
MemecCore Mentor GraphicsInventra NewLogic Technologies, Inc. (Europe) NMI Electronics Paxonet Communications, Inc. Perigee, LLC Rapid Prototypes Inc. sci-worx GmbH (Hannover, Germany) SysOnChip TILAB (Telecom Italia Lab) VAutomation Virtual IP Group, Inc.XYLON.
© 2002, [email protected] http://kressaray.de10
University of Kaiserslautern
Xputer LabThe Xilinx Reference Design Alliance
Program
The Xilinx Reference Design Alliance Program helps the development of multi-component reference designs that incorporate Xilinx devices and other semiconductors. The designs are fully functional, but no warranties, no liability. Partners are:.
ADI Engineering Innovative Integration
JK microsystems, Inc.LYR Technologies NetLogic Microsystems
© 2002, [email protected] http://kressaray.de11
University of Kaiserslautern
Xputer LabThe Xilinx University Program
The Xilinx University Program provides
• Xilinx Student Edition Software, • Professor Workshops, • a Xilinx University User Group, • Presentation Materials and Lab Files, • Course Examples, • Research,• Books, etc.
© 2002, [email protected] http://kressaray.de12
University of Kaiserslautern
Xputer Lab Altera offers over a hundred IP cores (1)
•modulator, •synchronizer, •DDR SDRAM controller,•Hadamar transform, •interrupt controller, •Real86 16 bit microprocessor, •floating point, •FIR filter, •discrete cosine, •ATM cell processor, •and many others.
•controller, •UART, •microprocessor, •decoder, •bus control, •USB controller, •PCI bus interface, •viterbi controller, •fast Ethernet •MAC receiver or transmitter,
Altera offers over a hundred IP cores like, for example:
© 2002, [email protected] http://kressaray.de13
University of Kaiserslautern
Xputer Lab Altera offers over a hundred IP cores (2)
from Altera | AMIRIX Systems, Inc. Amphion Semiconductor, Ltd. Arasan Chip Systems, Inc. CAST, Inc. Digital Core Design Eureka Technology Inc. HammerCores InnocorKtech Telecommunications, Inc. Lexra Computing EnginesMentor Graphics - Inventra
Modelware Ncomm, Inc. NewLogic Technologies Northwest Logic Nova Engineering, Inc. Palmchip Corporation Paxonet Communications PLD Applications Sciworx Simple Silicon Tensilica TurboConcept.
© 2002, [email protected] http://kressaray.de14
University of Kaiserslautern
Xputer LabAltera IP core design services
Altera IP core design services are available from:
• Northwest Logic
© 2002, [email protected] http://kressaray.de15
University of Kaiserslautern
Xputer Lab Altera Certified Design Center (CDC) Program
Certified Design Center (CDC) Program:
• Barco Silex • El Camino GmbH • Excel Consultants • Plextek • Reflex Consulting • Sci-worx • Tality • Zaiq Technologies.
© 2002, [email protected] http://kressaray.de16
University of Kaiserslautern
Xputer LabThe Altera Consultants
Alliance Program (ACAP):
The Altera Consultants Alliance Program (ACAP): lists
•41 offices in North America and
•29 in the rest of the world.
© 2002, [email protected] http://kressaray.de17
University of Kaiserslautern
Xputer LabDevlopment boards
Devlopment boards are offered from: • Altera • El Camino GmbH • Gid'el Limited• Nova Engineering, Inc. • PLD Applications • Princeton Technology Group • RPA Electronics Design, LLC • Tensilica.
© 2002, [email protected] http://kressaray.de18
University of Kaiserslautern
Xputer Lab Consultants and services not listed by Xilinx nor Altera (index)
Algotronix, Edinburgh, Andraka Consulting Group Arkham Technology, Pasadena, CA Barco Silex, Louvain-la-Neuve, Belgium, Bottom Line Technologies, Milford, NJCodelogic, Helderberg, South Africa, Coelacanth Engineering, Norwell, MASS Comit Systems, Inc., Santa Clara, CAEDTN Programmable Logic Design Center
Flexibilis, Tampere, Finland, Geoff Bostock Designs, Wiltshire, England, Great River Technology, Alberquerque, NM, New Horizons GB Ltd, United Kingdom, North West LogicSilicon System Solutions, Canterbury, Australia, Smartech, Tampere, Finland, Tekmosv, Austin, Texas, The Rockland Group, Garden Valley, CANick Tredennick, Los Gatos, California, Vitesse,
© 2002, [email protected] http://kressaray.de19
University of Kaiserslautern
Xputer Lab Consultants and services not listed by Xilinx nor Altera (1)
Algotronix, Edinburgh, Reconfigurable Computing and FPL in software radio, communications and computer security
Andraka Consulting Group high performance FPGA designs for DSP applications
Arkham Technology, Pasadena, low cost IP cores for Xilinx and Atmel, embedded processor, DSP, wireless communication, COM / CORBA / DirectX, client-server database programming, software internationalization, PCB design
Barco Silex, Louvain-la-Neuve, Belgium, IP integration boards for ASIC and FPGA, consultancy, design, sub-contracting
© 2002, [email protected] http://kressaray.de20
University of Kaiserslautern
Xputer Lab Consultants and services not listed by Xilinx nor
Altera (2)Bottom Line Technologies, Milford, New Jersey, FPGA design, training, designing Xilinx parts since 1985
Codelogic, Helderberg, South Africa, consulting, FPGA design services
Coelacanth Engineering, Norwell, Massachusetts, design services, test development services, in wireless communication, DSP-based instrumentation, mixed-signal ATE
Comit Systems, Inc., Santa Clara, California, DSP, ASIC, networking, embedded control in avionics -- FPGA / ASIC design and system software
EDTN Programmable Logic Design Center
© 2002, [email protected] http://kressaray.de21
University of Kaiserslautern
Xputer Lab Consultants and services not listed by Xilinx nor
Altera (3)FirstPass, Castle Rock, Colorado
Vitesse, ASIC design
Flexibilis, Tampere, Finland, VHDL IP cores for Xilinx products
Geoff Bostock Designs, Wiltshire, England, FPGA design services
Great River Technology, Alberquerque, New Mexico, FPGA design services in digital video and point-to-point data transmission for aerospace, military, and commercial broadcasters
New Horizons GB Ltd, United Kingdom, FPGA design and training, Xilinx specialist
North West Logic; FPGA and embedded processor design in digital communications, digital video
© 2002, [email protected] http://kressaray.de22
University of Kaiserslautern
Xputer Lab Consultants and services not listed by Xilinx nor
Altera (4)Silicon System Solutions, Canterbury, Australia, VHDL IP cores for the ASIC and FPGA/CPLD/EPLD markets
Smartech, Tampere, Finland, ASIC and FPGA design
Tekmosv, Austin, Texas, Multiple Designs on a Single Gate Array, HDL synthesis, design conversions, chip debug, test generation
The Rockland Group, Garden Valley, California, a TeleConsulting organization about logic design for FPGAs
Nick Tredennick, Los Gatos, California, investor and consultant
© 2002, [email protected] http://kressaray.de23
University of Kaiserslautern
Xputer Lab>> Terminology
• Configware Industry
• Terminology
• MoPL data-procedural language
• Anti architecture and circuitry
• Stream-based Memory Architecture
http://www.uni-kl.de
© 2002, [email protected] http://kressaray.de24
University of Kaiserslautern
Xputer LabTerminology
Paradigm Platform Programming
source
“von Neumann” Hardware Software
Soft Machine (w. soft datapaths)
Coarse grain Flexware
high level Configware
RL (FPGA etc.) fine grain Flexware netlist level
Configware
© 2002, [email protected] http://kressaray.de25
University of Kaiserslautern
Xputer LabTerminology & Acronyms
• Software (SW): procedural sources*• Configware (CW): structural sources• Hardware (HW): hardwired platforms• ASIC: customizable hardwired platforms • Flexware (FW): reconfigurable platforms• FPGA: field-programmable gate array• FPL: field-programmable logic
• RC: reconfigurable computing• RL: reconfigurable logic
*) note: firmware is SW !
© 2002, [email protected] http://kressaray.de26
University of Kaiserslautern
Xputer LabStream-based Computing (2)
terms:
• DPU: datapath unit• DPA: datapath array• rDPU: reconfigurable
DPU• rDPA: reconfigurable
DPA
• stream-based computing: using complex pipe network (super-systolic: Kress et al.)
© 2002, [email protected] http://kressaray.de27
University of Kaiserslautern
Xputer LabConfusing Terminology
Computer Science and EE as well as ist R&D and applicatgion areas suffer from a babylonial confusion.
Communication not only between Computer Science and EE, but also between ist special areas, even between ist different abstrac tion levels is made difficult – mainly because of immature terminology in relation to reconfigurable circuits and their applications.
Terms are rarely standardized and often used with drastically different meanings – even within then same special area.
Often terms have been so badly coined, that they are not self-explanatory, but mesleading. A demonstratory example is the comparizon of terms used used in VHDL and Verilog.
Ideal are "intuitive" terms. But often Intuition yields the wrong idea. Whenever a new term appears in teaching, I often have to tell the students, that the term does not mean, what he believes.
© 2002, [email protected] http://kressaray.de28
University of Kaiserslautern
Xputer LabTerms (1)
.
Term Meaning Example
Hardware hardwired ASIC, CPU, DPU, DPA
Morphware Reconfigurable(structurally programmable)
FPLA, FPGA, rDPU, rDPA
Firmware Microprogramme (rarely used after introduction of RISC proc.)
IBM 360 Computer Family
Software procedural programs (instruction stream exec. by a CPU)
Word, C, OS, Compiler, etc.
Streamware
data-procedural programs (data streams exec. by a DPU or DPA)
data schedules, data streams, e. g. MoPL programs
Configware structural programs, soft IP cores, personalizing CPLD, FPGA, or other Flexware
f. configuration of rDPA, FPGA, e. g. as a logic circuit, state machine, datapath, function
[à la Ingo Kreuz]
© 2002, [email protected] http://kressaray.de29
University of Kaiserslautern
Xputer LabTerms (2)
.
Term Meaning Example
data objects of computing: w. “data” property depends on the moment of watching
Bits, numbers, operands, results, any text (also compiler input) lists, graphs, tables, images, ...
data stream ordered, also parallel data word lists, obtained by scheduling
I/O data streams for systolic or other arrays,Also DSP
programming
personalisation by loading programm code
procedural code or structural code: for (re)configuration
program source text or object code for programming
procedural or structural
[à la Ingo Kreuz]
© 2002, [email protected] http://kressaray.de30
University of Kaiserslautern
Xputer LabTerms (3)
.
Term Meaning Example
boot program simple program to enable programming- usually saved in non-volatile memory
comparable to the starter of the engine of a car
booting load and execute a boot program
[à la Ingo Kreuz]
© 2002, [email protected] http://kressaray.de31
University of Kaiserslautern
Xputer LabHardware Terms (1)
Term Meaning Example
machine execution unit, driven by deterministic sequencer, + memory
von Neumann, or anti machine
„dataflow machine“
not a machine, since without a deterministic sequencer (exotic concept)
(dead research area)
CPU instruction stream processor ("von Neumann”): program counter (instruction sequencer) and DPU - mode of operation: deterministically instruction-driven
ARM, Pentium core,
DPU, rDPU (reconfigurable) data path unit*
DPA, rDPA (reconfigurable) DPU array* KressArray
[à la Ingo Kreuz]
*) processing datastreams (transport-triggered), not yet a machine: autosequencing memory missing
© 2002, [email protected] http://kressaray.de32
University of Kaiserslautern
Xputer LabHardware Terms (2)
Term Meaning Example
DPU data path unit, processes operands - no CPU since without sequencer - no maschine
ALU with registers, multiplexers etc.
Computer CPU with RAM and interfaces
Parallel Computer
ensemble of several Computers
Xputer deterministically data-driven Machine, (transport-triggered) - data counter(s) used instead of a program counterm
MoM architectures (Kaiserslautern)
dataflow machine
indeterministically data-driven (execution sequence unpredictable)
(sleeping research area)
[à la Ingo Kreuz]
© 2002, [email protected] http://kressaray.de33
University of Kaiserslautern
Xputer LabTerms on Parallelism (1)
Term Meaning Example
parallelism several levels of parallelism distinguished
parallel processes, parallelism at instruction set level, pipelines,
concurrent parallel processes run on different CPUs of a parallel computer - may occasionally exchange signals or data
weather prognisis, complex simulations, etc.
ISP (instruction set parallelism)
several CPUs run in parallel by clocked synchronization
VLIW (very long instruction word) computer
[à la Ingo Kreuz]
© 2002, [email protected] http://kressaray.de34
University of Kaiserslautern
Xputer LabTerms on Parallelism (2)
Term Meaning Example
pipelining several uniform or different DPUs running simultaneously - connected to a pipeline by buffer registers.
pipelined CPUs, pipe networks, systolic, etc.
chaining several uniform or different DPUs running simultaneously - connected to a pipeline without buffer registers
Schaltnetze, komplexe arithmetische Operatoren
Pipe network Ensemble of DPUs, also multiple pipelines, also with irregular or wild structures
systolisc arrays, stream-based computing arrays
[à la Ingo Kreuz]
© 2002, [email protected] http://kressaray.de35
University of Kaiserslautern
Xputer LabTerms on Parallelism (3)
Term Meaning Example
Systolic Array Pipe network with only linear (straight-on, no branching), uniform pipelines (all DPUs hardwired and with same functionality) pipelines
Matrix computation, DSP, DNA sequencing, etc.
stream-based computing arrays (super-systolic arrays)
pipe network, configured before fabrication
image processing, DSP, complex functions and algorithms
(coarse grain) reconf. stream-based arrays
stream-based arrays, configurable after fabrication
KressArray
[à la Ingo Kreuz]
© 2002, [email protected] http://kressaray.de36
University of Kaiserslautern
Xputer LabCounterparts
category property counterpart
programing mode
procedural (classical)
structural (synthesis, design) - „field-programmable“, PLA „programming“, etc.
machine: principle of operation
controlflow-driven (instruction-driven): v. Neumann
Data-driven: Xputer machine
system: principle of operation
instruction-flow-driven (parallel computer etc.)
Data-stream-based (systolisc array, DPU array, KressArray)
Set-up time (datapaths switched thru)
during run time; (instruction-driven)
before run time:FPGA (at compile time)Gate Array (at fabrication)
[à la Ingo Kreuz]
© 2002, [email protected] http://kressaray.de37
University of Kaiserslautern
Xputer Lab>> MoPL data-procedural
language
• Configware Industry
• Terminology
• MoPL data-procedural language
• Anti architecture and circuitry
• Stream-based Memory Architecture
http://www.uni-kl.de
© 2002, [email protected] http://kressaray.de38
University of Kaiserslautern
Xputer LabFundamental Ideas available
(1)
• Data Sequencer Methodology
• Data-procedural Languages (Duality with v N)
• ... supporting memory bandwidth optimization
• Soft Data Path Synthesis Algorithms
• Parallelizing Loop Transformation Methods
• Compilers supporting Soft Machines
• SW / CW Partitioning Co-Compilers
© 2002, [email protected] http://kressaray.de39
University of Kaiserslautern
Xputer LabFundamental Ideas available
(2)
• Programming Xputers
• Similarities to programming computers
• How not to get confused by similarities
• What benefits vs. Computers ?
© 2002, [email protected] http://kressaray.de40
University of Kaiserslautern
Xputer Lab Programming Language Paradigms
language category Computer Languages Xputer Languages
both deterministic procedural sequencing: traceable, checkpointable
operation sequence driven by:
read next instruction, goto (instr. addr.),
jump (to instr. addr.), instr. loop, loop nesting
no parallel loops, escapes, instruction stream branching
read next data item, goto (data addr.),
jump (to data addr.), data loop, loop nesting, parallel loops, escapes, data stream branching
state register program counter data counter(s) address computation
massive memory cycle overhead overhead avoided
Instruction fetch memory cycle overhead overhead avoided parallel memory bank access interleaving only no restrictions
easy to learn
© 2002, [email protected] http://kressaray.de41
University of Kaiserslautern
Xputer LabSimilar Programming Language
Paradigms
language category Computer Languages Xputer Languages
both deterministic procedural sequencing: traceable, checkpointable
sequencingdriven by:
read next instruction, goto (instruction addr.), jump (to instruction addr.), instruction loop, instruction loop nesting no parallel loops, instruction loop escapes, instruction stream branching
read next data object, goto (data addr.), jump (to data addr.), data loop, data loop nesting, parallel data loops, data loop escapes, data stream branching
very easy to learn
© 2002, [email protected] http://kressaray.de42
University of Kaiserslautern
Xputer Lab
JPEG zigzag scan pattern
x
y
EastScan is step by [1,0]end EastScan;
SouthScan isstep by [0,1]endSouthScan;
*> Declarations
NorthEastScan isloop 8 times until [*,1]step by [1,-1]endloopend NorthEastScan;
SouthWestScan isloop 8 times until [1,*]step by [-1,1]endloopend SouthWestScan;
HalfZigZag isEastScanloop 3 times SouthWestScanSouthScanNorthEastScanEastScanendloopend HalfZigZag;
goto PixMap[1,1]
HalfZigZag;SouthWestScanuturn (HalfZigZag)
HalfZigZag
HalfZigZag
data counterdata counter
data counterdata counter
1
3
2
4 published in 1993
© 2002, [email protected] http://kressaray.de43
University of Kaiserslautern
Xputer Lab>> Anti architecture and circuitry
•Configware Industry
•Terminology
•MoPL data-procedural language
• Anti architecture and circuitry
•Stream-based Memory Architecturehttp://www.uni-kl.de
© 2002, [email protected] http://kressaray.de44
University of Kaiserslautern
Xputer Lab
GAG =Address
Generatorc
Generic GAU generic address unit Scheme
BaseSlider
B0
LimitSlider
L0
0B
[
AddressStepper
A
A
A
|| ||
L
]
limit
all 3 are copiesof the same BSU
stepper circuitGAU
published
in 1990
© 2002, [email protected] http://kressaray.de45
University of Kaiserslautern
Xputer Lab GAG: Address Stepper
GAG =
AddressGenerator
Generic
+ / –
A
AAddress
Escape
ClauseEnd
Detect
endExec
StepCounter
=o
maxStepCount
inittag
0BBase[
L
Limit
]
A
stepVector| |
A LB0
[ ]|| ||limit
GAG: Address Stepper
stepper
sequencing
BSU =
StepperUnit
Basic
published
in 1990
© 2002, [email protected] http://kressaray.de46
University of Kaiserslautern
Xputer LabGeneric Sequence Examples
LimitSlider
BaseSlider
GAU
AddressStepper
B0AL0
A
published
in 1990
a) b)
c)
d) e) f) g)
video scan
-90º rotated video scan
sheared video scan
non-rectangular video scan
zigzag video scan
spiral scan
feed-back-driven scans
atomic scan linear scan
-45º rotated (mirx (v scan))
perfectshuffle
until
© 2002, [email protected] http://kressaray.de47
University of Kaiserslautern
Xputer Lab
floor
F
address
ceiling
C
Slider Animation Demo
yx
B0 L0
LB
A
B L
published
in 1990
© 2002, [email protected] http://kressaray.de48
University of Kaiserslautern
Xputer LabGAG Complex Sequencer
Implementation
LimitSlider
BaseSlider
GAU
AddressStepper
B0AL0
A
all `been published
in 1990
LimitSlider
BaseSlider
GAU
AddressStepper
B0AL0
A
LimitSlider
BaseSlider
GAU
AddressStepper
B0AL0
A
GAUGAU
GAGGeneric Address Generator
SDS
GAG
VLIWstack
© 2002, [email protected] http://kressaray.de49
University of Kaiserslautern
Xputer Lab>> Stream-based Memory
Architecture
• Configware Industry
• Terminology
• MoPL data-procedural language
• Anti architecture and circuitry
• Stream-based Memory Architecturehttp://www.uni-kl.de
© 2002, [email protected] http://kressaray.de50
University of Kaiserslautern
Xputer LabMoM Xputer Architecture
rDPA MultipleRAM banks
Smart memory interface
Scan Window „Cache“
published
in 1990
© 2002, [email protected] http://kressaray.de51
University of Kaiserslautern
Xputer LabAntimachine: MoM architecture
x
y
handle positions
scan window
scan pattern (high level sequencing)
example
intra scan window accesses(low level sequencing)
Handle Position Generator
Scan Window Generator
handleposition
bank 0 1 • • • n
y-GAG x-GAG
memory accesses
© 2002, [email protected] http://kressaray.de52
University of Kaiserslautern
Xputer LabLinear Filter Application
b)
r
r r r
r
r/w r r
r
rr r
w / r r r
r
r r r
r
w/r r r
r
r r r Bank a
Bank a
Bank b
w r
r
r
scan step
© 2002, [email protected] http://kressaray.de53
University of Kaiserslautern
Xputer LabScanline unrolling
r r
r/w r r
r
r r r
r/w r r
r/w r r
r r r
© 2002, [email protected] http://kressaray.de54
University of Kaiserslautern
Xputer Lab90o Rotation of Scan Pattern
r r
rr
r
r
r
r
r
r
Bank a
Bank a
Bank b
Bank b
w wwr rr rr
r rr rrw ww
w w w
r
w
r
rr
r
r
r
r
w
r
r
w
Bank a
Bank a
Bank b
Bank b
scanwindowoverlaparea
r r/wr r/w r/w
r
r
r/w
r
rr
r
r
r
r/w
r
r
r/w
r
r
© 2002, [email protected] http://kressaray.de55
University of Kaiserslautern
Xputer LabLinear Filter Application
after inner scan line loop unrolling
final design
after scan line
unrolling
hardw. level access optim.
initial design
Parallelized Merged Buffer Linear Filter Applicationwith example image of x=22 by y=11 pixel
© 2002, [email protected] http://kressaray.de56
University of Kaiserslautern
Xputer LabXMDS Scan Pattern Editor GUI
© 2002, [email protected] http://kressaray.de57
University of Kaiserslautern
Xputer LabMoM Architecture Features
• Scan Cache Size adjustable at run time
• Any other shape than square supported
• 2-dimensional memory space
• Supports generic „scan patterns“
– Subject of parallel access transformations
– compare Francky Cathoor et al .
• Supports visualization
© 2002, [email protected] http://kressaray.de58
University of Kaiserslautern
Xputer LabHot Research Topic: Memory Architectures
•High Performance Embedded Memory Architectures [Cathoor et al.]
•High Performance Memory Communication Architectures [Herz]
•Custom Memory Management Methodology [Cathoor et al]
•Data Reuse Transformations [Kougia et al.]
•Data Reuse Exploration [Soudris, Wuytak]
•Rapidly greowing market: IP cores, module generators ets.
© 2002, [email protected] http://kressaray.de59
University of Kaiserslautern
Xputer LabProcessor Memory Performance Gap
1
10
100
1000Performance
1980 1990 2000
µProc60%/yr..
DRAM7%/yr..
Processor-MemoryPerformance Gap:(grows 50% / year)
DRAM
CPU
von Neumann bottleneck
© 2002, [email protected] http://kressaray.de60
University of Kaiserslautern
Xputer LabrDPAs: classical cache does not help
• the memory bandwidth problem is often more dramatic then for microprocessors
• classical interleaving is not practicable, since based on sequential instruction streams
• classical caches do not help, since instruction sequencing is not used
• the problem: throughput of parallel data streams, not instruction streams
• super pipe networks, no parallel computers !
• Stream-based arrays are a memory bandwidth problem
however, the anti m
achine has n
o vN bottleneck
!
© 2002, [email protected] http://kressaray.de61
University of Kaiserslautern
Xputer LabData-Stream-based Soft Anti
Machine
SchedulerMemory(data memory)
memory bank
memory bank
memory bank
memory bank
memory bank
...
...
“instructions”
rDPACompiler
Sequencers(data stream
generator)
© 2002, [email protected] http://kressaray.de62
University of Kaiserslautern
Xputer LabThe Disk Farm? or
a System On a Card?
The 500GB disc cardLOTS of bandwidthA few disks replaced by >10s Gbytes RAM and a processor
14"
MicroDrive:1.7” x 1.4” x 0.2” 2006: ?
1999: 340 MB, 5400 RPM, 5 MB/s, 15 ms seek
2006: 9 GB, 50 MB/s ? (1.6X/yr capacity, 1.4X/yr BW)
Integrated IRAM processor2x height
Connected via crossbar switchgrowing like Moore’s law
16 Mbytes; ; 1.6 Gflops; 6.4 Gops10,000+ nodes in one rack! 100/board = 1 TB; 0.16 Tflops
[Gordon Bell, Jim Gray,
ISCA2000]
© 2002, [email protected] http://kressaray.de63
University of Kaiserslautern
Xputer LabMoM Application Examples
• Image Processing• Grid-based design rule check [1983*]
– 4 by 4 word scan cache– Pattern-matching based– Our own nMOS „DPLA“ design – design rule violation pixel map automatically
generated from textual design rules– 256 M&C nMOS, 800 single metal CMOS– Speed-up > 10000 vs. Motorola 68000
*) „machine“ not yet discovered
© 2002, [email protected] http://kressaray.de64
University of Kaiserslautern
Xputer Lab
Schedule
time slot
08.30 – 10.00
Reconfigurable Computing (RC)
10.00 – 10.30
coffee break
10.30 – 12.00
Stream-based Computingfor RC
12.00 – 14.00
lunch break
14.00 – 15.30
Resources for RC
15.30 – 16.00
coffee break
16.00 – 17.30
FPGAs: recent developments
© 2002, [email protected] http://kressaray.de65
University of Kaiserslautern
Xputer Lab>>> Coarse Grain
- END -
© 2002, [email protected] http://kressaray.de66
University of Kaiserslautern
Xputer Lab
Schedule
time slot
08.30 – 10.00
Reconfigurable Computing (RC)
10.00 – 10.30
coffee break
10.30 – 12.00
Stream-based Computing for RC
12.00 – 14.00
lunch break
14.00 – 15.30
Resources for RC
15.30 – 16.00
coffee break
16.00 – 17.30
FPGAs: recent developments
© 2002, [email protected] http://kressaray.de67
University of Kaiserslautern
Xputer Lab
http://kressarray.de
Efficient Memory Communicationshould be directly supported by the Mapper Tools
sequencers
memory ports
application
not used
Legend:Optimized ParallelMemory Controller
An example byNageldinger’s KressArray Xplorer
Synthesizable Memory Communication
© 2002, [email protected] http://kressaray.de68
University of Kaiserslautern
Xputer LabMemory Communication Architecture
• hot research topic in embedded systems
• storage context transformations [Herz, others]
• for low power
• for high performance
• startups provide memory IP or generators