slaac s ystems l evel a pplications of a daptive c omputing
DESCRIPTION
SLAAC S ystems L evel A pplications of A daptive C omputing. DARPA/ITO Adaptive Computing Systems PI Meeting Napa, California April 13-14 Presented by: Bob Parker Deputy Director, Information Sciences Institute. System Level Applications of. Adaptive Computing. - PowerPoint PPT PresentationTRANSCRIPT
SLAACSystems Level Applications of
Adaptive Computing
DARPA/ITO Adaptive Computing Systems PI Meeting
Napa, California
April 13-14
Presented by:
Bob ParkerDeputy Director,
Information Sciences Institute
04/19/23 2Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
System Level Applications ofAdaptive Computing
Team Members: USC/ISI (Lead), BYU, UCLA, Sandia National Labs
Significant reduction in power, weight, volume,and cost for several challenging DoD embeddedapplications
•SAR ATR•Sonar Beamforming•IR ATR•Others
Utilizing Three Phases of Adaptive Computing Components Large Current Generation FPGAs
Rapid Reconfigurable and/or Fine Grain FPGAsHybrid FPGAs
Integrating Multiple Constituent TechnologiesScalable Embedded Baseboard Gigabit/Sec NetworkingModular Adaptive Compute ModulesSmart Network Based Control SoftwareAlgorithm Mapping Tools
Developing Reference PlatformsFlight Worthy Deployable SystemLow Cost Researchers Kit
‘97 ‘98 ‘99
Lab Demo of an ACS implemented SAR ATR algorithm
Embedded SAR ATR Demo of ACS HW (Clear, 1Mpixel/s, 6TT)First Generation of Reference Platforms
‘01
Embedded SAR ATR Demo (CC&D, 1Mpixel/s, 6TT)
‘00
Embedded SAR ATR Demo(CC&D, 10Mpixel/s, 6TT)
04/19/23 3Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
SLAAC Objectives
Define a system-level open, distributed heterogeneous adaptive systems architecture
Design, develop and evolve scalable reference platforms implementing the adaptive systems architecture
Validate the approach by deploying reference platforms in multiple defense application domains SAR ATR Sonar Beamforming IR ATR Others
04/19/23 4Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
SLAAC Affiliates
ACSResearch
Community
BYU
Sandia
UCLA
ISI
SandiaSAR/ATR
NVL
IRATR
NUWC
SonarBeamforming
LANL
Ultra Wide- Band Coherent RF
LANL Multi-
dimensional
Image
Processing
Lockheed
Martin
App
licat
ions
Cha
lleng
e Pr
oble
m O
wne
rsSL
AA
C D
evel
oper
s
Electronic
Counter-
measuresC
ompo
nent
Dev
elop
ers
04/19/23 5Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
SLAAC Architecture
ACSDevice
ControlProcessor
ControlProcessor
DSPDevice
ControlProcessor
Network
Host
NetworkInterface
Processor
Host
ACSDevice
ControlProcessor
Sensor
NetworkInterface
Processor
Myricom L4 Orca Board
X1
X2
XP_RIGHT
XP_RIGHT
X0
XP_LEFT
XP_LEFT
XP
_XB
AR
XP
_XB
AR
X0_LEFT
X0_RIGHT
X0_
XB
AR
PROMCLK
PMC BUS
PCI BUS
SLAAC1 Board UCLA Board
Myricom L5 Baseboard
ACSDevice
ControlProcessor
04/19/23 6Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
SLAAC Programming Model
Single host program controls distributed system of nodes and channels system dynamically
allocated at runtime multiple hosts compete
for nodes channels stream data
between host/nodes
1
2
3Network
NodesHost
04/19/23 7Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
Runtime System
Runtime SystemMessaging Layer
Network Layer
Application
System Layer - High-level programming interface (e.g., ACS_Create_Sytem (Node_list, Channel_list))
Node Layer - Hide-device specific information (e.g., FPGA configuration)
Control Layer - Node-independent communication commands (i.e., blocking and non-blocking message passing primitives)
04/19/23 8Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
Remote Node Processing Alternatives
Application
Runtime
Messaging
Messaging
Runtime
FPGA
Application
Runtime
Messaging
Messaging
Runtime
Application
Runtime
FPGA
Host Node
Remote Node
Network Network
•Less power required from compute node
•Less latency between application and low-level control
04/19/23 9Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
Runtime and Debugging
Interactive debugger all system layer C functions
provided in command-line interface
symbolic VHDL debugging support using readback
single-step clock scriptable
SLAAC Runtime monitor system state hardware diagnostics
Other tools network traffic monitors
(MPI based?) load balancing visualization tools
04/19/23 10Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
Runtime Status
Complete System Layer API specification Control Layer API specification, partially simulated
Scheduled May: VHDL simulation of SLAAC board June: Implementation of basic runtime system functions
04/19/23 11Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
Development Platform Path
SBC
SLAAC Double-
Wide PMC CardM
yrin
et
SBC
SLAAC Double-
Wide PMC Card
P0/
Myr
inet
L5 Baseboard
SLAAC
P0
/ Myr
inet
L5 Baseboard
SLAAC
ImprovedComputeDensity
ImprovedDevelopment Environment
Low Cost COTS Development Platform
SBC w/ External Network
SBC w/ Embedded Network
Fully Embedded Platform
SBC
SLAAC1
PMC
Board
Myrinet
PMC
Card
SBC
SLAAC1
PMC
Board
Myrinet
PMC
Card
SLAAC Runtime System
System Layer
Node Layer
Control Layer
System Layer
Node Layer
Control Layer
System Layer
Node Layer
Control Layer
System Layer
Node Layer
Control Layer
04/19/23 12Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
Hardware Platforms and Software Development
•Low risk development path
•Standards compliant (MPI, VxWorks)
•Recompile to change platforms
•GP programming environment at node level
•Bandwidth limited by MPI
•Custom network interface program (exploits GM)
•Direct network/compute connection
•Immature development environment
•SLAAC provides programming environment
•Maximum bandwidth
Hardware Platform Node O.S No Node O.S.
Cluster of workstations MPI, Linix or NT, PCI GM, PCI
SBC w/ external network MPI, VxWorks, PMC MPI, PMC
SBC w/ embedded network MPI, VxWorks, VME P0 GM, VME P0
Fully Embedded ? GM, VME P0
NodeApplication
Runtime
COTS OS
HostApplication
Runtime
COTS OS
Node O.S.
Risk
Performance
NodeApplication
Runtime
Custom NI
HostApplication
Runtime
COTS OS
No Node O.S.
04/19/23 13Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
BYU/UCLA Domain-Specific Compilers for ATR
BYU: Focus of Attention (FOA) UCLA: Template Matching
Image Morphology“CYTO” code
NeighborhoodProcessor Generator
SynopsysLogic Synthesis
XilinxLogic & Route
FPGA
Hand optimizedneighborhood
operators(Viewlogic Library)
Map “CYTO’ neighborhood operations to pre-defined FPGA blocks High packing density to enable single configuration
Templates
CorrelatorGenerator
SynopsysLogic Synthesis
XilinxPlace &Route
FPGA FPGAFPGA FPGA
Optimize VHDL using template overlap Creates optimized template subset withminimum number of reconfigurations
VHDL (Structural)
OptimizationHere
OptimizationHere
VHDL (Structural)
04/19/23 14Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
•Mojave board can interface to the i960development system for in-housetesting (as shown), or with theMyricom LANai board.
Host ProcessorPCI Slot
Bus Connector
PCI BusExternal
System Processor
MojaveBoard
Static FPGA
The UCLA Testbench
04/19/23 15Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
X1
X2
XP_RIGHT
XP_RIGHT
X0
XP_LEFT
XP_LEFT
XP
_XB
AR
XP
_XB
AR
X0_LEFT
X0_RIGHT
X0_
XB
AR
PROMCLK
FIFO Data (64 pins)FIFO Control (~16 pins)Clock, Configuration, InhibitExternal Memory Bus
PMC BUS
PCI BUS
Jumper block256Kx18 SRAM
SLAAC 1 Board
04/19/23 16Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
40,000 sqnm / day@ 1 ft. Resolution
Current ChallengeSystem Parameter Scale Factor
SAR Area Coverage Rate(sqnm / day @1 ft Res.)
40,0001000 40X (FOA,*
*Corresponds to a data rate of 40 Megapixels / sec
Level / Difficulty of CC&D HighLow 100X (Indexer)10X (Ident.)
Number of Target Classes 30 6 5X (Indexer,Ident.)
Indexer,Ident.)
Surveillance Challenge Problem - SAR / ATR
04/19/23 17Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
Project Benefit IncludesImproved Compute Density
(Scaled To Challenge Size; Assuming FOA, Indexer, and 1 Identifier)
Clear
Clear
CC&D
CC&D
Past STARLOSSystems
JSTARS ‘96 Demo
JSTARS ‘97 Demo
(Early Two-Level Multicomputers + Algorithms)DARPA EHPC Program
DARPA ACS Program
5X Over Moore’s LawACS Large FPGAs with on-chip SRAM blocks + Algorithms
10X Over Moore’s LawACS Hybrid Chips + Algorithms
‘99 Demo Range
‘01 Demo Range
Clear
CC&D
‘98 Demo
‘91 ‘94 ‘96‘92 ‘93 ‘95 ‘97 ‘98 ‘99 ‘00 ‘01
Year
0.01
0.1
1.0
10
100
1000
10000
100000
1000000
Num
ber
of 6
U V
ME
Cha
ssis
(VM
E C
hass
is =
3.5
cft
, 80
lbs,
700
W, $
400,
000)
04/19/23 18Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
ATR Flight Demo System
For: 1 Mpixel/sec with 6 target configurations (targets in-the-clear scenario) Baseline 1996 System:Hardware Architecture— Systolic – 3 algorithm modules— SIMD – 1 algorithm module— Early 2-level multiprocessors/DSP –
3 algorithm modules
1997 Flight Demo System:Hardware Architecture— 2-level multiprocessor/DSP – 8 algorithm
modules (1 additional algorithm module implemented over baseline system)
Power, Volume, Weight Product (W- ft3 -lbs.)Power (W)
1680
453W
Weight (lbs.)354
124lbs.
Volume (ft3)
17.5
7ft3
(5 VME chassis)
(2 VME chassis)
10,407,600
393,204
W-
ft3
-lb
s 26.47 ratio
26.47 ratio
2-level multiprocessor/DSP configuration implements algorithms (with additionalalgorithm module) with better performance and significantly lower power, size,and weight versus baseline implementation
04/19/23 19Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
Common SAR/ATR - DARPA ACS & EHPC FY97 Testbed
MIMD
SIMD
Systolic
Laboratory Development ElementReal-Time Deployable Element(Common ATR Model Year 1)
Joint STARS
SBC
Myrinet
MIMDIntel
Paragon
SIMDCPP DAP
SystolicDatacube/Custom
WorkstationsSUN / SGI / PC
MIMDSGI
SystolicCYTO
RAID(Data) Next Generation
Embeddable HPC Technologies
Myrinet
HIPPI
SIMDCNAPS
MIMDMulticomputersPowerPC HPSC
04/19/23 20Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
JSTARS ATR Processor
PowerPC Multicomputer 13 Commercial Motorola VMEbus
CPU boards– 200Mhz 603e PowerPC per board– 5.2 GFLOPS Peak
Commercial Myrinet High Speed Communications
– 1.28Gbits/sec full duplex– Cross point topology
SHARC Multicomputer 4 Sanders HPSC processor boards
– 8 33Mhz Analog Devices SHARC DSP processors per board
– 3.2 GFLOPS Peak Myrinet High Speed Communications
04/19/23 21Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
TMD/RTM Real-Time ATR
delivered 6/97 FOA, SLD, MPM, MSE, CRM, LPM,& PGA Supported 5 real-time ESAR/ATR airborne
flight exercises– 2 Engineering check-out flights– 3 Phase III evaluation flights
Features 1 Mpixel/sec, 6 Configurations targets in-the-
clear scenario Large scale dynamic range capability Modular, Scalable, Plug & Play Architecture 2 VMEbus Chassis ATR System Heterogeneous Two-Level Multicomputer,
COTS PowerPC and Sanders SHARC
DARPA SAR ATR EHPC Testbed Experiments in Action
RTM ATR Advanced Technology Demonstration
This work performed under the sponsorship of the Air Force Aeronautical Systems Center and the Air Force Research Laboratory (formerly Rome Laboratory)
04/19/23 22Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
Joint STARS SAR/ATR Transition
• Developed Airborne Real-time SAR / ATR System• Demonstrated initial system at Pentagon(Sep 96)• All COTS system implementation (Apr 97)• Full system integrated on T3 aircraft (Aug 97)• Engineering / integration flights completed with
fully operational system (Sep 97)• Three real-time demonstration flights (Oct 97)• Operationally significant Pd/FAR performance
• Jointly managed, USAF ASC/FBXT and AFRL/IF.• Provided JSTARS with a real-time ATR capability.• Leveraged prior Service & DARPA investments.• Sandia developed the ATR System, Northrop
Grumman developed the ESAR system and led the integration of both systems onto the aircraft.
• ATR system enables an image analysts to identify threats in real-time by prescreening large amounts of data for potential targets.
Description Accomplishments
04/19/23 23Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
BYU- FOA and Indexing
SAR Image
Coarse Data
Sensor PreprocessorFine Data
Identification
Location AngleEstimate
Target ID
Confidence
Indexer
Detection
Focus ofAttention
Superquant FOA 1 pass adaptive threshold technique Produces ROI blocks for indexing >7.8 Gbops/second/image
1 Mpixel/second, FY98 40 Mpixel/second, FY01
CC&D indexing Algorithm definition in process
04/19/23 24Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
BYU - SAR ATR Status Non-Adaptive FOA
Wildforce PCI platform 3 months to retarget to SLAAC board
Compilation Strategies Current approach
– “VHDL synthesis from scratch”– Traditional tool flow
Planned approach - July 1999– “Gate Array” approach– Fixed chip floorplan regular arrays
30x speedup, compile < 1 hour~ 10% efficiency loss
04/19/23 25Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
Sonar Beamforming with 96 Element TB-23
Goals: First RT matched field algorithm deployment 1000x 51 Beams 51000 Beams Ranging + “look forward” capability Demonstrate adaptation among algorithms at sea Validate FPGAs for signal processing
Computation: 2 stage (course and fine) 16 Gop/sec, 2.5 GB memory, 80 MB/sec I/O
Approach: Use k- + matched field algorithms Leverage ACS to provide course grain RTR
– Environmental adaptability– Multiple resolution processing
04/19/23 26Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
STEALTH ADVANCEMENTSN
OIS
E L
EV
EL
SN
OIS
E L
EV
EL
S
19601960 19701970 19801980 19901990 20002000 20102010
VICTOR 1VICTOR 1
ALFAALFA
VICTOR IIIVICTOR III
AKULAAKULA
IMPROVEDIMPROVEDVICTOR IIIVICTOR III
IMPROVEDIMPROVEDAKULAAKULA
SEVERODVINSKSEVERODVINSK
594594
637637
688688
688I688I
SSN-21SSN-21 NSSNNSSN
LEAD SHIPLEAD SHIPKEEL LAIDKEEL LAID
DEC 93DEC 93
BROADBAND QUIETING COMPARISONBROADBAND QUIETING COMPARISON
Credit: NUWC
04/19/23 27Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
Sonar Status
Algorithm identified by NUWC 4/3/98 validation in process
BYU mapping of NUWC algorithm underway for Wildforce board map to SLAAC boards when available
Sonar module generation Operational generators include:
– pipelined multipliers & CORDIC units
– C/C++ programs generating VHDL code
– generators used in Wildforce mapping
04/19/23 28Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
Timeline - Baseline + Option
1 BYU prelim. mappingto ACS2 Submarine I/F specified and I/F construction begun
FY99 FY00 FY01FY98
Feasibility Study 1st Mapping 1 Advanced algor. dev.on RRP2 Sonar module generators
1 Advanced algor. dev.2 Sonar subcompilers
Top level compiler
NUWC specifiesalgorithm
1st SLAAC ACSboard avail.
BYU end-to-end lab demo completeand delivered to ISI
Advanced SLAAC ACS boards avail.
ISI delivers demo system to NUWCfor testing
SEA TEST (summer 2000)
04/19/23 29Bob ParkerUSC INFORMATION SCIENCES INSTITUTE
SLAAC Conclusions
Great early success in deployed capability Interesting runtime tradeoffs Significant risk reduction through COTS standards Promising simulation results - headed for
hardware Adaptive systems are hard - but getting easier
http://www.east.isi.edu/SLAAC