benchmarking porting costs of the skynet high...

5
Benchmarking Po Performance S Michael J. Linnig Engineering Fellow Space and Airborne Systems Raytheon Dallas, Texas, USA [email protected] Abstract— This paper gives an overview of performance signal processing middleware nick (with a nod to James Cameron). The paper wil of porting SKYNET based radar modes to six processor architectures. The SKYNET Open Middleware has been embodiments for more than 10 years on U.S. su Seven organizations, including Raytheon, Nor and two Federally Funded Research and Deve make use of the SKYNET Middleware f development. The middleware defines a sup open architecture for massively parallel com hundreds of individual COTS processors to be together to solve the real-time signal proc surveillance radars. Keywords— Open Architecture; High Perform Middleware; Radar; Third Party Mode Developm I. INTRODUCTION Surveillance Radar applications req computing resources to process their sizeable time. Surveillance Radars have high data ra have a large number of simultaneous rece bandwidth, and large swath widths. High bandwidth radars generate prodigio data. Each receive channel’s analog-to-digit produce hundreds of millions of samples pe range resolution is needed. If digital beam there may be multiple simultaneous receive b receive data rate may be on the order of tens second or more and may require many tens processing power. This is more compute available in most Von Neumann computers. A Figure 1, a parallel processing solution is calle A. “Embarrassingly Parallel Problem” Signal processing for radars is often an Parallel Problem” that is solvable by d computations and allocating the processing to of identical processors running in parallel, ea 978-1-4799-6233-4/14/$31.00 ©2014 IE orting Costs of the SKYN Signal Processing Middl Gary R. Sude Engineering Fe Space and Airborne Raytheon Dallas, Texas, U Gary_R_Suder@Rayt the portable high knamed SKYNET ll discuss the costs different parallel used in multiple urveillance radars. rthrop Grumman elopment Centers for radar mode ercomputer class mputing, allowing efficiently netted cessing needs of mance Computing; ent; Linux quire significant data rates in real ates because they ive beams, high ous quantities of tal converter can er second if high mforming is used, beams. The total of gigabytes per s of teraflop’s of power than is As sketched out in ed for. “Embarrassingly decomposing the o a large number ach working on a subset of the data. These processors the taxonomy of parallel processin approach called Single Program M programming. In SPMD program application executes on multiple copy operates on a portion of the da a solution, such a distributed appl some intermediate results between ta The SKYNET computing infra SPMD parallel processing abs Middleware makes SPMD prog coordinates sharing of data between develop parallel applications target processors, shown in Figure 1. Thi algorithm execution spanning many radar mode software can be largely and the kind of processors. B. Current User Community and H Third party developers use the create radar modes. There is a robu based around SKYNET. Seven Raytheon, Northrop Grumman an Research and Development Centers The non-proprietary SKYNET A the Spectron SPOX Operating Sy Instruments in 1998. Raytheon Figure 1: The Surveillance Radar Processing Sol EEE NET High leware er llow Systems USA theon.com need not share memory. In ng techniques, there is an Multiple Data (SPMD) [1] mming, a copy of a single processors such that each ata. While working to obtain lication may have to share asks. astructure uses the proven straction. The SKYNET gramming easier and it n processors. Programmers ted to a rectangular grid of s approach allows efficient y processors. The resulting independent of the number History e SKYNET middleware to ust development community organizations, including nd two Federally Funded employ the middleware. API leverages heavily from ystem produced by Texas chose the SPOX API for Problem Needs a Parallel lution

Upload: others

Post on 27-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Benchmarking Porting Costs of the SKYNET High …ieee-hpec.org/2014/CD/index_htm_files/FinalPapers/70.pdfMiddleware developer can easily configure virtual nodes per processor to accommoda

Benchmarking PoPerformance S

Michael J. Linnig Engineering Fellow

Space and Airborne Systems Raytheon

Dallas, Texas, USA [email protected]

Abstract— This paper gives an overview of performance signal processing middleware nick(with a nod to James Cameron). The paper wilof porting SKYNET based radar modes to six processor architectures.

The SKYNET Open Middleware has been embodiments for more than 10 years on U.S. suSeven organizations, including Raytheon, Norand two Federally Funded Research and Devemake use of the SKYNET Middleware fdevelopment. The middleware defines a supopen architecture for massively parallel comhundreds of individual COTS processors to betogether to solve the real-time signal procsurveillance radars.

Keywords— Open Architecture; High PerformMiddleware; Radar; Third Party Mode Developm

I. INTRODUCTION Surveillance Radar applications req

computing resources to process their sizeable time. Surveillance Radars have high data rahave a large number of simultaneous recebandwidth, and large swath widths.

High bandwidth radars generate prodigiodata. Each receive channel’s analog-to-digitproduce hundreds of millions of samples perange resolution is needed. If digital beamthere may be multiple simultaneous receive breceive data rate may be on the order of tens second or more and may require many tensprocessing power. This is more compute available in most Von Neumann computers. AFigure 1, a parallel processing solution is calle

A. “Embarrassingly Parallel Problem” Signal processing for radars is often an

Parallel Problem” that is solvable by dcomputations and allocating the processing toof identical processors running in parallel, ea

978-1-4799-6233-4/14/$31.00 ©2014 IE

orting Costs of the SKYNSignal Processing Middl

Gary R. SudeEngineering Fe

Space and Airborne Raytheon

Dallas, Texas, UGary_R_Suder@Rayt

the portable high knamed SKYNET ll discuss the costs different parallel

used in multiple urveillance radars. rthrop Grumman elopment Centers for radar mode ercomputer class

mputing, allowing efficiently netted cessing needs of

mance Computing; ent; Linux

quire significant data rates in real

ates because they ive beams, high

ous quantities of tal converter can er second if high

mforming is used, beams. The total of gigabytes per

s of teraflop’s of power than is

As sketched out in ed for.

“Embarrassingly decomposing the o a large numberach working on a

subset of the data. These processorsthe taxonomy of parallel processinapproach called Single Program Mprogramming. In SPMD programapplication executes on multiple copy operates on a portion of the daa solution, such a distributed applsome intermediate results between ta

The SKYNET computing infraSPMD parallel processing absMiddleware makes SPMD progcoordinates sharing of data betweendevelop parallel applications targetprocessors, shown in Figure 1. Thialgorithm execution spanning manyradar mode software can be largely and the kind of processors.

B. Current User Community and HThird party developers use the

create radar modes. There is a robubased around SKYNET. Seven Raytheon, Northrop Grumman anResearch and Development Centers

The non-proprietary SKYNET Athe Spectron SPOX Operating SyInstruments in 1998. Raytheon

Figure 1: The Surveillance Radar Processing Sol

EEE

NET High leware er llow Systems

USA theon.com

need not share memory. In ng techniques, there is an

Multiple Data (SPMD) [1] mming, a copy of a single

processors such that each ata. While working to obtain lication may have to share asks.

astructure uses the proven straction. The SKYNET gramming easier and it n processors. Programmers ted to a rectangular grid of s approach allows efficient y processors. The resulting independent of the number

History e SKYNET middleware to ust development community

organizations, including nd two Federally Funded employ the middleware.

API leverages heavily from ystem produced by Texas chose the SPOX API for

Problem Needs a Parallel lution

Page 2: Benchmarking Porting Costs of the SKYNET High …ieee-hpec.org/2014/CD/index_htm_files/FinalPapers/70.pdfMiddleware developer can easily configure virtual nodes per processor to accommoda

SKYNET to minimize the amount of softwarexisting signal processing implementations (bO/S) to new architectures not supported by SP

As is, the SPOX operating system did notfor distributed applications on multiple procdeveloped the Array Operating System (AOSSKYNET middleware to add inter-processorbetween applications. The combination of SAPIs form the core SKYNET Middleware ASKYNET radar modes and the middleware rahundreds of COTS Digital Signal Processors (D

In 2003, Northrop Grumman Corporaversion of the SKYNET middleware that usPassing Interface (MPI) protocol [2] internallyprocessor communications.

Since 2003, Raytheon has refined andSKYNET Middleware to keep pace with evtechnologies. Raytheon easily ported the existo an array of Linux based COTS high performby using the SKYNET Middleware.

C. Open Systems Approach to Signal ProcessOur approach to signal processing middl

Figure 4, decomposes the middleware into two

(1) SKYNET MetaWare – A thin, Government owned API that decapplications from hardware and odependency; and

(2) Open Middleware – An open compbased on Linux and open standardeasily targeted to new hardware plat

We decomposed the middleware this provides an open systems core while preseperformance processing needed for Survsystems.

The foundation of the SKYNET MetaWbased open middleware. This foundationstandards such as the Message Passing InteData Distribution Service (DDS) [4] and the System itself. Linux and these open standardtoolbox and provide parallel computing buildinhigh-performance computing, a well d

“Each component will share a commointerface to the infrastructure through a thinisolation” layer. It is desired to elimcomponent-to-component communication all component-to-component communicatmediated by the infrastructure to allowcomponent modularity through abstencapsulation.” --DoD Radar Open System Architecture Defense Su

Figure 2: Motivation for Isolation L

re rework to port built using SPOX OX.

t provide support essors. Raytheon S) portion of the r communication SPOX and AOS

API. The original an on an array of DSPs).

ation released a sed the Message

y to support inter-

d improved the volving processor sting radar modes mance processors

sing Middleware leware, shown in o components:

non-proprietary, ouples the radar operating system

pute environment, ds, which can be tforms.

way because it erving the high-veillance Radar

Ware is the Linux n includes open erface (MPI) [3],

Linux Operating ds are a powerful ng blocks, but for defined parallel

computing infrastructure such as Linux base of SKYNET allows it modern processor easily, breaking p

The SKYNET MetaWare layer changes to the operating system anthe DoD Radar Open System ArcTeam [1], shown in Figure 2. The“applications isolation” layer. The rdependent on Linux; Future developLinux processors by only changing t

The SKYNET Middleware inclu

Figure 3: Approach scales up processors o

Figure 4: Portable Signal Processi

on component n “applications minate direct links. Instead, tions will be

w and enable traction and

upport Team

ayers

SKYNET is needed. The to be ported to almost any

processor vendor lock.

provides isolation from any nd also meets the desires of chitecture Defense Support e MetaWare layer is a thin radar modes are not directly pers can port modes to non-the MetaWare.

udes both a Sensor Manager

or down to hundreds of r cores

ng Middleware Approach

Page 3: Benchmarking Porting Costs of the SKYNET High …ieee-hpec.org/2014/CD/index_htm_files/FinalPapers/70.pdfMiddleware developer can easily configure virtual nodes per processor to accommoda

(radar scheduler) and a signal-processing cpaper only discusses the signal-processing com

II. SKYNET MIDDLEWARE ABSTR

The SKYNET computing infrastructureSingle Program, Multiple Data parallel procein Figure 3, programmers develop parallel appto a rectangular grid of processors. This efficient algorithms spanning hundreds of SKYNET Middleware is the signal processithat allows the radar mode code to be scalanumber of processors or arrays of processors cores. The resulting radar mode software is larof the number and the kind of processors.

A. The Virtual Node Abstraction To support the development of SPMD

middleware provides an abstraction called a Vis layered upon the real-time OS abstraction Passing Interface abstraction. A Virtual Nodedefined to have four NEWS neighbors (NorthSouth, hence NEWS.). As shown in Figure 5, Virtual Node tasks.

The Virtual Nodes logically form a grid processors, as shown in Figure 6. The prograapplication from a perspective of needing datapossible neighbors, in the North, East, South o(hence the NEWS moniker) as shown in FigurMiddleware developer can easily configurevirtual nodes per processor to accommodaprocessors or for algorithmic convenience.

Multiple processors are arranged to formNodes. The middleware internally characbetween Virtual Nodes as either off-chip or onif the neighbor Virtual Nodes resides on the In the current implementation, Off-chip transvia MPI while on-chip transfers are simmemory copies. The SKYNET AOS midprovides the API to establish these large grcommunication between the Virtual Nodes. developer does not need to know if the neighbor on chip neighbor, the middleware handles th

B. The Array OS Abstraction The Array OS (AOS) Abstraction provid

the signal processing application that do rectangular connection topology across proce

Figure 5: NEWS -- Cartesian Communicatio

component. This mponent.

RACTIONS e makes use of essing. As shown plications targeted approach allows processors. The

ing infrastructure able to a variable with hundreds of

rgely independent

applications, the Virtual Node, that and the Message

e is a task that is h, East, West and the neighbors are

superimposed on ammer writes the a from one of four or West directions re 6. A SKYNET e the number of ate multiple core

m grids of Virtual cterizes transfers n-chip depending same processor. sfers are handled

mply memory-to-ddleware library

rids and provides The application

bor is an off chip he details.

des operations to operations on a essor boundaries.

It provides Barrier Synchronization multiple processor operations.

C. North-East-South-West CommunNEWS communications may

application needs to share data beNodes. For example, if the data benature, then a Virtual Node may nperform seamless calculations (i.e. the Virtual Node boundaries. This dmultiple steps. First the data is sewest) and then it is sent west (reccomplexity of this communicatiapplication programmer by the Amiddleware.

Very large arrays of data can bthe storage across many Virtual Noand doing most of the processing lolibraries execute distributed array oprocessors in the array. The necdistributed processing operations is SKYNET AOS middleware withouhaving to have knowledge of the det

D. The Operating System AbstractiThe SKYNET Middleware h

Abstraction API that isolates the apdetails of the underlying Operatingthis API includes task managemenmemory management, queues and pThe Operating System Abstractiooperating system to be changedapplication. Even a non-POSIX basbe accommodated without changing

E. Other Abstractions The Middleware defines all th

mode development. This includes

on Optimized

Figure 6: Virtual Nodes are

primitives allowing simple

nications be performed when the

etween neighboring Virtual eing processed is spatial in need its neighbor’s data to calculations with a seam at data sharing is performed in ent east (received from the ceived from the east). The ion is hidden from the Array OS portion of the

e processed by distributing odes (and their processors) ocally. NEWS based math operations that can span all essary communication for provided efficiently by the

ut the application program tails.

on has an Operating System pplication program from the g System. Functionality in nt, semaphores, mailboxes, periodic alarms and timers. on allows the underlying d without impacting the sed Operating Systems can

g radar mode source code.

he APIs needed for radar s high performance math

Multi-core Friendly

Page 4: Benchmarking Porting Costs of the SKYNET High …ieee-hpec.org/2014/CD/index_htm_files/FinalPapers/70.pdfMiddleware developer can easily configure virtual nodes per processor to accommoda

libraries, processor affinity control, strinstrumentation.

III. PORTABILITY AND PERFORMANC

As part of an extensive processor selectRaytheon ported the SKYNET Middleware hardware architectures, including large arraysCOTS Power and Intel processors.

Originally, the SKYNET radar modes andwere developed for arrays of COTS Digital S(DSPs). It was time for a processor refresh.was designed to find candidate processor archnext generation system. Two of the key discristudy were (1) radar mode portability cost drives the cost of future processor technology processor efficiency since efficiency drives SPower (SWaP).

During the study, our engineers ported tapplications onto homogenous arrays of procidentical processors). The study evaluated thethe radar mode to each architecture, noting whmode software had to change to work on theexamined.

Middleware Most Proces

Figure 7: R

ream I/O and

CE STUDY tion trade study,

to six different s of COTS DSPs,

d the Middleware Signal Processors The trade study hitectures for the iminators for this since portability refreshes and (2)

Size, Weight, and

two fielded radar cessors (arrays of e costs of porting hat fraction of the e processor being

The portability percentages shfraction of radar application sourchanged to port to the new processoPerformance Embedded Computing

As shown in Figure 7, we architectures, little or no changes mode software. For many architecthad to change to accommodate a nexception is the Mercury Cell Procelimitations of that processor requireradar mode software to allow it to Note (*) that the team was not able TigerSharc processor array.

Raytheon engineers developedtools that allowed the team to tune optimal throughput from each procfound that the middleware is anultimate radar mode performanmiddleware does not require signchanges when radar modes are migr

Allowed Radar Modes to be Portessor Architectures with Few Chan

Radar Application Percent Change Needed in Port

hown in Figure 7 are the rce lines of code (SLOC) or (definition from the High

Software Initiative).

found that that for most were needed to the radar

tures, only the middleware new processor. A notable essor. The very constrained d significant changes to the run on the Cell Processor. to run Radar App 2 on the

d performance-monitoring the middleware to achieve

cessor being evaluated. We n important contributor to nce. A well designed nal processing application rated to a new processor.

ed to ges

Page 5: Benchmarking Porting Costs of the SKYNET High …ieee-hpec.org/2014/CD/index_htm_files/FinalPapers/70.pdfMiddleware developer can easily configure virtual nodes per processor to accommoda

The team also measured the efficiency of the processors and discovered that the peak throughput quoted by the vendor were often unachievable in practice. Efficiency here is defined as ratio of Total Algorithm GFLOPS divided by Peak GFLOPS (sometimes called marketing FLOPS).

The study found that the measured processor efficiency was dramatically uneven from family to family. We are reluctant to quote exact efficiency numbers as it is application dependent and will vary as technology changes. Note that, for our applications, many processors delivered well below 5 percent of the advertised marketing FLOPs. We believe that processor efficiency should be evaluated when choosing a processor; we found some of the more powerful processors in theory had poor efficiencies in practice.

IV. SUMMARY We have described the SKYNET Middleware and its long

history in high performance real-time signal processing. The SKYNET Middleware is in use in a robust community of developers producing signal processing applications. Because the interfaces in the SKYNET Middleware are U.S. Government owned, applications built using the middleware are free of vendor lock. There is a thriving third party development culture in place.

We have described how the SKYNET Middleware is based on an Open Architecture approach using Linux. We have shown how having a thin middleware layer over Linux provides a very portable solution. That portability was demonstrated when fielded radar applications were ported to many different processor architectures with little or no change.

V. REFERENCES [1] DoD’s Perspective on Radar Open Architectures, June 2010, Scott

Lucero1 et al. [2] Algorithms and Theory of Computation Handbook, 1999 by CRC Press

LLC. [3] Using MPI, 2nd Edition: Portable Parallel Programming with the

Message Passing Interface. Cambridge, MA, USA: MIT Press Scientific And Engineering Computation Series, 1999, By Gropp, William; Lusk, Ewing; Skjellum, Anthony

[4] Data Distribution Service for Real-time Systems, Version 1.2, 2007, Object Management Group