data reorganization interface · 2011. 5. 13. · the dri-1.0 api was ratified and published in...

18
Data Reorganization Interface Kenneth Cain Mercury Computer Systems, Inc. Phone: (978)-967-1645 Email Address: [email protected] Abstract: This presentation will update the HPEC community on the latest status of the standard Data Reorganization Interface (DRI). DRI is a software interface for performing data-parallel distribution and reorganization operations (e.g., transpose, reshape) that are frequently required in scalable HPEC applications. DRI provides increased ease of use compared to point-to-point middleware by providing abstractions for multi-dimensional datasets, partitioning and distribution methods (e.g., block, block-cyclic, overlapped elements), and a high-level interface that frees applications from having to orchestrate the multitude of individual transfers required in a single data reorganization. A planned transfer approach in DRI enables high performance data transfers, and its multi-buffering semantics enable (with hardware support) time overlap of an application’s communication and computation operations. DRI is designed to enhance existing standard and proprietary middleware by adding a standard, easy to use interface without compromising high performance. The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002 workshop. DRI-related activities since that announcement will be discussed in this presentation, including current vendor implementation status, a summary of results from the first use of DRI in a realistic application demonstration (SAR image formation), and candidate features that could be added to an enhanced DRI standard. The DRI-1.0 document can be accessed on the World Wide Web at URL http://www.data-re.org .

Upload: others

Post on 08-May-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

Data Reorganization Interface

Kenneth Cain Mercury Computer Systems, Inc.

Phone: (978)-967-1645 Email Address: [email protected]

Abstract: This presentation will update the HPEC community on the latest status of the standard Data Reorganization Interface (DRI). DRI is a software interface for performing data-parallel distribution and reorganization operations (e.g., transpose, reshape) that are frequently required in scalable HPEC applications. DRI provides increased ease of use compared to point-to-point middleware by providing abstractions for multi-dimensional datasets, partitioning and distribution methods (e.g., block, block-cyclic, overlapped elements), and a high-level interface that frees applications from having to orchestrate the multitude of individual transfers required in a single data reorganization. A planned transfer approach in DRI enables high performance data transfers, and its multi-buffering semantics enable (with hardware support) time overlap of an application’s communication and computation operations. DRI is designed to enhance existing standard and proprietary middleware by adding a standard, easy to use interface without compromising high performance. The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002 workshop. DRI-related activities since that announcement will be discussed in this presentation, including current vendor implementation status, a summary of results from the first use of DRI in a realistic application demonstration (SAR image formation), and candidate features that could be added to an enhanced DRI standard. The DRI-1.0 document can be accessed on the World Wide Web at URL http://www.data-re.org.

Page 2: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

Report Documentation Page Form ApprovedOMB No. 0704-0188

Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering andmaintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information,including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, ArlingtonVA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if itdoes not display a currently valid OMB control number.

1. REPORT DATE 20 AUG 2004

2. REPORT TYPE N/A

3. DATES COVERED -

4. TITLE AND SUBTITLE Data Reorganization Interface

5a. CONTRACT NUMBER

5b. GRANT NUMBER

5c. PROGRAM ELEMENT NUMBER

6. AUTHOR(S) 5d. PROJECT NUMBER

5e. TASK NUMBER

5f. WORK UNIT NUMBER

7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) Mercury Computer Systems, Inc.

8. PERFORMING ORGANIZATIONREPORT NUMBER

9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S)

11. SPONSOR/MONITOR’S REPORT NUMBER(S)

12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release, distribution unlimited

13. SUPPLEMENTARY NOTES See also ADM001694, HPEC-6-Vol 1 ESC-TR-2003-081; High Performance Embedded Computing(HPEC) Workshop (7th)., The original document contains color images.

14. ABSTRACT

15. SUBJECT TERMS

16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT

UU

18. NUMBEROF PAGES

17

19a. NAME OFRESPONSIBLE PERSON

a. REPORT unclassified

b. ABSTRACT unclassified

c. THIS PAGE unclassified

Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18

Page 3: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

© 2003 Mercury Computer Systems, Inc.

Data Reorganization Interface (DRI)

Data Reorganization Interface (DRI)

Kenneth Cain Jr.Mercury Computer Systems, Inc.

High Performance Embedded Computing (HPEC) ConferenceSeptember 2003

On behalf of the Data Reorganization Forumhttp://www.data-re.org

Computer Systems, L

The Ultimate Performance Machine

Page 4: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

2© 2003 Mercury Computer Systems, Inc.

Status update for the DRI-1.0 standard since Sep. 2002 publication

Outline, Acknowledgements Outline, Acknowledgements

DRI Overview.

Highlights of First DRI Demonstration.Common Imagery Processor (Brian Sroka, MITRE).

Vendor Status.Mercury Computer Systems, Inc.MPI Software Technology, Inc. (Anthony Skjellum).SKY Computers, Inc. (Stephen Paavola).

I 1 1 1 ■ hi i i *

If A« OTT RBORSMIffiHWNP

Page 5: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

3© 2003 Mercury Computer Systems, Inc.

What is DRI? What is DRI?

Partition for data-parallel processingDivide multi-dimensional dataset across processesWhole, block, block-cyclic partitioningOverlapped data elements in partitioningProcess group topology specification

Redistribute data to next processing stageMulti-point data transfer with single function callMulti-buffered to enable communication / computation overlapPlanned transfers for higher performance

Standard API that complementsexisting communication middleware

I 1 1 1 ■ hi i i *

If A« OTT RBORSMIffiHWNP

Page 6: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

First DRI-based DemonstrationFirst DRI-based Demonstration

Common Imagery Processor (CIP)Conducted by Brian Sroka of The MITRE Corporation

Computer Systems, Inc.

The Ultimate Performance Machine

Page 7: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

5© 2003 Mercury Computer Systems, Inc.

CIP and APG-73 BackgroundCIP and APG-73 Background

CIP

The primary sensor processing element of the Common Imagery Ground/Surface System (CIGSS)

Processes imagery data into exploitable image, outputs to other CIGSS elements

A hardware independent software architecture supporting multi-sensor processing capabilities

Prime Contractor: Northrop Grumman, Electronic Sensor Systems Sector

Enhancements directed by CIP Cross-Service IPT, Wright Patterson AFB

APG-73SAR component of F/A-18 Advanced Tactical Airborne Reconnaissance System (ATARS)

Imagery from airborne platforms sent to TEG via Common Data Link

Source: “HPEC-SI Demonstration: Common Imagery Processor (CIP) Demonstration -- APG73 Image Formation”, Brian Sroka, The MITRE Corporation, HPEC 2002

Computer Systems, Inc.

MERKUR? The Ultimate Performance Machine

-

Page 8: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

6© 2003 Mercury Computer Systems, Inc.

APG-73 Data Reorganization (1)APG-73 Data Reorganization (1)

Block data distribution

P0 P1

Range

Cross Range

%16

PN-1 PN

P0

P1

PN

Block data distribution

Source

• No overlap• Blocks of cross-range• Full range

Destination

• No overlap• Full cross-range• Mod-16 length blocks of range

Source: “CIP APG-73 Demonstration: Lessons Learned”, Brian Sroka, The MITRE Corporation, March 2003 HPEC-SI meeting

Computer Systems, Inc.

MERKUR? The Ultimate Performance Machine

Page 9: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

7© 2003 Mercury Computer Systems, Inc.

APG-73 Data Reorganization (2)APG-73 Data Reorganization (2)

%16

Range

Cross-Range

P0

P1

PN

Block data distribution

Source

• No overlap• Full cross-range• Mod-16 length blocks of range cells

Destination

• 7 points right overlap• Full cross-range• Blocked portion of range cells

P0

P1

PN

Block with overlap

Source: “CIP APG-73 Demonstration: Lessons Learned”, Brian Sroka, The MITRE Corporation, March 2003 HPEC-SI meeting

Computer Systems, Inc.

MERKUR? The Ultimate Performance Machine

Page 10: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

8© 2003 Mercury Computer Systems, Inc.

DRI Use in CIP APG-73 SARDRI Use in CIP APG-73 SAR

Simple transition to DRI#pragma splits loop over global data among threadsDRI: loop over local data

Demonstration completed

Demonstrations underway

MITRE DRI

MPI *

Application

Mercury PAS/DRI

Application

SKY MPICH/DRI

Application

* MPI/Pro (MSTI) and MPICH demonstrated

Range compressionInverse weighting

for

Azimuth compressionInverse weighting

for

Side-lobe clutter removal Amplitude detectionImage data compression

for

DRI-2: Overlap exchange

DRI-1: Cornerturn

DRI Implementations Used

Source: “HPEC-SI Demonstration: Common Imagery Processor (CIP) Demonstration --APG73 Image Formation”, Brian Sroka, The MITRE Corporation, HPEC 2002 workshop

Computer Systems, Inc.

MERKUR? The Ultimate Performance Machine

Page 11: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

9© 2003 Mercury Computer Systems, Inc.

0

5000

10000

15000

20000

25000

30000

35000

MainSupport Kernel

Total

MainSupport Kernel

Total

MainSupport Kernel

Total

MainSupport Kernel

Total

SLOCs VSIPL SLOCs

SourceLines

OfCode

(SLOCs)

5% SLOC increase for DRI includes code for:2 scatter / gather reorgs3 cornerturn data reorg cases3 overlap exchange data reorg cases managing interoperation between DRI and VSIPL libraries

Portability: SLOC ComparisonPortability: SLOC Comparison

~1% Increase ~5% Increase

VSIPL DRI VSIPLSequential Shared Memory VSIPL

Source: “HPEC-SI Demonstration: Common Imagery Processor (CIP) Demonstration -- APG73 Image Formation”, Brian Sroka, The MITRE Corporation, HPEC 2002 workshop

1 for interleaved complex +2 for split complex data format

Using DRI requires much less source code than manual distributed-memory implementation

Computer Systems, Inc.

MERKUR? The Ultimate Performance Machine

Page 12: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

10© 2003 Mercury Computer Systems, Inc.

CIP APG-73 DRI ConclusionsCIP APG-73 DRI Conclusions

Source: “HPEC-SI Demonstration: Common Imagery Processor (CIP) Demonstration -- APG73 Image Formation”, Brian Sroka, The MITRE Corporation, HPEC 2002Source: “High Performance Embedded Computing Software Initiative (HPEC-SI)”, Dr. Jeremy Kepner, MIT Lincoln Laboratory, June 2003 HPEC-SI meeting

Applying DRI to operational software does not greatly affect software lines of code

DRI greatly reduces complexity of developing portabledistributed-memory software (shared-memory transition easy)

Communication code in DRI estimated 6x smaller SLOCs than if implemented with MPI manually

No code changed to retarget application (MITRE DRI on MPI)

Features missing from DRI:Split complexDynamic (changing) distributionsRound-robin distributionsPiecemeal data production / consumptionNon-CPU endpoints

Future needs

Computer Systems, Inc.

MERKUR? The Ultimate Performance Machine

Page 13: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

Vendor DRI StatusVendor DRI StatusMercury Computer Systems, Inc.MPI Software Technology, Inc.

SKY Computers, Inc.

Computer Systems, Inc.

The Ultimate Performance Machine

Page 14: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

12© 2003 Mercury Computer Systems, Inc.

Mercury Computer Systems (1/2)Mercury Computer Systems (1/2)

Commercially available in PAS-4.0.0 (Jul-03)Parallel Acceleration System (PAS) middleware productDRI interface to existing PAS features

• The vast majority of DRI-1.0 is supported• Not yet supported: block-cyclic, toroidal, some replication

Additional PAS features compatible with DRI Optional: applications can use PAS and DRI APIs together

Applications can use MPI & PAS/DRI Example: independent use of PAS/DRI and MPI libraries by the same application is possible (libraries not integrated)

Computer Systems, Inc.

MERKUR? The Ultimate Performance Machine

Page 15: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

13© 2003 Mercury Computer Systems, Inc.

Mercury Computer Systems (2/2)Mercury Computer Systems (2/2)

DRI Adds No Significant Overhead

DRI Achieves PAS Performance!

PAS communication features: User-driven buffering & synchronizationDynamically changing transfer attributesDynamic process sets I/O or memory device integrationTransfer only a Region of Interest (ROI)

PAS communication features: User-driven buffering & synchronizationDynamically changing transfer attributesDynamic process sets I/O or memory device integrationTransfer only a Region of Interest (ROI)

Standard DRI_Distribution

Object

Standard DRI_Blockinfo

Object

Hybrid use of PAS and DRI APIs

Built on Existing PAS PerformanceMercury Computer Systems

1K X 1K Complex Matrix Transpose Standard DRI Overhead % Relative to PAS

0.00%

0.02%

0.04%

0.06%

0.08%

0.10%

2 4 8 12 16

Number of Processors

% O

verh

ead

DR

I R

elat

ive

to P

AS

Computer Systems, Inc.

MERKUR? The Ultimate Performance Machine

Page 16: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

14© 2003 Mercury Computer Systems, Inc.

MPI Software Technology, Inc.MPI Software Technology, Inc.

MPI Software Technology has released its ChaMPIon/Pro (MPI-2.1 product) this springWork now going on to provide DRI “in MPI clothing” as add-on to ChaMPIon/ProConfirmed targets are as follows:

Linux clusters with TCP/IP, Myrinet, InfiniBandMercury RACE/RapidIO Multicomputers

Access to early adopters: 1Q04More info available from:[email protected](Tony Skjellum)

Computer Systems, Inc.

MERKUR? The Ultimate Performance Machine

lytOf Software Technology

We take the mess out of message passing!

Page 17: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

15© 2003 Mercury Computer Systems, Inc.

SKY Computers, Inc. (1/2)SKY Computers, Inc. (1/2)

Initial ImplementationInitial Implementation

Experimental version implemented for SKYchannelIntegrated with MPIAchieving excellent performance for system sizes at least through 128 processors

SKYCOMPUTERS A SUBSIDIARY OF ANALOGIC CORPORATION

Page 18: Data Reorganization Interface · 2011. 5. 13. · The DRI-1.0 API was ratified and published in September 2002 by the Data Reorganization Forum, and was announced at the HPEC 2002

16© 2003 Mercury Computer Systems, Inc.

Sky Computers, Inc. (2/2)Sky Computers, Inc. (2/2)

SKY’s PlansSKY’s Plans

Fully supported implementation with SMARTpacPart of SKY’s plans for standards complianceIncluded with MPI libraryOptimized InfiniBand performance

SKYCOMPUTERS A SUBSIDIARY OF ANALOGIC CORPORATION