copyright © 2005, sas institute inc. all rights reserved. getting the best performance from v9...

22
Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information Technology

Post on 21-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved.

Getting the BestPerformance from V9Threaded PROC SORTScott MebustSystem DeveloperBase Information Technology

Page 2: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 2

The (Unofficial) SAS Skydiving Team

Page 3: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 3

Keys to Sorting Performance

Know the conditions

Observe actual performance

Understand theoretical performance

Make adjustments

Page 4: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 4

Know the Conditions

System

SAS

Sort job

Page 5: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 8

Observe Actual Performance

Monitor System Activity

Examine the SAS Log

Measure System Capabilities

Page 6: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 9

Identify and Observe Sorting PhasesSort Phase

Merge Phase

I/O Bound, External, Single-Threaded

Page 7: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 12

Measure Storage Device SequentialTransfer Rates From Within SAS

Create a large dataset (e.g. 4xRAM)

Read dataset, dumping to _NULL_

Ensure Real time » CPU time

Compute transfer rates (R)

readread

t

FR

writewrite

t

FR

Where

F: size of the dataset (bytes)t: real time (seconds)

Page 8: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 14

Measure In-Core Sorting Costs

CPU Time Per Observation

0.00E+00

5.00E-08

1.00E-07

1.50E-07

2.00E-07

2.50E-07

3.00E-07

3.50E-07

4.00E-07

4.50E-07

5.00E-07

10000 100000 1000000 10000000 1E+08

Number of Observations

No

rma

lize

d C

PU

Tim

e

(se

co

nd

s)

Actual

Predicted

Small joboverhead

Page 9: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 15

Understand Theoretical Performance

Classify the job

Estimate SORT running time

Consider estimation hazards

Page 10: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 16

Classify the Job Performance Limitation

Compute Bound

I/O Bound

Mixed

Page 11: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 17

Classify the Job Size

InternalM

OF

1

ExternalM

OF

1

SinglePassMOFB MultiPassMOFB

Where

F: size of input datasetO: size of internal sorting overheadM: size of RAMB: utility file page (block) size

Page 12: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 21

Estimate the Running TimeInternal Sort, I/O Bound

writeread ttt

readread

R

Ft

writewrite

R

Ft

Input

Output

RAM

Sequential Read

Sequential Write

Where

t: real time (sec)F: dataset size (bytes)R: transfer rate (bytes/sec)

Page 13: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 22

Estimate the Running TimeSingle-Pass External, I/O Bound

4321 ttttt readR

Ft 1

writeR

Ut 2

Output

Input Sequential Read

Sequential Write Random Read

Sequential WriteRAM

RAMwriteR

Ft 4 ?3 t

U: utility file size (bytes)

Temp

Where

Page 14: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 23

Utility File Read Time

oFU Single-threaded:

File Size

FU Multi-threaded:

B

Unpages

Number of Pages

readread

R

Ut

Best Case (Sequential) Read Time

readpagesread

R

Brsnt

Worst Case (Random) Read Time

where

where

B: utility file page (block) sizewhereF: size of input dataset

o: # of observations × sort key length

s: average positional latencyr: average rotational latency

Page 15: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 24

Multi-Pass External Sorting

M

OFnruns

B

Mnbuffers

)(

)(

buffers

runspasses

nLn

nLnn

Number of Sorted Runs Number of Utility File Passes

is theMaximum External

Merge Order

where

and

F: size of input datasetO: size of internal sorting overheadM: SORTSIZEB: utility file page (block) size

Page 16: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 25

Estimate the Running TimeSingle-Pass External, Compute Bound

writemergesort tttt

?sortt

Output

Input

Temp

Sequential Write Random Read

RAM

RAM

?mergetwrite

writeR

Ft

Page 17: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 26

Single-Pass External, Compute Bound

sortmerge tt

runs

obsobs

n

nrun

Utility File Creation Time

where

Where nobs is the total number of observations in the dataset

M

OFnruns

trun is the time required to perform an in-memory sort the number ofobservations in a single run

Utility File Merge Time, Compute Bound

runrunssort tnt

As previously described forI/O bound Utility File Read Time

Utility File Merge Time, I/O Bound

readmerge

R

Ut

readpagesmerge

R

Brsnt

Worst Case:

Best Case:

Page 18: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 27

Consider Estimation Hazards

File cache effects

Pseudo-internal sorting (thrashing)

Pseudo-external sorting (file cache)

Limitations within each sorting phase

Page 19: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 30

Make adjustments

Determine if there is a problem

Identify the problem

Alter the conditions

Re-evaluate

Page 20: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 31

Identify the Problem

Processing speed

Memory

External Storage

Page 21: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 32

Alter the Conditions

Memory settings

Library to storage device mappings

Utility file location

Utility file page size

Page 22: Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information

Copyright © 2005, SAS Institute Inc. All rights reserved. 34Copyright © 2005, SAS Institute Inc. All rights reserved. 34