working with administrative databases: tips and tricks › content › dam › sas › en_ca ›...

25
3 Working with Administrative Databases: Tips and Tricks Canadian Institute for Health Information Emerging Issues Team Simon Tavasoli

Upload: others

Post on 07-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

3

Working with Administrative

Databases: Tips and Tricks

Canadian Institute for Health Information

Emerging Issues Team

Simon Tavasoli

Page 2: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Administrative Databases

> Administrative databases are often used to synthesize information

regarding health care system or to investigate health research

questions

> The data may be derived from population registries, vital statistics or

other records of life events, or from health claims and services data

> Canadian Institute for Health Information (CIHI), collect /receives

essential data and prepares analyses on Canada’s health system

and the health of Canadians

> Currently CIHI holds more than 27 databases with millions of

Record (e.g. National Ambulatory Care Registry contains millions of

records each year)

3

Page 3: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Working with Administrative Databases:

General Tips and Tricks

> Each day hundreds of employees conduct analyses using SAS

> Given the magnitude of work load on the CIHI server, using

resources wisely is important

> Efficiency can be measured in many ways

– Real Time

– CPU time

– Memory

– Input /Output

– Original Programmer time

– Maintenance Programmer time

3

There is always a trade-off

Page 4: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

System Options for measure of performance > Options STIMER; (Default ) NOTE: DATA statement used:

real time 1.16 seconds

cpu time 0.09 seconds

> Options FULLSTIMER; NOTE: The SAS System used:

real time 0.14 seconds

user cpu time 0.01 seconds

system cpu time 0.05 seconds

Memory 1452k

Page Faults 1

Page Reclaims 2349

Page Swaps 0

Voluntary Context Switches 53

Involuntary Context Switches 5

Block Input Operations 1

Block Output Operations 0

4

Page 5: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Optimizing performance

* Optimize performance by reducing CPU time

5

-Check the program using the _null_ or the OBS

-Use WHERE vs. IF

-Use DROP and KEEP statements

-Issues with merging data

-Avoid unnecessary DATA steps or sorting

-Manipulation of data with IF/THEN/ELSE statements

-Dealing with resource intensive calculations

*Keep the libraries clean

*Reduce the size of the tables using COMPRESS=YES

Page 6: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

When checking your programs, use a null

data set or limit the number of

observations

6

Page 7: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Subsetting Datasets: WHERE vs. IF

statements

7

Page 8: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Process only the variables that you need

8

Need only two variables

Social Sciences computing cooperative

Page 9: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Subsetting datasets

9

Page 10: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Subsetting datasets: KEEP Statement

10

Page 11: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Subsetting datasets: KEEP Statement

11

Page 12: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Subsetting datasets: KEEP Statement

12

Page 13: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Some other Shortcuts

13

Page 14: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Merging data

14

Page 15: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Merging data

15

Page 16: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

When only one condition can be true for

a given observation, write

a series of IF-THEN/ELSE statements.

16 Social Sciences computing cooperative

Page 17: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

When only one condition can be true for

a given observation, write

a series of IF-THEN/ELSE statements.

17

Page 18: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

18

When only one condition can be true for

a given observation, write

a series of IF-THEN/ELSE statements.

Page 19: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Perform resource-intensive calculations

and comparisons only once

19 Social Sciences computing cooperative

Page 20: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Assign many values in one statement

20 Social Sciences computing cooperative

Page 21: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Dealing with Missing Values

21

Put missing values last in expressions

Check for missing values before using a variable in multiple

statements.

Social Sciences computing cooperative

Page 22: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Avoid unnecessary sorting

22

Page 23: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

If several different subsets are needed,

avoid rereading the data for each subset

23

Page 24: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

Keep your SAS environment clean

24

Page 25: Working with Administrative Databases: Tips and Tricks › content › dam › SAS › en_ca › User... · * Optimize performance by reducing CPU time 5 -Check the program using

COMPRESS=

25