working with your data (chapter 2 in the little sas book)

32
IOWA STATE UNIVERSITY Department of Animal Science Working with Your Data (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 4 September 9, 2010

Upload: mary

Post on 23-Feb-2016

41 views

Category:

Documents


0 download

DESCRIPTION

Working with Your Data (Chapter 2 in the Little SAS Book) . Animal Science 500 Lecture No. 4 September 9, 2010. Working with Your Data. To this point we have identified Many forms which data can stored and ultimately imported into SAS Spreadsheets – Excel, Lotus, Quattro Pro, etc. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Working with Your Data (Chapter 2 in the Little SAS Book)

Animal Science 500Lecture No. 4

September 9, 2010

Page 2: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Working with Your DataTo this point we have identified

1. Many forms which data can stored and ultimately imported into SAS

1. Spreadsheets – Excel, Lotus, Quattro Pro, etc.2. Databases – Access, SQL, others3. Text files – from Word, WordPad, Notepad, others4. Other fileformats

2. Many ways to import our data into SAS1. Import wizard2. Infile statement3. Others

Many options to use with the importing of the data, formatting the input data, etc.

Page 3: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Modifying your Datau Data step

n read and modify datan create a new datasetn performs actions on rows

u Proc stepn use an existing datasetn produce an output/resultsn performs actions on columns

Page 4: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Modifying your Datau Creating and redefining variables is

straightforward in a SAS data stepn variable = expression;

u Examplesn Newvariable = constant;n Newvariable = oldvariable * constant;n Adjusted Backfat, growth rate, loin muscle area =

predetermined equation

Page 5: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Arithmetic Operators

Operation Symbol Example Result+ addition Num + Num

Example: 5 + 3add two numbers together

- subtraction Num - Num Example: 5 – 3 or can use two variables ending wt. – beginning wt.

subtract the value of 5 -3

* multiplication (table note 1)

2*yAlways have to have * cannot use 2(y) or 2y

multiply 2 by the value of Y

/ division var/5or can use variable weight gain / days on test

divide the value of VAR by 5

** can

also use the ^

exponentiation a**2or a^2

raise A to the second power

Page 6: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Comparison Operatorsu Comparison operators set up a comparison, operation,

or calculation with two variables, constants, or expressions within the dataset being used . n If the comparison is true, the result is 1. n If the comparison is false, the result is 0.

u Comparison operators can be expressed as symbols or with their mnemonic equivalents, which are shown in the following table:

Page 7: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Comparison Operators

SymbolMnemonic Equivalent Definition Example

= EQ equal to a=3

^= NE not equal to (table note 1) a ne 3

¬= NE not equal to

~= NE not equal to

> GT greater than num>5

< LT less than num<8 >= GE greater than or equal to 

(table note 2)sales>=300

<= LE less than or equal to (table note 3) sales<=100

IN equal to one of a list num in (3, 4, 5)

Page 8: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Logical (Boolean) Operators and Expressions

Symbol Mnemonic Equivalent Example& AND (a>b & c>d)| OR (a>b or c>d)! OR

¦ OR

¬ NOT not(a>b)

ˆ NOT

~ NOT

Logical operators, also called Boolean operators, are usually used in expressions to link sequences of comparisons.

Page 9: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Order of calculationsu The order in which any of the functions follow

standard mathematical rules of precedence.u To overcome this parentheses are used to

override that order.

Page 10: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Modifying your Datau The DROP or KEEP statements

n Used to decrease the number of variablesn Usually not a concern with datasets normally

encounteredn Remember that the variables are dropped or

retained (keep) within the SAS dataset unless you specify otherwise

Page 11: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Modifying your DataData new2; set new;

ADG = ((Finalwt. – Beginningwt) / DaysOnTest);

Drop Beginningwt DaysOnTest;

Proc Means;Run; Quit;

Page 12: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using IF – THEN Statementsu In cases where we want to assign some

statement to some of your observations but not all.n For example adjustment factors for backfat, loin muscle

area, growth rate that differing by sex of animal

u Called condition – action statementsu IF condition THEN action;

Page 13: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using IF – THEN Statementsu Example1

n If job='banker' then highsal=1;

u IF condition AND condition THEN action;u Example2u If job='banker' and age>65 then ret_banker=1;u If job eq ‘banker’ and age ge 65 tehn ret_banker=1;

Page 14: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using IF – THEN Statementsu Example1

n If job='banker' then highsal=1;

u IF condition AND condition THEN action;u Example2u If job='banker' and age>65 then ret_banker=1;u If job eq ‘banker’ and age ge 65 tehn ret_banker=1;

Page 15: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using the IN Operatoru Using the IN operator makes comparisons and

works similarly to the If – Then statement but gives a bit more flexibility

u IF Model IN (‘Corvette’, ‘Camaro’) Then make = ‘Cheverolet’;n Assumes you have a column or variable titled Modeln Creates new variable or column titled Cheverolet

Page 16: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using the IN Operatoru Example using animal data. u IF SEX IN (‘gilt’, ‘barrow’) Then adjustedBF = BF +

((actualwt – 250) * (actualbf / (actualwt – constant1)));

u IF SEX IN (‘boar”) Then adjustedBF = BF + ((actualwt – 250) * (actualbf / (actualwt – constant2)));

Page 17: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using IF – THEN Statementsu A single IF – THEN statement can have

only one actionu Using the key words DO and END then it

is possible to execute more than 1 actionu Example IF condition THEN DO; action; action; END;

Page 18: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using IF – THEN Statementsu A single IF – THEN statement can have

only one actionu Using the key words DO and END then it

is possible to execute more than 1 actionu Example IF Model = ‘Mustang” THEN DO; Make = ‘Ford’; Size = ‘Compact’; END;

Page 19: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using IF – THEN Statementsu The AND and OR keywords can be used

to specify multiple conditionsn IF condition AND condition THEN action;

u Examplel IF Model = ‘Mustang’ AND Year < 1975 THEN

Status = ‘Classic’; both conditions must be met to reach the ‘Classic” status

l IF Model = ‘Mustang’ OR Year < 1975 THEN Status = ‘Classic’; only one of the conditions must be met to reach the ‘Classic” status

Page 20: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using IF – THEN /ELSE Statements

u Using the IF - Then / Else statement is typically used to group observations

u Basic form of statementn IF condition THEN action;

ELSE IF condition THEN action;ELSE IF condition THEN action;

Advantages: When compared to regular IF – THEN statements1. Computationally more efficient as it uses less computer time

because once a condition is satisfied SAS skips the rest of the steps.

2. The ELSE statement is mutually exclusive thus preventing an observation from ending up in more than one group.

Page 21: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

The DO – END statementThe DO-END statement is useful if you want to make several

changes or create new variables for a subgroup or under certain conditions.

Note that the DO-loop continues until you end it using END;Example

If sex = ’female’ then do;u AdjustedBF = equation;u AdjustedLMA = equation;u AdjustedDAYS = equation;u End;

Page 22: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

The DO – END statementThe DO-END statement is useful if you want to make several

changes or create new variables for a subgroup or under certain conditions.

Note that the DO-loop continues until you end it using END;u Else if sex EQ ’male’ then do;u ...u ...u end;

Page 23: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using IF – THEN /ELSE Statements

u Example:u IF CowBC . THEN AdjustedCowBC = .;

ELSE IF CowBC ge 9 THEN AdjustedCowBC = 5;ELSE IF CowBC ge 7 and lt 9 THEN AdjustedCowBC = 4;ELSE IF CowBC ge 5 and lt 7 THEN AdjustedCowBC = 3;ELSE IF CowBC ge 3 and lt 5 THEN AdjustedCowBC = 2;ELSE IF CowBC ge 1 and lt 3 THEN AdjustedCowBC = 1;

ELSE IF condition THEN action;

Page 24: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using IF – THEN /ELSE Statementsu When the condition is true, SAS assigns the stated value

to AdjustedCowBC and then leaves the loop. The last ELSE is a trash-bin: anything that is not covered by the previous conditions is put to missing.

u Look at what gets assigned to the last ELSE statement as it may identify an error in the data set or other problems

u Make sure the condition part includes all possibilities, else you might get missings or hidden errors.

u Examine the log where it will reveal the number of missing values created, but there is no indication for observations that were not covered by your programming!

Page 25: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Subsetting your datau Sometimes researchers or programmers want

to look at only a portion of the data that is collected.

u Can accomplish this using the IF statementu Example only interested in gilts in a dataset

that includes data from boars and barrows.n IF sex = ‘barrow’ THEN delete;n IF sex = ‘boar” THEN delete;

Page 26: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Subsetting your datau Sometimes researchers or programmers want

to look at only a portion of the data that is collected.

u Another way to use the IF statementu Rather than making it an deletion statement,

make an inclusionary statement;u Example only interested in gilts in a dataset

that includes data from boars and barrows.n IF sex = ‘gilt’; this results in the program looking at data

where sex = gilt

Page 27: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Subsetting your dataWhat is the difference when looking at only gilts if the statements IF sex = ‘gilt’; this results in the program looking at data where sex = gilt;IF sex = ‘barrow’ THEN delete; IF sex = ‘boar” THEN delete;

Dataset would include anything coded incorrectly in the dataset other than ‘barrow’ or ‘boar’Using the inclusionary statement (IF sex =‘gilt”), requires that every line of data be examined

Page 28: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

The RETAIN Statementsu Use the RETAIN statement when you

want to keep some or all of the variable from a previous DATA step.n The RETAIN variable list; can appear anywhere

in the DATA stepn You can specify an initial value instead of

missing for variables as followsRETAIN variable list initial value;

Page 29: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

The Sum Statementsu The Sum statement is used when have a

cumulative total for some variable. Example used in the book;

RETAIN MaxRuns;MaxRuns = MAX (MaxRuns, Runs);RunsToDate + Runs;

Might want to use something similar if you are totaling milk production across the days in lactation or pigs born alive across parity.

Page 30: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using Arrays in SASu An array is a temporary holding site for a

collection of variables upon which the same operations will be performed. n Arrays provide convenient shortcuts in programming.n An array is a group of variables that is user defined.n The array is user defined in the DATA step.

l All the variables in an array must either be characters or numeric CANNOT mix character and numeric variables in the same array.

Page 31: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using Arrays in SASu Array name (n) $ (may or may not be included)

variable list.n The number (n) must match the number of variables in

the list.n An array by itself does nothingn You create the array to perform some function that you

want to perform on all array variables (book uses changing missing value from 9 to .

Page 32: Working with Your Data (Chapter 2 in the Little SAS Book)

IOWA STATE UNIVERSITYDepartment of Animal Science

Using Shortcuts for Lists of Variable Namesu Provides you a listing for inputting variables that have

very similar names n Variable1, Variable2, Variable3, Variable4 and so forthn You could use an input statement that includes all of the names –

INPUT Variable1 Variable2 Variable3 Variable4;

n Alternatively you could write itn INPUT Variable1 – Variable4; and all variable will have

been inputted.