bas 150 lesson 4 lecture

31
BAS 150 Lesson 4: Creating and Managing SAS Datasets & Formats and Labels

Upload: wake-tech-bas

Post on 12-Apr-2017

39 views

Category:

Education


0 download

TRANSCRIPT

Page 1: BAS 150 Lesson 4 Lecture

BAS 150Lesson 4: Creating and Managing SAS Datasets & Formats and Labels

Page 2: BAS 150 Lesson 4 Lecture

• Create a permanent SAS data set

• Effectively manage multiple SAS data sets and libraries

• Modify SAS data sets

• Evaluate the difference in data formats and labels

• Create a new variable in SAS using formats

This Lesson’s Learning Objectives

Page 3: BAS 150 Lesson 4 Lecture

SAS Data Sets

SAS datasets can be permanent or temporary.

Previously, we’ve used the data statement to create

temporary SAS datasets.

In this lesson, we’ll learn how to create permanent SAS

datasets using the data statement.

Page 4: BAS 150 Lesson 4 Lecture

They are subdirectories or folders

They store SAS datasets

Starts with libname statement

Understanding SAS Libraries (1 of 3)

Page 5: BAS 150 Lesson 4 Lecture

Libref is the location name you give for the subdirectory

where SAS needs to look for the dataset

You will refer to this libref further in your programs

The libref for this libname statement is:

Understanding SAS Libraries (2 of 3)

Page 6: BAS 150 Lesson 4 Lecture

Libref

o Naming Rules 8 characters or less

Begin with character or underscore

No special characters or spaces

o If you do not specify a libref… Temporary save to work library

Understanding SAS Libraries (3 of 3)

Page 7: BAS 150 Lesson 4 Lecture

Creating a Permanent SAS Dataset

Page 8: BAS 150 Lesson 4 Lecture

Two parts

o Descriptor

o Data

PROC Statements

o PROC Print

o PROC Contents

Format of SAS Datasets

Page 9: BAS 150 Lesson 4 Lecture

PROC Contents (1 of 2)

PROC Contents can be used to display the metadata, or descriptor portion, of the SAS dataset.

Page 10: BAS 150 Lesson 4 Lecture

PROC Contents (2 of 2) Note the key parts:

• Data set name• File name• Number of variables• Number of

observations• Variable list

Proc contents is an easy way to know what’s in your dataset

Page 11: BAS 150 Lesson 4 Lecture

PROC Contents: VARNUM Option By default, all variables in the PROC contents procedure

are listed alphabetically

Sometimes, it is useful to look at the variable list in the order they were created

This can be done using the VARNUM option.

Before: After:

Page 12: BAS 150 Lesson 4 Lecture

Useful for showing your results of your code

However, doing a default proc print option with a large dataset will

generate endless pages of output and can even make SAS freeze.

It’s better to specify options when using PROC Print, which we’ll discuss

next.

PROC Print

Page 13: BAS 150 Lesson 4 Lecture

There are several options for Proc Print:

o Var – specifies the variables you want to print

o Noobs – suppresses the default observation column

o Obs – limits the number of observations

PROC Print Options

Page 14: BAS 150 Lesson 4 Lecture

Importing a SAS Data Set Soccer is the Library

Soccer_Scores is the dataset within the library.

No need for infile statements to retrieve data.

The set statement is like an input statement, but instead of reading in a raw data file, it reads observations from a SAS dataset.

Page 15: BAS 150 Lesson 4 Lecture

Modifying a SAS Data SetThere are many different ways to modify a SAS data set.

The example on the left uses the “Where” statement to subset the work.goals data set AND creates a new SAS data set called work.topgoals.

Work.topgoals only includes observations where the goal variable is greater than or equal to 10 goals.

Page 16: BAS 150 Lesson 4 Lecture

Formats and Labels

Page 17: BAS 150 Lesson 4 Lecture

Label Statement

Apply in DATA or PROC stepo DATA or PROC Datasets: permanent

o PROC steps: temporary

Using Labels (1 of 3)

Page 18: BAS 150 Lesson 4 Lecture

Using Labels (2 of 3)• The first label statement is in the data

statement.

• It creates permanent descriptors for the

variables player, goals, age and

years_playing

• The second label statement is in the

proc print statement.

• Proc Print requires a label option

when you want to display labels

(instead of field names) in the column

header, because the default is the

variable name.

Page 19: BAS 150 Lesson 4 Lecture

Using Labels (3 of 3)

• To the right is the output using our label statement

• The headings are now the labels, instead of variable names.

Page 20: BAS 150 Lesson 4 Lecture

Similar to labels

Define appearance of data

Grouping

Predefined or custom

Both Data and Proc steps

o Data: Permanent

o Proc: Temporary

Formats

Page 21: BAS 150 Lesson 4 Lecture

Format Rules

$ Indicates a character format.

format Names the SAS format.

w Specifies the total format width, including decimal places and special characters.

. Is required syntax. Formats always contain a period (.) as part of the name.

d Specifies the number of decimal places to display in numeric formats.

<$>format<w>.<d>

Page 22: BAS 150 Lesson 4 Lecture

Commonly Used SAS FormatsFormat Definition

$w. Writes standard character data.

w.d Writes standard numeric data.

COMMAw.dWrites numeric values with a comma that separates every three digits and a period that separates the decimal fraction.

DOLLARw.dWrites numeric values with a leading dollar sign, a comma that separates every three digits, and a period that separates the decimal fraction.

COMMAXw.dWrites numeric values with a period that separates every three digits and a comma that separates the decimal fraction.

EUROXw.dWrites numeric values with a leading euro symbol (€), a period that separates every three digits, and a comma that separates the decimal fraction.

Page 23: BAS 150 Lesson 4 Lecture

Pre-formatted value Format Formatted value

2125854 comma10. 2,125,854

52115 dollar14.2 $52,115.00

17526 mmddyy8. 12/26/07

17526 weekdate. Wednesday, December 26, 2007

M $Gender. Male

12 AgeGroup. Under 18

C $PassFail. Passing Grade

Examples of Formats

Page 24: BAS 150 Lesson 4 Lecture

Format Names

Page 25: BAS 150 Lesson 4 Lecture

Specifying Ranges of Values

Age of Player Value

5 - 6 Level 1

7-9 Level 2

10-12 Level 3

We can create our own formats to simplify data presentation by creating groups.

From the Soccer data, we can create a format for age so that each value represents a “Playing Level”

Example: Soccer players that are 5 or 6 years old are considered a “Level 1”, where those that are 7,8 or 9 are considered “Level 2”

Page 26: BAS 150 Lesson 4 Lecture

Defining a Numeric FormatUsing PROC Format

We create a new format called “level”.

This format will convert the age variable into playing levels 1 – 3.

Notice we do not reference age at all in the proc statement. Why? Because a format can be used for any variable.

Page 27: BAS 150 Lesson 4 Lecture

Applying a Numeric FormatOnce we’ve created the format, we can use it in PROC PRINT

Notice the period at the end of the second group of statements.

This is important and an easy place to make an error!

Remember, there is no period when you create the format, but a period is required when you use it.

Page 28: BAS 150 Lesson 4 Lecture

Viewing the Output

Note: I changed the age variable label to ‘Playing Level’ for reporting purposes

Page 29: BAS 150 Lesson 4 Lecture

Creating a New Variable in SASInstead of changing the variable label name for age, we can create a new variable called “Playing_Level”.

By using a PUT statement we can combine the age variable and the format level to create “Playing_Level”.

The Output.

Page 30: BAS 150 Lesson 4 Lecture

• Create a permanent SAS data set

• Effectively manage multiple SAS data sets and libraries

• Modify SAS data sets

• Evaluate the difference in data formats and labels

• Create a new variable in SAS using formats

Summary - Learning Objectives

Page 31: BAS 150 Lesson 4 Lecture

“This workforce solution was funded by a grant awarded by the U.S. Department of Labor’s

Employment and Training Administration. The solution was created by the grantee and does not

necessarily reflect the official position of the U.S. Department of Labor. The Department of Labor

makes no guarantees, warranties, or assurances of any kind, express or implied, with respect to such

information, including any information on linked sites and including, but not limited to, accuracy of the

information or its completeness, timeliness, usefulness, adequacy, continued availability, or

ownership.”

Except where otherwise stated, this work by Wake Technical Community College Building Capacity in

Business Analytics, a Department of Labor, TAACCCT funded project, is licensed under the Creative

Commons Attribution 4.0 International License. To view a copy of this license, visit

http://creativecommons.org/licenses/by/4.0/

Copyright Information