accessing data and creating data structures - sas institute · c. the obs= option specifies the...
TRANSCRIPT
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Accessing Data and Creating Data StructuresSAS Global Certification Webinar Series
Copyright © SAS Institute Inc. All rights reserved.
Michele Ensor Senior ManagerSAS Education
Accessing Data and Creating Data Structures
Becky GrayCertification Exam Developer
SAS Global Certification
Copyright © SAS Inst itute Inc. A l l r ights reserved.
3
SAS Global Certification Webinar Serieshttps://communities.sas.com/t5/SAS-Certification/bd-p/certification
February 8, 2017 September 14, 2017 Today: January 18, 2018
Upcoming SAS Global Certification Webinars
• February 15: Managing Data - 11:00 a.m. – 12:00 p.m. ET• March 13: Generating Reports and Test Taking Strategies – 11:00 a.m. – 12:00 p.m. ET
Copyright © SAS Inst itute Inc. A l l r ights reserved.
4
Base Programming Exam Content Areas
The intended candidate for the SAS Base Programming exam is someone with current SAS programming experience in the following five content areas:
1. Accessing Data
2. Creating Data Structures
3. Managing Data
4. Generating Reports
5. Handling Errors
In addition, candidates should be familiar with the enhancements and new functionality available in SAS 9.4.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
5
Base Programming Exam Specifics
The following are specifics of the SAS Base Programming for SAS®9 exam:
• ______ multiple-choice and short-answer questions
• ______ minutes to complete exam
• Closed book
• Exam taken on a computer
• Score received after completing the exam
• Must achieve a score of _____% correct to pass
60-65
110
70
Copyright © SAS Inst itute Inc. A l l r ights reserved.
1. Accessing Data
2. Creating Data Structures
Copyright © SAS Inst itute Inc. A l l r ights reserved.
7
Accessing Data
The content area of Accessing Data includes the following topics:
• Use FORMATTED and LIST input to read raw data files.
• Use INFILE statement options to control processing when reading raw data files.
• Use various components of an INPUT statement to process raw data files including column and line pointer controls, and trailing @ controls.
• Combine SAS data sets.
• Access an Excel workbook.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
8
Using Formatted and List Input
Formatted input is used to read standard and nonstandard raw data in fixed columns.
The following are some of the common SAS informats:
DATA output-SAS-data-set;INFILE 'raw-data-file' <options>;INPUT pointer-control variable informat … ;
RUN;
$w. $CHARw.
COMMAw.dDOLLARw.d
COMMAXw.dDOLLARXw.d
$UPCASEw.
EUROXw.d
MMDDYYw. DDMMYYw. DATEw.
w.d
PERCENTw.d
YEARw.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
9
Using Formatted and List Input
List input is used to read standard and nonstandard raw data separated by a delimiter.
• A space (blank) is the default delimiter.
• Variables must be specified in the order in which they appear in the raw data file.
• The default length for character and numeric variables is eight bytes unless using a LENGTH statement.
DATA output-SAS-data-set;LENGTH variable(s) $ length;INFILE 'raw-data-file' <options>;INPUT variable <$> variable <:informat> … ;
RUN;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
10
Question 1
Given the following fixed column raw data file:
Given the following DATA step:
Which SAS data set is created?
a. b.
data people;infile 'people.dat';input @1 Name $5. @7 Gender $1.
@9 Amt dollar3.;run;
Sue F $26 Bobby M $35
Copyright © SAS Inst itute Inc. A l l r ights reserved.
11
Question 1
Given the following fixed column raw data file:
Given the following DATA step:
Which SAS data set is created?
a. b.
data people;infile 'people.dat';input @1 Name $5. @7 Gender $1.
@9 Amt dollar3.;run;
Sue F $26 Bobby M $35
Copyright © SAS Inst itute Inc. A l l r ights reserved.
12
Question 2
Given the following delimited raw data file:
Given the desired SAS data set:
Which statement correctly reads the raw data file?
a.
b.
c.
input Date:date. Product Quantity;
input Date date. Product $ Quantity;
input Date:date. Product $ Quantity;
12MAR2013 Shirt 101APR13 Toy 2
Copyright © SAS Inst itute Inc. A l l r ights reserved.
13
Question 2
Given the following delimited raw data file:
Given the desired SAS data set:
Which statement correctly reads the raw data file?
a.
b.
c.
input Date:date. Product Quantity;
input Date date. Product $ Quantity;
input Date:date. Product $ Quantity;
12MAR2013 Shirt 101APR13 Toy 2
Copyright © SAS Inst itute Inc. A l l r ights reserved.
14
Using INFILE Statement Options
Options can be specified in the INFILE statement:
The following are some of the common INFILE statement options:
DLM= DSD
FIRSTOBS= OBS=
INFILE 'raw-data-file' <options>;
MISSOVER
Copyright © SAS Inst itute Inc. A l l r ights reserved.
15
Question 3
Which statement is true?
a. The DSD option sets the default delimiter to a comma.
b. The DLM= option is specified in the INPUT statement.
c. The OBS= option specifies the record number of the first record to read in an input file.
d. The MISSOVER option causes SAS to read a new input data record if it does not find values on the current data record.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
16
Question 3
Which statement is true?
a. The DSD option sets the default delimiter to a comma.
b. The DLM= option is specified in the INPUT statement.
c. The OBS= option specifies the record number of the first record to read in an input file.
d. The MISSOVER option causes SAS to read a new input data record if it does not find values on the current data record.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
17
Using INPUT Statement Components
Column and line pointer controls can be specified in the INPUT statement.
In addition, the trailing @ and double trailing @ can be specified in the INPUT statement.
@n +n / #n
@ @@
Copyright © SAS Inst itute Inc. A l l r ights reserved.
18
Question 4
Which statement is false?
a. The @n column pointer control moves the pointer to column n.
b. The double trailing @ is specified in the INPUT statement.
c. The / line pointer control advances the pointer to column 1 of the next input record.
d. The single trailing @ holds the input record for the execution of the next INPUT statement across iterations of the DATA step.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
19
Question 4
Which statement is false?
a. The @n column pointer control moves the pointer to column n.
b. The double trailing @ is specified in the INPUT statement.
c. The / line pointer control advances the pointer to column 1 of the next input record.
d. The single trailing @ holds the input record for the execution of the next INPUT statement across iterations of the DATA step.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
20
Accessing and Combining SAS Data Sets
The SET statement is used to read a SAS data set in a DATA step.
Also, the SET statement is used to concatenate data sets. Concatenating copies all observations from the first data set and then copies all observations from one or more successive data sets.
DATA output-SAS-data-set;SET input-SAS-data-set;
RUN;
DATA output-SAS-data-set;SET input-SAS-data-set1 input-SAS-data-set2 … ;
RUN;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
21
Accessing and Combining SAS Data Sets
The MERGE statement with a BY statement enables for match-merging.
Match-merging combines observations from two or more SAS data sets into a single observation by matching up values of one or more common variables.
DATA output-SAS-data-set;MERGE input-SAS-data-set1 input-SAS-data-set2 … ;BY <DESCENDING> by-variable(s);
RUN;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
22
Accessing and Combining SAS Data Sets
The following are a few of the data set options that can be used with the input SAS data set when accessing and combining data sets:
IN= RENAME=
KEEP= DROP=
FIRSTOBS=
OBS=
Copyright © SAS Inst itute Inc. A l l r ights reserved.
23
Question 5
Given the SAS data sets Nc and Sc:
Given the following SAS program:
How many variables are in the final data set Both?
data both;set nc sc;
run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
24
Question 5
Given the SAS data sets Nc and Sc:
Given the following SAS program:
How many variables are in the final data set Both? 3
data both;set nc sc;
run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
25
Question 6
Given the SAS data sets One and Two:
Given the following SAS program:
How many observations are in the final data set All?
data all;merge one(in=a) two(in=b);by ID;if a and not b;
run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
26
Question 6
Given the SAS data sets One and Two:
Given the following SAS program:
How many observations are in the final data set All? 1
data all;merge one(in=a) two(in=b);by ID;if a and not b;
run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
27
Accessing an Excel Workbook
The LIBNAME statement can be used to assign a library reference name to a Microsoft Excel workbook assuming SAS/ACCESS Interface to PC File Format is available.
.
LIBNAME libref EXCEL 'Excel-workbook';LIBNAME libref PCFILES PATH='Excel-workbook';LIBNAME libref XLSX 'Excel-workbook';
PROC CONTENTS DATA=libref._ALL_;RUN;
LIBNAME libref CLEAR;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
28
Accessing an Excel Workbook
If using the EXCEL or PSFILES engine, worksheet names appear with a dollar sign at the end of the name. A SAS name literal is used to reference the worksheet.
If using the XLSX engine, worksheet names do not appear with a dollar sign at the end of the name.
libref.'Excel-worksheet$'n
libref.Excel-worksheet
Copyright © SAS Inst itute Inc. A l l r ights reserved.
29
Question 7
Which program reads a sheet named July from the Excel workbook YR2017.xlsx?
a.
b.
c.
libname myxls excel 'YR2017.xlsx';data july;
set myxls.'July$'n;run;
libname excel pcfiles 'YR2017.xlsx';data july;
set excel.'July$'n;run;
libname myexcel xlsx 'YR2017.xlsx';data july;
set xlsx.July;run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
30
Question 7
Which program reads a sheet named July from the Excel workbook YR2017.xlsx?
a.
b.
c.
libname myxls excel 'YR2017.xlsx';data july;
set myxls.'July$'n;run;
libname excel pcfiles 'YR2017.xlsx';data july;
set excel.'July$'n;run;
libname myexcel xlsx 'YR2017.xlsx';data july;
set xlsx.July;run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
1. Accessing Data
2. Creating Data Structures
Copyright © SAS Inst itute Inc. A l l r ights reserved.
32
Creating Data Structures
The content area of Creating Data Structures includes the following topics:
• Create temporary and permanent SAS data sets.
• Create and manipulate SAS date values.
• Export data to create standard and comma-delimited raw data files.
• Control which observations and variables in a SAS data set are processed and output.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
33
Creating SAS Data Sets
The DATA step can be used to create temporary or permanent SAS data sets.
The LIBNAME statement is used to assign a library reference name (libref) to a SAS library.
Note: Work is the default temporary library.
DATA libref.SAS-data-set;<additional SAS statements>
RUN;
LIBNAME libref 'SAS-data-library';
Copyright © SAS Inst itute Inc. A l l r ights reserved.
34
Creating SAS Data Sets
Multiple SAS data sets can be created in a DATA step.
The OUTPUT statement is used to direct output to a specific SAS data set.
Note: An OUTPUT statement without arguments writes to every SAS data set listed in the DATA statement.
DATA SAS-data-set-1 < … SAS-data-set-n>;
OUTPUT <SAS-data-set-1 … SAS-data-set-n>;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
35
Creating SAS Data Sets
The DATA step is processed in two phases.
Yes
Compile the step
No
Yes
Compilation Phase
Execution Phase
Success?
Initialize PDV to missing
Execute SET statement
Execute other statements
Output to SAS data set
End of file?
Next step
Next step
Copyright © SAS Inst itute Inc. A l l r ights reserved.
36
Question 8
Given the following SAS data set and program:
Which statement is true?
a. A permanent data set is created.
b. The final data set contains six observations.
c. The final data set contains three variables.
d. The DATA step stops execution during the sixth iteration.
data both;set project.orders;Total=Quantity*Price;
run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
37
Question 8
Given the following SAS data set and program:
Which statement is true?
a. A permanent data set is created.
b. The final data set contains six observations.
c. The final data set contains three variables.
d. The DATA step stops execution during the sixth iteration.
data both;set project.orders;Total=Quantity*Price;
run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
38
Question 9
Which statement is true?
a. During the compilation phase, variables are assigned missing values.
b. During the compilation phase, the program data vector (PDV) is created.
c. During the execution phase, the descriptor portion of the SAS data set is created.
d. During the execution phase, an input buffer is created if reading data from a SAS data set.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
39
Question 9
Which statement is true?
a. During the compilation phase, variables are assigned missing values.
b. During the compilation phase, the program data vector (PDV) is created.
c. During the execution phase, the descriptor portion of the SAS data set is created.
d. During the execution phase, an input buffer is created if reading data from a SAS data set.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
40
Working with SAS Date Values
A SAS date value is stored as the number of days between January 1, 1960 and a specific date.
A SAS date constant is used to refer to a SAS date value in a program.
' ddMMMyyyy ' d
Copyright © SAS Inst itute Inc. A l l r ights reserved.
41
Working with SAS Date Values
Functions are used to extract information from a SAS date value or create a SAS date value.
Formats are used to display SAS date values.
Informats are used to convert data to a SAS date value.
MMDDYYw.DDMMYYw.
DATEw.
WORDDATE. WEEKDATE.
YEARw.
MONYYw.
MMDDYYw.DDMMYYw.
DATEw. YEARw.
YEAR QTR MONTH DAY
WEEKDAY TODAY DATE MDY
Copyright © SAS Inst itute Inc. A l l r ights reserved.
42
Question 10
Given the following SAS program:
Which report output is correct?
a. b.
c. d.
data dates;Birth='15MAY1964'd;Dow=weekday(Birth);Age=(today()-Birth)/365.25;format Birth weekdate.;
run;proc print data=dates noobs;run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
43
Question 10
Given the following SAS program:
Which report output is correct?
a. b.
c. d.
data dates;Birth='15MAY1964'd;Dow=weekday(Birth);Age=(today()-Birth)/365.25;format Birth weekdate.;
run;proc print data=dates noobs;run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
44
Exporting to Raw Data Files
ODS statements can be used to create a raw data file.
The EXPORT procedure can be used to create a raw data file.
ODS CSVALL FILE='raw-data-file';SAS code to generate a report(s)
ODS CSVALL CLOSE;
PROC EXPORT DATA=SAS-data-setOUTFILE='raw-data-file'DBMS=CSV|DLM|TAB <REPLACE> <LABEL>;
DELIMITER='char';RUN;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
45
Exporting to Raw Data Files
The DATA step can also be used to create a raw data file.
DATA _NULL_;SET input-SAS-data-set;FILE 'raw-data-file' <DSD>;PUT variable variable-n <:format> … ;
RUN;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
46
Question 11
Which statement is true?
a. ODS CSVALL is part of Base SAS.
b. A style template is used when specifying the STYLE= option in the ODS CSVALL statement.
c. SAS/ACCESS Interface to PC Files is needed to use the EXPORT procedure to create a delimited file.
d. The EXPORT procedure can reference a raw data file with the DATA= option and the OUTFILE= option.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
47
Question 11
Which statement is true?
a. ODS CSVALL is part of Base SAS.
b. A style template is used when specifying the STYLE= option in the ODS CSVALL statement.
c. SAS/ACCESS Interface to PC Files is needed to use the EXPORT procedure to create a delimited file.
d. The EXPORT procedure can reference a raw data file with the DATA= option and the OUTFILE= option.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
48
Controlling Observations and Variables
The DROP and KEEP statements can be used to control the variables in the output SAS data set.
The DROP= and KEEP= data set options can also be used to determine the variables in the output data set.
DROP variable-list;
KEEP variable-list;
SAS-data-set(DROP=variable-1 < … variable-n>)
SAS-data-set(KEEP=variable-1 < … variable-n>)
Copyright © SAS Inst itute Inc. A l l r ights reserved.
49
Controlling Observations and Variables
The WHERE statement and IF statement can be used to control the observations in the output SAS data set.
• The WHERE statement selects observations before they are brought into the program data vector.
• The IF statement selects observations that were read into the program data vector.
WHERE expression;
IF expression;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
50
Controlling Observations and Variables
An expression is a sequence of operands and operators that form a set of instructions that define a condition.
Operandscharacter constants, numeric constants, date constants, character variables, and numeric variables
Arithmetic Operators
exponentiation (**), multiplication (*), division (/), addition (+), and subtraction (-)
Comparison Operators
EQ (=), NE, GT (>), LT (<), GE (>=), LE (<=), and IN
Logical Operators AND (&), OR (|), and NOT
Special WHERE Operators
CONTAINS (?), BETWEEN-AND, IS NULL, IS MISSING, LIKE, SAME AND, and ALSO
Copyright © SAS Inst itute Inc. A l l r ights reserved.
51
Question 12
Given the following input SAS data set:
Given the following SAS program:
How many variables are in the final SAS data set?
data class;set sashelp.class(drop=Sex);keep Name Height Weight;if Age >= 13 then Group='Teen';
run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
52
Question 12
Given the following input SAS data set:
Given the following SAS program:
How many variables are in the final SAS data set? 3
data class;set sashelp.class(drop=Sex);keep Name Height Weight;if Age >= 13 then Group='Teen';
run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
53
Question 13
Given the following input SAS data set:
Which step produces an error?
a.
b.
data subset;set employees;Month=month(birth);if state='CA' and Month=3;
run;
data subset;set employees;Month=month(birth);where state='NC' and Month=3;
run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
54
Question 13
Given the following input SAS data set:
Which step produces an error?
a.
b.
data subset;set employees;Month=month(birth);if state='CA' and Month=3;
run;
data subset;set employees;Month=month(birth);where state='NC' and Month=3;
run;
Copyright © SAS Inst itute Inc. A l l r ights reserved.
55
Base Programming Exam Preparation
Multiple resources are available for your exam preparation.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
56
Books – SAS Product Documentationhttp://support.sas.com/documentation/onlinedoc/base/
Free SAS Product Documentation is available.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
57
Books – Certification Prep Guidehttps://www.sas.com/sas/books.html
The official prep guide covers all of the objectives tested in the exam.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
58
BASE Programmer Certification Review Series
communities.sas.com Find a Community Learn SAS SAS Certification
Preparing for the Base Programming Exam September 14, 2017
Accessing Data and Creating Data Structures January 18, 2018
Managing Data February 15, 2018
Generating Reports and Test-Taking Strategies March 13, 2018
Copyright © SAS Inst itute Inc. A l l r ights reserved.
59
How to Reach Certification
sas.com/certifyWeb:
pearsonvue.com/sasExam Registration
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Thanks for attending this Certification Webinar!
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Q&APlease submit your questions in the Q&A window
Copyright © SAS Inst itute Inc. A l l r ights reserved.
@SASSoftware
SASSoftware
communities.sas.com
SAS Software, SASUsersgroup
SAS, SAS Users Group
blogs.sas.com/content
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Thank You!
Upcoming SAS Global Certification Webinars
• February 15: Managing Data - 11:00 a.m. – 12:00 p.m. ET• March 13: Generating Reports and Test Taking Strategies – 11:00 a.m. – 12:00 p.m. ET