tips and tricks for sas programmers group...title chapter 4: creating simple queries author susan...
TRANSCRIPT
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Tips and Tricks for SAS Programmers
SUCCESSMarch 2018Mary Harding
2
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Agenda
Tip 1 Sorting tip
Tip 2 Don’t read data over and over
again :SASFILE
Tip 3 Pattern Matching
Tip 4 Substr (left side)
3
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Agenda
Tip 5 Create Native EXCEL files from
ODS…..It’s FINALLY HERE !
4
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 1: Save on Sorting: Use the NOEQUALS Option
PROC SORT : change default behavior
Example:
o First Sort 13 Cards Highest to lowest:
o KK QQ JJ 10 9 8 7 6 5 3
o Then sort by Suit.
Solution: Proc Sort, NOEQUALS option
For observations with identical BY-
variable values, EQUALS maintains
the relative order of the
observations within the input data
set in the output data set.
EQUALS is the default, and uses
unnecessary CPU.
If we look at the ‘s, Proc Sort, maintains the relative order
of the K Q J 5, even though we didn’t need to.
5
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
First sort table by
order_date.
• Now sort the same table by
order_type
• Notice how SAS maintains
the previous order_date
order.
Tip 1: Save on Sorting: Use the NOEQUALS Option
6
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
PROC SORT DATA=prod.customers NOEQUALS ;
I just care about the beer
can size
Tip 1: Save on Sorting: Use the NOEQUALS Option
Proc Sort data=orders out=Orders_ordertype Noequals;
by Order_type;
Run;
Use NOEQUALS option to save processing time.
7
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SASFILE Statement
▪ Loads TABLE keeps data in MEMORY
▪ Use when processing data across many steps
▪ Works if TABLE fits in Memory
▪ Remember to use SASFILE close statement to release
memory
▪ Cannot write to file
Tip 2: Don’t read data over and over again: SASFILE
8
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
sasfile orion.customerdim load;
proc freq data=orion.customerdim;tables CustomerCountry CustomerType;
run;proc print data=orion.customerdim noobs;
where CustomerType='Orion Club Gold members high activity';var CustomerID CustomerName CustomerAgeGroup;
run;proc means data=orion.customerdim mean median max min;
var CustomerAge;class CustomerGroup;
run;proc tabulate data=orion.customerdim format=8.;
class CustomerAgeGroup CustomerType;table CustomerType All=Total,
CustomerAgeGroup*n=' ' All=Total*n=' '/rts=45;run;sasfile orion.customerdim close;
Tip 2: Don’t read data over and over again: SASFILE
SASFILE SAS-data-set LOAD;
SASFILE SAS-data-set CLOSE;
Reduce
processing
9
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 2: Don’t read data over and over again: SASFILE
Note: Cannot write to a table that is in-memory
10
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 3: Pattern Matching: PERL Regular Expressions (PRX)
• Open source language available in SAS
• Specifies Character patterns to search or replace
• Returns the Start position at which the pattern is found
• Returns 0 if pattern is not found
Let’s look at an example: I need to verify SIN Patterns
11
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 3: Pattern Matching : PERL Regular Expressions
Data SIN;
infile datalines dlm=",";
input name:$30. sin:$20.;
Datalines;
Cary Grant,345-667-888
Humphrey Bogart,56-444-890
Lauren Bacall, 458-897-987
Sophie Loren, jhu-987-90
James Dean,789-378-123HI
Jerry Lewis,987-99-089
Catherine Hepburn,883-268-169
Cary Grant,564-878-LK8
Marlon Brando,Toronto234-582-873
Janet Lee,849-74N-e89
;
Run;
1. Verify Pattern: XXX-XXX-XXX
2. Verify that X is a digit
3. Verify there are no additional
characters
ap
7 Invalid SINs
12
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 3: Pattern Matching PERL Regular Expressions
Data Invalid Good;
set sin;
If prxmatch('/\d{3}-\d{3}-\d{3}/',trim(SIN))=0 then
output Invalid /*did not find a pattern match*/;
else output Good;
Run;
Metacharacter Role
/ Start or end regular expression
\d{3} Match three digits
- Dash between digits
It almost works
13
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 3: Pattern Matching PERL Regular Expressions
What happened to :
James Dean,789-378-123HI
Marlon Brando,Toronto234-582-873
The PRXMATCH is a sliding search so it did find XXX-XXX-XXX, We need to further qualify our search Toronto234-582-873
Force PRXMATCH to search for match ONLY from position 1 to 11,
rather than search for the pattern anywhere in the string
Metacharacter Role
^ Represents the beginning of
the string, before the first
character
$ Represents the end of the
string, after the last character
Hmm…Why are these
still considered VALID?
14
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 3: Pattern Matching PERL Regular Expressions
Data Invalid Good;
set sin;
if prxmatch('/^\d{3}-\d{3}-\d{3}$/',(SIN))=0 then output Invalid;
/*If equal to Zero, it did not find the pattern match and is INVALID*/
else output Good;
Run;
It works
^ matches the position at the
beginning of the input string
$ matches the position at the
end of the input string.
7 Invalid SINs, problem
solved.
15
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 3: Perl Regular Expressions
PRX can accomplish much more:
Use in the DATA step to enhance search-and-
replace text options.
Use to perform the following tasks:
o Validate data
oReplace text
o Extract a substring from a string
16
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 3: Pattern Matching: PRX vs VERIFY and LIKE functions
o The VERIFY function has some
similar properties to PERL
Regular Expressions
o Let’s use the VERIFY function
with the LIKE operator, to check
SINs.
Is there
another
way?
17
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 3: Pattern Matching: PRX vs VERIFY and LIKE functions
Data Invalid_SIN;
Set sin;
Where SIN not Like '___-___-___' or
Verify(trim(compress(SIN,'-')),'0123456789') ne 0;
Run;
VERIFY(target-expression, search–expression, <search–expression> )
Returns the position of the first character in target-expression that is not present in
any search-expression. If there are no characters other than those in search-
expressions,VERIFY returns a 0. Essentially a correct Pattern match.
18
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 3: Pattern Matching: PRX vs VERIFY and LIKE functions
o An alternate solution that doesn’t require
learning Perl Regular expression syntax.
o May not work for more complicated
matching problems.
19
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 4: Substr(left side)
You need to replace several characters in a string.
Easy you say, I know several ways :
o Tranwrd function
o Translate
o PRXCHANGE
o Conditional logic
Easy problem:
Change values of the variable Pet. If Pet= ‘Cat’ then
change to ‘Dog’.
Newpet=Tranwrd(Pet,’Cat’,’Dog’);
20
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 4: Substr(left side)
What if you have millions of characters that
need to be changed, to a specific value.
More difficult Problem:
We need to Confidentialize credit card
numbers.
Any ideas……
Hint: it is a one line solution
21
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 4: Substr(left side) and Picture Format
SUBSTR(variable, position <, length>)=characters to replace
Replaces character values based on their position in a string,
not their value.
The one line solution…..
Example SUBSTR(creditcard,13,4)=’XXXX’;
22
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
▪ It’s finally here, create native xlsx files from SAS Reports.
▪ Production SAS9.4 M3 (Experimental SAS9.4 M1 and M2)
ODS EXCEL FILE="C:\Temp\colour_cells.xlsx" ;
Proc report data=orion.emps(obs=20);
column Name JobTitle Country Gender HireDate Salary;
compute Name;
count+1;
if mod(count,2)=0 then
call define(_ROW_,'STYLE',
'style={background=lightgreen}');
endcomp;
Run;
ODS EXCEL CLOSE;
Tip 4: Create native EXCEL- ODS EXCEL
23
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
Tip 4: Create native EXCEL- ODS EXCEL
24
Copyr i g ht © 2014, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
24
What to learn more Papers and Documentation
o Paper 042-2007: Jolley, Linda T. Stroupe, Jane S. Dear Miss
SASAnswers: A Guide to SAS® Efficiency
o Paper 209-2007: Virgile, Robert The Most Important Efficiency
Techniques
o PERL Regular Expressions:
http://www2.sas.com/proceedings/forum2007/223-2007.pdf
o http://support.sas.com/rnd/base/datastep/perl_regexp/regexp-tip-
sheet.pdf
o ODS EXCEL Documentation:
http://support.sas.com/documentation/cdl/en/odsug/67921/HTML/def
ault/viewer.htm#p09n5pw9ol0897n1qe04zeur27rv.htm
o https://blogs.sas.com/content/sgf/2017/02/20/tips-for-using-the-ods-
excel-destination/
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Questions?
sas.com
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Thank You___________________________