SAS Basics
Windows
Program Editor Write/edit all your statements here.
Log Watch this for any errors in program as it
runs. Output
Will automatically pop in front when there is output. Does not need to occupy screen space during program editing.
File Organization
Create subfolders in your Project folder for Data
Contains SAS datasets, with .sd2 extension Formats
Compiled version of formats, a file with .sc2 extension. Used for building classes of variables for looking at frequencies.
Output Save output files here. These are text files with
a .sas extension. Programs
All programs are text files with .sas ending.
Creating a dataset
Internal Data DATA datasetname; INPUT name $ sex $ age; CARDS; John M 23 Betty F 33 Joe M 50 ; RUN;
Creating a dataset
External Data DATA datasetname; INFILE ‘c:\folder\subfolder\file.txt’; INPUT name $ sex $ age; ; RUN;
Creating from an existing one
DATA save.data2 (keep = age income); SET save.data1; RUN;
DATA save.data2; SET save.data1; DROP age; TAX = income*0.28; RUN;
Permanent Data Sets LIBNAME save ‘c:\project\data’; DATA save.data1; X=25; Y=X*2; RUN;Note that save is merely a name you
make up to point to a location where you wish to save the dataset called data1. (It will be saved as data1.sd2)
What’s in my SAS dataset? PROC CONTENTS data=save.data1; RUN;
PROC CONTENTS data=save.data1 POSITION;
RUN;This will organize the variable list sorted
alphabetically and a duplicate list sorted by position (the sequence in which they actually exist in the file).
Viewing file contents
PROC PRINT data=save.data1; run;
PROC PRINT data=save.data1 (obs=5); VAR name age; RUN;
PROC PRINT data=save.data1 (obs=12); VAR age -- income; RUN;
Frequencies/Crosstabs PROC FREQ data=save.data1; TABLES age income trades; RUN;
PROC FREQ data=save.data1; TABLES age*sex; RUN;
Scatter Plot
PROC PLOT data=save.data1; PLOT Y*X; RUN;
Creating a Format Library PROC FORMAT LIBRARY=LIBRARY; VALUE BG 0 = 'BAD' 1 = 'GOOD' -1 = 'MISSING' ; VALUE TWO -1 = 'MISSING' -2 = 'NO RECORD' -3 = 'INQS. ONLY' -4 = 'PR ONLY' 0='0' 1='1' 1<-HIGH='2+' ; RUN;
Applying a format to a variable PROC DATASETS library=save; MODIFY data1; FORMAT trades ten.; RUN; QUIT;
This applies the format called ten to the variable trades. A subsequent PROC FREQ statement for trades will show the format applied. Note that ten must already exist in the format library for this to work.
Applying a format: Method 2
Data save.data2; SET save.data1; FORMAT trades bktrds ten. totbal mileage. ; RUN;
This is another way to apply formats when creating a new dataset (data2) from a previous one (data1) that has unformatted variables.
Random Selection of Obs. DATA save.new; SET save.old; Random1 = RANUNI(254987)*100; IF Random1 > 50 THEN OUTPUT; RUN; QUIT;The function RANUNI requires a seed number, and then
produces random values between 0 and 1, stored under the variable name Random1 (you can choose any name). The above program will create new.sd2, with about half the observations of old.sd2, randomly chosen.
Sorting and Merging Datasets PROC SORT data = save.junk; BY Age Income; Run;
PROC SORT data=save.junk OUT=save.neat; BY acctnum; RUN;
PROC SORT data=save.junk NODUPKEY; BY something; RUN;
Sorting and Merging Datasets
PROC SORT data=save.one; BY Acctnum; RUN; PROC SORT data=save.two; BY Acctnum; RUN;
DATA save.three; MERGE save.one save.two; BY Acctnum; RUN;
Sorting and Merging Datasets
DATA save.three; MERGE save.one (IN = a) save.two; BY Acctnum; IF a; RUN;
Using Arrays DATA save.new; SET save.old; ARRAY vitamin(6) a b c d e k; DO i = 1 to 6; IF vitamin(i) = -5 THEN vitamin(i) = .; END; RUN;This assumes you have 6 variables called a, b, c, d, e, and ,k
in save.old. This program will modify all 6 such that any instance of a –5 value is converted to a missing value.
Simple Correlations
PROC CORR data=save.relative; VAR tvhours study; RUN;
PROC CORR data=save.relative; VAR tvhours study; WITH Score; RUN;