proc transpose a simple tutorial by charles patridge the hartford 860-547-6644...
TRANSCRIPT
Proc Transpose A Simple Tutorial
By
Charles Patridge
The Hartford
860-547-6644
http://www.sconsig.com
A Simple Transposition
The Input Data Set 1OBS # Tester1 Tester2 Tester3 Tester4
1 22 25 21 21
2 15 19 18 17
3 17 19 19 19
4 20 19 16 19
5 14 15 13 13
6 15 17 18 19
7 10 11 9 10
8 22 24 23 21
The Output Data Set 2_NAME_ COL1 COL2 COL3 COL4 COL5 COL6 COL7 COL8
Tester1 22 15 17 20 14 15 10 22
Tester2 25 19 19 19 15 17 11 24
Tester3 21 18 19 16 13 18 9 23
Tester4 21 17 19 19 13 19 10 21
Columns become Rows
&
Rows become Columns
PROC TRANSPOSE <DATA=input-data-set> <LABEL=label> <LET> <NAME=name> <OUT=output-data-set> <PREFIX=prefix>;
BY <DESCENDING> variable-1 <…<DESCENDING> variable-n> <NOTSORTED>;
COPY variable(s);
ID variable;
IDLABEL variable;
VAR variable(s);
OptionsDATA= input-data-set names the SAS data set to transpose.Default: most recently created SAS data set LABEL= label specifies a name for the variable in the output data set that contains the label of the variable that is being transposed to create the current observation.Default: _LABEL_ LET allows duplicate values of an ID variable. PROC TRANSPOSE transposes the observation containing the last occurrence of a particular ID value within the dataset or BY group. NAME= name specifies the name for the variable in the output data set that contains the name of the variable being transposed to create the current observation.Default: _NAME_
Options - continued
OUT= output-data-set names the output data set. If output-data-set does not exist, PROC TRANSPOSE creates it using the DATAn naming convention.
Default: DATAn
PREFIX= prefix specifies a prefix to use in constructing names for transposed variables in the output data set. For example, if PREFIX=VAR, the names of the variables are VAR1, VAR2, . . . ,VARn.
Interaction: when you use PREFIX= with an ID statement, the value prefixes to the ID value.
Note: If a BY group in the input data set has more observations than other BY groups, PROC TRANSPOSE assigns missing values in the output data set to the variables that have no corresponding input observations.
Proc Transpose A simple example
proc format; value mymths
1="JAN" 2="FEB" 3="MAR" 4="APR" 5="MAY" 6="JUN”
7="JUL" 8="AUG" 9="SEP" 10="OCT" 11="NOV" 12="DEC"; run;
Data rawdata; infile cards missover;
input @1 gender $1. @3 date date9. amount;
year = year(date); month = month(date);
cards;
F 01Feb2003 123
M 01Mar2003 57
F 01Mar2003 121
M 01Apr2003 63
;;;;
run;
Proc Transpose – A simple example - continued
proc sort data=rawdata out=rawdata; by gender year; run;
proc transpose data=rawdata out=tranpose (drop=_name_);
by gender year;
id month ;
format month mymths3.;
var amount;
run;
proc print;
run;
Proc Transpose A simple example
Obs gender year FEB MAR APR
1 F 2003 123 121 .
2 M 2003 . 57 63
F 01Feb2003 123 RAWDATA
M 01Mar2003 57
F 01Mar2003 121
M 01Apr2003 63
Proc TransposeA simple example - continued
proc transpose data=tranpose out=rawout;
by gender year; var feb mar apr;
run;
data rawout; set rawout;
tmpdate = '01'||trim(_name_)||put(year,4.);
date = input(tmpdate,date9.);
amount = col1;
if amount = . then amount = 0;
drop tmpdate _name_ col1; format date date9.;
run;
proc print data=rawout; run;
Proc TransposeA simple example
Obs gender year date amount
1 F 2003 01FEB2003 123
2 F 2003 01MAR2003 121
3 F 2003 01APR2003 0
4 M 2003 01FEB2003 0
5 M 2003 01MAR2003 57
6 M 2003 01APR2003 63
Proc TransposeCan create records for missing OBS
BUT only inclusive of existing end points!
Proc TransposeMore Uses
/*** What if you need to have a full matrix of Data ***/
data template (keep=gender date amount year);
length gender $ 1.; format date date9.;
tmpdate = '01dec2002'd; amount = 0;
do g = 1 to 2;
do d = 1 to 12;
if g = 1 then gender = 'F';
if g = 2 then gender = 'M';
date = intnx('month', tmpdate, d );
output;
end;
end;
run; continued….
proc sort data=rawdata out=rawdata ; by gender date; run;
proc sort data=template out=template; by gender date; run;
/*** merge template with raw data to create a full matrix of data ***/
data rawdataa;
merge template (in=t)
rawdata (in=r);
by gender date;
year = year(date);
month= month(date);
run; continued….
/*** now you would like to take vertical data to make horizontal data ***/
proc transpose data=rawdataa out=tranpose;
by gender year;
id month ;
format month mymths3.;
var amount;
run;
/*** print to see what data now looks like ***/
proc print data=tranpose; run;
Obs gender year _NAME_ JAN FEB MAR APR MAY JUN JUL AUG SEP …….
1 F 2003 amount 0 123 121 0 0 0 0 0 0 ….
2 M 2003 amount 0 0 57 63 0 0 0 0 0 ….
You can see we have created buckets for all the months of the year – JAN thru DEC
Now, let’s do some simple math for percentages, ytd #’s and ytd %’s.
Continued from prior data step
data rawdatab; set tranpose; array mth(*) jan feb mar apr may jun jul aug sep oct nov dec ; array pct(*) janpct febpct marpct aprpct maypct junpct julpct augpct seppct octpct novpct decpct; array ytd(*) janytd febytd marytd aprytd mayytd junytd julytd augytd sepytd octytd novytd decytd; array ytdp(*) janytp febytp marytp aprytp mayytp junytp julytp augytp sepytp octytp novytp decytp;
Continued…
Continued …
/*** get yearly totals ***/
total = sum(of jan--dec);
/*** get monthly percentages to yearly totals***/
do i = 1 to 12; pct(i) = mth(i) / total; end;
/*** get ytd totals ***/
do i = 1 to 12;
if i = 1 then ytd(i) = mth(i);
else ytd(i) = mth(i) + ytd(i-1);
end;
/*** get ytd percentages ***/
do i = 1 to 12; ytdp(i) = ytd(i) / total; end;
drop i;
run; continued…
Continued …
/*** convert horizontal data to vertical data ***/
proc transpose data=rawdatab out=tranposeb;
by gender year;
var _numeric_;
run;
proc print data=tranposeb; run;
Obs gender year _NAME_ amt
1 F 2003 year 2003.00
2 F 2003 JAN 0.00
3 F 2003 FEB 123.00
4 F 2003 MAR 121.00
5 F 2003 APR 0.00
6 F 2003 MAY 0.00
7 F 2003 JUN 0.00
8 F 2003 JUL 0.00
9 F 2003 AUG 0.00
10 F 2003 SEP 0.00
11 F 2003 OCT 0.00
12 F 2003 NOV 0.00
13 F 2003 DEC 0.00
Obs gender year _NAME_ amt
14 F 2003 janpct 0.00
15 F 2003 febpct 0.50
16 F 2003 marpct 0.50
17 F 2003 aprpct 0.00
18 F 2003 maypct 0.00
19 F 2003 junpct 0.00
20 F 2003 julpct 0.00
21 F 2003 augpct 0.00
22 F 2003 seppct 0.00
23 F 2003 octpct 0.00
24 F 2003 novpct 0.00
25 F 2003 decpct 0.00
continued….
Continued … output
Continued … output
Obs gender year _NAME_ amt
26 F 2003 janytd 0.00
27 F 2003 febytd 123.00
28 F 2003 marytd 244.00
29 F 2003 aprytd 244.00
30 F 2003 mayytd 244.00
31 F 2003 junytd 244.00
32 F 2003 julytd 244.00
33 F 2003 augytd 244.00
34 F 2003 sepytd 244.00
35 F 2003 octytd 244.00
36 F 2003 novytd 244.00
37 F 2003 decytd 244.00
Obs gender year _NAME_ amt
38 F 2003 janytp 0.00
39 F 2003 febytp 0.50
40 F 2003 marytp 1.00
41 F 2003 aprytp 1.00
42 F 2003 mayytp 1.00
43 F 2003 junytp 1.00
44 F 2003 julytp 1.00
45 F 2003 augytp 1.00
46 F 2003 sepytp 1.00
47 F 2003 octytp 1.00
48 F 2003 novytp 1.00
49 F 2003 decytp 1.00
50 F 2003 total 244.00
Let’s Process this Last Transposed Dataset
data rawdatac;
set tranposeb;
length month $ 3. ;
_name_ = upcase(_name_); /*** make contents upper case ***/
if _name_ = "YEAR" then delete; /*** do not need this ***/
if _name_ = "TOTAL" then delete; /*** do not need this ***/
month = substr(_name_,1,3); /*** Get Name of Month ***/
if ' ' = substr(_name_,4,3) then mthly = amt; /*** get monthly amts ***/
if 'YTD' = substr(_name_,4,3) then YTD = amt; /*** get YTD amts ***/
if 'PCT' = substr(_name_,4,3) then PCT = amt; /*** get PCT amts ***/
if 'YTP' = substr(_name_,4,3) then YTDP = amt; /*** get YTD PCT amts ***/
/*** convert date field to sas date field ***/
tmpdate = '01' || month || put(year,4.);
date = input( tmpdate, date9.);
run; continued….
Continued
/*** summarize over the new variables created ***/
proc summary data=rawdatac nway missing;
classes gender date;
var mthly pct ytd ytdp;
output out=summary (drop=_type_ _freq_)
sum=;
run;
proc print;
format date date7.;
run;
Final Output
Obs gender date mthly PCT YTD YTDP
1 F 01JAN03 0 0.00000 0 0.00000
2 F 01FEB03 123 0.50410 123 0.50410
3 F 01MAR03 121 0.49590 244 1.00000
4 F 01APR03 0 0.00000 244 1.00000
5 F 01MAY03 0 0.00000 244 1.00000
6 F 01JUN03 0 0.00000 244 1.00000
7 F 01JUL03 0 0.00000 244 1.00000
8 F 01AUG03 0 0.00000 244 1.00000
9 F 01SEP03 0 0.00000 244 1.00000
10 F 01OCT03 0 0.00000 244 1.00000
11 F 01NOV03 0 0.00000 244 1.00000
12 F 01DEC03 0 0.00000 244 1.00000
13 M 01JAN03 0 0.00000 0 0.00000
14 M 01FEB03 0 0.00000 0 0.00000
15 M 01MAR03 57 0.47500 57 0.47500
…
Proc Transpose A Simple Tutorial
The End
By
Charles Patridge
The Hartford
860-547-6644
http://www.sconsig.com