tabulations in sas with time series a...
Post on 26-Feb-2021
2 Views
Preview:
TRANSCRIPT
Tabulations in SAS with Time Series –
A PerspectivePresenters
Karine Désilets, Statistics Canada, Ottawa, Canada.
Jun Li, Statistics Canada , Ottawa, Canada.
AbstractPresenting efficient ways to calculate growth rates and include Seasonal Data,
Raking and Benchmarking in Tabulation Tools. The pros and cons of tabulating with
Proc Means, Proc Summary, Proc Tabulate and Proc Report will also be compared.
Telling Canada’s story in numbers
Karine Désilets
Jun Li
System Engineering Division
Statistics Canada
May, 2018
Tabulations in SAS with
Times Series – A perspective
www.statcan.gc.ca
Today’s Topics
40 minutes - with the Presentation of :
Our Generalized Tabulation Tools
Some Proc for Tabulations
Growth Rate Calculations
Seasonal Adjustment with X-12-ARIMA, 2 Days Class, H-0434
Theory and Application of Benchmarking, 2 Days Class, H-0436
Theory and Application of Raking for Time Series, 2 Days Class, H-0437
Agenda
Introduction to Time Series and Tabulation Tools
Econometric with SAS/ETS
Background and Strategic Fit of Unadjusted/Seasonally Adjusted Data
Economic World: X12-ARIMA, Raking and Benchmarking
Pros / Cons :
Means, Summary, Tabulate, Report Procs
New Generation: Threaded Means Proc with Viya and CAS
Growth Rate / Percentage Change
Conclusion
What is a Time Series ? (Industry)
Source: twitter,visme
What is a Time Series ? (Key Indicators)
Source: Statistics Canada Official Web Site
What is a Time Series ? (Cansim)
Source: Statistics Canada Official Web Site - Cansim
Annual
Quarterly
Monthly
ID SEX
AGE
GROUP WEIGHT INCOME
1 0 1 5 0
2 0 2 10 900
3 0 3 15 -5
4 0 4 1 3
5 1 1 5 6
6 1 2 5 10
7 1 3 12 4
8 1 4 1 2
Input MicrodataTabulated Data
SEX
AGE
GROUP
SUM
WEIGHT
INCOME
SUM
WEIGHT
. . 54 9058
0 . 31 8928
1 . 23 130
. 1 10 30
. 2 15 9050
. 3 27 -27
. 4 2 5
0 1 5 0
0 2 10 9000
0 3 15 -75
0 4 1 3
1 1 5 30
1 2 5 50
1 3 12 48
1 4 1 2
XML Input FileMetadata
Definitions
What is a Tabulation Tool ?
TABULATION
Injection File
Categorical Data
Dimensions
Weighted / Unweighted
Statistics to Compute
Confidentiality / Rounding
Injection File
From continuous microdata there is a need to create categories.
Example:
Age Age Group
Age Description GroupO -17 Persons under 18 years 1
18 – 64 Persons 18 to 64 years 2
65 and up Persons 65 year and over 3
. Not Responded 4
ID SEX
AGE
GROUP WEIGHT INCOME
1 0 1 5 0
2 0 2 10 900
3 0 3 15 -5
4 0 4 1 3
5 1 1 5 6
6 1 2 5 10
7 1 3 12 4
8 1 4 1 2
ID SEX
AGE
GROUP WEIGHT INCOME
1 0 1 5 0
2 0 2 10 900
3 0 3 15 -5
4 0 4 1 3
5 1 1 5 6
6 1 2 5 10
7 1 3 12 4
8 1 4 1 2
Input MicrodataTabulated Data
SEX
AGE
GROUP
SUM
WEIGHT
INCOME
SUM
WEIGHT
. . 54 9058
0 . 31 8928
1 . 23 130
. 1 10 30
. 2 15 9050
. 3 27 -27
. 4 2 5
0 1 5 0
0 2 10 9000
0 3 15 -75
0 4 1 3
1 1 5 30
1 2 5 50
1 3 12 48
1 4 1 2
XML Input FileMetadata
Definitions
What is a Tabulation Tool ?
TABULATION
Injection File
Categorical Data
Dimensions
Weighted / Unweighted
Statistics to Compute
Confidentiality / Rounding
Tabulation Tool Actual State
A Generalized Tabulation tool has been developed for Social Survey,
Administrative Data and soon, Census Data :
Create Tabulated Data Tables;
Calculate Precision Measures;
Apply confidentiality rules and/or rounding consistently across data sources;
Disseminate output or custom products for internal and/or external clients
Dynamically produce tabulation for a specific period or for time series
Unadjusted Data at :
Annual level
Infra annual level (quarters or months)
Inclusion of socio-economic/economic field would make the need of
Seasonally Adjusted Data to be created and then, tabulate them.
Actual Statistics Calculations Available
Level – 1 Statistics
All Statistics available in Proc
Means.
Examples:
Sum, median, percentile, max,
min, weighted sum, count
Level – 2 Statistics
Gini
Geomean
Level – 3 Statistics
Ratio
Share
Distribution
Level – 4 Statistics
Moving Average
Level – 5 Statistics
Level Change
Percentage Change
Significance Test
Input Categorical Data
What is a Tabulation Tool for Time Series?
Year ID SEX
AGE
GROUP WEIGHT INCOME
2001 1 0 1 5 0
2001 2 0 2 10 900
2001 3 0 3 15 -5
2001 4 0 4 1 3
2001 5 1 1 5 6
2001 6 1 2 5 10
2001 7 1 3 12 4
2001 8 1 4 1 2
2002 1 0 1 5 0
2002 2 0 2 10 900
2002 3 0 3 15 -5
2002 4 0 4 1 3
2002 5 1 1 5 6
2002 6 1 2 5 10
2002 7 1 3 12 4
2002 8 1 4 1 2
Year Month ID SEX
AGE
GROUP WEIGHT INCOME
2001 1 1 0 1 5 0
2001 2 2 0 2 10 516
2001 3 3 0 3 15 -4
2002 1 1 0 4 1 2
2002 2 2 1 1 5 6
2002 3 3 1 2 5 9
2003 1 1 1 3 12 0
2003 2 2 1 4 1 1
2003 3 3 0 1 5 0
2004 1 1 0 2 10 323
2004 2 2 0 3 15 -4
2004 3 3 0 4 1 1
2005 1 1 1 1 5 4
2005 2 2 1 2 5 9
2005 3 3 1 3 12 4
Quarter ?
Econometrics Calculations with SAS/ETS
SAS/ETS software, a component of the SAS System, provides SAS
procedures for:
econometric analysis
time series analysis
time series forecasting
systems modeling and simulation
discrete choice analysis
analysis of qualitative and limited dependent variable models
seasonal adjustment of time series data
financial analysis and reporting
access to economic and financial databases
time series data management
Looking for Time Series
related calculations ?
The answer is certainly with
the documentation.
Economic Calculations with SAS/EG
Background and Strategic Fit
Time Series are part of current life and lead the economy.
Tabulation tools already exist and can support Unadjusted Time Series :
• Annual level
• Infra annual level (Quarters or Months)
Next Steps:
• Unadjusted Data
• Seasonally Adjusted Data (PROC X12-ARIMA)
• Raking
• Benchmarking
• Incorporation of Growth Rates (Percentage Change) as a Statistic
• Which PROC to use to Tabulate ?
Could We tabulate Time Series Microdata with Basic concepts related to Time series ?
Briefly: What is Seasonally Adjusted Data?Unadjusted Series:
Trend-Cycle
Seasonal Component
Trading-Day/Easter Effects
Irregular
Seasonally Adjusted Series :
Combination of trend-cycle and
irregular components
Estella Dagum from Statistics
Canada created the X11-ARIMA
Method in the 1970.350000
370000
390000
410000
430000
450000
470000
490000
510000
530000
550000
Q1 2007 Q1 2009 Q1 2011 Q1 2013 Q1 2015 Q1 2017
GDP at market prices
Source of graph : Statistics Canada - CSMA foundations – module 8 extension – with approbation of
Jim Tebrake.
How to Produce Seasonally Adjusted Data?With PROC X-12-ARIMA. :
USA Census Office Bureau method is called X-13-ARIMA-SEATS*.
The main goal is to apply moving-averages to the Calendar-adjusted time series to
smooth out the seasonal fluctuations *
Our objective today, is to have a system that can handle X12-ARIMA with
Tabulations.
Options:
1. Seasonally adjusted the Raw microdata directly and Tabulate (Direct Bottom-
up Approach) – multiplicative model with zero values
2. Seasonally adjusted the lower cuboid of the Tabulated Data and aggregate to
higher level (Semi Bottom-up Approach)
3. Seasonally adjusted the cuboid of Tabulated Data (Lost of Additivity)
Extract of : Canadian system of macroeconomic accounts (CSMA) – Module 8
SEATS = Signal Extraction in ARIMA Time Series
How to Produce Seasonally Adjusted Data?
With PROC X12:
Source: https://support.sas.com/documentation/onlinedoc/ets/132/x12.pdf
Proc X12 Results and Database
Components to store in the Dataset :
Date
Dimensions (ex: Province, Sex, Age Group)
Weight / Unweighted
Raw Statistics
Seasonally Adjusted Statistics
Trend-Cycle
Seasonal Component
Trading-Day/Easter Effects
Irregular
(Sex, Age Group,
Province)
(Sex, Age Group)
(Sex)
()
(Age Group)
(Sex, Province)
Province
(Age Group, Province)
Tabulated Data
Date Sex
Age
group Province Types SUM(Sales)
SEAS
SUM(Sales)
JAN-18 . . . 000 9058
Computed
from
option 1,2,3
JAN-18 0 . . 100 8928 …
JAN-18 1 . . 100 130 …
JAN-18 . 1 . 010 30 …
JAN-18 . 2 . 010 9050 …
JAN-18 . 3 . 010 -27 …
JAN-18 . 4 . 010 5 …
JAN-18 . . Quebec 001 0 …
JAN-18 . . Quebec 001 9000 …
JAN-18 . . Quebec 001 -75 …
JAN-18 . . Quebec 001 3 …
JAN-18 . . Quebec 001 30 …
JAN-18 . . Quebec 001 50 …
JAN-18 . . Quebec 001 48 …
… … … … … …
Computational
Lattice of Cuboids
Tabulations and Seasonally Adjusted Data
What we will do in case of hierarchy?
Country Region Province City
O(n) where n is the
number of dimensions
What is a Raking? When to Apply a Raking ?
A.K.A Reconciliation, Balancing, Spreading or Dispersing.
Simplest Form
Province AppleSold
IPE 1,850
Nova Scotia 548
New Brunswick 761
Newfoundland 4,091
Quebec 8,871
Ontario 34,333
Manitoba 5,866
Saskatchewan 13,632
Alberta 13,096
British Columbia 14,624
Nunavut 21
NWT 292
Yukon 4
CANADA TOTAL 97,990
New Total: 103,500
Formula :
ĉ𝑖 = 𝑐𝑖 ∗ 𝐵𝑒𝑛𝑐ℎ
𝑐𝑖
Results of One-Dimensional Raking
Two types of constraint:
Binding Total vs
Non-binding
18/05/201824
Quick Example : One-dimensional Raking
East Center West Sum Control total
(Canada)
Q1 12 14 13 39 40
Q2 10 9 15 34 25
Q3 12 8 17 37 40
Q4 9 9 14 32 37
Annual
Total
43 40 59
East Center West Control total
(Canada)
Q1 11.34 14.82 13.84 40
Q2 10 5.59 9.41 25
Q3 12.02 8.92 19.06 40
Q4 9.64 10.67 16.69 37
Annual
Total
43 40 59
Result
Initial table
Source: Statistics Canada - Proc Ts-Raking Course Notes: An in-house SAS procedure for
Balancing Time Series
18/05/201825
Quick Example : Two-dimensional Raking
East Center West Sum Control total
(Canada)
Cars 12 14 13 39 40
Vans 20 20 24 64 53
Sum 32 34 37
Control total 30 31 32
Result
Initial table
East Center West Control total
(Canada)
Cars 12.72 14.38 12.9 40
Vans 17.28 16.62 19.1 53
Control total 30 31 32
Results
need to be
rounded
Source: Statistics Canada - Proc Ts-Raking Course Notes: An in-house SAS procedure for
Balancing Time Series
How to do a Raking?
1. Manual Adjustments (Based on subject-matter expertise)
2. Iterative Proportional Fitting a.k.a. RAS (The Basic Algorithm) from 1960-70
3. PROC OPTMODEL – SAS/OR with equations and linear constraints.
4. PROC TS-RAKING – Statistics Canada Generalized System
5. Macro GSeriesTSBalancing - – Statistics Canada Generalized System
6. A Prorate on FAME - Forecasting Analysis and Modeling Environment from
Sunguard
How to use Raking in Tabulation ?Tabulated Data Full Lattice of Cuboids
Date Sex
Age
group Province Types SUM(Sales)
NEW
Sum(Sales)
JAN-18 . . . 000 9058
JAN-18 0 . . 100 8928 8500
JAN-18 1 . . 100 130 140
JAN-18 . 1 . 010 30 30
JAN-18 . 2 . 010 9050 8200
JAN-18 . 3 . 010 -27 400
JAN-18 . 4 . 010 5 10
JAN-18 . . Quebec 001 0 …
JAN-18 . . Quebec 001 9000 …
JAN-18 . . Quebec 001 -75 …
JAN-18 . . Quebec 001 3 …
JAN-18 . . Quebec 001 30 …
JAN-18 . . Quebec 001 50 …
JAN-18 . . Quebec 001 48 …
… … … … … …
AddUp
RakedRaked
(AgeG)
(Sex, AgeG)
(Sex, AgeG, Prov)
(AgeG, Prov)
()
(Prov)
(Sex, Prov)
(Sex)
Raked
Raked Raked
Raked
Raked
What is a Benchmarking ?
A Benchmarking occurs, often, when new Annual Data Benchmark series is given
at annual level and sub-annual estimates need to be adjusted accordingly.
The Prorate or One-Dimensional Raking is not always possible for a time
series, it will possibly cause :
A “level-jump” (downward or upward) at each beginning of the year.
A opposite jump at the end of the year
This is also known as Denton-Cholette Quadratic Minimization Method
Step-adjustment with the Denton-Cholette Method
1000
1100
1200
1300
1400
1500
1600
2010q1 2012q1 2014q1
Old ann New ann Old qtr New qtr
Source of graph : Statistics Canada - CSMA foundations – module 8 extension – with approbation of
Jim Tebrake
What to use to apply a Benchmark?
With PROC Benchmarking – Statistics Canada Generalized System
Quadminz in FAME - Forecasting Analysis and Modeling Environment from
Sunguard
How to use Benchmarking in Tabulation ?
New Tabulated Data to the Annual
Benchmark
New Provincial Benchmark
Date Industry Province Types
RAW
SUM(Sales)
SEAS
Sum(Sales)
Q1-14 1001 Quebec 11 9058
Q2-14 1001 Quebec 11 8928 8500
Q3-14 1001 Quebec 11 130 140
Q4-14 1001 Quebec 11 30 30
Q1-15 1001 Quebec 11 9050 8200
Q2-15 1001 Quebec 11 -27 400
Q3-15 1001 Quebec 11 5 10
Q4-15 1001 Quebec 11 0 …
Q1-16 1001 Quebec 11 9000 …
Q2-16 1001 Quebec 11 -75 …
Q3-16 1001 Quebec 11 3 …
Q4-16 1001 Quebec 11 30 …
Q1-17 1001 Quebec 11 50 …
Q2-17 1001 Quebec 11 48 …
Q3-17 … … … … …
Date Industry Province Types New Annual
14 1001 Quebec 111 9058
15 1001 Quebec 111 8928
16 1001 Quebec 111 130
17 1001 Quebec 111 30
+PROC Benchmarking options
=
Original Tabulated Data
Date Industry Province Types
RAW
SUM(Sales)
SEAS
Sum(Sales)
Q1-14 1001 Quebec 11
Q2-14 1001 Quebec 11
Time Series Dataset:
Summary Statistics:
Time Series and Their Summary Statistics
Candidate Procs
1. Proc Means: creating printed tables of summary statistics;
2. Proc Summary: creating datasets of summary statistics;
3. Proc Tabulate: creating tabular reports of summary
statistics; (reports can be either simple or highly customized
tables.)
4. Proc Report: creating both detail and summary reports
containing both summary statistics and computed data
using “compute” blocks.
1. Means proc:
2. Summary proc:
3. Tabulate proc:
Code Examples
4. Report proc:
Code Examples (cont.)
• All four procedures can create reports of summary
statistics with the same standard suite, and output the
reports to SAS datasets.
• By default, MEANS proc displays output; SUMMARY proc
does not display output (needs print option to display
output).
Comparison of Four Procs
Comparison of Four Procs (cont.)
• When var statement missing, Proc MEANS analyzes all numerical variables that are not listed in
other statements;
Proc SUMMARY generates observation counts only, and does not
work if statistics specified in output statement.
• Proc TABULATE has more flexibility than others in
displaying summary statistics within groups in either rows
or columns.
Comparison of Four Procs (cont.)
• REPORT proc is capable of calculating and displaying
information based on other columns (compute block in
proc report). It can provide both detail reporting and
summary reporting.
Comparison of Four Procs (cont.)
• TABULATE and REPORT procs more emphasize on
visually displaying summary statistics.
• For summary statistics reports that are output to SAS
datasets, MEANS and SUMMARY procs are easier in
implementation and more appropriate to be used in a
generalized tabulation tool.
Observation
Threaded Proc Means with Cloud Analytics Services (CAS)
Source: SAS Cloud Analytic Services 3.1: Fundamentals,
available on documentation.sas.com.
What is a Growth Rate ?
The Economic Times Wrote Statistics Canada
Percentage Change refers to
the actual change between the
old value and the new one,
expressed relative to the old
value.
Formula
Valuet – Valuet-1 * 100
Valuet-1
How to calculate Growth Rate in SAS (1)
Equivalent Formula :
With a Simple Dataset :
( Value t – 1 ) * 100
Value t-1
How to calculate Growth Rate in SAS (1)
Dataset with group:
ConclusionTabulation tool system is needed for dissemination(Pre Cansim Data) or Special
Tables. Integration of Seasonally Adjusted Data would be needed and statistics
related to Times Series as:
• Growth Rate / Percent Change
• Period to Period Change
• Year over Year Change
• Data at Annual Rate
• Index, Linked Index
• Laspeyres, Paasche, Fisher, Walsh, Tornqvist, etc..
SAS/ EG Task or Custom tasks Could be a good idea for analysis and explore
those Data.
MEANS and SUMMARY are more appropriate procs than TABULATE and
REPORT to be used in a generalized tabulation tool.
Questions
References• Seasonal Adjustment with X-12-ARIMA, Class Notes, H-0434.
• Theory and Application of Benchmarking, Class Notes, H-0436.
• Theory and Application of Raking for Time Series, Class Notes, H-0437.
• Han J. & Kamber, Micheline. Data Mining – concepts and Techniques, 499 pages.
• Canadian System of Macroeconomic Accounts – Modules 1 to 8, Statistics Canada
• SAS website and documentations.
A Special Thanks to :
Philip Smith – Alumni, Statistics Canada – Educator, Mentor, Consultant
Jim Tebrake – Executive Director Macroeconomic Accounts , Statistics Canada
Steve Holder – National Strategy, SAS Canada for SAS Viya + CAS expertise/discussions
top related