academia sinica jan-2015

197
Rob J Hyndman Visualizing and forecasting big time series data

Upload: rob-hyndman

Post on 16-Jul-2015

1.881 views

Category:

Education


1 download

TRANSCRIPT

Rob J Hyndman

Visualizing and forecasting

big time series data

20

00

20

10

Ho

liday

20

00

20

10

VF

R

20

00

20

10

Bu

sin

ess

20

00

20

10

BA

A

BA

B

BA

C

BB

A

BC

A

BC

B

BC

C

BD

A

BD

B

BD

C

BD

D

BD

E

BD

F

BE

A

BE

B

BE

C

BE

D

BE

E

BE

F

Oth

er

BE

G

Victoria: scaled

Outline

1 Examples of big time series

2 Time series visualisation

3 BLUF: Best Linear Unbiased Forecasts

4 Application: Australian tourism

5 Application: Australian labour market

6 Fast computation tricks

7 hts package for R

8 References

Visualising and forecasting big time series data Examples of big time series 2

1. Australian tourism demand

Visualising and forecasting big time series data Examples of big time series 3

1. Australian tourism demand

Visualising and forecasting big time series data Examples of big time series 3

Quarterly data on visitor night from1998:Q1 – 2013:Q4From: National Visitor Survey, based onannual interviews of 120,000 Australiansaged 15+, collected by Tourism ResearchAustralia.Split by 7 states, 27 zones and 76 regions(a geographical hierarchy)Also split by purpose of travel

HolidayVisiting friends and relatives (VFR)BusinessOther

304 bottom-level series

2. Labour market participation

Australia and New Zealand StandardClassification of Occupations

8 major groups43 sub-major groups

97 minor groups– 359 unit groups

* 1023 occupations

Example: statistician2 Professionals

22 Business, Human Resource and MarketingProfessionals224 Information and Organisation Professionals

2241 Actuaries, Mathematicians and Statisticians224113 Statistician

Visualising and forecasting big time series data Examples of big time series 4

2. Labour market participation

Australia and New Zealand StandardClassification of Occupations

8 major groups43 sub-major groups

97 minor groups– 359 unit groups

* 1023 occupations

Example: statistician2 Professionals

22 Business, Human Resource and MarketingProfessionals224 Information and Organisation Professionals

2241 Actuaries, Mathematicians and Statisticians224113 Statistician

Visualising and forecasting big time series data Examples of big time series 4

3. PBS sales

Visualising and forecasting big time series data Examples of big time series 5

3. PBS salesATC drug classification

A Alimentary tract and metabolismB Blood and blood forming organsC Cardiovascular systemD DermatologicalsG Genito-urinary system and sex hormonesH Systemic hormonal preparations, excluding sex hormones

and insulinsJ Anti-infectives for systemic useL Antineoplastic and immunomodulating agentsM Musculo-skeletal systemN Nervous systemP Antiparasitic products, insecticides and repellentsR Respiratory systemS Sensory organsV Various

Visualising and forecasting big time series data Examples of big time series 6

3. PBS sales

ATC drug classificationA Alimentary tract and metabolism14 classes

A10 Drugs used in diabetes84 classes

A10B Blood glucose lowering drugs

A10BA Biguanides

A10BA02 Metformin

Visualising and forecasting big time series data Examples of big time series 7

4. Spectacle sales

Visualising and forecasting big time series data Examples of big time series 8

Monthly sales data from 2000 – 2014Provided by a large spectacle manufacturerSplit by brand (26), gender (3), price range(6), materials (4), and stores (600)About a million bottom-level series

4. Spectacle sales

Visualising and forecasting big time series data Examples of big time series 8

Monthly sales data from 2000 – 2014Provided by a large spectacle manufacturerSplit by brand (26), gender (3), price range(6), materials (4), and stores (600)About a million bottom-level series

4. Spectacle sales

Visualising and forecasting big time series data Examples of big time series 8

Monthly sales data from 2000 – 2014Provided by a large spectacle manufacturerSplit by brand (26), gender (3), price range(6), materials (4), and stores (600)About a million bottom-level series

4. Spectacle sales

Visualising and forecasting big time series data Examples of big time series 8

Monthly sales data from 2000 – 2014Provided by a large spectacle manufacturerSplit by brand (26), gender (3), price range(6), materials (4), and stores (600)About a million bottom-level series

Hierarchical time series

A hierarchical time series is a collection ofseveral time series that are linked together ina hierarchical structure.

Total

A

AA AB AC

B

BA BB BC

C

CA CB CC

ExamplesNet labour turnoverPharmaceutical salesTourism by state and region

Visualising and forecasting big time series data Examples of big time series 9

Hierarchical time series

A hierarchical time series is a collection ofseveral time series that are linked together ina hierarchical structure.

Total

A

AA AB AC

B

BA BB BC

C

CA CB CC

ExamplesNet labour turnoverPharmaceutical salesTourism by state and region

Visualising and forecasting big time series data Examples of big time series 9

Hierarchical time series

A hierarchical time series is a collection ofseveral time series that are linked together ina hierarchical structure.

Total

A

AA AB AC

B

BA BB BC

C

CA CB CC

ExamplesNet labour turnoverPharmaceutical salesTourism by state and region

Visualising and forecasting big time series data Examples of big time series 9

Hierarchical time series

A hierarchical time series is a collection ofseveral time series that are linked together ina hierarchical structure.

Total

A

AA AB AC

B

BA BB BC

C

CA CB CC

ExamplesNet labour turnoverPharmaceutical salesTourism by state and region

Visualising and forecasting big time series data Examples of big time series 9

Grouped time series

A grouped time series is a collection of timeseries that can be grouped together in anumber of non-hierarchical ways.

Total

A

AX AY

B

BX BY

Total

X

AX BX

Y

AY BY

ExamplesTourism by state and purpose of travelGlasses by brand and store

Visualising and forecasting big time series data Examples of big time series 10

Grouped time series

A grouped time series is a collection of timeseries that can be grouped together in anumber of non-hierarchical ways.

Total

A

AX AY

B

BX BY

Total

X

AX BX

Y

AY BY

ExamplesTourism by state and purpose of travelGlasses by brand and store

Visualising and forecasting big time series data Examples of big time series 10

Grouped time series

A grouped time series is a collection of timeseries that can be grouped together in anumber of non-hierarchical ways.

Total

A

AX AY

B

BX BY

Total

X

AX BX

Y

AY BY

ExamplesTourism by state and purpose of travelGlasses by brand and store

Visualising and forecasting big time series data Examples of big time series 10

Outline

1 Examples of big time series

2 Time series visualisation

3 BLUF: Best Linear Unbiased Forecasts

4 Application: Australian tourism

5 Application: Australian labour market

6 Fast computation tricks

7 hts package for R

8 References

Visualising and forecasting big time series data Time series visualisation 11

Victorian tourism dataB

AA

Hol

BA

BH

ol

BA

AV

isB

AB

Vis

BA

AB

usB

AB

Bus

BA

AO

thB

AB

Oth

BA

CH

olB

BA

Hol

BA

CV

isB

BA

Vis

BA

CB

usB

BA

Bus

BA

CO

thB

BA

Oth

BC

AH

olB

CB

Hol

BC

AV

isB

CB

Vis

BC

AB

usB

CB

Bus

BC

AO

thB

CB

Oth

BC

CH

olB

DA

Hol

BC

CV

isB

DA

Vis

BC

CB

usB

DA

Bus

BC

CO

thB

DA

Oth

BD

BH

olB

DC

Hol

BD

BV

isB

DC

Vis

BD

BB

usB

DC

Bus

BD

BO

thB

DC

Oth

BD

DH

olB

DE

Hol

BD

DV

isB

DE

Vis

BD

DB

usB

DE

Bus

BD

DO

thB

DE

Oth

BD

FH

olB

EA

Hol

BD

FV

isB

EA

Vis

BD

FB

usB

EA

Bus

BD

FO

thB

EA

Oth

BE

BH

olB

EC

Hol

BE

BV

isB

EC

Vis

BE

BB

usB

EC

Bus

BE

BO

thB

EC

Oth

BE

DH

olB

EE

Hol

BE

DV

isB

EE

Vis

BE

DB

usB

EE

Bus

BE

DO

thB

EE

Oth

BE

FH

olB

EG

Hol

BE

FV

isB

EG

Vis

BE

FB

usB

EG

Bus

BE

FO

thB

EG

Oth

Visualising and forecasting big time series data Time series visualisation 12

Kite diagrams0

00

Line graph profile

Duplicate & fliparound the hori-zontal axis

Fill the colour

Visualising and forecasting big time series data Time series visualisation 13

Kite diagrams: Victorian tourism20

0020

10

Hol

iday

2000

2010

VF

R

2000

2010

Bus

ines

s

2000

2010

BA

A

BA

B

BA

C

BB

A

BC

A

BC

B

BC

C

BD

A

BD

B

BD

C

BD

D

BD

E

BD

F

BE

A

BE

B

BE

C

BE

D

BE

E

BE

F

Oth

er

BE

G

Victoria

Visualising and forecasting big time series data Time series visualisation 14

Kite diagrams: Victorian tourism

Visualising and forecasting big time series data Time series visualisation 14

Kite diagrams: Victorian tourism20

0020

10

Hol

iday

2000

2010

VF

R

2000

2010

Bus

ines

s

2000

2010

BA

A

BA

B

BA

C

BB

A

BC

A

BC

B

BC

C

BD

A

BD

B

BD

C

BD

D

BD

E

BD

F

BE

A

BE

B

BE

C

BE

D

BE

E

BE

F

Oth

er

BE

G

Victoria: scaled

Visualising and forecasting big time series data Time series visualisation 14

An STL decompositionSTL decomposition of tourism demandfor holidays in Peninsula

5.0

6.0

7.0

data

−0.

50.

5

seas

onal

5.8

6.1

6.4

tren

d

−0.

40.

0

2000 2005 2010

rem

aind

er

timeVisualising and forecasting big time series data Time series visualisation 15

Seasonal stacked bar chart

Place positive values above the originwhile negative values below the originMap the bar length to the magnitudeEncode quarters by colours

Visualising and forecasting big time series data Time series visualisation 16

Seasonal stacked bar chart

Place positive values above the originwhile negative values below the originMap the bar length to the magnitudeEncode quarters by colours

−1.0

−0.5

0.0

0.5

1.0

Holiday

BAA BAB BAC BBABCABCBBCCBDABDBBDCBDDBDEBDF BEA BEBBECBEDBEE BEFBEGRegions

Sea

sona

l Com

pone

nt

Qtr

Q1

Q2

Q3

Q4

Visualising and forecasting big time series data Time series visualisation 16

Seasonal stacked bar chart: VIC

Visualising and forecasting big time series data Time series visualisation 17

Seasonal stacked bar chart: VIC

−1.0−0.5

0.00.51.0

−1.0−0.5

0.00.51.0

−1.0−0.5

0.00.51.0

−1.0−0.5

0.00.51.0

Holiday

VF

RB

usinessO

ther

BAABABBACBBABCABCBBCCBDABDBBDCBDDBDEBDFBEABEBBECBEDBEEBEFBEGRegions

Sea

sona

l Com

pone

nt

QtrQ1Q2Q3Q4

Visualising and forecasting big time series data Time series visualisation 17

Corrgram of remainder

Visualising and forecasting big time series data Time series visualisation 18

Compute the correlationsamong the remaindercomponents

Render both the sign andmagnitude using a colourmapping of two hues

Order variables according tothe first principal component ofthe correlations.

Corrgram of remainder: VIC

Visualising and forecasting big time series data Time series visualisation 19−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

BE

EH

olB

EF

Oth

BE

EO

thB

DE

Oth

BE

BO

thB

EA

Bus

BE

FB

usB

DC

Oth

BA

CH

olB

EB

Bus

BE

AV

isB

BA

Hol

BD

EH

olB

AB

Oth

BA

AV

isB

AA

Hol

BD

CH

olB

BA

Bus

BC

BH

olB

EG

Bus

BD

DV

isB

AB

Vis

BD

AV

isB

EA

Oth

BD

FH

olB

EE

Bus

BA

AO

thB

AC

Oth

BD

AO

thB

DE

Bus

BC

BO

thB

AC

Bus

BE

BV

isB

AC

Vis

BC

AO

thB

EF

Vis

BC

BV

isB

ED

Hol

BE

GO

thB

DB

Hol

BA

BB

usB

EB

Hol

BD

FB

usB

EC

Hol

BC

AH

olB

DB

Oth

BE

AH

olB

DC

Bus

BE

CV

isB

DB

Vis

BC

CH

olB

BA

Vis

BA

BH

olB

BA

Oth

BC

CO

thB

CB

Bus

BC

CV

isB

EG

Vis

BD

DH

olB

EC

Oth

BD

CV

isB

AA

Bus

BC

CB

usB

EC

Bus

BC

AV

isB

DF

Vis

BE

GH

olB

DD

Oth

BE

DO

thB

ED

Vis

BD

DB

usB

DE

Vis

BE

FH

olB

EE

Vis

BD

BB

usB

DA

Bus

BD

AH

olB

CA

Bus

BD

FO

thB

ED

Bus

BEEHolBEFOthBEEOthBDEOthBEBOthBEABusBEFBusBDCOthBACHolBEBBusBEAVisBBAHolBDEHolBABOthBAAVisBAAHolBDCHolBBABusBCBHolBEGBusBDDVisBABVisBDAVisBEAOthBDFHolBEEBusBAAOthBACOthBDAOthBDEBusBCBOthBACBusBEBVisBACVisBCAOthBEFVisBCBVisBEDHolBEGOthBDBHolBABBusBEBHolBDFBusBECHolBCAHolBDBOthBEAHolBDCBusBECVisBDBVisBCCHolBBAVisBABHolBBAOthBCCOthBCBBusBCCVisBEGVisBDDHolBECOthBDCVisBAABusBCCBusBECBusBCAVisBDFVisBEGHolBDDOthBEDOthBEDVisBDDBusBDEVisBEFHolBEEVisBDBBusBDABusBDAHolBCABusBDFOthBEDBus

Corrgram of remainder: VIC

Visualising and forecasting big time series data Time series visualisation 19−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

BD

AH

ol

BD

DH

ol

BE

BH

ol

BE

FH

ol

BE

CH

ol

BE

DH

ol

BD

FH

ol

BC

CH

ol

BD

CH

ol

BC

AH

ol

BE

AH

ol

BE

GH

ol

BB

AH

ol

BA

AH

ol

BA

BH

ol

BD

BH

ol

BD

EH

ol

BA

CH

ol

BC

BH

ol

BE

EH

ol

BDAHol

BDDHol

BEBHol

BEFHol

BECHol

BEDHol

BDFHol

BCCHol

BDCHol

BCAHol

BEAHol

BEGHol

BBAHol

BAAHol

BABHol

BDBHol

BDEHol

BACHol

BCBHol

BEEHol

Corrgram of remainder: TAS

Visualising and forecasting big time series data Time series visualisation 20−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

FC

AH

ol

FB

BH

ol

FB

AH

ol

FAA

Hol

FC

BH

ol

FC

AV

is

FB

BV

is

FAA

Vis

FC

BB

us

FAA

Oth

FC

AO

th

FB

BO

th

FB

AB

us

FB

AO

th

FC

BV

is

FC

AB

us

FB

AV

is

FC

BO

th

FB

BB

us

FAA

Bus

FCAHol

FBBHol

FBAHol

FAAHol

FCBHol

FCAVis

FBBVis

FAAVis

FCBBus

FAAOth

FCAOth

FBBOth

FBABus

FBAOth

FCBVis

FCABus

FBAVis

FCBOth

FBBBus

FAABus

Principal components decomposition

Visualising and forecasting big time series data Time series visualisation 21

−25

−15

−5

5

PC

1

−5

05

10

PC

2

−5

05

10

2000 2005 2010

PC

3

Time

First three PCs

Principal components decomposition

Visualising and forecasting big time series data Time series visualisation 21

−25

−20

−15

−10

−5

05

Season plot: PC1

Month

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Principal components decomposition

Visualising and forecasting big time series data Time series visualisation 21

−5

05

10

Season plot: PC2

Month

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Principal components decomposition

Visualising and forecasting big time series data Time series visualisation 21

−5

05

10

Season plot: PC3

Month

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Principal components decomposition

Visualising and forecasting big time series data Time series visualisation 22

−0.15 −0.10 −0.05 0.00 0.05

−0.

100.

000.

050.

100.

150.

20

Loading 1

Load

ing

2

NSWVICQLDSATASNTWA

Principal components decomposition

Visualising and forecasting big time series data Time series visualisation 22

−0.15 −0.10 −0.05 0.00 0.05

−0.

100.

000.

050.

100.

150.

20

Loading 1

Load

ing

2

HolVisBusOth

Feature analysis

Summarize each time series with a featurevector:

strength of trendsummer seasonalitywinter seasonalityBox-Pierce statistic on remainder of STLLumpiness (variance of annual variances ofremainder)

Do PCA on feature matrix

Visualising and forecasting big time series data Time series visualisation 23

Feature analysis

Summarize each time series with a featurevector:

strength of trendsummer seasonalitywinter seasonalityBox-Pierce statistic on remainder of STLLumpiness (variance of annual variances ofremainder)

Do PCA on feature matrix

Visualising and forecasting big time series data Time series visualisation 23

Feature analysis

Summarize each time series with a featurevector:

strength of trendsummer seasonalitywinter seasonalityBox-Pierce statistic on remainder of STLLumpiness (variance of annual variances ofremainder)

Do PCA on feature matrix

Visualising and forecasting big time series data Time series visualisation 23

Feature analysis

Summarize each time series with a featurevector:

strength of trendsummer seasonalitywinter seasonalityBox-Pierce statistic on remainder of STLLumpiness (variance of annual variances ofremainder)

Do PCA on feature matrix

Visualising and forecasting big time series data Time series visualisation 23

Feature analysis

Summarize each time series with a featurevector:

strength of trendsummer seasonalitywinter seasonalityBox-Pierce statistic on remainder of STLLumpiness (variance of annual variances ofremainder)

Do PCA on feature matrix

Visualising and forecasting big time series data Time series visualisation 23

Feature analysis

Summarize each time series with a featurevector:

strength of trendsummer seasonalitywinter seasonalityBox-Pierce statistic on remainder of STLLumpiness (variance of annual variances ofremainder)

Do PCA on feature matrix

Visualising and forecasting big time series data Time series visualisation 23

Feature analysis

Summarize each time series with a featurevector:

strength of trendsummer seasonalitywinter seasonalityBox-Pierce statistic on remainder of STLLumpiness (variance of annual variances ofremainder)

Do PCA on feature matrix

Visualising and forecasting big time series data Time series visualisation 23

Feature analysis

Visualising and forecasting big time series data Time series visualisation 24

trend

summer

winterco

rrlumpy

−2

0

2

−5.0 −2.5 0.0 2.5PC1 (39.1% explained var.)

PC

2 (2

3.6%

exp

lain

ed v

ar.)

groups

BusHolOthVis

Feature analysis

Visualising and forecasting big time series data Time series visualisation 24

trend

summer

winterco

rrlumpy

−2

0

2

−5.0 −2.5 0.0 2.5PC1 (39.1% explained var.)

PC

2 (2

3.6%

exp

lain

ed v

ar.)

groups

NSWNTQLDSATASVICWA

Outline

1 Examples of big time series

2 Time series visualisation

3 BLUF: Best Linear Unbiased Forecasts

4 Application: Australian tourism

5 Application: Australian labour market

6 Fast computation tricks

7 hts package for R

8 References

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 25

Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.

Existing methods:ã Bottom-upã Top-downã Middle-out

How to compute forecast intervals?

Most research is concerned about relativeperformance of existing methods.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26

Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.

Existing methods:ã Bottom-upã Top-downã Middle-out

How to compute forecast intervals?

Most research is concerned about relativeperformance of existing methods.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26

Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.

Existing methods:ã Bottom-upã Top-downã Middle-out

How to compute forecast intervals?

Most research is concerned about relativeperformance of existing methods.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26

Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.

Existing methods:ã Bottom-upã Top-downã Middle-out

How to compute forecast intervals?

Most research is concerned about relativeperformance of existing methods.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26

Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.

Existing methods:ã Bottom-upã Top-downã Middle-out

How to compute forecast intervals?

Most research is concerned about relativeperformance of existing methods.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26

Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.

Existing methods:ã Bottom-upã Top-downã Middle-out

How to compute forecast intervals?

Most research is concerned about relativeperformance of existing methods.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26

Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.

Existing methods:ã Bottom-upã Top-downã Middle-out

How to compute forecast intervals?

Most research is concerned about relativeperformance of existing methods.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26

Top-down method

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 27

Advantages

Works well inpresence of lowcounts.

Single forecastingmodel easy tobuild

Provides reliableforecasts foraggregate levels.

Disadvantages

Loss of information,especiallyindividual seriesdynamics.

Distribution offorecasts to lowerlevels can bedifficult

No predictionintervals

Top-down method

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 27

Advantages

Works well inpresence of lowcounts.

Single forecastingmodel easy tobuild

Provides reliableforecasts foraggregate levels.

Disadvantages

Loss of information,especiallyindividual seriesdynamics.

Distribution offorecasts to lowerlevels can bedifficult

No predictionintervals

Top-down method

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 27

Advantages

Works well inpresence of lowcounts.

Single forecastingmodel easy tobuild

Provides reliableforecasts foraggregate levels.

Disadvantages

Loss of information,especiallyindividual seriesdynamics.

Distribution offorecasts to lowerlevels can bedifficult

No predictionintervals

Top-down method

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 27

Advantages

Works well inpresence of lowcounts.

Single forecastingmodel easy tobuild

Provides reliableforecasts foraggregate levels.

Disadvantages

Loss of information,especiallyindividual seriesdynamics.

Distribution offorecasts to lowerlevels can bedifficult

No predictionintervals

Top-down method

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 27

Advantages

Works well inpresence of lowcounts.

Single forecastingmodel easy tobuild

Provides reliableforecasts foraggregate levels.

Disadvantages

Loss of information,especiallyindividual seriesdynamics.

Distribution offorecasts to lowerlevels can bedifficult

No predictionintervals

Top-down method

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 27

Advantages

Works well inpresence of lowcounts.

Single forecastingmodel easy tobuild

Provides reliableforecasts foraggregate levels.

Disadvantages

Loss of information,especiallyindividual seriesdynamics.

Distribution offorecasts to lowerlevels can bedifficult

No predictionintervals

Bottom-up method

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 28

Advantages

No loss ofinformation.

Better capturesdynamics ofindividual series.

Disadvantages

Large number ofseries to beforecast.

Constructingforecasting modelsis harder becauseof noisy data atbottom level.

No predictionintervals

Bottom-up method

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 28

Advantages

No loss ofinformation.

Better capturesdynamics ofindividual series.

Disadvantages

Large number ofseries to beforecast.

Constructingforecasting modelsis harder becauseof noisy data atbottom level.

No predictionintervals

Bottom-up method

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 28

Advantages

No loss ofinformation.

Better capturesdynamics ofindividual series.

Disadvantages

Large number ofseries to beforecast.

Constructingforecasting modelsis harder becauseof noisy data atbottom level.

No predictionintervals

Bottom-up method

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 28

Advantages

No loss ofinformation.

Better capturesdynamics ofindividual series.

Disadvantages

Large number ofseries to beforecast.

Constructingforecasting modelsis harder becauseof noisy data atbottom level.

No predictionintervals

Bottom-up method

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 28

Advantages

No loss ofinformation.

Better capturesdynamics ofindividual series.

Disadvantages

Large number ofseries to beforecast.

Constructingforecasting modelsis harder becauseof noisy data atbottom level.

No predictionintervals

The BLUF approach

Hyndman et al (CSDA 2011) proposed a newstatistical framework for forecastinghierarchical time series which:

1 provides point forecasts that areconsistent across the hierarchy;

2 allows for correlations and interactionbetween series at each level;

3 provides estimates of forecast uncertaintywhich are consistent across the hierarchy;

4 allows for ad hoc adjustments andinclusion of covariates at any level.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 29

The BLUF approach

Hyndman et al (CSDA 2011) proposed a newstatistical framework for forecastinghierarchical time series which:

1 provides point forecasts that areconsistent across the hierarchy;

2 allows for correlations and interactionbetween series at each level;

3 provides estimates of forecast uncertaintywhich are consistent across the hierarchy;

4 allows for ad hoc adjustments andinclusion of covariates at any level.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 29

The BLUF approach

Hyndman et al (CSDA 2011) proposed a newstatistical framework for forecastinghierarchical time series which:

1 provides point forecasts that areconsistent across the hierarchy;

2 allows for correlations and interactionbetween series at each level;

3 provides estimates of forecast uncertaintywhich are consistent across the hierarchy;

4 allows for ad hoc adjustments andinclusion of covariates at any level.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 29

The BLUF approach

Hyndman et al (CSDA 2011) proposed a newstatistical framework for forecastinghierarchical time series which:

1 provides point forecasts that areconsistent across the hierarchy;

2 allows for correlations and interactionbetween series at each level;

3 provides estimates of forecast uncertaintywhich are consistent across the hierarchy;

4 allows for ad hoc adjustments andinclusion of covariates at any level.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 29

Hierarchical data

Total

A B C

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 30

Yt : observed aggregate of allseries at time t.

YX,t : observation on series X attime t.

Bt : vector of all series atbottom level in time t.

Hierarchical data

Total

A B C

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 30

Yt : observed aggregate of allseries at time t.

YX,t : observation on series X attime t.

Bt : vector of all series atbottom level in time t.

Hierarchical data

Total

A B C

yt = [Yt, YA,t, YB,t, YC,t]′ =

1 1 11 0 00 1 00 0 1

YA,tYB,tYC,t

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 30

Yt : observed aggregate of allseries at time t.

YX,t : observation on series X attime t.

Bt : vector of all series atbottom level in time t.

Hierarchical data

Total

A B C

yt = [Yt, YA,t, YB,t, YC,t]′ =

1 1 11 0 00 1 00 0 1

︸ ︷︷ ︸

S

YA,tYB,tYC,t

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 30

Yt : observed aggregate of allseries at time t.

YX,t : observation on series X attime t.

Bt : vector of all series atbottom level in time t.

Hierarchical data

Total

A B C

yt = [Yt, YA,t, YB,t, YC,t]′ =

1 1 11 0 00 1 00 0 1

︸ ︷︷ ︸

S

YA,tYB,tYC,t

︸ ︷︷ ︸

Bt

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 30

Yt : observed aggregate of allseries at time t.

YX,t : observation on series X attime t.

Bt : vector of all series atbottom level in time t.

Hierarchical data

Total

A B C

yt = [Yt, YA,t, YB,t, YC,t]′ =

1 1 11 0 00 1 00 0 1

︸ ︷︷ ︸

S

YA,tYB,tYC,t

︸ ︷︷ ︸

Btyt = SBt

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 30

Yt : observed aggregate of allseries at time t.

YX,t : observation on series X attime t.

Bt : vector of all series atbottom level in time t.

Hierarchical dataTotal

A

AX AY AZ

B

BX BY BZ

C

CX CY CZ

yt =

YtYA,tYB,tYC,tYAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t

=

1 1 1 1 1 1 1 1 11 1 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 11 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 00 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 00 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1

︸ ︷︷ ︸

S

YAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t

︸ ︷︷ ︸

Bt

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 31

Hierarchical dataTotal

A

AX AY AZ

B

BX BY BZ

C

CX CY CZ

yt =

YtYA,tYB,tYC,tYAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t

=

1 1 1 1 1 1 1 1 11 1 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 11 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 00 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 00 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1

︸ ︷︷ ︸

S

YAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t

︸ ︷︷ ︸

Bt

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 31

Hierarchical dataTotal

A

AX AY AZ

B

BX BY BZ

C

CX CY CZ

yt =

YtYA,tYB,tYC,tYAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t

=

1 1 1 1 1 1 1 1 11 1 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 11 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 00 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 00 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1

︸ ︷︷ ︸

S

YAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t

︸ ︷︷ ︸

Bt

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 31

yt = SBt

Grouped dataAX AY A

BX BY B

X Y Total

yt =

YtYA,tYB,tYX,tYY,tYAX,tYAY,tYBX,tYBY,t

=

1 1 1 11 1 0 00 0 1 11 0 1 00 1 0 11 0 0 00 1 0 00 0 1 00 0 0 1

︸ ︷︷ ︸

S

YAX,tYAY,tYBX,tYBY,t

︸ ︷︷ ︸

Bt

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 32

Grouped dataAX AY A

BX BY B

X Y Total

yt =

YtYA,tYB,tYX,tYY,tYAX,tYAY,tYBX,tYBY,t

=

1 1 1 11 1 0 00 0 1 11 0 1 00 1 0 11 0 0 00 1 0 00 0 1 00 0 0 1

︸ ︷︷ ︸

S

YAX,tYAY,tYBX,tYBY,t

︸ ︷︷ ︸

Bt

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 32

Grouped dataAX AY A

BX BY B

X Y Total

yt =

YtYA,tYB,tYX,tYY,tYAX,tYAY,tYBX,tYBY,t

=

1 1 1 11 1 0 00 0 1 11 0 1 00 1 0 11 0 0 00 1 0 00 0 1 00 0 0 1

︸ ︷︷ ︸

S

YAX,tYAY,tYBX,tYBY,t

︸ ︷︷ ︸

Bt

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 32

yt = SBt

Forecasting notation

Let yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as yt.(They may not add up.)

Hierarchical forecasting methods of the form:yn(h) = SPyn(h)

for some matrix P.

P extracts and combines base forecastsyn(h) to get bottom-level forecasts.S adds them upRevised reconciled forecasts: yn(h).

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 33

Forecasting notation

Let yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as yt.(They may not add up.)

Hierarchical forecasting methods of the form:yn(h) = SPyn(h)

for some matrix P.

P extracts and combines base forecastsyn(h) to get bottom-level forecasts.S adds them upRevised reconciled forecasts: yn(h).

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 33

Forecasting notation

Let yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as yt.(They may not add up.)

Hierarchical forecasting methods of the form:yn(h) = SPyn(h)

for some matrix P.

P extracts and combines base forecastsyn(h) to get bottom-level forecasts.S adds them upRevised reconciled forecasts: yn(h).

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 33

Forecasting notation

Let yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as yt.(They may not add up.)

Hierarchical forecasting methods of the form:yn(h) = SPyn(h)

for some matrix P.

P extracts and combines base forecastsyn(h) to get bottom-level forecasts.S adds them upRevised reconciled forecasts: yn(h).

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 33

Forecasting notation

Let yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as yt.(They may not add up.)

Hierarchical forecasting methods of the form:yn(h) = SPyn(h)

for some matrix P.

P extracts and combines base forecastsyn(h) to get bottom-level forecasts.S adds them upRevised reconciled forecasts: yn(h).

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 33

Forecasting notation

Let yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as yt.(They may not add up.)

Hierarchical forecasting methods of the form:yn(h) = SPyn(h)

for some matrix P.

P extracts and combines base forecastsyn(h) to get bottom-level forecasts.S adds them upRevised reconciled forecasts: yn(h).

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 33

Bottom-up forecasts

yn(h) = SPyn(h)

Bottom-up forecasts are obtained using

P = [0 | I] ,

where 0 is null matrix and I is identity matrix.

P matrix extracts only bottom-levelforecasts from yn(h)

S adds them up to give the bottom-upforecasts.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 34

Bottom-up forecasts

yn(h) = SPyn(h)

Bottom-up forecasts are obtained using

P = [0 | I] ,

where 0 is null matrix and I is identity matrix.

P matrix extracts only bottom-levelforecasts from yn(h)

S adds them up to give the bottom-upforecasts.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 34

Bottom-up forecasts

yn(h) = SPyn(h)

Bottom-up forecasts are obtained using

P = [0 | I] ,

where 0 is null matrix and I is identity matrix.

P matrix extracts only bottom-levelforecasts from yn(h)

S adds them up to give the bottom-upforecasts.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 34

Top-down forecasts

yn(h) = SPyn(h)

Top-down forecasts are obtained using

P = [p | 0]

where p = [p1, p2, . . . , pmK]′ is a vector of

proportions that sum to one.

P distributes forecasts of the aggregate tothe lowest level series.

Different methods of top-down forecastinglead to different proportionality vectors p.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 35

Top-down forecasts

yn(h) = SPyn(h)

Top-down forecasts are obtained using

P = [p | 0]

where p = [p1, p2, . . . , pmK]′ is a vector of

proportions that sum to one.

P distributes forecasts of the aggregate tothe lowest level series.

Different methods of top-down forecastinglead to different proportionality vectors p.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 35

Top-down forecasts

yn(h) = SPyn(h)

Top-down forecasts are obtained using

P = [p | 0]

where p = [p1, p2, . . . , pmK]′ is a vector of

proportions that sum to one.

P distributes forecasts of the aggregate tothe lowest level series.

Different methods of top-down forecastinglead to different proportionality vectors p.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 35

General properties: bias

yn(h) = SPyn(h)

Assume: base forecasts yn(h) are unbiased:E[yn(h)|y1, . . . ,yn] = E[yn+h|y1, . . . ,yn]

Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|y1, . . . ,yn].Then E[yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36

General properties: bias

yn(h) = SPyn(h)

Assume: base forecasts yn(h) are unbiased:E[yn(h)|y1, . . . ,yn] = E[yn+h|y1, . . . ,yn]

Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|y1, . . . ,yn].Then E[yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36

General properties: bias

yn(h) = SPyn(h)

Assume: base forecasts yn(h) are unbiased:E[yn(h)|y1, . . . ,yn] = E[yn+h|y1, . . . ,yn]

Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|y1, . . . ,yn].Then E[yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36

General properties: bias

yn(h) = SPyn(h)

Assume: base forecasts yn(h) are unbiased:E[yn(h)|y1, . . . ,yn] = E[yn+h|y1, . . . ,yn]

Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|y1, . . . ,yn].Then E[yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36

General properties: bias

yn(h) = SPyn(h)

Assume: base forecasts yn(h) are unbiased:E[yn(h)|y1, . . . ,yn] = E[yn+h|y1, . . . ,yn]

Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|y1, . . . ,yn].Then E[yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36

General properties: bias

yn(h) = SPyn(h)

Assume: base forecasts yn(h) are unbiased:E[yn(h)|y1, . . . ,yn] = E[yn+h|y1, . . . ,yn]

Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|y1, . . . ,yn].Then E[yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36

General properties: bias

yn(h) = SPyn(h)

Assume: base forecasts yn(h) are unbiased:E[yn(h)|y1, . . . ,yn] = E[yn+h|y1, . . . ,yn]

Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|y1, . . . ,yn].Then E[yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36

General properties: bias

yn(h) = SPyn(h)

Assume: base forecasts yn(h) are unbiased:E[yn(h)|y1, . . . ,yn] = E[yn+h|y1, . . . ,yn]

Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|y1, . . . ,yn].Then E[yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36

General properties: variance

yn(h) = SPyn(h)

Let variance of base forecasts yn(h) be givenby

Σh = Var[yn(h)|y1, . . . , yn]

Then the variance of the revised forecasts isgiven by

Var[yn(h)|y1, . . . , yn] = SPΣhP′S′.

This is a general result for all existing methods.Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 37

General properties: variance

yn(h) = SPyn(h)

Let variance of base forecasts yn(h) be givenby

Σh = Var[yn(h)|y1, . . . , yn]

Then the variance of the revised forecasts isgiven by

Var[yn(h)|y1, . . . , yn] = SPΣhP′S′.

This is a general result for all existing methods.Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 37

General properties: variance

yn(h) = SPyn(h)

Let variance of base forecasts yn(h) be givenby

Σh = Var[yn(h)|y1, . . . , yn]

Then the variance of the revised forecasts isgiven by

Var[yn(h)|y1, . . . , yn] = SPΣhP′S′.

This is a general result for all existing methods.Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 37

BLUF via trace minimization

TheoremFor any P satisfying SPS = S, then

minP

= trace[SPΣhP′S′]

has solution

P = (S′Σ†hS)−1S′Σ†h.

Σ†h is generalized inverse of Σh.

Equivalent to GLS estimate of regressionyn(h) = Sβn(h) + εh where ε ∼ N(0,Σh).

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 38

BLUF via trace minimization

TheoremFor any P satisfying SPS = S, then

minP

= trace[SPΣhP′S′]

has solution

P = (S′Σ†hS)−1S′Σ†h.

Σ†h is generalized inverse of Σh.

Equivalent to GLS estimate of regressionyn(h) = Sβn(h) + εh where ε ∼ N(0,Σh).

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 38

BLUF via trace minimization

TheoremFor any P satisfying SPS = S, then

minP

= trace[SPΣhP′S′]

has solution

P = (S′Σ†hS)−1S′Σ†h.

Σ†h is generalized inverse of Σh.

Equivalent to GLS estimate of regressionyn(h) = Sβn(h) + εh where ε ∼ N(0,Σh).

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 38

Optimal combination forecasts

yn(h) = SPyn(h) = S(S′Σ†hS)−1S′Σ†hyn(h)

Σ†h is generalized inverse of Σh.

Var[yn(h)|y1, . . . , yn] = S(S′Σ†hS)−1S′

Problem: Σh hard to estimate.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39

Optimal combination forecasts

yn(h) = SPyn(h) = S(S′Σ†hS)−1S′Σ†hyn(h)

Initial forecasts

Σ†h is generalized inverse of Σh.

Var[yn(h)|y1, . . . , yn] = S(S′Σ†hS)−1S′

Problem: Σh hard to estimate.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39

Optimal combination forecasts

yn(h) = SPyn(h) = S(S′Σ†hS)−1S′Σ†hyn(h)

Revised forecasts Initial forecasts

Σ†h is generalized inverse of Σh.

Var[yn(h)|y1, . . . , yn] = S(S′Σ†hS)−1S′

Problem: Σh hard to estimate.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39

Optimal combination forecasts

yn(h) = SPyn(h) = S(S′Σ†hS)−1S′Σ†hyn(h)

Revised forecasts Initial forecasts

Σ†h is generalized inverse of Σh.

Var[yn(h)|y1, . . . , yn] = S(S′Σ†hS)−1S′

Problem: Σh hard to estimate.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39

Optimal combination forecasts

yn(h) = SPyn(h) = S(S′Σ†hS)−1S′Σ†hyn(h)

Revised forecasts Initial forecasts

Σ†h is generalized inverse of Σh.

Var[yn(h)|y1, . . . , yn] = S(S′Σ†hS)−1S′

Problem: Σh hard to estimate.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39

Optimal combination forecasts

yn(h) = SPyn(h) = S(S′Σ†hS)−1S′Σ†hyn(h)

Revised forecasts Initial forecasts

Σ†h is generalized inverse of Σh.

Var[yn(h)|y1, . . . , yn] = S(S′Σ†hS)−1S′

Problem: Σh hard to estimate.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39

Optimal combination forecasts

yn(h) = SPyn(h) = S(S′Σ†hS)−1S′Σ†hyn(h)

Revised forecasts Initial forecasts

Σ†h is generalized inverse of Σh.

Var[yn(h)|y1, . . . , yn] = S(S′Σ†hS)−1S′

Problem: Σh hard to estimate.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39

Optimal combination forecasts

yn(h) = S(S′Σ†hS)−1S′Σ†hyn(h)

Revised forecasts Base forecasts

Solution 1: OLSAssume εh ≈ SεB,h where εB,h is theforecast error at bottom level.

Then Σh ≈ SΩhS′ where Ωh = Var(εB,h).

If Moore-Penrose generalized inverse used,then (S′Σ†hS)

−1S′Σ†h = (S′S)−1S′.

yn(h) = S(S′S)−1S′yn(h)Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 40

Optimal combination forecasts

yn(h) = S(S′Σ†hS)−1S′Σ†hyn(h)

Revised forecasts Base forecasts

Solution 1: OLSAssume εh ≈ SεB,h where εB,h is theforecast error at bottom level.

Then Σh ≈ SΩhS′ where Ωh = Var(εB,h).

If Moore-Penrose generalized inverse used,then (S′Σ†hS)

−1S′Σ†h = (S′S)−1S′.

yn(h) = S(S′S)−1S′yn(h)Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 40

Optimal combination forecasts

yn(h) = S(S′Σ†hS)−1S′Σ†hyn(h)

Revised forecasts Base forecasts

Solution 1: OLSAssume εh ≈ SεB,h where εB,h is theforecast error at bottom level.

Then Σh ≈ SΩhS′ where Ωh = Var(εB,h).

If Moore-Penrose generalized inverse used,then (S′Σ†hS)

−1S′Σ†h = (S′S)−1S′.

yn(h) = S(S′S)−1S′yn(h)Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 40

Optimal combination forecasts

yn(h) = S(S′Σ†hS)−1S′Σ†hyn(h)

Revised forecasts Base forecasts

Solution 1: OLSAssume εh ≈ SεB,h where εB,h is theforecast error at bottom level.

Then Σh ≈ SΩhS′ where Ωh = Var(εB,h).

If Moore-Penrose generalized inverse used,then (S′Σ†hS)

−1S′Σ†h = (S′S)−1S′.

yn(h) = S(S′S)−1S′yn(h)Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 40

Optimal combination forecasts

yn(h) = S(S′Σ†hS)−1S′Σ†hyn(h)

Revised forecasts Base forecasts

Solution 1: OLSAssume εh ≈ SεB,h where εB,h is theforecast error at bottom level.

Then Σh ≈ SΩhS′ where Ωh = Var(εB,h).

If Moore-Penrose generalized inverse used,then (S′Σ†hS)

−1S′Σ†h = (S′S)−1S′.

yn(h) = S(S′S)−1S′yn(h)Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 40

Optimal combination forecasts

yn(h) = S(S′Σ†hS)−1S′Σ†hyn(h)

Revised forecasts Base forecasts

Solution 1: OLSAssume εh ≈ SεB,h where εB,h is theforecast error at bottom level.

Then Σh ≈ SΩhS′ where Ωh = Var(εB,h).

If Moore-Penrose generalized inverse used,then (S′Σ†hS)

−1S′Σ†h = (S′S)−1S′.

yn(h) = S(S′S)−1S′yn(h)Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 40

Optimal combination forecasts

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 41

yn(h) = S(S′S)−1S′yn(h)Total

A B C

Optimal combination forecasts

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 41

yn(h) = S(S′S)−1S′yn(h)Total

A B C

Weights:

S(S′S)−1S′ =

0.75 0.25 0.25 0.250.25 0.75 −0.25 −0.250.25 −0.25 0.75 −0.250.25 −0.25 −0.25 0.75

Optimal combination forecasts

Total

A

AA AB AC

B

BA BB BC

C

CA CB CC

Weights: S(S′S)−1S′ =

0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.080.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.060.23 −0.17 0.58 −0.17 −0.06 −0.06 −0.06 0.19 0.19 0.19 −0.06 −0.06 −0.060.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.190.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.020.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.020.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.020.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.020.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.020.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.020.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.270.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.270.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 42

Optimal combination forecasts

Total

A

AA AB AC

B

BA BB BC

C

CA CB CC

Weights: S(S′S)−1S′ =

0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.080.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.060.23 −0.17 0.58 −0.17 −0.06 −0.06 −0.06 0.19 0.19 0.19 −0.06 −0.06 −0.060.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.190.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.020.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.020.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.020.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.020.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.020.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.020.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.270.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.270.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 42

Features

Covariates can be included in initial forecasts.

Adjustments can be made to initial forecastsat any level.

Very simple and flexible method. Can workwith any hierarchical or grouped time series.

SPS = S so reconciled forcasts are unbiased.

Conceptually easy to implement: OLS onbase forecasts.

Weights are independent of the data and ofthe covariance structure of the hierarchy.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 43

Features

Covariates can be included in initial forecasts.

Adjustments can be made to initial forecastsat any level.

Very simple and flexible method. Can workwith any hierarchical or grouped time series.

SPS = S so reconciled forcasts are unbiased.

Conceptually easy to implement: OLS onbase forecasts.

Weights are independent of the data and ofthe covariance structure of the hierarchy.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 43

Features

Covariates can be included in initial forecasts.

Adjustments can be made to initial forecastsat any level.

Very simple and flexible method. Can workwith any hierarchical or grouped time series.

SPS = S so reconciled forcasts are unbiased.

Conceptually easy to implement: OLS onbase forecasts.

Weights are independent of the data and ofthe covariance structure of the hierarchy.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 43

Features

Covariates can be included in initial forecasts.

Adjustments can be made to initial forecastsat any level.

Very simple and flexible method. Can workwith any hierarchical or grouped time series.

SPS = S so reconciled forcasts are unbiased.

Conceptually easy to implement: OLS onbase forecasts.

Weights are independent of the data and ofthe covariance structure of the hierarchy.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 43

Features

Covariates can be included in initial forecasts.

Adjustments can be made to initial forecastsat any level.

Very simple and flexible method. Can workwith any hierarchical or grouped time series.

SPS = S so reconciled forcasts are unbiased.

Conceptually easy to implement: OLS onbase forecasts.

Weights are independent of the data and ofthe covariance structure of the hierarchy.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 43

Features

Covariates can be included in initial forecasts.

Adjustments can be made to initial forecastsat any level.

Very simple and flexible method. Can workwith any hierarchical or grouped time series.

SPS = S so reconciled forcasts are unbiased.

Conceptually easy to implement: OLS onbase forecasts.

Weights are independent of the data and ofthe covariance structure of the hierarchy.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 43

Challenges

Computational difficulties in bighierarchies due to size of the S matrix andsingular behavior of (S′S).

Need to estimate covariance matrix toproduce prediction intervals.

Ignores covariance matrix in computingpoint forecasts.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 44

yn(h) = S(S′S)−1S′yn(h)

Challenges

Computational difficulties in bighierarchies due to size of the S matrix andsingular behavior of (S′S).

Need to estimate covariance matrix toproduce prediction intervals.

Ignores covariance matrix in computingpoint forecasts.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 44

yn(h) = S(S′S)−1S′yn(h)

Challenges

Computational difficulties in bighierarchies due to size of the S matrix andsingular behavior of (S′S).

Need to estimate covariance matrix toproduce prediction intervals.

Ignores covariance matrix in computingpoint forecasts.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 44

yn(h) = S(S′S)−1S′yn(h)

Optimal combination forecasts

Solution 1: OLSApproximate Σ†1 by cI.

Solution 2: RescalingSuppose we approximate Σ1 by itsdiagonal.

Let Λ =[diagonal

(Σ1

)]−1contain inverse

one-step forecast variances.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 45

yn(h) = S(S′Σ†1S)−1S′Σ†1yn(h)

yn(h) = S(S′ΛS)−1S′Λyn(h)

Optimal combination forecasts

Solution 1: OLSApproximate Σ†1 by cI.

Solution 2: RescalingSuppose we approximate Σ1 by itsdiagonal.

Let Λ =[diagonal

(Σ1

)]−1contain inverse

one-step forecast variances.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 45

yn(h) = S(S′Σ†1S)−1S′Σ†1yn(h)

yn(h) = S(S′ΛS)−1S′Λyn(h)

Optimal combination forecasts

Solution 1: OLSApproximate Σ†1 by cI.

Solution 2: RescalingSuppose we approximate Σ1 by itsdiagonal.

Let Λ =[diagonal

(Σ1

)]−1contain inverse

one-step forecast variances.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 45

yn(h) = S(S′Σ†1S)−1S′Σ†1yn(h)

yn(h) = S(S′ΛS)−1S′Λyn(h)

Optimal combination forecasts

Solution 1: OLSApproximate Σ†1 by cI.

Solution 2: RescalingSuppose we approximate Σ1 by itsdiagonal.

Let Λ =[diagonal

(Σ1

)]−1contain inverse

one-step forecast variances.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 45

yn(h) = S(S′Σ†1S)−1S′Σ†1yn(h)

yn(h) = S(S′ΛS)−1S′Λyn(h)

Optimal combination forecasts

Solution 1: OLSApproximate Σ†1 by cI.

Solution 2: RescalingSuppose we approximate Σ1 by itsdiagonal.

Let Λ =[diagonal

(Σ1

)]−1contain inverse

one-step forecast variances.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 45

yn(h) = S(S′Σ†1S)−1S′Σ†1yn(h)

yn(h) = S(S′ΛS)−1S′Λyn(h)

Optimal reconciled forecasts

yn(h) = Sβn(h) = S(S′ΛS)−1S′Λyn(h)

Easy to estimate, and places weight wherewe have best forecasts.Ignores covariances.For large numbers of time series, we needto do calculation without explicitly formingS or (S′ΛS)−1 or S′Λ.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 46

Optimal reconciled forecasts

yn(h) = Sβn(h) = S(S′ΛS)−1S′Λyn(h)

Initial forecasts

Easy to estimate, and places weight wherewe have best forecasts.Ignores covariances.For large numbers of time series, we needto do calculation without explicitly formingS or (S′ΛS)−1 or S′Λ.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 46

Optimal reconciled forecasts

yn(h) = Sβn(h) = S(S′ΛS)−1S′Λyn(h)

Revised forecasts Initial forecasts

Easy to estimate, and places weight wherewe have best forecasts.Ignores covariances.For large numbers of time series, we needto do calculation without explicitly formingS or (S′ΛS)−1 or S′Λ.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 46

Optimal reconciled forecasts

yn(h) = Sβn(h) = S(S′ΛS)−1S′Λyn(h)

Revised forecasts Initial forecasts

Easy to estimate, and places weight wherewe have best forecasts.Ignores covariances.For large numbers of time series, we needto do calculation without explicitly formingS or (S′ΛS)−1 or S′Λ.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 46

Optimal reconciled forecasts

yn(h) = Sβn(h) = S(S′ΛS)−1S′Λyn(h)

Revised forecasts Initial forecasts

Easy to estimate, and places weight wherewe have best forecasts.Ignores covariances.For large numbers of time series, we needto do calculation without explicitly formingS or (S′ΛS)−1 or S′Λ.

Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 46

Outline

1 Examples of big time series

2 Time series visualisation

3 BLUF: Best Linear Unbiased Forecasts

4 Application: Australian tourism

5 Application: Australian labour market

6 Fast computation tricks

7 hts package for R

8 References

Visualising and forecasting big time series data Application: Australian tourism 47

Australian tourism

Visualising and forecasting big time series data Application: Australian tourism 48

Australian tourism

Visualising and forecasting big time series data Application: Australian tourism 48

Hierarchy:States (7)

Zones (27)

Regions (82)

Australian tourism

Visualising and forecasting big time series data Application: Australian tourism 48

Hierarchy:States (7)

Zones (27)

Regions (82)

Base forecastsETS (exponentialsmoothing) models

Base forecasts

Visualising and forecasting big time series data Application: Australian tourism 49

Domestic tourism forecasts: Total

Year

Vis

itor

nigh

ts

1998 2000 2002 2004 2006 2008

6000

065

000

7000

075

000

8000

085

000

Base forecasts

Visualising and forecasting big time series data Application: Australian tourism 49

Domestic tourism forecasts: NSW

Year

Vis

itor

nigh

ts

1998 2000 2002 2004 2006 2008

1800

022

000

2600

030

000

Base forecasts

Visualising and forecasting big time series data Application: Australian tourism 49

Domestic tourism forecasts: VIC

Year

Vis

itor

nigh

ts

1998 2000 2002 2004 2006 2008

1000

012

000

1400

016

000

1800

0

Base forecasts

Visualising and forecasting big time series data Application: Australian tourism 49

Domestic tourism forecasts: Nth.Coast.NSW

Year

Vis

itor

nigh

ts

1998 2000 2002 2004 2006 2008

5000

6000

7000

8000

9000

Base forecasts

Visualising and forecasting big time series data Application: Australian tourism 49

Domestic tourism forecasts: Metro.QLD

Year

Vis

itor

nigh

ts

1998 2000 2002 2004 2006 2008

8000

9000

1100

013

000

Base forecasts

Visualising and forecasting big time series data Application: Australian tourism 49

Domestic tourism forecasts: Sth.WA

Year

Vis

itor

nigh

ts

1998 2000 2002 2004 2006 2008

400

600

800

1000

1200

1400

Base forecasts

Visualising and forecasting big time series data Application: Australian tourism 49

Domestic tourism forecasts: X201.Melbourne

Year

Vis

itor

nigh

ts

1998 2000 2002 2004 2006 2008

4000

4500

5000

5500

6000

Base forecasts

Visualising and forecasting big time series data Application: Australian tourism 49

Domestic tourism forecasts: X402.Murraylands

Year

Vis

itor

nigh

ts

1998 2000 2002 2004 2006 2008

010

020

030

0

Base forecasts

Visualising and forecasting big time series data Application: Australian tourism 49

Domestic tourism forecasts: X809.Daly

Year

Vis

itor

nigh

ts

1998 2000 2002 2004 2006 2008

020

4060

8010

0

Reconciled forecasts

Visualising and forecasting big time series data Application: Australian tourism 50

Tota

l

2000 2005 2010

6500

080

000

9500

0

Reconciled forecasts

Visualising and forecasting big time series data Application: Australian tourism 50

NS

W

2000 2005 2010

1800

024

000

3000

0

VIC

2000 2005 20101000

014

000

1800

0

QLD

2000 2005 2010

1400

020

000

Oth

er2000 2005 201018

000

2400

0

Reconciled forecasts

Visualising and forecasting big time series data Application: Australian tourism 50

Syd

ney

2000 2005 20104000

7000

Oth

er N

SW

2000 2005 2010

1400

022

000

Mel

bour

ne

2000 2005 2010

4000

5000

Oth

er V

IC

2000 2005 2010

6000

1200

0

GC

and

Bris

bane

2000 2005 2010

6000

9000

Oth

er Q

LD2000 2005 201060

0012

000

Cap

ital c

ities

2000 2005 2010

1400

020

000

Oth

er

2000 2005 2010

5500

7500

Forecast evaluation

Select models using all observations;

Re-estimate models using first 12observations and generate 1- to8-step-ahead forecasts;

Increase sample size one observation at atime, re-estimate models, generateforecasts until the end of the sample;

In total 24 1-step-ahead, 232-steps-ahead, up to 17 8-steps-ahead forforecast evaluation.

Visualising and forecasting big time series data Application: Australian tourism 51

Forecast evaluation

Select models using all observations;

Re-estimate models using first 12observations and generate 1- to8-step-ahead forecasts;

Increase sample size one observation at atime, re-estimate models, generateforecasts until the end of the sample;

In total 24 1-step-ahead, 232-steps-ahead, up to 17 8-steps-ahead forforecast evaluation.

Visualising and forecasting big time series data Application: Australian tourism 51

Forecast evaluation

Select models using all observations;

Re-estimate models using first 12observations and generate 1- to8-step-ahead forecasts;

Increase sample size one observation at atime, re-estimate models, generateforecasts until the end of the sample;

In total 24 1-step-ahead, 232-steps-ahead, up to 17 8-steps-ahead forforecast evaluation.

Visualising and forecasting big time series data Application: Australian tourism 51

Forecast evaluation

Select models using all observations;

Re-estimate models using first 12observations and generate 1- to8-step-ahead forecasts;

Increase sample size one observation at atime, re-estimate models, generateforecasts until the end of the sample;

In total 24 1-step-ahead, 232-steps-ahead, up to 17 8-steps-ahead forforecast evaluation.

Visualising and forecasting big time series data Application: Australian tourism 51

Hierarchy: states, zones, regions

MAPE h = 1 h = 2 h = 4 h = 6 h = 8 AverageTop Level: Australia

Bottom-up 3.79 3.58 4.01 4.55 4.24 4.06OLS 3.83 3.66 3.88 4.19 4.25 3.94Scaling (st. dev.) 3.68 3.56 3.97 4.57 4.25 4.04Level: States

Bottom-up 10.70 10.52 10.85 11.46 11.27 11.03OLS 11.07 10.58 11.13 11.62 12.21 11.35Scaling (st. dev.) 10.44 10.17 10.47 10.97 10.98 10.67Level: Zones

Bottom-up 14.99 14.97 14.98 15.69 15.65 15.32OLS 15.16 15.06 15.27 15.74 16.15 15.48Scaling (st. dev.) 14.63 14.62 14.68 15.17 15.25 14.94Bottom Level: Regions

Bottom-up 33.12 32.54 32.26 33.74 33.96 33.18OLS 35.89 33.86 34.26 36.06 37.49 35.43Scaling (st. dev.) 31.68 31.22 31.08 32.41 32.77 31.89

Visualising and forecasting big time series data Application: Australian tourism 52

Outline

1 Examples of big time series

2 Time series visualisation

3 BLUF: Best Linear Unbiased Forecasts

4 Application: Australian tourism

5 Application: Australian labour market

6 Fast computation tricks

7 hts package for R

8 References

Visualising and forecasting big time series data Application: Australian labour market 53

ANZSCO

Australia and New Zealand StandardClassification of Occupations

8 major groups43 sub-major groups

97 minor groups– 359 unit groups

* 1023 occupations

Example: statistician2 Professionals

22 Business, Human Resource and MarketingProfessionals224 Information and Organisation Professionals

2241 Actuaries, Mathematicians and Statisticians224113 Statistician

Visualising and forecasting big time series data Application: Australian labour market 54

ANZSCO

Australia and New Zealand StandardClassification of Occupations

8 major groups43 sub-major groups

97 minor groups– 359 unit groups

* 1023 occupations

Example: statistician2 Professionals

22 Business, Human Resource and MarketingProfessionals224 Information and Organisation Professionals

2241 Actuaries, Mathematicians and Statisticians224113 Statistician

Visualising and forecasting big time series data Application: Australian labour market 54

Australian Labour Market data

Visualising and forecasting big time series data Application: Australian labour market 55

Time

Leve

l 0

7000

9000

1100

0

Time

Leve

l 1

500

1000

1500

2000

2500

1. Managers2. Professionals3. Technicians and trade workers4. Community and personal services workers5. Clerical and administrative workers6. Sales workers7. Machinery operators and drivers8. Labourers

Time

Leve

l 2

100

200

300

400

500

600

700

Time

Leve

l 3

100

200

300

400

500

600

700

Time

Leve

l 4

1990 1995 2000 2005 2010

100

200

300

400

500

Australian Labour Market data

Visualising and forecasting big time series data Application: Australian labour market 55

Time

Leve

l 0

7000

9000

1100

0

Time

Leve

l 1

500

1000

1500

2000

2500

1. Managers2. Professionals3. Technicians and trade workers4. Community and personal services workers5. Clerical and administrative workers6. Sales workers7. Machinery operators and drivers8. Labourers

Time

Leve

l 2

100

200

300

400

500

600

700

Time

Leve

l 3

100

200

300

400

500

600

700

Time

Leve

l 4

1990 1995 2000 2005 2010

100

200

300

400

500

Lower three panelsshow largestsub-groups at eachlevel.

Australian Labour Market data

Visualising and forecasting big time series data Application: Australian labour market 55

Time

Leve

l 0

7000

9000

1100

0

Time

Leve

l 1

500

1000

1500

2000

2500

1. Managers2. Professionals3. Technicians and trade workers4. Community and personal services workers5. Clerical and administrative workers6. Sales workers7. Machinery operators and drivers8. Labourers

Time

Leve

l 2

100

200

300

400

500

600

700

Time

Leve

l 3

100

200

300

400

500

600

700

Time

Leve

l 4

1990 1995 2000 2005 2010

100

200

300

400

500

Time

Leve

l 0

1080

011

200

1160

012

000

Base forecastsReconciled forecasts

Time

Leve

l 1

680

700

720

740

760

780

800

Time

Leve

l 2

140

150

160

170

180

190

200

Time

Leve

l 3

140

150

160

170

180

Year

Leve

l 4

2010 2011 2012 2013 2014 2015

120

130

140

150

160

Australian Labour Market data

Visualising and forecasting big time series data Application: Australian labour market 55

Time

Leve

l 0

7000

9000

1100

0

Time

Leve

l 1

500

1000

1500

2000

2500

1. Managers2. Professionals3. Technicians and trade workers4. Community and personal services workers5. Clerical and administrative workers6. Sales workers7. Machinery operators and drivers8. Labourers

Time

Leve

l 2

100

200

300

400

500

600

700

Time

Leve

l 3

100

200

300

400

500

600

700

Time

Leve

l 4

1990 1995 2000 2005 2010

100

200

300

400

500

Time

Leve

l 0

1080

011

200

1160

012

000

Base forecastsReconciled forecasts

Time

Leve

l 1

680

700

720

740

760

780

800

Time

Leve

l 2

140

150

160

170

180

190

200

Time

Leve

l 3

140

150

160

170

180

Year

Leve

l 4

2010 2011 2012 2013 2014 2015

120

130

140

150

160

Base forecastsfrom auto.arima()

Largest changesshown for eachlevel

Forecast evaluation (rolling origin)RMSE h = 1 h = 2 h = 3 h = 4 h = 5 h = 6 h = 7 h = 8 Average

Top level

Bottom-up 74.71 102.02 121.70 131.17 147.08 157.12 169.60 178.93 135.29

OLS 52.20 77.77 101.50 119.03 138.27 150.75 160.04 166.38 120.74

WLS 61.77 86.32 107.26 119.33 137.01 146.88 156.71 162.38 122.21

Level 1

Bottom-up 21.59 27.33 30.81 32.94 35.45 37.10 39.00 40.51 33.09

OLS 21.89 28.55 32.74 35.58 38.82 41.24 43.34 45.49 35.96

WLS 20.58 26.19 29.71 31.84 34.36 35.89 37.53 38.86 31.87

Level 2

Bottom-up 8.78 10.72 11.79 12.42 13.13 13.61 14.14 14.65 12.40

OLS 9.02 11.19 12.34 13.04 13.92 14.56 15.17 15.77 13.13

WLS 8.58 10.48 11.54 12.15 12.88 13.36 13.87 14.36 12.15

Level 3

Bottom-up 5.44 6.57 7.17 7.53 7.94 8.27 8.60 8.89 7.55

OLS 5.55 6.78 7.42 7.81 8.29 8.68 9.04 9.37 7.87

WLS 5.35 6.46 7.06 7.42 7.84 8.17 8.48 8.76 7.44

Bottom Level

Bottom-up 2.35 2.79 3.02 3.15 3.29 3.42 3.54 3.65 3.15

OLS 2.40 2.86 3.10 3.24 3.41 3.55 3.68 3.80 3.25

WLS 2.34 2.77 2.99 3.12 3.27 3.40 3.52 3.63 3.13

Visualising and forecasting big time series data Application: Australian labour market 56

Outline

1 Examples of big time series

2 Time series visualisation

3 BLUF: Best Linear Unbiased Forecasts

4 Application: Australian tourism

5 Application: Australian labour market

6 Fast computation tricks

7 hts package for R

8 References

Visualising and forecasting big time series data Fast computation tricks 57

Fast computation: hierarchical data

Total

A

AX AY AZ

B

BX BY BZ

C

CX CY CZ

yt =

YtYA,tYB,tYC,tYAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t

=

1 1 1 1 1 1 1 1 11 1 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 11 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 00 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 00 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1

︸ ︷︷ ︸

S

YAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t

︸ ︷︷ ︸

Bt

Visualising and forecasting big time series data Fast computation tricks 58

yt = SBt

Fast computation: hierarchical data

Total

A

AX AY AZ

B

BX BY BZ

C

CX CY CZ

yt =

YtYA,tYAX,tYAY,tYAZ,tYB,tYBX,tYBY,tYBZ,tYC,tYCX,tYCY,tYCZ,t

=

1 1 1 1 1 1 1 1 11 1 1 0 0 0 0 0 01 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 00 0 0 0 0 0 1 1 10 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1

︸ ︷︷ ︸

S

YAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t

︸ ︷︷ ︸

Bt

Visualising and forecasting big time series data Fast computation tricks 59

yt = SBt

Fast computation: hierarchies

Think of the hierarchy as a tree of trees:

Total

T1 T2 . . . TK

Then the summing matrix contains k smaller summingmatrices:

S =

1′n1

1′n2· · · 1′nK

S1 0 · · · 00 S2 · · · 0...

... . . . ...0 0 · · · SK

where 1n is an n-vector of ones and tree Ti has niterminal nodes.

Visualising and forecasting big time series data Fast computation tricks 60

Fast computation: hierarchies

Think of the hierarchy as a tree of trees:

Total

T1 T2 . . . TK

Then the summing matrix contains k smaller summingmatrices:

S =

1′n1

1′n2· · · 1′nK

S1 0 · · · 00 S2 · · · 0...

... . . . ...0 0 · · · SK

where 1n is an n-vector of ones and tree Ti has niterminal nodes.

Visualising and forecasting big time series data Fast computation tricks 60

Fast computation: hierarchies

S′ΛS =

S′1Λ1S1 0 · · · 0

0 S′2Λ2S2 · · · 0... ... . . . ...0 0 · · · S′KΛKSK

+λ0 Jn

λ0 is the top left element of Λ;Λk is a block of Λ, corresponding to tree Tk;Jn is a matrix of ones;n =

∑k nk.

Now apply the Sherman-Morrison formula . . .

Visualising and forecasting big time series data Fast computation tricks 61

Fast computation: hierarchies

S′ΛS =

S′1Λ1S1 0 · · · 0

0 S′2Λ2S2 · · · 0... ... . . . ...0 0 · · · S′KΛKSK

+λ0 Jn

λ0 is the top left element of Λ;Λk is a block of Λ, corresponding to tree Tk;Jn is a matrix of ones;n =

∑k nk.

Now apply the Sherman-Morrison formula . . .

Visualising and forecasting big time series data Fast computation tricks 61

Fast computation: hierarchies

(S′ΛS)−1 =

(S′1Λ1S1)

−1 0 · · · 00 (S′2Λ2S2)

−1 · · · 0...

.... . .

...0 0 · · · (S′KΛKSK)

−1

−cS0

S0 can be partitioned into K2 blocks, with the (k, `)block (of dimension nk × n`) being

(S′kΛkSk)−1Jnk,n`(S

′`Λ`S`)

−1

Jnk,n` is a nk × n` matrix of ones.

c−1 = λ−10 +

∑k

1′nk(S′kΛkSk)

−11nk .

Each S′kΛkSk can be inverted similarly.S′Λy can also be computed recursively.

Visualising and forecasting big time series data Fast computation tricks 62

Fast computation: hierarchies

(S′ΛS)−1 =

(S′1Λ1S1)

−1 0 · · · 00 (S′2Λ2S2)

−1 · · · 0...

.... . .

...0 0 · · · (S′KΛKSK)

−1

−cS0

S0 can be partitioned into K2 blocks, with the (k, `)block (of dimension nk × n`) being

(S′kΛkSk)−1Jnk,n`(S

′`Λ`S`)

−1

Jnk,n` is a nk × n` matrix of ones.

c−1 = λ−10 +

∑k

1′nk(S′kΛkSk)

−11nk .

Each S′kΛkSk can be inverted similarly.S′Λy can also be computed recursively.

Visualising and forecasting big time series data Fast computation tricks 62

The recursive calculations can bedone in such a way that we neverstore any of the large matricesinvolved.

Fast computation

When the time series are not strictlyhierarchical and have more than two groupingvariables:

Use sparse matrix storage and arithmetic.

Use iterative approximation for invertinglarge sparse matrices.

Paige & Saunders (1982)ACM Trans. Math. Software

Visualising and forecasting big time series data Fast computation tricks 63

Fast computation

When the time series are not strictlyhierarchical and have more than two groupingvariables:

Use sparse matrix storage and arithmetic.

Use iterative approximation for invertinglarge sparse matrices.

Paige & Saunders (1982)ACM Trans. Math. Software

Visualising and forecasting big time series data Fast computation tricks 63

Fast computation

When the time series are not strictlyhierarchical and have more than two groupingvariables:

Use sparse matrix storage and arithmetic.

Use iterative approximation for invertinglarge sparse matrices.

Paige & Saunders (1982)ACM Trans. Math. Software

Visualising and forecasting big time series data Fast computation tricks 63

Outline

1 Examples of big time series

2 Time series visualisation

3 BLUF: Best Linear Unbiased Forecasts

4 Application: Australian tourism

5 Application: Australian labour market

6 Fast computation tricks

7 hts package for R

8 References

Visualising and forecasting big time series data hts package for R 64

hts package for R

Visualising and forecasting big time series data hts package for R 65

hts: Hierarchical and grouped time seriesMethods for analysing and forecasting hierarchical and groupedtime series

Version: 4.3Depends: forecast (≥ 5.0)Imports: SparseM, parallel, utilsPublished: 2014-06-10Author: Rob J Hyndman, Earo Wang and Alan LeeMaintainer: Rob J Hyndman <Rob.Hyndman at monash.edu>BugReports: https://github.com/robjhyndman/hts/issuesLicense: GPL (≥ 2)

Example using Rlibrary(hts)

# bts is a matrix containing the bottom level time series# nodes describes the hierarchical structurey <- hts(bts, nodes=list(2, c(3,2)))

Visualising and forecasting big time series data hts package for R 66

Example using Rlibrary(hts)

# bts is a matrix containing the bottom level time series# nodes describes the hierarchical structurey <- hts(bts, nodes=list(2, c(3,2)))

Visualising and forecasting big time series data hts package for R 66

Total

A

AX AY AZ

B

BX BY

Example using Rlibrary(hts)

# bts is a matrix containing the bottom level time series# nodes describes the hierarchical structurey <- hts(bts, nodes=list(2, c(3,2)))

# Forecast 10-step-ahead using WLS combination method# ETS used for each series by defaultfc <- forecast(y, h=10)

Visualising and forecasting big time series data hts package for R 67

forecast.gts functionUsageforecast(object, h,method = c("comb", "bu", "mo", "tdgsf", "tdgsa", "tdfp"),fmethod = c("ets", "rw", "arima"),weights = c("sd", "none", "nseries"),positive = FALSE,parallel = FALSE, num.cores = 2, ...)

Argumentsobject Hierarchical time series object of class gts.h Forecast horizonmethod Method for distributing forecasts within the hierarchy.fmethod Forecasting method to usepositive If TRUE, forecasts are forced to be strictly positiveweights Weights used for "optimal combination" method. When

weights = "sd", it takes account of the standard deviation offorecasts.

parallel If TRUE, allow parallel processingnum.cores If parallel = TRUE, specify how many cores are going to be

used

Visualising and forecasting big time series data hts package for R 68

Outline

1 Examples of big time series

2 Time series visualisation

3 BLUF: Best Linear Unbiased Forecasts

4 Application: Australian tourism

5 Application: Australian labour market

6 Fast computation tricks

7 hts package for R

8 References

Visualising and forecasting big time series data References 69

ReferencesRJ Hyndman, RA Ahmed, G Athanasopoulos, andHL Shang (2011). “Optimal combination forecasts forhierarchical time series”. Computational statistics &data analysis 55(9), 2579–2589.RJ Hyndman, AJ Lee, and E Wang (2014). Fastcomputation of reconciled forecasts for hierarchicaland grouped time series. Working paper 17/14.Department of Econometrics & Business Statistics,Monash UniversityRJ Hyndman, AJ Lee, and E Wang (2014). hts:Hierarchical and grouped time series.cran.r-project.org/package=hts.RJ Hyndman and G Athanasopoulos (2014).Forecasting: principles and practice. OTexts.OTexts.org/fpp/.

Visualising and forecasting big time series data References 70

ReferencesRJ Hyndman, RA Ahmed, G Athanasopoulos, andHL Shang (2011). “Optimal combination forecasts forhierarchical time series”. Computational statistics &data analysis 55(9), 2579–2589.RJ Hyndman, AJ Lee, and E Wang (2014). Fastcomputation of reconciled forecasts for hierarchicaland grouped time series. Working paper 17/14.Department of Econometrics & Business Statistics,Monash UniversityRJ Hyndman, AJ Lee, and E Wang (2014). hts:Hierarchical and grouped time series.cran.r-project.org/package=hts.RJ Hyndman and G Athanasopoulos (2014).Forecasting: principles and practice. OTexts.OTexts.org/fpp/.

Visualising and forecasting big time series data References 70

å Papers and R code:

robjhyndman.com

å Email: [email protected]