avms and cama

20
AVMs and CAMA The robots are taking over

Upload: wyatt-leon

Post on 31-Dec-2015

32 views

Category:

Documents


0 download

DESCRIPTION

AVMs and CAMA. The robots are taking over. What is CAMA?. What is an AVM?. Automated Valuation Model Uses a statistical model and a large amount of property data to estimate the market value of an individual property or portfolio of properties (RMBS – remember those?!) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: AVMs and CAMA

AVMs and CAMAThe robots are taking over

Page 2: AVMs and CAMA

What is CAMA?• Computer-assisted mass

appraisal• Uses a statistical model and a

large amount of property data to estimate the market values of large numbers of properties

• Usually used for tax purposes

What is an AVM?• Automated Valuation Model• Uses a statistical model and a

large amount of property data to estimate the market value of an individual property or portfolio of properties (RMBS – remember those?!)

• A confidence level is also usually produced to indicate how accurate the valuation is

• Usually used for lending purposes

Page 3: AVMs and CAMA

Property Analytics• Data

– Land Registry, Registers of Scotland– Surveyor reports– BCIS (for reinstatement valuations)– Royal Mail, Ordnance Survey, credit referencing co’s

• Range of price estimation techniques– Surveyor emulation (comps search engine)– Multi-variate linear and non-linear regression– Repeat sales regression analysis

Page 4: AVMs and CAMA

AVM accuracy• The most commonly used benchmark in measuring the performance of

the AVM is surveyor valuations, e.g. a property can be valued using both the Rightmove AVM and a Surveyor:

– 3 Badger Lane, Durham, DH1 3LN– Type: House– Style: Detached– Bedrooms: 3– Surveyor valuation: £340,000– AVM valuation: £324,500

• The difference between these two valuations is(AVM - Surveyor valuation) = - £15,500 = - 4.6% "error"

Surveyor valuation £340,000 • If this measure is replicated across many properties, a spread of

"errors" can be plotted...

Page 5: AVMs and CAMA

Batch valuations from Rightmove analysed by Standard & Poor's, Moody's, Fitch and DBRS

Page 6: AVMs and CAMA

Statistical model• Use multiple regression analysis (MRA) to infer a

mathematical (e.g. linear) relationship between several property attributes and the price that a dwelling might trade for

• Property attributes: size, type, age, location, etc.• Mathematical relationship encapsulated in an equation

which can be used to estimate price in cases where the attributes are known but the price isn’t

Page 7: AVMs and CAMA

Different from conventional valuation

• Relies on large data set – big problem in UK• Provides a valuation and an estimate of variance• Quick• Cheap• Difficult to defend• Difficult to sue an AVM

Page 8: AVMs and CAMA

Building the modelName of variable

Description of variable

Type of variable

Sub-type

Values

ID Identification number

Quantitative Category Unique identifiers

TYPE Type of dwelling

Qualitative Category D - DetachedSD - Semi-detachedET - End-terraceMT - Mid-terrace

ROOMS Number of rooms

Quantitative Interval Ranges from 3 to 8 rooms

HTG Type of heating

Qualitative Category G - GasAD – Air ductE - ElectricitySF – Solid fuelO – Oil

PRICE Capital value Quantitative Contin-uous

Capital value (£,000s)

RENT Rateable value

Quantitative Contin-uous

Rental value (£ per month)

IDPrice000s

RENTpmth TYPE ROOMS HTG

1 341 268 D 6 G2 242 130 D 4 AD3 242 130 D 4 AD4 297 253 D 6 G5 297 211 D 5 G6 396 343 D 7 G7 270 134 D 5 G8 462 378 D 8 G9 176 157 ET 3 E

50 407 317 D 8 O51 231 191 SD 4 O52 231 191 SD 4 O53 226 178 ET 4 E54 215 178 ET 4 E55 220 178 ET 4 E56 209 178 ET 4 E57 220 178 ET 4 E58 264 211 D 6 G59 330 297 MT 6 E60 341 303 MT 6 G

...

Page 9: AVMs and CAMA

Building the model – Start with simple linear regression modelResponse variable:

PRICE (ave = £302,000, sd = £65,000)

Predictor variable:

Monthly rent (ave = £251, sd = £63)

Frequency distribution is slightly positively skewed

Page 10: AVMs and CAMA

Simple linear regression model

Ordinary least squares (OLS)...

where:

y = estimate of the average sale price corresponding to a given value of x

x = actual value of the monthly rent

b0 = estimate of the intercept of the regression line

b1 = estimate of the gradient of the regression line

u = random component (residual error term)

iii uxbby 10

Valuers are being replaced by GCSE maths!

Page 11: AVMs and CAMA

Simple linear regression model• Using the least squares principle (which minimises the sum of the squared

differences between actual and predicted values of y) the regression line can be derived by solving for coefficients b1 and b0 using the variance of x and the covariance of x and y.

• The expression from which b1 can be calculated is

• For b0 the expression is

n

ii

n

iii

x

xy

x

xy

xx

yyxx

s

s

Var

Covb

1

2

12

2

1

xbyn

xbyb 1

10

Page 12: AVMs and CAMA

(un-standardised) coefficients

b1 = 215747/233509 = 0.9239

b0 = 302 – 0.9239 * 251 = 70.10

Both are significantly different from 0 at the 0.01% level

y = 70.10 + 0.9239x

So that’s £70,100 plus 0.92 x monthly rent...

IDPRICE

(y)RENT

(x)x – xbar

(a)y-ybar

(b) (a) * (b) (a)^2

... ... ... ... ... ... ...

50 407 317 65.41 105 6878 4279

51 231 191 -59.99 -71 4251 3598

52 231 191 -59.99 -71 4251 3598

53 226 178 -73.19 -76 5588 5356

54 215 178 -73.19 -87 6393 5356

55 220 178 -73.19 -82 5991 5356

56 209 178 -73.19 -93 6796 5356

57 220 178 -73.19 -82 5991 5356

58 264 211 -40.19 -38 1521 1615

59 330 297 45.61 28 1284 2081

60 341 303 51.11 39 2001 2613

sum 215747 233509

mean 302 251

Page 13: AVMs and CAMA

Intercept (£695,930)

Page 14: AVMs and CAMA

Interpretation of the model

Res

po

nse

var

iab

le, y

Predictor variable, x

Meansale price

Total variation of observed y from mean y

Regression model variation of predicted y from mean y

Residual variation of predicted y from observed y y = b0+b1x +ui

The mean value of the dependent variable y is a straight line on a scatter-plot as it would be the same for all values of an independent variable x

Page 15: AVMs and CAMA

Total variation (SST) of each value of y about the mean value of y is calculated by taking the sum of the squared differences between observed values of y and the mean value of y

Where = sale price of property i

= average sale price

i = 1, … , n (where n is the number of sales)

Each point on the regression line (which slopes) varies from the mean value of y. This regression model variation (SSM) can be calculated as the sum of the squared differences between mean value of y and the regression line.

Where = modelled sale price of property i

Finally, residual variation (SSR) (variation unexplained by the regression model) can be calculated as the sum of the squared differences between observed values of y and the regression line.

We would expect the total variation to comprise variation explained by the regression model plus residual variation, i.e. SST = SSM + SSR

2

yySS iT

iy

y

2

ˆ yySS iM

iy

2ˆiiR yySS

Page 16: AVMs and CAMA
Page 17: AVMs and CAMA

Model performanceAs a measure of size of the relationship between the two variables we can calculate the amount of variance in the values of the dependent variable (SST) which is explained by the model (SSM), i.e. explained variation divided by total variation. This is known as the coefficient of determination, R2

R2 ranges from 0 to 1 and the smaller the residual variation as a percentage of total variation, the larger the R2

The F-ratio is the regression model variation (SSM) divided by the residual mean squares and is a measure of how much the model has improved the prediction of the outcome compared to the level of inaccuracy in the model. A good model will have a high F-ratio.

7962.0

250355

199336ˆ2

2

2

yy

yy

SS

SSR

i

i

T

M

78.226880

199336

R

M

MS

SSF

Page 18: AVMs and CAMA

Model parameters• Un-standardised coefficients are in the source units for the variable• If x significantly predicts y it should have a b significantly different from

zero. This is tested using a t-test:

• For samples >= 60 observations (plus one additional observation for each parameter to be estimated) a predictor variable with a t-stat >= +/-2.00 indicates 95% confidence that b does not equal 0 and therefore x is significant in predicting y (if > +/2.58 then 99% confident)

s

bt

Un-standardised coefficients

(b)Standard error (s) t stat p-value

Lower 95.0%

Upper 95.0%

Intercept 69.59342 15.89699 4.377774 5.07E-05 37.77214 101.4147

RENT_pmth 0.923935 0.061376 15.05379 1.08E-21 0.801078 1.046791

Page 19: AVMs and CAMA

Model residualsResiduals (difference between observed and predicted outcomes) should be normally distributed about the predicted responses with a mean of zero. A normal P-P plot of standardised residuals is a check on normality: plotted points should follow a straight line.

Page 20: AVMs and CAMA

So rent is a pretty good predictor of price. This is unsurprising as investors (buy-to-let) pay prices that bear a relationship (expressed as a yield or multiple) to the rent.

When the model fit is appropriate a scatter-plot of standardised residuals against predicted responses should be random, centred on the line of zero standard residual value–Standardised residuals with z-scores > +/-3 are outliers and therefore concerning–If > 1% standardised residuals have z-score > 2.5 the error in model is unacceptable–If > 5% standardised residuals have z-score > 2 this is also evidence that the model poorly represents the data