statistical theory of estimation

51
The Phone Keypad Number Puzzle

Upload: acquate

Post on 10-Jan-2017

101 views

Category:

Business


2 download

TRANSCRIPT

Page 1: Statistical Theory of Estimation

The Phone Keypad Number Puzzle

Page 2: Statistical Theory of Estimation

The Phone Keypad Number Puzzle

Page 3: Statistical Theory of Estimation

The Phone Keypad Number Puzzle

Presenter
Presentation Notes
The sum of numbers in a row, vertical, horizontal or diagonal is always divisible by 3 The key is in the remainders
Page 4: Statistical Theory of Estimation

1 / 3 => 2

2 / 3 => 1

3 / 3 => 0

Presenter
Presentation Notes
The sum of numbers in a row, vertical, horizontal or diagonal is always divisible by 3 The key is in the remainders
Page 5: Statistical Theory of Estimation

Why Does a Circle Have 360 Degrees?

Page 6: Statistical Theory of Estimation

Babylonians

Presenter
Presentation Notes
Why not 100 or 500 or even 720? Could be Greeks as well, but these guys
Page 7: Statistical Theory of Estimation

Babylonians

Presenter
Presentation Notes
Why not 100 or 500 or even 720? Could be Greeks as well
Page 8: Statistical Theory of Estimation

What does 360 resemble?

Presenter
Presentation Notes
Why not 100 or 500 or even 720? Could be Greeks as well
Page 9: Statistical Theory of Estimation

What does 360 resemble?

Presenter
Presentation Notes
Earth year is roughly 365 days So everyday sun moves about 1/365 of the way along a huge circle all the way around the Earth (called ecliptic) If you lived few millennia ago and di not have modern instruments to accurately calculate this, you would think it is 360 Make sense?
Page 10: Statistical Theory of Estimation

Babylonian Calendar

Presenter
Presentation Notes
Also, 360 is a lovable number Divisible by 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 18, 20, 24, 30, 36, 40, 45, 60, 72, 90, 120, 180 and of course 360. And that makes 360 a really convenient number because it means we can divide a circle into 2, 3, 4, 5, 6, 8, 9, 10, 12, and so on even parts. It makes solving problems by hand—which, mind you, was the only way to solve problems thousands of years ago—much easier.
Page 11: Statistical Theory of Estimation

TJ Gokcen @tjgokcen

Page 12: Statistical Theory of Estimation

Estimating the maximum of a discrete uniform distribution from sampling data

TJ Gokcen @tjgokcen

Page 13: Statistical Theory of Estimation

German Tank Problem

Presenter
Presentation Notes
During the course of the war, the Western Allies made sustained efforts to determine the extent of German production and approached this in two major ways: conventional intelligence gathering and statistical estimation. In many cases, statistical analysis substantially improved on conventional intelligence. In some cases, conventional intelligence was used in conjunction with statistical methods, as was the case in estimation of Panther tank production just prior to D-Day.
Page 14: Statistical Theory of Estimation
Presenter
Presentation Notes
The allied command structure had thought the Panzer V (Panther) tanks seen in Italy, with their high velocity, long-barreled 75 mm/L70 guns, were unusual heavy tanks and would only be seen in northern France in small numbers. But it was important to know how much of these tanks were produced and then determine where they could be sent Shortly before D-Day, rumors indicated that large numbers of Panzer V tanks were being used.
Page 15: Statistical Theory of Estimation
Presenter
Presentation Notes
The Allies attempted to estimate the number of tanks being produced. To do this, they used the serial numbers on captured or destroyed tanks. The principal numbers used were gearbox numbers, as these fell in two unbroken sequences. Chassis and engine numbers were also used, though their use was more complicated. Various other components were used to cross-check the analysis. Similar analyses were done on tires, which were observed to be sequentially numbered
Page 16: Statistical Theory of Estimation
Presenter
Presentation Notes
Analysis of wheels from two tanks (32 road wheels each, 64 road wheels total) yielded an estimate of 270 tanks produced in February 1944, substantially more than had previously been suspected German records after the war showed production for the month of February 1944 was 276. The statistical approach proved to be far more accurate than conventional intelligence methods, and the phrase "German tank problem" became accepted as a descriptor for this type of statistical analysis. And how successful were they in estimating?
Page 17: Statistical Theory of Estimation
Page 18: Statistical Theory of Estimation
Page 19: Statistical Theory of Estimation
Presenter
Presentation Notes
As can be seen they were very close to the real numbers So how did they do it?
Page 20: Statistical Theory of Estimation

Estimators

a rule telling you how to calculate a special type of statistic that tells you not only about the properties of a sample of data, but also about the properties of the entire population from which the sample was drawn.

Page 21: Statistical Theory of Estimation

Population Maximum

an estimator rule that will help us estimate the value of the largest integer in the bag using only the values in the sample

Page 22: Statistical Theory of Estimation
Presenter
Presentation Notes
Population vs Sample data
Page 23: Statistical Theory of Estimation

10, 23, 17, 9, 35, 3

A bag of tiles with numbers on them

Page 24: Statistical Theory of Estimation

10, 23, 17, 9, 35, 3

Keep in mind we have 42 tiles in the bag

A bag of tiles with numbers on them

Page 25: Statistical Theory of Estimation

10, 23, 17, 9, 35, 3

Keep in mind we have 42 tiles in the bag

A bag of tiles with numbers on them

How do we come up with population maximum?

Page 26: Statistical Theory of Estimation

10, 23, 17, 9, 35, 3

- Twice the biggest integer 2 x 35 = 70

A bag of tiles with numbers on them

Page 27: Statistical Theory of Estimation

10, 23, 17, 9, 35, 3

- Twice the biggest integer 2 x 35 = 70

A bag of tiles with numbers on them

- Twice the mean value = 16 x 2 = 32

Page 28: Statistical Theory of Estimation

10, 23, 17, 9, 35, 3

- Twice the median value

A bag of tiles with numbers on them

- Put all the numbers in numerical order:3, 9, 10, 17, 23, 35

Presenter
Presentation Notes
- To find median value take the number in the middle or two numbers in the middle and take the mean value of them
Page 29: Statistical Theory of Estimation

10, 23, 17, 9, 35, 3

- Twice the median value

A bag of tiles with numbers on them

- Put all the numbers in numerical order:3, 9, 10, 17, 23, 35

Page 30: Statistical Theory of Estimation

10, 23, 17, 9, 35, 3

- Twice the median value

A bag of tiles with numbers on them

- Median is 10+17 /2 = 13.5 x 2 = 27

Presenter
Presentation Notes
All of the values are way off So how do we calculate this?
Page 31: Statistical Theory of Estimation

sample max

sample sizepop max = sample max + - 1

Presenter
Presentation Notes
Right, so let’s plug in our numbers
Page 32: Statistical Theory of Estimation

sample max

sample sizepop max = sample max + - 1

35

6pop max = 35 + - 1

Our numbers: 10, 23, 17, 9, 35, 3

Page 33: Statistical Theory of Estimation

sample max

sample sizepop max = sample max + - 1

pop max = ~40

Our numbers: 10, 23, 17, 9, 35, 3

Presenter
Presentation Notes
the population maximum is estimated to be equal to the sample maximum…plus a little bit more. And that little bit more is basically equal to the average gap between the numbers in the sample.  Of course the real formula is a bit more complicated than this
Page 34: Statistical Theory of Estimation
Presenter
Presentation Notes
Frequentists approach to estimate the number of tanks
Page 35: Statistical Theory of Estimation
Presenter
Presentation Notes
Bayesian analysis to estimate the number of tanks
Page 36: Statistical Theory of Estimation
Presenter
Presentation Notes
Final formula (or there of)
Page 37: Statistical Theory of Estimation

• Number of bugs• Number of user stories• Number of user story points• Team Capacity

Page 38: Statistical Theory of Estimation

• It’s abused

Page 39: Statistical Theory of Estimation

• It’s abused• Never taken as the estimation always as the final number

Page 40: Statistical Theory of Estimation

• It’s abused• Never taken as the estimation always as the final number• Used to stress out developers

Page 41: Statistical Theory of Estimation

• It’s abused• Never taken as the estimation always as the final number• Used to stress out developers• Scope Creep

Page 42: Statistical Theory of Estimation

• What is the aim of the project?• What do we expect to get out of it?• Where does the project fit with in the

organization?• What other areas does it impact?

Page 43: Statistical Theory of Estimation

• What is my team’s capacity?• Do we need to hire more people or

outsource?• Launch a start up for this project?• Where does marketing come in?

Page 44: Statistical Theory of Estimation

• For iterations etc. estimating is sufficient, because you will be making granular decisions

• Otherwise, budgeting especially with lack of granularity, is a better fit

Presenter
Presentation Notes
For more accurate estimates, you need more granularity The more waterfall it gets.
Page 45: Statistical Theory of Estimation

• Budget using a top-down approach• Let’s say we’re building an online

bookstore

Page 46: Statistical Theory of Estimation

• Shopping Cart• Browse Books• Search Books• Manage Inventory• Preview Inside of Book

Presenter
Presentation Notes
Do we have enough information to answer “How much is this going to cost? Probably not. We need to get more granular. This is too high level. Let’s break down search books I to details like, By Author, By title etc. If we have some experience, and a balanced team should, building such a component then we can come up with some timelines. Let’s take a look at them.
Page 47: Statistical Theory of Estimation
Page 48: Statistical Theory of Estimation

• Should we build this software?

• Do we have enough info to answer: Should we make this software?

Presenter
Presentation Notes
If our budget is $500k, then we do have enough information. The answer is no, we can’t afford it. If our budget is $5M, then we do have enough information. The answer is yes, we can afford it. If our budget is $2.5M, then we do not have enough information.
Page 49: Statistical Theory of Estimation
Presenter
Presentation Notes
If your budget is somewhere in between, then more information is needed. Prioritize topics Required vs Nice to Have Then we get the confidence levels. http://www.stridenyc.com/ballpark
Page 50: Statistical Theory of Estimation
Presenter
Presentation Notes
http://www.stridenyc.com/ballpark
Page 51: Statistical Theory of Estimation
Presenter
Presentation Notes
You are essentially building your MVP And spending as little time on budgeting and estimating as possible