statistics an intro to our world of data by s. middleton, m.a

33
Statistics An intro to our world of Data by S. Middleton, M.A.

Upload: guillermo-cheak

Post on 14-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

StatisticsAn intro to our world of Data

by S. Middleton, M.A.

WHY STUDY STATS?

There are lots of reasons one takes a course in school but…

Statistics is Historical . . . Statistical Census taking goes back to

Babylonia and ancient Egypt, however the Roman Empire gives us its name. Statistics is derived from the Latin word status, which means “state.”

Examples of a Census:

“In those days Cesar Augustus issued a decree that a census should be taken of the entire Roman world. This was the first census that took place while Quirinius was Governor of Syria. And everyone went to his own town to register”. Recorded in a book of the Bible by the Physician Luke in Ch. 2 verses 1-3

Statistics is Current . . . Modern day executives use statistics to

shape decision making. Take for example-

Moneyball is the story of Billy Beane, a Mt. Carmel Graduate, who changed the way Major League Baseball evaluates their prospects based upon statistical information.

Moneyball

Why Study Stats?

As a student, we study Statistics to be able to read and understand various statistical studies in our careers.

Statistical procedures are basic to research in all fields.

To become better citizens and consumers.

WHAT WILL I LEARN?In this presentation and in the Stats course…

Goals for our study…

We will answer these questions:What are the branches of Statistics?What are data?Where do we get data?

Two Branches of Statistics

Descriptive Statistics,which utilizes numerical & graphical methods to look for patterns in a data set, summarize the information in a convenient form.

Inferential Statistics,which utilizes sample data to make predictions, estimates, decisions or other generalizations about a larger set of data.

DESCRIPTIVE STATISTICS

The Nilson company collected data on credit or debit purchases recorded in the U.S. during the first six months of 1998.

US Market Share for Credit Cards

DINERS CLUB1%

DISCOVER6%

MASTER CARD25%

AMERICAN EXPRESS

18%

VISA50%

• The U.S. market for credit cards.

DESCRIPTIVE STATISTICS

With descriptive statistics, the statistician tries to DESCRIBE a situation.

Often the data is presented in some meaningful form, such as charts, graphs, or tables.

A second branch is…

INFERENTIAL STATISTICSFamily Home Journal Study

A group of 1017 men aged 48 years old was studied for 18 years. It was found that for unmarried men 60% to 70% were still alive at age 65. For married men 90% were alive till age 65. They Concluded that marriage contributed to the length of one’s life.

What are data?

The KINDS of data.The SOURCE of dataTECHNIQUES in obtaining data.The MEASUREMENT Classifications.

Data can be …(classified by Kind)QUANTITATIVE

- involvesNumbers

OR

QUALITATIVE- is a sometimes called “categorical”

DATA . . . Classified for you

Data

Qualitative(Categorical)

Quantitative

(Measured)

Discrete(can be counted)1, or 2 but not 1.2

Continuous

( values)1, 1.01, 1.1,1.07, etc.

Data has a source …

A POPULATION- “All of the Observations or Measurements”

A SAMPLE -“ A Portion of the Population”

An Example from Hawaii

The department of Agriculture wants to know if this crop of pineapples are under sized. They They take the individual weights of a sample of 100 pineapples from an experimental field of pineapples for study.

What’s the Population ? The Sample ?The Weight of all the pineapples in

the field.100 pineapples

TECHNIQUES to Produce data

ObservationExperiment SimulationSurvey

Data can be Produced by…Observation –

the researcher merely observes what is happening or what has happened in the past.

Motorcycle Industry Council–collected data on the ages and incomes of motorcycle owners in 1980 and then again in 1998. The researchers merely stated that motorcycle owners were getting older (USA today). There was no research intervention.

Data can be Produced by…Experimentation –

the researcher manipulates one of the variables and tries to determine how the manipulation influences the other variables.

Virginia Polytechnic University (Psychology Today)–they divided the female undergraduate students into two groups and had them do as many sit-ups as possible in 90 seconds. The first group was only told “to do your best”, while the second group was told to try an increase their best by 10% each day. They were measured again after 4 days to see what happened.

Data can be Produced by…Simulation –

a researcher may use probability experiments to mimic real life situations that might be too costly, dangerous, or time-consuming.

NASA -space shuttle pilots are trained using the simulator, rather than learning on the real shuttle.

Data can be Produced by… Survey –

one of the most common methods for obtaining information is a survey. There are many types, but 4 common methods are: Telephone, Mailed Questionnaire, Personal Interview, and surveying records.

Literary Digest -In 1932, Literary digest conducted a survey by mailing questionnaires to subscribers asking questions about the upcoming election.

Which Technique is Best ?

In the following slides, use each of the techniques just presented -Surveying, Experimentation, or Census taking, to answer the question which seems appropriate for each senario.

A Study of the effect of stopping the cooling process of anuclear reactor.Probably SIMULATION.

I don’t think you want a melt down !

Study the effect of calcium supplements given to young girls on their bone mass

EXPERIMENTATIONThis is very similar to studies the AMA performs regularly. In fact this one was done by Tom Lloyd, who used 94 girls half of which were given the calcium and half given a placebo. He found that 1.3% more bone mass was gained by girls using the calcium treatment. He published his findings in the “Journal of the American Medical Association”.

Study the credits earned of each student enrolled at MCHS.

Surveying Records - The registrar can keeps these records for every student, you could gain a report of this data.

Data can be CLASSIFIED by how it is measured?

There are 4 levels of measurement• Nominal Level • Ordinal Level• Interval Level• Ratio Level

Nominal Level

Nominal means “in name only”it refers to data that has no way of organizing or ranking data as greater than other data.

Examples include: color, names of cities, ski areas, etc.

Ordinal Level Ordinal Level means includes

categorical data that can be ranked or placed in an order but actual differences between data values can not be determined or are meaningless.

Examples include: NBA teams, AAA rated motels, Digital Cameras, etc.

Interval Level

Interval means that data can be compared and includes meaningful differences between the values. There is NO real ZERO involved, however.

Examples include: temperature ratings, years when Democrats won elections, etc.

Ratio Level

Ratio Level allows for ranking, taking differences, and finding a “ratio” between the data values. It makes sense to say that one data value is “twice” as long as the other.

Examples include: length, time lapse between check at a bank, temperature measured in 0K.

Levels of Measurement From different data values classify them as Nominal,

Ordinal, Interval, or Ratio.

Senator’s name is Sam Wilson.

The Senator is 58 yrs old.

He was elected to the Senate in 1980, 1986, 1992

His Income is $878,314.

A leading magazine claims he is ranked 7th based on his voting record on bills regarding schools.

Nominal

Ratio

Ratio

Interval

Ordinal

In Summary, Here’s What YOU should know.The KINDS of data.

The SOURCE of data.

Techniques for PRODUCING data.

CLASSIFICATION measurement levels.

Quantitative & Qualitative

Population or Sample

Observation, Survey, Experiment, Simulation,

Nominal, Ordinal, Interval, or Ratio

It’s YOUR turn!!!!!!

Take each of the questions we answered in the survey at the beginning of this unit and classify each level of measurement for the data values collected.