© boardworks ltd 2005 1 of 49 d1 planning and collecting data ks4 mathematics
TRANSCRIPT
© Boardworks Ltd 2005 2 of 49
A
A
A
A
AD1.1 Specifying the problem and planning
Contents
D1 Planning and collecting data
D1.2 Types of data
D1.3 Collecting data
D1.5 The stages of research
D1.4 Sampling
© Boardworks Ltd 2005 3 of 49
Formulating a hypothesis
The first step in planning a statistical enquiry is to decide what problem you want to explore.
This can be done by asking questions that you want your data to answer and by stating a hypothesis.
A hypothesis is a statement that you believe to be true but that you have not yet tested.
The plural of hypothesis is hypotheses.
For example,
Year Eleven pupils with paid jobs don’t do as well
in their exams.
© Boardworks Ltd 2005 4 of 49
“Year Eleven pupils with paid jobs don’t do as well in their exams.”
Forming a hypothesis
How could you find out if this statement is true?
How will you collect it?
Which Year Elevens does this statement cover? How could you ensure the data you collect represents all
of these Year Elevens?
What would you do with the data?
What would you expect to find?
Think about:
What data (information) would you need to collect?
© Boardworks Ltd 2005 5 of 49
Key vocabulary
hypothesis – a statement that can be tested
population – the group (often of people) referred to in the hypothesis
sample – a selection from the population
biased sample – an unfair selection
representative sample – a fair selection
cross section – a selection that reflects all the subgroups within the population
objective data – information that is not affected by people’s opinions
© Boardworks Ltd 2005 6 of 49
Key vocabulary
subjective data – information that is affected by people’s opinions
primary data – information you collect yourself, by asking people, measuring, carrying out experiments, and so on
secondary data – information that has been collected already, that you get from books, the internet, and so on
ethical issues – problems to do with confidentiality and personal questions
reliable results – results that will be repeated if the experiment or survey is carried out again with a new sample
© Boardworks Ltd 2005 7 of 49
“The bigger the sample size, the more reliable the results.”
Reliable results
Do you agree with this statement?
For example, suppose we had an experiment into reaction times using a class of ten year olds.
Generally, the statement is true so long as the sample is fair and the conditions in which the data was collected normal.
This sample is not representative of the population at large and so the results are not reliable if they are applied to the wider population.
© Boardworks Ltd 2005 9 of 49
“Year Eleven pupils with paid jobs don’t do as well in their exams.”
Using key words
What decisions will you have to make about the population?
You decide on a sample size of 20. What are the risks of choosing a small sample?
You decide to ask 20 of your friends. What kind of sample is this likely to produce?
You need to have a cross section of the population. You decide to have 10 boys and 10 girls. What other factors do you need to take into account?
You ask people to estimate the number of hours they spend on housework as well. What is the problem with this?
If you were to do a survey, what questions would you ask?
© Boardworks Ltd 2005 10 of 49
Choose one of these hypotheses and discuss how you would decide on:
the population
Planning how to test a hypothesis
“People feel stressed when they have exams.”
“You get less work done when it is noisy.”
“Sleep deprivation affects concentration.”
“Coffee can help you revise better.”
“The more revision you do, the better your exam results.”
the sample size
how you will ensure the sample is representative
what data you will collect and how you will collect it
any problems you might encounter.
© Boardworks Ltd 2005 12 of 49
Extending a hypothesis
Once you have collected data and drawn conclusions about your hypothesis, you could ask further questions and pursue other lines of enquiry.
You will need to plan what these might be beforehand if you are carrying out a survey. For example,
How could you extend these hypotheses?
What extra information might it be worth collecting?
“People feel stressed when they have exams.”
“You get less work done when it is noisy.”
“Sleep deprivation affects concentration.”
“Coffee can help you revise better.”
“The more revision you do, the better your exam results.”
© Boardworks Ltd 2005 13 of 49
A
A
A
A
A
D1.2 Types of data
Contents
D1.3 Collecting data
D1.1 Specifying the problem and planning
D1 Planning and collecting data
D1.5 The stages of research
D1.4 Sampling
© Boardworks Ltd 2005 14 of 49
Measuring stress
Kelly decides to ask 30 Year Eleven pupils how stressed out they are during their mocks.
issues of confidentiality
how she will record their answers
whether the data will enable her to decide whether her hypothesis is correct or not.
“People feel stressed when they have exams.”
What are the problems with this approach?
the subjectivity of the data
Think about:
© Boardworks Ltd 2005 15 of 49
Using a scale
Kelly decides to use the questionnaire below.
“People feel stressed when they have exams.”
What could she do with her results?
Circle the most appropriate number for each statement, which refer to the time of your mocks:
• I am not sleeping well.
• I feel anxious.
• I feel sick or have stomach problems.
• I often get upset or angry.
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
strongly agree
strongly disagree
© Boardworks Ltd 2005 16 of 49
Collecting numerical data
Kelly decides to add the numbers circled by each participant to give them a total score. She calls this their “stress score”.These are her results.
“People feel stressed when they have exams.”
Does Kelly have enough information to confirm her hypothesis?
10 15 11 10 15
14 14 12 7 16
8 16 12 9 14
18 12 17 15 17
16 11 18 14 12
14 10 15 13 11
© Boardworks Ltd 2005 17 of 49
Kelly issues the same questionnaire to the same participants a month later. Two participants are unable to take part, and their results are removed. She now has two items of data for each participant. Here are the results:
Mock After Mock After Mock After Mock After Mock After
10 15 8 11 12 10 2 15 14
14 13 14 8 12 10 7 4 16 10
8 8 16 5 12 10 9 14 8
18 14 12 2 17 12 15 7 17 9
16 12 11 5 18 9 14 3 12 6
14 11 10 10 15 5 13 1 11 4
Collecting numerical data
What could Kelly do with her results?
© Boardworks Ltd 2005 18 of 49
Sidra has carried out a survey too. She has asked participants to answer the question “Do you feel stressed?” during and after the mocks. Here are both girls’ results:
Numerical and non-numerical data
Kelly’s results
41147212
612210516
917515814
814918815
101612171114
141510121216
11310121418
314121188
71510101314
69511710
AfterMockAfterMockAfterMock
Sidra’s results
YNYYYY
YNYYNN
YYNNNY
NNNNYY
NYYYYN
YNYYNY
NYYYNY
NNNYNN
YYYYNY
NYYYNY
AfterMockAfterMockAfterMock
Whose results are better and why?
© Boardworks Ltd 2005 19 of 49
Data can be either:numerical or quantitative data non-numerical or qualitative data
Data can be either:numerical or quantitative data non-numerical or qualitative data
Types of data
heights
opinions favourite subjects
time
eye colour gender
age
Examples of quantitative data include,
Examples of qualitative data include,
© Boardworks Ltd 2005 20 of 49
Types of data
Which kind of data are each of these?
Now think of your own examples of numerical and non-numerical data.
1) people’s opinions about third world debt
3) whether people are left handed or right handed
4) the number of full stops in different books
5) how you felt after your last exam
6) how popular your favourite band is among your friends
7) which supermarket people prefer
2) how much sleep you have had each night this month
8) the number of Year Elevens who are stressed
© Boardworks Ltd 2005 21 of 49
Shoe size The number of goals in a football match The temperature of a classroom The time taken to complete a task The number of GCSE grade A*s achieved in your school
last year The number of marks gained in a dance exam The height of a mountain
Numerical data can be either:continuous discrete
Numerical data can be either:continuous discrete
Measurements
Which of the examples of numerical data given below would need to be rounded off?
© Boardworks Ltd 2005 22 of 49
Discrete data jumps from one measurement to the next. The measurements in between have no meaning.Discrete data jumps from one measurement to the next. The measurements in between have no meaning.
Discrete data
Shoe sizeYou can have a shoe size of 4 or 4½ but not 4¼ .
Number of goals in a football match
You can score 2 goals but not 2.5.
The number of GCSE grade A*s achieved in your school last year
There could have been 40 or 41 A* grades but not 40.1.
The number of marks gained in a dance exam
You could get 60 but not 60.8 in the exam.
© Boardworks Ltd 2005 23 of 49
Continuous data does not jump from one measurement to the next, but passes smoothly through all the measurements in between.
Continuous data does not jump from one measurement to the next, but passes smoothly through all the measurements in between.
Continuous data
The temperature of a classroom
The temperature could be 21oC, 21.1oC, 21.01oC or ….
The time taken to complete a task
The time could be 57 secs, 57.1 secs, 57.01 secs or ….
The height of a mountain
The height could be 300 m, 300.6 m, 300.0006 feet, or …..
© Boardworks Ltd 2005 25 of 49
A
A
A
A
A
D1.3 Collecting data
Contents
D1.2 Types of data
D1.1 Specifying the problem and planning
D1 Planning and collecting data
D1.5 The stages of research
D1.4 Sampling
© Boardworks Ltd 2005 26 of 49
“Year Eleven pupils with paid jobs don’t do as well in their exams.”
Writing a questionnaire
Task 2
Write an improved questionnaire.
Task 1
You are about to be shown a questionnaire designed to investigate this hypothesis. Discuss how it could be improved.
Think about it from the point of view of
the participants
the researcher collating and analysing the data.
© Boardworks Ltd 2005 27 of 49
Writing a questionnaire
Questionnaire about jobs in Year Eleven
Name: …………… Form: ……………
1. Do you have a paid job? ……………………………………………2. If so, how many hours do you do in a week? …………………….3. What is your job? ……………………………………………………4. How long have you had it? …………………………………………5. Do your parents make you do any jobs at home? ……………….6. If so, how many hours do you do in a week? …………………….7. How many hours of revision did you do for your mocks? ………8. What were your mock results like? ………………………………. 9. Do you think you could have done better if you didn’t have a
job? ……………………………………………………………………
© Boardworks Ltd 2005 28 of 49
Guidelines for writing a questionnaire
Give participants the option of remaining anonymous.
Ask for the participant’s gender.
Anticipate problems such as participants working different hours each week.
Think about whether participants will have the information you ask for, such as mock results.
Think about using tick boxes or scales to make it easy to fill in as well as easy to analyse.
When writing your own questionnaire, it can be helpful to use the following guidelines:
© Boardworks Ltd 2005 29 of 49
Guidelines for writing a questionnaire
Be specific: ask for mock grades for named subjects.
Don’t ask leading questions.
Make sure questions are not ambiguous or misleading.
Carry out a “pilot study” by testing the questionnaire on a few friends first to see if there are any problems.
Only ask relevant questions.
When writing your own questionnaire, it can be helpful to use the following guidelines:
© Boardworks Ltd 2005 30 of 49
Data collection sheets
A data collection sheet is a table, or series of tables, where the data on the questionnaire is collated.A data collection sheet is a table, or series of tables, where the data on the questionnaire is collated.
The table above is an example of a part of a data collection sheet. Draw up a data collection sheet for your questionnaire.
A*B0F
BC2F
CE6M
AB1M
BA3M
Science mockMaths mockHours of workGender
© Boardworks Ltd 2005 31 of 49
Contents
A
A
A
A
A
D1.4 Sampling
D1.3 Collecting data
D1.2 Types of data
D1.1 Specifying the problem and planning
D1 Planning and collecting data
D1.5 The stages of research
© Boardworks Ltd 2005 32 of 49
Soap wars
How are TV viewing figures compiled?
VIEW IN G FIG U R ES
Westenders
Carnation S treet
JAN FEB M AR APR M AY AU G
2
4
6
8
10
12
M illions
© Boardworks Ltd 2005 33 of 49
Television viewing figures
When compiling television viewing figures, it is impractical to find out what everyone in the country is watching at a particular time.
Instead, the viewing habits of a sample of households is carefully monitored and the data collected is used to compile the figures.
To avoid bias, it is important that the sample is representative of all television viewing households across the country.
This is done by dividing households into categories and taking samples in proportion to the size of each category.
This is an example of a stratified sample.
© Boardworks Ltd 2005 34 of 49
Different sampling methods
Random samplingPeople are chosen at random e.g. names picked from a hat or using a random number generator on a calculator.Every member of the population has an equal chance of being chosen.
27
Systematic sampling
Members of the population are chosen at regular intervals, such as every 100th person from a telephone directory.
Quota sampling
You keep asking until you have enough people from each category. An example would be a survey in the street where you stop when you have enough people from each age category.
© Boardworks Ltd 2005 36 of 49
Stratified sampling
Imagine you are going to investigate the hypothesis
Assume the population is that of married people in Great Britain.
Relevant factors could be:
the year when they got married
How would you select a sample?
socio-economic class
parents’ marital status
“People who get married at 20 are more likely to divorce than people who get married at 30”
© Boardworks Ltd 2005 37 of 49
Stratified sampling
The relevant factors are also called variables.
Year of marriage is one variable. We could split this into decades.
The different decades are called strata. For example, the 1970s might be one stratum.
What are the strata for the variable “parents’ marital status”?
The strata for “parent’s marital status” could be:
married
divorced
separated
widowed
single
living with partner
© Boardworks Ltd 2005 38 of 49
Stratified sampling
For example, if 10% of all married people in Great Britain were married in the 1970s, then the sample needs to contain 10% of people in this strata.
If we want the sample to represent a cross section of the population, we need the size of each stratum to be chosen to reflect their proportions in the population.
The actual number then depends on the number of people in the sample.
For example, if there are 3000 people in the sample, 30 of them should be people married in the 1970s.
© Boardworks Ltd 2005 39 of 49
Using stratified sampling
“Pupils from Town A are more likely to be late for school than pupils from Town B.”
40% of pupils in a school are boys and 60% are girls.
30% live in Town A and 70% in Town B.
There are 450 pupils in the school.
You want a sample of 60.
Construct a stratified sample which reflects both the proportions of male and female and where they live.
© Boardworks Ltd 2005 40 of 49
Using stratified sampling
≈ 7
≈ 11
Of the 18 pupils from Town A, 40% are boys and 60% are girls.
Number of boys from Town A = 40% of 18 = 7.2
Number of girls from Town A = 60% of 18 = 10.8
Of the 42 pupils from Town B, 40% are boys and 60% are girls.
Number of boys from Town B = 40% of 42 = 16.8
Number of girls from Town B = 60% of 42 = 25.2
≈ 17
≈ 25
Number of pupils from Town A = 30% of 60 = 18
Number of pupils from Town B = 70% of 60 = 42
If there are 60 pupils in our sample then 30% must come from Town A and 70% from Town B.
© Boardworks Ltd 2005 41 of 49
A guide to using stratified sampling
Decide how many strata there are for each variable.
Find out what percentage of the population each of the different strata makes up.
Decide on your sample size.
Calculate how many people you need for each of the strata.
To select the people in each of the strata, use another sampling method, for example random sampling.
First decide which variables are relevant to your hypothesis.
© Boardworks Ltd 2005 42 of 49
Suppose you have a list of 100 people and want to select 20 of them randomly. This can be done using the random number generator on your calculator.
Using a calculator to generate a random sample
Number each person from 0 to 99.
Key 100 into your calculator, followed by the RAN # button.
Press the equals button twenty times, making a note of each number.
Find the people on your list of 100 people that match your twenty numbers.
Press equals. This gives you a number between 0 and 99. The number may have a decimal. This should be rounded down (or the decimal ignored).
© Boardworks Ltd 2005 43 of 49
Evaluating different sampling methods
Random sampling
Every member of the population has an equal chance of being chosen, which makes it fair.
It can be very time consuming and usually impractical.
Systematic sampling
You are unlikely to get a biased sample.
It is not strictly random: some members of the population cannot be chosen once you have decided where to start on the list.
© Boardworks Ltd 2005 44 of 49
Evaluating different sampling methods
Quota sampling
This is easier to manage.
It could be biased. For example, if you are only asking people on the street or in a shop, the sample might not represent people at work all day.
Stratified sampling
It is the best way to reflect the population accurately.
It is time consuming and you have to limit the number of relevant variables to make it practical.
© Boardworks Ltd 2005 45 of 49
Contents
A
A
A
A
A
D1.5 The stages of research
D1.3 Collecting data
D1.2 Types of data
D1.1 Specifying the problem and planning
D1 Planning and collecting data
D1.4 Sampling
© Boardworks Ltd 2005 46 of 49
The stages of research
There are several stages in carrying out a project or piece of research. These include:
Developing your hypothesis and planning how to test it.
Collecting data.
Using graphs and calculations to describe your results.
Analysing your results; drawing conclusions about whether your hypothesis has been supported by the data.
Evaluating and recognizing the limitations of your methods; and deciding how reliable your conclusions are.
Extending your hypothesis and pursuing new lines of enquiry.
© Boardworks Ltd 2005 47 of 49
The data collection cycle
These stages can be shown by the data collection cycle as follows:
Specify the problem and plan
Specify the problem and plan
Process and display the data
Process and display the data
Collect the data from a variety of
sources
Collect the data from a variety of
sources
Interpret and discuss the
results
Interpret and discuss the
results
© Boardworks Ltd 2005 48 of 49
GCSE coursework
Your GCSE coursework will be assessed on three strands. Each one is worth the same number of marks:
What kind of evidence will the marker be looking for?
Discuss what each of these strands involves.
Interpreting and evaluating data
Processing and representing data
Planning and collecting data
© Boardworks Ltd 2005 49 of 49
Review
A key skill in Handling Data is the correct use of vocabulary.
Act out or mime your word.
Think up some ways of remembering the sampling methods.
Choose a hypothesis and write a paragraph about how you would research it. Compete with your partner to get as many key words in as possible.
Give your partner a definition of a word to guess.