practice questions for exam 1 - university of...

17
1 | Page Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number of features as they arrive in the lot in order to determine their worth. Among the features looked at are miles per gallon (MPG), make and model (ie. Toyota Camry), number of cylinders, horsepower, weight, and year made. List these variables and state whether each is quantitative or categorical. Quantitative Catergorical MPG make # of cylinders model horsepower weight year 2. High temperatures for 35 major US cities were collected for January 23, 2006 and were put into the stem plot below with leaf unit=1.0. 1 3 4 5 3 6689 11 4 001122 17 4 566678 (3) 5 113------------18 th position is 51 15 5 678899 9 6 04 7 6 8 6 7 12 4 7 889 1 8 3 a) What is the minimum, median and maximum for this dataset? min: 34 x 1.0 = 34 position of median: max: 83 x 1.0 = 83 M = 51 x 1.0 = 51 b) Find the range for this dataset. Range = max min 83 34 = 49

Upload: nguyenphuc

Post on 24-Jul-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

1 | P a g e

Practice Questions for Exam 1

1. A used car lot evaluates their cars on a number of features as they arrive in the lot in

order to determine their worth. Among the features looked at are miles per gallon

(MPG), make and model (ie. Toyota Camry), number of cylinders, horsepower, weight,

and year made. List these variables and state whether each is quantitative or categorical.

Quantitative Catergorical

MPG make

# of cylinders model

horsepower

weight

year

2. High temperatures for 35 major US cities were collected for January 23, 2006 and

were put into the stem plot below with leaf unit=1.0.

1 3 4

5 3 6689

11 4 001122

17 4 566678

(3) 5 113------------18th position is 51

15 5 678899

9 6 04

7 6 8

6 7 12

4 7 889

1 8 3

a) What is the minimum, median and maximum for this dataset?

min: 34 x 1.0 = 34 position of median:

max: 83 x 1.0 = 83 M = 51 x 1.0 = 51

b) Find the range for this dataset.

Range = max – min

83 – 34 = 49

Page 2: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

2 | P a g e

3. Test scores for a class of 15 economics students were as follows:

86 95 78 93 34 58 65 68 72 98 92 84 73 84 91

34 58 65 68 72 73 78 84 84 86 91 92 93 95 98

M

a) Find the mean, median, and mode.

= 78.06 Pos. med. =

Mode =84

M = 84

b) Find the IQR, range, standard deviation and variance.

IQR = - range: 98 – 34 = 64 s = 17.07 = 291.39

92 – 68 = 24

c) Create a stem and leaf plot.

3 4

4

5 8

6 5 8

7 2 3 8

8 4 4 6

9 1 2 3 5 8

d.) Create a boxplot.

30 40 50 60 70 80 90 100

4. As they left the movie theater in Gainesville, 17 people were asked how long they had

to wait in line for their tickets.

a.) What is the population of interest?

All Gainesville residents that go to the movie theater

Page 3: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

3 | P a g e

b.) What is the sample?

17 people

c.) What is the variable being measured?

Time spent in line

d.) Is this variable discrete quantitative, continuous quantitative or categorical?

continuous quantitative

The results of the above question were as follows:

Minutes

Fre

qu

en

cy

1612840

6

5

4

3

2

1

0

Histogram of Minutes

e) Describe the shape, center, and spread of this histogram.

Shape: Skewed right

Center: 4 to 6

Spread: 0 to 18

f) Are there any outliers?

The value at 18 is might be considered an moderate outlier.

2 6 14 18

Page 4: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

4 | P a g e

5. Last year a small accounting firm paid each of its five clerks $22,000, two junior

accountants $50,000 each, and the firm’s owner $270,000.

a) What is the mean salary paid at the firm?

= 60,000

b) How many employees earn less than the mean?

7 employees earn less than the mean

c) What is the median salary?

d) What does this tell us about the mean and the median?

The mean is affected by outliers, but the median is not.

6. Two students took the same English course. Their grades were based on five

compositions. The grades are as follows.

Student A Student B

Comp. 1 78 84

Comp. 2 52 63

Comp. 3 84 80

Comp. 4 95 92

Comp. 5 92 89

a) Find the mean and standard deviation for each student.

Student A Student B

= 80.12 = 81.6

s = 17.12 s = 11.4

b) Which student has more variability in their scores?

Student A, because the standard deviation is higher.

22,000

22,000

22,000

22,000

22,000

50,000

50,000

270,000

Page 5: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

5 | P a g e

7. The following are the golf scores of 12 members of a women’s golf team in

tournament play:

89 90 87 95 86 81 102 105 83 88 91 79

a) Display the distribution by a stemplot and describe its main features.

7 9

8 1 3 6 7 8 9

9 0 1 5

10 2 5

b) Compute the mean, variance and standard deviation of these golf scores.

Mean: = 89.67

Standard deviation: s = 7.83

Variance: = 61.31

c) Then compute the median, the quartiles and the IQR.

79 81 83 86 87 88 89 90 91 95 102 105

:

M:

:

IQR: Q3 - Q1= 93-84.5 = 8.5

d) Are there any outliers?

No

Center- upper 80’s

Spread – 79 to 105

Shape – bell

No outliers

Page 6: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

6 | P a g e

8. Colleges and universities are requiring an increasing amount of information about

applicants before making acceptance and financial aid decisions. Classify each of the

following types of data required on a college application as discrete quantitative,

continuous quantitative or categorical.

a) High School GPA Continuous quantitative

b) Gender of applicant Categorical

c) Parent’s income Continuous quantitative

d) High School class rank Discrete quantitative

9. Answer the following questions:

a) What is the primary disadvantage of using the range to compare the variability of

data sets?

The range is heavily influenced by outliers

b) Can the variance of a data set ever be negative?

No

c) The variable of interest is height which is measured in inches. What is the unit of

the standard deviation? What is the unit of the variance?

Units for standard deviation is inches

Units for variance is squared inches

d) Give an example of a dataset where the standard deviation equals 0.

Any dataset that has all of the same numbers

Example: 5 5 5 5 5

Page 7: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

7 | P a g e

10. Describe the following scatterplots.

a) Elevation ( in meters) versus mean annual temperature(in Centigrade).

b). Year vs. Gas Price Average from 1976 to 2004.

Year

Av

g.

Ga

s P

rice

200520001995199019851980

2.00

1.75

1.50

1.25

1.00

0.75

0.50

Negative

Linear

No outliers

Strong

Positive

Linear

No outliers

Moderate

Page 8: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

8 | P a g e

c). Age (in months) vs. Score on a Cognitive Abilities Test

Score

Ag

e

3530252015

36

34

32

30

28

26

24

22

20

Positive

Linear

2 potential

outliers

strong

Page 9: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

9 | P a g e

11. A least squares regression was fit to the data shown above of Year vs. Gas Prices.

The result was the following:

Year

Av

g.

Ga

s P

rice

200520001995199019851980

2.00

1.75

1.50

1.25

1.00

0.75

0.50

S 0.210031

R-Sq 47.3%

R-Sq(adj) 45.3%

Fitted Line PlotAvg. Gas Price = - 44.78 + 0.02310 Year

a). Identify the explanatory and response variables.

explanatory(x) = year response(y) = avg. gas price

b). What is R2 and what is its interpretation?

R2

= 47.3 % of variation of gas price is explained by year

c). What is the slope and what is its interpretation?

Slope = 0.02310, 0.02310 is the average change in gas price every year

d). What is the y-intercept and what is its interpretation?

-44.78, do not interpret {No data around x = 0}

e). What would you predict the gas price to be in 2030? Is this reliable?

-44.78 + 0.02310(2030) = 2.113

Not reliable, 2030 is too far away from our data.

y- intercept slope

Page 10: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

10 | P a g e

12. Eleven members of a golf team play two rounds a piece. A bystander wants to

predict the round 2 score based on the round 1 score. Their scores are as follows: Round 1 Round 2

89 94

90 85

87 89

95 89

86 81

81 76

105 89

83 87

88 91

91 88

79 80

a). The least squares regression line minimizes the sum of the

squared residuals.

OR

The least squares regression line minimizes the sum of the

squared distances of the points to the line.

b). Identify the explanatory and response variables.

Explanatory variable: Round 1

Response variable: Round 2

c). What is R2 and what is its interpretation?

R2 = (.549)

2 x 100 = 30.1%

OR

1.) square r – 0.5492 = 0.301

2.) Make it a decimal by moving the decimal two places to the right. 30.1%

30.1% is the percent of variation of round 2 scores that can be explained by round 1

scores

d). What is the slope and what is its interpretation?

40.013.7

31.5549.0

x

y

s

srb

Slope: 0.41 is the average change in round 2 golf scores for a one point change in

round 1 golf scores

Round 1 = x Round 2 = y

�� = 88.5 �� = 86.27

sx = 7.13 sy = 5.31

Page 11: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

11 | P a g e

e). What is the y-intercept and what is its interpretation?

85.50)55.88(40.027.86 xbya

No round 2 scores near zero. Do not interpret.

e). What is the least squares regression equation?

bxay ˆ

LSR Line : xy 40.085.50ˆ

f). Find the residual for golfer 7 (round 1=105, round 2=89).

obs x = 105 obs y = 89

)105(*40.085.50ˆ y

Residual = obs y – pred y = 89 – 92.85 = -3.85

g.) When this point is removed, the value of r changes to 0.661. Is that point an

influential outlier?

Yes, because r changed a lot.

Page 12: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

12 | P a g e

13) The following plot shows how much water (cubic kilometers) was released by the

Mississippi River in the years 1954-1980.

Year

Wa

ter

Re

lea

se

198019751970196519601955

900

800

700

600

500

400

300

S 120.066

R-Sq 21.7%

R-Sq(adj) 18.6%

Fitted Line PlotWater Release = - 14837 + 7.808 Year

a). What is the correlation between year and water release?

b). In 1973, a major flood occurred. That year, the river discharged 880 cubic kilometers

of water. Find the residual for this point.

observed x observed y

184.568)1973(808.714837ˆ y

residual = obs y – pred y

= 880 – 568.184

= 311.816

1) Change to decimal

2) Take square root

3) Determine sign

Page 13: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

13 | P a g e

If we remove the observation from 1973, our new plot is:

Year

Wa

ter

Re

lea

se

198019751970196519601955

800

700

600

500

400

300

S 103.614

R-Sq 21.3%

R-Sq(adj) 18.0%

Fitted Line PlotWater Release = - 12468 + 6.598 Year

c). How did the LSR equation change? R2?

The slope went down, y intercept went up, and R2 went down a little.

d). Was 1973 an influential outlier?

No, the line didn’t change very much and R2 only changed a little.

14. For each of the descriptions below, determine what type of mistake is taking

place: extrapolation or misuse of cause and effect.

a). There is a high correlation between being a newspaper subscriber and having a high

income. Should I subscribe to the newspaper if I want to make more money?

Misuse of cause and effect, high correlation does not mean causation

b). 15 year-old Abby is an aspiring professional golfer. For the last 5 years, she has

recorded her average score at a local course. Using this information, she predicts what

her average score will be when she is 25.

Extrapolation, 25 is too far away from the data observed

Page 14: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

14 | P a g e

c)In any given city, the number of churches and the number of bars are highly correlated.

Does church attendance cause drinking?

Misuse of cause and effect, high correlation does not mean causation. There could

be lurking variables such as: the number of people in a city will cause both the

number of churches and the number of bars to increase.

15. A student who waits on tables at a Chinese restaurant in a college neighborhood

records the cost of means and the tip left by single diners. The student wants to predict

the tip based on the price of the meal. r = 0.954

(x) Meal $4.50 $5.79 $6.24 $4.62 $6.35 = 5.5 sx = 0.884

(y) Tip $0.50 $0.75 $0.85 $0.60 $1.00 = 0.74 sy = 0.1981

a) Compute the least-squares regression line for these data.

(

)

( )

b) Make a scatterplot of the data and draw the regression line on your plot.

𝑦 𝑥

LSR Line

1.00

.75

.50

.25

4 4.5 5 5.5 6 6.5

(Cost of meal in $)

𝑦 ( )

𝑦 ( )

Now plot (4, .4193) and (6, .8469), and

draw a line through the points.

Page 15: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

15 | P a g e

c) The next diner orders a meal costing $4.89. Use your regression line to predict the

tip. ⏞ ( )

Predicted tip 61cents

16. Below are boxplots for the amount of calories for the different types of cereals based

on whether they are on the bottom, middle or top shelf. Shelf 1 is the bottom shelf, shelf

2 is the middle shelf and shelf 3 is the top shelf.

shelf

ca

lori

es

321

175

150

125

100

75

50

Boxplot of calories vs shelf

about 110 calories 1.) What is the median amount of calories for boxes on the top shelf?

top_ 2.) Which shelf has the largest spread?

Symmetric _3.) What is the shape of the distribution of calories for boxes sold on the

top? (right, left, symmetric) (You can know that a boxplot is symmetric, but you can not

determine from a boxplot that it is normal).

Median

Page 16: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

16 | P a g e

17. Answer the following questions.

a.) What is the strongest value of correlation? 1 or -1

b.) What is the measure of center that is most influenced by outliers? Mean

c.) What is the measure of spread that is the most influenced by outliers? Range

d.) What percentage of the data is less than Q1? 25%

e.) What percentage of the data is less than Q3? 75%

f.) What is an influential outlier? This is when there is a point outside of the trend

of most of the data and when the point is removed the slope of the line

changes drastically and the value of R2 changes substantially.

g.) What type of graph would you use to explore the relationship between two

quantitative variables? scatterplot

h.) What type of graph would you use to explore the relationship between a

quantitative variable and a categorical variable? boxplots

i.) What type of graph would you use to explore the relationship between two

categorical variables? Contingency table

Page 17: Practice Questions for Exam 1 - University of Floridammeece/Spring2023/Spring2012/practice_exam1... · Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number

17 | P a g e

18. Below are boxplots of foal weights divided by gender(M=Males and F=Females).

Answer the following questions about the plot.

Gender

We

igh

t

MF

130

120

110

100

90

Boxplot of Weight vs Gender

a.) What is the median of the male foals? _____about115_________________

b.) Approximate the IQR for the female foals. ___Q3-Q1=128-95 =33______

c.) Compare the centers of the weights of the male and female foals. Use a complete

sentence.

The median weight for the female foals is slightly lower than the median weight of

the male.

d.) Compare the spread of the weights of the male and female foals. Use a complete

sentence.

The IQR and the range for the female foals is larger than for the male foals.

e.) Compare the shapes of the two distributions. Use a complete sentence.

The shape of the distribution for the female foals is fairly symmetric, but the shape

of the distribution for the male foals is slightly right skewed.