### Week 7 Confidence interval estimation

1
1
STAM4000
Quantitative Methods
Week 7
Confidence interval estimation
2
COMMONWEALTH OF AUSTRALIA
WARNING
This material has been reproduced and communicated to you by or on behalf of Kaplan
Business School pursuant to Part VB of the
The material in this communication may be subject to copyright under the Act. Any further
reproduction or communication of this material by you may be the subject of copyright
protection under the Act.
Do not remove this notice.
2

 3 ion #1 #2 #3 Distinguish between point and interval estimators Interval estimation of the population mean Interval estimation of the population proportion Week 7 Confidence interval estimat Learning Outcomes

4
Why does this matter?
In the real
world, we are
concerned
confidence of
our estimates.
5
#1 Distinguish between point and interval estimators
3oECAEQHQ&biw=1013&bih=433#imgrc=2M__G-tOSP2FZM&imgdii=WinFxTzU1vt0wM

6
#1

 โขA statistic, measured at one point in time, to estimate a parameter.

Point estimator

 โขA range of values, based on a statistic to estimate a parameter.

Interval
estimator
Point and interval estimators
7
Two types of
point estimators
this week
Sample mean,
๐ฅาง
to estimate the population mean,
ฮผ
Sample proportion,
๐ฦธ
to estimate the population
proportion,
p
#1 Point estimator for ฮผ and p
8
#1 Interval estimators
9
#1 General form of an interval estimator
10
#1 What is the margin of error?
General form of CI:
point estimate ยฑ margin of error
The margin of error:
โข The amount by which, sample results likely
differ from population results.
โข Is related to the sampling error, as we are
using a sample from the population
E.g. Each week, you go to your
local delicatessen and ask for, an
average, of 100 grams
of Provolone cheese.
Say, you are willing to accept
a 10 grams margin of error.
This tells us that the average
90 grams to 110 grams.
11
#2 Interval estimation of the population mean
https://econtent.frontrange.edu/~dplatt/135comics.htm
12
#2 Confidence interval for the population mean

 Confidence Intervals for ฮผ ฯ is known Use Z distribution ฯ is unknown Use t distribution

 ๐เดฅ ยฑ ๐๐๐๐๐ ๐ ๐

 ๐เดฅ ยฑ ๐๐๐๐๐ ๐ ๐

13
#2 Conditions to check before creating a CI for ฮผ
https://pixabay.com/photos/bulldog-cute-easter-animal-dog-2952049/

 14 #2 The CI for ฮผ, may or may not actually include ฮผ? CI? We are only a specified percentage confident, that ฮผ lies in the interval.

Is ยต
in
the

 15 #2 The confidence interval formula to estimate ยต, when ฯ is KNOWN: ๐ฅาง is the sample mean z crit is the Z critical value ฯ is the KNOWN population standard deviation n is the sample size ยฑ is read as โplus or minusโ and may be written as +/- or โ ๐ฅาง ยฑ ๐ง๐๐๐๐ก ๐ ๐ Understand interval estimation for ฮผ when ฯ is known

 16 #2 The percentage (%) of the confidence interval determines the Z critical value. Example: Find the Z critical value for a 95% CI. Finding Z critical from the Z tables -1.96 0 1.96 Z Z crit Z crit 95% = 0.95 100% = 1 0.975

17
17
#2 Summary table of common Z critical values for CI
18
#2 Margin of error (ME) of a CI for ฮผ when ฯ is KNOWN
๐ฅาง ยฑ ๐ง๐๐๐๐ก
๐ ๐
๐ฅาง ยฑ ME
ME ME
The size of ME depends on:
โข Z critical, based on the percentage of the CI
โข sample size, n
โข population standard deviation, ฯ
19
19
#2 Some facts about confidence interval
โข Holding all else constant, if the sample size (n) is increased, the margin of
error decreases, making the confidence interval narrower. Thus, increasing
the sample size is a way to counteract the loss of precision associated with
high confidence.
โข Holding all else constant, if the percentage of confidence increases, the
critical value increases. This increases the margin of error and the confidence
interval becomes wider.

20
20
#2 Two common interpretations of the CI for ฮผ

 Note: โขUse of โconfidenceโ โขUse the name of the variable โขUse the values of the LCL and UCL โขInclude units

We are CI% that the population
mean lies between the lower
confident limit and the upper
confidence limit.

 Note: โขThe sample size is the same โขEach CI is different โขWe are theoretically creating an infinite number of CI

If all possible samples of the
same size
n are taken, CI% of
those CI would contain the
(100 – CI)% would not
contain the population mean.

21
21
#2
In 1989 A.J. Hackett started the worldโs first commercial bungee
jumping site in Queenstown, New Zealand. In 2020 a random
sample of 49 bungee jumpers had a mean age of 26.7 years. Assume
the population standard deviation is known to be 3.5 years.
a) Check the conditions to find a confidence interval for the
population mean age of a bungee jumper.
b) Find and interpret a 95% CI for the mean age.
c) Holding all else the same, explain whether a 99% CI for the mean
age would be narrower, the same width or wider than the 95% CI?
Do not create another CI.
d) What is the business application here?
BUSINESS QUESTION: What is the mean age of a bungee jumper?
Example
22
22
#2
Example solution
a) Random Sample Condition?
Satisfied as told random sample of 49 jumpers.
10% Condition? Not told if sampled without replacement.
As n = 49 we must assume that there are at least 490 bungee
jumpers in the population.
Normal or Large Enough Sample Condition:
As n = 49 > 30 we can use the Central Limit Theorem and
conclude that
๐เดค~ Normal.
The conditions are satisfied, we have normality.
As ฯ is known, we can use the Z tables.
https://pixabay.com/photos/bulldog-cute-easter-animal-dog-2952049/
23
23
#2
Example solution
๐ฅาง ยฑ ๐ง๐๐๐๐ก ๐
๐
= 26.7 ยฑ 1.96 ร
3.5
49
= 26.7 ยฑ 1.96 ร
3.5
7
= 26.7 ยฑ 1.96 ร
0.5
= 26.7 ยฑ 0.98
= (25.72 years, 27.68 years)
b)
Interpretation: We are 95% confident the population mean age of a
bungee jumper lies between 25.72 and 27.68 years.

24
#2
c) Holding all else the same, the 95% confidence interval (CI) will be narrower
than the 99% CI due to the smaller Z critical value of 1.96 compared to Z critical
of 2.576, respectively. The 99% CI gives a wider interval with more confidence,
but less precision.
d) The confidence interval tells us that the population mean age for a bungee
jumper lies between a small range of 25.72 years and 27.68 years. To capture a
wider age group, a marketing campaign targeting younger and older individuals,
could be used to increase sales and profit.
Example solution
25
25
#2 Exercise
BUSINESS QUESTION: What is the average number of almonds per 30 gm bag of a healthy snack?
A company sells 40 gram bags of almonds as a healthy snack and wants to estimate the
number of almonds that are packed into each bag. A random sample of 36 bags was selected
from a production run and the number of almonds counted in each bag. The sample mean number
of almonds is 20.6. Assume the population standard deviation is known to be 1.7 almonds per bag.
a) What is the point estimate of the number of almonds per bag?
b) Check the conditions to create a confidence interval for ฮผ.
c) Find and interpret a 90% confidence interval for the population mean.
d) Say, now that 100 bags were randomly selected and assume that the mean was miraculously the
same at 20.6 almonds. Without calculating another confidence interval, explain what happens to
the width of the 90% confidence interval created in part c)?
This Photo by Unknown Author is licensed
under
CC BY-NC
27
27
#2 Understand interval estimation for ฮผ when ฯ is unknown
We can use the sample standard deviation (s), as we do not know the population
standard deviation (ฯ).
The confidence interval formula to estimate ยต when ฯ is unknown:
๐ฅาง is the sample mean
t
crit is the t critical value
s is the sample standard deviation, an estimate of ฯ
n is the sample size
ยฑ is read as โplus or minusโ, sometimes written as +/-
๐ฅาง ยฑ ๐ก๐๐๐๐ก
๐  ๐
28
28
#2 No ฯ? No problem, just use s and the t-tables
We can use the sample
standard deviation (s) as an
estimate of ฯ.
Now, we are using the sample mean and
the sample standard deviation to
estimate ฮผ, so we have more variability.
We can no longer use Z.
We need a
new
distribution
called the
Studentโs t
distribution
t distribution:
Family of t
curves that
depend on the
sample size

29
29
#2 Comparison of Z and t curves
Z t
0
t curve for n = 10
Bell-Shaped
t curve for n = 36
Symmetric
โThickerโ
Tails
Standard Normal, Z
As the df โ, the Studentโs t distribution Z, Standard normal distribution.
30
30
#2
โข t -table row: degrees of freedom: df = n – 1 for CI for ฮผ.
โข t-table column: 100 -๐ถ๐ผ% /100
2
is the area in the right tail of the t curve, and this area is denoted
in the subscript of โtโ in the first row of the t-table.
โข Read off the required t critical value, where the row and column intersect.
31
31
#2
Determine the t critical value for each of the following:
a) 95% CI and n = 10
Use row: df = n – 1 = 10 – 1 = 9
Column:
100 -๐ถ๐ผ% /100
2
=
100 -95% /100
2
= 0.025
Use column t
0.025
t critical = ยฑ 2.262
b) 99% CI and n = 10.
Use row df = 9 and column t
0.005
t critical = ยฑ 3.250
c) 90% CI and n = 64.
df = 63, use row 60 and column t0.05
t critical = ยฑ 1.671
Example
32
32
#2 Example
The council of a city wants to attract more shoppers to the
city centre by proposing the building of a new public carpark.
The council plans to pay for the carpark through parking fees.
They employed a consultant, who found a similar carpark, in
a similar city, and randomly sampled 44 weekdays. The
consultant found daily fees collected averaged \$4326, with a
standard deviation of \$1500. Assume the conditions are
satisfied.
a) Find a 90% confidence interval for the mean daily income
this new carpark is estimated to generate.
c) The consultant who advised the council on this car park
proposal, predicted that parking revenues would average
\$4000 per day. Based on your confidence interval, what
do you think of the consultantโs prediction?
eage.com.au/national/victoria/ooops-developer-fails-to-build-two-promised-levels-of-underground-parking-20151028-gkkhxv.html
33
#2
๐ฅาง ยฑ ๐ก๐๐๐๐ก ๐
๐
= 4326 ยฑ 1.684 1500
44
= 4326 ยฑ 380.809
= (\$3945.19, \$4706.81)
a) Told to assume the conditions are satisfied. As ฯ is unknown, we must use the t distribution.
For a 90% CI, t
0.05, df = 43, use 40, t critical = ยฑ 1.684
b) We are 90% confident the population mean daily income of this new carpark will lie
between \$3945.19 and \$4706.81.
c) The consultants prediction of \$4000 seems reliable as \$4000 lies inside this 90% confidence
interval.
Example solution
34
34
#2 Exercise
Demand for pet puppies has increased with the onset of COVID-19 โ the companionship of a
pet is comforting to many. In Australia, one of the most popular breed of dogs is a cavoodle
โ a cross between a cavalier spaniel and a poodle. Of 25 recent cavoodle puppy litters, the
mean was 3.65 puppies with a standard deviation
of 1.56 puppies. Assume conditions are satisfied.
a) Find and interpret a 95% confidence interval.
b) What is the width of your confidence interval in part a)?
c) Holding all else constant, and without doing the calculations, would a 99% confidence
interval be narrower, wider or the same width as your confidence interval from part a)?
Explain.
BUSINESS QUESTION: What is the average number of puppies per litter for a cavoodle?
36
#3 Interval estimation of the population proportion
https://photostockeditor.com/free-images/bungee
37
#3
Confidence interval formula to estimate the population proportion, p:
๐ฦธ = sample proportion of interest
๐เท = 1 – ๐ฦธ
Z
critical = Z value related to the CI %
n = sample size
qp n
p z
critical
ห ห
ห
๏ฑ ๏ด
โข To estimate p, we only use Z
โข Use the decimal form of the proportions
โข Work to at least 3 decimal places
Interval estimation of the population proportion
pห ๏ฑ z pห(1pห) / n
pห ๏ฑ ME
38
#3 Conditions to check before creating a CI for p
39
#3 Example
Owners of a start-up business want to open a market stall to sell their products.
They are trying to decide whether to accept credit card payments or rely solely
on cash. They took a random sample of 100 market customer purchases for
other stalls and found 70 of these were paid by credit card.
a) Describe in words what p and
๐ฦธ are, in the context of this example.
b) Check the conditions.
c) Find and interpret a 95% confidence interval.
Solution:
a) p = population proportion of market customers who pay by credit card
๐ฦธ = sample proportion of market customers who paid by credit card = 70
100
= 0.7, ๐เท = 1 – ๐ฦธ = 0.3
b) Check conditions: Told random sample; must assume there are at least 1000 market customers in
the population;
๐๐ฦธ = 100(0.7) = 70 > 10 and ๐๐เท = 100(0.3) = 30 > 10. Conditions satisfied; use Z.
c)
0.7 ยฑ 1.96 0.7ร0.3
100
= 0.7 ยฑ 0.0898
= (0.6102, 0.7898)
Interpretation: The owners of the start-up can be 95% confident that
the population proportion of market customers who pay by credit card
lies between 61.02% and 78.98%.
https://unsplash.com/@peterampazzo?utm_source=unsplash&utm_medium=referral
&utm_content=creditCopyText

40
#3
A journalist, for an adventure sports magazine, is writing an article on the
proportion of bungee jumpers who sustain an injury. He takes a random sample
of 200 bungee jumpers and finds 10 of these claimed to have sustained an injury
from their jump. Assume the conditions are satisfied.
a) Describe p and
๐ฦธ in the context of this exercise.
b) Construct and interpret a 90% Confidence Interval (CI).
c) Without calculating another confidence interval, what happens to the width of
your 90% CI from part a), if the sample size was increased to 400 but the sample
Exercise
42
Supplementary Exercises
โข Students are advised that Supplementary Exercises to this topic may be found on the
subject portal under โWeekly materialsโ.
โข Solutions to the Supplementary Exercises may be available on the portal under โWeekly
materials “at the end of each week.
โข Time permitting, the lecturer may ask students to work through some of these exercises
in class.
โข Otherwise, it is expected that all students work through all Supplementary Exercises
outside of class time.

43
Extension
โข The following slides are an extension to this weekโs topic.
โข The work covered in the extension:
o Is not covered in class by the lecturer.
o May be assessed.
44
44
Quick quiz: interpretation of a CI for ฮผ
A researcher has calculated a 95% confidence interval (CI) for the population
mean number of screens (smartphones, television, laptops, etc.) per
household to be (2.0, 5.4) screens.
Are any of the following interpretations of
this CI correct?
a) The probability that the population mean is greater than 1 is at least 0.95.
b) There is a 95% probability that the population mean lies between 2.0 and
5.4 screens.
c) If we were to repeat the experiment over and over, then 95% of the time
the population mean would fall between 2.0 and 5.4 screens.
d) We are 95% confident the sample mean lies between 2.0 and 5.4 screens
per household.
e) 95% of all households have between 2.0 and 5.4 screens.
This Photo
45
45
Quick quiz solution
a) The probability that the population mean is greater than 1 is at least 95%.
Incorrect: as probability โ  confidence. The population mean is either in the CI, with a
probability of 1 or the population mean is not in the CI, with a probability of 0.
b) There is a 95% probability that the population mean lies between 2.0 and 5.4 screens.
Incorrect: as probability โ  confidence.
c) If we were to repeat the experiment over and over, then 95% of the time the
population mean would fall between 2.0 and 5.4 screens.
Incorrect: as 95% of the confidence intervals created cannot possibly have these exact
same values of (2.0, 5.4).
d) We are 95% confident the sample mean lies between 2.0 and 5.4 screens per
household.
Incorrect: the CI is to estimate the population mean. It had to know the value of the
sample mean to create the CI to estimate the population mean.
e) 95% of all households have between 2.0 and 5.4 screens.
Incorrect: the CI is to estimate the population mean number of screens, and is
not about the measurements of individual households.
This Photo
This Photo
46
46
Summarise factors affecting the width of a CI for ฮผ
โข The margin of error (ME) determines the width of the confidence interval.
โข Note: Me = half the width.
โข Holding all else constant, if the sample size, n, is increased, the confidence interval
becomes narrower, and more precise. Why?
โข Holding all else constant, if the percentage of confidence is decreased, the critical
value (either Z or t) is smaller and the confidence interval becomes narrower, and
more precise. Why?
โข Holding all else constant, if the standard deviation decreases, the confidence interval
becomes narrower and more precise. Why?
๐ฅาง ยฑ ๐ก๐๐๐๐ก
๐  ๐
๐ฅาง ยฑ ๐ง
๐๐๐๐ก
๐ ๐
47
47
precision.
โข Increasing n, increases information. This decreases the width of
the CI, making the CI more precise, but at the cost of collecting a
larger sample.
โข Decreasing CI %, decreases confidence but also decreases the
width of the CI, increasing precision of the CI.
โข A balance must be struck between:
o cost
o confidence
o precision
Can we ever be 100% confident with a CI?
48
As this will give us the
minimum sample size needed,
we must always
round up to
the nearest integer.
n =
๐ง
๐๐๐๐ก 2 ๐2
๐๐ธ2
n =
๐๐๐๐๐ก ๐
๐๐ธ
2
๐ฅาง ยฑ ๐ง๐๐๐๐ก
๐ ๐
๐ฅาง ยฑ
ME
Determine the minimum sample size for a CI about ฮผ
Example:
If
๏ณ = 45, what is the minimum sample size
needed to estimate the mean within ยฑ 5
with 90% confidence?
Solution:
n =
๐๐๐๐๐ก ๐
๐๐ธ
2
=
1.645(45)
5
2
= 219.19 โ 220
The minimum sample size needed is 220.

49
Illustration of confidence intervals for p
Say, a bowl contains 1,000 different coloured balls.
We want to estimate p, the population proportion of white balls in the bowl.
Forty students were each asked to do the following:
โข Take a random sample of 25 balls from the bowl
โข Calculate ๐ฦธ, the sample proportion of white balls in their sample
โข Use their ๐ฦธ to create a 95% confidence interval for the population proportion
of white balls in the bowl of balls, p.
The next slide shows the forty confidence intervals created.
โข The real population proportion of white balls in the bowl is p = 50%= 0.50
โข How many CI did NOT include p of 0.50?
50
Illustration continued
Only one student
a confidence
interval that did
not contain the
true (population)
proportion,
p of 0.5.
The student did
everything
correctly.

51
โข The CI for the proportion can be rearranged
to find n
n =
๐
๐๐๐๐ก๐๐๐๐ ๐ฦธ๐เท
๐๐ธ
2
Determine the minimum sample size for a CI about p
q n
p
p z
critical
ห ห
ห
๏ฑ
Z critical is the Z value for the % CI
ME = margin of error in decimal form
๐เท= 1 – ๐ฦธ , in decimal form
๐ฦธ is either:
o The estimated sample proportion provided by a previous study
o If there is no value of ๐ฦธ provided, use ๐ฦธ = 0.50 as this gives the
largest standard error and the largest minimum value of n.
Example: A researcher is estimating
the proportion of students who buy
their lunch every day. A recent survey
found the sample proportion to be
0.30. What sized sample, n, would be
needed to ensure a 95% CI has a
margin of error of 0.04?
n =
๐
๐๐๐๐ก๐๐๐๐ ๐ฦธ๐เท
๐๐ธ
2
n =
1.96 0.3(1 – 0.3)
0.04
2
n = 22.455 2
n = 504.21 ๐ ๐ก๐ข๐๐๐๐ก๐
Minimum sample size needed is 505
students.

52
A researcher is estimating the proportion of students who buy their lunch every day.
What sized sample, n, would be needed to ensure a 95% CI has a margin of error of 0.04?
As ๐ฦธ is not provided, we must use 0.5, as this gives the largest minimum value of n.
n =
๐
๐๐๐๐ก๐๐๐๐ ๐ฦธ๐เท
๐๐ธ
2
n =
1.96 0.5(1 – 0.5)
0.04
2
n =
1.96 0.25
0.04
2
n = 24.5 2
n = 600.25 students
Minimum sample size is 601 students.
Example
53
Things to note with finding n:
1. Round up not off (that is, here we round up for any decimal value).
Why? Because this is the minimum sample size required.
2. Remember that the sample size required does not depend on the
population size.
3. Calculating the sample size is part of the design of a survey and should be
done at the start of the survey process. The sample size is a guideline of the
minimum sample size required to give you the data you need.
4. Note: If doing a survey, the sample size, n is the number of respondents, not
the number of individuals surveyed.