Week 8 Hypothesis testing

1
1
STAM4000
Quantitative Methods
Week 8
Hypothesis testing
https://www.pinterest.com.au/pin/545357836118376197/
2
COMMONWEALTH OF AUSTRALIA
Copyright Regulations 1969
WARNING
This material has been reproduced and communicated to you by or on behalf of Kaplan
Business School pursuant to Part VB of the
Copyright Act 1968 (the Act).
The material in this communication may be subject to copyright under the Act. Any further
reproduction or communication of this material by you may be the subject of copyright
protection under the Act.
Do not remove this notice.
2

3
ing
#1
#2
#3
Describe the steps of hypothesis testing
Construct hypothesis tests for one
population mean
Examine errors in hypothesis testing
Week 8
Hypothesis test
Learning
Outcomes

4
Why does this matter?
We sometimes need to
determine if there is
significant evidence to
support a claim.
(http://4.bp.blogspot.com/-zf7S5L0XT-U/T_PX-wWXEBI/AAAAAAAADp8/DPKrX_iJJUA/s1600/1b.jpg)
5
#1 Describe the steps of hypothesis testing
https://www.google.com/search?q=steps+funny+animals&tbm=isch&ved=2ahUKEwiQkcrvhqfuAhVKF7cAHaabBWMQ2-
cCegQIABAA&oq=steps+funny+animals&gs_lcp=CgNpbWcQAzoHCCMQ6gIQJzoECCMQJzoECAAQQzoCCAA6CAgAELEDEIMBOgUIABCxAzoHCAAQsQMQQzoGCAAQBRAeOgYIABAIEB46BAgAEB5QgdsEWMuABWD8ggVoAXAAeACAAfoBiAGZGpIBBjAuMTguMZgBA
KABAaoBC2d3cy13aXotaW1nsAEKwAEB&sclient=img&ei=0VEGYNDkHMqu3LUPpreWmAY&bih=433&biw=1013&rlz=1C1CHBF_enAU841AU846&hl=en#imgrc=Muqzigq1hZoPTM

6
#1 What is a hypothesis?
A hypothesis is an
idea,
claim or
belief
about a population,
that we want to test,
using a sample.
Photo by Jonathan Daniels on Unsplash
7
#1 Steps in hypothesis testing
8
#1 Step 1: Write the hypotheses

•Expresses what we initially
ASSUME to be true.

Null hypothesis, denoted by
Ho

•Expresses our claim into a statement
that we are trying to gather enough
evidence to PROVE is NOW true.

Alternative Hypothesis,
denoted by
Ha, H
A, or H1
9
9
An individual is thought to have committed a crime and is brought before a court
of law.
In Australia, we presume the individual is innocent,
then gather evidence to try and prove they are guilty.
We have two hypotheses:
Ho: individual is innocent
Ha: individual is guilty
Objective of
testing:
gather evidence
to reject Ho and
accept Ha.
#1 Introductory example
10

10 Two Incorrect
outcomes
(errors)

Possible outcomes in our introductory example?

Two Correct
outcomes

Innocent
individual is freed
Guilty individual is
imprisoned
Innocent
individual is
imprisoned
Guilty individual
is freed
Four Possible outcomes
#1

11
11
Quick quiz
This Photo by Unknown Author is licensed under CC BY
Let’s use the previous example, with the following
hypotheses:
Ho: individual is innocent
Ha: individual is guilty
Which is worse:
an innocent individual going to prison
or
a guilty individual going free?
#1
12
#1 Step 2:Find the calculated test statistic and/or the p-value

•Value from a formula, quantifying the
difference between what is hypothesised
the population and what is in the sample.

Calculated test statistic

•The probability of getting our sample
results or more extreme, if our null
hypothesis were really true.

p-value
13
#1 Step 3: Find the critical value

•As a minimum, we need:
o Relevant statistical tables or technolo
o Level of significance, α
o Number of tails in Ha

Critical value

•Area of rejection region
•Probability of Type I error (later)

Level of significance, α
14
#1 Step 4: Sketch a curve


oOn axis of curve:
Insert critical value(s)
Label rejection region(s)
Insert calculated value
oIn area beneath curve
Insert α
Insert p-value (if relevant)

Ha determines the number of tails in a curve
Sketch a curve
15
#1 Steps 5: Decision and Step 6: Conclusion

i. Critical value method: compare calculated test statistic with critical value(s).
or
ii. p-value method: compare p-value, of calculated test statistic, with α

Decision
If we reject Ho, then we may accept Ha.
BUT, if we fail to reject Ho, then we must
RETAIN Ho;
we NEVER accept Ho.
https://www.google.com/search?q=hypothesis+testing+picture+funny&tbm=isch&ved=2ahUKEwi1iq7vgqnuAhVBYisKHdr5DJYQ2-
cCegQIABAA&oq=hypothesis+testing+picture+funny&gs_lcp=CgNpbWcQAzoCCAA6BAgAEBg6BAgAEB5Q31VYs21gsW9oAHAAeAGAAYEEiAH9FZIBCjAuMTIuMS41LTGYAQ
CgAQGqAQtnd3Mtd2l6LWltZ8ABAQ&sclient=img&ei=DloHYLXLLcHErQHa87OwCQ&bih=462&biw=994&rlz=1C1CHBF_enAU841AU846&hl=en
Conclusion

•Use your decision to answer the original question.

16
#2 Construct hypothesis tests for one population mean
google.com/search?q=under+construction+funny+comic&tbm=isch&ved=2ahUKEwiGo92CqqnuAhU2kksFHSU7DiIQ2-
cCegQIABAA&oq=under+construction+funny+comic&gs_lcp=CgNpbWcQAzoCCAA6BAgAEB46BggAEAUQHjoGCAAQCBAeUK6KCFivmAhguZoIaABwAHgAgAG4AYgBhwiSAQMwLjaYAQCgAQGqAQtnd3Mtd2l6LWltZ8ABAQ&sclient=img&ei=HIMHYIalFbakrtoPpfa4k
AI&bih=433&biw=1013&rlz=1C1CHBF_enAU841AU846&hl=en#imgrc=0jMX3p-8zMiXmM

17
#2 Test hypotheses about one population mean
18
Read as “alpha”.
α is also called the significance level.
α, and the direction in Ha, will help us define the rejection region by finding the critical
value(s).
Usually, 1% ≤ α ≤ 10% or 0.01 ≤ α ≤ 0.10.
α of 1% is a very strict test, as it has a very small rejection region
α of 10% is a more lenient test, as it has a larger rejection region.
α should be selected by the researcher before a sample is selected.
α Is the probability of incorrectly rejecting Ho.
#2 The level of significance, α
19
Denotes
position of
critical
value(s)
Ho: μ = value
Ha : μ < value
0
Ho: μ = value
Ha : μ > value
a
a
One tailed test in the left or lower tail
0
One tailed test in the right or upper tail
Two-tail test
Rejection
region(s) is
shaded
/2
0
Ho: μ = value a

/2 a

Ha: μ ≠ value
Total area = 100% or 1
#2 Writing hypotheses and sketches to test μ
α is the
level of
significance

20
20
#2 Example
i) The manager of a soda drink
company has set the bottling
machine to fill bottles to an
average volume of 250 ml.
An assembly line worker claims
the bottling machine is not
filling the bottles correctly.
The worker samples 40 bottles
and finds an average of 230 ml
per bottle.
Write the hypotheses to test the
worker’s claim.
ii) A phone
company
executive announced that the
customer call waiting time, for
their helpline, is less than, an
average of 12 minutes per call.
The company takes a random
sample of 36 calls and finds an
average wait time of 11 minutes.
Write the hypotheses to test the
executive’s announcement.
iii) The CEO of a fast-food
franchise reported that the
average weekly sales, per
franchise, is $45,000. The
marketing team believes this
can be increased and runs an
advertising campaign. The CEO
sampled the weekly sales of 10
franchises, after the campaign,
and found average sales were
$47,000.
Write the hypotheses to test
the marketing team’s belief.
This Photo by Unknown Author is licensed under CC BY-SA
This Photo
by Unknown Author is licensed under
CC BY
This Photo
by Unknown Author is licensed
under
CC BY-NC-ND
Ho: µ = 250
Ha: µ ≠ 250
Ho: µ = 12
Ha: μ < 12
Ho : µ = 45,000
Ha: µ > 45,000

21
#2
Test statistic
formula to test μ
σ is KNOWN
use Z
𝒁
𝒄𝒂𝒍𝒄 =
𝒙ഥ – 𝝁
𝝈 𝒏
σ is UNKNOWN
use t
𝒕𝒄𝒂𝒍𝒄 =
𝒙ഥ – 𝝁
𝒔 𝒏

22
22
#2
For a Z-test,
to find the Z critical
we need:
i) Direction
in Ha
ii) Level of
significance,
α
For a t-test,
to find the t critical we
need:
i) Direction
in Ha
ii) Level of
significance,
α
iii) Degrees
of freedom
= n – 1
Critical value to test μ
23

23

#2 Summary table of common Z critical values for tests

α One-tailed test Two-tailed test
1% = 0.01 2.33 2.576
5% = 0.05 1.645 1.96
10% = 0.10 1.28 1.645
Z critical may be negative, positive or both
depending on the direction in the
alternative hypothesis, Ha.

24
#2 Conditions to check before testing μ
25
#2
BUSINESS QUESTION: Has the average time, to cook and glaze a Krispy Kreme donut, changed?
Example: Two-tailed test of μ, when σ is KNOWN
The amount of time required to cook and glaze a Krispy Kreme donut, on an assembly line, is thought to be
normally distributed with a mean of 130 seconds and a standard deviation of 15 seconds. A sample of 100
randomly selected Krispy Kreme donuts is drawn and the cooking and glazing processing time recorded. The
sample mean is found to be 126.8 seconds.
Using the critical value method, test the belief that the average time to
cook and glaze a donut has changed. Use α = 0.10. Assume conditions are satisfied.

Solution: σ is known, so we can use Z.
Ho: μ = 130
Ha: μ ≠ 130
Step 1:
REJECT Ho

𝟎.𝟏𝟎
REJECT Ho

𝟎.𝟏𝟎
Step 4:
= = 0.05 = = 0.05

 

Z calculated = 𝟏𝟐𝟔.𝟖 -𝟏𝟑𝟎
𝟏𝟓𝟏𝟎𝟎
Step 2:
= -2.13 𝟐 𝟐 𝟐 𝟐

Z critical = ±1.645
Decision: As Z calculated of -2.13 is less than Z critical of -1.645, we reject Ho and accept Ha, at α = 0.10.
Conclusion: There is significant evidence that the population average time to cook and glaze a Krispy
Kreme donut has changed and is not 130 seconds.

Do not
reject Ho

– 2.13 -1.645 0 1.645 Z This Photo under CC BYby Unknown Author is licensed -NC-ND
This Photo
by Unknown Author is licensed under CC BY-SA
Step 3:
Step 5:
Step 6:

26
#2
BUSINESS
QUESTION:
Has the average
price of petrol
decreased since the
onset of COVID-19?
Example: Left-tailed test for μ, when σ is KNOWN
This Photo by Unknown Author is licensed under CC BY-SA
A radio announcer claimed the average price of petrol has decreased due to
COVID-19 restrictions. Say, the average price of unleaded petrol in 2019 was
$1.35 per litre with a standard deviation of $0.10 per litre. To investigate
the radio announcer’s claim, a researcher, in late 2020, takes a random
sample of 64 different petrol stations and records the price of unleaded
petrol for each. This gave an average of $1.30 per litre.
a) Check the conditions before testing for μ.
b) Write the hypotheses to test the radio announcer’s claim.
c) Calculate the test statistic.
d) Find the critical value at the 5% level of significance.
e) Sketch the curve, showing the rejection region.
f) Give your decision using the critical value method.
g) Give your conclusion.

27
27
a) Random Sample Condition? Told it’s a random sample.
10% Condition? As n = 64, we need at least 640 different petrol stations in
Australia. This is satisfied as there are thousands of petrol stations in Australia.
Normal or Large Enough Sample Condition: As n = 64 > 30, by the Central Limit
Theorem, we have normality i.e.,
𝑋ത ~ 𝑁𝑜𝑟𝑚𝑎𝑙.
All the conditions are satisfied. As σ is known, use Z.
b) Ho: μ = 1.35
Ha: μ < 1.35
c)
𝒁𝒄𝒂𝒍𝒄 = 𝒙ഥ-𝝈 𝝁
𝒏
=
𝟏.𝟑𝟎 -𝟏.𝟑𝟓
𝟎.𝟏𝟎
𝟔𝟒
= -4
#2 Example solution
28
28
d) Told α = 5% = 0.05, Ha is one tailed in the left. Using the Z critical summary
table for testing, Z critical is -1.645.
e)
– 4 -1.645 0 Z
f) Decision: With this left-tailed test, as Z calculated of -4 is less than Z critical of
-1.645, we can reject Ho and accept Ha, at the 5% level of significance.
g) Conclusion: There is significant evidence to support the radio announcer’s claim
that the average price of petrol has decreased since COVID-19.
#2 Example solution

Do not reject Ho

REJECT Ho
α = 5% = 0.05
29
#2 Exercise
BUSINESS
QUESTION:
Has the
average
price of a
cup of
coffee
increased?
You are told that last year, the mean price of a cup of coffee was
$4.00 with a standard deviation of $0.40. A random sample of 49
cafes found a mean price of $4.10. Is there significant evidence that
the mean price has increased?
a) Check the conditions before testing for μ.
b) Write the hypotheses for this test.
c) Calculate the test statistic.
d) Find the critical value at a 1% level of significance.
e) Sketch the curve, showing the rejection region.
f) Give your decision using the critical value method.
g) Give your conclusion.
This Photo by Unknown Author is licensed under CC BY-NC-ND
32
#2 Example: Right-tailed test for μ, when σ is UNKNOWN
Was online shopping become more popular in 2020? Say, in 2019, Australian
households spent an average of $346 per week online. A random sample of 51
Australian households in 2020 revealed an average expenditure online of $368 per
week with a standard deviation of $95. Assume the conditions are satisfied.
a) Test using α of 10%
b) Test using α of 5%
This Photo by Unknown Author is licensed under CC BY-SA
33
#2 Example solution
μ = $346, n = 51, 𝑥ҧ = $368, s = $95. As σ unknown, do a t-test.
a) Test using α of 10%
Ho: μ = 346
Ha: μ > 346
t
𝒄𝒂𝒍𝒄 =
𝒙ഥ- 𝝁
𝒔 𝒏
=
𝟑𝟔𝟖 -𝟑𝟒𝟔
𝟗𝟓
𝟓𝟏
= 1.65
t critical for α = 10% = 0.10 in the right tail,
and df = 51 – 1 = 50, t critical = 1.299
Decision: As t calculated of 1.65 > t critical of 1.299, we can reject Ho and accept Ha.
Conclusion: At α of 10%, there is significant evidence that online shopping had become
more popular in 2020.
REJECT Ho
α = 10% = 0.10

Do not reject Ho

0 1.299 1.65 t
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
Step 6:

34
#2 Example solution
b) Test using α of 5%
From part a):
Same hypotheses and the
same
t calculated of 1.65.
New t critical for α = 5% = 0.05:
t
0.05 in the right tail and df = 51 – 1 = 50, t critical = 1.676
New decision: As t calculated of 1.65 is now less than t critical of 1.676,
t calculated does NOT lie in the rejection region, so we cannot reject Ho.
Now, we must retain Ho.
New conclusion: At α of 5%, there is NO significant evidence that online
shopping has become more popular in 2020.
This Photo by Unknown Author is licensed under CC BY-SA
0 1.65 1.676 t
REJECT Ho
α = 5% = 0.05

Do not reject Ho

35
#2 Exercise
BUSINESS
QUESTION:
Have households
stopped hoarding
toilet paper?
https://unsplash.com/photos/egqR_zUd4NI
In March 2020, Australia saw the demand for toilet paper increase due
to impending isolation from COVID-19. By October 2020, Australian
households felt less threatened, and demand for toilet paper seemed
to revert back to pre-COVID-19 levels.
Say, in March 2020, Australian households stored an average of 12.5
toilet rolls per adult.
In October 2020, a random sample of 36 Australian households, found
an average of 10.8 toilet rolls stored per adult, with a standard
deviation of 5 toilet rolls.
In October 2020, did Australian households feel less urgency to store
toilet paper, as compared to March 2020?
Assume the conditions are satisfied. Test using α of 5%.

37
#2 p-values
38
A different way of doing a test – the p-value method
A p-value is the probability of obtaining a test statistic more extreme ( ≤ or ≥) than the
observed sample value, given H
0 is true.
The p-value is also called observed level of significance
The p-value is the smallest value of a for which H0 can be rejected
We can obtain the p-value from statistical tables or technology
We set a
level of significance (a), which is like a threshold for the p-value.
Rule: p-value LOW, NULL must GO.
If the p-value ≤ α, we reject Ho and accept Ha.
If the p-value > α, we retain Ho.
#2
39
#2
The amount of time required to cook and glaze a Krispy Kreme donut on an assembly line is thought to be
normally distributed with a mean of 130 seconds and standard deviation of 15 seconds. A sample of 100
randomly selected Krispy Kreme donuts is drawn and the cooking and glazing processing time recorded. The
sample mean is found to be 126.8 seconds.
Using the p-value method, test the belief that the average time to
cook and glaze a donut is no longer 130 seconds. Use α = 0.10 level. Assume conditions are satisfied.
Ho: μ = 130
Ha: μ ≠ 130
Z calculated = 126.8 -130
15100 = -2.13
Two-tailed test:
p-value = P(Z ≤ -2.13 ) + P(Z 2.13)

= 0.0166 + 0.0166
= 0.0332
-2.13 -1.645 0 1.645 2.13 Z

Decision: As the p-value of 0.0332 is less than α of 0.10, we can reject Ho and accept Ha.
Same conclusion as with the critical value method: there is significant evidence that the population
average time to cook and glaze a Krispy Kreme donut has changed, and is not 130 seconds.
REJECT Ho
∝ 𝟐
=
𝟎.𝟏𝟎
𝟐
= 0.05
REJECT Ho
∝ 𝟐
=
𝟎.𝟏𝟎
𝟐
= 0.05
Do not
reject Ho
𝒑-𝒗𝒂𝒍𝒖𝒆
𝟐
= 0.0166 𝒑-𝒗𝒂𝒍𝒖𝒆
𝟐
= 0.0166
Recall the earlier Krispy Kreme example …
This Photo by Unknown Author is licensed under CC BY-SA
This Photo
by Unknown Author is licensed
under
CC BY-NC-ND
40
#2 Exercise
BUSINESS
QUESTION:
Has the
average
price of a
cup of
coffee
increased?
Recall, the earlier coffee exercise:
Last year, the mean price of a cup of coffee was $4.00 with a
standard deviation of $0.40. A random sample of 49 cafes found a
mean price of $4.10. Is there significant evidence that the mean
price has increased?
Now, test at the 1% level of significance, using the p-value method.
Recall, from the earlier exercise:
Ho: μ = 4
Ha: μ > 4
𝒁
𝒄𝒂𝒍𝒄 =
𝒙ഥ- 𝝁
𝝈 𝒏
=
𝟒.𝟏 -𝟒
𝟎.𝟒𝟎
𝟒𝟗
= 1.75
This Photo by Unknown Author is licensed under CC BY-NC-ND
42
#3 Examine errors in hypothesis testing
https://www.google.com/search?q=null+hypothesis+jokes&tbm=isch&hl=en&chips=q:testing+reject+null+hypothesis+jokes,online_chips:testing,online_chips:reject,online_chips:statisticians&rlz=1C1CHBF_enAU841AU846&sa=X&ved=2ahUKEwjW4ObKoqnuAhWEBysKHeXdCAkQ4lYoDnoECAEQLA&biw=994&bih=423#imgrc=Kp7ahCtVfqCk7M&imgdii=j1en5HS1JLZ5sM
43
#3 Understand errors in hypothesis testing
44
#3 Summary of errors in hypothesis testing
45
Example
Can there be errors in COVID-19 test results?
Ho: person tests positive
Ha: person tests negative
Type I error:
Incorrectly rejecting Ho. Claiming the person is COVID-19 negative, but in reality
they are COVID-19 positive.
Type II error:
Incorrectly retaining Ho. Concluding the person is COVID-19 positive, but in
realty the person is COVID-19 negative.
This Photo by Unknown Author is licensed under CC BY-NC-ND
#3
46
Exercise
Cars emit harmful air pollutants. For a car to be
marketed in Australia, it must meet minimum requirements
for emissions. Suppose government regulators are suspicious
of tests done by a car manufacturing company and retest a sample
of their cars. The company will get a large fine and have to fix all cars if found to not satisfy the
minimum requirements. The hypotheses are set-up as follows:
Ho: The cars satisfy the minimum requirements for emissions
Ha: The cars do not satisfy the minimum requirements for emissions
a) In this context, what is a Type I error?
Is this more serious for the car manufacturer or the consumer?
b) In this context, what is a Type II error?
Is this more serious for the car manufacturer or for the consumer?
This Photo by Unknown Author is licensed under CC BY
#3
48
Supplementary Exercises
Students are advised that Supplementary Exercises to this topic may be found on the
subject portal under “Weekly materials”.
Solutions to the Supplementary Exercises may be available on the portal under “Weekly
materials “at the end of each week.
Time permitting, the lecturer may ask students to work through some of these exercises
in class.
Otherwise, it is expected that all students work through all Supplementary Exercises
outside of class time.

49
Extension
The following slides are an extension to this week’s topic.
The work covered in the extension:
o Is not covered in class by the lecturer.
o May be assessed.
50
50
Example
Write the hypotheses for the following:
a) The latest version of a smartphone claims that the battery life per
charge has increased from the previous model. The previous model,
had a battery that lasted an average of 10 hours of continuous internet,
per charge. Thirty of the latest version phones were randomly selected
and tested. These had an average battery life of 12 hours of continuous
internet, per charge.
Ho: μ = 10 or we could write Ho: μ ≤ 10
Ha: μ > 10 Ha: μ > 10
b) Does social media lead to less sleep? Researchers investigating the
impact of social media on sleep, took a random sample of 100 adults and
found an average of 7.5 hours sleep per night. The recommended average
number of hours of sleep per night is 8 hours.
Ho: μ = 8 or we could write Ho: μ
≥ 8
Ha: μ < 8 Ha: μ < 8
This Photo by Unknown Author is licensed under CC BY
51
Example
a) Ho: µ = 10
Ha: µ > 10
Where, α is 5%, n is 46
and σ is unknown,
use t.
α is 5% or 0.05.
Right-tailed, use t
0.05
df = n – 1 = 46 – 1 = 45
t critical = 1.679
Find the critical values for the the following hypothesis tests. Assume conditions are
satisfied.
b) Ho: µ = 500
Ha: µ ≠ 500
Where, α is 10%, n is 35 and
σ is known,
use Z.
Two-tailed
Z critical = ± 1.645
c) Ho: µ = 100
Ha: µ < 100
Where, α is 10%, n is
36 and σ is unknown,
use t.
Left-tailed, use t0.10
df = n – 1 = 36 – 1 = 35
t critical = -1.306
.
52
Summary of errors in hypothesis testing
The significance level, α, should be selected before sampling begins as it
determines the rejection region for a hypothesis test. It is unethical to choose
α after testing begins.
Choice of α must take into account the relative seriousness of the two types of
error.
There is a trade-off between the size of a Type I error and the size of a Type II
error. As we reduce P(type I error), we decrease the size of the rejection
region, therefore increase the size of the retain region and the chance of a Type
II error.
As α is the probability of a Type I error and can be fixed before testing, we
should write the hypotheses such that a Type I error is the more serious of the
two errors.

53
Interval estimators can be used to test hypotheses.
Calculate the (1–
a) confidence level interval estimator, then
if the hypothesised parameter value falls within the interval, do not
reject the null hypothesis, while
if the hypothesised parameter value falls outside the interval, conclude
that the null hypothesis can be rejected (μ is not equal to the
hypothesised value).
Drawbacks
Two-tail interval estimators may not provide the right answer to the question
posed in one-tail hypothesis tests.
The interval estimator does not yield a
p-value.
Testing hypotheses and interval estimators