Hypothesis tests in linear regression

1
1
STAM4000
Quantitative Methods
Week 11
Chi-square tests
https://unsplash.com/@lastly?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText
Kaplan Business School (KBS), Australia 1
2
2
UP until now:
Hypothesis tests of the population mean, μ, (TESTS FOR ONE QUANTITATIVE
VARIABLE) based on NORMAL DISTRIBUTIONS of either Z or t.
Hypothesis tests in linear regression (TESTS FOR TWO OR MORE QUANTITATIVE
VARIABLE) – this was actually using the NORMAL distribution of t.
2
3
3
Today:
Hypothesis tests about TWO CATEGORICAL VARIABLES – specifically, we will be
testing for “INDEPENDENCE” – this uses the Chi-square distribution, that is positively
skewed or skewed to the right.
Way back in Week 3, “probability” the independence between only TWO EVENTS.
In Week 11, we will concentrate on INDEPENDENCE OF TWO CATEGORICAL
VARIABLES OVERALL.
3
4
COMMONWEALTH OF AUSTRALIA
Copyright Regulations 1969
WARNING
This material has been reproduced and communicated to you by or on behalf of Kaplan
Business School pursuant to Part VB of the
Copyright Act 1968 (the Act).
The material in this communication may be subject to copyright under the Act. Any further
reproduction or communication of this material by you may be the subject of copyright
protection under the Act.
Do not remove this notice.
4
Kaplan Business School (KBS), Australia 4

5
ts
#1
#2
#3
Introduction to Chi-square tests
Chi-square test of independence
Standardized (Z score) Chi-square
residuals
(difference/deviation)
Week 11
Chi-square tes
Learning
Outcomes

Kaplan Business School (KBS), Australia 5
6
Why does this matter?
If we have
categorical
variables, and our
data are counts
(or
frequencies)
, we
can still examine
whether variables
are independent.
https://www.reddit.com/r/mathmemes/comments/b2dub1/poor_souls/
Kaplan Business School (KBS), Australia 6
7
#1 Introduction to Chi-square tests
https://unsplash.com/@senseiminimal?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText
Kaplan Business School (KBS), Australia 7
Note:
As with all calculated test statistics, we can be given a p-value.
However, it can be difficult to find the p-value for a Chi-square calculated value using
the Chi-square statistical tables.
8
#1 What are we testing here?
Chi-square tests are about one or more categorical variables.
We will follow the familiar process of hypothesis testing:
Check conditions, but now we will have conditions for Chi-square.
Follow the steps of hypothesis testing:
o Write hypotheses: now we will have “names of our categorical variables
included”
o Find the Chi-square calculated test statistic, Chi-square value from a formula
o Find the Chi-square critical value, from Chi-square statistical tables
o Sketch a Chi-square curve, positively skewed or skewed to the right
o Decision, Comparison Chi-square calc test statistic WITH Chi-square critical
value
o Conclusion, ties our decision to the original question
Chi-square
is read as
“ki square

9
9
Ho, is our “Null hypothesis” which we ASSUME TO BE TRUE.
Ha, is our “Alternative hypothesis”, which we try to gather evidence and PROVE is
NOW true.
9
10
#1 Three different type of Chi-square tests
Compares the observed distribution of one categorical
variable, to an expected distribution of that categorical
variable.
Goodness-of-fit test
•Compares the distribution of several groups for the same
Test of homogeneity categorical variable
Examines the difference between observed and expected
counts of two categorical variables, to determine if there
is an association between the two variables.
Test of independence
We will cover the test of independence and standardized residuals in
STAM4000
Chi-square
is read as
“ki square

11
#1
We assume:
The outcome of each of the identical trials would fall into one of two categories.
The probability of these outcomes is constant throughout the experiment.
If p is the probability of success, the Expected frequency of an event X with
success rate
p is E[X] = np
o The expected frequencies are calculated, assuming the null hypothesis, Ho,
is TRUE.
Chi-square tests: Theory
12
#1
Our test compares the observed frequencies, from the sample, with the
expected frequencies, from the hypothesised model in Ho.
We ask:
“Is the difference between what we expected and what we observed,
due to sampling variability or is the differences large enough to be
due to a change from the hypothesis model in Ho?”
We square the difference between the observed and expected frequencies, to
make them positive AND then we divide this by the expected frequency, to get
an idea of the relative size of the difference.
Theory continued …
Chi-square calculator link:
http://www.socscistatistics.com/tests/chisquare2/Default2.aspx
Chi-square notes link: archive.bio.ed.ac.uk/jdeacon/statistics/tress9.html
13
#1 Chi-square calculated test statistic