Nonparametric Test
Nonparametric Test
Chi-Square Test for Independence
The test is used to determine whether two categorical variables are independent.
Notation for the Chi-Square Test for Independence (Please note that the notation varies
depending on the text)
O represents the observed frequency of an outcome
E represents the expected frequency of an outcome
r represents the number of rows in the contingency table
c represents the number of columns in the contingency table
n represents the total number of trials
Test Statistic
2
2
1 1
O E
E
df r c
The Chi-Square test is a hypothesis test. There are seven steps for a hypothesis test.
1. State the null hypothesis
2. State the alternative hypothesis
3. State the level of significance
4. State the test statistic
5. Calculate
6. Statistical Conclusion
7. Experimental Conclusion
Example
A university is interested to know if the choice of major has a relationship to gender. A
random sample of 200 incoming freshmen students was taken (100 male and 100
female). There major and gender were recorded. The results are shown in the
contingency table below.
Major Female Male
Math 5 15
Nursing 44 10
English 10 10
Pre-Med 17 20
History 4 5
Education 15 20
Undecided 5 20
To determine if there is a relationship between the gender of a freshmen student and
thei declared major perform the hypothesis test (Use level of significance 0.05 ) .
Step 1: Null Hypothesis
0 : Gender and Major of Freshmen students are independentH
Step 2: Alternative Hypothesis
: Gender and Major of Freshmen students are not independentAH
Step 3: Level of Significance
0.05
Step 4: Test Statistic
2
2
1 1
O E
E
df r c
Step 5: Calculations
There are several calculations for this test. We have to find the expected frequency for
each cell in the contingency table. The expected frequency is the probability under the
null hypothesis times the total frequency for the given row. Here the probability under
the null hypothesis is .5, as the probability of being male and female is equal.
rE pn
Major Female Male
Math 1 .5 20 10E 2 .5 20 10E
Nursing 3 .5 54 27E 4 .5 54 27E
English 5 .5 20 10E 6 .5 20 10E
Pre-Med 7 .5 37 18.5E 8 .5 37 18.5E
History 9 .5 9 4.5E 10 .5 9 4.5E
Education 11 .5 35 17.5E 12 .5 35 17.5E
Undecided 13 .5 25 12.5E 14 .5 25 12.5E
Know calculate the test statistic.
2
2
2 2 2 2
2
2 2 2 2
2 2 2 2
2 2
2
5 10 15 10 44 27 10 27
10 10 27 27
10 10 10 10 17 18.5 20 18.5
10 10 18.5 18.5
4 4.5 5 4.5 15 17.5 20 17.5
4.5 4.5 17.5 17.5
5 12.5 20 12.5
12.5 12.5
2.5 2.5 10.7 10.7 0 0 .1216
obs
obs
obs
O E
E
2
.1216 .0556 .0556 .357 .357 4.5 4.5
36.4684obs
The calculation for degrees of freedom is as follows:
1 1 7 1 2 1
6 1 6
df r c
df
The critical value for the Chi-Square with 6 degrees of freedom at a level of significance
0.05 is 12.592. This is found by using the Chi Square table.
Step 6: Statistical Conclusion
Since 2 2
6, 0.0536.4684 12.592obs df then reject the null hypothesis.
Step 7: Experimental Conclusion
There is sufficient evidence to indicate that gender has an effect on choice of major for
the incoming freshmen.
Mann-Whitney
The Mann-Whitney test is a nonparametric version of the independent sample t-test.
This study is used when there are two independent samples of ordinal scores.
Test Statistic
1 1
1 1 2 1
2 2
2 1 2 2
1
2
1
2
n n U n n R
n n U n n R
Notation:
1
2
1
2
number of scores in group 1
n number of score in group 2
R sum of ranks for score in group 1
R sum of ranks for score in group 2
n
The Mann-Whitney test is a hypothesis test. There are seven steps for a hypothesis
test.
1. State the null hypothesis
2. State the alternative hypothesis
3. State the level of significance
4. State the test statistic
5. Calculate
6. Statistical Conclusion
7. Experimental Conclusion
Example:
Suppose there was race between an Antelope and a mountain lion. The Antelope won
but you want to know if this could be extended to a general statement that Antelope will
win in a race against a mountain lion. A random sample of 10 Antelopes and 10
mountain lions are put in a race. The order in which they finish is recorded. ( A for
antelope and L for mountain lion)
AALALLLAAALAALLAALLL
Step 1: Null Hypothesis
0 : Probability of the Antelope winning is no different than the probalility of the
mountain lion wining the race.
H
Step 2: Alternative Hypothesis
: Probability of the Antelope winning is different than the probalility of the
mountain lion wining the race.
AH
Step 3: Level of Significance
0.05
Step 4: Test Statistic
1 1
1 1 2 1
2 2
2 1 2 2
1
2
1
2
n n U n n R
n n U n n R
Step 5: Calculations
The results are ranked according to the in order in which they finished the race.
Rank
A 1
A 2
L 3
A 4
L 5
L 6
L 7
A 8
A 9
A 10
L 11
A 12
A 13
L 14
L 15
A 16
A 17
L 18
L 19
L 20
Let the Antelope be sample one and the mountain lion be sample 2.
1
2
1
2
number of scores for the Antelopes
n number of score for the mountain lions
R sum of ranks for score for the Antelopes
R sum of ranks for score for the mountain lions
n
Find the sum of the ranks for each group.
Antelope Mountain Lion
1 3
2 5
4 6
8 7
9 11
10 14
12 15
13 18
16 19
17 20
Sum 92 118
The number of for each animal is 10 because 10 antelopes and 10 mountain lions
raced.
Now plug the information into the test statistic
1 1
1 1 2 1
1
1
1
2 2
2 1 2 2
2
2
2
1
2
10 10 1 10 10 92
2
100 55 92
63
1
2
10 10 1 10 10 118
2
100 55 118
37
n n U n n R
U
U
U
n n U n n R
U
U
U
Find the critical value for the test statistic. Use the appropriate U table found in the
appendix of the textbook. Look for alpha to be 0.05 and then the number of scores for
each category to be 10. Then you see the following critical values.
1,
2,
23
77
crit
crit
U
U
Step 6: Statistical Conclusion
The statistical conclusion tells whether to reject or fail to reject the null hypothesis. The
rejection rule is as follow.
1
2
If then reject the null hypothesis
If then reject the null hypothesis
crit
crit
U U
U U
Since 1 1,63 23 critU U then fail to reject the null hypothesis also 2 2,37 77 critU U
then fail to reject the null hypothesis.
Step 7: Experimental Conclusion
Since we failed to reject the null hypothesis we can say that there is not significant
evidence to support the claim that the probability of the antelope winning is any different
than the probability of the mountain lion winning the race.
Kruskal-Wallis Test
The Kruskal-Wallis Test is a nonparametric version of the one-way ANOVA. This test
however does not assume normality or homogeneity of variance. The test requires
ordinal scaling of the dependent variable and must have at least 5 data values in each
sample.
Notation
k number of sample or groups
n number of data values in each group
N number of data values in all samples combined
R sum of the ranks for each sample
Test Statistic
2
2 12 3 1
1
1
R N
N N n
df k
The Kruskal-Wallis test is a hypothesis test. There are seven steps for a hypothesis
test.
1. State the null hypothesis
2. State the alternative hypothesis
3. State the level of significance
4. State the test statistic
5. Calculate
6. Statistical Conclusion
7. Experimental Conclusion
Example
A gym has decided to recommend a diet to their clients. They have narrowed down the
choice to three and they want to test which diet is best. They have randomly selected 21
volunteers from the gym to participate in the diet. All the participants have similar health
and body type as well as general exercise routine. The amount of weight loss was
recorded.
Diet A Diet B Diet C
10 11 1
12 4 8
13 2 17
7 6 0
3 16 16
15 9 18
5 14 19
Step 1: Null Hypothesis
0 : There is no difference amoung the dietsH
Step 2: Alternative Hypothesis
: There is a difference between the dietsAH
Step 3: Level of Significance
0.05
Step 4: Test Statistic
2
2 12 3 1
1
1
R N
N N n
df k