Morgan, Leech, Gloeckner, & Barrett (2013). Cross-Tabulation, Chi-Square, and Nonparametric Measures of Association (Chapter 8).

Morgan, G. A., Leech, N. L., Gloeckner, G. W., & Barrett, K. C. (2013). Cross-Tabulation, Chi-Square, and Nonparametric Measures of Association (Chapter 8). In IBM SPSS for Introductory Statistics: Use and Interpretation (5th edition, pp. 136–148). New York: Routledge.

“In this chapter, you will learn how to make cross-tabulation tables from two variables, both of which have a few levels or values of categorical data. You will learn how to decide if there is a statistically significant relationship between two nominal variables using chi-square and you will learn how to assess the strength of this relationship (i.e., the effect size) using phi (or Cramer’s _V_) and odds ratios. You will also compute and interpret Kendall’s tau-b for ordinal variables and eta for one nominal and one normal/scale variable. We will see eta again in Chapter 11 as an effect size measure for ANOVAs. The statistics demonstrated in this chapter are called nonparametric statistics because they are designed to be used with data that are not normally distributed.” (p 136)

[“Problem 8.1: Chi-Square and Phi (or Cramer’s V)” (p 136) …]

“Chi-square (χ²) or phi/Cramer’s V are good choices for statistics when analyzing two nominal variables.” (p 136)

Chi-square requires a relatively large sample size and/or a relatively even split of the subjects among the levels because the expected counts in 80% of the cells should be greater than five. Fisher’s exact test should be reported instead of chi-square for small samples if each of the two variables being related has only two levels (2 x 2 cross-tabulation). Chi-square and the Fisher’s exact test provide similar information about relationships among variables; however, they only tell us whether the relationship is statistically significant (i.e., not likely to be due to chance). They do not tell the effect size (i.e., the strength of the relationship).” (p 136)

Phi and Cramer’s _V_ provide a test of statistical significance and also provide information about the strength of the association between two categorical variables. They can be used as measures of the effect size. If one has a 2 x 2 cross-tabulation, phi is the appropriate statistic. For larger cross-tabs, Cramer’s _V_ is used.” (p 136)

[“Assumptions and Conditions for the Use of Chi-square, Phi, Cramer’s V, and Odds Ratios” (p 137) …]

  • “The data for the variables must be independent. Each subject is assessed only once.
  • “Data are treated as nominal, even if ordered.
  • “For chi-square, if the expected frequencies are less than 5, the test of significance is too liberal. At least 80% of the expected frequencies should be 5 or larger. All should be at least 5 if you have a 2 x 2 chi-square; if they are not, use Fisher’s exact test.
  • “Odds ratios and risk ratios are problematic to interpret when the probability of an event is near zero (i.e., < .1) or near 1.” (p 137)

“A question answered by the chi-square test is whether these discrepancies between observed and expected counts are bigger than one might expect by chance.” (p 140)

[“Problem 8.2: Risk Ratios and Odds Ratios” (p 141)]

[“Problem 8.3: Other Nonparametric Associational Statistics” (p 143) …]

“If both variables are nominal and you have a 2 x 2 cross-tabulation, like the one in Output 8.1, phi is the appropriate statistic to use from the symmetric measures table. For larger cross-tabulations (like a 3 x 3) with nominal data, Cramer’s _V_ is the appropriate statistic. … The primary assumption of Kendall’s tau-b is that data are at least ordinal.” (p 143)

[Adapted from SPSS output for Problem 8.3 (p 145) …]


[Notes to above chart …]

  • “Phi is not appropriate for a 3 x 3 table.
  • “Cramer’s _V_ measures the strength of a relationship of two nominal variables when one or both have three or more levels/values.
  • “Kendall’s tau-b measures the strength of the association if both variables are ordinal.” (p 145)

[“Problem 8.4: Cross-Tabulation and Eta” (p 146) …]

“There is an important associational statistic, eta, that is used when one variable is nominal and the other is approximately normal or scale.” (p 146)

See this page at