Categorical Data Analysis in MerQur: From Cross-Tabulation to Log-Linear Models
DOI:
https://doi.org/10.53463/mjdsm.20260460Keywords:
categorical data, chi-square, cross table, Fisher test, McNemar, Cohen's KappaAbstract
Categorical data carries class/label values (species, region, yes/no, design style) rather than a numeric scale and constitutes a large share of social, environmental and health research. This study introduces in detail the 12 categorical-data methods offered by the MerQur desktop software: cross-tabulation, chi-square test of independence, chi-square goodness-of-fit, Fisher’s exact test, McNemar’s test, Cohen’s Kappa, the Cochran-Mantel-Haenszel (CMH) test, log-linear analysis, multiple-response analyses (frequency / crosstab / multiple-by-multiple) and Cochran’s Q. For each, the following are presented: (i) the hypothesis tested and application context, (ii) required assumptions (expected frequency, matching, independence, stratification), (iii) MerQur form fields, (iv) reported statistics and effect sizes (Cramér’s V, Phi, odds ratio, Cohen’s g/κ), and (v) an interpretation guide for a typical research question. All worked examples were produced with real MerQur output on the synthetic Landscape Architecture dataset distributed with MerQur. Overall, MerQur’s categorical toolkit presents a broad spectrum — from a simple two-variable association table to the log-linear analysis of three-way contingency structures and multiple-response surveys — together with correct effect sizes and assumption warnings within a single graphical interface.
References
Agresti, A. (2013). Categorical data analysis (3rd ed.). Wiley.
Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete multivariate analysis: Theory and practice. MIT Press.
Cochran, W. G. (1950). The comparison of percentages in matched samples. Biometrika, 37(3–4), 256–266. https://doi.org/10.1093/biomet/37.3-4.256
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46. https://doi.org/10.1177/001316446002000104
Cramér, H. (1946). Mathematical methods of statistics. Princeton University Press.
Fisher, R. A. (1922). On the interpretation of χ² from contingency tables, and the calculation of P. Journal of the Royal Statistical Society, 85(1), 87–94. https://doi.org/10.2307/2340521
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174. https://doi.org/10.2307/2529310
Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22(4), 719–748.
McNemar, Q. (1947). Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12(2), 153–157. https://doi.org/10.1007/BF02295996
Pearson, K. (1900). On the criterion that a given system of deviations from the probable… Philosophical Magazine, 50(302), 157–175. https://doi.org/10.1080/14786440009463897
Published
Issue
Section
License
Copyright (c) 2026 MerQur Journal of Data Science and Methods

This work is licensed under a Creative Commons Attribution 4.0 International License.
This article is published under a Creative Commons Attribution 4.0 International License (CC-BY 4.0). Under this license you may:
- Share: Copy and redistribute the material in any medium or format.
- Adapt: Remix, transform and build upon the material for any purpose, including commercial use.
- Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made.