MerQur: An Integrated Platform for Academic Data Analysis and Reporting

Authors

  • Ömer K. Örücü Author

DOI:

https://doi.org/10.53463/merqur.2026001

Keywords:

statistical analysis, data science, academic reporting

Abstract

MerQur is a multilingual (Turkish, English, Spanish) desktop platform for data analysis and reporting that enables academic researchers to perform advanced statistical analyses without writing code. Within a single graphical interface it offers more than 110 analyses organized under tab-based categories: descriptive statistics, parametric and non-parametric tests, correlation and association analyses, linear and generalized regression, machine-learning classification and clustering, dimensionality reduction, survival analysis, Bayesian methods, mixed models, categorical-data analyses, and spatial analyses in a dedicated map tab (kernel density, Moran’s I, Getis-Ord hot spots, spatial clustering). For each analysis the platform automatically runs assumption checks, reports appropriate effect sizes, produces publication-ready charts, and converts results into automated reports. MerQur is built upon open-source scientific Python libraries — including NumPy, pandas, SciPy, statsmodels, scikit-learn, PySAL, and GeoPandas — unifying these methods in an integrated, accessible interface that requires no programming or statistical-infrastructure expertise. Its aim is to accelerate the analytical workflow while preserving methodological rigor (assumption verification, effect sizes, multiple-comparison correction) and to facilitate reproducible academic reporting.

References

Yazılım ve Kütüphaneler (Software and Libraries)

Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785

Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., del Río, J. F., Wiebe, M., Peterson, P., … Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585, 357–362. https://doi.org/10.1038/s41586-020-2649-2

Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in Science & Engineering, 9(3), 90–95. https://doi.org/10.1109/MCSE.2007.55

Jordahl, K., Van den Bossche, J., Fleischmann, M., Wasserman, J., McBride, J., Gerard, J., Tratner, J., Perry, M., Badaracco, A. G., Farmer, C., Hjelle, G. A., Snow, A. D., Cochran, M., Gillies, S., Culbertson, L., Bartos, M., Eubank, N., Bilogur, A., Rey, S., … Leblanc, F. (2020). geopandas/geopandas: v0.8.1 [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.3946761

Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.-Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 30, 3146–3154.

McInnes, L., Healy, J., Saul, N., & Großberger, L. (2018). UMAP: Uniform Manifold Approximation and Projection. Journal of Open Source Software, 3(29), 861. https://doi.org/10.21105/joss.00861

McKinney, W. (2010). Data structures for statistical computing in Python. Proceedings of the 9th Python in Science Conference, 56–61. https://doi.org/10.25080/Majora-92bf1922-00a

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.

Plotly Technologies Inc. (2015). Collaborative data science [Computer software]. Plotly Technologies Inc. https://plot.ly

Rey, S. J., & Anselin, L. (2007). PySAL: A Python library of spatial analytical methods. The Review of Regional Studies, 37(1), 5–27. https://doi.org/10.52324/001c.8285

Riverbank Computing Limited. (2024). PyQt6 [Computer software]. https://www.riverbankcomputing.com/software/pyqt/

Seabold, S., & Perktold, J. (2010). Statsmodels: Econometric and statistical modeling with Python. Proceedings of the 9th Python in Science Conference, 92–96. https://doi.org/10.25080/Majora-92bf1922-011

Servén, D., & Brummitt, C. (2018). pyGAM: Generalized Additive Models in Python [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.1208723

Vallat, R. (2018). Pingouin: Statistics in Python. Journal of Open Source Software, 3(31), 1026. https://doi.org/10.21105/joss.01026

Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van der Walt, S. J., Brett, M., Wilson, J., Millman, K. J., Mayorov, N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., … SciPy 1.0 Contributors. (2020). SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nature Methods, 17, 261–272. https://doi.org/10.1038/s41592-019-0686-2

Waskom, M. L. (2021). seaborn: Statistical data visualization. Journal of Open Source Software, 6(60), 3021. https://doi.org/10.21105/joss.03021

İstatistiksel ve Metodolojik Kaynaklar (Statistical and Methodological References)

Agresti, A. (2013). Categorical data analysis (3rd ed.). Wiley.

Anselin, L. (1988). Spatial econometrics: Methods and models. Kluwer Academic Publishers. https://doi.org/10.1007/978-94-015-7799-1

Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B, 57(1), 289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.

Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B, 34(2), 187–202. https://doi.org/10.1111/j.2517-6161.1972.tb00899.x

Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. Chapman & Hall.

Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), 226–231.

Fisher, R. A. (1925). Statistical methods for research workers. Oliver & Boyd.

Getis, A., & Ord, J. K. (1992). The analysis of spatial association by use of distance statistics. Geographical Analysis, 24(3), 189–206. https://doi.org/10.1111/j.1538-4632.1992.tb00261.x

Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer. https://doi.org/10.1007/978-0-387-84858-7

Kaplan, E. L., & Meier, P. (1958). Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53(282), 457–481. https://doi.org/10.1080/01621459.1958.10501452

Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174. https://doi.org/10.2307/2529310

Moran, P. A. P. (1950). Notes on continuous stochastic phenomena. Biometrika, 37(1–2), 17–23. https://doi.org/10.1093/biomet/37.1-2.17

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B, 58(1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

Tobler, W. R. (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography, 46(sup1), 234–240. https://doi.org/10.2307/143141

Tukey, J. W. (1949). Comparing individual means in the analysis of variance. Biometrics, 5(2), 99–114. https://doi.org/10.2307/3001913

Welch, B. L. (1947). The generalization of “Student’s” problem when several different population variances are involved. Biometrika, 34(1–2), 28–35. https://doi.org/10.1093/biomet/34.1-2.28

Published

2026-05-11

Issue

Section

Software