Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing

Main Article Content

Bhupendera Kumar
Rajeev Kumar


With the proliferation of surveys for almost every issue governing our life with various parameters and a variety of data, it becomes necessary for a researcher to unify these data followed for extracting inferences from the survey. Data from quantitative surveys are clustered to reveal respondents' divergent and dominant tendencies. It aims to investigate the general trends among the respondents' categories. Due to the unique characteristics of survey data, popular clustering techniques based on value similarity are inadequate.
In this paper, we attempt to unify the numerical data with the ordinal data of a survey. We model the data with a Gaussian distribution, therefore, we first convert the numerical data to ordinal data following the distribution; this may be the governing attributes for deciding the clusters. Then, we use $K$-means clustering with varying numbers of clusters. We implement the proposed methodologies on real survey data and compare the clustering efficiency before and after the proposed methodology on the number of clusters. More crucially, it appropriately uses the ordinal attributes order information and numerical attribute statistical information for clustering. Extensive testing demonstrates that the suggested unification works better on real data sets than its contemporaries.

Kumar, B., & Kumar, R. . (2023). Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing. INFOCOMP Journal of Computer Science, 22(1). Retrieved from https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/2492
