Quantification of Ordinal Variables: A Critical Inquiry into Polychoric and Canonical Correlation. Nishisato, S. & Hemsworth, D. In Baba, Y., Hayter, A. J., Kanefuji, K., & Kuriki, S., editors, Recent Advances in Statistical Research and Data Analysis, pages 49--84. Springer Japan, January, 2002.
Quantification of Ordinal Variables: A Critical Inquiry into Polychoric and Canonical Correlation [link]Paper  abstract   bibtex   
“Scaling” or “quantification” was re-examined with respect to its main objectives and requirements. Among other things, the attention was directed to the condition that the variance-covariance matrix of scaled quantities must be positive definite or semi-definite in order for the variables to be mapped in Euclidean space. Dual scaling was chosen to guide us through the search for identifying problems, understanding the basic aspects of those problems and practical remedies for them. In particular, the pair-wise quantification approach and the global quantification approach to multivariate analysis were used to identify some tricky theoretical problems, associated with the failure of identifying coordinates of variables in Euclidean hyper-space. One of the problems, arising from the pair-wise quantification, was the lack of a geometric definition of correlation between two sets of categorical variables. This absence of a geometric definition was attributed to the lack of a single data matrix, often leading to negative eigenvalues of the correlation matrix. Then the attention was shifted to the calculation of polychoric correlation and canonical correlation for categorized ordinal variables, the practice often seen in the study of structural equation modeling (SEM). Of particular interest were the problems associated with the pair-wise determination of thresholds (polychoric correlation), the univariate determination of thresholds (polychoric correlation) and the pair-wise determination of category weights (canonical correlation). The study identified two possible causes for the failure of mapping variables in Euclidean space: the pair-wise determination of thresholds or categories and the lack of underlying multivariate normality of the distribution. The degree of this failure was noted as an increasing function of the number of variables in the data set. It was highlighted then that dual scaling could mitigate the problems due to these causes that the current SEM practice of using polychoric correlation and canonical correlation would constantly encounter. Numerical examples were provided to show what is at stake when scaling is not properly carried out. It was stressed that when one cannot reasonably make the assumption of the latent multivariate normal distribution dual scaling offers an excellent alternative to canonical correlation and polychoric correlation as used in SEM because dual scaling transforms the data towards the categorized normal distribution.
@incollection{ nishisato_quantification_2002,
  title = {Quantification of {Ordinal} {Variables}: {A} {Critical} {Inquiry} into {Polychoric} and {Canonical} {Correlation}},
  copyright = {©2002 Springer-Verlag Tokyo},
  isbn = {978-4-431-68546-3, 978-4-431-68544-9},
  shorttitle = {Quantification of {Ordinal} {Variables}},
  url = {http://link.springer.com/chapter/10.1007/978-4-431-68544-9_3},
  abstract = {“Scaling” or “quantification” was re-examined with respect to its main objectives and requirements. Among other things, the attention was directed to the condition that the variance-covariance matrix of scaled quantities must be positive definite or semi-definite in order for the variables to be mapped in Euclidean space. Dual scaling was chosen to guide us through the search for identifying problems, understanding the basic aspects of those problems and practical remedies for them. In particular, the pair-wise quantification approach and the global quantification approach to multivariate analysis were used to identify some tricky theoretical problems, associated with the failure of identifying coordinates of variables in Euclidean hyper-space. One of the problems, arising from the pair-wise quantification, was the lack of a geometric definition of correlation between two sets of categorical variables. This absence of a geometric definition was attributed to the lack of a single data matrix, often leading to negative eigenvalues of the correlation matrix. Then the attention was shifted to the calculation of polychoric correlation and canonical correlation for categorized ordinal variables, the practice often seen in the study of structural equation modeling (SEM). Of particular interest were the problems associated with the pair-wise determination of thresholds (polychoric correlation), the univariate determination of thresholds (polychoric correlation) and the pair-wise determination of category weights (canonical correlation). The study identified two possible causes for the failure of mapping variables in Euclidean space: the pair-wise determination of thresholds or categories and the lack of underlying multivariate normality of the distribution. The degree of this failure was noted as an increasing function of the number of variables in the data set. It was highlighted then that dual scaling could mitigate the problems due to these causes that the current SEM practice of using polychoric correlation and canonical correlation would constantly encounter. Numerical examples were provided to show what is at stake when scaling is not properly carried out. It was stressed that when one cannot reasonably make the assumption of the latent multivariate normal distribution dual scaling offers an excellent alternative to canonical correlation and polychoric correlation as used in SEM because dual scaling transforms the data towards the categorized normal distribution.},
  language = {en},
  urldate = {2014-01-14TZ},
  booktitle = {Recent {Advances} in {Statistical} {Research} and {Data} {Analysis}},
  publisher = {Springer Japan},
  author = {Nishisato, Shizuhiko and Hemsworth, David},
  editor = {Baba, Yasumasa and Hayter, Anthony J. and Kanefuji, Koji and Kuriki, Satoshi},
  month = {January},
  year = {2002},
  pages = {49--84}
}

Downloads: 0