Calibration of Remotely Sensed Proportion or Area Estimates for Misclassification Error

Calibration of Remotely Sensed Proportion or Area Estimates for Misclassification Error. Czaplewski, R. L. & Catts, G. P. 39(1):29–43.

Paper doi abstract bibtex

Classifications of remotely sensed data contain misclassification errors that bias areal estimates. Monte Carlo techniques were used to compare two statistical methods that correct or calibrate remotely sensed areal estimates for misclassification bias using reference data from an error matrix. The inverse calibration estimator was consistently superior to the classical estimator using a simple random sample of reference plots. The effects of sample size of reference plots, detail of the classification system, and classification accuracy on the precision of the inverse estimator are discussed. If reference plots are a simple random sample of the study area, then a total sample size of 500-1000 independent reference plots is recommended for calibration. [Excerpt] [...] True misclassification probabilities are unknown if they are estimated with a finite sample of reference plots. Thus, estimates of misclassification probabilities contain sampling errors. These sampling errors are propagated into errors in calibrated areal estimates. As the sample size of reference plots increases, propagated errors will decrease. Merits of alternative calibration estimators can be affected by the sample size used to estimate misclassification probabilities. [] Brown (1982), in a key review of the multivariate calibration literature, identifies two classes of statistical calibration estimators that treat measurement error: 1) classical models that predict the known but imperfect measurements using the unknown true state; and 2) inverse models that predict the true but unknown state using known but imperfect measurements. Based on many simulation studies, neither the classical nor inverse estimator has been shown universally superior (Brown, 1982; Heldal and Spjotvoll, 1988). Much depends upon the specific application and the evaluation criteria. There have been no direct comparisons of alternative probabilistic estimators that calibrate for measurement errors caused by misclassification. [] [...] [Conclusions] The inverse calibration estimator was more precise and less biased for areal estimates than the classical estimator given the conditions of our simulation study. These conditions are typical of many remote sensing studies in which a simple random sample of homogeneous and accurately registered reference plots are available. However, other types of reference data are also used in remote sensing, such as heterogeneous reference sites, stratified sampling, and purposefully selected reference sites. Future studies are needed to evaluate estimators using these other types of reference data. [] It is recommended that sample sizes of 500- 1000 reference sites be used to calibrate areal estimates, if the reference sites are homogeneous and a simple random sample of the study area. More precise methods for determining the necessary sample size might be possible using approximate estimators of the covariance matrix for errors propagated from the calibration process, as given by Tenenbein (1972) and Grassia and SundSundberg (1982). However, this assumes these estimators are reliable for small sample sizes. Future studies are needed to test this assumption. Also, additional work is required to recommend sample sizes for other types of reference data used in remote sensing, such as heterogeneous clusters of pixels. [] If areal estimates are an important product of a remote sensing project, then the expense of 500-1000 unstratified, independent reference data plots will often be justified. However, this is more reference data than typical for most remote sensing studies. Efficiency of statistical areal calibration can be improved with a stratified sample of reference plots, and certain issues are discussed regarding the choice of the appropriate statistical estimators for a given stratification scheme, but this subject is beyond the scope of the present study. Efficiency might also be improved with larger, heterogeneous reference plots to estimate the misclassification error matrix. In this case, the inverse and classical estimator evaluated in this paper can be used to calibrate areal estimates; however, the estimators for the error covariance matrix given by Tenenbein (1972) and Grassia and Sundberg (1982) do not apply. However, this too is beyond the scope of tLe present study.

@article{czaplewskiCalibrationRemotelySensed1992,
  title = {Calibration of Remotely Sensed Proportion or Area Estimates for Misclassification Error},
  author = {Czaplewski, Raymond L. and Catts, Glenn P.},
  date = {1992-01},
  journaltitle = {Remote Sensing of Environment},
  volume = {39},
  pages = {29--43},
  issn = {0034-4257},
  doi = {10.1016/0034-4257(92)90138-a},
  url = {https://doi.org/10.1016/0034-4257(92)90138-a},
  abstract = {Classifications of remotely sensed data contain misclassification errors that bias areal estimates. Monte Carlo techniques were used to compare two statistical methods that correct or calibrate remotely sensed areal estimates for misclassification bias using reference data from an error matrix. The inverse calibration estimator was consistently superior to the classical estimator using a simple random sample of reference plots. The effects of sample size of reference plots, detail of the classification system, and classification accuracy on the precision of the inverse estimator are discussed. If reference plots are a simple random sample of the study area, then a total sample size of 500-1000 independent reference plots is recommended for calibration.

[Excerpt] [...] True misclassification probabilities are unknown if they are estimated with a finite sample of reference plots. Thus, estimates of misclassification probabilities contain sampling errors. These sampling errors are propagated into errors in calibrated areal estimates. As the sample size of reference plots increases, propagated errors will decrease. Merits of alternative calibration estimators can be affected by the sample size used to estimate misclassification probabilities.

[] Brown (1982), in a key review of the multivariate calibration literature, identifies two classes of statistical calibration estimators that treat measurement error: 1) classical models that predict the known but imperfect measurements using the unknown true state; and 2) inverse models that predict the true but unknown state using known but imperfect measurements. Based on many simulation studies, neither the classical nor inverse estimator has been shown universally superior (Brown, 1982; Heldal and Spjotvoll, 1988). Much depends upon the specific application and the evaluation criteria. There have been no direct comparisons of alternative probabilistic estimators that calibrate for measurement errors caused by misclassification.

[] [...]

[Conclusions] The inverse calibration estimator was more precise and less biased for areal estimates than the classical estimator given the conditions of our simulation study. These conditions are typical of many remote sensing studies in which a simple random sample of homogeneous and accurately registered reference plots are available. However, other types of reference data are also used in remote sensing, such as heterogeneous reference sites, stratified sampling, and purposefully selected reference sites. Future studies are needed to evaluate estimators using these other types of reference data.

[] It is recommended that sample sizes of 500- 1000 reference sites be used to calibrate areal estimates, if the reference sites are homogeneous and a simple random sample of the study area. More precise methods for determining the necessary sample size might be possible using approximate estimators of the covariance matrix for errors propagated from the calibration process, as given by Tenenbein (1972) and Grassia and SundSundberg (1982). However, this assumes these estimators are reliable for small sample sizes. Future studies are needed to test this assumption. Also, additional work is required to recommend sample sizes for other types of reference data used in remote sensing, such as heterogeneous clusters of pixels.

[] If areal estimates are an important product of a remote sensing project, then the expense of 500-1000 unstratified, independent reference data plots will often be justified. However, this is more reference data than typical for most remote sensing studies. Efficiency of statistical areal calibration can be improved with a stratified sample of reference plots, and certain issues are discussed regarding the choice of the appropriate statistical estimators for a given stratification scheme, but this subject is beyond the scope of the present study. Efficiency might also be improved with larger, heterogeneous reference plots to estimate the misclassification error matrix. In this case, the inverse and classical estimator evaluated in this paper can be used to calibrate areal estimates; however, the estimators for the error covariance matrix given by Tenenbein (1972) and Grassia and Sundberg (1982) do not apply. However, this too is beyond the scope of tLe present study.},
  keywords = {*imported-from-citeulike-INRMM,~INRMM-MiD:c-4014141,~to-add-doi-URL,bias-correction,classification,confusion-matrix,modelling-uncertainty,monte-carlo,remote-sensing,statistics},
  number = {1}
}

Downloads: 0

{"_id":"JsBYwgXh7rhMgChAa","bibbaseid":"czaplewski-catts-calibrationofremotelysensedproportionorareaestimatesformisclassificationerror","authorIDs":[],"author_short":["Czaplewski, R. L.","Catts, G. P."],"bibdata":{"bibtype":"article","type":"article","title":"Calibration of Remotely Sensed Proportion or Area Estimates for Misclassification Error","author":[{"propositions":[],"lastnames":["Czaplewski"],"firstnames":["Raymond","L."],"suffixes":[]},{"propositions":[],"lastnames":["Catts"],"firstnames":["Glenn","P."],"suffixes":[]}],"date":"1992-01","journaltitle":"Remote Sensing of Environment","volume":"39","pages":"29–43","issn":"0034-4257","doi":"10.1016/0034-4257(92)90138-a","url":"https://doi.org/10.1016/0034-4257(92)90138-a","abstract":"Classifications of remotely sensed data contain misclassification errors that bias areal estimates. Monte Carlo techniques were used to compare two statistical methods that correct or calibrate remotely sensed areal estimates for misclassification bias using reference data from an error matrix. The inverse calibration estimator was consistently superior to the classical estimator using a simple random sample of reference plots. The effects of sample size of reference plots, detail of the classification system, and classification accuracy on the precision of the inverse estimator are discussed. If reference plots are a simple random sample of the study area, then a total sample size of 500-1000 independent reference plots is recommended for calibration. [Excerpt] [...] True misclassification probabilities are unknown if they are estimated with a finite sample of reference plots. Thus, estimates of misclassification probabilities contain sampling errors. These sampling errors are propagated into errors in calibrated areal estimates. As the sample size of reference plots increases, propagated errors will decrease. Merits of alternative calibration estimators can be affected by the sample size used to estimate misclassification probabilities. [] Brown (1982), in a key review of the multivariate calibration literature, identifies two classes of statistical calibration estimators that treat measurement error: 1) classical models that predict the known but imperfect measurements using the unknown true state; and 2) inverse models that predict the true but unknown state using known but imperfect measurements. Based on many simulation studies, neither the classical nor inverse estimator has been shown universally superior (Brown, 1982; Heldal and Spjotvoll, 1988). Much depends upon the specific application and the evaluation criteria. There have been no direct comparisons of alternative probabilistic estimators that calibrate for measurement errors caused by misclassification. [] [...] [Conclusions] The inverse calibration estimator was more precise and less biased for areal estimates than the classical estimator given the conditions of our simulation study. These conditions are typical of many remote sensing studies in which a simple random sample of homogeneous and accurately registered reference plots are available. However, other types of reference data are also used in remote sensing, such as heterogeneous reference sites, stratified sampling, and purposefully selected reference sites. Future studies are needed to evaluate estimators using these other types of reference data. [] It is recommended that sample sizes of 500- 1000 reference sites be used to calibrate areal estimates, if the reference sites are homogeneous and a simple random sample of the study area. More precise methods for determining the necessary sample size might be possible using approximate estimators of the covariance matrix for errors propagated from the calibration process, as given by Tenenbein (1972) and Grassia and SundSundberg (1982). However, this assumes these estimators are reliable for small sample sizes. Future studies are needed to test this assumption. Also, additional work is required to recommend sample sizes for other types of reference data used in remote sensing, such as heterogeneous clusters of pixels. [] If areal estimates are an important product of a remote sensing project, then the expense of 500-1000 unstratified, independent reference data plots will often be justified. However, this is more reference data than typical for most remote sensing studies. Efficiency of statistical areal calibration can be improved with a stratified sample of reference plots, and certain issues are discussed regarding the choice of the appropriate statistical estimators for a given stratification scheme, but this subject is beyond the scope of the present study. Efficiency might also be improved with larger, heterogeneous reference plots to estimate the misclassification error matrix. In this case, the inverse and classical estimator evaluated in this paper can be used to calibrate areal estimates; however, the estimators for the error covariance matrix given by Tenenbein (1972) and Grassia and Sundberg (1982) do not apply. However, this too is beyond the scope of tLe present study.","keywords":"*imported-from-citeulike-INRMM,~INRMM-MiD:c-4014141,~to-add-doi-URL,bias-correction,classification,confusion-matrix,modelling-uncertainty,monte-carlo,remote-sensing,statistics","number":"1","bibtex":"@article{czaplewskiCalibrationRemotelySensed1992,\n title = {Calibration of Remotely Sensed Proportion or Area Estimates for Misclassification Error},\n author = {Czaplewski, Raymond L. and Catts, Glenn P.},\n date = {1992-01},\n journaltitle = {Remote Sensing of Environment},\n volume = {39},\n pages = {29--43},\n issn = {0034-4257},\n doi = {10.1016/0034-4257(92)90138-a},\n url = {https://doi.org/10.1016/0034-4257(92)90138-a},\n abstract = {Classifications of remotely sensed data contain misclassification errors that bias areal estimates. Monte Carlo techniques were used to compare two statistical methods that correct or calibrate remotely sensed areal estimates for misclassification bias using reference data from an error matrix. The inverse calibration estimator was consistently superior to the classical estimator using a simple random sample of reference plots. The effects of sample size of reference plots, detail of the classification system, and classification accuracy on the precision of the inverse estimator are discussed. If reference plots are a simple random sample of the study area, then a total sample size of 500-1000 independent reference plots is recommended for calibration.\n\n[Excerpt] [...] True misclassification probabilities are unknown if they are estimated with a finite sample of reference plots. Thus, estimates of misclassification probabilities contain sampling errors. These sampling errors are propagated into errors in calibrated areal estimates. As the sample size of reference plots increases, propagated errors will decrease. Merits of alternative calibration estimators can be affected by the sample size used to estimate misclassification probabilities.\n\n[] Brown (1982), in a key review of the multivariate calibration literature, identifies two classes of statistical calibration estimators that treat measurement error: 1) classical models that predict the known but imperfect measurements using the unknown true state; and 2) inverse models that predict the true but unknown state using known but imperfect measurements. Based on many simulation studies, neither the classical nor inverse estimator has been shown universally superior (Brown, 1982; Heldal and Spjotvoll, 1988). Much depends upon the specific application and the evaluation criteria. There have been no direct comparisons of alternative probabilistic estimators that calibrate for measurement errors caused by misclassification.\n\n[] [...]\n\n[Conclusions] The inverse calibration estimator was more precise and less biased for areal estimates than the classical estimator given the conditions of our simulation study. These conditions are typical of many remote sensing studies in which a simple random sample of homogeneous and accurately registered reference plots are available. However, other types of reference data are also used in remote sensing, such as heterogeneous reference sites, stratified sampling, and purposefully selected reference sites. Future studies are needed to evaluate estimators using these other types of reference data.\n\n[] It is recommended that sample sizes of 500- 1000 reference sites be used to calibrate areal estimates, if the reference sites are homogeneous and a simple random sample of the study area. More precise methods for determining the necessary sample size might be possible using approximate estimators of the covariance matrix for errors propagated from the calibration process, as given by Tenenbein (1972) and Grassia and SundSundberg (1982). However, this assumes these estimators are reliable for small sample sizes. Future studies are needed to test this assumption. Also, additional work is required to recommend sample sizes for other types of reference data used in remote sensing, such as heterogeneous clusters of pixels.\n\n[] If areal estimates are an important product of a remote sensing project, then the expense of 500-1000 unstratified, independent reference data plots will often be justified. However, this is more reference data than typical for most remote sensing studies. Efficiency of statistical areal calibration can be improved with a stratified sample of reference plots, and certain issues are discussed regarding the choice of the appropriate statistical estimators for a given stratification scheme, but this subject is beyond the scope of the present study. Efficiency might also be improved with larger, heterogeneous reference plots to estimate the misclassification error matrix. In this case, the inverse and classical estimator evaluated in this paper can be used to calibrate areal estimates; however, the estimators for the error covariance matrix given by Tenenbein (1972) and Grassia and Sundberg (1982) do not apply. However, this too is beyond the scope of tLe present study.},\n keywords = {*imported-from-citeulike-INRMM,~INRMM-MiD:c-4014141,~to-add-doi-URL,bias-correction,classification,confusion-matrix,modelling-uncertainty,monte-carlo,remote-sensing,statistics},\n number = {1}\n}\n\n","author_short":["Czaplewski, R. L.","Catts, G. P."],"key":"czaplewskiCalibrationRemotelySensed1992","id":"czaplewskiCalibrationRemotelySensed1992","bibbaseid":"czaplewski-catts-calibrationofremotelysensedproportionorareaestimatesformisclassificationerror","role":"author","urls":{"Paper":"https://doi.org/10.1016/0034-4257(92)90138-a"},"keyword":["*imported-from-citeulike-INRMM","~INRMM-MiD:c-4014141","~to-add-doi-URL","bias-correction","classification","confusion-matrix","modelling-uncertainty","monte-carlo","remote-sensing","statistics"],"downloads":0},"bibtype":"article","biburl":"https://tmpfiles.org/dl/58794/INRMM.bib","creationDate":"2020-07-02T22:41:04.855Z","downloads":0,"keywords":["*imported-from-citeulike-inrmm","~inrmm-mid:c-4014141","~to-add-doi-url","bias-correction","classification","confusion-matrix","modelling-uncertainty","monte-carlo","remote-sensing","statistics"],"search_terms":["calibration","remotely","sensed","proportion","area","estimates","misclassification","error","czaplewski","catts"],"title":"Calibration of Remotely Sensed Proportion or Area Estimates for Misclassification Error","year":null,"dataSources":["DXuKbcZTirdigFKPF"]}