Comparability of Methods for Setting Standards. Skakun, E. N & Kling, S. Journal of Educational Measurement, 17(3):229–235, 1980.
Comparability of Methods for Setting Standards [link]Paper  abstract   bibtex   
The Nedelsky and two modified versions of the Ebel procedure were used by a group of judges to set pass-fail levels on a certification examination in the medical specialty of General Surgery. The passing scores derived from these methods were compared with the passing score produced by using a norm-referenced approach in terms of similarity of pass-fail levels and effect on overall failure rates. In addition, the reliability of the mean ratings given to the test items by the judges for the three approaches, as well as the variability of the judges' passing scores, were investigated. Results indicated that the approaches produced different passing scores, with a difference of 5% between the lowest and highest passing scores. This difference of 5% is equivalent to ten items on the examination and although this small number of items might appear trivial, the failure rate would double if the approach with the higher cutting score were implemented. With regard to the reliability of the mean ratings assigned to the items by the judges for the three approaches, the two modified Ebel procedures displayed a higher reliability than the Nedelsky approach. The average difference in passing scores was highest for the Nedelsky technique.
@article{skakun_comparability_1980,
	title = {Comparability of {Methods} for {Setting} {Standards}},
	volume = {17},
	issn = {00220655},
	url = {http://links.jstor.org/sici?sici=0022-0655%28198023%2917%3A3%3C229%3ACOMFSS%3E2.0.CO%3B2-P},
	abstract = {The Nedelsky and two modified versions of the Ebel procedure were used by a group of judges to set pass-fail levels on a certification examination in the medical specialty of General Surgery. The passing scores derived from these methods were compared with the passing score produced by using a norm-referenced approach in terms of similarity of pass-fail levels and effect on overall failure rates. In addition, the reliability of the mean ratings given to the test items by the judges for the three approaches, as well as the variability of the judges' passing scores, were investigated. Results indicated that the approaches produced different passing scores, with a difference of 5\% between the lowest and highest passing scores. This difference of 5\% is equivalent to ten items on the examination and although this small number of items might appear trivial, the failure rate would double if the approach with the higher cutting score were implemented. With regard to the reliability of the mean ratings assigned to the items by the judges for the three approaches, the two modified Ebel procedures displayed a higher reliability than the Nedelsky approach. The average difference in passing scores was highest for the Nedelsky technique.},
	number = {3},
	urldate = {2008-02-20},
	journal = {Journal of Educational Measurement},
	author = {Skakun, Ernest N and Kling, Samuel},
	year = {1980},
	pages = {229--235},
}

Downloads: 0