Assessment of the performance of electrocardiographic computer programs with the use of a reference data base. Willems, J. L., Arnaud, P., Bemmel, J. H. v., Bourdillon, P. J., Brohet, C., Volta, S. D., Andersen, J. D., Degani, R., Denis, B., & Demeester, M. Circulation, 71(3):523–534, March, 1985.
Assessment of the performance of electrocardiographic computer programs with the use of a reference data base. [link]Paper  doi  abstract   bibtex   
To allow an exchange of measurements and criteria between different electrocardiographic (ECG) computer programs, an international cooperative project has been initiated aimed at standardization of computer-derived ECG measurements. To this end an ECG reference library of 250 ECGs with selected abnormalities was established and a comprehensive reviewing scheme was devised for the visual determination of the onsets and offsets of P, QRS, and T waves. This task was performed by a group of cardiologists on highly amplified, selected complexes from the library of ECGs. With use of a modified Delphi approach, individual outlying point estimates were eliminated in four successive rounds. In this way final referee estimates were obtained that proved to be highly reproducible and precise. This reference data base was used to study measurement results obtained with nine vectorcardiographic and 10 standard 12-lead ECG analysis programs. The medians of program determinations of P, QRS, and T wave onsets and offsets were close to the final referee estimates. However, an important variability could be demonstrated between measurements from individual programs and mean differences from the referee estimates amounted to 10 msec for QRS for certain programs. In addition, the variances of all programs with respect to the referee point estimates were variable. Some programs proved to be more accurate and stable when the data from high- vs low-noise recordings were analyzed. Average Q wave durations calculated from ECGs for which programs agreed on the presence of a Q or QS wave differed by more than 8 msec in several program-to-program comparisons. Such differences may have important consequences with respect to diagnostic performance. Various factors that might explain these differences have been determined. The present study demonstrates that to allow an exchange of results and diagnostic criteria between different ECG computer programs, definitions, minimum wave requirements, and measurement procedures urgently need to be standardized.
@article{willems_assessment_1985,
	title = {Assessment of the performance of electrocardiographic computer programs with the use of a reference data base.},
	volume = {71},
	issn = {0009-7322, 1524-4539},
	url = {http://circ.ahajournals.org/content/71/3/523},
	doi = {10.1161/01.CIR.71.3.523},
	abstract = {To allow an exchange of measurements and criteria between different electrocardiographic (ECG) computer programs, an international cooperative project has been initiated aimed at standardization of computer-derived ECG measurements. To this end an ECG reference library of 250 ECGs with selected abnormalities was established and a comprehensive reviewing scheme was devised for the visual determination of the onsets and offsets of P, QRS, and T waves. This task was performed by a group of cardiologists on highly amplified, selected complexes from the library of ECGs. With use of a modified Delphi approach, individual outlying point estimates were eliminated in four successive rounds. In this way final referee estimates were obtained that proved to be highly reproducible and precise. This reference data base was used to study measurement results obtained with nine vectorcardiographic and 10 standard 12-lead ECG analysis programs. The medians of program determinations of P, QRS, and T wave onsets and offsets were close to the final referee estimates. However, an important variability could be demonstrated between measurements from individual programs and mean differences from the referee estimates amounted to 10 msec for QRS for certain programs. In addition, the variances of all programs with respect to the referee point estimates were variable. Some programs proved to be more accurate and stable when the data from high- vs low-noise recordings were analyzed. Average Q wave durations calculated from ECGs for which programs agreed on the presence of a Q or QS wave differed by more than 8 msec in several program-to-program comparisons. Such differences may have important consequences with respect to diagnostic performance. Various factors that might explain these differences have been determined. The present study demonstrates that to allow an exchange of results and diagnostic criteria between different ECG computer programs, definitions, minimum wave requirements, and measurement procedures urgently need to be standardized.},
	language = {en},
	number = {3},
	urldate = {2013-09-24TZ},
	journal = {Circulation},
	author = {Willems, J. L. and Arnaud, P. and Bemmel, J. H. van and Bourdillon, P. J. and Brohet, C. and Volta, S. Dalla and Andersen, J. D. and Degani, R. and Denis, B. and Demeester, M.},
	month = mar,
	year = {1985},
	pmid = {3838268},
	pages = {523--534}
}

Downloads: 0