Learning from crowds in digital pathology using scalable variational Gaussian processes

Learning from crowds in digital pathology using scalable variational Gaussian processes. López-Pérez, M., Amgad, M., Morales-Álvarez, P., Ruiz, P., Cooper, L. A. D., Molina, R., & Katsaggelos, A. K. Scientific Reports, 11(1):11612, jun, 2021.

Paper doi abstract bibtex

The volume of labeled data is often the primary determinant of success in developing machine learning algorithms. This has increased interest in methods for leveraging crowds to scale data labeling efforts, and methods to learn from noisy crowd-sourced labels. The need to scale labeling is acute but particularly challenging in medical applications like pathology, due to the expertise required to generate quality labels and the limited availability of qualified experts. In this paper we investigate the application of Scalable Variational Gaussian Processes for Crowdsourcing (SVGPCR) in digital pathology. We compare SVGPCR with other crowdsourcing methods using a large multi-rater dataset where pathologists, pathology residents, and medical students annotated tissue regions breast cancer. Our study shows that SVGPCR is competitive with equivalent methods trained using gold-standard pathologist generated labels, and that SVGPCR meets or exceeds the performance of other crowdsourcing methods based on deep learning. We also show how SVGPCR can effectively learn the class-conditional reliabilities of individual annotators and demonstrate that Gaussian-process classifiers have comparable performance to similar deep learning methods. These results suggest that SVGPCR can meaningfully engage non-experts in pathology labeling tasks, and that the class-conditional reliabilities estimated by SVGPCR may assist in matching annotators to tasks where they perform well.

@article{Miguel2021,
abstract = {The volume of labeled data is often the primary determinant of success in developing machine learning algorithms. This has increased interest in methods for leveraging crowds to scale data labeling efforts, and methods to learn from noisy crowd-sourced labels. The need to scale labeling is acute but particularly challenging in medical applications like pathology, due to the expertise required to generate quality labels and the limited availability of qualified experts. In this paper we investigate the application of Scalable Variational Gaussian Processes for Crowdsourcing (SVGPCR) in digital pathology. We compare SVGPCR with other crowdsourcing methods using a large multi-rater dataset where pathologists, pathology residents, and medical students annotated tissue regions breast cancer. Our study shows that SVGPCR is competitive with equivalent methods trained using gold-standard pathologist generated labels, and that SVGPCR meets or exceeds the performance of other crowdsourcing methods based on deep learning. We also show how SVGPCR can effectively learn the class-conditional reliabilities of individual annotators and demonstrate that Gaussian-process classifiers have comparable performance to similar deep learning methods. These results suggest that SVGPCR can meaningfully engage non-experts in pathology labeling tasks, and that the class-conditional reliabilities estimated by SVGPCR may assist in matching annotators to tasks where they perform well.},
author = {L{\'{o}}pez-P{\'{e}}rez, Miguel and Amgad, Mohamed and Morales-{\'{A}}lvarez, Pablo and Ruiz, Pablo and Cooper, Lee A. D. and Molina, Rafael and Katsaggelos, Aggelos K.},
doi = {10.1038/s41598-021-90821-3},
issn = {2045-2322},
journal = {Scientific Reports},
month = {jun},
number = {1},
pages = {11612},
pmid = {34078955},
title = {{Learning from crowds in digital pathology using scalable variational Gaussian processes}},
url = {https://www.nature.com/articles/s41598-021-90821-3},
volume = {11},
year = {2021}
}

Downloads: 0

{"_id":"WbvPZtbag7EiXKH5R","bibbaseid":"lpezprez-amgad-moraleslvarez-ruiz-cooper-molina-katsaggelos-learningfromcrowdsindigitalpathologyusingscalablevariationalgaussianprocesses-2021","author_short":["López-Pérez, M.","Amgad, M.","Morales-Álvarez, P.","Ruiz, P.","Cooper, L. A. D.","Molina, R.","Katsaggelos, A. K."],"bibdata":{"bibtype":"article","type":"article","abstract":"The volume of labeled data is often the primary determinant of success in developing machine learning algorithms. This has increased interest in methods for leveraging crowds to scale data labeling efforts, and methods to learn from noisy crowd-sourced labels. The need to scale labeling is acute but particularly challenging in medical applications like pathology, due to the expertise required to generate quality labels and the limited availability of qualified experts. In this paper we investigate the application of Scalable Variational Gaussian Processes for Crowdsourcing (SVGPCR) in digital pathology. We compare SVGPCR with other crowdsourcing methods using a large multi-rater dataset where pathologists, pathology residents, and medical students annotated tissue regions breast cancer. Our study shows that SVGPCR is competitive with equivalent methods trained using gold-standard pathologist generated labels, and that SVGPCR meets or exceeds the performance of other crowdsourcing methods based on deep learning. We also show how SVGPCR can effectively learn the class-conditional reliabilities of individual annotators and demonstrate that Gaussian-process classifiers have comparable performance to similar deep learning methods. These results suggest that SVGPCR can meaningfully engage non-experts in pathology labeling tasks, and that the class-conditional reliabilities estimated by SVGPCR may assist in matching annotators to tasks where they perform well.","author":[{"propositions":[],"lastnames":["López-Pérez"],"firstnames":["Miguel"],"suffixes":[]},{"propositions":[],"lastnames":["Amgad"],"firstnames":["Mohamed"],"suffixes":[]},{"propositions":[],"lastnames":["Morales-Álvarez"],"firstnames":["Pablo"],"suffixes":[]},{"propositions":[],"lastnames":["Ruiz"],"firstnames":["Pablo"],"suffixes":[]},{"propositions":[],"lastnames":["Cooper"],"firstnames":["Lee","A.","D."],"suffixes":[]},{"propositions":[],"lastnames":["Molina"],"firstnames":["Rafael"],"suffixes":[]},{"propositions":[],"lastnames":["Katsaggelos"],"firstnames":["Aggelos","K."],"suffixes":[]}],"doi":"10.1038/s41598-021-90821-3","issn":"2045-2322","journal":"Scientific Reports","month":"jun","number":"1","pages":"11612","pmid":"34078955","title":"Learning from crowds in digital pathology using scalable variational Gaussian processes","url":"https://www.nature.com/articles/s41598-021-90821-3","volume":"11","year":"2021","bibtex":"@article{Miguel2021,\nabstract = {The volume of labeled data is often the primary determinant of success in developing machine learning algorithms. This has increased interest in methods for leveraging crowds to scale data labeling efforts, and methods to learn from noisy crowd-sourced labels. The need to scale labeling is acute but particularly challenging in medical applications like pathology, due to the expertise required to generate quality labels and the limited availability of qualified experts. In this paper we investigate the application of Scalable Variational Gaussian Processes for Crowdsourcing (SVGPCR) in digital pathology. We compare SVGPCR with other crowdsourcing methods using a large multi-rater dataset where pathologists, pathology residents, and medical students annotated tissue regions breast cancer. Our study shows that SVGPCR is competitive with equivalent methods trained using gold-standard pathologist generated labels, and that SVGPCR meets or exceeds the performance of other crowdsourcing methods based on deep learning. We also show how SVGPCR can effectively learn the class-conditional reliabilities of individual annotators and demonstrate that Gaussian-process classifiers have comparable performance to similar deep learning methods. These results suggest that SVGPCR can meaningfully engage non-experts in pathology labeling tasks, and that the class-conditional reliabilities estimated by SVGPCR may assist in matching annotators to tasks where they perform well.},\nauthor = {L{\\'{o}}pez-P{\\'{e}}rez, Miguel and Amgad, Mohamed and Morales-{\\'{A}}lvarez, Pablo and Ruiz, Pablo and Cooper, Lee A. D. and Molina, Rafael and Katsaggelos, Aggelos K.},\ndoi = {10.1038/s41598-021-90821-3},\nissn = {2045-2322},\njournal = {Scientific Reports},\nmonth = {jun},\nnumber = {1},\npages = {11612},\npmid = {34078955},\ntitle = {{Learning from crowds in digital pathology using scalable variational Gaussian processes}},\nurl = {https://www.nature.com/articles/s41598-021-90821-3},\nvolume = {11},\nyear = {2021}\n}\n","author_short":["López-Pérez, M.","Amgad, M.","Morales-Álvarez, P.","Ruiz, P.","Cooper, L. A. D.","Molina, R.","Katsaggelos, A. K."],"key":"Miguel2021","id":"Miguel2021","bibbaseid":"lpezprez-amgad-moraleslvarez-ruiz-cooper-molina-katsaggelos-learningfromcrowdsindigitalpathologyusingscalablevariationalgaussianprocesses-2021","role":"author","urls":{"Paper":"https://www.nature.com/articles/s41598-021-90821-3"},"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"https://sites.northwestern.edu/ivpl/files/2023/06/IVPL_Updated_publications-1.bib","dataSources":["ePKPjG8C6yvpk4mEK","ya2CyA73rpZseyrZ8","zFPgsTDAW8aDnb5iN","E6Bth2QB5BYjBMZE7","nbnEjsN7MJhurAK9x","PNQZj6FjzoxxJk4Yi","7FpDWDGJ4KgpDiGfB","bod9ms4MQJHuJgPpp","QR9t5P2cLdJuzhfzK","D8k2SxfC5dKNRFgro","7Dwzbxq93HWrJEhT6","qhF8zxmGcJfvtdeAg","fvDEHD49E2ZRwE3fb","H7crv8NWhZup4d4by","DHqokWsryttGh7pJE","vRJd4wNg9HpoZSMHD","sYxQ6pxFgA59JRhxi","w2WahSbYrbcCKBDsC","XasdXLL99y5rygCmq","3gkSihZQRfAD2KBo3","t5XMbyZbtPBo4wBGS","bEpHM2CtrwW2qE8FP","teJzFLHexaz5AQW5z"],"keywords":[],"search_terms":["learning","crowds","digital","pathology","using","scalable","variational","gaussian","processes","lópez-pérez","amgad","morales-álvarez","ruiz","cooper","molina","katsaggelos"],"title":"Learning from crowds in digital pathology using scalable variational Gaussian processes","year":2021}