UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt – A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis

UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt – A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis. Vashisht, P., Lodha, A., Maddipatla, M., Yao, Z., Mitra, A., Yang, Z., Wang, J., Kwon, S., & Yu, H. May, 2024. arXiv:2404.17749 [cs]

Paper abstract bibtex 1 download

This paper presents our team's participation in the MEDIQA-ClinicalNLP2024 shared task B. We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework. Our investigation reveals that GPT-4V, when used as a retrieval agent, can accurately retrieve the correct skin condition 85% of the time using dermatological images and brief patient histories. Additionally, we empirically show that Naive Chain-of-Thought (CoT) works well for retrieval while Medical Guidelines Grounded CoT is required for accurate dermatological diagnosis. Further, we introduce a Multi-Agent Conversation (MAC) framework and show its superior performance and potential over the best CoT strategy. The experiments suggest that using naive CoT for retrieval and multi-agent conversation for critique-based diagnosis, GPT-4V can lead to an early and accurate diagnosis of dermatological conditions. The implications of this work extend to improving diagnostic workflows, supporting dermatological education, and enhancing patient care by providing a scalable, accessible, and accurate diagnostic tool.

@misc{vashisht_umass-bionlp_2024,
	title = {{UMass}-{BioNLP} at {MEDIQA}-{M3G} 2024: {DermPrompt} -- {A} {Systematic} {Exploration} of {Prompt} {Engineering} with {GPT}-{4V} for {Dermatological} {Diagnosis}},
	shorttitle = {{UMass}-{BioNLP} at {MEDIQA}-{M3G} 2024},
	url = {http://arxiv.org/abs/2404.17749},
	abstract = {This paper presents our team's participation in the MEDIQA-ClinicalNLP2024 shared task B. We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework. Our investigation reveals that GPT-4V, when used as a retrieval agent, can accurately retrieve the correct skin condition 85\% of the time using dermatological images and brief patient histories. Additionally, we empirically show that Naive Chain-of-Thought (CoT) works well for retrieval while Medical Guidelines Grounded CoT is required for accurate dermatological diagnosis. Further, we introduce a Multi-Agent Conversation (MAC) framework and show its superior performance and potential over the best CoT strategy. The experiments suggest that using naive CoT for retrieval and multi-agent conversation for critique-based diagnosis, GPT-4V can lead to an early and accurate diagnosis of dermatological conditions. The implications of this work extend to improving diagnostic workflows, supporting dermatological education, and enhancing patient care by providing a scalable, accessible, and accurate diagnostic tool.},
	urldate = {2024-09-03},
	publisher = {arXiv},
	author = {Vashisht, Parth and Lodha, Abhilasha and Maddipatla, Mukta and Yao, Zonghai and Mitra, Avijit and Yang, Zhichao and Wang, Junda and Kwon, Sunjae and Yu, Hong},
	month = may,
	year = {2024},
	note = {arXiv:2404.17749 [cs]},
	keywords = {Computer Science - Artificial Intelligence, Computer Science - Computation and Language},
}

Downloads: 1

{"_id":"SrCCXb2Cqcfdq2Kzp","bibbaseid":"vashisht-lodha-maddipatla-yao-mitra-yang-wang-kwon-etal-umassbionlpatmediqam3g2024dermpromptasystematicexplorationofpromptengineeringwithgpt4vfordermatologicaldiagnosis-2024","author_short":["Vashisht, P.","Lodha, A.","Maddipatla, M.","Yao, Z.","Mitra, A.","Yang, Z.","Wang, J.","Kwon, S.","Yu, H."],"bibdata":{"bibtype":"misc","type":"misc","title":"UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt – A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis","shorttitle":"UMass-BioNLP at MEDIQA-M3G 2024","url":"http://arxiv.org/abs/2404.17749","abstract":"This paper presents our team's participation in the MEDIQA-ClinicalNLP2024 shared task B. We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework. Our investigation reveals that GPT-4V, when used as a retrieval agent, can accurately retrieve the correct skin condition 85% of the time using dermatological images and brief patient histories. Additionally, we empirically show that Naive Chain-of-Thought (CoT) works well for retrieval while Medical Guidelines Grounded CoT is required for accurate dermatological diagnosis. Further, we introduce a Multi-Agent Conversation (MAC) framework and show its superior performance and potential over the best CoT strategy. The experiments suggest that using naive CoT for retrieval and multi-agent conversation for critique-based diagnosis, GPT-4V can lead to an early and accurate diagnosis of dermatological conditions. The implications of this work extend to improving diagnostic workflows, supporting dermatological education, and enhancing patient care by providing a scalable, accessible, and accurate diagnostic tool.","urldate":"2024-09-03","publisher":"arXiv","author":[{"propositions":[],"lastnames":["Vashisht"],"firstnames":["Parth"],"suffixes":[]},{"propositions":[],"lastnames":["Lodha"],"firstnames":["Abhilasha"],"suffixes":[]},{"propositions":[],"lastnames":["Maddipatla"],"firstnames":["Mukta"],"suffixes":[]},{"propositions":[],"lastnames":["Yao"],"firstnames":["Zonghai"],"suffixes":[]},{"propositions":[],"lastnames":["Mitra"],"firstnames":["Avijit"],"suffixes":[]},{"propositions":[],"lastnames":["Yang"],"firstnames":["Zhichao"],"suffixes":[]},{"propositions":[],"lastnames":["Wang"],"firstnames":["Junda"],"suffixes":[]},{"propositions":[],"lastnames":["Kwon"],"firstnames":["Sunjae"],"suffixes":[]},{"propositions":[],"lastnames":["Yu"],"firstnames":["Hong"],"suffixes":[]}],"month":"May","year":"2024","note":"arXiv:2404.17749 [cs]","keywords":"Computer Science - Artificial Intelligence, Computer Science - Computation and Language","bibtex":"@misc{vashisht_umass-bionlp_2024,\n\ttitle = {{UMass}-{BioNLP} at {MEDIQA}-{M3G} 2024: {DermPrompt} -- {A} {Systematic} {Exploration} of {Prompt} {Engineering} with {GPT}-{4V} for {Dermatological} {Diagnosis}},\n\tshorttitle = {{UMass}-{BioNLP} at {MEDIQA}-{M3G} 2024},\n\turl = {http://arxiv.org/abs/2404.17749},\n\tabstract = {This paper presents our team's participation in the MEDIQA-ClinicalNLP2024 shared task B. We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework. Our investigation reveals that GPT-4V, when used as a retrieval agent, can accurately retrieve the correct skin condition 85\\% of the time using dermatological images and brief patient histories. Additionally, we empirically show that Naive Chain-of-Thought (CoT) works well for retrieval while Medical Guidelines Grounded CoT is required for accurate dermatological diagnosis. Further, we introduce a Multi-Agent Conversation (MAC) framework and show its superior performance and potential over the best CoT strategy. The experiments suggest that using naive CoT for retrieval and multi-agent conversation for critique-based diagnosis, GPT-4V can lead to an early and accurate diagnosis of dermatological conditions. The implications of this work extend to improving diagnostic workflows, supporting dermatological education, and enhancing patient care by providing a scalable, accessible, and accurate diagnostic tool.},\n\turldate = {2024-09-03},\n\tpublisher = {arXiv},\n\tauthor = {Vashisht, Parth and Lodha, Abhilasha and Maddipatla, Mukta and Yao, Zonghai and Mitra, Avijit and Yang, Zhichao and Wang, Junda and Kwon, Sunjae and Yu, Hong},\n\tmonth = may,\n\tyear = {2024},\n\tnote = {arXiv:2404.17749 [cs]},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Computation and Language},\n}\n\n","author_short":["Vashisht, P.","Lodha, A.","Maddipatla, M.","Yao, Z.","Mitra, A.","Yang, Z.","Wang, J.","Kwon, S.","Yu, H."],"key":"vashisht_umass-bionlp_2024","id":"vashisht_umass-bionlp_2024","bibbaseid":"vashisht-lodha-maddipatla-yao-mitra-yang-wang-kwon-etal-umassbionlpatmediqam3g2024dermpromptasystematicexplorationofpromptengineeringwithgpt4vfordermatologicaldiagnosis-2024","role":"author","urls":{"Paper":"http://arxiv.org/abs/2404.17749"},"keyword":["Computer Science - Artificial Intelligence","Computer Science - Computation and Language"],"metadata":{"authorlinks":{}},"downloads":1,"html":""},"bibtype":"misc","biburl":"http://fenway.cs.uml.edu/papers/pubs-all.bib","dataSources":["TqaA9miSB65nRfS5H"],"keywords":["computer science - artificial intelligence","computer science - computation and language"],"search_terms":["umass","bionlp","mediqa","m3g","2024","dermprompt","systematic","exploration","prompt","engineering","gpt","dermatological","diagnosis","vashisht","lodha","maddipatla","yao","mitra","yang","wang","kwon","yu"],"title":"UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt – A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis","year":2024,"downloads":1}