UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt – A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis. Vashisht, P., Lodha, A., Maddipatla, M., Yao, Z., Mitra, A., Yang, Z., Wang, J., Kwon, S., & Yu, H. May, 2024. arXiv:2404.17749 [cs]
UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt – A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis [link]Paper  abstract   bibtex   1 download  
This paper presents our team's participation in the MEDIQA-ClinicalNLP2024 shared task B. We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework. Our investigation reveals that GPT-4V, when used as a retrieval agent, can accurately retrieve the correct skin condition 85% of the time using dermatological images and brief patient histories. Additionally, we empirically show that Naive Chain-of-Thought (CoT) works well for retrieval while Medical Guidelines Grounded CoT is required for accurate dermatological diagnosis. Further, we introduce a Multi-Agent Conversation (MAC) framework and show its superior performance and potential over the best CoT strategy. The experiments suggest that using naive CoT for retrieval and multi-agent conversation for critique-based diagnosis, GPT-4V can lead to an early and accurate diagnosis of dermatological conditions. The implications of this work extend to improving diagnostic workflows, supporting dermatological education, and enhancing patient care by providing a scalable, accessible, and accurate diagnostic tool.
@misc{vashisht_umass-bionlp_2024,
	title = {{UMass}-{BioNLP} at {MEDIQA}-{M3G} 2024: {DermPrompt} -- {A} {Systematic} {Exploration} of {Prompt} {Engineering} with {GPT}-{4V} for {Dermatological} {Diagnosis}},
	shorttitle = {{UMass}-{BioNLP} at {MEDIQA}-{M3G} 2024},
	url = {http://arxiv.org/abs/2404.17749},
	abstract = {This paper presents our team's participation in the MEDIQA-ClinicalNLP2024 shared task B. We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework. Our investigation reveals that GPT-4V, when used as a retrieval agent, can accurately retrieve the correct skin condition 85\% of the time using dermatological images and brief patient histories. Additionally, we empirically show that Naive Chain-of-Thought (CoT) works well for retrieval while Medical Guidelines Grounded CoT is required for accurate dermatological diagnosis. Further, we introduce a Multi-Agent Conversation (MAC) framework and show its superior performance and potential over the best CoT strategy. The experiments suggest that using naive CoT for retrieval and multi-agent conversation for critique-based diagnosis, GPT-4V can lead to an early and accurate diagnosis of dermatological conditions. The implications of this work extend to improving diagnostic workflows, supporting dermatological education, and enhancing patient care by providing a scalable, accessible, and accurate diagnostic tool.},
	urldate = {2024-09-03},
	publisher = {arXiv},
	author = {Vashisht, Parth and Lodha, Abhilasha and Maddipatla, Mukta and Yao, Zonghai and Mitra, Avijit and Yang, Zhichao and Wang, Junda and Kwon, Sunjae and Yu, Hong},
	month = may,
	year = {2024},
	note = {arXiv:2404.17749 [cs]},
	keywords = {Computer Science - Artificial Intelligence, Computer Science - Computation and Language},
}

Downloads: 1