Multimodal ChatGPT for Medical Applications: an Experimental Study of GPT-4V

Yan, Zhiling; Zhang, Kai; Zhou, Rong; He, Lifang; Li, Xiang; Sun, Lichao

Computer Science > Computer Vision and Pattern Recognition

arXiv:2310.19061 (cs)

[Submitted on 29 Oct 2023]

Title:Multimodal ChatGPT for Medical Applications: an Experimental Study of GPT-4V

Authors:Zhiling Yan, Kai Zhang, Rong Zhou, Lifang He, Xiang Li, Lichao Sun

View PDF

Abstract:In this paper, we critically evaluate the capabilities of the state-of-the-art multimodal large language model, i.e., GPT-4 with Vision (GPT-4V), on Visual Question Answering (VQA) task. Our experiments thoroughly assess GPT-4V's proficiency in answering questions paired with images using both pathology and radiology datasets from 11 modalities (e.g. Microscopy, Dermoscopy, X-ray, CT, etc.) and fifteen objects of interests (brain, liver, lung, etc.). Our datasets encompass a comprehensive range of medical inquiries, including sixteen distinct question types. Throughout our evaluations, we devised textual prompts for GPT-4V, directing it to synergize visual and textual information. The experiments with accuracy score conclude that the current version of GPT-4V is not recommended for real-world diagnostics due to its unreliable and suboptimal accuracy in responding to diagnostic medical questions. In addition, we delineate seven unique facets of GPT-4V's behavior in medical VQA, highlighting its constraints within this complex arena. The complete details of our evaluation cases are accessible at this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2310.19061 [cs.CV]
	(or arXiv:2310.19061v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2310.19061

Submission history

From: Zhiling Yan [view email]
[v1] Sun, 29 Oct 2023 16:26:28 UTC (16,618 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Multimodal ChatGPT for Medical Applications: an Experimental Study of GPT-4V

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Multimodal ChatGPT for Medical Applications: an Experimental Study of GPT-4V

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators