Article Text
Abstract
Chat Generative Pre-Trained Transformer (ChatGPT) has been a growing point of interest in medical education yet has not been assessed in the field of bioethics. This study evaluated the accuracy of ChatGPT-3.5 (April 2023 version) in answering text-based, multiple choice bioethics questions at the level of US third-year and fourth-year medical students. A total of 114 bioethical questions were identified from the widely utilised question banks UWorld and AMBOSS. Accuracy, bioethical categories, difficulty levels, specialty data, error analysis and character count were analysed. We found that ChatGPT had an accuracy of 59.6%, with greater accuracy in topics surrounding death and patient–physician relationships and performed poorly on questions pertaining to informed consent. Of all the specialties, it performed best in paediatrics. Yet, certain specialties and bioethical categories were under-represented. Among the errors made, it tended towards content errors and application errors. There were no significant associations between character count and accuracy. Nevertheless, this investigation contributes to the ongoing dialogue on artificial intelligence’s (AI) role in healthcare and medical education, advocating for further research to fully understand AI systems’ capabilities and constraints in the nuanced field of medical bioethics.
- Ethics- Medical
- Decision Making
- Education
Data availability statement
Data may be obtained from a third party and are not publicly available. Data are obtained from a third party and is not publicly accessible. The bioethical questions used in this study are property of UWorld, AMBOSS and its licencors. Access to the question sets could be acquired through purchasing subscriptions from both companies.
Statistics from Altmetric.com
Data availability statement
Data may be obtained from a third party and are not publicly available. Data are obtained from a third party and is not publicly accessible. The bioethical questions used in this study are property of UWorld, AMBOSS and its licencors. Access to the question sets could be acquired through purchasing subscriptions from both companies.
Footnotes
JC and AC contributed equally.
Contributors JC and AC conceptualised the project. The protocol was drafted by JC and AC, and further refined and overseen by BP. Data curation was undertaken by JC, AC and BP. Statistical support was provided by LK. JC, AC and BP were responsible for drafting the original manuscript. LK, JC and AC were responsible for visualisation. All authors contributed to, read and approved the final manuscript. JC and AC are the guarantors. JC and AC contributed equally to this paper. There are no non-author contributors to disclose.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.
Read the full text or download the PDF:
Other content recommended for you
- Assessment of ChatGPT’s performance on neurology written board examination questions
- Artificial intelligence for diabetic retinopathy in low-income and middle-income countries: a scoping review
- Evaluation framework to guide implementation of AI systems into healthcare settings
- Does “AI” stand for augmenting inequality in the era of covid-19 healthcare?
- Artificial intelligence (AI) for neurologists: do digital neurones dream of electric sheep?
- Public perceptions on the application of artificial intelligence in healthcare: a qualitative meta-synthesis
- Development and validation of a deep learning system to screen vision-threatening conditions in high myopia using optical coherence tomography images
- Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy
- Comparison of predicted psychological distress among workers between artificial intelligence and psychiatrists: a cross-sectional study in Tsukuba Science City, Japan
- Randomised controlled trials in medical AI: ethical considerations