- Researchers agree: Orthopedic surgeons remain the most reliable source of information
- All chatbots had significant limitations and omitted important steps of scrutiny.
- The researchers summarized as follows: ChatGPT is not yet a sufficient resource to answer patient questions.Further work is needed to develop accurate orthopedic-focused chatbots
San Francisco, February 12, 2024 /PRNewswire/ — As large-scale language models become more popular (LLM) chatbotA type of artificial intelligence (AI) used by . Chat GPT,Google Bird, BingAIIt is important to outline the accuracy of. musculoskeletal system Health information they provide. Three new studies presented at the 2024 Annual Meeting American Academy of Orthopedic Surgeons (AOS) analyzed the validity of the information chatbot Administered to patients for specific orthopedic surgeries to assess the accuracy of treatment methods chatbot Introducing research advances and clinical decision making.
The study found that certain chatbots provided concise summaries of a wide range of orthopedic conditions, but each chatbot demonstrated limited accuracy depending on the category. Researchers agree that orthopedic surgeons remain the most reliable source of information. The findings will help people in the field understand the effectiveness of these AI tools, whether they may introduce bias or misunderstanding when used by patients or non-expert colleagues, and whether future enhancements will It helps you understand how chatbots could become a valuable tool for patients and doctors in the future.
Research overview and results
Potential misinformation and dangers associated with clinical use of LLM chatbots
This study branden sosa, a fourth-year medical student at Weill Cornell Medicine, uses Open AI ChatGPT 4.0, Google Bard, and BingAI chatbots to explain basic orthopedic concepts, integrate clinical information, and address patient questions. We evaluated the accuracy of Each chatbot was asked to answer 45 orthopedic-related questions across the categories of “Bone Physiology,” “Referring Physician,” and “Patient Questions,” and assessed for accuracy. Two independent blinded reviewers scored the responses on her 0-4 scale to assess accuracy, completeness, and ease of use. Responses were analyzed for strengths and limitations within categories and across chatbots. The research team found the following trends:
- When asked an orthopedic question, OpenAI ChatGPT, Google Bard, and BingAI provided correct answers covering the most important salient points in 76.7%, 33%, and 16.7% of the queries, respectively.
- When providing recommendations regarding clinical management, all chatbots should be aware of deviations from standard of care, ordering antibiotics before culture, or failing to include important studies in a diagnostic workup. It exhibited significant limitations, including omitting important steps.
- For less complex patient questions, ChatGPT and Google Bard were able to provide mostly accurate responses, but were often unable to elicit important medical history that was relevant to fully address the question. .
- Careful analysis of the citations provided by the chatbot revealed oversampling of a small number of references and 10 defective links that either did not work or led to incorrect articles.
Is ChatGPT ready for prime time? Assessing AI accuracy in answering common questions of arthroplasty patients
led by researchers Jenna A. BernsteinThe orthopedic surgeon at Connecticut Orthopedics learned how accurately ChatGPT 4.0 answered patient questions by creating a list of 80 frequently asked patient questions about knee and hip replacements. I tried to investigate. Each question was queried twice by him on her ChatGPT. First ask the question as written, then prompt ChatGPT to answer the patient's question “as an orthopedic surgeon”. Each surgeon on the team rated the accuracy of each answer set on a scale of 1 to 4. We assessed the agreement between her two surgeons' ratings for each ChatGPT answer set. Associations between question prompts and response accuracy were both assessed using her two statistical analysis instruments (Cohen's kappa test and Wilcoxon signed rank test, respectively). Findings include:
- When evaluating the quality of ChatGPT answers, when asked without prompts, 26% (21 of 80 responses) had an average scale of 3 (partially accurate but incomplete) or less, and 8% (80 responses) (6 of them) had an average score of 3 or less. Grade less than 3 if preceded by prompt. Therefore, the researchers summarized that ChatGPT is still not a sufficient resource to answer patient questions and further work is needed to develop a chatbot focused on precise orthopedics.
- ChatGPT performed significantly better when properly prompted to answer patient questions “as an orthopedic surgeon” with 92% accuracy.
Can I use ChatGPT 4.0 to answer patient questions about Latarjet surgery for anterior shoulder instability?
Researchers at the Hospital for Special Surgery new yorkled by Kyle KunzeMD assessed the following trends: Chat GPT Provides medical information regarding 4.0. latarjet Treatment for patients with anterior shoulder instability.The overall goal of this study is that this chatbot May demonstrate potential to serve as a clinical assistant and assist both patients and healthcare providers by providing accurate medical information.
To answer this question, the team first performed a Google search using the query “Latarjet” to extract the top 10 frequently asked questions (FAQs) and related sources for this procedure. We then asked ChatGPT to perform a similar search of FAQs to identify the questions and sources served by the chatbot. The main findings of the survey are as follows.
- ChatGPT has demonstrated its ability to provide a wide range of clinically relevant questions and answers and to provide 100% information from academic sources. This is in contrast to Google, which incorporates some of the academic resources combined with information found on surgeons' personal websites and large medical practices.
- The most common question category on both ChatGPT and Google was technical details (40%). However, ChatGPT also provided information regarding risks/complications (30%), recovery schedule (20%), and surgical evaluation (10%).
# # #
2024 AAOS Annual Meeting Disclosure Statement
About AAOS
With more than 39,000 members, the American Academy of Orthopedic Surgeons is the world's largest medical association of musculoskeletal specialists. AAOS is a trusted leader in musculoskeletal health promotion. We provide the highest quality, most comprehensive education to help orthopedic surgeons and allied health professionals of all career levels best treat patients in their daily practice. AAOS is your source for information about bone and joint conditions, treatments, and related musculoskeletal medical issues. And it leads healthcare discussions on quality improvement.
Follow AAOS Facebook, X, linkedin and Instagram.
Source American Academy of Orthopedic Surgeons