GPT-4 answers to questions that required interpretation of experimental data were long, excessively wordy, and often included accurate but unrequested information. In our study, students observed instances of detailed model hallucinations of scientific figures with realistic summative interpretation of these results. The Advanced Virology exam contained 7 of 13 questions based on interpretation of figures and GPT4-Expert received the highest marks. This result presents a potential limitation in our approach of providing the model’s ability to generate correct answers to scientific questions
#SCIENCE #English #CA
Read more at Nature.com