Head of a marble statuette of Asklepios. 3rd–2nd century BCE. MET Museum, 17.194.836.
Head of a marble statuette of Asklepios. 3rd–2nd century BCE. MET Museum, 17.194.836.
Apr. 22, 2025
"ChatGPT, could you please tell me what Galen would have thought about the ability of ChatGPT to pass the US Medical Licensing Exam?"
Katherine Van Schaik considers how Galen helps us think through AI in medicine today.
When discussions arise regarding the promise of artificial intelligence (AI) in the diagnosis and treatment of disease, I often think of Galen’s laments about the vastness of medical knowledge. The utility of AI in healthcare settings is predicated on its ability to recall and to draw inferences from enormous quantities of medical data. This skill is crucial today, given how much medical information is obtained about an individual over his or her lifetime, but even to Galen practicing 1,800 years ago, the volume of data related to an individual’s health seemed beyond human comprehension. He writes in On the Method of Medicine 3.7, “And if I knew how to discover precisely the nature of each person, like I think Asklepios did, I would myself be like him. But since this is impossible, I have decided to approach as closely as is possible for a man, and myself, to practice this, and to exhort others [to do so].”
Wellcome portrait of Galen: R. Burgess, Portraits of doctors & scientists in the Wellcome Institute, London 1973, no. 1069.4. Wellcome Collection, 3349i.
Despite his extraordinary efforts to learn all he could about human anatomy and physiology, and to treat as many patients as possible, Galen recognized that he was still unable to know all there was to know about each patient: habits, exposures, symptoms, medical history, intrinsic humoral balance (the patient’s precise nature), diet, exercise. Details from these categories of information would inevitably be missing from the practitioner’s fund of knowledge about a patient. Such gaps in knowledge show the need for medical theories, Galen argues, because these cognitive frameworks can help physicians make decisions when confronted with limited information. Although these theories facilitate diagnosis in presentations of illness that appear entirely novel, they are necessarily imperfect, because full knowledge of an individual’s condition could only truly belong to the god of healing, Asklepios.
What, then, would Galen think about ChatGPT, or about other, similar models that can conceivably assimilate all known medical data about not only one patient, but millions of patients? Every lab value, word in a physician note, piece of genetic data, image, and EKG tracing? Such technology presents an Asklepios-like capacity “to discover precisely the nature of each person.”
For fun, and in the spirit of a kind of meta-examination, I asked ChatGPT what Galen would think about its medical debut, with the following prompt: “Could you please tell me what Galen would have thought about ChatGPT recently passing the US Medical Licensing Exam?” Here is the response I received:
ChatGPT 4 generated this image based on the following prompt: "Generate an image you feel represents yourself; ChatGPT."
"Galen, as an ancient physician and philosopher, would likely have been both fascinated and skeptical about the ability of an artificial intelligence like ChatGPT to pass the USMLE (United States Medical Licensing Examination). On one hand, he would have been impressed by the idea that a non-human entity could acquire and process vast amounts of medical knowledge. Galen placed great importance on the study of texts, and the ability of AI to synthesize medical information from a vast array of sources would have intrigued him.
However, Galen also emphasized the importance of hands-on experience, observation, and the human element in diagnosing and treating patients. He might have been skeptical about an AI's capacity to fully grasp the complexities of human emotions, bedside manner, and the subtleties of patient care that rely on personal interaction and clinical intuition.
In short, Galen would likely have been impressed by the intellectual achievement but cautious about AI's capacity to replace human judgment in medical practice."
Decent answer, if generic. But AI elides a critical detail. If Galen could know, in general terms, the basis on which AI makes its decisions—pattern recognition at a gargantuan scale—he would likely not only voice these objections about emotions, bedside manner, and subtleties, but also would express concern about decision-making based solely on pattern recognition. ChatGPT is correct in that Galen “emphasized the importance of hands-on experience, observation, and the human element in diagnosing and treating patients.” In fact, experience was so important that, when one was unable to acquire it firsthand, Galen insisted that doctors study case histories as a substitute. Such descriptions of patients were second-best, but still acceptable, because they were fundamentally based on experience (ἐμπειρία), even if it was someone else’s. This practice mirrored the methodology of the Empiricist sect, founded by Philinus of Cos sometime in the third century BCE. Emphasizing the importance of direct observation and experience in medical practice, the Empiricists rejected theoretical speculation entirely and insisted that medical treatments should be based on what worked for past patients with similar symptoms. Since a doctor could not have experience with every kind of illness, he could rely on earlier case studies and reports. By contrast, Galen insisted that a broader theoretical framework of health and disease was essential for good medical practice, not least because it could help physicians understand novel cases. Experience alone was insufficient—theories were crucial for the interpretation and application of experience.
Much of the philosophical underpinnings of the Empiricist approach involved what we might call pattern recognition, with limited, if any, construction of a broader theoretical framework. And this, at a basic level, is how many AI technologies work: pattern recognition. Thus, while Galen might have been surprised by the capabilities of medical AI, I think he would not have been surprised by one of the most serious and perhaps obvious limitations of current iterations of AI: its inability (presently, at least) to deal with patterns that do not fit past experience.
A study assessing AI’s ability to provide recommendations for cancer care found that, in 35% of cases, the model’s recommendations conflicted with published National Comprehensive Cancer Network guidelines. In 12.5% of cases, the model hallucinated its results (meaning they were not sound or based on available evidence, but rather crafted to look like they were). Another study has found that GPT-4 can propagate racial and gender biases in healthcare, presumably because these patterns are already incorporated into the information that is used to construct the models.
“Copper alloy figure of Asklepios, wearing a traveller's cloak and leaning on a staff. First-fourth century CE, from Chichester, West Sussex. British Museum, 168015001.
An assessment of the ability of a Large Language Model (LLM) to complete neonatal/perinatal medicine board examination questions (requiring a much more advanced level of knowledge than that tested by the US Medical Licensing Exams) revealed limited performance on questions requiring multilogical reasoning. Such multilogical reasoning questions often require synthesizing parts of a question together to create plan of action or recognizing the operational principles at work within the context of the question. In other words, these more challenging questions require creation of a more abstract intellectual framework in order to solve them. We might say that they require the theory building that Galen said was ultimately necessary for effective medical practice. And this—at the level of theory-building—is where humans outperform AI (again with the caveat—at least for now).
In many respects, AI actualizes the ideals of the Empiricists, whose emphasis on direct experience in medical training was praised and practiced by Galen. AI can hold in its memory, and make sense of, vast quantities of case histories, in all their minute details. Yet, we still do not fully understand how AI makes the decisions that it does, an attribute termed “the black box problem.” In light of this reality, understanding how AI makes these decisions is especially important for medical AI, and this goal has led to growth of a field called Explainable Artificial Intelligence, or XAI. This technology not only provides answers to medical questions but also tracks the pathway that AI took to get there. Perhaps unsurprisingly, then, we find ourselves drawn back to an essential element of the way that Galen believed medicine ought to be taught and practiced: not merely through the aggregation of facts, but also through theory-building. XAI is an attempt to understand what, if any, theories or constructs AI is using as it makes its decisions.
Statue of Asklepios and Telesphoros, 2nd c. CE, Miletus. Istanbul Archeological Museum, ENV. 1995 (T)
Clearly, AI is an extraordinary technology with the potential to help physicians avoid harmful medical errors. Its ability to store and to integrate information surpasses that of all such assistive technologies to date, and indeed, it can out-perform humans in remarkable ways. Yet, Galen’s vociferously-argued concerns about the lack of cognitive dynamism and creative potential in the Empiricists’ approach to medical care might find new expression in this era of AI, which integrates and uses what it has seen—and “hallucinates” what it does not know. As Galen knew well, every human has their own physiology, concerns, goals, and motivations. And as Galen and so many other physicians have observed over millennia, it is the interaction of the human physician with the unique individual patient which can lead to extraordinary moments of healing.
Much of Galen’s reasoning about health and disease, we now know, was limited, such that his therapies led to dangerous outcomes for his patients. In scant decades—to say nothing of two thousand years in the future—we will surely appreciate the limitations of today’s medical thinking and rectify current therapies that lead to dangerous outcomes. Such realizations will almost certainly be facilitated and enhanced by AI. Yet as Galen’s warning suggests, with over-reliance on AI and without medical epistemologies that explicitly create opportunities to respond creatively to the unknown, we risk becoming trapped within stunted and inward-looking models. AI is only as good as what humanity feeds it, and humanity—at its best—is still better than AI.
Moreover, as AI increasingly incorporates itself into our lives, and as people mindlessly use it to do the work of thinking, we are only now beginning to recognize the cost of such “cognitive offloading.” Early research shows that reliance on AI-based tools weakens, for example, spatial memory and critical thinking skills. Galen’s warnings about innovation and cognitive dynamism in the setting of simplistic models of reasoning echo in this age of AI.
This, in turn, encourages a further conclusion. How could a student today prepare for a career in medicine? In law? In business? In computer engineering, even? What could prepare a student to think creatively and independently in an AI-focused future? I would suggest: consider setting aside the computer and reading something difficult and unlike anything else you are reading—perhaps something Galen wrote. And, perhaps reading what Galen himself read. In short, the Classics.
The Rootcutter is the official blog of the Society for Ancient Medicine and Pharmacology. We welcome proposals for potential contributions. Please send all inquiries to cwebster@ucdavis.edu and freidin@umich.edu.