The conversational fine-tuning of LLMs (Conversational Agentss), popularized by ChatGPT, brought the power of the LLMs within the reach of the larger public, allowing straightforward applications in other domains, notably the clinical one. However, the enthusiasm towards LLM applications is rapidly being moderated with the apprehension of their unpredictable nature and highly non-trivial and disturbing failure models – from inventing facts to failing to follow instructions to gaslighting users into leaving spouses and killing themselves.
Such failures are problematic in other domains, but would be critical in clinical setting, whether they occur on the end of medical professionals or patients. The goal of this project is to evaluate how serious that problem potentially is and investigate some of avenues of mitigation, but leveraging the combined experience of the Gen Learn Center in LLMs, the one of Profs. Dr. Schumacher and Calbimonte and Dr. Calvaresi in building clinical conversational agents in the domain of nutritional coaching and finally the experience of Dr. Piguet of the Applied Ethics service in the questions of clinical ethics.