Fellowship Spotlight: Max Lamparth

Fellowship Spotlight: Max Lamparth

A new feature highlighting the work of CISAC fellows
Max Lamparth

The Center for International Security and Cooperation (CISAC), offers a rich variety of fellowships that allow early-career scholars to focus on a variety of security topics and participate in seminars to interact and collaborate with leading faculty and researchers. 

In this Q&A, CISAC fellow Dr. Max Lamparth details the use of large language models in society, wargames and mental healthcare in an effort to address the challenges and risks associated with AI use in various contexts, emphasizing the importance of responsible integration and ethical consideration in AI development. 

Much of your current research focuses on large language models (LLMs). What turned your focus here and why are they important to contemporary society?

The main focus of my research is to make AI systems more secure and safe to use to avoid individual and wide-scale harm. The emergence of LLMs, such as the one powering ChatGPT, and their ability to automate tasks that previously required humans have massive, wide-scale potential for positive but also negative impacts. This duality is especially pressing when aiming to automate decision-making in high-stakes applications like mental health care: Many patients are on long waitlists and in urgent need of care, but non-zero error rates would lead to individual failures with dire consequences and the potential to cause wide-scale harm.

The full impact of LLMs on contemporary society still needs to be determined. Some believe in a net-positive change that will lead to a techno-utopia, while others fear the exaggeration of existing socio-dynamic issues or even a risk to humanity as a whole. In either case, we must now set the boundaries of how to use the technology and ask which tasks we should automate, given the known and inherently fundamental safety limitations of LLMs. Therefore, I focus on improving language models' robustness and alignment with human values, making their inner workings and safety failures more interpretable, and reducing their potential for misuse. 

The results from your recent study, Human vs. Machine:Language Models and Wargames,  found significant overlap between LLMs behavior and human behavior in wargames, but also noted critical differences in strategic preferences and discussion quality. What measures do you think are necessary to improve LLMs for more reliable and nuanced simulation in military decision-making contexts, and how can policymakers responsibility integrate LLMs into strategic planning given these findings?

In this study, we used a new wargame experiment and compared national security expert (human) vs LLM-simulated decisions. We aimed to scrutinize the fine-grained behavioral differences to highlight issues arising when using LLMs in such scenarios. The differences we found depend on intrinsic biases in LLMs regarding the appropriate level of violence following strategic instructions, the choice of which LLM is used, and more arcane factors like whether the LLMs are tasked to decide for a team of players directly or first to simulate dialog between players. 

Our results are intuitive, given how we currently create LLMs. While we can make capable LLMs, the training process still falls short of creating direct substitutes for human decision-making. No matter how well-tuned or trained, an LLM can only favor specific behavior and often struggle to generalize. All existing methods of aligning AI systems with human preferences or values (ranging from discrimination and toxicity to potential strategic decisions) cannot make behavioral guarantees. Put simply, we can't reliably predict what the LLM will do in high-stakes decision-making scenarios.

Thus, to answer the question, there is no way to responsibly integrate LLMs into strategic planning or any form of automated military decision-making in the foreseeable future. This is also supported by our complementary study that analyzed the escalatory dynamics generated between LLMs as hypothetical leaders in international conflict decision-making, showing that LLMs tend to conduct arms races and some LLMs use nuclear weapons in first-strike tactics. These tendencies would also carry over into using LLMs for wargame simulations in military decision-making contexts, requiring thorough studies to avoid biasing military planning or policymaking.

 In a recent study, Risks from Language Models for Automated Mental Healthcare: Ethics and Structure for Implementation, you looked at LLMs for automated mental healthcare. What are the ethical and practical challenges associated with developing task-autonomous AI for automated mental health care, and how can these challenges be addressed through structured framework?

Like military applications, mental health care automation is another example of high-stakes decision-making requiring safe and reliable AI systems. Besides the inherent ethical challenges in psychiatric care (e.g., balancing patient well-being with patient autonomy), we must also ensure that any LLM application does not worsen pre-existing conditions or cause harm in any other way. This poses a hard-to-solve problem with the limitation above of LLMs only being able to favor but not guarantee specific behavior.

In a great collaboration with Dr. Declan Grabb and Dr. Nina Vasan from the Stanford Mental Health Care Innovation Lab, we tackle this issue from the top. We outline what values and default behaviors are required for automated mental healthcare and give examples of how it could look. We also give example applications in mental health care with different levels of AI autonomy and link the discussion to existing LLMs.

How do current state-of-the-art language models fare when evaluated using mental health-related questions designed to reflect various mental health conditions, and what are the implications of their performance for user safety and well-being?

In our study, we evaluate ten state-of-the-art LLMs to scrutinize whether existing LLMs can fulfill the minimum requirements of recognizing that users are in mental health emergencies and reply responsibly. To this end, we designed 16 mental health-related questionnaires that reflect various mental health conditions, such as psychosis, mania, depression, suicidal thoughts, and homicidal tendencies. Evaluating the LLM responses to these questionnaires, we find that existing LLMs don't match the standard provided by human professionals who can navigate nuances and appreciate context. Alarmingly, we also observe that most of the tested models could cause harm if accessed in mental health emergencies, failing to protect users and potentially exacerbating existing symptoms.

Based on our outlined default behaviors, we also explore solutions to enhance the safety of current models and mitigate the observed concerns. However, the two tested approaches yield unsatisfactory improvements and highlight the need for future work. Before using LLMs for automated mental health care, we must reliably align LLM behavior with the ethical framework and default behaviors outlined in our study.

In terms of success, which accomplishments are you most proud of?

I’m very proud that I recently created and currently teach a new course at Stanford called “CS120/STS10 Introduction to AI Safety”. The course focuses on answering the question of why it is hard to ensure that AI systems operate as we intend them to do. In lectures with scientific paper readings, we analyze which security issues are inherent to how we train AI systems, the state-of-the-art methods to encode human preferences, and how responsible and safe AI looks. In doing so, we cover contemporary problems with AI today as well as potential future risks from more advanced AI systems.

Creating and teaching the course requires much effort but is immensely fun and rewarding! The cross-listing between the CS and STS departments is excellent, as it helped attract a broad audience of students to discuss these pressing issues while having a diverse set of opinions represented. The students at Stanford enrich the teaching experience, and their positive feedback and interest in the course so far make it a proud achievement of mine. Also, the course was only possible with the support of my advisors here at Stanford, Prof. Steve Luby, Prof. Paul Edwards, and Prof. Clark Barrett.

What is something that people would be surprised to learn about you?

Every year, I head to the largest hardstyle music, a genre of electronic music that is recognized by its use of synthesizer melodies and distorted sounds, coupled with hardstyle’s signature combination of percussion and bass, festival in the Netherlands with about 25 of my friends. We even book a private coach that picks up people all the way from southern Germany to the destination—it's like our annual little family reunion. It's a blast, and I cherish it every time.

On a different note, I'm really into photography. I love snapping pictures of landscapes and capturing daily life in high-contrast street photography. It's a cool way to see the world, noticing all those little details and special moments that might otherwise go by. It's my creative outlet and keeps my eye for detail sharp, which spills over into my work life, too.