AutoToM

Automated Bayesian Inverse Planning and Model
Discovery for Open-ended Theory of Mind

Introducing AutoToM, a fully automated and open-ended Theory of Mind reasoning method. AutoToM is characterized by the following features:

Visual Representation Icon
Open-ended ToM: AutoToM is a model-based method that can operate in any domain, infer any mental variable, and conduct robust ToM reasoning of any order.
Connector Design Icon
LLM Meets Bayesian inference: AutoToM integrates the flexibility of LLMs with the robustness of Bayesian inverse planning.
Instruction Tuning Recipes Icon
Automated Bayesian Inverse Planning: AutoToM conducts inverse planning for any specified model, automating the hypothesis sampling and Bayesian inference.
Magnifier Icon
Automated Model Discovery: AutoToM performs automated model proposals and iteratively refines the model by adjusting variables and timesteps.
Benchmarking Icon
Performance: AutoToM achieves state-of-the-art performance across multiple benchmarks, offering a scalable, robust, and interpretable approach to machine ToM.
Teaser Image

Theory of Mind (ToM), the ability to understand people's minds based on their behavior, is key to developing socially intelligent agents. We introduce AutoToM, a fully automated and open-ended Theory of Mind reasoning method. As the first model-based ToM method that addresses open-ended scenarios, AutoToM achieves state-of-the-art performance across multiple benchmarks, offering a scalable, robust, and interpretable approach to machine ToM.

Visual Representation Logo Model-based
Reasoning
Instruction Tuning Recipes Icon Inverse
Planning
Magnifier Icon Model
Discovery
Benchmarking Icon State-of-the-art
Performance

Click to jump to each section.


Model-based Theory of Mind

Understanding the Challenge of Theory of Mind

Theory of Mind (ToM), the ability to understand people's mental variables based on their behavior, is key to developing socially intelligent agents.

There are two current approaches to Theory of Mind reasoning:

  1. Directly applying LLMs to reason about people's mental states with prompting strategies such as perspective-taking, change-tracking, and temporal-spatial reasoning. However, even with these advanced prompting techniques, LLMs still make systematic errors in complex scenarios.
  2. Using model-based inference, particularly Bayesian Inverse Planning (BIP). Recent works have proposed to combine BIP and LLMs to achieve scalable yet robust model-based ToM inference. While these methods significantly outperform LLMs in specific domains, they use rigid, handcrafted models, which cannot generalize across different domains.

Bayesian Inverse Planning: A Robust Framework

Bayesian Inverse Planning (BIP) models how observers infer unobservable mental states—such as beliefs and goals—from an agent's behavior. It assumes that the agent acts rationally according to a Bayesian Theory of Mind (BToM) model, which specifies how internal variables lead to observable actions. BIP then inverts this generative process to assess what latent mental variables can lead to observed behavior, serving as a robust solution to ToM challenges.

To conduct BIP in different scenarios, there are several key challenges: 1) Different ToM inference problem requires different BToM models (see Figure 4), but we don't know which is most suitable; 2) There are many time steps in a given context, and we needs to reason which steps are relevant; 3) There is no predefined hypothesis space for each mental variable.

AutoToM: A Paradigm Shift

We introduce AutoToM, a fully automated and open-ended model-based Theory of Mind reasoning method. It automates every aspect of Bayesian inverse planning, including the proposal and adjustment of model structures, the identification of relevant timesteps, the generation of hypotheses, and the execution of Bayesian inference. It is designed to operate in any context, infer any mental state, reason about any number of agents, and support any order of recursive reasoning, which represents our vision of an open-ended and robust machine Theory of Mind.

Overview
Figure 1: An overview of AutoToM. \( X^{t_s:t} \) are observable variables, \( V^{t_s:t} \) are latent mental variables, and \( q \) is the query.
\( t_s:t \) denotes timesteps from \( t_s \) to \( t \) in the context that are considered for inference. Variables \( s^t, o^t, b^t, a^t, g^t \) represent state, observation, belief, action, and goal, respectively, with solid arrows indicating dependencies defined in the models.

Figure 1 provides an overview of AutoToM. Given a question, we extract the observable variables (information extraction) and propose an initial BToM model. This is followed by automated Bayesian inverse planning and iterative model adjustment. When the model utility is high enough, we will produce the final answer based on the inference result.

Automated Bayesian Inverse Planning

Given a BToM model, we integrate LLMs as the computational backend to implement every aspect of the Bayesian inverse planning. This includes hypothesis sampling for latent mental variables, and probabilistic inference for the target mental variable (Figure 2). The construction, information flow, and computations within the BToM model are entirely automated.

Hypothesis Sampling. Conventional BIP assumes a maually defined hypothesis space as well as hypothesis representation for each latent mental variable. Our hypothesis sampling module instead leverages an LLM to propose only a small set of quality hypotheses for each latent variable, conditioned on observable variables and their values extracted from the context. We further apply hypothesis reduction to eliminate unlikely hypotheses and reduce the hypothesis space.

Bayesian Inference. We estimate each local conditional in the BToM model using an LLM. After marginalizing the joint distribution over non-target latent variables, we then produce the posterior probabilities of the target variable in the query. We greatly generalize prior methods by enabling any ToM inference based on any BToM model structure, simultaneously considering multiple non-target latent variables and supporting arbitrary levels of recursion for high-order ToM inference.

Automated Bayesian Inverse Planning
Figure 2: Illustration of automated Bayesian inverse planning given a specified BToM model.

Automated Model Discovery

Prior works on Bayesian inverse planning rely on manually designed BToM models, which limits their applicability to domain-specific scenarios. In contrast, the Automated Model Discovery component automatically proposes a model and dynamically adjusts it to ensure both the effectiveness of the model—confidently inferring agents' mental states—and the efficiency of the inference by minimizing model complexity.

Information Extraction. The information extraction module processes the given context to identify the values of observable variables, including states, actions, and utterances, organized along a timeline. When there are multiple agents, we first identify whose mental state the question is asking about, and then construct the timesteps based on its actions.

Initial Model Proposal. We employ an LLM to propose an initial BToM model tailored to the available information and the query. Following this model, we conduct §automated Bayesian inverse planning. If the model utility exceeds a threshold, we accept the inference result as the final answer. Otherwise, we use the model utility to guide model adjustments.

Model Adjustment. We iteratively adjust the proposed model by considering two types of model adjustments: variable adjustment and timestep adjustment (Figure 3).

Variable Adjustment. We refine the model structure at a specific timestep by iteratively introducing new, relevant latent variables into the model to address uncertainty in the inference. For each adjustment, we compute the updated model utility and accept the modification that offers the biggest increase in utility.

Timestep Adjustment. If the model utility remains low and no significant improvements can be achieved through variable adjustment given the current timesteps \( t_s:t \), we may need to incorporate an additional timestep, \( t_s-1 \), to provide more context for the inference. When we add one more timestep, we first apply the model structure in the initial model proposal, and then conduct variable adjustments for this new timestep as well.

Model Adjustment
Figure 3: We automatically refine the BToM model by alternating between variable adjustment and timestep adjustment.

Effective and Efficient. The results from our ablation study (Figure 4) highlight the benefits of variable adjustment, timestep adjustment, and hypothesis reduction. The automatic model discovery in AutoToM can construct suitable BToM model that not only enables rich ToM inferences but also reduces compute, balancing accuracy and cost.

Comparison
Figure 4: Averaged performance and compute of AutoToM and the ablated methods on all benchmarks.

State-of-the-art Performance

We evaluated our method on multiple Theory of Mind benchmarks, including ToMi, BigToM, MMToM-QA, MuMA-ToM, and Hi-ToM. As shown in Figure 5, these benchmarks encompass different mental variables, observable contexts, numbers of agents, the presence or absence of utterances, wording styles, and modalities. AutoToM autonomously discover the appropriate BToM models for them.

Incorporating System Prompt in Instruction Tuning Data alleviates “Answer Machine Phenomenon”
Figure 5: Examples questions and the necessary BToM model in diverse benchmarks.

The main results are summarized in Table 1. Unlike our AutoToM method, many recent ToM baselines can only be applied to specific benchmarks. Among general methods, AutoToM achieves state-of-the-art results across all benchmarks. This is because Bayesian inverse planning is more robust in inferring mental states given long context with complex environments and agent behavior. It is also more adept at recursive reasoning which is key to higher-order inference.

Notably, AutoToM performs comparably to manually specified models (AutoToM w/ Model Spec.), showing that automatic model discovery without domain knowledge is as effective as human-provided models.

Method Type ToMi BigToM MMToM-QA MuMA-ToM Hi-ToM All
SymbolicToM Specific 98.60 - - - - -
TimeToM Specific 87.80 - - - - -
PercepToM Specific 82.90 - - - - -
BIP-ALM Specific - - 76.70 33.90 - -
LIMP Specific - - - 76.60 - -
AutoToM w/ Model Spec. Specific 88.80 86.75 79.83 84.00 74.00 82.68
LLaMA 3.1 70B General 72.00 77.83 43.83 55.78 35.00 47.71
Gemini 2.0 Flash General 66.70 82.00 48.00 55.33 52.50 60.91
Gemini 2.0 Pro General 71.90 86.33 50.84 62.22 57.50 65.76
GPT-4o General 77.00 82.42 44.00 63.55 50.00 63.39
SimToM General 79.90 77.50 51.00 47.63 71.00 65.41
AutoToM General 88.30 86.92 75.50 81.44 72.50 80.93
Table 1: Results of AutoToM and baselines on all benchmarks. “-” indicates that the domain-specific method is not applicable to the benchmark.

Conclusion

To conclude, AutoToM is a novel framework for open-ended Theory of Mind. Given any ToM inference problem, AutoToM can automatically construct a suitable BToM model and conduct automated Bayesian inverse planning with an LLM backend. It suggests a promising direction toward cognitively grounded ToM modeling that is scalable, robust, and open-ended.

Acknowledgement

We would like to thank the Cambrian authors for providing this webpage template.

BibTeX

Coming soon.