AutoToM

Scaling Model-based Mental Inference
via Automated Agent Modeling

Introducing AutoToM, an automated and open-ended Theory of Mind reasoning method. AutoToM is characterized by the following features:

Visual Representation Icon
Open-ended ToM: AutoToM is a model-based method that can operate in any domain, infer any mental variable, and conduct robust ToM reasoning of any order.
Connector Design Icon
LLM Meets Bayesian inference: AutoToM integrates the flexibility of LLMs with the robustness of Bayesian inverse planning, automating the hypothesis sampling and Bayesian inference.
Magnifier Icon
Automated Agent Model Discovery: AutoToM performs automated model proposals and iteratively refines the model by adjusting variables and timesteps.
Benchmarking Icon
Performance: AutoToM achieves SOTA on five benchmarks, produces human-like confidence estimates, and supports embodied decision-making.
Teaser Image

Theory of Mind (ToM), the ability to understand people's minds based on their behavior, is key to developing socially intelligent agents. We introduce AutoToM, an automated agent modeling method for scalable, robust, and interpretable mental inference. AutoToM achieves state-of-the-art performance across five benchmarks, produces human-like confidence estimates, and enables online mental inference for embodied decision-making.

Experiments
Figure 1: Overview of AutoToM's capacities and applications evaluated in this work.

Visual Representation Logo Model-based
Reasoning
Instruction Tuning Recipes Icon Inverse
Planning
Magnifier Icon Model
Discovery
Benchmarking Icon State-of-the-art
Performance

Click to jump to each section.


Model-based Theory of Mind

Understanding the Challenge of Theory of Mind

Theory of Mind (ToM), the ability to understand people's mental variables based on their behavior, is key to developing socially intelligent agents.

There are two current approaches to Theory of Mind reasoning:

  1. Directly applying LLMs to reason about people's mental states with prompting strategies such as perspective-taking, change-tracking, and temporal-spatial reasoning. However, even with these advanced prompting techniques, LLMs still make systematic errors in complex scenarios.
  2. Using model-based inference, particularly Bayesian Inverse Planning (BIP). Recent works have proposed to combine BIP and LLMs to achieve scalable yet robust model-based ToM inference. While these methods significantly outperform LLMs in specific domains, they use rigid, handcrafted models, which cannot generalize across different domains.

Bayesian Inverse Planning: A Robust Framework

Bayesian Inverse Planning (BIP) models how observers infer unobservable mental states—such as beliefs and goals—from an agent's behavior. It assumes that the agent acts rationally according to a Bayesian Theory of Mind (BToM) agent model, which specifies how internal variables lead to observable actions. BIP then inverts this generative process to assess what latent mental variables can lead to observed behavior, serving as a robust solution to ToM challenges.

To conduct BIP in different scenarios, there are several key challenges: 1) Different ToM inference problem requires different agent models (see Figure 1(a)), but we don't know which is most suitable; 2) There are many time steps in a given context, and we need to reason which steps are relevant; 3) There is no predefined hypothesis space for each mental variable.

AutoToM: A Paradigm Shift

We introduce AutoToM, a fully automated and open-ended model-based Theory of Mind reasoning method. It automates every aspect of Bayesian inverse planning, including the proposal and adjustment of model structures, the identification of relevant timesteps, the generation of hypotheses, and the execution of Bayesian inference. It is designed to operate in any context, infer any mental state, reason about any number of agents, and support any order of recursive reasoning, which represents our vision of an open-ended and robust machine Theory of Mind.

Overview
Figure 2: An overview of AutoToM. \( X^{t_s:t} \) are observable variables, \( V^{t_s:t} \) are latent mental variables, and \( q \) is the query.
\( t_s:t \) denotes timesteps from \( t_s \) to \( t \) in the context that are considered for inference. Variables \( s^t, o^t, b^t, a^t, g^t \) represent state, observation, belief, action, and goal, respectively, with solid arrows indicating dependencies defined in the models.

Figure 2 provides an overview of AutoToM. Given a question, we extract the observable variables (information extraction) and propose an initial agent model. This is followed by automated Bayesian inverse planning and iterative model adjustment. When the model utility is high enough, we will produce the final answer based on the inference result.

Automated Bayesian Inverse Planning

Given an agent model, we integrate LLMs as the computational backend to implement every aspect of the Bayesian inverse planning. This includes hypothesis sampling for latent mental variables, and probabilistic inference for the target mental variable (Figure 3). The construction, information flow, and computations within the agent model are entirely automated.

Hypothesis Sampling. Conventional BIP assumes a manually defined hypothesis space as well as hypothesis representation for each latent mental variable. Our hypothesis sampling module instead leverages an LLM to propose only a small set of quality hypotheses for each latent variable, conditioned on observable variables and their values extracted from the context. We further apply hypothesis reduction to eliminate unlikely hypotheses and reduce the hypothesis space.

Bayesian Inference. We estimate each local conditional in the agent model using an LLM. After marginalizing the joint distribution over non-target latent variables, we then produce the posterior probabilities of the target variable in the query. We greatly generalize prior methods by enabling any ToM inference based on any agent model structure, simultaneously considering multiple non-target latent variables and supporting arbitrary levels of recursion for high-order ToM inference.

Automated Bayesian Inverse Planning
Figure 3: Illustration of automated Bayesian inverse planning given a specified agent model.

Automated Agent Model Discovery

Prior works on Bayesian inverse planning rely on manually designed agent models, which limits their applicability to domain-specific scenarios. In contrast, the Automated Agent Model Discovery component automatically proposes a model and dynamically adjusts it to ensure both the effectiveness of the model—confidently inferring agents' mental states—and the efficiency of the inference by minimizing model complexity.

Information Extraction. The information extraction module processes the given context to identify the values of observable variables, including states, actions, and utterances, organized along a timeline. When there are multiple agents, we first identify whose mental state the question is asking about, and then construct the timesteps based on its actions.

Initial Model Proposal. We employ an LLM to propose an initial agent model tailored to the available information and the query. Following this model, we conduct §automated Bayesian inverse planning. If the model utility exceeds a threshold, we accept the inference result as the final answer. Otherwise, we use the model utility to guide model adjustments.

Model Adjustment. We iteratively adjust the proposed model by considering two types of model adjustments: variable adjustment and timestep adjustment (Figure 4).

Variable Adjustment. We refine the model structure at a specific timestep by iteratively introducing new, relevant latent variables into the model to address uncertainty in the inference. For each adjustment, we compute the updated model utility and accept the modification that offers the biggest increase in utility.

Timestep Adjustment. If the model utility remains low and no significant improvements can be achieved through variable adjustment given the current timesteps \( t_s:t \), we may incorporate an additional timestep, \( t_s-1 \), to provide more context for the inference. When we add one more timestep, we first apply the model structure in the initial model proposal, and then conduct variable adjustments for this new timestep as well.

Model Adjustment
Figure 4: We automatically refine the agent model by alternating between variable adjustment and timestep adjustment.

Experiment 1: Evaluation on ToM Benchmarks

We evaluated our method on multiple Theory of Mind benchmarks, including ToMi, BigToM, MMToM-QA, MuMA-ToM, and Hi-ToM. As shown in Figure 1(a), these benchmarks encompass different mental variables, observable contexts, numbers of agents, the presence or absence of utterances, wording styles, and modalities. AutoToM autonomously discovers the appropriate agent models for them.

The main results are summarized in Table 1. AutoToM demonstrates the strongest overall performance among all methods, including large reasoning models. As shown in Figure 5 AutoToM demonstrates robust scalability and exhibits a much lower degree of volatility under different conditions than large reasoning models. This is because Bayesian inverse planning is more robust in inferring mental states given long context with complex environments and agent behavior. It is also more adept at recursive reasoning which is key to higher-order inference.

Method ToMi BigToM MMToM-QA MuMA-ToM Hi-ToM All
LLaMA 3.1 70B 72.00 77.83 43.83 55.78 35.00 56.89
GPT-4o 77.00 82.42 44.00 63.55 50.00 63.39
Gemini 2.0 Flash 66.70 82.00 48.00 55.33 52.50 60.91
Gemini 2.0 Pro 71.90 86.33 50.84 62.22 57.50 65.76
SymbolicToM 98.60 - - - 44.50 -
SimToM 79.90 77.50 51.00 47.63 71.00 65.41
DeepSeek-R1 89.40 86.25 49.67 63.44 56.50 69.05
Gemini 2.0 Flash Thinking 78.00 82.83 54.00 82.56 73.50 74.18
o3-mini-high 73.10 86.92 64.67 70.00 75.00 73.94
BIP-ALM 55.60 50.33 56.17 33.90 14.50 42.10
LIMP 44.60 61.67 55.33 76.60 6.50 48.94
AutoToM (w/ GPT-4o) 88.30 86.92 83.00 81.44 72.50 82.43
Table 1: Results of all methods on ToM benchmarks, grouped by model types: LLMs, ToM prompting, large reasoning models, and model-based inference.
Incorporating System Prompt in Instruction Tuning Data alleviates “Answer Machine Phenomenon”
Figure 5: Comparison of AutoToM and large reasoning models across various conditions.

Ablation Study. The results from our ablation study (Figure 6) highlight the benefits of variable adjustment, timestep adjustment, and hypothesis reduction. The automatic agent model discovery in AutoToM can construct a suitable agent model that not only enables rich ToM inferences but also reduces compute, balancing accuracy and cost.

Comparison
Figure 6: Averaged performance and compute of AutoToM and the ablated methods on all benchmarks.

Experiment 2: Evaluation on Classic Cognitive Studies

AutoToM produces posterior distributions over the hypothesis space, offering uncertainty estimates. This allows us to compare the model uncertainties with human judgments. We adapted two well-known cognitive studies on human ToM: online goal inference and desire and belief inferences in the food truck scenarios.
We computed the correlation between model responses and human judgments reported in the original studies. As shown in Table 2, AutoToM aligns well with human confidence judgments on all three tasks.

Task AutoToM GPT-4o o3-mini-high Gemini 2.0 Flash Thinking
Online goal inference (full obs.) 0.93** 0.81** 0.97** 0.95**
Desire inference (partial obs.) 0.88** 0.30 0.52* 0.58*
Belief inference (partial obs.) 0.73** 0.04 0.03 0.60*

Experiment 3: Embodied Assistance

We evaluated AutoToM in an embodied assistance benchmark, Online Watch-And-Help (O-WAH), where a helper agent must simultaneously observe a main agent's actions, infer its goal, and assist it to reach the inferred goal faster in realistic household environments.
As shown in Figure 7, AutoToM achieves the highest speedup of 27.7%, significantly outperforming all baselines.

Comparison
Figure 7: Averaged speedup of AutoToM and baselines on the O-WAH benchmark.

Conclusion

To conclude, AutoToM is a novel framework for open-ended Theory of Mind. Given any ToM inference problem, AutoToM can automatically construct a suitable agent model and conduct automated Bayesian inverse planning with an LLM backend. It suggests a promising direction toward cognitively grounded ToM modeling that is scalable, robust, and open-ended.

Acknowledgement

We would like to thank Hyokun Yun and Tanya Roosta for their helpful comments, and the Cambrian authors for providing this webpage template.

BibTeX

@article{zhang2025autotom,
  title={AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of Mind},
  author={Zhang, Zhining and Jin, Chuanyang and Jia, Mung Yao and Shu, Tianmin},
  journal={arXiv preprint arXiv:2502.15676},
  year={2025}
}