Theory of Mind (ToM), the ability to understand people's minds based on their behavior, is key to developing socially intelligent agents. We introduce AutoToM, an automated agent modeling method for scalable, robust, and interpretable mental inference. AutoToM achieves state-of-the-art performance across five benchmarks, produces human-like confidence estimates, and enables online mental inference for embodied decision-making.

Model-based Theory of Mind

Understanding the Challenge of Theory of Mind

Theory of Mind (ToM), the ability to understand people's mental variables based on their behavior, is key to developing socially intelligent agents.

There are two current approaches to Theory of Mind reasoning:

Directly applying LLMs to reason about people's mental states with prompting strategies such as perspective-taking, change-tracking, and temporal-spatial reasoning. However, even with these advanced prompting techniques, LLMs still make systematic errors in complex scenarios.
Using model-based inference, particularly Bayesian Inverse Planning (BIP). Recent works have proposed to combine BIP and LLMs to achieve scalable yet robust model-based ToM inference. While these methods significantly outperform LLMs in specific domains, they use rigid, handcrafted models, which cannot generalize across different domains.

Bayesian Inverse Planning: A Robust Framework

Bayesian Inverse Planning (BIP) models how observers infer unobservable mental states—such as beliefs and goals—from an agent's behavior. It assumes that the agent acts rationally according to a Bayesian Theory of Mind (BToM) agent model, which specifies how internal variables lead to observable actions. BIP then inverts this generative process to assess what latent mental variables can lead to observed behavior, serving as a robust solution to ToM challenges.

To conduct BIP in different scenarios, there are several key challenges: 1) Different ToM inference problem requires different agent models (see Figure 1(a)), but we don't know which is most suitable; 2) There are many time steps in a given context, and we need to reason which steps are relevant; 3) There is no predefined hypothesis space for each mental variable.

AutoToM: A Paradigm Shift

We introduce AutoToM, a fully automated and open-ended model-based Theory of Mind reasoning method. It automates every aspect of Bayesian inverse planning, including the proposal and adjustment of model structures, the identification of relevant timesteps, the generation of hypotheses, and the execution of Bayesian inference. It is designed to operate in any context, infer any mental state, reason about any number of agents, and support any order of recursive reasoning, which represents our vision of an open-ended and robust machine Theory of Mind.

Figure 2 provides an overview of AutoToM. Given a question, we extract the observable variables (information extraction) and propose an initial agent model. This is followed by automated Bayesian inverse planning and iterative model adjustment. When the model utility is high enough, we will produce the final answer based on the inference result.

Automated Bayesian Inverse Planning

Given an agent model, we integrate LLMs as the computational backend to implement every aspect of the Bayesian inverse planning. This includes hypothesis sampling for latent mental variables, and probabilistic inference for the target mental variable (Figure 3). The construction, information flow, and computations within the agent model are entirely automated.

Hypothesis Sampling. Conventional BIP assumes a manually defined hypothesis space as well as hypothesis representation for each latent mental variable. Our hypothesis sampling module instead leverages an LLM to propose only a small set of quality hypotheses for each latent variable, conditioned on observable variables and their values extracted from the context. We further apply hypothesis reduction to eliminate unlikely hypotheses and reduce the hypothesis space.

Bayesian Inference. We estimate each local conditional in the agent model using an LLM. After marginalizing the joint distribution over non-target latent variables, we then produce the posterior probabilities of the target variable in the query. We greatly generalize prior methods by enabling any ToM inference based on any agent model structure, simultaneously considering multiple non-target latent variables and supporting arbitrary levels of recursion for high-order ToM inference.

Automated Bayesian Inverse Planning — **Figure 3:** Illustration of automated Bayesian inverse planning given a specified agent model.

Automated Agent Model Discovery

Prior works on Bayesian inverse planning rely on manually designed agent models, which limits their applicability to domain-specific scenarios. In contrast, the Automated Agent Model Discovery component automatically proposes a model and dynamically adjusts it to ensure both the effectiveness of the model—confidently inferring agents' mental states—and the efficiency of the inference by minimizing model complexity.

Information Extraction. The information extraction module processes the given context to identify the values of observable variables, including states, actions, and utterances, organized along a timeline. When there are multiple agents, we first identify whose mental state the question is asking about, and then construct the timesteps based on its actions.

Initial Model Proposal. We employ an LLM to propose an initial agent model tailored to the available information and the query. Following this model, we conduct §automated Bayesian inverse planning. If the model utility exceeds a threshold, we accept the inference result as the final answer. Otherwise, we use the model utility to guide model adjustments.

Model Adjustment. We iteratively adjust the proposed model by considering two types of model adjustments: variable adjustment and timestep adjustment (Figure 4).

Variable Adjustment. We refine the model structure at a specific timestep by iteratively introducing new, relevant latent variables into the model to address uncertainty in the inference. For each adjustment, we compute the updated model utility and accept the modification that offers the biggest increase in utility.

Timestep Adjustment. If the model utility remains low and no significant improvements can be achieved through variable adjustment given the current timesteps \( t_s:t \), we may incorporate an additional timestep, \( t_s-1 \), to provide more context for the inference. When we add one more timestep, we first apply the model structure in the initial model proposal, and then conduct variable adjustments for this new timestep as well.

Model Adjustment — **Figure 4:** We automatically refine the agent model by alternating between variable adjustment and timestep adjustment.

Experiment 1: Evaluation on ToM Benchmarks

We evaluated our method on multiple Theory of Mind benchmarks, including ToMi, BigToM, MMToM-QA, MuMA-ToM, and Hi-ToM. As shown in Figure 1(a), these benchmarks encompass different mental variables, observable contexts, numbers of agents, the presence or absence of utterances, wording styles, and modalities. AutoToM autonomously discovers the appropriate agent models for them.

The main results are summarized in Table 1. AutoToM demonstrates the strongest overall performance among all methods, including large reasoning models. As shown in Figure 5 AutoToM demonstrates robust scalability and exhibits a much lower degree of volatility under different conditions than large reasoning models. This is because Bayesian inverse planning is more robust in inferring mental states given long context with complex environments and agent behavior. It is also more adept at recursive reasoning which is key to higher-order inference.

Method	ToMi	BigToM	MMToM-QA	MuMA-ToM	Hi-ToM	All
LLaMA 3.1 70B	72.00	77.83	43.83	55.78	35.00	56.89
GPT-4o	77.00	82.42	44.00	63.55	50.00	63.39
Gemini 2.0 Flash	66.70	82.00	48.00	55.33	52.50	60.91
Gemini 2.0 Pro	71.90	86.33	50.84	62.22	57.50	65.76
SymbolicToM	98.60	-	-	-	44.50	-
SimToM	79.90	77.50	51.00	47.63	71.00	65.41
DeepSeek-R1	89.40	86.25	49.67	63.44	56.50	69.05
Gemini 2.0 Flash Thinking	78.00	82.83	54.00	82.56	73.50	74.18
o3-mini-high	73.10	86.92	64.67	70.00	75.00	73.94
BIP-ALM	55.60	50.33	56.17	33.90	14.50	42.10
LIMP	44.60	61.67	55.33	76.60	6.50	48.94
AutoToM (w/ GPT-4o)	88.30	86.92	83.00	81.44	72.50	82.43

Table 1: Results of all methods on ToM benchmarks, grouped by model types: LLMs, ToM prompting, large reasoning models, and model-based inference.

Incorporating System Prompt in Instruction Tuning Data alleviates “Answer Machine Phenomenon” — **Figure 5:** Comparison of AutoToM and large reasoning models across various conditions.

Ablation Study. The results from our ablation study (Figure 6) highlight the benefits of variable adjustment, timestep adjustment, and hypothesis reduction. The automatic agent model discovery in AutoToM can construct a suitable agent model that not only enables rich ToM inferences but also reduces compute, balancing accuracy and cost.

Comparison — **Figure 6:** Averaged performance and compute of AutoToM and the ablated methods on all benchmarks.

Task	AutoToM	GPT-4o	o3-mini-high	Gemini 2.0 Flash Thinking
Online goal inference (full obs.)	0.93**	0.81**	0.97**	0.95**
Desire inference (partial obs.)	0.88**	0.30	0.52*	0.58*
Belief inference (partial obs.)	0.73**	0.04	0.03	0.60*

AutoToM

Scaling Model-based Mental Inference
via Automated Agent Modeling

Model-based Theory of Mind

Understanding the Challenge of Theory of Mind

Bayesian Inverse Planning: A Robust Framework

AutoToM: A Paradigm Shift

Automated Bayesian Inverse Planning

Automated Agent Model Discovery

Experiment 1: Evaluation on ToM Benchmarks

Experiment 2: Evaluation on Classic Cognitive Studies

Experiment 3: Embodied Assistance

Conclusion

Acknowledgement

BibTeX

AutoToM

Scaling Model-based Mental Inference via Automated Agent Modeling

Model-based Theory of Mind

Understanding the Challenge of Theory of Mind

Bayesian Inverse Planning: A Robust Framework

AutoToM: A Paradigm Shift

Automated Bayesian Inverse Planning

Automated Agent Model Discovery

Experiment 1: Evaluation on ToM Benchmarks

Experiment 2: Evaluation on Classic Cognitive Studies

Experiment 3: Embodied Assistance

Conclusion

Acknowledgement

BibTeX

Scaling Model-based Mental Inference
via Automated Agent Modeling