Classical planning formulations like the Planning Domain Definition Language (PDDL) admit action sequences guaranteed to achieve a goal state given an initial state if any are possible. However, reasoning problems defined in PDDL do not capture temporal aspects of action taking, such as concurrent actions between two agents when there are no conflicting conditions, without significant modification and definition to existing PDDL domains. A human expert aware of such constraints can decompose a goal into subgoals, each reachable through single agent planning, to take advantage of simultaneous actions. In contrast to classical planning, large language models (LLMs) directly used for inferring plan steps rarely guarantee execution success, but are capable of leveraging commonsense reasoning to assemble action sequences. We combine the strengths of both classical planning and LLMs by approximating human intuitions for multi-agent planning goal decomposition. We demonstrate that LLM-based goal decomposition leads to faster planning times than solving multi-agent PDDL problems directly while simultaneously achieving fewer plan execution steps than a single agent plan alone, as well as most multiagent plans, while guaranteeing execution success. Additionally, we find that LLM-based approximations of subgoals result in similar multi-agent execution lengths to those specified by human experts.
We propose decomposing an $N$-agent planning problem into $N$ single-agent planning problems by leveraging the human-like commonsense reasoning capabilities of LLMs. Specifically, we consider a multiagent scenario with $N-1$ helper agents and one main agent. For a given problem $\mathrm{P}$, containing object definitions, initial state $i$ and goal conditions $g$, each helper agent $h$ generates a plan $\pi_h = \prod(i, g_h)$ to reach a subgoal state $g_h$ from the initial state $i$ using a planner $\prod$. The resulting state $i_{h+1} = E(i, \prod(i, g_h))$, where $E$ is Plan Execution returning the state reached after starting from $i$ and executing steps $\prod(i, g_h)$, serves as the starting point for the next helper agent. This iteration continues until the main agent executes $\pi_m = \prod(i_{N-1}, g)$ to achieve the final goal $g$. Each helper agent’s subgoal $g_h$ is generated through two modules: a subgoal generator for English subgoal generation and a subgoal translator for translating it into a PDDL subgoal. We hypothesize that LLMs can infer helper subgoals that enable parallel execution alongside the \main\ agent while assuming all agents will eventually achieve their respective goals.
@misc{bai2025twostepmultiagenttaskplanning,
title={TwoStep: Multi-agent Task Planning using Classical Planners and Large Language Models},
author={David Bai and Ishika Singh and David Traum and Jesse Thomason},
year={2025},
eprint={2403.17246},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2403.17246},
}