New LLM framework solves complex planning problems with zero training
LLMFP breaks this deadlock by reimagining planning as a constrained optimization problem, a mathematical process of finding the best solution under given conditions. The framework uses LLMs to interpret natural language task descriptions, identify goals, decision variables, and constraints, and then encode them into a solvable format. For instance, in a coffee supply chain scenario, LLMFP determines how to source beans from suppliers, roast them at facilities, and distribute them to cafes at minimal cost, all while respecting capacity limits and demand fluctuations.
Researchers at the Massachusetts Institute of Technology (MIT) have developed a new framework that harnesses large language models (LLMs) to tackle complex planning problems with unprecedented flexibility and precision. This innovation promises to transform how AI systems address real-world challenges, from optimizing supply chains to orchestrating robotic tasks, without requiring extensive human intervention or task-specific training. Announced on April 5, 2025, the development marks a leap forward in the quest for general-purpose AI planning tools capable of adapting to diverse scenarios.
The study, titled "Planning Anything with Rigor: General-Purpose Zero-Shot Planning with LLM-Based Formalized Programming," was published as a conference paper at the International Conference on Learning Representations (ICLR) 2025. Conducted by Yilun Hao, Yang Zhang, and Chuchu Fan, the research introduces LLMFP, a framework that leverages the reasoning and programming capabilities of LLMs to convert planning tasks into optimization problems, solving them from scratch with no prior examples. This approach has demonstrated remarkable success, achieving optimal solutions in over 83% of tested cases, far surpassing existing methods.
How does LLMFP address the flexibility-complexity trade-off?
Planning in real-world contexts often involves navigating a delicate balance between flexibility and complexity. Traditional LLM-based systems excel in simple, single-objective tasks like scheduling household chores, but falter when faced with multi-step, multi-constraint problems such as managing logistics or long-term robotic operations. Conversely, solutions designed for complex tasks typically rely on meticulously crafted, task-specific examples and pre-defined rules, limiting their ability to generalize across different domains. This trade-off has long hindered the deployment of AI planners in dynamic, unpredictable environments.
LLMFP breaks this deadlock by reimagining planning as a constrained optimization problem, a mathematical process of finding the best solution under given conditions. The framework uses LLMs to interpret natural language task descriptions, identify goals, decision variables, and constraints, and then encode them into a solvable format. For instance, in a coffee supply chain scenario, LLMFP determines how to source beans from suppliers, roast them at facilities, and distribute them to cafes at minimal cost, all while respecting capacity limits and demand fluctuations. By automating this process without task-specific preparation, LLMFP achieves what previous systems could not: zero-shot flexibility weighing flexibility with high performance on complex tasks.
The process unfolds in five steps. First, the system defines the optimization problem by extracting key elements from the task description. Next, it formulates a detailed representation of variables and their relationships. Then, it generates executable code using a solver like the satisfiability modulo theory (SMT) tool. After running the code, it formats the results into actionable plans and conducts a self-assessment to refine any errors. This structured approach ensures reliability, even for intricate problems like multi-step robot block stacking, where actions must be sequenced over time.
What makes LLMFP a game-changer for diverse applications?
The implications of LLMFP extend far beyond academic exercises. Across nine diverse planning problems - ranging from supply chain logistics to robotic block moving, the framework achieved an average optimal rate of 83.7% with GPT-4o and 86.8% with Claude 3.5 Sonnet. These figures represent a dramatic improvement over the best baseline method, direct planning with OpenAI’s ol-preview, which scored only 46.1% and 46% respectively. This leap underscores its ability to handle tasks that demand intricate reasoning and adherence to multiple constraints.
One standout feature is its adaptability. Unlike domain-specific tools that require extensive customization, LLMFP operates solely on natural language inputs: a task description, background information, and a user query. For example, it can respond to a query like “What happens if café demand rises by 23%?” by recalculating the entire supply chain plan. This universality makes it applicable to fields as varied as business management, logistics, and robotics, where planning often involves juggling competing priorities over extended horizons.
Another strength lies in its handling of implicit constraints - conditions not explicitly stated but inferred through common sense, such as ensuring non-negative quantities in shipping calculations. By prompting LLMs to systematically explore relationships between variables, LLMFP uncovers these hidden requirements, enhancing the realism and feasibility of its plans. Experiments further showed its robustness: when task descriptions were paraphrased, it maintained a 92% optimal rate on 50 Blocksworld queries, proving its insensitivity to wording variations.
Why does LLMFP outperform existing methods?
The success of LLMFP stems from its fusion of LLM capabilities with formal optimization techniques. While LLMs alone struggle with complex planning - often producing invalid or incomplete plans - LLMFP capitalizes on their strengths in language understanding and code generation. By translating planning tasks into a format solvable by established tools like SMT or even mixed-integer linear programming (MILP) solvers, it sidesteps the limitations of direct LLM planning.
Baseline comparisons highlight this edge. Methods like chain-of-thought prompting or code-based planning with LLMs alone faltered, with optimal rates as low as 0% in some cases. In contrast, LLMFP’s structured pipeline, defining, formulating, coding, executing, and self-correcting, ensures both accuracy and scalability. Ablation studies confirmed that each component, from constraint identification to self-assessment, contributes to its superior performance.
Ease of adaptation is another advantage. Switching from SMT to MILP solvers required minimal prompt adjustments, demonstrating LLMFP’s flexibility across solver types. Even when tested with minimal input, no task-specific examples or external critics, it outperformed systems burdened by preparatory overhead. Adding examples could boost results further, but their absence didn’t hinder its zero-shot prowess, a critical feature for real-world deployment where labeled data is scarce.
- FIRST PUBLISHED IN:
- Devdiscourse

