17 May 2025 2 min read

CodePDE: How LLMs Are Revolutionizing PDE Solving Without Specialized Training

The Challenge of PDE Solving

Partial Differential Equations (PDEs) are the backbone of modeling physical systems, from fluid dynamics to quantum mechanics. Yet, solving them has traditionally required deep domain expertise and significant computational resources. Traditional numerical solvers—like finite difference or finite element methods—demand meticulous tuning and extensive debugging. Neural-network-based solvers, while promising, require large training datasets and often lack interpretability.

Enter CodePDE, a new framework from researchers at Carnegie Mellon University and the Flatiron Institute that leverages large language models (LLMs) to generate PDE solvers without task-specific training. The results? Superhuman performance on 4 out of 5 benchmark PDE problems.

How CodePDE Works

CodePDE frames PDE solving as a code generation task. Given a natural language description of a PDE (like the Burgers or Navier-Stokes equations), it instructs an LLM to generate executable solver code. The framework then iteratively refines the solution through:

Reasoning: Chain-of-thought prompting explores numerical methods (finite difference, spectral, etc.).
Debugging: Runtime errors are fed back to the LLM for autonomous correction (bug-free rate jumps from 42% to 86%).
Refinement: Feedback on solution accuracy improves solver quality.
Test-Time Scaling: Generating multiple solver variants and selecting the best boosts accuracy.

Crucially, CodePDE doesn’t fine-tune models—it uses off-the-shelf LLMs like GPT-4o, Gemini 2.5 Pro, and Claude 3.7, proving that general-purpose models can excel at specialized tasks with the right scaffolding.

Key Results

Outperforming Humans: On the Burgers equation, CodePDE achieved a 70% lower error than human expert solvers. It matched or beat experts on 4 of 5 PDE families.
Debugging Matters: Without iterative debugging, only 42% of generated solvers worked. With it, 86% succeeded.
Compute Scaling: Generating 32 solver variants and picking the best reduced error by up to 50%.
Efficiency: Gemini 2.5 Pro’s solvers ran 10x faster than baseline neural operators like FNO.

The Catch: Reaction-Diffusion

LLMs struggled with the Reaction-Diffusion equation, where human experts exploit an analytical solution for the reaction term. Generated solvers defaulted to numerical approximations, leading to higher errors. This highlights a limitation: LLMs lack deep physical intuition (for now).

Why This Matters

CodePDE democratizes PDE solving. Instead of requiring months of manual solver development, researchers can describe their problem in plain English and get working code. The approach is:

Interpretable: Unlike black-box neural solvers, the generated code is human-readable.
Flexible: Works with any LLM, local or API-based.
Scalable: Test-time compute can be traded for accuracy.

The Future

The team suggests fine-tuning LLMs on numerical analysis textbooks or integrating them with symbolic math tools to close gaps like Reaction-Diffusion. Hybrid approaches—combining LLM-generated code with neural operators—could marry interpretability with performance.

One thing’s clear: AI-generated scientific computing is here. And it’s fast, accurate, and (almost) effortless.

Read the full paper here.