17 Apr 2025 2 min read

SCENT: A Scalable Framework for Spatiotemporal Learning in Scientific Data

Spatiotemporal learning—modeling data that evolves across space and time—is a critical challenge in scientific domains, from climate modeling to epidemiology. But real-world scientific data is messy: sensors fail, measurements are sparse, and datasets are massive. Traditional approaches struggle with these complexities, often requiring trade-offs between accuracy, scalability, and flexibility.

Enter SCENT (Scalable Conditioned Neural Field for Spatiotemporal Learning), a new framework from researchers at Brookhaven National Laboratory and Cornell Tech. Published in a recent arXiv preprint, SCENT tackles spatiotemporal learning with a unified architecture that handles interpolation, reconstruction, and forecasting—all while scaling to high-dimensional scientific datasets.

The Challenges of Spatiotemporal Data

Scientific data is notoriously difficult to model:

Irregular sampling: Sensor failures or moving instruments (like air quality monitors on buses) create gaps.
High dimensionality: Climate simulations or fluid dynamics datasets can be terabytes in size.
Complex dependencies: Spatial patterns influence temporal evolution (and vice versa).

Existing methods, like Fourier Neural Operators (FNO) or transformer-based models, often specialize in one task (e.g., forecasting) but struggle with others (e.g., filling in missing data). SCENT aims to bridge this gap.

How SCENT Works

At its core, SCENT is built on a transformer-based encoder-processor-decoder architecture, enhanced with three key innovations:

Time-Targeted Spatial Encoder: Unlike traditional models that treat time as an afterthought, SCENT explicitly encodes both input and target times, improving attention to relevant patterns.
Temporal Warp Processor: This module enables continuous-time forecasting by learning dynamic evolution directly, avoiding error accumulation in long-term predictions.
Sparse Attention: To handle large datasets, SCENT uses sparse cross-attention, reducing computational overhead while preserving global context.

Figure: SCENT's architecture integrates time-aware encoding and sparse attention for efficient spatiotemporal modeling.

Key Results

The team tested SCENT across multiple benchmarks, including:

Simulated fluid dynamics: SCENT outperformed FNO and other baselines in reconstructing turbulent flows from sparse, noisy sensors.
AirDelhi PM2.5 dataset: SCENT achieved state-of-the-art forecasting accuracy for fine-grained air pollution measurements, even with moving sensors.
Scalability tests: On large-scale datasets, SCENT maintained near-linear performance scaling, while competitors plateaued.

Why This Matters

SCENT’s flexibility makes it a promising tool for:

Climate science: Filling gaps in satellite or sensor networks.
Epidemiology: Modeling disease spread across regions over time.
Industrial monitoring: Predicting equipment failures from sparse sensor data.

"The ability to unify interpolation, reconstruction, and forecasting in a single model is a game-changer," says lead author David Keetae Park. "It means scientists can spend less time wrangling data and more time extracting insights."

What’s Next

The team plans to extend SCENT to exabyte-scale datasets, like those from particle physics experiments, and explore real-world deployments in environmental monitoring. For now, the code and pre-trained models are available on GitHub, offering researchers a powerful new tool for spatiotemporal analysis.

Read the full paper on arXiv.