Services
Contact Us

Time Series Foundation Models: Use Cases & Benefits

Sıla Ermut
Sıla Ermut
updated on Jun 12, 2026

Time series foundation models (TSFMs) are pre-trained models that forecast, classify, impute, and detect anomalies in time series data without requiring a separate model for every dataset or industry. TSFMs use transformer-based architectures and large-scale time-series datasets to generalize across domains such as finance, retail, energy, and healthcare.

Discover the architecture, use cases, adoption in industries, benefits, challenges, and comparisons of time series foundation models with existing models:

What are Time Series Foundation Models?

TSFMs apply foundation model training methods to sequential numerical data, using architectures that can learn temporal patterns from large collections of time series. Instead of training a separate model for each forecasting problem, they can be adapted to tasks such as forecasting, anomaly detection, imputation, and classification.

Leading TSFMs are:

Amazon Chronos-2

Amazon Chronos-2 is an encoder-only model derived from T5 encoder architecture and reached tens of millions of downloads of Hugging Face.1

Salesforce Moirai-2

Salesforce Moirai-2 uses a decoder-only transformer architecture trained on the 27 billion observation LOTSA dataset.

Sundial

Sundial developed by researchers in Tsinghua University has market-leading results on the TimeBench dataset.

TimesFM-2.5

TimesFM-2.5 is Google’s latest model in the TimesFM series. It is a pretrained model with ~200M parameters and 16k context length, trained on a corpus of real-world time series data points. 2 Compared with large language models (LLMs), it brings a compact size, fast inference, and focus on time series data.

Architecture and training

TimesFM borrows the decoder-only transformer architecture from language models: stacked causal self-attention and feedforward layers generate the following output conditioned on past context.

Unlike text, the model represents a sequence as patches of contiguous time points; each patch is embedded (via an MLP residual block plus positional encodings) and treated as a token. A key design choice is to predict a longer output patch length than the input patch, which reduces iterative steps at inference and limits error accumulation on long horizons.

For model training, Google mixes synthetic data (to teach basic temporal “grammar”) with a large, diverse dataset of real series (e.g., Google Trends and Wikipedia Pageviews) to improve transfer. The total pretraining scale is on the order of 100B time points.

Figure 1: Graph showing TimesFM’s architecture.3

Evaluation and results

Google evaluated TimesFM in pure zero-shot mode across public benchmarks. On the Monash Forecasting Archive, TimesFM outperforms most statistical models (e.g., ARIMA, ETS) and matches or exceeds several deep learning baselines trained on the target series.

On long-horizon tasks (e.g., ETT datasets), TimesFM’s zero-shot accuracy rivals supervised baselines (e.g., PatchTST trained per dataset) and beat prompt-based LLM forecasters. Metrics include scaled MAE and geometric-mean summaries across datasets.4

Key characteristics and architecture of TSFMs

TSFMs’ transformer architecture uses self-attention, residual connections, and linear layers to model long-range dependencies and seasonality patterns. Input patches are transformed via a multilayer perceptron into embeddings, while positional encodings preserve temporal order.

Compared to other foundation models, these architectures are adapted for forecasting tasks, rather than text or image processing.

Figure 2: Diagram showing different adaptation techniques.5

What are the primary use cases?

Forecasting

TSFMs use historical time series values and, when supported by the model, additional inputs such as weather, promotions, holidays, or other variables. This allows them to model relationships across multiple signals rather than relying on a single target series.

Classification

TSFMs use transformer-based models to recognize characteristic structures such as arrhythmias in medical data or unusual demand peaks in retail.

Imputation

TSFMs reconstruct missing intervals by leveraging patterns learned from diverse datasets during unified training.

Unlike simple interpolation, they retain consistency with seasonality and trends. Applications include filling gaps in energy usage logs or medical monitoring data, where missing information can affect downstream forecasting tasks.

Anomaly detection

TSFMs can support anomaly detection by converting their learned time series representations or reconstruction outputs into anomaly scores. For example, MOMENT uses a reconstruction-based setup in which the mean squared error between the observed and predicted time series is used as the anomaly criterion.6

This approach can reduce the need for task-specific anomaly labels, but it should still be benchmarked against traditional anomaly detection methods for each dataset.

Industries adopting TSFMs

Retail

In retail, TSFMs are primarily relevant for SKU- or store-level demand forecasting, where sales patterns can change due to holidays, pricing, promotions, stockouts, and regional seasonality.

Their usefulness depends on whether the model supports external variables. For example, TimeGPT can use exogenous variables such as prices, promotions, and holiday indicators.7

Another example, Lag-Llama, is designed as a univariate probabilistic forecasting model. This means TSFMs should not be described as a single class that always incorporates retail-specific drivers.8

A more practical retail use case is to test TSFMs as reusable forecasting baselines on demand datasets, then compare them with existing statistical, machine learning, or domain-specific forecasting models before deployment.

Finance

In financial time series, TSFMs are most relevant for tasks where historical data is limited, noisy, or affected by regime changes. These include forecasting newly listed assets, estimating short-term volatility, and identifying unusual transactions or market patterns.

Single-market models such as ARIMA, GARCH, or LSTM-based forecasters can become less reliable when the data distribution changes after interest rate shifts, liquidity shocks, macroeconomic announcements, or market stress events. TSFMs address this limitation by transferring patterns learned from broader time series datasets, but their outputs still require backtesting because financial data is highly non-stationary.

Potential finance use cases include asset price forecasting, volatility forecasting, portfolio risk monitoring, and fraud or transaction anomaly detection.

Healthcare

TSFMs learn from both clinical and synthetic data, enabling early warning systems that adapt to patient-specific baselines. Beyond monitoring, they support research and discovery in drug trials by identifying subtle temporal patterns across large datasets.

Energy

Unlike traditional methods that assume fixed seasonal patterns, TSFMs handle variable conditions such as renewable generation.

They combine consumption histories with exogenous variables such as temperature and wind speed, producing probabilistic time-series forecasting outputs for grid balancing. Computational efficiency is relevant here, as tiny time mixers provide localized predictions at lower cost. Explore sustainability AI applications for more information.

Transportation

TSFMs trained on diverse datasets can transfer across regions with minimal fine-tuned adaptation. Real-world examples include congestion forecasting in urban areas and optimizing delivery routes in logistics.

Manufacturing

TSFMs handle long-range dependencies across sensors and production cycles, improving early fault detection.

When fine-tuned with facility-specific data, they achieve improved performance in reducing downtime and ensuring quality control.

Weather and climate

Weather and climate modeling requires managing multiple forecast horizons, from hours to years. Statistical models and traditional methods often fail to capture multi-scale variability.

TSFMs, through their transformer architecture and self-attention mechanisms, can model both local and global dependencies. Examples include short-term precipitation forecasting and climate cycle predictions. Probabilistic time series forecasting helps quantify uncertainty in these outputs.

Benefits of time series foundation models

Key advantages of TSFMs compared to existing models include:

  • Zero-shot performance: Delivering strong results on unseen datasets without fine-tuned adaptation.
  • Reduced training costs: Reuse of one model across domains instead of training separate models.
  • Domain generalization: A model adapts to varied contexts with transfer learning and few-shot learners.
  • Computational efficiency: Smaller than large foundation models in NLP while still delivering improved performance.
  • Versatility: Handling diverse forecast horizons, granularities, and output patch lengths.
See more of our benchmarks and data-driven insights in Google Search.
GoogleAdd as preferred source

Challenges of TSFMs

Technical challenges

Training data scarcity: Unlike text for language models, the available public datasets for time series data is smaller. However, now there are datasets like Large-scale Open Time Series Archive (LOTSA) with billions of observations across multiple domains.9

Lack of universal structure: No equivalent of vocabulary or grammar.

Complex temporal dynamics: Diverse seasonality patterns and histories.

Domain specificity: Different sampling rates and behaviors across industries.

Practical challenges

  • Privacy concerns in collecting diverse datasets.
  • High computational efficiency requirements for model training.
  • Distribution shift in evolving environments.
  • Interpretability and transparency in real-world applications.
  • Integration into legacy systems and related work pipelines.

Time series foundation models: Development and design factors

Time series foundation models: Outcomes and operational factors

Differences from other foundation models

TSFMs diverge from language models and vision foundational models in several ways:

  • Data modality: Sequential numeric data rather than text or images.
  • Architecture: Adapted transformer-based architectures with patching and normalization (e.g., reversible instance normalization).
  • Training approach: Incorporating both synthetic data and real-world corpora, like Google Research datasets.
  • Scale: Smaller in size than large foundation models, yet delivering high-quality point forecasts.
  • Evaluation: Benchmarked on forecasting tasks, anomaly detection, and imputation instead of text understanding.

Conclusion

Time series foundation models represent a shift from domain-specific statistical models, regression models, and supervised deep learning toward a unified model for time series. By applying transformer-based architectures and leveraging pre-trained models, they offer scalable solutions for forecasting tasks, anomaly detection, and other applications across industries.

While challenges remain in training data availability, interpretability, and integration into existing workflows, the advantages in zero-shot forecasting, transfer learning, and cross-domain adaptability position TSFMs as a key step toward general-purpose forecasting.

Cite this research

Pick the format that matches where you're publishing. Pasting the link version into your CMS preserves the backlink.

Sıla Ermut (2026) - "Time Series Foundation Models: Use Cases & Benefits". Published online at AIMultiple.com. Retrieved June 12, 2026, from: https://aimultiple.com/time-series-foundation-models [Online Resource]

Ermut, S. (2026, June 12). Time Series Foundation Models: Use Cases & Benefits. AIMultiple. https://aimultiple.com/time-series-foundation-models

@misc{ermut2026,
  author = {Ermut, Sıla},
  title  = {{Time Series Foundation Models: Use Cases & Benefits}},
  year   = {2026},
  month  = jun,
  howpublished    = {\url{https://aimultiple.com/time-series-foundation-models}},
  note   = {AIMultiple. Retrieved June 12, 2026}
}
Sıla Ermut
Sıla Ermut
Industry Analyst
Sıla Ermut is an industry analyst at AIMultiple focused on email marketing and sales videos. She previously worked as a recruiter in project management and consulting firms. Sıla holds a Master of Science degree in Social Psychology and a Bachelor of Arts degree in International Relations.
View Full Profile

Be the first to comment

Your email address will not be published. All fields are required. Comments are left in their original language.

0/450