Insight

Time Series Foundation Models: Use Cases & Benefits

updated on Jun 12, 2026

Time series foundation models (TSFMs) are pre-trained models that forecast, classify, impute, and detect anomalies in time series data without requiring a separate model for every dataset or industry. TSFMs use transformer-based architectures and large-scale time-series datasets to generalize across domains such as finance, retail, energy, and healthcare.

Discover the architecture, use cases, adoption in industries, benefits, challenges, and comparisons of time series foundation models with existing models:

What are Time Series Foundation Models?

TSFMs apply foundation model training methods to sequential numerical data, using architectures that can learn temporal patterns from large collections of time series. Instead of training a separate model for each forecasting problem, they can be adapted to tasks such as forecasting, anomaly detection, imputation, and classification.

Leading TSFMs are:

Amazon Chronos-2

Amazon Chronos-2 is an encoder-only model derived from T5 encoder architecture and reached tens of millions of downloads of Hugging Face.¹

Salesforce Moirai-2

Salesforce Moirai-2 uses a decoder-only transformer architecture trained on the 27 billion observation LOTSA dataset.

Sundial

Sundial developed by researchers in Tsinghua University has market-leading results on the TimeBench dataset.

TimesFM-2.5

TimesFM-2.5 is Google’s latest model in the TimesFM series. It is a pretrained model with ~200M parameters and 16k context length, trained on a corpus of real-world time series data points. ² Compared with large language models (LLMs), it brings a compact size, fast inference, and focus on time series data.

Architecture and training

TimesFM borrows the decoder-only transformer architecture from language models: stacked causal self-attention and feedforward layers generate the following output conditioned on past context.

Unlike text, the model represents a sequence as patches of contiguous time points; each patch is embedded (via an MLP residual block plus positional encodings) and treated as a token. A key design choice is to predict a longer output patch length than the input patch, which reduces iterative steps at inference and limits error accumulation on long horizons.

For model training, Google mixes synthetic data (to teach basic temporal “grammar”) with a large, diverse dataset of real series (e.g., Google Trends and Wikipedia Pageviews) to improve transfer. The total pretraining scale is on the order of 100B time points.

Graph showing TimesFM time series foundation models architecture.

Figure 1: Graph showing TimesFM’s architecture.³

Evaluation and results

Google evaluated TimesFM in pure zero-shot mode across public benchmarks. On the Monash Forecasting Archive, TimesFM outperforms most statistical models (e.g., ARIMA, ETS) and matches or exceeds several deep learning baselines trained on the target series.

On long-horizon tasks (e.g., ETT datasets), TimesFM’s zero-shot accuracy rivals supervised baselines (e.g., PatchTST trained per dataset) and beat prompt-based LLM forecasters. Metrics include scaled MAE and geometric-mean summaries across datasets.⁴

Key characteristics and architecture of TSFMs

TSFMs’ transformer architecture uses self-attention, residual connections, and linear layers to model long-range dependencies and seasonality patterns. Input patches are transformed via a multilayer perceptron into embeddings, while positional encodings preserve temporal order.

Compared to other foundation models, these architectures are adapted for forecasting tasks, rather than text or image processing.

Figure 2: Diagram showing different adaptation techniques.⁵

What are the primary use cases?

Forecasting

TSFMs use historical time series values and, when supported by the model, additional inputs such as weather, promotions, holidays, or other variables. This allows them to model relationships across multiple signals rather than relying on a single target series.

Classification

TSFMs use transformer-based models to recognize characteristic structures such as arrhythmias in medical data or unusual demand peaks in retail.

Imputation

TSFMs reconstruct missing intervals by leveraging patterns learned from diverse datasets during unified training.

Unlike simple interpolation, they retain consistency with seasonality and trends. Applications include filling gaps in energy usage logs or medical monitoring data, where missing information can affect downstream forecasting tasks.

Anomaly detection

TSFMs can support anomaly detection by converting their learned time series representations or reconstruction outputs into anomaly scores. For example, MOMENT uses a reconstruction-based setup in which the mean squared error between the observed and predicted time series is used as the anomaly criterion.⁶

This approach can reduce the need for task-specific anomaly labels, but it should still be benchmarked against traditional anomaly detection methods for each dataset.

Industries adopting TSFMs

Retail

In retail, TSFMs are primarily relevant for SKU- or store-level demand forecasting, where sales patterns can change due to holidays, pricing, promotions, stockouts, and regional seasonality.

Their usefulness depends on whether the model supports external variables. For example, TimeGPT can use exogenous variables such as prices, promotions, and holiday indicators.⁷

Another example, Lag-Llama, is designed as a univariate probabilistic forecasting model. This means TSFMs should not be described as a single class that always incorporates retail-specific drivers.⁸

A more practical retail use case is to test TSFMs as reusable forecasting baselines on demand datasets, then compare them with existing statistical, machine learning, or domain-specific forecasting models before deployment.

Finance

In financial time series, TSFMs are most relevant for tasks where historical data is limited, noisy, or affected by regime changes. These include forecasting newly listed assets, estimating short-term volatility, and identifying unusual transactions or market patterns.

Single-market models such as ARIMA, GARCH, or LSTM-based forecasters can become less reliable when the data distribution changes after interest rate shifts, liquidity shocks, macroeconomic announcements, or market stress events. TSFMs address this limitation by transferring patterns learned from broader time series datasets, but their outputs still require backtesting because financial data is highly non-stationary.

Potential finance use cases include asset price forecasting, volatility forecasting, portfolio risk monitoring, and fraud or transaction anomaly detection.

Healthcare

TSFMs learn from both clinical and synthetic data, enabling early warning systems that adapt to patient-specific baselines. Beyond monitoring, they support research and discovery in drug trials by identifying subtle temporal patterns across large datasets.

Energy

Unlike traditional methods that assume fixed seasonal patterns, TSFMs handle variable conditions such as renewable generation.

They combine consumption histories with exogenous variables such as temperature and wind speed, producing probabilistic time-series forecasting outputs for grid balancing. Computational efficiency is relevant here, as tiny time mixers provide localized predictions at lower cost. Explore sustainability AI applications for more information.

Transportation

TSFMs trained on diverse datasets can transfer across regions with minimal fine-tuned adaptation. Real-world examples include congestion forecasting in urban areas and optimizing delivery routes in logistics.

Manufacturing

TSFMs handle long-range dependencies across sensors and production cycles, improving early fault detection.

When fine-tuned with facility-specific data, they achieve improved performance in reducing downtime and ensuring quality control.

Weather and climate

Weather and climate modeling requires managing multiple forecast horizons, from hours to years. Statistical models and traditional methods often fail to capture multi-scale variability.

TSFMs, through their transformer architecture and self-attention mechanisms, can model both local and global dependencies. Examples include short-term precipitation forecasting and climate cycle predictions. Probabilistic time series forecasting helps quantify uncertainty in these outputs.

Get our team to automate one of your business processes with AI agents, free of charge.

Automate a process

Benefits of time series foundation models

Key advantages of TSFMs compared to existing models include:

Zero-shot performance: Delivering strong results on unseen datasets without fine-tuned adaptation.
Reduced training costs: Reuse of one model across domains instead of training separate models.
Domain generalization: A model adapts to varied contexts with transfer learning and few-shot learners.
Computational efficiency: Smaller than large foundation models in NLP while still delivering improved performance.
Versatility: Handling diverse forecast horizons, granularities, and output patch lengths.

Challenges of TSFMs

Technical challenges

Training data scarcity: Unlike text for language models, the available public datasets for time series data is smaller. However, now there are datasets like Large-scale Open Time Series Archive (LOTSA) with billions of observations across multiple domains.⁹

Lack of universal structure: No equivalent of vocabulary or grammar.

Complex temporal dynamics: Diverse seasonality patterns and histories.

Domain specificity: Different sampling rates and behaviors across industries.

Practical challenges

Privacy concerns in collecting diverse datasets.
High computational efficiency requirements for model training.
Distribution shift in evolving environments.
Interpretability and transparency in real-world applications.
Integration into legacy systems and related work pipelines.

Don’t miss our benchmarks and data-driven insights. The button opens Google; selecting AIMultiple confirms that you wish to see AIMultiple more often in Google search results.

Add as preferred source

Time series foundation models: Development and design factors

Approach	Setup effort	Training data needs	Architecture
TimesFM (decoder-only foundation model)	Minimal; strong zero shot forecasting across domains without retraining	Pretrained on ~100B time points from diverse datasets (real + synthetic data)	Decoder only transformer with patch tokenization, self attention, longer output patch length
Traditional methods (ARIMA, ETS, etc.)	Moderate; requires manual model selection, tuning, stationarity assumptions	Uses only the target time series; no pretraining	Statistical models with linear and seasonal assumptions
Supervised deep learning models (PatchTST, Informer, etc.)	High; per-dataset model training and hyperparameter tuning	Requires large labeled training sets for each domain	Transformer based architectures or CNN/RNN hybrids
LLM-based prompting (GPT, etc.)	Low at inference; requires careful text formatting of numeric sequences	Trained on text corpora, not time series data	Large language models; tokenization designed for text, not numeric dependencies

Time series foundation models: Outcomes and operational factors

Approach	Performance on benchmarks	Adaptability	Efficiency
TimesFM (decoder-only foundation model)	Matches or surpasses supervised baselines in zero-shot; better than statistical models and LLM prompting	Generalizes across finance, energy, retail, healthcare; supports transfer learning and few shot adjustments	Compact (~200M parameters); lower computational efficiency needs than large foundation models
Traditional methods (ARIMA, ETS, etc.)	Strong on stable, short-horizon series; weak on irregular or multivariate data	Little to no transfer across domains; must refit per dataset	Lightweight, fast; runs on limited hardware
Supervised deep learning models (PatchTST, Informer, etc.)	Often highest accuracy when trained on sufficient data; strong on domain-specific tasks	Poor generalization; needs retraining per dataset	Resource-intensive (>1B parameters); slower to train and deploy
LLM-based prompting (GPT, etc.)	Weaker than TimesFM; struggles with long forecast horizons and numeric accuracy	Adaptable in principle, but heavily reliant on prompt engineering	Very costly at inference; inefficient due to scale and token length

Differences from other foundation models

TSFMs diverge from language models and vision foundational models in several ways:

Data modality: Sequential numeric data rather than text or images.
Architecture: Adapted transformer-based architectures with patching and normalization (e.g., reversible instance normalization).
Training approach: Incorporating both synthetic data and real-world corpora, like Google Research datasets.
Scale: Smaller in size than large foundation models, yet delivering high-quality point forecasts.
Evaluation: Benchmarked on forecasting tasks, anomaly detection, and imputation instead of text understanding.

Conclusion

Time series foundation models represent a shift from domain-specific statistical models, regression models, and supervised deep learning toward a unified model for time series. By applying transformer-based architectures and leveraging pre-trained models, they offer scalable solutions for forecasting tasks, anomaly detection, and other applications across industries.

While challenges remain in training data availability, interpretability, and integration into existing workflows, the advantages in zero-shot forecasting, transfer learning, and cross-domain adaptability position TSFMs as a key step toward general-purpose forecasting.

Cite this research

Pick the format that matches where you're publishing. Pasting the link version into your CMS preserves the backlink.

Sıla Ermut (2026) - "Time Series Foundation Models: Use Cases & Benefits". Published online at AIMultiple.com. Retrieved June 12, 2026, from: https://aimultiple.com/time-series-foundation-models [Online Resource]

Ermut, S. (2026, June 12). Time Series Foundation Models: Use Cases & Benefits. AIMultiple. https://aimultiple.com/time-series-foundation-models

@misc{ermut2026,
  author = {Ermut, Sıla},
  title  = {{Time Series Foundation Models: Use Cases & Benefits}},
  year   = {2026},
  month  = jun,
  howpublished    = {\url{https://aimultiple.com/time-series-foundation-models}},
  note   = {AIMultiple. Retrieved June 12, 2026}
}

Reference Links

amazon/chronos-2 · Hugging Face

GitHub - google-research/timesfm: TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting. · GitHub

A decoder-only foundation model for time-series forecasting

https://arxiv.org/pdf/2403.14735v3

MOMENT: A Family of Open Time-series Foundation Models

https://arxiv.org/pdf/2310.03589

https://arxiv.org/pdf/2310.08278

Salesforce/lotsa_data · Datasets at Hugging Face

Salesforce AI Research

Sıla Ermut

Industry Analyst

Follow On

Sıla Ermut is an industry analyst at AIMultiple focused on email marketing and sales videos. She previously worked as a recruiter in project management and consulting firms. Sıla holds a Master of Science degree in Social Psychology and a Bachelor of Arts degree in International Relations.

View Full Profile

Be the first to comment

Your email address will not be published. All fields are required. Comments are left in their original language.

What are Time Series Foundation Models?

Key characteristics and architecture of TSFMs

What are the primary use cases?

Industries adopting TSFMs

Benefits of time series foundation models

Challenges of TSFMs

Time series foundation models: Development and design factors

Time series foundation models: Outcomes and operational factors

Differences from other foundation models

Conclusion

Cite this research

We follow ethical norms & our process for objectivity. This research does not feature any customers of AIMultiple.

Don’t miss our benchmarks and data-driven insights. The button opens Google; selecting AIMultiple confirms that you wish to see AIMultiple more often in Google search results.

Add as preferred source

Next to Read

Agentic AI Frameworks

Insight

Jun 30

Time Series Foundation Models: Use Cases & Benefits

What are Time Series Foundation Models?

Amazon Chronos-2

Salesforce Moirai-2

Sundial

TimesFM-2.5

Architecture and training

Evaluation and results

Key characteristics and architecture of TSFMs

What are the primary use cases?

Forecasting

Classification

Imputation

Anomaly detection

Industries adopting TSFMs

Retail

Finance

Healthcare

Energy

Transportation

Manufacturing

Weather and climate

Benefits of time series foundation models

Challenges of TSFMs

Technical challenges

Practical challenges

Time series foundation models: Development and design factors

Time series foundation models: Outcomes and operational factors

Differences from other foundation models

Conclusion

Cite this research

Link with attributionHTML, for blog posts, LinkedIn articles & newsletters. Recommended.

APA 7th editionFor academic papers and analyst reports following APA 7th style.

BibTeXFor LaTeX documents and academic reference managers.

Reference Links

Be the first to comment

Next to Read

Top 10+ Agentic Orchestration Frameworks & Tools

Agentic Mesh: The Future of Scalable AI Collaboration

AI IPS: 6 Real-life Use Cases & Leading Tools

AI Utilities: Top 20 Use Cases & Case Studies

Top Open Source UEBA Tools & Commercial Alternatives

Top 6 Log Analysis Software Including Solarwinds in 2026