Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series

Ekambaram, Vijay; Jati, Arindam; Nguyen, Nam H.; Dayama, Pankaj; Reddy, Chandra; Gifford, Wesley M.; Kalagnanam, Jayant

Computer Science > Machine Learning

arXiv:2401.03955v4 (cs)

[Submitted on 8 Jan 2024 (v1), revised 6 Apr 2024 (this version, v4), latest version 5 Jun 2024 (v7)]

Title:Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series

Authors:Vijay Ekambaram, Arindam Jati, Nam H. Nguyen, Pankaj Dayama, Chandra Reddy, Wesley M. Gifford, Jayant Kalagnanam

View PDF HTML (experimental)

Abstract:Large pre-trained models for zero/few-shot learning excel in language and vision domains but encounter challenges in multivariate time series (TS) due to the diverse nature and scarcity of publicly available pre-training data. Consequently, there has been a recent surge in utilizing pre-trained large language models (LLMs) with token adaptations for TS forecasting. These approaches employ cross-domain transfer learning and surprisingly yield impressive results. However, these models are typically very slow and large (~billion parameters) and do not consider cross-channel correlations. To address this, we present Tiny Time Mixers (TTM), a significantly small model based on the lightweight TSMixer architecture. TTM marks the first success in developing fast and tiny general pre-trained models (<1M parameters), exclusively trained on public TS datasets, with effective transfer learning capabilities for forecasting. To tackle the complexity of pre-training on multiple datasets with varied temporal resolutions, we introduce several novel enhancements such as adaptive patching, dataset augmentation via downsampling, and resolution prefix tuning. Moreover, we employ a multi-level modeling strategy to effectively model channel correlations and infuse exogenous signals during fine-tuning, a crucial capability lacking in existing benchmarks. TTM shows significant accuracy gains (12-38\%) over popular benchmarks in few/zero-shot forecasting. It also drastically reduces the compute needs as compared to LLM-TS methods, with a 14X cut in learnable parameters, 106X less total parameters, and substantial reductions in fine-tuning (65X) and inference time (54X). In fact, TTM's zero-shot often surpasses the few-shot results in many popular benchmarks, highlighting the efficacy of our approach. Code and pre-trained models will be open-sourced.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2401.03955 [cs.LG]
	(or arXiv:2401.03955v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2401.03955

Submission history

From: Vijay Ekambaram [view email]
[v1] Mon, 8 Jan 2024 15:21:21 UTC (2,236 KB)
[v2] Thu, 11 Jan 2024 18:21:27 UTC (2,241 KB)
[v3] Wed, 17 Jan 2024 16:27:24 UTC (2,249 KB)
[v4] Sat, 6 Apr 2024 17:16:18 UTC (2,250 KB)
[v5] Tue, 9 Apr 2024 07:19:41 UTC (2,250 KB)
[v6] Mon, 3 Jun 2024 17:57:22 UTC (9,468 KB)
[v7] Wed, 5 Jun 2024 13:46:37 UTC (9,468 KB)

Computer Science > Machine Learning

Title:Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators