Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series

Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series. Ekambaram, V., Jati, A., Dayama, P., Mukherjee, S., Nguyen, N. H., Gifford, W. M., Reddy, C., & Kalagnanam, J. June, 2024. arXiv:2401.03955 [cs]

Paper doi abstract bibtex

Large pre-trained models excel in zero/few-shot learning for language and vision tasks but face challenges in multivariate time series (TS) forecasting due to diverse data characteristics. Consequently, recent research efforts have focused on developing pre-trained TS forecasting models. These models, whether built from scratch or adapted from large language models (LLMs), excel in zero/few-shot forecasting tasks. However, they are limited by slow performance, high computational demands, and neglect of cross-channel and exogenous correlations. To address this, we introduce Tiny Time Mixers (TTM), a compact model (starting from 1M parameters) with effective transfer learning capabilities, trained exclusively on public TS datasets. TTM, based on the light-weight TSMixer architecture, incorporates innovations like adaptive patching, diverse resolution sampling, and resolution prefix tuning to handle pre-training on varied dataset resolutions with minimal model capacity. Additionally, it employs multi-level modeling to capture channel correlations and infuse exogenous signals during fine-tuning. TTM outperforms existing popular benchmarks in zero/few-shot forecasting by (4-40\%), while reducing computational requirements significantly. Moreover, TTMs are lightweight and can be executed even on CPU-only machines, enhancing usability and fostering wider adoption in resource-constrained environments. Model weights for our initial variant (TTM-Q) are available at https://huggingface.co/ibm-granite/granite-timeseries-ttm-v1. Model weights for more sophisticated variants (TTM-B, TTM-E, and TTM-A) will be shared soon. The source code for TTM can be accessed at https://github.com/ibm-granite/granite-tsfm/tree/main/tsfm_public/models/tinytimemixer.

@misc{ekambaram_tiny_2024,
	title = {Tiny {Time} {Mixers} ({TTMs}): {Fast} {Pre}-trained {Models} for {Enhanced} {Zero}/{Few}-{Shot} {Forecasting} of {Multivariate} {Time} {Series}},
	shorttitle = {Tiny {Time} {Mixers} ({TTMs})},
	url = {http://arxiv.org/abs/2401.03955},
	doi = {10.48550/arXiv.2401.03955},
	abstract = {Large pre-trained models excel in zero/few-shot learning for language and vision tasks but face challenges in multivariate time series (TS) forecasting due to diverse data characteristics. Consequently, recent research efforts have focused on developing pre-trained TS forecasting models. These models, whether built from scratch or adapted from large language models (LLMs), excel in zero/few-shot forecasting tasks. However, they are limited by slow performance, high computational demands, and neglect of cross-channel and exogenous correlations. To address this, we introduce Tiny Time Mixers (TTM), a compact model (starting from 1M parameters) with effective transfer learning capabilities, trained exclusively on public TS datasets. TTM, based on the light-weight TSMixer architecture, incorporates innovations like adaptive patching, diverse resolution sampling, and resolution prefix tuning to handle pre-training on varied dataset resolutions with minimal model capacity. Additionally, it employs multi-level modeling to capture channel correlations and infuse exogenous signals during fine-tuning. TTM outperforms existing popular benchmarks in zero/few-shot forecasting by (4-40{\textbackslash}\%), while reducing computational requirements significantly. Moreover, TTMs are lightweight and can be executed even on CPU-only machines, enhancing usability and fostering wider adoption in resource-constrained environments. Model weights for our initial variant (TTM-Q) are available at https://huggingface.co/ibm-granite/granite-timeseries-ttm-v1. Model weights for more sophisticated variants (TTM-B, TTM-E, and TTM-A) will be shared soon. The source code for TTM can be accessed at https://github.com/ibm-granite/granite-tsfm/tree/main/tsfm\_public/models/tinytimemixer.},
	urldate = {2024-06-17},
	publisher = {arXiv},
	author = {Ekambaram, Vijay and Jati, Arindam and Dayama, Pankaj and Mukherjee, Sumanta and Nguyen, Nam H. and Gifford, Wesley M. and Reddy, Chandra and Kalagnanam, Jayant},
	month = jun,
	year = {2024},
	note = {arXiv:2401.03955 [cs]},
	keywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning, notion},
}

Downloads: 0

{"_id":"g2qfpambNJJaLxhpn","bibbaseid":"ekambaram-jati-dayama-mukherjee-nguyen-gifford-reddy-kalagnanam-tinytimemixersttmsfastpretrainedmodelsforenhancedzerofewshotforecastingofmultivariatetimeseries-2024","author_short":["Ekambaram, V.","Jati, A.","Dayama, P.","Mukherjee, S.","Nguyen, N. H.","Gifford, W. M.","Reddy, C.","Kalagnanam, J."],"bibdata":{"bibtype":"misc","type":"misc","title":"Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series","shorttitle":"Tiny Time Mixers (TTMs)","url":"http://arxiv.org/abs/2401.03955","doi":"10.48550/arXiv.2401.03955","abstract":"Large pre-trained models excel in zero/few-shot learning for language and vision tasks but face challenges in multivariate time series (TS) forecasting due to diverse data characteristics. Consequently, recent research efforts have focused on developing pre-trained TS forecasting models. These models, whether built from scratch or adapted from large language models (LLMs), excel in zero/few-shot forecasting tasks. However, they are limited by slow performance, high computational demands, and neglect of cross-channel and exogenous correlations. To address this, we introduce Tiny Time Mixers (TTM), a compact model (starting from 1M parameters) with effective transfer learning capabilities, trained exclusively on public TS datasets. TTM, based on the light-weight TSMixer architecture, incorporates innovations like adaptive patching, diverse resolution sampling, and resolution prefix tuning to handle pre-training on varied dataset resolutions with minimal model capacity. Additionally, it employs multi-level modeling to capture channel correlations and infuse exogenous signals during fine-tuning. TTM outperforms existing popular benchmarks in zero/few-shot forecasting by (4-40\\%), while reducing computational requirements significantly. Moreover, TTMs are lightweight and can be executed even on CPU-only machines, enhancing usability and fostering wider adoption in resource-constrained environments. Model weights for our initial variant (TTM-Q) are available at https://huggingface.co/ibm-granite/granite-timeseries-ttm-v1. Model weights for more sophisticated variants (TTM-B, TTM-E, and TTM-A) will be shared soon. The source code for TTM can be accessed at https://github.com/ibm-granite/granite-tsfm/tree/main/tsfm_public/models/tinytimemixer.","urldate":"2024-06-17","publisher":"arXiv","author":[{"propositions":[],"lastnames":["Ekambaram"],"firstnames":["Vijay"],"suffixes":[]},{"propositions":[],"lastnames":["Jati"],"firstnames":["Arindam"],"suffixes":[]},{"propositions":[],"lastnames":["Dayama"],"firstnames":["Pankaj"],"suffixes":[]},{"propositions":[],"lastnames":["Mukherjee"],"firstnames":["Sumanta"],"suffixes":[]},{"propositions":[],"lastnames":["Nguyen"],"firstnames":["Nam","H."],"suffixes":[]},{"propositions":[],"lastnames":["Gifford"],"firstnames":["Wesley","M."],"suffixes":[]},{"propositions":[],"lastnames":["Reddy"],"firstnames":["Chandra"],"suffixes":[]},{"propositions":[],"lastnames":["Kalagnanam"],"firstnames":["Jayant"],"suffixes":[]}],"month":"June","year":"2024","note":"arXiv:2401.03955 [cs]","keywords":"Computer Science - Artificial Intelligence, Computer Science - Machine Learning, notion","bibtex":"@misc{ekambaram_tiny_2024,\n\ttitle = {Tiny {Time} {Mixers} ({TTMs}): {Fast} {Pre}-trained {Models} for {Enhanced} {Zero}/{Few}-{Shot} {Forecasting} of {Multivariate} {Time} {Series}},\n\tshorttitle = {Tiny {Time} {Mixers} ({TTMs})},\n\turl = {http://arxiv.org/abs/2401.03955},\n\tdoi = {10.48550/arXiv.2401.03955},\n\tabstract = {Large pre-trained models excel in zero/few-shot learning for language and vision tasks but face challenges in multivariate time series (TS) forecasting due to diverse data characteristics. Consequently, recent research efforts have focused on developing pre-trained TS forecasting models. These models, whether built from scratch or adapted from large language models (LLMs), excel in zero/few-shot forecasting tasks. However, they are limited by slow performance, high computational demands, and neglect of cross-channel and exogenous correlations. To address this, we introduce Tiny Time Mixers (TTM), a compact model (starting from 1M parameters) with effective transfer learning capabilities, trained exclusively on public TS datasets. TTM, based on the light-weight TSMixer architecture, incorporates innovations like adaptive patching, diverse resolution sampling, and resolution prefix tuning to handle pre-training on varied dataset resolutions with minimal model capacity. Additionally, it employs multi-level modeling to capture channel correlations and infuse exogenous signals during fine-tuning. TTM outperforms existing popular benchmarks in zero/few-shot forecasting by (4-40{\\textbackslash}\\%), while reducing computational requirements significantly. Moreover, TTMs are lightweight and can be executed even on CPU-only machines, enhancing usability and fostering wider adoption in resource-constrained environments. Model weights for our initial variant (TTM-Q) are available at https://huggingface.co/ibm-granite/granite-timeseries-ttm-v1. Model weights for more sophisticated variants (TTM-B, TTM-E, and TTM-A) will be shared soon. The source code for TTM can be accessed at https://github.com/ibm-granite/granite-tsfm/tree/main/tsfm\\_public/models/tinytimemixer.},\n\turldate = {2024-06-17},\n\tpublisher = {arXiv},\n\tauthor = {Ekambaram, Vijay and Jati, Arindam and Dayama, Pankaj and Mukherjee, Sumanta and Nguyen, Nam H. and Gifford, Wesley M. and Reddy, Chandra and Kalagnanam, Jayant},\n\tmonth = jun,\n\tyear = {2024},\n\tnote = {arXiv:2401.03955 [cs]},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning, notion},\n}\n\n\n\n\n\n\n\n","author_short":["Ekambaram, V.","Jati, A.","Dayama, P.","Mukherjee, S.","Nguyen, N. H.","Gifford, W. M.","Reddy, C.","Kalagnanam, J."],"key":"ekambaram_tiny_2024","id":"ekambaram_tiny_2024","bibbaseid":"ekambaram-jati-dayama-mukherjee-nguyen-gifford-reddy-kalagnanam-tinytimemixersttmsfastpretrainedmodelsforenhancedzerofewshotforecastingofmultivariatetimeseries-2024","role":"author","urls":{"Paper":"http://arxiv.org/abs/2401.03955"},"keyword":["Computer Science - Artificial Intelligence","Computer Science - Machine Learning","notion"],"metadata":{"authorlinks":{}},"downloads":0,"html":""},"bibtype":"misc","biburl":"https://bibbase.org/zotero/warren.pettine","dataSources":["zYjFbyFmZWKpRCD4j"],"keywords":["computer science - artificial intelligence","computer science - machine learning","notion"],"search_terms":["tiny","time","mixers","ttms","fast","pre","trained","models","enhanced","zero","few","shot","forecasting","multivariate","time","series","ekambaram","jati","dayama","mukherjee","nguyen","gifford","reddy","kalagnanam"],"title":"Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series","year":2024}