Resource Overcommitment with Granular and Pattern-Based Machine Learning Predictions. Sadiq, A. Z., Shen, H., Sen, T., Deng, N., & Xiang, S. In 2025 IEEE International Conference on Big Data (BigData), pages 229–238, Macau, China, December, 2025. IEEE.
Resource Overcommitment with Granular and Pattern-Based Machine Learning Predictions [link]Paper  doi  abstract   bibtex   
Cloud providers employ overcommitment, where the total resource demand of tasks assigned to a machine exceeds its physical capacity, thereby improving resource utilization. To guide the job scheduler to determine the tasks allocated to a machine, existing methods rely on conservative statistical methods, such as using the 98th percentile of resource usage of the machine over a time period, and predicting peak resource demand of the running tasks on the machine. Although these methods minimize the risk of task performance degradation, resources remain underutilized due to their inability to dynamically adapt to changing workloads. Also, many existing machine learning methods, such as traditional Long Short-Term Memory (LSTM) and transformer-based time-series prediction models, have been employed for future resource demand prediction. However, since these predictions are for the outcomes of the scheduler’s task assignment decisions, they cannot be used to guide the scheduler in determining the task allocation. To address these limitations, we experimentally evaluate the performance of LSTM, Bi-Directional LSTM with Attention (BDLA), iTransformer, and Chronos for predicting resource usage peak at the task level and find that BDLA achieves the best performance. We then extend BDLA to predict peak at the task-group and machine levels and observe that task-level and group-level predictions yield the highest accuracy. Furthermore, we enhance BDLA by augmenting its inputs with cluster information derived from the input time series. This augmented BDLA achieves up to 95.98% higher CPU savings and 29.63% higher memory savings than the state-of-the-art method at the same violation rate.
@inproceedings{sadiq_resource_2025,
	address = {Macau, China},
	title = {Resource {Overcommitment} with {Granular} and {Pattern}-{Based} {Machine} {Learning} {Predictions}},
	copyright = {https://doi.org/10.15223/policy-029},
	isbn = {979-8-3315-9447-3},
	url = {https://ieeexplore.ieee.org/document/11401586/},
	doi = {10.1109/BigData66926.2025.11401586},
	abstract = {Cloud providers employ overcommitment, where the total resource demand of tasks assigned to a machine exceeds its physical capacity, thereby improving resource utilization. To guide the job scheduler to determine the tasks allocated to a machine, existing methods rely on conservative statistical methods, such as using the 98th percentile of resource usage of the machine over a time period, and predicting peak resource demand of the running tasks on the machine. Although these methods minimize the risk of task performance degradation, resources remain underutilized due to their inability to dynamically adapt to changing workloads. Also, many existing machine learning methods, such as traditional Long Short-Term Memory (LSTM) and transformer-based time-series prediction models, have been employed for future resource demand prediction. However, since these predictions are for the outcomes of the scheduler’s task assignment decisions, they cannot be used to guide the scheduler in determining the task allocation. To address these limitations, we experimentally evaluate the performance of LSTM, Bi-Directional LSTM with Attention (BDLA), iTransformer, and Chronos for predicting resource usage peak at the task level and find that BDLA achieves the best performance. We then extend BDLA to predict peak at the task-group and machine levels and observe that task-level and group-level predictions yield the highest accuracy. Furthermore, we enhance BDLA by augmenting its inputs with cluster information derived from the input time series. This augmented BDLA achieves up to 95.98\% higher CPU savings and 29.63\% higher memory savings than the state-of-the-art method at the same violation rate.},
	language = {en},
	urldate = {2026-03-19},
	booktitle = {2025 {IEEE} {International} {Conference} on {Big} {Data} ({BigData})},
	publisher = {IEEE},
	author = {Sadiq, Ali Zafar and Shen, Haiying and Sen, Tanmoy and Deng, Nan and Xiang, Sunan},
	month = dec,
	year = {2025},
	pages = {229--238},
}

Downloads: 0