ML-Enabled Systems Model Deployment and Monitoring: Status Quo and Problems

ML-Enabled Systems Model Deployment and Monitoring: Status Quo and Problems. Zimelewicz, E., Kalinowski, M., Mendez, D., Giray, G., Santos Alves, A. P., Lavesson, N., Azevedo, K., Villamizar, H., Escovedo, T., Lopes, H., Biffl, S., Musil, J., Felderer, M., Wagner, S., Baldassarre, T., & Gorschek, T. In Bludau, P., Ramler, R., Winkler, D., & Bergsmann, J., editors, 16th International Conference on Software Quality, Software Quality Days SWQD 2024, Vienna, Austria, April 23-25, pages 112–131, Cham, 2024. Springer Nature Switzerland.

Author version abstract bibtex 2 downloads

[Context] Systems that incorporate Machine Learning (ML) models, often referred to as ML-enabled systems, have become commonplace. However, empirical evidence on how ML-enabled systems are engineered in practice is still limited; this is especially true for activities surrounding ML model dissemination. [Goal] We investigate contemporary industrial practices and problems related to ML model dissemination, focusing on the model deployment and the monitoring ML life cycle phases. [Method] We conducted an international survey to gather practitioner insights on how ML-enabled systems are engineered. We gathered a total of 188 complete responses from 25 countries. We analyze the status quo and problems reported for the model deployment and monitoring phases. We analyzed contemporary practices using bootstrapping with confidence intervals and conducted qualitative analyses on the reported problems applying open and axial coding procedures. [Results] Practitioners perceive the model deployment and monitoring phases as relevant and difficult. With respect to model deployment, models are typically deployed as separate services, with limited adoption of MLOps principles. Reported problems include difficulties in designing the architecture of the infrastructure for production deployment and legacy application integration. Concerning model monitoring, many models in production are not monitored. The main monitored aspects are inputs, outputs, and decisions. Reported problems involve the absence of monitoring practices, the need to create custom monitoring tools, and the selection of suitable metrics. [Conclusion] Our results help provide a better understanding of the adopted practices and problems in practice and support guiding ML deployment and monitoring research in a problem-driven manner.

@InProceedings{ZimelewiczEtAl24,
  author="Zimelewicz, Eduardo
  and Kalinowski, Marcos
  and Mendez, Daniel
  and Giray, G{\"o}rkem
  and Santos Alves, Antonio Pedro
  and Lavesson, Niklas
  and Azevedo, Kelly
  and Villamizar, Hugo
  and Escovedo, Tatiana
  and Lopes, Helio
  and Biffl, Stefan
  and Musil, Juergen
  and Felderer, Michael
  and Wagner, Stefan
  and Baldassarre, Teresa
  and Gorschek, Tony",
  editor="Bludau, Peter
  and Ramler, Rudolf
  and Winkler, Dietmar
  and Bergsmann, Johannes",
  title="ML-Enabled Systems Model Deployment and Monitoring: Status Quo and Problems",
  booktitle= {16th International Conference on Software Quality, Software Quality Days {SWQD} 2024, Vienna, Austria, April 23-25},
  year="2024",
  publisher="Springer Nature Switzerland",
  address="Cham",
  pages="112--131",
  abstract="[Context] Systems that incorporate Machine Learning (ML) models, often referred to as ML-enabled systems, have become commonplace. However, empirical evidence on how ML-enabled systems are engineered in practice is still limited; this is especially true for activities surrounding ML model dissemination. [Goal] We investigate contemporary industrial practices and problems related to ML model dissemination, focusing on the model deployment and the monitoring ML life cycle phases. [Method] We conducted an international survey to gather practitioner insights on how ML-enabled systems are engineered. We gathered a total of 188 complete responses from 25 countries. We analyze the status quo and problems reported for the model deployment and monitoring phases. We analyzed contemporary practices using bootstrapping with confidence intervals and conducted qualitative analyses on the reported problems applying open and axial coding procedures. [Results] Practitioners perceive the model deployment and monitoring phases as relevant and difficult. With respect to model deployment, models are typically deployed as separate services, with limited adoption of MLOps principles. Reported problems include difficulties in designing the architecture of the infrastructure for production deployment and legacy application integration. Concerning model monitoring, many models in production are not monitored. The main monitored aspects are inputs, outputs, and decisions. Reported problems involve the absence of monitoring practices, the need to create custom monitoring tools, and the selection of suitable metrics. [Conclusion] Our results help provide a better understanding of the adopted practices and problems in practice and support guiding ML deployment and monitoring research in a problem-driven manner.",
  isbn="978-3-031-56281-5",
  urlAuthor_version = {http://www.inf.puc-rio.br/~kalinowski/publications/ZimelewiczEtAl24.pdf}  
}

Downloads: 2