ML-Enabled Systems Model Deployment and Monitoring: Status Quo and Problems. Zimelewicz, E., Kalinowski, M., Mendez, D., Giray, G., Santos Alves, A. P., Lavesson, N., Azevedo, K., Villamizar, H., Escovedo, T., Lopes, H., Biffl, S., Musil, J., Felderer, M., Wagner, S., Baldassarre, T., & Gorschek, T. In Bludau, P., Ramler, R., Winkler, D., & Bergsmann, J., editors, 16th International Conference on Software Quality, Software Quality Days SWQD 2024, Vienna, Austria, April 23-25, pages 112–131, Cham, 2024. Springer Nature Switzerland.
ML-Enabled Systems Model Deployment and Monitoring: Status Quo and Problems [pdf]Author version  abstract   bibtex   2 downloads  
[Context] Systems that incorporate Machine Learning (ML) models, often referred to as ML-enabled systems, have become commonplace. However, empirical evidence on how ML-enabled systems are engineered in practice is still limited; this is especially true for activities surrounding ML model dissemination. [Goal] We investigate contemporary industrial practices and problems related to ML model dissemination, focusing on the model deployment and the monitoring ML life cycle phases. [Method] We conducted an international survey to gather practitioner insights on how ML-enabled systems are engineered. We gathered a total of 188 complete responses from 25 countries. We analyze the status quo and problems reported for the model deployment and monitoring phases. We analyzed contemporary practices using bootstrapping with confidence intervals and conducted qualitative analyses on the reported problems applying open and axial coding procedures. [Results] Practitioners perceive the model deployment and monitoring phases as relevant and difficult. With respect to model deployment, models are typically deployed as separate services, with limited adoption of MLOps principles. Reported problems include difficulties in designing the architecture of the infrastructure for production deployment and legacy application integration. Concerning model monitoring, many models in production are not monitored. The main monitored aspects are inputs, outputs, and decisions. Reported problems involve the absence of monitoring practices, the need to create custom monitoring tools, and the selection of suitable metrics. [Conclusion] Our results help provide a better understanding of the adopted practices and problems in practice and support guiding ML deployment and monitoring research in a problem-driven manner.

Downloads: 2