Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding

Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding. Cabral, R. & Kalinowski, M. In Proceedings of the XXIII Brazilian Symposium on Software Quality, of SBQS '24, pages 703–705, 2024. Association for Computing Machinery. Summary for the "Third Best Brazilian Software Quality MS Dissertation Award", received at SBQS 2024. Student: Raphael Cabral, Advisor: Marcos Kalinowski.

Author version doi abstract bibtex

[Context] Applying design principles has long been acknowledged as beneficial for understanding and maintainability in traditional software projects. These benefits may similarly hold for Machine Learning (ML) projects, which involve iterative experimentation with data, models, and algorithms. However, ML components are often developed by data scientists with diverse educational backgrounds, potentially resulting in code that doesn’t adhere to software design best practices. [Goal] To better understand this phenomenon, we investigated the impact of the SOLID design principles on ML code understanding. [Method] We conducted a controlled experiment with three independent trials involving 100 data scientists. We restructured real industrial ML code that did not use SOLID principles. Within each trial, one group was presented with the original ML code, while the other was presented with ML code incorporating SOLID principles. Participants of both groups were asked to analyze the code and fill out a questionnaire that included both open-ended and closed-ended questions on their understanding. [Results] The dissertation results provide statistically significant evidence that adopting the SOLID design principles can improve code understanding within ML projects. [Conclusion] We put forward that software engineering design principles should be spread within the data science community and considered for enhancing the quality and maintainability of ML code.

@inproceedings{CabralK24,
  author = {Cabral, Raphael and Kalinowski, Marcos},
  title = {Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding},
  year = {2024},
  isbn = {9798400717772},
  publisher = {Association for Computing Machinery},
  urlAuthor_version = {http://www.inf.puc-rio.br/~kalinowski/publications/CabralK24.pdf},
  doi = {10.1145/3701625.3701695},
  abstract = {[Context] Applying design principles has long been acknowledged as beneficial for understanding and maintainability in traditional software projects. These benefits may similarly hold for Machine Learning (ML) projects, which involve iterative experimentation with data, models, and algorithms. However, ML components are often developed by data scientists with diverse educational backgrounds, potentially resulting in code that doesn’t adhere to software design best practices. [Goal] To better understand this phenomenon, we investigated the impact of the SOLID design principles on ML code understanding. [Method] We conducted a controlled experiment with three independent trials involving 100 data scientists. We restructured real industrial ML code that did not use SOLID principles. Within each trial, one group was presented with the original ML code, while the other was presented with ML code incorporating SOLID principles. Participants of both groups were asked to analyze the code and fill out a questionnaire that included both open-ended and closed-ended questions on their understanding. [Results] The dissertation results provide statistically significant evidence that adopting the SOLID design principles can improve code understanding within ML projects. [Conclusion] We put forward that software engineering design principles should be spread within the data science community and considered for enhancing the quality and maintainability of ML code.},
  booktitle = {Proceedings of the XXIII Brazilian Symposium on Software Quality},
  pages = {703–705},
  numpages = {3},
  keywords = {SOLID Design Principles, Machine Learning, Code Understanding},
  location = {Salvador, Brazil},
  note = {<font color="red">Summary for the "Third Best Brazilian Software Quality MS Dissertation Award", received at SBQS 2024. Student: Raphael Cabral, Advisor: Marcos Kalinowski.</font>},
  series = {SBQS '24}
}

Downloads: 0

{"_id":"HzjC7Ry467KZyfamj","bibbaseid":"cabral-kalinowski-investigatingtheimpactofsoliddesignprinciplesonmachinelearningcodeunderstanding-2024","author_short":["Cabral, R.","Kalinowski, M."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"propositions":[],"lastnames":["Cabral"],"firstnames":["Raphael"],"suffixes":[]},{"propositions":[],"lastnames":["Kalinowski"],"firstnames":["Marcos"],"suffixes":[]}],"title":"Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding","year":"2024","isbn":"9798400717772","publisher":"Association for Computing Machinery","urlauthor_version":"http://www.inf.puc-rio.br/~kalinowski/publications/CabralK24.pdf","doi":"10.1145/3701625.3701695","abstract":"[Context] Applying design principles has long been acknowledged as beneficial for understanding and maintainability in traditional software projects. These benefits may similarly hold for Machine Learning (ML) projects, which involve iterative experimentation with data, models, and algorithms. However, ML components are often developed by data scientists with diverse educational backgrounds, potentially resulting in code that doesn’t adhere to software design best practices. [Goal] To better understand this phenomenon, we investigated the impact of the SOLID design principles on ML code understanding. [Method] We conducted a controlled experiment with three independent trials involving 100 data scientists. We restructured real industrial ML code that did not use SOLID principles. Within each trial, one group was presented with the original ML code, while the other was presented with ML code incorporating SOLID principles. Participants of both groups were asked to analyze the code and fill out a questionnaire that included both open-ended and closed-ended questions on their understanding. [Results] The dissertation results provide statistically significant evidence that adopting the SOLID design principles can improve code understanding within ML projects. [Conclusion] We put forward that software engineering design principles should be spread within the data science community and considered for enhancing the quality and maintainability of ML code.","booktitle":"Proceedings of the XXIII Brazilian Symposium on Software Quality","pages":"703–705","numpages":"3","keywords":"SOLID Design Principles, Machine Learning, Code Understanding","location":"Salvador, Brazil","note":"<font color=\"red\">Summary for the \"Third Best Brazilian Software Quality MS Dissertation Award\", received at SBQS 2024. Student: Raphael Cabral, Advisor: Marcos Kalinowski.</font>","series":"SBQS '24","bibtex":"@inproceedings{CabralK24,\r\n author = {Cabral, Raphael and Kalinowski, Marcos},\r\n title = {Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding},\r\n year = {2024},\r\n isbn = {9798400717772},\r\n publisher = {Association for Computing Machinery},\r\n urlAuthor_version = {http://www.inf.puc-rio.br/~kalinowski/publications/CabralK24.pdf},\r\n doi = {10.1145/3701625.3701695},\r\n abstract = {[Context] Applying design principles has long been acknowledged as beneficial for understanding and maintainability in traditional software projects. These benefits may similarly hold for Machine Learning (ML) projects, which involve iterative experimentation with data, models, and algorithms. However, ML components are often developed by data scientists with diverse educational backgrounds, potentially resulting in code that doesn’t adhere to software design best practices. [Goal] To better understand this phenomenon, we investigated the impact of the SOLID design principles on ML code understanding. [Method] We conducted a controlled experiment with three independent trials involving 100 data scientists. We restructured real industrial ML code that did not use SOLID principles. Within each trial, one group was presented with the original ML code, while the other was presented with ML code incorporating SOLID principles. Participants of both groups were asked to analyze the code and fill out a questionnaire that included both open-ended and closed-ended questions on their understanding. [Results] The dissertation results provide statistically significant evidence that adopting the SOLID design principles can improve code understanding within ML projects. [Conclusion] We put forward that software engineering design principles should be spread within the data science community and considered for enhancing the quality and maintainability of ML code.},\r\n booktitle = {Proceedings of the XXIII Brazilian Symposium on Software Quality},\r\n pages = {703–705},\r\n numpages = {3},\r\n keywords = {SOLID Design Principles, Machine Learning, Code Understanding},\r\n location = {Salvador, Brazil},\r\n note = {<font color=\"red\">Summary for the \"Third Best Brazilian Software Quality MS Dissertation Award\", received at SBQS 2024. Student: Raphael Cabral, Advisor: Marcos Kalinowski.</font>},\r\n series = {SBQS '24}\r\n}\r\n\r\n","author_short":["Cabral, R.","Kalinowski, M."],"key":"CabralK24","id":"CabralK24","bibbaseid":"cabral-kalinowski-investigatingtheimpactofsoliddesignprinciplesonmachinelearningcodeunderstanding-2024","role":"author","urls":{"Author version":"http://www.inf.puc-rio.br/~kalinowski/publications/CabralK24.pdf"},"keyword":["SOLID Design Principles","Machine Learning","Code Understanding"],"metadata":{"authorlinks":{}}},"bibtype":"inproceedings","biburl":"https://bibbase.org/network/files/KuRSiZJF8A6EZiujE","dataSources":["q7rgFjFgwoTSGkm3G","iSfhee4nHcHz4F2WQ"],"keywords":["solid design principles","machine learning","code understanding"],"search_terms":["investigating","impact","solid","design","principles","machine","learning","code","understanding","cabral","kalinowski"],"title":"Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding","year":2024}