A Machine Learning Based Approach to Detect Machine Learning Design Patterns. Pan, W., Washizaki, H., Yoshioka, N., Fukazawa, Y., Khomh, F., & Gu�h�neuc, Y. In Yi, J. & Leavens, G. T., editors, Proceedings of the 30<sup>th</sup> Asia-Pacific Software Engineering Conference (APSEC), pages 574–578, December, 2023. IEEE CS Press. 5 pages. Early Research Achievements Track.
Paper abstract bibtex As machine learning expands to various domains, the demand for reusable solutions to similar problems increases. Machine learning design patterns are reusable solutions to design problems of machine learning applications. They can significantly enhance programmers' productivity in programming that requires machine learning algorithms. Given the critical role of machine learning design patterns, the automated detection of them becomes equally vital. However, identifying design patterns can be time-consuming and error-prone. We propose an approach to detect their occurrences in Python files. Our approach uses an Abstract Syntax Tree (AST) of Python files to build a corpus of data and train a refined Text-CNN model to automatically identify machine learning design patterns. We empirically validate our approach by conducting an exploratory study to detect four common machine learning design patterns: Embedding, Multilabel, Feature Cross, and Hashed Feature. We manually label 450 Python code files containing these design patterns from repositories of projects in GitHub. Our approach achieves accuracy values ranging from 80% to 92% for each of the four patterns.
@INPROCEEDINGS{Pan23-APSEC-ERA-ML4MLDPDetection,
AUTHOR = {Weitao Pan and Hironori Washizaki and Nobukazu Yoshioka and
Yoshiaki Fukazawa and Foutse Khomh and Yann-Ga�l Gu�h�neuc},
BOOKTITLE = {Proceedings of the 30<sup>th</sup> Asia-Pacific Software Engineering Conference (APSEC)},
TITLE = {A Machine Learning Based Approach to Detect Machine
Learning Design Patterns},
YEAR = {2023},
OPTADDRESS = {},
OPTCROSSREF = {},
EDITOR = {Joo-yong Yi and Gary T. Leavens},
MONTH = {December},
NOTE = {5 pages. Early Research Achievements Track.},
OPTNUMBER = {},
OPTORGANIZATION = {},
PAGES = {574--578},
PUBLISHER = {IEEE CS Press},
OPTSERIES = {},
OPTVOLUME = {},
KEYWORDS = {Topic: <b>Design patterns</b>, Venue: <c>APSEC</c>},
URL = {http://www.ptidej.net/publications/documents/APSEC23.doc.pdf},
PDF = {http://www.ptidej.net/publications/documents/APSEC23.ppt.pdf},
ABSTRACT = {As machine learning expands to various domains, the
demand for reusable solutions to similar problems increases. Machine
learning design patterns are reusable solutions to design problems of
machine learning applications. They can significantly enhance
programmers' productivity in programming that requires machine
learning algorithms. Given the critical role of machine learning
design patterns, the automated detection of them becomes equally
vital. However, identifying design patterns can be time-consuming and
error-prone. We propose an approach to detect their occurrences in
Python files. Our approach uses an Abstract Syntax Tree (AST) of
Python files to build a corpus of data and train a refined Text-CNN
model to automatically identify machine learning design patterns. We
empirically validate our approach by conducting an exploratory study
to detect four common machine learning design patterns: Embedding,
Multilabel, Feature Cross, and Hashed Feature. We manually label 450
Python code files containing these design patterns from repositories
of projects in GitHub. Our approach achieves accuracy values ranging
from 80\% to 92\% for each of the four patterns.}
}
Downloads: 0
{"_id":"8dBAsMuBBhyC2s6w5","bibbaseid":"pan-washizaki-yoshioka-fukazawa-khomh-guhneuc-amachinelearningbasedapproachtodetectmachinelearningdesignpatterns-2023","author_short":["Pan, W.","Washizaki, H.","Yoshioka, N.","Fukazawa, Y.","Khomh, F.","Gu�h�neuc, Y."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"firstnames":["Weitao"],"propositions":[],"lastnames":["Pan"],"suffixes":[]},{"firstnames":["Hironori"],"propositions":[],"lastnames":["Washizaki"],"suffixes":[]},{"firstnames":["Nobukazu"],"propositions":[],"lastnames":["Yoshioka"],"suffixes":[]},{"firstnames":["Yoshiaki"],"propositions":[],"lastnames":["Fukazawa"],"suffixes":[]},{"firstnames":["Foutse"],"propositions":[],"lastnames":["Khomh"],"suffixes":[]},{"firstnames":["Yann-Ga�l"],"propositions":[],"lastnames":["Gu�h�neuc"],"suffixes":[]}],"booktitle":"Proceedings of the 30<sup>th</sup> Asia-Pacific Software Engineering Conference (APSEC)","title":"A Machine Learning Based Approach to Detect Machine Learning Design Patterns","year":"2023","optaddress":"","optcrossref":"","editor":[{"firstnames":["Joo-yong"],"propositions":[],"lastnames":["Yi"],"suffixes":[]},{"firstnames":["Gary","T."],"propositions":[],"lastnames":["Leavens"],"suffixes":[]}],"month":"December","note":"5 pages. Early Research Achievements Track.","optnumber":"","optorganization":"","pages":"574–578","publisher":"IEEE CS Press","optseries":"","optvolume":"","keywords":"Topic: <b>Design patterns</b>, Venue: <c>APSEC</c>","url":"http://www.ptidej.net/publications/documents/APSEC23.doc.pdf","pdf":"http://www.ptidej.net/publications/documents/APSEC23.ppt.pdf","abstract":"As machine learning expands to various domains, the demand for reusable solutions to similar problems increases. Machine learning design patterns are reusable solutions to design problems of machine learning applications. They can significantly enhance programmers' productivity in programming that requires machine learning algorithms. Given the critical role of machine learning design patterns, the automated detection of them becomes equally vital. However, identifying design patterns can be time-consuming and error-prone. We propose an approach to detect their occurrences in Python files. Our approach uses an Abstract Syntax Tree (AST) of Python files to build a corpus of data and train a refined Text-CNN model to automatically identify machine learning design patterns. We empirically validate our approach by conducting an exploratory study to detect four common machine learning design patterns: Embedding, Multilabel, Feature Cross, and Hashed Feature. We manually label 450 Python code files containing these design patterns from repositories of projects in GitHub. Our approach achieves accuracy values ranging from 80% to 92% for each of the four patterns.","bibtex":"@INPROCEEDINGS{Pan23-APSEC-ERA-ML4MLDPDetection,\r\n AUTHOR = {Weitao Pan and Hironori Washizaki and Nobukazu Yoshioka and \r\n Yoshiaki Fukazawa and Foutse Khomh and Yann-Ga�l Gu�h�neuc},\r\n BOOKTITLE = {Proceedings of the 30<sup>th</sup> Asia-Pacific Software Engineering Conference (APSEC)},\r\n TITLE = {A Machine Learning Based Approach to Detect Machine \r\n Learning Design Patterns},\r\n YEAR = {2023},\r\n OPTADDRESS = {},\r\n OPTCROSSREF = {},\r\n EDITOR = {Joo-yong Yi and Gary T. Leavens},\r\n MONTH = {December},\r\n NOTE = {5 pages. Early Research Achievements Track.},\r\n OPTNUMBER = {},\r\n OPTORGANIZATION = {},\r\n PAGES = {574--578},\r\n PUBLISHER = {IEEE CS Press},\r\n OPTSERIES = {},\r\n OPTVOLUME = {},\r\n KEYWORDS = {Topic: <b>Design patterns</b>, Venue: <c>APSEC</c>},\r\n URL = {http://www.ptidej.net/publications/documents/APSEC23.doc.pdf},\r\n PDF = {http://www.ptidej.net/publications/documents/APSEC23.ppt.pdf},\r\n ABSTRACT = {As machine learning expands to various domains, the \r\n demand for reusable solutions to similar problems increases. Machine \r\n learning design patterns are reusable solutions to design problems of \r\n machine learning applications. They can significantly enhance \r\n programmers' productivity in programming that requires machine \r\n learning algorithms. Given the critical role of machine learning \r\n design patterns, the automated detection of them becomes equally \r\n vital. However, identifying design patterns can be time-consuming and \r\n error-prone. We propose an approach to detect their occurrences in \r\n Python files. Our approach uses an Abstract Syntax Tree (AST) of \r\n Python files to build a corpus of data and train a refined Text-CNN \r\n model to automatically identify machine learning design patterns. We \r\n empirically validate our approach by conducting an exploratory study \r\n to detect four common machine learning design patterns: Embedding, \r\n Multilabel, Feature Cross, and Hashed Feature. We manually label 450 \r\n Python code files containing these design patterns from repositories \r\n of projects in GitHub. Our approach achieves accuracy values ranging \r\n from 80\\% to 92\\% for each of the four patterns.}\r\n}\r\n\r\n","author_short":["Pan, W.","Washizaki, H.","Yoshioka, N.","Fukazawa, Y.","Khomh, F.","Gu�h�neuc, Y."],"editor_short":["Yi, J.","Leavens, G. T."],"key":"Pan23-APSEC-ERA-ML4MLDPDetection","id":"Pan23-APSEC-ERA-ML4MLDPDetection","bibbaseid":"pan-washizaki-yoshioka-fukazawa-khomh-guhneuc-amachinelearningbasedapproachtodetectmachinelearningdesignpatterns-2023","role":"author","urls":{"Paper":"http://www.ptidej.net/publications/documents/APSEC23.doc.pdf"},"keyword":["Topic: <b>Design patterns</b>","Venue: <c>APSEC</c>"],"metadata":{"authorlinks":{}},"downloads":0},"bibtype":"inproceedings","biburl":"http://www.yann-gael.gueheneuc.net/Work/Publications/Biblio/complete-bibliography.bib","dataSources":["8vn5MSGYWB4fAx9Z4"],"keywords":["topic: <b>design patterns</b>","venue: <c>apsec</c>"],"search_terms":["machine","learning","based","approach","detect","machine","learning","design","patterns","pan","washizaki","yoshioka","fukazawa","khomh","gu�h�neuc"],"title":"A Machine Learning Based Approach to Detect Machine Learning Design Patterns","year":2023}