Multimodal Deep Learning

Multimodal Deep Learning. Akkus, C., Chu, L., Djakovic, V., Jauch-Walser, S., Koch, P., Loss, G., Marquardt, C., Moldovan, M., Sauter, N., Schneider, M., Schulte, R., Urbanczyk, K., Goschenhofer, J., Heumann, C., Hvingelby, R., Schalk, D., & Aßenmacher, M. January, 2023. arXiv:2301.04856 [cs, stat]

Paper doi abstract bibtex

This book is the result of a seminar in which we reviewed multimodal approaches and attempted to create a solid overview of the field, starting with the current state-of-the-art approaches in the two subfields of Deep Learning individually. Further, modeling frameworks are discussed where one modality is transformed into the other, as well as models in which one modality is utilized to enhance representation learning for the other. To conclude the second part, architectures with a focus on handling both modalities simultaneously are introduced. Finally, we also cover other modalities as well as general-purpose multi-modal models, which are able to handle different tasks on different modalities within one unified architecture. One interesting application (Generative Art) eventually caps off this booklet.

@misc{akkus_multimodal_2023,
	title = {Multimodal {Deep} {Learning}},
	url = {http://arxiv.org/abs/2301.04856},
	doi = {10.48550/arXiv.2301.04856},
	abstract = {This book is the result of a seminar in which we reviewed multimodal approaches and attempted to create a solid overview of the field, starting with the current state-of-the-art approaches in the two subfields of Deep Learning individually. Further, modeling frameworks are discussed where one modality is transformed into the other, as well as models in which one modality is utilized to enhance representation learning for the other. To conclude the second part, architectures with a focus on handling both modalities simultaneously are introduced. Finally, we also cover other modalities as well as general-purpose multi-modal models, which are able to handle different tasks on different modalities within one unified architecture. One interesting application (Generative Art) eventually caps off this booklet.},
	urldate = {2023-07-24},
	publisher = {arXiv},
	author = {Akkus, Cem and Chu, Luyang and Djakovic, Vladana and Jauch-Walser, Steffen and Koch, Philipp and Loss, Giacomo and Marquardt, Christopher and Moldovan, Marco and Sauter, Nadja and Schneider, Maximilian and Schulte, Rickmer and Urbanczyk, Karol and Goschenhofer, Jann and Heumann, Christian and Hvingelby, Rasmus and Schalk, Daniel and Aßenmacher, Matthias},
	month = jan,
	year = {2023},
	note = {arXiv:2301.04856 [cs, stat]},
	keywords = {Computer Science - Computation and Language, Computer Science - Machine Learning, Statistics - Machine Learning, notion},
}

Downloads: 0

{"_id":"ArLqf7mynNPgTgKCm","bibbaseid":"akkus-chu-djakovic-jauchwalser-koch-loss-marquardt-moldovan-etal-multimodaldeeplearning-2023","author_short":["Akkus, C.","Chu, L.","Djakovic, V.","Jauch-Walser, S.","Koch, P.","Loss, G.","Marquardt, C.","Moldovan, M.","Sauter, N.","Schneider, M.","Schulte, R.","Urbanczyk, K.","Goschenhofer, J.","Heumann, C.","Hvingelby, R.","Schalk, D.","Aßenmacher, M."],"bibdata":{"bibtype":"misc","type":"misc","title":"Multimodal Deep Learning","url":"http://arxiv.org/abs/2301.04856","doi":"10.48550/arXiv.2301.04856","abstract":"This book is the result of a seminar in which we reviewed multimodal approaches and attempted to create a solid overview of the field, starting with the current state-of-the-art approaches in the two subfields of Deep Learning individually. Further, modeling frameworks are discussed where one modality is transformed into the other, as well as models in which one modality is utilized to enhance representation learning for the other. To conclude the second part, architectures with a focus on handling both modalities simultaneously are introduced. Finally, we also cover other modalities as well as general-purpose multi-modal models, which are able to handle different tasks on different modalities within one unified architecture. One interesting application (Generative Art) eventually caps off this booklet.","urldate":"2023-07-24","publisher":"arXiv","author":[{"propositions":[],"lastnames":["Akkus"],"firstnames":["Cem"],"suffixes":[]},{"propositions":[],"lastnames":["Chu"],"firstnames":["Luyang"],"suffixes":[]},{"propositions":[],"lastnames":["Djakovic"],"firstnames":["Vladana"],"suffixes":[]},{"propositions":[],"lastnames":["Jauch-Walser"],"firstnames":["Steffen"],"suffixes":[]},{"propositions":[],"lastnames":["Koch"],"firstnames":["Philipp"],"suffixes":[]},{"propositions":[],"lastnames":["Loss"],"firstnames":["Giacomo"],"suffixes":[]},{"propositions":[],"lastnames":["Marquardt"],"firstnames":["Christopher"],"suffixes":[]},{"propositions":[],"lastnames":["Moldovan"],"firstnames":["Marco"],"suffixes":[]},{"propositions":[],"lastnames":["Sauter"],"firstnames":["Nadja"],"suffixes":[]},{"propositions":[],"lastnames":["Schneider"],"firstnames":["Maximilian"],"suffixes":[]},{"propositions":[],"lastnames":["Schulte"],"firstnames":["Rickmer"],"suffixes":[]},{"propositions":[],"lastnames":["Urbanczyk"],"firstnames":["Karol"],"suffixes":[]},{"propositions":[],"lastnames":["Goschenhofer"],"firstnames":["Jann"],"suffixes":[]},{"propositions":[],"lastnames":["Heumann"],"firstnames":["Christian"],"suffixes":[]},{"propositions":[],"lastnames":["Hvingelby"],"firstnames":["Rasmus"],"suffixes":[]},{"propositions":[],"lastnames":["Schalk"],"firstnames":["Daniel"],"suffixes":[]},{"propositions":[],"lastnames":["Aßenmacher"],"firstnames":["Matthias"],"suffixes":[]}],"month":"January","year":"2023","note":"arXiv:2301.04856 [cs, stat]","keywords":"Computer Science - Computation and Language, Computer Science - Machine Learning, Statistics - Machine Learning, notion","bibtex":"@misc{akkus_multimodal_2023,\n\ttitle = {Multimodal {Deep} {Learning}},\n\turl = {http://arxiv.org/abs/2301.04856},\n\tdoi = {10.48550/arXiv.2301.04856},\n\tabstract = {This book is the result of a seminar in which we reviewed multimodal approaches and attempted to create a solid overview of the field, starting with the current state-of-the-art approaches in the two subfields of Deep Learning individually. Further, modeling frameworks are discussed where one modality is transformed into the other, as well as models in which one modality is utilized to enhance representation learning for the other. To conclude the second part, architectures with a focus on handling both modalities simultaneously are introduced. Finally, we also cover other modalities as well as general-purpose multi-modal models, which are able to handle different tasks on different modalities within one unified architecture. One interesting application (Generative Art) eventually caps off this booklet.},\n\turldate = {2023-07-24},\n\tpublisher = {arXiv},\n\tauthor = {Akkus, Cem and Chu, Luyang and Djakovic, Vladana and Jauch-Walser, Steffen and Koch, Philipp and Loss, Giacomo and Marquardt, Christopher and Moldovan, Marco and Sauter, Nadja and Schneider, Maximilian and Schulte, Rickmer and Urbanczyk, Karol and Goschenhofer, Jann and Heumann, Christian and Hvingelby, Rasmus and Schalk, Daniel and Aßenmacher, Matthias},\n\tmonth = jan,\n\tyear = {2023},\n\tnote = {arXiv:2301.04856 [cs, stat]},\n\tkeywords = {Computer Science - Computation and Language, Computer Science - Machine Learning, Statistics - Machine Learning, notion},\n}\n\n","author_short":["Akkus, C.","Chu, L.","Djakovic, V.","Jauch-Walser, S.","Koch, P.","Loss, G.","Marquardt, C.","Moldovan, M.","Sauter, N.","Schneider, M.","Schulte, R.","Urbanczyk, K.","Goschenhofer, J.","Heumann, C.","Hvingelby, R.","Schalk, D.","Aßenmacher, M."],"key":"akkus_multimodal_2023","id":"akkus_multimodal_2023","bibbaseid":"akkus-chu-djakovic-jauchwalser-koch-loss-marquardt-moldovan-etal-multimodaldeeplearning-2023","role":"author","urls":{"Paper":"http://arxiv.org/abs/2301.04856"},"keyword":["Computer Science - Computation and Language","Computer Science - Machine Learning","Statistics - Machine Learning","notion"],"metadata":{"authorlinks":{}},"html":""},"bibtype":"misc","biburl":"https://api.zotero.org/users/7461051/collections/YGWEDN7F/items?key=JesLwColmDamE3ak4jR0GxhE&format=bibtex&limit=100","dataSources":["p2uqc5vSxe2qN6jKS"],"keywords":["computer science - computation and language","computer science - machine learning","statistics - machine learning","notion"],"search_terms":["multimodal","deep","learning","akkus","chu","djakovic","jauch-walser","koch","loss","marquardt","moldovan","sauter","schneider","schulte","urbanczyk","goschenhofer","heumann","hvingelby","schalk","aßenmacher"],"title":"Multimodal Deep Learning","year":2023}