Performance evaluation of deep feature learning for RGB-D image/video classification

Performance evaluation of deep feature learning for RGB-D image/video classification. Shao, L., Cai, Z., Liu, L., & Lu, K. Information Sciences, 385-386:266--283, April, 2017.

Paper doi abstract bibtex

Deep Neural Networks for image/video classification have obtained much success in various computer vision applications. Existing deep learning algorithms are widely used on RGB images or video data. Meanwhile, with the development of low-cost RGB-D sensors (such as Microsoft Kinect and Xtion Pro-Live), high-quality RGB-D data can be easily acquired and used to enhance computer vision algorithms [14]. It would be interesting to investigate how deep learning can be employed for extracting and fusing features from RGB-D data. In this paper, after briefly reviewing the basic concepts of RGB-D information and four prevalent deep learning models (i.e., Deep Belief Networks (DBNs), Stacked Denoising Auto-Encoders (SDAE), Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) Neural Networks), we conduct extensive experiments on five popular RGB-D datasets including three image datasets and two video datasets. We then present a detailed analysis about the comparison between the learned feature representations from the four deep learning models. In addition, a few suggestions on how to adjust hyper-parameters for learning deep neural networks are made in this paper. According to the extensive experimental results, we believe that this evaluation will provide insights and a deeper understanding of different deep learning algorithms for RGB-D feature extraction and fusion.

@article{shao_performance_2017,
	title = {Performance evaluation of deep feature learning for {RGB}-{D} image/video classification},
	volume = {385-386},
	issn = {0020-0255},
	url = {http://www.sciencedirect.com/science/article/pii/S0020025517300191},
	doi = {10.1016/j.ins.2017.01.013},
	abstract = {Deep Neural Networks for image/video classification have obtained much success in various computer vision applications. Existing deep learning algorithms are widely used on RGB images or video data. Meanwhile, with the development of low-cost RGB-D sensors (such as Microsoft Kinect and Xtion Pro-Live), high-quality RGB-D data can be easily acquired and used to enhance computer vision algorithms [14]. It would be interesting to investigate how deep learning can be employed for extracting and fusing features from RGB-D data. In this paper, after briefly reviewing the basic concepts of RGB-D information and four prevalent deep learning models (i.e., Deep Belief Networks (DBNs), Stacked Denoising Auto-Encoders (SDAE), Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) Neural Networks), we conduct extensive experiments on five popular RGB-D datasets including three image datasets and two video datasets. We then present a detailed analysis about the comparison between the learned feature representations from the four deep learning models. In addition, a few suggestions on how to adjust hyper-parameters for learning deep neural networks are made in this paper. According to the extensive experimental results, we believe that this evaluation will provide insights and a deeper understanding of different deep learning algorithms for RGB-D feature extraction and fusion.},
	urldate = {2018-03-25TZ},
	journal = {Information Sciences},
	author = {Shao, Ling and Cai, Ziyun and Liu, Li and Lu, Ke},
	month = apr,
	year = {2017},
	keywords = {Deep neural networks, Feature learning, Performance evaluation, RGB-D data},
	pages = {266--283}
}

Downloads: 0

{"_id":"dDm8HcGN4GnqjT52a","bibbaseid":"shao-cai-liu-lu-performanceevaluationofdeepfeaturelearningforrgbdimagevideoclassification-2017","downloads":0,"creationDate":"2018-04-15T14:41:43.965Z","title":"Performance evaluation of deep feature learning for RGB-D image/video classification","author_short":["Shao, L.","Cai, Z.","Liu, L.","Lu, K."],"year":2017,"bibtype":"article","biburl":"https://bibbase.org/zotero/pvhuy","bibdata":{"bibtype":"article","type":"article","title":"Performance evaluation of deep feature learning for RGB-D image/video classification","volume":"385-386","issn":"0020-0255","url":"http://www.sciencedirect.com/science/article/pii/S0020025517300191","doi":"10.1016/j.ins.2017.01.013","abstract":"Deep Neural Networks for image/video classification have obtained much success in various computer vision applications. Existing deep learning algorithms are widely used on RGB images or video data. Meanwhile, with the development of low-cost RGB-D sensors (such as Microsoft Kinect and Xtion Pro-Live), high-quality RGB-D data can be easily acquired and used to enhance computer vision algorithms [14]. It would be interesting to investigate how deep learning can be employed for extracting and fusing features from RGB-D data. In this paper, after briefly reviewing the basic concepts of RGB-D information and four prevalent deep learning models (i.e., Deep Belief Networks (DBNs), Stacked Denoising Auto-Encoders (SDAE), Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) Neural Networks), we conduct extensive experiments on five popular RGB-D datasets including three image datasets and two video datasets. We then present a detailed analysis about the comparison between the learned feature representations from the four deep learning models. In addition, a few suggestions on how to adjust hyper-parameters for learning deep neural networks are made in this paper. According to the extensive experimental results, we believe that this evaluation will provide insights and a deeper understanding of different deep learning algorithms for RGB-D feature extraction and fusion.","urldate":"2018-03-25TZ","journal":"Information Sciences","author":[{"propositions":[],"lastnames":["Shao"],"firstnames":["Ling"],"suffixes":[]},{"propositions":[],"lastnames":["Cai"],"firstnames":["Ziyun"],"suffixes":[]},{"propositions":[],"lastnames":["Liu"],"firstnames":["Li"],"suffixes":[]},{"propositions":[],"lastnames":["Lu"],"firstnames":["Ke"],"suffixes":[]}],"month":"April","year":"2017","keywords":"Deep neural networks, Feature learning, Performance evaluation, RGB-D data","pages":"266--283","bibtex":"@article{shao_performance_2017,\n\ttitle = {Performance evaluation of deep feature learning for {RGB}-{D} image/video classification},\n\tvolume = {385-386},\n\tissn = {0020-0255},\n\turl = {http://www.sciencedirect.com/science/article/pii/S0020025517300191},\n\tdoi = {10.1016/j.ins.2017.01.013},\n\tabstract = {Deep Neural Networks for image/video classification have obtained much success in various computer vision applications. Existing deep learning algorithms are widely used on RGB images or video data. Meanwhile, with the development of low-cost RGB-D sensors (such as Microsoft Kinect and Xtion Pro-Live), high-quality RGB-D data can be easily acquired and used to enhance computer vision algorithms [14]. It would be interesting to investigate how deep learning can be employed for extracting and fusing features from RGB-D data. In this paper, after briefly reviewing the basic concepts of RGB-D information and four prevalent deep learning models (i.e., Deep Belief Networks (DBNs), Stacked Denoising Auto-Encoders (SDAE), Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) Neural Networks), we conduct extensive experiments on five popular RGB-D datasets including three image datasets and two video datasets. We then present a detailed analysis about the comparison between the learned feature representations from the four deep learning models. In addition, a few suggestions on how to adjust hyper-parameters for learning deep neural networks are made in this paper. According to the extensive experimental results, we believe that this evaluation will provide insights and a deeper understanding of different deep learning algorithms for RGB-D feature extraction and fusion.},\n\turldate = {2018-03-25TZ},\n\tjournal = {Information Sciences},\n\tauthor = {Shao, Ling and Cai, Ziyun and Liu, Li and Lu, Ke},\n\tmonth = apr,\n\tyear = {2017},\n\tkeywords = {Deep neural networks, Feature learning, Performance evaluation, RGB-D data},\n\tpages = {266--283}\n}\n\n","author_short":["Shao, L.","Cai, Z.","Liu, L.","Lu, K."],"key":"shao_performance_2017","id":"shao_performance_2017","bibbaseid":"shao-cai-liu-lu-performanceevaluationofdeepfeaturelearningforrgbdimagevideoclassification-2017","role":"author","urls":{"Paper":"http://www.sciencedirect.com/science/article/pii/S0020025517300191"},"keyword":["Deep neural networks","Feature learning","Performance evaluation","RGB-D data"],"downloads":0},"search_terms":["performance","evaluation","deep","feature","learning","rgb","image","video","classification","shao","cai","liu","lu"],"keywords":["deep neural networks","feature learning","performance evaluation","rgb-d data"],"authorIDs":[],"dataSources":["E2tDnyqghkeNYrDGK"]}