Modeling Video Dynamics with Deep Dynencoder. Yan, X., Chang, H., Shan, S., & Chen, X. In Fleet, D., Pajdla, T., Schiele, B., & Tuytelaars, T., editors, Computer Vision – ECCV 2014, of Lecture Notes in Computer Science, pages 215–230, September, 2014. Springer International Publishing.
Modeling Video Dynamics with Deep Dynencoder [link]Paper  doi  abstract   bibtex   
Videos always exhibit various pattern motions, which can be modeled according to dynamics between adjacent frames. Previous methods based on linear dynamic system can model dynamic textures but have limited capacity of representing sophisticated nonlinear dynamics. Inspired by the nonlinear expression power of deep autoencoders, we propose a novel model named dynencoder which has an autoencoder at the bottom and a variant of it at the top (named as dynpredictor). It generates hidden states from raw pixel inputs via the autoencoder and then encodes the dynamic of state transition over time via the dynpredictor. Deep dynencoder can be constructed by proper stacking strategy and trained by layer-wise pre-training and joint fine-tuning. Experiments verify that our model can describe sophisticated video dynamics and synthesize endless video texture sequences with high visual quality. We also design classification and clustering methods based on our model and demonstrate the efficacy of them on traffic scene classification and motion segmentation.
@inproceedings{yan_modeling_2014,
	series = {Lecture {Notes} in {Computer} {Science}},
	title = {Modeling {Video} {Dynamics} with {Deep} {Dynencoder}},
	copyright = {©2014 Springer International Publishing Switzerland},
	isbn = {978-3-319-10592-5, 978-3-319-10593-2},
	url = {http://link.springer.com/chapter/10.1007/978-3-319-10593-2_15},
	abstract = {Videos always exhibit various pattern motions, which can be modeled according to dynamics between adjacent frames. Previous methods based on linear dynamic system can model dynamic textures but have limited capacity of representing sophisticated nonlinear dynamics. Inspired by the nonlinear expression power of deep autoencoders, we propose a novel model named dynencoder which has an autoencoder at the bottom and a variant of it at the top (named as dynpredictor). It generates hidden states from raw pixel inputs via the autoencoder and then encodes the dynamic of state transition over time via the dynpredictor. Deep dynencoder can be constructed by proper stacking strategy and trained by layer-wise pre-training and joint fine-tuning. Experiments verify that our model can describe sophisticated video dynamics and synthesize endless video texture sequences with high visual quality. We also design classification and clustering methods based on our model and demonstrate the efficacy of them on traffic scene classification and motion segmentation.},
	language = {en},
	urldate = {2017-02-20},
	booktitle = {Computer {Vision} – {ECCV} 2014},
	publisher = {Springer International Publishing},
	author = {Yan, Xing and Chang, Hong and Shan, Shiguang and Chen, Xilin},
	editor = {Fleet, David and Pajdla, Tomas and Schiele, Bernt and Tuytelaars, Tinne},
	month = sep,
	year = {2014},
	doi = {10.1007/978-3-319-10593-2_15},
	keywords = {Artificial Intelligence (incl. Robotics), Autoencoder, Computer Graphics, Deep Model, Dynamic Textures, Image Processing and Computer Vision, Pattern Recognition, Time Series, Video Dynamics},
	pages = {215--230}
}

Downloads: 0