Sparse composition of body poses and atomic actions for human activity recognition in RGB-D videos. Lillo, I., Niebles, J., & Soto, A. Image and Vision Computing, 59(March):63-75, 2017.
Sparse composition of body poses and atomic actions for human activity recognition in RGB-D videos [link]Paper  abstract   bibtex   
This paper presents an approach to recognize human activities using body poses estimated from RGB-D data. We focus on recognizing complex activities composed of sequential or simultaneous atomic actions characterized by body motions of a single actor. We tackle this problem by introducing a hierarchical compositional model that operates at three levels of abstraction. At the lowest level, geometric and motion descriptors are used to learn a dictionary of body poses. At the intermediate level, sparse compositions of these body poses are used to obtain meaningful representations for atomic human actions. Finally, at the highest level, spatial and temporal compositions of these atomic actions are used to represent complex human activities. Our results show the benefits of using a hierarchical model that exploits the sharing and composition of body poses into atomic actions, and atomic actions into activities. A quantitative evaluation using two benchmark datasets illustrates the advantages of our model to perform action and activity recognition.
@Article{	  lillo:etal:2017,
  author	= {I. Lillo and JC. Niebles and A. Soto},
  title		= {Sparse composition of body poses and atomic actions for
		  human activity recognition in RGB-D videos},
  journal	= {Image and Vision Computing},
  volume	= {59},
  number	= {March},
  pages		= {63-75},
  year		= {2017},
  abstract	= {This paper presents an approach to recognize human
		  activities using body poses estimated from RGB-D data. We
		  focus on recognizing complex activities composed of
		  sequential or simultaneous atomic actions characterized by
		  body motions of a single actor. We tackle this problem by
		  introducing a hierarchical compositional model that
		  operates at three levels of abstraction. At the lowest
		  level, geometric and motion descriptors are used to learn a
		  dictionary of body poses. At the intermediate level, sparse
		  compositions of these body poses are used to obtain
		  meaningful representations for atomic human actions.
		  Finally, at the highest level, spatial and temporal
		  compositions of these atomic actions are used to represent
		  complex human activities. Our results show the benefits of
		  using a hierarchical model that exploits the sharing and
		  composition of body poses into atomic actions, and atomic
		  actions into activities. A quantitative evaluation using
		  two benchmark datasets illustrates the advantages of our
		  model to perform action and activity recognition.},
  url		= {http://www.sciencedirect.com/science/article/pii/S0262885616301949}
}

Downloads: 0