End-to-End Joint Semantic Segmentation of Actors and Actions in Video. Ji, J., Buch, S., Niebles, J., & Soto, A. In ECCV, 2018. Paper abstract bibtex 2 downloads Traditional video understanding tasks include human action recognition and actor-object semantic segmentation. However, the joint task of providing semantic segmentation for different actor classes simultaneously with their action class remains a challenging but necessary task for many applications. In this work, we propose a new end-to-end architecture for tackling this joint task in videos. Our model effectively leverages multiple input modalities, contextual information, and joint multitask learning in the video to directly output semantic segmentations in a single unified framework. We train and benchmark our model on the large-scale Actor-Action Dataset (A2D) for joint actor-action semantic segmentation, and demonstrate state-of-the-art performance for both segmentation and detection. We also perform experiments verifying our joint approach improves performance for zero-shot understanding, indicating generalizability of our jointly learned feature space.
@InProceedings{ jingwei:etal:2018,
author = {J. Ji and S. Buch and JC. Niebles and A. Soto},
title = {End-to-End Joint Semantic Segmentation of Actors and
Actions in Video},
booktitle = {{ECCV}},
year = {2018},
abstract = {Traditional video understanding tasks include human action
recognition and actor-object semantic segmentation.
However, the joint task of providing semantic segmentation
for different actor classes simultaneously with their
action class remains a challenging but necessary task for
many applications. In this work, we propose a new
end-to-end architecture for tackling this joint task in
videos. Our model effectively leverages multiple input
modalities, contextual information, and joint multitask
learning in the video to directly output semantic
segmentations in a single unified framework. We train and
benchmark our model on the large-scale Actor-Action Dataset
(A2D) for joint actor-action semantic segmentation, and
demonstrate state-of-the-art performance for both
segmentation and detection. We also perform experiments
verifying our joint approach improves performance for
zero-shot understanding, indicating generalizability of our
jointly learned feature space.},
url = {http://svl.stanford.edu/assets/papers/ji2018eccv.pdf}
}
Downloads: 2
{"_id":"8mf2DZtktm93Yu47A","bibbaseid":"ji-buch-niebles-soto-endtoendjointsemanticsegmentationofactorsandactionsinvideo-2018","downloads":2,"creationDate":"2018-06-26T17:54:53.372Z","title":"End-to-End Joint Semantic Segmentation of Actors and Actions in Video","author_short":["Ji, J.","Buch, S.","Niebles, J.","Soto, A."],"year":2018,"bibtype":"inproceedings","biburl":"https://raw.githubusercontent.com/ialab-puc/ialab.ing.puc.cl/master/pubs.bib","bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"firstnames":["J."],"propositions":[],"lastnames":["Ji"],"suffixes":[]},{"firstnames":["S."],"propositions":[],"lastnames":["Buch"],"suffixes":[]},{"firstnames":["JC."],"propositions":[],"lastnames":["Niebles"],"suffixes":[]},{"firstnames":["A."],"propositions":[],"lastnames":["Soto"],"suffixes":[]}],"title":"End-to-End Joint Semantic Segmentation of Actors and Actions in Video","booktitle":"ECCV","year":"2018","abstract":"Traditional video understanding tasks include human action recognition and actor-object semantic segmentation. However, the joint task of providing semantic segmentation for different actor classes simultaneously with their action class remains a challenging but necessary task for many applications. In this work, we propose a new end-to-end architecture for tackling this joint task in videos. Our model effectively leverages multiple input modalities, contextual information, and joint multitask learning in the video to directly output semantic segmentations in a single unified framework. We train and benchmark our model on the large-scale Actor-Action Dataset (A2D) for joint actor-action semantic segmentation, and demonstrate state-of-the-art performance for both segmentation and detection. We also perform experiments verifying our joint approach improves performance for zero-shot understanding, indicating generalizability of our jointly learned feature space.","url":"http://svl.stanford.edu/assets/papers/ji2018eccv.pdf","bibtex":"@InProceedings{\t jingwei:etal:2018,\n author\t= {J. Ji and S. Buch and JC. Niebles and A. Soto},\n title\t\t= {End-to-End Joint Semantic Segmentation of Actors and\n\t\t Actions in Video},\n booktitle\t= {{ECCV}},\n year\t\t= {2018},\n abstract\t= {Traditional video understanding tasks include human action\n\t\t recognition and actor-object semantic segmentation.\n\t\t However, the joint task of providing semantic segmentation\n\t\t for different actor classes simultaneously with their\n\t\t action class remains a challenging but necessary task for\n\t\t many applications. In this work, we propose a new\n\t\t end-to-end architecture for tackling this joint task in\n\t\t videos. Our model effectively leverages multiple input\n\t\t modalities, contextual information, and joint multitask\n\t\t learning in the video to directly output semantic\n\t\t segmentations in a single unified framework. We train and\n\t\t benchmark our model on the large-scale Actor-Action Dataset\n\t\t (A2D) for joint actor-action semantic segmentation, and\n\t\t demonstrate state-of-the-art performance for both\n\t\t segmentation and detection. We also perform experiments\n\t\t verifying our joint approach improves performance for\n\t\t zero-shot understanding, indicating generalizability of our\n\t\t jointly learned feature space.},\n url\t\t= {http://svl.stanford.edu/assets/papers/ji2018eccv.pdf}\n}\n\n","author_short":["Ji, J.","Buch, S.","Niebles, J.","Soto, A."],"key":"jingwei:etal:2018","id":"jingwei:etal:2018","bibbaseid":"ji-buch-niebles-soto-endtoendjointsemanticsegmentationofactorsandactionsinvideo-2018","role":"author","urls":{"Paper":"http://svl.stanford.edu/assets/papers/ji2018eccv.pdf"},"metadata":{"authorlinks":{"soto, a":"https://asoto.ing.puc.cl/publications/"}},"downloads":2},"search_terms":["end","end","joint","semantic","segmentation","actors","actions","video","ji","buch","niebles","soto"],"keywords":[],"authorIDs":["jAtuJBcGhng4Lq2Nd"],"dataSources":["3YPRCmmijLqF4qHXd","sg6yZ29Z2xB5xP79R","sj4fjnZAPkEeYdZqL","m8qFBfFbjk9qWjcmJ","QjT2DEZoWmQYxjHXS"]}