AP-MTL: Attention Pruned Multi-Task Learning Model for Real-Time Instrument Detection and Segmentation in Robot-Assisted Surgery

AP-MTL: Attention Pruned Multi-Task Learning Model for Real-Time Instrument Detection and Segmentation in Robot-Assisted Surgery. Islam, M., Vs, V., & Ren, H.
abstract bibtex

Surgical scene understanding and multi-tasking learning are crucial for image-guided robotic surgery. Training a real-time robotic system for the detection and segmentation of high-resolution images provides a challenging problem with the limited computational resource. The perception drawn can be applied in effective real-time feedback, surgical skill assessment, and human-robot collaborative surgeries to enhance surgical outcomes. For this purpose, we develop a novel end-to-end trainable real-time Multi-Task Learning (MTL) model with weight-shared encoder and task-aware detection and segmentation decoders. Optimization of multiple tasks at the same convergence point is vital and presents a complex problem. Thus, we propose an asynchronous task-aware optimization (ATO) technique to calculate task-oriented gradients and train the decoders independently. Moreover, MTL models are always computationally expensive, which hinder real-time applications. To address this challenge, we introduce a global attention dynamic pruning (GADP) by removing less signiﬁcant and sparse parameters. We further design a skip squeeze and excitation (SE) module, which suppresses weak features, excites signiﬁcant features and performs dynamic spatial and channelwise feature re-calibration. Validating on the robotic instrument segmentation dataset of MICCAI endoscopic vision challenge, our model signiﬁcantly outperforms state-of-the-art segmentation and detection models, including best-performed models in the challenge.

@article{islam_ap-mtl_nodate,
	title = {{AP}-{MTL}: {Attention} {Pruned} {Multi}-{Task} {Learning} {Model} for {Real}-{Time} {Instrument} {Detection} and {Segmentation} in {Robot}-{Assisted} {Surgery}},
	abstract = {Surgical scene understanding and multi-tasking learning are crucial for image-guided robotic surgery. Training a real-time robotic system for the detection and segmentation of high-resolution images provides a challenging problem with the limited computational resource. The perception drawn can be applied in effective real-time feedback, surgical skill assessment, and human-robot collaborative surgeries to enhance surgical outcomes. For this purpose, we develop a novel end-to-end trainable real-time Multi-Task Learning (MTL) model with weight-shared encoder and task-aware detection and segmentation decoders. Optimization of multiple tasks at the same convergence point is vital and presents a complex problem. Thus, we propose an asynchronous task-aware optimization (ATO) technique to calculate task-oriented gradients and train the decoders independently. Moreover, MTL models are always computationally expensive, which hinder real-time applications. To address this challenge, we introduce a global attention dynamic pruning (GADP) by removing less signiﬁcant and sparse parameters. We further design a skip squeeze and excitation (SE) module, which suppresses weak features, excites signiﬁcant features and performs dynamic spatial and channelwise feature re-calibration. Validating on the robotic instrument segmentation dataset of MICCAI endoscopic vision challenge, our model signiﬁcantly outperforms state-of-the-art segmentation and detection models, including best-performed models in the challenge.},
	language = {en},
	author = {Islam, Mobarakol and Vs, Vibashan and Ren, Hongliang},
	pages = {7},
}

Downloads: 0

{"_id":"SqMDwSFGPwuyXEBKn","bibbaseid":"islam-vs-ren-apmtlattentionprunedmultitasklearningmodelforrealtimeinstrumentdetectionandsegmentationinrobotassistedsurgery","author_short":["Islam, M.","Vs, V.","Ren, H."],"bibdata":{"bibtype":"article","type":"article","title":"AP-MTL: Attention Pruned Multi-Task Learning Model for Real-Time Instrument Detection and Segmentation in Robot-Assisted Surgery","abstract":"Surgical scene understanding and multi-tasking learning are crucial for image-guided robotic surgery. Training a real-time robotic system for the detection and segmentation of high-resolution images provides a challenging problem with the limited computational resource. The perception drawn can be applied in effective real-time feedback, surgical skill assessment, and human-robot collaborative surgeries to enhance surgical outcomes. For this purpose, we develop a novel end-to-end trainable real-time Multi-Task Learning (MTL) model with weight-shared encoder and task-aware detection and segmentation decoders. Optimization of multiple tasks at the same convergence point is vital and presents a complex problem. Thus, we propose an asynchronous task-aware optimization (ATO) technique to calculate task-oriented gradients and train the decoders independently. Moreover, MTL models are always computationally expensive, which hinder real-time applications. To address this challenge, we introduce a global attention dynamic pruning (GADP) by removing less signiﬁcant and sparse parameters. We further design a skip squeeze and excitation (SE) module, which suppresses weak features, excites signiﬁcant features and performs dynamic spatial and channelwise feature re-calibration. Validating on the robotic instrument segmentation dataset of MICCAI endoscopic vision challenge, our model signiﬁcantly outperforms state-of-the-art segmentation and detection models, including best-performed models in the challenge.","language":"en","author":[{"propositions":[],"lastnames":["Islam"],"firstnames":["Mobarakol"],"suffixes":[]},{"propositions":[],"lastnames":["Vs"],"firstnames":["Vibashan"],"suffixes":[]},{"propositions":[],"lastnames":["Ren"],"firstnames":["Hongliang"],"suffixes":[]}],"pages":"7","bibtex":"@article{islam_ap-mtl_nodate,\n\ttitle = {{AP}-{MTL}: {Attention} {Pruned} {Multi}-{Task} {Learning} {Model} for {Real}-{Time} {Instrument} {Detection} and {Segmentation} in {Robot}-{Assisted} {Surgery}},\n\tabstract = {Surgical scene understanding and multi-tasking learning are crucial for image-guided robotic surgery. Training a real-time robotic system for the detection and segmentation of high-resolution images provides a challenging problem with the limited computational resource. The perception drawn can be applied in effective real-time feedback, surgical skill assessment, and human-robot collaborative surgeries to enhance surgical outcomes. For this purpose, we develop a novel end-to-end trainable real-time Multi-Task Learning (MTL) model with weight-shared encoder and task-aware detection and segmentation decoders. Optimization of multiple tasks at the same convergence point is vital and presents a complex problem. Thus, we propose an asynchronous task-aware optimization (ATO) technique to calculate task-oriented gradients and train the decoders independently. Moreover, MTL models are always computationally expensive, which hinder real-time applications. To address this challenge, we introduce a global attention dynamic pruning (GADP) by removing less signiﬁcant and sparse parameters. We further design a skip squeeze and excitation (SE) module, which suppresses weak features, excites signiﬁcant features and performs dynamic spatial and channelwise feature re-calibration. Validating on the robotic instrument segmentation dataset of MICCAI endoscopic vision challenge, our model signiﬁcantly outperforms state-of-the-art segmentation and detection models, including best-performed models in the challenge.},\n\tlanguage = {en},\n\tauthor = {Islam, Mobarakol and Vs, Vibashan and Ren, Hongliang},\n\tpages = {7},\n}\n\n","author_short":["Islam, M.","Vs, V.","Ren, H."],"key":"islam_ap-mtl_nodate","id":"islam_ap-mtl_nodate","bibbaseid":"islam-vs-ren-apmtlattentionprunedmultitasklearningmodelforrealtimeinstrumentdetectionandsegmentationinrobotassistedsurgery","role":"author","urls":{},"metadata":{"authorlinks":{}},"html":""},"bibtype":"article","biburl":"https://bibbase.org/zotero/yuuki_koyama","dataSources":["bJBFjpaYFPYMt7By9"],"keywords":[],"search_terms":["mtl","attention","pruned","multi","task","learning","model","real","time","instrument","detection","segmentation","robot","assisted","surgery","islam","vs","ren"],"title":"AP-MTL: Attention Pruned Multi-Task Learning Model for Real-Time Instrument Detection and Segmentation in Robot-Assisted Surgery","year":null}