Fixed Point Quantization of Deep Convolutional Networks

Fixed Point Quantization of Deep Convolutional Networks. Lin, D., D., Talathi, S., S., & Annapureddy, V., S. 33rd International Conference on Machine Learning, ICML 2016, 6:4166-4175, International Machine Learning Society (IMLS), 11, 2015.

Paper

Fixed Point Quantization of Deep Convolutional Networks [link]

Website abstract bibtex

In recent years increasingly complex architectures for deep convolution networks (DCNs) have been proposed to boost the performance on image recognition tasks. However, the gains in performance have come at a cost of substantial increase in computation and model storage resources. Fixed point implementation of DCNs has the potential to alleviate some of these complexities and facilitate potential deployment on embedded hardware. In this paper, we propose a quantizer design for fixed point implementation of DCNs. We formulate and solve an optimization problem to identify optimal fixed point bit-width allocation across DCN layers. Our experiments show that in comparison to equal bit-width settings, the fixed point DCNs with optimized bit width allocation offer >20% reduction in the model size without any loss in accuracy on CIFAR-10 benchmark. We also demonstrate that fine-tuning can further enhance the accuracy of fixed point DCNs beyond that of the original floating point model. In doing so, we report a new state-of-the-art fixed point performance of 6.78% error-rate on CIFAR-10 benchmark.

@article{
 title = {Fixed Point Quantization of Deep Convolutional Networks},
 type = {article},
 year = {2015},
 pages = {4166-4175},
 volume = {6},
 websites = {http://arxiv.org/abs/1511.06393},
 month = {11},
 publisher = {International Machine Learning Society (IMLS)},
 day = {19},
 id = {d7515001-2b52-3056-b6be-5083f956188c},
 created = {2021-06-14T08:26:23.197Z},
 accessed = {2021-06-14},
 file_attached = {true},
 profile_id = {48fc0258-023d-3602-860e-824092d62c56},
 group_id = {1ff583c0-be37-34fa-9c04-73c69437d354},
 last_modified = {2021-06-14T08:31:46.321Z},
 read = {false},
 starred = {false},
 authored = {false},
 confirmed = {false},
 hidden = {false},
 folder_uuids = {c9e2a751-ce83-45dd-9c0e-bdac57df3cf4,cf9189f6-f354-4337-8aaf-a5f12cbf8660},
 private_publication = {false},
 abstract = {In recent years increasingly complex architectures for deep convolution networks (DCNs) have been proposed to boost the performance on image recognition tasks. However, the gains in performance have come at a cost of substantial increase in computation and model storage resources. Fixed point implementation of DCNs has the potential to alleviate some of these complexities and facilitate potential deployment on embedded hardware. In this paper, we propose a quantizer design for fixed point implementation of DCNs. We formulate and solve an optimization problem to identify optimal fixed point bit-width allocation across DCN layers. Our experiments show that in comparison to equal bit-width settings, the fixed point DCNs with optimized bit width allocation offer >20% reduction in the model size without any loss in accuracy on CIFAR-10 benchmark. We also demonstrate that fine-tuning can further enhance the accuracy of fixed point DCNs beyond that of the original floating point model. In doing so, we report a new state-of-the-art fixed point performance of 6.78% error-rate on CIFAR-10 benchmark.},
 bibtype = {article},
 author = {Lin, Darryl D. and Talathi, Sachin S. and Annapureddy, V. Sreekanth},
 journal = {33rd International Conference on Machine Learning, ICML 2016}
}

Downloads: 0

{"_id":"TyYiWdds3pLpbCFrg","bibbaseid":"lin-talathi-annapureddy-fixedpointquantizationofdeepconvolutionalnetworks-2015","downloads":0,"creationDate":"2016-02-09T06:01:08.996Z","title":"Fixed Point Quantization of Deep Convolutional Networks","author_short":["Lin, D., D.","Talathi, S., S.","Annapureddy, V., S."],"year":2015,"bibtype":"article","biburl":"https://bibbase.org/service/mendeley/bfbbf840-4c42-3914-a463-19024f50b30c","bibdata":{"title":"Fixed Point Quantization of Deep Convolutional Networks","type":"article","year":"2015","pages":"4166-4175","volume":"6","websites":"http://arxiv.org/abs/1511.06393","month":"11","publisher":"International Machine Learning Society (IMLS)","day":"19","id":"d7515001-2b52-3056-b6be-5083f956188c","created":"2021-06-14T08:26:23.197Z","accessed":"2021-06-14","file_attached":"true","profile_id":"48fc0258-023d-3602-860e-824092d62c56","group_id":"1ff583c0-be37-34fa-9c04-73c69437d354","last_modified":"2021-06-14T08:31:46.321Z","read":false,"starred":false,"authored":false,"confirmed":false,"hidden":false,"folder_uuids":"c9e2a751-ce83-45dd-9c0e-bdac57df3cf4,cf9189f6-f354-4337-8aaf-a5f12cbf8660","private_publication":false,"abstract":"In recent years increasingly complex architectures for deep convolution networks (DCNs) have been proposed to boost the performance on image recognition tasks. However, the gains in performance have come at a cost of substantial increase in computation and model storage resources. Fixed point implementation of DCNs has the potential to alleviate some of these complexities and facilitate potential deployment on embedded hardware. In this paper, we propose a quantizer design for fixed point implementation of DCNs. We formulate and solve an optimization problem to identify optimal fixed point bit-width allocation across DCN layers. Our experiments show that in comparison to equal bit-width settings, the fixed point DCNs with optimized bit width allocation offer >20% reduction in the model size without any loss in accuracy on CIFAR-10 benchmark. We also demonstrate that fine-tuning can further enhance the accuracy of fixed point DCNs beyond that of the original floating point model. In doing so, we report a new state-of-the-art fixed point performance of 6.78% error-rate on CIFAR-10 benchmark.","bibtype":"article","author":"Lin, Darryl D. and Talathi, Sachin S. and Annapureddy, V. Sreekanth","journal":"33rd International Conference on Machine Learning, ICML 2016","bibtex":"@article{\n title = {Fixed Point Quantization of Deep Convolutional Networks},\n type = {article},\n year = {2015},\n pages = {4166-4175},\n volume = {6},\n websites = {http://arxiv.org/abs/1511.06393},\n month = {11},\n publisher = {International Machine Learning Society (IMLS)},\n day = {19},\n id = {d7515001-2b52-3056-b6be-5083f956188c},\n created = {2021-06-14T08:26:23.197Z},\n accessed = {2021-06-14},\n file_attached = {true},\n profile_id = {48fc0258-023d-3602-860e-824092d62c56},\n group_id = {1ff583c0-be37-34fa-9c04-73c69437d354},\n last_modified = {2021-06-14T08:31:46.321Z},\n read = {false},\n starred = {false},\n authored = {false},\n confirmed = {false},\n hidden = {false},\n folder_uuids = {c9e2a751-ce83-45dd-9c0e-bdac57df3cf4,cf9189f6-f354-4337-8aaf-a5f12cbf8660},\n private_publication = {false},\n abstract = {In recent years increasingly complex architectures for deep convolution networks (DCNs) have been proposed to boost the performance on image recognition tasks. However, the gains in performance have come at a cost of substantial increase in computation and model storage resources. Fixed point implementation of DCNs has the potential to alleviate some of these complexities and facilitate potential deployment on embedded hardware. In this paper, we propose a quantizer design for fixed point implementation of DCNs. We formulate and solve an optimization problem to identify optimal fixed point bit-width allocation across DCN layers. Our experiments show that in comparison to equal bit-width settings, the fixed point DCNs with optimized bit width allocation offer >20% reduction in the model size without any loss in accuracy on CIFAR-10 benchmark. We also demonstrate that fine-tuning can further enhance the accuracy of fixed point DCNs beyond that of the original floating point model. In doing so, we report a new state-of-the-art fixed point performance of 6.78% error-rate on CIFAR-10 benchmark.},\n bibtype = {article},\n author = {Lin, Darryl D. and Talathi, Sachin S. and Annapureddy, V. Sreekanth},\n journal = {33rd International Conference on Machine Learning, ICML 2016}\n}","author_short":["Lin, D., D.","Talathi, S., S.","Annapureddy, V., S."],"urls":{"Paper":"https://bibbase.org/service/mendeley/bfbbf840-4c42-3914-a463-19024f50b30c/file/748733a9-2640-d6cf-3ad4-6544bf229c12/full_text.pdf.pdf","Website":"http://arxiv.org/abs/1511.06393"},"biburl":"https://bibbase.org/service/mendeley/bfbbf840-4c42-3914-a463-19024f50b30c","bibbaseid":"lin-talathi-annapureddy-fixedpointquantizationofdeepconvolutionalnetworks-2015","role":"author","metadata":{"authorlinks":{}},"downloads":0},"search_terms":["fixed","point","quantization","deep","convolutional","networks","lin","talathi","annapureddy"],"keywords":[],"authorIDs":[],"dataSources":["qcwuM7Zzcbynrts5v","ya2CyA73rpZseyrZ8","2252seNhipfTmjEBQ"]}