Diagnostic classifiers: Revealing how neural networks process hierarchical structure. Veldhoen, S., Hupkes, D., & Zuidema, W. CEUR Workshop Proceedings, 2016.
abstract   bibtex   
We investigate how neural networks can be used for hierarchical, compositional semantics. To this end, we define the simple but nontrivial artificial task of pro-cessing nested arithmetic expressions and study whether different types of neural networks can learn to add and subtract. We find that recursive neural networks can implement a generalising solution, and we visualise the intermediate steps: projection, summation and squashing. We also show that gated recurrent neural networks, which process the expressions incrementally, perform surprisingly well on this task: they learn to predict the outcome of the arithmetic expressions with reasonable accuracy, although performance deteriorates with increasing length. To analyse what strategy the recurrent network applies, visualisation techniques are less insightful. Therefore, we develop an approach where we formulate and test hypotheses on what strategies these networks might be following. For each hypoth-esis, we derive predictions about features of the hidden state representations at each time step, and train 'diagnostic classifiers' to test those predictions. Our results indicate the networks follow a strategy similar to our hypothesised 'incremental strategy'.
@article{Veldhoen2016,
abstract = {We investigate how neural networks can be used for hierarchical, compositional semantics. To this end, we define the simple but nontrivial artificial task of pro-cessing nested arithmetic expressions and study whether different types of neural networks can learn to add and subtract. We find that recursive neural networks can implement a generalising solution, and we visualise the intermediate steps: projection, summation and squashing. We also show that gated recurrent neural networks, which process the expressions incrementally, perform surprisingly well on this task: they learn to predict the outcome of the arithmetic expressions with reasonable accuracy, although performance deteriorates with increasing length. To analyse what strategy the recurrent network applies, visualisation techniques are less insightful. Therefore, we develop an approach where we formulate and test hypotheses on what strategies these networks might be following. For each hypoth-esis, we derive predictions about features of the hidden state representations at each time step, and train 'diagnostic classifiers' to test those predictions. Our results indicate the networks follow a strategy similar to our hypothesised 'incremental strategy'.},
author = {Veldhoen, Sara and Hupkes, Dieuwke and Zuidema, Willem},
file = {:Users/shanest/Documents/Library/Veldhoen, Hupkes, Zuidema/CEUR Workshop Proceedings/Veldhoen, Hupkes, Zuidema - 2016 - Diagnostic classifiers Revealing how neural networks process hierarchical structure.pdf:pdf},
issn = {16130073},
journal = {CEUR Workshop Proceedings},
keywords = {method: diagnostic classifier},
title = {{Diagnostic classifiers: Revealing how neural networks process hierarchical structure}},
volume = {1773},
year = {2016}
}

Downloads: 0