Separable Layers Enable Structured Efficient Linear Substitutions

G Gray, EJ Crowley, A Storkey - arXiv preprint arXiv:1906.00859, 2019 - arxiv.org
arXiv preprint arXiv:1906.00859, 2019arxiv.org
In response to the development of recent efficient dense layers, this paper shows that
something as simple as replacing linear components in pointwise convolutions with
structured linear decompositions also produces substantial gains in the efficiency/accuracy
tradeoff. Pointwise convolutions are fully connected layers and are thus prepared for
replacement by structured transforms. Networks using such layers are able to learn the
same tasks as those using standard convolutions, and provide Pareto-optimal benefits in …
In response to the development of recent efficient dense layers, this paper shows that something as simple as replacing linear components in pointwise convolutions with structured linear decompositions also produces substantial gains in the efficiency/accuracy tradeoff. Pointwise convolutions are fully connected layers and are thus prepared for replacement by structured transforms. Networks using such layers are able to learn the same tasks as those using standard convolutions, and provide Pareto-optimal benefits in efficiency/accuracy, both in terms of computation (mult-adds) and parameter count (and hence memory). Code is available at https://github.com/BayesWatch/deficient-efficient.
arxiv.org