# Trained Rank Pruning for Efficient Deep Neural Networks

@article{Xu2019TrainedRP, title={Trained Rank Pruning for Efficient Deep Neural Networks}, author={Yuhui Xu and Yuxi Li and Shuai Zhang and Wei Wen and Botao Wang and Yingyong Qi and Yiran Chen and Weiyao Lin and Hongkai Xiong}, journal={2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS)}, year={2019}, pages={14-17} }

To accelerate DNNs inference, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposition; however, small approximation errors in parameters can ripple over a large prediction loss. Apparently, it is not optimal to separate low-rank approximation from training. Unlike previous works, this paper integrates low rank approximation and… Expand

#### Figures, Tables, and Topics from this paper

#### 16 Citations

TRP: Trained Rank Pruning for Efficient Deep Neural Networks

- Computer Science
- IJCAI
- 2020

The proposed Trained Rank Pruning (TRP), which alternates between low rank approximation and training, is proposed, which maintains the capacity of the original network while imposing low-rank constraints during training. Expand

Learning Low-rank Deep Neural Networks via Singular Vector Orthogonality Regularization and Singular Value Sparsification

- Computer Science, Mathematics
- 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
- 2020

SVD training is proposed, the first method to explicitly achieve low-rank DNNs during training without applying SVD on every step, and empirically shows that SVD training can significantly reduce the rank of DNN layers and achieve higher reduction on computation load under the same accuracy. Expand

Low-Rank Compression of Neural Nets: Learning the Rank of Each Layer

- Computer Science
- 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020

It is shown that this indeed can select ranks much better than existing approaches, making low-rank compression much more attractive than previously thought, and can make a VGG network faster than a ResNet and with nearly the same classification error. Expand

Compression-aware Continual Learning using Singular Value Decomposition

- Computer Science
- ArXiv
- 2020

This work employs compression-aware training and performs low-rank weight approximations using singular value decomposition (SVD) to achieve network compaction and introduces a novel shared representational space based learning between tasks. Expand

Pruning by Training: A Novel Deep Neural Network Compression Framework for Image Processing

- Computer Science
- IEEE Signal Processing Letters
- 2021

This work proposes an effective one-stage pruning framework: introducing a trainable collaborative layer to jointly prune and learn neural networks in one go, and demonstrates very promising results against other state-of-the-art filter pruning methods. Expand

Scalable Deep Neural Networks via Low-Rank Matrix Factorization

- Computer Science, Mathematics
- ArXiv
- 2019

A novel method is proposed that enables DNNs to flexibly change their size after training via singular value decomposition (SVD), which enables to effectively compress the error and complexity of models as little as possible. Expand

Neural Network Compression via Additive Combination of Reshaped, Low-Rank Matrices

- Computer Science
- 2021 Data Compression Conference (DCC)
- 2021

This work considers a form of network compression that has not been explored before: an additive combination of reshaped low-rank matrices, which results in a “Learning-Compression” algorithm which alternates between a standard machine learning step and a step involving signal compression. Expand

Optimal Selection of Matrix Shape and Decomposition Scheme for Neural Network Compression

- Computer Science
- ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2021

The algorithm automatically selects the most suitable ranks and decomposition schemes to efficiently reduce compression costs (e.g., FLOPs) of various networks. Expand

Principal Component Networks: Parameter Reduction Early in Training

- Computer Science, Mathematics
- ArXiv
- 2020

This paper shows how to find small networks that exhibit the same performance as their overparameterized counterparts after only a few training epochs and uses PCA to find a basis of high variance for layer inputs and represent layer weights using these directions. Expand

Refining the Structure of Neural Networks Using Matrix Conditioning

- Mathematics, Computer Science
- ArXiv
- 2019

This work proposes a practical method that employs matrix conditioning to automatically design the structure of layers of a feed-forward network, by first adjusting the proportion of neurons among theayers of a network and then scaling the size of network up or down. Expand

#### References

SHOWING 1-10 OF 40 REFERENCES

Coordinating Filters for Faster Deep Neural Networks

- Computer Science
- 2017 IEEE International Conference on Computer Vision (ICCV)
- 2017

Force Regularization, which uses attractive forces to enforce filters so as to coordinate more weight information into lower-rank space, is proposed and mathematically and empirically verified that after applying this technique, standard LRA methods can reconstruct filters using much lower basis and thus result in faster DNNs. Expand

Constrained Optimization Based Low-Rank Approximation of Deep Neural Networks

- Computer Science
- ECCV
- 2018

COBLA is empirically demonstrate that COBLA outperforms prior art using the SqueezeNet and VGG-16 architecture on the ImageNet dataset and is approximately solved by sequential quadratic programming. Expand

Learning Structured Sparsity in Deep Neural Networks

- Computer Science, Mathematics
- NIPS
- 2016

The results show that for CIFAR-10, regularization on layer depth can reduce 20 layers of a Deep Residual Network to 18 layers while improve the accuracy from 91.25% to 92.60%, which is still slightly higher than that of original ResNet with 32 layers. Expand

Convolutional neural networks with low-rank regularization

- Computer Science, Mathematics
- ICLR
- 2016

A new algorithm for computing the low-rank tensor decomposition for removing the redundancy in the convolution kernels and is more effective than iterative methods for speeding up large CNNs. Expand

Training Quantized Nets: A Deeper Understanding

- Computer Science, Mathematics
- NIPS
- 2017

This work investigates training methods for quantized neural networks from a theoretical viewpoint, and explores accuracy guarantees for training methods under convexity assumptions, and shows that training algorithms that exploit high-precision representations have an important greedy search phase that purely quantized training methods lack, which explains the difficulty of training using low- Precision arithmetic. Expand

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression

- Computer Science
- 2017 IEEE International Conference on Computer Vision (ICCV)
- 2017

ThiNet is proposed, an efficient and unified framework to simultaneously accelerate and compress CNN models in both training and inference stages, and it is revealed that it needs to prune filters based on statistics information computed from its next layer, not the current layer, which differentiates ThiNet from existing methods. Expand

Wide Residual Networks

- Computer Science
- BMVC
- 2016

This paper conducts a detailed experimental study on the architecture of ResNet blocks and proposes a novel architecture where the depth and width of residual networks are decreased and the resulting network structures are called wide residual networks (WRNs), which are far superior over their commonly used thin and very deep counterparts. Expand

Pruning Filters for Efficient ConvNets

- Computer Science
- ICLR
- 2017

This work presents an acceleration method for CNNs, where it is shown that even simple filter pruning techniques can reduce inference costs for VGG-16 and ResNet-110 by up to 38% on CIFAR10 while regaining close to the original accuracy by retraining the networks. Expand

Compression-aware Training of Deep Networks

- Computer Science
- NIPS
- 2017

It is shown that accounting for compression during training allows us to learn much more compact, yet at least as effective, models than state-of-the-art compression techniques. Expand

Speeding up Convolutional Neural Networks with Low Rank Expansions

- Computer Science
- BMVC
- 2014

Two simple schemes for drastically speeding up convolutional neural networks are presented, achieved by exploiting cross-channel or filter redundancy to construct a low rank basis of filters that are rank-1 in the spatial domain. Expand