AMMAI: [AMMAI] [Lecture 09] - "Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding"

Paper Information:
Han, Song, Huizi Mao, and William J. Dally. "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding." arXiv preprint arXiv:1510.00149 (2015).

Motivation:
The demanding of running neural network in embeded systems becomes more and more popular. However, the limited harware resources is obstacle to the application.

Contributions:
They reduce the sorage and energy requited of large network with pruning, trained quantization, Huffmand coding.

Technical summarization:

3 blocks below will be describe in the following parts.

Network pruning:

Frist,It learn the connectivity via normal network training. Second,

weights below a threshold are removed. Finally, the remaining sparse connections will be retrained.

Trained quantization and weight sharing

They use k-means clustering to find the shared weights. Since centroid initialization impacts the quality of clustering, larger weights are quite vital; therefore, linear initialization is choosen to initialization.

Huffman coding

The main concept of Huffman coding is that more common symbols are represented with fewer bits.

My comment:

This paper indeed makes a lot visualization to crystallize the abstract weight distribution of CNN. Two examples below will be shown.

It's quite a straight way to view distribution of weight in histogram. Furthermore, from the weight distribution the bias is shown clearly; therefore, it's a concrete proof to show the reason to use Huffman coding.

Weights' distribution of conv3 layer is shown above. As we can see, it forms a bimodal distribution.

Following picture shows that overhead of codebook is very small and often negligible. As the first time I see the use of codebook, I thought it will cost some space. This picture shows it will not cost too much space consumption. Therefor, the decoding time might also not be a problem because the amount of codebook is quite small.

AMMAI

2016年4月28日星期四

[AMMAI] [Lecture 09] - "Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding"

沒有留言:

張貼留言

2016年4月28日 星期四

[AMMAI] [Lecture 09] - "Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding"

沒有留言:

張貼留言

2016年4月28日星期四