Learning to compress neural networks in Caffe