Handwritten Digit Recognition using Capsule Networks
Overview:
My summer research internship at CSIR-CEERI Pilani was my first exposure to both the field of Computer Vision and to research, both of which have become strong interests of mine from then on. I was tasked with implementing the Capsule Networks architecture to solve handwritten digit recognition tasks, a classic Computer Vision problem. Since the paper had already provided excellent results on the MNIST dataset, I worked on the much lesser known and much more challenging Kannada-MNIST dataset, a dataset curated by Indians, in India. The test set(Dig-10K) provided some interesting challenges later down the line as well.
After matching the baseline, I went on to improve on the state-of-the-art results on this dataset by tweaking the reconstruction network over the course of experimentation (used largely for regularisation purposes in CapsNet). The proposed reconstruction network (based on transposed convolutions) also used less than 7% of the parameters as the originally proposed dense network for reconstruction.
Towards the end certain comparative studies were also conducted between CNNs and CapsNets.
Technical Details
- Language: Python
- Framework: Tensorflow1, PyTorch(reimplementation)