Skip to main content

This is the homepage and blog of Dhruv Thakur, a Data Scientist in the making. For more about me, see here.


Evolution of Grad-CAM heat-maps along a ResNet-34

This exercise is a continuation of my last post, which was an exploration in generating class discriminative localization maps for a convnet. In particular, I used feature map activations of the last convolutional layer (after BatchNorm), along with gradients of a specific class score wrt these activations to create heat-maps that help visualize parts of input image that contribute most coming up with a prediction.

I wanted to extend that approach to see how these heat-maps shape up as we move deeper into the network, starting with the very first convolutional layer. Similar to the last post, inspiration for this comes from a fastai Deep Learning MOOC lecture which is itself inspired by Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization by Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra.

Read more…

Generating class discriminative heat-maps using Grad-CAM

I’m quite interested in understanding and interpreting how convnets “see” and process input images that we feed them. I first got a taste of this kind of work after reading Visualizing and Understanding Convolutional Networks by Matthew D Zeiler and Rob Fergus, which is 5 years old as of today. I’m guessing a lot of work has been/is being done by the deep learning research community to make convnets more intuitive and understandable. I’m trying to take strides towards understanding that work.

This post/notebook is an exercise in generating localization heat maps to help visualise areas of an image which contribute the most when making a prediction. Inspiration for this comes from a fastai Deep Learning MOOC (2018) lecture which is itself inspired by Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization by Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra.

Read more…

Understanding ResNets

I'm currently enrolled in fastai's Deep Learning MOOC (version 3), and loving it so far. It's only been 2 lectures as of today, but folks are already building awesome stuff based on the content taught so far.

The course starts with the application of DL in Computer Vision, and in the very first lecture, course instructor Jeremy teaches us how to leverage transfer learning by making use of pre-trained ResNet models. I've been meaning to dive into the details of Resnets for a while, and this seems like a good time to do so.

This post is written in the vein of a summary-note, rather than that of a full-fledged introduction to resnets, ie, it's (sort-of) written for my own future reference, and can be helpful for somebody with some background on the topic.

Read more…

Word Embeddings and RNNs

One of the simplest ways to convert words from a natural language into mathematical tensors is to simply represent them as one-hot vectors where the length of these vectors is equal to the size of the vocabulary from where these words are fetched.

For example, if we have a vocabulary of size 8 containing the words:

"a", "apple", "has", "matrix", "pineapple", "python", "the", "you"

the word "matrix" can be represented as: [0,0,0,1,0,0,0,0]

Obviously, this approach will become a pain when we have a huge vocabulary of words (say millions), and have to train models with these representations as inputs. But apart from this issue, another problem with this approach is that there is no built-in mechanism to convey semantic similarity of words. eg. in the above example, apple and pineapple can be considered to be similar (as both are fruits), but their vector representations don't convey that.

Word Embeddings let us represent words or phrases as vectors of real numbers, where these vectors actually retain the semantic relationships between the original words. Instead of representing words as one-hot vectors, word embeddings map words to a continuous vector space with a much lower dimension.

Read more…