Lstm | dhruv's space

Summary Notes: GRU and LSTMs

This post is sort-of a continuation to my last post, which was a Summary Note on the workings of basic Recurrent Neural Networks. As I mentioned in that post, I’ve been learning about the workings of RNNs for the past few days, and how they deal with sequential data, like text. An RNN can be built using either a basic RNN unit (described in the last post), a Gated Recurrent unit, or an LSTM unit. This post will describe how GRUs/LSTMs learn long term dependencies in the data, which is something basic RNN units are not so good at. ...

Word Embeddings and RNNs

One of the simplest ways to convert words from a natural language into mathematical tensors is to simply represent them as one-hot vectors where the length of these vectors is equal to the size of the vocabulary from where these words are fetched. For example, if we have a vocabulary of size 8 containing the words: "a", "apple", "has", "matrix", "pineapple", "python", "the", "you" the word “matrix” can be represented as: [0,0,0,1,0,0,0,0] ...

Summary Notes: Basic Recurrent Neural Networks

I’ve been learning about Recurrent Neural Nets this week, and this post is a “Summary Note” for the same. A “Summary Note” is just a blog post version of the notes I make for something, primarily for my own reference (if I need to come back to the material in the future). These summary notes won’t go into the very foundations of whatever they’re about, but rather serve as a quick and practical reference for that particular topic. ...