Thursday, November 26, 2020

Polar Codes










Just read a really nice article in the WIRED magazine about the inventor of the "Polar Codes", Erdal Arikan and Huawei.. they spend an equal amount of time praising Erdal Arikan and dissing Huawei... 


First they talk about how Erdal invented the "Polar Codes". His obsession with the Shannon limit in Information Theory. This is the technical upper limit on how much information you can pack in a channel, factoring in the noise and redundancy. It was a very hard problem and even his mentor in MIT came close to a solution and gives up. Erdal goes back to Turkey and helps set up the engineering college in a University - Bilkent, which he heads. This helps him to work on this problem over 20 years, while other people in US have to work on small problems for the sake of tenure.


Simultaneously, they talk about Huawei's rise through the support of the Chinese Government and by stealing intellectual property. Apparently, Huawei screws Nortel, and when Nortel has to file for Bankruptcy, Huawei takes over their research team. It is the Chinese born head of the Nortel team who identifies Erdal's polar codes and starts working on them.


Now, even if one company develops the technology, the standards have to be agreed by a lot of companies, the governing body is called 3GPP. Huawei has a lot of leverage there, because it holds most of patents and it has been able to push the Polar Codes based technology. US doesn't even have an equivalent company - Europe has Ericsson, Japan has a few. And all the Chinese companies are working together in pushing the Polar Codes based standard including ZTE and Lenovo. They've succeeded :)


Now to claim legitimacy, Huawei is honoring the inventor Erdal - They start the article by saying the ceremony and settings were corny and cold-war style :)


https://youtu.be/BE5HuqEg0oY

Tuesday, November 3, 2020

Deep Learning - Coursera

 


This course was introduction to Neural Networks, I'll try to summarize this in as simple a manner as possible.

Logistic regression can be viewed as a simple, single layer neural network. Similarly, a neural network can be viewed as multiple layers of logistic regression.

The difference being that, logistic regression can detect linear patterns only i.e. just do line fitting through the training dataset.

Neural networks can detect non-linear patterns. This is because the each layer of the neural network has a non-linear activation function.

Logistic Regression with Gradient Descent

The goal of Logistic Regression is to train a model or create a model which can make predictions, more specifically True or False predictions.

The input is anything that can be represented as a matrix, say X, then we need a matrix W and a vector b such that:

W * X + b  = a.   

When an activation function say Ω is applied to a, we get the True or False result.

Rounding Ω(a) to an integer gives the True or False result.


Gradient Descent Algorithm

 The training algorithm proceeds as follows: 

We take a large number of training examples and We start with a zero matrix and zero vector for W and b.

For each training example X and training result Y, 

  1. we calculate  a = Ω (W * X + b) 
  2. Next, we calculate the cost Y vs. a
  3. Next, we adjust W and b based on the cost.
We do this until there is no difference in cost across different iterations i.e there is no gradient descent.


Neural Network 

The Neural Network goal is similar, but goal is to train multiple layers and we start with random value matrices

The neural network algorithm is similar to the gradient descent algorithm, but once again it works across multiple layers.

  1. As an equivalent of step 1 in logistic regression, we have "Forward Propagation"
  2. Once again we calculate cost by comparing with intermediate step's result with the training result.
  3. As an equivalent of step 3 in Logistic regression, we have backward propogation, which adjusts the weights across all the layers.


Other Stuff

Hyper-parameters

Things like the number of layers, the learning rate etc. Tuning these for optimal efficiency is a course of its own

Vectorization

This is a computation optimization where we avoid explicit for-loops in the code and instead use Python and Numpy's inbuilt features such as broadcasting

Learning Tip

Do the course with a friend, makes it much more easier and fun..