Neural Network Basics
A Neural network is an interconnected set of Neurons, arranged in layers. Input goes on one end and output the other end.
For example, the picture above is a neural network with 4 nodes in the input layer and 3 nodes in the output layer. This is the exact structure that we have used for iris classification that we have solved in our Hello World example on Day 1. The layer in between is called the hidden layer. This is what gives the name – Deep Learning – because the network is deep (with not just the input and output layer, but one or many hidden layers).
This is the basic structure of a neural network. The number of nodes or layers could change, but this is the basic structure of a typical neural network. To understand a neural network better, we have to get started from the basics.
Neural Network was inspired by the brain. A human brain consists of billions of neurons that are interconnected. Here is a quick picture from wikipedia.
x1,x2..xn represent the inputs. y1,y2..yn
are the outputs. So, essentially a neuron tranforms a set of inputs to a set of outputs. When many such neurons are connected, they form an intelligent system.
The most simplest way to represent a neuron, mathematically is with a perceptron.
A perceptron receives inputs, adds them up and produces an output. What is the big deal about it ? It is just basic addition, right ? True – That’s where the concept of weights come in.
Each of the inputs is multiplied by a weight. So, instead of just summing up the inputs, you multiply them with the weights and sum it up (weighted sum of inputs). The weighted sum could be a number within a very large range, depending on the input range and the weights. What is the use of having a number that could be anywhere from −∞ to +∞
To normalize this, a bias or threshold is introduced.
What does a perceptron achieve
The calculation above seems simple enough, but what exactly does it acheive ? Think of it like a decision making machine. It weights input parameters and provides a Yes or No decision. For example, say you want to decide if you want to learn Deep Learning or not, how do you go about it in your mind ?
Inputs Weight Job Prospect 30% Interesting enough 20% Future Growth 30% Salary 20%
You weigh your inputs ( multiply the inputs with the corresponding weightage) and arrive at a figure. In fact, each of these inputs are also given a number internally in your mind. However, the way a human brain functions is far more complicated. Like I said before, neural networks & deep learning are just “based on” how the human brain works. it is not an exact replica.
While a perceptron is good enough for simple tasks, it has its limitations when building complex neural networks. That is where sigmoid neurons come in. If you have seen logistic regression in Machine Learning before, you will already have an idea on what a sigmoid function does. It essentially maps a range of numbers between −∞ to +∞
to values betwen 0 and 1.
A perceptron outputs either a 0 or a 1 depending on the weighted inputs & threshold. A sigmoid neuron outputs a value between 0 and 1. This makes the sigmoid neuron much more useful in large scale neural networks.
The weighted sum of inputs + bias is calculated just as above.
Now, instead of just outputting this, a sigmoid neuron calculates a sigmoid of the calculated weighted input + bias and outputs a value between a 0 and 1.
You can a have a visual of the sigmoid function as seen below.
from scipy.special import expit import matplotlib.pyplot as plt import numpy as np %matplotlib inline x = np.linspace(-1000,1000) y = expit (x) plt.plot(x,y) plt.grid()
This looks like a binary curve ( that can only take a value of 0 or 1), but if you closely observe the curve between a range of say -10 to 10, you can clearly observe the gradual progression.
x = np.linspace(-10,10) y = expit (x) plt.plot(x,y) plt.grid()
This is the logistic regression curve. Only when the value of the ( weighted sum + bias ) stays very close to 0, do you observe the logistic curve. For any other extreme value, the output is pretty much either a 0 or 1 (very much like a perceptron).
Advantages of Sigmoid Neuron over a Perceptron
Since the output range of a sigmoid neuron is smooth, small changes in the inputs will result in small changes in the output. So, instead of just doing a flip of the switch (0 or 1), sigmoid function acts more like a slider. This feature of sigmoid functions output makes it very useful for neural networks learning.
Changes to your output are essentially a function of changes in the weights and biases. This is the basis of Neural Network learning.
However, to understand this mathematically, we have to understand a little bit of derivatives, partial derivatives and then the actual back-propogation algorithm itself – Gradient Descent. These will be the topic of our next chapter.