Introduction
Machine learning is the process in which artificial intelligence uses data and algorithms to learn over time. Models are able to run multiple simulations, changing how much weight a particular variable holds on a decision, and then compares the result to the expected result. Neural networks are a subclass of machine learning. Rather than making decisions based on the data the system receives, neural networks are able to create new variables to factor into their decision making. This gives neural networks the ability to attempt to give answers to more complex problems.
Basic Structure
Neural networks are designed to make decisions like the human brain. They use nodes/neurons that pass information to each other at different intensities, and then decide whether or not to discard that information. These networks are made up of layers that consist of nodes that pass data to each other. The strength of the connection between two nodes is called the connection weight. This will determine how much impact the data coming from a node will have on the decision of the node receiving the information. Larger 'weights' have more of an impact on a node's decision making. Weights are initially randomly assigned, but will changed overtime as the system learns which data is more important.
Input Layer
First, the data is passed into the nodes/neurons within the input layer. "Each neuron corresponds to a feature, and its value represents the featureβs value" (Sarita, 'Basic Understanding of Neural Network Structure'). Each node then passes the weighted sum value to every node on the next layer.
Hidden Layer
The hidden layer is the the next layer within the network which receives the weighted sum from the input layer. The weighted sum is calculated by multiplying the output of the node by the connection weight between the two nodes exchanging data, and then adding all the resulting values together.
In the example below, the weighted sum is calculated by:
Value1 = 160 Weight1 = 0.35
Value2 = 55 Weight2 = 0.2
(Value1 * Weight1) + (Value2 * Weight2) = 67
Deep Learning Dictionary - Lightweight Crash Course
The bias of the node will then be added to the weighted sum before being passed into an activator function. The bias is a constant that "is used to offset the result. It helps the models to shift the activation function towards the positive or negative side."(Turing, 'What Is the Necessity of Bias in Neural Networks?)
The value of bias will change as the system learns
A network may have multiple hidden layers depending on how complex it is. In these cases, each hidden layer is in charge of learning about a separate aspect of the data. The layers are broken up into components that perform smaller tasks.
Activator Functions
Activator functions are required to make networks non-linear, allowing data to be passed in all directions to different nodes, and then backwards through the network. The purpose of an activator function is to tell a node whether or not to send data to the next layer (activate). Different activators will perform different transformations on node output, and have different return values. If the activator function does not tell the node to fire, then it will not output any information to the next layer.
A few popular functions are:
- Sigmoid
- TanH
- ReLu
Output Layer
The final hidden layer passes data to the output layer, which is where the final decision/output of the model will be made. The same calculations performed within the hidden layers are also performed here. The weighted sum is calculated and the bias is added. That number is then passed into an activator function that tells the node whether or not to fire.
Backward Propagation
This is where the model is able to learn from the mistakes it made. The data is passed backwards through the system, and evaluated to see where mistakes were made. The system changes the weights and biases over the nodes, and runs the simulation again. By changing the weights, the system is able to learn which variables are more impactful when making a decision.
Summary
Neural networks are complex machines, with the ability to analyze data, compare results, and manipulate conditions in order to become better decision makers. The system is composed of three types of layers, which are a collection of nodes that pass data to each other. The hidden and output layers perform calculations that decide whether or not information should continue to be passed through the network. A huge characteristic of these networks is the ability to pass data backwards, which gives it the ability to learn and improve.
Sources:
AWS Neural Network
IBM Neural Network
Basic Understanding
AI, Machine Learning, Neural Networks, and Deep Learning
Function Activation
Top comments (0)