August 13, 2018
In a nutshell, neural networks (or artificial neural networks) are computation systems—loosely inspired by some parts of the animal brain—that are designed to mimic the way human brains process information.To understand how neural networks work, it may be helpful to learn the basics of how a biological neural network works.
A biological neural network is composed of neurons and each neuron is made up of:
If a network can be defined as "a group or system of interconnected people or things", then a neural network is a group or system of interconnected neurons.How do neurons connect with other neurons? Well, neurons connect to other neuron when signals are sent from the axon of one neuron to the dendrites of another neuron.Information is received through the dendrites and, if the the voltage is strong enough, an all-or-nothing electrical impulse, called the action potential, is generated. When action potential is generated, it quickly passes through the axon and creates new synaptic connections with other neurons at the end of the axon. This process repeats itself. If the voltage input of any of these synaptic connections is strong enough, action potential is generated in the respective neuron and new synaptic connections are created with other neurons at the end of the axon.(Make sense so far? If not, I would suggest you do not move forward until it does.)Now that you know the basics of how a biological neural network works, let's learn the basics of an artificial neural network.
As we mentioned at the beginning of the post, an artificial neural network (or "neural network", as we'll refer to it as), is loosely modeled after a biological neural network1.There are many different types of neural networks, but for the sake of this post, we will stick with a simple, fully-connected, feedforward neural network. (Don't worry, it's not as intimidating as it sounds!)First, let's define the main components that make up a neural network: neurons and weights. A feedforward neural network has three types of neurons:
and each type of neuron is organized into layers. There is only 1 input layer and 1 output layer, but there can be multiple hidden layers in between.Each neuron is connected to every other neuron in both the previous layer and the following layer by weights. The weights between neurons are illustrated in the diagram above as lines. The higher the weight between two neurons, the more of an influence the first neuron has on the second.The flow is as follows: input information is received in the input layer, travels through the multiple hidden layers, and finally gets emitted through the output layer. The decision that the neural network makes based on a given input depends on the outputs that the output neurons emit.As the input information travels through the hidden neurons, it gets multiplied by each weight in its path. Each neuron combines all weighted outputs from the previous nodes that link to it and then applies an activation function on that combined total to determine its own output. This process continues for every single neuron in the neural network—all the way from the input layer to the output layer.You might be wondering how a neural network sets its weight values to accomplish a given task. It's important to understand that neural networks aren't intelligent enough to know how to make the proper decisions based on any given input right off the bat. They have to be trained to make proper decisions.The process of training a neural network basically introduces a feedback mechanism to continually adjust the weights in the neural network until they are best-suited for the task at hand. Let's dig into how this works.In order to train a neural network, you need:
During training, the inputs are processed through each layer of neurons and a loss is calculated by comparing the actual outputs of the neural network with the intended outputs (the set of labels). The loss is an indicator of how well the neural network is adjusted for the task at hand. The higher the loss, the worse the accuracy of the neural network.The neural network then uses a feedback mechanism, called backpropogation, to adjust the weights of each neuron so that the loss is likely to decrease after the next feedforward cycle. Without going into the math, it is worth knowing that the neural network determines the adjustment it has to make on the weight of a given neuron by using calculus. (We will cover backpropogation in more detail in a separate blog post.)And so, the cycle continues—input in, loss calculated, weights adjusted accordingly—until the difference between the actual output from the neural network and the intended output is small enough for our purposes. We know the loss is small enough for our purposes when it moves below a subjective threshold. Once we reach this point, we consider the neural network to be trained (ta-da!).We can now test our neural network by passing in a fresh set of inputs and comparing the outputs of the neural network to the intended outputs (the labels). One thing we want to be careful not to do is to overfit our model on the set of data that we trained on. A neural network need to be able to generalize in order to make proper decisions on data that it has never seen before. If we overfit our training data into our neural network, our neural network will get very good at making decisions based on input data that is very similar to data in the set of data that it was trained on, but may have issues making decisions based on input data that is different than the data that it was trained on. (We will cover overfitting in more detail in a separate blog post.)
(It is important to keep in mind that biological neural networks and artificial neural networks are not the same thing. This statement gives too much credit to artificial neural networks. As renowned A.I. researcher Yann LeCun stated in a public Facebook post, "AI researchers would be happy to build a machine as intelligent as a parrot, a crow, a rat or a cat. That would be a tremendous progress over the current state of the art.")
Haroon is the Co-founder and Executive Director at A.I. For Anyone. He holds a Master’s degree in Information and Data Science from UC Berkeley and a Bachelor's degree from Penn State University. He is currently a Product Marketer at a startup and has previously worked at firms—including Deloitte Consulting, Mark Cuban Companies, PayPal, Facebook, and NASA—at the intersection of product and data science. He is also a 2011 Gates Millennium Scholar.