When neural networks were first invented many decades ago, the original motivation was to write software that could mimic how the human brain or how the biological brain learns and thinks.
Even though today, neural networks, sometimes also called artificial neural networks, have become very different than how any of us might think about how the brain actually works and learns. Some of the biological motivations still remain in the way we think about artificial neural networks or computer neural networks today.
Let's start by taking a look at how the brain works and how that relates to neural networks. The human brain, or maybe more generally, the biological brain demonstrates a higher level or more capable level of intelligence.
Neural networks started with the motivation of trying to build software to mimic the brain. Work in neural networks had started back in the 1950s, then gained popularity in the 1980s and 1990s with applications like handwritten digit recognition.
But then neural networks fell out of favor again in the late 1990s, only to re-emerge around 2005, re-branded as deep learning. Deep learning and neural networks mean similar things, but deep learning sounds more appealing. This branding took off in the last decade and a half.
The first major breakthrough area for deep learning was speech recognition, then computer vision with the ImageNet moment in 2012, and natural language processing. Now, neural networks are used in everything from climate change to medical imaging and online advertising.
Even though today's neural networks have almost nothing to do with how the brain learns, early motivations tried to mimic the brain. Let’s discuss how the brain works and how artificial neural networks model some aspects of it.
Here’s a diagram of neurons in a biological brain:
Neurons send electrical impulses and form connections with other neurons. A single neuron aggregates inputs, performs computations, and sends outputs, forming the basis of human thought.
Here’s a simplified diagram of a biological neuron. Neurons comprise a cell body, and inputs are received via dendrites, while outputs are sent via the axon.
Artificial neural networks use a very simplified mathematical model of what a biological neuron does. It takes one or more inputs, does computations, and outputs a result to the next neuron.
Now, instead of simulating just one neuron, we simulate many. This allows us to input numbers, compute, and generate outputs using a network of neurons.
Even though neural networks were originally biologically motivated, today’s researchers don’t rely heavily on biological principles. Instead, they use engineering techniques to build effective algorithms.
The answer lies in the availability of big data and the inability of traditional algorithms like linear and logistic regression to handle it effectively. Neural networks, when scaled up with more neurons, show higher performance with larger datasets.
Training larger neural networks led to significant performance improvements in tasks like speech recognition, image recognition, and natural language processing.
The rise of faster processors and GPUs (originally designed for graphics) also contributed to deep learning’s rapid growth, enabling the training of very large networks with large datasets.
This is how neural networks started and why they’ve taken off in the last several years. Now, let’s dive deeper into the details of how a neural network actually works. Please proceed to the next part.