From the course: Generative AI: Introduction to Large Language Models

What is a neural network?

From the course: Generative AI: Introduction to Large Language Models

What is a neural network?

- [Narrator] The human brain is a massive parallel processor that is made up of a very large network of interconnected cells called neurons. This intricate web of neurons enables complex information processing, communication, and cognitive functions that underlie human intelligence and behavior. Each neuron receives signals from multiple other neurons through its dendrites, the input channels of the neuron. These signals, often referred to as inputs, are transmitted electrically and carry information in the form of chemical and electrical impulses. The neuron processes these inputs to compute a weighted sum, which is then evaluated against a certain threshold. If the sum surpasses this threshold, the neuron generates an output signal or action potential, which is transmitted along its axon to communicate with other neurons. Artificial neural networks solve learning problems by modeling the relationship between a set of input signals and an output signal, similar to the way biological neurons work. As a set of inputs are received by an artificial node, each input is weighted by importance. These weighted signals are then summed and passed through an activation function, which determines what value is forwarded by the node. An activation function is a component within a neural network that transforms a neurons combined input signals into a single output signal. To understand how activation functions are used by an artificial neural network, let's assume we have a sample training data set that looks like this. Our objective is to provide a neural network with the independent variables, amount, grade, and purpose, so it learns to correctly predict the values of the dependent variable, whether a borrower will default on their loan or not. Neural networks learn through numbers, so we must represent our training data in numeric form, similar to what we have done here. As each row of the training data is passed to the input nodes of the neural network, it assigns weights to each input. These weights correspond with how important the neural network thinks the input data is. As a result, each input signal is multiplied by its corresponding weight, summed, and a bias weight is added to the sum. In this example, we see that the sum is 0.4. Now, this is where the activation function comes in. One of the most basic activation functions is known as a unit step, or threshold activation function. The output of the function is zero if the sum of the input signals is less than zero, or one if the sum of the input signals is zero or more. Since the sum of our input signals is 0.4, the threshold activation function will fire, or output, one as the prediction of the neural network. Now, let's see what happens when the second row of the training data is passed through the neural network. As the inputs come in, they're weighted and aggregated. The sum this time around is -0.4. Because the sum is less than zero, the activation function outputs zero as a prediction of the network. This process of weighted aggregation and activation across multiple interconnected nodes allows artificial neural networks to learn complex patterns and make predictions based on input data. Besides the threshold activation function, there are several other activation functions we can choose from when designing a neural network. The function we choose depends on the learning task at hand. One option is the sigmoid activation function. This function is often used for binary response problems. Instead of an output of zero or one, the output of the sigmoid activation function can be any real number between zero and one. Another activation function we can use is a hyperbolic tangent activation function. This function differs from the sigmoid activation function in that it is zero centered. The output of the hyperbolic tangent activation function falls between -1 and one. The rectified linear unit activation function is the most commonly used activation function. It transforms the sum of inputs by setting all negative values to zero and leaving positive values unchanged. It is computationally efficient, and has played a significant role in the success of deep neural networks. The type of network introduced in this video is known as a perceptron, a very basic type of artificial neural network. It is one of many neural network architectures we can use to solve learning problems. The complexity of the task that a neural network can learn is determined by the topology of the network. So perceptrons can only solve simple problems. In subsequent course videos, we will explore different neural network architectures and discuss the learning problems each one is best suited for.

Contents