In deep learning, activation functions play a crucial role in determining the output of a neural network. They introduce non-linearity into the network, allowing it to learn complex patterns in the data. In this tutorial, we will discuss some of the commonly used activation functions in deep learning, such as sigmoid, tanh, ReLU, and softmax. We will also implement these activation functions using TensorFlow and Keras in Python.
- Sigmoid Activation Function:
The sigmoid activation function is a commonly used activation function in neural networks. It takes an input value and outputs a value between 0 and 1, which can be interpreted as the probability of the neuron being activated. The sigmoid function is defined as:
S(x) = 1 / (1 + e^(-x))
To implement the sigmoid activation function in TensorFlow, you can use the tf.keras.activations.sigmoid() function. Here is an example code snippet:
import tensorflow as tf
import numpy as np
# Input data
x = np.array([-1, 0, 1, 2, 3], dtype=np.float32)
# Define the sigmoid activation function
def sigmoid(x):
return 1 / (1 + tf.exp(-x))
# Calculate the output
output = sigmoid(x)
print(output)
- Tanh Activation Function:
The tanh activation function is another commonly used activation function in neural networks. It takes an input value and outputs a value between -1 and 1. The tanh function is defined as:
tanh(x) = (e^x – e^(-x)) / (e^x + e^(-x))
To implement the tanh activation function in TensorFlow, you can use the tf.keras.activations.tanh() function. Here is an example code snippet:
# Define the tanh activation function
def tanh(x):
return tf.math.tanh(x)
# Calculate the output
output = tanh(x)
print(output)
- ReLU Activation Function:
The ReLU (Rectified Linear Unit) activation function is a widely used activation function in deep learning. It replaces all negative values in the input with zero, while leaving positive values unchanged. The ReLU function is defined as:
ReLU(x) = max(0, x)
To implement the ReLU activation function in TensorFlow, you can use the tf.keras.activations.relu() function. Here is an example code snippet:
# Define the ReLU activation function
def relu(x):
return tf.maximum(0, x)
# Calculate the output
output = relu(x)
print(output)
- Softmax Activation Function:
The softmax activation function is often used in the output layer of a neural network for multi-class classification problems. It takes an input vector and normalizes it into a probability distribution over all classes. The softmax function is defined as:
softmax(x_i) = e^(x_i) / sum(e^(x_j))
To implement the softmax activation function in TensorFlow, you can use the tf.keras.activations.softmax() function. Here is an example code snippet:
# Define the softmax activation function
def softmax(x):
return tf.nn.softmax(x)
# Calculate the output
output = softmax(x)
print(output)
In this tutorial, we discussed some commonly used activation functions in deep learning and implemented them using TensorFlow and Keras in Python. Activation functions play a crucial role in the performance of neural networks, so it is important to choose the right activation function based on the problem at hand. I hope this tutorial was helpful in understanding activation functions in deep learning.
Check out our premium machine learning course with 2 Industry projects: https://codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced
00:00 Activation functions are necessary in neural networks
02:04 Activation functions are necessary for building non-linear equations in neural networks.
04:06 Step function and sigmoid function are activation functions used in classification
05:57 Use sigmoid function in the output layer and 10h function in all other places.
07:54 Derivatives and the problem of vanishing gradients
10:02 The most popular activation function for hidden layers is the sigmoid function.
12:04 Sigmoid and tanh functions are used to convert values into a range of 0 to 1 or -1 to 1 respectively.
14:30 Positive values remain the same, negative values become zero, leaky value function multiplies input by 0.1
your explaination is great!!!
1.75x seems to be normal speed
thanks bro that really helped .
please, do more videos like this, it's so good for my brain development🧨🧨🧨🧨
it's really help full thanks
Hello Sir, Thank you for your tutorials and I found them very interesting and easy. Previously I was very afraid of machine learning but now due to your simple explanations it became my favourite and interesting subject. I have a doubt regarding this activation function tutorial, we are implementing hidden layers as the real world features have a non-linear relationship with the output but if activation functions like ReLU are used which is a linear function how does it capture the non-linearity of the features ? Also another question if I use either sigmoid or tanh function for hidden layers and not for output layer and if there is no vanishing gradient problem for a case, how is it capturing the linking patterns of features/inputs since for any problem we are adjusting it to sigmoid or tanh function. Am I missing something, could you please help me with both the questions sir please.
Thank you!
Great work!
Best Video on YouTube on this topic
Very structured and organic build up of concepts, not throwing a bunch in a short timeframe down your throat praying you gobble it up. I appreciate your hard work behind the animations too.Keep it up!
really amazing
Definitely u will go to hights, if u take this as full time work
thank you great work
BOSS BOSS, one of the best pedagogue
GREAT explanation ..this video and all the others in the playlist.
thanku sir
congrats for your lovely tutorial. is C++ being used for deep learning? or Python is the top list of industries for AI transformation.
Nice series of tutorials. Super easy and time-efficient explanations.