Activation Functions in Neural Networks (Hindi) - Sigmoid, ReLU, Softmax Explained

⚡ Activation Functions in Neural Networks

Neural Networks में activation function का role decision making का होता है। बिना activation functions के neural network सिर्फ linear transformations ही कर पाता। लेकिन real-world problems (जैसे image recognition, NLP, speech processing) non-linear होती हैं। इसीलिए हमें activation functions की ज़रूरत होती है।

❓ Activation Function की ज़रूरत क्यों?

Non-linearity introduce करता है
Complex decision boundaries बनाता है
Gradient Descent और Backpropagation को संभव बनाता है
Neurons को relevant signal pass करने और irrelevant ignore करने में मदद करता है

📌 Types of Activation Functions

1️⃣ Step Function

सबसे basic activation function है। Input > 0 होने पर 1 return करता है, वरना 0। यह simple binary classification के लिए उपयोगी है लेकिन gradient-based training में काम नहीं आता।


import numpy as np

def step_function(x):
    return np.where(x >= 0, 1, 0)

print(step_function(np.array([-2, -1, 0, 1, 2])))

2️⃣ Sigmoid Function

Sigmoid हर input को (0,1) range में compress करता है। Logistic regression और binary classification में use होता है। लेकिन बड़े positive या negative inputs पर gradient vanish हो जाता है।


import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

print(sigmoid(np.array([-5, 0, 5])))

3️⃣ Tanh Function

Tanh हर input को (-1, 1) range में compress करता है। यह Sigmoid से बेहतर है क्योंकि इसका mean zero होता है। लेकिन इसमें भी gradient vanishing issue होता है।


def tanh(x):
    return np.tanh(x)

print(tanh(np.array([-3, 0, 3])))

4️⃣ ReLU (Rectified Linear Unit)

Deep Learning में सबसे popular activation function है। Negative values को 0 कर देता है और positive values as it is pass कर देता है। यह vanishing gradient problem को काफी हद तक solve करता है।


def relu(x):
    return np.maximum(0, x)

print(relu(np.array([-5, -1, 0, 2, 5])))

5️⃣ Leaky ReLU

ReLU का improved version है। इसमें negative inputs को पूरी तरह ignore नहीं किया जाता बल्कि उन्हें small slope (0.01x) दिया जाता है। इससे dying ReLU problem solve होती है।


def leaky_relu(x, alpha=0.01):
    return np.where(x > 0, x, alpha * x)

print(leaky_relu(np.array([-5, -1, 0, 2, 5])))

6️⃣ Softmax Function

Softmax multi-class classification problems में use होता है। यह outputs को probability distribution में बदल देता है। हर class की probability (0,1) range में होती है और sum = 1 होता है।


def softmax(x):
    exp_x = np.exp(x - np.max(x))
    return exp_x / exp_x.sum()

print(softmax(np.array([2, 1, 0.1])))

📊 Comparison of Activation Functions

Function	Range	Pros	Cons
Step	0 or 1	Simple, fast	No gradient, no learning
Sigmoid	(0,1)	Probabilistic output	Vanishing gradient
Tanh	(-1,1)	Zero-centered	Vanishing gradient
ReLU	[0, ∞)	Efficient, avoids vanishing gradient	Dying ReLU
Leaky ReLU	(-∞, ∞)	Fixes dying ReLU	Still not perfect
Softmax	(0,1)	Best for classification	Expensive computation

🏆 निष्कर्ष

Activation Functions neural network का सबसे महत्वपूर्ण हिस्सा हैं। इनके बिना complex patterns सीखना impossible है। सही activation function का चुनाव data और task पर depend करता है। आज industry में ReLU, Leaky ReLU और Softmax सबसे ज़्यादा use हो रहे हैं। लेकिन research में नए activation functions (Swish, GELU आदि) भी आ रहे हैं।