In the previous article, we calculated the derivatives of the cross-entropy loss. In this article, we begin optimizing the bias terms using backpropagation.
We start by setting the bias ( b_3 ) to an initial value. In this case, we choose
b_3 = -2
To verify that backpropagation is actually improving the model, we first compute the total cross entropy over the training data for this value of ( b_3 ).
We use the following bias values:
and we keep
b_3 = -2
Forward pass computation
For a single input example, we compute the intermediate values as follows.
Upper node:
Bottom node:
Raw output values
Softmax probabilities
Cross-entropy loss
| Petal | Sepal | Species | (p) | Cross Entropy |
|---|---|---|---|---|
| 0.04 | 0.42 | Setosa | 0.15 | 1.89 |
| 1.00 | 0.54 | Virginica | 0.71 | 0.35 |
| 0.50 | 0.37 | Versicolor | 0.65 | 0.43 |
The total cross entropy when ( b_3 = -2 ) is
Total CE} = 2.67
Visualizing the loss curve
We can visualize how the total cross entropy changes with different values of ( b_3 ) by plotting ( b_3 ) on the x-axis and the total cross entropy on the y-axis. If we evaluate many values of ( b_3 ), we obtain a smooth pink curve with a clear minimum.
We can visualize this with the help of a python code.
import numpy as np
import matplotlib.pyplot as plt
def relu(x):
return np.maximum(0, x)
def softmax(raws):
exp_vals = np.exp(raws)
return exp_vals / np.sum(exp_vals)
# Fixed biases
b1 = 1.6
b2 = 0.7
b4 = 0
b5 = 1
# Sample training data: (petal, sepal, true_class)
data = [
(0.04, 0.42, 0), # Setosa
(1.00, 0.54, 2), # Virginica
(0.50, 0.37, 1), # Versicolor
]
def total_cross_entropy(b3):
total_ce = 0.0
for petal, sepal, target in data:
upper = petal * -2.5 + sepal * 0.6 + b1
bottom = petal * -1.5 + sepal * 0.4 + b2
raw_setosa = relu(upper) * -0.1 + relu(bottom) * 1.5 + b3
raw_versi = relu(upper) * 2.4 + relu(bottom) * -5.2 + b4
raw_virg = relu(upper) * -2.2 + relu(bottom) * 3.7 + b5
probs = softmax([raw_setosa, raw_versi, raw_virg])
total_ce += -np.log(probs[target])
return total_ce
b3_values = np.linspace(-6, 4, 200)
losses = [total_cross_entropy(b3) for b3 in b3_values]
plt.plot(b3_values, losses, color="pink")
plt.xlabel("b3")
plt.ylabel("Total Cross Entropy")
plt.title("Cross Entropy vs b3")
plt.show()
This plot produces the pink curve, where the lowest point corresponds to the value of ( b_3 ) that minimizes the total cross entropy.
In the next part of the article, we will use backpropagation to move ( b_3 ) toward this minimum and update it step by step.
Looking for an easier way to install tools, libraries, or entire repositories?
Try Installerpedia: a community-driven, structured installation platform that lets you install almost anything with minimal hassle and clear, reliable guidance.
Just run:
ipm install repo-name
β¦ and youβre done! π








Top comments (0)