## Transcribed Text

Neural Network and
Backpropagation
1
X1
X2
X3
X 784
𝝷
Let’s first use a Single Neuron to recognize
handwritten digits
784 x 10
784 x 10 784 x 10
2
X1
X2
X3
X 784
1
30
2
1
10 3
Let’s use multiple neurons to recognize handwritten digits
A neural network is put together by hooking together
many of our simple “neurons,” so that the output of a
neuron can be the input of another.
input layer output layer hidden layer
2
X1
X2
X3
X 784
W1
784 x 30
1
30
2
1
10 3
W2
30 x 10
Define weights
Let’s use multiple neurons to recognize handwritten digits
input layer output layer hidden layer
2
X1
X2
X3
X 784
W1
784 x 30
1
30
2
1
10 3
W2
30 x 10
Let’s use multiple neurons to recognize handwritten digits
Define biases
b1
b2
b3
b30
b1
b2
b10
input layer output layer hidden layer
Can you recall the bias term
from Lecture 1 or 2?
2
X1
X2
X3
X 784
W1
784 x 30
1
30
2
1
10 3
W2
30 x 10
Let’s use multiple neurons to recognize handwritten digits
Define weighted input
b1
b2
b3
b30
b1
b2
b10
Z1 Z2
Zl = Wla
l-1 +bl
10/02/2020 Ahsan Adeel - AI and ML Course (6CS012)
2
X1
X2
X3
X 784
W1
784 x 30
1
30
2
1
10 3
W2
30 x 10
Let’s use multiple neurons to recognize handwritten digits
Define activations
b1
b2
b3
b30
b1
b2
b10
Z1 Z2
Zl = Wla
l-1 +bl
a
l = 𝞂( Zl
)
a1
a2
a0
2
X1
X2
X3
X 784
W1
784 x 30
1
30
2
1
10 3
W2
30 x 10
Let’s use multiple neurons to recognize handwritten digits
Forward propagation b1
b2
b3
b30
b1
b2
b10
Z1 Z2
Zl = Wla
l-1 +bl
a
l = 𝞂( Zl
)
a1
a2
a0
• A_0 =?
• Z_1 =?
• A_1 =?
• Z_2 =?
• A_2 =?
• error =
2
X1
X2
X3
X 784
W1
784 x 30
1
30
2
1
10 3
W2
30 x 10
Let’s use multiple neurons to recognize handwritten digits
Backpropagation b1
b2
b3
b30
b1
b2
b10
Z1 Z2
Zl = Wla
l-1 +bl
a
l = 𝞂( Zl
)
a1
a2
a0
Update:
• W1?
• b1?
• W2?
• b2
Backpropagation
Update:
• W1?
• b1?
• W2?
• b2
W1= W1 - 𝞰𝞩 J(W1)
b1= b1 - 𝞰𝞩 J(b1)
W2= W2 - 𝞰𝞩 J(W2)
b2= b2 - 𝞰𝞩 J(b2)
Backpropagation
Update:
• W1?
• b1?
• W2?
• b2
W1= W1 - 𝞰𝞩 J(W1) How to find these gradients now?
b1= b1 - 𝞰𝞩 J(b1)
W2= W2 -𝞰𝞩 J(W2)
b2= b2 - 𝞰𝞩 J(b2)
2
X1
X2
X3
X 784
W1
784 x 30
1
30
2
1
10 3
W2
30 x 10
Let’s use multiple neurons to recognize handwritten digits
b1
b2
b3
b30
b1
b2
b10
Z1 Z2
Zl = Wla
l-1 +bl
a
l = 𝞂( Zl
)
a1
a2
a0 Introduce an error function
δ3
δ1 δ2
2
X1
X2
X3
X 784
W1
784 x 30
1
30
2
1
10 3
W2
30 x 10
Let’s use multiple neurons to recognize handwritten digits
b1
b2
b3
b30
b1
b2
b10
Z1 Z2
Zl = Wla
l-1 +bl
a
l = 𝞂( Zl
)
a1
a2
a0 Introduce an error function
δ3 = a2 - y
δ δ2 1
Four BP equations
σ(z) = 1/1+exp(−z)
σ'(z) = σ(z) * (1- σ(z))
Introduce an error function
Let’s call last as “layer L”
and all others as “layers l”
2
X1
X2
X3
X 784
W1
784 x 30
1
30
2
1
10 3
W2
30 x 10
Let’s use multiple neurons to recognize handwritten digits
b1
b2
b3
b30
b1
b2
b10
Z1 Z2
Zl = Wla
l-1 +bl
a
l = 𝞂( Zl
)
a1
a2
a0
δ3 = a2 - y
δ δ2 1
δ1 = ?
δ2 = ?
δ3 = ?
𝞩 J(W1) = δ1 .a0T
𝞩 J(W2) = δ2 .a1T
𝞩 J(b1) = δ1
𝞩 J(b2) = δ2
W1 = W1 - 𝞰 𝞩 J(W1)
W2 = W3 - 𝞰 𝞩 J(W2)
b1 = b1 - 𝞰 𝞩 J(b1)
b2 = b2 - 𝞰 𝞩 J(b2)
Task 1
(MNIST classification using SGD driven ANN
- Download WS6 folder from Canvas
- Use the “MNIST_Data” from WS5 and store it in the same folder
- The provided code will extract and pre-process the dataset
- The provided code also does the post-processing including thresholding, accuracy etc.
- For your ease, following lines are left empty for you to fill.
# ForwardProp
Z_0 = x
A_0 = Z_0
Z_1 =
A_1 =
Z_2 =
A_2 =
error =
# Backprop
delta3 =
delta_2 =
d_W_2 = # derivative of 'J' w.r.t 'W2'
d_b_2 = # derivative of 'J' w.r.t 'b2'
delta_1 =
d_W_1 = # derivative of 'J' w.r.t 'W1'
d_b_1 = # derivative of 'J' w.r.t 'b1'
W_1 = W_1 – eta *d_W_1
W_2 = W_2 – eta * d_W_2
b_1 = b_1 – eta *d_b_1
b_2 = b_2 – eta *d_b_2
# ForwardProp – Testing on a test data
A_0 = x_t
A_0 = np.reshape(Z_0, (784, 500))
Z_1 =
A_1 =
Z_2 =
A_2
Task 1
(MNIST classification using SGD driven ANN
1.1. Explain in detail (step by step) the forward and backpropagation algorithm with equations
1.2. Complete the equations and run the code
1.3. Test the model on test data
1.4. Compare results with WS5 Task 2
1.5. Comment on results in detail
References:
• http://deeplearning.stanford.edu/tutorial
• Huff, Trevor, and Scott C. Dulebohn. "Neuroanatomy, Visual Cortex."
(2017).
• Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep
Learning, MIT Press, 2016, url: http://www.deeplearningbook.org
• https://www.khanacademy.org/math/statistics-probabilit
• 3blue1brown: https://www.3blue1brown.com/
• http://neuralnetworksanddeeplearning.com/index.html

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction
of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice.
Unethical use is strictly forbidden.

import mnist_loader

import numpy as np

import matplotlib.pyplot as plt

import scipy.sparse

import input_data

def oneHotIt(Y):

m = Y.shape[0]

#Y = Y[:,0]

OHX = scipy.sparse.csr_matrix((np.ones(m), (Y, np.array(range(m)))))

OHX = np.array(OHX.todense()).T

return OHX

def sigmoid(z):

"""The sigmoid function."""

return 1.0/(1.0+np.exp(-z))

def sigmoid_prime(z):

"""Derivative of the sigmoid function."""

return sigmoid(z)*(1-sigmoid(z))

mnist = input_data.read_data_sets("MNIST_Data/", one_hot=False)

batch = mnist.train.next_batch(5000)

Y = batch[1]

X = batch[0]

#X = X / 255

y_new = np.zeros(Y.shape)

y_new[np.where(Y == 0.0)[0]] = 1

Y = y_new...