Homework 2: CSCE 970 (Spring 2017)

CSCE 970 (Spring 2017) Homework 2

Assigned Friday, February 10
Due Sunday, February 19
Total points: 60

When you hand in your results from this homework, you should submit the following, in a single file:

A single .pdf file with your writeup. Only pdf will be accepted, and you should only submit one pdf file, with the name username1-username2.pdf, where username1 and username2 are your team members' user names on cse.

Submit everything by the due date and time using the handin program.

On this homework, you must work with your homework partner.

You work for a tech start-up called i_C_Catz™ (slogan: "For all your cutting-edge cat detection needs"), which is trying to make money off this new "deep learning" fad. Your boss, who knows nothing about machine learning, hired you because of the stellar reference letter that you were given from your instructor. Because you are eminently qualified in machine learning, your boss tasks you with the job of implementing a convolutional neural network for cat detection in images.

Of course, you are up to the task. However, you boss sent all this quarter's development budget to a Nigerian prince who has not gotten back to him with his promised return on investment. So you don't have access to any deep learning libraries. Thus, one of the first tasks you will need to complete (and the only one for this assignment) is to determine how to perform weight updates.

In your machine learning class, your lecture notes on Backpropagation told you to, in training trial t

Update each network weight w^t_j,i:
w^t_j,i ← w^t_j,i + Δ w^t_j,i
where
Δ w^t_j,i = η δ ^t_j x^t_j,i

where η is the learning rate, w^t_j,i is the weight of the connection from node i to node j, and x^t_j,i is the signal transmitted from node i to node j in trial t. When hidden units used the sigmoid function as its activation function, we computed hidden node h's error term δ^t_h as

δ^t_h ← ŷ^t_h (1 - ŷ^t_h) Σ_{k ∈ down(h)} w^t_k,h δ^t_k

where ŷ^t_h is node h's output and down(h) is the set of nodes downstream of node h, i.e., the nodes that receive h's output as direct input.

You are to summarize the changes to the algorithm when using a convolutional neural network architecture similar to what was described in class this week. In particular, when we change our sigmoid hidden units to the series

convolutional node → ReLU node → max pooling node

summarize what changes need to be made to correctly perform weight updates. You may assume that the loss function is square loss and the output units all use sigmoid activation functions, so there is no need to change the update process of the output nodes.

You may utilize any resource you wish for this assignment, including tutorials, scientific papers, or your own derivation (though not an existing implementation). However, you must cite any sources that you use. In a writeup of 1–2 pages, describe the changes that are required, and carefully explain your reasoning for these changes.

Finally, as part of your writeup, compare your results to an existing implementation such as TensorFlow. Highlight any similarities and differences that you notice.

Back

Last modified 10 February 2017; please report problems to sscott.