Assigned Wednesday, February 28
Due Tuesday, March 13 at 11:59 p.m.
When one person from your group hands in your team's results from this homework, they should submit via handin the following, in separate files:
Also, one person from your group (the same one who handed in the .py and .pdf files) will submit your team's model files for the competition. On crane, you should copy your model files to $WORK/handin. This folder has been created for you. Do not delete the handin folder and recreate it, as that will break the permission settings needed for us to evaluate your model. You will submit two sets of six model files each, with the following names:
On this homework, you must work with your homework partner(s).
You recently got a job at as a "web optimizer". Your employers want to reduce the volume of web traffic, and have tasked you to help. Specifically, your job is to build a new lossy image compression system. Your proficiency in deep learning technologies leads you to believe that you can accomplish this by applying autoencoders. You decide that your first prototype will process 32×32 RGB images.
Specifically, you are going to harvest from the web a variety of 32×32 RGB images (you might consider starting with CIFAR data, but there are other sources of appropriately sized data out there as well). Since you are training an autoencoder, you may ignore any labels of the images. The specific architecture of the encoder and decoder is up to you, so long as you meet the requirements of the input and output tensors and so long as you use numeric or bit values in your embedded representation (i.e., not strings, etc.). Your inputs and outputs should be in RGB formats, with each pixel in the range [0,255]. You should measure your end-to-end performance with peak signal-to-noise ratio (PSNR) between each input image and its output.
Design and implement at least two architectures for this problem. For each architecture, perform a hyperparameter/regularizer search, as with previous homeworks.
You are to submit a detailed, well-written report, with conclusions that you can justify with your results. Your report should include a description of the learning problem (what is to be learned), a detailed description of your architecture, activation functions, and regularizer(s), and the values of the hyperparameters you tested. All these design decisions must be justified. You should then describe your experimental results (including quality of reconstruction and rates of compression), and draw conclusions from your results (conclusions must be justified by your presented results). In particular, you should discuss the impact of your architecture choices and hyperparameter settings.
As part of your submission, you will include files representing your best models for the two competitions in this homework. The first competition is the Max Compression competition, in which the winning entry will have the largest amount of compression while maintaining a minimum level of image reproduction quality. Specifically, an entry to Max Compression must have average PSNR of at least 45dB on our test set. Among the entries achieving this level of quality, the winner will be the one with the smallest encoding, in terms of the number of bytes required to encode our entire held-out set. You have two ways to control this. One is to control n, is the number of hidden nodes in the innermost layer, which is the dimension of your encoding. The second way to control the size of your encoding is your choice of data type of the encoder's output (which, of course, must match the data type of the decoder's input). By default, we've been using tf.float32 as the data type for weights and node outputs, but you are free to use tf.cast() at the output of your encoder and input of your encoder to achieve fewer bytes per dimension in embedded space. E.g., you could quantize your encoder output to tf.int16 or smaller (like tf.bool) to reduce your representation size.
The second contest is Max Quality, in which the winning entry will have the the maximum reproduction quality while maintaining a minimum level of compression. Specifically, an entry to Max Quality must use at most 400 bytes per image (compressed down from 3072 bytes per RGB image) to represent our test set. Among the entries achieving this level of compression, the winner will be the one with the largest PSNR on our test set.
To help us evaluate your models' reconstruction quality and determine their levels of compression, you will submit your encoders and decoders as separate models, where the dimension of the encoder's output tensor matches the dimension of the decoder's input tensor. The contest scripts will plug the encoder and decoder together, compute the total size of the embedded representation, and evaluate the encoder-decoder pair on the test set.
To submit your program, you must submit the following three files, via handin:
To sumit your model for the competition, you must submit the following model files by copying them to $WORK/handin on crane:
Finally, via handin, you must submit your report as username1-username2.pdf, where username1 and username2 are your team members' user names on cse.
Last modified 05 March 2018; please report problems to