CSCE 496/896 (Spring 2019) Homework 3

Assigned Friday, March 8
Due Sunday, March 31 Friday, April 5 Sunday, April 7 (with 5-point bonus for final submission by April 5) at 11:59 p.m.

When one person from your group hands in your team's results from this homework, they should submit via handin the following, in separate files:

  1. Three program files, with the following names:
    1. main.py: Code to specify and train the TensorFlow graph
    2. model.py: TensorFlow code that defines the network
    3. util.py: Helper functions
  2. A single .pdf file with your writeup of the results for all the homework problems. Only pdf will be accepted, and you should only submit one pdf file, with the name username1-username2.pdf, where username1 and username2 are your team members' user names on cse. Include all your plots in this file, and all discussion.

Also, one person from your group (the same one who handed in the .py and .pdf files) will submit your team's model files for the competition. On crane, you should copy your model files to $WORK/handin (or /work/cse496dl/$USER/handin). This folder has been created for you. Do not delete, move or rename the handin folder (even if you recreate it), as that will break the permission settings needed for us to evaluate your model. After you copy your files to your handin folder, run the script

/work/cse496dl/shared/bin/cse_upload.sh username
where username is your cse username. You will need to type your cse password to complete the submission of the files to cse for evaluation for the competition. You will submit three model files, with the following names:
  1. homework_3.data-00000-of-00001
  2. homework_3.index
  3. homework_3.meta
Your model must use the following two tensors:
  1. input_placeholder with shape=[None,84,84,4], which represents 4 preprocessed, greyscale frames concatenated by channel
  2. output with shape=[None,18], which are Q values over the 18 possible actions in Seaquest

On this homework, you must work with your homework partner(s).


  1. (150 pts)

    You will train a deep Q-learning network (DQN) to play the Atari game SeaQuest. Specify the architecture however you wish, but your Markov decision process state must be four consecutive frames of the game (84 × 84 × 4) and your outputs must be an estimate of Q(s,a) for all possible actions a. You should update your DQN network using either TD(λ) for some λ or via n-step time differences for some n ≥ 2. E.g., for n = 2:

    Q(2)(st, at) = rt + γrt + 1 + γ2 maxa Q(st+2, a)
    Note that larger time differences require more information in each tuple used in experience replay.

    To test your agent, we'll be running it in the SeaquestNoFrameskip-v4 environment published by OpenAI's Gym for a fixed number of episodes, and the average score will be used as the model score. The greedy action will be selected and run for 4 frames. This should automatically be accounted for if you use the code from the hackathon using atari_wrapper.py. We'll also be using the pre-processing from atari_wrapper.py for the input frames in the same way it was used in the hackathon.

    Submission Requirements

    You are to submit a detailed, well-written report, with conclusions that you can justify with your results. Your report should include a description of the learning problem (what is to be learned), a detailed description of your architecture, activation functions, and regularizers, and the values of the hyperparameters you tested. All these design decisions must be justified. You should then describe your experimental results, and draw conclusions from your results (conclusions must be justified by your presented results). In particular, you should report your agent's average cumulative reward per episode, and how well it played the game (average final score).

    As part of your submission, you will include files representing your best model of the ones that you created and tested. After the deadline, we will evaluate each team's best model in the game environment. Bonus points will be awarded to the teams with the best submitted models, as measured by live game play. To help you determine how good your model is relative to your competitors, each night we will evaluate on our data set each team's submitted model and post the results.

    To submit your program, you must submit the following three files, via handin:

    1. main.py
    2. model.py
    3. util.py

    To sumit your model for the competition standings, you must submit the following three files by copying them to $WORK/handin (or /work/cse496dl/$USER/handin) on crane and then running the script

    /work/cse496dl/shared/bin/cse_upload.sh username
    where username is your cse username:
    1. homework_3.data-00000-of-00001
    2. homework_3.index
    3. homework_3.meta

    You should use this command to save the files:
    saver.save(session, "homework_3")
    Your model must use the following two tensors:
    1. input_placeholder with shape=[None,84,84,4], which represents 4 preprocessed, greyscale frames concatenated by channel
    2. output with shape=[None,18], which are Q values over the 18 possible actions in Seaquest

    Finally, via handin, you must submit your report (combined for both exercises) as username1-username2.pdf, where username1 and username2 are your team members' user names on cse.


Return to the CSCE 496/896 (Spring 2019) Home Page

Last modified 05 April 2019; please report problems to sscott