Assigned Wednesday, March 14
Due Tuesday, April 3 at 11:59 p.m.
When one person from your group hands in your team's results from this homework, they should submit via handin the following, in separate files:
On this homework, you must work with your homework partner(s).
You recently got a job at as an "authorship automation consultant". Your employers want to automate the writing of books to rapidly increase production, and have tasked you to help. Specifically, as a first step towards this automation task, your job is to use a recurrent architecture as part of a prototype sentence completion system that will take as input a sentence s and will output the next k words of s.
You will download a variety of corpus datasets to train and evaluate your prototype. (You might consider starting with the PTB data that you used in the hack-a-thons, but there are other sources of appropriate data out there as well.) You don't need to worry about text labels, since you're only focused on generating the next k words of a given sentence. You may train your own word embedding mapping, or use an existing one on the web.
Design and implement at least one architecture for this problem, and conduct a hyperparameter/regularizer search, as with previous homeworks.
You are to submit a detailed, well-written report, with conclusions that you can justify with your results. Your report should include a description of the learning problem (what is to be learned), a detailed description of your word embedding, architecture, activation functions, and regularizer(s), and the values of the hyperparameters you tested. All these design decisions must be justified. You should then describe your experimental results (including quality of sentence completion), and draw conclusions from your results (conclusions must be justified by your presented results). In particular, you should discuss the impact of your architecture choices and hyperparameter settings, as well as the effect that varying k has on accuracy.
Last modified 14 March 2018; please report problems to