CSCE 970 (Spring 2009) Homework 1

Assigned Thursday, January 22
Due Sunday, February 1
Total points: 60


When you hand in your results from this homework, you should submit the following, in separate files:

  1. A single .tar.gz or .tar.Z file called username.tar.gz where username is your username on cse. In this tar file, put:
  2. A single .pdf file with your writeup of the results for all the homework problems, including the last problem. Only pdf will be accepted, and you should only submit one pdf file, with the name username.pdf, where username is your username on cse. Include all your plots in this file, as well as a detailed summary of your experimental setup, results, and conclusions. If you have several plots, you might put a few example ones in the main text and defer the rest to an appendix. Remember that the quality of your writeup strongly affects your grade. See the web page on "Tips on Presenting Technical Material".

Submit everything by the due date and time using the web-based handin program.

On this homework, you must work on your own and submit your own results written in your own words.


  1. (15 pts) Do Exercise 3.3 on page 51 of Durbin's book.

  2. (40 pts) In this exercise, you will implement a program to infer basic hidden Markov models and analyze other sequences in the context of these models. The models will be similar to that on page 54 of Durbin's book, but there will be three dice rather than two, and you are not given the transition or emission probabilities of the models. These probabilities will be inferred from the training data.

    You will build one model for each of the following training sequences: train1.txt and train2.txt. Each of these files (consisting of one long line) contains a sequence of outcomes of dice rolls (numbered 0–5), interspersed with the notation :x:, which denotes a change to die x (x takes values from {0, 1, 2}). Thus in the training set you know which die produced each outcome. After your program infers each model, you will graphically display the model in your report. Your program will also run the Viterbi and forward algorithms on your model, using the following test sequences as input: test1.txt, test2.txt, and test3.txt. For each test sequence, give the most likely path through the model as well as the probability of this path. Also give the probability that each sequence was generated by the model.

    You are to submit a detailed, well-written report, with conclusions. In particular, you should answer the following questions. Did the test sequences come from the same model as the training sequences? Which ones did and which did not? Of course, this is merely the minimum that is required in your report. Other experiments that you run (including generating your own sequences and building models from them) and other interesting questions that you answer might yield extra points.

  3. (5 pts) State how many hours you spent on each problem of this homework assignment.

Back

Last modified 16 August 2011; please report problems to sscott.