Homework 2 for CSCE 471/871 (Spring 2011)

Assigned Friday, March 4
Due Monday, March 21 at 11:59 p.m.
Total points: 90

When you hand in your results from this homework, you should submit the following, in separate files:

  1. A single zip file called username.zip, where username is your username on cse. In this zip file, put:
  2. A single .pdf file with your writeup of the results for all the homework problems. Only pdf will be accepted, and you should only submit one pdf file, with the name username.pdf, where username is your username on cse. Include all your plots in this file, as well as a detailed summary of your experimental setup, results, and conclusions. If you have several plots, you might put a few example ones in the main text and defer the rest to an appendix. Remember that the quality of your writeup strongly affects your grade. See the web page on ``Tips on Presenting Technical Material''.

Submit everything by the due date and time using the web-based handin program.

On this homework, you must work on your own and submit your own results written in your own words.

(90 pts) In this assignment, you will implement a program to infer a profile hidden Markov model from a multiple alignment and use it to search a database for related proteins.

You will build your model from this global multiple alignment. The file contains a multiple alignment of several related protein sequences (here's an overview of the MSF format; you can google "MSF format" for others). After your program parses this file, it will determine how to define the model's architecture and then infer the model's parameters from the multiple alignment, using Laplace's rule as a prior. After your program infers the model, your program will search this database for proteins related to the model.

You are to submit a detailed, well-written report, with conclusions. In your report, discuss each hit, reporting all relevant information. If some of the alignments are especially long, you may omit those alignments from your report. But you must still discuss them in your report and include the alignments in a text file in your .tar.gz file. Of course, this is merely the minimum that is required in your report.

Return to the CSCE 471/871 (Fall 2011) Home Page

Last modified 16 August 2011; please report problems to sscott AT cse.