Course Announcement for Spring 2008
CSCE 471/871: Introduction to Bioinformatics

A new skill combining biology and computing has exploded into what is probably the hottest career opportunity for college graduates in the coming decade. Government and private research laboratories, including all the major drug makers, are desparately scouring universities for people trained in computational biology, also known as bioinformatics. ... Because of the demand, salaries for newly minted Ph.D.s competent in both biology and computer science average $90,000 a year.

- Robert Boyd, Knight-Ridder Newspapers

Instructor:
Stephen Scott
Avery 364
sscott AT cse
http://www.cse.unl.edu/~sscott/
Meeting Time and Place: Monday, Wednesday, Friday 10:30-11:20, Avery 112

Credits: 3 units

Prerequisites: CSCE 310 (Data Structures & Algorithms) or equivalent programming experience; STAT 380/880 (Prob. and Stats.) or equivalent background

Required Text:
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, by Richard Durbin, Sean Eddy, Anders Krogh, and Graeme Mitchison. Cambridge University Press, 1998. ISBN 0-521-62971-3.

Optional Texts:
Introduction to Mathematical Methods in Bioinformatics, by Alexander Isaev. Springer, 2004. ISBN 3540219730.
Problems and Solutions in Biological Sequence Analysis, by Mark Borodovsky and Svetlana Ekisheva. Cambridge University Press, 2006. ISBN 0521612306.

Course Description:

Biology easily has 500 years of exciting problems to work on.

- Donald E. Knuth, Professor Emeritus, Dept. of Comp. Sci, Stanford University

Bioinformatics is a discipline that employs computational sciences in molecular biological sciences. The need for advanced computational biology tools is fueled by an explosion in the rate of genomic data acquisition and the absence of robust computational methods for storing and analyzing the information. Even analysis of relatively small genomes, such as bacteria, encounters a significant bottleneck at the level of genome annotation (identifying genes and ascribing function) which is due in part to limitations in computational methods.

In this course you will learn several fundamentals and current trends in bioinformatics. As such, this course will not show you how to use existing computational biology tools, though you will probably learn some of that on your own as a side effect. Instead you will acquire a deep understanding of how they work, to the point where you can adapt existing tools to new problems and create new tools.

The biological problems we will study include sequence alignments, protein family modeling, and phylogeny. The approach we will focus on is hidden Markov models, though time permitting we will also discuss dynamic programming as well as machine learning models like decision trees and artificial neural networks.

Grades in this course will be based on homework exercises, writing exercises, and a project.

For more information, see the previous offering's web page: http://cse.unl.edu/~sscott/teach/Classes/cse496F02/