CSCE 970: Slides from Lectures

CSCE 970 lecture slides

Entries in red do not have slides updated for spring 2003. Refer to the the spring 2001 offering for old copies of slides.

Lectures 0 and 1: Administrivia and Introduction, Jan 14. Theodoridis Chapter 1. (ps, pdf)
Topics: Features (attributes) and feature vectors, classification, supervised vs. unsupervised learning

Lecture 2: Bayesian-Based Classifiers, Jan 16-21. Theodoridis Sections 2.1-2.4, 2.5.1, 2.5.2, 2.5.6, 2.6. (ps, pdf) Topic summary 1 due Thursday, Feb 6
Topics: Bayesian decision theory, discriminant functions, Bayesian classification for Gaussian distributions, estimation of unknown pdfs, k-nearest neighbor techniques

Lecture 3: Linear Classifiers, Jan 21-28. Theodoridis Sections 3.1-3.3, 3.4.1, 3.4.2, 3.5 (skim), pages 1-19 of GD/EG paper. (ps, pdf) Topic summary 2 due Tuesday, Feb 11
Topics: Linear discriminant functions, perceptron algorithm, Winnow, exponentiated gradient, least squares methods
Also see:

Manfred K. Warmuth, who has done much work on EG and Winnow. Many papers available on-line, including:
- Jyrki Kivinen and Manfred K. Warmuth. Exponentiated Gradient versus Gradient Descent for Linear Predictors. Information and Computation, 132(1):1-63, January 1997. [introduced EG and compared it to GD]
- D. P. Helmbold, J. Kivinen, and M. K. Warmuth. Relative loss bounds for single neurons. IEEE Transactions on Neural Networks, 10(6):1291-1304, 1999. [expands on results from GD/EG paper]
- Claudio Gentile and Manfred K. Warmuth. Linear Hinge Loss and Average Margin. In Advances in Neural Information Processing Systems 11, pp. 225-231, 1998. [gives agnostic error bounds for Perceptron]
Nick Littlestone (creator of Winnow) and some of his papers:
- N. Littlestone. ``Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm''. Machine Learning, 2:285-318, 1988. [original Winnow paper]
- N. Littlestone. ``Redundant noisy attributes, attribute errors, and linear threshold learning using Winnow''. In Proc. 4th Annu. Workshop on Comput. Learning Theory, 147-156, 1991. Morgan Kaufmann. [agnostic Winnow results]
- A. Grove, N. Littlestone, and D. Schuurmans. General convergence results for linear discriminant updates. Machine Learning 43(1-3):173-210, 2001. [gives nice presentation of Winnow with negative weights + very general error bounds]
Avrim Blum and his survey paper ``On-Line Algorithms in Machine Learning''
Thomas G. Dietterich and his paper ``Solving Multiclass Learning Problems via Error-Correcting Output Codes''

Lecture 4: Nonlinear Classifiers. Theodoridis Sections 4.1-4.4, 4.6 (skip proof), 4.7, 4.9, 4.10, 4.13-4.15, 4.17. (ps, pdf) Topic summary 3 due Tuesday, Mar 11
Topics: 2- and 3-layer perceptrons, backpropagation, setting network size (especially pruning), Cover's theorem, RBF networks, decision trees
Also see:

ANN growing and pruning:
- Christopher M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 1995. (Section 9.5)
SVMs:
- Muller, K.-R., Mika, S., Ratsch, G., Tsuda, K., and Scholkopf, B. An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 12(2):181-201, 2001.
- Christopher Burges. A tutorial on support vector machines for pattern recognition
- Nello Cristianini and John Shawe-Taylor. An Introduction to Support Vector Machines. Cambridge University Press, 2000.
- Richard Duda, Peter Hart, and David Stork. Pattern Classification, 2nd Edition. [also see software supplements] John Wiley, 2001. (Section 5.11)
- SVM tutorial
- KernelMachines.org
- Sebastian Thrun's links
Decision trees:
- Tom Mitchell. Machine Learning. McGraw-Hill, 1997. (Chapter 3)
My work on learning geometric patterns (a generalization of multiple-instance learning), especially the paper based on Winnow.

Lecture 5: Hidden Markov Models. Durbin Chapter 3, Theodoridis Sections 9.1-9.4, 9.6, 9.8. (ps, pdf) Topic summary 4 due Thursday, March 27
Topics: Markov models, the Viterbi algorithm, hidden Markov models, Baum-Welch algorithm.
Also see:

R. Durbin, S. Eddy, A. Krogh, and G. Mitchison. Biological Sequence Analysis. Cambridge University Press, 1998. [see ch. 3]
Ron Shamir's course on computational biology [see the scribe notes on hidden Markov models]
ISMB99 Tutorial on HMMs
K. Sjölander, K. Karplus, M. Brown, R. Hughey, A. Krogh, I. S. Mian, and D. Haussler. Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology. Computer Applications in the Biosciences (CABIOS), Vol. 12, No. 4, Pages 327-345, 1996. [compressed postscript] [pdf]
HMM Tutorial
Source code (specific to biological sequence analysis):
- SAM (Sequence Alignment and Modeling System)
- HMMER [Profile hidden Markov models for biological sequence analysis]

Lecture 6: System Evaluation and Combining Classifiers. Theodoridis Chapter 10, selected papers. (ps, pdf) Topic summary 5 due Tuesday, April 8
Topics: Estimating classification error (confidence intervals, paired t tests, cross-validation), improving performance (bagging, boosting, weighted majority).
Also see:

Tom Mitchell. Machine Learning. McGraw-Hill, 1997. (Chapters 5 and 7)
Avrim Blum and his survey paper ``On-Line Algorithms in Machine Learning''
Leo Breiman. ``Bagging Predictors.'' Technical report 421, Department of Statistics, University of California, Berkeley. September 1994.
Yoav Freund and Robert Schapire. ``A Short Introduction to Boosting.'' Journal of the Japanese Society for Artificial Intelligence, 14(5):771-780. September 1999.
Thomas G. Dietterich. ``Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms.'' Neural Computation, 10(7):1895-1924. 1998.
Distribution Tables from StatSoft, Inc.
t tables by Dr. Victor L. Bissonnette
t distribution critical values

Lecture 7: Clustering: Basic Concepts. Theodoridis Chapter 11, Sections 12.1-12.2. (ps, pdf)
Topics: Applications, examples, cluster types, feature types, proximity measures, categories of algorithms.

Lecture 8: Sequential Clustering Algorithms. Theodoridis Sections 12.3-12.6. (ps, pdf)
Topics: BSAS, MBSAS, TTSAS, estimating the number of clusters.

Lecture 9: Hierarchical Clustering Algorithms. Theodoridis Sections 13.1, 13.2.1-13.2.4, 13.5. (ps, pdf) Topic summary 6 (on Lectures 7-9) due Tuesday, April 15
Topics: Agglomerative schemes (dendograms, single link algorithm, complete link algorithm), determining the best number of clusters.

Lecture 10: Clustering Algorithms Based on Cost Function Optimization. Theodoridis Sections 14.1, 14.3.1, 14.3.6, 14.5, selected papers. (ps, pdf) Topic summary 7 due Tuesday, April 22
Topics: Isodata algorithm, fuzzy clustering methods (also fuzzy classification, if time permits).

Lecture 11: Clustering Tendency and Cluster Validity. Theodoridis Chapter 16. (ps, pdf NOTE: These are different from what was handed out in class; two more slides are in this set.)
Topics: Hypothesis testing, internal criteria, external criteria, relative criteria, validity of individual clusters, cluster tendency.
Also see:

Section 5.3.1 ("Hypothesis Testing Basics") from the text

Special Lecture: How to Give a Good Research Talk. (ps, pdf)
Also see:

Tips on presenting technical material

Return to the CSCE 970 Home Page