Assigned Tuesday, October 5
Due Sunday, October 24 at 11:59 p.m.
When you hand in your results from this homework,
you should submit the following, in separate
files:
You are to compare your ANN's results to those from ID3 on the same UCI data sets you used for Homework 1. Your goal is to convince the reader that, for each data set, either one of the two algorithms is superior to the other (and give a significance level as well) or that there is no statistically significant difference between them. To accomplish this task, you may use any tools from Lecture 5 that you wish, under two conditions: (1) you must use the tools correctly and thoroughly corroborate your assertion, and (2) you must have at least one confidence interval and at least one ROC curve in your report. Note that in order to use certain statistical tools correctly, you may need to run a few additional experiments with ID3.
You are to submit a detailed, well-written report, with real conclusions and everything. In particular, you should answer the following questions for both your new classifier and ID3. Did training error go to 0? Did overfitting occur? Should you have stopped training early? Was there a statistically significant difference between the performance of ID3 and that of the ANN? What algorithm would you recommend for your data sets? Of course, this is merely the minimum that is required in your report.
Extra credit opportunities include (but are not limited to) running on extra data sets, using other activation functions, using multiclass data, and running experiments on more ANN architectures and/or with more learning rates. As always, the amount of extra credit is commensurate with the level of extra effort and the quality of your report of the results.
The following problem is only for students registered for CSCE 878. CSCE 478 students who do it will receive extra credit, but the amount will be less than the number of points indicated.
Last modified 16 August 2011; please report problems to sscott AT cse.