CSCE 478/878 (Fall 2001) Homework 1
Assigned Wednesday, September 26
Due Wednesday, October 10 at 11:59:59 p.m.
Now due Friday, October 12 at 11:59:59 p.m.
You are to submit all files, including all source code (in the language of
your choice), data files, and a single pdf file
of your entire writeup
(only pdf will be accepted this time, and you should only submit one pdf
file). Submit everything by the
due date and time using the
handin program.
- (25 pts) Implement the ID3 algorithm from table 3.1 (p. 56) and
run it at least 30 times on randomly generated data sets, where the
examples are generated and labeled the same way as described in Problem
2.10 (p. 50). For each of the 30 runs, use a different size for the training
set. Then evaluate each of the 30 decision trees on a randomly
generated test set of size 100 (use the same test set for each
different training set). Then plot the performance on the test set
versus the size of the training set. How did increasing the training
set size influence generalization error? Did overfitting occur?
If not, can you push the learner to the point of overfitting? Why or why not?
Hand in your source code and data sets as part of your solution to this
problem, as well as a brief report of your results.
- (10 pts) Do Problem 2.3 on p. 48
- (5 pts) Do Problem 3.2 on p. 77
- (15 pts) Do Problem 7.2 on p. 227
- (5 pts) State how many hours you spent on each problem of
this homework assignment (for CSCE 878 students, this includes the next
two problems).
The following two
problems are only for students registered for CSCE 878. CSCE 478 students who
do these will receive extra credit, but the amount will be substantially less
than the number of points indicated.
- (20 pts) Do Problem 7.6 on p. 228.
Hand in your source code and data sets as part of your solution to this
problem, as well as a brief report of your results.
- (10 pts) A binary
decision stump is a depth-1 decision tree, i.e.
it has a root node and two leaves. What is the VC dimension of the hypothesis
class of binary
decision stumps defined over the real plane? Argue that your answer
is correct.
Return to the CSCE 478/878 (Fall 2001) Home Page
Last modified 16 August 2011; please report problems to
sscott AT cse.