CSCE 496/896-005 (Spring 2018) Project Ideas
In this course you and your team will do a substantial project, in which
you will characterize a significant problem amenable to a deep learning
solution, study the related work to this problem,
develop one or more deep learning approaches to this problem,
and evaluate your approaches.
You will summarize your project results in a written report and an oral
presentation. The written report must
use a professional writing style similar to that found in a refereed conference
or journal (e.g., ACM, IEEE, ICML, ICLR, NIPS),
including abstract, introduction, summary of related work, your
contribution, references, and an appendix (if necessary). The oral
presentation
will be to the entire class at the end of the semester: during the fifteenth
week (April 23–27), and if necessary, during the fourteenth week
(April 16–20). You will submit your written report
no later than 11:59 p.m. on April 25.
In accordance with UNL policies,
you have now been informed in
writing of the nature and scope of this project prior to the eighth week
of classes.
Later this semester (late February to early March) we will set a deadline
for submission of 1–3 paragraph proposals on your projects.
Also, in late March, you will submit to us a brief progress report and meet with us
for a check-in on your project. You must do both of these
in order to get full credit for your project, and you must get our approval on
your proposal before starting work on your project.
Projects are due by
11:59 pm on
Wednesday, April 25.
See the rules on projects for more information.
✔ Project oral presentations are the weeks of April 16 and
April 23. See the
presentation schedule for more information.
✔ Project check-in reports are due Sunday, April 8.
You
should submit it to sscott@cse and pquint@cse in text format in the body of an email before 11:59 pm on that day.
The proposal should include:
- A summary of what you've accomplished on your project so far.
- A description of what major elements remain for your project, including
what is left over from your proposal as well as any new elements that you've
added since your proposal.
- An overview of any significant hurdles that have arisen.
- A list of what questions you need answered to move forward.
On Monday, April 9, we will spend the hack-a-thon period reviewing each group's
check-in report.
✔ PROPOSAL DEADLINE:
The proposal submission deadline is Sunday, March 4.
You
should submit it to sscott@cse and pquint@cse in text format in the body of an email before 11:59 pm on that day.
The proposal should include:
- A brief statement of your project topic.
- Motivation for your topic (why it is important and interesting).
- A precise work plan: what you plan to do, what data sets you will test on, how you will evaluate performance, etc.
- At least three references (at least two published journal or conference papers).
Project ideas
The following is a list of possible projects, suggested by a variety of individuals. If you want
to know more about a particular one, let us know and we can put you in contact with those who can
provide more details.
You are welcome to suggest your own project as well, e.g., one related to your own research.
It should contain some form of an experimental component, be relevant to the course, and be
a non-trivial extension beyond what we covered in class.
Classification
- Object Detection on Aerial and Satellite Imagery
- An application of graph convolutional kernels
- Problem: Graph convolutional kernels are analogous to convolutions on images, but they try
to identify features in the adjacency matrices of graphs. Useful in node labeling, edge
identification, and graph compression.
- Data: Graph data from the Stanford Large Network
Dataset Collection (SNAP), and other graphs
- Related papers and blog posts
- Student progression through UNL's academic programs
- Problem: Modeling student progression, major migration and related success factors
- Data: Vanessa Roof, Director of Student Success Reporting & Analytics
- Notes: Anonymized UNL student data; specific problem(s) TBD; might leverage graphical models
- EEG analysis
- Problems:
- An EEG dataset where subjects are seeing a seeing a series of
face, scene, and object images on the screen. The classification problem is
what category the person is seeing. They have 17 subjects with pretty good
data (and many more subjects less good), with a few hundred instances per
subject per category.
- An EEG dataset where subjects are doing a visual short-term memory task: they see either
1 or 2 items (small colored discs), on either the left or right side of the
screen, and then they have to remember the locations of the items for a second
or two and then report the location of one of them. The classification problem
is to predict whether subjects are seeing 1 or 2 items, and whether those items are on the left or right.
- An fMRI dataset where subjects are viewing a movie that's about 5
minutes long, and they see the same movie several times throughout the
experiment. They have about 20 subjects' worth of data. An issue is that one
cannot easily combine data across human beings with fMRI the way one can with
EEG because the anatomy is so different between people, so there is no
straightforward way to correspond the features to each other across subjects.
However, they have an fMRI image every second, so each person probably has a couple thousand instances to work with, where each instance would correspond to 1 second of the video.
- Defense against adversarial examples
- Problem: Several image classification systems have been shown to be
tricked into gross misclassification by making small, sometimes imperceptible
changes to the input image, e.g., tricking the system into thinking a panda is
a gibbon. An open problem is how to modify an architecture or regularizer to
make classifiers more robust against such minor changes.
- Related papers
- Computer-aided smart user interface design
- Problem: Computer aided smart user interface (UI) design can be a cool deep learning
application. In practice, assessment and refactoring UI involve human participants. However,
evaluating designs by looking at historical data can reduce the need for human intervention.
Can be in both supervised and semi-supervised setting.
- Related papers
- Human-robot interaction
- Problem: The objective is to use human-robot interaction data to predict the user's
locus of control (LOC), which is a user quality measuring the control a person feels they
have over their life. Large amounts of data can be used to train a model, but the classification
would ideally be based on a small number of interactions.
- Data: A study with 30 human participants had each participant control a robot to navigate
an obstacle course. Each participant made two runs, for a total number of 60 runs, each capturing
data from approximately 30,000 position points. Data gathered includes duration of each run,
distance of each run, number of commands sent to the robot each run, etc.
- Application of convolutional neural networks for genomic motif finding
- Problem: CNNs have been applied to natural language processing (NLP). The phrases are encoded into a format that looks like Braille and input to the network. This project is to determine if that could be used for finding genomic motifs for bacterial species. Conventionally, these motifs are k-mers and their frequencies in a species' genome is considered a signature for that species. Perhaps one could use a CNN + autoencoder to come up with relevant motifs instead of enumerating all k-mers. The data in this case would be sequencing reads.
- Related links
- Automated table analysis
- Problem:
A classification problem, where the rows and columns of a
complete table are segmented into headers, data, and ancillary regions solved with unsupervised learning of complete
tables. In this formulation, one would consider the tables as pictures with categorical cell pixels (each cell would be one
pixel, analogous to color pictures). The input to the program would be formatting/stylistic cell features derived from a .xls
representation of the table. Available are 1320 tables in .xls format, along with the classification results
produced by an alternate algorithmic method to evaluate results.
- Association training
- Problem: Assist an ongoing project to train a system to map inputs (images, text, sounds,
etc.) from input space X to an embedded vector in ℝd such that
two similar instances from X have embeddings that are near each other in
ℝd under some distance measure such as Euclidean distance.
This would potentially allow the use of locality-sensitive hashing on the embeddings to
enable very fast information retrieval of similar instances from X. Similar to a
classification problem, but labels are associated with pairs of inputs. E.g.,
two instances x1, x2 ∈ X are a positively labeled
pair if they are known to be similar in input space (e.g., both pictures of cats), and a negatively
labled pair if they known to not be similar. The goal is to successfully train using a
relatively small number of positive and negative pairs.
- Data: MNIST, CIFAR-10, text data
Autoencoding
- Hard instance generation
- Problem: Some computational problems are known to be intractable
in the worst case, but are still widely studied at a heuristic level.
One such problem is 3-CNF-SAT: given a boolean 3-CNF formula φ, answer
whether there an assignment to its variables that satisfies it.
heuristics for 3-CNF-SAT have been studied extensively, but there are
still some formulas that are very difficult to solve (requiring days of
processing). This project's goal is to apply autoencoders to characterize
these hard problems, to lend insight on why hard instances are hard.
- Data from competitions
- Action-Conditional Video Prediction
- Create a model which, conditioned on the current observation of a stochastic MDP environment (e.g., Atari or robotic pushing environment) and the actions taken by an existing policy, predicts future states of the stochastic environment
- Very strongly related to reinforcement learning theory and applications
- Proposer: Paul Quint, with high willingness in long-term collaboration after the semester and to push ideas to publication
- Related papers and blog posts
Reinforcement Learning
- Play a new game with AlphaZero
- Problem: Re-implement Deepmind's famous approach to Go/Chess/Shogi for a new game
- Data: Environment of students' choosing, ideally some difficult, perfect knowledge, deterministic, two player game
- Proposer: Paul Quint, with willingness to collaborate after the semester
- Related papers and blog posts
- Play Starcraft 2 Minigames
- Problem: Play minigames in the StarCraft II Learning Environment, a cutting edge problem in Deep RL
- Environment: https://github.com/deepmind/pysc2
- Notes: This project could go in a lot if directions involving planning, AI, multi-agent systems, and more
- Proposer: Paul Quint, with willingness to collaborate after the semester
- Related papers and blog posts
Last modified 01 May 2018; please report problems to
sscott.