CSCE 496/896-005 (Spring 2019) Project Ideas
In this course you and your team will do a substantial project, in which
you will characterize a significant problem amenable to a deep learning
solution, study the related work to this problem,
develop one or more deep learning approaches to this problem,
and evaluate your approaches.
You will summarize your project results in a written report and an oral
presentation.
The written report must
use a professional writing style similar to that found in a refereed conference
or journal (e.g., ACM, IEEE, ICML, ICLR, NIPS),
including abstract, introduction, summary of related work, your
contribution, references, and an appendix (if necessary).
The oral presentation
will be to the entire class at the end of the semester: during the fifteenth
week (April 22–26), and if necessary, during the fourteenth week
(April 15–19). You will submit your written report
no later than 11:59 p.m. on April 24.
In accordance with UNL policies,
you have now been informed in
writing of the nature and scope of this project prior to the eighth week
of classes.
Later this semester (late February to early March) we will set a deadline
for submission of 1–3 paragraph proposals on your projects.
Also, in late March, you will submit to us a brief progress report and meet with us
for a check-in on your project. You must do both of these
in order to get full credit for your project, and you must get our approval on
your proposal before starting work on your project.
Projects are officially due by
11:59 pm on
Wednesday, April 24.
See the rules on projects for more information.
Project oral presentations are the weeks of April
15 and April 22. See the
presentation schedule for more information.
✔ Project check-in reports are due Sunday, April 7.
You
should submit it to sscott@cse and pquint@cse in text format in the body of an email before 11:59 pm on that day.
The proposal should include:
- A summary of what you've accomplished on your project so far.
- A description of what major elements remain for your project, including
what is left over from your proposal as well as any new elements that you've
added since your proposal.
- An overview of any significant hurdles that have arisen.
- A list of what questions you need answered to move forward.
On Monday, April 8, we will spend the hack-a-thon period reviewing each group's
check-in report.
✔ PROPOSAL DEADLINE:
The proposal submission deadline is Sunday, March 3.
You
should submit it to sscott@cse and pquint@cse in text format in the body of an email before 11:59 pm on that day.
The proposal should include:
- A brief statement of your project topic.
- Motivation for your topic (why it is important and interesting).
- A precise work plan: what you plan to do, what data sets you will test on, how you will evaluate performance, etc.
- At least three references (at least two published journal or conference papers).
The following is a list of possible projects, suggested by a variety of individuals. If you want
to know more about a particular one, let us know and we can put you in contact with those who can
provide more details.
You are welcome to suggest your own project as well, e.g., one related to your own research.
It should contain some form of an experimental component, be relevant to the course, and be
a non-trivial extension beyond what we covered in class.
Classification
- Classification of wildlife camera trap images
- Problem: The goal of this pilot project is to reduce the burden of manual viewing and
classification of thousands of wildlife images using image classification
tools. The images come from camera traps (motion-activated cameras),
which are widely used by ecologists to collect various information on wildlife
populations such as habitat use and prey vigilance.
The first part of this project is to identify the species of the animal in the
image. The second part is to indicate the location of each animal in the
image, and to count them.
The work will entail adapting existing approaches trained on African wildlife to identify Nebraska
wildlife. Could become a thesis and a paper.
- Proposer: Andrew Little
- Related papers:
- Classification of programmer expertise
- Problem: The goal of this project is to classify the expertise
level of a programmer based on eye tracking data (time series or fixed window)
collected while the programmer debugs code. Could become a thesis and a paper.
- Proposer: Bonita Sharif
- Related papers:
- Prediction of stackoverflow tags
- Classification of animal feed types
- Problem: Given chemical analysis results on animal
feed, predict its type (e.g., soy-based, corn-based).
Could become a thesis and a paper.
- Deep learning in wireless networks
- Problem: Apply deep learning to problems in wireless networking.
Data collected from the NEXTT testbed
of a cloud-radio access network (C-RAN)-based experimental wireless network.
Possibilities include, but are not limited to,
channel estimation, bandwidth sharing, and others related to massive MIMO.
Could become a thesis and a paper.
- Defense against adversarial examples
- Problem: Several image classification systems have been shown to be
tricked into gross misclassification by making small, sometimes imperceptible
changes to the input image, e.g., tricking the system into thinking a panda is
a gibbon. An open problem is how to modify an architecture or regularizer to
make classifiers more robust against such minor changes.
- Related papers:
- Association training
- Problem: Assist an ongoing project to train a system to map inputs (images, text, sounds,
etc.) from input space X to an embedded vector in ℝd such that
two similar instances from X have embeddings that are near each other in
ℝd under some distance measure such as Euclidean distance.
This would potentially allow the use of locality-sensitive hashing on the embeddings to
enable very fast information retrieval of similar instances from X. Similar to a
classification problem, but labels are associated with pairs of inputs. E.g.,
two instances x1, x2 ∈ X are a positively labeled
pair if they are known to be similar in input space (e.g., both pictures of cats), and a negatively
labled pair if they known to not be similar. The goal is to successfully train using a
relatively small number of positive and negative pairs.
- Data: MNIST, CIFAR-10, text data
- EEG analysis
- Problems:
- An EEG dataset where subjects are seeing a seeing a series of
face, scene, and object images on the screen. The classification problem is
what category the person is seeing. They have 17 subjects with pretty good
data (and many more subjects less good), with a few hundred instances per
subject per category.
- An EEG dataset where subjects are doing a visual short-term memory task: they see either
1 or 2 items (small colored discs), on either the left or right side of the
screen, and then they have to remember the locations of the items for a second
or two and then report the location of one of them. The classification problem
is to predict whether subjects are seeing 1 or 2 items, and whether those items are on the left or right.
- An fMRI dataset where subjects are viewing a movie that's about 5
minutes long, and they see the same movie several times throughout the
experiment. They have about 20 subjects' worth of data. An issue is that one
cannot easily combine data across human beings with fMRI the way one can with
EEG because the anatomy is so different between people, so there is no
straightforward way to correspond the features to each other across subjects.
However, they have an fMRI image every second, so each person probably has a couple thousand instances to work with, where each instance would correspond to 1 second of the video.
Reinforcement Learning
- Learning how to apply genetic algorithm operators for software
assurance
- Problem: Use reinforcement learning to learn a policy to select
which genetic algorithm (GA) action is most appropriate for evolving solutions for
formal methods of software assurance.
- Data: Kodkod relational models are fed into a GA for solving.
The RL agent will learn how to select which GA operators for fastest solving.
- Proposer: Hamid Bagheri
- Related Paper:
- Play a new game with AlphaZero
- Problem: Re-implement Deepmind's famous approach to Go/Chess/Shogi for a new game
- Data: Environment of students' choosing, ideally some difficult, perfect knowledge, deterministic, two player game
- Proposer: Eleanor Quint, with willingness to collaborate after the semester
- Related papers and blog posts:
- Play Starcraft 2 Minigames
- Problem: Play minigames in the StarCraft II Learning Environment, a cutting edge problem in Deep RL
- Environment: https://github.com/deepmind/pysc2
- Notes: This project could go in a lot if directions involving planning, AI, multi-agent systems, and more
- Proposer: Eleanor Quint, with willingness to collaborate after the semester
- Related papers and blog posts:
Imputation
- Predicting missing flight recorder data
- Problem: Given data (time series or fixed window) from an
airplane's flight recorder, predict values that are missing in the sequence.
Could be a thesis or a paper.
Autoencoding
- Hard instance generation
- Problem: Some computational problems are known to be intractable
in the worst case, but are still widely studied at a heuristic level.
One such problem is 3-CNF-SAT: given a boolean 3-CNF formula φ, answer
whether there an assignment to its variables that satisfies it.
heuristics for 3-CNF-SAT have been studied extensively, but there are
still some formulas that are very difficult to solve (requiring days of
processing). This project's goal is to apply autoencoders to characterize
these hard problems, to lend insight on why hard instances are hard.
- Data from competitions
- Action-Conditional Video Prediction
- Problem: Create a model which, conditioned on the current
observation of a stochastic MDP environment (e.g., Atari or robotic pushing
environment) and the actions taken by an existing policy, predicts future
states of the stochastic environment.
Very strongly related to reinforcement learning theory and applications
- Proposer: Eleanor Quint, with high willingness in long-term collaboration after the semester and to push ideas to publication
- Related papers and blog posts:
Miscellaneous Applications
- An application of graph convolutional kernels
- Problem: Graph convolutional kernels are analogous to convolutions on images, but they try
to identify features in the adjacency matrices of graphs. Useful in node labeling, edge
identification, and graph compression.
- Data: Graph data from the Stanford Large Network
Dataset Collection (SNAP), and other graphs
- Related papers and blog posts:
- Computer-aided smart user interface design
- Problem: Computer aided smart user interface (UI) design can be a cool deep learning
application. In practice, assessment and refactoring UI involve human participants. However,
evaluating designs by looking at historical data can reduce the need for human intervention.
Can be in both supervised and semi-supervised setting.
- Related papers:
- Application of convolutional neural networks for genomic
motif finding
- Problem: CNNs have been applied to natural language processing
(NLP). The phrases are encoded into a format that looks like Braille and input
to the network. This project is to determine if that could be used for finding
genomic motifs for bacterial species. Conventionally, these motifs are
k-mers and their frequencies in a species' genome is considered a
signature for that species. Perhaps one could use a CNN and autoencoder to come up with relevant motifs instead of enumerating all k-mers. The data in this case would be sequencing reads.
- Related links:
Last modified 15 April 2019; please report problems to
sscott.