General instructions
Student presentations are the most important part of the course. Their goal is two-fold:
- Give you a taste of ML theory research, especially many interesing areas and topic which we do not cover in class.
- Give you an opportunity to practice the skill of giving talks, which is an extremely important part of your PhD training (but is often overlooked)
Each presentation will be about 35 minute long, with 15 minutes for questions and discussion, for a total of about 50 minutes. For the presentation you can either use slides, use the tablet I use in class or, if you prefer, the blackboard. To ensure all presentations are high-quality, one week before the presentations, you are asked to meet with me to review your slides/material and your preparation. Therefore, you should try to be ready with your presentation a week before it is scheduled. This preparation review will be worth 10% of your grade, the actual in-class presentation will be worth 20%.
Some general advide regarding the talk:
- Try to do one practice talk before your talk, with any of your friends, classmates etc. if possible. One of the surest ways of giving good talks is to get feedback and be reasonably well prepared.
- You are totally not expected to cover everything in your paper! In fact, most likely you will only have time to cover one or two main results. Try to convey one or more key insights or takeaways instead of a lot of technical details.
- On a related note, even if the paper is notation-heavy, it does not mean your talk has to be so. Try to use a minimum amount of notation to convey what you want to say. It is common to make simplification in talks even if that means sacrificing generality or even rigorousness.
- To combine the above two points, when you read the paper and plan your presentations, try to identify one or two key ideas that you want to present. You should try to present at least one theoretical result in a self-sufficient way in the presentation, including the proof or proof sketch (you'll probably only have time to do this for one theoretical result). This would mean that you might have to present the result in the most simple and cleanest setting, trying to avoid all extra details and jargon. Try to also relate the presentation to material covered in class wherever possible.
- I encourage you to watch these extremely nice short videos by Uri Alon on how to give a good talk: [1] [2] [3] [4] [5] [6] .
Some remarks regarding the papers:
- You are not required to understand every single detail of the papers! Instead, try to focus on the key ideas/messages. It is fine that you only skim some parts of the paper, as long as you spend time in carefully reading and understanding some other main parts. Importantly, if your paper is long, you only need to read about 20-30 pages for the presentation, enough to get some key ideas and present them. If your project report will be on the same paper, you can read a bit more for the final report, or read a bit more of some other related paper, but around 30 pages is still sufficient.
- If you need further help understanding your paper, feel free to ask on Piazza or schedule an appointment with me.
Schedule and papers
- Monday Nov 1
- Slot 1 (starting around 11am): Fatih Erdem Kizilkaya [Computational/statistical tradeoffs]
- Wednesday Nov 3
- Slot 1: Chandra Sekhar Mukherjee [Computational/statistical tradeoffs]
- Slot 2: Jesse Zhang [Stability for understanding generalization]
- Monday Nov 8
- Slot 1: Ta-Yang Wang [Generalization for deep neural networks]
- Slot 2: Emir Ceyani [PAC-Bayes, Generalization for deep neural networks]
- Wednesday Nov 10
- Slot 1: Sophie Hsu [Generalization for deep neural networks]
- Slot 2: Di Zhang [Generalization for deep neural networks]
- Monday Nov 15
- Slot 1: Grace Zhang [Generalization for deep neural networks]
- Slot 2: Jiahao Wen [Optimization for deep neural networks]
- Wednesday Nov 17
- Slot 1: Berk Tınaz [Out of distribution generalization]
- Slot 2: Bhavya Vasudeva [Out of distribution generalization]
- Monday Nov 22
- Slot 1: Navid Hashemi [Adversarial robustness]
- Slot 2: Neel Patel [Robust ML]
- Monday Nov 29
- Slot 1: Sid Devic [Clustering]
- Slot 2: Zhengqi Wu [Beyond iid data]
- Wednesday Dec 1
- Slot 1: Ali Omrani [Fairness]
- Slot 2: Yingxiao Ye [Fairness]
Project report
The report summarizes and distills the main results of your assigned papers, other papers that you chose to read, or the research you chose to do. You should aim to read around 30 pages, so if your paper is longer than this you don't need to read all of it. The final report has to be written in Latex and should be 7-8 pages long, excluding references. Please use this LaTex template based on the NeurIPS format. You can use the following format as a guide but you don't need to follow this strictly. For instance, if you're including some research that you did, you will probably modify the outline accordingly.
- Abstract: A short abstract, summarizing the entire survey.
- Introduction: Introduce the main topic of you assigned paper(s). Try to put it into the context of general learning theory, and explain how it relates to topics covered in our lectures (to the extent possible). Briefly mention the high-level results and explain the significance of the results (such as improvement over prior work).
- Problem setup: For problem setup, you should describe the problems in detail using necessary notation. Once again, you are not asked to cover everything in the papers, so only describe in detail what you plan to cover in this short survey.
- Main results: Describe the main algorithms/theorems. For algorithms, describe what they are doing at each step and what the key idea is behind it. For theorems, after the formal statement, try to explain in words what the statement really means and what the implications are.
- Proofs: Try to distill some proofs from the paper and reproduce them in the report. Due to the space limit, most likely you can only fit 1-2 proofs into the report, so pick the ones that you think are most important. If the original proofs are long and complicated, try to break it down into several parts (in the form of lemmas for example), and only present the proofs for some of these parts.
- Experiments: If the paper has experiments and you think they are an important component, mention them here.
- Conclusion: A short conclusion, highlighting the main message again.
- References.
The following two are optional and considered as bonus tasks. You can use up to 1 extra page for each of these two extra components:
- Open questions: Identify interesting and concrete open questions in the same direction that are not mentioned in the papers already. Mention briefly what you think the potential approaches are to tackle these open questions and/or why you think these are hard problems that require new techniques beyond what the papers present.
- More papers in the same direction: Read more related papers of the same topic and include an extended related work section that summarizes what the other papers you read are about and how they are compared to your assigned papers. To find these other papers, you can use the reference list of your assigned papers, use Google Scholar to see which papers cite your assigned papers, or do a general online search of the related topic.
The project should be written in a way such that anyone else in the class who doesn't read the original paper(s) but directly reads the report can still follow along. Try to have this in mind as you write and make the exposition as clean and clear as possible, and have sufficient intution along with the formal details. The report will mainly be evaluated on the quality of the writing and the presentation, and how your understanding of the paper is reflected in it.
The report is due on Tuesday December 14th at 8am (24 hour extension from previous deadline). No further extensions or late days are possible for the report, so please plan accordingly.