A collection of resources and information for concrete skills that are helpful when pursuing a PhD in computer science (specifically in ML/AI or related disciplines)
This section details how to effectively communicate your research, in particular through papers, figures, and presentations. Given that one major goal in writing papers is so that reviewers and peers will read them, rate them highly, and cite them, in this section we’ll also discuss how to review papers so the parallels are clear.
As crazy as it sounds, in many cases when revising drafts (both those I’ve written and those written by others), one of the most effective tools in my toolkit is to ask the author “What are we trying to say here?” then taking whatever simple 1 - 3 sentence explanation comes out and writing that at the top of the relevant paragraph, section, or draft.
This tool is so effective because it helps make your paper’s message apparent (by stating it clearly) and because it reflects the high priority of that information (by stating it first). The efficacy of this tool leads naturally into the second principle, which is…
Beyond just knowhing what you’re trying to say, you need to know the entire framing of your paper. The framing (typically laid out in full in the introduction of the paper) will guide the entirety of the paper (and, ideally, the research process itself). When thinking through the framing of the paper, I personally really like a slight variation of the Stanford InfoLab tips mentioned below:
This framing both serves as an immediate outline for most introductions and serves to define the goals of most of the sections in a traditional paper structure.
A good recipe when thinking about using this framing is that for each point, you want to state it clearly, motivate it (ideally with citation, experimental results, arguments, or proofs), and connect it back to the main message of your paper. For example, if you argue that a problem is important because it affects industry applications, find a citation for it! If you argue that existing approaches fail because of an empirical property, then cite it, or, if nobody has noted it previously, show it empirically! To motivate why one of your experiments helps establish your claim, argue qualitatively why the particular result you observe for the particular test you run offers that support!
In general, you’ll include the following sections, in something like the following order. I’ve included a smattering of notes of each of these, but they should be expanded—feel free to contribute if you want to help expand on any of these points!
This guide is a great starting point for writing technical papers. I particularly value their discussion on the introduction, which states that an effective introduction must focus on answering five key questions:
This guide is more comprehensive than just focusing on the writing part, and is featured also in the research section of this guide, but it is very relevant here too! Note the synergy between this and the Stanford InfoLab tips above.
This is a series of 10 tips by Dr. Sebastian Nowozin that are more focused on the immediate technical craft of writing itself, but still very useful.
There are two, sometimes contradictory, goals in reviewing modern conference ML papers. First, you review to help the venue ensure it only accepts high-quality papers that will be impactful and (practically speaking) well cited. Second, you review to help the author best improve their work in this paper and in general. It is important to keep in mind that you need to service these two goals at once in most reviewing contexts, even when they may not fully align. For example, it may become clear early in reading a paper that the paper will not be a good fit for a specific venue—stopping the review there may be suitable for helping the venue, but doesn’t serve the author well at all. Conversely, offering highly detailed feedback about effective writing techniques, figure suggestions, framing tips, etc. may help the author significantly, but if you do this at the expense of missing something critical (e.g., a poor statistical practice, mistake in a proof, or that the paper is highly redundant with existing work) you may be failing in the venue’s goal.
Of course, you also have your own goal, which is to satisfy both of these goals as effectively as possible in as little time as possible.
I like to structure my reviews in line with both common guidelines for reviewing (e.g., summary, key strengths, key weaknesses, presentation issues, etc.), but also in line with the guidelines for writing papers and the goals listed above. In particular, I do something like the following:
I often write my summary section using the framing outline given in the section above, identifying for each point in the outline what my best guess is for how the authors would respond to that point. Interestingly, in my experience this is often both faster and more helpful than a free-form summary. It is helpful for the authors as it (tacitly) encourages them to focus on those framing questions in a revision and helps expose any areas where there intent wasn’t clearly communicated, and it is helpful for the venue as it can help align reviewers on a common understanding of the work.
Here, the focus is on the other minor strengths and weaknesses, that, while nontrivial, would not change my decision either way alone (though many together certainly could). As these are not (in general) wht will decide the recommendation either way, the primary goal here is to help the author, not the venue.
In some cases, either due to review forms or details of the paper, I’ll separately highlight any commentary on presentation quality or missing references. Missing references, in particular, however, usually comes up earlier, both in the key/minor weaknesses and in the summary. Note: To adequately check if references are missing you must attempt to find missing references! Usually, with a few quick google searches I can determine if any major holes exist in the references for a given submission, and I am consistently surprised at how few reviewers do this. I have also personally caught at least one case of plagarism / dual submission violation in this way, that would have otherwise gone unnoticed, so it is an important, understated step!
Here I just synthesize previously noted things into a single accept/reject recommendation. Some people state that you should always give extreme recommendations, but I disagree with this – skepticisim and self-reflection are integral in any part of science, and accordingly I am always skeptical of my own opinions, and tend to give less confident scores (e.g., minor accept/reject vs. extreme accept/reject). However, that is just my take, and other opinions may differ. If there is any ambiguity in the key strengths/weaknesses sections as to which way the recommendation will go (e.g., if both lists are equally populuated), I’ll try to reconcile that here and clarify my thought process.
In this section, I have up to two subsections, splitting my questions for the author / guidance on changes I’d want to see based on whether or not satisfactory answers to them would likely change my score. The motivation for breaking things up like this is two-fold:
If/as more reviewing systems move to open reviewing platforms like OpenReview, I think this will be increasingly important to help prevent authors from feeling like they’re being “strung along” by reviewers and to make reviewers think more about their scores in terms of “deltas” (what would need to change for my score to change) and less in terms of qualitative, ill-defined impressions.
The first place to start when thinking about how to be a better reviewer is by looking at the guidelines posted by the venues asking for reviews! Here are a few:
This guide, by Pablo Samuel Castro lists good practices for reviewing technical ML papers. I haven’t gone through it in depth yet, but at a quick read it seems like excellent advice.
These are links that are interesting and relevant, but not as directly informative or essential as those in other sections.
This slide deck, by Professor Marinka Zitnik, walks through how to make impressive, well-designed figures for both conference and journal venues.