If you are reading this, then you probably finished the long and arduous journey to grad school. You emerged victoriously, and this success is well-deserved. But which school should you choose? How to make a right choice if all schools look great in their own way? This blog post is centered around these questions. It is most useful if you are a computer science student aiming to study machine learning and, in particular, natural language processing in the US, but most of the information here is equally valid for any field of research and any country.
The choice of grad school that is right for you can be tricky and confusing. We live in a time of hyper-competitiveness, where even undergrads need to optimize for metrics like paper count to make it to the next level — grad school. This heavily career-centered perspective was probably advantageous to get you into grad school, and it remains crucial to get you to the level after that: a great job in industry or academia. So choosing the school which is best for your career can feel like an obvious choice. However, a PhD is a very long journey, and choosing your grad school based on this perspective alone might make you more vulnerable to burn-out, disillusionment, and general dissatisfaction.
In this blog post, I will discuss this career-centered perspective in detail, but I also provide you with three other views that hopefully help you make a balanced choice that not only leads to academic success but long-term satisfaction and a full and rich life. Balancing your decision based on all four perspectives probably leads to a better choice than looking at one angle alone. Before I go into the details, let me briefly introduce these four perspectives: The Career Perspective, the Identity Perspective, the Stability Perspective, and the Variability Perspective.
A quite intuitive perspective is the Career Perspective, which is about determining and weighing the factors that help you to be successful in your PhD and have a successful career.
A different perspective is the Identity Perspective: not looking at your career but at who you want to be and how your choice enables and facilitates that identity. The social environment that you are in has a strong causal effect on your development: We are strongly influenced by the people and culture around us, and the friends of friends that you do not even know will make you honest/deceitful, selfish/selfless, caring/exploiting, and so forth. If you choose a school where the unwritten motto is “The worth of a person is measured in papers and citations” you will slowly but surely grow to be a person that would live by such a motto. Would you like to be such a person? So by choosing a school you in some way also define and constrain the person that you can become.
The Stability Perspective says that choosing the “right” school is an illusion but that there are other choices that matter much more because they give you the stability that you need to succeed in the arduous PhD journey. It is well known that the effect of most moderately painful or enjoyable events that significantly affect your life will wear off within about two years and that you will return to your baseline happiness and stay there. However, some things are more stable. A great and friendly social environment where you always feel supported and not alone will provide you with the most human needs and will make a 5-year-or-so journey a breeze. On the other hand, a tiny research group with a distant advisor will make for an uncertain, lonely, and stressful 5 years.
Another valid way to select a school is by the variability of experience it will offer — the Variability Perspective. You probably sacrificed in some way to get into grad school. You neglected your passions outside work, neglected friends or your partner or your family, neglected self-development, neglected to work on your mental, physical or spiritual health, or you neglected other things that are important to you. By choosing the school that is best for your career, you might very well continue on this path of neglect. When does it stop? Once you have completed an excellent PhD, you might labor on by choosing that super competitive assistant professor job, then tenure, then being a leading figure in your field, and so on. There is nothing wrong with such a path through life, but continuous exploitation will lead to local minima. The two most common regrets of the dying are “I wish I’d had the courage to live a life true to myself, not the life others expected of me” and “I wish I hadn’t worked so hard.” The dying probably would have avoided their situation if they would have known better. Making sure you have the time and opportunity for further exploration is very helpful in gathering the information necessary to make better choices in the future that do not lead to regret.
The career perspective looks at the most critical factors for your academic success and success beyond that and chooses the school that is best according to these factors. Let me go through each factor. I list the factors in order of importance, starting from the most important.
Advisor
Finding suitable advisors is probably the most crucial task when choosing between grad schools. One could even go further and argue that one should not choose a school, but one should choose an advisor. A lousy advisor can make you miserable, unproductive, stressed, and might be the main reason why you would want to drop out of the program. The right advisor will help you to be productive, stay healthy, and help you to enjoy doing your research. It is important to emphasize personal fit here: Some advisors are great for you and bad for others and vice versa. The following criteria will help you identify advisors that might be better for you than others. However, there is a great deal of gut feeling to this decision. It is a bit similar to dating, even if everything is right on paper doesn’t mean this is the right person for you.
Another important note is that you should be looking for advisors and not a single advisor. This complicates an already complicated process, but it is risky to choose a school based on a single person. Relationships are complicated, and things might not work out as expected with your advisor. If possible, you should have an alternative advisor to whom you can switch if it does not work out with the other advisor. This strategy also offers the possibility of being co-advised — two advisors that complement each other may provide a great fit even though a single advisor might not.
The following advisor-related factors do not have a particular order.
Research Style
Research style is probably the most elusive quality but also by far the most important quality that you acquire during your PhD. While many would say that the goal of a PhD is to become an independent researcher, the truth is that with the steep requirements for ML/NLP PhD positions, many students are already somewhat close to being independent. They can generate ideas with ease and execute them confidently in research projects. However, the actual quality that new students lack is research style.
Harriet Zuckerman is probably the person that studied scientific expertise to the greatest qualitative depth. In her work Scientific Elite, she interviewed almost every US Nobel prize laureate of the 20th century. She found that these individuals often rose through the ranks through accumulated advantage. One advantage helped them secure the next position/grant/collaboration, which increased their advantage and helped them secure the next one, and so forth. Zuckerman found that the main advantage gained through this ladder-climbing was not necessarily more resources (money, equipment), but having the opportunity to culturally adopt the research style of other successful scientists. Consistent with this, most future Nobel laureates have been advised by Nobel laureates or would be Nobel laureates. So good questions to ask are: Can my research advisor’s research style help me in my career? Do I want to be a researcher that follows the style of my advisor?
While your advisor will be the focal point of research culture, research culture is also created through interactions between your advisor’s lab students. It is usually subconsciously adopted over time. Most students might not be aware of how they were shaped by their advisor and research group. It happens automatically and does not necessarily require explicit thinking or effort.
To give you some examples of what facets of a research style might look like:
- Ideas are cheap and belong to the research group. Execution of those ideas as research projects is real research.
- Novel ideas are everything. If someone publishes something even remotely similar to what you have, you should give up the project and work on something nobody is working on.
- Good science is good math. A paper should be mathematically solid so that it will stand for years, holding valuable insights and generalizations that go beyond the current theoretical application.
- Good science is robust science. A paper should have careful claims with robust evidence. This will help make the field progress more quickly by providing reliable information to build on.
- Good science is a good research vision. A paper should be about what is possible in the future and where a line of research could lead to. Evidence augments vision, but a paper without vision is blind, incremental, and will be forgotten.
- Good science is good insight. Some insights can be extrapolated and be applied to many other scientific problems, many of which have not been formulated yet. Finding and expressing these insights is vital for scientific progress.
- It is all about productivity. Research is inherently noisy and messy, and it’s tough to predict the outcome of an idea or set of experiments in the development stage. Navigating this uncertainty is best done through fast iterations and balancing multiple projects to maximize the chances of a big success.
- Good science is collaborative. Different people can bring unique perspectives to a project and increase the chance of serendipitous insights. Collaborations bring the best out of people and can result in a sum that is larger than its parts.
- Good science is solitary. To gain the deepest insights into a problem, one has to understand a problem in its fullness without outside help. While collaborators can join later, the all-encompassing understanding of a problem through solitary pursuit is critical to tackling the most important scientific problems and for growing as a researcher.
These are just some examples, and usually, a research style is made up of a multitude of facets like these. Research style is complex but can best be encapsulated by the questions “What does good/bad research look like?” However, if you ask these questions during the visit days, you often find that people answer what they think good research is supposed to look like, rather than what it looks like for them. Therefore, better questions for visit days are:
- “What are examples of research papers you like?”
- “What research papers (in your area) do you think are the most important ones in the last years?”
These questions often reveal what people think are important problems and the “correct” manner of approaching these problems. Both qualities are strongly related to research style.
Since the acquisition of research style is largly automatic and subconscious, it is crucial to understand which research style you will be adopting by joining a particular school and lab/advisor. So, what can such adoption look like?
Take a friend as an example who, at the start of the PhD, can be culturally described as a minimalistic hacker-researcher—someone who tinkers around with minimal changes to a system to improve it in a simple manner. He teamed up with an insight-driven neat professor for his PhD. After a couple of years in his PhD, he learned to be an insight-driven hacker. He builds hacks, understands the deep relationships of how his hack affects the system, and then extracts this insight in the most minimalistic and well-formulated way possible along with his practical hack. The combination is a pretty potent mix: the minimalistic insight-driven hacker-researcher. This person finds small hacks that yield robust results and insights into how other research and the hack relate to other concepts.
One friend described me as a product-driven experimental hacker, meaning someone that rapidly prototypes changes to a system and tests them experimentally for reliable effects. If reliable effects are found, the hack is extracted into a product that other researchers can easily use. I was pretty surprised by that view at first, but I now think it pretty much hits the nail on the head.
Some friends I would describe as:
- concept-centered experimental visionary
- gregarious cool-stuff-can-be-good-science collaborator
- mind-the-gap collaborator
- principled neat-and-tidy collaborator
- I-like-cool-stuff researcher
It is important to note that there is neither right or wrong nor good or bad research style. For example, in Zuckerman’s work, two Nobel Laureates in the same field would sometimes have radically opposite research styles, yet each different research style made both Nobel Laureates and their students successful. Similarly, while an I-like-cool-stuff style sounds unimpressive, the Feynman-like playfulness of an I-like-cool-stuff researcher might lead to significant discoveries that others overlook because others do not deem these problems serious enough to consider working on them.
Looking at my friends, they often came in with a particular mindset, and looking at them now, they very clearly adopted the central cultural tenet of their research environment.
It can be very empowering to enter grad school with this view. A friend of mine entered grad school and, upon hearing this interpretation, actively thought about how he could augment his research style with a particular advisor’s style. He switched advisors until he found the right ones. Then he leaned in and tried to adopt the advisor’s central cultural research facet as quickly as possible. My friend’s primary research advisor told him four years into the PhD that there is nothing left that he can teach him and that he should graduate and move on to learn more. I was not surprised and think it directly related to my friend’s viewpoint that adopting a research style and developing research taste is the most crucial element of a PhD.
So while elusive and hard to define, the research style of a particular advisor or department can be an essential consideration to choosing the grad school that is right for you.
While the following sections will dive into other angles on choosing potential advisors, they can also be interpreted as sub-component of research style, particularly the advisor values section.
Advisor Research Fit
Students often do not know what to look for in an advisor and often cling to the idea that they need to find an advisor that is interested in the same research that they want to do. There is some truth to this idea, but this idea is more dangerous than it is helpful. From my friends in the 2nd year, about 66% changed their research direction completely — many of them in the first year. That number is higher if I look at later years. Most of them still work in their subfield (robotics/NLP/vision), but they switched to different research areas in those fields. Some examples:
- Multilingual parsing -> multilingual models -> machine translation
- question answering -> dialog -> reinforcement learning -> semantic parsing
- NLP architectures -> machine translation -> model efficiency
- human pose recognition -> sim2real
- question answering -> model efficiency -> interpretability -> model efficiency
What you see from these transitions is that an exact fit is not needed with advisors since your research interests will change. The same is true for your potential advisors: they might no longer be interested in research that they are well known for, or they might be interested in a direction which they have not yet published in. Compared to students, advisors have much more breadth and might be equally interested in many different research directions at once. Furthermore, while new professors are often compelled to stick closer to a specific research direction until they get tenure, tenured professors can be very flexible in research directions, and their interest might also be influenced significantly by the interests of their students. More senior professors are often happy to take on a completely new research direction that is interesting to you and compelling to them — this can be the advantage of hands-off advisors, which I will talk about in the next section.
Despite the overall fluidity of research interests of both you and your advisor, it is a good idea to have at least some overlap. It might be worth asking about the advisor’s long-term research vision, but be aware that such plans are often not well fleshed out and can change quickly based on changes in the field (e.g., BERT). It might also be worth looking at the values of an advisor because they are rather stable over time, and they can hint which kind of research they like — more on values later.
Advising style: Hands-on vs Hands-off
Advising styles can be mainly separated into hands-on and hands-off styles. What does this mean?
In general, what you can expect from a hands-off advisor is that you do all the work, and your advisor gives you feedback on what you have produced. For a hands-on style, the advisor also helps with the producing in some way.
More concretely, a hands-on advisor might be helping you with many details of your research: Brainstorming research ideas, discussing research ideas and problems in detail, help define research problems and ideas, thinking about a narrative for your paper, formulating claims, structuring research project into certain pieces with milestones, checking in frequently to discuss partial results, discussing programming problems and bugs, providing rapid feedback, steering the project to prevent failure, providing detailed feedback for the write-up, providing detailed feedback for presentation slides – all of these are signs for a hands-on advising style.
Hands off advisor will be helping you with high-level details of your research: Discussing viability and impact of a research idea, discussing research narrative/pitches/claims, discussing research results, providing (high-level) feedback on final paper draft and slides. Working with a very hands-off advisor has many benefits, but in terms of direct help and interaction with your hands-off advisor you often cannot expect much more than I list here.
The hands-on / hands-off dichotomy is a continuum — usually, an advisor exhibits a mix of these traits. For example, some advisors might be very hands-off, but are very involved in idea generation, while yet others really like to give detailed feedback on writing. Usually, advisors also adjust slightly to the needs of each student and can be more hands-on in research areas where he or she is well-established. It is useful to talk to students to get the exact details in which areas the advisor is hands-on or hands-off. Areas here can refer to activity areas (help with writing, brainstorming ideas, thinking about a research story, etc.), technical areas (helping with bugs/code, finding the right software framework), and research areas (machine translation, question answering, etc.). So you should not ask students, “Is your advisor hands-on or hands-off?”, but instead you should ask, “Is your advisor hands-on with giving feedback on writing?” and so forth. Ask about the areas that are most important to you (your weak areas).
A hands-on advisor is great if you are less experienced in research, need more structure and deadlines, are unsure about potential research topics, and are externally motivated. A hands-