Advice to NLP & CompLing students
- Sign up on relevant mailing lists, groups, etc. There are many job and Ph.D. positions that get posted on these lists
- Become fluent in the command-line shell, and use Linux. Ubuntu is a popular choice. Learn a real text editor like Vim or Emacs.
- Learn LaTeX, Git, Python (v. 3)
- Then learn a statically-typed programming language. Maybe Go, Swift, C, Java, or Scala.
- Be active on GitHub. This is how you can show off your programming, documentation, and collaboration skills. You can start by posting code from class projects. Include a well-developed README file and a free licence. Then learn how to contribute to other people's projects.
- On writing papers: be direct, and expect your audience to mostly "skim" your paper. This paper is a nice example, especially the first half. Simon Peyton Jones has some nice advice on how to write research papers and how to give presentations
- Do internships as much as possible!! This is really important!
- Read up on how to do a technical interview. Here are some examples: 1, 2, 3, 4, 5, 6
- Fraser Bowen (2017) - Neural network-based grammatical error correction
- Alina Karakanta (2017) - Neural MT using a resource-rich closely-related language
- Benjamin Peters (2017) - NMT-based phonological transcription. Think of it as multilingual text-to-speech
- Mahsa Vafaie (2017) - Acoustic dialect identification with deep-learning based ASR
- Laura Bostan (2017) - Deep learning-based generation of recipes
- Nikolaos Bampounis (2016) - Paraphrasing for machine translation using distributed word vectors
- Artuur Leeuwenberg (2015) - Minimally supervised synonym extraction using word embeddings
- Liling Tan (2017) - Induced ontologies in machine translation
- Santanu Pal (2017) - Automatic post-editing of machine translation
- Marcos Zampieri (2016) - Pluricentric languages: Automatic identication and linguistic variation
- Mojgan Seraji (2015), Uppsala University, advised by Joakim Nivre - Morphosyntactic corpora and tools for Persian
- Kata Naszádi (2017) - Image-sensitive language modeling