The Hugging Face team believes that we can reach our goals in NLP by building powerful open source tools and by conducting impactful research. Our team has begun holding regular internal discussions about awesome papers and research areas in NLP. In the spirit of open science, we've decided to share these discussion materials with the community.
Note: These science day discussions are held offline with no physical presentation or discussion to provide. However, some presentation materials do include limited comments from our team or summaries of internal discussions.
See planned future discussions below.
- Paper: Pre-training via Paraphrasing
- Authors: Mike Lewis, Marjan Ghazvininejad, Gargi Ghosh, Armen Aghajanyan, Sida Wang, Luke Zettlemoyer
- Presenter: Sam Shleifer
- Presentation: Forum Summary
- Community Discussion
- Paper: Weight Poisoning Attacks on Pre-trained Models
- Authors: Keita Kurita, Paul Michel, Graham Neubig
- Presenter: Joe Davison
- Presentation: Colab notebook/post
- Community Discussion
- Paper: Linformer: Self-Attention with Linear Complexity
- Authors: Sinong Wang, Belinda Li, Madian Khabsa, Han Fang, Hao Ma
- Presenter: Teven Le Scao
- Presentation: Tutorial Blog Post
- Community Discussion
- Paper: Evaluating NLP Models via Contrast Sets
- Authors: Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, Ally Zhang, Ben Zhou
- Presenter: Victor Sanh
- Presentation: Slides
- Paper: Movement Pruning: Adaptive Sparsity by Fine-Tuning
- Authors: Victor Sanh, Thomas Wolf, Alexander M. Rush
- Presenter: Victor Sanh
- Presentation: Slideshare
- Paper: Von Mises-Fisher Loss for Training Sequence to Sequence Models with Continuous Outputs
- Authors: Sachin Kumar, Yulia Tsvetkov
- Presenter: Victor Sanh
- Presentation: Colab notebook
- Topic: Transfer Learning in Natural Language Processing (NLP): Open questions, current trends, limits, and future directions
- Presenter: Thomas Wolf
- Presentation: Video
- Topic: Overview of recent work on: Indexing and Retrieval for Open Domain Question Answering
- Presenter: Yacine Jernite
- Presentation: Slides
- Paper: Scaling Laws for Neural Language Models
- Authors: Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei
- Presenter: Teven Le Scao
- Presentation: Google doc paper tutorial
- Paper: Representation Learning with Contrastive Predictive Coding
- Authors: Aaron van den Oord, Yazhe Li, Oriol Vinyals
- Presenter Patrick von Platen
- Presentation: Slides
- Paper: Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
- Authors: R. Thomas McCoy, Ellie Pavlick, Tal Linzen
- Presenter: Victor Sanh
- Presentation: Slides
- Paper: REALM: Retrieval-Augmented Language Model Pre-Training
- Authors: Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-Wei Chang
- Presenter: Joe Davison
- Presentation: Write-up
- Paper: Adaptively Sparse Transformers
- Authors: Gonçalo M. Correia, Vlad Niculae, André F.T. Martins
- Presenter: Sasha Rush
- Presentation: Colab notebook
No planned discussions for the moment, check back soon.