Abstract:
The methodology of pretraining large language models and then fine-tuning them on annotated examples has greatly advanced NLP. However, these advances come at rising computational costs in both training and inference time, data collection and annotation challenges, and engineering expertise – all of which are necessary to apply state-of-the-art NLP models to real tasks. In this talk, we discuss recent steps towards democratizing NLP technology, and making it applicable to low-resource settings. We cover new methods for efficient pretraining, trade-offs between annotating more data and scaling up models, and how to simulate target-task data from unlabeled text. Finally, we discuss a new paradigm that aims to allow lay-users to define their own tasks on the fly, in natural language, without the intervention of engineers or data scientists.
https://technion.zoom.us/j/94950420992