Abstract
Large and anonymous social platforms are a breeding ground for dark jargon and abusive language. A hallmark of the euphemisms and innuendos that abound in such language is non-compositionality. Large scale pretrained language models, generally successful on many canonical NLP tasks, perform very poorly in the presence of non-compositionality. In this talk, we discuss robust, largely unsupervised, methods to detect and interpret euphemisms in underground forums in a context-specific way. We also discuss robust mechanisms to generate counterspeech, bringing a computational lens to the active bystander-effect of social psychology. The key technical innovation is the use of novel language modeling mechanisms that do not explicitly rely on finetuning large scale pretrained language models.
Bio
Suma Bhat is Assistant Professor in Electrical and Computer Engineering at the University of Illinois, Urbana-Champaign. Her research interests include processing idiosyncratic and creative linguistic phenomena so as to make natural language processing more natural.