Reading Reflection 4

The authors of this paper conducted an insightful and comprehensive analysis of the costs, risks, and potential harms involved in the pursuit of ever-larger language models. They identified three categories of costs of developing ever-larger LMs: environmental costs that are most likely to fall on marginalized populations; financial costs that “in turn erect barriers to entry”; and opportunity costs for researchers to direct their efforts away from potentially more effective and less resource-intensive solutions. They also delineated how biases and problematic values can be embedded in the training data in ways that enlarging the datasets cannot counter, and how these biases and values cause risks of substantial harm.

The authors also provided concrete recommendations for better research practices and directions. They urged researchers to “consider the financial and environmental costs of model development up front”; to curate data collection carefully by providing thorough documentation and making note of potential users and stakeholders; to focus energy on understanding how LMs achieve tasks and how they form part of socio-technical systems.

Overall, I think the authors did a great job illustrating their arguments and provided much food for thought. Some questions I have in mind include: are “stochastic parrots” inherently problematic? And how can NLU be achieved without incurring the same costs, risks, and harms described in this paper?

In Section 6.1, the authors argued that it is problematic that LMs produce an illusion of coherence without any actual meaning behind it. I find that argument somewhat weak. I believe that in any kind of communication between humans, our interpretation of the meaning conveyed by the other party is always based on our subjective understanding of the other party’s context of communication, rather than the other party’s subjective reality of their own context. Thus, the fact that “the comprehension of the implicit meaning is an illusion arising from our singular human understanding of language” seems true for both human-human communication and human-AI communication.

Therefore, I believe the fact that there is no actual meaning behind LM-generated text isn’t really a problem in and of itself. Rather, the problem lies in presentation, like when LM-generated text is displayed without any labeling or context, and thus leads the audience to believe that it’s written by a human being with certain values and motives. Take the example of chatbots: when bots such as Microsoft’s Tay sprout hateful language, people are aware to an extent that the bot doesn’t know what it’s talking about, and is simply parroting the language of hateful human beings. In contrast, when bot accounts on social media spread auto-generated extremist propaganda, they can entice people to engage with them as if they are actual human beings, and thus cause substantial harm.

Furthermore, as someone unfamiliar with this field, I am curious as to what “natural language understanding” means for machines. To begin with, are there already algorithms and data structures available to encode “meanings”? If yes, how do they operate? Then there is the task of interpreting the “meanings” of human languages, which are complex, layered, and contextual. One sentence can convey completely different meanings when uttered in different ways, in different circumstances, and by different people. It almost seems impossible for machines to capture them without simplifying or distorting them. And even if this is possible, how do we ensure that the same risks of bias and harm in current LMs are mitigated in these models? And how can we ensure the models work across cultures, and especially for marginalized groups? Some “meanings” are unique to certain cultures, and when even linguists and anthropologists don’t have a clear understanding of these meanings, how can we build computational models that do them justice?