Crap, I left my $199 yearly subscription info inside my butler’s Lamborghini. Could your personal valet sky-write your login credentials for nature.com above my Tuscan estate? Specifically, above the Eastern alpaca pens—this Murano glass monocle of mine isn’t a bi-focal. Cheers.
And don't worry, the people that did the research and wrote the article, and the person that reviewed the article aren't going to see a single cent of it.
This seems to be based on a racist assumption. Why is speaking improper English labelled as "African American english"?. I would want to see the LLM assumptions also for southern drawl and for general incorrectly spelled / grammared speech, to compare to the assumptions made for the African American english version.
Speaking with slang / incorrect grammar is of course, in general, inversely correlated with education level and/or preference for shorthand forms of speech over writing/speaking the full grammatically correct form. The LLM is saying speaking in slang = stupid/lazy.
The researcher is labelling slang as specifically African American speak, therefore interpreting the LLM response as assuming African Americans are stupid/lazy.
This [the article?] seems to be based on a racist assumption.
No, it isn't based on an assumption. The written features that were analysed are associated with AAE. From the article:
use of invariant ‘be’ for habitual aspect;
use of ‘finna’ as a marker of the immediate future;
use of (unstressed) ‘been’ for SAE [standard American English] ‘has been’ or ‘have been’ (present perfects);
absence of the copula ‘is’ and ‘are’ for present-tense verbs;
use of ‘ain’t’ as a general preverbal negator;
orthographic realization of word-final ‘ing’ as ‘in’;
use of invariant ‘stay’ for intensified habitual aspect; and
absence of inflection in the third-person singular present tense.
Why is speaking improper English labelled as “African American english”?.
Flip the question - why are those features associated with AAE labelled "improper English"?
I would want to see the LLM assumptions also for southern drawl and for general incorrectly spelled / grammared speech
The article tackles this: "Furthermore, we present experiments involving texts in other dialects (such as Appalachian English) as well as noisy texts, showing that these stereotypes cannot be adequately explained as either a general dismissive attitude towards text written in a dialect or as a general dismissive attitude towards deviations from SAE"
I always love when cough "educated" people (usually just what they like to say when they mean "not black") go on about how "black people don't speak proper English!" because certain vowels can be dropped here or there, grammar shifts, the works. Most of us have heard AAE (also maybe heard it called "Ebonics" if you're a little older) at one point or another, and likely don't have an issue understanding what anyone is saying. A few things that skew more metaphorical or slang words might slip by but you get the gist.
That's the point of language. Convey information. If the information is conveyed, then language has done its job. Yay language.
If anyone wants to continue saying "it's not PROPER English" well... I have bad news for you. Neither is any other modern form of English. So many words have been borrowed, or stolen, sentence structures have changed, entire words change meaning. And that's just in the last 100 years.
English is an amalgamation of many different root languages, and has so many borrowed words and phrases, along with nearly every other modern language, can any of them still be said to be "proper"?
When I think of the difference between "proper English" and "improper English" I'm reminded of My Fair Lady. "The rain in Spain stays mainly on the plain" Eliza vs Henry Higgins (or 'enry 'iggins if you're feeling improper)
Really good reply, thanks for the effort you put in. Its good to see they did compare with other dialects. It's interesting that the same bias was not seen.
I would still disagree with the statement that AAE could be considered equally proper to textbook, grammatically correct according to the Oxford English dictionary (or the American equivalent). A dialect by definition is an adaptation of the language from the standard 'proper' grammatical rules.
I don't know or hang around with many black people, but I do hear all of the stuff pointed out here on the regular any time I see a group of rednecks at the local farm supply.
Plus, internet meme culture has vastly changed the language landscape where, for example, phrases like "you don't think it be like it is, but it do" are used by people from all walks of life.
Why is speaking improper English labelled as “African American english”?.
Oh no, you're in the picture. It's a real dialect, just as valid as what they speak on the BBC, which I'm guessing is itself different from how you speak.
To be clear, I don't think you meant to be unkind here. I'm not trying to make you feel bad.
Hundreds of millions of people now interact with language models, with uses ranging from help with writing1,2 to informing hiring decisions3. However, these language models are known to perpetuate systematic racial prejudices, making their judgements biased in problematic ways about groups such as African Americans4,5,6,7. Although previous research has focused on overt racism in language models, social scientists have argued that racism with a more subtle character has developed over time, particularly in the United States after the civil rights movement8,9. It is unknown whether this covert racism manifests in language models. Here, we demonstrate that language models embody covert racism in the form of dialect prejudice, exhibiting raciolinguistic stereotypes about speakers of African American English (AAE) that are more negative than any human stereotypes about African Americans ever experimentally recorded. By contrast, the language models’ overt stereotypes about African Americans are more positive. Dialect prejudice has the potential for harmful consequences: language models are more likely to suggest that speakers of AAE be assigned less-prestigious jobs, be convicted of crimes and be sentenced to death. Finally, we show that current practices of alleviating racial bias in language models, such as human preference alignment, exacerbate the discrepancy between covert and overt stereotypes, by superficially obscuring the racism that language models maintain on a deeper level. Our findings have far-reaching implications for the fair and safe use of language technology.
Although nonstandard English and pidgins often demonstrate the same level of nuance and complexity as standard English, it's very common for there to be negative stereotypes. One has to wonder whether the LLMs generated from (stolen en masse) written output say as much about us as they do about their creators.
An LLM needs to evaluate and modify the preliminary output before actually sending it. In the context of a human mind that’s called thinking before opening your mouth.
Yeah it turns out when your entire tech industry is dominated by cishet white techbros and the entire foundation of their education and the production of such models is based on that then you get racist as fuck outcomes from any given algorithm that is a product of that same set of normative standards.
If you have the time I highly recommend reading Palo Alto by Malcolm Harris, it's a great primer on how all this shit got started and why we should frankly just burn Silicon Valley to the ground.