I mean... they didn't specify it had to be random (or even uniform)? But yeah, it's a good showcase of how GPT acquired the same biases as people, from people..
Reminds me of my previous job where our LLM was grading things too high. The AI "engineer" adjusted the prompt to tell the LLM that the average output should be 3. I had a hard time explaining that wouldn't do anything at all, because all the chats were independent events.
Anyways, I quit that place and the project completely derailed.