What's the worst way to use statistics?
What's the worst way to use statistics?
What's the worst way to use statistics?
Probably to support racism. Like the black people crime statistics.
Yeah. Eugenics. It's convinced a lot of smart people.
Averages. They're almost always a bullshit flag if it's tied to anything remotely political. If you're not going to also give the standard deviation and skew then at least use median.
To cherry pick it and use it to promote fascist views
By using unrelated data to prove a point.
Or misrepresenting data.
For example, if your country has a 10% crime rate. Meaning 10% of the population will commit a crime at some point. Due to worker immigration the country gains 20% more people. The it is expected that of those workers about 10% will commit a crime. Thus increasing the total amount of crimes committed in the country but the crime rate is still at 10%.
Now misrepresenting would be to cry out that the workers are bad because the amount of crime has gone up.
...as a drunken man uses lamp posts — for support rather than illumination.
The question makes me remember Daryl Bem, a celebrated social psychologist. He published a much cited article called "Writing the Empirical Journal Article". About 15 years ago, he used this advice to prove that humans can see into the future. His advice is probably still used to teach. That's probably the worst thing you can do.
correlation and causation. even useless stats comparing apples and oranges, the numbers generated are only as good as the study design and methods.
Blindly. People love to list them as evidence as if the numbers stand on their own. Reality is a person had some hand in assembling the numbers and there is no such thing as a bulletproof statistic. Good statistics ought to be scrutinized.
As a math guy, I hate when people say statistics is math. Like yeah, there are equations, and math plays a role, but the results so often speak more to the selection and interpretation choices made by the statistician than to any kind of mathematical rigor.
By training an algorithm that will have an impact on said statistics. Not only the algorithm can cheat (see Goodhart's law), but it can repeat biases that led to these statistics (like those law enforcement algorithms that became racists)
You are describing Google Ads right now. Algorithms are better and better in reaching to poeple that are already on the purchase patch. It's like giving a restaurant flayers to people that are waiting for a weiter to show them a table.
Aren't our ads amazing? Look, almost everyone who saw them made the purchase!
Analytics that ignores Goodharts law ruin everything. Movies, HR, Marketing (not much to ruin left, but you get the point), performancet review, recommendations...
Well, to immanetize the eschaton. That's the worst thing to do with statistics.
When you mix statistics with marketing.
“Facts are stubborn things, but statistics are pliable.” ― Mark Twain
Not making sure the result even makes sense. There was a real example, where a ~2010 news article said that the number of crimes in their city has been doubling every year since ~1980.
That is not possible. Assume that there was one crime in 1980. In 2010, there must be at least 2^20 crimes.
I once saw a reddit post where some busybody counted how many people with dogs walked by in an hour and multiplied that by 24 and assumed that was how many walked by in a day (as if it would be the same amount at all times of day)
On people who dont understand them to paint an incomplete picture of reality. Misleadingly.
Truncated graphs. I hate them but they are so often (ab)used, even in professional situations.
Eugenics
White people are shot more by cops proving there is no police brutality against people of color.
(No, this isn't my actual opinion. This is an arguement racists use and white suprimists)
Lying
Lies, damned lies, and statistics
More men are arrested for crime than women, proving that cops are sexist.
Mixing up correlation with causation. A while back I was having a discussion here on Lemmy because people were saying pitbulls are dangerous and pointing to the disproportionate amount of deaths caused by pitbulls vs the percentage of dogs that are pitbulls. The argument goes something like this "Pitbulls are responsible for 55% of killings, but they're only 12% of all dogs, therefore Pitbulls are dangerous".
Oh, and BTW if you agreed with that argument above, congratulations, you're officially a racist, because those are the numbers of murder convictions and demographics for Black people in the USA. The argument is the same, and the reason why it's flawed is the same: correlation does not imply causation. Just because there's something seems disproportionate out of context doesn't mean it has the most obvious cause, in both cases the reasons are much more complex and mostly have to do with education and opportunity (or lack thereof).