Mode is a kind of average. I infer that you mean "mean" when you say "average" here.
The mean takes into account outliers in a way that the mode doesn't.
The joke about the average number of legs among humans being less than 2 describes a situation where mode provides more meaning than mean. In the case of scattered values, mode makes less sense, such as the average net worth of the people in a country.
I don't know why the mean is the "default" average. In many situations, the mode or median makes more sense.
Mean is the default "average" because it's easy to calculate and were taught it first in school. Like, years before median or mode. That's it, that's the whole reason.
such as the average net worth of the people in a country.
And the mean average makes no sense here either, which is why incomes and wealth are almost always quoted as medians instead of means, unless they're totals which is a roundabout way of reporting the mean (if you know the population baseline).
The more unequal a country, the larger the difference between mean and median incomes.
Now I understand better how you're thinking. Indeed, the notion of "what the average person has" is answered better by the median, but the notion of "What's most typical" is answered by the mode.
Actually I think that the notion ‘what the average person has’ is bettered answered by mode, I feel like mean is better for kind of plotting a data or points to find a ‘trend’ or something like that, I am hella confused now actually
The mode can't hope to answer how much money the average person has, because there are far too many possible values.
The mean answers how much money people have on average, but the outliers exert too much influence to answer how much money the average person has.
The median moderates the influence of both the very rich and the very poor, so it better approximates the amount of money that those in the middle of the population have, which is what 'the average person" tends to be.
For populations where the number of possible values is much lower, the mode and the median tend to be closer to together. Emphasis on "tend".