Data centers contain 90% crap data
Data centers contain 90% crap data

Data centers contain 90% crap data

Data centers contain 90% crap data
Data centers contain 90% crap data
1980s-2000s : the information age
2000s-present : the data age.
Information implies it's correct, data implies it can be anything , true or false.
aughts were not bad but it was falling and once we got in the teens ugh. oh and old man thing the pre www was advertisement free which was awesome.
sure. the cut off can be somewhere around there, start can be earlier too.
Checks out, at least in my case.
I self-host my email and pretty much every other cloud service I'd otherwise be using. My Gmail account is literally a spam catcher address, so everything there and elsewhere I haven't already deleted is 100% crap.
You'll pry my kitten pictures from my cold dead hands!
Yes, but 90% of everything is crap. Why should we expect data centers to be any different?
Solutions?
sudo rm -rf /data
I’m imagining Data from Star Trek being deleted…
Captain, this is most illogical.
That depends on the problem.
I disagree w/ the author that storing blurry cat memes is what's "destroying our environment." Transportation is our biggest net polluter in terms of CO2, which is higher than all electrical generation combined. If we're want to solve CO2 emissions, we have to solve transportation, since that's the 500 pound gorilla in the room.
If we look specifically at datacenters, storage makes up a tiny fraction of the overall energy use. That article mentions that datacenters probably have a similar CO2 footprint as the aviation industry, which makes up about 2.5% of the world's carbon emissions, or about 10% of the total transportation emissions from the above link.
If the goal is to fix climate change, data centers are pretty far down the list in terms of priorities. Higher priorities are, roughly in this order:
Changing anything about data centers is way down the list of priorities, and it'll be largely solved by something much higher up. So it's really the wrong target to attack.
Solutions?
Carbon tax.
In this micro example, imagine if you could access all of your data for free when there as abundant sunshine (carbon free), or had to pay for carbon based energy at night. You'd start to sort your data for what you really wanted so that you'd only be paying a small amount for a small amount of data.
I don't see one unless our society because less dependent on bullshit and honors privacy. I don't know about anyone else but I constantly bullshit specifics about myself on line to dirty up any data collected on me.
We fully transition to clean energy like nuclear and build more power plants to allow us to store our online stuff.
The author of this article is not a serious person. He's in the same bucket as Greta Thunberg. They just like to scream and blame people instead of providing practical solutions. It's frankly tiring to hear them despite their honorable intentions.
Thunberg's solution has always been "listen to the experts who have been screaming at you for 50 years." You don't have to be an expert to care about things or to want to listen to people who are experts.
He’s in the same bucket as Greta Thunberg. They just like to scream and blame people instead of providing practical solutions.
Greta Thunberg is 22 years old right now, and was "screaming" and "blaming people" when she was 11 years old.
She saw the world she was going to inherit and forced conversation to work toward solutions. Expecting an 11 year old to provide answers that none of the established world has is silly.
Massive deduplication across all accounts on all servers of image, audio, and video data would theoretically be possible, but ain't gonna happen. Or we could just discourage people from posting cat videos and bad memes (even less likely to happen).
I would argue that duplication of content is a feature, not a bug. It adds resilience, and is explicitly built into systems like CDNs, git, and blockchain (yes I know, blockchains suck at being useful, but nevertheless the point is that duplication of data is intentional and serves a purpose).
Deduplication is trivial when applied at the block level, as long as the data is not encrypted, or is encrypted at rest by the storage system.
Sturgeon's Law in action again.