The Internet Archive and its 916 billion saved web pages are back online
The Internet Archive and its 916 billion saved web pages are back online

The Internet Archive and its 916 billion saved web pages are back online

Wayback Machine back in read-only mode after DDoS, may need further maintenance.
Maybe it’s time to federate the IA.
One of the rare use cases of a blockchain actually being useful. A federated internet archive that uses a blockchain to validate that the saved data has not been altered by a malicious actor trying to tamper with proofs
That would be really cool but horribly inefficient because of the sheer amount of storage required
The core feature of all blockchain tech.
I mean you don’t need the blockchain for that. The same way that distro mirrors don’t need the blockchain. It can be federated, with each upload being verified through hashes that they are in fact the real upload. I would argue that something like blockchain would remove the authority from them, granting the position of a bad actor spinning up enough servers to be able to poison the blockchain just because they had the computing power, claiming authority
isn't this what ipfs is?
The thing is sometimed articles must be removed from IA (copyright (I disagree with that one) or when information is leaked that could threaten lives), with a blockchain this would be impossible
We don't need a blockchain for that.
Having multiple servers which store file checksums would have much less overhead, would be easily repeatable and appendable, with no need for unnecessary computational labor. Linux mint currently uses the checksum process for verifying that an ISO downloaded is not altered in any way, and it can work for any file (preferably not humongous files).
Strive for K.I.S.S. whenever possible.
You need a useless 51% of good nodes to assure that, making it even more wasteful.
I don't know if that's a good idea.
How would you go about implementing the infrastructure for that?
That’s an excellent question. Unfortunately I do not have an answer. But I believe it’s worth discussing some means of redundancy for the IA; even if it’s as simple as rsync to other hosts.
They’ve been using Filecoin
YaCy self-hostable search engine kind of has this feature and architecture by way of a DHT inter-peer search, in combination with local page caching. Although the caching feature is something that a node operator needs to manually enable.