9个月前

The Internet Archive and its 916 billion saved web pages are back online

arstechnica.com

Wayback Machine back in read-only mode after DDoS, may need further maintenance.

[MIGRATED TO DIFFERENT INSTANCE CHECK PIN POST] Internet is Beautiful @lemm.ee

ekZepp @lemmy.world

9个月前

The Internet Archive and its 916 billion saved web pages are back online

arstechnica.com /tech-policy/2024/10/the-internet-archive-and-its-916-billion-saved-webpages-are-back-online/

Pulse of Truth @infosec.pub

Resident Pulser @infosec.pub

BOT

9个月前

The Internet Archive and its 916 billion saved webpages are back online

arstechnica.com /tech-policy/2024/10/the-internet-archive-and-its-916-billion-saved-webpages-are-back-online/

60 评论

Maybe it’s time to federate the IA.
- One of the rare use cases of a blockchain actually being useful. A federated internet archive that uses a blockchain to validate that the saved data has not been altered by a malicious actor trying to tamper with proofs
  That would be really cool but horribly inefficient because of the sheer amount of storage required
  
  horribly inefficient
  The core feature of all blockchain tech.
  
  I mean you don’t need the blockchain for that. The same way that distro mirrors don’t need the blockchain. It can be federated, with each upload being verified through hashes that they are in fact the real upload. I would argue that something like blockchain would remove the authority from them, granting the position of a bad actor spinning up enough servers to be able to poison the blockchain just because they had the computing power, claiming authority
  
  isn't this what ipfs is?
  
  The thing is sometimed articles must be removed from IA (copyright (I disagree with that one) or when information is leaked that could threaten lives), with a blockchain this would be impossible
  
  We don't need a blockchain for that.
  Having multiple servers which store file checksums would have much less overhead, would be easily repeatable and appendable, with no need for unnecessary computational labor. Linux mint currently uses the checksum process for verifying that an ISO downloaded is not altered in any way, and it can work for any file (preferably not humongous files).
  Strive for K.I.S.S. whenever possible.
  
  You need a useless 51% of good nodes to assure that, making it even more wasteful.
- I don't know if that's a good idea.
  How would you go about implementing the infrastructure for that?
  
  That’s an excellent question. Unfortunately I do not have an answer. But I believe it’s worth discussing some means of redundancy for the IA; even if it’s as simple as rsync to other hosts.
- They’ve been using Filecoin
- YaCy self-hostable search engine kind of has this feature and architecture by way of a DHT inter-peer search, in combination with local page caching. Although the caching feature is something that a node operator needs to manually enable.
A commenter on Ars suggested donating, so I did. You can too with this link! https://www.paypal.com/paypalme/internetarchive
- I verified this is indeed the method listed on the Internet Archive website.
  
  Nice. Wouldn’t want money going to |nternetArchive!
- I need to do this again. I donated last year, but it's one of my favorite and one pretty important site.
  
  Recurring!
- 🫡
It's worth noting that the saved pages are the only thing that are back for now. Their other services have not yet been brought back online.
This absolutely made my morning.
Edit: Never mind, already knew about the Wayback machine. I thought it was the rest of the archive.
Still good news.
Such good news!
Okay, which one is missing?
- follow the ~~money~~chemtrail
I realize it's like the least important aspect of this, but yay! My podcast is back! I listen to Lawrence Manzo's Mahabharata podcast every night to go to sleep, and I haven't slept well since the attack
- If you rely on it that much maybe its time to download it all and keep it.
  
  I honestly don't know how I'd get it until it comes back. I can download through the podcast app, but until then, to my knowledge, it's completely lost anywhere other than archive.org Even the original blog it was posted to back in 2010 doesn't have the audiofiles anymore, just links to the archive.org
- This is why it breaks. It's not a streaming CDN. Do you torrent over tor as well?
  
  I use a podcast app, and apparently it pulls from there. I never knew before it went down. But I tried a bunch of different apps over the course of this, and they all pull from that.
  
  Maybe if you're rude to more people you can fix everything.
Only the way back machine is restarted to me not archive.org

60 评论