Internet Archive Back Online After Hack
This irreplaceable resource needs to be safeguarded
Oct 18, 2024 | Share
News
The Internet Archive, a nonprofit digital library, is back online in read-only mode after becoming the target of a series of cyberattacks over the last few weeks. The Archive hosts free access to huge collections of digitized materials, but is perhaps most well-known for its “Wayback Machine,” which allows users to view snapshots of popular websites taken in the past.
These attacks have highlighted the importance of cybersecurity for non-profits like the Archive, but have also shown just how many of us rely on the Archive for work, school, or our normal online activities.
The Internet Archive is hit by a pair of cyberattacks
On October 9, 2024, it was discovered that hackers had breached the Internet Archive’s systems when visitors to the site were met with taunting messages that the attackers had created. The hackers’ note claimed that the information of 31 million users had been stolen. Security experts would later confirm that this information included email addresses, screen names, encrypted passwords, and other data.
Importantly, users’ actual passwords should not be accessible, assuming they were properly encrypted in the database.
Just after the breach, the Internet archive was then hit by a distributed denial-of-service (DDoS) attack, in which huge numbers of bots flood the site with requests, overwhelming its servers and preventing other users from accessing the site. This attack temporarily took the site offline. It is currently back online in a read-only mode, allowing users to visit the site (though it seems to still be struggling) but not make changes or contribute new content.
Why the Internet Archive is important
Although not as well-known as other online resources like Wikipedia, the Internet Archive, started in 1996, is unique in its goal of preserving the history of the internet. If you’ve ever been looking up information from a website only to discover that the domain got sold to a different company or that something important got deleted, there’s a good chance that the Internet Archive has a copy of that site from before those changes were made.
These archived copies of websites can be accessed through the site’s “Wayback Machine,” which lets you search through the timeline of the internet. Not every site on the internet is archived, it’s surprising how comprehensive the collection is, containing more than 860 billion web pages.
In addition to archiving web pages, the Internet Archive also contains millions of digitized print materials, videos, software programs, images, and audio files.
While it’s fun to go back and see what CNN or Wired looked like in 2001, these records are essential for researchers, journalists, lawyers, and many other professionals. Lawyers who specialize in trademark and copyright law regularly use records from the Internet Archive to prove when something was first published online. Likewise, journalists often need to rely on archived copies of websites to get information that organizations try to scrub from their online presence.
The Archive is also an important resource for other internet services. Wikipedia references make heavy use of archived websites to preserve the citation even if the original website changes. The Internet Archive also launched an initiative to make references to books on Wikipedia viewable in their collections. Google is even incorporating archived web pages into its search to give users historical context for the links in its results.
Building on a house of cards
A surprising amount of the technological infrastructure that makes our everyday lives possible is built on the volunteer labor of a few dedicated individuals. Multibillion-dollar tech companies often rely on open-source libraries that have been maintained by one guy in upstate New York for twenty years.
The Internet Archive is a much bigger project and has managed to support itself through donations for almost three decades, but it’s still a bit surprising to think that it’s basically the only record of the information. It’s a bit like if the Library of Congress, instead of being run by the government, was instead run as a passion project by a group of dedicated librarians. To paraphrase one of the taunts from the hackers’ note, it can feel like the whole Internet Archive “runs on sticks” and is constantly on the verge of running out of funds or being taken out by a catastrophic security breach.
Ironically, the Internet Archive is an especially valuable tool if your website is ever hacked. I’ve worked for organizations that have had their servers hacked and lost information that I was later able to find safely archived in a record on the Internet Archive. The issue, then, is who archives the archive?
Final thoughts
The main takeaway, for a lot of people, is that now you know the Internet Archive exists. Hopefully the site is able to get back on its feet and you can take full advantage of its collections—whether that’s enjoying early 20th century music recordings or finding your favorite band’s old blog.
More importantly, I hope that these attacks raise awareness about the importance of the Internet Archive and similar projects so that they get the support they need to continue their missions. It’s easy to take the systems that underlie our online activities for granted, but as a lot of people have realized over the last week, we would certainly notice if they were gone.
Author - Peter Christiansen
Peter Christiansen writes about telecom policy, communications infrastructure, satellite internet, and rural connectivity for HighSpeedInternet.com. Peter holds a PhD in communication from the University of Utah and has been working in tech for over 15 years as a computer programmer, game developer, filmmaker, and writer. His writing has been praised by outlets like Wired, Digital Humanities Now, and the New Statesman.
Editor - Kevin Parrish
Kevin Parrish has more than a decade of experience working as a writer, editor, and product tester. He began writing about computer hardware and soon branched out to other devices and services such as networking equipment, phones and tablets, game consoles, and other internet-connected devices. His work has appeared in Tom’s Hardware, Tom's Guide, Maximum PC, Digital Trends, Android Authority, How-To Geek, Lifewire, and others. At HighSpeedInternet.com, he focuses on network equipment testing and review.