Parched Internet — Archive

This article was archived to the Wayback Machine at the time of publication. If you are reading this in the future, please consider that our present was just as fleeting as yours.

The Internet Archive has been fighting a high-profile copyright lawsuit brought by book publishers (Hachette v. Internet Archive). A loss there could cripple the organization and set a precedent that makes all archiving legally perilous. The Archive cannot be everywhere at once. But millions of internet users can. Browser extensions like Wayback Machine (by the Archive itself) and ArchiveBox allow individuals to save pages on demand. If you see something important—a news article, a government document, a friend’s blog—save it immediately. Do not assume the crawler will find it. parched internet archive

This is the story of the Parched Internet Archive—what it means, why it’s happening, and why you should be terrified. The first delusion of the digital age is that “the cloud” means forever. We post photos to Instagram, compose thoughts on Twitter, and publish research on personal blogs, assuming that these artifacts will exist for our grandchildren to browse. After all, it’s not paper. It doesn’t burn or mold or yellow. It’s data —immortal, weightless, invincible. This article was archived to the Wayback Machine

The result: thousands of pages—perfectly legal, historically relevant—are being erased from the record because they contain an old phone number or a disputed photograph. The Parched Internet Archive is not dry because it ran out of money for hard drives. It is dry because the cost of crawling has exploded. To archive a single modern web page, the crawler must download dozens of linked resources: CSS files, fonts, images, videos, tracking pixels, and third-party embeds. Many of these are hosted on different domains (e.g., a page on CNN.com might embed a Twitter widget, a YouTube video, and a Google Font). If any of those external resources are blocked or changed, the archived page breaks. Internet Archive)