internet archive 2
TRANSCRIPT
The Internet Archive
Presented by Alex Craig
Brewster Kahle• Graduated from MIT in 1982 with a degree in
Computer Science and Engineering• Studied Artificial Intelligence• Helped found Thinking Machines Inc., manufacturer
of supercomputers using parallel processing
WAIS• Wide Area Information Server• Developed 1988• Early internet search software• Offered searching of the contents and
databases of computer servers on the internet
• Sold to AOL in 1995
Alexa Internet and the Internet Archive
• 1996, following the sale of WAIS to AOL, Kahle and partner Bruce Gilliat (WAIS), found both the non-profit Internet Archive and the for-profit Alexa Internet simultaneously
Alexa Internet
• For-profit Internet Toolbar• Tracks user browsing information to aid future
internet searches• The toolbar makes an archive of each
website as it is “crawled” - then donated to Internet Archive
• Sold to Amazon in 1999
The Internet Archive: The Wayback Machine
• Archives “snapshots” of the Internet to create an “internet library”
• Originally received copies mainly from the Alexa Internet service, now includes other sources of “donations”
• Allows users to see archived versions of websites as they appeared in the past
• Because the average lifetime of a website is 100 days, the snapshot is retaken every two months.
The Wayback Machine
• Today, a single copy of everything that's on the Net -- equal to 15,000 copies of Encyclopedia Britannica is added every 2 months
• The National Science Foundation, Library of Congress, Markle Foundation, Compaq, and Alexa all donate money, software, and equipment to keep the Internet Archive up and running.
Internet Archive: Other Tools• Open Library: Searchable Database for books• Archive-It: Fee-based subscription service that allows members
to permanently archive their data• Media Collections: Moving Image, Audio, Live Music, and Text
Internet Archive as Library• Made an official library of the state of California in
2007• The Archive is now mirrored at the Bibliotheca
Alexandria in Egypt, this is the only external backup of the archive
Why is the Internet Archive Important?
• Preservation of the past
• “Digitized information, especially on the Internet, has such rapid turnover these days that total loss is the norm. Civilization is developing severe amnesia as a result…The Internet Archive is the beginning of a cure - the beginning of complete, detailed, accessible, searchable memory for society, and not just scholars this time, but everyone.” - Stewart Brand (founder of The Long Now Foundation
Mission Statement
• “Libraries exist to preserve society's cultural artifacts and to provide access to them. If libraries are to continue to foster education and scholarship in this era of digital technology, it's essential for them to extend those functions into the digital world… without cultural artifacts, civilization has no memory and no mechanism to learn from its successes and failures. And paradoxically, with the explosion of the Internet, we live in what Danny Hillis has referred to as our “digital dark age”. The Internet Archive is working to prevent the Internet - a new medium with major historical significance - and other "born-digital" materials from disappearing into the past…we are working to preserve a record for generations to come.” (from archive.org)
Controversies
• Suzanne Shell - 2005, demanded $100,000 for archiving her website profane-justice.org - they later settled with the Archive offering this statement: “Internet Archive has no interest in including materials in the Wayback Machine of persons who do not wish to have their Web content archived”
Controversies
• 2005 - Healthcare Advocates, Inc. - Attempted to sue the Internet Archive for violating the Digital Millennium Copyright Act.
• Settled out of court
DMCA Exemption
• 2006 court ruling ruling grants exemption to "computer programs and video games distributed in formats that have become obsolete and that require the original media or hardware as a condition of access, when circumvention is accomplished for the purpose of preservation or archival reproduction of published digital works by a library or archive".
• "The Net is the No. 1 resource for people…this is how students learn, it's how business is done. If we don't have a memory, we're living in an Orwellian world of our own making." - Brewster Kahle