Web Archives

What are web archives?

Web pages, in case you haven’t noticed, change : a lot. Researchers estimated that 25% of web links in scholarly papers no longer work. This is a problem for people who want to study the culture of the past 20 years or even the last year. Controversies come and go at “internet time” and vanish into the ether, unless someone archives them.

The Internet Archive was started by Brewster Kahle in the 1990s to save as many web pages as possible. Today their mission has expanded to become the universal library for human knowledge, in order to do that they preserve and digitize all sorts of material including music, movies, and the web.

You can use the Wayback Machine to view the cached copies of old web pages collected by the Internet Archive.

A lot of libraries and cultural institutions also curate their own web archives. One of the popular services to manage these collections is Archive-IT, a tool originally developed at the Internet Archive and now provided as a subscription service to libraries and other cultural institutions.

My research is focused on three questions?

  • How should libraries manage their web archiving activities?
  • What can researchers do with web archives?
  • What kind of tools do researchers need in order to use web archives to answer interesting questions?
  • How do you evaluate the quality of the information contained in a web archive?