Web Archives

AlNoamany, Y., AlSum, A., Weigle, M. C., & Nelson, M. L. (2014) Who and what links to the Internet Archive International Journal on Digital Libraries 14(3-4), 101–115. http://doi.org/10.1007/s00799-014-0111-5

A deep dive into the Internet Archive web access logs finds a few intriguing details. Method is sampling of the IA logs (n=6M) from a single day. The group analyzed referrers, page language, availability on the live web, and existence in other archives. Robots make 92.4% of the HTTP requests to IA. Top 3 languages with http200 results were English, Japanese, and German; top 3 http404 archive results were English, Russian, and German. Wikipedia is the most common referrer. One interesting result: 86.4% of web pages linking to the Wayback Machine link to specific mementos, “which means they link to Web pages at a specific time.”