Web Archiving

Questions about Web Archiving:

How much of a website is collected in the Archives?

The Archives goal is to create an archival copy—essentially a snapshot—of how the site appeared at a particular point in time. Depending on the collection, we preserve as much of the site as possible, including html pages, images, flash, PDFs, audio, and video files, to provide context for future researchers. The crawler is currently unable to archive streaming media, "deep web" or database content requiring user input, and content requiring payment or a subscription for access.  In addition, there will always be some websites that take advantage of emerging or unusual technologies that the crawler cannot anticipate.

Do you archive all identifying site documentation, including URL, trademark, copyright statement, ownership, publication date, etc.?

The Archives attempts to completely reproduce a site for archival purposes.

Is there any personal information in the web archive?

The Archives collects websites that are publicly accessible. These may include pages with personal information.
Information Especially for Webmasters and Site Owners

Why was my website selected?

Websites are selected by Archives according to collection strategies developed for each thematic or event collection. The Library maintains a collections policy statement and other internal documents to guide the selection of electronic resources, including websites.

How often and for how long will you collect my site?

Typically the Archives crawls a website annually or quarterly, depending on how frequently the content changes. 

The Archives may crawl your site for a specific period of time or on an ongoing basis. This varies depending on the scope of a particular project. Some archiving activities are related to a time-sensitive event, such as before and immediately after a national election, or immediately following an event. Other archiving activities may be ongoing with no specified end date.

Still have questions? I am happy to help!
Jessika Drmacich, Records Manager and Digital Resources Archivist