Archiving web pages
We mention in "Best practices for contributing" that whenever an artifact includes a link to a website, you should also include a Wayback Machine link.
However, there are some cases where the Wayback Machine isn’t able to crawl pages, such as if the page is gated behind a confirmation prompt (“confirm you’re 18 years old to enter”) or if the page requires you to log in.
For these cases, there’s an alternative approach to including archives of web pages in an artifact.
Most software that deals with web archiving uses a standardized file format called a WARC file to store archived web pages. To host a web archive on Ace Archive, you’ll need to:
- Use specialized software to generate a WARC archive for a website.
- Host that WARC file on Ace Archive.
- In the artifact, include a link to a different site that allows users to browse the site archived by the WARC file.
Generating a web archive
There are many tools you can use to generate a web archive. Any tool that can
.warc file will work! If the goal is to archive web pages that
require user interaction (a situation that the Wayback Machine isn’t
well-suited for), a good tool for the job is
With this tool, you “record” a browsing session, and any pages you visit in the tool will be included in the archive. This allows you to do things like bypass confirmation prompts or log into sites.
To host the generated web archive on Ace Archive, include it in the artifact file as a file.
Browsing the web archive
Browsing the contents of a WARC archive requires special software. Luckily, there is a tool called ReplayWeb.page which can pull the WARC file from Ace Archive and allow users to browse its contents without downloading any software. Whenever you add a WARC file to an artifact, you should also include a link to this tool.
In the form below, enter the URL slug of the artifact, the file name you used in
the artifact file (which must end in
.warc for the tool to work), and the URL
of the archived web page you want users to land on, and it will generate the URL
you should include in the artifact file.