Portable Web Package

Overview

A Portable Web Package (PWP) is a self-contained collection of documents for web browsers, usable without modification locally and on web servers. A PWP may be packaged into a single file to facilitate distribution, for example zip; many copies improves long-term survivability over periods exceeding 100 years.

Structure

A PWP is implemented as a directory structure using browser native mechanisms, essentially static pages without runtime processing. This approach improves robustness and supports long-term preservation without relying on complex systems. Only a simple filesystem such as exFAT is required; advanced mechanisms such as symbolic links or access control lists (ACLs) are not required. The root directory must contain directories and a single file named index.html; this file serves as the entry point and also functions as the landing page on web servers.

Instantiations may be formally defined using schema languages such as RELAX NG, or more loosely with templates. A common practice is to create PWPs containing other PWPs.

Linking and Processing

To ensure that a PWP works both locally and on web servers, some technical constraints apply. Links should point to files rather than directories, since directory listings may not always be available. Because runtime processing is avoided, content is typically split into separate files and elements identified by URI fragments. JavaScript may be used for small helper tasks but should be kept to a minimum.

File Formats

Files should be preserved in their original formats, even if browsers cannot display them directly. Additional versions should also be provided in widely supported formats. For example, a file such as "foo.docx" may also be provided as HTML, PDF/A, and plain text. The most reliable converter is usually the software that created the original file. Storing the same information in multiple formats improves resilience and long-term accessibility. Plain text is especially useful because it allows software tools to process the data easily.

Naming and Limits

The recommended maximum number of entries (files and directories) per directory is 10,000. File and directory names should be compact mnemonics producing compact URIs as defined in the URI standard. The character repertoire should be restricted to unicase [a-z][0-9][_.]. The "." character is reserved as the separator for format and language identifiers; examples: "foo.pdf", "foo.en.pdf".

Design Principle and Miscellaneous

Simplicity is the primary design principle. The technical barrier should remain low so that PWPs are easy to handcraft or generate programmatically. Data should be self-describing whenever possible. Simplicity improves long-term preservation.

In addition to modern graphical browsers such as Chrome, a PWP should remain usable with command-line interface (CLI) programs such as Lynx or cURL. Graceful degradation is recommended.

Metadata and cryptographic assurance (integrity, authenticity, time-stamping, etc.) may follow the general approach described in Long-Term Archive and Notary Services (LTANS).

PWP is ready for practical use. Several implementations exist and the format is documented in detail. Exploring an existing PWP is often the easiest way to understand the format; see also Xdossier.