Offline Knowledge Preservation

The Internet is fragile. Websites disappear (link rot studies show significant decay over 5 years), services shut down, and access can be cut by infrastructure failures, censorship, or economic collapse.

Kiwix

Kiwix — offline reader for free, compressed ZIM files of major knowledge sources. No internet required once downloaded.

ZIM Package	Size	Content
Wikipedia English “mini”	~12 GB	Full articles, no images
Wikipedia English (with images)	~89 GB	Complete — fits on a USB drive
Computer Science “nopic”	~443 MB	Technical reference, no images
Project Gutenberg	varies	Classic literature
Stack Exchange dumps	varies	Programming Q&A

Also available: Wikisource, Wikiquote, Wikivoyage, Wikibooks, Wikiversity, TED talks, Crash Course, Khan Academy, medical wikis.

Full library: library.kiwix.org

Running in Docker:

docker run -d \
  -p 8080:8080 \
  -v /path/to/zim/files:/data \
  ghcr.io/kiwix/kiwix-serve:latest \
  --library /data/library.xml

ArchiveBox

ArchiveBox — personal web archiving. Already running in this homelab at archive.folk.zone. Preserves bookmarks, research papers, social media content, and legal evidence archives. Data is readable without needing to run ArchiveBox itself.

See ArchiveBox service page.

Internet Archive Offline Mirror

dweb-mirror — Internet Archive's open-source project for creating offline mirrors of their collection. Designed to address lack of internet access as a factor in educational outcomes and poverty.

HTTrack

HTTrack — downloads entire websites to a local directory, recursively building all structures and preserving link architecture.

httrack "https://example.com" -O "/path/to/mirror" "+*.example.com/*"

Project Gutenberg Mirror

Over 70,000 free books in 60+ languages. New texts added daily. Mirror via rsync:

rsync -av --del aleph.gutenberg.org::gutenberg /path/to/mirror/

Mirroring guidelines: gutenberg.org/help/mirroring

DokuWiki

Table of Contents