internet-archiving

Here are 25 public repositories matching this topic...

ArchiveBox / ArchiveBox

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

Updated Sep 27, 2024
Python

akamhy / waybackpy

Star

Wayback Machine API interface & a command-line tool

osint internet-archive web-archiving wayback-machine webarchiving cdx-api internet-archiving savepagenow archive-webpage archive-webpages wayback-machine-api wayback-machine-python

Updated Feb 26, 2024
Python

pirate / wikipedia-mirror

Sponsor

Star

🌐 Guide and tools to run a full offline mirror of Wikipedia.org with three different approaches: Nginx caching proxy, Kiwix + ZIM dump, and MediaWiki/XOWA + XML dump

html docker nginx wiki docker-compose mediawiki wikipedia archiving datascience kiwix zim wikipedia-dump wikipedia-mirror openzim xowa internet-archiving mwdumper kiwix-offline-wikipedia

Updated Apr 7, 2021
Shell

ArchiveBox / archivebox-browser-extension

Sponsor

Star

Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.

chrome-extension archiving svelte firefox-extension browser-extension web-archiving digital-preservation digipres internet-archiving archivebox

Updated Jul 12, 2024
TypeScript

ArchiveBox / electron-archivebox

Sponsor

Star

Desktop Electron app for ArchiveBox internet archiver. (ALPHA: not ready for general use)

electron windows macos linux docker gui desktop web-archiving digipres internet-archiving archivebox desktop-electron

Updated Feb 28, 2023
JavaScript

ArchiveBox / readability-extractor

Sponsor

Star

Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page's article text.

wrapper node readability internet-archiving archivebox

Updated Sep 16, 2024
JavaScript

ArchiveBox / docker-archivebox

Sponsor

Star

Home of the official docker image for ArchiveBox

docker kubernetes image docker-compose docker-image container oci digipres podman internet-archiving archivebox

Updated Feb 19, 2024
Dockerfile

ArchiveBox / good-karma-kit

Sponsor

Star

😇 A Docker Compose bundle to run on servers with spare CPU, RAM, disk, and bandwidth to help the world. Includes Tor, ArchiveWarrior, BOINC, and more...

docker docker-compose ipfs distributed-computing tor distributed-storage sia boinc kiwix i2p foldingathome storj pywb internet-archiving archivebox good-karma archivewarrior zimfarm

Updated May 11, 2024

vegetableman / vandal

Star

Navigator for Web Archive

chrome-extension firefox-addon wayback-machine webarchive internet-archiving

Updated Nov 23, 2023
JavaScript

pirate / internet-archiving-talk

Sponsor

Star

🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.

slideshow wget talks warc censorship web-archiving ethics internet-archiving archivebox

Updated Aug 15, 2024
JavaScript

ArchiveBox / debian-archivebox

Sponsor

Star

Home of the official apt/deb package for Ubuntu/Debian-based systems.

package debian apt ubuntu web-archiving aptitude digipres internet-archiving archivebox stdeb

Updated May 20, 2024
Python

ArchiveBox / homebrew-archivebox

Sponsor

Star

Homebrew formula for the ArchiveBox self-hosted internet archiving solution.

macos homebrew package linuxbrew web-archiving digipres brew-tap internet-archiving archivebox

Updated Feb 19, 2024
Ruby

ArchiveBox / docs

Sponsor

Star

Source for the Github Wiki / ReadTheDocs documentation for AchiveBox, the self-hosted internet archiving solution.

python cli community documentation ui rest wiki sphinx usage web-archiving digipres internet-archiving archivebox

Updated May 7, 2024
CSS

ArchiveBox / pip-archivebox

Sponsor

Star

Official Python package for ArchiveBox, the self-hosted internet archiving solution.

python pypi wheel pip setuptools web-archiving digipres sdist internet-archiving archivebox

Updated Jul 15, 2024

mikwielgus / forum-dl

Sponsor

Star

Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC

python scraper forum discourse phpbb warc data-fetching simplemachines internet-archiving

Updated Jun 27, 2024
Python

itsliamdowd / WaybackBrowserMacOS

Star

Pick a date and explore websites from the early days of the internet to now all in an easy-to-use browser format! 💻

Updated Jul 1, 2022
Swift

itsliamdowd / WaybackBrowserWindows

Star

Pick a date and explore websites from the early days of the internet to now all in an easy-to-use browser format! 💻

Updated Jun 14, 2022
Python

Own-Data-Privateer / hoardy-web

Star

Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing, mirroring, and/or indexing. Your own personal private Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data.

cli backups internet archiving snapshot self-hosted archive browser-extension archiver web-archiving wayback-machine web-browsing web-archive website-archive auto-save offline-reading internet-archiving

Updated Sep 21, 2024
JavaScript

gabldotink / sharkive.old

Star

upload stuff to the Internet Archive using a shell script

youtube youtube-dl internet-archive youtube-downloader internet-archiving

Updated Jul 28, 2023
Shell

Fooftilly / RSS_archiver

Star

Download and archive RSS feeds to Wayback Machine. Save a list of archived feed in locad db.

rss archive internet-archive rss-feed archiver wayback-machine webarchive link-archiver internet-archiving rss-archive link-archive

Updated Oct 19, 2023
Python

Improve this page

Add a description, image, and links to the internet-archiving topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the internet-archiving topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

internet-archiving

Here are 25 public repositories matching this topic...

ArchiveBox / ArchiveBox

akamhy / waybackpy

pirate / wikipedia-mirror

ArchiveBox / archivebox-browser-extension

ArchiveBox / electron-archivebox

ArchiveBox / readability-extractor

ArchiveBox / docker-archivebox

ArchiveBox / good-karma-kit

vegetableman / vandal

pirate / internet-archiving-talk

ArchiveBox / debian-archivebox

ArchiveBox / homebrew-archivebox

ArchiveBox / docs

ArchiveBox / pip-archivebox

mikwielgus / forum-dl

itsliamdowd / WaybackBrowserMacOS

itsliamdowd / WaybackBrowserWindows

Own-Data-Privateer / hoardy-web

gabldotink / sharkive.old

Fooftilly / RSS_archiver

Improve this page

Add this topic to your repo