Skip to content
This repository has been archived by the owner on Jan 14, 2019. It is now read-only.
/ geolayerdump Public archive

A simple tool for making snapshots from geoserver every day

Notifications You must be signed in to change notification settings

vsimko/geolayerdump

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Motivation

We want to download several layers from geoserver every day in order to create snapshots that can be used as geo-temporal data later on.

Requirements

  • support for GML files only
  • create shapshots once per day or less frequently
  • retry download if failed (but only up to some retry limit, e.g max 3 times)
  • use content from previous day if failed to download (but keep track of it)
  • the data can potentially be several GB of XML
  • store the snapshots efficiently (compressed, only diffs ...)
  • use cron to trigger the download every N minutes
  • when triggered, only a single GML file should be downloaded at a time
  • keep low profile - avoid parallel downloads, limit transfer rate
  • every file is downloaded only once per day

Implementation

  • we need a URL of the geoserver and a directory DIR where the layers will be downloaded

script download_layer.sh URL DIR:

  • picks the oldest *.gml file from DIR
  • downloads the corresponding layer from URL using WFS as GML (XML)
  • formats the XML files using xmllint --format
  • removes fid XML attributes (because they are always newly generated by the geoserver)
  • for every *.gml file a corresponding *.meta file will be generated which contains some accounting information about the download

duplicity:

  • a useful tool for incremental backups (see usage examples below)

Incremental snapshots using duplicity:

duplicity -vi --allow-source-mismatch --no-encryption path/to/src/dir file:https://path/to/snapshot/dir
  • -vi = verbosity level is "info"
  • --allow-source-mismatch allows that the names of source dirs can be changed

Listing existing snapshots

duplicity colletion-status file:https://path/to/my/snapshot/dir

Restoring a snapshot

duplicity restore --no-encryption --time 2016-06-30T11:00:00 file:https://path/to/snapshot/dir path/to/output/dir

Showing summary of differences

Assuming we want to compare differences between directories dir1 and dir2 and that we want to ignore files matching a pattern *.meta:

diff -x '*.meta' dir1 dir2 | diffstats

Output should look like this:

include/net/bluetooth/l2cap.h |    6 ++++++
 net/bluetooth/l2cap.c         |   18 +++++++++---------
 2 files changed, 15 insertions(+), 9 deletions(-)

Rename stuff by removing prefix from filename

Assuming you are in some directory which contains files and the prefix is "PREFIX" (This is just a quick and dirty method, there is certainly a better way to do so)

find . | while read F; do mv $F ${F#./PREFIX}; done

About

A simple tool for making snapshots from geoserver every day

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published