Skip to content
This repository has been archived by the owner on Jul 28, 2022. It is now read-only.
/ yamdwe Public archive
forked from projectgus/yamdwe

Yet Another Mediawiki to DokuWiki Exporter

License

Notifications You must be signed in to change notification settings

cdruee/yamdwe

 
 

Repository files navigation

Yet Another Mediawiki to DokuWiki Exporter

Yamdwe is made up of two Python programs to export an existing Mediawiki install to a Dokuwiki install.

Features

  • Exports and recreates full revision history of all pages, including author information for correct attribution.
  • Exports images and maintains modification dates (but not past revisions of an image.)
  • Can optionally export user accounts to the default dokuiwiki "basicauth" format (see below.)
  • Parses MediaWiki syntax using the mwlib library (as used by Wikipedia), so can convert most pages very cleanly - minimal manual cleanup.
  • Syntax support includes: tables, image embeds, code blocks.
  • Uses the MediaWiki API to export pages and images, so a MediaWiki install can be exported remotely and without admin privileges (NB: Yamdwe does hit the API quite hard, so please do not export other people's wikis for fun. Or, at minimum, please read their Terms of Service first and comply by them.)
  • Supports logging in to Mediawiki to export, and also HTTP Basic Auth.

Compatible Versions

  • Dokuwiki 2014-09-29a "Hrun", but should work on any recent version. Exporting users only works on 2014-09-29a or newer (see below).
  • MediaWiki 1.13 or newer (ie any recent version, 1.13 is from 2008!)

Yamdwe has now been used successfully on many wikis of various sizes. If you've used it on a particularly large or unusual wiki, please let me know!

Requirements

If exporting users is required

Using yamdwe

Installing dependencies

1. Basic dependencies

For Debian/Ubuntu Linux:

sudo apt-get install python python-mysqldb python-pip python-lxml python-requests python-dev

2. Virtualenv (optional)

I suggest installing the remaining Python dependencies inside a virtualenv, as mwlib in particular has a lot of specific dependencies.

Some of the mwlib dependencies may be available as system Python packages, but they may have older/incompatible versions. Sandboxing these packages into a "virtualenv" avoids these version conflicts.

Virtualenv & virtualenvwrapper for Debian/Ubuntu:

sudo apt-get install python-virtualenv virtualenvwrapper
source /etc/bash_completion
mkvirtualenv --system-site-packages yamdwe

(Next time you log in the virtualenvwrapper aliases will be automatically added to your environment, and you can use workon yamdwe to enable the yamdwe virtualenv.)

3. Pip dependencies

Make sure to run these inside the virtualenv (ie run workon yamdwe first), if you're using a virtulaenv.

pip install http:https://pypi.python.org/packages/source/s/simplemediawiki/simplemediawiki-1.2.0b2.tar.gz
pip install -i http:https://pypi.pediapress.com/simple/ mwlib

Set up Dokuwiki

If you're creating a new DokuWiki then set up your DokuWiki installation and perform the initial installation steps (name the wiki, set up an admin user, etc.) You can also use yamdwe with an existing wiki, but any existing content with the same name will be overwritten.

Exporting pages & images

To start an export, you will need the URL of the mediawiki API (usually http:https://mywiki/wiki/api.php or similar) and the local path to the Dokuwiki installation.

yamdwe.py MEDIAWIKI_API_URL DOKUWIKI_ROOT_PATH

If you need to log in to to your Mediawiki install (either with a Mediawiki username, or via HTTP Basic Auth) then run yamdwe.py -h to view the command line options for authentication.

If installation goes well it should print the names of pages and images as it is exporting, and finally print "Done". This process can be slow, and can load up the Mediawiki server for large wikis.

Yamdwe may warn you at the end that it is unable to set correct permissions for the Dokuwiki data directories and files - regardless, you should check and correct these manually.

Inevitably some content will not import cleanly, so a manual check/edit/cleanup pass is almost certainly necessary.

Exporting users

This step is optional, but it's nice as it matches the user names in the imported revision history with actual users in dokuwiki.

For this step you need access to the MySQL database backing the mediawiki install, and local access to the dokuwiki root directory.

An example usage looks like this:

./yamdwe_users.py -u mediawiki --prefix wiki_ /srv/www/dokuwiki/

Run yamdwe_users.py with "-h" to see all options:

Any settings you're unsure about (like --prefix for table prefix) can be found in the LocalSettings.php file of your Mediawiki installation.

yamdwe_users exports mediawiki password hashes to a dokuwiki "basicauth" text file. These imported passwords require Dokuwiki version 2014-09-29 "Hrun" or newer. On older Dokuwiki installs the password file format is not compatible and it will break user auth. The best thing to do is to update to 2014-09-29 or newer before running yamdwe_users.py.

Post Import Steps

  • After the export please check for correct permissions on the dokuwiki data/conf/users.auth.php file and other data/conf files.

  • The search index needs to be manually rebuilt with the contents of the new pages. The searchindex plugin can do this.

Common Manual Cleanup Items

  • Page naming and namespaces will probably need some rearranging/renaming to seem "natural" in Dokuwiki. The move Plugin makes this straightforward.

  • Some uncommon URL schemes, such as file:https://, are not detected by Dokuwiki as links unless you add a scheme.local.conf file as described here

Known Issues

Please check the Issues list on github to see what's going on.

If you do find a bug or have trouble exporting a wiki then please open an issue there and I (or other yamdwe users) can try and help you out.

Submitting Good Bug Reports

If the bug is with some Mediawiki markup that doesn't provide the expected Dokuwiki markup, for a good bug report please include:

  • Excerpt of the Mediawiki markup causing the problem.
  • Desired Dokuwiki markup output.
  • Actual (problematic) Dokuwiki output from yamdwe.

Better Bug Reports?

Want to put a huge smile on my face and get a massive karma dose by submitting an even better bug report? Are you comfortable using git & github?

  • Fork the yamdwe repository on github.
  • Add a test case directory under tests/ and place the problematic Mediawiki markup into a file mediawiki.txt, and the desired correct Dokuwiki output into dokuwiki.txt.
  • Run wikicontent_tests.py to verify that the incorrect output you expected is printed as part of the test failure.
  • Add a commmit which adds the new test case directory.
  • Submit a Pull Request for the test failure. Use the Pull Request description field to explain the problem.

Best Bug Reports?

If you want to outclass even that bug report, your commit could also add a fix for the conversion problem in yamdwe, so all tests pass including the new one you added! A+++ would accept Pull Request again!

Don't worry if you don't want to perform any extra steps though, any (polite) bug report is always welcome!

About

Yet Another Mediawiki to DokuWiki Exporter

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Python 100.0%