-
Notifications
You must be signed in to change notification settings - Fork 0
Historical - Early 2000's hack to "fix" bad codes in UTF-8 encoded XML so that XML parsers will be able to parse it
License
zimeon/utf8conditioner
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
utf8conditioner --------------- Takes UTF-8 input from stdin, writes processed UTF-8 to stdout and errors/warnings to stderr. Developed as a tool for checking and pre-processing UTF-8 encoded XML responses to OAI-PMH requests (see https://www.openarchives.org/). This program aims to `fix' bad codes in UTF-8 encoded XML so that XML parsers will be able to parse it (albeit with some corruption introduced by substitution of dummy characters in place of illegal codes). Even though data may need to be discarded as unreliable it may still be possible to extract any <resumptionToken> element to see if a harvest was complete or not. COMPILING This program has been developed on Linux using gcc but should compile with on other systems with an ANSI C compiler. On Linux with gcc: > make [should build utf8conditioner] > make test [will run several tests, should all say PASS] > ./utf8conditioner -h [will display help] To use another compiler, change the CC = gcc line in the Makefile. Simeon Warner, [email protected] $Id: README,v 1.2 2005/10/25 23:18:23 simeon Exp $
About
Historical - Early 2000's hack to "fix" bad codes in UTF-8 encoded XML so that XML parsers will be able to parse it
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published