I hate mappings!*
(*)That's why mopper tries to do the job as fast as possible!
A fast and lightweight data-to-RDF mapping tool. It executes an AlgeMapLoom mapping plan which, in turn, can be generated from RML or ShExML mappings.
This very early experimental version takes a mapping plan file in JSON format as input and generates RDF as N-Triples or N-Quads. Starting from an RML or ShExML mapping is on the roadmap.
Conceptually every operator runs in its own thread, and data flow between them as a stream of messages (as a kind of simplified actor model). There is still plenty of room for optimizations though...
Most basically:
mopper -m my-mapping-file.json
To check all options, run mopper --help
Usage: mopper [OPTIONS] --mapping-file <FILE>
Options:
-m, --mapping-file <FILE> The path to the AlgeMapLoom mapping plan (JSON)
-v, --verbose... Increase log level
-q, --quiet Be quiet; no logging
--force-std-out Force output to standard out, ignoring the targets in the plan. Takes precedence over --force-to-file
--force-to-file <FILE> Force output to file, ignoring the targets in the plan
--message-buffer-capacity <N> Set the maximum number of messages each communication channel can hold before blocking the sender thread. `0` means no messages are hold: 'send' and 'receive' must happen at the same time. The default is `128`
-d, --deduplicate Remove duplicate triples or quads. Note that currently deduplication only works on a per-sink basis and has a negative impact on speed and memory consumption
-h, --help Print help
You need Rust and Cargo to build mopper (install instructions).
Then, in the root directory, run
cargo build --release
The executable binary comes in the target/release
directory.
Mopper is work in progress. Here's a rough overview of what's (not) implemented:
Input formats:
- CSV
- JSON
- XML
Input / output types:
- File
- Standard out
- Standard in
- Stream (e.g. Kafka, Websocket)
- Relational database
Output formats:
- N-Triples
- N-Quads
- More RDF serializations
Mapping features:
- IRI generation function
- Reference function
- IRI template function
- Constant IRI generation
- URL encode function
- IRI generation
- Projection operator
- Fragmenting
- Join operator (only inner join with
equals
condition) - Blank node generation function
- Deduplication
- Concatenate function
- Replace function
- To uppercase / lowercase function
- FnO function handling
- Rename operator