xml2json

xml2json is a tool to transform XML files under an specific format to a C++ json parser.

With the new Boom C++ is having at the moment thanks to C++11/C++14 and the upcoming C++17, I felt there was something missing in this language. Working on cloud applications is not trivial, and usually involves several parts and components. REST APIs are the common mechanism to communicate against the servers, and the underlying content is usually JSON, so after creating your application model, data structures, etc... comes one big and annoying part, to parse the JSON message.

JSON messages should not be difficult to read by humans, and pretty lightweight for computers, that makes them the perfect choice to transmit data, and thanks to its "tree" structure, allows developers design quite complex models in an easily. That friendly and human like behavior of the JSON has a non-desired effect, we include a lot of fields, attributes, recursive methods, and the parsing even though is trivial is pretty time consuming.

So this library is as NHibernate is for SQL, or Google Protobuf but for JSON. We will create an xml model of our data, that will create and parse automatically the data we want to read.

This is not the same as other libraries like Cereal or tools like Google Protocol Buffer. Google Protocol buffer on version 3 supports encoding of the .proto file to JSON but doesn't provide decoding of it. Cereal is good library to serialize/deserialize data using C++11 but doesn't support the same logic as this tool or Google Protocol Buffer offer (conditions, optionals, etc...)

Version

1.0.1

Requirements

To be able to compile the project you need a version of boost that supports program options, I'm using 1.58 at the moment.

Installation

To install the tool you just need clone the repository on a local folder

$ git clone --recursive [git-repo-url] [folder-you-want]

Create a bin folder and call cmake

$ mkdir bin
$ cd bin
$ cmake ../

Usage

In order to make use of this program I recommend first to take a look at the help output

$ ./umison -h
umison allowed options:
    -h [ --help ]                   produce this help message
    -e [ --write-file-h ] arg       The .h output will be written to this file
    -p [ --write-file-cpp ] arg     The .cpp output will be written to this file
    -i [ --read-file ] arg          Use this file as input xml template
    -a [ --append-string ] arg      String to be appended in the internal
                                    namespace to avoid collisions with existing
                                    code
    --generate-custom-interface arg Specify an engine you want to create a custom
                                    read_data method

By default the program uses the standard input so you can redirect it like in the following case

$ cat test.xml | ./umison

To produce an output.

By default the program will use the standard output but you can specify different files for the generated code

$ cat test.xml | ./umison

/**
 * This file has been created automatically. Any change will be lost
 * Please modify your input template to create a different output
 * instead of modifying this file.
 *
 * Author: José Gerardo Palma Durán
 *
 * Disclaimer: The author of this software is not responsible of any
 * possible damage/problem caused by the usage of this software. This
 * code is provided as is, without any warranty.
 *
 * If you want to report any bug, please contact me at jpalma at barracuda dot com
 *
 * If you didn't get a copy of the code used to create these templates, you can always
 * download it for free from https://github.com/raistmaj/xml2json
 * */

#include <string>
#include <vector>

// Internal namespace declaration
namespace __internal__umison {

    // Forward declaration

You can specify a file as input template and files to print the output, remember if you only specify one of the files the other will use the standard output.

$ ./umison -i test.xml -e test.h -p test.cpp
XML parsing complete.
XML checking minimum requirements.
Done.
Checking references on Classes and Jsons.
Parsing complete.
Starting to build output.
$ ls
umison  test.cpp  test.h

If you specify the flag --generate-custom-interface with a valid engine, depending on it, we will create additional methods on the .h/.cpp where you can specify direct elements for that library.

$ ./umison --generate-custom-interface rapidjson -i riot_games.xml -e riot_games.h -p riot_games.cpp

At the moment this are the available engines

rapidjson

Template format

As mentioned before we use XML. XML was used as the structure is pretty close to json and additionally we can include attributes allowing us to specify names, conditions, etc...

The document must start with the root tag umison

<umison>
</umison>

At this level you can specify two possible entities, class and json, the first one is used to reference data structures by the rest, the second one is the parsers we want to generate and will be exposed on the header file.

<umison>
    <class>
        ...
    </class>
    ...
    <json>
        ...
    </json>
</umison>

One important note about classes and references is that the code will be created in the same order you specify on your template, so as in C/C++ you can't use an undefined data structure and the objects are designed to use RAII(no pointers are involved so automatic memory management and possible compile time optimizations).

Data types and attributes

We must identify basic type(the ones that are supported by the parser automatically), and composed types(the ones built with multiple types).

Basic types are:

integer: 64 bits integer
int32: 32 bits integer, the reason of this names is because I wanted the default be 64 bits
float: a floating point number.
string: a string value
list: a Json array
refclass: a named object which references a complex type.
map: an anonymous object where its name is not given in the json. One important note about maps is, they are the only allowed type on one level, we expect that level to be only a map. Maps are distinguish from the rest as the first element is a string and is not known at compile time, as consequence will be filled with the retrieved result, this is important so the solution to avoid collisions is to forbid different elementson that level. At the moment is user responsibility to avoid maps with other elements on one level within the xml.

{
    "22039182" : {
        ...
    }
}

In that fragment 22039182 is the user ID but we don't have something like this

{
    "user_data" : {
        "user_id" : 22039182,
        "internal_user_data" : {
            ...
        }
    }
}

Inside class and json we can use the following table to guide us on wich attributes are supported by each tag

Data type / Attribute	optional	optional_name	condition	name	refclass	value
integer	X	X	X	X
int32	X	X	X	X
float	X	X	X	X
string	X	X	X	X
list	X	X	X	X	X
refclass	X	X	X	X	X
map	X	X	X	X		X

Each attribute will have a different default value, and in the case of name, if we leave it empty, the behavior is undefined. On list and refclass if we left empty refclass attribute the behavior is undefined too.

optional: False by default, will mark if the node can be skipped in the parsing. An error will be reported if the node exists and the type is wrong.
optional_name: By default, optional values creates an additional bool field representing if the field was read or not, as for elements different of arrays there is no way to detect if there was a value, and even for arrays we may be interested on empty arrays. The default behavior is to append _umi_optional to the name, but in some cases maybe a collision may happen. For that reason, we give you the option to specify a custom name for that variable. Remember THIS IS NOT TO INDICATE IF THE VARIABLE IS OPTIONAL OR NOT, THIS MEANS; THE VARIABLE IS OPTIONAL AND I WANT TO USE THIS ATTRIBUTE TO DETECT IF . WAS READ.
condition: Empty by default, C/C++ code to be evaluate within an if condition to consider if we must read that node.
name: Name we want to use to identify the element in our data structure and in the json, is 1-1.
refclass: Class to be used in the referenced elements.
value: Type we want to use on the map.

The next types are supported by the refclass attribute

integer: The array/class will be an integer of 64 bits.
int32: The array/class will be an integer of 32 bits.
float: The array/class will be a float.
string: The array/class wll be a string.
class name: The array/class will be a class already defined in your template.

For a complete example please see the file test.xml or the test folder.

The final representation depends on the output engine.

Output engines

The system requires what is known as output engines to produce our .h and .cpp, depending on the one you select the results may differ(the .cpp).

The .h is always the same, will not be different no matter what, as this is a C++ creator we use STL for the data structures if necessary.

The next list shows the conversion from xml type to cpp type

integer: long long int
int32: int
float: double
string: std::string
list: std::list<refType>
map: std::multimap<std::string,refType>

Generated Parser

Once you have your template and you have created your custom parser for that template, it will be surprisingly easy and straightforward for you to start using it. The only requirement at the moment is you need to have rapidjson installed. Just include the .h, instantiate a new object of the json you want to parse and call read_data.

#include "test.h"

int main(int argc, char** argv) {
    umison::test1 instance;
    std::string json_response;
    // Fill our json_response with a valid json
    if(instance.read_data(json_response)){
        // Access the data automatically
    }
}

By default, the system will use std::cerr to report errors, each class provides a custom method where you can supply your own stream.

#include "test.h"
#include <sstream>

int main(int argc, char** argv) {
    umison::test1 instance;
    std::stringstream my_stream;
    std::string json_response;
    // Fill our json_response with a valid json
    if(instance.read_data(json_response, my_stream)){
        // Access the data automatically
    }
}

The visibility of the attributes in the object will be private, so in order to get access to each of them, you need to use the different methods provided. There are three different types, "get", "set" and "mutable" accesor. Get will be a const method, so no modification will be done, set will be used to modify the content using a parameter, and mutable is a get that returns a non-const reference.

Tested

The software and resulting files have been tested on Linux but should work on Windows.

The main binary is kinda fragile and will abort under some scenarios.

The resulting json parser MUST be pretty solid and avoid things like segmentation faults. I'm continuously working to improve the quality and readability of the generated code, so please, any suggestion/pull request is always welcome.

LICENSE

See LICENSE.txt for a copy of the license, but well 2 terms standard BSD so you have to read it if you don't want.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
pugixml @ 2cf599b		pugixml @ 2cf599b
rapidjson @ 292714e		rapidjson @ 292714e
test/riot_games		test/riot_games
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE.txt		LICENSE.txt
README.md		README.md
main.cpp		main.cpp
test.json		test.json
test.xml		test.xml
umison.cpp		umison.cpp
umison.h		umison.h
umisonfileprinter.h		umisonfileprinter.h
umisonforwarddeclaration.h		umisonforwarddeclaration.h
umisonoutenginerapidjson.h		umisonoutenginerapidjson.h
umisonoutputengine.h		umisonoutputengine.h
umisonoutputengines.h		umisonoutputengines.h
umisontypetocpp.h		umisontypetocpp.h
umixml.cpp		umixml.cpp
umixml.h		umixml.h
umixmltags.h		umixmltags.h
umixmltype.h		umixmltype.h
umixmltypeboolean.h		umixmltypeboolean.h
umixmltypeclass.h		umixmltypeclass.h
umixmltypefloat.h		umixmltypefloat.h
umixmltypeinteger.h		umixmltypeinteger.h
umixmltypeinteger32.h		umixmltypeinteger32.h
umixmltypelist.h		umixmltypelist.h
umixmltypemap.h		umixmltypemap.h
umixmltyperefclass.h		umixmltyperefclass.h
umixmltypes.h		umixmltypes.h
umixmltypestring.h		umixmltypestring.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

xml2json

Version

Requirements

Installation

Usage

Template format

Data types and attributes

Output engines

Generated Parser

Tested

LICENSE

About

Releases

Packages

Languages

License

raistmaj/xml2json

Folders and files

Latest commit

History

Repository files navigation

xml2json

Version

Requirements

Installation

Usage

Template format

Data types and attributes

Output engines

Generated Parser

Tested

LICENSE

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages