djson

package module

v0.0.0-...-c02c5ae Latest Latest Go to latest Published: May 9, 2017 License: MIT Imports: 4 Imported by: 10

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/a8m/djson

Links

Open Source Insights

README ¶

DJSON is a JSON decoder for Go that is 2~ to 3~ times faster than the standard encoding/json and the existing solutions, when dealing with arbitrary JSON payload. See benchmarks below.
It is a good approach for people who are using json.Unmarshal together with interface{}, don't know what the schema is, and still want good performance with minimal changes.

Motivation

While searching for a JSON parser solution for my projects, that is faster than the standard library, with zero reflection tests, allocates less memory and is still safe(I didn't want the "unsafe" package in my production code, in order to reduce memory consumption).
I found that almost all implemtations are just wrappers around the standard library and aren't fast enough for my needs.
I encountered two projects: ujson that is the UltraJSON implementation and jsonparser, that is a pretty awesome project.
ujson seems to be faster than encoding/json but still doesn't meet my requirements.
jsonparser seems to be really fast, and I even use it for some of my new projects.
However, its API is different, and I would need to change too much of my code in order to work with it.
Also, for my processing work that involves ETL, changing and setting new fields on the JSON object, I need to transform the jsonparser result to map[string]interface{} and it seems that it loses its power.

Advantages and Stability

As you can see in the benchmark below, DJSON is faster and allocates less memory than the other alternatives.
The current version is 1.0.0-alpha.1, and I'm waiting to hear from you if there are any issues or bug reports, to make it stable.
(comment: there is a test file named decode_test that contains a test case that compares the results to encoding/json - feel free to add more values if you find they are important)
I'm also plaining to add the DecodeStream(io.ReaderCloser) method(or NewDecoder(io.ReaderCloser)), to support stream decoding without breaking performance.

Benchmark

There are 3 benchmark types: small, medium and large payloads.
All the 3 are taken from the jsonparser project, and they try to simulate a real-life usage. Each result from the different benchmark types is shown in a metric table below. The lower the metrics are, the better the result is. Time/op is in nanoseconds, B/op is how many bytes were allocated per op and allocs/op is the total number of memory allocations.
Benchmark results that are better than encoding/json are marked in bold text.
The Benchmark tests run on AWS EC2 instance(c4.xlarge). see: screenshots

Compared libraries:

Small payload

Each library in the test gets a small payload to process that weighs 134 bytes.
You can see the payload here, and the test screenshot here.

Library	Time/op	B/op	allocs/op
encoding/json	8646	1993	60
ugorji/go/codec	9272	4513	41
antonholmquist/jason	7336	3201	49
bitly/go-simplejson	5253	2241	36
Jeffail/gabs	4788	1409	33
mreiferson/go-ujson	3897	1393	35
a8m/djson	2534	1137	25
a8m/djson.AllocString	2195	1169	13

Medium payload

Each library in the test gets a medium payload to process that weighs 1.7KB.
You can see the payload here, and the test screenshot here.

Library	Time/op	B/op	allocs/op
encoding/json	42029	10652	218
ugorji/go/codec	65007	15267	313
antonholmquist/jason	45676	17476	224
bitly/go-simplejson	45164	17156	219
Jeffail/gabs	41045	10515	211
mreiferson/go-ujson	33213	11506	267
a8m/djson	22871	10100	195
a8m/djson.AllocString	19296	10619	87

Large payload

Each library in the test gets a large payload to process that weighs 28KB.
You can see the payload here, and the test screenshot here.

Library	Time/op	B/op	allocs/op
encoding/json	717882	212827	3247
ugorji/go/codec	1052347	239130	4426
antonholmquist/jason	751910	277931	3257
bitly/go-simplejson	753663	277628	3252
Jeffail/gabs	714304	212740	3241
mreiferson/go-ujson	599868	235789	4057
a8m/djson	437031	210997	2932
a8m/djson.AllocString	372382	214053	1413

LICENSE

MIT

Documentation ¶

Index ¶

Variables
func Decode(data []byte) (interface{}, error)
func DecodeArray(data []byte) ([]interface{}, error)
func DecodeObject(data []byte) (map[string]interface{}, error)
type Decoder
- func NewDecoder(data []byte) *Decoder
type SyntaxError
- func (e *SyntaxError) Error() string
type ValueType
- func Type(v interface{}) ValueType
- func (v ValueType) String() string

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	ErrUnexpectedEOF    = &SyntaxError{"unexpected end of JSON input", -1}
	ErrInvalidHexEscape = &SyntaxError{"invalid hexadecimal escape sequence", -1}
	ErrStringEscape     = &SyntaxError{"encountered an invalid escape sequence in a string", -1}
)

Predefined errors

Functions ¶

func Decode ¶

func Decode(data []byte) (interface{}, error)

Decode parses the JSON-encoded data and returns an interface value. The interface value could be one of these:

bool, for JSON booleans
float64, for JSON numbers
string, for JSON strings
[]interface{}, for JSON arrays
map[string]interface{}, for JSON objects
nil for JSON null

Note that the Decode is compatible with the the following insructions:

var v interface{}
err := json.Unmarshal(data, &v)

Example ¶

package main

import (
	"fmt"
	"log"

	"github.com/a8m/djson"
)

func main() {
	var data = []byte(`[
		{"Name": "Platypus", "Order": "Monotremata"},
		{"Name": "Quoll",    "Order": "Dasyuromorphia"}
	]`)

	val, err := djson.Decode(data)
	if err != nil {
		log.Fatal("error:", err)
	}

	fmt.Printf("%+v", val)

	// - Output:
	// [map[Name:Platypus Order:Monotremata] map[Name:Quoll Order:Dasyuromorphia]]
}

Output:

func DecodeArray ¶

func DecodeArray(data []byte) ([]interface{}, error)

DecodeArray is the same as Decode but it returns []interface{}. You should use it to parse JSON arrays.

Example ¶

package main

import (
	"fmt"
	"log"

	"github.com/a8m/djson"
)

func main() {
	var data = []byte(`[
		"John",
		"Dan",
		"Kory",
		"Ariel"
	]`)

	users, err := djson.DecodeArray(data)
	if err != nil {
		log.Fatal("error:", err)
	}
	for i, user := range users {
		fmt.Printf("[%d]: %v\n", i, user)
	}
}

Output:

func DecodeObject ¶

func DecodeObject(data []byte) (map[string]interface{}, error)

DecodeObject is the same as Decode but it returns map[string]interface{}. You should use it to parse JSON objects.

Example ¶

Example that demonstrate the basic transformation I do on each incoming event. `lowerKeys` and `fixEncoding` are two generic methods, and they don't care about the schema. The three others(`maxMindGeo`, `dateFormat`and `refererURL`) process and extend the events dynamically based on the "APP_ID" field.

package main

import (
	"fmt"
	"log"

	"github.com/a8m/djson"
)

func main() {
	var data = []byte(`{
		"ID": 76523,
		"IP": "69.89.31.226"
		"APP_ID": "BD311",
		"Name": "Ariel",
		"Username": "a8m",
		"Score": 99,
		"Date": 1475332371532,
		"Image": {
			"Src": "images/67.png",
			"Height": 450,
			"Width":  370,
			"Alignment": "center"
		},
		"RefererURL": "https://..."
	}`)
	event, err := djson.DecodeObject(data)
	if err != nil {
		log.Fatal("error:", err)
	}

	fmt.Printf("Value: %v", event)

	// Process the event
	//
	// lowerKeys(event)
	// fixEncoding(event)
	// dateFormat(event)
	// maxMindGeo(event)
	// refererURL(event)
	//
	// pipeline.Pipe(event)
}

Output:

Types ¶

type Decoder ¶

type Decoder struct {
	// contains filtered or unexported fields
}

Decoder is the object that holds the state of the decoding

func NewDecoder ¶

func NewDecoder(data []byte) *Decoder

NewDecoder creates new Decoder from the JSON-encoded data

func (*Decoder) AllocString ¶

func (d *Decoder) AllocString()

AllocString pre-allocates a string version of the data before starting to decode the data. It is used to make the decode operation faster(see below) by doing one allocation operation for string conversion(from bytes), and then uses "slicing" to create non-escaped strings in the "Decoder.string" method. However, string is a read-only slice, and since the slice references the original array, as long as the slice is kept around, the garbage collector can't release the array. For this reason, you want to use this method only when the Decoder's result is a "read-only" or you are adding more elements to it. see example below.

Here are the improvements:

small payload  - 0.13~ time faster, does 0.45~ less memory allocations but
		 the total number of bytes that are allocated is 0.03~ bigger

medium payload - 0.16~ time faster, does 0.5~ less memory allocations but
		 the total number of bytes that are allocated is 0.05~ bigger

large payload  - 0.13~ time faster, does 0.50~ less memory allocations but
		 the total number of bytes that are allocated is 0.02~ bigger

Here is an example to illustrate when you don't want to use this method

str := fmt.Sprintf(`{"foo": "bar", "baz": "%s"}`, strings.Repeat("#", 1024 * 1024))
dec := djson.NewDecoder([]byte(str))
dec.AllocString()
ev, err := dec.DecodeObject()

// inpect memory stats here; MemStats.Alloc ~= 1M

delete(ev, "baz") // or ev["baz"] = "qux"

// inpect memory stats again; MemStats.Alloc ~= 1M
// it means that the chunk that was located in the "baz" value is not freed

Example ¶

package main

import (
	"fmt"
	"log"

	"github.com/a8m/djson"
)

func main() {
	var data = []byte(`{"event_type":"click","count":"93","userid":"4234A"}`)
	dec := djson.NewDecoder(data)
	dec.AllocString()

	val, err := dec.DecodeObject()
	if err != nil {
		log.Fatal("error:", err)
	}

	fmt.Printf("Value: %+v", val)

	// - Output:
	// map[count:93 userid:4234A event_type:click]
}

Output:

func (*Decoder) Decode ¶

func (d *Decoder) Decode() (interface{}, error)

Decode parses the JSON-encoded data and returns an interface value. The interface value could be one of these:

bool, for JSON booleans
float64, for JSON numbers
string, for JSON strings
[]interface{}, for JSON arrays
map[string]interface{}, for JSON objects
nil for JSON null

Note that the Decode is compatible with the the following insructions:

var v interface{}
err := json.Unmarshal(data, &v)

func (*Decoder) DecodeArray ¶

func (d *Decoder) DecodeArray() ([]interface{}, error)

DecodeArray is the same as Decode but it returns []interface{}. You should use it to parse JSON arrays.

func (*Decoder) DecodeObject ¶

func (d *Decoder) DecodeObject() (map[string]interface{}, error)

DecodeObject is the same as Decode but it returns map[string]interface{}. You should use it to parse JSON objects.

type SyntaxError ¶

type SyntaxError struct {
	Offset int // error occurred after reading Offset bytes
	// contains filtered or unexported fields
}

A SyntaxError is a description of a JSON syntax error.

func (*SyntaxError) Error ¶

func (e *SyntaxError) Error() string

type ValueType ¶

type ValueType int

ValueType identifies the type of a parsed value.

const (
	Null ValueType = iota
	Bool
	String
	Number
	Object
	Array
	Unknown
)

func Type ¶

func Type(v interface{}) ValueType

Type returns the JSON-type of the given value

func (ValueType) String ¶

func (v ValueType) String() string

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
benchmark

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL