Skip to content
/ syrup Public

Syrup is a simple binary way of preserving data on the wire, with perhaps a few extra calories.

License

Notifications You must be signed in to change notification settings

ocapn/syrup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Syrup is a lightweight and easy-to-develop (and reasonably easy to read) serialization of Preserves. You can also think of it as an extension of the ideas of canonical s-expressions and Bencode.

Quick intro

For the sake of simplicity, let’s look at examples in Python.

We’ve got bytestrings and strings and integers and booleans:

>>> from syrup import syrup_encode, syrup_decode, Symbol

# bytestrings
>>> syrup_encode(b"a bytestring")
b'12:a bytestring'

# strings
>>> syrup_encode("a string")
b'8"a string'

# symbols (maybe only lispy people care)
>>> syrup_encode(Symbol('foo'))
b"3'foo"

# integers
>>> syrup_encode(42)
b'42+'
>>> syrup_encode(0)
b'0+'
>>> syrup_encode(-123)
b'123-'

# floats (single and double precision, but Python only supports double)
# (encoding looks ugly... floats are complicated, IEEE 754 bla bla)
>>> syrup_encode(123.456)
b'D@^\xdd/\x1a\x9f\xbew'

# booleans
>>> syrup_encode(True)
b't'
>>> syrup_encode(False)
b'f'

But maybe you want to combine things. So ok, we have lists, sets, and dictionaries:

# lists
>>> syrup_encode(["foo", 123, True])
b'[3"foo123+t]'

# dictionaries
>>> syrup_encode({"species": "cat",
...               "name": "Tabatha",
...               "age": 12})
b'{3"age12+4"name7"Tabatha7"species3"cat}'

# sets
>>> syrup_encode({"cookie", "milk", "napkin"})
b'#4"milk6"cookie6"napkin$'

When reading, whitespace is ignored (but we recommend using the official Preserves syntax if you’re trying to make things human-readable).

>>> syrup_decode(b'{3"age 12+ 4"name 7"Tabatha 7"species 3"cat}')
{'age': 12, 'name': 'Tabatha', 'species': 'cat'}

Arbitrary nesting is allowed, including in keys and sets, if the platform supports it.

Sets and dictionaries are both ordered, dictionaries by keys. Items and keys are sorted by comparing them in sorted form. (This might make Tony Garnock-Jones say, “yuck!”) But in this way, if two applications both agree to use Syrup, it is an acceptable form for canonicalization.

Maybe this seems almost good enough, but you need your own types. For this purpose, Syrup also provides a Record type:

>>> from syrup import record

# We could encode a date as a single iso8601 string
>>> syrup_encode(record('isodate', '2020-05-01T14:08:11'))
b'<4"date19"2020-05-01T14:08:11>'

# But records permit multiple arguments,
# so we could encode it that way too
>>> syrup_encode(record('date', 2020, 5, 1, 14, 8, 11))
b'<4"date2020+5+1+14+8+11+>'

Pseudo-specification

Here’s nearly everything you need to know, taken right from a comment in the Racket implementation:

;; Booleans: t or f
;; Single flonum: F<ieee-single-float>   (big endian)
;; Double flonum: D<ieee-double-float>   (big endian)
;; Positive integers: <int>+
;; Negative integers: <int>-
;; Bytestrings: 3:cat
;; Strings: 3"cat                        (utf-8 encoded)
;; Symbols: 3'cat                        (utf-8 encoded)
;; Dictionary: {<key1><val1><key2><val2>}
;; Lists: [<item1><item2><item3>]
;; Records: <<label><val1><val2><val3>>  (the outer <> for realsies tho)
;; Sets: #<item1><item2><item3>$

(Sorry, records look a bit confusing there, since <> are actually used rather than just a placeholder for a variable.)

There’s only one other key detail. Writing out a Syrup structure should always canonicalize it. The good news: this is fairly easy to do via recursion, since only dictionaries and sets are unordered. Dictionaries are ordered by their keys, sets by their items (and dictionaries must not include the same key twice). Simply write out the keys/items first, then sort them by the bytes, from lower to higher.

That’s it, really. Easy-peasy.

Relationship to csexps and bencode

In comparison to canonical s-expressions, syrup uses similar syntax when limited to lists and bytestrings, except that in writing uses [] rather than (). But as you can see above, it also defines a lot of other types, too.

Syrup is also similar to Bencode, in that it supports integers, bytestrings, lists, and dictionaries. However its syntax for lists and dictionaries are different when syrup is written.

For funsies, the syrup decoders that ship in this repository will accept Bencode or canonical s-expression syntax (except for canonical s-expression “display hints” syntax, but those never made sense anyway… use records instead).

>>> syrup_decode(b'd3:agei12e4:name5:Missy7:species3:cate')
{b'age': 12, b'name': b'Missy', b'species': b'cat'}

But, Syrup uses {} instead of de for dictionaries when encoding itself.

>>> syrup_encode({b'age': 12, b'name': b'Missy', b'species': b'cat'})
b'{3:agei12e4:name5:Missy7:species3:cat}'

Implementations

Implementations in impls/ subdirectory:

External implementations:

License

Apache v2

About

Syrup is a simple binary way of preserving data on the wire, with perhaps a few extra calories.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published